邢台做网站的那好,加利弗设计公司官网,创建全国文明城市调查问卷答案,湖南专业网站建设服务目录 一.环境信息二.部署提前工作三.部署Prometheus监控系统四.部署Node_exporter组件五.部署Kube_state_metrics组件六.部署Grafana可视化平台七.Grafana接入Prometheus数据八.Grafana添加监控模板九.拓展 一.环境信息
1.服务器及k8s版本信息
IP地址主机名称角色版本192.168… 目录 一.环境信息二.部署提前工作三.部署Prometheus监控系统四.部署Node_exporter组件五.部署Kube_state_metrics组件六.部署Grafana可视化平台七.Grafana接入Prometheus数据八.Grafana添加监控模板九.拓展 一.环境信息
1.服务器及k8s版本信息
IP地址主机名称角色版本192.168.40.180master1master节点1.27192.168.40.181node1node节点1.27192.168.40.182node2node节点1.27
2.部署组件版本
序号名称版本作用1Prometheusv2.33.5收集、存储和处理指标数据2Node_exporterv0.16.0采集服务器指标如CPU、内存、磁盘、网络等3Kube-state-metricsv1.9.0采集K8S资源指标如Pod、Node、Deployment、Service等4Grafanav8.4.5可视化展示Prometheus收集数据
二.部署提前工作
1.创建名称空间下面所有的资源到到这里
kubectl create ns prometheus2.创建ServiceAccount账号并绑定cluster-admin集群角色(Prometheus中需要指定)
kubectl create serviceaccount prometheus -n prometheuskubectl create clusterrolebinding prometheus-clusterrolebinding -n prometheus --clusterrolecluster-admin --serviceaccountprometheus:prometheuskubectl create clusterrolebinding prometheus-clusterrolebinding-1 -n prometheus --clusterrolecluster-admin --usersystem:serviceaccount:prometheus:prometheus3.创建Prometheus存放数据目录 注意我们将prometheus服务部署在node1节点上此步骤在node1上操作
mkdir /data
chmod -R 777 /data4.创建Grafana存放数据目录 将Grafana服务部署在node1节点所以此步骤也在node1节点执行
mkdir /var/lib/grafana/ -p
chmod 777 /var/lib/grafana/三.部署Prometheus监控系统
1.创建ConfigMap资源
vim prometheus-cfg.yaml
---
kind: ConfigMap
apiVersion: v1
metadata:labels:app: prometheusname: prometheus-confignamespace: prometheus
data:prometheus.yml: |global:scrape_interval: 15s # 采集目标主机监控据的时间间隔scrape_timeout: 10s # 数据采集超时时间默认10sevaluation_interval: 1m # 触发告警检测的时间默认是1mscrape_configs:- job_name: kubernetes-nodekubernetes_sd_configs: # 基于K8S的服务发现- role: node # 使用node模式服务发现relabel_configs: # 正则匹配- source_labels: [__address__] # 匹配带有IP的标签regex: (.*):10250 # 10250端口(kubelet端口)replacement: ${1}:9100 # 替换成9100target_label: __address__action: replace- action: labelmapregex: __meta_kubernetes_node_label_(.)- job_name: kubernetes-node-cadvisor # cadvisor容器用于收集和提供有关节点上运行的容器的资源使用情况和性能指标kubernetes_sd_configs:- role: nodescheme: httpstls_config:ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crtbearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/tokenrelabel_configs:- action: labelmap # 把匹配到的标签保留regex: __meta_kubernetes_node_label_(.) # 保留匹配到的具有__meta_kubernetes_node_label的标签- target_label: __address__ replacement: kubernetes.default.svc:443- source_labels: [__meta_kubernetes_node_name]regex: (.)target_label: __metrics_path__replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor- job_name: kubernetes-apiserverkubernetes_sd_configs:- role: endpointsscheme: httpstls_config:ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crtbearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/tokenrelabel_configs:- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]action: keepregex: default;kubernetes;https- job_name: kubernetes-service-endpointskubernetes_sd_configs:- role: endpoints # 使用k8s中的endpoint模式服务发现relabel_configs:- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]action: keep # 采集满足条件的实例其他实例不采集regex: true- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]action: replacetarget_label: __scheme__regex: (https?)- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]action: replacetarget_label: __metrics_path__regex: (.)- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]action: replacetarget_label: __address__regex: ([^:])(?::\d)?;(\d)replacement: $1:$2- action: labelmapregex: __meta_kubernetes_service_label_(.)- source_labels: [__meta_kubernetes_namespace]action: replacetarget_label: kubernetes_namespace- source_labels: [__meta_kubernetes_service_name]action: replacetarget_label: kubernetes_name 执行配置清单
kubectl apply -f prometheus-cfg.yaml查看ConfigMap资源信息
kubectl get configmap -n prometheus prometheus-config2.创建Deployment资源
vim prometheus-deploy.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:name: prometheus-servernamespace: prometheuslabels:app: prometheus
spec:replicas: 1selector:matchLabels:app: prometheuscomponent: servertemplate:metadata:labels:app: prometheuscomponent: serverannotations:prometheus.io/scrape: falsespec:nodeName: node1 # 调度到node1节点serviceAccountName: prometheus # 指定sa服务账号containers:- name: prometheusimage: prom/prometheus:v2.33.5imagePullPolicy: IfNotPresentcommand: # 启动时运行的命令- prometheus- --config.file/etc/prometheus/prometheus.yml # 指定配置文件- --storage.tsdb.path/prometheus # 数据存放目录- --storage.tsdb.retention720h # 暴露720小时(30天)- --web.enable-lifecycle # 开启热加载ports:- containerPort: 9090protocol: TCPvolumeMounts:- mountPath: /etc/prometheus # 将prometheus-config卷挂载至/etc/prometheusname: prometheus-config- mountPath: /prometheus/name: prometheus-storage-volumevolumes: - name: prometheus-config # 将prometheus-config做成卷configMap:name: prometheus-config- name: prometheus-storage-volume hostPath:path: /datatype: Directory执行配置清单
kubectl apply -f prometheus-deploy.yaml查看Deployment资源信息
kubectl get deployment prometheus-server -n prometheus3.创建Service资源
vim prometheus-svc.yaml
---
apiVersion: v1
kind: Service
metadata:name: prometheus-svcnamespace: prometheuslabels:app: prometheus
spec:type: NodePortports:- port: 9090targetPort: 9090nodePort: 31090protocol: TCPselector:app: prometheuscomponent: server执行配置清单
kubectl apply -f prometheus-svc.yaml查看Service资源信息
kubectl get svc prometheus-svc -n prometheus4.访问浏览器http://IP:31090
四.部署Node_exporter组件
使用daemonsets资源
vim node-export.yaml
---
apiVersion: apps/v1
kind: DaemonSet
metadata:name: node-exporternamespace: prometheuslabels:name: node-exporter
spec:selector:matchLabels:name: node-exportertemplate:metadata:labels:name: node-exporterspec:hostPID: truehostIPC: true# 使用物理机IP地址(调度到那个节点,就使用该节点IP地址)hostNetwork: truecontainers:- name: node-exporterimage: prom/node-exporter:v0.16.0imagePullPolicy: IfNotPresentports:# 暴露端口- containerPort: 9100resources:requests:cpu: 0.15securityContext:privileged: trueargs:- --path.procfs- /host/proc- --path.sysfs- /host/sys- --collector.filesystem.ignored-mount-points- ^/(sys|proc|dev|host|etc)($|/)volumeMounts:- name: devmountPath: /host/dev- name: procmountPath: /host/proc- name: sysmountPath: /host/sys- name: rootfsmountPath: /rootfs- name: localtimemountPath: /etc/localtime# 指定容忍度,允许调度到master节点tolerations:- key: node-role.kubernetes.io/control-planeoperator: Existseffect: NoSchedulevolumes:- name: prochostPath:path: /proc- name: devhostPath:path: /dev- name: syshostPath:path: /sys- name: rootfshostPath:path: /- name: localtimehostPath:path: /etc/localtimetype: File注意需要根据环境修改容忍度tolerations 允许调度到Master节点其他不用修改
可以使用以下命令查看master1节点中的污点是什么然后配置到上面的tolerations 执行资源清单
kubectl apply -f node-export.yaml查看资源信息正常三个节点都要部署node_exporter如果没有master节点就要检查上面容忍度配置了。
kubectl get pods -n prometheus -o wide五.部署Kube_state_metrics组件
kube-state-metrics是什么 kube-state-metrics通过监听API Server生成有关资源对象的状态指标比如Node、Pod需要注意的是kube-state-metrics只是简单的提供一个metrics数据并不会存储这些指标数据所以我们可以使用Prometheus来抓取这些数据然后存储主要关注的是业务相关的一些元数据比如Pod副本状态等调度了多少个replicas现在可用的有几个多少个Pod是running/stopped/terminated状态Pod重启了多少次我有多少job在运行中
vim kube-state-metrics.yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:name: kube-state-metricsnamespace: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:name: kube-state-metrics
rules:
- apiGroups: []resources: [nodes, pods, services, resourcequotas, replicationcontrollers, limitranges, persistentvolumeclaims, persistentvolumes, namespaces, endpoints]verbs: [list, watch]
- apiGroups: [extensions]resources: [daemonsets, deployments, replicasets]verbs: [list, watch]
- apiGroups: [apps]resources: [statefulsets]verbs: [list, watch]
- apiGroups: [batch]resources: [cronjobs, jobs]verbs: [list, watch]
- apiGroups: [autoscaling]resources: [horizontalpodautoscalers]verbs: [list, watch]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:name: kube-state-metrics
roleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: kube-state-metrics
subjects:
- kind: ServiceAccountname: kube-state-metricsnamespace: prometheus
---
apiVersion: apps/v1
kind: Deployment
metadata:name: kube-state-metricsnamespace: prometheus
spec:replicas: 1selector:matchLabels:app: kube-state-metricstemplate:metadata:labels:app: kube-state-metricsspec:serviceAccountName: kube-state-metricscontainers:- name: kube-state-metricsimage: quay.io/coreos/kube-state-metrics:v1.9.0imagePullPolicy: IfNotPresentports:- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:annotations:prometheus.io/scrape: truename: kube-state-metricsnamespace: prometheuslabels:app: kube-state-metrics
spec:ports:- name: kube-state-metricsport: 8080protocol: TCPselector:app: kube-state-metrics执行资源清单
kubectl apply -f kube-state-metrics.yaml查看资源信息
kubectl get pods -n prometheus六.部署Grafana可视化平台
注意修改nodeName指定部署到node1节点其他不用修改
vim grafana.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:name: grafana-servernamespace: prometheus
spec:replicas: 1selector:matchLabels:task: monitoringk8s-app: grafanatemplate:metadata:labels:task: monitoringk8s-app: grafanaspec:nodeName: node1 # 部署到node1节点containers:- name: grafanaimage: grafana/grafana:8.4.5imagePullPolicy: IfNotPresentports:- containerPort: 3000protocol: TCPvolumeMounts:- mountPath: /etc/ssl/certsname: ca-certificatesreadOnly: true- mountPath: /varname: grafana-storage- mountPath: /var/lib/grafana/name: libenv:- name: INFLUXDB_HOSTvalue: monitoring-influxdb- name: GF_SERVER_HTTP_PORTvalue: 3000- name: GF_AUTH_BASIC_ENABLEDvalue: false- name: GF_AUTH_ANONYMOUS_ENABLEDvalue: true- name: GF_AUTH_ANONYMOUS_ORG_ROLEvalue: Admin- name: GF_SERVER_ROOT_URLvalue: /volumes:- name: ca-certificateshostPath:path: /etc/ssl/certs- name: grafana-storageemptyDir: {}- name: libhostPath:path: /var/lib/grafana/type: DirectoryOrCreate
---
apiVersion: v1
kind: Service
metadata:labels:kubernetes.io/cluster-service: truekubernetes.io/name: monitoring-grafananame: grafana-svcnamespace: prometheus
spec:ports:- port: 80targetPort: 3000nodePort: 31091selector:k8s-app: grafanatype: NodePort执行资源清单
kubectl apply -f grafana.yaml查看资源信息
kubectl get pods -n prometheus浏览器访问http://IP:31091 OK浏览器可以访问到Grafana表示至此步骤无误
七.Grafana接入Prometheus数据
1.点击 设置 Data Sources Add data source 选择Prometheus 2.填写Name、URL 字段 URL 使用SVC的域名格式是SVC名称.名称空间.svc
http://prometheus-svc.prometheus.svc:90903.往下滑点击 Save test
八.Grafana添加监控模板
序号模板文件备注1node_exporter.json服务器监控模板-22docker_rev1.jsonDocker监控模板3Kubernetes-1577674936972.jsonK8S集群监控模板4Kubernetes-1577691996738.jsonK8S集群监控模板
1.导入node_exporter.json 服务器监控-2模板 2.导入docker_rev1.json Docker监控模板 3.导入Kubernetes-1577691996738.jsonK8S-2监控模板
九.拓展
1.热加载
curl -XPOST http://192.168.40.180:31090/-/reload2.新增Service服务 在Service中添加注解才可以被Prometheus发现如下图这是我们定义的ConfigMap内容 案例以上面定义的prometheus-svc 为例子添加prometheus_io_scrape注解。
vim prometheus-svc.yaml
---
apiVersion: v1
kind: Service
metadata:name: prometheus-svcnamespace: prometheuslabels:app: prometheusannotations:prometheus_io_scrape: true # 注解,有这个才可以被Prometheus发现
spec:type: NodePortports:- port: 9090targetPort: 9090nodePort: 31090protocol: TCPselector:app: prometheuscomponent: server更新一下资源清单
kubectl apply -f prometheus-svc.yaml热加载一下Prometheus
curl -XPOST http://192.168.40.180:31090/-/reloadOKPrometheus已经监控上了!