简介
在k8s平台上部署Prometheus监控有几种方式
- 老老实实写yaml部署脚本,这种方式部署太麻烦,细节太多,不建议
- 使用开源项目prometheus-operator部署
- 使用开源项目kube-prometheus部署
prometheus-operator只包含一个operator,该operator管理和操作Prometheus和Alertmanager集群,项目地址:
https://github.com/prometheus-operator/prometheus-operator
kube Prometheus以Prometheus Operator和一系列manifests文件为基础,以帮助你快速在kubernetes集群中部署Prometheus监控系统,项目地址:
https://github.com/prometheus-operator/kube-prometheus
这里我选用的是kube Prometheus去部署监控
下载Kube-Prometheus项目
#我用的版本是 release-0.11
https://github.com/prometheus-operator/kube-prometheus/tree/release-0.11
安装 Kube-Prometheus
#GitHub上已经给出了安装方法
[root@master kube-prometheus-release-0.11]# kubectl apply --server-side -f manifests/setup
customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com serverside-applied
namespace/monitoring serverside-applied
[root@master kube-prometheus-release-0.11]# until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
No resources found
kubectl apply -f manifests/
...
安装完成,看下资源情况
[root@master kube-prometheus-release-0.11]# kubectl get all -n monitoring
NAME READY STATUS RESTARTS AGE
pod/alertmanager-main-0 2/2 Running 0 65s
pod/alertmanager-main-1 2/2 Running 0 65s
pod/alertmanager-main-2 2/2 Running 0 63s
pod/blackbox-exporter-559db48fd-4c6rf 3/3 Running 0 2m40s
pod/grafana-546559f668-ft5zs 1/1 Running 0 2m15s
pod/kube-state-metrics-576b75c6f7-dx8vs 3/3 Running 0 2m9s
pod/node-exporter-fzwzs 2/2 Running 0 2m
pod/node-exporter-qstbq 2/2 Running 0 2m
pod/node-exporter-r9w26 2/2 Running 0 2m1s
pod/prometheus-adapter-5f68766c85-hvvhn 1/1 Running 0 86s
pod/prometheus-adapter-5f68766c85-vkh7l 1/1 Running 0 86s
pod/prometheus-k8s-0 2/2 Running 0 49s
pod/prometheus-k8s-1 0/2 PodInitializing 0 49s
pod/prometheus-operator-68845dfbbf-ldvvz 2/2 Running 0 81s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-main ClusterIP 10.0.0.120 <none> 9093/TCP,8080/TCP 2m46s
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 66s
service/blackbox-exporter ClusterIP 10.0.0.164 <none> 9115/TCP,19115/TCP 2m41s
service/grafana ClusterIP 10.0.0.80 <none> 3000/TCP 2m18s
service/kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 2m10s
service/node-exporter ClusterIP None <none> 9100/TCP 2m2s
service/prometheus-adapter ClusterIP 10.0.0.213 <none> 443/TCP 91s
service/prometheus-k8s ClusterIP 10.0.0.28 <none> 9090/TCP,8080/TCP 100s
service/prometheus-operated ClusterIP None <none> 9090/TCP 51s
service/prometheus-operator ClusterIP None <none> 8443/TCP 84s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/node-exporter 3 3 3 3 3 kubernetes.io/os=linux 2m4s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/blackbox-exporter 1/1 1 1 2m49s
deployment.apps/grafana 1/1 1 1 2m25s
deployment.apps/kube-state-metrics 1/1 1 1 2m19s
deployment.apps/prometheus-adapter 2/2 2 2 100s
deployment.apps/prometheus-operator 1/1 1 1 93s
NAME DESIRED CURRENT READY AGE
replicaset.apps/blackbox-exporter-559db48fd 1 1 1 2m51s
replicaset.apps/grafana-546559f668 1 1 1 2m27s
replicaset.apps/kube-state-metrics-576b75c6f7 1 1 1 2m20s
replicaset.apps/prometheus-adapter-5f68766c85 2 2 2 102s
replicaset.apps/prometheus-operator-68845dfbbf 1 1 1 95s
NAME READY AGE
statefulset.apps/alertmanager-main 2/3 75s
statefulset.apps/prometheus-k8s 1/2 59s
在上面可以看到自动创建了一个monitoring的NameSpace,Pod也都创建好了。
如何访问Grafana
[root@master kube-prometheus-release-0.11]# kubectl get svc -n monitoring
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-main ClusterIP 10.0.0.120 <none> 9093/TCP,8080/TCP 4m59s
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 3m19s
blackbox-exporter ClusterIP 10.0.0.164 <none> 9115/TCP,19115/TCP 4m54s
grafana ClusterIP 10.0.0.80 <none> 3000/TCP 4m31s
kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 4m23s
node-exporter ClusterIP None <none> 9100/TCP 4m15s
prometheus-adapter ClusterIP 10.0.0.213 <none> 443/TCP 3m44s
prometheus-k8s ClusterIP 10.0.0.28 <none> 9090/TCP,8080/TCP 3m53s
prometheus-operated ClusterIP None <none> 9090/TCP 3m4s
prometheus-operator ClusterIP None <none> 8443/TCP 3m37s
默认情况下,服务的网络类型都是ClusterIP,无法在外面访问,这里最好的方法是使用ingress配置对外提供服务,由于我的集群里没有安装ingress,现在我就修改成NodePort方式对外提供服务
vim manifests/grafana-service.yaml
spec:
ports:
- name: http
port: 3000
targetPort: http
type: NodePort
manifests/alertmanager-service.yaml
manifests/prometheus-service.yaml
#grafana、alertmanager、prometheus都配置成type: NodePort
kubectl apply -f manifests/grafana-service.yaml
kubectl apply -f manifests/alertmanager-service.yaml
kubectl apply -f manifests/prometheus-service.yaml
再查看一下Service信息
alertmanager-main NodePort 10.0.0.120 <none> 9093:47927/TCP,8080:31539/TCP 10m
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 9m12s
blackbox-exporter ClusterIP 10.0.0.164 <none> 9115/TCP,19115/TCP 10m
grafana NodePort 10.0.0.80 <none> 3000:37010/TCP 10m
kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 10m
node-exporter ClusterIP None <none> 9100/TCP 10m
prometheus-adapter ClusterIP 10.0.0.213 <none> 443/TCP 9m37s
prometheus-k8s NodePort 10.0.0.28 <none> 9090:40124/TCP,8080:42004/TCP 9m46s
prometheus-operated ClusterIP None <none> 9090/TCP 8m57s
prometheus-operator ClusterIP None <none> 8443/TCP 9m30s
这里就可以看到
Grafana/Prometheus/Alertmanager都变成了NodePort
我们挑一个服务IP地址访问一下
Grafana
Prometheus
Alertmanager
卸载Prometheus方式
kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup
这样k8s的资源就可以监控起来了,这中间还有一个问题是,我安装Kube-Promethues的时候有很多镜像下载不了,下篇我说下怎样下载k8s.gcr.io的镜像。