Kubernetes Prometheus

简介

在k8s平台上部署Prometheus监控有几种方式

  1. 老老实实写yaml部署脚本,这种方式部署太麻烦,细节太多,不建议
  2. 使用开源项目prometheus-operator部署
  3. 使用开源项目kube-prometheus部署

prometheus-operator只包含一个operator,该operator管理和操作Prometheus和Alertmanager集群,项目地址:
https://github.com/prometheus-operator/prometheus-operator

kube Prometheus以Prometheus Operator和一系列manifests文件为基础,以帮助你快速在kubernetes集群中部署Prometheus监控系统,项目地址:
https://github.com/prometheus-operator/kube-prometheus

这里我选用的是kube Prometheus去部署监控

下载Kube-Prometheus项目

#我用的版本是 release-0.11
https://github.com/prometheus-operator/kube-prometheus/tree/release-0.11

安装 Kube-Prometheus

#GitHub上已经给出了安装方法
[root@master kube-prometheus-release-0.11]# kubectl apply --server-side -f manifests/setup
customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com serverside-applied
namespace/monitoring serverside-applied


[root@master kube-prometheus-release-0.11]# until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
No resources found

kubectl apply -f manifests/
...

安装完成,看下资源情况

[root@master kube-prometheus-release-0.11]# kubectl get all -n monitoring
NAME                                       READY   STATUS            RESTARTS   AGE
pod/alertmanager-main-0                    2/2     Running           0          65s
pod/alertmanager-main-1                    2/2     Running           0          65s
pod/alertmanager-main-2                    2/2     Running           0          63s
pod/blackbox-exporter-559db48fd-4c6rf      3/3     Running           0          2m40s
pod/grafana-546559f668-ft5zs               1/1     Running           0          2m15s
pod/kube-state-metrics-576b75c6f7-dx8vs    3/3     Running           0          2m9s
pod/node-exporter-fzwzs                    2/2     Running           0          2m
pod/node-exporter-qstbq                    2/2     Running           0          2m
pod/node-exporter-r9w26                    2/2     Running           0          2m1s
pod/prometheus-adapter-5f68766c85-hvvhn    1/1     Running           0          86s
pod/prometheus-adapter-5f68766c85-vkh7l    1/1     Running           0          86s
pod/prometheus-k8s-0                       2/2     Running           0          49s
pod/prometheus-k8s-1                       0/2     PodInitializing   0          49s
pod/prometheus-operator-68845dfbbf-ldvvz   2/2     Running           0          81s

NAME                            TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                      AGE
service/alertmanager-main       ClusterIP   10.0.0.120   <none>        9093/TCP,8080/TCP            2m46s
service/alertmanager-operated   ClusterIP   None         <none>        9093/TCP,9094/TCP,9094/UDP   66s
service/blackbox-exporter       ClusterIP   10.0.0.164   <none>        9115/TCP,19115/TCP           2m41s
service/grafana                 ClusterIP   10.0.0.80    <none>        3000/TCP                     2m18s
service/kube-state-metrics      ClusterIP   None         <none>        8443/TCP,9443/TCP            2m10s
service/node-exporter           ClusterIP   None         <none>        9100/TCP                     2m2s
service/prometheus-adapter      ClusterIP   10.0.0.213   <none>        443/TCP                      91s
service/prometheus-k8s          ClusterIP   10.0.0.28    <none>        9090/TCP,8080/TCP            100s
service/prometheus-operated     ClusterIP   None         <none>        9090/TCP                     51s
service/prometheus-operator     ClusterIP   None         <none>        8443/TCP                     84s

NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
daemonset.apps/node-exporter   3         3         3       3            3           kubernetes.io/os=linux   2m4s

NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/blackbox-exporter     1/1     1            1           2m49s
deployment.apps/grafana               1/1     1            1           2m25s
deployment.apps/kube-state-metrics    1/1     1            1           2m19s
deployment.apps/prometheus-adapter    2/2     2            2           100s
deployment.apps/prometheus-operator   1/1     1            1           93s

NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/blackbox-exporter-559db48fd      1         1         1       2m51s
replicaset.apps/grafana-546559f668               1         1         1       2m27s
replicaset.apps/kube-state-metrics-576b75c6f7    1         1         1       2m20s
replicaset.apps/prometheus-adapter-5f68766c85    2         2         2       102s
replicaset.apps/prometheus-operator-68845dfbbf   1         1         1       95s

NAME                                 READY   AGE
statefulset.apps/alertmanager-main   2/3     75s
statefulset.apps/prometheus-k8s      1/2     59s

在上面可以看到自动创建了一个monitoring的NameSpace,Pod也都创建好了。

如何访问Grafana

[root@master kube-prometheus-release-0.11]# kubectl get svc -n monitoring
NAME                    TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                      AGE
alertmanager-main       ClusterIP   10.0.0.120   <none>        9093/TCP,8080/TCP            4m59s
alertmanager-operated   ClusterIP   None         <none>        9093/TCP,9094/TCP,9094/UDP   3m19s
blackbox-exporter       ClusterIP   10.0.0.164   <none>        9115/TCP,19115/TCP           4m54s
grafana                 ClusterIP   10.0.0.80    <none>        3000/TCP                     4m31s
kube-state-metrics      ClusterIP   None         <none>        8443/TCP,9443/TCP            4m23s
node-exporter           ClusterIP   None         <none>        9100/TCP                     4m15s
prometheus-adapter      ClusterIP   10.0.0.213   <none>        443/TCP                      3m44s
prometheus-k8s          ClusterIP   10.0.0.28    <none>        9090/TCP,8080/TCP            3m53s
prometheus-operated     ClusterIP   None         <none>        9090/TCP                     3m4s
prometheus-operator     ClusterIP   None         <none>        8443/TCP                     3m37s

默认情况下,服务的网络类型都是ClusterIP,无法在外面访问,这里最好的方法是使用ingress配置对外提供服务,由于我的集群里没有安装ingress,现在我就修改成NodePort方式对外提供服务

vim manifests/grafana-service.yaml
spec:
  ports:
  - name: http
    port: 3000
    targetPort: http
  type: NodePort
manifests/alertmanager-service.yaml 
manifests/prometheus-service.yaml 
#grafana、alertmanager、prometheus都配置成type: NodePort

kubectl apply -f manifests/grafana-service.yaml
kubectl apply -f manifests/alertmanager-service.yaml 
kubectl apply -f manifests/prometheus-service.yaml

再查看一下Service信息

alertmanager-main       NodePort    10.0.0.120   <none>        9093:47927/TCP,8080:31539/TCP   10m
alertmanager-operated   ClusterIP   None         <none>        9093/TCP,9094/TCP,9094/UDP      9m12s
blackbox-exporter       ClusterIP   10.0.0.164   <none>        9115/TCP,19115/TCP              10m
grafana                 NodePort    10.0.0.80    <none>        3000:37010/TCP                  10m
kube-state-metrics      ClusterIP   None         <none>        8443/TCP,9443/TCP               10m
node-exporter           ClusterIP   None         <none>        9100/TCP                        10m
prometheus-adapter      ClusterIP   10.0.0.213   <none>        443/TCP                         9m37s
prometheus-k8s          NodePort    10.0.0.28    <none>        9090:40124/TCP,8080:42004/TCP   9m46s
prometheus-operated     ClusterIP   None         <none>        9090/TCP                        8m57s
prometheus-operator     ClusterIP   None         <none>        8443/TCP                        9m30s

这里就可以看到
Grafana/Prometheus/Alertmanager都变成了NodePort

我们挑一个服务IP地址访问一下

Grafana

 

K8s Grafana

Prometheus

 

K8s Prometheus

 

Alertmanager

K8s Alertmanager

 

 

卸载Prometheus方式

kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup

这样k8s的资源就可以监控起来了,这中间还有一个问题是,我安装Kube-Promethues的时候有很多镜像下载不了,下篇我说下怎样下载k8s.gcr.io的镜像。