prometheus指标监控安装部署(保姆版)
<p>[TOC]</p>
<h2>环境</h2>
<p>堡垒机:10.1.90.9
公网IP:180.184.138.201
系统:CentOS7.9
多台主机需要设置时间同步(重要):ntpdate cn.ntp.org.cn</p>
<h2>一、配置公网出口</h2>
<pre><code class="language-bash">#堡垒机公网出口
iptables -A RH-Firewall-1-INPUT -s 180.184.138.201/32 -p tcp -m state --state NEW -m tcp --dport 9090 -j ACCEPT
#公司公网出口
iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 9090 -j ACCEPT</code></pre>
<h1>二、安装Prometheus</h1>
<h2>1.关闭防火墙和selinux</h2>
<pre><code class="language-bash">systemctl stop firewalld &amp;&amp; systemctl disable firewalld
sed -i &#039;s/=enforcing/=disabled/g&#039; /etc/selinux/config &amp;&amp; setenforce 0
</code></pre>
<h2>2.下载Prometheus tar包</h2>
<p>官网下载地址:<a href="https://prometheus.io/download/#prometheus">https://prometheus.io/download/#prometheus</a></p>
<pre><code class="language-bash">[root@Prometheus opt]$ ll
total 85744
-rw-r--r-- 1 root root 87797143 Dec 12 17:26 prometheus-2.40.6.linux-amd64.tar.gz
[root@Prometheus opt]$ tar -zxvf prometheus-2.40.6.linux-amd64.tar.gz
[root@Prometheus opt]$ mv /opt/prometheus-2.40.6.linux-amd64 /usr/local/prometheus
</code></pre>
<h2>3.配置Prometheus.yml</h2>
<pre><code class="language-bash">[root@Prometheus opt]$ cd /usr/local/prometheus/
[root@Prometheus prometheus]$ vim /usr/local/prometheus/prometheus.yml
....
scrape_configs:
- job_name: &quot;prometheus&quot;
static_configs:
- targets: [&quot;180.184.138.201:9090&quot;]</code></pre>
<h2>4.配置系统启动文件,设置开机自启</h2>
<pre><code class="language-bash">[root@Prometheus prometheus]$ vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io
After=network.target
[Service]
Type=simple
ExecStart=/data/prometheus/prometheus-3.1.0/prometheus \
--config.file=/data/prometheus/prometheus-3.1.0/prometheus.yml \
--storage.tsdb.path=/data/prometheus/tsdb_data/ \
--storage.tsdb.retention.time=60d \
--query.max-concurrency=100 \ #最大并发查询数,默认值20,建议优化参数(CPU 核数 × 2~3):100
--query.timeout=5m \ #查询超时时间,默认值2m,建议优化参数:5m
--query.max-samples=100000000 \ #单个查询可以加载到内存中的最大样本数,增加这个参数的值可以允许更大的查询,默认值:50000000,建议优化参数:100000000
--web.enable-lifecycle \
--web.enable-admin-api \
--web.enable-remote-write-receiver \
--log.level=error
LimitNOFILE=10240
StandardOutput=file:/var/log/prometheus.log
StandardError=file:/var/log/prometheus_error.log
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
[Install]
WantedBy=multi-user.target
[root@Prometheus prometheus]$ systemctl start prometheus
[root@Prometheus prometheus]$ systemctl enable prometheus.service</code></pre>
<h2>5.查看端口</h2>
<pre><code class="language-bash">netstat -luntp | grep 9090</code></pre>
<h2>5.Prometheus数据重写remote_write</h2>
<pre><code>remote_write:
- url: &quot;http://xxx.xxx.xxx.xxx:9090/api/v1/write&quot;</code></pre>
<h1>三、安装node_exporter</h1>
<pre><code class="language-bash">[root@Prometheus opt]$ ll
total 95688
-rw-r--r-- 1 root root 10181045 Dec 12 17:26 node_exporter-1.5.0.linux-amd64.tar.gz
-rw-r--r-- 1 root root 87797143 Dec 12 17:26 prometheus-2.40.6.linux-amd64.tar.gz
[root@Prometheus opt]$ tar -zxvf node_exporter-1.5.0.linux-amd64.tar.gz
[root@Prometheus opt]$ mv node_exporter-1.5.0.linux-amd64 /usr/local/bin/</code></pre>
<h2>1.启动node_exporter</h2>
<pre><code class="language-bash">[root@Prometheus opt]$ cd /usr/local/bin/node_exporter-1.5.0.linux-amd64/
#加上&amp;代表在后台运行,如果不加在启动node_exporter时会一直卡在前台,nohup命令永久运行这个程序
[root@Prometheus node_exporter-1.5.0.linux-amd64]$ nohup ./node_exporter &amp;
# nohup ./node_exporter &amp; 可以写进rc.local文件中去,防止系统重启
配置系统服务
vim /etc/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=shiyue
ExecStart=/home/shiyue/node_exporter/node_exporter --collector.processes --collector.filesystem.ignored-mount-points=^/(sys|proc|dev|host|etc)($$|/) --collector.textfile.directory=/home/shiyue/node_exporter/metrics
Restart=on-failure
[Install]
WantedBy=multi-user.target
</code></pre>
<h2>放行端口</h2>
<pre><code class="language-bash">iptables -A RH-Firewall-1-INPUT -s 180.184.138.201/32 -p tcp -m state --state NEW -m tcp --dport 9100 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 9100 -j ACCEPT
</code></pre>
<h2>2.与Prometheus集成</h2>
<pre><code class="language-bash">vim /usr/local/prometheus/prometheus.yml
...
- job_name: &quot;node_exporter&quot;
static_configs:
- targets: [&quot;180.184.138.201:9100&quot;]</code></pre>
<h2>3.重启服务</h2>
<pre><code class="language-bash"> systemctl restart prometheus.service</code></pre>
<h2>4.测试Prometheus在9100端口中抓取的数据</h2>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=99fa622c4d6899baf1a46c675c80f414&amp;file=file.png" alt="" /></p>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=870e2c105bbc280fd4f5efbc7b8c7c3d&amp;file=file.png" alt="" /></p>
<h1>四、安装Grafana</h1>
<p>官网安装地址:<a href="https://grafana.com/grafana/download?pg=get&plcmt=selfmanaged-box1-cta1">https://grafana.com/grafana/download?pg=get&plcmt=selfmanaged-box1-cta1</a></p>
<h2>1.下载grafana</h2>
<pre><code class="language-bash">wget https://dl.grafana.com/enterprise/release/grafana-enterprise-9.3.1-1.x86_64.rpm
yum install grafana-enterprise-9.3.1-1.x86_64.rpm</code></pre>
<h2>2.启动服务</h2>
<pre><code class="language-bash">systemctl start grafana-server.service
systemctl enable grafana-server.service</code></pre>
<h2>3.放行端口</h2>
<pre><code class="language-bash">iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 3000 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -s 180.184.138.201/32 -p tcp -m state --state NEW -m tcp --dport 3000 -j ACCEPT</code></pre>
<h2>4.测试</h2>
<p><a href="http://180.184.138.201:3000">http://180.184.138.201:3000</a>
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=9ba33f58e129b06154c8ebb6c9fae089&amp;file=file.png" alt="" /></p>
<h2>5.添加数据源(DATA_SOURCE)</h2>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=3015dba5bcb722f04195f1efdf0522b0&amp;file=file.png" alt="" />
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=dabeedcdd7117b458cd97dfb36a5e409&amp;file=file.png" alt="" />
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=97f11985d577b15a5b61a96e0d94c9fb&amp;file=file.png" alt="" />
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=13e4911a45be04ea0a34089620946a3b&amp;file=file.png" alt="" /></p>
<h2>6.添加Dashboard</h2>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=90a783b6a6bcbf6f5481b3206fa599fe&amp;file=file.png" alt="" />
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=a2fd54c13f7bc0408c4714155423a20f&amp;file=file.png" alt="" /></p>
<h1>五、安装MySQL5.7</h1>
<h3>1.下载tar包</h3>
<pre><code class="language-bash">wget https://cdn.mysql.com//Downloads/MySQL-5.7/mysql-5.7.40-linux-glibc2.12-x86_64.tar.gz</code></pre>
<h3>2.解压到/usr/local/目录下</h3>
<pre><code class="language-bash">tar -xvf mysql-5.7.40-linux-glibc2.12-x86_64.tar.gz -C /usr/local/
</code></pre>
<h3>3.重命名文件夹</h3>
<pre><code class="language-bash"> mv mysql-5.7.40-linux-glibc2.12-x86_64 mysql</code></pre>
<h3>4.创建组和用户</h3>
<pre><code class="language-bash"> groupadd mysql
useradd -r -g mysql mysql
</code></pre>
<h3>5.创建mysql数据存储目录</h3>
<pre><code class="language-bash"> #创建目录
mkdir -p /data/mysqldata
#赋予权限
chown mysql:mysql -R /data/mysqldata
</code></pre>
<h3>6.编辑my.cnf</h3>
<pre><code class="language-bash">vim /etc/my.cnf
[mysqld]
user=root
datadir=/data/mysqldata
basedir=/usr/local/mysql
port=3306
max_connections=200
max_connect_errors=10
character-set-server=utf8
default-storage-engine=INNODB
default_authentication_plugin=mysql_native_password
lower_case_table_names=0
group_concat_max_len=102400
[mysql]
default-character-set=utf8
[client]
port=3306
default-character-set=utf8
</code></pre>
<h3>7.初始化数据库</h3>
<pre><code class="language-bash">cd /usr/local/mysql/bin
./mysqld --defaults-file=/etc/my.cnf --basedir=/usr/local/mysql/ --datadir=/data/mysqldata/ --user=mysql --initialize
</code></pre>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=1d373c24b16519e8c23553cca0a7b263&amp;file=file.png" alt="" /></p>
<h3>8.添加mysqld服务到系统</h3>
<pre><code class="language-bash"> cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysql
</code></pre>
<h3>9.启动mysql</h3>
<pre><code class="language-bash">service mysql start</code></pre>
<h3>10.将mysql添加到命令服务</h3>
<pre><code class="language-bash">ln -s /usr/local/mysql/bin/mysql /usr/bin
</code></pre>
<h3>11.登录mysql</h3>
<pre><code class="language-bash">mysql -uroot -p
# 输入刚才初始化的密码</code></pre>
<h3>12.更改root密码</h3>
<pre><code class="language-bash">mysql&gt; ALTER USER &#039;root&#039;@&#039;localhost&#039; IDENTIFIED WITH mysql_native_password BY &#039;sy1212&#039;;
</code></pre>
<h3>13.更改root连接权限</h3>
<pre><code class="language-bash">use mysql;
update user set host=&#039;%&#039; where user = &#039;root&#039;;
flush privileges;
</code></pre>
<h3>14.msyql服务命令</h3>
<pre><code class="language-bash">#启动mysql
service mysql start
#停止
service mysql stop
#重启
service mysql restart
</code></pre>
<h2>六、部署mysqld_exporter</h2>
<p><a href="https://github.com/prometheus/mysqld_exporter.git">https://github.com/prometheus/mysqld_exporter.git</a></p>
<h3>1.安装</h3>
<pre><code class="language-bash">[root@Prometheus opt]$ tar zxvf mysqld_exporter-0.14.0.linux-amd64.tar.gz -C /usr/local
[root@Prometheus opt]$ cd /usr/local/
[root@Prometheus local]$ mv mysqld_exporter-0.14.0.linux-amd64/ mysqld_exporter</code></pre>
<h3>2.在MySQL中创建监控用户,并赋权</h3>
<pre><code class="language-bash">create user &#039;exporter&#039;@&#039;localhost&#039; IDENTIFIED BY &#039;sy1212&#039;;
GRANT SELECT, PROCESS, SUPER, REPLICATION CLIENT, RELOAD ON *.* TO &#039;exporter&#039;@&#039;localhost&#039;;</code></pre>
<h3>3.为 mysqld_exporter 创建配置文件</h3>
<pre><code class="language-bash">[root@Prometheus local]$ vim /usr/local/mysqld_exporter/mysqld_exporter.cnf
[client]
user=exporter
password=sy1212
[root@Prometheus local]$ nohup /usr/local/mysqld_exporter/mysqld_exporter --config.my-cnf=/usr/local/mysqld_exporter/mysqld_exporter.cnf &amp;</code></pre>
<h3>4.查看端口</h3>
<pre><code class="language-bash">[root@Prometheus local]$ netstat -luntp | grep 9104</code></pre>
<h3>5.放行端口</h3>
<pre><code class="language-bash">iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 9104 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -s 180.184.138.201/32 -p tcp -m state --state NEW -m tcp --dport 9104 -j ACCEPT</code></pre>
<h3>6.测试</h3>
<p><a href="http://180.184.138.201:9104/metrics">http://180.184.138.201:9104/metrics</a></p>
<h3>7.与Prometheus集成</h3>
<pre><code class="language-bash">vim /usr/local/prometheus/prometheus.yml
...
- job_name: &quot;mysqld_exporter&quot;
static_configs:
- targets: [&quot;180.184.138.201:9104&quot;]</code></pre>
<h3>8.重启服务</h3>
<pre><code class="language-bash">systemctl restart prometheus</code></pre>
<p>grafana监控模板编号:7362</p>
<h1>七、安装Nginx.1.21.4</h1>
<h3>1.源码安装nginx</h3>
<pre><code class="language-bash">[root@master ~]# wget http://nginx.org/download/nginx-1.21.4.tar.gz
[root@master ~]# cd &amp;&amp; tar xf nginx-1.21.4.tar.gz &amp;&amp; cd nginx-1.21.4
[root@master nginx-1.21.4]# ./configure &amp;&amp; make -j2 &amp;&amp; make install
[root@master nginx-1.21.4]# /usr/local/nginx/sbin/nginx -v
[root@master nginx-1.21.4]# ps -ef | grep nginx &amp;&amp; netstat -luntp | grep nginx</code></pre>
<h3>2.配置nginx-module-vts模块</h3>
<pre><code class="language-bash">[root@master ~]# unzip nginx-module-vts-master.zip
[root@master ~]# cd nginx-1.21.4
[root@master nginx-1.21.4]# ./configure --add-module=/root/nginx-module-vts-master
[root@master nginx-1.21.4]# make -j2
[root@master nginx-1.21.4]# /root/nginx-1.21.4/objs/nginx -v</code></pre>
<h3>3.修改配置</h3>
<pre><code class="language-bash">vim /usr/local/nginx/conf/nginx.conf
...
http下新增配置
vhost_traffic_status_zone;
vhost_traffic_status_filter_by_host on;
80端口下,新增status接口监控
location /status {
vhost_traffic_status_display;
vhost_traffic_status_display_format html;
}
...</code></pre>
<h3>4.重新加载配置</h3>
<pre><code class="language-bash">/root/nginx-1.21.4/objs/nginx -t
nginx: the configuration file /usr/local/nginx/conf/nginx.conf syntax is ok
nginx: configuration file /usr/local/nginx/conf/nginx.conf test is successful</code></pre>
<h3>5.备份原文件</h3>
<pre><code class="language-bash">cp /usr/local/nginx/sbin/nginx /usr/local/nginx/sbin/nginx.bak</code></pre>
<h3>6.替换nginx二进制文件</h3>
<pre><code class="language-bash">cp -f ~/nginx-1.21.4/objs/nginx /usr/local/nginx/sbin/nginx
cp: overwrite ‘/usr/local/nginx/sbin/nginx’? y</code></pre>
<h3>7.正确性检查</h3>
<pre><code class="language-bash">/usr/local/nginx/sbin/nginx -t
nginx: the configuration file /usr/local/nginx/conf/nginx.conf syntax is ok
nginx: configuration file /usr/local/nginx/conf/nginx.conf test is successful
reload nginx
yum install -y psmisc
killall nginx &amp;&amp; /usr/local/nginx/sbin/nginx</code></pre>
<h3>8.查看编译结果</h3>
<pre><code class="language-bash">[root@master ~]# /usr/local/nginx/sbin/nginx -V
nginx version: nginx/1.21.4
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)
configure arguments: --add-module=/root/nginx-module-vts-master</code></pre>
<p>测试:180.184.138.201/status</p>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=c8ac051d4a16eec6b0f09246edf261d5&amp;file=file.png" alt="" /></p>
<h3>9.与Prometheus集成</h3>
<pre><code class="language-bash">...
- job_name: &quot;Nginx 1.21.4&quot;
metrics_path: /status/format/prometheus
static_configs:
- targets: [&quot;180.184.138.201:80&quot;]</code></pre>
<h3>10.重启服务</h3>
<pre><code class="language-bash">systemctl restart prometheus</code></pre>
<h1>八、安装redis7.0.4</h1>
<p>tar包下载地址:<a href="http://download.redis.io/releases/">http://download.redis.io/releases/</a></p>
<h3>1.上传安装包</h3>
<pre><code class="language-bash"> [root@Prometheus opt]# ll
-rw-r--r-- 1 root root 2963216 Dec 15 09:18 redis-7.0.4.tar.gz</code></pre>
<h3>2.解压</h3>
<pre><code class="language-bash"> [root@Prometheus opt]# tar -zxvf reis-7.0.4.tar.gz -C /usr/local</code></pre>
<h3>3.进入安装目录</h3>
<pre><code class="language-bash"> [root@Prometheus opt]# cd redis-7.0.4</code></pre>
<h3>4.运行编译命令</h3>
<pre><code class="language-bash">[root@Prometheus redis-7.0.4]# make
[root@Prometheus redis-7.0.4]# make install
注意:如果在编译过程中出现 Jemalloc/jemalloc.h:没有那个文件 没有的错误,在确保 gcc 安装成功后,可执行 make distclean 进行清除后再次安装。</code></pre>
<h3>5.前台启动redis</h3>
<pre><code>[root@Prometheus ~]# redis-server</code></pre>
<h3>6.警告处理方法</h3>
<pre><code class="language-bash">#警告1:最大文件数
5068:M 13 Sep 2022 22:00:28.245 * Increased maximum number of open files to 10032 (it was originally set to 1024).
#解决办法
[root@serverc ~]# vim /etc/security/limits.conf
root soft nofile 10032
root hard nofile 10032
#警告2:TCP监听队列
5068:M 13 Sep 2022 22:00:28.246 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
#解决办法
[root@serverc ~]# vim /etc/sysctl.conf
net.core.somaxconn=1024
#警告3:内存过载
5068:M 13 Sep 2022 22:00:28.246 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add &#039;vm.overcommit_memory = 1&#039; to /etc/sysctl.conf and then reboot or run the command &#039;sysctl vm.overcommit_memory=1&#039; for this to take effect.
5068:M 13 Sep 2022 22:00:28.247 * Ready to accept connections
#解决办法
[root@serverc ~]# vim /etc/sysctl.conf
vm.overcommit_memory=1</code></pre>
<h3>7.后台启动redis</h3>
<pre><code class="language-bash">#在 redis 的安装目录中,有一个 redis.conf 文件,我们把这个文件复制到 /etc/目录下:
[root@Prometheus ~]# cp /usr/local/redis-7.0.4/redis.conf /etc/
#然后修改 /etc/redis.conf 文件,把 daemonize 值设置为 yes 即可。
[root@Prometheus ~]# vim /etc/redis.conf
daemonize yes
#保存退出后,执行如下命令来启动服务。
[root@Prometheus ~]# redis-server /etc/redis.conf</code></pre>
<h3>8.验证服务</h3>
<h4>1.使用 redis-cli 脚本来连接 redis 服务</h4>
<pre><code class="language-bash">[root@Prometheus ~]# redis-cli -p 6379
#执行如下命令
127.0.0.1:6379&gt; ping
PONG
#连接成功</code></pre>
<h3>9.关闭服务</h3>
<pre><code class="language-bash">[root@Prometheus ~]# redis-cli shutdown
#也可以进入终端后再关闭
127.0.0.1:6379&gt; shutdown
ps -ef | grep redis</code></pre>
<h3>10.开机启动</h3>
<h4>redis 没有开机启动功能,我们需要编写脚本来实现这个功能。我们在 /etc/systemd/system 目录下新建 redis.service 文件。</h4>
<pre><code class="language-bash">cd /etc/systemd/system
vim redis.service
然后添加以下内容
[Unit]
#服务描述
Description=Redis Server Manager
#服务类别
After=network.target
[Service]
#后台运行的形式
Type=forking
#服务命令
ExecStart=/usr/local/bin/redis-server /etc/redis.conf
#给服务分配独立的临时空间
PrivateTmp=true
[Install]
#运行级别下服务安装的相关设置,可设置为多用户,即系统运行级别为3
WantedBy=multi-user.target</code></pre>
<h3>11.执行命令</h3>
<pre><code class="language-bash">systemctl start redis.service #启动redis服务
systemctl enable redis.service #设置开机自启动</code></pre>
<h3>12.设置密码(可不做)</h3>
<pre><code class="language-basg">[root@Prometheus local]$ redis-cli -p 6379
127.0.0.1:6379&gt; ping
(error) NOAUTH Authentication required.
127.0.0.1:6379&gt; auth sy1212
OK
127.0.0.1:6379&gt; ping
PONG
127.0.0.1:6379&gt; exit</code></pre>
<h1>九、部署redis_exporter</h1>
<h3>1.下载tar包、解压</h3>
<pre><code class="language-bash">[root@Prometheus opt]$ wget https://github.com/oliver006/redis_exporter/releases/download/v1.6.1/redis_exporter-v1.6.1.linux-amd64.tar.gz
[root@Prometheus opt]$ tar -xf redis_exporter-v1.6.1.linux-amd64.tar.gz -C /usr/local/
[root@Prometheus opt]$ cd /usr/local/
[root@Prometheus local]$ mv redis_exporter-v1.6.1.linux-amd64 redis_exporter</code></pre>
<h3>2.启动 redis_exporter 服务</h3>
<pre><code class="language-bash">[root@Prometheus local]$ cat &gt; /usr/lib/systemd/system/redis_exporter.service &lt;&lt;EOF
&gt; [Unit]
&gt; Description=redis_exporter
&gt; Documentation=https://github.com/oliver006/redis_exporter
&gt; After=network.target
&gt; [Service]
&gt; Type=simple
&gt; User=prometheus
&gt; ExecStart=/usr/local/redis_exporter/redis_exporter -redis.addr 180.184.138.201:6379 -redis.password sy1212
&gt; Restart=on-failure
&gt; [Install]
&gt; WantedBy=multi-user.target
&gt; EOF
</code></pre>
<h3>3.启动 redis_exporter</h3>
<pre><code class="language-bash">$ systemctl daemon-reload
$ systemctl start redis_exporter
$ systemctl status redis_exporter
$ systemctl enable redis_exporter
$ netstat -luntp | grep 9121
</code></pre>
<h3>4.测试数据</h3>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=ce4ba8598db3af890a26dbbce2c14b3c&amp;file=file.png" alt="" /></p>
<h3>5.与Prometheus集成</h3>
<pre><code class="language-bash">...
- job_name: &quot;redis_exporter&quot;
scrape_interval: 30s
scrape_timeout: 30s
static_configs:
- targets: [&quot;180.184.138.201:9121&quot;]
labels:
group: &#039;increased_timeout&#039;</code></pre>
<h3>6.导入Grafana</h3>
<p>模板编号:6908</p>
<h1>十、安装php-8.2.0(未完成)</h1>
<p>官网下载地址:<a href="https://www.php.net/downloads.php">https://www.php.net/downloads.php</a></p>
<h3>1.解压</h3>
<pre><code class="language-bash">tar -xf php-8.2.0.tar.bz2
</code></pre>
<h3>2.配置</h3>
<pre><code class="language-bash">cd php-8.2.0
./configure --prefix=/usr/local/php8 --with-config-file-path=/usr/local/php8/etc \
--with-curl --with-freetype --enable-gd --with-jpeg --with-gettext --with-kerberos --with-libdir=lib64 --with-libxml \
--with-mysqli --with-openssl --with-pdo-mysql --with-pdo-sqlite --with-pear --enable-sockets --with-mhash --with-ldap-sasl \
--with-xsl --with-zlib --with-zip -with-bz2 --with-iconv --enable-fpm --enable-pdo --enable-bcmath --enable-mbregex \
--enable-mbstring --enable-opcache --enable-pcntl --enable-shmop --enable-soap --enable-sockets --enable-sysvsem \
--enable-xml --enable-sysvsem --enable-cli --enable-opcache --enable-intl --enable-calendar --enable-static --enable-mysqlnd</code></pre>
<p>提示以下信息说明配置成功
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=13bb3fc84510f640688c6e7aceddd957&amp;file=file.png" alt="" /></p>
<p>配置过程中,如果出现以下错误,需要使用 export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig/" 把文件导出到环境变量,出现下列错误是因为没有读取到libzip的文件或没法验证!!!
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=56b032008b4fa5da00b0ee8466152e61&amp;file=file.png" alt="" /></p>
<h3>3.编译</h3>
<pre><code class="language-bash">[root@Prometheus php-8.2.0]$ make
make完成后的提示:
Build complete.
Don&#039;t forget to run &#039;make test&#039;.</code></pre>
<h3>4.安装</h3>
<pre><code class="language-bash">[root@Prometheus php-8.2.0]$ make install</code></pre>
<h1>十一、安装erlang</h1>
<p>安装包下载地址:<a href="http://erlang.org/download/otp_src_20.0.tar.gz">http://erlang.org/download/otp_src_20.0.tar.gz</a></p>
<h3>1.解压</h3>
<pre><code class="language-bash">[root@Prometheus opt]$ tar -xf otp_src_20.0.tar.gz
[root@Prometheus opt]$ chmod -R 777 otp_src_20.0</code></pre>
<h3>2.配置</h3>
<pre><code class="language-bash">mkdir /usr/local/erlang
cd /opt/otp_src_20.0/
./configure --prefix=/usr/local/erlang -with-ssl -enable-threads -enable-smmp-support -enable-kernel-poll -enable-hipe -without-javac</code></pre>
<h3>3.编译安装</h3>
<pre><code class="language-bash">[root@Prometheus otp_src_20.0]$ make &amp;&amp; make install</code></pre>
<h3>4.配置环境变量</h3>
<pre><code class="language-bash">[root@Prometheus otp_src_20.0]$ vim ~/.bash_profile
PATH=$PATH:/opt/otp_src_20.0/bin
[root@Prometheus otp_src_20.0]$ source ~/.bash_profile</code></pre>
<h3>5.验证erlang是否安装成功</h3>
<pre><code class="language-bash">[root@Prometheus otp_src_20.0]$ erl
Erlang/OTP 20 [erts-9.0] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V9.0 (abort with ^G)
1&gt; </code></pre>
<h1>十二、安装rabbitmq</h1>
<p>下载地址:<a href="https://github.com/rabbitmq/rabbitmq-server/releases/download/v3.7.8/rabbitmq-server-generic-unix-3.7.8.tar.xz">https://github.com/rabbitmq/rabbitmq-server/releases/download/v3.7.8/rabbitmq-server-generic-unix-3.7.8.tar.xz</a> </p>
<h3>1.解压</h3>
<pre><code class="language-bash">[root@Prometheus local]$ tar -xf /opt/rabbitmq-server-generic-unix-3.7.8.tar.xz -C /usr/local/rabbitmq</code></pre>
<h3>2.配置环境变量</h3>
<pre><code class="language-bash">vim /etc/profile
export PATH=$PATH:/usr/local/rabbitmq/rabbitmq_server-3.7.8/sbin
# 重启生效
source /etc/profile</code></pre>
<h3>3.后台启动rabbitmq</h3>
<pre><code class="language-bash">[root@Prometheus ~]$ cd /usr/local/rabbitmq/rabbitmq_server-3.7.8
# 加上-detached代表后台启动
[root@Prometheus rabbitmq_server-3.7.8]$ ./sbin/rabbitmq-server -detached</code></pre>
<h3>4.开启web管理</h3>
<pre><code class="language-bash">[root@Prometheus rabbitmq_server-3.7.8]$ ./sbin/rabbitmq-plugins enable rabbitmq_management</code></pre>
<h3>5.放行rabbitmq的web端口</h3>
<pre><code class="language-bash">iptables -A RH-Firewall-1-INPUT -s 180.184.138.201/32 -p tcp -m state --state NEW -m tcp --dport 15672 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 15672 -j ACCEPT</code></pre>
<h3>6.测试</h3>
<p><strong>180.184.138.201:15672</strong>
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=e055e4709f8757d686c25521169534f6&amp;file=file.png" alt="" />
<strong>rabbitmq 默认账号和密码为guest,管理员账号要自己创建</strong></p>
<h3>7.创建rabbitmq登录账号、密码</h3>
<pre><code class="language-bash">[root@Prometheus rabbitmq_server-3.7.8]$ ./sbin/rabbitmqctl add_user admin sy1212
Adding user &quot;admin&quot; ...
# 把用户admin授权为administrator(必须做,不然会提示没有管理员权限)
[root@Prometheus rabbitmq_server-3.7.8]$ ./sbin/rabbitmqctl set_user_tags admin administrator
Setting tags for user &quot;admin&quot; to [administrator] ...</code></pre>
<h3>8.登录成功界面</h3>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=96a595403d1752aeebbcbb0febe5fa13&amp;file=file.png" alt="" /></p>
<h3>9.设置开机自启</h3>
<pre><code class="language-bash">[root@Prometheus rabbitmq_server-3.7.8]$ vim /usr/lib/systemd/system/rabbitmq-server.service
[Unit]
Description=RabbitMQ broker
After=syslog.target network.target
[Service]
#Type=notify
User=root
Group=root
WorkingDirectory=/usr/local/rabbitmq/rabbitmq_server-3.7.8
ExecStart=/usr/local/rabbitmq/rabbitmq_server-3.7.8/sbin/rabbitmq-server
ExecStop=/usr/local/rabbitmq/rabbitmq_server-3.7.8/sbin/rabbitmqctl stop
[Install]
WantedBy=multi-user.target
[root@Prometheus rabbitmq_server-3.7.8]$ systemctl start rabbitmq-server.service
[root@Prometheus rabbitmq_server-3.7.8]$ systemctl enable rabbitmq-server.service</code></pre>
<h1>十三、安装rabbitmq_exporter</h1>
<p>tar包下载地址:<a href="https://github.com/kbudde/rabbitmq_exporter/releases/download/v1.0.0-RC19/rabbitmq_exporter_1.0.0-RC19_linux_amd64.tar.gz">https://github.com/kbudde/rabbitmq_exporter/releases/download/v1.0.0-RC19/rabbitmq_exporter_1.0.0-RC19_linux_amd64.tar.gz</a></p>
<h3>1.解压</h3>
<pre><code class="language-bash">[root@Prometheus opt]$ tar xf rabbitmq_exporter_1.0.0-RC19_linux_amd64.tar.gz
[root@Prometheus opt]$ mv rabbitmq_exporter /usr/local/</code></pre>
<h3>2.开机启动rabbitmq_exporter</h3>
<pre><code class="language-bash">[root@Prometheus opt]$ vim /usr/lib/systemd/system/rabbitmq_exporter.service
[Unit]
Description=rabbitmq_exporter
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=prometheus
DefaultEnvironment=&#039;RABBIT_USER=admin RABBIT_PASSWORD=sy1212 OUTPUT_FORMAT=JSON PUBLISH_PORT=9419 RABBIT_URL=http://180.184.138.201:15672&#039;
ExecStart=/usr/local/rabbitmq_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
[root@Prometheus opt]$ systemctl start rabbitmq_exporter</code></pre>
<h3>3.查看端口</h3>
<pre><code class="language-bash">[root@Prometheus opt]$ netstat -luntp | grep 9419
tcp6 0 0 :::9419 :::* LISTEN 27253/rabbitmq_expo</code></pre>
<h3>4.放行端口</h3>
<pre><code class="language-bash">iptables -A RH-Firewall-1-INPUT -s 180.184.138.201/32 -p tcp -m state --state NEW -m tcp --dport 9419 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 9419 -j ACCEPT</code></pre>
<h3>5.与Prometheus集成</h3>
<pre><code class="language-bash">[root@Prometheus opt]$ vim /usr/local/prometheus/prometheus.yml
...
- job_name: &quot;rabbitmq_exporter&quot;
static_configs:
- targets: [&quot;180.184.138.201:9419&quot;]
[root@Prometheus opt]$ systemctl restart prometheus.service</code></pre>
<h1>十四、安装alermanager</h1>
<h3>1.解压二进制包</h3>
<pre><code class="language-bash">[root@Prometheus opt]$ tar -xf alertmanager-0.24.0.linux-amd64.tar.gz -C /usr/local/
[root@Prometheus opt]$ cd /usr/local/
[root@Prometheus local]$ mv alertmanager-0.24.0.linux-amd64/ alertmanager</code></pre>
<h3>2.使用systemd来管理Alertmanager服务</h3>
<pre><code class="language-bash">[root@Prometheus local]$ vim /usr/lib/systemd/system/alertmanager.service
[Unit]
Description=alertmanager
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/alertmanager/alertmanager --config.file /usr/local/alertmanager/alertmanager.yml --storage.path=/var/lib/alertmanager
Restart=on-failure
[Install]
WantedBy=multi-user.target
</code></pre>
<h3>3.启动alermanager服务</h3>
<pre><code class="language-bash">[root@Prometheus local]$ systemctl daemon-reload
[root@Prometheus local]$ systemctl start alertmanager.service
[root@Prometheus local]$ systemctl enable alertmanager.service
Created symlink from /etc/systemd/system/multi-user.target.wants/alertmanager.service to /usr/lib/systemd/system/alertmanager.service.</code></pre>
<h3>4.放行端口,alertmanager默认端口9093</h3>
<pre><code class="language-bash">iptables -A RH-Firewall-1-INPUT -s 180.184.138.201/32 -p tcp -m state --state NEW -m tcp --dport 9093 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 9093 -j ACCEPT</code></pre>
<h1>十五、企业微信机器人告警</h1>
<h2>1.安装docker</h2>
<h3>1.1 需要的安装包</h3>
<pre><code class="language-bash">yum install -y yum-utils </code></pre>
<h3>1.2 设置镜像的仓库</h3>
<pre><code class="language-bash">yum-config-manager \
--add-repo \
http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo 推荐使用阿里云的
</code></pre>
<h3>1.3 更新yum软件包索引</h3>
<pre><code class="language-bash">yum makecache fast</code></pre>
<h3>1.4 安装依赖</h3>
<pre><code class="language-bash">yum install container-selinux fuse-overlayfs slirp4netns -y</code></pre>
<h3>1.5 安装docker相关内容(docker-ce 社区版;docker-ee 企业版;推荐使用ce。)</h3>
<pre><code class="language-bash">yum install docker-ce docker-ce-cli containerd.io -y</code></pre>
<h3>1.6 验证,启动</h3>
<pre><code class="language-bash">systemctl start docker
docker version
systemctl enable docker</code></pre>
<h2>2. 配置报警规则</h2>
<pre><code class="language-bash">vim /usr/local/prometheus/prometheus.yml
# Alertmanager configuration
# 改为alertmanager的地址
alerting:
alertmanagers:
- static_configs:
- targets:
- 180.184.138.201:9093 # 9093 alertmanager端口
# 指定规则文件
rule_files:
- &quot;/usr/local/prometheus/rules/*.yml&quot;</code></pre>
<h3>2.1.在/usr/local/prometheus/rules/中添加以.yml文件结尾的规则文件,Prometheus会根据这些规则文件进行监控报警</h3>
<h4>2.2.1 模板规则说明</h4>
<pre><code class="language-bash"># 一个配置文件里包含多个组
groups:
- name: example # 组名
# 触发规则列表
rules:
- alert: HighErrorRate # 告警的名字,在组中需要唯一
expr: job:request_latency_seconds:mean5m{job=&quot;myjob&quot;} &gt; 0.5 # 触发规则
for: 10m # 规则触发持续多长时间发送告警
# 告警附加标签
labels:
severity: page
# 告警附加注释
annotations:
summary: High request latency</code></pre>
<h3>2.1.1 node_alived.yml</h3>
<pre><code class="language-bash">vim /usr/local/prometheus/rules/node_alived.yml
groups:
- name: &quot;实例存活告警规则&quot;
rules:
- alert: &quot;实例存活告警&quot;
expr: up == 0
for: 1m
labels:
user: prometheus
severity: warning
annotations:
summary: &quot;主机宕机 !!!&quot;
description: &quot;该实例主机已经宕机超过1分钟了。&quot;</code></pre>
<h3>2.1.2 memory_over.yml</h3>
<pre><code class="language-bash">vim /usr/local/prometheus/rules/memory_over.yml
groups:
- name: &quot;内存报警规则&quot;
rules:
- alert: &quot;内存使用率告警&quot;
expr: (1-(node_memory_MemAvailable_bytes / (node_memory_MemTotal_bytes))) * 100 &gt; 10
for: 1m
labels:
severity: warning
annotations:
summary: &quot;服务器可用内存不足&quot;
description: &quot;内存使用率已超过10%(当前值: {{ $value }}%)&quot;
</code></pre>
<h3>2.1.3 cpu_over.yml</h3>
<pre><code class="language-bash">vim /usr/local/prometheus/rules/cpu_over.yml
groups:
- name: CPU报警规则
rules:
- alert: CPU使用率告警
expr: 100 - (avg by (instance)(irate(node_cpu_seconds_total{mode=&quot;idle&quot;}[1m]) )) * 100 &gt; 10
for: 1m
labels:
severity: warning
annotations:
summary: &quot;CPU使用率正在飙升&quot;
description: &quot;CPU使用率超过10%(当前值:{{ $value }}%)&quot;
</code></pre>
<h3>2.1.4 disk_over.yml</h3>
<pre><code class="language-bash">vim /usr/local/prometheus/rules/disk_over.yml
groups:
- name: &quot;磁盘使用率报警规则&quot;
rules:
- alert: &quot;磁盘使用率告警&quot;
expr: 100 - node_filesystem_free_bytes{fstype=~&quot;xfs|ext4&quot;} / node_filesystem_size_bytes{fstype=~&quot;xfs|ext4&quot;} * 100 &gt; 80
for: 2m
labels:
severity: warning
annotations:
summary: &quot;磁盘分区使用率过高&quot;
description: &quot;分区使用大于80%(当前值: {{ $value }}%)&quot;
</code></pre>
<h2>3. 重启Prometheus</h2>
<pre><code class="language-bash">systemctl restart prometheus</code></pre>
<p>登陆prometheus的UI界面,查看Alerts规则</p>
<p><strong>Inactive</strong>:没有触发阈值
<strong>Pending</strong>:已触发阈值但未满足告警持续时间
<strong>Firing</strong>:已触发阈值且满足告警持续时间</p>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=c022a8fd41b18421a70c36597fb57c3c&amp;file=file.png" alt="" /></p>
<h2>4. docker安装企业微信报警插件(webhook-adapter),启用一个名为:wechat的机器人。</h2>
<pre><code class="language-bash">docker run -d --name wechat \
--restart always -p 8080:80 \
guyongquan/webhook-adapter \
--adapter=/app/prometheusalert/wx.js=/wx=https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxx(自己的微信机器人key)</code></pre>
<h2>5. 配置alertmanager.yml文件</h2>
<pre><code class="language-bash">global:
resolve_timeout: 5m
route:
group_by: [&#039;alertname&#039;]
group_wait: 10s
group_interval: 10s
repeat_interval: 5m
receiver: &#039;web.hook&#039;
receivers:
- name: &#039;web.hook&#039;
webhook_configs:
#- send_resolved: true
- url: &#039;http://180.184.138.201:8080/adapter/wx&#039;
inhibit_rules:
- source_match:
severity: &#039;critical&#039;
target_match:
severity: &#039;warning&#039;
equal: [&#039;alertname&#039;, &#039;dev&#039;, &#039;instance&#039;]</code></pre>
<h2>6. 启动alertmanager服务</h2>
<pre><code class="language-bash"># --storage.path的路径用systemctl status alertmanager 查看
/usr/local/alertmanager/alertmanager --config.file=/usr/local/alertmanager/alertmanager.yml --storage.path=/var/lib/alertmanager/data/ &amp;&gt; /usr/local/alertmanager/access.log &amp;</code></pre>
<h2>7. 放行9094端口(prometheus的监听端口)</h2>
<pre><code class="language-bash">iptables -A RH-Firewall-1-INPUT -s 180.184.138.201/32 -p tcp -m state --state NEW -m tcp --dport 9094 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 9094 -j ACCEPT</code></pre>
<h1>十六、node_exporter远程监控、告警实例</h1>
<p>[A] 180.184.138.201 Prometheus
[B] 42.192.10.73 node_exporter</p>
<h3>1. B 部署好node_exporter后,需要放行的端口如下</h3>
<pre><code class="language-bash">iptables -A RH-Firewall-1-INPUT -s 42.192.10.73/32 -p tcp -m state --state NEW -m tcp --dport 9100 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 9100 -j ACCEPT
# 重点,也要放行Prometheus服务端的IP和端口
iptables -A RH-Firewall-1-INPUT -s 180.184.138.201/32 -p tcp -m state --state NEW -m tcp --dport 9100 -j ACCEPT
</code></pre>
<h3>2. A 添加 B 的 node_exporter 端口</h3>
<pre><code class="language-bash">[root@Prometheus ~]$ vim /usr/local/prometheus/prometheus.yml
...
- job_name: &quot;node_exporter&quot;
static_configs:
- targets: [&quot;180.184.138.201:9100&quot;,&quot;42.192.10.73:9100&quot;]
[root@Prometheus ~]$ systemctl restart prometheus.service
[root@Prometheus ~]$ iptables -A RH-Firewall-1-INPUT -s 42.192.10.73/32 -p tcp -m state --state NEW -m tcp --dport 9100 -j ACCEPT</code></pre>
<h3>3. 效果</h3>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=d7b25077645876d9116ece95ad37cefb&amp;file=file.png" alt="" />
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=3c9b8cebd4179629c2f7dcec4353471c&amp;file=file.png" alt="" /></p>
<h2>4. 告警</h2>
<h3>4.1 B 安装好alertmanager_exporter,然后在 A 执行 vim /usr/local/alertmanager/alertmanager.yml</h3>
<pre><code class="language-bash">...
alerting:
alertmanagers:
- static_configs:
- targets:
- 180.184.138.201:9093
- 42.192.10.73:9093 # 添加远程主机的alertmanager IP和端口</code></pre>
<h3>4.2 B 放行端口</h3>
<pre><code class="language-bash">iptables -A RH-Firewall-1-INPUT -s 42.192.10.73/32 -p tcp -m state --state NEW -m tcp --dport 9093 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 9093 -j ACCEPT</code></pre>
<h1>十七、安装consul</h1>
<p>下载地址:<a href="https://www.consul.io/downloa">https://www.consul.io/downloa</a></p>
<h2>安装脚本</h2>
<pre><code class="language-bash">#!/bin/bash
unzip /opt/consul_1.10.3_linux_amd64.zip
mv consul /usr/local/bin
sudo mkdir -p /var/lib/consul /etc/consul.d
sudo chmod -R 775 /var/lib/consul /etc/consul.d
cat &lt;&lt;EOF &gt;/etc/systemd/system/consul.service
[Unit]
Description=Consul Service Discovery Agent
Documentation=https://www.consul.io/
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=root
Group=root
ExecStart=/usr/local/bin/consul agent -server -ui \
-bootstrap-expect=1 \
-data-dir=/var/lib/consul \
-node=consul-server \
-client=0.0.0.0 \
-bind=10.14.0.126 \
-config-dir=/etc/consul.d
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl stop consul
systemctl start consul
systemctl restart consul
systemctl enable consul
systemctl status consul</code></pre>
<h2>1. prometheus接入consul</h2>
<pre><code class="language-bash">...
- job_name: &quot;prometheus-consul&quot;
consul_sd_configs:
- server: 10.14.0.126:8500
...</code></pre>
<h2>2. 重启Prometheus</h2>
<pre><code class="language-bash">systemctl restart prometheus.service</code></pre>
<h2>3. 注册服务</h2>
<pre><code class="language-bash">vim exporter_registry.yml
---
- name: exporter_registry
hosts: all
tasks:
- name: exporter_registry
shell: curl --request PUT -d &#039;{&quot;ID&quot;:&quot;{{ ansible_hostname }}-{{ ansible_default_address }}-{{ Prometheus_exporter_tag }}&quot;,&quot;Name&quot;:&quot;{{ ansible_hostname }}-{{ ansible_default_address }}&quot;,&quot;Tags&quot;:[&quot;{{ Prometheus_game_tag }}&quot;],&quot;Address&quot;:&quot;{{ ansible_default_address }}&quot;,&quot;Port&quot;:{{ exporter_port }},&quot;Meta&quot;:{&quot;Name&quot;:&quot;{{ Prometheus_exporter_tag }}&quot;,&quot;group&quot;:&quot;{{ Prometheus_exporter_group_tag }}&quot;},&quot;EnableTagOverride&quot;:false,&quot;Check&quot;:{&quot;HTTP&quot;:&quot;http://{{ ansible_default_address }}:{{ exporter_port }}/metrics&quot;,&quot;Interval&quot;:&quot;10s&quot;},&quot;Weights&quot;:{&quot;Passing&quot;:10,&quot;Warning&quot;:1}}&#039; {{ Prometheus_server_consulIP }}/v1/agent/service/register?replace-existing-checks=1
delegate_to: localhost
become: yes
vim nginx_registry.yml
- name: nginx_registry
hosts: all
tasks:
- name: nginx_registry
shell: curl --request PUT -d &#039;{&quot;ID&quot;:&quot;{{ ansible_hostname }}-{{ ansible_default_address }}-{{ Prometheus_exporter_tag }}&quot;,&quot;Name&quot;:&quot;{{ ansible_hostname }}-{{ ansible_default_address }}&quot;,&quot;Tags&quot;:[&quot;{{ Prometheus_game_tag }}&quot;],&quot;Address&quot;:&quot;{{ ansible_default_address }}&quot;,&quot;Port&quot;:{{ exporter_port }},&quot;Meta&quot;:{&quot;Name&quot;:&quot;{{ Prometheus_exporter_tag }}&quot;,&quot;group&quot;:&quot;{{ Prometheus_exporter_group_tag }}&quot;,&quot;__metrics_path__&quot;:&quot;{{ path }}&quot;},&quot;EnableTagOverride&quot;:false,&quot;Check&quot;:{&quot;HTTP&quot;:&quot;http://{{ ansible_default_address }}:{{ exporter_port }}/status&quot;,&quot;Interval&quot;:&quot;10s&quot;},&quot;Weights&quot;:{&quot;Passing&quot;:10,&quot;Warning&quot;:1}}&#039; {{ Prometheus_server_consulIP }}/v1/agent/service/register?replace-existing-checks=1
delegate_to: localhost
become: yes</code></pre>
<h3>预览</h3>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=45138d26392884d2db4afd89c5bd0708&amp;file=file.png" alt="" /></p>
<h2>4.取消注册</h2>
<pre><code class="language-bash">#!/bin/bash
# node_exporter为注册时的id
curl --request PUT http://180.184.138.201:8500/v1/agent/service/deregister/node_exporter
</code></pre>
<h2>5.标签管理</h2>
<p>> 对于类似consul服务器发现的一些标签我们都要做一定的处理让它满足我们的需求,标签处理我们成为relabel,以下介绍relabel提供的各种方法:
replace:# 正则匹配源标签的值用来替换目标标签,如果有replacement,使用replacement替换目标标签。
labelmap: # 正则匹配所有标签名,将匹配的标签的值复制到由replacement提供的标签名。</p>
<p>【标签保留或删除】
labeldrop: # 正则匹配所有标签名,匹配则移除标签。
labelkeep: # 正则匹配所有标签名,不匹配的标签会被移除。</p>
<p>【监控目标管理】
keep:# 如果正则表达式没有匹配到源标签,删除targets
drop:# 正则匹配到源标签,删除targets</p>
<h2>6.标签管理下的一些默认值</h2>
<p>> // regex is (.*),
// replacement is $1,
// separator is ;
// ,and action is replace</p>
<h2>7.drop使用--控制target</h2>
<p>> 删除label对应匹配的值的target</p>
<pre><code class="language-bash">-job_name: &#039;consul&#039;
consul_sd_configs:
- server: 180.184.138.201:8500
relabel_configs:
- source_labels: [__address__]
regex: 180.184.138.201:8300
action: drop</code></pre>
<h2>7.keep使用</h2>
<p>> 与drop相反</p>
<pre><code class="language-bash">-job_name: &#039;consul&#039;
consul_sd_configs:
- server: 180.184.138.201:8500
relabel_configs:
- source_labels: [__address__]
regex: 180.184.138.201:8300
action: keep</code></pre>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=fcc8c9cd730a4331faa2992958eedadc&amp;file=file.png" alt="" />
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=93a066bc207e185fed3a64f1c0be2093&amp;file=file.png" alt="" /></p>
<h2>8.labeldrop使用</h2>
<p>> 清理指定标签</p>
<pre><code class="language-bash"> - source_labels: [__meta_consul_node__]
action: replace
target_name: node_name
- action: labelmap
regex: __meta_consul_(.+)
- regex: tg.* # 以tg开头的所有都会被labeldrop掉
action: labeldrop</code></pre>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=d04fd8856d1f35e48da7dccee899273f&amp;file=file.png" alt="" />
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=aa81984f100f78457d07b44b66172766&amp;file=file.png" alt="" /></p>
<h2>9.labelmap使用</h2>
<p>> 值赋值给(.+)正则匹配的内容, __meta_consul_service_address="180.184.138.201" ----> service_address="180.184.138.201"</p>
<pre><code class="language-bash"> - action: labelmap
regex: __meta_consul_(.+)</code></pre>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=0db2cb160cb23591a2f660f736a91535&amp;file=file.png" alt="" /></p>
<h2>10.replace使用</h2>
<p>> 替换label名字,把__meta_consul_tags替换成my_tags</p>
<pre><code class="language-bash"> - source_labels: [__meta_consul_tags]
action: replace
target_label: my_tags</code></pre>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=e5d78da28b1c62673b097a9748cd10e8&amp;file=file.png" alt="" /></p>
<h2>11.总结</h2>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=6d2df05edd3932152d064dce074b4f9d&amp;file=file.png" alt="" /></p>
<p>补充:</p>
<pre><code class="language-bash">#删除一个标签,不显示在告警信息中
relabel_configs:
- action: labeldrop
regex: &quot;job&quot;
或者
metric_relabel_configs:
- action: labeldrop
regex: &quot;job&quot;
- action: labeldrop
regex: &quot;instance&quot;
</code></pre>
<p><strong>relabel_configs和metric_relabel_configs的区别:</strong>
<strong>relabel_configs是在拉取数据时过滤掉不要的标签。metric_relabel_configs是拉取数据完成后,再次对标签进行筛选</strong></p>
<h1>十八、客户端向服务端注册consul</h1>
<h2>1.playbook脚本</h2>
<pre><code class="language-yaml">vim exporter_registry.yml
---
- name: exporter_registry
hosts: all
tasks:
- name: metrics_registry
shell: curl --request PUT -d &#039;{&quot;ID&quot;:&quot;{{ ansible_hostname }}-{{ ansible_default_address }}-{{ Prometheus_exporter_tag }}&quot;,&quot;Name&quot;:&quot;{{ ansible_hostname }}-{{ ansible_default_address }}&quot;,&quot;Tags&quot;:[&quot;{{ Prometheus_game_tag }}&quot;],&quot;Address&quot;:&quot;{{ ansible_default_address }}&quot;,&quot;Port&quot;:{{ exporter_port }},&quot;Meta&quot;:{&quot;Name&quot;:&quot;{{ Prometheus_exporter_tag }}&quot;,&quot;group&quot;:&quot;{{ Prometheus_exporter_group_tag }}&quot;},&quot;EnableTagOverride&quot;:false,&quot;Check&quot;:{&quot;HTTP&quot;:&quot;http://{{ ansible_default_address }}:{{ exporter_port }}/metrics&quot;,&quot;Interval&quot;:&quot;10s&quot;},&quot;Weights&quot;:{&quot;Passing&quot;:10,&quot;Warning&quot;:1}}&#039; {{ Prometheus_server_consulIP }}/v1/agent/service/register?replace-existing-checks=1
delegate_to: localhost
become: yes
</code></pre>
<pre><code class="language-yaml">vim nginx_registry.yml
- name: nginx_registry
hosts: all
tasks:
- name: nginx_registry
shell: curl --request PUT -d &#039;{&quot;ID&quot;:&quot;{{ ansible_hostname }}-{{ ansible_default_address }}-{{ Prometheus_exporter_tag }}&quot;,&quot;Name&quot;:&quot;{{ ansible_hostname }}-{{ ansible_default_address }}&quot;,&quot;Tags&quot;:[&quot;{{ Prometheus_exporter_tag }}&quot;],&quot;Address&quot;:&quot;{{ ansible_default_address }}&quot;,&quot;Port&quot;:{{ exporter_port }},&quot;Meta&quot;:{&quot;Name&quot;:&quot;{{ Prometheus_exporter_tag }}&quot;,&quot;group&quot;:&quot;{{ Prometheus_exporter_group_tag }}&quot;,&quot;__metrics_path__&quot;:&quot;{{ path }}&quot;},&quot;EnableTagOverride&quot;:false,&quot;Check&quot;:{&quot;HTTP&quot;:&quot;http://{{ ansible_default_address }}:{{ exporter_port }}/status&quot;,&quot;Interval&quot;:&quot;10s&quot;},&quot;Weights&quot;:{&quot;Passing&quot;:10,&quot;Warning&quot;:1}}&#039; {{ Prometheus_server_consulIP }}/v1/agent/service/register?replace-existing-checks=1
delegate_to: localhost
become: yes</code></pre>
<h2>2.脚本执行</h2>
<pre><code class="language-bash"># exporter_registry
ansible-playbook -i 42.192.10.73:2020, -e &quot;ansible_default_address=42.192.10.73&quot; -e &quot;Prometheus_exporter_tag=node_exporter&quot; -e &quot;Prometheus_game_tag=永夜降临&quot; -e &quot;Prometheus_exporter_group_tag=永夜降临&quot; -e &quot;exporter_port=9100&quot; -e &quot;Prometheus_server_consulIP=180.184.138.201:8500&quot; -e &quot;ansible_ssh_user=shiyue&quot; exporter_registry.yml
# nginx_registry
ansible-playbook -i 180.184.138.201:2020,
-e &quot;Prometheus_exporter_tag=nginx-vts-module&quot;
-e &quot;ansible_default_address=180.184.138.201&quot;
-e &quot;Prometheus_exporter_group_tag=永夜降临&quot;
-e &quot;exporter_port=80&quot;
-e &quot;path=/status/format/prometheus&quot;
-e &quot;Prometheus_server_consulIP=180.184.138.201:8500&quot;
-e &quot;ansible_ssh_user=shiyue&quot;
nginx_register.yml
</code></pre>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=6ab866f7d59dce2e18a5fcff146e0554&amp;file=file.png" alt="" />
<img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=c1745a66bdaa123b6cc76a3be0a662ed&amp;file=file.png" alt="" /></p>
<h2>3.取消注册</h2>
<pre><code class="language-bash">[root@Prometheus consul_scripts]$ cat delet_register.sh
#!/bin/bash
curl --request PUT http://180.184.138.201:8500/v1/agent/service/deregister/$1
#执行脚本 如:bash delet_register.sh Prometheus-180.184.138.201-mysqld_exporter</code></pre>
<h1>十九、自定义监控</h1>
<h2>1. 安装pushgateway</h2>
<p>下载地址:<a href="https://github.com/prometheus/pushgateway/releases/tag/v1.5.1">https://github.com/prometheus/pushgateway/releases/tag/v1.5.1</a></p>
<pre><code class="language-bash">tar -xzvf pushgateway-1.5.1.linux-amd64.tar.gz
mv pushgateway-1.5.1.linux-amd64 /usr/local/pushgateway</code></pre>
<h3>1.1 放行端口</h3>
<pre><code class="language-bash">iptables -A RH-Firewall-1-INPUT -s 180.184.138.201 -p tcp -m state --state NEW -m tcp --dport 9091 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -s 113.108.148.74/32 -p tcp -m state --state NEW -m tcp --dport 9091 -j ACCEPT</code></pre>
<h3>1.2 创建systemd服务</h3>
<pre><code class="language-bash">cat &gt; /usr/lib/systemd/system/pushgateway.service &lt;&lt;EOF
&gt; [Unit]
&gt; Description=pushgateway
&gt; Documentation=https://github.com/prometheus/pushgateway
&gt; After=network.target
&gt;
&gt; [Service]
&gt; ExecStart=/usr/local/pushgateway/pushgateway
&gt; Restart=on-failure
&gt;
&gt; [Install]
&gt; WantedBy=multi-user.target
&gt; EOF
systemctl daemon-reload
systemctl enable pushgateway
systemctl start pushgateway</code></pre>
<h3>1.3 查看端口</h3>
<pre><code class="language-bash">netstat -luntp | grep 9091
tcp6 0 0 :::9091 :::* LISTEN 2448/./pushgateway</code></pre>
<h3>1.4 查看web页面 ip:9091</h3>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=54758f7e99b83b78629cd1e68487ba07&amp;file=file.png" alt="" /></p>
<h2>2. 实例1:自定义监控主机在线人数</h2>
<h3>2.1 计划任务</h3>
<pre><code class="language-bash">[root@Prometheus ~]$ crontab -l
#pushgateway: peopleonline
*/1 * * * * echo &quot;node_people_online `cat /etc/passwd | wc -l`&quot; | curl --data-binary @- http://180.184.138.201:9091/metrics/job/node_people_online/instance/180.184.138.201:9091/group/长安幻想
需要服务端允许客户端ip:9091端口</code></pre>
<h3>2.2 告警规则</h3>
<pre><code class="language-bash">- alert: &quot;主机在线人数告警&quot;
expr: node_people_online &gt; 50
labels:
serverity: Warning
annotations:
description: &quot;该主机在线人数达到阈值,当前值{{ $value }}&quot;</code></pre>
<h2>3.实例2:自定义监控45000+的端口</h2>
<h3>3.1 shell脚本</h3>
<pre><code class="language-bash">#!/bin/bash
#!/bin/bash
echo &quot;#TYPE port_more_than_45000 gauge&quot;
pub_ip=`dig +short myip.opendns.com @resolver1.opendns.com`
name=&quot;port_more_than_45000&quot;
#获取处于closed状态并且大于45000的端口
port=(`nmap $pub_ip | sed &#039;1,6d&#039; | sed &#039;$d&#039; | sed &#039;/^$/d&#039; | awk &#039;{print $1,$2}&#039; | sed &#039;s/\/tcp//g&#039; | sed -n &#039;/closed$/p&#039; | awk &#039;{if($1&gt;8000) print $1}&#039;`)
for i in ${port[@]}
do
#端口closed
echo $name&#039;{instance=&#039;\&quot;${pub_ip}\&quot;&#039;,group=&quot;长安幻想&quot;,mode=&#039;\&quot;${i}\&quot;&#039;}&#039; 0
done
#获取处于open状态并且大于45000的端口
port=(`nmap $pub_ip | sed &#039;1,6d&#039; | sed &#039;$d&#039; | sed &#039;/^$/d&#039; | awk &#039;{print $1,$2}&#039; | sed &#039;s/\/tcp//g&#039; | sed -n &#039;/open$/p&#039; | awk &#039;{if($1&gt;8000) print $1}&#039;`)
for j in ${port[@]}
do
#端口open
echo $name&#039;{instance=&#039;\&quot;${pub_ip}\&quot;&#039;,group=&quot;长安幻想&quot;,mode=&#039;\&quot;${j}\&quot;&#039;}&#039; 1
done
</code></pre>
<h3>3.2 执行结果输出到txt</h3>
<pre><code class="language-bash">bash new_port.sh &gt; new_port.txt</code></pre>
<h3>3.3 上传脚本</h3>
<pre><code class="language-bash">curl -XPOST --data-binary @/root/pro_script/port_more_than_45000/new_port.txt http://180.184.138.201:9091/metrics/job/port_more_than_45000
#注意,为了防止 pushgateway 重启或意外挂掉,导致数据丢失,我们可以通过 -persistence.file 和 -persistence.interval 参数将数据持久化下来。</code></pre>
<h3>3.4 计划任务</h3>
<pre><code class="language-bash">crontab -e
*/1 * * * * bash /root/pro_script/port_more_than_45000/new_port.sh &gt; /root/pro_script/port_more_than_45000/new_port.txt
*/2 * * * * bash /root/pro_script/port_more_than_45000/push.sh</code></pre>
<h2>4. 自定义监控wazuh</h2>
<h3>4.1 shell脚本</h3>
<pre><code class="language-bash">#!/bin/bash
#获取cpu核数
#!/bin/bash
cpu=`cat /proc/cpuinfo | grep -w processor | wc -l`
agent_pid=`ps -ef | grep wazuh-agentd | grep -v grep | awk &#039;{print $2}&#039;`
agent_cpu=`top -b -n 1 | awk &#039;{if($1==&#039;$agent_pid&#039;) print $9}&#039;`
agent_mem=`top -b -n 1 | awk &#039;{if($1==&#039;$agent_pid&#039;) print $10}&#039;`
execd_cpu=`top -b -n 1 | grep -w wazuh-execd | awk &#039;{print $9}&#039;`
execd_mem=`top -b -n 1 | grep -w wazuh-execd | awk &#039;{print $10}&#039;`
syscheckd_pid=`ps -ef | grep wazuh-syscheckd | grep -v grep | awk &#039;{print $2}&#039;`
syscheckd_cpu=`top -b -n 1 | awk &#039;{if($1==&#039;$syscheckd_pid&#039;) print $9}&#039;`
syscheckd_mem=`top -b -n 1 | awk &#039;{if($1==&#039;$syscheckd_pid&#039;) print $10}&#039;`
logcollector_pid=`ps -ef | grep wazuh-logcollector | grep -v grep | awk &#039;{print $2}&#039;`
logcollector_cpu=`top -b -n 1 | awk &#039;{if($1==&#039;$logcollector_pid&#039;) print $9}&#039;`
logcollector_mem=`top -b -n 1 | awk &#039;{if($1==&#039;$logcollector_pid&#039;) print $10}&#039;`
modulesd_pid=`ps -ef | grep wazuh-modulesd | grep -v grep | awk &#039;{print $2}&#039;`
modulesd_cpu=`top -b -n 1 | awk &#039;{if($1==&#039;$modulesd_pid&#039;) print $9}&#039;`
modulesd_mem=`top -b -n 1 | awk &#039;{if($1==&#039;$modulesd_pid&#039;) print $10}&#039;`
n1=$(echo &quot;scale=1;$agent_cpu / $cpu&quot; | bc)
n2=$(echo &quot;scale=1;$execd_cpu / $cpu&quot; | bc)
n3=$(echo &quot;scale=1;$syscheckd_cpu / $cpu&quot; | bc)
n4=$(echo &quot;scale=1;$logcollector_cpu / $cpu&quot; | bc)
n5=$(echo &quot;scale=1;$modulesd_cpu / $cpu&quot; | bc)
echo &quot;#TYPE wazuh_agent_cpu_use gauge&quot;
#echo &quot;wazuh_agent_cpu_use &quot; `echo | awk &quot;{print $agent_cpu / $cpu}&quot;`
echo &quot;wazuh_agent_cpu_use ${n1}&quot;
echo &quot;#TYPE wazuh_agent_mem_use gauge&quot;
echo &quot;wazuh_agent_mem_use ${agent_mem}&quot;
#
echo &quot;#TYPE wazuh_execd_cpu_use gauge&quot;
#echo &quot;wazuh_execd_cpu_use &quot; `echo | awk &quot;{print $execd_cpu / $cpu}&quot;`
echo &quot;wazuh_execd_cpu_use ${n2}&quot;
echo &quot;#TYPE wazuh_execd_mem_use gauge&quot;
echo &quot;wazuh_execd_mem_use ${execd_mem}&quot;
echo &quot;#TYPE wazuh_syscheckd_cpu_use gauge&quot;
#echo &quot;wazuh_syscheckd_cpu_use &quot; `echo | awk &quot;{print $syscheckd_cpu / $cpu}&quot;`
echo &quot;wazuh_syscheckd_cpu_use ${n3}&quot;
echo &quot;#TYPE wazuh_syscheckd_mem_use gauge&quot;
echo &quot;wazuh_syscheckd_mem_use ${syscheckd_mem}&quot;
echo &quot;#TYPE wazuh_logcollector_cpu_use gauge&quot;
#echo &quot;wazuh_logcollector_cpu_use &quot; `echo | awk &quot;{print $logcollector_cpu / $cpu}&quot;`
echo &quot;wazuh_logcollector_cpu_use ${n4}&quot;
echo &quot;#TYPE wazuh_logcollector_mem_use gauge&quot;
echo &quot;wazuh_logcollector_mem_use ${logcollector_mem}&quot;
#
echo &quot;#TYPE wazuh_modulesd_cpu_use gauge&quot;
#echo &quot;wazuh_modulesd_cpu_use &quot; `echo | awk &quot;{print $modulesd_cpu / $cpu}&quot;`
echo &quot;wazuh_modulesd_cpu_use ${n5}&quot;
echo &quot;#TYPE wazuh_modulesd_mem_use gauge&quot;
echo &quot;wazuh_modulesd_mem_use ${modulesd_mem}&quot;</code></pre>
<p><strong>【单台执行】</strong></p>
<h3>4.2 执行脚本保存内存到txt</h3>
<pre><code class="language-bash">bash wazuh_cpu_mem.sh &gt; wazuh.txt</code></pre>
<h3>4.3 上传数据&计划任务</h3>
<pre><code class="language-bash">*/1 * * * * sleep 25;bash /root/pro_script/wazuh/wazuh_cpu_mem.sh &gt; /root/pro_script/wazuh/wazuh.txt
*/2 * * * * sleep 40;curl -XPOST --data-binary @/root/pro_script/wazuh/wazuh.txt http://180.184.138.201:9091/metrics/job/wazuh/instance/180.184.138.201</code></pre>
<p><strong>【集群执行】</strong>
服务端运行playbook</p>
<pre><code class="language-bash">vim wazuh.yml
---
- name: wazuh
hosts: all
#remot_user: root
tasks:
- name: create wazuh directory
file:
path: /home/shiyue/wazuh_monitor
state: directory
mode: 0755
- name: copy wazuh_cpu_mem.sh
copy:
src: /root/pro_script/wazuh/wazuh_cpu_mem.sh
dest: /home/shiyue/wazuh_monitor/
mode: 0755
- name: crontab
cron:
name: &quot;get data&quot;
minute: &quot;*/1&quot;
user: shiyue
state: present
job: &quot;sleep 25;sudo /home/shiyue/wazuh_monitor/wazuh_cpu_mem.sh &gt; /home/shiyue/wazuh_monitor/wazuh.txt&quot;
#become: yes
- name: crontab 2
cron:
name: &quot;push data&quot;
minute: &quot;*/1&quot;
state: present
job: &quot;sleep 40;curl -XPOST --data-binary @/home/shiyue/wazuh_monitor/wazuh.txt http://180.184.138.201:9091/metrics/job/wazuh/instance/{{ public_network }}&quot;
#become: yes
</code></pre>
<p>执行playbook</p>
<pre><code class="language-bash">ansible-playbook -i 123.60.51.109:2020, -e &quot;ansible_ssh_user=shiyue&quot; -e &quot;public_network=123.60.51.109&quot; wazuh.yaml</code></pre>