> For the complete documentation index, see [llms.txt](https://pshizhsysu.gitbook.io/prometheus/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://pshizhsysu.gitbook.io/prometheus/ff08-san-ff09-prometheus-gao-jing-chu-li/shi-li-dui-zhu-ji-jin-xing-jian-kong-gao-jing.md).

# 示例 - 对主机进行监控告警

## 准备条件

在主机上安装好NodeExporter

## 定义告警规则

修改prometheus.yml文件，配置告警规则文件host.yml：

```
rule_files:
  - /etc/prometheus/rule_files/host.yml
```

创建文件`/usr/local/prometheus/rule_files/host.yml`，内容如下

```
groups:
- name: Host
  rules:
  - alert: HostCPU
    expr: 100 * (1 - avg(irate(node_cpu_seconds_total{mode="idle"}[2m])) by(instance)) > 80
    for: 5m
    labels:
      serverity: high
    annotations:
      summary: "{{$labels.instance}}: High CPU Usage Detected"
      description: "{{$labels.instance}}: CPU usage is {{$value}}, above 80%"

  - alert: HostMemory
    expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 80
    for: 5m
    labels:
      serverity: middle
    annotations:
      summary: "{{$labels.instance}}: High Memory Usage Detected"
      description: "{{$labels.instance}}: Memory Usage i{{ $value }}, above 80%"

  - alert: HostDisk
    expr: 100 * (node_filesystem_size_bytes{fstype=~"xfs|ext4"} - node_filesystem_available_bytes) / node_filesystem_size_bytes > 80
    for: 5m
    labels:
      serverity: low
    annotations:
      summary: "{{$labels.instance}}: High Disk Usage Detected"
      description: "{{$labels.instance}}, mountpoint {{$labels.mountpoint}}: Disk Usage is {{ $value }}, above 80%"
```

然后，重启Prometheus。

访问UI，查看当前告警的活动状态，如下，都为Inactive

![](/files/-MA1FfBj6wa5uQTQTYhE)

此时，我们在主机上可以手动拉高系统的CPU使用率，验证Prometheus的告警流程，在主机上运行以下命令（可以执行命令`pkill -9 dd`停止下面的进程）：

```
$ for i in `seq 1 $(cat /proc/cpuinfo |grep "physical id" |wc -l)`; do dd if=/dev/zero of=/dev/null & done
```

15秒后（因为我们设置的prometheus采集周期为15s），我们就可以从UI上看到cpu使用率为100%

![](/files/-MA1FfBk-2IM-wFn8ago)

同时我们也可以在`Alerts`页面可以看到`Pending`状态中已经有了`HostCPU`，

![](/files/-MA1FfBlTziF6pqZvr8c)

由于我们在告警规则中设置的持续时间为5分钟，所以`HostCPU`的状态还不会变成`Firing`。此时，如果我们执行命令`pkill -9 dd`让CPU的使用率降下去，再去看一下，`HostCPU`的状态又会变成了`Inactive`。

五分钟后，我们再看，`HostCPU`的状态就由`Pending`变成了`Firing`

![](/files/-MA1FfBmxvx9AjaF_3qe)

下一节，我们将介绍如果通过AlertManager将告警发送出去


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://pshizhsysu.gitbook.io/prometheus/ff08-san-ff09-prometheus-gao-jing-chu-li/shi-li-dui-zhu-ji-jin-xing-jian-kong-gao-jing.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
