Beats：解密 Filebeat 中的 setup 命令

Elastic 中国社区官方博客

于 2020-10-20 14:47:19 发布

阅读量6.5k

点赞数 1

CC 4.0 BY-SA版权

分类专栏： Elastic Beats 文章标签： elasticsearch 大数据

本文为博主原创文章，未经博主允许不得转载。

本文链接：https://blog.csdn.net/UbuntuTouch/article/details/109178666

Elastic 同时被 2 个专栏收录

1942 篇文章

订阅专栏

Beats

94 篇文章

订阅专栏

本文详细介绍了Filebeat的安装及配置过程，包括关键命令./filebeatsetup的作用、创建索引模板、配置仪表板以及设置机器学习等核心功能。同时，深入解析了Ingest Pipelines的工作原理。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

在我之前的教程：

我已经详述了如果启动 Filebeat 并监督系统日志。在启动 Filebeat 的过程中，有一个很重要的步骤就是：

./filebeat setup

这个步骤非常重要，但是描述的内容并不是很多。为什么需要这个步骤呢？它到底能够做什么呢？

首先，我们在命令的输出中，我们可以看到如下的内容：

$ ./filebeat setup
Overwriting ILM policy is disabled. Set `setup.ilm.overwrite: true` for enabling.

Index setup finished.
Loading dashboards (Kibana must be running and reachable)
Loaded dashboards
Setting up ML using setup --machine-learning is going to be removed in 8.0.0. Please use the ML app instead.
See more: https://www.elastic.co/guide/en/machine-learning/current/index.html
Loaded machine learning job configurations
Loaded Ingest pipelines
liuxg:filebeat-7.9.1-darwin-x86_64 liuxg$ ./filebeat modules enable system
Enabled system

从上面的输出的内容我们可以看出来这个步骤做了如下的几个事情：

1）Index setup finished

在这个步骤中，它将创建一个 index template，并创建好相应的 index pattern。我们可以在 Kibana 中看到如下的一个新的 index pattern 被创建：

也就是说所有的 filebeat 导入的文件将会自动被这个 index pattern 所访问。

同时它也生产一个相应的 Index Life Cycle Management policy:

我们可以点击 actions 来查看这个 policy 的内容。我们甚至可以来修改这个内容。

我们也可以使用如下的命令来查看为 filebeat 而创建的 index template:

GET _template/filebeat-7.9.1

我们从上面可以看出来这个 index template 里有一个 rollover_alias 叫做 filebeat-7.9.1。我们可以通过如下的方式来查看：

GET _alias/filebeat-7.9.1

上面的命令显示：

{
  "filebeat-7.9.1-2020.10.20-000001" : {
    "aliases" : {
      "filebeat-7.9.1" : {
        "is_write_index" : true
      }
    }
  }
}

我们可以通过 filebeat-7.9.1 这个 alias 来访问所有的 filebeat 的文档：

GET filebeat-7.9.1/_search

2）Loaded dashboards

这个表明它帮我们装载所有和 Filebeat 相关的 dasboards：

在执行完 setup 命令后，我们可以在 Dashboards 中发现已经帮我们生产好的 Dashboard 来供我们使用。

3）Setting up ML using setup

这个用来设置一下机器学习的配置。机器学习是白金版的功能。我们必须打开30天试用才可以使用：

具体的使用请参考 https://www.elastic.co/guide/en/machine-learning/current/index.html

4）Loaded Ingest pipelines

它表明装载 ingest pipelines。在没有启动任何的模块之前，我们在 Kibana 中也找不到任何东西，但是一旦我们使用如下的方法来启动一个模块：

./filebeat modules enable system

在上面，我们启动了 system 这个模块。那么在我们的 Kibana 中，我们可以看到：

在上面，我们可以看到有两个 ingest pipeline 同时被创建。我们可以点击右边的三个点进行编辑：

我们可以查看当前的这个 ingest pipeline 的定义情况。当然我们也可以直接点击 Test pipeline 来进行测试。我们可以打开自己的电脑系统，并在如下的路径打开文件:

Mac OS:

/var/log/system.log

Ubuntu OS:

/var/log/syslog

我们拷贝其中的一条信息：

并在 Test pipeline 的窗口中进行如下的编辑：

点击 Run the pipeline：

我们可以看到上面的输出。从上面我们可以看出来这个 ingest pipeline 能够正确地分析 system.log/syslog 信息。

我们可以通过如下的方式来获得这个 ingest pipeline 的内容：

GET _ingest/pipeline/filebeat-7.9.1-system-syslog-pipeline

上面的命令显示：

{
  "filebeat-7.9.1-system-syslog-pipeline" : {
    "description" : "Pipeline for parsing Syslog messages.",
    "processors" : [
      {
        "grok" : {
          "field" : "message",
          "patterns" : [
            """%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\[%{POSINT:process.pid:long}\])?: %{GREEDYMULTILINE:system.syslog.message}""",
            "%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{GREEDYMULTILINE:system.syslog.message}",
            """%{TIMESTAMP_ISO8601:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\[%{POSINT:process.pid:long}\])?: %{GREEDYMULTILINE:system.syslog.message}"""
          ],
          "pattern_definitions" : {
            "GREEDYMULTILINE" : """(.|
)*"""
          },
          "ignore_missing" : true
        }
      },
      {
        "remove" : {
          "field" : "message"
        }
      },
      {
        "rename" : {
          "field" : "system.syslog.message",
          "target_field" : "message",
          "ignore_missing" : true
        }
      },
      {
        "date" : {
          "if" : "ctx.event.timezone == null",
          "field" : "system.syslog.timestamp",
          "target_field" : "@timestamp",
          "formats" : [
            "MMM  d HH:mm:ss",
            "MMM dd HH:mm:ss",
            "MMM d HH:mm:ss",
            "ISO8601"
          ],
          "on_failure" : [
            {
              "append" : {
                "field" : "error.message",
                "value" : "{{ _ingest.on_failure_message }}"
              }
            }
          ]
        }
      },
      {
        "date" : {
          "field" : "system.syslog.timestamp",
          "target_field" : "@timestamp",
          "formats" : [
            "MMM  d HH:mm:ss",
            "MMM dd HH:mm:ss",
            "MMM d HH:mm:ss",
            "ISO8601"
          ],
          "timezone" : "{{ event.timezone }}",
          "on_failure" : [
            {
              "append" : {
                "field" : "error.message",
                "value" : "{{ _ingest.on_failure_message }}"
              }
            }
          ],
          "if" : "ctx.event.timezone != null"
        }
      },
      {
        "remove" : {
          "field" : "system.syslog.timestamp"
        }
      },
      {
        "set" : {
          "value" : "event",
          "field" : "event.type"
        }
      }
    ],
    "on_failure" : [
      {
        "set" : {
          "field" : "error.message",
          "value" : "{{ _ingest.on_failure_message }}"
        }
      }
    ]
  }
}

同样地，我们也可以通过如下的方式来进行测试这个 ingest pipeline：

POST _ingest/pipeline/filebeat-7.9.1-system-syslog-pipeline/_simulate
{
  "docs": [
    {
      "_source": {
        "message": "Oct 20 00:41:22 liuxg syslogd[122]: Configuration Notice: ASL Module \"com.apple.cdscheduler\" claims selected messages. Those messages may not appear in standard system log files or in the ASL database."
      }
    }
  ]
}

上面的命令显示的结果是：

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "process" : {
            "name" : "syslogd",
            "pid" : 122
          },
          "system" : {
            "syslog" : {
              "timestamp" : "Oct 20 00:41:22"
            }
          },
          "host" : {
            "hostname" : "liuxg"
          },
          "message" : """Configuration Notice: ASL Module "com.apple.cdscheduler" claims selected messages. Those messages may not appear in standard system log files or in the ASL database.""",
          "error" : {
            "message" : ""
          }
        },
        "_ingest" : {
          "timestamp" : "2020-10-20T06:45:00.940665Z"
        }
      }
    }
  ]
}

显然这个 ingest pipeline 能够正确地解析我们的 syslog 内容。