elk ingest plugs pipeline
Filebeat + Elasticsearch + Kibana 轻量日志收集与展示系统
https://wzyboy.im/post/1111.html?utm_source=tuicool&utm_medium=referral
提到
beat -> logstash -> elk
可以
beat -> elk ingest plugs ( Elasticsearch Ingest Node )
Elasticsearch Ingest Node 是 Elasticsearch 5.0 起新增的功能。在 Ingest Node 出现之前,人们通常会在 ES 前置一个 Logstash Indexer,用于对数据进行预处理。有了 Ingest Node 之后,Logstash Indexer 的大部分功能就可以被它替代了,grok, geoip 等 Logstash 用户所熟悉的处理器,在 Ingest Node 里也有。对于数据量较小的 ES 用户来说,省掉一台 Logstash 的开销自然是令人开心的,对于数据量较大的 ES 用户来说,Ingest Node 和 Master Node, Data Node 一样也是可以分配独立节点并横向扩展的,也不用担心性能瓶颈。
目前 Ingest Node 已支持数十种处理器,其中的 script 处理器具有最大的灵活性。
与 /_template 类似,Ingest API 位于 /_ingest 下面。用户将 pipeline 定义提交之后,在 Beats 中即可指定某 pipeline 为数据预处理器。
FROM docker.elastic.co/elasticsearch/elasticsearch-oss:6.4.2
已經內建了
https://www.elastic.co/guide/en/elasticsearch/plugins/6.5/ingest-geoip.html
https://www.elastic.co/guide/en/elasticsearch/plugins/6.5/ingest-user-agent.html
===============
.filebeat
filebeat.yml
補上 like example
output.elasticsearch:
hosts: ["http://localhost:9200/"]
pipelines:
- pipeline: nginx.access
when.equals:
fields.type: nginx.access
- pipeline: nginx.error
when.equals:
fields.type: nginx.error
OK, use bottom way to make pipeline.
.pipeline
https://www.elastic.co/guide/en/elasticsearch/reference/current/simulate-pipeline-api.html
https://qbox.io/blog/indexing-elastic-stack-5-0-ingest-apis
https://dev.classmethod.jp/server-side/elasticsearch/elasticsearch-ingest-node/
https://qbox.io/blog/how-to-index-geographical-location-of-ip-addresses-to-elasticsearch-5-0-1
Get a pipeline
GET _ingest/pipeline/geoippipeline
write a pipeline
PUT _ingest/pipeline/geoippipeline
{
"description" : "Add geoip information to the given IP address",
"processors": [
{
"geoip" : {
"field" : "ip",
"ignore_missing": true
}
},
{
"geoip" : {
"field" : "src_ip",
"ignore_missing": true
}
},
{
"geoip" : {
"field" : "clientip",
"ignore_missing": true
}
},
{
"set" : {
"field" : "location",
"value" : "{{geoip.location.lon}}, {{geoip.location.lat}}"
}
}
]
}
real use pipeline with test data, check is ok.
POST _ingest/pipeline/geoippipeline/_simulate
{
"docs":[
{
"_source": {
"ip": "8.8.0.0",
"src_ip": "8.8.0.0",
"clientip": "8.8.0.0"
}
}
]
}
Developer test
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description" : "parse multiple patterns",
"processors": [
{
"geoip" : {
"field" : "ip",
"ignore_missing": true
}
},
{
"geoip" : {
"field" : "src_ip",
"ignore_missing": true
}
},
{
"geoip" : {
"field" : "clientip",
"ignore_missing": true
}
},
{
"set" : {
"field" : "location",
"value" : "{{geoip.location.lon}}, {{geoip.location.lat}}"
}
}
]
},
"docs":[
{
"_source": {
"ip": "8.8.0.0",
"src_ip": "8.8.0.0",
"clientip": "8.8.0.0"
}
}
]
}