Logstash grok配置调试
grok是一种采用组合多个预定义的正则表达式,用来匹配分割文本并映射到关键字的工具。通常用来对日志数据进行预处理。logstash的filter模块中grok插件是其实现之一。
logstash内置的grok匹配规则可参考:https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns 。grok还支持自定义匹配字段规则,可以灵活满足扩展的需求。
日志方式
input {
...
}
filter {
...
}
output {
if "_grokparsefailure" in [tags] {
file { path => "/data/logs/logstash/grok_failures.txt" } #解析失败日志
} else {
elasticsearch {
hosts => ["192.168.165.239:9200"]
index => "%{type}"
}
stdout {
codec => rubydebug #控制台输出日志
}
}
}
此时可以通过前台启动查看控制台输出日志:
# bin/logstash -f config_file/log.conf
Kibana Dev Tools
Console
查询ES采集的数据格式是否符合grok切割预期
GET /appblog/_search
{
"query": {
"match_all": {
}
},
"sort": [{ "@timestamp": { "order" : "desc"} }]
}
Grok Debugger
Sample Data 输入日志样例,如
2019-05-25 15:23:32.009 [cn-appblog-provider-channel-gateway-alipay][ INFO ] [65117] [nio-8851-exec-8] [47a999cec484e6b5] [0ea76f03cdf92c57] [true] --- [cn.appblog.provider.channel.gateway.alipay.helper.XStreamHelper] [parseAlipayCreateReturn] [39] : This is log content
Grok Pattern 输入匹配规则,如
%{TIME_STAMP_A:logtime}\s+\[%{APP_NAME:appname}\]\[\s+%{LOG_LVL:loglvl}\s+\]\s+\[%{PROCESS_ID:pid}\]\s+\[%{PROCESS_NAME:pname}\]\s+\[%{TRACE_ID:traceid}\]\s+\[%{SPAN_ID:spanid}\]\s+\[%{SPAN_EXPORTABLE}\]\s+---\s+\[%{CLASS_PATH:classpath}\]\s+\[%{METHOD_NAME:methodname}\]\s+\[%{CODE_LINE:codeline}\]\s+:\s+%{CONTENT:content}
Custom Patterns 输入自定义规则,如
TIME_STAMP_A \d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}
TIME_STAMP_T \d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{3}Z
TIME_STAMP_P \d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
TIME_STAMP_S \d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3}
HOST_NAME_PATTERN [a-zA-Z0-9._-]+
APP_NAME [a-zA-Z0-9._-]+
LOG_LVL [a-zA-Z0-9._-]+
CORRELATION_ID [0-9a-f-]{36}
CIP ((?:(?:25[0-5]|2[0-4]\d|((1\d{2})|([1-9]?\d)))\.){3}(?:25[0-5]|2[0-4]\d|((1\d{2})|([1-9]?\d))))
ID_PATTERN [0-9a-f\-]{36}
RPC_ID_PATTERN [0-9\.]+
APP_OR_METHOD [/a-zA-Z0-9._-]+
TRACE_ID [0-9a-f]*
SPAN_ID [0-9a-f]*
PROCESS_ID \d{0,5}
PROCESS_NAME [a-zA-Z0-9._-]+
SPAN_EXPORTABLE [a-z]{0,5}
CLASS_PATH [a-zA-Z0-9._]+
METHOD_NAME [a-zA-Z0-9_]+
CODE_LINE \d{1,5}
CONTENT [\s\S]*$
点击Simulate
,得到Structured Data
{
"traceid": "47a999cec484e6b5",
"classpath": "cn.appblog.provider.channel.gateway.alipay.helper.XStreamHelper",
"loglvl": "INFO",
"pname": "nio-8851-exec-8",
"pid": "65117",
"content": "This is log content",
"codeline": "39",
"spanid": "0ea76f03cdf92c57",
"appname": "cn-appblog-provider-channel-gateway-alipay",
"logtime": "2019-05-25 15:23:32.009",
"methodname": "parseAlipayCreateReturn"
}
配置示例
2019-05-23 11:50:36.022 [cn-appblog-provider-channel-core][ INFO ] [21992] [nio-8888-exec-1] [143da285c068e5e1] [cb964a4c7b09ee0e] [true] --- [cn.appblog.provider.channel.core.helper.ChannelInfoHelper] [checkChannelInfo] [35] : ChannelPayRequest.checkChannelInfo [MerchantId: 142019050800009001, TransSerialNo: 122019052300016001, ChnlCode: alipay_offline_payment]
input {
kafka {
bootstrap_servers => "192.168.1.10:9092"
topics => "logstash"
group_id => "logstash"
consumer_threads => 5
decorate_events => true
codec => json
type => "thaipay"
#auto_offset_reset => "smallest"
#reset_beginning => true
}
}
filter {
if [type] == "thaipay" {
if [message] =~ "^\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}\s+\[[a-zA-Z0-9._-]+\]\s*\[\s*[a-zA-Z0-9._-]+\s*\][\s\S]*$" {
grok {
patterns_dir => "/data/server/logstash/config_file/patterns"
#add_field => {"logmatch" => "100001"}
#match => { "message" => "%{TIME_STAMP_A:logtime}" }
#match => { "message" => "%{TIME_STAMP_A:logtime}\s+\[%{APP_NAME:appname}\]\s+\[%{LOG_LVL:loglvl}\]" }
#match => { "message" => "%{TIME_STAMP_A:logtime}\s+\[%{APP_NAME:appname}\]\[\s+%{LOG_LVL:loglvl}\s+\]\s+\[%{PROCESS_ID:pid}\]\s+\[%{PROCESS_NAME:pname}\]\s+\[%{TRACE_ID:traceid}\]\s+\[%{SPAN_ID:spanid}\]\s+\[%{SPAN_EXPORTABLE}\]\s+---\s+\[%{CLASS_PATH:classpath}\]\s+\[%{METHOD_NAME:methodname}\]\s+\[%{CODE_LINE:codeline}\]" }
match => { "message" => "%{TIME_STAMP_A:logtime}\s+\[\s*%{APP_NAME:appname}\s*\]\[\s*%{LOG_LVL:loglvl}\s*\]\s+\[\s*%{PROCESS_ID:pid}\s*\]\s+\[\s*%{PROCESS_NAME:pname}\s*\]\s+\[\s*%{TRACE_ID:traceid}\s*\]\s+\[\s*%{SPAN_ID:spanid}\s*\]\s+\[\s*%{SPAN_EXPORTABLE}\s*\]\s+---\s+\[\s*%{CLASS_PATH:classpath}\s*\]\s+\[\s*%{METHOD_NAME:methodname}\s*\]\s+\[\s*%{CODE_LINE:codeline}\s*\]\s+:\s+%{CONTENT:content}" }
}
#date {
# match => ["logtime", "yyyy-MM-dd HH:mm:ss.SSS"]
# target => "messagetime"
#locale => "en"
#timezone => "+00:00"
#remove_field => ["logtime"]
#}
}
}
}
output {
if "_grokparsefailure" in [tags] {
file { path => "/data/logs/logstash/grok_failures.txt" }
} else {
elasticsearch {
hosts => ["192.168.1.10:9200"]
index => "%{type}"
}
stdout {
codec => rubydebug
}
}
}
TIME_STAMP_A \d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}
TIME_STAMP_T \d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{3}Z
TIME_STAMP_P \d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
TIME_STAMP_S \d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3}
HOST_NAME_PATTERN [a-zA-Z0-9._-]+
APP_NAME [a-zA-Z0-9._-]+
LOG_LVL [a-zA-Z0-9._-]+
CORRELATION_ID [0-9a-f-]{36}
CIP ((?:(?:25[0-5]|2[0-4]\d|((1\d{2})|([1-9]?\d)))\.){3}(?:25[0-5]|2[0-4]\d|((1\d{2})|([1-9]?\d))))
ID_PATTERN [0-9a-f\-]{36}
RPC_ID_PATTERN [0-9\.]+
APP_OR_METHOD [/a-zA-Z0-9._-]+
TRACE_ID [0-9a-f]*
SPAN_ID [0-9a-f]*
PROCESS_ID \d{3,5}
PROCESS_NAME [a-zA-Z0-9._-]+
SPAN_EXPORTABLE [a-z]{0,5}
CLASS_PATH [a-zA-Z0-9._]+
METHOD_NAME [a-zA-Z0-9_]+
CODE_LINE \d{1,5}
CONTENT [\s\S]*$
版权声明:
作者:Joe.Ye
链接:https://www.appblog.cn/index.php/2023/03/19/logstash-grok-configuration-debugging/
来源:APP全栈技术分享
文章版权归作者所有,未经允许请勿转载。
共有 0 条评论