官术网_书友最值得收藏!

3.1 Nginx訪問日志

訪問日志處理分析絕對是使用ELK stack時最常見的需求。默認(rèn)的處理方式下,性能和精確度都不夠好。本節(jié)會列舉對Nginx訪問日志的幾種不同處理方式,并闡明其優(yōu)劣。

3.1.1 grok處理方式

Logstash默認(rèn)自帶了Apache標(biāo)準(zhǔn)日志的grok正則:

COMMONAPACHELOG %{IPORHOST:clientip} %{USER:ident} %{NOTSPACE:auth}\[%{HTTPDATE:
    timestamp}\] “(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?
    |%{DATA:rawrequest})” %{NUMBER:response} (?:%{NUMBER:bytes}|-)
COMBINEDAPACHELOG %{COMMONAPACHELOG} %{QS:referrer} %{QS:agent}

對于Nginx標(biāo)準(zhǔn)日志格式,可以發(fā)現(xiàn)只是最后多了一個$http_x_forwarded_for變量。所以Nginx標(biāo)準(zhǔn)日志的grok正則定義是:

MAINNGINXLOG %{COMBINEDAPACHELOG} %{QS:x_forwarded_for}

自定義的日志格式,可以照此修改。

3.1.2 split處理方式

Nginx日志因?yàn)椴糠肿兞恐袃?nèi)含空格,所以很多時候只能使用%{QS}正則來做分隔,性能和細(xì)度都不太好。如果能自定義一個比較少見的字符作為分隔符,那么處理起來就簡單多了。假設(shè)定義的日志格式如下:

log_format main “$http_x_forwarded_for | $time_local | $request | $status |
    $body_bytes_sent | ”“$request_body | $content_length | $http_referer | $http_user_agent | $nuid | ”“$http_cookie | $remote_addr | $hostname | $upstream_addr | $upstream_response_
    time | $request_time”;

實(shí)際日志如下:

117.136.9.248 | 08/Apr/2015:16:00:01 +0800 | POST /notice/newmessage?sign=cba4f614e05db285850cadc696fcdad0&token=JAGQ92Mjs3--gik_b_DsPIQHcyMKYGpD&did=b749736ac70f12df700b18cd6d051d5&osn=android&osv=4.0.4&appv=3.0.1&net=460-02-2g&longitude=120.393006&latitude=36.178329&ch=360&lp=1&ver=1&ts=1428479998151&im=869736012353958&sw=0&sh=0&la=zh-CN&lm=weixin&dt=vivoS11tHTTP/1.1| 200 | 132 | abcd-sign-v1://dd03c57f8cb6fcef919ab5df66f2903f:d51asq5yslwnyz5t/{\x22type\x22:4,\x22uid\x22:7567306} | 89 | - | abcd/3.0.1, Android/4.0.4, vivo S11t | nuid=0C0A0A0A01E02455EA7CF47E02FD072C1428480001.157| - | 10.10.10.13 | bnx02.abcdprivate.com | 10.10.10.22:9999 | 0.022 | 0.022 59.50.44.53 | 08/Apr/2015:16:00:01 +0800 | POST /feed/pubList?appv=3.0.3&did=89da72550de488328e2aba5d97850e9f&dt=iPhone6%2C2&im=B48C21F3-487E-4071-9742-DC6D61710888&la=cn&latitude=0.000000&lm=weixin&longitude=0.000000&lp=-1.000000&net=0-0-wifi&osn=iOS&osv=8.1.3&sh=568.000000&sw=320.000000&token=7NobA7asg3Jb6n9o4ETdPXyNNiHwMs4J&ts=1428480001275 HTTP/1.1 | 200 | 983 | abcd-sign-v1://b398870a0b25b29aae65cd553addc43d:72214ee85d7cca22/{\x22nextkey\x22:\x22\x22,\x22uid\x22:\x2213062545\x22,\x22token\x22:\x227NobA7asg3Jb6n9o4ETdPXyNNiHwMs4J\x22}| 139 | - | Shopping/3.0.3 (iPhone; iOS 8.1.3; Scale/2.00) | nuid=0C0A0A0A81-DF2455017D548502E48E2E1428480001.154 | nuid=CgoKDFUk34GFVH0BLo7kAg== | 10.10.10.11 | bnx02.abcdprivate.com | 10.10.10.35:9999 | 0.025 | 0.026

然后還可以針對request做更細(xì)致的切分。比如URL參數(shù)部分。很明顯,URL參數(shù)中的字段順序是亂的。第一行問號之后的第一個字段是sign,第二行問號之后的第一個字段是appv。所以需要將字段進(jìn)行切分,取出每個字段對應(yīng)的值。官方自帶grok滿足不了要求,最終采用的Logstash配置如下:

filter {
    ruby {
        init =>“@kname =['http_x_forwarded_for','time_local','request','status',
            'body_bytes_sent','request_body','content_length','http_referer','http_
            user_agent','nuid','http_cookie','remote_addr','hostname','upstream_
            addr','upstream_response_time','request_time']”
        code =>“event.append(Hash[@kname.zip(event['message'].split(‘ | ’))])”
    }
    if [request] {
        ruby {
            init =>“@kname = ['method','uri','verb']”
            code =>“event.append(Hash[@kname.zip(event['request'].split(‘ ’))])”
        }
        if [uri] {
            ruby {
                init =>“@kname = ['url_path','url_args']”
                code =>“event.append(Hash[@kname.zip(event['request'].split(‘?’))])”
            }
            kv {
                prefix =>“url_”
                source =>“url_args”
                field_split =>“&”
                remove_field => [ “url_args”,“uri”,“request” ]
            }
        }
    }
    mutate {
        convert => [“body_bytes_sent” , “integer”,“content_length”, “integer”,“upstream_response_time”, “float”,“request_time”, “float”
        ]
    }
    date {
        match => [ “time_local”, “dd/MMM/yyyy:hh:mm:ss Z” ]
        locale =>“en”
    }
}

最終結(jié)果如下:

{“message” =>“1.43.3.188 | 08/Apr/2015:16:00:01 +0800 | POST /search/sug
    gest?appv=3.0.3&did=dfd5629d705d400795f698055806f01d&dt=iPhone7%2C2&im=
    AC926907-27AA-4A10-9916-C5DC75F29399&la=cn&latitude=-33.903867&lm=
    sina&longitude=151.208137&lp=-1.000000&net=0-0-wifi&osn=iOS&osv=8.1.3&sh=66
    7.000000&sw=375.000000&token=_ovaPz6Ue68ybBuhXustPbG-xf1WbsPO&ts=
    1428480001567 HTTP/1.1 | 200 | 353 | abcd-sign-v1://a24b478486d3bb92ed89a-
    901541b60a5:b23e9d2c14fe6755/{\\x22key\\x22:\\x22last\\x22,\\x22offset\\x22:
    \\x220\\x22,\\x22token\\x22:\\x22_ovaPz6Ue68ybBuhXustPbG-xf1WbsPO\\x22,
    \\x22limit\\x22:\\x2220\\x22} | 148 | - | abcdShopping/3.0.3 (iPhone; iOS
    8.1.3; Scale/2.00) | nuid=0B0A0A0A9A64AF54F97634640230944E1428480001.113
    | nuid=CgoKC1SvZJpkNHb5TpQwAg== | 10.10.10.11 | bnx02.abcdprivate.com | 
    10.10.10.26:9999 | 0.070 | 0.071”,“@version” =>“1”,“@timestamp” =>“2015-04-08T08:00:01.000Z”,“type” =>“nginxapiaccess”,“host” =>“blog05.abcdprivate.com”,“path” =>“/home/nginx/logs/api.access.log”,“http_x_forwarded_for” =>“1.43.3.188”,“time_local” =>“ 08/Apr/2015:16:00:01 +0800”,“status” =>“200”,“body_bytes_sent” => 353,“request_body” =>“abcd-sign-v1://a24b478486d3bb92ed89a901541b60a5:b23e9d2c1
    4fe6755/{\\x22key\\x22:\\x22last\\x22,\\x22offset\\x22:\\x220\\x22,\\x22token
    \\x22:\\x22_ovaPz6Ue68ybBuhXustPbG-xf1WbsPO\\x22,\\x22limit\\x22:\\x2220\\x22}”,“content_length” => 148,“http_referer” =>“-”,“http_user_agent” =>“abcdShopping/3.0.3 (iPhone; iOS 8.1.3; Scale/2.00)”,“nuid” =>“nuid=0B0A0A0A9A64AF54F97634640230944E1428480001.113”,“http_cookie” =>“nuid=CgoKC1SvZJpkNHb5TpQwAg==”,“remote_addr” =>“10.10.10.11”,“hostname” =>“bnx02.abcdprivate.com”,“upstream_addr” =>“10.10.10.26:9999”,“upstream_response_time” => 0.070,“request_time” => 0.071,“method” =>“POST”,“verb” =>“HTTP/1.1”,“url_path” =>“/search/suggest”,“url_appv” =>“3.0.3”,“url_did” =>“dfd5629d705d400795f698055806f01d”,“url_dt” =>“iPhone7%2C2”,“url_im” =>“AC926907-27AA-4A10-9916-C5DC75F29399”,“url_la” =>“cn”,“url_latitude” =>“-33.903867”,“url_lm” =>“sina”,“url_longitude” =>“151.208137”,“url_lp” =>“-1.000000”,“url_net” =>“0-0-wifi”,“url_osn” =>“iOS”,“url_osv” =>“8.1.3”,“url_sh” =>“667.000000”,“url_sw” =>“375.000000”,“url_token” =>“_ovaPz6Ue68ybBuhXustPbG-xf1WbsPO”,“url_ts” =>“1428480001567”
}

如果URL參數(shù)過多,可以不使用kv切分,或者預(yù)先定義成nested object后改成數(shù)組形式:

if [uri] {
    ruby {
        init =>“@kname = ['url_path','url_args']”
        code =>“event.append(Hash[@kname.zip(event['request'].split(‘?’))])”
    }
    if [url_args] {
        ruby {
            init =>“@kname = ['key','value']”
            code =>“event['nested_args'] = event['url_args'].split(‘&’)。collect
                {|i| Hash[@kname.zip(i.split(‘=’))]}”
            remove_field => [ “url_args”,“uri”,“request” ]
        }
    }
}

采用nested object的優(yōu)化原理和nested object的使用方式,請閱讀后面Elasticsearch調(diào)優(yōu)章節(jié)。

3.1.3 json格式

自定義分隔符雖好,但是配置寫起來畢竟復(fù)雜很多。其實(shí)對Logstash來說,Nginx日志還有另一種更簡便的處理方式,就是自定義日志格式時,通過手工拼寫直接輸出成JSON格式:

log_format json '{“@timestamp”:“$time_iso8601”,'
             ‘“host”:“$server_addr”,'
             ’“clientip”:“$remote_addr”,'
             ‘“size”:$body_bytes_sent,'
             ’“responsetime”:$request_time,'
             ‘“upstreamtime”:“$upstream_response_time”,'
             ’“upstreamhost”:“$upstream_addr”,'
             ‘“http_host”:“$host”,'
             ’“url”:“$uri”,'
             ‘“xff”:“$http_x_forwarded_for”,'
             ’“referer”:“$http_referer”,'
             ‘“agent”:“$http_user_agent”,'
             ’“status”:“$status”}';

然后采用下面的Logstash配置即可:

input {
    file {
        path =>“/var/log/nginx/access.log”
        codec => json
    }
}
filter {
    mutate {
        split => [ “upstreamtime”, “,” ]
    }
    mutate {
        convert => [ “upstreamtime”, “float” ]
    }
}

這里采用多個mutate插件,是因?yàn)閡pstreamtime可能有多個數(shù)值,所以先切割成數(shù)組以后,再分別轉(zhuǎn)換成浮點(diǎn)型數(shù)值。而在mutate中,convert函數(shù)的執(zhí)行優(yōu)先級高于split函數(shù),所以只能分開兩步寫。mutate內(nèi)各函數(shù)的優(yōu)先級順序,之前插件介紹章節(jié)有詳細(xì)說明,讀者可以返回去閱讀。

3.1.4 syslog方式發(fā)送

Nginx從1.7版開始,加入了syslog支持,Tengine則更早。這樣,我們可以通過syslog直接發(fā)送日志。Nginx上的配置如下:

access_log syslog:server=unix:/data0/rsyslog/nginx.sock locallog;

或者直接發(fā)送給遠(yuǎn)程Logstash機(jī)器:

access_log syslog:server=192.168.0.2:5140,facility=local6,tag=nginx-access,
    severity=info logstashlog;

默認(rèn)情況下,Nginx將使用local7.info等級,以nginx為標(biāo)簽發(fā)送數(shù)據(jù)。注意,采用syslog發(fā)送日志的時候,無法配置buffer=16k選項(xiàng)。

主站蜘蛛池模板: 谢通门县| 青龙| 广灵县| 灵川县| 余江县| 皮山县| 深州市| 新密市| 博兴县| 博湖县| 道孚县| 沂南县| 偃师市| 内江市| 红桥区| 东兰县| 巫山县| 色达县| 阳谷县| 宝应县| 南澳县| 芜湖市| 克东县| 宝清县| 永德县| 麻江县| 荆州市| 永济市| 中宁县| 衡阳县| 静乐县| 丽江市| 成武县| 奇台县| 巢湖市| 隆回县| 隆德县| 宝应县| 衡阳县| 阿瓦提县| 华安县|