全自动鱼笼多少钱一个

書名： ELK stack權威指南
作者名：饒琛琳
本章字數： 2541字
更新時間： 2018-12-31 21:08:12

2.2　編解碼配置

Codec是Logstash從1.3.0版開始新引入的概念（Codec來自Coder/decoder兩個單詞的首字母縮寫）。

在此之前，Logstash只支持純文本形式輸入，然后以過濾器處理它。但現在，我們可以在輸入期處理不同類型的數據，這全是因為有了Codec設置。

Codec的引入，使得Logstash可以更好、更方便地與其他有自定義數據格式的運維產品共存，比如graphite、fluent、netflow、collectd，以及使用msgpack、json、edn等通用數據格式的其他產品等。

事實上，我們在第一個“Hello World”用例中就已經用過Codec了——rubydebug就是一種Codec！雖然它一般只會用在stdout插件中，作為配置測試或者調試的工具。

提示

這個五段式的流程說明源自Perl版的Logstash（后來改名叫Message：：Passing模塊）的設計。本書稍后5.8節會對該模塊稍作介紹。

2.2.1　JSON編解碼

在早期的版本中，有一種降低Logstash過濾器的CPU負載消耗的做法盛行于社區（在當時的Cookbook上有專門的一節介紹）：直接輸入預定義好的JSON數據，這樣就可以省略掉filter/grok配置！

這個建議依然有效，不過在當前版本中需要稍微做一點配置變動，因為現在有專門的Codec設置。

1.配置示例

社區常見的示例都是用的Apache的customlog，不過我覺得Nginx是一個比Apache更常用的新型Web服務器，所以我這里會用nginx.conf做示例：

logformat json '{“@timestamp”:“$time_iso8601”,'
           ‘“@version”:“1”,'
           ’“host”:“$server_addr”,'
           ‘“client”:“$remote_addr”,'
           ’“size”:$body_bytes_sent,'
           ‘“responsetime”:$request_time,'
           ’“domain”:“$host”,'
           ‘“url”:“$uri”,'
           ’“status”:“$status”}';
access_log /var/log/nginx/access.log_json json;

注意，在$request_time和$body_bytes_sent變量兩頭沒有雙引號"，這兩個數據在JSON里應該是數值類型！

重啟Nginx應用，然后修改你的input/file區段配置成下面這樣：

input {
    f?ile {
        path =>“/var/log/nginx/access.log_json”“
        codec =>”json“
    }
}

2.運行結果

下面訪問一下你Nginx發布的Web頁面，然后你會看到Logstash進程輸出類似下面這樣的內容：

{“@timestamp” =>“2014-03-21T18:52:25.000+08:00”，“@version” =>“1”，“host” =>“raochenlindeMacBook-Air.local”，“client” =>“123.125.74.53”，“size” =>8096,“responsetime” =>0.04,“domain” =>“www.domain.com”，“url” =>“/path/to/f?ile.suff?ix”，“status” =>“200”
}

3.Nginx代理服務的日志格式問題

對于一個Web服務器的訪問日志，看起來已經可以很好的工作了。不過如果Nginx是作為一個代理服務器運行的話，訪問日志里有些變量，比如說$upstream_response_time，可能不會一直是數字，它也可能是一個“-”字符串！這會直接導致Logstash對輸入數據驗證報異常。

有兩個辦法解決這個問題：

1）用sed在輸入之前先替換-成0。運行Logstash進程時不再讀取文件而是標準輸入，這樣命令就成了下面這個樣子：

tail -F /var/log/nginx/proxy_access.log_json \
    | sed 's/upstreamtime“:-/upstreamtime”:0/' \
    | /usr/local/logstash/bin/logstash -f /usr/local/logstash/etc/proxylog.conf

2）日志格式中統一記錄為字符串格式（即都帶上雙引號"），然后再在Logstash中用filter/mutate插件來變更應該是數值類型的字符字段的值類型。

有關LogStash：：Filters：：Mutate的內容，本書稍后會有介紹。

2.2.2　多行事件編碼

有些時候，應用程序調試日志會包含非常豐富的內容，為一個事件打印出很多行內容。這種日志通常都很難通過命令行解析的方式做分析。

而Logstash正為此準備好了codec/multiline插件！當然，multiline插件也可以用于其他類似的堆棧式信息，比如Linux的內核日志。

配置示例如下：

input {
    stdin {
        codec => multiline {
            pattern =>“^\[”
            negate => true
            what =>“previous”
        }
    }
}

運行Logstash進程，然后在等待輸入的終端中輸入如下幾行數據：

[Aug/08/08 14:54:03] hello world
[Aug/08/09 14:54:04] hello logstash
    hello best practice
    hello raochenlin
[Aug/08/10 14:54:05] the end

你會發現Logstash輸出下面這樣的返回：

{
“@timestamp” =>“2014-08-09T13:32:03.368Z”,
“message” =>“[Aug/08/08 14:54:03] hello world\n”,
“@version” =>“1”,
“host” =>“raochenlindeMacBook-Air.local”
}
{
“@timestamp” =>“2014-08-09T13:32:24.359Z”,
“message” =>“[Aug/08/09 14:54:04] hello logstash\n\n    hello best practice\n\n
             hello raochenlin\n”,
“@version” =>“1”,
“tags” => [
        [0] “multiline”
    ],“host” =>“raochenlindeMacBook-Air.local”
}

你看，后面這個事件，在“message”字段里存儲了三行數據！

注意

輸出的事件中沒有最后一行的"the end"字符串，這是因為你最后輸入的回車符\n并不匹配設定的^\[正則表達式，Logstash還得等下一行數據直到匹配成功后才會輸出這個事件。

其實這個插件的原理很簡單，就是把當前行的數據添加到前面一行后面，直到新進的當前行匹配^\[正則為止。這個正則還可以用grok表達式，稍后你就會學習這方面的內容。具體的Java日志正則見：https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/java

說到應用程序日志，Log4j肯定是第一個被大家想到的，使用codec/multiline也確實是一個辦法。

不過，如果你本身就是開發人員，或者可以推動程序修改變更的話，Logstash還提供了另一種處理Log4j的方式：input/log4j。與codec/multiline不同，這個插件是直接調用了org.apache.log4j.spi.LoggingEvent處理TCP端口接收的數據。稍后章節會詳細講述Log4j的用法。

2.2.3　網絡流編碼

NetFlow是Cisco發明的一種數據交換方式。NetFlow提供網絡流量的會話級視圖，記錄下每個TCP/IP事務的信息。它的目的不是像tcpdump那樣提供網絡流量的完整記錄，而是匯集起來形成更易于管理和易讀的流向和容量的分析監控。

Cisco上配置NetFlow的方法，請參照具體的設備說明，主要是設定采集服務器的地址和端口，為運行Logstash服務的主機地址和端口（示例中為9995）。

采集NetFlow數據的Logstash服務配置示例如下：

input {
    udp {
      port => 9995
      codec => netf?low {
        definitions =>“/opt/logstash-1.4.2/lib/logstash/codecs/netflow/netflow.yaml”
        versions => [5]
      }
    }
  }
  output {
    elasticsearch {
      index =>“logstash_netf?low5-%{+YYYY.MM.dd}”
      host =>“localhost”
    }
  }

由于該插件生成的字段較多，所以建議對應的Elasticsesarch索引模板也需要單獨提交：

# curl -XPUT localhost:9200/_template/logstash_netf?low5 -d '{“template” : “logstash_netflow5-*”,“settings”: {
“index.refresh_interval”: “5s”
},“mappings” : {
“_default_” : {
“_all” : {“enabled” : false},
“properties” : {
“@version”: { “index”: “analyzed”, “type”: “integer” },
“@timestamp”: { “index”: “analyzed”, “type”: “date” },
“netf?low”: {
“dynamic”: true,
“type”: “object”,
“properties”: {
  “version”: { “index”: “analyzed”, “type”: “integer” },
  “f?low_seq_num”: { “index”: “not_analyzed”, “type”: “long” },
  “engine_type”: { “index”: “not_analyzed”, “type”: “integer” },
  “engine_id”: { “index”: “not_analyzed”, “type”: “integer” },
  “sampling_algorithm”: { “index”: “not_analyzed”, “type”: “integer” },
  “sampling_interval”: { “index”: “not_analyzed”, “type”: “integer” },
  “f?low_records”: { “index”: “not_analyzed”, “type”: “integer” },
  “ipv4_src_addr”: { “index”: “analyzed”, “type”: “ip” },
  “ipv4_dst_addr”: { “index”: “analyzed”, “type”: “ip” },
  “ipv4_next_hop”: { “index”: “analyzed”, “type”: “ip” },
  “input_snmp”: { “index”: “not_analyzed”, “type”: “long” },
  “output_snmp”: { “index”: “not_analyzed”, “type”: “long” },
  “in_pkts”: { “index”: “analyzed”, “type”: “long” },
  “in_bytes”: { “index”: “analyzed”, “type”: “long” },
  “f?irst_switched”: { “index”: “not_analyzed”, “type”: “date” },
  “last_switched”: { “index”: “not_analyzed”, “type”: “date” },
  “l4_src_port”: { “index”: “analyzed”, “type”: “long” },
  “l4_dst_port”: { “index”: “analyzed”, “type”: “long” },
  “tcp_flags”: { “index”: “analyzed”, “type”: “integer” },
  “protocol”: { “index”: “analyzed”, “type”: “integer” },
  “src_tos”: { “index”: “analyzed”, “type”: “integer” },
  “src_as”: { “index”: “analyzed”, “type”: “integer” },
  “dst_as”: { “index”: “analyzed”, “type”: “integer” },
  “src_mask”: { “index”: “analyzed”, “type”: “integer” },
  “dst_mask”: { “index”: “analyzed”, “type”: “integer” }
     }
    }
   }
  }
 }
}'

Elasticsearch索引模板的功能，本書稍后12.6節會有詳細介紹。

官术网_书友最值得收藏!

ELK stack權威指南

2.2 編解碼配置

2.2.1 JSON編解碼

2.2.2 多行事件編碼

2.2.3 網絡流編碼

2.2　編解碼配置

2.2.1　JSON編解碼

2.2.2　多行事件編碼

2.2.3　網絡流編碼