Istio提供了开箱即用的日志收集组件,可以把日志输出到Mixer组件上的标准输出或者存储到Mixer组件上的文件里。也可以把日志发送到Fluentd组件,然后收集存储到日志中心,用于检索分析日志,生产环境推荐使用这种方式来收集日志。
收集HTTP协议服务的请求日志输出到标准输出示例:
1 apiVersion: config.istio.io/v1alpha2 2 kind: logentry 3 metadata: 4 name: myaccesslog 5 namespace: istio-system 6 spec: 7 severity: '"Default"' 8 timestamp: request.time 9 variables: 10 source_ip: source.ip | ip("0.0.0.0") 11 destination_ip: destination.ip | ip("0.0.0.0") 12 source_user: source.principal | "" 13 method: request.method | "" 14 url: request.path | "" 15 protocol: request.scheme | "http" 16 response_code: response.code | 0 17 response_size: response.size | 0 18 request_size: request.size | 0 19 latency: response.duration | "0ms" 20 monitored_resource_type: '"UNSPECIFIED"' 21 --- 22 apiVersion: config.istio.io/v1alpha2 23 kind: stdio 24 metadata: 25 name: myaccessloghandler 26 namespace: istio-system 27 spec: 28 severity_levels: 29 info: 1 30 outputAsJson: true 31 --- 32 apiVersion: config.istio.io/v1alpha2 33 kind: rule 34 metadata: 35 name: myaccesslog-logstdio 36 namespace: istio-system 37 spec: 38 match: "true" 39 actions: 40 - handler: myaccessloghandler.stdio 41 instances: 42 - myaccesslog.logentry
第1~20行定义了名为了myaccesslog的logentry实例。severity表示收集的日志级别为Default,此字段供支持日志级别的适配器使用。timestamp表示使用请求的时间作为日志的时间。variables定义了日志收集的数据。monitored_resource_type定义了被监控资源的类型为"UNSPECIFIED"。
第22~30行定义了名为myaccessloghandler的stdio适配器。stdio适配器会把收集到的数据输出到Mixer组件的标准输出,severity_levels定义了日志的级别为info,outputAsJson设置为true表示以JSON的格式输出日志,可以通过设置outputLevel字段指定输出的级别。可以通过更多的配置选项输出日志到文件中并自动轮转日志,此次实验并没有使用这种方法,有兴趣的读者可以查看官方文档 [1] 。
第32~42行定义了名为myaccesslog-logstdio的规则,表明把myaccesslog实例收集到的日志发送给myaccessloghandler适配器处理。
收集服务请求日志发送到Fluentd组件示例:
1 apiVersion: config.istio.io/v1alpha2 2 kind: logentry 3 metadata: 4 name: newlog 5 namespace: istio-system 6 spec: 7 severity: '"info"' 8 timestamp: request.time 9 variables: 10 source: source.labels["app"] | source.workload.name | "unknown" 11 user: source.principal | "unknown" 12 destination:destination.labels["app"] | destination.service.name | "unknown" 13 response_code: response.code | 0 14 response_size: response.size | 0 15 latency: response.duration | "0ms" 16 monitored_resource_type: '"UNSPECIFIED"' 17 --- 18 apiVersion: config.istio.io/v1alpha2 19 kind: fluentd 20 metadata: 21 name: fluentdhandler 22 namespace: istio-system 23 spec: 24 address: "fluentd-es.logging:24224" 25 --- 26 apiVersion: config.istio.io/v1alpha2 27 kind: rule 28 metadata: 29 name: newlogtofluentd 30 namespace: istio-system 31 spec: 32 match: "true" 33 actions: 34 - handler: fluentdhandler.fluentd 35 instances: 36 - newlog.logentry
第1~16行定义了名为newlog的logentry实例。与收集HTTP协议服务的请求日志定义类似,在此不再赘述。
第18~24行定义了名为fluentdhandler的fluentd适配器。address定义了fluentd服务地址。
第26~36行定义了名为newlogtofluentd的规则,表明把newlog实例收集到的日志数据发送到fluentdhandler适配器。
【实验一】 测试日志输出到Mixer标准输出
1)创建用于请求的Pod:
$ kubectl apply -f kubernetes/fortio.yaml
2)创建服务日志收集到Mixer标准输出的规则:
$ kubectl apply -f istio/telemetry/log-http-access-log.yaml
3)打开新终端查看收集到的服务日志:
$ kubectl -n istio-system logs -f $(kubectl -n istio-system get pods -l istio-mixer-type=telemetry -o jsonpath='{.items[0].metadata.name}') -c mixer | grep \"instance\":\"myaccesslog.logentry.istio-system\"
4)另外打开一个终端并发请求服务:
$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -curl http://service-python/env HTTP/1.1 200 OK content-type: application/json content-length: 176 server: envoy date: Fri, 18 Jan 2019 13:53:26 GMT x-envoy-upstream-service-time: 595 {"message":"python v2","upstream":[{"message":"lua v2","response_time":0.1},{"message":"node v1","upstream":[{"message":"go v2","response_time":"0.30"}],"response_time":0.76}]} $ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -qps 10 -n 100 -loglevel Error http://service-python/env 14:01:11 I logger.go:97> Log level is now 4 Error (was 2 Info) Fortio 1.0.1 running at 10 queries per second, 2->2 procs, for 100 calls: http://service-python/env Aggregated Sleep Time : count 96 avg -3.8814565 +/- 2.688 min -9.370066353 max 0.046896141 sum -372.619823 # range, mid point, percentile, count >= -9.37007 <= -0.001 , -4.68553 , 98.96, 95 > 0.044 <= 0.0468961 , 0.0454481 , 100.00, 1 # target 50% -4.68553 WARNING 98.96% of sleep were falling behind Aggregated Function Time : count 100 avg 0.73897862 +/- 0.4143 min 0.097562604 max 1.980383317 sum 73.8978625 # target 50% 0.669231 # target 75% 1 # target 90% 1.58823 # target 99% 1.94117 # target 99.9% 1.97646 Sockets used: 5 (for perfect keepalive, would be 4) Code 200 : 99 (99.0 %) Code 503 : 1 (1.0 %) All done 100 calls (plus 0 warmup) 738.979 ms avg, 5.1 qps
5)清理:
$ kubectl delete -f istio/telemetry/log-http-access-log.yaml
【实验二】 测试日志输出到Fluentd标准输出
由于机器性能问题,本次实验环境只使用Fluentd收集日志,并输出到Fluentd的标准输出上,在生产环境中,把日志通过Fluentd收集,然后保存到ElasticSearch集群中,通过Kibana在Web中搜索查看分析日志。在机器性能足够的情况下,可以使用loging-stack.yaml部署EFK日志收集平台来模拟生产环境中的日志收集场景。
1)部署Fluentd:
$ kubectl apply -f kubernetes/loging-fluentd-stdout.yaml $ kubectl get pod -n logging NAME READY STATUS RESTARTS AGE fluentd-es-6cd547b4bc-sjqmk 1/1 Running 0 3m13s
2)创建服务日志发送到Fluentd的日志收集规则:
$ kubectl apply -f istio/telemetry/log-fluentd.yaml
3)打开终端查看服务日志:
$ kubectl -n logging logs -f $(kubectl -n logging get pods -l app=fluentd-es -o jsonpath='{.items[0].metadata.name}') | grep newlog.logentry.istio-system
4)另外打开一个终端并发请求服务:
$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -curl http://service-python/env HTTP/1.1 200 OK content-type: application/json content-length: 178 server: envoy date: Fri, 18 Jan 2019 14:08:10 GMT x-envoy-upstream-service-time: 667 {"message":"python v1","upstream":[{"message":"lua v1","response_time":1.03},{"message":"node v2","response_time":1.95,"upstream":[{"message":"go v2","response_time":"1.04"}]}]} $ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -qps 10 -n 100 -loglevel Error http://service-python/env 14:08:51 I logger.go:97> Log level is now 4 Error (was 2 Info) Fortio 1.0.1 running at 10 queries per second, 2->2 procs, for 100 calls: http://service-python/env Aggregated Sleep Time : count 96 avg -3.8814565 +/- 2.688 min -9.370066353 max 0.046896141 sum -372.619823 # range, mid point, percentile, count >= -9.37007 <= -0.001 , -4.68553 , 98.96, 95 > 0.044 <= 0.0468961 , 0.0454481 , 100.00, 1 # target 50% -4.68553 WARNING 98.96% of sleep were falling behind Aggregated Function Time : count 100 avg 0.73897862 +/- 0.4143 min 0.097562604 max 1.980383317 sum 73.8978625 # target 50% 0.669231 # target 75% 1 # target 90% 1.58823 # target 99% 1.94117 # target 99.9% 1.97646 Sockets used: 5 (for perfect keepalive, would be 4) Code 200 : 99 (99.0 %) Code 503 : 1 (1.0 %) All done 100 calls (plus 0 warmup) 738.979 ms avg, 5.1 qps
5)清理:
$ kubectl delete -f kubernetes/loging-fluentd-stdout.yaml $ kubectl delete -f istio/telemetry/log-fluentd.yaml
【实验三】 测试日志输出到EFK
进行此实验步骤时,每台虚拟机分配了4核CPU和4G内存,资源不够的读者可以跳过此步骤实验。
1)部署EFK日志收集平台:
$ kubectl apply -f kubernetes/loging-stack.yaml
2)创建服务日志发送到Fluentd日志收集规则:
$ kubectl apply -f istio/telemetry/log-fluentd.yaml
3)并发请求服务:
$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -curl http://service-python/env HTTP/1.1 200 OK content-type: application/json content-length: 178 server: envoy date: Fri, 18 Jan 2019 14:10:25 GMT x-envoy-upstream-service-time: 179 {"message":"python v1","upstream":[{"message":"lua v1","response_time":0.15},{"message":"node v2","response_time":0.51,"upstream":[{"message":"go v1","response_time":"0.40"}]}]} $ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -qps 10 -n 300 -loglevel Error http://service-python/env 14:11:50 I logger.go:97> Log level is now 4 Error (was 2 Info) Fortio 1.0.1 running at 10 queries per second, 2->2 procs, for 300 calls: http://service-python/env Aggregated Sleep Time : count 296 avg -33.639106 +/- 23.5 min -84.517839391 max -0.225980787 sum -9957.17528 # range, mid point, percentile, count >= -84.5178 <= -0.225981 , -42.3719 , 100.00, 296 # target 50% -42.5148 WARNING 100.00% of sleep were falling behind Aggregated Function Time : count 300 avg 1.4654142 +/- 1.06 min 0.009331473 max 7.796983286 sum 439.624248 # target 50% 1.26667 # target 75% 1.98095 # target 90% 2.81132 # target 99% 6.5 # target 99.9% 7.70789 Sockets used: 10 (for perfect keepalive, would be 4) Code 200 : 293 (97.7 %) Code 503 : 7 (2.3 %) All done 300 calls (plus 0 warmup) 1465.414 ms avg, 2.6 qps
4)在Kibana上查看日志数据。
访问地址http://11.11.11.111:32142/ ,设置并查看日志。选择Management导航栏,点击Index Patterns创建Index Pattern,如图11-4所示。
填入logstash-*匹配ElasticSearch中存储的日志索引,如图11-5所示。
选择@timestamp作为时间字段,如图11-6所示。
在Discover导航栏查看日志,如图11-7所示。
图11-4 在Kibana上查看日志数据
图11-5 匹配ElasticSearch中存储的日志索引
图11-6 选择时间字段
5)清理:
$ kubectl delete -f istio/telemetry/log-fluentd.yaml $ kubectl delete -f kubernetes/loging-stack.yaml $ kubectl delete -f kubernetes/fortio.yaml
图11-7 在Discover导航栏查看日志
[1] 官方文档地址为https://istio.io/docs/reference/config/policy-and-telemetry/adapters/stdio/。