Istio提供了开箱即用的日志收集组件,可以把日志输出到Mixer组件上的标准输出或者存储到Mixer组件上的文件里。也可以把日志发送到Fluentd组件,然后收集存储到日志中心,用于检索分析日志,生产环境推荐使用这种方式来收集日志。
收集HTTP协议服务的请求日志输出到标准输出示例:
1 apiVersion: config.istio.io/v1alpha2
2 kind: logentry
3 metadata:
4 name: myaccesslog
5 namespace: istio-system
6 spec:
7 severity: '"Default"'
8 timestamp: request.time
9 variables:
10 source_ip: source.ip | ip("0.0.0.0")
11 destination_ip: destination.ip | ip("0.0.0.0")
12 source_user: source.principal | ""
13 method: request.method | ""
14 url: request.path | ""
15 protocol: request.scheme | "http"
16 response_code: response.code | 0
17 response_size: response.size | 0
18 request_size: request.size | 0
19 latency: response.duration | "0ms"
20 monitored_resource_type: '"UNSPECIFIED"'
21 ---
22 apiVersion: config.istio.io/v1alpha2
23 kind: stdio
24 metadata:
25 name: myaccessloghandler
26 namespace: istio-system
27 spec:
28 severity_levels:
29 info: 1
30 outputAsJson: true
31 ---
32 apiVersion: config.istio.io/v1alpha2
33 kind: rule
34 metadata:
35 name: myaccesslog-logstdio
36 namespace: istio-system
37 spec:
38 match: "true"
39 actions:
40 - handler: myaccessloghandler.stdio
41 instances:
42 - myaccesslog.logentry
第1~20行定义了名为了myaccesslog的logentry实例。severity表示收集的日志级别为Default,此字段供支持日志级别的适配器使用。timestamp表示使用请求的时间作为日志的时间。variables定义了日志收集的数据。monitored_resource_type定义了被监控资源的类型为"UNSPECIFIED"。
第22~30行定义了名为myaccessloghandler的stdio适配器。stdio适配器会把收集到的数据输出到Mixer组件的标准输出,severity_levels定义了日志的级别为info,outputAsJson设置为true表示以JSON的格式输出日志,可以通过设置outputLevel字段指定输出的级别。可以通过更多的配置选项输出日志到文件中并自动轮转日志,此次实验并没有使用这种方法,有兴趣的读者可以查看官方文档 [1] 。
第32~42行定义了名为myaccesslog-logstdio的规则,表明把myaccesslog实例收集到的日志发送给myaccessloghandler适配器处理。
收集服务请求日志发送到Fluentd组件示例:
1 apiVersion: config.istio.io/v1alpha2 2 kind: logentry 3 metadata: 4 name: newlog 5 namespace: istio-system 6 spec: 7 severity: '"info"' 8 timestamp: request.time 9 variables: 10 source: source.labels["app"] | source.workload.name | "unknown" 11 user: source.principal | "unknown" 12 destination:destination.labels["app"] | destination.service.name | "unknown" 13 response_code: response.code | 0 14 response_size: response.size | 0 15 latency: response.duration | "0ms" 16 monitored_resource_type: '"UNSPECIFIED"' 17 --- 18 apiVersion: config.istio.io/v1alpha2 19 kind: fluentd 20 metadata: 21 name: fluentdhandler 22 namespace: istio-system 23 spec: 24 address: "fluentd-es.logging:24224" 25 --- 26 apiVersion: config.istio.io/v1alpha2 27 kind: rule 28 metadata: 29 name: newlogtofluentd 30 namespace: istio-system 31 spec: 32 match: "true" 33 actions: 34 - handler: fluentdhandler.fluentd 35 instances: 36 - newlog.logentry
第1~16行定义了名为newlog的logentry实例。与收集HTTP协议服务的请求日志定义类似,在此不再赘述。
第18~24行定义了名为fluentdhandler的fluentd适配器。address定义了fluentd服务地址。
第26~36行定义了名为newlogtofluentd的规则,表明把newlog实例收集到的日志数据发送到fluentdhandler适配器。
【实验一】 测试日志输出到Mixer标准输出
1)创建用于请求的Pod:
$ kubectl apply -f kubernetes/fortio.yaml
2)创建服务日志收集到Mixer标准输出的规则:
$ kubectl apply -f istio/telemetry/log-http-access-log.yaml
3)打开新终端查看收集到的服务日志:
$ kubectl -n istio-system logs -f $(kubectl -n istio-system get pods -l istio-mixer-type=telemetry -o jsonpath='{.items[0].metadata.name}') -c mixer | grep \"instance\":\"myaccesslog.logentry.istio-system\"
4)另外打开一个终端并发请求服务:
$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -curl http://service-python/env
HTTP/1.1 200 OK
content-type: application/json
content-length: 176
server: envoy
date: Fri, 18 Jan 2019 13:53:26 GMT
x-envoy-upstream-service-time: 595
{"message":"python v2","upstream":[{"message":"lua v2","response_time":0.1},{"message":"node v1","upstream":[{"message":"go v2","response_time":"0.30"}],"response_time":0.76}]}
$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -qps 10 -n 100 -loglevel Error http://service-python/env
14:01:11 I logger.go:97> Log level is now 4 Error (was 2 Info)
Fortio 1.0.1 running at 10 queries per second, 2->2 procs, for 100 calls: http://service-python/env
Aggregated Sleep Time : count 96 avg -3.8814565 +/- 2.688 min -9.370066353 max 0.046896141 sum -372.619823
# range, mid point, percentile, count
>= -9.37007 <= -0.001 , -4.68553 , 98.96, 95
> 0.044 <= 0.0468961 , 0.0454481 , 100.00, 1
# target 50% -4.68553
WARNING 98.96% of sleep were falling behind
Aggregated Function Time : count 100 avg 0.73897862 +/- 0.4143 min 0.097562604 max 1.980383317 sum 73.8978625
# target 50% 0.669231
# target 75% 1
# target 90% 1.58823
# target 99% 1.94117
# target 99.9% 1.97646
Sockets used: 5 (for perfect keepalive, would be 4)
Code 200 : 99 (99.0 %)
Code 503 : 1 (1.0 %)
All done 100 calls (plus 0 warmup) 738.979 ms avg, 5.1 qps
5)清理:
$ kubectl delete -f istio/telemetry/log-http-access-log.yaml
【实验二】 测试日志输出到Fluentd标准输出
由于机器性能问题,本次实验环境只使用Fluentd收集日志,并输出到Fluentd的标准输出上,在生产环境中,把日志通过Fluentd收集,然后保存到ElasticSearch集群中,通过Kibana在Web中搜索查看分析日志。在机器性能足够的情况下,可以使用loging-stack.yaml部署EFK日志收集平台来模拟生产环境中的日志收集场景。
1)部署Fluentd:
$ kubectl apply -f kubernetes/loging-fluentd-stdout.yaml $ kubectl get pod -n logging NAME READY STATUS RESTARTS AGE fluentd-es-6cd547b4bc-sjqmk 1/1 Running 0 3m13s
2)创建服务日志发送到Fluentd的日志收集规则:
$ kubectl apply -f istio/telemetry/log-fluentd.yaml
3)打开终端查看服务日志:
$ kubectl -n logging logs -f $(kubectl -n logging get pods -l app=fluentd-es -o jsonpath='{.items[0].metadata.name}') | grep newlog.logentry.istio-system
4)另外打开一个终端并发请求服务:
$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -curl http://service-python/env
HTTP/1.1 200 OK
content-type: application/json
content-length: 178
server: envoy
date: Fri, 18 Jan 2019 14:08:10 GMT
x-envoy-upstream-service-time: 667
{"message":"python v1","upstream":[{"message":"lua v1","response_time":1.03},{"message":"node v2","response_time":1.95,"upstream":[{"message":"go v2","response_time":"1.04"}]}]}
$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -qps 10 -n 100 -loglevel Error http://service-python/env
14:08:51 I logger.go:97> Log level is now 4 Error (was 2 Info)
Fortio 1.0.1 running at 10 queries per second, 2->2 procs, for 100 calls: http://service-python/env
Aggregated Sleep Time : count 96 avg -3.8814565 +/- 2.688 min -9.370066353 max 0.046896141 sum -372.619823
# range, mid point, percentile, count
>= -9.37007 <= -0.001 , -4.68553 , 98.96, 95
> 0.044 <= 0.0468961 , 0.0454481 , 100.00, 1
# target 50% -4.68553
WARNING 98.96% of sleep were falling behind
Aggregated Function Time : count 100 avg 0.73897862 +/- 0.4143 min 0.097562604 max 1.980383317 sum 73.8978625
# target 50% 0.669231
# target 75% 1
# target 90% 1.58823
# target 99% 1.94117
# target 99.9% 1.97646
Sockets used: 5 (for perfect keepalive, would be 4)
Code 200 : 99 (99.0 %)
Code 503 : 1 (1.0 %)
All done 100 calls (plus 0 warmup) 738.979 ms avg, 5.1 qps
5)清理:
$ kubectl delete -f kubernetes/loging-fluentd-stdout.yaml $ kubectl delete -f istio/telemetry/log-fluentd.yaml
【实验三】 测试日志输出到EFK
进行此实验步骤时,每台虚拟机分配了4核CPU和4G内存,资源不够的读者可以跳过此步骤实验。
1)部署EFK日志收集平台:
$ kubectl apply -f kubernetes/loging-stack.yaml
2)创建服务日志发送到Fluentd日志收集规则:
$ kubectl apply -f istio/telemetry/log-fluentd.yaml
3)并发请求服务:
$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -curl http://service-python/env
HTTP/1.1 200 OK
content-type: application/json
content-length: 178
server: envoy
date: Fri, 18 Jan 2019 14:10:25 GMT
x-envoy-upstream-service-time: 179
{"message":"python v1","upstream":[{"message":"lua v1","response_time":0.15},{"message":"node v2","response_time":0.51,"upstream":[{"message":"go v1","response_time":"0.40"}]}]}
$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -qps 10 -n 300 -loglevel Error http://service-python/env
14:11:50 I logger.go:97> Log level is now 4 Error (was 2 Info)
Fortio 1.0.1 running at 10 queries per second, 2->2 procs, for 300 calls: http://service-python/env
Aggregated Sleep Time : count 296 avg -33.639106 +/- 23.5 min -84.517839391 max -0.225980787 sum -9957.17528
# range, mid point, percentile, count
>= -84.5178 <= -0.225981 , -42.3719 , 100.00, 296
# target 50% -42.5148
WARNING 100.00% of sleep were falling behind
Aggregated Function Time : count 300 avg 1.4654142 +/- 1.06 min 0.009331473 max 7.796983286 sum 439.624248
# target 50% 1.26667
# target 75% 1.98095
# target 90% 2.81132
# target 99% 6.5
# target 99.9% 7.70789
Sockets used: 10 (for perfect keepalive, would be 4)
Code 200 : 293 (97.7 %)
Code 503 : 7 (2.3 %)
All done 300 calls (plus 0 warmup) 1465.414 ms avg, 2.6 qps
4)在Kibana上查看日志数据。
访问地址http://11.11.11.111:32142/ ,设置并查看日志。选择Management导航栏,点击Index Patterns创建Index Pattern,如图11-4所示。
填入logstash-*匹配ElasticSearch中存储的日志索引,如图11-5所示。
选择@timestamp作为时间字段,如图11-6所示。
在Discover导航栏查看日志,如图11-7所示。
图11-4 在Kibana上查看日志数据
图11-5 匹配ElasticSearch中存储的日志索引
图11-6 选择时间字段
5)清理:
$ kubectl delete -f istio/telemetry/log-fluentd.yaml $ kubectl delete -f kubernetes/loging-stack.yaml $ kubectl delete -f kubernetes/fortio.yaml
图11-7 在Discover导航栏查看日志
[1] 官方文档地址为https://istio.io/docs/reference/config/policy-and-telemetry/adapters/stdio/。