Istio通过结合连接池和实例健康检测机制实现熔断功能。当后端实例出现故障时就移出负载均衡池,当负载均衡池中无可用健康实例时,服务请求会立即得到服务不可用的响应码,此时服务就处于熔断状态了。当服务实例被移出的时间结束后,服务实例会被再次添加到负载均衡池中,等待下一轮的服务健康检测。
配置示例:
1 apiVersion:
networking.istio.io/v1alpha3
2 kind: DestinationRule
3 metadata:
4 name: service-go
5 spec:
6 host: service-go
7 trafficPolicy:
8 connectionPool:
9 tcp:
10 maxConnections: 10
11 http:
12 http2MaxRequests: 10
13 maxRequestsPerConnection: 10
14 outlierDetection:
15 consecutiveErrors: 3
16 interval: 3s
17 baseEjectionTime: 3m
18 maxEjectionPercent: 100
19 subsets:
20 - name: v1
21 labels:
22 version: v1
23 - name: v2
24 labels:
25 version: v2
第8~13行定义了连接池配置,并发请求设置为10。
第14~18行定义了后端实例健康检测配置,允许全部实例移出连接池。
【实验】
1)启动用于并发测试的Pod:
$ kubectl apply -f kubernetes/fortio.yaml
2)创建service-go服务的路由规则:
$ kubectl apply -f istio/route/virtual-service-go.yaml
3)创建熔断规则:
$ kubectl apply -f istio/resilience/destination-rule-go-cb.yaml
4)访问:
$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -curl http://service-go/env HTTP/1.1 200 OK content-type: application/json; charset=utf-8 date: Wed, 16 Jan 2019 10:22:35 GMT content-length: 19 x-envoy-upstream-service-time: 3 server: envoy {"message":"go v2"} # 20 并发 $ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -c 20 -qps 0 -n 200 -loglevel Error http://service-go/env 10:25:21 I logger.go:97> Log level is now 4 Error (was 2 Info) Fortio 1.0.1 running at 0 queries per second, 2->2 procs, for 200 calls: http://service-go/env Aggregated Function Time : count 200 avg 0.023687933 +/- 0.01781 min 0.002302379 max 0.082312522 sum 4.73758658 # target 50% 0.0175385 # target 75% 0.029375 # target 90% 0.0533333 # target 99% 0.0766667 # target 99.9% 0.08185 Sockets used: 22 (for perfect keepalive, would be 20) Code 200 : 198 (99.0 %) Code 503 : 2 (1.0 %) All done 200 calls (plus 0 warmup) 23.688 ms avg, 631.3 qps # 30 并发 $ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -c 30 -qps 0 -n 300 -loglevel Error http://service-go/env 10:26:49 I logger.go:97> Log level is now 4 Error (was 2 Info) Fortio 1.0.1 running at 0 queries per second, 2->2 procs, for 300 calls: http://service-go/env Aggregated Function Time : count 300 avg 0.055940327 +/- 0.04215 min 0.001836339 max 0.207798702 sum 16.782098 # target 50% 0.0394737 # target 75% 0.0776471 # target 90% 0.123333 # target 99% 0.18 # target 99.9% 0.205459 Sockets used: 94 (for perfect keepalive, would be 30) Code 200 : 236 (78.7 %) Code 503 : 64 (21.3 %) All done 300 calls (plus 0 warmup) 55.940 ms avg, 486.3 qps # 40 并发 $ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -c 40 -qps 0 -n 400 -loglevel Error http://service-go/env 10:27:17 I logger.go:97> Log level is now 4 Error (was 2 Info) Fortio 1.0.1 running at 0 queries per second, 2->2 procs, for 400 calls: http://service-go/env Aggregated Function Time : count 400 avg 0.034048003 +/- 0.02541 min 0.001808212 max 0.144268023 sum 13.6192011 # target 50% 0.028587 # target 75% 0.0415789 # target 90% 0.0588889 # target 99% 0.132 # target 99.9% 0.143414 Sockets used: 203 (for perfect keepalive, would be 40) Code 200 : 225 (56.2 %) Code 503 : 175 (43.8 %) All done 400 calls (plus 0 warmup) 34.048 ms avg, 951.0 qps # 查看 istio-proxy 状态 $ kubectl exec fortio -c istio-proxy -- curl -s localhost:15000/stats | grep service-go | grep pending cluster.outbound|80|v1|service-go.default.svc.cluster.local.upstream_rq_pending_active: 0 cluster.outbound|80|v1|service-go.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0 cluster.outbound|80|v1|service-go.default.svc.cluster.local.upstream_rq_pending_overflow: 0 cluster.outbound|80|v1|service-go.default.svc.cluster.local.upstream_rq_pending_total: 0 cluster.outbound|80|v2|service-go.default.svc.cluster.local.upstream_rq_pending_active: 0 cluster.outbound|80|v2|service-go.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0 cluster.outbound|80|v2|service-go.default.svc.cluster.local.upstream_rq_pending_overflow: 0 cluster.outbound|80|v2|service-go.default.svc.cluster.local.upstream_rq_pending_total: 0 cluster.outbound|80||service-go.default.svc.cluster.local.upstream_rq_pending_active: 0 cluster.outbound|80||service-go.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0 cluster.outbound|80||service-go.default.svc.cluster.local.upstream_rq_pending_overflow: 551 cluster.outbound|80||service-go.default.svc.cluster.local.upstream_rq_pending_total: 1282
实验部署了两个版本的service-go服务实例,每个版本一个Pod,每个Pod的并发数为10,所以总的最大并发数就为20。从压测结果可以看出,当并发逐渐增大时,服务不可用的响应码(503)所占比例逐渐升高。但是从结果看,熔断器并不是非常准确地拦截了高于设置并发值的请求,Istio允许有部分请求遗漏。
5)清理:
$ kubectl delete -f kubernetes/fortio.yaml $ kubectl delete -f istio/route/virtual-service-go.yaml $ kubectl delete -f istio/resilience/destination-rule-go-cb.yaml