Istio入门与实战 8.5 熔断

更多图书

8.5　熔断

Istio通过结合连接池和实例健康检测机制实现熔断功能。当后端实例出现故障时就移出负载均衡池，当负载均衡池中无可用健康实例时，服务请求会立即得到服务不可用的响应码，此时服务就处于熔断状态了。当服务实例被移出的时间结束后，服务实例会被再次添加到负载均衡池中，等待下一轮的服务健康检测。

配置示例：

1 apiVersion：

 networking.istio.io/v1alpha3
 2 kind: DestinationRule
 3 metadata:
 4   name: service-go
 5 spec:
 6   host: service-go
 7   trafficPolicy:
 8     connectionPool:
 9       tcp:
10         maxConnections: 10
11       http:
12         http2MaxRequests: 10
13         maxRequestsPerConnection: 10
14     outlierDetection:
15       consecutiveErrors: 3
16       interval: 3s
17       baseEjectionTime: 3m
18       maxEjectionPercent: 100
19   subsets:
20   - name: v1
21     labels:
22       version: v1
23   - name: v2
24     labels:
25       version: v2

第8~13行定义了连接池配置，并发请求设置为10。

第14~18行定义了后端实例健康检测配置，允许全部实例移出连接池。

【实验】

1）启动用于并发测试的Pod：

$ kubectl apply -f kubernetes/fortio.yaml

2）创建service-go服务的路由规则：

$ kubectl apply -f istio/route/virtual-service-go.yaml

3）创建熔断规则：

$ kubectl apply -f istio/resilience/destination-rule-go-cb.yaml

4）访问：

$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -curl http://service-go/env
HTTP/1.1 200 OK
content-type: application/json; charset=utf-8
date: Wed, 16 Jan 2019 10:22:35 GMT
content-length: 19
x-envoy-upstream-service-time: 3
server: envoy
{"message":"go v2"}
# 20 并发


$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -c 20 -qps 0 -n 200 -loglevel Error http://service-go/env
10:25:21 I logger.go:97> Log level is now 4 Error (was 2 Info)
Fortio 1.0.1 running at 0 queries per second, 2->2 procs, for 200 calls: http://service-go/env
Aggregated Function Time : count 200 avg 0.023687933 +/- 0.01781 min 0.002302379 max 0.082312522 sum 4.73758658
# target 50% 0.0175385
# target 75% 0.029375
# target 90% 0.0533333
# target 99% 0.0766667
# target 99.9% 0.08185
Sockets used: 22 (for perfect keepalive, would be 20)
Code 200 : 198 (99.0 %)
Code 503 : 2 (1.0 %)
All done 200 calls (plus 0 warmup) 23.688 ms avg, 631.3 qps
# 30 并发


$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -c 30 -qps 0 -n 300 -loglevel Error http://service-go/env
10:26:49 I logger.go:97> Log level is now 4 Error (was 2 Info)
Fortio 1.0.1 running at 0 queries per second, 2->2 procs, for 300 calls: http://service-go/env
Aggregated Function Time : count 300 avg 0.055940327 +/- 0.04215 min 0.001836339 max 0.207798702 sum 16.782098
# target 50% 0.0394737
# target 75% 0.0776471
# target 90% 0.123333
# target 99% 0.18
# target 99.9% 0.205459
Sockets used: 94 (for perfect keepalive, would be 30)
Code 200 : 236 (78.7 %)
Code 503 : 64 (21.3 %)
All done 300 calls (plus 0 warmup) 55.940 ms avg, 486.3 qps
# 40 并发


$ kubectl exec fortio -c fortio /usr/local/bin/fortio -- load -c 40 -qps 0 -n 400 -loglevel Error http://service-go/env
10:27:17 I logger.go:97> Log level is now 4 Error (was 2 Info)
Fortio 1.0.1 running at 0 queries per second, 2->2 procs, for 400 calls: http://service-go/env
Aggregated Function Time : count 400 avg 0.034048003 +/- 0.02541 min 0.001808212 max 0.144268023 sum 13.6192011
# target 50% 0.028587
# target 75% 0.0415789
# target 90% 0.0588889
# target 99% 0.132
# target 99.9% 0.143414
Sockets used: 203 (for perfect keepalive, would be 40)
Code 200 : 225 (56.2 %)
Code 503 : 175 (43.8 %)
All done 400 calls (plus 0 warmup) 34.048 ms avg, 951.0 qps
# 查看

 istio-proxy 状态


$ kubectl exec fortio  -c istio-proxy  -- curl -s localhost:15000/stats | grep service-go | grep pending
cluster.outbound|80|v1|service-go.default.svc.cluster.local.upstream_rq_pending_active: 0
cluster.outbound|80|v1|service-go.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0
cluster.outbound|80|v1|service-go.default.svc.cluster.local.upstream_rq_pending_overflow: 0
cluster.outbound|80|v1|service-go.default.svc.cluster.local.upstream_rq_pending_total: 0
cluster.outbound|80|v2|service-go.default.svc.cluster.local.upstream_rq_pending_active: 0
cluster.outbound|80|v2|service-go.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0
cluster.outbound|80|v2|service-go.default.svc.cluster.local.upstream_rq_pending_overflow: 0
cluster.outbound|80|v2|service-go.default.svc.cluster.local.upstream_rq_pending_total: 0
cluster.outbound|80||service-go.default.svc.cluster.local.upstream_rq_pending_active: 0
cluster.outbound|80||service-go.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0
cluster.outbound|80||service-go.default.svc.cluster.local.upstream_rq_pending_overflow: 551
cluster.outbound|80||service-go.default.svc.cluster.local.upstream_rq_pending_total: 1282

实验部署了两个版本的service-go服务实例，每个版本一个Pod，每个Pod的并发数为10，所以总的最大并发数就为20。从压测结果可以看出，当并发逐渐增大时，服务不可用的响应码（503）所占比例逐渐升高。但是从结果看，熔断器并不是非常准确地拦截了高于设置并发值的请求，Istio允许有部分请求遗漏。

5）清理：

$ kubectl delete -f kubernetes/fortio.yaml
$ kubectl delete -f istio/route/virtual-service-go.yaml
$ kubectl delete -f istio/resilience/destination-rule-go-cb.yaml