升级Istio的过程中可能会导致服务不可用,为了减少服务不可用时间,需要保证Istio控制平面组件和网格内的服务以多副本的方式部署。如果是大版本升级可能还会涉及改变新的配置,调整新的API等。本次实验仅以小版本升级作为演示,如下的实验步骤把Istio从1.0.3版本升级到1.0.5版本。
本节实验在以官方示例的方式部署Istio的集群中进行。
(1)下载安装Istio
下载Istio安装包:
$ sudo yum install -y wget $ wget https://github.com/istio/istio/releases/download/1.0.5/istio-1.0.5-linux.tar.gz
解压安装:
$ tar xf istio-1.0.5-linux.tar.gz $ sudo mv istio-1.0.5 /usr/local $ sudo rm -f /usr/local/istio $ sudo ln -sv /usr/local/istio-1.0.5/ /usr/local/istio
添加到PATH路径中:
$ echo 'export PATH=/usr/local/istio/bin:$PATH' | sudo tee /etc/profile.d/istio.sh
验证安装:
$ source /etc/profile.d/istio.sh $ istioctl version Version: 1.0.5 GitRevision: c1707e45e71c75d74bf3a5dec8c7086f32f32fad User: root@6f6ea1061f2b Hub: docker.io/istio GolangVersion: go1.10.4 BuildStatus: Clean
(2)配置Istio
复制源文件,便于出错时恢复:
$ cp /usr/local/istio/install/kubernetes/istio-demo.yaml /usr/local/istio/install/kubernetes/istio-demo.yaml.ori
修改Istio部署配置。由于实验使用的虚拟机每台只有2G内存,默认情况下pilot的Deployment请求2G内存,为了使实验顺利进行。把13531行左右关于istio-pilot的内存配置修改成如下内容:
containers: - name: discovery image: "docker.io/istio/pilot:1.0.5" imagePullPolicy: IfNotPresent args: - "discovery" ports: ... initialDelaySeconds: 5 periodSeconds: 30 timeoutSeconds: 5 ... resources: requests: cpu: 500m memory: 500Mi volumeMounts: - name: config-volume mountPath: /etc/istio/config - name: istio-certs mountPath: /etc/certs readOnly: true
配置istio-egressgateway启动多副本:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: istio-egressgateway namespace: istio-system ... spec: replicas: 2 template: metadata: labels: app: istio-egressgateway istio: egressgateway
配置istio-ingressgateway启动多副本:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: istio-ingressgateway namespace: istio-system labels: ... spec: replicas: 2 template: metadata: labels: app: istio-ingressgateway istio: ingressgateway
配置istio-pilot启动多副本:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: istio-pilot namespace: istio-system # TODO: default template doesn't have this, which one is right ? labels: ... spec: replicas: 2 template: metadata: labels: istio: pilot app: pilot
配置istio-policy启动多副本:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: istio-policy namespace: istio-system ... spec: replicas: 2 template: metadata: labels: app: policy istio: mixer istio-mixer-type: policy
配置istio-telemetry启动多副本:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: istio-telemetry namespace: istio-system ... spec: replicas: 2 template: metadata: labels: app: telemetry istio: mixer istio-mixer-type: telemetry
使用命令行修改配置:
$ sed -i '13531 s/memory: 2048Mi/memory: 500Mi/g' /usr/local/istio/install/kubernetes/istio-demo.yaml $ sed -i '12838 s/replicas: 1/replicas: 2/g' /usr/local/istio/install/kubernetes/istio-demo.yaml $ sed -i '12977 s/replicas: 1/replicas: 2/g' /usr/local/istio/install/kubernetes/istio-demo.yaml $ sed -i '13482 s/replicas: 1/replicas: 2/g' /usr/local/istio/install/kubernetes/istio-demo.yaml $ sed -i '13228 s/replicas: 1/replicas: 2/g' /usr/local/istio/install/kubernetes/istio-demo.yaml $ sed -i '13368 s/replicas: 1/replicas: 2/g' /usr/local/istio/install/kubernetes/istio-demo.yaml
查看配置:
$ grep -B5 'memory: 200Mi' /usr/local/istio/install/kubernetes/istio-demo.yaml - name: PILOT_TRACE_SAMPLING value: "100" resources: requests: cpu: 500m memory: 200Mi $ grep -A5 'replicas: 2' /usr/local/istio/install/kubernetes/istio-demo.yaml replicas: 2 template: metadata: labels: app: istio-egressgateway istio: egressgateway -- replicas: 2 template: metadata: labels: app: istio-ingressgateway istio: ingressgateway -- replicas: 2 template: metadata: labels: app: policy istio: mixer -- replicas: 2 template: metadata: labels: app: telemetry istio: mixer -- replicas: 2 template: metadata: labels: istio: pilot app: pilot
修改镜像使用国内镜像,加速部署,执行以下命令修改镜像:
$ sed -i 's@quay.io/coreos/hyperkube:v1.7.6_coreos.0@registry.cn-shanghai.aliyuncs.com/gcr-k8s/hyperkube:v1.7.6_coreos.0@g' /usr/local/istio/install/kubernetes/istio-demo.yaml
(3)配置Istio控制平面组件多副本
把Istio控制平面组件都设置为两副本,实现Istio控制平面高可用。Citadel负责证书和密码的管理,Galley负责Istio其他组件的配置验证,Sidecar-injector负责Pod的Envoy代理自动注入功能。Istio的更新一般都在业务的低流量时期进行,并且会禁止对网格内服务的相关操作,上述组件在更新时暂时不可用,并不会影响服务间的访问,因此本次实验不把上述组件进行高可用配置:
$ kubectl scale --replicas=2 -n istio-system deployment istio-egressgateway $ kubectl scale --replicas=2 -n istio-system deployment istio-ingressgateway $ kubectl scale --replicas=2 -n istio-system deployment istio-pilot $ kubectl scale --replicas=2 -n istio-system deployment istio-policy $ kubectl scale --replicas=2 -n istio-system deployment istio-telemetry $ kubectl get pod -n istio-system NAME READY STATUS RESTARTS AGE grafana-546d9997bb-4xs5s 1/1 Running 0 25d istio-citadel-6955bc9cb7-dsl78 1/1 Running 0 25d istio-cleanup-secrets-ntxn8 0/1 Completed 0 25d istio-egressgateway-7dc5cbbc56-lpkqn 1/1 Running 0 24s istio-egressgateway-7dc5cbbc56-rc5pl 1/1 Running 0 25d istio-galley-545b6b8f5b-5pd4n 1/1 Running 0 25d istio-grafana-post-install-97s5m 0/1 Completed 0 25d istio-ingressgateway-7958d776b5-ccxxq 1/1 Running 0 24s istio-ingressgateway-7958d776b5-qfpg7 1/1 Running 0 25d istio-pilot-64958c46fc-5ct7q 2/2 Running 0 24s istio-pilot-64958c46fc-cdk62 2/2 Running 0 25d istio-policy-5c689f446f-l87kl 2/2 Running 0 25d istio-policy-5c689f446f-zbl2r 2/2 Running 0 23s istio-security-post-install-j8xr7 0/1 Completed 0 25d istio-sidecar-injector-99b476b7b-58lk7 1/1 Running 0 25d istio-telemetry-55d68b5dfb-4xjlp 2/2 Running 0 23s istio-telemetry-55d68b5dfb-6v27b 2/2 Running 0 25d istio-tracing-6445d6dbbf-p5nt5 1/1 Running 0 25d prometheus-65d6f6b6c-2lf6n 1/1 Running 0 25d servicegraph-57c8cbc56f-c2g2j 1/1 Running 3 25d
(4)升级期间的服务可用性测试
部署并扩容service-go服务:
$ kubectl apply -f service/go/service-go.yaml $ kubectl scale --replicas=2 deployment service-go-v1 $ kubectl scale --replicas=2 deployment service-go-v2 $ kubectl get pod NAME READY STATUS RESTARTS AGE service-go-v1-7cc5c6f574-4xjlp 2/2 Running 0 26s service-go-v1-7cc5c6f574-sj2mn 2/2 Running 0 25d service-go-v2-7656dcc478-f8zwc 2/2 Running 0 25d service-go-v2-7656dcc478-zbl2r 2/2 Running 0 26s
创建service-go服务路由规则。在升级Istio时,可能会出现服务瞬时不可用的现象,配置服务service-go重试规则,可以减少服务瞬时不可用的现象,添加服务service-go的RBAC和mTLS配置,测试升级过程中mTLS和RBAC能否正常工作:
$ kubectl apply -f istio/upgrade/virtual-service-go-rbac-mtls-retry.yaml
创建测试的Pod:
$ kubectl create ns testing $ kubectl apply -f <(istioctl kube-inject -f kubernetes/sleep.yaml) -n testing $ kubectl get pod -n testing
打开一个新终端,持续请求服务:
$ SLEEP_POD=$(kubectl get pod -l app=sleep -n testing -o jsonpath={.items..metadata.name}) $ kubectl exec -n testing $SLEEP_POD -c sleep -- sh -c 'while true;do date | tr -s "\n" " ";curl -s http://service-go.default/env -o /dev/null -w "%{http_code}\n";sleep 1;done' Sat Jan 19 04:16:33 UTC 2019 200 Sat Jan 19 04:16:34 UTC 2019 200 Sat Jan 19 04:16:36 UTC 2019 200 ...
(5)升级Istio
1)更新Istio CRD:
$ kubectl apply -f /usr/local/istio/install/kubernetes/helm/istio/templates/crds.yaml
2)查看Istio CRD:
$ kubectl get crd NAME CREATED AT adapters.config.istio.io 2019-01-28T06:44:40Z ... rules.config.istio.io 2019-01-28T06:44:39Z servicecontrolreports.config.istio.io 2019-01-28T06:44:40Z ... tracespans.config.istio.io 2019-01-28T06:44:40Z virtualservices.networking.istio.io 2019-01-28T06:44:39Z
3)升级Istio相关组件:
$ kubectl apply -f /usr/local/istio/install/kubernetes/istio-demo.yaml
由于Job类型有部分字段不可修改,可能会出现3个关于Job的错误,由于不影响升级,可以直接忽略。
4)查看Istio控制平面组件状态:
$ kubectl get svc -n istio-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ... istio-egressgateway ClusterIP 108.226.165 <none> 80/TCP,443/TCP 2m istio-galley ClusterIP 10.106.189.177 <none> 443/TCP,9093/TCP 72m istio-ingressgateway LoadBalancer 10.105.194.10 <pending> 80:31380/TCP,443:31390/TCP,31400:31400/TCP,15011:31606/TCP,8060:30116/TCP,853:31117/TCP,15030:31051/TCP,15031:30994/TCP 72m istio-pilot ClusterIP 10.111.192.14 <none> 15010/TCP,15011/TCP,8080/TCP,9093/TCP 72m ... $ kubectl get pod -n istio-system grafana-7ffdd5fb74-sjqgg 1/1 Running 0 7m58s istio-citadel-55cdfdd57c-qr8ts 1/1 Running 0 7m54s istio-egressgateway-7798845f5d-ckqcf 1/1 Running 0 7m57s istio-egressgateway-7798845f5d-xbrl5 1/1 Running 0 7m58s istio-galley-76bbb946c8-n6fts 1/1 Running 0 7m59s istio-grafana-post-install-d7qwp 0/1 Completed 0 62m istio-ingressgateway-78c6d8b8d7-jkt9c 1/1 Running 0 7m58s istio-ingressgateway-78c6d8b8d7-rj67m 1/1 Running 0 7m57s istio-pilot-865fd9c96-2zfbh 2/2 Running 0 7m53s istio-pilot-865fd9c96-qvtrw 2/2 Running 0 7m56s istio-policy-7b6cc95d7b-9tw2h 2/2 Running 0 7m55s istio-policy-7b6cc95d7b-ldmjc 2/2 Running 0 7m57s istio-sidecar-injector-9c6698858-9b7j6 1/1 Running 0 7m52s istio-telemetry-bfc9ff784-gs9gg 2/2 Running 0 7m56s istio-telemetry-bfc9ff784-kh4cj 2/2 Running 0 7m55s istio-tracing-6445d6dbbf-4cbbj 1/1 Running 0 62m prometheus-65d6f6b6c-xgfkn 1/1 Running 0 62m servicegraph-5c6f47859-vq7wx 1/1 Running 0 7m53s
5)观察整个控制平面组件升级过程中的服务访问日志:
... Sat Jan 19 04:19:39 UTC 2019 200 Sat Jan 19 04:19:40 UTC 2019 200 ... Sat Jan 19 04:19:41 UTC 2019 200 Sat Jan 19 04:19:43 UTC 2019 200 ... Sat Jan 19 04:19:44 UTC 2019 503 Sat Jan 19 04:19:45 UTC 2019 200 Sat Jan 19 04:19:46 UTC 2019 200 ...
从服务访问日志中可以看出,只有极少数异常服务响应码(503),这表明整个控制平面组件升级过程中,网格内的服务几乎不受影响,可以正常提供服务。
6)升级服务的Envoy代理:
$ TERMINATION_SECONDS=$(kubectl get deploy service-go-v1 -o jsonpath='{.spec.template.spec.terminationGracePeriodSeconds}') $ TERMINATION_SECONDS=$(($TERMINATION_SECONDS+1)) $ patch_string="{\"spec\":{\"template\":{\"spec\":{\"terminationGracePeriodSeconds\":$TERMINATION_SECONDS}}}}" $ kubectl patch deploy service-go-v1 -p $patch_string $ kubectl patch deploy service-go-v2 -p $patch_string
在控制平面组件全部升级完成后再进行此实验。
7)观察服务Envoy代理升级过程中的服务访问日志:
... Sat Jan 19 04:27:33 UTC 2019 200 Sat Jan 19 04:27:34 UTC 2019 200 Sat Jan 19 04:27:35 UTC 2019 503 Sat Jan 19 04:27:36 UTC 2019 200 ... Sat Jan 19 04:27:53 UTC 2019 200 Sat Jan 19 04:27:54 UTC 2019 200 Sat Jan 19 04:27:55 UTC 2019 200 ...
从服务访问日志中可以看出,只有极少数异常服务响应码(503),这表明整个Envoy代理升级过程中,网格内的服务几乎不受影响,可以正常提供服务。
8)清理:
$ kubectl delete -f istio/upgrade/virtual-service-go-rbac-mtls-retry.yaml $ kubectl delete ns testing $ kubectl scale --replicas=1 -n istio-system deployment istio-egressgateway $ kubectl scale --replicas=1 -n istio-system deployment istio-ingressgateway $ kubectl scale --replicas=1 -n istio-system deployment istio-pilot $ kubectl scale --replicas=1 -n istio-system deployment istio-policy $ kubectl scale --replicas=1 -n istio-system deployment istio-telemetry