Kubernetes垃圾回收机制

问题发现

测试环境上,node节点上面镜像,突然被删除,故查明下删除的原理机制

垃圾回收机制

Kubelet的GC功能将清理未使用的image和container。Kubelet每分钟对container执行一次GC,每5分钟对image执行一次GC。不建议使用外部垃圾收集工具,因为这些工具可能破坏Kubelet。

image

1
2
Kubernetes与Cadvisor配合,通过ImageManager管理所有image的生命周期。
image的GC策略包含高阈值和低阈值,高于高阈值的磁盘使用率将触发GC,删除最近最少使用的图像,直到满足低阈值为止。

container

1
2
3
4
5
6
7
8
9
10
container的GC主要有3个用户定义变量:
MinAge:容器被GC的最短时间
MaxPerPodContainer: 允许每个PodContainer中死容器的最大数目,PodContainer指1个Container而非pod
MaxContainers:死容器的最大数目
Minage=0,MaxPerPodContainer和MaxContainers <0, 表示禁用这些变量
GC用于unidentified、deleted或超出边界的容器(3个用户定义变量)。
最旧的container通常首先被移除。
如果MaxPerPodContainer>MaxContainers,maxperpodcontainer会进行调整,直至降级为1,并逐出最旧的容器。
pods所拥有的已删除的容器一旦超过MinAge,就会被删除。
未由Kubelet管理的容器不受容器垃圾收集的约束。

配置 GC

1
2
3
4
5
6
7
8
通过修改kubelet flags来实现。
image gc参数:
image-gc-high-threshold image GC 高阈值百分比,缺省为85%
image-gc-low-threshold image GC 低阈值百分比,缺省为80%
minimum-container-ttl-duration MinAge参数,缺省为0
maximum-dead-containers-per-container MaxPerPodContainer,缺省为1.
maximum-dead-containers MaxContainers,缺省为-1, 也就是没有限制
Container可能在其到期之前被GC,Container包含日志和其他对故障排除有用的数据。强烈建议MaxPerPodContainer和maximum-dead-containers 足够大。

GC效果

配置一个10s后会失败的container 配置文件pod-gc.yaml

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: v1
kind: Pod
metadata:
name: gc-test
spec:
containers:
- name: busybox-gc-1
image: busybox:v1
command:
- /bin/sh
- -c
- 'sleep 10 && hello'

使用yaml创建pod

1
kubectl apply -f pod-gc.yaml

查看docker的状态,当出现2个exit的container时,就会出发GC,回收最早的一个container

1
2
3
4
5
6
7
8
9
10
11
# docker ps -a | grep busybox-gc
8ddbfcf4ebee 59788edf1f3e "/bin/sh -c 'sleep 1…" 2 seconds ago Up 1 second k8s_busybox-gc-1_gc-test_default_3651c167-75f0-11e9-bc74-52540005f38a_5
7d7c1ade82af 59788edf1f3e "/bin/sh -c 'sleep 1…" About a minute ago Exited (127) About a minute ago k8s_busybox-gc-1_gc-test_default_3651c167-75f0-11e9-bc74-52540005f38a_4
# docker ps -a | grep busybox-gc
8ddbfcf4ebee 59788edf1f3e "/bin/sh -c 'sleep 1…" 10 seconds ago Up 9 seconds k8s_busybox-gc-1_gc-test_default_3651c167-75f0-11e9-bc74-52540005f38a_5
7d7c1ade82af 59788edf1f3e "/bin/sh -c 'sleep 1…" About a minute ago Exited (127) About a minute ago k8s_busybox-gc-1_gc-test_default_3651c167-75f0-11e9-bc74-52540005f38a_4
# docker ps -a | grep busybox-gc
8ddbfcf4ebee 59788edf1f3e "/bin/sh -c 'sleep 1…" 11 seconds ago Exited (127) Less than a second ago k8s_busybox-gc-1_gc-test_default_3651c167-75f0-11e9-bc74-52540005f38a_5
7d7c1ade82af 59788edf1f3e "/bin/sh -c 'sleep 1…" About a minute ago Exited (127) About a minute ago k8s_busybox-gc-1_gc-test_default_3651c167-75f0-11e9-bc74-52540005f38a_4
# docker ps -a | grep busybox-gc
8ddbfcf4ebee 59788edf1f3e "/bin/sh -c 'sleep 1…" 12 seconds ago Exited (127) 1 second ago k8s_busybox-gc-1_gc-test_default_3651c167-75f0-11e9-bc74-52540005f38a_5

修改配置文件/var/lib/kubelet/kubeadm-flags.env增加参数–maximum-dead-containers-per-container=2

1
KUBELET_KUBEADM_ARGS=--cgroup-driver=systemd --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --maximum-dead-containers-per-container=2

重启kubelet服务

1
systemctl restart kubelet

删除前面的pod

1
kubectl delete -f pod-gc.yaml

重新添加pod

1
kubectl apply -f pod-gc.yaml

可以看到,当exit的容器达到3个时,才会触发gc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# docker ps -a | grep busybox-gc
c8677098e9aa af2f74c517aa "/bin/sh -c 'sleep 1…" 8 seconds ago Up 8 seconds k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_2
f0bc1b7893bd af2f74c517aa "/bin/sh -c 'sleep 1…" 34 seconds ago Exited (127) 23 seconds ago k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_1
9e7fa118b1d3 af2f74c517aa "/bin/sh -c 'sleep 1…" 45 seconds ago Exited (127) 35 seconds ago k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_0
# docker ps -a | grep busybox-gc
c8677098e9aa af2f74c517aa "/bin/sh -c 'sleep 1…" 9 seconds ago Up 9 seconds k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_2
f0bc1b7893bd af2f74c517aa "/bin/sh -c 'sleep 1…" 35 seconds ago Exited (127) 24 seconds ago k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_1
9e7fa118b1d3 af2f74c517aa "/bin/sh -c 'sleep 1…" 46 seconds ago Exited (127) 36 seconds ago k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_0
# docker ps -a | grep busybox-gc
c8677098e9aa af2f74c517aa "/bin/sh -c 'sleep 1…" 10 seconds ago Exited (127) Less than a second ago k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_2
f0bc1b7893bd af2f74c517aa "/bin/sh -c 'sleep 1…" 36 seconds ago Exited (127) 25 seconds ago k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_1
9e7fa118b1d3 af2f74c517aa "/bin/sh -c 'sleep 1…" 47 seconds ago Exited (127) 36 seconds ago k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_0
# docker ps -a | grep busybox-gc
c8677098e9aa af2f74c517aa "/bin/sh -c 'sleep 1…" 11 seconds ago Exited (127) 1 second ago k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_2
f0bc1b7893bd af2f74c517aa "/bin/sh -c 'sleep 1…" 37 seconds ago Exited (127) 26 seconds ago k8s_busybox-gc-1_gc-test_default_3425a55b-75f5-11e9-bc74-52540005f38a_1
Donate