Wednesday, 2020-12-16

*** fuentess has quit IRC		00:15
*** th0din has quit IRC		01:10
*** th0din has joined #kata-dev		01:15
kata-irc-bot	<leevn2011> Hi There, I am wondering why we are only allowing running `one pod per VM`. What is the main motivation of running the container inside of a VM (OCI compatability ?) Since we have got the VM and we can only run 1 container inside of it, why don't we just run the applications inside of the VM directly? Why bother creating a container inside? Any reference would be greatly appreciated !	07:05
kata-irc-bot	<caoruidong> kata is The speed of containers, the security of VMs. If you run in VM directly, it comes back to the age before containers	07:19
kata-irc-bot	<leevn2011> When you say `speed` do you mean the speed of deployment and packaging libraries? It seems to me that running container inside of the VM might imposes some overhead(perhaps not much) , in principle it should be slower than running applications inside of the `kata lightweight VM` directly. I	07:25
kata-irc-bot	<leevn2011> Another interesting thing is that `one pod per one VM` The memory foot print cost is actually very high.	07:30
*** sameo has joined #kata-dev		07:31
kata-irc-bot	<fidencio> the memory footprint is around 300MB per pod, when using QEMU as VMM.	07:34
kata-irc-bot	<leevn2011> Indeed, compared to a traditional QEMU VM. the memory foot print cost is high. @caoruidong Maybe frank can elaborate more ?	07:40
*** dklyle has quit IRC		07:41
*** sgarzare has joined #kata-dev		08:07
*** jodh has joined #kata-dev		08:13
*** snir has quit IRC		08:21
*** snir has joined #kata-dev		08:22
kata-irc-bot	<christophe> Since a VM is a pod, and you can run multiple containers per pod (see https://www.howtoforge.com/multi-container-pods-in-kubernetes/ on how to do it), why do you think we can't run multiple containers in a VM?	08:34
kata-irc-bot	<christophe> If you tried it and it failed, I believe this is a bug and you should report it as such.	08:35
*** fgiudici has joined #kata-dev		08:36
kata-irc-bot	<leevn2011> oh thanks, That's makes sense. According to the official documentation, In the case of docker, `kata-runtime` creates a single container per pod. https://github.com/kata-containers/documentation/blob/master/design/architecture.md	08:57
*** davidgiluk has joined #kata-dev		09:03
kata-irc-bot	<caoruidong> Kata uses vm for security, you must pay something for that. Pod is k8s concept, docker doesn't know it. So in docker it's one container per pod	09:54
kata-irc-bot	<christophe> @leevn2011 This is _in the case of docker_ because docker has no real idea of a pod.	10:40
kata-irc-bot	<leevn2011> @christophe @caoruidong Thank you both for the clarification. It makes sense to me now!	10:53
*** yyyeer has joined #kata-dev		11:10
yyyeer	ping eric	11:11
yyyeer	https://github.com/kata-containers/kata-containers/issues/1171	11:12
yyyeer	we might decide what way to go	11:12
kata-irc-bot	<christophe> FYI, silly experiment of the day: it looks like the lowest `default_memory` I can use and still boot `alpine` correctly is 184M. With that, `cat /proc/meminfo` gives me: ```MemTotal: 147616 kB MemFree: 34104 kB MemAvailable: 42972 kB``` I wonder why there is always a delta between the `default_memory` value and what we see as `MemTotal` in the guest.	11:35
kata-irc-bot	<christophe> That same setup also boots `fedora` with: ```MemTotal: 147616 kB MemFree: 28292 kB MemAvailable: 41628 kB```	11:37
kata-irc-bot	<christophe> However, with `fedora`, you can't use `dnf install` in such a low-memory footprint. It gets `Killed`. By contrast, `apk add emacs` in `alpine` works OK.	11:38
*** yyyeer has quit IRC		11:38
kata-irc-bot	<christophe> ```[ 119.140604] Out of memory: Killed process 215 (dnf) total-vm:299452kB, anon-rss:28180kB, file-rss:8kB, shmem-rss:0kB, UID:0 pgtables:208kB oom_score_adj:0 [ 119.142402] oom_reaper: reaped process 215 (dnf), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB``` Will try with straight rpm.	11:38
kata-irc-bot	<christophe> Straight `rpm` seems to work, although it means you need to find the URLs and dependencies manually.	11:43
kata-irc-bot	<fidencio> @christophe, out of curiosity, what exactly are you trying to measure? The minimal value for `default_memory` that would work for the most basic operations in the most common container images?	11:49
kata-irc-bot	<christophe> Yes.	13:01
kata-irc-bot	<fidencio> @christophe, and are you testing with our own kernel or with upstream kernel?	13:08
kata-irc-bot	<christophe> On Fedora with the host kernle	13:15
kata-irc-bot	<fidencio> If you have the time and the interest, would be nice to know those numbers also with: • upstream kernel; • 2.x agent; Both things should bring that number down.	13:20
kata-irc-bot	<wmoschet> Hi folks! I noticed that `make cri-containerd` has failed on some jobs. Is it a known issue? Someone already working on a fix?	13:44
kata-irc-bot	<fidencio> I guess it's part of the issues faced since last week, Wainer.	13:47
kata-irc-bot	<wmoschet> @fidencio it might be. I noticed the url to download cri-containerd binaries is broken, then the script build it from source. Not sure it is building the correct version, so this might be one of the problems	13:50
kata-irc-bot	<fidencio> Also, I've noticed kube* not being installed, which would also cause a different failure	13:51
kata-irc-bot	<wmoschet> @fidencio hmm... so, let's try to fix it?	13:52
kata-irc-bot	<fidencio> Do you have cycles for that?	13:52
kata-irc-bot	<fidencio> If so, please, go ahead.	13:52
kata-irc-bot	<wmoschet> @fidencio yes, I am block by some PRs not getting merged...so...	13:53
*** fuentess has joined #kata-dev		13:56
*** crobinso has joined #kata-dev		14:05
kata-irc-bot	<eric.ernst> Just to clarify one more thing @leevn2011 - we're looking at defense in depth here.	15:40
kata-irc-bot	<eric.ernst> And in the pod case, want to still provide the same isolation between containers within a single pod.	15:41
kata-irc-bot	<eric.ernst> 300 MB seems high @fidencio?	15:41
kata-irc-bot	<fidencio> As @christophe is investigating, it does depend on the guest kernel being used, but that's what I was able to benchmark with the distro kernel.	15:44
kata-irc-bot	<christophe> @eric.ernst At the moment, roughly: 150-200M required for the VM payload, 30-50M of overhead for qemu, another 20M for shim/runtime and for virtiofsd. Some variations depending on kernel and qemu build, but these are the rough estimates.	15:47
kata-irc-bot	<fidencio> @christophe, was your evaluation done with 1.x? (mine was).	15:49
kata-irc-bot	<eric.ernst> this is with the releases packages (kernel/guest/qemu)?	15:50
kata-irc-bot	<eric.ernst> (sorry, apparently i can only read up two messages at once -- I see -- distro kernel)	15:50
kata-irc-bot	<wmoschet> https://github.com/kata-containers/tests/pull/3126	15:51
kata-irc-bot	<wmoschet> running some jobs...let's see	15:51
kata-irc-bot	<eric.ernst> that number hurts, but i guess it depends on expectations. I would hope that we're on the order of 100 MB best case, and was concerned as we slipped to ~160.	15:51
kata-irc-bot	<eric.ernst> You may run into issues if you try running a pod with container memory requests, as well. ie, you'll be limited on how much you can hotplug.	15:53
kata-irc-bot	<eric.ernst> ...perhaps this shoudl be for a different thread though...	15:54
kata-irc-bot	<eric.ernst> overheads :thread:	15:55
kata-irc-bot	<eric.ernst> I was looking at this a few weeks ago, as I was uncomfortable with the numbers.	15:55
kata-irc-bot	<eric.ernst> virtiofsd was ~ 4 MB each (2 per pod) shim is ~ 25 MB VMM ~ 147 MB	15:56
kata-irc-bot	<eric.ernst> this was with shim-v2.	15:56
kata-irc-bot	<eric.ernst> and with Kata 2.0	15:56
kata-irc-bot	<eric.ernst> with 1.x, we weren't building with PIE, so the shim was smaller. When PIE is enabled (it shoudl be!), it is near 25 MB as well	15:57
kata-irc-bot	<fidencio> And VMM being statically QEMU?	15:57
kata-irc-bot	<eric.ernst> (pie is expecnsive)	15:57
kata-irc-bot	<eric.ernst> yes.	15:57
kata-irc-bot	<eric.ernst> from dmesg: ```10242 K kernel code 1864 K rodata 501 K rwdata 852 K init Total of kernel stuff: ~13.1 MB```	15:57
kata-irc-bot	<eric.ernst> And then, looking more at costs in the guest: ```expected total memory: 1179648 (128MiB default + 1GiB for container) Memory total reported in /proc/meminfo: 1152084 Difference (presumable kernel structures??): 26.9 MB. ```	15:59
kata-irc-bot	<eric.ernst> (note, I wouldn't suggest memory default of 128 MiB, it as just used during this measurement experiment)	16:00
kata-irc-bot	<eric.ernst> if folks have time, i think it'd be pretty useful to standardize on how we measure each item, and to look at improving.	16:00
kata-irc-bot	<fidencio> Sure, we're in a meeting right now, which will take 51 more minutes.	16:09
kata-irc-bot	<fidencio> In short, what I measured was a way less specific measurement than you or @christophe are doing. What I ended up doing was programmatically starting a number of pods, checking (via prometheus) the amount of memory going up, redoing it several times.	16:14
kata-irc-bot	<eric.ernst> Yeah, which makes sense for first pass. I just want to understand the exact breakdown, so we can begin to investigate improvements, and understand where we are ... sub .. optimal.	16:21
*** dklyle has joined #kata-dev		16:52
*** dklyle has quit IRC		17:30
*** dklyle has joined #kata-dev		17:30
*** sgarzare has quit IRC		17:59
*** jodh has quit IRC		18:00
*** fgiudici has quit IRC		18:28
kata-irc-bot	<christophe> Also, note that we have a non-technical constraint that in our case, we need to run as well as possible with the host kernel.	18:55
kata-irc-bot	<fidencio> @jose.carlos.venegas.m, @gabriela.cervantes.te, @wmoschet, Seems that sonobouy timeout failures have been increasing by a lot. Those timeouts are accounted from the time the k8s cluster is up or does this take into consideration the whole job time?	19:25
kata-irc-bot	<wmoschet> @fidencio I glanced at this problem...IIUC at some point the test script runs `sonobouy run (...)` and the timeout happens on that operation. So I think it is a sonobouy timeout	19:27
kata-irc-bot	<wmoschet> supposely sonobouy by default should wait hours before the timeout...but it seems to wait a few minutes only	19:27
kata-irc-bot	<fidencio> Ack, I see it here. Let me increase the time and see if it helps.	19:31
kata-irc-bot	<fidencio> @wmoschet, as you have to rerun the tests anyways in your last PR about the skipping test, mind to also adding the following patch before rerunning? ```fidencio@machado ~/go/src/github.com/kata-containers/tests $ git diff diff --git a/integration/kubernetes/e2e_conformance/run.sh b/integration/kubernetes/e2e_conformance/run.sh index 36f2e40d..a8390a2d 100755 --- a/integration/kubernetes/e2e_conformance/run.sh +++	19:33
kata-irc-bot	b/integration/kubernetes/e2e_conformance/run.sh @@ -29,7 +29,7 @@ MINIMAL_CONTAINERD_K8S_E2E="${MINIMAL_CONTAINERD_K8S_E2E:-false}" KATA_HYPERVISOR="${KATA_HYPERVISOR:-}" # Overall Sonobuoy timeout in minutes. -WAIT_TIME=${WAIT_TIME:-180} +WAIT_TIME=${WAIT_TIME:-300} JOBS_FILE="${SCRIPT_PATH}/e2e_k8s_jobs.yaml"```	19:33
kata-irc-bot	<wmoschet> @fidencio sure. btw, I saw that code... 180 is suppose to be in hours but I believe it is in seconds	19:39
kata-irc-bot	<fidencio> I also thought that's in seconds	19:40
kata-irc-bot	<fidencio> let me re-check	19:40
kata-irc-bot	<fidencio> Yep, waits is actually minutes.	19:46
kata-irc-bot	<fidencio> Well, the problem is not exactly the sonobuoy timeout to finish the process ... take a look ... ```20:10:30 + echo 'running: sonobuoy run --wait=180 --e2e-focus="ConfigMap should be consumable from pods in volume\|ConfigMap should be consumable from pods in volume as non-root\|Kubelet when scheduling a busybox command that always fails in a pod should be possible to delete\|Kubectl client Kubectl apply should reuse port when apply to an	19:53
kata-irc-bot	existing SVC"' 20:10:30 running: sonobuoy run --wait=180 --e2e-focus="ConfigMap should be consumable from pods in volume\|ConfigMap should be consumable from pods in volume as non-root\|Kubelet when scheduling a busybox command that always fails in a pod should be possible to delete\|Kubectl client Kubectl apply should reuse port when apply to an existing SVC" 20:10:30 + eval 'sonobuoy run --wait=180 --e2e-focus="ConfigMap should be consumable from	19:53
kata-irc-bot	pods in volume\|ConfigMap should be consumable from pods in volume as non-root\|Kubelet when scheduling a busybox command that always fails in a pod should be possible to delete\|Kubectl client Kubectl apply should reuse port when apply to an existing SVC"' 20:10:30 ++ sonobuoy run --wait=180 '--e2e-focus=ConfigMap should be consumable from pods in volume\|ConfigMap should be consumable from pods in volume as non-root\|Kubelet when scheduling a busybox	19:53
kata-irc-bot	command that always fails in a pod should be possible to delete\|Kubectl client Kubectl apply should reuse port when apply to an existing SVC' 20:10:30 time="2020-12-16T19:10:30Z" level=info msg="created object" name=sonobuoy namespace= resource=namespaces 20:10:30 time="2020-12-16T19:10:30Z" level=info msg="created object" name=sonobuoy-serviceaccount namespace=sonobuoy resource=serviceaccounts 20:10:30 time="2020-12-16T19:10:30Z" level=info	19:53
kata-irc-bot	msg="created object" name=sonobuoy-serviceaccount-sonobuoy namespace= resource=clusterrolebindings 20:10:30 time="2020-12-16T19:10:30Z" level=info msg="created object" name=sonobuoy-serviceaccount-sonobuoy namespace= resource=clusterroles 20:10:30 time="2020-12-16T19:10:30Z" level=info msg="created object" name=sonobuoy-config-cm namespace=sonobuoy resource=configmaps 20:10:30 time="2020-12-16T19:10:30Z" level=info msg="created object"	19:53
kata-irc-bot	name=sonobuoy-plugins-cm namespace=sonobuoy resource=configmaps 20:10:31 time="2020-12-16T19:10:31Z" level=info msg="created object" name=sonobuoy namespace=sonobuoy resource=pods 20:10:32 time="2020-12-16T19:10:31Z" level=info msg="created object" name=sonobuoy-aggregator namespace=sonobuoy resource=services 20:11:51 time="2020-12-16T19:11:42Z" level=error msg="error attempting to run sonobuoy: waiting for run to finish: failed to get status:	19:53
kata-irc-bot	failed to get namespace sonobuoy: etcdserver: request timed out"``` The problem here is the timeout reached to start sonobuoy as it seems to timeout due to etcdserver. @eric.ernst, any idea why it may be timing out like this?	19:53
kata-irc-bot	<fidencio> I remember facing similar issues when the VM I use to test happens to lack resources	19:55
kata-irc-bot	<fidencio> @jose.carlos.venegas.m, do we happen to have an easy way to increase the number of vCPUs / RAM in the guest?	19:55
kata-irc-bot	<jose.carlos.venegas.m> @fidencio I am not familiar on how to do it, @salvador.fuentes is the one	19:56
kata-irc-bot	<jose.carlos.venegas.m> I think we have to create a new machine alias in jenkins to point to a new VM machine type	19:57
kata-irc-bot	<fidencio> @jose.carlos.venegas.m, Could we have the -k8s-minimal tests not being blockers for now?	19:57
kata-irc-bot	<fidencio> Those errors started happening when a new version of Ubuntu was deployed, isn't it? I wonder if the resources consumption increased	19:58
kata-irc-bot	<fidencio> A way to debug would be creating one of those VMs and manually run the process and see how stressed the system is.	19:58
kata-irc-bot	<fidencio> > Those errors started happening when a new version of Ubuntu was deployed, isn't it? I wonder if the resources consumption increased Hmm. Let me take this out. Fedora faces exactly the same issue.	20:02
kata-irc-bot	<jose.carlos.venegas.m> so not distro specific ?	20:02
kata-irc-bot	<jose.carlos.venegas.m> can you share me a job URL to also see if we can find something more	20:03
kata-irc-bot	<fidencio> I'd say it's not distro specific, but seems to happen more with Ubuntu.	20:03
kata-irc-bot	<fidencio> http://jenkins.katacontainers.io/job/kata-containers-2.0-tests-ubuntu-PR-containerd-k8s-minimal/228/	20:03
kata-irc-bot	<fidencio> http://jenkins.katacontainers.io/job/kata-containers-2.0-tests-fedora-PR-crio-k8s-e2e-minimal/27/	20:03
kata-irc-bot	<jose.carlos.venegas.m> ```20:11:51 time="2020-12-16T19:11:42Z" level=error msg="error attempting to run sonobuoy: waiting for run to finish: failed to get status: failed to get namespace sonobuoy: etcdserver: request timed out"``` Looks like a very early setup fail	20:04
kata-irc-bot	<jose.carlos.venegas.m> thx	20:04
kata-irc-bot	<jose.carlos.venegas.m> also just seen in 2.0 correct ?	20:04
kata-irc-bot	<fidencio> Yes, just seen in 2.0	20:05
kata-irc-bot	<jose.carlos.venegas.m> ok, just googling quickly at least based in https://github.com/kubernetes-sigs/kind/issues/717	20:11
kata-irc-bot	<jose.carlos.venegas.m> seems could happen based on resources	20:11
kata-irc-bot	<jose.carlos.venegas.m> and this is happening randomly	20:12
kata-irc-bot	<jose.carlos.venegas.m> at that point kata should not be a blocker so looks like a env/setup issue	20:12
kata-irc-bot	<fidencio> Exactly, it's too early in the process.	20:14
*** davidgiluk has quit IRC		20:17
*** sameo has quit IRC		20:33
kata-irc-bot	<archana.m.shinde> @fidencio @jose.carlos.venegas.m Yes, I have seen those errors myself when running on a smaller machine	21:43
kata-irc-bot	<archana.m.shinde> seaching on the web does point to resources running out	21:44
kata-irc-bot	<archana.m.shinde> could be due to a slower disk	21:44
kata-irc-bot	<archana.m.shinde> @salvador.fuentes Want to make sure we are not running out of space on those VMs	21:45
kata-irc-bot	<archana.m.shinde> @jose.carlos.venegas.m That issue does mention about increasing the inotify limit `sysctl -w fs.inotify.max_user_watches=524288`	21:48
kata-irc-bot	<jose.carlos.venegas.m> @archana.m.shinde @fidencio hey sorry I got distracted, I think we need run in an azure vm to identify the real cause	21:50
kata-irc-bot	<jose.carlos.venegas.m> I can expend some time tomorrow in case you are not looking to it now	21:50
*** dklyle has quit IRC		22:04
*** dklyle has joined #kata-dev		22:21
*** crobinso has quit IRC		22:52
*** fuentess has quit IRC		23:23
kata-irc-bot	<salvador.fuentes> @fidencio, @jose.carlos.venegas.m, @archana.m.shinde, @eric.ernst hey, taking in consideration @archana.m.shinde comment regarding that the issue could be due to a slow disk, I went through the azure configuration and checked a flag to use Ephemeral OS disk (according to azure, they provide lower I/O latency ) and have restarted some jobs. I see that from the 3 jobs I restarted, all 3 have passed:	23:31
kata-irc-bot	http://jenkins.katacontainers.io/job/kata-containers-2.0-tests-ubuntu-PR-containerd-k8s-minimal/ . It seems that this helped, but still want to be cautious and see how they behave between today and tomorrow.	23:31
kata-irc-bot	<archana.m.shinde> great @salvador.fuentes. Yeah, lets monitor them over a day	23:33
kata-irc-bot	<fidencio> @salvador.fuentes thanks a lot for taking a look into this!	23:50
kata-irc-bot	<liubin0329> Glad to see some jobs have succeed	23:54

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!