*** dklyle has joined #kata-dev | 01:32 | |
kata-irc-bot | <mnaser> @salvador.fuentes http://jenkins.katacontainers.io/job/kata-containers-tests-fedora-28-master/1/console :slightly_smiling_face: | 01:34 |
---|---|---|
kata-irc-bot | <mnaser> clarkb: i'm slowly rolling out this updated kernel across all of sjc1, so i think it would be a good time to pick up zuul work again | 01:35 |
kata-irc-bot | <mnaser> @salvador.fuentes looks like job failed but i dunno why.. but it works. | 01:59 |
kata-irc-bot | <salvador.fuentes> thanks @mnaser, I'll check the error | 03:30 |
*** dklyle has quit IRC | 03:58 | |
kata-irc-bot | <eric.ernst> @fupan @bergwolf - we have any arch docs for containerd-v2-shim yet? | 04:06 |
kata-irc-bot | <eric.ernst> whether in github, gdoc, or preferably .md? | 04:06 |
kata-irc-bot | <eric.ernst> @xu ^^ | 04:07 |
kata-irc-bot | <fupan> Currently we have a doc about how to deploy and run containerd-shim-kata-v2 with containerd and cri | 04:17 |
kata-irc-bot | <fupan> https://gist.github.com/gnawux/d06c34b845aa3350799cbeaeb3c1270e | 04:17 |
kata-irc-bot | <eric.ernst> thanks @fupan | 04:28 |
kata-irc-bot | <eric.ernst> what are gaps between what's on hyperhq and what's on kata-containers/runtime now? | 04:29 |
kata-irc-bot | <eric.ernst> should be at parity with the PR (finally!!!) merging? | 04:30 |
kata-irc-bot | <eric.ernst> I'd like to get this up tomorrow, though perhaps using runtimeClass and 1.12 | 04:30 |
kata-irc-bot | <eric.ernst> btw, nice job on the prep for the kata, v2 shim, gvisor talk @fupan | 04:31 |
kata-irc-bot | <fupan> By now there is almost | 04:36 |
kata-irc-bot | <fupan> By now there isn’t any gap between hyperhq and kata/runtime , and what’s merging into kata/runtime is also the latest updates in hyperhq. | 04:39 |
kata-irc-bot | <fupan> By now there is only one PR related to shimv2 hasn’t been merged. https://github.com/kata-containers/runtime/pull/940 | 04:40 |
*** eernst has joined #kata-dev | 04:45 | |
*** eernst has quit IRC | 04:55 | |
kata-irc-bot | <eric.ernst> ok, thanks @fupan | 05:39 |
*** jodh has joined #kata-dev | 06:51 | |
*** zerocoolback has joined #kata-dev | 07:10 | |
*** zerocoolback has quit IRC | 07:12 | |
*** shrasool has quit IRC | 07:18 | |
*** sameo has joined #kata-dev | 08:13 | |
*** dims has quit IRC | 08:32 | |
*** dims has joined #kata-dev | 08:33 | |
*** gwhaley has joined #kata-dev | 08:59 | |
*** davidgiluk has joined #kata-dev | 09:06 | |
kata-irc-bot | <graham.whaley> @mnaser @salvador.fuentes back on the boot speed with atomic - your 3.7s is on the high end of boot times that we see - but then, I normally measure the time 'into the workload', as your time also has the container shutdown time (which is debatable if anybody is really interested in ;) ). @mnaser, you could run up the metrics report (https://github.com/kata-containers/tests/tree/master/metrics/report) to compare the two - if you | 09:24 |
kata-irc-bot | are only interested in boot times and maybe footprint, use the '-t -d' options on the grabdata script. You end up with a PDF report showing you a comparison, and you get some more details of breakdown. | 09:24 |
kata-irc-bot | <graham.whaley> btw, great news on the debug and kernel update everybody - woot | 09:25 |
davidgiluk | graham.whaley: I guess something like container-shutdown-time probably shows up indirectly in something like a container density measurement | 09:43 |
*** shrasool has joined #kata-dev | 09:44 | |
gwhaley | davidgiluk: I guess if we were doing some sort of rolling dynamic density test, yes it could have an effect (if you launch faster than you die for instance). Right now we don't have a test like that :-) our density tests are static (run n containers, take measures, do math...) | 09:45 |
gwhaley | when most folks talk about 'boot speed' though, what they really care about is time to get to the workload. So, when somebody measures 'time docker run busybox true', they are also measuring the quit time, so they skew the numbers a little. | 09:46 |
davidgiluk | yep | 09:46 |
gwhaley | the report tool tries to generate a nice little graph with a breakdown for us - nicely somebody posted a snippet on a PR last night, so we have a handy example: https://github.com/kata-containers/runtime/pull/768#issuecomment-442613869 | 09:47 |
davidgiluk | that's pretty | 09:51 |
davidgiluk | I wonder if it's worth running 'systemd-analyze blame' as the task to see if it says where the time is going in the systemd world | 09:53 |
gwhaley | I have a feeling @devimc has done that before in the past. we've also used bootchart to have a look - but, I don't think we've done that and had an optimisation cycle for a while | 09:53 |
gwhaley | also, for fun, you'd want to run the systemd analyse inside the VM but not inside the container (in that 'little space' that is the mini-OS sat in the VM around the container)... that is a fun space to try and debug ;-) We have a doc on the repos somewhere describing how to gain yourself a console to that space.. | 09:54 |
davidgiluk | yeh at one point I did have that debug shell working | 10:02 |
davidgiluk | but then I rebuilt the rootfs/initrd and lost the magic change | 10:03 |
*** shrasool has quit IRC | 10:24 | |
*** lpetrut has joined #kata-dev | 10:38 | |
*** gwhaley has quit IRC | 12:09 | |
*** gwhaley has joined #kata-dev | 13:12 | |
*** shrasool has joined #kata-dev | 13:34 | |
*** shrasool has quit IRC | 13:38 | |
kata-irc-bot | <mnaser> @salvador.fuentes i see a job that seems to have passed under fedora 28? :slightly_smiling_face: | 13:38 |
kata-irc-bot | <salvador.fuentes> @mnaser, yes it passed :slightly_smiling_face: :tada: | 13:40 |
kata-irc-bot | <salvador.fuentes> thanks | 13:40 |
*** shrasool has joined #kata-dev | 13:50 | |
kata-irc-bot | <mnaser> would anyone have some free cycles to throw questions at in order to try and get magnum integrated with kata? | 14:27 |
kata-irc-bot | <mnaser> i've made ok progress in terms of getting it installed, but some of the containers (pause) launch, but the others are not launching | 14:27 |
kata-irc-bot | <mnaser> maybe just even a pointer to where i can get logs to whats happening | 14:28 |
kata-irc-bot | <mnaser> they just sit in "Created" status in `docker ps` | 14:28 |
*** kailun has quit IRC | 14:29 | |
gwhaley | @mnaser: if you enable debug in the kata toml config file, then you get reams of stuff in the journal, which you can then extract and post on an Issue. | 14:31 |
gwhaley | https://github.com/kata-containers/documentation/blob/master/Developer-Guide.md#enable-full-debug | 14:32 |
kata-irc-bot | <mnaser> gwhaley: lovely, thank you | 14:32 |
gwhaley | btw, watch out for any journal rate limits you have - let me find the note for that as well (otherwise we lose debug ;-)) | 14:33 |
gwhaley | oh, it is the second block on that link above ;-) | 14:33 |
kata-irc-bot | <mnaser> `level=error msg="Create container failed with error: oci runtime error: Error bridging virtual endpoint"` | 14:34 |
kata-irc-bot | <mnaser> i probably missed that before | 14:34 |
kata-irc-bot | <mnaser> https://github.com/kata-containers/runtime/blob/master/virtcontainers/veth_endpoint.go#L88-L97 | 14:35 |
kata-irc-bot | <mnaser> seems like the veth attach is somehow not happening right | 14:35 |
gwhaley | probably want @amshinde @archana on that - maybe open an Issue on github | 14:36 |
kata-irc-bot | <mnaser> https://github.com/kata-containers/runtime/blob/23e75f0f03c7357cec6c77f904e610ad37d1d179/virtcontainers/network.go#L504-L534 | 14:37 |
*** fuentess has joined #kata-dev | 14:37 | |
kata-irc-bot | <mnaser> let me see what type of interface its trying | 14:37 |
kata-irc-bot | <mnaser> internetworking_model="macvtap" | 14:39 |
kata-irc-bot | <mnaser> i guess thats' the default | 14:39 |
kata-irc-bot | <mnaser> ```Nov 29 14:40:04 my-cluster-pojfxyiqeafi-minion-0.vexxhost.local kata-runtime[28165]: time="2018-11-29T14:40:04.702259779Z" level=error msg="Error bridging virtual endpoint" arch=amd64 command=create container=85a88894c36f6a686088b28ccb2de815dc719c93643b6736b94b9889aabe720a error="Could not create TAP interface: LinkAdd() failed for macvtap name tap2_kata: file exists" name=kata-runtime pid=28165 source=virtcontainers subsystem=network | 14:42 |
kata-irc-bot | Nov 29 14:40:04 my-cluster-pojfxyiqeafi-minion-0.vexxhost.local kata-runtime[28165]: time="2018-11-29T14:40:04.70235696Z" level=error msg="Could not create TAP interface: LinkAdd() failed for macvtap name tap2_kata: file exists" arch=amd64 command=create container=85a88894c36f6a686088b28ccb2de815dc719c93643b6736b94b9889aabe720a name=kata-runtime pid=28165 source=runtime ``` | 14:42 |
kata-irc-bot | <mnaser> i wonder if its not properly cleaning up macvtap devices | 14:42 |
kata-irc-bot | <mnaser> i wonder if it failed to launch at some point and didn't clean up properly | 14:44 |
gwhaley | @mnaser: definitely a possibility. If you open an issue, please detail what you ran - and, as nobody else probably has atomoc set up, maybe some ideas about how many containers of what sort it was launching and if in parallel etc. | 14:46 |
kata-irc-bot | <mnaser> gwhaley: i'm just going to try and delete all CNIs and containers first, just to see if there is a root cause *before* this | 14:46 |
kata-irc-bot | <mnaser> aka something happened which resulted in things not cleaning up properly in the first place | 14:46 |
kata-irc-bot | <mnaser> ``` time="2018-11-29T14:48:23.065660185Z" level=error msg="Error bridging virtual endpoint" arch=amd64 command=create container=04701f1e4e6c8326d109cd18eb448a2eb94bdc7e246c932e415c51a720a91f4b error="Could not get veth interface: tap0_kata: Incorrect link type macvtap, expecting veth" name=kata-runtime pid=31008 source=virtcontainers subsystem=network``` | 14:49 |
kata-irc-bot | <mnaser> and then later | 14:50 |
kata-irc-bot | <mnaser> ```time="2018-11-29T14:48:40.386621979Z" level=error msg="Could not create TAP interface: LinkAdd() failed for macvtap name tap0_kata: file exists" arch=amd64 command=create container=0f1f8fd314c61d0b81be833a4114681f9eaadeb8c7151c2c4a3f05bcd1757611 name=kata-runtime pid=31838 source=runtime``` | 14:50 |
kata-irc-bot | <mnaser> it almost looks like its a race condition where multiple containers go up at the same time and the tap indexes get mucked up | 14:51 |
*** lpetrut has quit IRC | 14:58 | |
gwhaley | mnaser - that would have been one of my guesses - hence the note about noting parallelism... ;-) yeah, we need @amshinde on the case I think | 14:58 |
kata-irc-bot | <mnaser> prelim https://github.com/kata-containers/runtime/issues/953 | 15:00 |
kata-irc-bot | <graham.whaley> @archana.m.shinde ^^ | 15:02 |
kata-irc-bot | <mnaser> woo | 15:07 |
kata-irc-bot | <mnaser> i think i got a reproducer | 15:07 |
kata-irc-bot | <mnaser> added a reproducer to the bug, it'd be nice if someone can confirm on their side too | 15:14 |
kata-irc-bot | <mnaser> not sure where to file this bug, but 1.4 doesnt seem to be in https://build.opensuse.org/project/show/home:katacontainers:release | 15:18 |
*** sameo has quit IRC | 15:30 | |
*** shrasool has quit IRC | 15:36 | |
*** sameo has joined #kata-dev | 15:50 | |
*** dklyle has joined #kata-dev | 16:01 | |
kata-irc-bot | <salvador.fuentes> @jose.carlos.venegas.m ^ | 16:11 |
kata-irc-bot | <jose.carlos.venegas.m> @mnaser: @salvador.fuentes: that is true, let me update it there | 16:13 |
kata-irc-bot | <mnaser> ok so i'm not seeing the netmon process go up | 16:31 |
kata-irc-bot | <mnaser> which can explain why things werent working | 16:31 |
gwhaley | mnaser: not sure if the netmon only landed in 1.4, and you are on 1.3 I think - but, others may know for sure. | 16:32 |
kata-irc-bot | <mnaser> netmon seems disabled by default | 16:32 |
kata-irc-bot | <mnaser> it is in 1.3 | 16:32 |
kata-irc-bot | <mnaser> ```[netmon] # If enabled, the network monitoring process gets started when the # sandbox is created. This allows for the detection of some additional # network being added to the existing network namespace, after the # sandbox has been created. # (default: disabled) #enable_netmon = true``` | 16:33 |
*** sameo has quit IRC | 16:35 | |
*** fiddletwix has joined #kata-dev | 16:43 | |
*** eernst has joined #kata-dev | 17:03 | |
*** david-lyle has joined #kata-dev | 17:04 | |
*** dklyle has quit IRC | 17:05 | |
kata-irc-bot | <eric.ernst> Extra motivation for passing (semi)static files over VSOCK and eliminating requirement for 9p: https://nabla-containers.github.io/2018/11/28/fs/ | 17:15 |
davidgiluk | hm, now would we have hit the same thing in our world | 17:24 |
*** sameo has joined #kata-dev | 17:25 | |
*** sameo has quit IRC | 17:32 | |
*** sameo has joined #kata-dev | 17:40 | |
*** jodh has quit IRC | 18:01 | |
*** david-lyle has quit IRC | 18:15 | |
*** gwhaley has quit IRC | 18:18 | |
*** eernst has quit IRC | 18:40 | |
*** eernst has joined #kata-dev | 18:42 | |
*** dklyle has joined #kata-dev | 18:45 | |
*** fiddletwix has quit IRC | 18:58 | |
*** lpetrut has joined #kata-dev | 18:58 | |
kata-irc-bot | <mike> is there a best practice on mitigating things like this on the host? strict seccomp? or would that break kata? pretty new to using it | 19:01 |
*** eernst has quit IRC | 19:07 | |
*** dklyle has quit IRC | 19:07 | |
*** fuentess has quit IRC | 19:08 | |
*** eernst has joined #kata-dev | 19:13 | |
*** eernst has quit IRC | 19:18 | |
*** eernst has joined #kata-dev | 19:19 | |
*** eernst has quit IRC | 19:21 | |
*** shrasool has joined #kata-dev | 19:26 | |
*** sameo has quit IRC | 19:41 | |
*** shrasool has quit IRC | 19:48 | |
*** shrasool has joined #kata-dev | 19:49 | |
*** shrasool has quit IRC | 19:51 | |
*** fiddletwix has joined #kata-dev | 20:04 | |
*** shrasool has joined #kata-dev | 20:18 | |
*** eernst has joined #kata-dev | 20:19 | |
*** shrasool has quit IRC | 20:19 | |
*** davidgiluk has quit IRC | 20:20 | |
*** eernst has quit IRC | 20:21 | |
kata-irc-bot | <eric.ernst> I think moving to block device is one major step | 20:21 |
kata-irc-bot | <eric.ernst> I need to look more re: other actual mitigation’s | 20:22 |
*** eernst has joined #kata-dev | 20:25 | |
*** eernst has quit IRC | 20:29 | |
*** fuentess has joined #kata-dev | 20:30 | |
*** dklyle has joined #kata-dev | 21:10 | |
*** dklyle has quit IRC | 21:39 | |
*** fuentess has quit IRC | 22:12 | |
*** dklyle has joined #kata-dev | 22:19 | |
*** dklyle has quit IRC | 22:24 | |
*** dklyle has joined #kata-dev | 22:32 | |
*** dklyle has quit IRC | 22:44 | |
*** lpetrut has quit IRC | 23:07 | |
*** eernst has joined #kata-dev | 23:17 | |
*** eernst has quit IRC | 23:22 | |
*** dhellmann_ has joined #kata-dev | 23:26 | |
*** eernst has joined #kata-dev | 23:26 | |
*** dhellmann has quit IRC | 23:26 | |
*** eernst has quit IRC | 23:27 | |
*** dhellmann_ is now known as dhellmann | 23:30 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!