*** sameo has quit IRC | 00:00 | |
kata-irc-bot | <eric.ernst> On 1.x branch I'm seeing a lot of CI failures for docker-ce as well. Not sure if that's a broken package now? Anyone able to reproduce / or seeing this? | 02:03 |
---|---|---|
kata-irc-bot | <eric.ernst> @bergwolf @liubin0329 @archana.m.shinde ^^ ? | 02:03 |
kata-irc-bot | <liubin0329> not only 1.x, 2.0 also failed too many tests. | 05:13 |
kata-irc-bot | <liubin0329> for example https://github.com/kata-containers/runtime/pull/3079 | 05:15 |
kata-irc-bot | <liubin0329> The same in 2.0 https://github.com/kata-containers/kata-containers/pull/1186 | 05:31 |
kata-irc-bot | <liubin0329> and the same error ```dpkg: error processing package docker-ce (--configure): installed docker-ce package post-installation script subprocess returned error exit status 1``` | 05:32 |
*** pcaruana has joined #kata-dev | 06:19 | |
*** sameo has joined #kata-dev | 06:46 | |
*** dklyle has quit IRC | 07:37 | |
*** fgiudici has joined #kata-dev | 08:12 | |
*** ailan__ has joined #kata-dev | 08:30 | |
*** jodh has joined #kata-dev | 08:34 | |
*** ailan has joined #kata-dev | 08:46 | |
*** ailan__ has quit IRC | 08:46 | |
kata-irc-bot | <liubin0329> I ignore the install error, and the CI passed. https://github.com/kata-containers/tests/pull/3107 | 08:54 |
kata-irc-bot | <liubin0329> So I think there are something went wrong but dit not prevent Docker working correctly. | 08:56 |
*** davidgiluk has joined #kata-dev | 09:07 | |
*** sgarzare has joined #kata-dev | 09:17 | |
*** ailan has quit IRC | 09:24 | |
*** ailan has joined #kata-dev | 09:24 | |
*** sgarzare has quit IRC | 10:19 | |
*** th0din has quit IRC | 10:26 | |
*** sgarzare has joined #kata-dev | 10:29 | |
*** Yarboa has quit IRC | 10:38 | |
*** th0din has joined #kata-dev | 10:49 | |
*** Yarboa has joined #kata-dev | 10:50 | |
*** Yarboa has quit IRC | 10:56 | |
*** davidgiluk has quit IRC | 11:10 | |
*** Yarboa has joined #kata-dev | 11:13 | |
*** th0din has quit IRC | 11:23 | |
*** davidgiluk has joined #kata-dev | 11:24 | |
*** ailan has quit IRC | 11:33 | |
*** ailan has joined #kata-dev | 11:34 | |
*** th0din has joined #kata-dev | 11:34 | |
*** Yarboa has quit IRC | 11:38 | |
*** Yarboa has joined #kata-dev | 11:40 | |
*** davidgiluk has quit IRC | 11:43 | |
*** davidgiluk has joined #kata-dev | 11:58 | |
*** ailan has quit IRC | 12:33 | |
*** devimc has joined #kata-dev | 13:22 | |
*** Yarboa has quit IRC | 13:27 | |
*** Yarboa has joined #kata-dev | 13:29 | |
*** fuentess has joined #kata-dev | 13:54 | |
*** crobinso has joined #kata-dev | 14:10 | |
*** Yarboa has quit IRC | 14:37 | |
*** tobberydberg has quit IRC | 14:38 | |
*** tobberydberg has joined #kata-dev | 14:39 | |
*** Yarboa has joined #kata-dev | 14:47 | |
*** davidgiluk has quit IRC | 15:11 | |
*** davidgiluk has joined #kata-dev | 15:12 | |
*** dklyle has joined #kata-dev | 15:27 | |
*** Yarboa has quit IRC | 15:47 | |
*** Yarboa has joined #kata-dev | 15:49 | |
*** pcaruana has quit IRC | 16:10 | |
*** devimc has quit IRC | 16:23 | |
*** devimc has joined #kata-dev | 16:23 | |
kata-irc-bot | <eric.ernst> Yeah. I guess we could stop and wonder why we're installing docker in 2.0 at this point, too. | 16:39 |
*** sameo has quit IRC | 16:50 | |
*** davidgiluk has quit IRC | 16:54 | |
*** davidgiluk has joined #kata-dev | 16:55 | |
*** sameo has joined #kata-dev | 17:03 | |
*** Yarboa has quit IRC | 17:17 | |
*** Yarboa has joined #kata-dev | 17:17 | |
*** jodh has quit IRC | 18:02 | |
kata-irc-bot | <fidencio> @jose.carlos.venegas.m, @gabriela.cervantes.te, buenas tardes! :slightly_smiling_face: May I ask your help to take a look at https://github.com/kata-containers/tests/issues/3108? Something changed, under the hood, that started triggering 2.x jobs on what should be a 1.x CI. | 18:13 |
kata-irc-bot | <fidencio> Regardless of the failure, there's something fishy going on there as it's trying to build the rust agent, and it shouldn't. | 18:14 |
kata-irc-bot | <gabriela.cervantes.te> mmm weird I have not modified that job | 18:15 |
kata-irc-bot | <fidencio> So, brainstorming here. What can cause such change? • A change in the jenkins jobs itself? • A change in the tests repo? • A change in the runtime repo? | 18:15 |
kata-irc-bot | <gabriela.cervantes.te> not sure what happened | 18:16 |
kata-irc-bot | <gabriela.cervantes.te> I am not familiar with that job | 18:28 |
kata-irc-bot | <archana.m.shinde> @salvador.fuentes @gabriela.cervantes.te There are the errors I mentioned: https://github.com/kata-containers/runtime/pull/3103#issuecomment-742717210 | 18:44 |
kata-irc-bot | <archana.m.shinde> I saw those errors for 1.12 and 1.11 branches | 18:45 |
kata-irc-bot | <archana.m.shinde> I believe @eric.ernst was seeing those errors on 2.0 as well | 18:45 |
kata-irc-bot | <salvador.fuentes> @archana.m.shinde, @eric.ernst taking a look... | 18:49 |
kata-irc-bot | <eric.ernst> FYI i setup a VM to debug, let me know if you identify the issue. | 19:08 |
kata-irc-bot | <eric.ernst> issue for failures: https://github.com/kata-containers/tests/issues/3103 | 19:09 |
kata-irc-bot | <salvador.fuentes> @eric.ernst only difference I see between passed jobs and the failed ones is that docker default version for ubuntu is now 20.10 instead of 19.03 | 19:10 |
kata-irc-bot | <salvador.fuentes> that version is being installed in the VM initialization, and then in the CI scripts we install the one on our versions.yaml, maybe related | 19:11 |
kata-irc-bot | <salvador.fuentes> I have specified in the VM initialization script to install the version we have in our versions.yaml. I have sent couple of failed jobs, let see if that fixes the issue | 19:12 |
kata-irc-bot | <eric.ernst> Yeah, that's all I noticed as well, and we just end up removing anyway? | 19:14 |
kata-irc-bot | <eric.ernst> docker_version_full=$(apt-cache madison $pkg_name | grep "$docker_version" | awk '{print $3}' | head -1) | 19:14 |
kata-irc-bot | <eric.ernst> what's madison? I don't know how apt-cache works maybe | 19:14 |
kata-irc-bot | <eric.ernst> from https://github.com/kata-containers/tests/blob/master/cmd/container-manager/manage_ctr_mgr.sh#L116 | 19:14 |
kata-irc-bot | <salvador.fuentes> with madison you get all versions available for a specific package | 19:14 |
kata-irc-bot | <eric.ernst> i see. | 19:15 |
kata-irc-bot | <salvador.fuentes> so with that command we get the correct version to install | 19:15 |
kata-irc-bot | <salvador.fuentes> but as 20.10 is already installed, we need to uninstall and then install the correct one. not sure if there are some incompatibility issues between 20.10 and 18.06 which is the one we have in the versions.yaml | 19:16 |
kata-irc-bot | <eric.ernst> Yeah. I'm thinking we probablay can skip the downgrade for 2.0, but since we need to fix for 1.x as well.... | 19:17 |
kata-irc-bot | <eric.ernst> On a fresh ubuntu bionic VM i don't see docker installed? | 19:18 |
kata-irc-bot | <eric.ernst> ie: ```$ apt list --installed | grep docker``` | 19:19 |
kata-irc-bot | <salvador.fuentes> no, it is not, but we do it in our VM initialization scripts | 19:19 |
kata-irc-bot | <salvador.fuentes> in jenkins | 19:19 |
kata-irc-bot | <eric.ernst> Where do those live? | 19:19 |
kata-irc-bot | <eric.ernst> happy to tmux/debug in real time if it is more helpful for you. | 19:20 |
kata-irc-bot | <salvador.fuentes> it is a template where cloud instances are configured | 19:20 |
kata-irc-bot | <eric.ernst> i'm on: https://portal.azure.com/#@katacontainersoutlook.onmicrosoft.com/resource/subscriptio[…]iders/Microsoft.Compute/virtualMachines/docker-test/overview | 19:20 |
kata-irc-bot | <eric.ernst> is that in ci repo, or just on the jenkins instance? | 19:21 |
kata-irc-bot | <salvador.fuentes> you can take a look at the backup file, but it is better if you go to jenkins - manage jenkins - manage nodes and clouds - go to the ubuntu1804-azure template | 19:23 |
kata-irc-bot | <salvador.fuentes> https://github.com/kata-containers/ci/blob/master/jenkins/config.xml#L758 | 19:23 |
kata-irc-bot | <eric.ernst> This happens on VM boot for each one, then? | 19:23 |
kata-irc-bot | <eric.ernst> I'm not sure if that'll buy us anything? | 19:24 |
kata-irc-bot | <salvador.fuentes> yeah, I remember we added that as we needed to add the jenkins user to the docker group, since it was needed to run non-root unit tests | 19:24 |
kata-irc-bot | <salvador.fuentes> if it is not added there, then you cannot really run the unit tests without root | 19:25 |
kata-irc-bot | <eric.ernst> Hmm. What about just creating the group manually and adding? | 19:25 |
kata-irc-bot | <salvador.fuentes> yeah, I guess that will work, without installing docker | 19:25 |
kata-irc-bot | <eric.ernst> ```sudo groupadd docker sudo usermod -aG docker $USER``` | 19:26 |
kata-irc-bot | <eric.ernst> then later on docker should just make use of that one? | 19:26 |
kata-irc-bot | <eric.ernst> I can run a quick test to verify the group isn't removed, etc. | 19:26 |
kata-irc-bot | <salvador.fuentes> yeah, would be helpful if you can run in your system | 19:26 |
kata-irc-bot | <salvador.fuentes> and we can modify the vm init script | 19:27 |
kata-irc-bot | <eric.ernst> bad that their downgrade fails. | 19:27 |
kata-irc-bot | <eric.ernst> do we do a remove then install, or downgrade? | 19:27 |
kata-irc-bot | <eric.ernst> I see groups setup fine still | 19:28 |
kata-irc-bot | <eric.ernst> ie, i did groupadd docker usermod -aG docker $user apt install -y docker-ce groups $USER (see it is applied) newgrp docker (update) docker works | 19:29 |
kata-irc-bot | <eric.ernst> that'll save us some time on booting. | 19:29 |
kata-irc-bot | <salvador.fuentes> we do a purge and then install | 19:29 |
kata-irc-bot | <eric.ernst> each VM, which is nice. | 19:29 |
kata-irc-bot | <salvador.fuentes> yeah, let me modify | 19:29 |
kata-irc-bot | <eric.ernst> interesting, so maybe the purge is borked. Let's see. | 19:30 |
kata-irc-bot | <salvador.fuentes> btw, seems that the issue is now gone, but... now we are facing the crio issue... | 19:30 |
kata-irc-bot | <eric.ernst> aaaaaah f'n CI issues. | 19:31 |
kata-irc-bot | <eric.ernst> by the time I dig in to it they dissapear | 19:31 |
kata-irc-bot | <eric.ernst> disappear* | 19:31 |
kata-irc-bot | <eric.ernst> classic | 19:31 |
kata-irc-bot | <jose.carlos.venegas.m> hey sorry I was afk, looking now | 19:32 |
kata-irc-bot | <eric.ernst> I can reproduce the issue, btw. | 19:33 |
kata-irc-bot | <eric.ernst> after purging and re-installing 18.06-03 | 19:33 |
kata-irc-bot | <salvador.fuentes> ohh ok, cool | 19:33 |
kata-irc-bot | <salvador.fuentes> I'll delete the installation from the VM init script and change with only creating the group | 19:34 |
kata-irc-bot | <salvador.fuentes> and that should work | 19:34 |
kata-irc-bot | <eric.ernst> I think that'll be cleaner anyway. | 19:34 |
kata-irc-bot | <eric.ernst> Let's see. | 19:34 |
kata-irc-bot | <salvador.fuentes> now we need to figure out the cri-o issue | 19:34 |
kata-irc-bot | <eric.ernst> interesting. installing 20.02, purging, install 18.06 fails. If I think rmeove 18.06 and reinstall it passes. | 19:37 |
kata-irc-bot | <eric.ernst> Either way....... let's just go with what you're doing. | 19:37 |
kata-irc-bot | <eric.ernst> Thanks Salvador | 19:37 |
kata-irc-bot | <jose.carlos.venegas.m> I just take a look to the config in jenkins but I dont see anything that may point to 2.0 | 19:39 |
kata-irc-bot | <jose.carlos.venegas.m> so I wonder if something in the test repo changed? | 19:39 |
kata-irc-bot | <salvador.fuentes> np :slightly_smiling_face: | 19:40 |
kata-irc-bot | <salvador.fuentes> yeah, just removed the install part from the init script | 19:40 |
kata-irc-bot | <jose.carlos.venegas.m> same for tests or runtime | 19:41 |
kata-irc-bot | <jose.carlos.venegas.m> oh sorry I did not see the rest of the conversation was ralated to this thread | 19:42 |
*** crobinso has quit IRC | 19:53 | |
kata-irc-bot | <fidencio> Sorry, I'm back now | 19:57 |
kata-irc-bot | <fidencio> @salvador.fuentes, which one of the cri-o issues? :slightly_smiling_face: | 19:59 |
kata-irc-bot | <salvador.fuentes> sorry, not cri-o specific issue, it is the one that Peter reported regarding the image not built correctly | 20:09 |
kata-irc-bot | <fidencio> @salvador.fuentes, that one is not exactly related to the image, but rather something changed that it takes the 2.x path instead of taking the 1.x path | 20:10 |
kata-irc-bot | <fidencio> Because till yesterday the rust agent would never ever be built (and, indeed, it shouldn't) as that job is for Kata 1.x | 20:11 |
kata-irc-bot | <salvador.fuentes> @fidencio I am taking a look and I see that in past jobs, rust is also being installed as part of the image although as you say it shouldn't. Not really sure why this is happening | 20:16 |
kata-irc-bot | <salvador.fuentes> e.g. http://jenkins.katacontainers.io/job/kata-containers-runtime-ubuntu-18-04-PR/2146/consoleText | 20:16 |
kata-irc-bot | <salvador.fuentes> another thing that caught my attention is that we were suppose to build clearlinux image by default, but I see in the logs that the image being built is using Fedora as the base | 20:17 |
kata-irc-bot | <salvador.fuentes> ```13:27:32 INFO: Image to generate: kata-containers-clearlinux-34080-osbuilder-24e88b8-agent-25d7471.img 13:27:32 INFO: Latest cached image: kata-containers-clearlinux-34000-osbuilder-24e88b8-agent-25d7471.img 13:27:32 Set runtime as default runtime to build the image 13:27:32 manage_ctr_mgr.sh - INFO: docker_info: version: 18.06.3-ce 13:27:32 manage_ctr_mgr.sh - INFO: docker_info: default runtime: runc 13:27:32 | 20:18 |
kata-irc-bot | manage_ctr_mgr.sh - INFO: docker_info: package name: docker-ce 13:27:32 manage_ctr_mgr.sh - INFO: configure_docker: runc is already configured as default runtime 13:27:32 Script finished 13:27:32 INFO: Detecting agent go version 13:27:32 INFO: Detecting runtime version using https://raw.githubusercontent.com/kata-containers/agent/25d7471/VERSION 13:27:33 INFO: Getting golang version from | 20:18 |
kata-irc-bot | https://raw.githubusercontent.com/kata-containers/runtime/1.12.0-rc0/versions.yaml 13:27:33 Required Go version: 1.11.10 13:27:33 INFO: Detecting agent rust version 13:27:33 INFO: Detecting runtime version using https://raw.githubusercontent.com/kata-containers/agent/25d7471/VERSION 13:27:33 INFO: Getting rust version from https://raw.githubusercontent.com/kata-containers/runtime/1.12.0-rc0/versions.yaml 13:27:33 Required rust version: 1.38.0 | 20:18 |
kata-irc-bot | 13:27:33 INFO: Detecting cmake version 13:27:33 INFO: Detecting runtime version using https://raw.githubusercontent.com/kata-containers/agent/25d7471/VERSION 13:27:33 INFO: Getting cmake version from https://raw.githubusercontent.com/kata-containers/runtime/1.12.0-rc0/versions.yaml 13:27:33 Required cmake version: 3.15.3 13:27:33 INFO: Detecting musl version 13:27:33 INFO: Detecting runtime version using | 20:18 |
kata-irc-bot | https://raw.githubusercontent.com/kata-containers/agent/25d7471/VERSION 13:27:33 INFO: Getting musl version from https://raw.githubusercontent.com/kata-containers/runtime/1.12.0-rc0/versions.yaml 13:27:33 Required musl version: 1.1.23 13:27:33 /tmp/jenkins/workspace/kata-containers-runtime-ubuntu-18-04-PR/go/src/github.com/kata-containers/osbuilder/rootfs-builder/clearlinux | 20:18 |
kata-irc-bot | /tmp/jenkins/workspace/kata-containers-runtime-ubuntu-18-04-PR/go/src/github.com/kata-containers/osbuilder 13:27:33 /tmp/jenkins/workspace/kata-containers-runtime-ubuntu-18-04-PR/go/src/github.com/kata-containers/osbuilder 13:27:33 Sending build context to Docker daemon 6.656kB 13:27:33 Step 1/11 : From docker.io/fedora:30 13:27:34 30: Pulling from library/fedora``` | 20:18 |
kata-irc-bot | <fidencio> So, let me get here the latest working one. | 20:19 |
kata-irc-bot | <fidencio> @salvador.fuentes, this is from Yesterday: http://jenkins.katacontainers.io/job/kata-containers-crio-PR/16024/console | 20:20 |
kata-irc-bot | <fidencio> the last one that worked, still doing 1.x work without any issue | 20:20 |
kata-irc-bot | <fidencio> Makes me wonder, what changed in the Azure images that could be causing this confusion? | 20:20 |
kata-irc-bot | <eric.ernst> fedora is being used to build clearlinux. | 20:21 |
kata-irc-bot | <eric.ernst> (if that makes sense) | 20:21 |
kata-irc-bot | <eric.ernst> we're still building clearnlinux tehre. | 20:21 |
kata-irc-bot | <eric.ernst> I think this is just part of USE_DOCKER=true. | 20:22 |
kata-irc-bot | <eric.ernst> right? | 20:22 |
*** davidgiluk has quit IRC | 20:24 | |
*** fuentess has quit IRC | 20:38 | |
kata-irc-bot | <salvador.fuentes> @fidencio @eric.ernst the issue is only seen when we build the image... in the log above, the image used was not built, it was downloaded from the our cache artifacts. | 21:09 |
kata-irc-bot | <eric.ernst> hmm. | 21:22 |
kata-irc-bot | <eric.ernst> btw, rerunning my PRs, now I see a whole bunch of different errors :S. https://github.com/kata-containers/runtime/pull/3101#issuecomment-742805033 | 21:22 |
*** Yarboa has quit IRC | 21:28 | |
*** fuentess has joined #kata-dev | 21:30 | |
*** Yarboa has joined #kata-dev | 21:32 | |
kata-irc-bot | <eric.ernst> I see that same failure on 2.0 | 21:40 |
kata-irc-bot | <eric.ernst> Wondering if this is a new test, or why CI is trying to kill us all. | 21:40 |
kata-irc-bot | <fidencio> @salvador.fuentes, what was the fix in the end? | 21:52 |
kata-irc-bot | <fidencio> Because it has passed the problematic part now | 21:52 |
kata-irc-bot | <eric.ernst> For docker, it was to not install docker in the first place when creating the VM | 21:53 |
*** Yarboa has quit IRC | 21:57 | |
*** Yarboa has joined #kata-dev | 22:07 | |
*** devimc has quit IRC | 22:16 | |
*** sgarzare has quit IRC | 22:25 | |
kata-irc-bot | <salvador.fuentes> @fidencio for the image issue: the problem may still exist... We have a job that builds the kata-containers image and caches it. When that cached image is used in the 1.x jobs, we don't have the issue, as we don't build the image, we just download it. But I suspect that when the cached image is not the latest one (due to a change in the clearlinux version, or a new commit in agent or osbuilder), the job will build the new one | 22:37 |
kata-irc-bot | and it may fail again... | 22:37 |
kata-irc-bot | <fidencio> @salvador.fuentes, ack, but seems we have a cached image right now and it unblocks CRI-O, am I right? | 22:42 |
kata-irc-bot | <salvador.fuentes> yes, that is right | 22:43 |
kata-irc-bot | <fidencio> If so, I'll try to get someone from our side to work on having the problem about the build solved | 22:43 |
kata-irc-bot | <fidencio> I know it's not something that we'll hit all the time and I think Pavel may be interested on working on this (unless you already have a fix for that) | 22:44 |
kata-irc-bot | <salvador.fuentes> I have also increased the number of times the nightly job is run, to make sure we have the latest image cached and to minimize the CI failures, but we still need to figure out what is going on with that issue | 22:44 |
kata-irc-bot | <salvador.fuentes> it would be good if you have someone on your side to help on this | 22:44 |
kata-irc-bot | <salvador.fuentes> most of us will be offline tomorrow and Mon, so we'll not be able to help until mid next week | 22:45 |
kata-irc-bot | <fidencio> Ack! For the record, Pavel is off Tomorrow as well, but this is one of the areas he's comfortable with. | 22:46 |
*** sameo has quit IRC | 22:46 | |
kata-irc-bot | <salvador.fuentes> cool, thanks :slightly_smiling_face: | 22:46 |
*** fuentess has quit IRC | 23:10 | |
*** sameo has joined #kata-dev | 23:22 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!