*** nicolasbock has quit IRC | 00:04 | |
*** xek has quit IRC | 00:18 | |
*** hamzy_ has joined #openstack-infra | 00:52 | |
*** tetsuro has joined #openstack-infra | 00:54 | |
*** hamzy has quit IRC | 00:55 | |
openstackgerrit | Mohammed Naser proposed openstack/project-config master: opendev: add openstack/devstack https://review.opendev.org/713123 | 01:04 |
---|---|---|
mnaser | ^ i wonder if it makes sense for us to get our own zuul tenant at this point... | 01:04 |
openstackgerrit | Mohammed Naser proposed openstack/project-config master: opendev: add openstack/devstack https://review.opendev.org/713123 | 01:06 |
openstackgerrit | Mohammed Naser proposed openstack/project-config master: opendev: move vexxhost to seperate tenant https://review.opendev.org/713123 | 01:10 |
openstackgerrit | Mohammed Naser proposed openstack/project-config master: opendev: move vexxhost to seperate tenant https://review.opendev.org/713123 | 01:11 |
mnaser | there, instead of adding it, i'm proposing moving it to a seperate tenant to stay out and not pollute opendev tenant too much | 01:14 |
*** yamamoto has joined #openstack-infra | 01:25 | |
mnaser | there's only 6 jobs in all of opendev (all tenants) and my jobs are taking forever in 'queued' :\ | 01:54 |
mnaser | almost 5 minutes in this case | 01:54 |
*** tetsuro has quit IRC | 01:55 | |
fungi | waiting for an unavailable node type? | 02:00 |
fungi | i guess the functional job is supposed to start once the image build job is paused? | 02:03 |
*** ociuhandu has joined #openstack-infra | 02:05 | |
*** ociuhandu has quit IRC | 02:10 | |
*** yamamoto has quit IRC | 02:19 | |
clarkb | ya and it queues anew image request at that point iirc | 02:30 |
clarkb | also its shared capacity | 02:30 |
clarkb | are the other tenants busy? | 02:30 |
fungi | sounded like no | 02:36 |
fungi | "...only 6 jobs in all of opendev (all tenants) ..." | 02:36 |
clarkb | er new node request not image | 02:38 |
clarkb | also if a cloud is having trouble we retry 3 times | 02:38 |
prometheanfire | sometimes I like uwsgi | 02:54 |
prometheanfire | Sat Mar 14 21:54:10 2020 - worker 1 (pid: 11481) is taking too much time to die...NO MERCY !!! | 02:54 |
*** dave-mccowan has quit IRC | 03:16 | |
fungi | uwsgi takes no prisoners | 03:18 |
*** diablo_rojo has quit IRC | 04:15 | |
*** diablo_rojo has joined #openstack-infra | 04:58 | |
*** tetsuro has joined #openstack-infra | 05:26 | |
*** yamamoto has joined #openstack-infra | 05:31 | |
*** yamamoto has quit IRC | 05:34 | |
*** evrardjp has quit IRC | 05:35 | |
*** evrardjp has joined #openstack-infra | 05:36 | |
*** admcleod has quit IRC | 06:10 | |
*** diablo_rojo has quit IRC | 06:33 | |
*** admcleod has joined #openstack-infra | 06:51 | |
*** yamamoto has joined #openstack-infra | 07:14 | |
*** admcleod has quit IRC | 07:16 | |
*** yamamoto has quit IRC | 07:20 | |
*** tetsuro has quit IRC | 07:22 | |
*** tetsuro has joined #openstack-infra | 07:23 | |
*** yamamoto has joined #openstack-infra | 07:25 | |
*** slaweq has joined #openstack-infra | 07:44 | |
*** admcleod has joined #openstack-infra | 07:47 | |
*** yamamoto has quit IRC | 08:14 | |
*** slaweq has quit IRC | 08:15 | |
*** tetsuro has quit IRC | 08:22 | |
*** slaweq has joined #openstack-infra | 08:28 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Keep doc/source/roles.rst sorted https://review.opendev.org/713128 | 08:30 |
*** yamamoto has joined #openstack-infra | 08:34 | |
*** slaweq has quit IRC | 08:56 | |
*** rcernin has quit IRC | 08:57 | |
*** slaweq has joined #openstack-infra | 09:05 | |
*** slaweq has quit IRC | 09:10 | |
*** yamamoto has quit IRC | 09:12 | |
*** Lucas_Gray has joined #openstack-infra | 09:13 | |
*** ociuhandu has joined #openstack-infra | 09:44 | |
*** elod has quit IRC | 09:48 | |
*** ociuhandu has quit IRC | 09:49 | |
*** apetrich has quit IRC | 10:13 | |
*** apetrich has joined #openstack-infra | 10:19 | |
*** yamamoto has joined #openstack-infra | 10:28 | |
*** yamamoto has quit IRC | 10:35 | |
*** Lucas_Gray has quit IRC | 10:52 | |
*** Lucas_Gray has joined #openstack-infra | 10:53 | |
*** ociuhandu has joined #openstack-infra | 10:55 | |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: Improve ensure-tox role https://review.opendev.org/708642 | 11:06 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded https://review.opendev.org/690057 | 11:26 |
*** yamamoto has joined #openstack-infra | 11:29 | |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded https://review.opendev.org/690057 | 11:33 |
*** yamamoto has quit IRC | 11:46 | |
*** ociuhandu has quit IRC | 11:51 | |
zbr | something really fishy: https://zuul.opendev.org/t/zuul/build/b8dc5b0963774d1a973714c0f724c996 -- I am unable to figure-out what went wrong with it | 11:59 |
*** yamamoto has joined #openstack-infra | 11:59 | |
*** n0on1 has joined #openstack-infra | 11:59 | |
zbr | zero files collected, clueless | 12:00 |
*** yamamoto has quit IRC | 12:04 | |
*** ociuhandu has joined #openstack-infra | 12:07 | |
zbr | clearly there is an issue with fedora-30 image, but i am unable to identify what is wrong with it because there are not logs | 12:23 |
*** ociuhandu has quit IRC | 12:37 | |
*** ociuhandu has joined #openstack-infra | 12:38 | |
*** ociuhandu has quit IRC | 12:43 | |
*** ociuhandu has joined #openstack-infra | 12:50 | |
*** ociuhandu has quit IRC | 13:12 | |
*** ociuhandu has joined #openstack-infra | 13:13 | |
*** yamamoto has joined #openstack-infra | 13:15 | |
*** yamamoto has quit IRC | 13:17 | |
*** ociuhandu has quit IRC | 13:18 | |
fungi | can you get the job to rerun and then pull up the console log stream? | 13:21 |
fungi | might yield a clue | 13:21 |
*** tosky has joined #openstack-infra | 13:36 | |
*** yamamoto has joined #openstack-infra | 13:36 | |
*** ricolin has quit IRC | 13:50 | |
*** yamamoto has quit IRC | 14:02 | |
*** yamamoto has joined #openstack-infra | 14:02 | |
*** ianychoi has joined #openstack-infra | 14:05 | |
openstackgerrit | Jeremy Stanley proposed opendev/system-config master: Set up LE certs for docs.airshipit.org static site https://review.opendev.org/706600 | 14:14 |
openstackgerrit | Jeremy Stanley proposed opendev/system-config master: Add a new docs.airshipit.org vhost on static01 https://review.opendev.org/706601 | 14:14 |
*** ociuhandu has joined #openstack-infra | 14:19 | |
*** Lucas_Gray has quit IRC | 14:22 | |
*** ociuhandu has quit IRC | 14:24 | |
*** yamamoto has quit IRC | 14:52 | |
*** Jeffrey4l has quit IRC | 14:56 | |
*** Jeffrey4l has joined #openstack-infra | 14:56 | |
AJaeger | zbr, fungi, fedora-30 images are still broken, aren't they? | 14:58 |
zbr | yep, see http://paste.openstack.org/show/790711/ | 14:58 |
zbr | apparently is not f30 fault, but still a blocker | 14:59 |
zbr | dmsimard|off: ^ this started in the last ~48h | 15:00 |
AJaeger | zbr: As far as I understand discussion here and in #opendev, we had a weeks old f30 image, deleted that by accident, and now build a new one - and that new one sems broken. But I only glanced over the discussion, so might have misunderstood something. | 15:03 |
AJaeger | mordred, ianw, clarkb, FYI, f30 is broken ^ | 15:04 |
fungi | i got the impression the f30 image we had was months old, but regardless the result is the same | 15:12 |
*** hwoarang has quit IRC | 15:14 | |
*** hwoarang has joined #openstack-infra | 15:14 | |
AJaeger | fungi: I was months old - and we deleted by accident and build a new one - didn't we? | 15:17 |
*** xek has joined #openstack-infra | 15:18 | |
mnaser | can i help with troubleshooting this somehow? :\ | 15:18 |
* mnaser is blocked by this | 15:18 | |
fungi | AJaeger: yeah, months old is still also technically weeks old | 15:19 |
fungi | mnaser: i believe the current challenge revolves around containerizing the nodepool builder so we can run it on a new enough platform to have support for the compression algorithm red hat switched the fedora rpms to a few months back | 15:20 |
clarkb | fungi: apparently the rpm switch is f31. f30 was syopped due to seggaults indtalling packages | 15:22 |
fungi | oh, got it | 15:22 |
clarkb | that mysteriously went away with newer build userspace | 15:22 |
fungi | well, regardless, this is the first i'd heard that there are problems with the new image which got uploaded early utc yesterday | 15:23 |
clarkb | in this case the nodes are coming up enough to share an ssh host key otherwise nodepool wpuldnt mark them ready | 15:23 |
fungi | so some details would probably help | 15:23 |
clarkb | but then failing a proper ssh connection | 15:23 |
fungi | and yeah, looking at zbr's paste, i concur | 15:24 |
fungi | looks like it may be an auth problem | 15:25 |
clarkb | the executor log may be more verbose? but ya auth somehow | 15:25 |
fungi | Permission denied (publickey,gssapi-keyex,gssapi-with-mic) | 15:25 |
clarkb | at this point its using the shared zuul key asit isnt far enough to switch to job specific key | 15:25 |
fungi | i doubt the executor log will have much more. first guesses would be either the zuul account doesn't exist or isn't configured with the authorized key | 15:26 |
clarkb | we should be able to test that off a manually booted image | 15:26 |
clarkb | fungi: ya except every other image uses that same element and works fine | 15:26 |
clarkb | (so in general the user andkey are ok what makes f30 special) | 15:27 |
fungi | loose permissions and incorrect ownership on ~zuul/.ssh or ~zuul/.ssh/authorized_keys could also be a cause | 15:27 |
clarkb | wouldnt that affect all the distros though? | 15:28 |
mnaser | is /home/nodepool/.ssh mounted in the new containerized nodepool? | 15:29 |
fungi | unless there's some fedora conditional in the element, i would expect so yeah | 15:29 |
fungi | mnaser: not sure what you mean by that | 15:29 |
clarkb | mnaser: hrm is that how the key is injected? if so that may be it | 15:29 |
mnaser | yes so im looking at the logs | 15:29 |
mnaser | https://nb01.openstack.org/fedora-30-0000000391.log | 15:29 |
clarkb | fungi that vould be where we load the key from | 15:29 |
fungi | ahh | 15:29 |
mnaser | ctrl+f ZUUL_USER_SSH_PUBLIC_KEY | 15:29 |
mnaser | '[' -f /home/nodepool/.ssh/id_rsa.pub ']' | 15:29 |
mnaser | and then it doesnt actually do anything after that | 15:30 |
clarkb | 8 dont think that path us mounted | 15:30 |
clarkb | *I dont | 15:30 |
mnaser | actually, i'm wrong though, i think there _is_ an ssh key here though, because it does the cat, but yeah, if its the wrong key.. | 15:30 |
clarkb | easy enough to update the docker compose file with that mount and restart the service | 15:30 |
clarkb | but I'm not in a great spot for that right now | 15:31 |
mnaser | i wonder why other jobs haven't all failed though | 15:31 |
clarkb | other jobs on f30? | 15:31 |
mnaser | no just other jobs inside the new containerized nodepool | 15:31 |
clarkb | there are no other jobs | 15:32 |
fungi | that's the only image it's building | 15:32 |
mnaser | oh gotcha! | 15:32 |
mnaser | i can't find a url that has the build logs | 15:33 |
clarkb | iirc the yse local.ssh key is a convenience thing and you can set actual key data too | 15:33 |
clarkb | Im not sure which way production builds expect it | 15:33 |
clarkb | mnaser: https://nb01.opendev.org/ | 15:34 |
mnaser | ahhh i was going for .openstack.org and went through 1-3 :p | 15:34 |
mnaser | oh boy | 15:35 |
mnaser | ok so because it can't find '/var/lib/nodepool/.ssh/id_rsa.pub' | 15:35 |
mnaser | it then treats that as the key | 15:35 |
mnaser | so it injects the _literal_ string "/var/lib/nodepool/.ssh/id_rsa.pub" as the ssh key | 15:35 |
mnaser | (because you can supply both a file _or_ ssh key in that element) | 15:36 |
clarkb | neat | 15:36 |
mnaser | i'm not finding anything inside opendev/system-config to push a patch for | 15:37 |
clarkb | mnaser: playbooks/roles/nodepool-builder/templates/something.j2 | 15:37 |
* mnaser clone locally | 15:38 | |
clarkb | but that server isnt directly managed right now so will need to have the compose file manually updated then stop started | 15:38 |
clarkb | (should record the change either way though) | 15:38 |
mnaser | clarkb: is this opendev/system-config ? | 15:39 |
AJaeger | mnaser: yes | 15:39 |
mnaser | oh playbooks/roles not roles/ -- okay | 15:39 |
AJaeger | mnaser: and replace something by one of the two files ;) | 15:39 |
clarkb | its the compose file | 15:39 |
clarkb | we also need to make sure the ssh key is written to the host | 15:40 |
clarkb | (I'm not sure of that) | 15:40 |
AJaeger | playbooks/roles/nodepool-builder/templates/docker-compose.yaml.j2 | 15:40 |
mnaser | yeah i found that file AJaeger , i think on the host it seems that /home/nodepool exists, so i'll just go with the assumption the key will be stored there and someone will get it there somehow | 15:40 |
openstackgerrit | Mohammed Naser proposed opendev/system-config master: nodepool-builder: mount SSH keys into the container https://review.opendev.org/713136 | 15:43 |
mnaser | clarkb, fungi, AJaeger: i believe that should do it, will just need someone to go in and do the thing | 15:44 |
*** slaweq has joined #openstack-infra | 15:44 | |
fungi | we also have to pass a special hostname when we do it, right? | 15:44 |
fungi | maybe better to resume this discussion in #opendev where it was happening on friday | 15:45 |
mnaser | ah yes | 15:45 |
*** ociuhandu has joined #openstack-infra | 15:47 | |
*** dkehn has joined #openstack-infra | 16:01 | |
*** n0on1 has quit IRC | 16:13 | |
openstackgerrit | Merged opendev/system-config master: Set up LE certs for docs.airshipit.org static site https://review.opendev.org/706600 | 16:19 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-operator master: Update to dhall lang v14 https://review.opendev.org/710649 | 16:20 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-operator master: Add missing volumes for the web and merger service https://review.opendev.org/712811 | 16:20 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-operator master: Add missing input defaults https://review.opendev.org/713138 | 16:20 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-operator master: Add a render.dhall function https://review.opendev.org/713139 | 16:21 |
*** ociuhandu has quit IRC | 16:24 | |
*** ociuhandu has joined #openstack-infra | 16:25 | |
*** ociuhandu has quit IRC | 16:26 | |
*** ociuhandu has joined #openstack-infra | 16:26 | |
zbr | mnaser: are you working on https://review.opendev.org/#/c/713136/ ? | 16:33 |
zbr | mainly i want to know what can be done to fix the broken fedora imagine | 16:33 |
zbr | which blocks other changes | 16:33 |
clarkb | zbr: we switched the conversation over to #opendev as noted above | 16:34 |
mnaser | ^ | 16:34 |
clarkb | zbr: from there we've manually updated the server and triggered a rebuild | 16:34 |
clarkb | re https://review.opendev.org/#/c/713136/ specifically I've updated https://storyboard.openstack.org/#!/story/2007407 with task 39072 and I Think we should wait for ianw's thoughts on how we should proceed | 16:34 |
zbr | ok, ~1000msg backlog, reading.... | 16:34 |
clarkb | we may or may not proceed with that change depending on how people feel about fixing this | 16:35 |
zbr | clarkb: ok. but what to do meanwhile? making fedora-30 jobs non-voting? | 16:36 |
clarkb | zbr: or just wait about an hour | 16:36 |
zbr | ah, nopb with that. thanks | 16:37 |
fungi | granted it's been months since we last rebuilt f30, so i wouldn't be surprised if we encounter new issues which have crept in since then | 16:40 |
zbr | fungi: i got into similar problems today for similar reasons: base-os containers not rebuild in long time, same results some got bit rotten. | 16:43 |
zbr | i am inclined to believe is best to rebuild them weekly, test and push them, keeping them fresh. | 16:43 |
zbr | at least one would know, that on a specific day of the week containers are updated, also this being good for performance. i guess the same applies to qcow2 images. | 16:44 |
*** slaweq has quit IRC | 16:46 | |
clarkb | fwiw the dnf isntall dib is doing may have decided it is lunch break time | 16:47 |
*** auristor has quit IRC | 16:47 | |
clarkb | its been ~12 minutes since it logged anything | 16:47 |
*** armax has joined #openstack-infra | 16:47 | |
clarkb | debugging dnf is likely well beyond what I want to spend my weekend doing :) | 16:48 |
*** auristor has joined #openstack-infra | 16:48 | |
fungi | hopefully it's just quiet | 16:51 |
*** ociuhandu has quit IRC | 17:19 | |
*** ociuhandu has joined #openstack-infra | 17:20 | |
*** lbragstad has joined #openstack-infra | 17:21 | |
*** ociuhandu has quit IRC | 17:23 | |
*** ociuhandu has joined #openstack-infra | 17:23 | |
*** ijw has quit IRC | 17:29 | |
mtreinish | rm_work: well, it looks like it's going ok so far. The u-c bump has already merged: https://review.opendev.org/#/c/713126/ | 17:29 |
mtreinish | that being said the diffs were pretty small for both releases | 17:31 |
*** matt_kosut has joined #openstack-infra | 17:32 | |
*** evrardjp has quit IRC | 17:35 | |
*** evrardjp has joined #openstack-infra | 17:36 | |
*** ociuhandu has quit IRC | 17:40 | |
*** ralonsoh has joined #openstack-infra | 17:44 | |
AJaeger | fungi, clarkb, it took 2hours 20 mins lunch break in the build before as well, see #opendev | 17:45 |
*** rosmaita has quit IRC | 17:54 | |
*** armax has quit IRC | 17:59 | |
*** ijw has joined #openstack-infra | 18:06 | |
*** ijw has quit IRC | 18:11 | |
*** ociuhandu has joined #openstack-infra | 18:12 | |
openstackgerrit | Ghanshyam Mann proposed openstack/hacking master: Remove pypy jobs https://review.opendev.org/710349 | 18:18 |
*** ijw has joined #openstack-infra | 18:38 | |
*** adam_g has quit IRC | 18:39 | |
*** melwitt has quit IRC | 18:39 | |
*** johnsom has quit IRC | 18:39 | |
*** kevinz has quit IRC | 18:39 | |
*** noonedeadpunk has quit IRC | 18:39 | |
*** aspiers has quit IRC | 18:39 | |
*** jkt has quit IRC | 18:39 | |
*** corvus has quit IRC | 18:39 | |
*** jonher has quit IRC | 18:39 | |
*** jkt has joined #openstack-infra | 18:40 | |
*** adam_g has joined #openstack-infra | 18:40 | |
*** jonher has joined #openstack-infra | 18:40 | |
*** melwitt has joined #openstack-infra | 18:40 | |
*** corvus has joined #openstack-infra | 18:40 | |
*** johnsom has joined #openstack-infra | 18:40 | |
*** kevinz has joined #openstack-infra | 18:40 | |
*** noonedeadpunk has joined #openstack-infra | 18:40 | |
*** ijw has quit IRC | 18:42 | |
*** ociuhandu has quit IRC | 18:51 | |
*** ociuhandu has joined #openstack-infra | 18:52 | |
*** aspiers has joined #openstack-infra | 18:56 | |
*** ociuhandu has quit IRC | 18:58 | |
*** ijw has joined #openstack-infra | 19:09 | |
*** ociuhandu has joined #openstack-infra | 19:13 | |
*** ijw has quit IRC | 19:13 | |
*** lbragstad has quit IRC | 19:16 | |
*** stevebaker has quit IRC | 19:32 | |
*** stevebaker has joined #openstack-infra | 19:32 | |
*** arif-ali has joined #openstack-infra | 19:32 | |
*** arxcruz|rover has quit IRC | 19:36 | |
*** jpena|off has quit IRC | 19:36 | |
*** dalvarez has quit IRC | 19:36 | |
*** spotz has quit IRC | 19:36 | |
*** jpena|off has joined #openstack-infra | 19:37 | |
*** arxcruz has joined #openstack-infra | 19:39 | |
*** stevebaker has quit IRC | 19:47 | |
*** ociuhandu has quit IRC | 19:48 | |
*** ociuhandu has joined #openstack-infra | 19:48 | |
*** ociuhandu has quit IRC | 19:51 | |
*** ociuhandu has joined #openstack-infra | 19:51 | |
*** ijw has joined #openstack-infra | 20:13 | |
*** ralonsoh has quit IRC | 20:18 | |
*** ijw has quit IRC | 20:18 | |
*** ociuhandu has quit IRC | 20:26 | |
*** ociuhandu has joined #openstack-infra | 20:26 | |
*** mordred has quit IRC | 20:30 | |
*** andreaf has quit IRC | 20:30 | |
*** dtantsur|afk has quit IRC | 20:30 | |
*** andreaf has joined #openstack-infra | 20:30 | |
*** mordred has joined #openstack-infra | 20:30 | |
*** ociuhandu has quit IRC | 20:33 | |
*** dtantsur has joined #openstack-infra | 20:35 | |
*** ijw has joined #openstack-infra | 20:44 | |
*** ijw has quit IRC | 20:49 | |
*** ociuhandu has joined #openstack-infra | 20:50 | |
*** ociuhandu has quit IRC | 20:54 | |
*** ijw has joined #openstack-infra | 21:01 | |
ianw | thanks everyone, i see fedora-30-0000000558.log built, with hopefully that key | 21:09 |
ianw | 2020-03-15 16:35:12.982 | > Installing : haveged-1.9.1-11.fc30.x86_64 247/253 | 21:10 |
ianw | 2020-03-15 18:53:00.176 | > Running scriptlet: haveged-1.9.1-11.fc30.x86_64 247/253 | 21:10 |
ianw | that's a pretty suspicious package to sit installing for so long ... anything where entropy might be involved | 21:11 |
*** jaicaa has quit IRC | 21:11 | |
ianw | 2020-03-15 03:21:09.290 | > Installing : haveged-1.9.1-11.fc30.x86_64 247/253 | 21:12 |
ianw | 2020-03-15 05:38:56.378 | > Running scriptlet: haveged-1.9.1-11.fc30.x86_64 247/253 | 21:12 |
*** bdodd has quit IRC | 21:12 | |
*** jaicaa has joined #openstack-infra | 21:13 | |
*** AJaeger has quit IRC | 21:13 | |
*** bdodd has joined #openstack-infra | 21:15 | |
mordred | ianw: oh - I wonder if we should bind-mount /dev/random (or do some other setting or something?) | 21:17 |
ianw | mordred: yeah, i'm wondering if it's more a general issue with /dev and this is just a symptom ... it seems likely | 21:18 |
mordred | ianw: possible - although I did see random people on the internet suggesting adding a -v /dev/urandom:/dev/random to the docker invocation | 21:18 |
mordred | but I'm not sure I trust those random people yet | 21:19 |
ianw | dd if=/dev/urandom of=/tmp/foo bs=1K count=1024 | 21:20 |
mordred | ianw: oh - wait | 21:20 |
ianw | 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0110084 s, 95.3 MB/s | 21:20 |
mordred | ianw: we're not running havegd on nb01 | 21:20 |
ianw | in the container, so it seems to be giving data fast enough | 21:20 |
ianw | (that's in the container) | 21:21 |
mordred | yeah - but the host may have built up enough entropy at this point | 21:21 |
mordred | do we need havegd on our build hosts? | 21:21 |
ianw | hrm, so possibly haveged install using /dev/random, from host, which is emptyish? i don't think any of the others do | 21:21 |
mordred | yeah. I agree - the others also don't seem to be | 21:23 |
ianw | ... possibly operations in the container namespaces aren't contributing to the entropy pool? whereas they do when built in a straight chroot? a bit far-fetched | 21:24 |
ianw | i think we need to get an strace when this has hung to really see that it is this anyway | 21:24 |
ianw | i can trigger a build | 21:25 |
*** AJaeger has joined #openstack-infra | 21:26 | |
ianw | oh, interesting ... Exception: Skipping build request for image fedora-30; paused | 21:27 |
fungi | yeah, gotta do it from the new builder | 21:27 |
fungi | clarkb worked out the docker run invocation above | 21:28 |
fungi | er, not above in here, in #opendev | 21:29 |
fungi | sudo docker exec nodepoolbuildercompose_nodepool-builder_1 nodepool image-build fedora-30 | 21:29 |
fungi | ianw: ^ | 21:29 |
ianw | yeah, i'm in the container poking :) | 21:29 |
ianw | 559 is building ... i hope this hangs | 21:30 |
ianw | # cat /proc/sys/kernel/random/entropy_avail | 21:30 |
ianw | 3302 | 21:30 |
ianw | i'll see if that starts to disappear | 21:30 |
*** factor has joined #openstack-infra | 21:31 | |
*** rcernin has joined #openstack-infra | 21:32 | |
ianw | (doesn't seem so and afaics the value looks the same inside & outside the container, so they're not seeing different things) | 21:33 |
*** ijw has quit IRC | 21:41 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Switch back to docker for gerrit and nodepool-builder https://review.opendev.org/713101 | 21:41 |
*** ijw has joined #openstack-infra | 21:41 | |
ianw | mordred: that 1) changed the testinfra for nodepool to look for the container with docker and 2) removed the "-it" on the docker run for review-dev ... which I don't believe should be there for ansible running the container | 21:42 |
ianw | "stderr": "the input device is not a TTY" | 21:42 |
ianw | @ https://zuul.opendev.org/t/openstack/build/12dba267f8f44c2eae1ddaabfdd7ad27/log/job-output.txt#3880 | 21:42 |
*** bdodd has quit IRC | 21:43 | |
ianw | otherwise i think that's the right way to go. we clearly have enough going on without also having to debug podman/podman-compose | 21:44 |
*** bdodd has joined #openstack-infra | 21:45 | |
ianw | ok ... left field ... seems to be "vgs" somehow called from /usr/bin/bash /bin/kernel-install add 5.5.8-100.fc30.x86_64 /lib/modules/5.5.8-100.fc30.x86_64/vmlinuz ... http://paste.openstack.org/show/790720/ | 21:49 |
ianw | ok, so "apt-get install lvm" inside the nb container seems to hang in the same way, but in a different part | 22:02 |
ianw | this gets into docker, udev sync and devicemapper i think :/ | 22:13 |
ianw | # docker run --entrypoint /bin/bash zuul/nodepool-builder -c "sudo apt-get update; sudo apt-get install -y lvm2; echo 'running vgs'; sudo vgs --options vg_uuid,pv_name --noheadings --separator : ; echo 'done'" | 22:36 |
ianw | does *not* replicate the problem | 22:36 |
ianw | but, running that in the currently running nodepool-builder container *does* just hang vgs | 22:37 |
ianw | ergo ... the currently running container is either in some bad state wrt to udev/volumes/dm/lvs/????? that might just go away if we restart it OR ... this is something like the first build works but screws up state, and following builds don't | 22:38 |
*** dave-mccowan has joined #openstack-infra | 22:43 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Switch back to docker for gerrit and nodepool-builder https://review.opendev.org/713101 | 22:44 |
rm_work | mtreinish: such excitement! I won't have to go in and patch the testtools files in every virtualenv for every project just to be able to properly run unit tests in Pycharm \o/ | 22:45 |
*** tkajinam has joined #openstack-infra | 22:49 | |
fungi | ianw: could it be hostname-related? (given it was restarted with an overridden hostname) | 22:52 |
ianw | fungi: ... maybe? i think that we have some idea of what's going on now, but we'll be best getting the container into a redeployable state first before we debug further | 22:53 |
ianw | i'm just working on config file deployment | 22:53 |
fungi | yeah, it does call for a bit of sanity | 22:54 |
*** jamesmcarthur has joined #openstack-infra | 22:54 | |
*** dave-mccowan has quit IRC | 22:58 | |
*** jamesmcarthur has quit IRC | 23:03 | |
*** dave-mccowan has joined #openstack-infra | 23:08 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: nodepool-builder: put container configs in /etc https://review.opendev.org/713148 | 23:15 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: nodepool-builder: put container configs in /etc https://review.opendev.org/713148 | 23:22 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Switch back to docker for gerrit and nodepool-builder https://review.opendev.org/713101 | 23:27 |
*** dmellado has quit IRC | 23:28 | |
*** dmellado has joined #openstack-infra | 23:29 | |
*** dchen has joined #openstack-infra | 23:37 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: bindep: remove lsb-release https://review.opendev.org/713150 | 23:39 |
*** sshnaidm has joined #openstack-infra | 23:57 | |
kevinz | ianw: there? | 23:59 |
ianw | kevinz: hey, yep, how's it going? | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!