Friday, 2024-02-02

tonybShould nodepool be 'pre-booting' nodes in the inmotion cloud?  I expect to see some number in the 'Available' state, 00:00
funginodepool only "pre-boots" nodes that have a min-ready count and only enough across all providers to meet that count00:01
tonybAh okay00:01
fungiwhat's the max-servers set to in inmotion?00:02
fungiin our nodepool configs i mean00:02
tonybI'll check but I think the max is 5100:02
fungia nonzero value?00:02
fungiwe sometimes set that to 0 to temporarily disable booting nodes in a provider, is why i ask00:02
corvushttps://grafana.opendev.org/d/53e8120f2a/nodepool3a-inmotion?orgId=1 says 5100:03
fungiso if you think we should be booting nodes there already and aren't, the next place to look would be the debug log for the launcher that provider is tied to. see if it's turning down node requests there for a specified reason00:05
corvusthe ready node launch attempts graph shows values for the past 2 hours which indicates successful launches00:07
tonybmax-servers: 51 ... no min-servers so that makes matches00:07
corvusthere were errors between 22 and 2300 but not since 230000:08
corvusi think 2300 is your approximate "done" time, right? so those 2 graphs look good.00:08
tonybI was mostly done yesterday today was just cleaning things out.00:10
tonybI'll look at the errors in the last 24 hours00:11
tonybYeah I think the thirds stage fix: https://etherpad.opendev.org/p/opendev-inmotion_debugging#L12101:01
tonybhas cleaned up the errors01:01
tonybthe periodic pipeline will go off in about 1 hour that'll be a good test01:01
tonyband since I completed that the number of nodes launched in this cloud seems to be a little higher01:08
*** dtantsur_ is now known as dtantsur01:50
opendevreviewTakashi Kajinami proposed openstack/diskimage-builder master: Get rid of 3rd party mock  https://review.opendev.org/c/openstack/diskimage-builder/+/90751302:29
tonybI think inmotion is doing better.  for the last 3+ hours it's sitting on 40+ nodes in use, there were a few errors in the last 30mins but far fewer than before05:08
ykarelHi is this known issue to infra mirrors for centos-stream not updated since 16th January?07:32
ykarelhttp://mirror.iad.rax.opendev.org/centos-stream/timestamp.txt07:32
ykarelvs http://mirror.rackspace.com/centos-stream/timestamp.txt07:33
ykarelhttp://mirror.iad.rax.opendev.org/logs/rsync-mirrors/centos-stream.log07:39
ykarelshows rsync: close failed on "/afs/.openstack.org/mirror/centos-stream/9-stream/CRB/x86_64/os/Packages/.dotnet-sdk-7.0-source-built-artifacts-7.0.115-2.el9.x86_64.rpm.5ic9Kr": Disk quota exceeded (122)07:40
ykarelfungi, frickler can you check ^07:42
fricklerykarel: it has been known that some volumes are running close to their limits for some time, but I didn't know that this has already happened09:55
fricklertonyb: did you set up your AFS credentials yet? ^^ might be a good opportunity to get a bit of practice. otherwise I'll just do a small quota bump as a quick workaround09:58
tonybfrickler: I haven't done anything with AFS credentials.  so I guess point me in the direction of doing that and I'll do it ASAP10:00
fricklertonyb: https://docs.opendev.org/opendev/system-config/latest/afs.html is the general doc, I must admit I'm not too deep into this myself, so if you need help better wait for fungi or clarkb10:14
fricklerthe command I would run would be "fs setquota /afs/.openstack.org/mirror/centos-stream  -max 350000000", current value is 300M10:16
tonybfrickler: thanks10:55
ykarelthx frickler tonyb 12:05
ykarelso is updated?12:05
fricklerykarel: not yet, is this causing actual job failures or was is just something you noticed?12:17
ykarelfrickler, noticed with weekly job https://zuul.openstack.org/builds?job_name=neutron-fullstack-with-uwsgi-fips&branch=master&skip=012:18
frickleroh, important gotcha when debugging a held system-config-run-* job: the nodes are configured with our usual sysadmin accs, can't login as root as with "normal" held nodes. but I guess you all already knew this and I'm just late to the party12:42
frickleraah, that's a nice failure. if we merge/test https://review.opendev.org/c/opendev/system-config/+/907500, the page https://opendev.org/opendev/system-config/ does actually contain 'Internal Server Error' because that text is in the commit message which is then shown on that page. so we'll need to temporarily disable this check 12:52
fricklerhttps://opendev.org/opendev/system-config/src/branch/master/testinfra/test_gitea.py#L133 unless someone has a better idea12:52
fricklerykarel: ok, I've done some small quota increase now, will check after the next rsync run. I think we still have some other quota bumps for tonyb to look at12:56
ykarelthx frickler 12:57
*** blarnath is now known as d34dh0r5314:53
fungiinfra-root: ems has confirmed the terms of our matrix homeserver business plan and will be officially upgrading us on wednesday next week (2024-02-07). we shouldn't expect any downtime or other impact to the service, the only outwardly visible change will, i think, be the increase in our user quota which we weren't using all of before anyway15:31
fungii'll try to remember to remind folks the day before, in the weekly meeting, too15:32
fungipopping out for a lunch break, may take a slightly longer one than usual since it's friday and i also have some quick errands to run, but should still be back by 18:00 at the latest (probably sooner)16:08
clarkbfungi: thank you for daling with that16:23
clarkbfrickler: I'll work on adjusting the test to be less collision likely16:23
opendevreviewClark Boylan proposed opendev/system-config master: Increase gitea db connection limit  https://review.opendev.org/c/opendev/system-config/+/90750016:40
clarkbthis should pass testing now I hope16:40
clarkbthat gitea db change passes now18:15
opendevreviewMerged opendev/system-config master: Retire the OpenInfra Labs mailing list  https://review.opendev.org/c/opendev/system-config/+/90710318:30
opendevreviewElod Illes proposed openstack/project-config master: [relmgt] Update reno when cutting unmaintained branch  https://review.opendev.org/c/openstack/project-config/+/90762619:15
*** priteau_ is now known as priteau21:44

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!