Tuesday, 2024-01-09

opendevreviewJulia Kreger proposed openstack/ironic master: RBAC: Fix allocation check  https://review.opendev.org/c/openstack/ironic/+/90503801:09
opendevreviewJulia Kreger proposed openstack/ironic master: DNM: Change to enforced policy by default  https://review.opendev.org/c/openstack/ironic/+/90200901:09
opendevreviewMerged openstack/ironic-lib master: Drop lower-constraints.txt (again)  https://review.opendev.org/c/openstack/ironic-lib/+/90499302:45
TheJuliaAny thoughts on removing ironic-inspector-tempest-managed-non-standalone ? It is failing a ton, nobody has fixed it. It looks like the base image, configured to rely upon netboot. Given we'e basically deprecated inspector itself at this point, it just seems like something we don't really need to be spinning resources.04:56
opendevreviewJulia Kreger proposed openstack/ironic master: CI: Remove ironic-inspector-tempest-managed-non-standalone  https://review.opendev.org/c/openstack/ironic/+/90505705:04
TheJuliaIf someone that knows the inner config of the metal3-integration job could take a look at https://zuul.opendev.org/t/openstack/build/a0d63ddc5a8c4aa2b1acfd900df55489  It seems entirely unrelated to changing the policy default for RBAC since it no-ops in that config. All sign points to disk space for the namespace?05:05
rpittaugood morning ironic! o/08:02
rpittauTheJulia: re metal3-integration job: from a quick look it seems like the issue is indeed related to the lack of disk space 08:27
opendevreviewRiccardo Pittau proposed openstack/bifrost master: Uplift default Ansible version to 8.x  https://review.opendev.org/c/openstack/bifrost/+/90395008:33
opendevreviewRiccardo Pittau proposed openstack/bifrost master: Uplift default Ansible version to 8.x  https://review.opendev.org/c/openstack/bifrost/+/90395008:35
opendevreviewRiccardo Pittau proposed openstack/ironic bugfix/23.1: Handle LLDP parse Unicode error  https://review.opendev.org/c/openstack/ironic/+/90486008:40
opendevreviewDmitry Tantsur proposed openstack/ironic-python-agent master: [WIP] Make inspection URL optional if the collectors are provided  https://review.opendev.org/c/openstack/ironic-python-agent/+/90402609:09
opendevreviewDmitry Tantsur proposed openstack/ironic master: DNM api debugging  https://review.opendev.org/c/openstack/ironic/+/90511312:44
opendevreviewDmitry Tantsur proposed openstack/ironic-python-agent master: [WIP] Make inspection URL optional if the collectors are provided  https://review.opendev.org/c/openstack/ironic-python-agent/+/90402612:45
opendevreviewDmitry Tantsur proposed openstack/bifrost master: Switch the dibipa jobs to Redfish  https://review.opendev.org/c/openstack/bifrost/+/90511412:49
dtantsurrpittau: I'm not super fond of ^^^, but I don't see other options12:50
rpittaummm alright :/13:24
TheJuliagood morning14:04
TheJuliarpittau: ack, okay. I guess it is good to see the job fail again on another job change entirely unrelated. 14:06
rpittauTheJulia: oooook that's not good for me :D14:16
* dtantsur is curious how many people actually test iPXE with IPv614:17
TheJuliait is used quite a bit, we just only have one job running it in our CI, why?14:19
TheJuliarpittau: sorry :(14:19
rpittauTheJulia: no worries! what's the other failing change?14:20
dtantsurTheJulia: I see a customer bug which seems to boil down to iPXE sending a malformed host header14:20
dtantsurwithout brackets around the IP14:20
opendevreviewJulia Kreger proposed openstack/ironic-inspector master: DNM: Change policy to enforce only new policy  https://review.opendev.org/c/openstack/ironic-inspector/+/90511914:20
TheJuliadtantsur: interesting, could it be the uefi network stack doing it?14:21
dtantsurWho knows? Possibly?14:22
TheJuliaJust thinking it gets a ton of networking stuffs from that, and it seems rather... odd14:22
dtantsurDoes it use HTTP implementation from the stack?14:22
TheJuliaI think it might...14:23
dtantsurCould explain why we don't see it more often14:23
TheJuliaso, it does not, it does most of it from what I'm seeing14:24
TheJuliawhat header/field is malformed?14:24
dtantsurHost14:24
TheJuliahttps://github.com/ipxe/ipxe/blob/master/src/net/tcp/httpcore.c#L601-L65514:27
dtantsurI'm looking at that too, but it's not trivial to understand if the host if bracketed anywhere14:28
TheJuliait doesn't reformat it I suspect it was geared around v4/v6 or an FQDN, and they never had anyone have anything which did any sort of host header header check. Looks like http_connect in httpconn.c just copies whatever is in the field14:30
dtantsurThey do have tests for this case https://github.com/ipxe/ipxe/blob/master/src/tests/uri_test.c#L586-L59614:30
TheJuliaheh, they are expecting to get the url with brackets14:30
TheJuliawhich is valid14:30
dtantsurCould it be that we're not supplying it correctly? but then how does it work at all?14:33
TheJuliaThat is a distinct possibility. I would sort of expect code to go "oh, I need to drop the brackets" tooso I'm wondering if it is just hidden c library smarts when it gets linked14:35
TheJuliaThere is also the interface binding which is occuring, not sure what all that is about14:35
dtantsurdaaaaamn14:40
TheJuliafind something major?14:40
dtantsuryeah, it's our fault. we don't escape the host in one case.. which should not really be hit, but it is14:40
TheJuliadoh14:40
dtantsurs/escape/put in brackets/ you got it14:40
dtantsurhttps://github.com/metal3-io/ironic-image/blob/main/ironic-config/inspector.ipxe.j2#L814:40
TheJuliadoh14:41
dtantsurmy only question is why they hit unmanaged inspection14:43
TheJuliawrong addresses?14:46
dtantsurjust 3G of logs to download, and I'll probably know..14:47
TheJuliawheeeeeee14:47
dtantsuryep, you're right, wrong bootMACAddress15:01
dtantsurTIL people use these commands https://docs.openstack.org/python-ironic-inspector-client/latest/cli/index.html#list-interface-data so we probably need to migrate them too15:06
dtantsurMeanwhile, do I really want to know why the dibipa jobs have both glean and cloud-init on the ramdisks? maybe I do, maybe I don't?15:16
TheJuliashort answer, I've been trying to fix that, and I need to have a long discussion with you about it in general15:16
dtantsur\o/15:16
TheJuliathe longer answer is cloud-init is not explicitly excised in builds/images when it comes from OS vendors15:18
* dtantsur watches a situation where the whole inventory becomes a *KEY* in a **kwargs dictionary and slowly loses his mind15:23
dtantsurhaha, missing content-type makes ironic go insane15:30
TheJuliacan  you blame it?15:38
opendevreviewDmitry Tantsur proposed openstack/ironic-python-agent master: Add missing headers to the inspection callback  https://review.opendev.org/c/openstack/ironic-python-agent/+/90512615:39
dtantsurgiven the state of my own sanity? not really :)15:39
opendevreviewDmitry Tantsur proposed openstack/ironic-python-agent master: Support several API and Inspector URLs  https://review.opendev.org/c/openstack/ironic-python-agent/+/90399915:45
opendevreviewDmitry Tantsur proposed openstack/ironic-python-agent master: [WIP] Make inspection URL optional if the collectors are provided  https://review.opendev.org/c/openstack/ironic-python-agent/+/90402615:45
opendevreviewJulia Kreger proposed openstack/ironic-inspector master: Change policy to enforce only new policy  https://review.opendev.org/c/openstack/ironic-inspector/+/90511915:54
TheJuliarpittau: so, do we have an idea on the forward path w/r/t the metal3-integration job?15:59
JayFTheJulia: we often set it -nv when it's broken like this for ironic-unrelated reasons, I'd +2A such a patch16:03
JayFTheJulia: I'll note they just enabled e2e tests on all PRs in metal3, per slack over there, so I am suspect that could be causing infrastructural issues16:03
rpittauTheJulia: I'm not sure what changed, I also see it's not always failing16:03
TheJuliaIndeed, but if there is a quick fix there is no real reason to NV it if we can just fix it quickly16:03
rpittauJayF: 16:03
rpittauI don't think that impacts our CI though16:04
JayFI assumed it might be running on the same machine(s)16:04
rpittauoh ok, no it's running on opendev CI16:04
TheJuliaShouldn't, could it be location on disk being used? aren't some of the machines out there only like 40 or 60gb, and then there are ones with a giant /opt ?16:04
rpittauthis actually passed today https://zuul.opendev.org/t/openstack/build/bfee7cf06545478781baa9e3c0fd0bed16:04
rpittauTheJulia: yeah, I was wondering that too16:05
TheJuliaokay, Then in that case I'll rev my dnm path once the patch on inspector to do the same finishes running16:05
TheJuliaand just kind of live with it for now as an intermittent failure16:06
rpittauthe disk space required seems also variable16:06
rpittaujust checked a couple of failed jobs16:06
TheJuliathat seems... weird16:07
clarkbgithub is having an outage/had a recent outage. Many of the recent failures are due to failures accessing github resources16:07
rpittauclarkb: yes, that's the other failures, but not clear why suddenly we're seeing the disk space issue too, or how can be related16:08
clarkbok without links to where you see the disk is full (or info on which path has no space and where the job ran) it is hard for me to help beyond that. Just noticed that a handful of the latest failurse appear related to the github outage16:16
rpittauclarkb: thanks for the info, we saw the failures in various changes, for example https://zuul.opendev.org/t/openstack/build/776964071b3f44bcb3d5efaeee7a31c6 the directory involved is /shared which should not be a separate partition 16:18
clarkbrpittau: can you link to where you see /shared is full?16:19
clarkbthat job ran on ovh-bhs1 which has a single partition and filesystem.16:19
clarkbthat filesystem should be roughly 80GB large iirc16:20
rpittauclarkb: sure, and sorry, I got the wrong change!16:20
rpittauin the ironic logs here https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_a0d/902009/7/check/metal3-integration/a0d63dd/controller/before_pivoting/ironic.log16:20
rpittauFailed to deploy instance: Disk volume where '/shared/tftpboot/master_images' is located doesn't have enough disk space. Required 446 MiB, only 225 MiB available space present.16:21
clarkbnote you can link directly to that message like this: https://zuul.opendev.org/t/openstack/build/a0d63ddc5a8c4aa2b1acfd900df55489/log/controller/before_pivoting/ironic.log#155116:23
rpittauclarkb: yep, thanks16:24
clarkbthat job ran on rax-ord which as two disks. One for / and then another that is typically mounted under /opt16:24
clarkbHowever the job is responsible for partitioning and formating a filesystem on the ephemeral disk and mounting it so I can't say for certain if /opt is a different fs in that particular job. The / disk is about 20GB and the epeheramal disk should be around 80GB iirc16:24
clarkblooks like these jobs collect system info at the end. May want to include `df` output16:25
rpittauclarkb: yes, I will do that16:26
rpittauteh space there can quickly decrease as we have a lot of images and containers runnning16:26
rpittauclarkb: thanks!16:26
opendevreviewRiccardo Pittau proposed openstack/ironic master: Add df logs to metal3 integration job  https://review.opendev.org/c/openstack/ironic/+/90513316:29
clarkblooking at https://zuul.opendev.org/t/openstack/build/a0d63ddc5a8c4aa2b1acfd900df55489/log/controller/system/docker-images.txt some of those images look fairly large. The ironic and sushy-tools in particular stand out to me. I think our (opendev and zuul) python images tend to be closer to 200-300MB. One exception is zuul-executor where we install multiple versions of16:30
clarkbansible and for reasons ansible is quite large16:30
clarkbthat said free space on / is going to be less than 20GB so optimizing the image size may not solve this particular problem16:30
rpittauclarkb: this images come from the metal3 project and they follow different rules, for example they're based on CentOS Stream 916:31
clarkbhopefully they are doing multistage builds and installing wheels into the final production images16:32
rpittauclarkb: it's actually a mix of things, but wheels are indeed installed in the final image16:33
clarkbrpittau: taking a quick glance the sushy-tools image appears to be based on debian and has two versions of python installed (3.9 and 3.11) as well as git and gcc. I suspect there is a fair bit of pruning that can be done16:40
clarkbbut again I'm not sure that will address the job failing issue. However, it does seem like a fairly bloated image16:40
rpittauclarkb: oh yeah, I was thinking about the main ironic image16:40
clarkbI suspect they aren't doing a multistage build and installing from binary packages (wheels) which is why you end up with git and gcc installed. I have no idea why you need two versions of python16:41
rpittauclarkb: which dockerfile are you looking at though? we're basing sushy-tools on python3.9 https://github.com/metal3-io/ironic-image/blob/main/resources/sushy-tools/Dockerfile16:43
clarkbrpittau: I'm not looking at a dockerfile. I am running bash in quay.io/metal3-io/sushy-tools and using du to inspect disk consumption16:44
JayFrpittau: is that a version where apt is going to install reccomends by default?16:44
JayFrpittau: if so, that explains ~everything clarkb has seen16:44
rpittauclarkb: that image may be outdated16:44
clarkbrpittau: thats the exact image in your job? https://zuul.opendev.org/t/openstack/build/a0d63ddc5a8c4aa2b1acfd900df55489/log/controller/system/docker-images.txt#1416:45
rpittaummm alright then something's not right16:45
clarkboh though I guess the image above it is a local image with the same id so maybe built locally and tagged that way?16:45
rpittauthe size of the sushy-tools image there is way too big16:46
rpittauanyway I don't think this is the root cause for the disk space issue, but good to know, thanks clarkb 16:47
rpittaugood night! o/17:07
*** jlvillal_ is now known as jlvillal17:13
opendevreviewJulia Kreger proposed openstack/ironic master: Change to enforced policy by default  https://review.opendev.org/c/openstack/ironic/+/90200919:43
TheJuliaso two relatively quick and easy bug fixes I'd like to get some eyes on: https://review.opendev.org/c/openstack/ironic/+/905022 and https://review.opendev.org/c/openstack/ironic/+/905038 basically issues encountered when changing the default policy enforcement over to the new RBAC policy19:45
JayFI looked at 905038 last night and it made no sense, it makes perfect sense now19:48
opendevreviewJulia Kreger proposed openstack/ironic master: Disable legacy RBAC policy by default.  https://review.opendev.org/c/openstack/ironic/+/90200919:48
JayFtemporal nature of code review lol19:48
TheJulialol19:49
JayF+2 both of those :)19:49
TheJuliait is true!19:49
JayFlets make sure the release note for swapping the defaults19:49
JayFis really really loud19:49
JayFI haven't looked at that change yet so maybe it is but just saying it out loud :D 19:49
JayFlike jay in a loud bar with a headcold trying to get someone's attention on the other side of the room loud lol19:50
TheJuliaoh, both changes are very loud release notes19:51
TheJuliawhy is jay in the irc bar with a headcold though?19:51
JayFjay in a bar in general would be B.C. in general anyway lol19:55
JayFI haven't been in a bar that wasn't attached to somewhere like an Applebees or BWW in so long lol19:55
TheJulia.... We started to visit speakeasys19:55
TheJuliaspeaking of...19:55
* TheJulia feels a need to meet the editor19:56
JayFI used up my liver's capacity for alcohol in my 20s, so while speakeasys are cool I have to make sure my drinks are easy lol19:56
TheJuliamakes tons of sense19:56
JayFalright, gotta go walk the dog while there's a lull in the wind/rain19:57
TheJulia++19:59
JayFI'd like to suggest someone who has knowledge of bifrost take a look at this and verify we don't have any dangling dependencies https://review.opendev.org/c/openstack/governance/+/877132/3/reference/projects.yaml21:40
JayFI'm 99.9999% sure we're OK, but don't trust myself to know what I may not know21:40
TheJuliayeah, none of that was used afaik21:43
JayFhttps://review.opendev.org/c/openstack/governance/+/905145/1/reference/projects.yaml21:45
JayFwrong link, but I think you're right anyway21:45
opendevreviewJay Faulkner proposed openstack/ironic master: Basic support for OVN VTEP switches  https://review.opendev.org/c/openstack/ironic/+/90056823:29

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!