opendevreview | Verification of a change to openstack/ironic master failed: CI: Remove ubuntu focal job https://review.opendev.org/c/openstack/ironic/+/894014 | 02:25 |
---|---|---|
opendevreview | Merged openstack/ironic master: CI: Remove ubuntu focal job https://review.opendev.org/c/openstack/ironic/+/894014 | 05:18 |
johnthetubaguy | JayF: I can be available later to give you a hand | 09:33 |
opendevreview | Alex Welsh proposed openstack/bifrost master: Improve downloaded deployment image support https://review.opendev.org/c/openstack/bifrost/+/884888 | 09:44 |
JayF | johnthetubaguy: I'd appreciate that. I'll be starting in about an hour | 13:14 |
opendevreview | Dmitry Tantsur proposed openstack/ironic master: [WIP] Generic API for attaching/detaching virtual media https://review.opendev.org/c/openstack/ironic/+/894918 | 13:20 |
samcat116 | I'm trying to do an L3 deployment using virtual media. My servers here need to use an 802.3ad bond to the switches, so I'm trying to use the --network-data feature to create those bonds. However the IPA doesn't seem to be grabbing it. I can see the json I've created on the node properties, however the IPA doesn't seem to grab it at all. Is there a location on the IPA where this should be set? I can see on the IPA journal logs that | 13:26 |
samcat116 | the first thing it says is it cannot find a config drive, so I am assuming thats part of it. Do I need to do anything special to enable that? | 13:26 |
TheJulia | good morning | 13:30 |
opendevreview | Alex Welsh proposed openstack/bifrost master: Improve downloaded deployment image support https://review.opendev.org/c/openstack/bifrost/+/884888 | 13:31 |
opendevreview | Alex Welsh proposed openstack/bifrost master: Improve downloaded deployment image support https://review.opendev.org/c/openstack/bifrost/+/884888 | 13:39 |
TheJulia | Anyone have the ptg etherpad link handy? And can we please add it to the whiteboard | 13:41 |
TheJulia | found in the eavesdrop logs, added | 13:43 |
samcat116 | Am I correct in that the --config-drive argument affects the end user image and not the ramdisk/IPA image? | 14:01 |
samcat116 | And that if I specify something with --network-data the conductor will build a config drive for the IPA for me, and then I can use --config-drive to send cloud-init data to the final end user image | 14:02 |
opendevreview | Alex Welsh proposed openstack/bifrost master: Improve downloaded deployment image support https://review.opendev.org/c/openstack/bifrost/+/884888 | 14:09 |
TheJulia | samcat116: give me a little bit and I can look, I woke up with a migraine this morning | 14:11 |
samcat116 | Sure no worries at all | 14:12 |
TheJulia | so my understanding is that is exactly how it works. There are likely rough edges because the person doing the work was successful, did some presentations, and after it was initially released, but really before folks started looking to use it, they became ill | 14:21 |
TheJulia | so IPA might not be grabbing it for a couple different reasons, and it really is not IPA, we set the stage for it. Is simple-init included in your IPA? | 14:22 |
TheJulia | the DIB element simple-init ? | 14:22 |
dtantsur | samcat116, all sounds correct to me | 14:25 |
samcat116 | I built an ipa and made sure to add the flag to include glean | 14:28 |
samcat116 | Which is what the spec says network-data needs | 14:29 |
TheJulia | samcat116: were you able to capture the boot log from the host? | 14:29 |
TheJulia | perhaps over IPMI serial over lan? | 14:30 |
samcat116 | I just had a crash card hooked up to the server in the data center | 14:31 |
TheJulia | and the screen is going to scroll *really* fast | 14:31 |
samcat116 | I think my next step is to throw ironic into debug mode and see if it catches anything there, as it sounds like it would be something on that side not building the iso correctly with the config drive | 14:31 |
samcat116 | Yep that was annoying | 14:31 |
TheJulia | yeah, that is likely a good first step | 14:31 |
TheJulia | I'd also see if you can get an ipmi SOL console and capture that output to a text file | 14:31 |
TheJulia | that way you can see what the agent is doing/reporting | 14:31 |
TheJulia | as well as glean/simple-init | 14:32 |
TheJulia | it might be there is a silly syntax error or something that is not being spotted | 14:32 |
TheJulia | that it finds when it attempts to do the needful. | 14:32 |
samcat116 | I’ll try that as well | 14:32 |
TheJulia | Between the two, that *should* give you all the information needed to figure out what is going on | 14:35 |
TheJulia | ... in theory :) | 14:35 |
TheJulia | Well, I'm on an etherpad roll today. 6 new items so far today | 14:58 |
JayF | TheJulia: can you spare 10 minutes for me please? | 15:26 |
JayF | TheJulia: relatively urgent and I have johnthetubaguy on the zoom | 15:26 |
TheJulia | give me a couple of minutes, to get to a stopping point | 15:26 |
TheJulia | JayF: link? | 15:30 |
opendevreview | Jay Faulkner proposed openstack/ironic master: [CI] Support for running with shards https://review.opendev.org/c/openstack/ironic/+/894460 | 15:37 |
JayF | FYI: Nova sharding is being reverted due to several issues https://etherpad.opendev.org/p/nova-sharding-rca | 17:30 |
dtantsur | sadpanda | 17:45 |
JayF | would be a sadder panda if we had landed it broken | 17:53 |
iurygregory | perfect... I was able to upgrade using the firmware interface, but.... | 18:39 |
iurygregory | | last_error | Node b1bbbbbb-84b1-5856-bfb6-6b5f2cd3dd11 failed step {'interface': 'firmware', 'step': 'update', 'args': | | 18:39 |
iurygregory | | | {'settings': [{'component': 'bmc', 'url': 'http://10.19.130.157:8080/ilo5281.bin'}]}, 'abortable': False, | | 18:39 |
iurygregory | | | 'priority': 0}: Failed to set node power state to power on. | 18:39 |
iurygregory | .-. need to figure out this now... | 18:39 |
samcat116 | So getting back to my network-data troubleshooting, I can see the network data being built into the ISO under /openstack/latest/network-data.json. I don't know how that file on the iso gets mapped into a config drive as that's what glean is expecting from what I can tell. Working on getting the ipa logs next | 19:23 |
samcat116 | Ok definitely an error from Glean, however the error is just `Error: <mac-address I have in the file>` | 20:05 |
TheJulia | well, that is vague | 20:07 |
TheJulia | any chance you can share what you set for the network-data ? | 20:07 |
samcat116 | https://paste.opendev.org/show/bBOWUBSEHA5cSPd04PJo/ | 20:14 |
TheJulia | oh... it is literally just "Error: 00:11:22:33:44:55" | 20:15 |
TheJulia | I've seen this before | 20:15 |
samcat116 | heres the glean logs | 20:15 |
samcat116 | https://paste.opendev.org/show/bX30K9ODurwpuNQoe9AH/ | 20:15 |
JayF | samcat116: omit the mac for bond0 | 20:16 |
JayF | samcat116: note I say that with more confidence and authority than I have; but I would 100% believe glean can't differentiate because they have the same mac across 3 links | 20:16 |
JayF | but that's mostly a guess | 20:17 |
TheJulia | bingo most likely,a ctually | 20:17 |
TheJulia | yeah, two interfaces with the same mac | 20:17 |
samcat116 | i was thinking that too and will try it, but it seemed like it errored on both interfaces with the opposing mac, which made me suspicious | 20:17 |
* TheJulia wonders if glean happilly handles bonding | 20:17 | |
samcat116 | I cmd + F'd the glean source and found a bunch of instances,so, hopefully? | 20:18 |
samcat116 | in my mind thats like feature number 2 after setting a static IP | 20:18 |
samcat116 | Looking again it definitely does | 20:19 |
samcat116 | it doesn't validate any of the bond settings it seems, just happily passes them along | 20:20 |
JayF | makes sense | 20:20 |
JayF | this format was originally specified at the request of a user who had bonded networks w/vlans on top of the bond | 20:20 |
JayF | (it me) | 20:21 |
JayF | https://review.opendev.org/c/openstack/releases/+/894228 probably last chance to look at cycle highlights if folks wanna take another look | 20:29 |
samcat116 | JayF so i removed the mac for bond0, but not im getting a json validation error | 20:32 |
samcat116 | 'Invalid network_data: {''id'': ''bond0'', ''type'': ''bond'', ''bond_links'': [''eno1'', ''eno2''], ''bond_mode'': ''802.1ad'', ''bond_xmit_hash_policy'': ''layer3+4'', ''bond_miimon'': 100} is not valid under any of the given schemas (HTTP 400)' | 20:32 |
JayF | I wonder what happens if you have the key but set the value to null | 20:32 |
samcat116 | let me try that | 20:33 |
JayF | yours is right per the original spec, fwiw https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/metadata-service-network-info.html#rest-api-impact | 20:34 |
JayF | I've only ever used this with cloud-init interpreting the user-data, not glean, so I'm not sure exactly where the edges are | 20:34 |
JayF | what OS samcat116 | 20:34 |
samcat116 | centos | 20:35 |
samcat116 | I found the json schema for this the other day but can't find it again | 20:41 |
JayF | likely the link I just pasted to nova-specs? | 20:41 |
JayF | samcat116: I can find no reason your original wouldn't have worked | 20:44 |
JayF | samcat116: and I just read the glean codepath it (should) be taking | 20:44 |
samcat116 | null or empty string aren't accepted there | 20:44 |
JayF | Sep 13 15:05:41 localhost python-glean[764]: DEBUG:g lean: Interface matched: eno2 (08:94 :ef:a:7b:9f) | 20:44 |
JayF | Sep 13 15:05:41 localhost puthon-glean[764]: Error: '08:94 :ef:aa:7b:9e' | 20:44 |
JayF | are those spaces added by you? | 20:44 |
JayF | oh, this is SOL output | 20:45 |
JayF | isn't it | 20:45 |
samcat116 | yes it is | 20:45 |
samcat116 | i dont have sol so its a screenshot then macos ocr | 20:45 |
samcat116 | but messed that bit | 20:45 |
JayF | yeah it's nbd, just making sure the bungled output was in teh transmission | 20:45 |
JayF | don't wanna go digging for breakages if you have a bad stick o'ram | 20:46 |
samcat116 | does that list of interfaces need to be exhaustive? | 20:47 |
JayF | any unmatched ones get a dhcp iirc | 20:48 |
samcat116 | k | 20:48 |
TheJulia | JayF: so, trying to get a unit test to verify Node.list in the sdk seems... elusive :\ | 20:48 |
JayF | so idk how python-y you are | 20:48 |
JayF | TheJulia: I am sorta off that category of thing for the day, need fresh brain | 20:48 |
JayF | samcat116: https://opendev.org/opendev/glean/src/commit/528f7216c64b459c2028a2ba81149548862b58b5/glean/__main__.py#L24 | 20:49 |
TheJulia | I don't blame you, just thought I'd share since it has sort of stumped by hurting brain too | 20:49 |
JayF | samcat116: so what that tells me is something raised an unexpected exception | 20:49 |
JayF | what would have an e.message = mac address | 20:49 |
JayF | https://opendev.org/opendev/glean/src/commit/528f7216c64b459c2028a2ba81149548862b58b5/glean/cmd.py#L1233 I wonder what e.message is on that IOError | 20:50 |
JayF | it never printed Writing Output Files that run, so it never got that far | 20:51 |
JayF | samcat116: I'd suggest filing a public bug against glean with the config and the error; it's a bug at a minimum that we don't return a better err | 20:53 |
JayF | samcat116: I just can't figure out what's throwing the unexpected exception :( | 20:53 |
TheJulia | I do believe that is where I've seen things go sideways with glean before | 20:54 |
TheJulia | #PatchesWelcome | 20:54 |
JayF | samcat116: if you could hack in an import traceback; traceback.print_exception(e) after that exception catch | 20:54 |
JayF | samcat116: you'd get a MUCH better error that likely we could pinpoint the error on | 20:54 |
JayF | https://opendev.org/opendev/glean/src/commit/528f7216c64b459c2028a2ba81149548862b58b5/glean/__main__.py#L25 at line 24 and a half ;) | 20:55 |
samcat116 | im not sure if i can build glean from source for dib? | 20:55 |
samcat116 | i guess i can try it live on this booted host | 20:55 |
JayF | glean expects to be run from udev; not sure how well that'll work | 20:56 |
JayF | what I'd do if it were me | 20:56 |
JayF | and by NO MEANS is this an endorsement of this, I know it's a hack | 20:56 |
JayF | I'd try to mount up that IPA ramdisk, navigate to the IPA venv, and just edit the code inside lib/site-packages/glean/[] | 20:56 |
JayF | if it's not in the IPA venv, it's a package, probably same would apply | 20:56 |
JayF | samcat116: 'import traceback; traceback.print_exc()' is an even shorter slug | 20:57 |
samcat116 | Ok i'll give that a whack | 21:00 |
samcat116 | im also trying with just a random mac for the bond | 21:00 |
JayF | After reading the code, I have some nonzero confidence you were already doing the right thing | 21:01 |
JayF | but try all the permutations you can manage :D | 21:01 |
samcat116 | no glean commits in over a year or bugs filed in several years :( | 21:02 |
TheJulia | we should fix that | 21:02 |
samcat116 | I wish i could just use regular cloud-init here and not network_data and glean | 21:03 |
TheJulia | .. i thought cloud-init could use the network metadata | 21:03 |
JayF | samcat116: you likely can, if you configured cloud-init to only do network | 21:05 |
JayF | samcat116: and I have personally used a more complex config than this against cloud-init | 21:05 |
JayF | samcat116: the downside here for cloud-init over glean for IPA use cases is that cloud-init wants to do *all the things* and all you need is networking | 21:05 |
JayF | Hmm also you may need to let cloud-init somehow know where to find the configdrive | 21:05 |
JayF | but I'm unsure how we shape that for network data on ipa | 21:05 |
TheJulia | we label the ISO as config-2 | 21:06 |
JayF | so it should just work | 21:06 |
TheJulia | ... so sort of cheating, kind of :) | 21:06 |
samcat116 | JayF whats the best way to mount the ipa.initramfs so I can edit it like you said? | 21:29 |
JayF | samcat116: file ipa.initramfs | 21:30 |
JayF | I hope it's a qcow, nbd, or raw. If it's a squashfs it's harder. It probably is a squashfs tho | 21:31 |
samcat116 | ipa.initramfs: gzip compressed data, from Unix, original size modulo 2^32 781161472 | 21:33 |
JayF | yeah it fits | 21:33 |
JayF | https://github.com/openstack/ironic-python-agent-builder/blob/master/dib/ironic-ramdisk-base/cleanup.d/99-ramdisk-create#L63 | 21:33 |
JayF | it's a compressed CPIO archive | 21:33 |
JayF | samcat116: https://linuxconfig.org/how-to-uncompress-and-list-an-initramfs-content-on-linux something like this, in an isolated dir, then modify the command from ^ Line 63 to recreate it | 21:34 |
JayF | samcat116: full disclosure: I'm not 100% sure this will work but it'll be fun to see :D | 21:34 |
samcat116 | oof | 21:34 |
JayF | yeah | 21:34 |
JayF | I was hoping it wa sa straight image | 21:34 |
JayF | you can loopback mount those | 21:34 |
JayF | which makes it 1000x easier to do something like this | 21:34 |
JayF | samcat116: if you're already using cloud-init, I encourage going down that path | 21:35 |
JayF | samcat116: it's likely easier than getting this working at this point | 21:35 |
samcat116 | so how do i send cloud init from ironic to the ipa | 21:35 |
JayF | so the data is the exact same | 21:35 |
samcat116 | Oh i see | 21:35 |
JayF | they both are tools that read configdrives+network data | 21:35 |
JayF | just when you install cloud-init in your IPA image, you'll likely need to put some config alongside to say "don't install e.g. an ssh key" | 21:36 |
JayF | but I'm not sure what that full set is; but there's no technical reason I can posit it wouldn't work | 21:36 |
samcat116 | ah so i need to use dib manually to include that and ipa, and likely cant use the ironic-python-agent-builder | 21:36 |
JayF | samcat116: https://github.com/openstack/ironic-python-agent-builder/blob/master/ironic_python_agent_builder/__init__.py#L49 you can pass -e to ipa-builder to add elements of your own | 21:37 |
JayF | samcat116: so you should be able to keep ipa-b as the way you execute it, just create a sidecar element (in one of your own repos) that has your custom bits | 21:37 |
JayF | this is more or less how we've done it everywhere I've worked that had custom IPA images -- you sorta end up with a meta-builder repo with a couple of bash scripts and a custom element or three | 21:38 |
samcat116 | I found the dib element to build glean from source, so I might try that too | 21:40 |
opendevreview | Jay Faulkner proposed openstack/ironic master: [releasenotes] Prelude for 2023.2/bobcat https://review.opendev.org/c/openstack/ironic/+/895007 | 21:57 |
JayF | samcat116: oh nice, yeah, that would certainly work especially if you forked it to your own repo and made that one change | 21:58 |
JayF | samcat116: there are magic DIB variables for pointing those things at your own git repos instead of the upstream | 21:58 |
JayF | samcat116: I think it's reasonably well doc'd but if you can't find it I'll help you look | 21:58 |
TheJulia | Hi, Delta, why you say one of my flights is on a C-212 Aviocar?!? #IHaveConcerns | 21:59 |
JayF | it's run by a surprising number of civil aviation services | 22:03 |
TheJulia | And now it says it is an A220 | 22:07 |
TheJulia | .... *weird* | 22:07 |
JayF | at this rate, it'll be an A380 or 787 by the time you leave | 22:08 |
JayF | if you booked it three months ago, it would've said "carrier pidgeon" | 22:08 |
TheJulia | lolz | 22:20 |
opendevreview | Julia Kreger proposed openstack/ironic master: Enable OVN CI https://review.opendev.org/c/openstack/ironic/+/885087 | 22:38 |
TheJulia | JayF: w/r/t https://review.opendev.org/c/openstack/networking-generic-switch/+/888051 <-- any thoughts on disabling the DLM test until we can sort through what is going on there? | 23:34 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!