Tuesday, 2022-06-28

opendevreviewSteve Baker proposed openstack/diskimage-builder master: Fix BLS entries for /boot partitions  https://review.opendev.org/c/openstack/diskimage-builder/+/84683802:41
stevebaker[m]ianw: hey, this one change fixes a downstream CI issue, when you have the chance to review ^^02:48
ianwstevebaker[m]: lgtm, thanks.  if causing issues feel free to +w it05:13
stevebaker[m]ianw: Its not blocking the pipeline, just failing one CI job in my team, sometimes05:26
stevebaker[m]TheJulia: could you take a look at https://review.opendev.org/c/openstack/diskimage-builder/+/846838 also when you get the chance?05:27
TheJuliaianw: back to the networkmanager discussion yesterday, it is starting to look like networkmanager is just evolving away from all of the controls we would want in short lived/temporary machines. top level options deprecated with no alternative or interface level "know the interface in advance" options which are not viable. :\16:33
clarkbTheJulia: what sorts of controls? Seems a bit odd ot me since network manager's primary use case (in my mind at least) is a mobile laptop with from a network standpoint mimics short lived instances (since its connectivity to a network is short lived)16:45
TheJuliaignore-carrier is deprecated and now an interface level setting. Retries has been completely restructured16:45
TheJuliaI think for ironic, specifically, we can just teach it to bounce the interfaces17:10
TheJuliawhich would address spanning tree blocking interface issues17:10
TheJuliawell.. kind of17:10
TheJuliamaybe17:10
opendevreviewJulia Kreger proposed openstack/diskimage-builder master: Use internal dhcp client for centos 9-stream and beyond  https://review.opendev.org/c/openstack/diskimage-builder/+/84801718:08
JayFTheJulia: ... if you're talking about a networking restart, or equivalent, I think that usually flips the link state on the ironic end, right?20:22
TheJuliaJayF: of us maybe needing to add one?20:24
TheJuliaOr… of the dib patch itself?20:24
JayFLemme look at the dib patch directly, I'm sorry I thought this was -ironic20:24
TheJuliaSo using dhclient actually toggles the link state explicitly20:25
TheJuliaSo line carrier goes down and back up20:25
JayFwhich is hell on STP/loop prevention stuff20:25
JayFand that's why you're using the internal implemention on centos9?20:25
JayFso basically you're like 5 steps ahead of me as usual, got it LOL20:26
TheJuliaJayF: hell and also needed if the switch does things like block traffic on port preference for PXE purposes20:26
JayF> block traffic on port preference for PXE purposes 20:27
TheJuliabecause that can reset the cycle, and ipxe actually can start the timer by issuing a patch down the port20:27
JayFI know all those words but don't understand :)20:27
TheJuliaerr, patcket20:27
* JayF can just read the backlog from yesterday, I was OOO yesterday20:28
TheJuliaeh, that was in #opendev mostly, fwiw20:28
TheJuliaI think20:28
* JayF is everywhere20:28
TheJuliaJayF: you know all and see all?20:29
JayFI'm not omnipotent, just omnipresent20:30
JayFI am everywhere, and read just enough to ask bad questions of TheJulia in random IRC channels ;) 20:30
TheJuliaJayF: well, present is half the battle20:31
TheJuliaJayF: https://storyboard.openstack.org/#!/story/2008001 is what I was referring to, fwiw20:31
JayFTheJulia: we had this exact problem in OnMetal, and fixed it by hitting the switch and/or the engineer configuring them until it stopped being so breaky20:37
JayFTheJulia: in fact, we made architectural changes in v2 to avoid this kinda failure state20:37
JayF(on a network level)20:37
JayFlike we never released with this bug, but we absolutely had to massage network (and maybe images too?) to prevent it20:37
TheJuliaYeah, we had a hardware vendor encounter it in their labs and they had to port mirror and send us pcap files before we finally realized exactly what was going on20:38
TheJulialuckily, networkmanager attempts to assemble lacp groups now aiui20:39
TheJuliaso... it might just be "fine" moving forward20:39
JayFwait, are we publishing stream-9 IPA images and folks are using them?20:46
JayFOr is this with your red hat on :D20:46
TheJuliaJayF:  we build them and post them for use in our own CI21:23
TheJuliaSo… the fedora is not on… at the moment21:23
JayFIt's arguably a pretty rough to release a stream-based image for reasons like this (and CI-maintenance sanity). I wonder if we've considered (for ironic) going to release images, but I know that's problematic too21:26
* JayF flips a table21:26
TheJuliaA new one drops once a month and gave us advance awareness of some issues…. Sooo in the grand scheme of things it seems to be helping us21:27
* TheJulia prints a time stone on the resin printer, and returns the table to its full upright, and locked position21:28
TheJuliaAnd honestly, we had the same issues without them, just the packages would get updated somewhere along the way and we would be scratching our heads assuming nothing had changed21:30
JayFYou are right that the 'stability' of !stream is ... not always so, at least for the level that ironic operates21:32
TheJuliaYeah, testing the last few steps down to hardware is unfortunately hard all around21:32
* TheJulia remembers grub int 19h issues21:33
clarkbI've said this in a few venues now that stream seems to be working well for catching upcoming issues, but for stable CI platform it is difficult. Almost like we need a mix of platforms with a group that is actively ready to address the down the pipeline problems that stream exposes.21:34
clarkbIt has also been difficult because the round trip time on getting bugs fixed in stream is much longer tahn one woudl expect for a rolling release21:35
clarkbbut upstream says they are improving that, just have to keep fingers crossed those improvements happen21:35
TheJuliaYeah, we caught an issue in stream about 6 weeks before it hit rhel…. Even then the delay was mostly on the rhel side for the already shipped stable versions21:37
TheJuliaThat same issue appeared in stable released centos8.X eventually too21:39
clarkbI run tumbleweed locally and I get antsy if updates take more than a couple of days so tahts sort of what I'm measuring against21:40
TheJulia:)21:41
opendevreviewMerged openstack/diskimage-builder master: Fix BLS entries for /boot partitions  https://review.opendev.org/c/openstack/diskimage-builder/+/84683822:30

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!