opendevreview | Steve Baker proposed openstack/diskimage-builder master: Fix BLS entries for /boot partitions https://review.opendev.org/c/openstack/diskimage-builder/+/846838 | 02:41 |
---|---|---|
stevebaker[m] | ianw: hey, this one change fixes a downstream CI issue, when you have the chance to review ^^ | 02:48 |
ianw | stevebaker[m]: lgtm, thanks. if causing issues feel free to +w it | 05:13 |
stevebaker[m] | ianw: Its not blocking the pipeline, just failing one CI job in my team, sometimes | 05:26 |
stevebaker[m] | TheJulia: could you take a look at https://review.opendev.org/c/openstack/diskimage-builder/+/846838 also when you get the chance? | 05:27 |
TheJulia | ianw: back to the networkmanager discussion yesterday, it is starting to look like networkmanager is just evolving away from all of the controls we would want in short lived/temporary machines. top level options deprecated with no alternative or interface level "know the interface in advance" options which are not viable. :\ | 16:33 |
clarkb | TheJulia: what sorts of controls? Seems a bit odd ot me since network manager's primary use case (in my mind at least) is a mobile laptop with from a network standpoint mimics short lived instances (since its connectivity to a network is short lived) | 16:45 |
TheJulia | ignore-carrier is deprecated and now an interface level setting. Retries has been completely restructured | 16:45 |
TheJulia | I think for ironic, specifically, we can just teach it to bounce the interfaces | 17:10 |
TheJulia | which would address spanning tree blocking interface issues | 17:10 |
TheJulia | well.. kind of | 17:10 |
TheJulia | maybe | 17:10 |
opendevreview | Julia Kreger proposed openstack/diskimage-builder master: Use internal dhcp client for centos 9-stream and beyond https://review.opendev.org/c/openstack/diskimage-builder/+/848017 | 18:08 |
JayF | TheJulia: ... if you're talking about a networking restart, or equivalent, I think that usually flips the link state on the ironic end, right? | 20:22 |
TheJulia | JayF: of us maybe needing to add one? | 20:24 |
TheJulia | Or… of the dib patch itself? | 20:24 |
JayF | Lemme look at the dib patch directly, I'm sorry I thought this was -ironic | 20:24 |
TheJulia | So using dhclient actually toggles the link state explicitly | 20:25 |
TheJulia | So line carrier goes down and back up | 20:25 |
JayF | which is hell on STP/loop prevention stuff | 20:25 |
JayF | and that's why you're using the internal implemention on centos9? | 20:25 |
JayF | so basically you're like 5 steps ahead of me as usual, got it LOL | 20:26 |
TheJulia | JayF: hell and also needed if the switch does things like block traffic on port preference for PXE purposes | 20:26 |
JayF | > block traffic on port preference for PXE purposes | 20:27 |
TheJulia | because that can reset the cycle, and ipxe actually can start the timer by issuing a patch down the port | 20:27 |
JayF | I know all those words but don't understand :) | 20:27 |
TheJulia | err, patcket | 20:27 |
* JayF can just read the backlog from yesterday, I was OOO yesterday | 20:28 | |
TheJulia | eh, that was in #opendev mostly, fwiw | 20:28 |
TheJulia | I think | 20:28 |
* JayF is everywhere | 20:28 | |
TheJulia | JayF: you know all and see all? | 20:29 |
JayF | I'm not omnipotent, just omnipresent | 20:30 |
JayF | I am everywhere, and read just enough to ask bad questions of TheJulia in random IRC channels ;) | 20:30 |
TheJulia | JayF: well, present is half the battle | 20:31 |
TheJulia | JayF: https://storyboard.openstack.org/#!/story/2008001 is what I was referring to, fwiw | 20:31 |
JayF | TheJulia: we had this exact problem in OnMetal, and fixed it by hitting the switch and/or the engineer configuring them until it stopped being so breaky | 20:37 |
JayF | TheJulia: in fact, we made architectural changes in v2 to avoid this kinda failure state | 20:37 |
JayF | (on a network level) | 20:37 |
JayF | like we never released with this bug, but we absolutely had to massage network (and maybe images too?) to prevent it | 20:37 |
TheJulia | Yeah, we had a hardware vendor encounter it in their labs and they had to port mirror and send us pcap files before we finally realized exactly what was going on | 20:38 |
TheJulia | luckily, networkmanager attempts to assemble lacp groups now aiui | 20:39 |
TheJulia | so... it might just be "fine" moving forward | 20:39 |
JayF | wait, are we publishing stream-9 IPA images and folks are using them? | 20:46 |
JayF | Or is this with your red hat on :D | 20:46 |
TheJulia | JayF: we build them and post them for use in our own CI | 21:23 |
TheJulia | So… the fedora is not on… at the moment | 21:23 |
JayF | It's arguably a pretty rough to release a stream-based image for reasons like this (and CI-maintenance sanity). I wonder if we've considered (for ironic) going to release images, but I know that's problematic too | 21:26 |
* JayF flips a table | 21:26 | |
TheJulia | A new one drops once a month and gave us advance awareness of some issues…. Sooo in the grand scheme of things it seems to be helping us | 21:27 |
* TheJulia prints a time stone on the resin printer, and returns the table to its full upright, and locked position | 21:28 | |
TheJulia | And honestly, we had the same issues without them, just the packages would get updated somewhere along the way and we would be scratching our heads assuming nothing had changed | 21:30 |
JayF | You are right that the 'stability' of !stream is ... not always so, at least for the level that ironic operates | 21:32 |
TheJulia | Yeah, testing the last few steps down to hardware is unfortunately hard all around | 21:32 |
* TheJulia remembers grub int 19h issues | 21:33 | |
clarkb | I've said this in a few venues now that stream seems to be working well for catching upcoming issues, but for stable CI platform it is difficult. Almost like we need a mix of platforms with a group that is actively ready to address the down the pipeline problems that stream exposes. | 21:34 |
clarkb | It has also been difficult because the round trip time on getting bugs fixed in stream is much longer tahn one woudl expect for a rolling release | 21:35 |
clarkb | but upstream says they are improving that, just have to keep fingers crossed those improvements happen | 21:35 |
TheJulia | Yeah, we caught an issue in stream about 6 weeks before it hit rhel…. Even then the delay was mostly on the rhel side for the already shipped stable versions | 21:37 |
TheJulia | That same issue appeared in stable released centos8.X eventually too | 21:39 |
clarkb | I run tumbleweed locally and I get antsy if updates take more than a couple of days so tahts sort of what I'm measuring against | 21:40 |
TheJulia | :) | 21:41 |
opendevreview | Merged openstack/diskimage-builder master: Fix BLS entries for /boot partitions https://review.opendev.org/c/openstack/diskimage-builder/+/846838 | 22:30 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!