Monday, 2023-12-18

opendevreviewTakashi Kajinami proposed openstack/ironic-inspector master: Suppress logs from stevedore  https://review.opendev.org/c/openstack/ironic-inspector/+/90385303:28
adam-metal3Hey, since last Monday (I have been told) we are seeing this https://paste.openstack.org/raw/bxMHCSI7kO3fgJh9etbI/ error in the Metal3 ci with the ubuntu tests, so Ironic ans sushy-tools are running as containers on the host (not in K8s cluster) 07:21
adam-metal3could you give me some pointer please that what could cause this ?07:23
rpittaugood morning ironic! o/07:50
dtantsuradam-metal3: I wonder how we can enable this "auto selection disabled"07:58
dtantsuradam-metal3: I assume it's a regression in https://opendev.org/openstack/sushy-tools/commit/361e0eef99671cff2c5273649a10ce4367fa761008:04
dtantsurI had concerns about it, but was assured it's fine...08:04
dtantsurI'm talking with the author of the patch08:13
dtantsurjm1[m]: this is what I mentioned on slack ^^^08:13
jm1[m]adam-metal3: Hi, sorry for the mess! Which Ubuntu version are you testing with?08:21
Nisha_AgarwalHello Ironic!!!08:31
Nisha_AgarwalNeed a quick help.....Is there a way we can configure session auth token expiry? 08:32
Nisha_Agarwalin ironic08:32
Nisha_Agarwalfor redfish driver08:33
adam-metal3dtantsur: thanks the info08:43
adam-metal3jm1[m]: we are using 22.0408:43
adam-metal3but Ironic runs in a container for us08:44
adam-metal3so in our case Ubuntu should not matter much , contenerized Ironic talking to libvirt vms via sushy-tools08:45
adam-metal3via*08:45
jm1[m]adam-metal3: but libvirtd is running on ubuntu 22.04, right? i am trying to reproduce it08:49
adam-metal3jm1[m]: yes you are right that runs on ubuntu 22.0408:49
rpittaudtantsur: for the dhcp issue in bifrost, I collected the dnsmasq config here https://b0f9ae0491e974b2315d-45537cc5c7120f43f6e626c6c78dc0c0.ssl.cf5.rackcdn.com/903755/2/check/bifrost-integration-dibipa-debian-centos-9/3bc1373/logs/dnsmasq_config/index.html09:03
rpittauI don't see anything wrong but we can compare it with a working one09:03
opendevreviewRiccardo Pittau proposed openstack/ironic-inspector master: [WIP] Handle LLDP parse Unicode error  https://review.opendev.org/c/openstack/ironic-inspector/+/90376009:13
jm1[m]adam-metal3: could you please point me to a log output? something is odd, e.g. the snippet you posted above says "Setting boot mode to bios failed for". but in bios mode, the nvram xml tag should not be set at all09:15
adam-metal3jm1[m]: I have asced for a link to the specific build, that will have a log tar file 09:17
opendevreviewRiccardo Pittau proposed openstack/ironic master: Handle LLDP parse Unicode error  https://review.opendev.org/c/openstack/ironic/+/90386109:19
opendevreviewRiccardo Pittau proposed openstack/ironic-inspector master: Handle LLDP parse Unicode error  https://review.opendev.org/c/openstack/ironic-inspector/+/90376009:19
adam-metal3jm1[m]: https://jenkins.nordix.org/view/Metal3/job/metal3_capm3_main_integration_test_ubuntu/879/09:26
adam-metal3there is the archive https://jenkins.nordix.org/view/Metal3/job/metal3_capm3_main_integration_test_ubuntu/879/artifact/logs-jenkins-metal3_capm3_main_integration_test_ubuntu-879.tgz09:26
adam-metal3and in the archive there is a "docker" directory and that will have the ironic logs09:27
jm1[m]adam-metal3: thank you! will have a look09:27
adam-metal3jm1[m]: Thank you !09:34
jm1[m]adam-metal3: where can i find the sushy-tools version you are using?09:40
jm1[m]adam-metal3: or rather the code responsible for pulling in sushy-tools09:40
adam-metal3https://github.com/metal3-io/ironic-image/blob/cf3c71cd0f0e1bd5af710f5f6af45036966641d9/resources/sushy-tools/Dockerfile#L309:40
jm1[m]dtantsur: we have not merged 1.1.0 yet, so i am wondering how my code could be responsible?!?09:43
jm1[m]dtantsur adam-metal3 maybe we have to look somewhere else. ironic wants to boot in bios mode. my patch removes/changes nvram_path but it does not mess with loader_path09:46
jm1[m]the error log complains about loader_path though09:46
adam-metal3jm1[m]: what is the use case of "loader_path" I am not familiar with this variable09:47
dtantsurjm1[m]: oh, the image does not yet use 1.1.0? that's interesting, I thought we did that already09:49
jm1[m]adam-metal3: a verbose explanation 😅  https://libvirt.org/formatdomain.html#bios-bootloader09:49
* dtantsur hopes it's not a regression in ubuntu09:49
TheJuliao/ morning folks09:50
* TheJulia is just in a hotel, very bored and trying to avoid sending records in for a car accident on Saturday09:51
jm1[m]adam-metal3: is this the code which creates the domain xml? https://github.com/metal3-io/metal3-dev-env/blob/main/vm-setup/roles/libvirt/templates/baremetalvm.xml.j209:51
* TheJulia has a load shedding idea that needs to get put into BZ09:52
TheJuliaErr LP09:52
zigoHi there! Can someone take over this ? https://review.opendev.org/c/openstack/ironic-lib/+/90381509:55
zigoThe issue is with zeroconf 0.129, probably this needs global-requirement fix first ...09:55
zigoFYI, I'm super busy with Python 3.12 compat, so I can't really take care of that one...09:55
adam-metal3jm1[m] yes 09:57
adam-metal3jm1[m] okay so it is about the boot efi/bios fimware get it, i just didn't catch at first what firmware it is reffering to but it is clear now thanks10:02
dtantsuradam-metal3: shutting in the darkness really, but maybe https://github.com/metal3-io/metal3-dev-env/pull/1325 will help10:06
dtantsurif it's not a sushy-tools regression, I'm completely puzzled what caused it10:06
dtantsuradam-metal3: is there any real reason why we configure testing nodes in UEFI but trying to use BIOS afterwards?10:08
dtantsurI'd expect most people to use UEFI nowadays unless they have some very specific reason not to (like a broken firmware)10:08
adam-metal3dtantsur: I don't think mixing these 2 is intentional, for a long time Metal3 CI was only testing with BIOS10:10
adam-metal3but dev-env had the ability to provide a choice for the user10:11
adam-metal3ASFAIK dev env is defaulting to bios10:14
dtantsuradam-metal3: the easiest way to "fix" the CI is to switch to UEFI IMO10:14
adam-metal3dtantsur: sure I can do that but I have to check then something first I don't remember if I have pushed the fix for the UEFI firmware path, because for some time in case of UEFI it was loading the secure UEFI firmware10:16
adam-metal3yeah okay I have merged the UEFI firmware path fix so we can switch no problem10:17
adam-metal3dtantsur: I have made this https://github.com/metal3-io/metal3-dev-env/pull/1326 I think UEFI default for dev-env is reasonable in general anyways10:25
dtantsuradam-metal3: quick grep also shows export LIBVIRT_FIRMWARE="bios"10:25
adam-metal3in network10:26
dtantsurah, it's in a condition. okay.10:26
dtantsuradam-metal3: then line 49 export BOOT_MODE="${BOOT_MODE:-legacy}"10:26
dtantsurand update config_example.sh10:27
adam-metal3okay 10:27
adam-metal3this is so badly organized boot mode defaults in network config what the hell....10:28
adam-metal3but also in ansible10:28
dtantsuradam-metal3: it's because you cannot boot over IPv6 network in legacy mode10:29
dtantsurif we default to UEFI, this logic can be simplified10:29
adam-metal3yes10:29
adam-metal3dtantsur: I think now the PR has what is minimally needed to hopefully unclog the ci but I think I will move the boot mode stuff somewhere else, it is very wierd in network config , ofc I get the IPV6 part but the general selection logic is here also10:32
adam-metal3but I don't want to spam Ironic irc with metal3 madness10:33
dtantsur:)10:33
* TheJulia tries to wake up with the worst hotel room coffee ever10:40
TheJuliazigo: maybe after the holidays I can, I'm sort of occupied this week unfortunately.10:41
opendevreviewMerged openstack/bifrost stable/2023.2: ironic: Perform online data migrations with localhost DB  https://review.opendev.org/c/openstack/bifrost/+/90129610:43
* dtantsur is wondering if the mdns idea was good in the end...10:44
TheJuliaI still think it was10:44
TheJuliabut a valid question is "is anyone *really* using it". A conundrum is though, they might not know at this point10:45
TheJuliaor easily know until it is gone10:45
TheJuliamaybe s/easily/painfully/10:45
TheJuliait does make some things like manual introspection data updates super easy for folks like arne's group10:52
TheJuliaIn theory, of course10:52
iurygregorygood morning Ironic11:21
TheJuliaJulia's crazy idea from over the weekend: https://bugs.launchpad.net/ironic/+bug/204680311:30
TheJuliaand it is kind of multiple ideas11:30
dtantsurI'll bookmark it until I have some time for a long read :)11:31
TheJuliait is definitely a high level idea11:31
TheJuliathat can kind of spawn, but sort of seems like "operationally reasonable"11:31
TheJuliadunno, it is out there11:31
* dtantsur keeps producing RFEs for minor improvements: https://bugs.launchpad.net/ironic/+bug/204642811:37
TheJulianot sure worth microversioning given the redaction, then again it can be turned off, but still not sure we should support "you turned off redaction!"11:39
* TheJulia wonders if harald is rebooting, or if his irc connection just dislikes the world today11:42
dtantsurheh11:42
iurygregoryconnection issues probably =D11:42
dtantsurMy IRC bouncer is on the Synology NAS here, so it's mostly unaffected by laptop reboots11:42
hjensasrebooting :)11:56
Nisha_Agarwaldtantsur, hi11:58
Nisha_AgarwalTheJulia, hi11:58
Nisha_AgarwalOne quick question on session timeout11:59
Nisha_AgarwalIs there a way we can configure the session timeout for redfish calls to the baremetal?11:59
Nisha_Agarwalin sushy11:59
dtantsursession timeout?12:01
Nisha_AgarwalYes auth token timeout12:01
Nisha_AgarwalWe were trying to certify one of the HPE server on RHOSP17.112:02
Nisha_Agarwaland since its kolla based we see the session auth token expires very soon leading to "missing attribute" error12:02
dtantsurNisha_Agarwal: it's something the server controls though?12:02
Nisha_Agarwalfor resolving this we have chanegd the ironic.conf to use auth_type as basic12:03
Nisha_Agarwaldtantsur, nope12:03
dtantsurI guess the problem is not in the expiration itself, but rather in the wrong error that sushy does not retry?12:03
Nisha_AgarwalYes12:03
Nisha_AgarwalSo actual flow is that sushy gets session response as 40112:03
Nisha_Agarwaland instead of retrying here it sends that session error response to the called12:04
Nisha_Agarwalcaller*12:04
Nisha_Agarwalthen caller tries to parse the attributes12:04
Nisha_Agarwaland fails with missing attribute error12:04
Nisha_AgarwalThe issue is very prominent when sushy is used from inside the container12:05
dtantsurI'm not sure I understand how a container can be related..12:05
Nisha_Agarwalif u try the same thing outside the container, issue is seen very less...you can still hit the issue but when u add pdb12:05
Nisha_Agarwalmay be some timing issue12:06
Nisha_Agarwalthats what my observation12:06
dtantsurdifferent versions?12:06
Nisha_Agarwalwe tried stable wallaby (3.7.6) inside and outside the container, and latest sushy 4.7.0 outside the container12:06
Nisha_Agarwaland could hit the issue when used pdb outside the container12:07
Nisha_Agarwalinside the container issue is seen without pdb very frequently12:07
Nisha_Agarwaland to resolve this i could only change the authentication mechanism to "basic"12:08
Nisha_AgarwalI was thinking if we could increase(configure) the session expiry and then made it worked, probably that would have been better12:08
Nisha_Agarwaldtantsur, https://paste.openstack.org/show/bKBx58BeU1dYEyXPnJdJ/12:11
Nisha_Agarwalcheck the logs here12:11
TheJuliaNisha_Agarwal: please open a bugzilla and add all the details possible, I’m basically out until the new year.12:11
Nisha_AgarwalTheJulia, Yes my colleague is already posting all this in the query to RedHat12:12
TheJuliaBz,  or a support query.12:12
TheJuliaErr, not support request.12:12
Nisha_AgarwalSo we are now basically just questioning if we could certify the server with "basic" as auth_type?12:12
Nisha_Agarwalor it has to be "auto"12:13
TheJuliaIt is far from ideal, and truthfully it could only be done as a workaround really.12:13
Nisha_Agarwalyes so the issue is seen in latest sushy as well if we just add pdb12:13
TheJuliaCan you certify with a workaround is. Or a question I can answer12:14
Nisha_AgarwalYes thats what we had the query...12:14
Nisha_AgarwalAnyway all the queries are being added in the support request12:14
TheJuliaNot a question I mean. I’m on my phone right now. Sorry for typos.12:14
Nisha_Agarwal:)12:14
Nisha_Agarwalnp12:15
TheJuliaSupport cannot answer that question, either.12:15
Nisha_Agarwalhmm then?12:15
Nisha_Agarwalbecause the fix has to be done in master12:15
Nisha_Agarwaland then backported 12:15
Nisha_Agarwalto wallaby in RHOSP12:15
TheJuliaIt comes down to the certification requirements. Basically has to work without workarounds as I understand it.12:15
Nisha_Agarwalhmmm12:15
TheJuliaBecause the certification is determined through automated tooling.12:16
Nisha_Agarwalyes i understand...12:16
Nisha_Agarwalwe couldnt find a way to configure this parameter when doing undercloud installation12:16
TheJuliaYeah, it is not possible AFAIK.12:17
Nisha_Agarwalhmmm12:17
TheJuliaIdeally, session auth should also be used, so we will need to take a look once the root cause is fully understood on master branch.12:17
Nisha_Agarwalit does session auth, but only once12:20
Nisha_Agarwalafter that when it gets the session auth and it hits the GET call on "redfish/v1/Systems/Partition0" the session has expired by then12:20
Nisha_Agarwalso actually there it gets the "invalid session error" with status 40112:21
Nisha_Agarwaland here instead of retrying the session token it just passes the response to the caller i.e. get_system() of sushy12:21
Nisha_Agarwaland that fails with missing attribute error12:22
TheJuliaPlease file a BZ if you can, I can look after the first of the year. If you develop a patch in the mean time, please add me to it.12:25
TheJuliaIt seems rather odd the session is invalidated so quickly as well. Anyhow, I need to get going and shouldn't be working when I'm off.12:27
Nisha_Agarwal:)12:28
Nisha_Agarwalit is happening more frequently when running inside the container...when we do it normally , we dont hit this issue unless we add a pdb12:29
TheJuliaa container shouldn't impact its behavior at all, but a very detailed BZ will help us tremendiously along with information about what hardware and firmware this has been reproduced against12:33
jm1[m]adam-metal3: dtantsur i was reading up on your discussion. did you find the root cause for this loader path issue?13:35
adam-metal3jm1[m]: I think the workaround I have made is more of an "ignore it" type solution, we don't have hard requirement for testing with legacy bios, so I will change the CI and dev-env to use uefi and I will go now and check the logs whether that has helped or not13:39
jm1[m]adam-metal3: ack. it must be some kind of edge case. if sushy-tools encounters that the loader xml tag has been defined but no text (loader_path) has been set, then it would log a warning. but it does not, hence it is not sushy-tools that removes that loader_path.14:32
jm1[m]something sets the loader xml tag but without the path. libvirtd is unhappy about that14:33
jm1[m]anyway, glad you found a workaround. then maybe my patch for sushy-tools 1.1.0 in ironic-image can finally be merged. and THEN we might see some fallout from my nvram patch ;)14:35
adam-metal3jm1[m]: I am still testing , who knows I might haunt you with this issue in the future also :D14:38
adam-metal3but thanks for all the help so far14:39
jm1[m]adam-metal3: np :D14:45
JayFI might be a couple minutes late in starting the meeting today. Anyone with rights can feel free to start it on time or I should have it started before 5 minutes after.14:58
JayF#startmeeting ironic15:01
opendevmeetMeeting started Mon Dec 18 15:01:28 2023 UTC and is due to finish in 60 minutes.  The chair is JayF. Information about MeetBot at http://wiki.debian.org/MeetBot.15:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:01
opendevmeetThe meeting name has been set to 'ironic'15:01
JayF#topic Announcments/Reminder15:01
JayF#info Standing reminder to review patches tagged ironic-week-prio and to hashtag your patches; https://tinyurl.com/ironic-weekly-prio-dash15:01
JayF#info The next two Ironic meetings (Dec 25, Jan 1 2024) are cancelled.15:02
JayF#topic Review Action Items15:02
JayF#info JayF emailed list about cancelled meetings15:02
iurygregoryo/15:02
rpittauo/15:02
JayF#topic Caracal Release Schedule15:02
JayF#info Next Milestone R-17, Caracal-@ on Jan 1115:02
JayFAny other comments on the release schedule? Anything we need to consider?15:03
dtantsuro/15:03
dtantsurwhere do we stand with intermediate releases?15:03
JayFI've cut none.15:03
dtantsurrpittau: ^^15:03
rpittauwe cut bugfix in decemeber15:04
rpittaunext ones will be in February, end of it15:04
dtantsur\o/15:04
rpittauand thanks to that we also released ironic-image in metal3 :)15:04
JayFAck, sounds like we're on track then.15:04
rpittauyup15:04
JayFOn this general topic; how is bugfix support in release automation / retiring the old ones?15:04
JayFI know that was in process but sorta lost the thread on it during my vacation15:05
rpittauJayF: I've opened a patch for that but I didn't get the talk going with the release team after an initial discussion15:05
rpittauthis is the patch btw https://review.opendev.org/c/openstack/releases/+/90081015:05
JayFack; so in progress just low priority and not moving quickly it seems15:06
JayFbasically what I expected15:06
rpittauyeah :/15:06
JayF#topic OpenInfra Meetup at CERN June 6 202415:06
JayFLooks like someone added an item suggesting a meetup for Ironic be done during this.15:06
rpittauyes!15:06
JayFSounds like a good idea. I would try to go but will note that "I will try to go" still means very low likelihood15:06
JayFso please someone else own this :D 15:07
rpittau:D15:07
rpittauI proposed it, I will own it :)15:07
JayFawesome \o/15:07
JayFGotta go see some protons go boom15:07
rpittauarne_wiebalck: this ^ probably is of your interest15:07
rpittauI guess a good dte would be June 5 as people will probably travel on Friday (June 7)15:08
iurygregoryMeetup is probably complicated I would say .-., even Summit is complicated to get $budget15:08
JayFI would say picking a date is probably getting ahead of ourselves15:08
JayFmaybe just send out anemail and get feelers?15:09
rpittauyeah, that's the intention, I was just thinking out loud15:09
JayFI know if I went, I'd probably have to combine a UK trip with it, so I might actually be more able to go on the 7th15:09
JayFAnything else on this topic?15:10
JayF#topic Review Ironic CI Status15:11
dtantsurBifrost DHCP jobs are broken, presumably since updating ansible-collection-openstack. We don't know why.15:11
JayFI'll note it broke for a couple days last week due to an Ironic<>Nova driver chain being tested on the *tip* but an intermediate patch broke it.15:11
JayFNow that whole chain of the openstacksdk migration has landed and those jobs are happy15:12
rpittaudtantsur: can we rebase the revert on top of https://review.opendev.org/c/openstack/bifrost/+/903755 to collect the dnsmasq config ?15:12
dtantsurdoing15:12
rpittautnx15:12
JayF#info Bifrost DHCP jobs broke by ansible-collection-openstack upgrade; revert and investigation in progress.15:12
JayFAnything else on hte gate?15:13
opendevreviewDmitry Tantsur proposed openstack/bifrost master: DNM Revert "Support ansible-collections-openstack 2 and later"  https://review.opendev.org/c/openstack/bifrost/+/90369415:13
JayF#topic Bug Deputy15:14
JayFrpittau was bug deputy this week; anything interesting to report?15:14
rpittaunothing new, it was really calm, I triaged a couple of old things15:15
JayFAny volunteers to take the baton this week?15:15
JayFIf not I think this it is reasonable to say "community" can do it through the holiday?15:15
dtantsuryeah15:16
rpittauyep15:16
JayF#info No specific bug deputy assigned through holiday weeks; Ironic community members encouraged to triage as they are working and have time.15:16
JayF#topic RFE Review15:16
JayFOne for dtantsur 15:16
JayF#link https://bugs.launchpad.net/ironic/+bug/2046428 Move configdrive to an auxiliary table15:16
JayFdtantsur: my big concern about this is how nasty is the migration going to be15:16
dtantsurIt's a small one, but has an API visibility15:16
dtantsurwell15:16
dtantsurWe won't migrate existing configdrives; the code will need to handle both locations for a good while15:17
JayFI don't think it's going to be small for scaled up deployments with lots of active configdrive instances :)15:17
JayFoooh, so we're not going to migrate the field outta node?15:17
dtantsurWell, there is no "field"15:17
dtantsurIt's just something in instance_info currently15:17
JayF*pulls up an api ref*15:18
dtantsurSo, new code will stop inserting configdrive into instance_info, but will keep reading it from both places15:18
JayFThis is ~trivial15:18
JayFinstance_info is not microversioned, we really can't microversion it15:18
dtantsur*nod*15:18
JayFunless we want to make changes in our nova/ironic driver harder than they already are15:18
dtantsur:D15:19
JayFWould we still support storing configdrives in swift?15:19
dtantsurAbsolutely15:19
JayFWould we ever use this table in that case?15:19
JayFe.g. I can't reach swift; is this table now a fallback?15:19
dtantsurI don't know how many people store that in swift, to be honest. It's opt-in.15:19
JayFThat's fair. I think from that perspective because my two largest environments did15:20
JayFbut I'm sure my downstream now doesn't15:20
JayFand swift usage is much lower15:20
dtantsurmy downstream definitely does not either :)15:20
JayFI am +2 on the feature, and like, +.999999 to it without a spec15:20
JayFlet me put it this way: there's no way I'd be able to implement this safely without a spec15:20
JayFbut you may be able to15:20
dtantsurThe patch is likely going to be shorter than even a short spec.15:21
rpittaunot sure about the spec either, but probably not needed15:21
JayFI think my big concern is more around code we might need to write but don't know than the code we'd know we need to write :)15:22
JayFbut you can't reduce my concern around unknown unknowns lol15:22
dtantsurI'm afraid I cannot :D15:22
JayFany objection to an approval as it sits, then?15:22
iurygregorynone from me15:22
JayF#info RFE 2046428 approved15:22
JayF#topic Open Discussion15:23
JayFAnything for open discussion?15:23
dtantsurWanna chat about https://review.opendev.org/c/openstack/ironic/+/902801 ?15:23
dtantsurI may be missing the core of your objections to it15:24
JayFI don't like the shape of that change and I don't know how to express it15:24
JayFI think you are15:24
JayFand I think I am, to an extent15:24
dtantsur(and would happily hear other opinions; no need to read the code, the summary should be enough)15:24
JayFSo basically we have a pie of threads15:24
JayFright now, we have AFAICT, two config options to control how that pie is setup15:25
dtantsurone?15:25
JayF"how big is the pie" (how many threads) and "how much of the pie do periodic workers get to use"15:25
dtantsurthe latter is not a thing15:25
JayFthat is untrue, I looked it up, gimme a sec and I'll link15:25
dtantsurhttps://review.opendev.org/c/openstack/ironic/+/902801/2/ironic/conductor/base_manager.py#33515:26
dtantsurthat's the same executor...15:26
JayFhttps://opendev.org/openstack/ironic/src/branch/master/ironic/conf/conductor.py#L8915:26
JayFit's the same executor, but we allow you to limit how much of that executor that the periodics will use15:26
dtantsur*each periodic*15:26
JayFOH15:27
dtantsur1 periodic can use 8 threads. 100 periodics can use 800 threads.15:27
dtantsurThis was done for power sync IIRC15:27
JayFThis conversation helps me get to the core of my point though, actually, which is nice15:27
dtantsurit's used like this https://opendev.org/openstack/ironic/src/branch/master/ironic/conductor/manager.py#L1415-L142415:28
JayFI worry that we are goign to make it extremely difficult to figure out sane values for this in scaled up environments15:28
JayFhmm but you didn't want it to be configurable15:28
dtantsurI do have a percentage15:29
JayFyou just wanted to reserve 5% of the pie at all times for user-interactive-apis15:29
dtantsurit's a config https://review.opendev.org/c/openstack/ironic/+/902801/2/ironic/conf/conductor.py#3115:29
JayFI'm going to reorient my question15:29
dtantsur5% of the default 300 is 15, which matches my personal definition of "several" :)15:29
JayFDo these configs exist in a post-eventlet world?15:29
dtantsurPossibly?15:30
JayFAs laid out in the current draft in governance (if you've read it)15:30
dtantsurWe may want to limit the concurrency for any asynchronous approach we take15:30
dtantsurOtherwise, we may land in the situation where Ironic is doing so much in parallel that it never gets to the bottom of its backlog15:30
JayFI think I'm just trying to close the barn door when the horse has already escaped w/r/t operational complexity :( 15:31
JayFand everytime we add something like this, it gets a little harder for a new user to understand how Ironic performs, and we'll never get rid of it15:31
dtantsurI cannot fully agree with both statements15:31
dtantsurWe *can* get rid of configuration options for sure. Removing eventlet will have a huge impact already.15:32
JayFwell agree or not, it's basically an exasperated "I give up" because I don't have a better answer and I don't want to stand in your way15:32
dtantsurWell, it's not super critical for me. If nobody thinks it's a good idea, I'll happily walk away from it.15:32
dtantsur(Until the next time someone tries to deploy 3500 nodes within a few hours, lol)15:32
JayFI think it's a situation where we're maybe putting a bandaid on a wound that needs stitches, right?15:33
JayFbut the last thing we need is another "lets take a look at this from another angle" sorta thing15:33
JayFand with eventlet's retirement from openstack on the horizon, there's no point15:33
dtantsur"on the horizen" ;)15:33
dtantsurI keep admiring your optimism :)15:34
JayFso kick the can down the road is probably the right call; whether that means for me to stop fighting and drop my -1 or for us to just be OK with the concurrency chokeout bug until it's gone15:34
JayFdtantsur: we don't have a choice15:34
JayFdtantsur: have you looked at how bad eventlet is on 3.12?15:34
dtantsurNot beyond what you shared with us15:34
JayFdtantsur: I have optimism only because staying on eventlet is harder than migrating off in the medium term15:34
dtantsurI know we must do it; I just don't know if we can practically do it15:34
JayFwhich isn't exactly "optimism" so much as "out of the fire and into the pan"15:34
dtantsur:D15:34
JayFdtantsur: smart people have already answered the question "yes we can, and here's how"15:35
dtantsur\o/15:35
JayFI think code is already written which mades asyncio and eventlet code work together15:35
JayFusing eventlet/aiohub (iirc)15:35
JayFhttps://github.com/eventlet/aiohub15:35
* dtantsur doesn't want to imagine potential issues that may arise from it...15:35
JayFhberaud is working on it, along with some others (including itamarst from GR-OSS)15:35
JayFdtantsur: I'm thinking the opposite. I'm looking at the other side of this, and seeing any number of "recheck random BS failure" things disappearing15:36
dtantsurBut.. if eventlet stays in some form, so do these options?15:36
JayFdtantsur: I'm telling you, eventlet's status today is miserable15:36
JayFdtantsur: probably, yeah :/ 15:36
JayFdtantsur: so I am like, going to pull my -1 off that. I'm not +1/+2 to the change but don't have a better idea15:36
dtantsurOkay, let's see what the quiet people here say :) if someone actually decided it's a good idea, we'll do it. otherwise, I'll silently abandon it the next time I clean up my backlog.15:37
JayFAs another note for open discussion15:38
JayFI believe I'm meeting downstream with a potential doc contractor15:38
JayFthat we sorta put in motion with my downstream a few weeks ago15:38
dtantsur\o/15:38
JayFmaybe I'll ask them how to make a decoder ring for 902801 :P 15:38
JayFAnything else for open discussion?15:38
JayFThanks everyone, have a good holiday o/15:40
JayF#endmeeting15:40
opendevmeetMeeting ended Mon Dec 18 15:40:22 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:40
opendevmeetMinutes:        https://meetings.opendev.org/meetings/ironic/2023/ironic.2023-12-18-15.01.html15:40
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/ironic/2023/ironic.2023-12-18-15.01.txt15:40
opendevmeetLog:            https://meetings.opendev.org/meetings/ironic/2023/ironic.2023-12-18-15.01.log.html15:40
JayFFWIW; I should be around this week, likely taking PTO Friday unless I have something come up. Similar next week; other than Monday I'll be around. First week of Jan I'll be gone most of the week.15:40
dtantsurI'll be out from Friday, through the next week and on the 1st15:42
rpittauI'll also be out from Friday and be in only 2 days the first week of January (3-4)15:42
iurygregoryI'm out from 02-12 January =)15:46
* JayF has a friend flying in 1 Jan to come with him to https://www.nhl.com/kraken/fans/winter-classic15:47
rpittaurelease the kraken!15:52
rpittaugood night! o/16:56
JayFo/17:13
opendevreviewVasyl Saienko proposed openstack/networking-baremetal master: Do not try to bind port when we can't  https://review.opendev.org/c/openstack/networking-baremetal/+/90325217:59
JayFdtantsur: I'll note, github.com/eventlet/eventlet is now alive again, you can potentially evaluate other solutions for https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/NMCPYYHUPG766V5MGUUEKNIDEV6RCELC/#H5QDJULOMS73WX34SZUD5AOCV3GDIAQA 21:02
JayFdtantsur: from my chat with itamarst, it sounds like python 3.7 is as far back as they are testing new eventlet changes, so I'm not sure I'd be confident a new release would be keen on 3.6, and I'm not sure we could just remove the one 2.x compat piece without a whole PR to address it21:03
JayFdtantsur: just thinking out loud and letting you know I haven't forgotten this (yet) :D 21:03
JayFdtantsur: I was looking into this, got to the point where I really wanted to see the centos stream 8 patch for the eventlet RPM ... and I don't have access to the source rpms because they are paywall'd (well, login-wall'd), so I punted :/21:09
JayFdtantsur: essentially itamar said, more or less "PRs welcome" in my downstream slack; so if there's a fix I suspect new-upstream might be amenable to it. Whether or not it's safe to use on python 3.6 is a question you'd have to test for :D 21:12
JayFHmm. I got ahold of that source; it doesn't look like the eventlet RPM from https://buildlogs.centos.org/centos/8-stream/cloud/x86_64/openstack-xena/Packages/p/ was patched at all21:24
* JayF maybe didn't understand something, or is the CVE patched version elsewhere21:24
JayFI realized that searching for SOURCE rpms for python packages was a little silly. 21:24
* JayF inspects all the .py files for ones and zeros /s21:24
JayFFYI Ironic folks: https://blueprints.launchpad.net/nova/+spec/ironic-guest-metadata has some Ironic work items in it now too; thought it'd be wise to share it around here some too22:23
JayFnothing we haven't talked about; but wanted to ensure it's been spread around22:23
JayFI'll put the Ironic half of this in RFE bugs once we have agreement on the nova half22:23
JayFSharding re-proposed in nova; on top of https://review.opendev.org/c/openstack/nova/+/900831/ -- now that I have a stack to test it on, I'll point my attention in the direction of tempest23:12

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!