Thursday, 2024-02-22

TheJuliaThat itself, shouldn't cause this. it is super weird though00:53
TheJuliadtantsur: any resolution regarding https://review.opendev.org/c/openstack/ironic/+/906113 ?00:58
TheJuliaomg I see it01:14
TheJuliatake a look at notify_conductor_resume_operation01:14
TheJuliawe can't remove the methods, for some reason we have a conductor heartbeat to trigger another rpc action code path01:15
* TheJulia pours whiskey into a glass01:15
TheJuliaJayF: cid: ^^^ ironic/conductor/utils.py notify_conductor_resume_operation (so conductor call itself.... gah!01:16
TheJuliaoh, like 947, at least on my local branch right now01:16
JayFI'm very confused as to why that only broke in one job though. That must only be used by the ansible driver01:40
JayFI'm curious if there's a forward looking fix we could make, but we would still have to delay the removal of the methods01:41
JayFSomething to look at tomorrow morning at a computer01:41
TheJuliawe could likely unwire it from using the call01:48
TheJuliaI'm not sure why it does it to begin with since the call should already be on the conductor01:48
TheJulia... I didn't realize we had conductor to conductor call logic anywhere01:48
TheJuliabut... eh01:48
TheJuliabrraaaains01:48
JayFThe only thing I can think of is if there was some reason you wanted to allow for a hash ring change02:18
JayFBecause I bet there are cases where you can RPC to a different conductor to continue the next step02:18
ashinclouds[m]That would  be a weird case to still be working and pick the work elsewhere02:30
ashinclouds[m]But, I could see cases where that might make sense02:31
opendevreviewJacob Anders proposed openstack/sushy-tools master: Replace hardcoded BiosVersion with an updatable field  https://review.opendev.org/c/openstack/sushy-tools/+/90948706:38
opendevreviewJacob Anders proposed openstack/sushy-tools master: [WIP] Add support for BIOS update emulation  https://review.opendev.org/c/openstack/sushy-tools/+/90950006:43
opendevreviewJacob Anders proposed openstack/sushy-tools master: [WIP] Add support for BIOS update emulation  https://review.opendev.org/c/openstack/sushy-tools/+/90950006:55
dtantsurTheJulia: re https://review.opendev.org/c/openstack/ironic/+/906113: I did not have time to properly work on it. I did file https://bugs.launchpad.net/ironic/+bug/2049913 as a large piece of work that can help.07:46
rpittaugood morning ironic! o/08:21
*** nfedorov is now known as jingvar09:22
dtantsurJayF: is it expected that I don't have +2 on unmaintained/yoga?10:48
opendevreviewcid proposed openstack/ironic master: Fix multiple assignment of redfish_system_id during node creation  https://review.opendev.org/c/openstack/ironic/+/90985110:57
opendevreviewcid proposed openstack/ironic master: Fix multiple assignment of redfish_system_id during node creation  https://review.opendev.org/c/openstack/ironic/+/90985111:33
jingvarHi folks12:58
jingvarI'm having an issue with ironic/ironic-inspector12:59
jingvarIronic was deploeyd with kayobe/kolla ansible without openstack services on 3 controll nodes13:00
jingvarOnly rabbit+ironic+maria+(finaly keystone + glance)13:02
JayFdtantsur: no13:02
jingvarWhen I star inspect, Inspector sends a message over rabbitmq to someone 13:03
jingvaroslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID13:04
jingvarIt breaks inspection process13:04
jingvarBut all consumers are present in Rabbit 13:05
jingvarThere is a strange thing - inspection is Failed. but Conductor start power on node etc13:08
jingvarI've tried Zed and Bobcat13:11
jingvarFile "/var/lib/kolla/venv/lib64/python3.9/site-packages/ironic_inspector/main.py", line 379, in api_introspection13:13
jingvarclient.call({}, 'do_introspection', node_id=node_id13:13
jingvarThis step13:13
jingvarI looks like missconfigured exchanges13:15
SvenKieskedo you happen to run rabbitmq with quorum queues enabled?13:29
dtantsurjingvar: I seriously wonder whether (and if yes - WHY) kolla configured inspector in an HA setup..13:33
SvenKieskewhat do you mean by HA? only because something is deployed on three nodes doesn't make it HA :)13:37
dtantsurSvenKieske: inspector does not need rabbitmq in a non-HA scenario13:37
dtantsur(also, inspector's HA support is an experimental feature)13:37
dtantsurquorum queues at least might be enabled https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/ironic/templates/ironic-inspector.conf.j2#L24-L2613:38
TheJuliagood morning13:43
SvenKieskeit seems it was at least partly enabled without any conditionals on HA usage, but I have to dig up the details: https://review.opendev.org/c/openstack/kolla-ansible/+/63236913:44
jingvarSvenKieske: yes quorum by default13:44
dtantsurjingvar: will it work if you revert https://review.opendev.org/c/openstack/kolla-ansible/+/632369 locally?13:44
SvenKieskethat are not all commits which enable the rabbitmq stuff for ironic-inspector, I _guess_..13:46
jingvardtantsur: how do inspector connect to conductor?13:46
dtantsurjingvar: public API (HTTP+JSON, the normal way)13:47
jingvarI have several condunctors and one inspector13:47
dtantsurjingvar: a correction: try setting transport_url=fake://13:47
jingvarwiil try13:48
SvenKieskedtantsur: I guess this is the culprit: https://review.opendev.org/c/openstack/kolla-ansible/+/868305/5/ansible/roles/ironic/templates/ironic-inspector.conf.j213:53
SvenKieskeI mean this was even not really correct before, because there is no check to enable the rabbitmq transport only for HA deployments, it was conditional on TLS instead..13:55
SvenKieskejingvar: would you be so kind to file a bug against kolla-ansible for this? I happen to be a maintainer and would like to fix this.13:55
dtantsurProbably, but in the end, you should use fake:// if it's a standalone (no API/worker split) inspector without HA13:56
jingvarI want conductor HA13:57
SvenKieskedtantsur: do you have any docs around this on your side? if I fix this on our side. I want to do it correctly :)13:57
TheJuliagood morning13:58
jingvarHA for deploement purposes13:58
jingvarinspector can be non HA13:58
dtantsuryeah, we're only talking about inspector now13:58
SvenKieskejingvar: alright, you might need this patch as well: https://review.opendev.org/c/openstack/networking-baremetal/+/90399513:58
dtantsurironic can and should be deployed with rabbit and stuff13:59
dtantsurSvenKieske: https://specs.openstack.org/openstack/ironic-inspector-specs/specs/splitting-service-on-API-and-worker.html might be the best we have14:00
dtantsurSvenKieske: but do kee https://specs.openstack.org/openstack/ironic-specs/specs/approved/merge-inspector.html in mind for the future14:00
SvenKieskethx, still, a bugreport would be nice. I don't know if I get around to it myself today, as I'm swamped with meetings.14:02
jingvarI don't use ssl14:03
SvenKieskejingvar: are you referring to the commit dtantsur asked you to revert? I guess this is just a refactoring artifact. I guess this is really a bug on how we deploy ironic-conductor in kolla. we can also move to #openstack-kolla if you like14:04
jingvarsorry which one?14:06
jingvartarnsport url to fake&14:07
jingvar?14:07
SvenKieskeno, sorry, this is probably a misunderstanding. I'm in the middle of a meeting, will respond later..14:07
jingvarthx14:08
SvenKieskejingvar: I created a bug report regarding the unnecessary transport via rabbitmq for non HA deployments of ironic-inspector, if you are interested: https://bugs.launchpad.net/kolla-ansible/+bug/205470514:55
*** nfedorov is now known as jingvar15:04
jingvardtantsur: tarnsport_url fake:// in inspector.conf help me, thanks a lot15:12
dtantsurjingvar: okay, so FYI SvenKieske ^^^15:13
jingvarwhat about inspector HA? 15:14
dtantsurDo you really *need* it?15:14
jingvaron zed I faced with failed deploy with broken inspector, but it works on Bobcat15:15
jingvarprobably it is IPA behavior15:15
jingvaron Zed it stops deploy when got timeout with update introspection15:16
jingvarBobcat ignore the error and do work15:17
SvenKieskejingvar: could you post your actual error message (trace) and tested openstack release to the above bug report? That would help me a lot. if you don't have an account you can also use paste.openstack.org and I copy it15:17
jingvarofcourse take a minute15:18
SvenKieskethank you15:19
jingvarfirst env https://paste.opendev.org/show/bscQiIEnqqfwzW2Vv3yx/15:21
jingvarnent one https://paste.opendev.org/show/bTfJ1XRpmkqDw2JisoZr/15:23
dtantsurIt looks like an unpatched queue. I wonder if it's the eventlet error that JayF was dealing with recently.15:23
jingvartake a look last lines - inspection started15:23
dtantsurSo, just to clarify: hasn't using fake:// made the problem go away?15:24
jingvardtantsur: As I wrote above, tarnsport_url fake:// in inspector.conf help me, thanks a lot15:26
dtantsurSvenKieske: my recommendation for Kolla would be: use fake:// transport and start only one copy of inspector (as you probably already do).15:27
dtantsurFair for the inspector merge work to happen for HA15:27
dtantsurs/Fair/Wait/ (brain y u so)15:27
SvenKieske:D alright, thanks!15:28
jingvarSvenKieske: stable/2023.215:28
SvenKieskethank you, I take it, this pastes are not set to expire? :)15:29
jingvarKayobe uses one inspector mgoddard: comment it15:29
SvenKieskeyeah, I noticed that15:29
SvenKieskein the code, that is15:30
jingvarSvenKieske: I'm not sure, just a random pastebin15:31
SvenKieskehum15:44
SvenKieskedtantsur: may I kindly ask, why "rabbit://" is the default transport_url then? I guess this is how it ended up in our config, because we try to use upstream defaults where possible.15:45
SvenKieskehttps://docs.openstack.org/ironic-inspector/2023.2/configuration/ironic-inspector.html#DEFAULT.transport_url15:45
*** nfedorov_ is now known as jingvar15:45
jingvarThere is a hole and there is a rabbit15:48
dtantsurthis ^^^ :)15:49
dtantsurSvenKieske: we had different plans back in the days. They never come to reality, instead we have the plan to freeze inspector as it is and migrate its functionality gradually into Ironic.15:50
SvenKieskeyeah, but maybe that would be a case where the default transport_url should be switched, given it doesn't really seem to work? or am I missing something else here? or at least document that most users don't want the default but instead "fake://"?15:52
dtantsurYeah, the documentation is lacking for sure15:55
dtantsurIt's designed to work with the API/worker split (which was thought to be a future mode of operation similar to the Ironic's API/conductor split)15:56
SvenKieskealright. I don't want to promise a docs patch myself as my backlog is already growing, maybe I get around to it some day..15:57
jingvar:) ^^^16:08
* TheJulia needs even more coffee today16:23
JayFdtantsur: I'm not certain it's an eventlet issue at a cursory glance. I won't have time to look in depth this week16:24
rpittaugood night! o/16:25
cidlittle troubleshooting, but it's not very clear, did I make a breaking change: https://review.opendev.org/c/openstack/ironic/+/90985117:12
TheJuliacid, an opportunity to learn :)17:13
cidhonestly, been doing a ton of learning recently.17:14
JayFcid: so since you have node options defined inside the loop, it's still getting the same problem: that same node_options is gonna be used for each time around17:15
JayFone potential solution would be to keep deploy options outta node_options, and add deploy_options (probably nicer to name node_deploy_options?) to the call on Line 2620-ish17:17
TheJuliamaybe being mindful of the desire to mix ipmi+redfish might also change the overall approach to fix it, but that is also a bit more of a lift structure/formatting/pattern wise17:21
JayFoooh17:25
JayFthat's a good point17:25
TheJuliaJust looking at it, we're pre-preparing, maybe we just need to prepare each time17:25
JayFmake the while loop longer?17:26
JayFthat makes sense17:26
TheJuliayeah17:26
cidhmm, makes sense (based on my understanding).17:30
cidstarts working on patch set 317:30
JayFI'll note17:32
JayFwell, nevermind17:33
JayFwas about to make a bad suggestion :D17:33
cid"bad"17:34
JayFwas going to suggest that you can likely test quickly outside of devstack17:35
JayFbut realized you'd likely need to set 9000 environment variables17:35
cidset 9,000 environment variables once?17:36
TheJuliaeh, 9000 is not awful, but 9001 is concerning ;)17:37
JayFThe reason the idea was bad is I thought it'd simplify things, but it wouldn't have been simpler :D 17:40
cidyea17:41
cidbecause I was about to ask/enquire, that I would love to provide updates to my code based on the test results, just that the resource intensity.17:42
cidit's the machines that will feel my wrath :)17:42
cidhence the nudge needing questions.17:42
JayFone of the tricks if you're worried about eating up too many CI resources is you can edit zuul.d/[projectname].yaml and comment out jobs from check/gate; but I wouldn't worry about it much17:45
cidowkay, will probably try that.17:49
JayFDon't spend too much time on it. CI time is important and limited, but so is your time.17:55
opendevreviewJulia Kreger proposed openstack/ironic master: WIP/DNM: See what explodes when cleaning is enabled for functional tests  https://review.opendev.org/c/openstack/ironic/+/90991817:59
TheJuliaWheeeeee https://bugs.launchpad.net/ironic/+bug/205472218:00
TheJulia(and by Wheee, I mean a sarcastic sound of joy which is actually pain)18:00
JayFif fake: pass18:01
JayFdid I fix it? /s18:01
JayFhonestly fake agent in sushy seems more appealing lol18:01
JayF*sushy-tools18:01
TheJuliamight be useful, but sort of also disjointed from the fact the tests expects cleaning disabled on a prod cloud18:14
opendevreviewcid proposed openstack/ironic master: Fix multiple assignment of redfish_system_id during node creation  https://review.opendev.org/c/openstack/ironic/+/90985119:22
opendevreviewJulia Kreger proposed openstack/ironic master: neutron: do not error if no cleaning/provisioning on launch  https://review.opendev.org/c/openstack/ironic/+/90993721:14
opendevreviewJulia Kreger proposed openstack/ironic master: fix errors messaging around network mappings  https://review.opendev.org/c/openstack/ironic/+/90993821:14
TheJuliaso the neutron one is one I care about and want to backport. I went to add new testing and realized the underlying callers/methods were also fairly well tested so only added a note.21:15
opendevreviewJulia Kreger proposed openstack/ironic stable/2023.1: WIP/DNM: See what explodes when cleaning is enabled for functional tests  https://review.opendev.org/c/openstack/ironic/+/90982821:22
TheJuliaIn today's "did I already fix this?!"21:23
JayFthose are fun changes to "show blame" on21:32
JayFand you can explicitly see we added that *before* validate existed21:32
TheJuliayup21:34
JayFSometime around 2pm pacific, I'm going to be talking to adamcarthur5 about finding some tasks to extract eventlet dependencies from IPA21:38
JayFif you want in on that conversation, lmk21:38
* JayF suspects "find a new WSGI server" is likely the winner there21:38
JayFany patches that do substantive changes we'll likely hold until after C is branched 21:39
opendevreviewJulia Kreger proposed openstack/ironic-tempest-plugin master: Invoke allocation tests with 'fake' deploy interface  https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/90993921:45
TheJuliaJayF: I happen to be available until 322:01
JayFTheJulia: https://us06web.zoom.us/j/89271237336?pwd=IqQRbboSzDyqhRWsgXUy12Kp9vhqde.122:03
TheJuliaI need to do one thing, but I can do it in the background22:03
opendevreviewAdam McArthur proposed openstack/ironic-python-agent master: Adding support for viewing individual cpu process info  https://review.opendev.org/c/openstack/ironic-python-agent/+/90934622:06
opendevreviewJulia Kreger proposed openstack/ironic master: WIP/DNM: Don't setup fake first!!!  https://review.opendev.org/c/openstack/ironic/+/90991822:11
JayFhttps://etherpad.opendev.org/p/ipa-eventlet-wsgi is our notes coming out of that chat; I think the plan will be for adamcarthur5 to do some exploration of possibilities and come to PTG with ideas and maybe (time permitting) prototypes. If you have strong opinions and want to share them, etherpad is a good place23:10
JayFI wonder if another low-hanging-fruit for migration could be removing the eventlet monkey_patch and explicitly importing the greened versions of socket for use in image streaming (I suspect this'd have to be stacked after 'get rid of eventlet wsgi')23:11
TheJuliaso yeah, I think I want to burn down the existence of the pure functional test jobs23:42
TheJuliasince... they default to run fake which breaks if a non-fake driver gets loaded for... say... deploy_interface23:42

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!