Monday, 2023-11-20

opendevreviewBoushra Sondos Bettir proposed openstack/ironic master: Support both OVN switches, logical and physical switches.  https://review.opendev.org/c/openstack/ironic/+/90056802:40
rpittaugood morning ironic! o/08:21
masgharGood morning!08:58
dtantsurJayF: https://launchpad.net/~virtualpdu-bugs/+members#active is now us, should we ask someone to fix it or should we drop virtualpdu from the triaging dashboard?10:28
dtantsurJayF: also please check my comment on https://bugs.launchpad.net/ironic/+bug/204023610:28
iurygregorygood morning Ironic11:24
iurygregorydtantsur, if I recall correctly he will be off this week11:24
dtantsurah, thanksgiving11:24
iurygregoryok, time to go back to multipath bug in IPA \o/12:28
iurygregoryand drink my second coffee (is not even 10 am)12:28
iurygregoryI've created https://bugs.launchpad.net/ironic-python-agent/+bug/2043992 for the bug I'm working on14:05
opendevreviewMark Goddard proposed openstack/bifrost stable/2023.2: ironic: Perform online data migrations with localhost DB  https://review.opendev.org/c/openstack/bifrost/+/90129614:25
opendevreviewMark Goddard proposed openstack/bifrost stable/2023.1: ironic: Perform online data migrations with localhost DB  https://review.opendev.org/c/openstack/bifrost/+/90129714:25
opendevreviewMark Goddard proposed openstack/bifrost stable/zed: ironic: Perform online data migrations with localhost DB  https://review.opendev.org/c/openstack/bifrost/+/90129814:26
opendevreviewMark Goddard proposed openstack/bifrost stable/yoga: ironic: Perform online data migrations with localhost DB  https://review.opendev.org/c/openstack/bifrost/+/90129914:26
dtantsurDo we even have the meeting today?14:58
iurygregorydtantsur, now I'm wondering the same thing15:00
* dtantsur checks IRC logs15:00
dtantsurTheJulia to run the 11/20 meeting15:01
dtantsurokay, let's do it15:01
dtantsur#startmeeting ironic15:02
opendevmeetMeeting started Mon Nov 20 15:02:00 2023 UTC and is due to finish in 60 minutes.  The chair is dtantsur. Information about MeetBot at http://wiki.debian.org/MeetBot.15:02
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:02
opendevmeetThe meeting name has been set to 'ironic'15:02
dtantsur#chair TheJulia 15:02
opendevmeetCurrent chairs: TheJulia dtantsur15:02
TheJuliao/15:02
iurygregoryo/15:02
dtantsurTheJulia: wanna take it from here?15:02
TheJuliasorry, slightly distracted this morning15:02
TheJuliasure15:02
TheJulia#topic Announcements / Reminders15:02
dtantsur#info https://github.com/dtantsur/ironic-bug-dashboard is revived and can be used for triaging (run locally)15:03
TheJuliaA standing reminder to review patches tagged with ironic-week-prio. The dashboard will be linked shortly.15:03
TheJulia#link https://tinyurl.com/ironic-weekly-prio-dash15:03
TheJuliaDoes anyone else have anything to announce or remind us of?15:03
dtantsurWe need to add a section for bug triaging, don't we?15:03
TheJuliaIt appears we don't have any action items15:03
iurygregorydtantsur, ++ 15:04
TheJuliaHas the policy proposal change merged?15:04
dtantsurI don't believe so (if you mean re bugs)15:04
dtantsur#link https://review.opendev.org/c/openstack/ironic/+/900449 Bug deputy proposal15:05
dtantsurBut I was a deputy nonetheless and ready to provide an update :)15:05
TheJuliaokay, fair! :)15:05
dtantsur#topic Bug deputy update15:06
dtantsurFirst, as I've mentioned already, the bug dashboard is functional again15:06
dtantsurWe don't have a place to host it yet, but it can be trivially run locally either with `tox -erun` or with the provided Dockerfile15:06
dtantsur#link https://github.com/dtantsur/ironic-bug-dashboard bug dashboard15:06
dtantsurSecond, I've done a major bug clean up:15:07
dtantsurIronic: 184 bugs (-31) + 189 wishlist items (+9). 29 new (-29), 96 in progress (-19), 3 critical (-1), 23 high (+2) and 13 incomplete (+1)15:07
dtantsurNova bugs with Ironic tag: 23. 0 new, 0 critical, 1 high15:07
dtantsurA lot of bugs got stuck in the open state despite being merged, probably because of the transition to storyboard back then15:07
dtantsurI don't think I've cleaned up everything, but I made some progress15:08
dtantsurThird, and this is a reminder:15:08
dtantsur#info Please triage your bugs if you're a constant member of the team. This includes setting status to Triaged, setting priority and updating tags.15:08
dtantsurForth, we have virtualpdu on our radar, but nobody here has ACL for its bugs.15:09
dtantsurrpittau: was it you show worked on the virtualpdu ownership the last time?15:09
dtantsurs/show/who/15:09
TheJuliaI believe it was rpittau 15:09
rpittauo/15:09
rpittauyes15:10
rpittauit was me15:10
dtantsurrpittau: can you ping your contacts again re launchpad ownership?15:10
rpittauof course!15:10
dtantsur#action rpittau to ask to change the launchpad ownership for virtualpdu to us15:11
dtantsurLast but not least, does anyone want to be the bug deputy this week?15:11
TheJuliaMy schedule is moderately crazy this week and I've got a long weekend15:12
dtantsurThe crickets tell me that it's on me again :)15:12
TheJuliaI likely can next week15:12
iurygregoryI can give a try dtantsur 15:12
dtantsurnice!15:12
iurygregoryif i have questions I will ping you =P15:12
dtantsuriurygregory: something I never got to: check the storyboard for things that we might want transfered15:13
dtantsurand yes, never hesitate to ping me15:13
dtantsur#action iurygregory is the bug deputy this week (and TheJulia potentially next time)15:13
iurygregoryack15:13
TheJuliaOnward?15:13
dtantsurUnless there are questions for me already15:13
iurygregoryyes15:13
TheJulia#topic Caracal release schedule15:13
TheJulia#link https://releases.openstack.org/caracal/schedule.html15:14
TheJuliaLast week was Caracal-1. Caracal-2 is the week of January 8th.15:14
TheJulia#topic Review Ironic CI status & update whiteboard if needed15:14
dtantsurHmm, is it about the time we make our intermediary release?15:14
TheJuliaI suspect it was last week or is around now15:14
TheJuliathat being said we had the metal3-integration job broken last week.15:15
rpittauI'm going for a bugfix release next week15:15
TheJuliarpittau: ack, thanks15:15
dtantsurBobcat was Oct 02, so we should aim for around Dec 02, I assume?15:15
TheJuliaWhat is the word on the metal3-integration CI job?15:15
rpittaudtantsur: yeah, I'll propose beginning next week, and should merge by the end of the week15:15
dtantsurbut yeah, next week will make the math somewhat better down the road15:16
rpittauTheJulia: it's fixed15:16
TheJuliacool, so we should be in a better place this week. That was the only issue I was aware of last week.15:16
TheJuliaOnward?15:16
dtantsur++15:16
rpittaugo!15:16
TheJulia#topic Bug Deputy role proposal15:17
TheJulia#link https://review.opendev.org/c/openstack/ironic/+/90044915:17
TheJuliaThe document is still outstanding, please review this week.15:17
TheJuliaI've rechecked it to hopefully clear the -1 on it15:18
TheJuliaSince we have no RFEs to review, we shall proceed to Open Discussion if there is no further discussion ?15:18
dtantsur++15:19
TheJulia#topic Open Discussion15:19
TheJuliaso, httpboot is looking good, grub just acts weird though15:19
dtantsurstop giving catnip to your grub!15:19
dtantsur(sorry)15:20
TheJulialol15:20
iurygregorynot first time grub acts weird lol15:20
TheJuliawell, I pinged some maintainers, got asked questions which I had already tried to answer15:20
dtantsurabsolutely unheard of!15:20
dtantsuranything we can help with?15:20
TheJuliaand then opened the code, looks like it is edk2 or substrate networking15:20
TheJuliaso it might be okay, I think I'll try to polish the patches15:20
TheJuliaI just need to get the redfish sushy and sushy-tools changes merged since I'll need to update requirements.txt15:21
dtantsurlinks?15:21
TheJuliasure15:21
iurygregoryI will try to review them this week15:22
TheJuliahttps://review.opendev.org/c/openstack/sushy-tools/+/901208 <-- this really fixes the prior change15:22
TheJulia#link https://review.opendev.org/c/openstack/sushy/+/71827615:22
TheJuliaOnce I can get the base ironic change updated with a released sushy-tools, the jobs will go green15:23
dtantsurnice15:23
TheJuliaI did deviate a little from the spec on the redfish and dhcp driven network booting paths, but nothing horrible15:23
TheJuliaand ultimately gives us the ability to still sort of do the ipxe logic dance for those who really want openstack integrated ipxe15:24
TheJuliaI'll keep sorting out grub, it does the needful though, it is something in grub where we might just want to document the substrate is there, it sort of works, but we've seen some issues which are expected to be unrelated15:24
* dtantsur is a bit worried by the amount of "sort of" :)15:25
TheJuliaWell, it loads shim, shim chains to grub15:25
TheJuliaand then grub sort of falls down on step 5 of packet processing semi-randomly15:25
TheJuliawhich leverages the http logic handler from UEFI15:25
TheJuliaso......15:25
TheJulia¯\_(ツ)_/¯15:26
dtantsurfun15:26
TheJuliaYeah, it doesn't block us from merging an interface, but it blocks us from having verbose "it works!" docs15:26
rpittau"it sort of works" :)15:27
TheJuliathe underlying code is all identical though, so high confidence if whatever ubuntu's grub is doing can be sorted out15:27
dtantsurmake a good sacrifice to the bootloader gods15:27
TheJulia"It sort of works, go complain to your vendor if it doesn't"15:27
rpittaulol15:27
dtantsurTheJulia: can we try with another distro?15:27
TheJuliaI believe the bootloader gods are the elder gods15:27
TheJuliadtantsur: I can, the CI changes themselves are way down the series of changes15:28
TheJuliaI'm also tempted to write a new "exercise all available boot interfaces" tempest job so we can do it in one-shot versus scenario test after scenario test15:28
dtantsurWhat I"m curious about is whether this a fundamental grub problem or just something in the ubuntu build15:28
TheJuliadtantsur: I'm honestly suspecting environmental grub + environment + ed2k15:29
TheJuliaerr, edk215:29
TheJuliaNext up, Cthulhu will appear to discuss grub15:29
rpittaunon-euclidean geometry is our last chance15:30
TheJuliawill this permit us to fold space and time, finally?15:31
TheJulia... it is clear, the meeting is over, we've folded it away.15:31
dtantsur\o/15:31
TheJuliaAnything else folks before I wrap up today's meeting?15:31
rpittauone small thing15:31
TheJuliayes?15:32
rpittauthe api for attach detach virtual media patch has got its  first +2 https://review.opendev.org/c/openstack/ironic/+/89491815:32
TheJulia\o.15:32
rpittauwe're a bit in a rush with  that, if anyone has a moment for a review would be great :)15:33
rpittauthat's all, thanks!15:33
TheJuliaThanks15:35
TheJuliaHopefully we can do the next release with some httpboot stuffs :)15:35
TheJulia(that would be epic)15:35
dtantsurTrue15:35
TheJuliaWell, if there is nothing else, it seems we have code review to do15:36
TheJuliaThanks everyone, have a wonderful week15:37
TheJuliaOh, one last thing15:37
TheJuliaAnyone want to run the meeting next week?15:38
dtantsurwon't Jay be back?15:38
TheJuliaHe will be, but doesn't mean we can't volunteer someone so he doesn't have to worry about it15:38
rpittauI can run it15:38
TheJuliaack, thanks. due to daylight savings time, it is a bit early for him too15:39
TheJuliaAnyhow, thanks everyone!15:39
TheJulia#endmeeting15:39
opendevmeetMeeting ended Mon Nov 20 15:39:26 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:39
opendevmeetMinutes:        https://meetings.opendev.org/meetings/ironic/2023/ironic.2023-11-20-15.02.html15:39
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/ironic/2023/ironic.2023-11-20-15.02.txt15:39
opendevmeetLog:            https://meetings.opendev.org/meetings/ironic/2023/ironic.2023-11-20-15.02.log.html15:39
rpittaudtantsur: I sent a mail for the virtualpdu launchpad space ownership15:46
dtantsurthanks!15:46
*** dmellado2 is now known as dmellado15:58
arne_wiebalckGood morning, Ironic!16:16
arne_wiebalckI am trying to get some backports wrapped up: https://review.opendev.org/q/I13617ae77515a9d34bc4bb3caf9fae73d5e4e57816:16
arne_wiebalckThe Yoga one fails in tempest tests, though, and I am not really clear on whether it is due to the patch or something unrelated ... if someone has a moment to push me in the right direction ...16:18
dtantsurTheJulia: do I recall it right that we want to deprecate WSMAN support? I was not on that session.16:29
Sandzwerg[m]Hi Ironic. We had an issue today when an deployment failed because of "Node b95c6b54-15a0-4cc4-a7c9-9566f2bd9fbd failed to validate deploy image info. Some parameters were missing. Missing are: ['instance_info.image_source']"}, power: {'result': True}, storage: {'result': True}" https://paste.opendev.org/show/borGja9FA4jBiqvMLSjt/ But from my understanding of that code & OpenStack16:31
Sandzwerg[m]https://github.com/sapcc/ironic/blob/stable/xena-m3/ironic/drivers/modules/agent.py#L466-L473 / https://opendev.org/openstack/ironic/src/branch/master/ironic/drivers/modules/agent.py#L436-L443 the image source is not an image parameter but something set by nova(?). Some time later I was able to make a successful deployment, using the exact same image as during this error.16:31
dtantsurrpittau: yay https://launchpad.net/virtualpdu16:33
rpittau\o/16:33
TheJuliadtantsur: yeah, basically I think since there is not really the humans around anymore16:38
TheJuliadtantsur: and dell has basically chosen redfish first/only path forward16:39
TheJuliaSandzwerg[m]: how did you attempt to perform the deployment each time, generally yes we expect image source to be a UUID to an image and it missing seems very problematic for a deployment16:40
TheJuliaarne_wiebalck: I suspect unrelated, based upon the error on the agent console, it looks like eventlet/ssl related :\16:42
TheJuliahttps://www.irccloud.com/pastebin/c0lwJ9gX/16:42
Sandzwerg[m]TheJulia: There is some automation that did it. When I look at the request spec of the instance in nova I see the image data including UUID, that's why I'm a bit puzzled that nova/ronic thinks that it's missing. https://paste.opendev.org/show/bANjk3pJ7yFMUdecTURz/ has the instance spec16:45
arne_wiebalckthanks for checking TheJulia! The same tests failing on the first attempt, so I was hesitating to do a bare recheck again.16:45
dtantsurTheJulia: it would be good to have some written artefact for it. There is a certain impact for us in Metal3.16:46
Sandzwerg[m]I also don't remember seeing this issue before. As the image itself seems to work on ironic the image itself should be okay as well. I also checked if some ironic service was restarting during that timeframe but I found nothing. Haven't checked nova yet, will do that quickly.16:46
TheJuliaSandzwerg[m]: the only thing I can think was there was some race condition, possibly between scheduling across multiple distinct nova computes ?! which helped create that condition. 16:47
TheJuliadtantsur: yeah, it is on my todo list16:47
dtantsurnice16:47
dtantsursame with ibmc and ilo4, if that ever happens16:47
TheJuliaSandzwerg[m]: if you can pin down what exactly, that might help, but the only thing I can think of is two locks maybe came in at the same time from two distinctly different updates16:48
TheJuliaand one either failed, or just didn't complete, or $something, dunno.16:48
TheJuliaor one started failed and yeah16:48
TheJuliaSandzwerg[m]: if you have multiple nova computes, I would start looking there, espescialy if peer_list is not set16:49
TheJuliadtantsur: Yeah, I was sort of thinking a whole list of them16:49
Sandzwerg[m]TheJulia: Interesting, as we currently have a 1:1 connection between a block of nodes (a small group of nodes) and a ironic & nova conductor, so two conductors working on it in parallel should be impossible but yeah, something like a race condition sounds sensible16:49
TheJuliaSandzwerg[m]: well, you can have one nova conductor, but many nova-computes16:50
TheJuliathe computes do the actual lift16:50
masgharTheJulia: dtantsur: So would it be safe to say that mention of the "idrac-wsman" raid interface can now become "idrac-redfish" in metal3? Or am I jumping ahead?16:50
dtantsurmasghar: jumping ahead a bit :)16:50
dtantsurwe'll need to make sure that the way we configure the node is compatible with Redfish16:50
TheJuliathe first thing is for us to define *what* exactly is deprecated16:50
masgharAlright. And we also keep ilo5 as it is in metal3? Yeah exactly16:51
Sandzwerg[m]TheJulia: for us that is also 1:1 at the time so 1 block - 1 ironic conductor and 1 nova-compute-ironic16:51
TheJuliaand then go from there along to what dtantsur is saying16:51
dtantsurmasghar: yeah, ignore both for now16:51
TheJuliaSandzwerg[m]: weird....16:51
masgharAlright ^16:51
TheJuliaSandzwerg[m]: do you have a read replica?16:51
TheJuliafor your database?16:51
Sandzwerg[m]No we have a single mariadb for a region, that will be eventually change to a galera cluster but so far hasn't been our bottle neck. I'm aware that our setup is not the usual style to do it, and were thinking about swichting to a more "default" one16:53
TheJuliaNah, that is fine, just trying to understand where things went sideways in the workflow16:54
TheJuliamaybe a thread got lost?!16:54
TheJuliaarne_wiebalck: unfortunately I'm not sure rechecking will just make it work. Might be one of those things where all we can really do is unblock the branch to merge general fixes at this point16:54
TheJuliadtantsur might recognize or be aware of the error, but being yoga branch I suspect we're not going to see it fixed with how it builds in CI16:55
arne_wiebalckTheJulia: the other option is to abandon that specific backport and we carry fwd our downstream patch instead (for this one version)16:56
arne_wiebalckTheJulia: no need to break stable/yoga :)16:56
TheJuliawell, I think it is already broken :)16:56
TheJuliabut yeah16:56
arne_wiebalckheh16:56
TheJuliaso, this sort of explains I think I saw on ironic-tempest-plugin, but instead just routed around by fixing the job configs to what branches were voting16:57
Sandzwerg[m]TheJulia: It might be fine but we'll probably change our setup at some point none the less. Makes it more resilent if there is more than one conductor/nova-compute and probably also useses less resources. Regarding Thread lost: Dunno, I'll look at the surrounding logs to see if I missed anything else. It's not a big issue right now but I never saw it before and I thought I ask here so I don' t miss anything. Thanks for your help16:58
Sandzwerg[m]:)16:58
TheJuliadtantsur: there is no better time than now to write deprecations16:58
TheJuliaSandzwerg[m]: I think I've seen something like that when two computes were interacting with the same node, but yeah.... that wouldn't bein your environment, so don't know.16:59
rpittaugood night! o/17:00
Sandzwerg[m]TheJulia: I should have checked the log better before instead of focusing on error message. It seems they tried to do a boot-from-volume which obviously won't work with ironic. There is a flag in the instance spec17:10
opendevreviewMerged openstack/ironic-python-agent stable/2023.2: Conditional creation of RAIDed ESP for UEFI Software RAID  https://review.opendev.org/c/openstack/ironic-python-agent/+/89932517:36
opendevreviewMerged openstack/ironic-python-agent stable/zed: Conditional creation of RAIDed ESP for UEFI Software RAID  https://review.opendev.org/c/openstack/ironic-python-agent/+/89986018:03
opendevreviewJulia Kreger proposed openstack/ironic master: Multiple driver related deprecations  https://review.opendev.org/c/openstack/ironic/+/90150119:05
opendevreviewJulia Kreger proposed openstack/ironic master: Deprecate configuration molds  https://review.opendev.org/c/openstack/ironic/+/90150219:05
opendevreviewJulia Kreger proposed openstack/sushy-tools master: Simplify UEFI logic and change the UefiHttp flow  https://review.opendev.org/c/openstack/sushy-tools/+/90120819:09
TheJuliadtantsur: revised, with error messages19:10
TheJuliagood issue to spot19:10

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!