Monday, 2020-05-11

*** livelace has joined #openstack-ironic00:02
*** livelace has quit IRC00:12
openstackgerrityuanliu proposed openstack/ironic master: If the [conductor]XXX_timeout is less than 0,disable periodic task  https://review.opendev.org/72379501:05
openstackgerrityuanliu proposed openstack/ironic master: If the "[conductor]XXX_timeout" is less than 0,disable periodic task  https://review.opendev.org/72379501:06
openstackgerrityuanliu proposed openstack/ironic master: If the "[conductor]XXX_timeout" is less than 0,disable periodic task  https://review.opendev.org/72379501:08
*** stevebaker has quit IRC01:18
*** yaawang has quit IRC01:35
*** yaawang has joined #openstack-ironic01:36
*** stevebaker has joined #openstack-ironic01:51
*** ricolin_ has joined #openstack-ironic01:54
*** ricolin_ has quit IRC03:25
*** uzumaki has quit IRC03:35
*** uzumaki has joined #openstack-ironic03:37
SpamapSTheJulia: aw I missed your IRC ping because I've kind of been ignoring le IRC. :-P04:14
*** ricolin has quit IRC05:22
*** yolanda has joined #openstack-ironic06:18
*** michchap has joined #openstack-ironic06:22
*** Qianbiao has joined #openstack-ironic06:38
arne_wiebalckGood morning, ironic!06:57
arne_wiebalckTheJulia: Yes, but we do not use Ironic's tfp infra, but point to our own central PXE/tftp infrastructure.06:59
arne_wiebalck*tftp06:59
arne_wiebalckTheJulia: force_raw_images is not set07:08
arne_wiebalckTheJulia: not sure what is the default07:08
arne_wiebalckTheJulia: what about this one: parallel_image_downloads07:09
arne_wiebalckTheJulia: Seems there is already some means to throttle?07:09
iurygregorygood morning arne_wiebalck and Ironic o/07:16
arne_wiebalckhey iurygregory o/07:34
*** rpittau|afk is now known as rpittau07:36
rpittaugood morning ironic! o/07:36
*** uzumaki has quit IRC07:37
iurygregorymorning rpittau o/07:42
rpittauhey iurygregory :)07:42
*** lucasagomes has joined #openstack-ironic07:56
*** dtantsur|afk is now known as dtantsur08:04
rpittaudtantsur|afk: you were asking for different distro support for devstack, centos8 was added -> https://review.opendev.org/726647 (haven't tested it yet)08:04
patchbotpatch 726647 - devstack (stable/ussuri) - CentOS 8 support - 2 patch sets08:04
dtantsurmorning ironic08:04
dtantsurw00t!08:04
rpittauhey dtantsur :)08:04
*** ericlei has joined #openstack-ironic08:06
dtantsurarne_wiebalck: commented on https://storyboard.openstack.org/#!/story/2007646. I don't think I like a generic reset-state API, to be honest. I'd rather fix any missing transitions we have.08:08
dtantsurand if we're fine with letting people bypass cleaning, maybe we allow setting node.automated_clean=False?08:10
arne_wiebalckdtantsur: I think the main transition we want is error->"something which does not require direct DB manipulation"08:11
dtantsurarne_wiebalck: you can use "deleted" on "error", no?08:12
dtantsurat least according to the state machine?08:12
*** ericlei has quit IRC08:12
* arne_wiebalck has a deja-vu deja-vu08:12
arne_wiebalckdtantsur: the command would be which one?08:13
dtantsur`openstack baremetal node undeploy`08:13
arne_wiebalckok, I don't think I ever tried this on error08:14
arne_wiebalckif this works, we covered at least my point 100%08:14
dtantsurour commands/transitions don't seem discoverable enough indeed08:14
arne_wiebalckwe discussed this one exactly, hence the double dja-vu08:14
dtantsurI wonder what we could do to improve that08:14
arne_wiebalckdeja -vu08:15
dtantsurheh08:15
dtantsurarne_wiebalck: okay, so what did you try? which options seemed obvious to you?08:15
arne_wiebalcknone, I was looking for something that would reset the state :-)08:15
*** akahat has quit IRC08:15
*** SpamapS has quit IRC08:15
arne_wiebalckundeploy does not sound like fixing nodes in error08:16
arne_wiebalckesp. as the node was not deployed, deployment failed08:16
*** SpamapS has joined #openstack-ironic08:16
arne_wiebalckI see now how this nicely includes cleaning08:17
dtantsurso, I'm pondering smth like `openstack baremetal node recover` which would apply the necessary transition08:17
dtantsuryeah08:17
arne_wiebalcksounds like sth I might have found08:18
arne_wiebalck;)08:18
dtantsurarne_wiebalck: may I edit your your RFE to reflect this discussion and some other ideas we talked about on Friday?08:19
arne_wiebalckoh, sure!08:19
*** hjensas|afk has quit IRC08:20
*** alexmcleod has joined #openstack-ironic08:22
gudrutis2Morning all08:25
dtantsurarne_wiebalck: how does it look now? https://storyboard.openstack.org/#!/story/200764608:30
dtantsurmorning gudrutis208:30
dtantsurarne_wiebalck: next, I'm thinking about aborting `cleaning` and `deploying`, but these are tough08:31
dtantsurit's not hard to break a lock and update the state, but how to stop conductor from doing whatever it's doing now?08:31
openstackgerritDmitry Tantsur proposed openstack/ironic stable/ussuri: Add timeout and retries to JSON RPC client  https://review.opendev.org/72675308:33
openstackgerritDmitry Tantsur proposed openstack/ironic stable/train: Add timeout and retries to JSON RPC client  https://review.opendev.org/72675408:33
arne_wiebalckdtantsur: very nice, thank you!08:35
*** hjensas has joined #openstack-ironic08:35
gudrutis2dtantsur: I have a question about our bellowed introspection rules. If I have an array of disk, can I count them and check the number in the condition?08:36
arne_wiebalckdtantsur: I checked my notes: we used undeploy for ACTIVE nodes without an instance, not for nodes in ERROR.08:36
dtantsurgudrutis2: I don't remember from the top of my head, sorry08:37
*** derekh has joined #openstack-ironic08:37
* dtantsur has found https://storyboard.openstack.org/#!/story/2003158 and O_o08:41
*** k_mouza has joined #openstack-ironic08:43
*** livelace has joined #openstack-ironic08:44
iurygregorymorning dtantsur o/08:45
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent-builder master: [WIP] Build tinyipa on focal  https://review.opendev.org/72579908:53
openstackgerritRiccardo Pittau proposed openstack/ironic-inspector master: Convert jobs to dib  https://review.opendev.org/71251608:54
openstackgerritRiccardo Pittau proposed openstack/python-ironic-inspector-client master: Convert job to dib  https://review.opendev.org/71869808:55
*** ericlei has joined #openstack-ironic09:08
openstackgerritRiccardo Pittau proposed openstack/networking-baremetal master: Convert networking-baremetal job to dib  https://review.opendev.org/71869709:09
openstackgerritRiccardo Pittau proposed openstack/networking-baremetal master: Convert networking-baremetal job to dib  https://review.opendev.org/71869709:10
*** ericlei12 has joined #openstack-ironic09:10
openstackgerritRiccardo Pittau proposed openstack/ironic-inspector master: [WIP] Use latest version of python construct  https://review.opendev.org/72482209:12
*** ericlei has quit IRC09:13
*** sshnaidm|off is now known as sshnaidm09:15
*** ricolin has joined #openstack-ironic09:20
*** ericlei12 has quit IRC09:25
openstackgerritDmitry Tantsur proposed openstack/ironic master: [WIP] Replace retrying with tenacity  https://review.opendev.org/37657409:33
*** priteau has joined #openstack-ironic09:44
*** hjensas is now known as hjensas|afk09:46
openstackgerritDmitry Tantsur proposed openstack/ironic-python-agent-builder master: Install linux-firmware in DIB images  https://review.opendev.org/72676909:54
openstackgerritDmitry Tantsur proposed openstack/ironic-python-agent-builder master: Install linux-firmware in DIB images  https://review.opendev.org/72676909:55
openstackgerritDmitry Tantsur proposed openstack/ironic-lib master: image_convert: retry resource unavailable and make RLIMIT configurable  https://review.opendev.org/72637810:09
*** rpittau is now known as rpittau|bbl10:15
openstackgerritIury Gregory Melo Ferreira proposed openstack/ironic-tempest-plugin master: Add standalone redfish jobs  https://review.opendev.org/72067510:57
openstackgerritIury Gregory Melo Ferreira proposed openstack/ironic-tempest-plugin master: Add standalone redfish jobs  https://review.opendev.org/72067510:59
*** iurygregory has quit IRC11:37
openstackgerritMerged openstack/ironic stable/ussuri: Native zuulv3 grenade job for ironic  https://review.opendev.org/72664311:38
openstackgerritDmitry Tantsur proposed openstack/ironic-python-agent-builder master: Install linux-firmware in DIB images  https://review.opendev.org/72676911:43
openstackgerritDmitry Tantsur proposed openstack/ironic-python-agent-builder master: Install linux-firmware in DIB images  https://review.opendev.org/72676911:45
*** rpittau|bbl is now known as rpittau11:58
*** iurygregory has joined #openstack-ironic11:58
* iurygregory is back, the internet company came to change the router and modem XD11:59
rpittauiurygregory: can you tell them to come and change mine too please? :P11:59
iurygregoryrpittau, sure, UPC was bought by Vodafone and they want to change the equipment, I can call your provider if it helps =)12:00
*** rh-jelabarre has joined #openstack-ironic12:00
openstackgerritMerged openstack/ironic stable/ussuri: Silence debug messages from oslo_messaging  https://review.opendev.org/72663312:01
openstackgerritMerged openstack/ironic-inspector master: Convert jobs to dib  https://review.opendev.org/71251612:06
arne_wiebalckSo nice when Ironic deploys on a disk which the BIOS does not consider (or cannot even see) for booting ...12:09
dtantsurOo12:09
iurygregory*magic*12:09
dtantsuriurygregory: vodafone is buying everything, our provider (unitymedia) also belongs to them12:10
iurygregorydtantsur, wow =O12:10
TheJuliaarne_wiebalck: force_raw_images defaults to true, what is parllel_image_downloads set to?12:12
arne_wiebalckTheJulia: also not touched, so default12:12
*** akahat has joined #openstack-ironic12:12
arne_wiebalckboot issue: the node has local drives plus an enclosure, but some of the disks in the enclosure are smaller than the disks in the node ...12:13
dtantsurmorning TheJulia12:13
dtantsurarne_wiebalck: so, root device hints needed?12:14
arne_wiebalck... so I guess Ironic happily picked one of the small ones in the enclosure, but then the BIOS goes: what would you like me to boot from?12:14
arne_wiebalckdtantsur: yes, this is what I have done now, worked in the past :)12:14
arne_wiebalckdtantsur: crossing fingers it works here as well ...12:14
dtantsurWe need to somehow make it clear that root device hints are highly recommended for machines with different disks12:15
* iurygregory thinks rpittau is happy since https://review.opendev.org/712516 merged12:16
patchbotpatch 712516 - ironic-inspector - Convert jobs to dib (MERGED) - 18 patch sets12:16
rpittauwas about time12:16
rpittau:)_12:16
rpittaustill have 3 to go12:16
*** priteau has quit IRC12:16
*** bfournie has left #openstack-ironic12:17
openstackgerritRiccardo Pittau proposed openstack/networking-baremetal master: Convert networking-baremetal job to dib  https://review.opendev.org/71869712:23
arne_wiebalckdtantsur: usually the arrays have larger/slower drives, so not often an issue12:24
arne_wiebalckdtantsur: this time the node has a h/w RAID (so the "disk" is larger) and the enclosure has some additional very small and fast devices12:25
arne_wiebalckdtantsur: what would be cool would be if Ironic would prefer local drives over ones in an enclosure12:26
*** hjensas|afk is now known as hjensas12:26
arne_wiebalckdtantsur: or even configurable12:26
dtantsurarne_wiebalck: can we detect that?12:28
dtantsurthe default algorithm is bad, but we've been unable to find anything that will satisfy more people12:28
arne_wiebalckdtantsur: I think the default is not too bad12:29
arne_wiebalckdtantsur: it works for the vast maority12:29
arne_wiebalckdtantsur: we have some code to count enclosures, so it should be doable to see which disks are where12:30
dtantsurI'd not object to a patch that puts them behind in the priority list12:30
dtantsurif we can detect them reliably12:30
arne_wiebalckdtantsur: yeah ... not sure about this later point12:30
dtantsurfalse negatives are fine, I'm worried about false positives12:30
dtantsurif we can rule out the latter it will be a big improvement already12:31
*** tkajinam has quit IRC12:31
arne_wiebalckhmm, yeah, I can have a look12:31
arne_wiebalckdo the root device hints provide sth already which would help with this?12:32
dtantsurI doubt it12:33
arne_wiebalckscsi address maybe12:33
TheJuliacan change if bus init order changes12:33
TheJuliaor drive rload order12:34
arne_wiebalckhmm, yeah, but the current "/dev/sda" suffers from the very same issue12:34
* TheJulia needs a caffeine IV12:35
*** bfournie has joined #openstack-ironic12:35
arne_wiebalckhints can be ANDed, right?12:35
arne_wiebalckyes12:36
arne_wiebalckdoes not help much if the devices are behind a RAID controller12:37
iurygregoryarne_wiebalck, I'm wondering if you answered yourself =)12:41
arne_wiebalckyes :)12:42
iurygregorygood =)12:42
* arne_wiebalck is not sure if this is a good thing ...12:42
iurygregoryif your answer is correct it's a good thing =)12:42
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent-builder master: [WIP] Build tinyipa on focal  https://review.opendev.org/72579912:56
openstackgerritIury Gregory Melo Ferreira proposed openstack/ironic-inspector master: Merge jobs  https://review.opendev.org/72612112:56
dtantsurfolks, I may need some brainstorming for the ironic-inspector uwsgi issue12:58
rpittaudtantsur: sure thing12:58
dtantsurnote that I accept "just remove oslo.msg and use JSON RPC" as a valid answer :)12:58
dtantsurthe situation is, we're seeing pretty regular failures https://zuul.openstack.org/builds?job_name=ironic-inspector-non-standalone-tempest12:58
dtantsurinspection fails because ironic received HTTP 502 when accessing inspector12:59
dtantsurlooks like https://zuul.openstack.org/build/2b7625959ac641b2a1f8ae476e0454cf/log/controller/logs/screen-ir-cond.txt#102612:59
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent-builder master: Build tinyipa on focal  https://review.opendev.org/72579913:00
dtantsurI cannot find anything particular in the inspector API logs, nor any difference from how ironic API is deployed13:00
openstackgerritMerged openstack/ironic stable/train: Silence debug messages from oslo_messaging  https://review.opendev.org/72663413:00
dtantsurwell, this is the only clue: https://zuul.openstack.org/build/2b7625959ac641b2a1f8ae476e0454cf/log/controller/logs/screen-ironic-inspector-api.txt#79113:01
dtantsurthe timing doesn't match well, however13:01
*** iurygregory has quit IRC13:01
*** iurygregory has joined #openstack-ironic13:02
*** livelace has quit IRC13:04
*** livelace has joined #openstack-ironic13:04
*** kbaegis has joined #openstack-ironic13:04
*** kbaegis has quit IRC13:09
*** kbaegis has joined #openstack-ironic13:10
*** Goneri has joined #openstack-ironic13:21
*** zzzeek has quit IRC13:25
*** zzzeek has joined #openstack-ironic13:26
*** zzzeek has quit IRC13:26
*** zzzeek has joined #openstack-ironic13:27
*** rloo has joined #openstack-ironic13:34
*** kbaegis has quit IRC13:36
*** kbaegis has joined #openstack-ironic13:41
*** beekneemech is now known as bnemec13:47
*** tzumainn has joined #openstack-ironic13:47
*** kbaegis has quit IRC13:49
openstackgerritRiccardo Pittau proposed openstack/python-ironic-inspector-client master: Convert job to dib  https://review.opendev.org/71869813:55
*** jdandrea has joined #openstack-ironic13:57
openstackgerritVerification of a change to openstack/ironic failed: Limit the number of ipmitool retries  https://review.opendev.org/72595414:00
*** uzumaki has joined #openstack-ironic14:01
*** livelace has quit IRC14:04
*** cdearborn has joined #openstack-ironic14:09
dtantsurrpittau: bad news, if we return linux-firmware to DIB images, we start running out of 2G of RAM :(14:12
rpittaudtantsur: I'm prone to think by experience that that is an issue on rabbitmq itself, it might require some tweaking on rabbitmq parameters14:12
rpittauoh gosh14:12
dtantsurwe probably need to use 3G instead14:12
TheJuliahow big are the images being generated?14:13
dtantsur387M after re-adding14:13
rpittauwell no alternative if we need that14:13
dtantsurhttps://zuul.opendev.org/t/openstack/build/f0abf4f8c5a74e46977eb7d8cd961099/log/job-output.txt#4429514:13
dtantsurwe've already got reports that without linux-firmware our images don't work on some bare metal machines14:14
TheJuliadtantsur: uncompressed?14:14
dtantsurTheJulia: compressed14:14
TheJuliayeah, what is the uncompressed value14:14
dtantsurI'm not sure how to figure it out, we don't record that in our jobs, I think14:14
TheJuliaok14:14
dtantsurcpio outputs 2160091 blocks14:15
dtantsurmaybe somebody can make any sense out of it?14:15
TheJuliayeah, seeking in my braincells14:15
rpittauoO14:15
dtantsurprobably * 51214:15
TheJuliamore like 1k if memory serves14:15
rpittau2g and a half? roughly ?14:16
TheJuliaif it is 51214:16
dtantsura bit more than 1G14:16
TheJuliathen... 3GB should work14:16
TheJuliabut barely14:16
TheJuliait may OOM if it tries to do anything super fancy in startup14:17
dtantsur2G works with current 267M images14:17
dtantsurif we can fit 4G testing VMs into the CI, we can do that as well14:17
*** dtantsur is now known as dtantsur|brb14:17
TheJuliayeah, there, I think the formula is 2xcompressed size+50%14:18
TheJuliaerr14:18
TheJulia2x uncompressed + 50%14:18
iurygregoryI think we can go for 4GB14:23
iurygregorywe are only creating 1 VM14:23
iurygregorynot sure how infra will see this =)14:24
TheJuliahonestly they will be unlikely to notice unless we start failing our jobs in general14:26
openstackgerritDhuldev Valekar proposed openstack/ironic master: DRAC: Added redfish management clean steps  https://review.opendev.org/72159314:28
*** jhesketh has quit IRC14:35
*** uzumaki has quit IRC14:37
*** uzumaki has joined #openstack-ironic14:37
iurygregoryyay for CI14:37
iurygregorya lot of "E: Unable to locate package <name>" XD14:37
TheJuliaugh14:38
TheJuliadid the image suddenly change from bionic? Or is the mirror just missing packages?14:39
TheJuliai.e. 404s14:39
*** kaifeng has joined #openstack-ironic14:43
openstackgerritRiccardo Pittau proposed openstack/networking-baremetal master: Convert networking-baremetal job to dib  https://review.opendev.org/71869714:46
*** zaneb has quit IRC14:46
rpittaujust a bunch of connection timeout everywhere14:47
*** zaneb has joined #openstack-ironic14:47
openstackgerritRiccardo Pittau proposed openstack/ironic master: [WIP] Fix grub2 pxe job with native bionic ovmf  https://review.opendev.org/71688914:51
iurygregorywell I hope infra will send an email when they are going to switch to focal LOL14:53
*** aedc has joined #openstack-ironic14:54
kaifengfocal is a new cloud provider?14:55
rpittaukaifeng: ubuntu focal LTS 20.04 :)14:56
*** sshnaidm is now known as sshnaidm|afk14:57
kaifengouch.. I haven't tried 20.04 yet, last week I tried to install it into my macbook mid2009 and it failed to boot :(14:57
*** dtantsur|brb is now known as dtantsur14:59
*** stendulker has joined #openstack-ironic14:59
rpittauiurygregory: the patch to add the new nodeset is still WIP, so I guess we have some time14:59
iurygregoryrpittau, well I hope we will have time to solve the problems =)15:00
rpittauI'm much interested in building tinyipa with tinycore 11.x, that's why the tests15:00
*** livelace has joined #openstack-ironic15:00
TheJulia#startmeeting ironic15:00
openstackMeeting started Mon May 11 15:00:18 2020 UTC and is due to finish in 60 minutes.  The chair is TheJulia. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
TheJuliao/15:00
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
*** openstack changes topic to " (Meeting topic: ironic)"15:00
iurygregoryo/15:00
rpittauo/15:00
openstackThe meeting name has been set to 'ironic'15:00
TheJulia\o15:00
kaifengo/15:00
stendulkero/15:00
TheJuliaGood morning everyone!15:00
erbarro/15:00
arne_wiebalcko/15:00
rlooo/15:00
TheJuliaOur agenda can be found on the wiki.15:01
TheJulia#link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_next_meeting15:01
ajyao/15:01
TheJulia#topic Announcements / Reminders15:01
*** openstack changes topic to "Announcements / Reminders (Meeting topic: ironic)"15:01
TheJuliaThree items to note this week!15:01
mgoddard\o15:01
rpiosoo/15:02
TheJulia#info Victoria Priorities Document under discussion, please join that discussion in review!15:02
dtantsuro/15:02
TheJulia#link https://review.opendev.org/#/c/720100/15:02
patchbotpatch 720100 - ironic-specs - WIP - Victoria Cycle Priorit(y|ies) - 1 patch set15:02
*** aedc_ has joined #openstack-ironic15:02
*** aedc has quit IRC15:02
TheJulia#info Baremetal Whitepaper effort will be holding it's third session on Tuesday May 12th, at 2PM UTC15:02
TheJulia#link https://cern.zoom.us/j/9424877058015:02
*** uzumaki has quit IRC15:03
TheJulia#info PTG Topics etherpad is up for comments/additions/thoughts/crazy ideas.15:03
TheJulia#link https://etherpad.opendev.org/p/Ironic-VictoriaPTG-Planning15:03
TheJuliaDoes anyone have anything else to announce or remind us of this week?15:03
* iurygregory doesn't15:05
* TheJulia goes and sees if we had any action items from last week15:05
TheJuliaNo meeting related action items, so it seems like we could skip to reviewing subteam status.15:06
dtantsur++15:06
* TheJulia wonders if we need to file a "Take over the world" or "world domination" action item every week.15:06
dtantsurwon't hurt at least, will it?15:06
TheJuliaIt will be more words to type every monday morning :)15:07
iurygregory++15:07
iurygregoryI can type the action if needed15:07
TheJuliaOkay, well then I guess we'll move on!15:07
TheJulia#topic Review sub-team status reports15:08
*** openstack changes topic to "Review sub-team status reports (Meeting topic: ironic)"15:08
TheJulia#link https://etherpad.opendev.org/p/IronicWhiteBoard15:08
-openstackstatus- NOTICE: Our CI mirrors in OVH BHS1 and GRA1 regions were offline between 12:55 and 14:35 UTC, any failures there due to unreachable mirrors can safely be rechecked15:08
TheJulia\o/15:08
iurygregory\o/15:08
dtantsurwell, at least they're back :)15:08
rajiniro/15:08
TheJuliaStarting around line 220 in the etherpad15:08
TheJuliaAny thoughts on logging the wsme changes on the items to review?15:10
dtantsur+115:10
rpittauyeah15:10
iurygregory++15:10
TheJuliak, if someone wants to add some of those links now it would be awesome15:10
TheJuliaI guess we can nuke the software raid item?15:11
arne_wiebalckyes15:12
*** aedc_ has quit IRC15:12
TheJuliadone!15:12
iurygregoryOld items from the Grenade work we want to keep track or I can remove old things? =)15:12
TheJuliaiurygregory: remove the old things!15:12
iurygregoryTheJulia, sure15:12
dtantsurusually stuff older than 2 weeks can be safely removed15:12
dtantsur(unless it's still up-to-date)15:12
TheJulia++15:13
iurygregorydtantsur, ok =)15:13
TheJuliaLooks like the v6 ci jobs needs a rebase15:13
iurygregoryI think we are done with Python315:13
iurygregorywdyt rpittau ?15:13
iurygregoryI think we can remove the topic =)15:14
rpittauwe can remove that yes15:14
*** aedc has joined #openstack-ironic15:15
TheJuliaLooks like two of the dib changes need reviews15:15
rpittauyes, please, they're the last 215:16
TheJuliaAdding to the list15:16
*** riuzen has joined #openstack-ironic15:17
TheJuliaI wonder if there is any interest on deployment state callbacks with nova15:17
TheJuliaI guess that really depends on the number of new deployments being performed against baremetal15:18
*** zbitter has joined #openstack-ironic15:18
arne_wiebalckAnd the power state callbacks were quite some work already.15:19
dtantsurcould be a huge scalability improvement15:19
dtantsurwait for CERN to start hitting problems with periodic sync? :-P15:19
*** zbitter has quit IRC15:19
TheJuliaYeah, I'd just prefer it be driven by those actually actively encountering it15:19
*** zbitter has joined #openstack-ironic15:19
arne_wiebalck:)15:20
TheJuliaAnyway, shall we proceed to priorities for the coming week?15:20
dtantsur++15:21
*** zaneb has quit IRC15:21
TheJulia#topic Priorities for the coming week15:21
*** openstack changes topic to "Priorities for the coming week (Meeting topic: ironic)"15:21
TheJulia#link https://etherpad.opendev.org/p/IronicWhiteBoard15:21
dtantsurcan we get the release model spec there please?15:22
TheJuliadtantsur: please add it15:22
dtantsurdone15:23
rpittauTheJulia: the ci failure on the dib conversion patch was because the timeout15:23
dtantsurTheJulia: https://review.opendev.org/#/c/688299/ needs removing -2 (not urgent)15:24
patchbotpatch 688299 - python-ironicclient - Add `network_data` ironic node attribute support - 9 patch sets15:24
TheJuliaThanks15:24
TheJuliaDone15:24
TheJuliaThat looks good to me, but does anyone see anything we're missing or that needs to be added?15:25
dtantsurnope, looks fine15:26
rpittaulgtm :)15:26
TheJuliadtantsur: what about timeouts/retries to jsonrpc?15:26
dtantsurmm, yeah, there are stable backports to merge15:26
* dtantsur is sleepy today15:27
iurygregoryit's monday =)15:27
TheJuliait is a monday15:27
dtantsuroh, https://review.opendev.org/725867 another fun one15:27
patchbotpatch 725867 - ironic - Mark more configuration options as reloadable - 2 patch sets15:27
dtantsurand https://review.opendev.org/#/c/726378/ could use opinions15:28
patchbotpatch 726378 - ironic-lib - image_convert: retry resource unavailable and make... - 3 patch sets15:28
* dtantsur loves his own definition of "nope"15:28
iurygregoryhehehe15:28
dtantsurit's not just Monday, it's Monday after night when a storm tried to make a music instrument out of our balcony15:29
iurygregorywow15:29
kaifengan idea for 726378, maybe perform reduced memory limit on retry?15:29
dtantsurkaifeng: how will reducing help? if anything, it will increase the chance of failing15:30
*** zbitter is now known as zaneb15:30
dtantsuranyway, let's discuss on the patch15:30
kaifengif memory is low, keep retry with the same resource limit seems identical15:31
kaifengnp15:31
TheJuliaI looked at the source of qemu-img, and it dynamically scales as it assembles, so stepping down the amount of ram may not help if it truly needs to map out something basically fragmeneted15:32
TheJuliafragmeneted15:32
TheJuliaWhich kind of caused me to think of https://review.opendev.org/#/c/726483/ as a guard so we hopefully avoid OOM conditions15:33
patchbotpatch 726483 - ironic - WIP: Guard conductor from consuming all of the ram - 1 patch set15:33
TheJuliaAnyway, the list looks good to me. Are we good to proceed?15:33
dtantsur++15:33
iurygregory++15:33
kaifeng++15:34
TheJuliaWe have no explicit topics. I believe we've also basically covered the SIG item, unless there is more arne_wiebalck ?15:34
TheJuliaWhich I guess takes us to RFE Review15:34
dtantsurI actually did have a topic..15:34
arne_wiebalcknope15:34
* dtantsur not sure where it went15:34
dtantsurIt was about the new release model proposal, maybe we should just ask people to review it15:34
TheJuliadtantsur: I moved it to annoucements earlier, unless you really think there is something to discuss right now?15:35
dtantsurah, ok15:35
dtantsurno, I think it requires careful reading15:35
dtantsurI just hope people do read it :)15:35
TheJuliaOkay, then RFE review it is!15:36
TheJulia#topic RFE Review15:36
*** openstack changes topic to "RFE Review (Meeting topic: ironic)"15:36
TheJuliaLooks like we have two items that have been proposed, would anyone like to introduce them?15:37
dtantsuryeah15:37
dtantsurThey're not mine, but I've added them, soo15:37
dtantsur#link https://storyboard.openstack.org/#!/story/2007646 More convenient state transitions in case of failures15:37
dtantsurthis came from the Friday's SPUC (one of the action items)15:37
dtantsurtwo pretty minor additions that may potentially make newcomers' life easier15:37
iurygregorythis one sounds interesting =)15:38
TheJuliaseems reasonable, at least from a 10,000 foot view15:39
kaifengwe have just discussed this kind of guard today :)15:39
dtantsurto be clear, it doesn't add anything new, just aliases for existing things15:39
dtantsurbut ones that are hopefully easier to discover and remember15:39
dtantsurI had another RFE with aliases for provisioning verbs, but I've lost it in storyboard....15:39
dtantsur... found it, thank you firefox15:39
dtantsur#link https://storyboard.openstack.org/#!/story/2007551 An RFE in a similar spirit with 2 more actions15:40
TheJulia+100015:41
dtantsurboth are pretty easy to implement. so, while I can do it, I encourage anyone who want to get into ironic development to take them15:41
*** livelace has quit IRC15:41
dtantsurany objections to either of these two?15:41
iurygregoryCan I take the second? =)15:42
dtantsuriurygregory: sure, assign yourself15:42
TheJuliano objections at all15:42
iurygregorydone =)15:42
*** livelace has joined #openstack-ironic15:42
kaifengno objection, in addition I'd like to seek a transition from deploy wait to deploy failed15:42
dtantsurkaifeng: mmm, interesting, maybe we should make 'abort' do that?15:43
dtantsurrather than being an alias for 'deleted'?15:43
dtantsur(this is re 1st RFE action #2)15:43
*** uzumaki has joined #openstack-ironic15:43
rpittauabort suggests a return to a precedent state though15:43
rloofor 2007646, (sorry if this is bikeshedding', but I wonder if 'recover' is the right word. this is just moving the node's provision state, right?15:43
rloooh wait. what does 'initial proposal (obsolete)" mean?15:43
kaifengdtantsur: quite alike, there is no way to cancel we can only wait it fails itself.15:43
dtantsurrloo: the initial text of the RFE as filed by arne_wiebalck15:44
rloowait. i have to actually READ it...15:44
rpittaumaybe call it explicitely 'fail' ?15:44
dtantsurrpittau: we even have a 'fail' action, we just don't expose it15:44
dtantsurrloo: 'recover' is based on the questions I receive pretty often: "How do I recover a node from the error state?" and similar15:45
dtantsurI agree that it's far from being perfect15:45
kaifenginspector use abort to fail a inspection, so maybe it suits for ironic15:45
dtantsurabort on clean wait causes clean failed15:45
dtantsuryeah, I agree, abort on deploy wait should end up in deploy failed15:46
dtantsurupdated https://storyboard.openstack.org/#!/story/200764615:47
kaifenglooks like i misundertood the recover, so it just moves node out of a failed state, do we care the clean up too?15:47
rlooi'm confused. in the rfe, it sez 'a node can be recovered by applying the deleted transition'. i don't think that's what the initial proposal is. or maybe i misunderstand.15:47
dtantsurthe proposed 'recover' action is an alias for whatever action moved a failed node to a non-failed state15:48
dtantsure.g. deploy failed -> (deleted) -> available15:48
dtantsurbased on the current state and target_provision_state15:48
dtantsurit won't have any new logic behind it15:48
rlooi don't think that's what was desired. i think it is more like 'deploying' -> error, but the user wants the node to go back to active state.15:49
rloomore like nova's reset, i forgot the exact command.15:49
dtantsurI'd quite like a 'retry' action15:49
kaifengoh, so it's an alias to avoid the old word which is kind of misleading, but maybe it doen't help for non- new comers :)15:49
dtantsurarne_wiebalck pointed out that "undeploy" is not an obvious way to make a failed node available again15:50
dtantsurbtw do we have an agreement on the 2nd half, i.e. supporting abort on 'deploy wait'?15:50
arne_wiebalckdtantsur: yes15:50
rlooabort works on 'clean failed' and 'inspect failed' ?15:51
dtantsurrloo: yes15:51
dtantsurwait15:51
dtantsuron 'wait'15:51
JayFabort is to get you from *_wait -> * failed15:51
dtantsurabort works on 'clean wait' and 'inspect wait', but not on 'deploy wait'15:51
TheJuliaQuick note, we only have about 8 minutes left.15:51
arne_wiebalckrloo: the original idea was to have reset-state command, but maybe that is too big of a hammer15:52
rlooam looking at our state xsition diagram, and don't see 'abort' from 'clean failed': https://docs.openstack.org/ironic/latest/_images/states.svg15:52
TheJuliaarne_wiebalck: but the purpose of a big hammer is actually what is needed15:52
JayFclean failed is the target state after an abort15:52
*** kbaegis has joined #openstack-ironic15:52
JayFclean failed is restored by going manage->provide15:52
JayFwhich puts it in mangeable and restarts cleaning15:52
dtantsurTheJulia: I'm not convinced the big hammer is needed if we get other things right15:52
dtantsura good thing about editing the database is that people know they're in danger..15:53
TheJuliadtantsur: that presumes we can see and handle all edge cases15:53
dtantsurthat's our mission :)15:53
JayFdtantsur: that's a really unreasonable answer for the real world, IMO15:53
TheJuliawe already... for a very long time now, have had people delete notes completely as their big hammer... except that is also the last thing we want them to ever do15:53
dtantsurJayF: that's why we have API and any levels of protection15:53
rloodownstream we've seen where the node is in a wedged state. i don't recall the details now.15:54
dtantsurtrust me, your position on "don't let people disable cleaning" is not shared by a lot of field folks either :)15:54
JayFdtantsur: and we made that configurable :)15:54
dtantsurand people just disabled the heck out if it15:54
JayFdtantsur: real talk, not from current job, but I've seen manual DB edits cause downtime when someone fat fingered something15:54
JayFdtantsur: I don't think that's a reasonable alternative15:55
rlooi think this discussion might be more relevant if we provided details wrt when a node gets wedged. (haven't had time to dig into that)15:55
dtantsurthe edge case I don't know how to handle (a PTG topic) is how to break out from stuck "deploying"15:55
TheJuliarloo: ++15:55
dtantsurbut just editing it to 'available' may get us into trouble15:55
TheJuliadtantsur: ++15:55
rlooi don't think our usecase was to move a node to available.15:55
arne_wiebalckrloo: one example is the conductor crashing during deploy15:56
arne_wiebalckrloo: this leaves the nodes in error15:56
dtantsurerror is not a stuck state15:56
rlooit was that a bm node has an instance. maybe it was rebuilding and something happened in the code (I don't recall). the node is still 'active' (from nova sense) but we can't update the ironic state.15:56
rlooie, i think we can put the instance into active in nova, but no equiv in ironic.15:56
dtantsurmy objection is based on the fact that we don't know what will happen to a node if we just edit the state in the DB, bypassing locks, cleaning, and so on15:57
arne_wiebalckrloo: in nova and cinder you can set the state to whatever the admin thinks is right15:57
rloodtantsur: error isn't a stuck state BUT there isn't any way to put that ironic node into 'active' :-(15:57
dtantsurrloo: 'deleted'15:57
TheJuliarloo: that is actually the adopt feature, except it only starts from ?two? possible states, not later on in the ownership of the node15:57
dtantsurwhich is very obscure, hence the 'recover' proposal15:57
rloono. deleted won't put it into active, it'll make it available.15:57
JayFdtantsur: I'd assume any such API would cancel locks and such as well, although that is pretty complex15:57
dtantsurah, sorry15:58
rlooor did something change recently?15:58
dtantsurrloo: how do you imagine putting a half-deleted node back to active?15:58
dtantsurit might have had its VIFs already removed..15:58
rloodtantsur: how does nova put a half-deployed (NOT DELETED) node back to active?15:58
rloodtantsur: it relies on the admin knowing what they are doing.15:58
TheJuliaEveryone, we have two minutes left and this really seems like a topic that needs higher bandwidth discussion15:58
rloodtantsur:  we (me anyway) are not talking about deleting anything.15:59
TheJuliaand an understanding of failure cases that people presentlly encounter15:59
dtantsurrloo: 'error' happens when deleting fails15:59
rloo++ agree with Julia.15:59
rloowe're talking about (I am) rebuilding.15:59
dtantsurand 'rebuild' can recover it back to 'active'15:59
dtantsurnote that deleting->(fail)->error is the only way to get to error15:59
kaifengsome *-ing is transitional states and there is no way to quit, restarting conductor can recover the state but I believe it would be nice to have some guarding task.16:00
rlooi don't really think this discussion is productive. i think we need real use cases.16:00
TheJulias/use/failure/16:00
rloocuz we're mixing deploy/delete/error/failure/active/available.16:00
TheJuliaAnyway, does anyone have anything else too discuss or raise before we end the meeting16:00
TheJuliarloo: ++++16:00
* dtantsur will respond kaifeng after the meeting16:00
TheJuliarloo: a chart is needed, honestly16:00
rlooi like our state diagram :)16:00
*** kbaegis has quit IRC16:01
dtantsurif anybody wants to take the 'abort' part - feel welcome16:01
dtantsurthis one seems not controversial16:01
TheJuliaThanks everyone! Have a wonderful week!16:01
kaifengit's a simple change anyone can take it :)16:02
rloodtantsur: maybe a new RFE, I think it means allow ironic node in 'deploy wait' to be aborted?16:02
rloo'deploy wait' -> abort -> 'deploy failed' ?16:02
dtantsurrloo: https://storyboard.openstack.org/#!/story/200764616:02
TheJuliaI think deploy already kind of allows it16:02
TheJuliaI _think_16:02
dtantsurno16:02
dtantsurit has 'deleted', but that brings is all the way through cleaning16:03
TheJuliaahh16:03
TheJuliafor an admin that is likely "okay"16:03
rlooin fact, it isn't 'deploy wait' i think it is 'callback wait' or something odd like that. would be great to rename that state...16:03
dtantsurwait call-back, yeah16:03
rloodtantsur: thx, i see you updated 2007646. I"m good with 2. But not with 1. Not yet anyway.16:04
rlooI'll comment in the rfe.16:04
dtantsurthanks!16:04
*** gyee has joined #openstack-ironic16:04
dtantsurTheJulia: time for #endmeeting?16:05
*** alexmcleod has quit IRC16:05
TheJulia#endmeeting16:05
*** openstack changes topic to "Bare Metal Provisioning | Status: http://bit.ly/ironic-whiteboard | Docs: http://docs.openstack.org/ironic/ | Bugs: https://storyboard.openstack.org/#!/project_group/75 | Contributors are generally present between 6 AM and 12 AM UTC, If we do not answer, please feel free to pose questions to openstack-discuss mailing list."16:05
openstackMeeting ended Mon May 11 16:05:50 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:05
openstackMinutes:        http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-05-11-15.00.html16:05
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-05-11-15.00.txt16:05
openstackLog:            http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-05-11-15.00.log.html16:05
dtantsurkaifeng: the problem with -ing states is that ironic is doing something with the nodes at this point16:06
dtantsurand we have no way to tell it to stop16:06
dtantsuri.e. it may be writing an image or even doing something out-of-band16:06
dtantsurI was thinking about poisoning a TaskManager object, i.e. modifying it in a way that the next access to task.node will blow up16:06
kaifengyep, i have experienced available -> deploying and stuck there16:06
dtantsurbut that requires some kind of cross-greenthread synchronization16:07
dtantsurand won't help with something long-running16:07
dtantsurso dunno. ideas welcome.16:07
openstackgerritRiccardo Pittau proposed openstack/ironic master: [WIP] Fix grub2 pxe job with native bionic ovmf  https://review.opendev.org/71688916:08
rpittaugood night! o/16:09
*** rpittau is now known as rpittau|afk16:09
*** ianychoi_ is now known as ianychoi16:09
kaifengrarely seen recently though, we'll have massive ironic adoption this year, I donno, things may emerge if something is still hidden there16:10
dtantsuryeah, it's not impossible for -ing states to get stuck16:10
dtantsurand I'm very keen on finding a solution better than restarting a conductor16:11
* dtantsur wonders if it's possible to kill a green thread16:12
kaifenghmm, maybe a guarding task from the outside, but this needs the controversial rfe talked above, which can explicitly set a node to a target state..16:13
*** stendulker has quit IRC16:13
dtantsurwell, we can do things internally16:13
dtantsurI doesn't necessary have to be exposed to the API16:13
dtantsurThe question is, how to make ironic stop whatever it's doing to the node16:13
*** lucasagomes has quit IRC16:15
* dtantsur senses a PTG topic16:15
*** Qianbiao has quit IRC16:15
kaifengit needs more thought, as we even don't know whether it behaves correctly16:15
dtantsurkaifeng: I've added to the PTG etherpad, hopefully we can discuss it16:17
kaifengdtantsur: thanks!16:18
*** dtantsur is now known as dtantsur|afk16:19
dtantsur|afko/16:19
*** dking has joined #openstack-ironic16:31
*** uzumaki has quit IRC16:32
*** derekh has quit IRC17:01
openstackgerritVerification of a change to openstack/ironic-python-agent failed: Convert jobs to dib  https://review.opendev.org/71862717:03
*** sshnaidm|afk is now known as sshnaidm17:12
arne_wiebalckbye everyone o/17:39
*** riuzen has quit IRC17:47
*** k_mouza has quit IRC17:48
iurygregoryclarkb, ignore_basepython_conflict can be in the tox section in tox.ini correct? e.g https://opendev.org/openstack/sushy-cli/src/branch/master/tox.ini#L517:48
clarkbiurygregory: yes that should make it apply to all tox targets (that looks fine to me)17:49
clarkbiurygregory: I think most projects have figured it out already, but we keep having these questions pop up semi regularly so I wanted to point it out again17:49
iurygregoryclarkb, cool!17:49
iurygregorywe are missing in 4 projects it seems17:49
iurygregoryI don't think is worth send to spec repos XD17:50
iurygregoryso it will be less17:50
openstackgerritIury Gregory Melo Ferreira proposed openstack/metalsmith master: Add ignore_basepython_conflict  https://review.opendev.org/72691517:51
iurygregoryclarkb, do I need to set minversion ? ^17:53
clarkbya minversion should be 3.1.0 based on https://tox.readthedocs.io/en/latest/config.html#conf-ignore_basepython_conflict I should've mentioned that. Oh well17:53
openstackgerritIury Gregory Melo Ferreira proposed openstack/ironic-python-agent-builder master: Update tox.ini  https://review.opendev.org/72691617:55
openstackgerritIury Gregory Melo Ferreira proposed openstack/metalsmith master: Update tox.ini  https://review.opendev.org/72691517:56
iurygregoryclarkb, tks!17:58
*** k_mouza has joined #openstack-ironic18:16
*** k_mouza has quit IRC18:17
*** kaifeng has quit IRC18:23
*** Lucas_Gray has joined #openstack-ironic18:23
*** iurygregory has quit IRC18:25
*** iurygregory has joined #openstack-ironic18:38
*** Lucas_Gray has quit IRC19:38
openstackgerritVerification of a change to openstack/ironic failed: Limit the number of ipmitool retries  https://review.opendev.org/72595420:19
*** k_mouza has joined #openstack-ironic20:32
*** k_mouza has quit IRC20:36
*** zaneb has quit IRC21:07
*** zaneb has joined #openstack-ironic21:13
*** k_mouza has joined #openstack-ironic21:28
*** k_mouza has quit IRC21:32
*** livelace has quit IRC21:48
*** sshnaidm is now known as sshnaidm|afk22:11
*** tkajinam has joined #openstack-ironic22:55
*** sshnaidm|afk has quit IRC23:14
*** sshnaidm has joined #openstack-ironic23:15
*** sshnaidm is now known as sshnaidm|afk23:16
stevebakerTheJulia: I have a basic auth middleware working locally :)23:24
TheJuliaKick-Ass! Post it! :)23:24
stevebakerI should proooobably write some tests first23:25
TheJuliaYeeeaaahhh!23:26
stevebakerTheJulia: I've only implemented the bcrypt digest format so far, because its the only one which is secure and standard https://httpd.apache.org/docs/current/misc/password_encryptions.html23:30
TheJuliaodds are you could beat me to hacking on the client at this rate23:38
stevebakerTheJulia: where are you first thinking of adding client support?23:41
openstackgerritJulia Kreger proposed openstack/ironic-python-agent master: Hint 404 lookup failures for Operators  https://review.opendev.org/72697623:43
TheJuliastevebaker: yeah, I talked to mordred and cmurphy last week about it23:44
TheJuliathe consensus was to add a openstacksdk plugin for keystoneauth1, and basically have it define and pass in23:44
stevebakerTheJulia: ok, I'll let you know if I start looking at that23:46
TheJuliastevebaker: if you have bandwidth and seems like something fun go right ahead, I'm basically maxed out on bandwidth23:50
stevebakerTheJulia: ok, as long as I'm not taking something fun you were looking forward to23:51
TheJuliayou wouldn't be23:51
openstackgerritJulia Kreger proposed openstack/ironic master: Add ussuri release notes version  https://review.opendev.org/72697823:54

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!