Tuesday, 2022-11-08

vanougood morning ironic01:08
arne_wiebalckGood morning vanou and Ironic!07:27
TheJuliagood morning07:28
* TheJulia misses sleep07:28
* TheJulia is very tired07:28
arne_wiebalckHey good morning, TheJulia o/07:29
* TheJulia waves good morning to arne_wiebalck 07:31
TheJuliaarne_wiebalck: out of curiosity, at CERN, what is the average OS image size you folks are deploying to machines?07:31
arne_wiebalckTheJulia: erm ...07:31
arne_wiebalck*needs to check ...*07:31
* arne_wiebalck needs to check ... and is confused by different chat systems07:32
TheJuliaheh07:32
TheJuliaconfusion is a state of being07:32
TheJuliaand it is a totally valid state at that07:32
TheJuliaarne_wiebalck: specifically virutal size and actual compressed image file size if you have it07:33
arne_wiebalckvirtual size is 4G for our CentOS images, 1G for Windows07:37
arne_wiebalckwe use raw images07:37
TheJuliaImpressive07:37
TheJuliaOkay, Thanks.07:37
arne_wiebalcknp :)07:39
TheJulianobodycam spotted really just a horrible performance with qemu-img convert writing qcows out on nvme devices07:39
TheJuliahttps://storyboard.openstack.org/#!/story/2010397  07:39
TheJuliaSo I was trying to get a sense of what you folks were doing/experiencing as a data point, but with just raw images, you have none of those issues.07:40
arne_wiebalckI *think* we had this at some point as well, qemu-img convert driving the controller OOM07:59
arne_wiebalck(while it should not convert)08:00
arne_wiebalckis directsync what we use in Ironic?08:15
kubajjGood morning Ironic!08:17
arne_wiebalckhey kubajj o/08:24
TheJuliaarne_wiebalck: oh, we fixed that issue in ironic-lib for the most part, then again, in massive concurency I coul dsee it08:25
TheJuliaso... I believe o_direct is used08:25
TheJuliaI don't know what directsync is in this context08:25
TheJuliabut the tl;dr of nobodycam's issue is it is writing out zeros with o_direct which is painfully slow on nvme's since by default you can't work with the buffer in that case08:27
TheJuliasince o_direct is basically "go direct to the medium, don't cache"08:27
Nisha_AgarwalTheJulia, GM08:38
TheJuliagood morning Nisha_Agarwal 08:39
Nisha_AgarwalIsnt it late night for u?08:39
TheJuliaI'm in Brno, CZ this week08:39
Nisha_Agarwal:) ok08:39
TheJuliaTrying to stay awake :)08:39
Nisha_Agarwal:)08:39
Nisha_AgarwalTheJulia, When you get some time could you review https://review.opendev.org/c/openstack/ironic/+/860055/5...the anaconda patch08:40
TheJuliaI might not be able to this week, for what it is worth08:40
TheJuliameetings all week08:40
Nisha_Agarwal:) np08:41
opendevreviewMerged openstack/virtualbmc master: remove python-dev from bindep  https://review.opendev.org/c/openstack/virtualbmc/+/86381808:42
rpittaugood morning ironic! o/09:01
rpittauif any core has a moment please review https://review.opendev.org/c/openstack/sushy/+/863828 thanks!09:49
dtantsurTheJulia: I was under impression that derekh has fixed the issue with zeroing in qemu-img...10:12
TheJuliaI think he partially did10:16
TheJuliabut nobodycam did mention more options/behavior10:16
derekhTheJulia dtantsur https://storyboard.openstack.org/#!/story/2009227 10:16
dtantsurajya: hi! were you able to confirm if changing secure boot returns a task or any way to track the execution?10:20
derekhThe example in https://storyboard.openstack.org/#!/story/2010397 doesn't have the "-S 0" 10:20
derekhNobodyCam ^10:21
arne_wiebalckTheJulia: at least in fio, there is direct and sync (one is bypassing the cache, the other is sync'ing after every write I think)10:31
TheJuliaarne_wiebalck: yeah, sync is a fun one because it is device dependent, example some raid controllers treat sync as "ahh, yes, that is committed to my battery backed buffer"10:33
ajyaHi dtantsur iDRAC returns task URI in Location header, only thing it is OEM task (same task id, different URL). From that generic task can be extracted, but it's specific code for iDRAC (sushy-oem-idrac has something like that already for one OEM endpoint). I'm asking firmware team to change this to generic task URL for future versions.10:33
TheJuliaand still needs to write that out.10:33
* TheJulia has had much fun with this over the years10:33
dtantsurajya: ouch :( thank you!10:33
arne_wiebalckTheJulia: right ... I think a sync ack *should* 10:34
arne_wiebalckguarantee persistent storage10:34
TheJuliayup10:34
ajyadtantsur: also I haven't got to reproducing the issue locally yet to confirm that this is the only issue, will try this week 10:34
arne_wiebalckTheJulia: all this is a constant source of interesting phenomena :)10:35
TheJuliaindeed10:35
arne_wiebalckTheJulia: since many years, at least for me 10:35
TheJuliaAnd why I had liquor in my office when I had data centers I could walk to10:36
arne_wiebalckheh10:36
dtantsurajya: could you check if SecureBootEnable is changed instantly or only after a reboot? If the latter, we can probably try rebooting and checking the updated value.10:36
ajyadtantsur: only after reboot, PATCHing only creates a job in iDRAC that is is Scheduled state. Only during reboot it is started. The workflow is the same as for BIOS attribute update because it is a BIOS attribute change (probably, could do the same with BIOS clean step).10:41
dtantsurajya: okay, so if I update the redfish code to check if the value has changed immediately, reboot if not, and check again, it will work?10:57
ajyadtantsur: yes, GET /SecureBoot returns old value until job is finished11:06
dtantsurooookay, time for some ugly hacks \o/11:12
dtantsurthanks again ajya 11:12
iurygregorymorning Ironic11:32
vanouHi arne_wiebalck and all11:35
opendevreviewMike Raineri proposed openstack/ironic master: Create 'redfish' driver Redfish Interop Profile  https://review.opendev.org/c/openstack/ironic/+/75406111:54
dtantsurajya: how long do you think the job application can take?12:25
ajyadtantsur: looking at the logs - around 6 mins12:33
dtantsurwow, okay12:34
ajyait includes rebooting system that takes time12:36
opendevreviewDmitry Tantsur proposed openstack/ironic master: [WIP] Wait for secure boot state change if it's not immediate  https://review.opendev.org/c/openstack/ironic/+/86399912:45
dtantsurajya: true. Could you take a quick look at the direction here ^^?12:45
ajyadtantsur: will this reboot somehow conflict with reboot when setting the boot device and launching for pxe or vm? I think for direct deploy there is one reboot that handles both secure boot and boot device changes and then launches IPA. Now it will reboot twice? 13:07
ajyaor users will have to set 0 to avoid that?13:08
ajyaWhat will be the flow for ramdisk deploy? 13:08
dtantsurajya: so, I checked PXE and redfish-virtual-media, and both do the secure boot business before setting the boot device13:09
dtantsurgood question re ramdisk13:09
ajyadtantsur: ok at least reboot will not clear boot device because it will be set later, it will only slow down direct deploy when changing secure boot as there will be 2 reboots (both roughly those 6 minutes)13:10
dtantsurajya: yeah, but only if you request the secure boot change13:12
ajyayes, users maybe will need to also increase their clean, deploy timeouts because of this addition 13:13
dtantsurgood point, I'll update the docs13:14
dtantsurramdisk deploy goes through the same prepare_instance call, which manages secure boot as one of the first things13:14
ajyawhat I don't know because I haven't tried - why secure boot does not work with ramdisk - there is no reboot because system is already running?13:14
dtantsurajya: my only guess is that it's because BMO uses force_persistent_boot_device=Never13:15
dtantsurwhich is still weird, but maybe the temporary boot device is actually reset?13:15
ajyadtantsur: so that would happen after applying secure boot in the middle of booting? Maybe, have to check. Otherwise, the proposed solution would work, but drawback is making things slower. However, if the boot setting is cleared in the middle of booting there is no other way around as reboot again13:29
ajyamaybe could do the reboot&waiting for force_persistent_boot_device=Never only but I have to confirm if that's the cause13:30
dtantsurright...13:41
dtantsurit's also complicated since set_secure_boot can be called from the API directly13:42
*** rcastillo|rover_ is now known as rcastillo13:49
*** rcastillo is now known as rcastillo|rover_13:49
*** rcastillo|rover_ is now known as rcastillo|rover13:51
opendevreviewDmitry Tantsur proposed openstack/ironic master: [WIP] [PoC] A metal3 CI job  https://review.opendev.org/c/openstack/ironic/+/86387313:52
ajyayeah, it should be self-contained unless there is a way to determine if additional reboot needed or not based on the context it is called from and other settings13:52
arozmanHi Ironic!14:30
arozmanHi, I have a question related to Ironic API, I have this request towards a Ironic server  curl -u 'username:passwd' -X PATCH -H "Content-Type: application/json" -d '[{"op":"add","path":"/boot_interface","value":"redfish-virtual-media"}]' -k https://172.22.0.1:6385/v1/nodes/96848965-e6ef-47d6-b02a-ca2f5989e4e0  and it returns a 406 error. Is there some option in the Ironic config that I have to enable to allow me to use this 14:56
arozmanendpoint?14:56
arozmanthis is the error message: {"error_message": "{\"faultcode\": \"Client\", \"faultstring\": \"Request not acceptable.\", \"debuginfo\": null}"}14:57
ashinclouds[m]arozman: ummm what is your hardware type. Btw, that should be op update 15:04
ashinclouds[m]Also, have you considered the bare metal command?15:05
JayFI think arozman works on metal3... At least based on my interpretation of that nickname 😀15:06
ashinclouds[m]Should still work if you invoke it properly15:07
JayFYeah I'm just guessing that they're trying to figure out the actual API call... Although you're right that's using the CLI with verbose will probably get you good results15:08
arozman@JaF yeah i am from the metal3 tribe true :D but in this case it is a pure Ironic use-case . I am trying to help some folks downstream with some testing.15:09
arozman@JayF, ashinclouds[m] Thanks I was looking for this info! I tried op: replace and add because those were present here https://docs.openstack.org/api-ref/baremetal/?expanded=change-node-boot-mode-detail,update-node-detail,create-node-detail15:10
arozmanbut then I will do the testing first with the baremetal cli 15:10
arozmanand also I will try op: update, thanks a lot !15:10
dtantsurarozman: you're missing the correct API version15:30
dtantsurarozman: 1.31 in your case: https://docs.openstack.org/ironic/latest/contributor/webapi-version-history.html#ocata-7-0-015:30
dtantsurit's provided via X-OpenStack-Ironic-Api-Version header15:30
arozman@dtantsur Thank you!!!15:30
dtantsurbut yeah, I'd recommend you rather use the client (we even have the ironic-client container!)15:31
arozmanyeah I will use the client for debugging that will help me a lot, but downstream folks will also develop Ironic driver for some closed source infra management service so they also need to know how the raw requests fit together, so thanks again! 15:33
opendevreviewDmitry Tantsur proposed openstack/ironic master: [WIP] [PoC] A metal3 CI job  https://review.opendev.org/c/openstack/ironic/+/86387315:43
JayFarozman: we do have the ironic-staging-drivers OSS repo which stuff like that can live if desired15:45
JayFarozman: https://opendev.org/x/ironic-staging-drivers if whoever is developing that driver for a closed source system wants to share the code15:46
arozman@JayF Thanks I would prefer everything to be OSS but even the service is not public even I am not allowed to know how does it work. It lives inside the belly of the megacorp :D15:47
JayFheh, got it :) 15:47
JayFarozman: you are from metal3, right?15:47
arozmanyes15:47
JayFI remember your name from that slack channel15:47
JayFokay, good stuff :)15:47
dtantsurJayF: JFYI there are weekly metal3 meetings on zoom, Wed 14:00 UTC15:50
JayFif that was not 6am local time, I might would kibitz it. I'm not sure I'd provide enough value to be worth taking that early of a morning :D 15:51
JayFbut I'm always happy to help or attend if there's something specific I can do15:51
arozman@JayF well in that case may I offer you https://www.youtube.com/channel/UC_xneeYbo-Dl4g-U78xW15g15:52
JayFI see your youtube link, and raise you a youtube live stream, my OSS Office Hours, starting in 7 minutes https://youtube.com/jayofdoom15:53
arozmanI fold, well then at least I have now something to listen to while I am working15:54
JayFarozman: knikolla[m]: I think I fixed it :| 16:15
arozmanokay16:16
rpittaubye o/17:16
opendevreviewMerged openstack/sushy master: Increase server side retries  https://review.opendev.org/c/openstack/sushy/+/86382817:33
opendevreviewJakub Jelinek proposed openstack/ironic master: Implements node inventory: database  https://review.opendev.org/c/openstack/ironic/+/86256917:50
kubajjTheJulia: dtantsur: It seems that the mysql_as_long causes issues because it is LONGTEXT for MySQL and TEXT for PostgreSQL18:13
TheJuliakubajj: if I remember correctly, text in Postgres is not size constrained18:34
kubajjTheJulia: does it make sense to split the asserts for the two fields into the TestMigrationsMySQL and TestMigrationsPostgresSQL?18:39
TheJuliakubajj: yes, I think there is another field that needs that already18:40
opendevreviewJakub Jelinek proposed openstack/ironic master: Implements node inventory: database  https://review.opendev.org/c/openstack/ironic/+/86256918:44
opendevreviewJakub Jelinek proposed openstack/ironic master: Implements node inventory: database  https://review.opendev.org/c/openstack/ironic/+/86256918:45
opendevreviewJakub Jelinek proposed openstack/ironic master: WIP: Get inventory from Inspector  https://review.opendev.org/c/openstack/ironic/+/86405719:15

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!