Wednesday, 2022-05-11

*** mat_fechner is now known as matfechner04:24
kamlesh6808cGood morning Ironic !04:25
arne_wiebalckGood morning kamlesh6808c and Ironic!06:56
arne_wiebalckstevebaker[m]: thanks for picking this up!06:56
rpittaugood morning ironic! o/07:41
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent master: Multipath Hardware path handling  https://review.opendev.org/c/openstack/ironic-python-agent/+/83703907:50
rpittaucan I please get a quick review on https://review.opendev.org/c/openstack/ironic-python-agent/+/841220 ?07:51
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.4: Use Yoga constraints for bugfix/8.4  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129008:09
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.4: Use Yoga constraints for bugfix/8.4  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129008:58
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.4: Use Yoga constraints for bugfix/8.4  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129008:59
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.4: Use Yoga constraints for bugfix/8.4  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129009:20
rpittaudtantsur: https://review.opendev.org/c/openstack/ironic-python-agent/+/841290 is green, not without some effort09:57
dtantsurrpittau: even more effort required, sorry. with this IPA is also checked out at stable/yoga, which means we're not testing patches11:11
iurygregorygood morning Ironic11:11
dtantsurI wonder if we need required-projects with IPA explicitly set to bugfix/whatever11:11
dtantsurmorning iurygregory 11:11
rpittauso we need to override by project, not by job :/11:11
rpittauyeah11:12
rpittauhey iurygregory :)11:12
dtantsurrpittau: given that we only care about requirements and IPA, it may be easier to do by project11:12
rpittauok, I'll have a look after lunch11:12
dtantsurthx!11:12
rpiosoGood morning, ironic :)12:31
rpiosoarne_wiebalck: Thank you! That's consistent with the reply I got about remote presentation.12:31
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.4: Use Yoga constraints for bugfix/8.4  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129012:49
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent master: Multipath Hardware path handling  https://review.opendev.org/c/openstack/ironic-python-agent/+/83703913:08
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.4: Use Yoga constraints for bugfix/8.4  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129013:11
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.4: Use Yoga constraints for bugfix/8.4  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129013:20
rpittauI'm the king of typos....13:20
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.4: Use Yoga constraints for bugfix/8.4  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129013:26
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.4: Use Yoga constraints for bugfix/8.4  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129013:28
TheJuliagood morning13:47
rpittaugood morning TheJulia :)13:51
rpittaudtantsur: can you please double-check when you have a moment? https://review.opendev.org/c/openstack/ironic-python-agent/+/84129013:52
rpittauTheJulia: I've updated the multipath patch https://review.opendev.org/c/openstack/ironic-python-agent/+/837039 I think it's ready now13:53
iurygregoryrpittau,  wow re https://review.opendev.org/c/openstack/ironic-python-agent/+/84129013:59
rpittauheh....13:59
opendevreviewMerged openstack/ironic-python-agent bugfix/8.4: Use Yoga constraints for bugfix/8.4  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129014:26
arne_wiebalckrpioso: ++14:32
ayoungWhere would I drop an SSL cert in the ipa image in order to get a successful SSL (TLS) connection back during the clean stage?14:38
ayoungI am assuming in the initrd somewhere14:38
TheJuliaayoung: locally signed endpoint ssl certifiacates?14:45
ayoungTheJulia, I think so, yes14:45
ayoungI installed via bifrost with enable TLS = true so whatever that does14:46
ayoungThe IPA image generation is x86 specific still and I want to test before working on making that aarch64 enabled14:46
dtantsurayoung: by default IPA is configured with insecure=True so you don't have to bother about this too much14:52
dtantsurunless you're messing with configuration14:52
TheJuliaThat call is also just the heartbeat14:53
dtantsur(yes, bifrost generates self-signed certificates)14:53
TheJuliaso potentially acceptable opsec risk14:53
ayoungwell, the ipa instances are sticking around in clean wait14:53
dtantsurayoung: can you check the generated kernel params (in the httpboot directory, location depends on version)14:54
ayoungpretty sure TLS is the only difference between the one I had working a week or two ago and now14:54
dtantsurwe have a CI job with TLS enabled14:54
ayoungok, so in httpboot, where do I look>14:55
ayoung?14:55
dtantsurthere should be a weirdly named script, just grep for "^kernel"?14:55
dtantsurprobably in pxelinux.cfg/<MAC>14:56
ayoungjust the sha14:56
ayoungboot.ipxe  ?14:56
dtantsurthis is the generic one, you need the per-node conf14:56
ayoungah...already gone...let me reclean14:57
dtantsur$ sudo grep kernel /var/lib/ironic/httpboot/pxelinux.cfg/52-54-00-eb-42-e6 | head -114:57
TheJuliayeah, it will be in /httpboot/<node-uuid>/config if memory serves14:57
dtantsurkernel http://192.168.122.1:8080//4e41df61-84b1-5856-bfb6-6b5f2cd3dd11/deploy_kernel selinux=0 troubleshoot=0 text nofb nomodeset systemd.journald.forward_to_console=yes console=ttyS0 ipa-insecure=1 ipa-debug=1 ipa-api-url=http://192.168.122.1:6385 ipa-global-request-id=req-5db4d02d-4589-4db4-b3f3-68f7b15c1e33 BOOTIF=${mac} initrd=deploy_ramdisk || goto retry14:57
dtantsuror that14:57
TheJuliathe pxelinux.cfg files are just links14:57
dtantsurone of them is a symlink to the other14:57
ayoungI think it actually just worked...14:58
ayoungthe node was in avaiable...hrm...14:58
dtantsurI have a guess14:58
dtantsuris it possible that the node had the agent running when you enabled TLS?14:59
dtantsure.g. from previous cleanings?14:59
ayoungNo, it rebooted14:59
dtantsurI mean, when you had the error?14:59
dtantsurthat it rebooted now may be the reason it started working :)15:00
ayoungI'll keep playing see what data I can generate15:00
* dtantsur is trying to have a meeting from the balcony15:02
TheJuliabalcony++ ?15:02
TheJuliaayoung: why did they reboot?15:02
TheJuliawhat triggered it?15:02
TheJuliaayoung: if an un-expected reboot occurs, your agent token can't be retrieved again15:03
ayoungenP4p3s0u1u3c2   who came up with this naming scheme?  Can whomever sits closest to that person deliver a swift kick to the shins for me?15:18
ayoungTheJulia, just the initila PXE boot kicked off by the clean, I think15:18
ayoungits seems to be working ok but not consistently15:18
ayoungwhich might be hardware, as it is a new cluster15:18
ayoungand cluster is only half the word15:19
dtantsurwhen you retry cleaning after a failure, a reboot is always done, even in fast-track mode15:20
ayoungOK.  But this was not after a failure.  I triggered the clean manually15:21
ayoungopenstack baremetal node clean   mystique14-r116  --clean-steps '[{"interface": "deploy", "step": "erase_devices"}]'15:21
ayoungI rally was testing whether the PXE boot stages works, which it does15:21
ayounggonna try a full deploy15:21
ayoungonce I see if this clean succeeds or fails, but it looks good on one node so far15:22
TheJuliaokay15:22
ayoungso...node in clean wait state, sol on the node shows it at the debian login prompt15:35
ayoungcleaned 2 nodes.  one is in the manageable state now, the other in clean wait15:36
ayoungnot exactly success, not quite failure.  Too early for whisky, too late for coffee15:36
dtantsurexactly right for coffee with whisky?15:37
TheJuliacould it still be cleaning?15:38
TheJuliacould the block device be... damaged?15:38
dtantsurerase_devices can take as long as it wants if there is no hardware-assisted secure erase15:48
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.3: Use Yoga constraints for bugfix/8.3  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129115:49
TheJuliaIndeed15:49
TheJuliaalso, we default to a single thread15:49
* TheJulia has a change... someplace... to make it 4 threads15:49
TheJuliaHere change... when I finally get back to code15:49
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.1: Use Xena constraints for bugfix/8.1  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129315:54
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.1: Use Xena constraints for bugfix/8.1  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129315:54
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.3: Use Yoga constraints for bugfix/8.3  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129115:55
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent bugfix/8.3: Use Yoga constraints for bugfix/8.3  https://review.opendev.org/c/openstack/ironic-python-agent/+/84129115:56
ayoungI have a cluster of 14 machines.  So different machines get different results, and yet they are all supposed to be identical.  The joys of new hardware15:59
opendevreviewRiccardo Pittau proposed openstack/ironic-python-agent master: Multipath Hardware path handling  https://review.opendev.org/c/openstack/ironic-python-agent/+/83703916:04
rpittaugood night! o/16:12
opendevreviewMerged openstack/ironic-python-agent master: The Python 3.6 and Python 3.7Support has been dropped since yaga  https://review.opendev.org/c/openstack/ironic-python-agent/+/84107316:12
iurygregorytypo :D16:17
dtantsuriurygregory: not necessarily: https://en.wikipedia.org/wiki/Baba_Yaga16:27
iurygregoryLOL16:29
iurygregoryI wasn't expecting that16:29
* dtantsur is happy to help16:29
TheJuliaOh wow17:02
JayFI just keep hearing the Weird Al song except "Y-A-G-A Yaaahhhggaaaa"17:12
TheJuliaI keep thinking of angry women17:24
* TheJulia feels just exhausted sending a bunch of emails17:37
opendevreviewMerged openstack/ironic-python-agent stable/yoga: Collect a full lsblk output in the ramdisk logs  https://review.opendev.org/c/openstack/ironic-python-agent/+/84093717:44
opendevreviewMerged openstack/ironic-python-agent bugfix/8.6: Collect a full lsblk output in the ramdisk logs  https://review.opendev.org/c/openstack/ironic-python-agent/+/84093817:44
opendevreviewVerification of a change to openstack/ironic-python-agent stable/xena failed: Use a pre-defined partition UUID to detect configdrive on GPT  https://review.opendev.org/c/openstack/ironic-python-agent/+/84034817:44
ayounglast_error: 'Node failed to start the first cleaning step: Connection to agent failed:18:20
ayoung  Failed to connect to the agent running on node af8ebf65-50cf-4aad-8799-8e5678d2574e18:20
ayoung  for invoking command clean.get_clean_steps. Error: HTTPSConnectionPool(host=''192.168.116.47'',18:20
ayoung  port=9999): Max retries exceeded with url: /v1/commands/?wait=true&agent_token=nKxmsok4Whd2ko9Ra5T9slzhGaEfsPUz_JznjIJK3Vw18:20
ayoung  (Caused by SSLError(SSLCertVerificationError(1, ''[SSL: CERTIFICATE_VERIFY_FAILED]18:20
ayoung  certificate verify failed: certificate is not yet valid (_ssl.c:1131)'')))'18:20
ayoungthat is what I saw before.  About Half the nodes failed with errors like this18:20
TheJuliaclocks?18:26
TheJuliado you have a time server?18:26
ayoungNot that I know of18:27
TheJuliaso, I suspect it might be that your system local clocks are just not consistent18:27
TheJuliawe have logic in the agent to force the clock to update someplace18:28
ayoungOK, I can handle that.  And I can get a clock server if needs be18:28
TheJuliahttps://github.com/openstack/ironic-python-agent/blob/fcb65cae18f4a6b4b05fb70677e2fa114e0558a9/releasenotes/notes/set-clock-prior-to-poweroff-af6ec210aad8b45a.yaml18:28
TheJuliawe also save the time at power-off18:28
NobodyCamgood afternoon Ironic folks19:05
NobodyCamcrazy question did deploy steps work in ussuri?19:06
TheJuliaNobodyCam: that is a great question19:20
NobodyCamLOL19:20
NobodyCamdocs seem to be all over the place19:21
TheJuliawell, clean steps were a thing, and deploy steps were almost a thing between releases19:21
NobodyCamyea I saw this: `Starting with the Victoria release cycle, deployment can be customized similarly to cleaning. `19:22
TheJuliaYeah, I was just looking at code suggesting victoria as well19:22
NobodyCamso there are deploy templates19:23
NobodyCam`Starting with the Stein release, with Bare Metal API version 1.55, deploy templates offer a way to define a set of one or more deploy steps to be executed with particular sets of arguments and priorities.`19:23
NobodyCamnot sure I've ever used the deploy templates19:24
TheJuliaoh jeeze I'm trying to remember how that maps through19:26
NobodyCamLOL19:26
TheJuliaso mgoddard has a video someplace demonstrating deploy time raid19:26
NobodyCamoh19:26
NobodyCamhappen to know if a customer deploy template over writes the default or is added to them?  19:28
NobodyCam*custom*19:28
TheJuliaI have no idea19:30
NobodyCamwe'll find out !19:32
NobodyCamhehehee19:32
NobodyCamwell I guess if I read the doc it would tell me: `During deployment, if any of the traits in a node’s instance_info.traits field match the name of a deploy template, then the steps from that deploy template will be added to the list of steps to be executed by the node.`19:33
TheJuliaThat sounds right!19:54
TheJuliacryptic crash of the day https://b514d6c133582c0af7f3-9355d865b880bd099576064727df95b2.ssl.cf2.rackcdn.com/841275/2/check/ipa-tempest-uefi-redfish-vmedia-src/d0afc64/controller/logs/ironic-bm-logs/node-1_console_2022-05-10-23%3A02%3A29_log.txt20:54
opendevreviewJulia Kreger proposed openstack/networking-generic-switch master: CI: use pre-existing ssh key on multinode jobs  https://review.opendev.org/c/openstack/networking-generic-switch/+/84126521:00
stevebaker[m]arne_wiebalck, iurygregory : Hey the SIG video is live https://www.youtube.com/watch?v=_K-aPdKnt1Y22:35
NobodyCamnow to figure out how to correctly add a step: `Validation of deploy steps from deploy templates matching this node's instance traits failed. Matching deploy templates: CUSTOM_ZACK_DEPLOY_STEP. Errors: node does not support this deploy step`22:46

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!