Tuesday, 2021-03-02

*** tosky has quit IRC00:17
*** k_mouza has joined #openstack-ironic00:21
*** k_mouza has quit IRC00:25
*** jamesdenton has quit IRC00:26
*** jamesden_ has joined #openstack-ironic00:26
mnaserTheJulia, arne_wiebalck: I believe I figured out the culprit, we don't erase metadata on RAID member drives (nor do we wipe them), so it ends up ignoring those devices and then running shred on the RAID array00:55
mnaser`smartctl -d ata /dev/md127 -g security`00:56
mnaserRunning cmd (subprocess): shred --force --zero --verbose --iterations 1 /dev/md127 execute /opt/ironic-python-agent/lib64/python3.6/site-packages/oslo_concurrency/processutils.py:38400:56
mnaserCMD "shred --force --zero --verbose --iterations 1 /dev/md127" returned: 0 in 1848.313s execute /opt/ironic-python-agent/lib64/python3.6/site-packages/oslo_concurrency/processutils.py:42300:56
mnaser30 minutes to run shred against the RAID array00:57
mnaserIMHO, the better approach is to just run delete_configuration, erase_device_metadata, erase_device, create_configuration on every cleaning instead?00:57
mnaserdoes automated cleaning not do that by default?00:57
mnaseraha!01:02
mnaserynchronous command get_clean_steps completed: {'clean_steps': {'GenericHardwareManager': [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy', 'reboot_requested': False, 'abortable': True}, {'step': 'erase_devices_metadata', 'priority': 99, 'interface': 'deploy', 'reboot_requested': False, 'abortable': True}, {'step': 'delete_configuration', 'priority': 0, 'interface': 'raid', 'reboot_requested':01:02
mnaserFalse, 'abortable': True}, {'step': 'create_configuration', 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}]}, 'hardware_manager_version': {'generic_hardware_manager': '1.1'}}01:02
*** paras333 has joined #openstack-ironic01:06
*** iurygregory has quit IRC01:07
*** paras333 has quit IRC01:10
mnaseri guess there's no way of changing the prio of delete_configuration and create_configuration ?01:11
eanderssonAnyone seeing this error? We believe it started after rebasing stable/victoria01:13
eandersson> Unexpected response from the agent for node <uuid>: the running command list does not include prepare_image or its result is malformed01:13
mnasereandersson: did you update your ipa too?01:20
eanderssonYea - should be, but can double check.01:21
mnasereandersson: /var/log/ironic/deploy/*.tar.gz might help too?01:22
eanderssonah yea good call01:22
mnaserthat'll also help identify the ipa version too01:22
eanderssonCould be related to IPA version. I see "deploy_results", but not "command_results" in the logs.01:26
eanderssonIt didn't create a log for some reason under /var/log/ironic/deploy :'(01:28
*** iurygregory has joined #openstack-ironic01:30
eanderssonmnaser does the image building pipeline work for you? :D01:35
eandersson> https://bootstrap.pypa.io/3.5/3.5/get-pip.py01:35
eanderssonIt's trying to get this for us and throwing a 40401:35
eanderssonoh lol nvm01:35
mnasereandersson: which image building one exactly?01:35
mnaserfor ipa?01:35
eandersson3.5/3.5 not sure why this is part of it01:35
eanderssonYea01:35
mnaseri can run a build now01:35
eanderssonI think I figured it out haha we manually patched a bug in the ipa build process that has been fixed now01:36
mnasereandersson: you using ironic-python-agent-builder?01:36
mnaserthat what i use :)01:36
eanderssonI think we started with that, but switched to just using the disk image builder at some point01:40
mnasereandersson: i mean it pretty much just runs DIB :)01:42
eanderssonYea - was gonna say it's probably the same, since we use the elements from the ipa-builder01:48
*** zzzeek has quit IRC01:49
*** zzzeek has joined #openstack-ironic01:51
*** paras333 has joined #openstack-ironic01:57
*** paras333 has quit IRC02:01
openstackgerritMerged openstack/ironic-python-agent stable/ussuri: Pin version of ipa-builder when publishing image  https://review.opendev.org/c/openstack/ironic-python-agent/+/77802102:18
*** rcernin has quit IRC02:34
*** rcernin has joined #openstack-ironic02:47
*** rloo has quit IRC02:50
*** mkrai has joined #openstack-ironic03:16
*** jamesden_ is now known as jamesdenton03:29
*** zzzeek has quit IRC04:32
*** zzzeek has joined #openstack-ironic04:33
*** rh-jlabarre has quit IRC04:35
*** tzumainn has quit IRC05:11
openstackgerritJacob Anders proposed openstack/ironic master: Add support for using NVMe specific cleaning  https://review.opendev.org/c/openstack/ironic/+/77813405:27
*** anuradha1904 has joined #openstack-ironic05:29
openstackgerritJacob Anders proposed openstack/ironic-python-agent master: Remove nvme-cli warning and delay on nvme-format  https://review.opendev.org/c/openstack/ironic-python-agent/+/77813605:41
openstackgerritYogesh proposed openstack/ironic master: Add idrac HW type IPMI interface support  https://review.opendev.org/c/openstack/ironic/+/77186205:53
*** lbragstad_ has joined #openstack-ironic06:03
*** lbragstad has quit IRC06:06
*** zzzeek has quit IRC06:10
*** zzzeek has joined #openstack-ironic06:11
*** k_mouza has joined #openstack-ironic06:21
*** k_mouza has quit IRC06:25
openstackgerritankit proposed openstack/ironic master: Adds config parameter kernel_append_param for iLO  https://review.opendev.org/c/openstack/ironic/+/75518906:27
*** gyee has quit IRC06:47
*** rcernin has quit IRC06:57
*** paras333 has joined #openstack-ironic07:24
*** jawad_axd has joined #openstack-ironic07:25
*** moshiur has joined #openstack-ironic07:28
arne_wiebalckmnaser: this sequence is exactly what we do, in our downstream h/w manager so it is part of automated cleaning07:37
mnaserarne_wiebalck: i'm working on a patch that allows overriding the prio for clean and create config07:38
arne_wiebalckmnaser: I think we always assume cleaning is happening between deploys07:40
mnaserarne_wiebalck: right, but the cleaning by default has prio of 0 for clean and create config07:41
mnaserand what happens is cleaning ignores raid array drives07:41
arne_wiebalckmnaser: the fact that RAID cleaning is not done automatically is done to be the same as for h/w RAID07:41
mnaserah07:41
mnaseri guess it makes sense to have a tunable so that it can be run as part of normal cleaning for software raid07:41
arne_wiebalckmnaser: we discussed several times already to have this done automatically07:41
arne_wiebalckmnaser: I think there is a patch which does what you suggest for deploy steps07:42
openstackgerritMohammed Naser proposed openstack/ironic master: Allow users to configure priority for {create,delete}_configuration  https://review.opendev.org/c/openstack/ironic/+/77814507:43
arne_wiebalckmnaser: I think RAID devices are skipped for erase as a) we assumed they were not there anymore as delete_configuration was run before and b) they would not be able to do fast erase07:43
mnaserarne_wiebalck: the RAID devices which are part of the raid array are skipped.. the raid device itself (/dev/md127) ended up running through shred07:44
arne_wiebalckmnaser: yeah ... not sure this makes sense as the underlying disks may also run shred07:45
arne_wiebalckmnaser: or some sort of erase07:45
mnaseryeah ideally what id like to do with my patch is run this order: delete_configuration, erase_devices_metadata, erase_devices, create_configuration -- that way, when it reaches erase_devices, there's no raid array, and it will run a quick secure erase07:46
arne_wiebalckmnaser: yes, this is how I have the order in our h/w manager07:47
mnaserso for me to avoid writing a hardware manager, i'd have those options that i can tweak :P07:47
arne_wiebalckmnaser: yes, that makes sense07:47
*** Qianbiao has joined #openstack-ironic07:50
openstackgerritvinay50muddu proposed openstack/ironic master: Add clean/deploy steps to manage certificates  https://review.opendev.org/c/openstack/ironic/+/76379107:50
openstackgerritArun S A G proposed openstack/ironic master: Add agent_state and agent_status params to heartbeat  https://review.opendev.org/c/openstack/ironic/+/77805808:01
zer0c00lThe anaconda deploy driver is ready for review - only thing missing is config drive related stuff. https://review.opendev.org/q/topic:%22anaconda-deploy-driver%22+(status:open%20OR%20status:merged)08:02
zer0c00li will be at the review jam tomorrow!08:02
openstackgerritArne Wiebalck proposed openstack/ironic master: Lazy-load node details from the DB  https://review.opendev.org/c/openstack/ironic/+/77693008:12
*** mkrai has quit IRC08:18
*** mkrai has joined #openstack-ironic08:18
*** rpittau|afk is now known as rpittau08:22
rpittaugood morning ironic! o/08:22
jandersgood morning rpittau o/08:25
rpittauhey janders :)08:25
rpittauiurygregory: unfortunately https://review.opendev.org/c/openstack/ironic-python-agent/+/778021 doesn't work so I'm going to revert it08:26
rpittauiurygregory: https://zuul.opendev.org/t/openstack/build/ab1e98708986424d9b31ab4db4abe6b708:26
*** tosky has joined #openstack-ironic08:35
*** ociuhandu has joined #openstack-ironic08:44
*** dougsz has joined #openstack-ironic08:50
*** jamesdenton has quit IRC08:56
*** jamesdenton has joined #openstack-ironic08:57
*** lucasagomes has joined #openstack-ironic09:01
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder  https://review.opendev.org/c/openstack/ironic-python-agent/+/77815309:15
rpittauok maybe we don't need to revert with this ^09:15
openstackgerritRiccardo Pittau proposed openstack/ironic master: [WIP] Prepare to use tinycore 12 for tinyipa  https://review.opendev.org/c/openstack/ironic/+/77734209:18
openstackgerritRiccardo Pittau proposed openstack/ironic master: Prepare to use tinycore 12 for tinyipa  https://review.opendev.org/c/openstack/ironic/+/77734209:20
openstackgerritRiccardo Pittau proposed openstack/ironic master: Prepare to use tinycore 12 for tinyipa  https://review.opendev.org/c/openstack/ironic/+/77734209:29
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent-builder master: Use tinycore 12 to build tinyipa  https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/77658709:30
moshiurHi rpittau: I am able to build the IPA image with opensuse base image.09:40
*** ociuhandu has quit IRC09:50
rpittaumoshiur: great :)09:59
rpittaumoshiur: patches welcome :)10:02
*** derekh has joined #openstack-ironic10:05
*** paras333 has quit IRC10:06
*** ociuhandu has joined #openstack-ironic10:06
moshiurThanks rpittau: I will try to add two patches each in https://github.com/openstack/diskimage-builder and  https://github.com/openstack/ironic-python-agent-builder.10:25
rpittaumoshiur: not sure how familiar you are with gerrit, but you should use the opendev repositories, not github, they're just mirrors10:26
moshiurrpittau: oh, I am not familiar with gerrit, but will give a try to do this.10:31
rpittaumoshiur: you can find a lot of info on the internet, you can start from https://www.gerritcodereview.com/ and https://wiki.openstack.org/wiki/How_To_Contribute10:33
openstackgerritDerek Higgins proposed openstack/ironic-python-agent master: Increase the memory limit for qemu-img  https://review.opendev.org/c/openstack/ironic-python-agent/+/77803510:33
*** mkrai has quit IRC10:46
*** mkrai_ has joined #openstack-ironic10:46
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder  https://review.opendev.org/c/openstack/ironic-python-agent/+/77815310:53
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder  https://review.opendev.org/c/openstack/ironic-python-agent/+/77815310:54
*** k_mouza has joined #openstack-ironic10:56
jandersTheJulia when you're online and have the time please let me know if this doco addition for NVMe cleaning would be sufficient to address the doco gap you pointed out: https://review.opendev.org/c/openstack/ironic/+/778134 thanks! :)11:01
janderssee you tomorow Ironic o/11:01
rpittaubye janders :)11:01
janderssee you rpittau11:06
iurygregorygood morning ironic11:12
iurygregoryrpittau, hey shouldn't we run on ubuntu-bionic?11:12
rpittauiurygregory: yeah, probably better, I'll add it to the patch11:13
iurygregoryrpittau++ =)11:13
iurygregoryand using UPPER_CONSTRAINTS_FILE makes a lot of sense!11:14
iurygregorygoing to grab coffee brb11:14
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder  https://review.opendev.org/c/openstack/ironic-python-agent/+/77815311:17
openstackgerritDerek Higgins proposed openstack/ironic-python-agent master: Install doc/requirements.txt in testenv:venv  https://review.opendev.org/c/openstack/ironic-python-agent/+/77817311:35
*** uzumaki has joined #openstack-ironic11:38
openstackgerritMerged openstack/ironic master: secure-rbac - minor follow-up for project scoped tests  https://review.opendev.org/c/openstack/ironic/+/77803311:38
openstackgerritMerged openstack/ironic-python-agent master: Added comment about IPA logs being uploaded to Ironic  https://review.opendev.org/c/openstack/ironic-python-agent/+/77803111:38
openstackgerritDerek Higgins proposed openstack/ironic-python-agent master: Increase the memory limit for qemu-img  https://review.opendev.org/c/openstack/ironic-python-agent/+/77803511:39
*** ociuhandu has quit IRC11:48
*** hoonetorg has quit IRC11:49
*** ociuhandu has joined #openstack-ironic11:50
*** ociuhandu has quit IRC11:50
*** ociuhandu has joined #openstack-ironic11:52
*** ociuhandu has quit IRC11:57
*** hoonetorg has joined #openstack-ironic12:03
anuradha1904Hi everyone, My name is Anuradha and I was an Outreachy intern for December 2020 round for OpenStack. My mentors were iurygregory, and TheJulia, Today is my last day of internship and I want to thank each one of you for this amazing community, I had the most amazing times learning and growing here. I will continue with my contributions and give back as much as possible. I want to thank my amazing mentor12:06
anuradha1904Iurygregory for being the best mentor I could ever ask for. He helped me with the smallest of doubts by helping me with examples, pseudo-codes, and explanations without complaining and with extraordinary patience. It never felt like his first experience as a mentor. Thank you my amazing mentor TheJulia for constantly motivating me and helping me solve my doubts by guiding me with steps such that I self12:06
anuradha1904learn and correct myself, Thank you tosin: for being a great friend, I am ready to grow and learn some more with you and Finally, all the amazing members of the community who reviewed my code, you all were a part of an amazing experience for a beginner who will try to learn to learn and grow. :)12:06
iurygregoryanuradha1904, thank you for your hard work! you did a great job =) congratulations!12:08
anuradha1904iurygregory, Thank you so much, could not have been possible at all without your help :)12:09
*** ociuhandu has joined #openstack-ironic12:12
*** ociuhandu has quit IRC12:17
*** ociuhandu has joined #openstack-ironic12:18
*** zzzeek has quit IRC12:20
*** ociuhandu has quit IRC12:22
*** zzzeek has joined #openstack-ironic12:23
*** paras333_ has joined #openstack-ironic12:35
*** ociuhandu has joined #openstack-ironic12:35
*** mkrai_ has quit IRC12:38
*** ociuhandu has quit IRC12:44
*** rh-jlabarre has joined #openstack-ironic13:04
*** ociuhandu has joined #openstack-ironic13:27
*** uzumaki has quit IRC13:40
*** lbragstad_ is now known as lbragstad13:41
TheJuliagood morning13:49
rpittaugood morning TheJulia :)13:49
iurygregorygood morning TheJulia =)13:51
*** rloo has joined #openstack-ironic13:59
*** rloo has quit IRC13:59
*** rloo has joined #openstack-ironic13:59
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder  https://review.opendev.org/c/openstack/ironic-python-agent/+/77815314:02
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder  https://review.opendev.org/c/openstack/ironic-python-agent/+/77815314:07
rpittauTheJulia: I think we're good to go for the releases, I'm going to review the release notes one more time14:09
rpittauoh they were actually already done :D14:09
rpittauexcept metalsmith14:10
rpittauI'l request that14:10
TheJuliarpittau: thanks14:12
*** lmcgann has joined #openstack-ironic14:14
*** rloo has quit IRC14:17
*** rloo has joined #openstack-ironic14:18
*** fdegir has joined #openstack-ironic14:19
openstackgerritRiccardo Pittau proposed openstack/metalsmith master: Fix release versions  https://review.opendev.org/c/openstack/metalsmith/+/77819414:20
rpittaumaybe we can just quickly merge this before ? ^14:20
TheJuliaApproved14:24
rpittauthanks14:25
*** jawad_axd has quit IRC14:31
*** jawad_axd has joined #openstack-ironic14:32
openstackgerritMerged openstack/ironic-python-agent master: Remove nvme-cli warning and delay on nvme-format  https://review.opendev.org/c/openstack/ironic-python-agent/+/77813614:36
openstackgerritMerged openstack/metalsmith master: Fix release versions  https://review.opendev.org/c/openstack/metalsmith/+/77819414:39
*** tzumainn has joined #openstack-ironic14:42
TheJuliaSo... March 367th is today?14:42
* iurygregory - no reference found =(14:45
TheJuliamnaser: So, I feel like maybe the enumeration of priority should look for software raid and reset the available steps as such14:45
tzumainndtantsur, the change required to allow instance_info to override *_interface values turned out to be suspiciously simple14:54
*** uzumaki has joined #openstack-ironic14:58
iurygregorytzumainn, Dmitry is on PTO this week =)15:01
iurygregorybut he will be happy to hear this when he comes back :D15:01
iurygregorys/hear/read15:02
tzumainnhaha, okay!15:02
iurygregoryif you have the change up feel free to add ironic-week-prio in the hashtag field =)15:03
tzumainniurygregory, done, thanks for the heads up!15:04
iurygregoryty!15:05
TheJuliacan we hold off on permission changes until after the new project scoped rbac work merges??15:05
iurygregoryI really don't want to look at the possible merge conflicts in  https://review.opendev.org/c/openstack/ironic/+/776540 :D15:08
iurygregoryTheJulia, I think it makes sense15:08
TheJuliaI ask mainly because I don't want to inadvertently squash something brand new and I'd prefer to limit the delta of changes. Plus a lot of the old style of permissions rules need to be ripped out in the grand scheme of the universe15:10
*** mkrai has joined #openstack-ironic15:11
iurygregoryI'm ok with this approach15:12
iurygregoryI know it's also a pain to solve merge conflicts etc15:13
*** jawad_axd has quit IRC15:15
*** jawad_axd has joined #openstack-ironic15:16
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent master: Remove default parameter from execute  https://review.opendev.org/c/openstack/ironic-python-agent/+/77820115:20
*** jawad_axd has quit IRC15:24
*** Qianbiao has quit IRC15:33
*** openstackgerrit has quit IRC15:35
mnaserTheJulia: that is far more resilient. I couldn’t find where the steps get enumerated or generated though :(15:53
*** mkrai has quit IRC15:57
*** moshiur has quit IRC15:59
*** openstackgerrit has joined #openstack-ironic16:03
openstackgerritMerged openstack/ironic-python-agent master: Increase the memory limit for qemu-img  https://review.opendev.org/c/openstack/ironic-python-agent/+/77803516:03
TheJuliamnaser: I'll try to take a look between meetings today16:05
TheJuliabut today is a meeting day16:05
*** uzumaki has quit IRC16:08
*** uzumaki has joined #openstack-ironic16:21
openstackgerritRiccardo Pittau proposed openstack/ironic-python-agent stable/ussuri: Use UPPER_CONSTRAINTS_FILE to deal with ipa-builder  https://review.opendev.org/c/openstack/ironic-python-agent/+/77815316:29
rpittausometimes my typos really amaze me16:29
openstackgerritRiccardo Pittau proposed openstack/ironic master: Prepare to use tinycore 12 for tinyipa  https://review.opendev.org/c/openstack/ironic/+/77734216:42
TheJuliabrain processing is always a fun topic16:46
openstackgerritDerek Higgins proposed openstack/ironic-python-agent master: Install doc/requirements.txt in testenv:venv  https://review.opendev.org/c/openstack/ironic-python-agent/+/77817316:49
*** anuradha1904 has quit IRC16:58
*** sshnaidm is now known as sshnaidm|afk17:00
openstackgerritJay Faulkner proposed openstack/ironic-specs master: No Conductor to IPA Communication spec  https://review.opendev.org/c/openstack/ironic-specs/+/77717217:01
*** lucasagomes has quit IRC17:06
*** ociuhandu_ has joined #openstack-ironic17:09
openstackgerritDerek Higgins proposed openstack/ironic-python-agent master: Install doc/requirements.txt in testenv:venv  https://review.opendev.org/c/openstack/ironic-python-agent/+/77817317:10
*** ociuhandu has quit IRC17:12
*** ociuhandu_ has quit IRC17:13
TheJuliaSo my last meeting ran over and I need to run an errand. With regards to the review jam I may be late or we can skip it today17:17
TheJuliaup for either option, as anyone can start/run it17:17
eanderssonThere is a check that warns if the ipa is too old. Does that check work with custom built versions of the ipa that does not necessarily have a tag (e.g. X.Y.Z.dev5)17:24
TheJuliaI don't remember17:29
TheJuliaI take it your outside of the supported matrix?17:29
* TheJulia take care to mechanic17:29
TheJuliadefinitly going to be very late for review jam now17:30
TheJulia:(17:30
eanderssonTrying to figure out a weird bug that started hitting us and noticed that we get a warning about not running a Victoria or newer IPA, but we are on the latest stable/victoria.17:31
*** dougsz has quit IRC17:32
TheJuliahmm17:33
TheJuliacould be a bug or the settings playing out to the message being logged17:33
eanderssonThis is the version we are running17:34
eandersson> ironic-python-agent==6.4.4.dev2417:34
eanderssonGuessing it is just a red herring17:35
JayFTheJulia: I think zer0c00l was planning on showing up to talk about anaconda, as mentioned yesterday in the upstream meeting17:36
JayFI will attend and hope others do in order to get his stuff moving17:36
rpittauI can't attend :/17:37
rpittaugood night! o/17:42
*** rpittau is now known as rpittau|afk17:42
arne_wiebalckbye everyone o/17:43
*** bnemec has quit IRC17:49
eandersson> Agent is busy: executing command execute_deploy_step17:54
eanderssonWhat ever issue we are having it is causing this on all steps.. which also prevents those the sweet logs from getting collected ;'(17:55
eandersson> Agent command standby.get_partition_uuids for node <uuid> failed. Expected 2xx HTTP status code, got 409.17:56
*** derekh has quit IRC18:01
JayFReview jam: https://meetpad.opendev.org/ironic18:03
JayFyou all are the peanut butter, join us :D18:04
eanderssonAre we really handling 409's correct here? They seem to map to "AgentIsBusy"18:07
eanderssonShouldt Ironic just retry if the agent is still "busy"?18:07
JayFNo, the bug is that it's trying to run two commands at the same time.18:08
JayFI don't know why/how, you could call GET /v1/commands on the agent if you can to see what is still running18:08
JayFbut generally speaking, there should never be a case when a conductor is trying to issue a command to a running agent that already has a command in progress18:08
JayFI don't know anything about why it's happening in your case; but it usually indicates some metadata was never written to a node, or perhaps (guessing here) something going upside-down with the agent fast track support18:09
eanderssonThanks helps a lot.18:13
eanderssonMy current best guess is that this is a new bug with deploy and reboot_requested.18:16
*** bnemec has joined #openstack-ironic18:16
JayFI'm not generally familiar with deploy steps, but if I recall I absolutely saw that (years ago) with clean steps + reboot requested18:17
JayFso you may be on the right track18:17
*** k_mouza has quit IRC18:36
*** k_mouza_ has joined #openstack-ironic18:36
eanderssonInteresting. It looks like it fails when it does not reboot as requested.18:36
*** paras333_ has quit IRC18:39
TheJuliaeandersson: uhh, we should be. This seems really familiar18:53
TheJuliawell heartbeats still occur which can trigger the next step18:53
eanderssonThe only thing we have found so far is that it fails with the above message when the reboot for some unknown reason isn't triggered.18:59
eanderssonAnd our script is dead simple. It's literally a bash script with exit 0.19:00
eanderssonThat gets executed by the IPA19:00
TheJuliawould you be up for talking through it?19:01
* TheJulia thinks more coffee is needed19:01
lbragstadTheJulia o/ it looks like y'all are making some good progress on the secure rbac patches - i only see a few left?19:01
JayFTheJulia: at least for cleaning, heartbeat won't trigger the next step unless the previous one has completed19:01
TheJulialbragstad: yup, we need to do cleanup and likely look at db stuffs later, but yeah19:02
TheJuliathe problem I think is it is getting the 409 on getting the command status19:02
TheJuliabut I'm trying to grok eandersson's exact case because I thought we fixed the bug19:02
JayF Agent command standby.get_partition_uuids for node <uuid> failed. Expected 2xx HTTP status code,19:02
JayF                         | got 409.19:02
JayFSorry, didn't mean to paste that before cleaning it up, but you see ^^ it's not calling for command status19:03
JayFand calling /v1/commands, while a command is running, succeeds (at least in any agent I've tried it on, up to ussuri)19:03
eanderssonhttp://paste.openstack.org/show/OzMCHhFjjCmN0mD3EcV4/19:04
*** k_mouza_ has quit IRC19:04
*** k_mouza has joined #openstack-ironic19:05
eanderssonThis is the log from two runs, one successul and one failure.19:05
eanderssonYou can see that it properly reboots the node when it is successful, but for some reason does not the second time we deploy it.19:05
TheJuliaThis is sounding super deja-vu'ey19:05
lbragstadTheJulia ok - https://review.opendev.org/q/topic:secure-rbac+project:openstack/ironic+status:open is still an accurate list of what needs to land for system-admin, system-reader, project-member, project-reader?19:06
*** tzumainn has quit IRC19:07
TheJulialbragstad: it is, system-[admin, member, reader] are done, it is all project stuffs right now and we've got one more to go which is still in development19:07
TheJulialbragstad: keep in mind, we don't have to have everything merged by m319:07
lbragstadoh - sweet19:07
TheJuliaat least, in ironic we don't have to19:08
lbragstadyeah - and ironic is pretty much an admin-only API19:08
lbragstadright?19:08
TheJuliawell, becoming less and less admin only, espescialy with this work19:08
TheJuliaplus ironic operates by different release rules19:08
lbragstadok19:08
*** tzumainn has joined #openstack-ironic19:09
TheJuliablarg19:16
TheJuliaeandersson: I see what is going on :(19:19
eanderssonSomething easy to fix? :D19:22
TheJuliamaybe19:23
TheJuliaare you getting a "Conductor attempted to process deploy step" error?19:24
TheJuliahttps://github.com/openstack/ironic/blob/6e0682377ce433e1f9e6acf863e2bf73728a75ae/ironic/conductor/deployments.py#L27319:24
eanderssonI don't but this is Victoria19:25
eanderssonand I don't think that exists in Victoria19:25
eanderssonMaybe > Expected 2xx HTTP status code, got 409. is the same message in Victoria19:26
TheJuliahttps://github.com/openstack/ironic/blob/6e0682377ce433e1f9e6acf863e2bf73728a75ae/ironic/drivers/modules/agent_base.py#L38019:27
TheJuliaare you getting that error that could be raised?19:28
eanderssonI don't see it in the logs at least19:28
*** k_mouza has quit IRC19:29
eanderssonThis causes logs to not be shipped to the ironic server as well19:29
eanderssonand having a difficult time catching it while it is happening19:29
TheJuliaI think our hard failure of things is more all on the ironic side of the universe19:29
TheJuliawe should handle the 40919:29
JayFIs there ever a valid case where we should get a 409 from the  agent?19:30
TheJuliayes, when the agent is still working but is heartbeating19:30
JayFCan you lay out that case explicitly? I can 100% hit /v1/commands while  a command is in progress on the agent.19:31
TheJuliasomewhere we're failing things fairly hardcore and I think I know where19:31
JayFWhich is the only agent endpoint that should be hit while a command is running19:31
TheJuliawell, you can hit it, but you can't ask it to execute the command status command19:31
TheJuliasince that is a separate command19:31
TheJuliaand only one command can run at a time19:31
JayFUh. Let me look at the code19:31
TheJulia++19:31
JayFIIRC we hit /v1/commands (the list endpoint) and take the first value19:31
eanderssonThe error message was confusing to me. Is busy to me just sounds like hey I am still working on this.19:32
TheJuliaeandersson: it likely is19:32
JayFhttps://opendev.org/openstack/ironic/src/branch/master/ironic/drivers/modules/agent_client.py#L253 I see no evidence we ever hit /v1/commands/{command_uuid}19:33
JayFand /v1/commands works when commands are in progress19:33
JayFI'm fairly certain this has to be running multiple actual commands, even if the commands are merely informational (like get_clean_steps, for example)19:33
JayFI'm out this afternoon; but I would love to know where this all leads -- I'll read scrollback but if there are any patches/stories filed, please feel free to ping them to me directly19:34
TheJuliaerbarr: just to confirm, the failure is https://github.com/openstack/ironic/blob/8604f84fd7bda4e30d3f07005c4901f3662303a7/ironic/common/exception.py#L62819:35
openstackgerritJulia Kreger proposed openstack/ironic stable/victoria: Handle agent still doing the prior command  https://review.opendev.org/c/openstack/ironic/+/77823719:37
* TheJulia whistles19:37
TheJuliathat is why it is deja vu19:37
JayF[-] Tried to execute standby.get_partition_uuids, agent is still executing Command name:19:37
JayF    execute_deploy_step, params: {'step': {'interface': 'deploy', 'step': 'write_image',19:37
JayFthat is 100% two commands at the same time19:37
JayFnot just getting a command status19:37
JayF(from the original patch to master)19:37
TheJuliayup19:39
TheJuliaeandersson: if you can try out 778237 and see if that clears up your issue, that would be good19:40
JayFthanks for that, I'll review that older master patch and if the victoria one isn't landed after that, I'll vote on ie19:41
eanderssonWe will try it out today19:42
TheJuliaokay19:42
eanderssonBuilding the container now :D19:46
TheJuliaWow, the mobile command center's AC finally turned on19:50
TheJulia(it is hooked up outside the home office window)19:51
eanderssonRebuilding 4 nodes now. Fingerscrossed.20:06
*** juanoterocas has joined #openstack-ironic20:10
eanderssonStill failing unfortunately20:12
TheJulia*sigh*20:12
eanderssonbut it looks different now20:13
stevebakermorning20:13
*** mcarden has joined #openstack-ironic20:14
TheJuliayay 6 rbac tests failing20:15
TheJuliaeandersson: oh?!?20:15
TheJuliaas much detail as possible would be greatly appreciated20:15
eanderssonIt failed and then just stopped this time. Before that patch it kept going.20:15
TheJulia*sigh*20:15
eanderssonI think it's just completely stuck now.20:22
eanderssonIt just failed and gave up20:23
eanderssonYea - the state engine got messed up and just booted back into the original OS20:24
eandersson(since we don't do the clean step at the moment the old OS was still there)20:25
eanderssonI'll try to get you some logs20:26
TheJuliamuch appreciated20:26
TheJuliaI've got calls the next 1.5 hours20:26
TheJuliafwiw20:26
TheJuliain a moment of somethign surprising, we actually don't call the conductor on patching an allocation20:26
iurygregory<surprise face> O.o20:28
eanderssonhttp://paste.openstack.org/show/hHCvcChRvxTL1xkaZo8W/20:30
eanderssonI noticed this showing up twice. Not sure if it has any significance.20:31
eandersson> Agent on node <node-uuid> returned deploy command success, moving to next step20:31
eanderssonIt almost feels like it is meant to wait for the reboot, but does not wait long enough and just moves on to the next state.20:58
openstackgerritVerification of a change to openstack/ironic failed: Project Scoping Node endpoint  https://review.opendev.org/c/openstack/ironic/+/77392420:59
eandersson90% sure that is what is happening here. It goes to DEPLOYWAIT and then the agent instantly moves to deploying due to some race condition.21:04
eanderssonI almost feel like we are missing something in the state machine to protect it here.21:21
*** hoonetorg has quit IRC21:21
openstackgerritJacob Anders proposed openstack/ironic master: Add support for using NVMe specific cleaning  https://review.opendev.org/c/openstack/ironic/+/77813421:25
janders^ NVMe cleaning doco fixes21:25
jandersgood morning Ironic o/21:25
*** k_mouza has joined #openstack-ironic21:29
iurygregorygood morning janders o/21:30
eanderssonMaybe we need to disable heartbeats before it goes into the waiting state.21:31
jandersgood morning iurygregory o/21:33
*** k_mouza has quit IRC21:34
*** hoonetorg has joined #openstack-ironic21:42
*** gyee has joined #openstack-ironic21:43
TheJuliabrraaains22:03
TheJuliaeandersson: Uhh... hmm22:05
TheJuliaeandersson: I think I understand what is going on and I think it is variation22:06
TheJuliafor my context, update_firmware is running before the deployment as it is priority 70 correct?22:06
TheJuliaand that is *still* in progress22:06
eanderssonYea22:12
eanderssonIf you look at my logs you can see that there are two different reqs and they happen at almost the same moment (unfortunately I removed timestamp)22:12
*** frigo has joined #openstack-ironic22:13
eanderssonOne puts it into DEPLOY-WAITING and the other almost instantly moves it back into ACTIVE22:13
eanderssonSo it isn't in progress (at least not the firmware call), but the transiton from ACTIVE -> DEPLOY-WAITING -> REBOOT -> ACTIVE is still in progress22:14
eanderssonbut because it moves DEPLOY-WAITING to ACTIVE it never has time to trigger the REBOOT22:14
*** lmcgann has quit IRC22:15
TheJuliaerr22:17
TheJuliaThat seems like a distinctly different issue from what I'm thinking22:17
TheJuliabut I think we've got two issues playing together in not fun ways22:18
eanderssonI could be wrong as I base this on reading the logs22:18
TheJuliaI can toss up a patch a little later for what I think I see in the code, but I need to get the current context out of my head first22:18
eanderssonGonna dig into it today as well. We didn't see this in the beginning so not sure what changed. We are thinking maybe rebasing the victoria branch a few weeks ago caused it, but that is just the only known change.22:21
* TheJulia heats up lunch22:21
TheJuliawhat was the beginning?22:21
TheJuliameaning ipa/ironic versions22:21
eanderssonPretty sure the beginning was like Victoria rc1 or 2 of Ironic22:22
eanderssonBefore Victoria was actually released22:23
TheJuliahm, we don't do RC's22:23
TheJuliaahh22:23
TheJuliaso maybe a point release before22:23
TheJuliahmmm22:23
eanderssonWe add our own features to Ironic so we just build based on the stable branches22:26
eanderssonThe version I deployed now does not have any custom features thou22:27
TheJuliahmm, could there have been overlapping changes maybe?22:29
TheJuliaI'm thinking there weres some last minute things to victoria, but the stable branch hasn't really changed much22:29
TheJuliaand stables are always based on our final cycle release unless "something else" has to happen22:30
eanderssonA possibility is that we just got really lucky as well. As this isn't happening to 100% of the rebuilds.22:30
TheJuliahmm, could be22:31
TheJuliaokay, let me get my current thing out of my head and then I'll put the other patch up22:31
TheJuliathat I think is needed22:31
*** rcernin has joined #openstack-ironic22:33
*** juanoterocas has quit IRC22:50
*** frigo has quit IRC23:00
*** pmannidi has joined #openstack-ironic23:07
*** pmannidi_ has quit IRC23:08
*** zzzeek has quit IRC23:13
*** zzzeek has joined #openstack-ironic23:17
*** k_mouza has joined #openstack-ironic23:30
*** k_mouza has quit IRC23:34
openstackgerritJulia Kreger proposed openstack/ironic master: Project Scoping Node endpoint  https://review.opendev.org/c/openstack/ironic/+/77392423:45
openstackgerritJulia Kreger proposed openstack/ironic master: Port/Portgroup project scoped access  https://review.opendev.org/c/openstack/ironic/+/77546523:45
openstackgerritJulia Kreger proposed openstack/ironic master: Volume targets/connectors Project Scoped RBAC  https://review.opendev.org/c/openstack/ironic/+/77631423:45
openstackgerritJulia Kreger proposed openstack/ironic master: Project scope driver vendor pass-through  https://review.opendev.org/c/openstack/ironic/+/77676723:45
openstackgerritJulia Kreger proposed openstack/ironic master: Follow-up on project scoped trait tests  https://review.opendev.org/c/openstack/ironic/+/77676823:45
openstackgerritJulia Kreger proposed openstack/ironic master: WIP: Allocation support for project scoped RBAC  https://review.opendev.org/c/openstack/ironic/+/77834023:45
TheJuliastevebaker: please take a look at the allocation patch above, still a wip, but it is drastically different endpoint so more eyes the better23:45

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!