opendevreview | Merged openstack/ironic master: Use monotonic time for hashring reset https://review.opendev.org/c/openstack/ironic/+/882261 | 00:22 |
---|---|---|
stevebaker[m] | Nisha_Agarwal: these three changes have been lined up, which makes stable/zed green. I don't actually have approve rights on that repo so if there are active reviewers other than TheJulia can you let them know? https://review.opendev.org/c/x/proliantutils/+/882510 | 01:57 |
stevebaker[m] | oh TheJulia doesn't have +2 either? | 01:57 |
TheJulia | I have no rights there | 01:58 |
opendevreview | Merged openstack/metalsmith stable/yoga: Get ports by 'binding:host_id' query filter https://review.opendev.org/c/openstack/metalsmith/+/877633 | 02:49 |
opendevreview | Merged openstack/metalsmith stable/xena: Get ports by 'binding:host_id' query filter https://review.opendev.org/c/openstack/metalsmith/+/877634 | 02:49 |
opendevreview | Merged openstack/metalsmith stable/wallaby: Get ports by 'binding:host_id' query filter https://review.opendev.org/c/openstack/metalsmith/+/877635 | 02:52 |
opendevreview | Merged openstack/ironic master: Fix api-ref v1-indicators https://review.opendev.org/c/openstack/ironic/+/882620 | 03:04 |
opendevreview | Verification of a change to openstack/ironic master failed: Remove use of nomodeset by default https://review.opendev.org/c/openstack/ironic/+/881576 | 03:10 |
JayF | stevebaker[m]: I emailed the core team recently, they promised to review many items. I'd suggest emailing them directly. DM me tomorrow (when I'm on my work laptop) and I can send you the email of the person who responded, so you can follow up. | 03:56 |
stevebaker[m] | Ok thanks | 03:57 |
opendevreview | OpenStack Proposal Bot proposed openstack/ironic-inspector master: Imported Translations from Zanata https://review.opendev.org/c/openstack/ironic-inspector/+/882656 | 04:01 |
opendevreview | OpenStack Proposal Bot proposed openstack/ironic-ui master: Imported Translations from Zanata https://review.opendev.org/c/openstack/ironic-ui/+/882662 | 04:22 |
opendevreview | OpenStack Proposal Bot proposed openstack/ironic master: Imported Translations from Zanata https://review.opendev.org/c/openstack/ironic/+/882666 | 04:29 |
opendevreview | Merged openstack/ironic master: Remove autocommit, again. https://review.opendev.org/c/openstack/ironic/+/862832 | 04:46 |
opendevreview | Merged openstack/ironic-inspector master: Imported Translations from Zanata https://review.opendev.org/c/openstack/ironic-inspector/+/882656 | 04:46 |
opendevreview | Merged openstack/ironic master: Imported Translations from Zanata https://review.opendev.org/c/openstack/ironic/+/882666 | 05:06 |
opendevreview | Verification of a change to openstack/ironic master failed: Support longer checksums for redfish firmware upgrade https://review.opendev.org/c/openstack/ironic/+/882163 | 05:21 |
opendevreview | Merged openstack/bifrost master: Remove extra symbols accidentally added https://review.opendev.org/c/openstack/bifrost/+/879547 | 06:09 |
opendevreview | Merged openstack/ironic master: Remove use of nomodeset by default https://review.opendev.org/c/openstack/ironic/+/881576 | 06:29 |
rpittau | good morning ironic! o/ | 06:37 |
MikeCTZA | been a while since I've been here on IRC ... made progress with our ironic deployment and now have about 60 nodes in production, not as many as some but its good for us | 07:02 |
MikeCTZA | having a problem with Dell servers and UEFI boot, was wondering if anyone had experience and could give any advise? | 07:02 |
MikeCTZA | having a problem with Dell servers and UEFI boot, was wondering if anyone had experience and could give any advice? | 07:03 |
opendevreview | Verification of a change to openstack/ironic master failed: Support longer checksums for redfish firmware upgrade https://review.opendev.org/c/openstack/ironic/+/882163 | 08:02 |
Sandzwerg[m] | <MikeCTZA> "having a problem with Dell..." <- What kind of issue? Also what nodes and which firmware version do you use? | 08:12 |
MikeCTZA | we have tried it with R640 and another model, it crashes the BIOS, they are all updates to latest f/w | 08:12 |
Sandzwerg[m] | They crash? Haven't seen that yet. At which step? | 08:14 |
MikeCTZA | can I share a link to a video showing it? | 08:14 |
Sandzwerg[m] | Sure | 08:15 |
MikeCTZA | https://www.dropbox.com/s/5jbn1qpylxaevqb/uefiboot2.mov?dl=0 | 08:15 |
MikeCTZA | this box isnt 100% updated firmware but the other was, we were just trying various permutations | 08:16 |
MikeCTZA | all of our ironic nodes we have deployed with BIOS as the boot but I see from Yoga (we are on Xena) it is default to UEFI so for future we want to get this going, also .. newer boxes prefer UEFI, I cant even get a manual deploy working to our new R6625 so this started this process | 08:22 |
Sandzwerg[m] | Hmm you seem to use the grub mode to deploy, we use a ram disk that writes the image to disk. Haven't seen that issue there. But our Dells are also a different version. Sorry but I have no advice here | 08:53 |
opendevreview | Harald Jensås proposed openstack/ironic-tempest-plugin master: Fix rbac indicator tests https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/882619 | 08:53 |
jrosser | MikeCTZA: we have r730 (old but they were available) here in a test lab setup here and the idrac driver has been horribly unreliable, eventually getting the Bmc into some weird state that means ironic can’t communicate with it | 08:57 |
jrosser | this week we will revert it all to the ipmi driver and see if we have better luck with that | 08:58 |
MikeCTZA | Sandzwerg[m] not used that method/seen that can you elaborate on how that works? we have the UEFI method working on out test boxes which are R630 older servers, it's a diff deployment but mirrors our prod setup | 09:01 |
opendevreview | Maksim Malchuk proposed openstack/bifrost master: Create the log file for the disk-image-create command https://review.opendev.org/c/openstack/bifrost/+/822895 | 09:03 |
opendevreview | Merged openstack/ironic-python-agent master: Add support for CentOS SUM files https://review.opendev.org/c/openstack/ironic-python-agent/+/882152 | 09:03 |
hjensas | rpittau: our conflicting srbac tempest, I fixed the api-ref last night - https://review.opendev.org/c/openstack/ironic/+/882620 - do you want to test indicators/{component} specifically? Or should we just go with the change that get {indicator}@{component}? | 09:08 |
opendevreview | Harald Jensås proposed openstack/ironic-tempest-plugin master: Advance tempest plugin tests to Zed (mostly) https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/882311 | 09:17 |
opendevreview | Harald Jensås proposed openstack/ironic-tempest-plugin master: Fix rbac tests, take 2 https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/882452 | 09:17 |
opendevreview | Harald Jensås proposed openstack/ironic-tempest-plugin master: Fix rbac indicator tests https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/882619 | 09:17 |
opendevreview | Harald Jensås proposed openstack/ironic-tempest-plugin master: Add RBAC specific tempest jobs to gate plugin https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/882312 | 09:17 |
hjensas | rpittau: See comments on - https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/882452 - I will rebase without your change and propose a patch to api-ref doc to address it there. | 09:38 |
opendevreview | Harald Jensås proposed openstack/ironic-tempest-plugin master: Fix rbac indicator tests https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/882619 | 09:53 |
opendevreview | Harald Jensås proposed openstack/ironic-tempest-plugin master: Add RBAC specific tempest jobs to gate plugin https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/882312 | 09:53 |
dtantsur | TheJulia: hey, some clarification on the md5 deprecation. Is your intent to remove it completely, even from os_hash_algo (not just from image_checksum)? This was not done in the IPA patch, hence the question. | 10:43 |
opendevreview | Dmitry Tantsur proposed openstack/ironic stable/2023.1: Handle MissingAttributeError when using OOB inspections to fetch MACs https://review.opendev.org/c/openstack/ironic/+/882522 | 10:54 |
opendevreview | Dmitry Tantsur proposed openstack/ironic stable/zed: Handle MissingAttributeError when using OOB inspections to fetch MACs https://review.opendev.org/c/openstack/ironic/+/882523 | 10:54 |
opendevreview | Dmitry Tantsur proposed openstack/ironic bugfix/21.3: Handle MissingAttributeError when using OOB inspections to fetch MACs https://review.opendev.org/c/openstack/ironic/+/882524 | 10:54 |
opendevreview | Dmitry Tantsur proposed openstack/ironic bugfix/21.2: Handle MissingAttributeError when using OOB inspections to fetch MACs https://review.opendev.org/c/openstack/ironic/+/882525 | 10:55 |
opendevreview | Dmitry Tantsur proposed openstack/ironic bugfix/21.0: Handle MissingAttributeError when using OOB inspections to fetch MACs https://review.opendev.org/c/openstack/ironic/+/882526 | 10:55 |
dtantsur | janders: ^^^ | 10:55 |
iurygregory | good morning Ironic | 11:24 |
Nisha_ | o/ ironic!!! | 11:26 |
Nisha_ | dtantsur, hi... we were doing and backporting the tox4 changes for proliantutils....we see in ironic it has been backported till stable/train release....any specific reason? | 11:28 |
Nisha_ | dtantsur, i referred to this patch https://review.opendev.org/c/openstack/ironic/+/876409 | 11:29 |
dtantsur | Nisha_: I think it was because tox 4.0 is used everywhere in the CI | 11:29 |
Nisha_ | openstack main support is beyond 3 releases? | 11:30 |
Nisha_ | dtantsur, no other reason? if not then it would be ok for us(proliantutils) to support tox 4.0 only till yoga? or it is required for us also to support till train? | 11:32 |
dtantsur | Nisha_: it's up to you. tox is only used for testing, after all. | 11:32 |
Nisha_ | dtantsur, ok | 11:33 |
Sandzwerg[m] | <MikeCTZA> "Sandzwerg not used that method/..." <- It's the default IPMI/pxe/tftp Setup | 11:39 |
MikeCTZA | thanks, maybe we are just doing things non-standard I found the docs (https://docs.openstack.org/ironic/xena/admin/ramdisk-boot.html) on it so will give it a read and check that option out as it could help us | 11:40 |
Sandzwerg[m] | No that's not it. I think our setup is not that different but you seem not to boot a deploy ramdisk with IPA but something else | 11:43 |
MikeCTZA | ah we create out ironic nodes with openstack baremetal node create --driver ipmi --name $HOSTNAME --driver-info ipmi_port=623 --driver-info ipmi_username=root --driver-info 'ipmi_password='ourpassword' --driver-info ipmi_address=$MGMTIP --resource-class baremetal-resource-class --property cpus=32 --property memory_mb=256000 --property local_gb=20 --property cpu_arch=x86_64 --driver-info deploy_ramdisk=$(openstack image | 11:50 |
MikeCTZA | show deploy-initrd3 -f value -c id) --driver-info deploy_kernel=$(openstack image show deploy-vmlinuz -f value -c id) | 11:50 |
MikeCTZA | the images we used we are using there we obtained from https://tarballs.opendev.org/openstack/ironic-python-agent/dib/files/ - I've tried even the latest ones (centos9), which I wasn't sure if would work on Xena but tried them anyway | 11:56 |
rpittau | hjensas: I will abandon mine, no problem | 11:57 |
hjensas | rpittau: np, pita when doc's don't match actual implementation. :( | 11:57 |
rpittau | very quick approval if anyone has a moment, thanks! https://review.opendev.org/c/openstack/virtualpdu/+/881483 | 12:33 |
opendevreview | Merged openstack/virtualpdu master: Use openstackdocs theme for docs https://review.opendev.org/c/openstack/virtualpdu/+/881483 | 12:46 |
Sandzwerg[m] | <MikeCTZA> "the images we used we are..." <- Interesting. We doesn't use these images but build our own but that shouldn't change anything fundamentaly. I'm a little bit confused by the two grub screens one after the other. Does the second still belong to the deployment? Then your IPA image seems to have an issue | 13:01 |
MikeCTZA | Sandzwerg[m] wouldnt think it would be a big difference, I've not watched the boot in detail but pretty sure we don't see either screen with those, I've never seen them before I was testing the UEFI process, think I need to do a bit more playing around to see whats going on, I'll report back/come back and ask more/give feedback once I have had time to check it a bit more | 13:07 |
opendevreview | Harald Jensås proposed openstack/ironic master: Remove indicators list by component from api-ref https://review.opendev.org/c/openstack/ironic/+/882710 | 13:33 |
TheJulia | Good morning | 14:06 |
TheJulia | MikeCTZA: as an aside, has anyone mentioned portfast? | 14:11 |
TheJulia | Sandzwerg[m]: is secure boot enabled? | 14:12 |
TheJulia | err, MikeCTZA is secure boot disabled | 14:12 |
TheJulia | dtantsur: my intent is to move it to a default state where it cannot be used by default any longer, which means even using the hash algo it will be rejected at some point unless operators update to a newer checksum algorithm | 14:13 |
dtantsur | TheJulia: okay, just needed to confirm that. thanks! (and good morning) | 14:14 |
TheJulia | good morning! | 14:14 |
* TheJulia is still waking up | 14:14 | |
Sandzwerg[m] | Ah secure boot is a good idea | 14:20 |
zigo | What does mean "Failed to prepare node <UUID> for cleaning: No available PXE-enabled ports on node <UUID>." and how to fix? | 14:37 |
zigo | TheJulia: JayF: BTW, I'm going to Vancouver ! :) | 14:37 |
zigo | Oh, just found out... | 14:44 |
zigo | Gosh, things are so manual with Ironic ! :) | 14:45 |
zigo | When setting --driver-info deploy_kernel=file:///images/deploy.vmlinuz, this path is relative to what? The tftp server?!? | 14:48 |
TheJulia | zigo: awesome | 14:57 |
TheJulia | zigo: you need a pxe enabled port :) | 14:57 |
TheJulia | zigo: you need a port in general as well, there needs to be at least one to setup for pxe booting | 14:57 |
TheJulia | oh, you got it | 14:57 |
zigo | Yeah, did that, but now: "Failed to prepare node <UUID> for cleaning: statvfs: path should be string, bytes, os.PathLike or integer, not NoneType" | 14:57 |
TheJulia | I think it is absolute path, tbh | 14:58 |
zigo | How does Ironic do? | 14:58 |
TheJulia | we *generally* expect you to provide a URL | 14:58 |
TheJulia | or like a glance image | 14:58 |
TheJulia | using files on filesystem creates pain cross-conductor | 14:58 |
zigo | I did upload to Glance, and set the UUID in the ironic.conf ... | 14:59 |
zigo | So I don't get why I also need to do --driver-info ... | 14:59 |
TheJulia | you shouldn't need to set it there | 14:59 |
zigo | Now... ironic.common.exception.ImageRefValidationFailed: Validation of image href /srv/tftp/vmlinuz-6.1.0-8-amd64 failed, reason: Scheme-less image href is not a UUID. | 15:00 |
zigo | :/ | 15:00 |
TheJulia | wait... wut | 15:00 |
TheJulia | what do you have each setting set to? | 15:01 |
zigo | I had it set to http://10.4.38.13:8088/vmlinuz-6.1.0-8-amd64 | 15:02 |
zigo | But it didn't work ... :( | 15:02 |
zigo | Now I can't unset... | 15:03 |
zigo | Ah, unset worked... | 15:04 |
zigo | Now I get this again (after unset...): https://paste.opendev.org/show/bgYudK0QkpYS4E1ssI2a/ | 15:06 |
zigo | ypeError: statvfs: path should be string, bytes, os.PathLike or integer, not NoneType | 15:06 |
zigo | T | 15:06 |
zigo | I can't use the file backend of glance or what?!? | 15:07 |
zigo | Must it be with Swift? | 15:07 |
zigo | (so then Ironic can use an URL...) | 15:07 |
zigo | Oh, the issue looks like being the Ironic cache, no? | 15:08 |
TheJulia | yes, by default we keep a folder to cache items so we don't download them over and over | 15:13 |
TheJulia | I'm guessing that is not configured ? | 15:13 |
TheJulia | or is explicitly None'd | 15:13 |
TheJulia | ? | 15:13 |
zigo | Oh, node is booting ... | 15:17 |
zigo | This looks nicer. | 15:17 |
TheJulia | oh good | 15:19 |
zigo | TheJulia: Ironic wrote into my /srv/tftp folder, but the server can't pass grub ... | 15:20 |
zigo | Grub has: configfile /srv/tftp/$net_default_mac.conf | 15:20 |
zigo | Nothing more... | 15:20 |
zigo | Then when I press enter, nothing happen. | 15:20 |
zigo | It just goes back to the grub menu. | 15:20 |
TheJulia | did Ironic create the folder? | 15:21 |
zigo | It created /srv/tftp/<UUID> yes... | 15:22 |
TheJulia | well, and the link to from $net_default_mac.conf to uuid/config ? | 15:22 |
zigo | Yeah. | 15:22 |
zigo | But still... | 15:22 |
zigo | My server isn't booting :( | 15:22 |
TheJulia | tftp log files ? | 15:22 |
TheJulia | since.... your using grub | 15:22 |
zigo | Oh, I know what's wrong. | 15:23 |
zigo | It's linking to /srv/tftp/<MAC>, though it should only be /<MAC> form the point of view of TFTP. | 15:23 |
zigo | Where is this configured ? | 15:24 |
zigo | Also, I need to add extra params for my debian-live image ... | 15:25 |
*** dmellado1 is now known as dmellado | 15:25 | |
TheJulia | ... wait, it should be symlinks so the tftpclient shouldn't be exposed to it unless it is chrooted in to just /srv/tftp ? | 15:26 |
zigo | Well, my /etc/default/tftpd-hpa has TFTP_DIRECTORY="/srv/tftp", so my clients only see what's in it, but can't see the toplevel rootfs ... | 15:29 |
TheJulia | ... so... hmm | 15:29 |
TheJulia | so this is an old tftp issue | 15:29 |
TheJulia | but I think we don't see it becuase most people run grub and it knows the root mapping | 15:29 |
zigo | I'll find out... :) | 15:32 |
zigo | I'm kind of super close to a working setup, I believe. | 15:32 |
TheJulia | you are | 15:33 |
zigo | Is it IPA that does the node cleaning job too? | 15:33 |
* TheJulia whispers "dnsmasq's ftp service" | 15:33 | |
TheJulia | yes | 15:34 |
zigo | Ah, shit, still got the SSL error with IPA, I need to fix my PKI, somehow... | 15:37 |
zigo | Node is cleaning (I can see it's running shred...). | 15:42 |
TheJulia | \o/ | 15:42 |
TheJulia | you may want to explore tuning that if it is a nvme device, fwiw | 15:43 |
rpittau | good night! o/ | 16:03 |
sean-k-mooney | TheJulia: dansmith i took a look the ironic grenade job. | 16:40 |
JayF | oooh, fun, did you find anything interesting? | 16:41 |
sean-k-mooney | the server create happens here at 11:06:34 https://zuul.openstack.org/build/d4c556c7b254451cad14489ba90f0563/log/controller/logs/grenade.sh_log.txt#1665 the nova-api then starts procssing it at 11:06:36 https://zuul.openstack.org/build/d4c556c7b254451cad14489ba90f0563/log/controller/logs/screen-n-api.txt#2707 | 16:41 |
sean-k-mooney | where it calls neutron | 16:41 |
sean-k-mooney | to validate netowrks ports quotas ectra | 16:42 |
sean-k-mooney | https://zuul.openstack.org/build/d4c556c7b254451cad14489ba90f0563/log/controller/logs/screen-q-svc.txt#3714 | 16:42 |
sean-k-mooney | the last request in the neturon log with that request id is 11:06:37.042222 | 16:42 |
sean-k-mooney | but there is nothin in the nova api log after 11:06:46 | 16:43 |
TheJulia | for the neutron ports by tenant id? | 16:43 |
sean-k-mooney | i was using the request-id to coralate the nova api request to the neutorn ones | 16:43 |
sean-k-mooney | we can see this log in the nova api logs https://github.com/openstack/nova/blob/0c397d60e79e47da05fd9dcee173514b2b8dc2cc/nova/network/neutron.py#L2636 | 16:44 |
* TheJulia gueses likely then, since that is the last query to neutron which we've seen requests come in for | 16:44 | |
TheJulia | Yeah, we've seen it get past https://github.com/openstack/nova/blob/0c397d60e79e47da05fd9dcee173514b2b8dc2cc/nova/network/neutron.py#L2654 based upon the queries/responses | 16:44 |
TheJulia | well, to that line, beyond that in nova-compute is unknown | 16:45 |
TheJulia | err | 16:45 |
TheJulia | nova-api | 16:45 |
sean-k-mooney | ya so the next step should be calling glance to check the image | 16:46 |
TheJulia | I don't *think* I've spotted that, tbh | 16:46 |
sean-k-mooney | im going to check it now and see but it looks almost like nova-api did not get resumed after it called neutron via neutorn client | 16:47 |
samuelkunkel[m] | <TheJulia> "samuelkunkel: o/ did you ever..." <- Ah just saw your Question. Sorry was on a longer „sick-spree“. | 16:49 |
samuelkunkel[m] | Are you refering to the broken HPE RL300 with the Ampere CPU? | 16:49 |
sean-k-mooney | there are request for the image at the rigth time in glance https://zuul.openstack.org/build/d4c556c7b254451cad14489ba90f0563/log/controller/logs/screen-g-api.txt#1485 | 16:50 |
opendevreview | Julia Kreger proposed openstack/ironic master: Fix self_owned_node policy check https://review.opendev.org/c/openstack/ironic/+/882597 | 16:51 |
TheJulia | sean-k-mooney: my feeling was the same, which seems... weird in itelf | 16:52 |
TheJulia | itself | 16:52 |
* TheJulia needs more coffee | 16:52 | |
TheJulia | samuelkunkel[m]: yes, I went ahead and revised the sushy patch anyway | 16:52 |
sean-k-mooney | TheJulia: im wondering if this is somehow related to the db profiling | 16:55 |
sean-k-mooney | the last thing in the nova log is | 16:55 |
sean-k-mooney | DEBUG dbcounter [-] [85203] Writing DB stats nova_api:SELECT=4 {{(pid=85203) stat_writer /usr/local/lib/python3.10/dist-packages/dbcounter.py:114}} | 16:55 |
TheJulia | oslo_db.exception.DBError: (sqlite3.InterfaceError) Cursor needed to be reset because of commit/rollback and can no longer be fetched from. <-- *ugh* (unrelated to current discussion, related to jammy most likely) | 16:56 |
dansmith | easy to disable the db profiling to confirm | 16:58 |
samuelkunkel[m] | TheJulia: yes both are running again. Shall I test your revised patch? | 16:58 |
TheJulia | samuelkunkel[m]: I only added unit tests, but you said you wanted to test one more time to be sure, it is all good | 16:59 |
TheJulia | ... can't be worse than the vendor which illicits a HTTP Erro 428 now.. *blink* *blink* | 16:59 |
sean-k-mooney | we would need to disable these plugins here right https://zuul.openstack.org/build/d4c556c7b254451cad14489ba90f0563/log/controller/logs/etc/nova/nova_conf.txt#53-57 | 17:01 |
samuelkunkel[m] | Yes, that was indeed my plan. I guess its not merged yet (so I could just backport it) ;) | 17:02 |
dansmith | sean-k-mooney: there's a devstack flag to turn it off | 17:02 |
sean-k-mooney | MYSQL_GATHER_PERFORMANCE | 17:02 |
dansmith | yup | 17:02 |
sean-k-mooney | yep just found it | 17:03 |
sean-k-mooney | https://github.com/openstack/devstack/blob/34afa91fc9f830fc8e1fdc4d76e7aa6d4248eaaa/lib/databases/mysql#L252C13-L256 | 17:03 |
sean-k-mooney | i guess htat is on by default? | 17:03 |
dansmith | it is | 17:03 |
dansmith | you're thinking what - that we're blocked on mysql at some point? | 17:03 |
sean-k-mooney | ya the stat reporting is happenign right around when i would expec us to be looking up the keyparis under a reader lock | 17:04 |
sean-k-mooney | so i dont knwo if it coudl be causeing issue or not | 17:04 |
dansmith | yeah, it's just that we're doing that against a different table and I think that log happens after we've done it | 17:04 |
dansmith | but it's certainly worth a try :) | 17:05 |
sean-k-mooney | part of the problem is the particaly part of code has almost no logs | 17:05 |
samuelkunkel[m] | <TheJulia> "samuelkunkel: I only added..." <- Also we have other gen 11 HPE incoming with ILO 6 (AMD Genoa) so your fix will be of good use <3 | 17:06 |
sean-k-mooney | its hanging somewhere here https://github.com/openstack/nova/blob/0c397d60e79e47da05fd9dcee173514b2b8dc2cc/nova/compute/api.py#L1080-L1115 assuming the process is not just suspended by apache/usigi waiting on a request but i think it finished interacting with neutron | 17:06 |
sean-k-mooney | so we porably need to add some debug lines there to narrow down exactly where it failing as much fo those funciton have no log lines | 17:07 |
TheJulia | sean-k-mooney: I was thinking exactly the same thing when I was looking at it friday morning | 17:11 |
sean-k-mooney | realsitcaly i think we would need to interment the code with more logging to really figure out what happening | 17:13 |
sean-k-mooney | i didnt check but i take it this is after the inital install and before the upgrade has run? | 17:14 |
TheJulia | correct | 17:14 |
sean-k-mooney | ok so we would have to propsoe a patch to stable and make grenade use that patch as its base for that to help | 17:14 |
TheJulia | or... modify grenade to pull in a patch directly | 17:15 |
sean-k-mooney | well zuul/depens on might be able to do this for use but you mean just have it apply the patch to add logs | 17:16 |
sean-k-mooney | honestly i did seem much else odd so it might be worth disabling the db profiling and see if the issue goes away | 17:17 |
sean-k-mooney | if it does not then we woudl need ot moify nova ot investiate. adding more logs there is proably not a bad idea anyway | 17:18 |
opendevreview | Julia Kreger proposed openstack/ironic master: DNM: Disable mysql counters for grenade https://review.opendev.org/c/openstack/ironic/+/882731 | 17:21 |
TheJulia | we've had luck shine upon us the last 24 hours, so we shall see I guesss :) | 17:21 |
opendevreview | Julia Kreger proposed openstack/ironic master: DNM: Don't return the in-flight SQL handler https://review.opendev.org/c/openstack/ironic/+/882732 | 17:24 |
TheJulia | I knew I should have fixed those last week... | 17:24 |
opendevreview | Julia Kreger proposed openstack/sushy stable/2023.1: Retry on ilo state error https://review.opendev.org/c/openstack/sushy/+/882528 | 18:33 |
opendevreview | Julia Kreger proposed openstack/ironic master: Don't return the in-flight SQL handler https://review.opendev.org/c/openstack/ironic/+/882732 | 19:19 |
opendevreview | Julia Kreger proposed openstack/sushy stable/zed: Retry on ilo state error https://review.opendev.org/c/openstack/sushy/+/882746 | 19:24 |
opendevreview | Julia Kreger proposed openstack/sushy stable/yoga: Retry on ilo state error https://review.opendev.org/c/openstack/sushy/+/882747 | 19:31 |
opendevreview | Julia Kreger proposed openstack/sushy stable/xena: Retry on ilo state error https://review.opendev.org/c/openstack/sushy/+/882748 | 19:32 |
opendevreview | Julia Kreger proposed openstack/sushy stable/wallaby: Retry on ilo state error https://review.opendev.org/c/openstack/sushy/+/882749 | 19:32 |
TheJulia | my ilo karma increases :) | 20:12 |
TheJulia | so it looks like neutron is no longer compatible with postgres | 20:18 |
TheJulia | https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_291/882164/1/check/ironic-tempest-pxe_ipmitool-postgres/291f2f8/controller/logs/screen-q-svc.txt | 20:18 |
sschmitt_ | Quick networking question: when deploying nodes with the flat networking type, you have the provider network that nodes are supposed to PXE boot on. Do people tend to also put their BMCs in an IP on this network? Since the provider network interface on your controller can't have an IP address, do you have another interface with a proper IP on it for BMC communication? | 20:42 |
TheJulia | we advise to never do so | 20:44 |
TheJulia | we advise to put BMCs on entirely separate networks with routed through some sort of ACL rules | 20:45 |
JayF | ++ | 20:45 |
sschmitt_ | Gotcha that makes sense | 20:45 |
TheJulia | because you never want anyone but the operator to touch the BMC directly | 20:45 |
TheJulia | that whole... soft underbelly of the host | 20:45 |
sschmitt_ | Can the interface given to inspector's dnsmasq be the same as the one used for the provider network? | 20:46 |
TheJulia | in theory, yes | 20:48 |
TheJulia | it needs an IP address though | 20:48 |
TheJulia | that may, or may not be an issue for you | 20:48 |
sschmitt_ | sounds like im gonna need a bunch more interfaces on my controllers | 20:55 |
TheJulia | it doesn't need to be dedicated, you can create a bond and leverage it | 20:56 |
TheJulia | or a vlan | 20:56 |
TheJulia | well, flat networking is... | 20:56 |
TheJulia | it is what it is | 20:56 |
sschmitt_ | yeah thats what I am doing already | 20:57 |
TheJulia | but some folks have done dynamic networking and put the new machines on so they end up with vlan 1 | 20:57 |
TheJulia | and then everything ends up getting assigned off to elsehwere | 20:57 |
TheJulia | elsewhere | 20:57 |
sschmitt_ | eventually this will be neutron based instead of flat, but trying to test out latest versions with OVN before going down the switch interaction path | 20:57 |
* TheJulia can't think of the vendor-esq name for a default vlan | 20:57 | |
TheJulia | ++ | 20:58 |
TheJulia | good plan | 20:58 |
samuelkunkel[m] | A vendor esq name? Of putting all ports in native vlan 1? | 21:03 |
JayF | many of them would call a default vlan the untagged vlan (vs tagged vlans) | 21:11 |
opendevreview | Verification of a change to openstack/ironic master failed: Support longer checksums for redfish firmware upgrade https://review.opendev.org/c/openstack/ironic/+/882163 | 21:14 |
JayF | TheJulia: did you reach out to neutron folks about it? | 21:22 |
JayF | TheJulia: if intended; we just have to drop the job :| | 21:23 |
TheJulia | I asked in channel, no response | 21:23 |
TheJulia | if I don't have any replies tomorrow, I'll send an email | 21:23 |
JayF | ack; I'm in there now I'll see if someone responds | 21:23 |
JayF | thanks for taking care of it | 21:23 |
JayF | I've been ill a bit this week | 21:23 |
JayF | trying to get useful things done anyway :) | 21:23 |
opendevreview | Julia Kreger proposed openstack/ironic-specs master: Follow-up on DPU Management Change https://review.opendev.org/c/openstack/ironic-specs/+/882760 | 21:25 |
opendevreview | Merged openstack/ironic-tempest-plugin master: Advance tempest plugin tests to Zed (mostly) https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/882311 | 21:26 |
TheJulia | \o/ | 21:36 |
TheJulia | at least CI seems to slowly moving to happier overall | 21:36 |
JayF | Hard for it to have gotten any worse (while we actually made progress lol) | 21:37 |
TheJulia | yeah | 21:38 |
opendevreview | Merged openstack/ironic-specs master: Framework for DPU management/orchustration https://review.opendev.org/c/openstack/ironic-specs/+/874189 | 21:39 |
TheJulia | suddenly, I'm very worried about our level of testing aroudn the deploy template db table | 22:10 |
JayF | what's got you worried? | 22:11 |
TheJulia | added columns | 22:12 |
TheJulia | nothing went boom | 22:12 |
JayF | oh that's always a bad feeling | 22:12 |
* TheJulia is going to go to the post office, and revisit | 22:12 | |
* TheJulia has new cool light led lights to pickup | 22:12 | |
TheJulia | okay, two failing tests now, I feel slightly better | 22:30 |
opendevreview | Merged openstack/ironic master: Support longer checksums for redfish firmware upgrade https://review.opendev.org/c/openstack/ironic/+/882163 | 23:45 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!