Wednesday, 2021-08-04

*** rpittau|afk is now known as rpittau07:13
lyarwoodFWIW https://review.opendev.org/c/openstack/nova/+/803322 has hit both https://bugs.launchpad.net/nova/+bug/1938021 and https://bugs.launchpad.net/nova/+bug/1936849 now, I've not had a chance to look at these really so if anyone has time this morning please take a look.08:06
gibilyarwood: ack, those bugs are on my radar too, but I have conflicting priorities :/ 08:50
gibiI will try to look at them this week08:50
* kashyap wonders if it's worth it at all to do the tedious deprecation stuff for floppy, instead just rip it out. I looked at tens of thousands of guests from OSP data I have, and I have not seen a single floppy disk, bus, or as a boot device.08:53
bauzaslyarwood: i can try to take a look08:54
bauzasin the meantime, can someone explain me the weirdo issue with mypy in https://review.opendev.org/c/openstack/nova/+/802918/7/nova/virt/libvirt/driver.py#511 ?08:55
bauzasoh, because I used a lambda function08:55
* bauzas facepalms08:56
bauzasgibi: any idea how I could tell mypy to *not* ask for a type annotation for a lambda function ?08:57
gibibauzas: either you gave it a ty.Any annotation08:59
gibior08:59
gibithere is some comment to silence mypy...08:59
gibi# type: ignore08:59
gibibauzas: why don't you want to define the type? is it a complicated one?08:59
bauzasgibi: I don't get why myty doesn't ask for an annotation here : https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L47409:00
bauzasand why it asks for https://review.opendev.org/c/openstack/nova/+/802918/7/nova/virt/libvirt/driver.py#51109:00
bauzasoh, the type annotation is for the new attribute09:03
bauzasI guess because mypy can't statically guess it09:03
bauzasas it uses a lambda function09:03
gibibauzas: you can check what mypy can guess reveal_type(<variable>)09:03
gibias far as I see local variables only resolved if the function is resolved by mypy 09:04
gibiso if the function has no annotation on the signature then mypy will not check the locals09:04
bauzaswhich is the case for __init__09:05
bauzasbut09:05
gibiwould be interesting to see what type self._sysinfo_serial_func got from mypy09:05
gibibut anyhow self.mdev_class_mapping is a Dict[str, str] isn't it?09:06
bauzaslet me run reveal_type on both mdev_class_mapping and pgpu_class_mapping09:06
bauzasgibi: exact, annotation is simple09:06
bauzashum, very interesting09:08
bauzasgibi: https://paste.opendev.org/show/807878/09:10
bauzasif I ask reveal_type() this forces mypy to introspect local variables09:10
bauzasso I guess we now need to add annotations every time09:11
opendevreviewSylvain Bauza proposed openstack/nova master: Provide the mdev class for every PCI device  https://review.opendev.org/c/openstack/nova/+/80291809:14
opendevreviewSylvain Bauza proposed openstack/nova master: Provide and use other RCs for mdevs if needed  https://review.opendev.org/c/openstack/nova/+/80323309:14
opendevreviewSylvain Bauza proposed openstack/nova master: Expose the mdev class  https://review.opendev.org/c/openstack/nova/+/80174309:14
opendevreviewSylvain Bauza proposed openstack/nova master: WIP: Cleanup GPU vs. mdev wording  https://review.opendev.org/c/openstack/nova/+/80337909:14
gibibauzas: I think that only means that mypy did not assign any type to either of those variables but mypy needed to use some type for mdev_class_mapping for something else later so mypy  asked the typehint from you. When you added reveal_type you forced mypy to try to assign type to both variables hence is asked typehint for both from you09:21
bauzasgibi: yup, that's what I found09:22
bauzasstatic typing, my love09:22
bauzasanyway, this is fixed in the last rev, I added an annotation09:22
gibiack09:23
gibiI'm done with the re-review of the mdev series so far so good :)09:25
bauzassee, I haven't grumbled about mypy 09:25
bauzassure, it's important to tell that a variable using a collections.defaultdict is a dict :p09:26
bauzasjust in case people don't know :D09:26
bauzasgibi: I replied to your (good) concern https://review.opendev.org/c/openstack/nova/+/803233/3//COMMIT_MSG#1709:41
stephenfinkashyap: not an option, I'm afraid https://governance.openstack.org/tc/reference/tags/assert_follows-standard-deprecation.html09:54
kashyapYeah, I know "standards"09:57
sean-k-mooneykashyap: well in this case its openstacks standard 10:32
sean-k-mooneywhich nova does follow10:32
sean-k-mooneykashyap: for what its worth if qemu upstream dont remove it im not really in a rush to remove it form nova. deprecating floppy usage sure, but it does not cost us much if anything to keep it and distros can still drop support even if we support it upstream10:36
kashyapsean-k-mooney: Hey; I see what you mean, but as noted it is more of a liability than anything at this point10:39
kashyap(Given past CVEs.  Upstream QEMU too, it's discouraged)10:39
kashyap(Downstream distros of OpenStack can also simply declare it out of scope / deprecated.)10:40
opendevreviewMerged openstack/nova master: zuul: Increase GLANCE_LIMIT_IMAGE_SIZE_TOTAL for nova-lvm  https://review.opendev.org/c/openstack/nova/+/80332210:55
lyarwood\o/11:00
opendevreviewLee Yarwood proposed openstack/nova master: libvirt: Handle silent failures to extend volume within os-brick  https://review.opendev.org/c/openstack/nova/+/80171411:37
opendevreviewLee Yarwood proposed openstack/nova master: Add functional test for bug 1937375  https://review.opendev.org/c/openstack/nova/+/80201111:37
opendevreviewLee Yarwood proposed openstack/nova master: compute: Avoid duplicate BDMs during reserve_block_device_name  https://review.opendev.org/c/openstack/nova/+/80199011:37
opendevreviewLee Yarwood proposed openstack/nova master: fup: Move _wait_for_volume_attach into InstanceHelperMixin  https://review.opendev.org/c/openstack/nova/+/80262311:37
opendevreviewLee Yarwood proposed openstack/nova master: Add regression test for bug 1938326  https://review.opendev.org/c/openstack/nova/+/80280111:38
opendevreviewLee Yarwood proposed openstack/nova master: compute: Do not mark disabled but down services as in maintenance  https://review.opendev.org/c/openstack/nova/+/80231711:38
lyarwoodgibi / bauzas ; now the gate is fixed reviews on ^ would be appreciated this week if you have time11:38
* lyarwood -> lunch11:41
gibilyarwood: on it11:45
gibilyarwood: after your lunch, is this extended race scenario possible? https://review.opendev.org/c/openstack/nova/+/801990/6/nova/compute/manager.py#698512:01
lyarwoodgibi: yeah that's also possible but I wonder if we want to treat it as a separate fix? 12:12
lyarwoodactually thinking about it, is it an issue if we have racing requests against the same volume 12:13
lyarwoodmultiattach races would be caught by c-api later12:14
gibilyarwood: I would treat it as a separate fix. whay you proposed is a valid fix for a scenario reported int he bug12:15
gibihm, I guess we have the instance.uuid lock as a pattern, we have that for most of the operations12:15
gibibut here a volume.id lock would be better12:16
gibiI'm not sure we even need the instance.uuid lock here12:16
gibibut could be12:16
stephenfinbauzas: Can we trade reviews for your generic mdev series and my DB series? I'm really eager to close out as much as I can this week, especially with a few people on PTO next week12:27
bauzasstephenfin: for the moment, I'm working on the functest 12:27
bauzasfor the mdev series, asked by gibi12:27
bauzasbut in one hour, we could do12:28
sean-k-mooneygibi: i think we take an instace lock for volume stuff to avoid concurrent voluem attach/detach operation on the same instance12:29
sean-k-mooneymulti attach is interesting edgecase12:30
gibisean-k-mooney: how a reserve that only create a new bdm can race with an attach that already has the bdm created or a detach that will delete a dbm tha should already exists12:30
gibistill, I'm not against keeping the instance.uuid lock12:31
sean-k-mooneygibi: well i was just using attach as example not suggesting it would race with it12:31
gibiit is a sensible safety measure12:31
opendevreviewAlexey Stupnikov proposed openstack/nova master: Fix incorrect reservation IDs in unit tests  https://review.opendev.org/c/openstack/nova/+/80336312:34
aarentsHi nova,13:09
sean-k-mooneyo/13:09
aarentsI'm a bit stuck with this change https://review.opendev.org/c/openstack/nova/+/799606 (see my self comment) which is only a partial fix to the bug https://bugs.launchpad.net/nova/+bug/1934742  I have in operation,13:09
aarentsI'm not sure how to properly fix that, if someone have idea I can dig..13:09
sean-k-mooneylet me read but you having issue with pepole delete port via neutron? rather then port detach right13:10
sean-k-mooneytechnially we dont really support that13:10
aarentsyep port deletion when attached13:11
sean-k-mooneywe have the network-vif-deleted event to try and help fix things13:11
sean-k-mooneybtu deletion when attached is strongly discuraged13:11
sean-k-mooneythe correct way to fix it would be to either block it in neutron or have nova call detach internally13:12
sean-k-mooneyaarents: stirckly speaking form a nova perpective deleting an interface that is attach to an instance via neutorn is a user error13:14
sean-k-mooneyand the code path you are looking at is a workaround we put in place to try an partly handel it13:15
aarentsSo we can imagine calling detach_interface but it will try to to unbind and will probably got 404 from neutron which is not a big deal13:15
sean-k-mooneywe would have to handel that ya but its something we could handel13:16
sean-k-mooneythis came up a few weeks ago though in the neutron driver meeting an i was pushing to block deleteing attach ports in the neutron api13:16
sean-k-mooneylong term that is the correct approch13:17
sean-k-mooneythis is really a neutorn bug not a nova one but nova can do a better jobs at cleaning up when you incorrectly detelete attach ports13:18
sean-k-mooneythe same is true for attach volumes by the way 13:18
sean-k-mooneyyou should never try to delete an attach volume but in that case i belive cinder will protect you13:18
sean-k-mooneygibi: you rememebr that conversation about 2-3 weeks ago in the drivers meeting right? i think the resolution was to write a neutron spec for the new extnsion13:19
aarentssean-k-mooney: I see your point about not a nova issue, My issue is I'like to avoid this possible leak because it generates un live_migrable instances in infra..13:19
sean-k-mooneyyep im sure it does and yes that is why we try to clean it up13:20
sean-k-mooneyaarents: it has other proplems it hte port that was delete uses sriov13:20
sean-k-mooneysince the pci device will not be released properly13:20
aarentsI see13:21
sean-k-mooneyaarents: sriov deattach does not work properly in any release before wallaby13:21
sean-k-mooneyand i think it sitll wont work properly if you delete an attach neturon port although ill admit i have not looked at that code path to check13:22
sean-k-mooneyi just suspect that we did not test that when we added sriov attach/detach support13:22
sean-k-mooneyaarents: anyway your partial fix is to just add a lock13:23
sean-k-mooneythat might help i in limited cases but i dont think it will in general fix this13:24
aarentsI aggre it is not enought13:24
aarentssean-k-mooney: thks! for the usefull info, I will dig more on blocking or try a detach13:30
sean-k-mooneyaarents: by the way if we cant use detach directly because all files cant be constructed we have 2 options.13:30
sean-k-mooneywe could have neutron send the full port info in the detach payload or we could contocut as much data as we can and pass incompelte objects13:31
sean-k-mooneybut blocking it in nueton would be the best approch or having neutron call detach which is rahter messy but would work better13:31
aarents"neutron send the full port info in the detach payload" is interesting yes13:32
sean-k-mooneyone thing we disucssed was moving when neutron sends the event13:32
sean-k-mooneyif neutorn sent the deleted event and waited for nova to ack it13:32
sean-k-mooneythen that would also fix it13:33
sean-k-mooneyto a degree at least13:33
sean-k-mooneyanyway this is a know issue that we just have not focused on so sorry your are hitting it13:33
aarents"could contocut as much data as we can and pass incompelte objects" <- I try that but it needs details like vif_type bridge name this info are no more available when port is deleted, I don't know how regenerate that13:34
sean-k-mooneyah yes we do need those13:35
sean-k-mooneywhich may or may not be still in the info cache as you have found13:35
opendevreviewMerged openstack/nova master: Add functional test for bug 1937375  https://review.opendev.org/c/openstack/nova/+/80201113:35
sean-k-mooneyits its a hack but we might be able to reconstuct the required data by inspecting the libvirt xml but i would prefer to avoid that13:36
aarentssean-k-mooney: yes.. it was the workaround I was considering, kind of detach_device_by_mac, we intereate on interfaces nad deleted the good one13:38
sean-k-mooneyya by mac works for most things excption sriov pfs but  in that case you can kind of still figure it out in some cases13:39
sean-k-mooneyit really depend on how much of the db info we have acess too13:39
sean-k-mooneywe cant get the pci address form the port since its gone and the info cache may be empty but we recently started recoring the neutron port uuid as the requester id in the pci_devices table i belive13:40
aarentsthe info we can grab in db are entry in virtuales_interfaces: 13:41
sean-k-mooneywhich means we should be able to get the pci adress by the port uuid13:41
aarents+---------------------+------------+------------+----+--------------------------------------------------------+------------+--------------------------------------+--------------------------------------+---------+------+13:41
aarents| created_at          | updated_at | deleted_at | id | address                                                | network_id | uuid                                 | instance_uuid                        | deleted | tag  |13:41
aarents+---------------------+------------+------------+----+--------------------------------------------------------+------------+--------------------------------------+--------------------------------------+---------+------+13:41
aarents| 2021-08-03 15:44:23 | NULL       | NULL       | 62 | fa:16:3e:fb:2d:7b/4ae693fd-23fb-46c6-a8a5-369270029fc6 |       NULL | 4ae693fd-23fb-46c6-a8a5-369270029fc6 | d07df8d0-fc98-4087-9b25-3b4a55cdcac4 |       0 | NULL |13:41
aarents+---------------------+------------+------------+----+--------------------------------------------------------+------------+--------------------------------------+--------------------------------------+---------+------+13:41
sean-k-mooneyyep13:41
opendevreviewSylvain Bauza proposed openstack/nova master: Expose the mdev class  https://review.opendev.org/c/openstack/nova/+/80174313:42
opendevreviewSylvain Bauza proposed openstack/nova master: WIP: Cleanup GPU vs. mdev wording  https://review.opendev.org/c/openstack/nova/+/80337913:42
sean-k-mooneyso with that info and the pci_devices table we can with some effort constuct most of what we need to inpect the xml and then figure out what the vif_type ectra was to unplug13:42
sean-k-mooneybut since the virt dirvers are not really ment to tlak to the db that is tricky to do13:43
sean-k-mooneywe would have to implement a new fuction in the compute manager ot do that more then likely 13:43
sean-k-mooneyand then extend thet virt driver api 13:43
sean-k-mooneywell it depend on how we approch it13:44
aarentssean-k-mooney: it is good to know the topic is open with neutron, in short term I will find a hack to detach interface with few info I have in db13:54
aarentssean-k-mooney: thks!13:55
kashyapsean-k-mooney: Remind me again, why did you suggest to leave this as-is to 'cirrus'? - https://review.opendev.org/c/openstack/nova/+/798680/3/nova/virt/libvirt/config.py14:00
sean-k-mooneybecause we only want to change the behavior for new instnaces14:00
sean-k-mooneyalso for most code paths this get overriden14:00
sean-k-mooneynone might actully be a better default14:01
sean-k-mooneyi dont think this will ever get used currently at least not on x8614:01
kashyapsean-k-mooney: I'm actually going to do a bunch of migration tests with a variant of this patch to see if there's *actual* breakage or are we talking only theoretical stuff14:01
sean-k-mooneywell what we approved ot proceed was no change to existing instnaces14:03
sean-k-mooneyif you want to change the default of exisitng instance then we should disssu this cahgne again with the wider team14:03
kashyapWell, let's see what breaks, if anything.  "What was approved" was all based on theory.  I'd like to see some actual evidence14:03
sean-k-mooneyso i think you should leave it at cirrus or defer this to yoga14:03
kashyapDon't worry, I am concerned as much as you to not break any valid cases or upgrades, etc.14:04
kashyapOh, totally forgot: I said this befoere but I've gotten a Red Hat QE to do some tests w/ Cirrus and VirtIO to change for existing instances - for Windows and Linux. 14:09
kashyapI'll add a note in the review14:09
sean-k-mooneythey would need to test both migration and hard reboots after the fact withthe updated xmls14:10
sean-k-mooneyfor live migration obviolys we cant change teh device model, it could change on a hard reboot after the migration but we generally want to avoid that14:10
kashyapsean-k-mooney: Right, the test that was done so far was this:14:10
kashyapHave a Linux and Windows guest on source with 'cirrus', change the video model to 'virtio'; reboot the guest, and then live-migrate -- it all succeeds 14:10
sean-k-mooneyack. we had discussed that in some cases default to virtio might be fine due to the vga fallback14:11
sean-k-mooneyhave you test this where we have specifed vram14:12
sean-k-mooneyor other extra specs14:12
sean-k-mooney* other image properteis14:12
kashyapAnd on hard-reboot, a guest can pick up new device-related bits some times; can't avoid it in some cases14:12
kashyapsean-k-mooney: Can you spell out a bit more on what do you want tested with the vRAM?14:12
sean-k-mooneyright personally i feel like that is a bug when that happens14:13
kashyap(I mean, it's a bug if it breaks anything user-visible; if not, I'd say it's fine)14:13
sean-k-mooneydepends on who you talk too14:13
kashyapsean-k-mooney: Likewise, what props you want to test in this case?  How are these other image props related?14:13
sean-k-mooneysome customer treat it as a bug since they would require recertifcation of the workload other dont care14:14
sean-k-mooneyhw_video_ram14:14
sean-k-mooneyi belive virtio is limited to 8MB14:14
sean-k-mooneycirrus i think is larger14:14
sean-k-mooneyi think crrus can support 24-64 mb somethign in that region14:15
sean-k-mooneykashyap: that is hte main one im concerned about currently14:15
kashyapsean-k-mooney: You mean hw_video_ram is the thing you're concerned about?14:16
sean-k-mooneyyes14:16
kashyapsean-k-mooney: Sure, adding a note to test that too (and summarizing what we talked here on the change)14:16
sean-k-mooneyi think the max vram that you can use with virtio is less then that for cirrus14:16
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/objects/image_meta.py#L41714:16
kashyapsean-k-mooney: What I'm not clear is the impact of hw_video_ram on the switch to virtio14:18
* kashyap clicks14:18
kashyapMaybe virtio doesn't need that max vRAM.  Anyway, to be tested :)14:20
sean-k-mooneykashyap: if you had it set to 16 with cirrus it would work but when you swich to virtio you vm would not boot14:20
sean-k-mooneykashyap: lets concirm since i tought the virt team said it did no that downstream mail thread today14:21
kashyapAnother q. is In what scenarios would one bother to set this at all? 14:21
kashyapYes, I will ask the virt graphics maintainer about it14:21
sean-k-mooneyto improve guest performacne14:21
sean-k-mooneyit allow more frame buffers to be created in the graphic device which is need for higher resolutions14:22
sean-k-mooneywithout a large enough vram allocation you can do double or triple buffing in the graphic device and xrog has to fall back to copying buffers to/from guest ram14:23
kashyapI see; I really wonder how many users actually know this, and how many, if any at all, change this..14:24
sean-k-mooneykashyap: from greg " - vga compatibility mode is limited when compared to stdvga.  It has14:24
sean-k-mooney   8 MB fixed video memory, whereas stdvga has 16 MB by default can14:24
sean-k-mooney   can be configured to have more if needed (for example in case you14:24
sean-k-mooney   want use 4k). 14:24
sean-k-mooney"14:24
sean-k-mooneythat was in context of virtio-vga14:24
sean-k-mooneyvs plain vga14:25
sean-k-mooneybut i think cirrus also support more the 8MB14:25
sean-k-mooneyas does QXL14:25
kashyapLet's confirm if Cirrus actually does14:25
sean-k-mooneylooking at https://libvirt.org/formatdomain.html#video-devices yes at least 16MB14:26
sean-k-mooneyFor a guest of type "kvm", the default video is: type with value "cirrus", vram with value "16384" and heads with value "1". 14:26
kashyapsean-k-mooney: Ah, yep14:27
* kashyap bbaib; thanks for the discussion, sean-k-mooney :)14:28
sean-k-mooneylookign at https://www.kraxel.org/blog/2019/09/display-devices-in-qemu/#qxl-vga by the way qxl defaults to 64MB which was one of the reasons it perfromed better in the past then other options i know much of the improvemnts have now been surpassed by virtio-gpu or std vga14:31
sean-k-mooneyactully from the std vga section14:33
sean-k-mooney"The linux driver supports page-flipping, so having room for 3-4 framebuffers is a good idea. The driver can leave the framebuffers in vram then instead of swapping them in and out. FullHD (1920x1080) for example needs a bit more than 8 MB for a single framebuffer, so 32 or 64 MB would be a good choice for that. "14:33
sean-k-mooneyyou can see that 8MB is not quite  large enough for 1080p14:33
sean-k-mooneyso without the virtio gpu driver you perfromace would likely regress goign form cirrus to virtio-vga if you tried to use a 1080p resolution today14:34
kashyapsean-k-mooney: Back ... reading14:43
kashyapI really doubt the seriousness of the "regression" from performance going from 'cirrus' to 'virtio-vga'.  If people want performance, you better make sure you're not using deadly-old Linux, and got the virtio-gpu driver.14:45
kashyapI'll check w/ Gerd (the author of the above post) and post my summary on the change14:48
sean-k-mooneyack14:49
gibiaarents: sean-k-mooney: here is the neutron drivers meeting log about deleting bound ports https://meetings.opendev.org/meetings/neutron_drivers/2021/neutron_drivers.2021-07-02-14.00.log.html#l-6714:56
gibithe agreement was that it needs specs14:57
gibithere is a summary in the rfe https://bugs.launchpad.net/neutron/+bug/193086615:00
sean-k-mooneygibi: right that is what i recalled 15:00
bauzasgibi: I tried to look how to verify whether we would have existing allocations for VGPU RC in case the operator would modify the options but honestly it's difficult to do it as we don't provide the allocations when calling update_provider_tree() without ReshapeNeeded15:05
bauzashonestly, we already don't support the fact that operators could modify the inventories if they modify the options15:06
bauzasbut I'll test this15:09
aarentsgibi: thank you for the links15:09
gibibauzas: if we cannot prevent it then at least a big red warning should be added to the docs.15:17
bauzasagreed 15:17
bauzasI think we would then have orphaned allocations15:17
gibiprobably until the instance is deleted or migrated15:18
bauzasgibi: we *could* try to hardstop in update_provider_tree() 15:18
bauzaswhich is called by the compute service just before we start the RPC service15:18
bauzas(called by pre_start_hook)15:19
bauzasexactly like we do for reshapes15:19
bauzasbut then we would need to pass allocations even without reshaping15:19
bauzasor, using the ReshapeNeeded exception 15:20
bauzaslike a reshape15:20
bauzaseither way, not sure I could do it in 2 days15:20
gibiyeah I don't expect that we initiati a rehape just to reject it :)15:22
gibiI suggest to check what will happen (e.g. orphaned allocation) and add a big warning about it in the docs15:23
bauzasgibi: that's just what I'm testing :)15:36
gibicoolio15:36
bauzasfortunately I have an environment <315:36
bauzasI don't know yet how long, but... :p15:36
*** whoami-rajat__ is now known as whoami-rajat15:37
opendevreviewAlexandre arents proposed openstack/nova master: libvirt: Abort live-migration job when monitoring fails  https://review.opendev.org/c/openstack/nova/+/76443516:07
*** rpittau is now known as rpittau|afk16:14
opendevreviewBalazs Gibizer proposed openstack/nova master: Support move ops with extended resource request  https://review.opendev.org/c/openstack/nova/+/80008716:49
opendevreviewBalazs Gibizer proposed openstack/nova master: [func test] refactor interface attach with qos  https://review.opendev.org/c/openstack/nova/+/80008816:50
opendevreviewBalazs Gibizer proposed openstack/nova master: Support interaface attach / detach with new resource request format  https://review.opendev.org/c/openstack/nova/+/80008916:50
bauzasgibi: still around ?16:51
gibibauzas: a bit16:51
bauzasgibi: good news, we refuse to change the RC if the operator modifies it16:52
bauzasgibi: https://paste.opendev.org/show/807890/16:52
gibi\o/16:52
bauzasgibi: we accept to do this if there are no allocations, but in case we have some of them, we get this ^16:52
gibiawesome16:52
gibithis is what we need16:52
opendevreviewBalazs Gibizer proposed openstack/nova master: [func test] move unshelve test to the proper place  https://review.opendev.org/c/openstack/nova/+/79362116:53
gibidoes it casue the compute to refuse to start?16:53
bauzasthe compute service continues to work, tho16:53
gibi;/16:53
bauzasgibi: no16:53
bauzasgibi: but that's fine16:53
gibibauzas: do we only hit it in the periodic but not in the inithost?16:54
opendevreviewBalazs Gibizer proposed openstack/nova master: WIP support extended res req in heal port allocation  https://review.opendev.org/c/openstack/nova/+/80206016:54
bauzasgibi: yup, because of update_provider_tree()16:54
bauzasgibi: but this method is also called when restarting the compute16:54
bauzasand every 60 secs16:54
bauzasand then*16:54
gibiOK, so we don't create orphans, we are loud in the log about the issue periodically so the admin will notice it.16:56
gibithat is OK to me16:56
bauzasalso, when trying to delete the instance, we get an error16:56
bauzasgibi: https://paste.opendev.org/show/807891/16:57
bauzasso the operator needs to first remodify the options to use again the VGPU RC16:57
gibiauch16:58
gibithat is ugly but meh16:58
bauzasanyway, I'll continue to look at it16:58
bauzaswe could hardstop the compute by the exception maybe16:59
gibithat would be ideal if possible16:59
gibianyhow I will drop of soon. I think you made a good progress with the mdev patches. I can look at them tomorrow again if there are updates17:00
bauzas++17:01
sean-k-mooneyhard stoping the compute i guess is due to some configuration erro or something discovered wehn the update_provider_tree is run?17:47
sean-k-mooneybauzas: the only way we allow operatros to modify the resouce provider by the way is via provider.yaml17:48
sean-k-mooneyif they tried to do that via plamcent or similar they are off in undefiend behavior land and get to keep what ever mangical bugs they find there17:49
*** ricolin_ is now known as ricolin18:02
opendevreviewArtom Lifshitz proposed openstack/nova master: WIP: PCI resize tests  https://review.opendev.org/c/openstack/nova/+/80350619:01
NobodyCamGood afternoon Nova folks, I am running into a situation where ironic nodes appear to available in both Ironic and Nova but when attempting to provision we are hitting errors with placement records not being cleaned up... My question is there a event we can listen for or api we can check that could be used to verify the node is ready for provisioning placement / inventory wise21:06
NobodyCamI should add that we listen for compute.instance.delete.start and compute.instance.delete.end21:13
opendevreviewAde Lee proposed openstack/nova master: Add check job for FIPS  https://review.opendev.org/c/openstack/nova/+/79051922:59

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!