melwitt | bauzas: this patch looks relevant to your interests https://review.opendev.org/c/openstack/nova/+/864674 | 01:57 |
---|---|---|
opendevreview | Jorhson Deng proposed openstack/nova master: Optimize the small pagesize in numa_fit_instance_to_host https://review.opendev.org/c/openstack/nova/+/864812 | 05:59 |
opendevreview | Jorhson Deng proposed openstack/nova master: Optimize the small pagesize in numa_fit_instance_to_host https://review.opendev.org/c/openstack/nova/+/864812 | 08:04 |
opendevreview | Jorhson Deng proposed openstack/nova master: Optimize the small pagesize in numa_fit_instance_to_host https://review.opendev.org/c/openstack/nova/+/864812 | 08:42 |
johnthetubaguy | When you run functional tests on your dev box, and you get loads of errors due to no valid host, it probably means I am doing something stupid, has anyone else hit that at all please? | 09:44 |
bauzas | johnthetubaguy: which testcases ? | 10:09 |
johnthetubaguy | good question, mostly the ones that use placement | 10:11 |
johnthetubaguy | there are loads of them in the functional suite, one example is: nova.tests.functional.wsgi.test_services.TestServicesAPI.test_resize_revert_after_deleted_source_compute | 10:11 |
johnthetubaguy | pretty sure its my environment being broken, as so many fail | 10:11 |
johnthetubaguy | ah, so I think it was by default python3 being 3.8, oops! | 10:19 |
johnthetubaguy | curiously, unit tests are just fine, but functional tests, no so much | 10:20 |
bauzas | johnthetubaguy: sorry was on a meeting | 10:22 |
johnthetubaguy | no worries, the fix was simple enough in the end | 10:22 |
bauzas | johnthetubaguy: yesterday I ran some reshape function tests locally and I had no problem | 10:22 |
johnthetubaguy | yeah, its my bad default python version, thats all | 10:22 |
bauzas | johnthetubaguy: oh, so due to py38 ? strange if so | 10:22 |
johnthetubaguy | yeah | 10:23 |
bauzas | I guess you recreated your venv too ? | 10:23 |
johnthetubaguy | I suspect something had failed to start and I didn't notice that, all I noticed was lots of no valid host exceptions | 10:23 |
johnthetubaguy | so tox spotted it and did the re-create | 10:23 |
bauzas | and it worked then ? | 10:23 |
johnthetubaguy | in tox.ini the base python is python3 rather than python3.9 I guess, which caused my "fun" | 10:24 |
johnthetubaguy | yeah, its working fine now | 10:24 |
bauzas | I suspect some os-resource-classes or os-traits library update wasn't updated | 10:24 |
bauzas | hence placement failing and then the novalidhosts eventually | 10:24 |
bauzas | doing the tox -r updated the deps | 10:24 |
johnthetubaguy | ah... got you, very possible | 10:24 |
johnthetubaguy | actually, I remember seeing some os-traits error actually | 10:25 |
bauzas | yeah, we hardfail on the number of traits | 10:25 |
johnthetubaguy | it was a missing attribute error I think, which is basically the same thing | 10:25 |
bauzas | cool | 10:25 |
johnthetubaguy | well mystery solved, thank you! | 10:25 |
bauzas | np, glad you fixed it by yourself :D | 10:26 |
johnthetubaguy | (makes bashing with a hammer noises) | 10:26 |
bauzas | :) | 10:27 |
opendevreview | John Garbutt proposed openstack/nova master: Functional test test_boot_reschedule_with_proper_pci_device_count https://review.opendev.org/c/openstack/nova/+/760354 | 10:44 |
opendevreview | John Garbutt proposed openstack/nova master: Fix PCI passthrough race on reschedule (refresh) https://review.opendev.org/c/openstack/nova/+/710848 | 10:44 |
opendevreview | John Garbutt proposed openstack/nova master: Fix PCI passthrough race on reschedule (claims) https://review.opendev.org/c/openstack/nova/+/710847 | 10:44 |
johnthetubaguy | gibi: I think you reviewed those in the past, I am not sure it answers all your questions, but I put the functional test first, in an attempt to work out which patches are needed. Its a nasty bug that on re-schedule you try to get the wrong PCI device, but fail with in-use errors. | 11:02 |
sean-k-mooney | johnthetubaguy: so i was thinking about your ironic patch for reserving on schedule over night | 11:03 |
sean-k-mooney | i think in general its a good idea | 11:04 |
sean-k-mooney | there was some concern about if cleaning was used or not in large cloud right extendign the time they would be unavaiable | 11:04 |
johnthetubaguy | yeah, gibi was mentioning that, and well, I don't disagree | 11:05 |
sean-k-mooney | if we wanted to cater for that we coudl make this configurable but i think the its proably ok ot reserve by default or uncondtionally | 11:05 |
johnthetubaguy | yeah, it feels like a future workaround config, if its a problem for people | 11:05 |
johnthetubaguy | interestingly, I think it fixes an extra case I should add to the commit... | 11:05 |
sean-k-mooney | oh what one | 11:06 |
johnthetubaguy | when you mark an in-use node as in maintenance mode as its broken, user gets to delete their instance when they are ready, and that goes into clean failed (depending on your ironic config), we don't hit the race with our placement updates any more either | 11:07 |
johnthetubaguy | we had the same window with that, once the allocation is removed when the instance is deleted | 11:07 |
sean-k-mooney | oh ok so clean failed happens because it in mantainance | 11:08 |
sean-k-mooney | and cant actully start cleaning? | 11:08 |
johnthetubaguy | more that, we start sending new instances to the node that is in maintainance, shortly after the user deletes their nova server | 11:08 |
johnthetubaguy | its basically the same race condition, but with a slightly different reason | 11:09 |
sean-k-mooney | nice i alwasys like it when one fix fixes multiple bugs | 11:09 |
johnthetubaguy | totally | 11:10 |
sean-k-mooney | so between https://review.opendev.org/c/openstack/nova/+/842478 (the retry) and https://review.opendev.org/c/openstack/nova/+/864773 (reserving) we have two fixes. they retry is certinaly backportable | 11:11 |
sean-k-mooney | reserving honelsy proably is too but to backport that i think we would need the workaround option | 11:11 |
johnthetubaguy | yeah, I think we need both, although the new one makes the older one less important | 11:12 |
johnthetubaguy | i.e. the older one only matters where available nodes go no longer available, and we don't spot it right away, since we remove the issue with automatic cleaning also causing that problem | 11:12 |
sean-k-mooney | yes so i was about to approve the old patch and then soft -1 the second one askign for the workaround option if that works for you. | 11:14 |
sean-k-mooney | the only thing i was wonderign about for the first patch is shoudl it have a release note | 11:14 |
sean-k-mooney | although its only apartial fix | 11:14 |
sean-k-mooney | the second patch should have one | 11:14 |
johnthetubaguy | yeah, second patch needs one for sure | 11:15 |
johnthetubaguy | the first one might be worth advertising via a release note I guess, it is handy | 11:15 |
johnthetubaguy | but merging is better for me, obviously :) | 11:16 |
sean-k-mooney | shall i hold +2w for you to add one or just go for it | 11:16 |
sean-k-mooney | i think its fine as is | 11:16 |
johnthetubaguy | yeah, lets get that first one in, I will add some workaround stuff in the second one | 11:17 |
sean-k-mooney | i like to have releas notes for close-bug | 11:17 |
sean-k-mooney | i treat them as optional for paritals or related | 11:17 |
johnthetubaguy | ah, fair enough | 11:17 |
sean-k-mooney | cool first one is on its way | 11:17 |
sean-k-mooney | ill comment on the second | 11:17 |
johnthetubaguy | sweet, thank you | 11:18 |
sean-k-mooney | i owe you a review of the ironic spec too, i didnt get to it on the review day so once im done with this ill take a look at that next | 11:19 |
johnthetubaguy | Ah, that would be great, thank you. I am sure it needs some refinement. I have half a plan to do some POC work on that soon, but these other bugs keep distracting me. | 11:24 |
sean-k-mooney | your looking at a thrid bug related to pci claims too right. did you pick that up form mark? | 11:31 |
sean-k-mooney | ya its https://review.opendev.org/q/topic:bug%252F1860555 | 11:32 |
johnthetubaguy | sort of yes, trying to restack that on top of the functional test | 11:32 |
johnthetubaguy | I need to decide which of these we backport in various downstreams | 11:33 |
sean-k-mooney | i havent looked at it in a while but i tought it would be backportable upstream | 11:34 |
sean-k-mooney | in case that helps. | 11:35 |
johnthetubaguy | yeah, I agree, I think it should be | 11:36 |
johnthetubaguy | I am not totally sure about the claims stuff, and how critical that is, its very possible we leak PCI devices without that fix up | 11:36 |
sean-k-mooney | leak is not quite right | 11:37 |
sean-k-mooney | we can end up claiming more then we request | 11:38 |
sean-k-mooney | they will all get freed when the vm is deleted | 11:38 |
sean-k-mooney | but not before | 11:38 |
sean-k-mooney | without it | 11:38 |
johnthetubaguy | yeah, without that object refresh, that is certainly true | 11:38 |
sean-k-mooney | we have seen this persiste even after the vm is shelved in downstream bug reports | 11:39 |
johnthetubaguy | the functional test probably needs more checks on the PCI claims I guess | 11:39 |
sean-k-mooney | not nessisalry because fo the rescdule there were issue with resize in the past that had a similar effect | 11:39 |
johnthetubaguy | ah, good to know, certainly believeable | 11:39 |
johnthetubaguy | ah, interesting | 11:40 |
sean-k-mooney | i have not look i added RP+1 to get it on my list i proably wont get to it this week but ill try and see if i can take a look again on monday | 11:40 |
johnthetubaguy | thank you | 11:41 |
opendevreview | Konrad Gube proposed openstack/nova-specs master: Add API for assisted volume extend https://review.opendev.org/c/openstack/nova-specs/+/855490 | 11:43 |
sean-k-mooney | elodilles: im plannign to fix a trivial docs issuw with one of our config options but i want to also test this in ci https://bugs.launchpad.net/nova/+bug/1996094 | 12:21 |
sean-k-mooney | i chatted to bauzas about this a bit downstream and we wer eunsure how you and other stabel cores would feel about when it comes to backporting | 12:22 |
sean-k-mooney | elodilles: is it ok to do both in one patch or woudl you prefer we did not backport the ci change | 12:22 |
sean-k-mooney | i woudl prefer to backport both and if we are backportign both i would prefer to have it be in one patch | 12:22 |
sean-k-mooney | my curent plan is to set heal_instance_info_cache_interval=0 in nova-next | 12:23 |
opendevreview | John Garbutt proposed openstack/nova master: Ironic nodes with instance reserved in placement https://review.opendev.org/c/openstack/nova/+/864773 | 12:36 |
johnthetubaguy | gibi sean-k-mooney I have added a release note and the workaround config, slightly more intrusive, but not by much I guess. | 12:39 |
* gibi reads back | 12:39 | |
sean-k-mooney | ack most of the way though the spec not much feeback so far beyond what is already there | 12:40 |
sean-k-mooney | just got to the nova manage command | 12:40 |
bauzas | sean-k-mooney: elodilles: yup, I just wondered if we were ok for backporting a .zuul file :) | 12:48 |
sean-k-mooney | im pretty sure i have dont that before | 12:49 |
sean-k-mooney | backported a job change we defintly did it to fix the train gate | 12:50 |
elodilles | sean-k-mooney bauzas : as far as i remember, traditionally, doc change backports were not accepted, but i think it is OK to backport them. about the CI I'm a bit hesitant, though. but it depends on the change, i would say | 13:01 |
gibi | johnthetubaguy: left reply in the pci re-schedule bugfix | 13:01 |
gibi | johnthetubaguy: that single instance.refresh() feels strange to me | 13:01 |
sean-k-mooney | elodilles: it littrally is disablelng a perodic task | 13:01 |
sean-k-mooney | elodilles: we perodicaly heal the network info cache but that in practic should not be required as neutron tells us when something chagnes | 13:02 |
sean-k-mooney | elodilles: so the ci change is setting one config value to 0 | 13:02 |
johnthetubaguy | gibi: agreed, its crazy that fixes so much, drops all the transient changes before a save, roughly. I want more of that claims stuff in the functional test I think | 13:03 |
sean-k-mooney | in one job nova-next | 13:03 |
elodilles | sean-k-mooney: so it won't be a new CI job, but a config value change in 1(?) job as I understand then | 13:03 |
sean-k-mooney | yes just one addtionall config override in the nova-next job | 13:03 |
gibi | johnthetubaguy: if the PciDevice.instance_uuid field is already update to point to this instance then simply resetting the instance with instance.refresh() cannot be enough | 13:03 |
sean-k-mooney | to set the existing config option to 0 instead of the defautl 60 | 13:03 |
elodilles | sean-k-mooney: and you state that it won't introduce any instability o:) | 13:04 |
sean-k-mooney | well thats why we want to have it runing in ci | 13:04 |
sean-k-mooney | but i dont belive it will | 13:04 |
elodilles | maby let's see it first in master banch then :) | 13:04 |
elodilles | * maybe | 13:05 |
sean-k-mooney | ok so two patches one for docs change to corect the help text adn seperate one for zuul | 13:05 |
sean-k-mooney | i can do that | 13:05 |
elodilles | sean-k-mooney: yep, that sounds good | 13:05 |
sean-k-mooney | cool ill do that then thanks | 13:09 |
opendevreview | Amit Uniyal proposed openstack/nova stable/train: functional: Change order of two classes https://review.opendev.org/c/openstack/nova/+/864672 | 13:36 |
opendevreview | Amit Uniyal proposed openstack/nova stable/train: functional: Rework '_delete_server' https://review.opendev.org/c/openstack/nova/+/864721 | 13:36 |
opendevreview | Amit Uniyal proposed openstack/nova stable/train: functional: Unify '_build_minimal_create_server_request' implementations https://review.opendev.org/c/openstack/nova/+/864713 | 13:36 |
opendevreview | Amit Uniyal proposed openstack/nova stable/train: Extend NeutronFixture to allow live migration with ports https://review.opendev.org/c/openstack/nova/+/864900 | 13:36 |
opendevreview | Danylo Vodopianov proposed openstack/nova master: Napatech SmartNIC support https://review.opendev.org/c/openstack/nova/+/859577 | 14:04 |
opendevreview | Danylo Vodopianov proposed openstack/os-vif master: MTU support for DPDK port added https://review.opendev.org/c/openstack/os-vif/+/859574 | 14:09 |
opendevreview | John Garbutt proposed openstack/nova master: Ironic nodes with instance reserved in placement https://review.opendev.org/c/openstack/nova/+/864773 | 14:11 |
opendevreview | Merged openstack/nova stable/ussuri: add regression test case for bug 1978983 https://review.opendev.org/c/openstack/nova/+/862603 | 14:16 |
*** dasm|off is now known as dasm | 14:22 | |
opendevreview | Merged openstack/nova stable/ussuri: For evacuation, ignore if task_state is not None https://review.opendev.org/c/openstack/nova/+/862604 | 14:36 |
opendevreview | Merged openstack/nova master: DOC update remote console access https://review.opendev.org/c/openstack/nova/+/860687 | 14:37 |
johnthetubaguy | gibi: I don't think the PCI device objects are getting saved in time, I am trying to prove that in the functional test at the moment | 14:47 |
opendevreview | Merged openstack/nova master: Replace "db archive" with "db archive_deleted_raws" https://review.opendev.org/c/openstack/nova/+/847963 | 15:11 |
opendevreview | sean mooney proposed openstack/nova stable/yoga: refactor: remove duplicated logic https://review.opendev.org/c/openstack/nova/+/855022 | 15:15 |
opendevreview | sean mooney proposed openstack/nova stable/yoga: Record SRIOV PF MAC in the binding profile https://review.opendev.org/c/openstack/nova/+/855023 | 15:15 |
opendevreview | sean mooney proposed openstack/nova stable/yoga: Remove double mocking https://review.opendev.org/c/openstack/nova/+/855024 | 15:16 |
opendevreview | sean mooney proposed openstack/nova stable/yoga: Remove double mocking... again https://review.opendev.org/c/openstack/nova/+/855025 | 15:16 |
opendevreview | sean mooney proposed openstack/nova stable/yoga: Add compute restart capability for libvirt func tests https://review.opendev.org/c/openstack/nova/+/855026 | 15:16 |
opendevreview | sean mooney proposed openstack/nova stable/yoga: enable blocked VDPA move operations https://review.opendev.org/c/openstack/nova/+/855035 | 15:16 |
sean-k-mooney | gibi: finally got around to fixing the commit messge on the double mocking pathc. i also rebased to the tip of stable yoga | 15:16 |
sean-k-mooney | im goint to try backporting that to xena and wallaby now | 15:16 |
gibi | sean-k-mooney: I'm +2 on https://review.opendev.org/q/topic:bug%252F1970467 | 15:20 |
* sean-k-mooney first patch to backport to xena already has a conflict yeah... | 15:25 | |
sean-k-mooney | ok including https://review.opendev.org/c/openstack/nova/+/829974 fixes it and that applies cleanly | 15:28 |
opendevreview | John Garbutt proposed openstack/nova master: Functional test test_boot_reschedule_with_proper_pci_device_count https://review.opendev.org/c/openstack/nova/+/760354 | 16:24 |
opendevreview | John Garbutt proposed openstack/nova master: Fix PCI passthrough race on reschedule (refresh) https://review.opendev.org/c/openstack/nova/+/710848 | 16:24 |
opendevreview | John Garbutt proposed openstack/nova master: Fix PCI passthrough race on reschedule (refresh) https://review.opendev.org/c/openstack/nova/+/710848 | 16:30 |
johnthetubaguy | gibi: I double checked with mgoddard, it turns out he found either patch will fix the issue, we just get to pick which one to merge. The functional tests seem to suggest mark is totally correct, i.e. either patch will fix it. I prefer the instance.refresh() one myself. | 16:34 |
gibi | johnthetubaguy: I'm wondering when we add the pci dev to the instance.pci_devices list. If we do that during the claim, then claim.abort should be the one to clean that up | 16:58 |
opendevreview | sean mooney proposed openstack/nova stable/xena: Fix migration with remote-managed ports & add FT https://review.opendev.org/c/openstack/nova/+/864931 | 18:51 |
opendevreview | sean mooney proposed openstack/nova stable/xena: refactor: remove duplicated logic https://review.opendev.org/c/openstack/nova/+/864932 | 18:51 |
opendevreview | sean mooney proposed openstack/nova stable/xena: Record SRIOV PF MAC in the binding profile https://review.opendev.org/c/openstack/nova/+/864933 | 18:51 |
opendevreview | sean mooney proposed openstack/nova stable/xena: Remove double mocking https://review.opendev.org/c/openstack/nova/+/864934 | 19:08 |
opendevreview | sean mooney proposed openstack/nova stable/xena: Remove double mocking... again https://review.opendev.org/c/openstack/nova/+/864935 | 19:08 |
opendevreview | sean mooney proposed openstack/nova stable/xena: Add compute restart capability for libvirt func tests https://review.opendev.org/c/openstack/nova/+/864936 | 19:08 |
opendevreview | sean mooney proposed openstack/nova stable/xena: enable blocked VDPA move operations https://review.opendev.org/c/openstack/nova/+/864937 | 19:08 |
opendevreview | Ghanshyam proposed openstack/nova master: Update gate jobs as per the 2023.1 cycle testing runtime https://review.opendev.org/c/openstack/nova/+/861111 | 20:30 |
opendevreview | Ghanshyam proposed openstack/nova master: Update gate jobs as per the 2023.1 cycle testing runtime https://review.opendev.org/c/openstack/nova/+/861111 | 20:31 |
opendevreview | Ghanshyam proposed openstack/placement master: Update gate jobs as per the 2023.1 cycle testing runtime https://review.opendev.org/c/openstack/placement/+/861471 | 20:39 |
opendevreview | Ghanshyam proposed openstack/placement master: Update gate jobs as per the 2023.1 cycle testing runtime https://review.opendev.org/c/openstack/placement/+/861471 | 20:41 |
opendevreview | Ghanshyam proposed openstack/os-traits master: Update python classifier for python 3.10 https://review.opendev.org/c/openstack/os-traits/+/861466 | 20:47 |
opendevreview | Ghanshyam proposed openstack/python-novaclient master: Update python classifier for python 3.10 https://review.opendev.org/c/openstack/python-novaclient/+/861469 | 20:57 |
opendevreview | Ghanshyam proposed openstack/osc-placement master: Update gate jobs as per the 2023.1 cycle testing runtime https://review.opendev.org/c/openstack/osc-placement/+/861470 | 21:01 |
opendevreview | Ghanshyam proposed openstack/os-vif master: Update gate jobs as per the 2023.1 cycle testing runtime https://review.opendev.org/c/openstack/os-vif/+/861468 | 21:09 |
opendevreview | Ghanshyam proposed openstack/nova master: Update gate jobs as per the 2023.1 cycle testing runtime https://review.opendev.org/c/openstack/nova/+/861111 | 21:48 |
*** dasm is now known as dasm|off | 22:47 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!