*** jangutter has joined #openstack-nova | 00:06 | |
*** jangutter_ has quit IRC | 00:07 | |
*** tosky has quit IRC | 00:14 | |
*** macz_ has quit IRC | 00:17 | |
*** martinkennelly has quit IRC | 00:21 | |
*** martinkennelly has joined #openstack-nova | 00:21 | |
*** tbachman has joined #openstack-nova | 00:29 | |
*** ccstone has quit IRC | 00:57 | |
*** eandersson has quit IRC | 00:57 | |
*** ccstone has joined #openstack-nova | 00:57 | |
*** eandersson has joined #openstack-nova | 00:57 | |
*** whoami-rajat___ has joined #openstack-nova | 01:06 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add os-volume_attachments reference docs https://review.opendev.org/760971 | 01:09 |
---|---|---|
openstackgerrit | MaAoyu proposed openstack/os-traits master: bump py37 to py38 in tox.ini https://review.opendev.org/757432 | 01:24 |
*** macz_ has joined #openstack-nova | 01:27 | |
*** macz_ has quit IRC | 01:32 | |
*** k_mouza has joined #openstack-nova | 01:35 | |
*** Liang__ has joined #openstack-nova | 01:37 | |
*** Liang__ has quit IRC | 01:39 | |
*** k_mouza has quit IRC | 01:40 | |
*** sapd1 has joined #openstack-nova | 01:41 | |
*** rcernin_ has joined #openstack-nova | 01:42 | |
*** rcernin has quit IRC | 01:43 | |
*** LinPeiWen has joined #openstack-nova | 01:48 | |
*** martinkennelly has quit IRC | 01:58 | |
*** spatel has joined #openstack-nova | 02:10 | |
*** spatel has quit IRC | 02:11 | |
*** spatel has joined #openstack-nova | 02:12 | |
*** spatel has quit IRC | 02:14 | |
*** kaisers has quit IRC | 02:45 | |
*** swp20 has joined #openstack-nova | 02:49 | |
*** xinranwang has joined #openstack-nova | 03:03 | |
*** xinranwang has quit IRC | 03:04 | |
*** xinranwang has joined #openstack-nova | 03:04 | |
*** hamalq has quit IRC | 03:21 | |
*** rcernin_ has quit IRC | 03:24 | |
*** rcernin_ has joined #openstack-nova | 03:34 | |
*** rcernin_ has quit IRC | 03:47 | |
*** ratailor has joined #openstack-nova | 03:56 | |
*** sapd1 has quit IRC | 03:57 | |
*** k_mouza has joined #openstack-nova | 04:11 | |
*** Liang__ has joined #openstack-nova | 04:13 | |
*** k_mouza has quit IRC | 04:15 | |
*** rcernin_ has joined #openstack-nova | 04:19 | |
*** rcernin_ has quit IRC | 04:19 | |
*** rcernin has joined #openstack-nova | 04:19 | |
*** vishalmanchanda has joined #openstack-nova | 04:31 | |
*** rcernin has quit IRC | 05:13 | |
*** rcernin has joined #openstack-nova | 05:14 | |
*** whoami-rajat___ is now known as whoami-rajat__ | 05:32 | |
*** evrardjp has quit IRC | 05:33 | |
*** evrardjp has joined #openstack-nova | 05:33 | |
openstackgerrit | Wenping Song proposed openstack/nova-specs master: Support vGPU management by Cyborg https://review.opendev.org/750116 | 05:43 |
*** ratailor has quit IRC | 06:33 | |
*** rcernin has quit IRC | 06:37 | |
*** rcernin has joined #openstack-nova | 06:50 | |
*** Liang__ has quit IRC | 07:06 | |
*** Liang__ has joined #openstack-nova | 07:07 | |
*** lpetrut has joined #openstack-nova | 07:11 | |
*** rcernin has quit IRC | 07:12 | |
*** melwitt has quit IRC | 07:20 | |
*** melwitt has joined #openstack-nova | 07:21 | |
*** dklyle has quit IRC | 07:24 | |
*** swp20 has quit IRC | 07:30 | |
*** rcernin has joined #openstack-nova | 07:33 | |
*** rcernin has quit IRC | 07:43 | |
xinranwang | gibi: Hi gibi, I have replied to your comment about smartnic support, and there are some open question need your suggestion, please check it when you got time. Thanks in advance. https://review.opendev.org/#/c/742785/6/specs/wallaby/approved/support-sriov-smartnic.rst | 07:47 |
*** slaweq has joined #openstack-nova | 07:54 | |
*** swp20 has joined #openstack-nova | 07:56 | |
*** ralonsoh has joined #openstack-nova | 07:59 | |
gibi | xinranwang: ack, I will look at it today | 08:04 |
xinranwang | gibi: great, thanks | 08:04 |
*** luksky has joined #openstack-nova | 08:09 | |
*** tosky has joined #openstack-nova | 08:13 | |
bauzas | good morning Nova | 08:13 |
*** andrewbonney has joined #openstack-nova | 08:14 | |
gibi | bauzas: o/ | 08:17 |
*** hoonetorg has quit IRC | 08:21 | |
*** tesseract has joined #openstack-nova | 08:21 | |
openstackgerrit | Merged openstack/nova master: Use subqueryload() instead of joinedload() for (system_)metadata https://review.opendev.org/758928 | 08:22 |
*** hoonetorg has joined #openstack-nova | 08:26 | |
openstackgerrit | Sylvain Bauza proposed openstack/nova master: Add a regression test for 5.12 compute API issue https://review.opendev.org/761457 | 08:36 |
openstackgerrit | Sylvain Bauza proposed openstack/nova master: Fix the compute RPC 5.12 issue https://review.opendev.org/761458 | 08:36 |
*** macz_ has joined #openstack-nova | 08:37 | |
*** rpittau|afk is now known as rpittau | 08:39 | |
bauzas | gibi: you probably missed my pings yesterday night, but I spotted a critical upgrade issue in victoria | 08:40 |
bauzas | https://bugs.launchpad.net/nova/+bug/1902925 | 08:41 |
openstack | Launchpad bug 1902925 in OpenStack Compute (nova) "Upgrades to compute RPC API 5.12 are broken" [Critical,In progress] - Assigned to Sylvain Bauza (sylvain-bauza) | 08:41 |
*** macz_ has quit IRC | 08:42 | |
bauzas | made a better explanation of the impact https://bugs.launchpad.net/nova/+bug/1902925/comments/3 | 08:47 |
openstack | Launchpad bug 1902925 in OpenStack Compute (nova) "Upgrades to compute RPC API 5.12 are broken" [Critical,In progress] - Assigned to Sylvain Bauza (sylvain-bauza) | 08:47 |
*** songwenping_ has joined #openstack-nova | 08:51 | |
*** swp20 has quit IRC | 08:54 | |
gibi | bauzas: thanks, now I read back. good catch | 08:57 |
bauzas | well, just found it when writing the RPC major bump | 08:58 |
bauzas | when you know the RPC usage, it's simple | 08:58 |
bauzas | oh shit, I forgot to add the conditional I promised to dansmith ^_^ | 08:59 |
*** ociuhandu has joined #openstack-nova | 09:06 | |
bauzas | actually, we don't need it \o/ | 09:06 |
*** ociuhandu has quit IRC | 09:16 | |
*** jangutter_ has joined #openstack-nova | 09:22 | |
*** ociuhandu has joined #openstack-nova | 09:22 | |
*** martinkennelly has joined #openstack-nova | 09:25 | |
*** ociuhandu has quit IRC | 09:25 | |
*** jangutter has quit IRC | 09:25 | |
*** ociuhandu has joined #openstack-nova | 09:26 | |
gibi | bauzas: I'm confused about the naming here https://review.opendev.org/#/c/761457/2/nova/tests/functional/regressions/test_bug_1902925.py@31 | 09:26 |
bauzas | that's what happens when you copy/paste some methods... | 09:27 |
*** k_mouza has joined #openstack-nova | 09:34 | |
gibi | bauzas: when you respin it, could you update the doc here too https://review.opendev.org/#/c/761458/2/nova/compute/manager.py@3355 | 09:35 |
gibi | besides these, the fix looks good to me | 09:37 |
*** suryasingh has joined #openstack-nova | 09:40 | |
*** macz_ has joined #openstack-nova | 09:43 | |
gibi | lyarwood, elod: when the bugfix ^^ is merged to victoria we need to push a point release | 09:44 |
gibi | as this is a critical upgrade issue to V | 09:44 |
*** derekh has joined #openstack-nova | 09:45 | |
gibi | bauzas: btw, one more request, could you add an upgrade reno to the fix? It would help making visible that upgrading to V needs this fix | 09:46 |
bauzas | gibi: sure for both | 09:47 |
gibi | thanks | 09:47 |
bauzas | I was just about to upload but I killed it | 09:47 |
*** macz_ has quit IRC | 09:48 | |
lyarwood | gibi: ack | 09:51 |
*** ociuhandu has quit IRC | 09:57 | |
*** ociuhandu has joined #openstack-nova | 09:58 | |
*** ratailor has joined #openstack-nova | 10:00 | |
*** ociuhandu has quit IRC | 10:01 | |
*** ociuhandu has joined #openstack-nova | 10:02 | |
*** kaisers has joined #openstack-nova | 10:02 | |
lyarwood | so are we not testing rebuild in grenade? | 10:02 |
openstackgerrit | Sylvain Bauza proposed openstack/nova master: Add a regression test for 5.12 compute API issue https://review.opendev.org/761457 | 10:02 |
openstackgerrit | Sylvain Bauza proposed openstack/nova master: Fix the compute RPC 5.12 issue https://review.opendev.org/761458 | 10:02 |
bauzas | gibi: done ^ | 10:03 |
elod | gibi: thx, I've planned to propose release patches for stein + train + ussuri + victoria today, but then I'll wait with the victoria release patch :] | 10:03 |
elod | lyarwood: fyi ^^^ | 10:03 |
bauzas | elod: hopefully, we'll merge it today | 10:03 |
bauzas | the fix is simple | 10:04 |
elod | bauzas: \o/ | 10:04 |
elod | then we just have to wait the gate :] | 10:04 |
lyarwood | elod: ack thanks | 10:04 |
lyarwood | I guess we don't test rebuilds in a mixed upgrade state | 10:04 |
lyarwood | and that's why grenade multinode didn't hit this | 10:05 |
gibi | bauzas: looking | 10:05 |
bauzas | lyarwood: stephenfin: gibi: I'm not telling you were bad about reviewing (i also sometimes misses some issues), but maybe it would be nice for you to look at both the fix https://review.opendev.org/#/c/761458/ but also to review https://review.opendev.org/#/c/761452/ to understand how RPC API works | 10:06 |
bauzas | again, no worries at all | 10:06 |
bauzas | it's more for providing a knowledge help for you folks about how RPC versions work | 10:07 |
stephenfin | ah, I knew that and forgot about it :( | 10:07 |
bauzas | if you knew it, all good then | 10:07 |
gibi | bauzas: yeah, thanks for the pointers. I'm wondering if we can make some test enhancements to catch these in the future | 10:08 |
lyarwood | I didn't even review the broken patch here so I'm not sure what you're trying to say | 10:08 |
lyarwood | .... | 10:08 |
bauzas | but hopefully the proxy change I'm providing is nice for knowing how to have a major version | 10:08 |
bauzas | lyarwood: not about any previous reviews, just for helping you to know what to review when you have a change with a RPC modification | 10:09 |
lyarwood | sure | 10:09 |
*** Liang__ has quit IRC | 10:10 | |
lyarwood | bauzas: look forward to your reference docs patches | 10:11 |
stephenfin | gibi: We could probably hash the signature or something? | 10:14 |
bauzas | actually, I could write something in https://docs.openstack.org/nova/latest/contributor/code-review.html | 10:14 |
bauzas | stephenfin: gibi: testing it is not simple | 10:14 |
stephenfin | i.e. identify all the position, non-optional arguments and generate/save a hash for those | 10:14 |
stephenfin | then compare each time, like we do for o.vos | 10:15 |
bauzas | since the arguments are different between RPC versions | 10:15 |
gibi | stephenfin: that would be the ovo way yes | 10:15 |
stephenfin | *positional | 10:15 |
lyarwood | bauzas: https://docs.openstack.org/nova/latest/reference/rpc.html I was thinking more in here | 10:15 |
lyarwood | bauzas: but either way | 10:15 |
bauzas | stephenfin: we have non-positional arguments that are unrelated to RPC versions | 10:15 |
bauzas | we just keep them optional | 10:16 |
bauzas | lyarwood: oh, TIL this page was existing | 10:16 |
* gibi needs to think about the testing | 10:16 | |
bauzas | gibi: we fixed it by code reviews | 10:17 |
bauzas | ah, this is already documented https://docs.openstack.org/nova/latest/contributor/code-review.html#rpc-api-versions | 10:17 |
gibi | bauzas: sure, code review is the fallaback, human intelligence is king, but if we can automate it then we could avoid failing humans like me at the original code rview | 10:17 |
lyarwood | bauzas: ah cool | 10:18 |
bauzas | but I guess "The manager-side method needs to tolerate older calls as well as newer calls" is maybe too much overall, and we need to explain it more | 10:18 |
*** songwenping_ has quit IRC | 10:18 | |
bauzas | gibi: we could enforce owners to propose functional tests | 10:18 |
bauzas | for testing the RPC pins | 10:18 |
bauzas | like I did in my regression test | 10:19 |
bauzas | this would be a simpliest approach | 10:19 |
gibi | I guess enforce by code review | 10:19 |
bauzas | that, yeah | 10:20 |
gibi | I agree | 10:20 |
bauzas | but from what I've seen, nobody is really doing it | 10:20 |
gibi | still I want to automate it if possible :D | 10:20 |
lyarwood | shouldn't we cover mixed compute upgrades in the multinode grenade job? | 10:20 |
bauzas | gibi: well, we don't really set new versions a lot right? | 10:20 |
gibi | becuase all are code review rules are as good as the way we enforce them | 10:20 |
gibi | bauzas: we do it less and less, I agree | 10:21 |
bauzas | like, we only had one rpc minor bump per release since a while | 10:21 |
bauzas | gibi: well, we have a code review documentation | 10:21 |
bauzas | and I expect cores to know it at least | 10:21 |
bauzas | I mean, that's a breaking change to accept a RPC change | 10:21 |
bauzas | maybe we rushed over accepting some feature that was long overdue, but maybe considering to require a functest would ensure that we would put the burden on code owners | 10:22 |
gibi | bauzas: you are correct that we assume that core reviews catch these kind of problems, but they don't as you found | 10:22 |
*** CeeMac has joined #openstack-nova | 10:23 | |
gibi | if we do these thing less and less it means that we will easier to forget what to look at in these changes | 10:23 |
gibi | as we don't excersize this knowledge | 10:24 |
*** jangutter has joined #openstack-nova | 10:25 | |
gibi | so I agree that one thing is to raise awerness for this issue as you did. | 10:26 |
gibi | but also I will think about some kind of automation as I cannot promise I won't forget this rule again 6 months from now when we bump the next | 10:27 |
*** jangutter_ has quit IRC | 10:29 | |
*** noonedeadpunk has quit IRC | 10:32 | |
*** noonedeadpunk has joined #openstack-nova | 10:32 | |
bauzas | gibi: ahah lol, i had to rush off home because I forgot my kids at the school :whoops: | 10:44 |
bauzas | gibi: fwiw, the change we merged was a bit hairy, so I do understand that it was difficult to find the problem | 10:44 |
bauzas | gibi: that's why I said we should at least ask to provide a functional test, that's it | 10:45 |
gibi | yeah, I should not forget to ask a functional test pining to old RPC version when a new RPC version is proposed | 10:45 |
*** jangutter_ has joined #openstack-nova | 10:58 | |
*** ratailor_ has joined #openstack-nova | 10:58 | |
*** jangutter has quit IRC | 11:01 | |
*** ratailor has quit IRC | 11:01 | |
*** ratailor__ has joined #openstack-nova | 11:08 | |
*** ratailor_ has quit IRC | 11:11 | |
*** ratailor__ has quit IRC | 11:17 | |
*** ratailor has joined #openstack-nova | 11:17 | |
*** ratailor has quit IRC | 11:18 | |
*** ratailor has joined #openstack-nova | 11:21 | |
*** noonedeadpunk has quit IRC | 11:21 | |
*** ratailor has quit IRC | 11:21 | |
*** noonedeadpunk has joined #openstack-nova | 11:25 | |
*** ratailor has joined #openstack-nova | 11:26 | |
*** dtantsur|afk is now known as dtantsur | 11:26 | |
*** ratailor has quit IRC | 11:27 | |
*** ratailor has joined #openstack-nova | 11:27 | |
*** ratailor has quit IRC | 11:31 | |
*** ratailor has joined #openstack-nova | 11:31 | |
*** ratailor_ has joined #openstack-nova | 11:40 | |
*** ratailor has quit IRC | 11:43 | |
*** ratailor__ has joined #openstack-nova | 11:47 | |
*** ratailor_ has quit IRC | 11:50 | |
*** xinranwang has quit IRC | 11:53 | |
*** tbachman has quit IRC | 11:58 | |
*** ociuhandu has quit IRC | 12:00 | |
*** ratailor_ has joined #openstack-nova | 12:00 | |
*** ratailor_ has quit IRC | 12:01 | |
*** ratailor has joined #openstack-nova | 12:01 | |
*** ociuhandu has joined #openstack-nova | 12:01 | |
*** ratailor__ has quit IRC | 12:03 | |
brinzhang0 | gibi: hi good morning | 12:05 |
brinzhang0 | gibi: Hope you can review | 12:05 |
brinzhang0 | Cyborg shelve/unshelve support patch https://review.opendev.org/#/c/729563/ :D | 12:06 |
*** ociuhandu has quit IRC | 12:06 | |
gibi | brinzhang0: add to my queue | 12:06 |
gibi | added | 12:06 |
brinzhang0 | gibi: thanks | 12:06 |
*** JamesBenson has joined #openstack-nova | 12:10 | |
*** raildo has joined #openstack-nova | 12:23 | |
*** ratailor has quit IRC | 12:25 | |
*** jangutter has joined #openstack-nova | 12:59 | |
*** jangutter_ has quit IRC | 13:03 | |
*** ociuhandu has joined #openstack-nova | 13:14 | |
*** sapd1 has joined #openstack-nova | 13:18 | |
*** tbachman has joined #openstack-nova | 13:28 | |
*** nweinber has joined #openstack-nova | 13:28 | |
*** suryasingh has quit IRC | 13:33 | |
bauzas | brinzhang0: gibi: hah, this time the new argument is nullable :p | 13:36 |
bauzas | but maybe it's time to ask for a functional testclass verifying the RPC API ? :) | 13:36 |
*** diconico07 has joined #openstack-nova | 13:43 | |
*** derekh has quit IRC | 13:49 | |
*** derekh has joined #openstack-nova | 13:50 | |
*** jangutter_ has joined #openstack-nova | 14:02 | |
*** jangutter has quit IRC | 14:06 | |
*** arxcruz has joined #openstack-nova | 14:08 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Bump the lowest eventlet version to 0.26.1 https://review.opendev.org/761427 | 14:11 |
*** dtantsur has quit IRC | 14:16 | |
*** vishalmanchanda has quit IRC | 14:30 | |
*** dtantsur has joined #openstack-nova | 14:37 | |
*** dtantsur has quit IRC | 14:37 | |
*** dtantsur has joined #openstack-nova | 14:38 | |
*** jangutter_ is now known as jangutter | 14:48 | |
*** belmoreira has joined #openstack-nova | 14:56 | |
*** dtantsur has quit IRC | 14:56 | |
*** dtantsur has joined #openstack-nova | 14:56 | |
*** macz_ has joined #openstack-nova | 15:07 | |
*** macz_ has quit IRC | 15:12 | |
iurygregory | Hi nova folks, a friend of mine using openstack queens asked me if it's possible to update the config drive of an instance? | 15:12 |
bauzas | gibi: the next nova meeting is in 45 mins, right? | 15:12 |
bauzas | tz change | 15:12 |
*** otubo has joined #openstack-nova | 15:13 | |
gibi | bauzas: yes, the meeting is at 16:00 UTC which is 17:00 CET | 15:14 |
bauzas | cool cool | 15:14 |
bauzas | nicer for us :) | 15:14 |
gibi | :) | 15:15 |
sean-k-mooney | openstack meetings are alwasy utc and never move | 15:18 |
sean-k-mooney | its other that do | 15:18 |
sean-k-mooney | fortunetlly DLS will not be a thing in europe after 2021 | 15:19 |
dansmith | hopefully not on the west coast either, but it's not set yet | 15:20 |
sean-k-mooney | it was ment to happen this year but got delayed so this was ment to be the last switch | 15:20 |
*** ociuhandu has quit IRC | 15:21 | |
sean-k-mooney | the current plan is contries adopting permenatn summer time will swap for the last time in the spring | 15:21 |
sean-k-mooney | and the rest will swap for the last time in the fall | 15:21 |
gibi | stephenfin, bauzas: spent some time thinking automating to catch bugs like https://bugs.launchpad.net/nova/+bug/1902925 . Besides code review (that fails some time like in this case) what we can do is to extend the grenade testing. | 15:32 |
openstack | Launchpad bug 1902925 in OpenStack Compute (nova) "Upgrades to compute RPC API 5.12 are broken" [Critical,In progress] - Assigned to Sylvain Bauza (sylvain-bauza) | 15:32 |
gibi | As far as I understand it run livemigration between mixed computes | 15:32 |
bauzas | sean-k-mooney: I'm against summer time | 15:32 |
bauzas | gibi: you need then two compute services | 15:33 |
gibi | bauzas: we have multinode grenade | 15:33 |
bauzas | and a rolling upgrade scenario | 15:33 |
bauzas | because the rpc pins will automatically set the version to the oldest compute one | 15:33 |
bauzas | (if set to 'auto') | 15:33 |
gibi | I think nova-grenade-multinode does what we need | 15:34 |
bauzas | and again, tbh, I wonder whether it's just a code review usage | 15:34 |
gibi | as per https://github.com/openstack/nova/blob/d25bc07d26212408211b64953af7ef6047ca3d9d/playbooks/legacy/nova-grenade-multinode/run.yaml#L47-L50 | 15:34 |
bauzas | dansmith: your thoughts on it ? tl;dr: automatical uprade testing vs. asking for functional tests that would verify a RPC version minor bump | 15:34 |
bauzas | gibi: if we run two computes, then okay, we don't need them to be on separate nodes but the other services | 15:35 |
bauzas | ie. aio+compute | 15:35 |
bauzas | which is what grenade-multinode is doing AFAIR | 15:35 |
bauzas | so you're right | 15:35 |
bauzas | gibi: but then we need to test all the RPC calls in tempest | 15:36 |
bauzas | good luck with this | 15:36 |
bauzas | I just feel the simpliest is just to ask for functests | 15:36 |
bauzas | I wrote them yesterday night and it took me 20 mins | 15:36 |
*** dklyle has joined #openstack-nova | 15:37 | |
gibi | bauzas: never said that we should not ask for a func test. I'm saying that we tend to forget about it as the current bug shows | 15:37 |
dansmith | yeah, so we could always pin the version to .0, | 15:37 |
dansmith | but coverage in tempest will be hard, | 15:37 |
dansmith | plus tempest needs to be graceful as some api calls will fail expectedly if the version doesn't support the new feature | 15:37 |
bauzas | technically, we need to set the pin to the previous release version | 15:38 |
sean-k-mooney | bauzas: same i want to stick on utc in my case | 15:38 |
dansmith | I'd prefer some test that ensures we've hit all the versions for each call in unit/func or something | 15:38 |
bauzas | dansmith: that's my thoughts | 15:38 |
gibi | dansmith: I thinked about that angle as that would be a good thing in my eys | 15:38 |
bauzas | checking it thru tempest is something I'd love, but I'm pragmativ | 15:38 |
bauzas | pragmatic | 15:39 |
dansmith | gibi: yeah | 15:39 |
bauzas | we honestly have the pattern to ask with my functest | 15:39 |
bauzas | it's just a simple request | 15:39 |
bauzas | and I guess (or I hope) none of the cores to miss this | 15:39 |
gibi | I have my doubt about my memory | 15:39 |
sean-k-mooney | testing rpc versions? trying to catch up on the converstation | 15:40 |
gibi | so I won't promise I will always remember | 15:40 |
gibi | sean-k-mooney: basically avoiding the bug https://bugs.launchpad.net/nova/+bug/1902925 | 15:40 |
openstack | Launchpad bug 1902925 in OpenStack Compute (nova) "Upgrades to compute RPC API 5.12 are broken" [Critical,In progress] - Assigned to Sylvain Bauza (sylvain-bauza) | 15:40 |
bauzas | gibi: well, RPC and DB upgrades are possibly the hugest changes we could review, right? | 15:40 |
gibi | right | 15:40 |
bauzas | I could understand this for a simple method | 15:40 |
bauzas | but for all the manager services, the risk is present | 15:41 |
bauzas | but either way, the meeting is in 20 mins | 15:41 |
bauzas | probably the best is to discuss it there | 15:41 |
sean-k-mooney | that had functional test for what it was worth | 15:41 |
bauzas | sean-k-mooney: the cyborg patches ? nope | 15:41 |
sean-k-mooney | my orginial one did https://review.opendev.org/#/c/715326/ | 15:42 |
sean-k-mooney | https://review.opendev.org/#/c/715326/29/nova/tests/functional/test_servers.py | 15:42 |
dansmith | gibi: details of db, rpc, and general upgrade issues in patches have always required lots of human review to get right.. in the early days when we went from not-upgradeable to where we are now.. we added lots of tests where we could, and developer traps like the required db migration tests, | 15:43 |
dansmith | gibi: but automating all the things is hard and the issues are complex | 15:43 |
bauzas | sean-k-mooney: you won't catch this error then | 15:43 |
dansmith | gibi: so I'm all for trying to catch more stuff, especially in a case like this where we just lacked such a test, but ... human review is not replaceable, obviously | 15:43 |
bauzas | sean-k-mooney: see my regression test, it does capture the bug https://review.opendev.org/#/c/761457/1/nova/tests/functional/regressions/test_bug_1902925.py | 15:43 |
bauzas | actually https://review.opendev.org/#/c/761457/3/nova/tests/functional/regressions/test_bug_1902925.py | 15:44 |
*** ociuhandu has joined #openstack-nova | 15:45 | |
bauzas | stephenfin: I don't get your -1 https://review.opendev.org/#/c/761458/3/releasenotes/notes/bug_1902925-351f563340a1e9a5.yaml@11 | 15:45 |
bauzas | stephenfin: the 'fixes' reno section is purposed to show the fixed bugs | 15:45 |
bauzas | so that's normal we won't show this note until we merge the patch | 15:45 |
stephenfin | I'm saying that the docs job won't pass until you do what gibi suggested | 15:46 |
sean-k-mooney | bauzas: becaue it need the version cap to trigger it | 15:46 |
stephenfin | you need to add a leading to underscore '.. bug 1902925:', i.e. '.. _bug 1902925:' | 15:47 |
openstack | bug 1902925 in OpenStack Compute (nova) "Upgrades to compute RPC API 5.12 are broken" [Critical,In progress] https://launchpad.net/bugs/1902925 - Assigned to Sylvain Bauza (sylvain-bauza) | 15:47 |
bauzas | stephenfin: ah that, no worries I'll fix it | 15:47 |
gibi | dansmith: I agree that we need human review. All I want is to aid that huma review _if possible_ | 15:48 |
dansmith | gibi: for sure | 15:48 |
sean-k-mooney | ok i see what you have changed. hum ok | 15:49 |
bauzas | sean-k-mooney: I explained the issue in https://bugs.launchpad.net/nova/+bug/1902925/comments/3 | 15:51 |
openstack | Launchpad bug 1902925 in OpenStack Compute (nova) "Upgrades to compute RPC API 5.12 are broken" [Critical,In progress] - Assigned to Sylvain Bauza (sylvain-bauza) | 15:51 |
sean-k-mooney | ya i just wanted to read the repoducer and code fix | 15:51 |
sean-k-mooney | i guess we have not extended this api since we did the 5.0 rpc bump | 15:52 |
sean-k-mooney | none of the other fileds are optional | 15:52 |
tobias-urdin | any good (and somewhat "supported" way) to extend the nova metadata API to include some custom paths? IIRC some ways of extending nova has been deprecated/removed over the years | 15:53 |
sean-k-mooney | yes | 15:53 |
sean-k-mooney | tobias-urdin: https://docs.openstack.org/nova/latest/admin/vendordata.html | 15:53 |
sean-k-mooney | that or i guess you coudl use middleware | 15:54 |
sean-k-mooney | but in general nova is not extensible in this way intentionally | 15:54 |
noonedeadpunk | sean-k-mooney: I think I might figued out why isolated aggregates got instances from time to time. any reason not to pass rebuilds through scheduler? https://opendev.org/openstack/nova/src/branch/master/nova/scheduler/manager.py#L146 | 15:55 |
sean-k-mooney | noonedeadpunk: rebuilds cant change host | 15:55 |
sean-k-mooney | noonedeadpunk: they are not move operations | 15:55 |
sean-k-mooney | and they do go to the schduler if the image changes | 15:55 |
noonedeadpunk | uh.... I see | 15:55 |
noonedeadpunk | well, I continue get 1 instance per month or smth like that on the isolated aggregate | 15:56 |
sean-k-mooney | to validate that the current host is still ok with the new image | 15:56 |
noonedeadpunk | and have no clue how that might happen... | 15:56 |
sean-k-mooney | are you using the placment way fo doing it | 15:57 |
sean-k-mooney | or the filter | 15:57 |
sean-k-mooney | the placement way that should not happen as the traits request should block it but the filter reuires all tenanats to be mapped to an aggreate | 15:58 |
sean-k-mooney | or unmpped tenants can go to any host | 15:58 |
openstackgerrit | Sylvain Bauza proposed openstack/nova master: Fix the compute RPC 5.12 issue https://review.opendev.org/761458 | 15:58 |
bauzas | dansmith: gibi: stephenfin: last round, hopefully | 15:59 |
bauzas | and then I'll backport the changes | 15:59 |
gibi | thanks | 15:59 |
*** belmoreira has quit IRC | 16:09 | |
noonedeadpunk | sean-k-mooney: do placement traits, exactly like specified in https://docs.openstack.org/nova/latest/reference/isolate-aggregates.html which you gave me one day | 16:11 |
noonedeadpunk | out of code I see no reason why this might happen | 16:11 |
noonedeadpunk | but it does | 16:11 |
sean-k-mooney | ya im not sure either | 16:12 |
sean-k-mooney | unless you have multiple schduler adn one of them has a differnt config | 16:13 |
noonedeadpunk | and if previously it was only during resizes or smth like that I found just new VM created a week ago... | 16:13 |
sean-k-mooney | e.g. on one of them you dont have the prefilter enabled | 16:13 |
noonedeadpunk | well, I think I've checked that.... in terms of prefilter you mean scheduler.enable_isolated_aggregate_filtering ? | 16:14 |
sean-k-mooney | bauzas: so that is what was breakign the grenade jobs? good find | 16:15 |
bauzas | only for the rebuild case | 16:16 |
bauzas | so, maybe... | 16:16 |
bauzas | idk | 16:16 |
gibi | grenade does not run evacuate or grenade tests as far as I know | 16:16 |
gibi | it runs live migration | 16:16 |
sean-k-mooney | it runs full tempest before and after | 16:16 |
sean-k-mooney | i think | 16:17 |
gibi | really? I only found a smoke result | 16:17 |
sean-k-mooney | im thinging that if the vm that landed on the unupgraded node was rebuilt it would fail | 16:17 |
gibi | + the live migration | 16:17 |
sean-k-mooney | maybe im wrong | 16:17 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/playbooks/legacy/nova-grenade-multinode/run.yaml#L40 | 16:20 |
sean-k-mooney | its running the compute api tests and senario tests | 16:20 |
sean-k-mooney | oh just the smoke subset of those? | 16:21 |
dansmith | just smoke before, not sure about full after t hough | 16:21 |
sean-k-mooney | if its runnign rebuild after then if it booted on the upgraded node we would get teh type error | 16:22 |
sean-k-mooney | if it booted on the un upgraded node it would have rebuilt fine | 16:22 |
sean-k-mooney | which would have made the test failure intermitent | 16:23 |
dansmith | I dunno why you say that, | 16:23 |
dansmith | the control plane would be upgraded, | 16:23 |
dansmith | oh you mean because the pin is set to auto and the presence of an old compute would keep it pinned I guess? | 16:23 |
sean-k-mooney | yes | 16:24 |
sean-k-mooney | it would be pinned but the old nova code would not expect the parmater and the new code would | 16:24 |
dansmith | that only works for U->V jobs, since V supported it, it'll be using the new version | 16:24 |
dansmith | you need to be looking at U->V grenade multinode jobs I'd expect right? | 16:24 |
dansmith | also, as bad as the gate has been lately, it wouldn't surprise me if people have just been rechecking past that occasional fail | 16:25 |
sean-k-mooney | well v->master woudl work since they woudl both use 5.12+ | 16:25 |
dansmith | that's my point | 16:25 |
sean-k-mooney | u->v would (posssible) be intermitent | 16:25 |
sean-k-mooney | so yes | 16:25 |
dansmith | right, very many fewer things running that configuration | 16:25 |
*** jangutter has quit IRC | 16:26 | |
sean-k-mooney | i have just been seeign some intermitent grendade job failure before the ptg so was wondering if this was the issue or if there are others | 16:26 |
*** jangutter has joined #openstack-nova | 16:26 | |
sean-k-mooney | most of the issue seam to be realted to volumes however rather then rebuild | 16:27 |
dansmith | could be.. so many CI fails lately, I expect people are doing a lot of recheck grinding | 16:27 |
*** macz_ has joined #openstack-nova | 16:27 | |
gibi | this is a recent grenade multinode run from stable/victoria https://1cc2260295ba1f69c29d-8ad4cd99420b0d8b2b27089e00008c76.ssl.cf1.rackcdn.com/761424/1/check/nova-grenade-multinode/e3cf1bf/logs/index.html | 16:32 |
gibi | I see two test reports | 16:32 |
gibi | https://1cc2260295ba1f69c29d-8ad4cd99420b0d8b2b27089e00008c76.ssl.cf1.rackcdn.com/761424/1/check/nova-grenade-multinode/e3cf1bf/logs/old/testr_results.html | 16:32 |
gibi | and | 16:32 |
gibi | https://1cc2260295ba1f69c29d-8ad4cd99420b0d8b2b27089e00008c76.ssl.cf1.rackcdn.com/761424/1/check/nova-grenade-multinode/e3cf1bf/logs/testr_results.html | 16:32 |
gibi | is there a 3rd report somewhere in the tree? | 16:32 |
sean-k-mooney | nope | 16:33 |
sean-k-mooney | just those two | 16:33 |
sean-k-mooney | so we are not running rebuild in the grenade job | 16:33 |
sean-k-mooney | i tought we were but i guess not | 16:34 |
sean-k-mooney | the grenade failures i was seeing were likely something else so | 16:35 |
sean-k-mooney | its been like 2 weeks so all that is left in my brain on the topic is "i have seen more grenade failures lately then i normally do" | 16:36 |
openstackgerrit | Sylvain Bauza proposed openstack/nova stable/victoria: Add a regression test for 5.12 compute API issue https://review.opendev.org/761638 | 16:48 |
openstackgerrit | Sylvain Bauza proposed openstack/nova stable/victoria: Fix the compute RPC 5.12 issue https://review.opendev.org/761639 | 16:48 |
bauzas | elod: stable changes are up there ^ | 16:50 |
bauzas | hopefully master changes will be merged tonight so we could move on tomorrow | 16:51 |
*** sapd1 has quit IRC | 16:51 | |
bauzas | and ideally release subsequently | 16:51 |
bauzas | (release stable/victoria) | 16:51 |
elod | bauzas: thx, looking :) | 16:51 |
bauzas | elod: don't | 16:52 |
bauzas | the master change isn't merged yet so I -2 it | 16:52 |
elod | don't worry I'll wait with the +2 until master is merged ;) | 16:52 |
elod | (if I don't find any mistake with the backport, ofc) | 16:53 |
elod | :] | 16:53 |
openstackgerrit | Merged openstack/nova master: Add a regression test for 5.12 compute API issue https://review.opendev.org/761457 | 17:00 |
bauzas | elod: heh ^ | 17:03 |
*** jamesden_ has joined #openstack-nova | 17:08 | |
*** JamesBen_ has joined #openstack-nova | 17:11 | |
elod | bauzas: ok, so the regression test part is ready and looks OK. +2'd | 17:11 |
bauzas | <3 | 17:11 |
elod | one more to go :) | 17:11 |
*** JamesBenson has quit IRC | 17:14 | |
elod | the backport of the fix also looks good to me and the fix is on the gate in master, so we just have to wait. | 17:16 |
elod | i'll prepare a release patch tomorrow for victoria if the fix gets merged | 17:17 |
*** k_mouza has quit IRC | 17:18 | |
*** k_mouza has joined #openstack-nova | 17:18 | |
*** rpittau is now known as rpittau|afk | 17:21 | |
*** k_mouza has quit IRC | 17:23 | |
*** gyee has joined #openstack-nova | 17:35 | |
*** hamalq has joined #openstack-nova | 17:38 | |
*** derekh has quit IRC | 18:00 | |
*** ociuhandu has quit IRC | 18:04 | |
*** ociuhandu has joined #openstack-nova | 18:12 | |
*** ociuhandu has quit IRC | 18:16 | |
*** haleyb has quit IRC | 18:18 | |
*** haleyb has joined #openstack-nova | 18:19 | |
*** andrewbonney has quit IRC | 18:29 | |
*** ralonsoh has quit IRC | 18:41 | |
*** k_mouza has joined #openstack-nova | 18:44 | |
*** k_mouza has quit IRC | 18:48 | |
*** dtantsur is now known as dtantsur|afk | 18:53 | |
*** lpetrut has quit IRC | 19:16 | |
*** _mlavalle_2 has quit IRC | 19:16 | |
*** lbragstad has quit IRC | 19:18 | |
*** tesseract has quit IRC | 19:31 | |
*** bbowen has quit IRC | 20:02 | |
*** rchurch has quit IRC | 20:22 | |
*** ociuhandu has joined #openstack-nova | 20:57 | |
*** ociuhandu has quit IRC | 20:58 | |
*** ociuhandu_ has joined #openstack-nova | 20:58 | |
*** bbowen has joined #openstack-nova | 21:13 | |
*** ociuhandu_ has quit IRC | 21:23 | |
*** ociuhandu has joined #openstack-nova | 21:25 | |
*** rcernin has joined #openstack-nova | 21:27 | |
*** ociuhandu has quit IRC | 21:35 | |
*** k_mouza has joined #openstack-nova | 21:45 | |
*** nweinber has quit IRC | 21:48 | |
*** k_mouza has quit IRC | 21:49 | |
*** rcernin has quit IRC | 21:53 | |
*** rcernin has joined #openstack-nova | 21:54 | |
openstackgerrit | Merged openstack/nova master: Fix the compute RPC 5.12 issue https://review.opendev.org/761458 | 21:55 |
*** slaweq has quit IRC | 22:06 | |
*** mlavalle has joined #openstack-nova | 22:08 | |
*** hamalq has quit IRC | 22:11 | |
*** jangutter_ has joined #openstack-nova | 22:23 | |
*** jangutter has quit IRC | 22:25 | |
*** jamesden_ has quit IRC | 22:33 | |
*** rcernin has quit IRC | 22:57 | |
*** rcernin has joined #openstack-nova | 23:01 | |
*** spatel has joined #openstack-nova | 23:20 | |
openstackgerrit | Merged openstack/nova stable/victoria: Add a regression test for 5.12 compute API issue https://review.opendev.org/761638 | 23:20 |
*** spatel has quit IRC | 23:25 | |
*** ociuhandu has joined #openstack-nova | 23:36 | |
*** ociuhandu has quit IRC | 23:40 | |
*** luksky has quit IRC | 23:46 | |
*** tosky has quit IRC | 23:55 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!