melwitt | alex_xu: ack, thanks | 00:08 |
---|---|---|
*** Dinesh_Bhor has joined #openstack-nova | 00:34 | |
*** macza has joined #openstack-nova | 00:37 | |
*** macza has quit IRC | 00:42 | |
*** tetsuro has joined #openstack-nova | 00:43 | |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Ensure attachment cleanup on failure in driver.pre_live_migration https://review.openstack.org/587439 | 00:48 |
*** tbachman has joined #openstack-nova | 00:56 | |
*** moshele has joined #openstack-nova | 01:04 | |
*** slaweq has joined #openstack-nova | 01:11 | |
*** TuanDA has joined #openstack-nova | 01:14 | |
*** slaweq has quit IRC | 01:16 | |
*** takashin has joined #openstack-nova | 01:17 | |
*** mrsoul has joined #openstack-nova | 01:26 | |
*** Dinesh_Bhor has quit IRC | 01:27 | |
*** Dinesh_Bhor has joined #openstack-nova | 01:35 | |
*** openstackgerrit has quit IRC | 01:35 | |
*** dillaman has joined #openstack-nova | 01:52 | |
*** mhen has quit IRC | 01:52 | |
*** READ10 has joined #openstack-nova | 01:52 | |
*** jdillaman has quit IRC | 01:52 | |
*** moshele has quit IRC | 01:58 | |
*** Dinesh_Bhor has quit IRC | 02:02 | |
*** hongbin has joined #openstack-nova | 02:09 | |
*** annp has joined #openstack-nova | 02:26 | |
*** READ10 has quit IRC | 02:32 | |
*** openstackgerrit has joined #openstack-nova | 02:45 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in virt/test_block_device.py https://review.openstack.org/566153 | 02:45 |
*** macza has joined #openstack-nova | 02:55 | |
*** Dinesh_Bhor has joined #openstack-nova | 02:55 | |
*** macza has quit IRC | 02:59 | |
*** hshiina has joined #openstack-nova | 03:02 | |
*** hshiina has joined #openstack-nova | 03:03 | |
*** hshiina has quit IRC | 03:03 | |
*** hshiina has joined #openstack-nova | 03:04 | |
*** hshiina has quit IRC | 03:05 | |
*** hshiina has joined #openstack-nova | 03:06 | |
*** hshiina has quit IRC | 03:06 | |
*** hshiina has joined #openstack-nova | 03:08 | |
*** hshiina has quit IRC | 03:08 | |
*** hshiina has joined #openstack-nova | 03:09 | |
*** hshiina has quit IRC | 03:09 | |
*** hshiina has joined #openstack-nova | 03:11 | |
*** slaweq has joined #openstack-nova | 03:11 | |
*** slaweq has quit IRC | 03:16 | |
*** Dinesh_Bhor has quit IRC | 03:36 | |
openstackgerrit | Merged openstack/python-novaclient master: Add support for microversion 2.67: BDMv2 volume_type https://review.openstack.org/609743 | 03:37 |
*** hongbin has quit IRC | 03:48 | |
*** Bhujay has joined #openstack-nova | 03:51 | |
*** Bhujay has quit IRC | 03:52 | |
*** lpetrut has joined #openstack-nova | 03:52 | |
*** Bhujay has joined #openstack-nova | 03:52 | |
*** Bhujay has quit IRC | 04:06 | |
*** slaweq has joined #openstack-nova | 04:11 | |
*** macza has joined #openstack-nova | 04:12 | |
*** slaweq has quit IRC | 04:16 | |
*** Dinesh_Bhor has joined #openstack-nova | 04:30 | |
*** lpetrut has quit IRC | 04:34 | |
*** liuyulong has quit IRC | 04:58 | |
*** tetsuro has quit IRC | 05:04 | |
*** macza_ has joined #openstack-nova | 05:08 | |
*** slaweq has joined #openstack-nova | 05:11 | |
*** macza has quit IRC | 05:12 | |
*** macza_ has quit IRC | 05:12 | |
*** macza has joined #openstack-nova | 05:13 | |
*** slaweq has quit IRC | 05:15 | |
*** macza has quit IRC | 05:20 | |
*** macza has joined #openstack-nova | 05:20 | |
*** macza has quit IRC | 05:24 | |
*** ratailor has joined #openstack-nova | 05:30 | |
*** janki has joined #openstack-nova | 05:47 | |
*** dpawlik has joined #openstack-nova | 05:55 | |
*** dpawlik has quit IRC | 06:00 | |
*** Luzi has joined #openstack-nova | 06:01 | |
*** dpawlik has joined #openstack-nova | 06:11 | |
*** slaweq has joined #openstack-nova | 06:11 | |
*** dpawlik has quit IRC | 06:12 | |
*** dpawlik has joined #openstack-nova | 06:12 | |
*** Dinesh_Bhor has quit IRC | 06:16 | |
*** slaweq has quit IRC | 06:16 | |
*** adrianc has joined #openstack-nova | 06:28 | |
*** moshele has joined #openstack-nova | 06:30 | |
*** edisonxiang has joined #openstack-nova | 06:44 | |
*** tommylikehu has joined #openstack-nova | 06:54 | |
*** slaweq has joined #openstack-nova | 06:56 | |
*** mrjk has quit IRC | 07:00 | |
*** rcernin has quit IRC | 07:01 | |
*** mrjk has joined #openstack-nova | 07:02 | |
*** helenafm has joined #openstack-nova | 07:18 | |
*** jpena|off is now known as jpena | 07:20 | |
*** lpetrut has joined #openstack-nova | 07:20 | |
*** lpetrut has quit IRC | 07:25 | |
*** markvoelker has quit IRC | 07:29 | |
*** markvoelker has joined #openstack-nova | 07:29 | |
*** ralonsoh has joined #openstack-nova | 07:29 | |
*** jangutter has quit IRC | 07:30 | |
*** jangutter has joined #openstack-nova | 07:30 | |
*** ttsiouts has joined #openstack-nova | 07:30 | |
*** Dinesh_Bhor has joined #openstack-nova | 07:33 | |
*** mvkr has quit IRC | 07:34 | |
*** markvoelker has quit IRC | 07:34 | |
bauzas | good morning nova | 07:37 |
*** edisonxiang has quit IRC | 07:39 | |
openstackgerrit | Jan Gutter proposed openstack/os-vif master: Extend host_info to cover port profiles https://review.openstack.org/610636 | 07:40 |
*** icey has quit IRC | 07:52 | |
*** icey has joined #openstack-nova | 07:52 | |
dpawlik | morning | 08:01 |
dpawlik | quick question: why nova service-list shows me ID instead of UUID ? | 08:01 |
*** dtantsur|afk is now known as dtantsur | 08:03 | |
*** moshele has quit IRC | 08:05 | |
*** gnuoy has quit IRC | 08:10 | |
*** k_mouza has joined #openstack-nova | 08:12 | |
openstackgerrit | Merged openstack/nova stable/rocky: Handle missing marker during online data migration https://review.openstack.org/608572 | 08:13 |
*** mvkr has joined #openstack-nova | 08:14 | |
*** tetsuro has joined #openstack-nova | 08:21 | |
*** yikun has quit IRC | 08:22 | |
*** hshiina has quit IRC | 08:23 | |
*** fghaas has joined #openstack-nova | 08:27 | |
fghaas | kashyap: taking the liberty to follow up on https://review.openstack.org/#/c/609788 which you asked me to take a stab on. If you could give that a read to make sure that I didn't chuck in anything stupid, I'd be grateful. Thanks! | 08:29 |
*** markvoelker has joined #openstack-nova | 08:30 | |
kashyap | fghaas: Hey | 08:30 |
kashyap | fghaas: I did see the email, and even partly reviewed it. But was buried in preparing a conference talk | 08:31 |
kashyap | (Also incidentally related to CPU models) | 08:31 |
*** priteau has joined #openstack-nova | 08:31 | |
kashyap | fghaas: I'll definitely look at it by EOD, I have it open. Sorry for the delay | 08:31 |
fghaas | I'm so surprised that your conf talk would be about *that*, of all things. ;) | 08:31 |
kashyap | Heh | 08:31 |
kashyap | Will you be in Edinburgh? | 08:32 |
kashyap | (Open Source Summit & KVM Forum) | 08:32 |
fghaas | No I won't. They waitlisted my talk, but it didn't get in. | 08:33 |
fghaas | But I will be in Berlin assuming you're headed there. | 08:33 |
dpawlik | nvm about my question. I didn't saw that there is nova and nova_legacy service type | 08:34 |
kashyap | fghaas: Yeah, will be there | 08:34 |
fghaas | Perfect. I still owe you dinner or at least a drink for your help with nesting and migration. | 08:34 |
*** mvkr has quit IRC | 08:35 | |
*** gnuoy has joined #openstack-nova | 08:41 | |
*** takashin has left #openstack-nova | 08:42 | |
kashyap | fghaas: No, don't be crazy | 08:47 |
kashyap | :-) | 08:47 |
kashyap | fghaas: Did a quick review, posted a couple of comments. Looks largely good. | 08:47 |
* kashyap crawls back into LaTex cave | 08:47 | |
*** mvkr has joined #openstack-nova | 08:48 | |
kashyap | (And yes, a drink in Berlin does sound good.) | 08:48 |
fghaas | oh, beamer. That's something my little brain is incapable of grasping. :) | 08:50 |
kashyap | fghaas: I hate myself for the amount of time I spend on it. But I love its typography | 08:51 |
kashyap | And the TikZ diagramming is so clean, nothing comes close to it. | 08:51 |
kashyap | But I'm not just super fast with it. Maybe in a few years | 08:51 |
*** derekh has joined #openstack-nova | 08:52 | |
*** cdent has joined #openstack-nova | 08:59 | |
fghaas | kashyap: reveal.js plus draw.io is my drug of choice. | 09:02 |
fghaas | Replied to your comments. Sorry for putting on my difficult tech writer hat. :) | 09:03 |
*** markvoelker has quit IRC | 09:03 | |
kashyap | The day I get fed up w/ TeX+TikZ, I'll consider the alternatives :-) | 09:04 |
kashyap | And nit-picky tech writer hat is good, I appreciate it. | 09:05 |
kashyap | fghaas: Aah, some of the text was already pre-exisiting. I should've realized it. Will respond in a few | 09:06 |
fghaas | No rush at all. If we can get this sorted within the week, I'm happy. | 09:06 |
*** panda|off has quit IRC | 09:15 | |
*** panda has joined #openstack-nova | 09:18 | |
*** helenafm has quit IRC | 09:25 | |
openstackgerrit | Jan Gutter proposed openstack/os-vif master: Do not call linux_net.delete_net_dev on Windows https://review.openstack.org/610916 | 09:30 |
jangutter | Is anyone familiar enough with Hyper-V functionality to check for a better way of doing ^ ? | 09:31 |
openstackgerrit | Jan Gutter proposed openstack/os-vif master: Extend host_info to cover port profiles https://review.openstack.org/610636 | 09:33 |
*** tommylikehu has quit IRC | 09:33 | |
*** k_mouza has quit IRC | 09:36 | |
*** k_mouza has joined #openstack-nova | 09:37 | |
*** mvkr has quit IRC | 09:40 | |
*** mvkr has joined #openstack-nova | 09:41 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Modify PciDevice.uuid generation code https://review.openstack.org/530487 | 09:44 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add an online migration for PciDevice.uuid https://review.openstack.org/530905 | 09:44 |
*** ShilpaSD has joined #openstack-nova | 09:45 | |
ShilpaSD | mriedem: Hi | 09:47 |
ShilpaSD | mriedem: Guide me how to reprodiuce cold migrate issue mentioned at https://bugs.launchpad.net/nova/+bug/1784020 in section 2. b) | 09:48 |
openstack | Launchpad bug 1784020 in OpenStack Compute (nova) "Shared storage providers are not supported and will break things if used" [Medium,Fix released] | 09:48 |
*** imacdonn has quit IRC | 09:54 | |
*** imacdonn has joined #openstack-nova | 09:55 | |
*** mrch has joined #openstack-nova | 09:56 | |
openstackgerrit | Florian Haas proposed openstack/nova master: Explain cpu_model_extra_flags and nested guest support https://review.openstack.org/609788 | 09:56 |
openstackgerrit | Florian Haas proposed openstack/nova stable/queens: Explain cpu_model_extra_flags and nested guest support https://review.openstack.org/609790 | 09:57 |
*** k_mouza has quit IRC | 09:58 | |
openstackgerrit | Florian Haas proposed openstack/nova stable/rocky: Explain cpu_model_extra_flags and nested guest support https://review.openstack.org/609789 | 09:59 |
*** TuanDA has quit IRC | 10:00 | |
*** markvoelker has joined #openstack-nova | 10:00 | |
openstackgerrit | Jan Gutter proposed openstack/os-vif master: Do not call linux_net.delete_net_dev on Windows https://review.openstack.org/610916 | 10:01 |
openstackgerrit | Florian Haas proposed openstack/nova stable/queens: Explain cpu_model_extra_flags and nested guest support https://review.openstack.org/609790 | 10:02 |
openstackgerrit | Jan Gutter proposed openstack/os-vif master: Extend host_info to cover port profiles https://review.openstack.org/610636 | 10:02 |
*** fghaas has quit IRC | 10:02 | |
*** tetsuro has quit IRC | 10:03 | |
*** erlon has joined #openstack-nova | 10:07 | |
stephenfin | gibi: Are you allowed to review these? https://review.openstack.org/#/c/482629/ | 10:07 |
stephenfin | This one too https://review.openstack.org/#/c/580345/ | 10:08 |
*** ttsiouts has quit IRC | 10:08 | |
* stephenfin has had http://burndown.peermore.com/nova-notification/ open for weeks now and would really like to close it :) | 10:08 | |
*** tetsuro has joined #openstack-nova | 10:14 | |
*** tetsuro has quit IRC | 10:16 | |
*** Dinesh_Bhor has quit IRC | 10:20 | |
*** k_mouza has joined #openstack-nova | 10:25 | |
*** ttsiouts has joined #openstack-nova | 10:29 | |
openstackgerrit | Merged openstack/nova stable/rocky: hyperv: Cleans up live migration Planned VM https://review.openstack.org/602698 | 10:29 |
openstackgerrit | Merged openstack/nova master: doc: update metadata service doc https://review.openstack.org/602593 | 10:30 |
*** markvoelker has quit IRC | 10:34 | |
*** moshele has joined #openstack-nova | 10:34 | |
gibi | stephenfin: what do you mean by allowed? :) | 10:34 |
* gibi opening the links | 10:34 | |
stephenfin | gibi: I wasn't sure if you'd authored any of them but it seems you haven't so all good | 10:35 |
gibi | stephenfin: yeah, I did not write those so I'm going to finish them off this afternoon | 10:35 |
* gibi was lazy in the past weeks about notification patches | 10:35 | |
*** helenafm has joined #openstack-nova | 10:39 | |
*** tbachman has quit IRC | 10:40 | |
*** macza has joined #openstack-nova | 10:43 | |
*** fghaas has joined #openstack-nova | 10:46 | |
*** adrianc has quit IRC | 10:46 | |
*** adrianc has joined #openstack-nova | 10:46 | |
*** macza has quit IRC | 10:47 | |
*** brinzhang has joined #openstack-nova | 10:47 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Remove duplicate legacy-tempest-dsvm-multinode-full job https://review.openstack.org/610931 | 10:50 |
jangutter | ralonsoh: do you know anyone knowlegeable on hyper-v networking? | 10:51 |
*** Dinesh_Bhor has joined #openstack-nova | 10:54 | |
ralonsoh | jangutter: no, sorry | 10:55 |
openstackgerrit | Merged openstack/nova stable/rocky: Replace usage of get_legacy_facade() with get_engine() https://review.openstack.org/608574 | 11:00 |
*** ttsiouts has quit IRC | 11:02 | |
*** ttsiouts has joined #openstack-nova | 11:03 | |
*** macza has joined #openstack-nova | 11:04 | |
*** Dinesh_Bhor has quit IRC | 11:06 | |
*** ttsiouts has quit IRC | 11:07 | |
*** dave-mccowan has joined #openstack-nova | 11:09 | |
*** macza has quit IRC | 11:09 | |
*** ttsiouts has joined #openstack-nova | 11:17 | |
*** READ10 has joined #openstack-nova | 11:25 | |
*** jpena is now known as jpena|lunch | 11:29 | |
*** k_mouza_ has joined #openstack-nova | 11:31 | |
*** markvoelker has joined #openstack-nova | 11:31 | |
*** ttsiouts has quit IRC | 11:31 | |
*** ttsiouts has joined #openstack-nova | 11:32 | |
gibi | stephenfin: I've sent the volume notification patch to the gate but I have concerns about the compute_task one | 11:32 |
*** k_mouza has quit IRC | 11:34 | |
*** ttsiouts has quit IRC | 11:34 | |
*** ttsiouts has joined #openstack-nova | 11:35 | |
*** ttsiouts has quit IRC | 11:37 | |
*** brinzhang has quit IRC | 11:38 | |
*** ttsiouts has joined #openstack-nova | 11:38 | |
*** k_mouza_ has quit IRC | 11:43 | |
*** k_mouza has joined #openstack-nova | 11:45 | |
*** k_mouza has quit IRC | 11:46 | |
*** ttsiouts has quit IRC | 11:46 | |
openstackgerrit | Jan Gutter proposed openstack/os-vif master: Do not call linux_net.delete_net_dev on Windows https://review.openstack.org/610916 | 11:47 |
*** k_mouza has joined #openstack-nova | 11:47 | |
*** ttsiouts has joined #openstack-nova | 11:47 | |
*** ttsiouts has quit IRC | 11:47 | |
*** jistr is now known as jistr|afk | 11:47 | |
*** ttsiouts has joined #openstack-nova | 11:47 | |
openstackgerrit | Jan Gutter proposed openstack/os-vif master: Extend host_info to cover port profiles https://review.openstack.org/610636 | 11:50 |
*** ttsiouts has quit IRC | 11:50 | |
*** ttsiouts has joined #openstack-nova | 11:51 | |
*** READ10 has quit IRC | 11:51 | |
*** ttsiouts has quit IRC | 11:55 | |
*** fghaas has quit IRC | 11:57 | |
*** ratailor has quit IRC | 12:00 | |
*** ttsiouts has joined #openstack-nova | 12:00 | |
*** markvoelker has quit IRC | 12:04 | |
*** tbachman has joined #openstack-nova | 12:04 | |
*** amorin has joined #openstack-nova | 12:08 | |
amorin | hey all | 12:08 |
amorin | I am facing issue on my openstack deployment when I try to live migrate an instance | 12:08 |
amorin | if the original image is deleted on glance | 12:09 |
amorin | live-migration fail | 12:09 |
*** tbachman has quit IRC | 12:09 | |
amorin | I am using openstack newton | 12:09 |
amorin | do you know if this bug is already declared somewhere ? | 12:09 |
amorin | Eventually fixed? | 12:09 |
*** tbachman has joined #openstack-nova | 12:10 | |
*** tbachman has quit IRC | 12:22 | |
*** k_mouza has quit IRC | 12:22 | |
*** ttsiouts has quit IRC | 12:32 | |
*** priteau has quit IRC | 12:39 | |
*** k_mouza has joined #openstack-nova | 12:40 | |
*** priteau has joined #openstack-nova | 12:40 | |
*** k_mouza_ has joined #openstack-nova | 12:41 | |
*** tbachman has joined #openstack-nova | 12:42 | |
*** k_mouza has quit IRC | 12:45 | |
*** macza has joined #openstack-nova | 12:53 | |
*** macza has quit IRC | 12:58 | |
bauzas | amorin: do you have some stacktrace to share ? | 13:09 |
*** efried has joined #openstack-nova | 13:10 | |
amorin | well, it's not a stacktrace but just a image not found | 13:10 |
amorin | I will show you | 13:10 |
amorin | I have that on source host: | 13:10 |
amorin | [instance: e4c16231-5c3e-49d3-b985-c963bfa52437] Live Migration failure: internal error: info migration reply was missing return status | 13:10 |
bauzas | amorin: we had a bug like this https://bugs.launchpad.net/nova/+bug/1270825 | 13:11 |
openstack | Launchpad bug 1270825 in OpenStack Compute (nova) "Live block migration fails for instances whose glance images have been deleted" [High,Fix released] - Assigned to melanie witt (melwitt) | 13:11 |
*** tbachman has quit IRC | 13:11 | |
amorin | sounds like my issue ! | 13:11 |
bauzas | we also had https://bugs.launchpad.net/nova/+bug/1546778 | 13:12 |
openstack | Launchpad bug 1546778 in OpenStack Compute (nova) liberty "libvirt: resize with deleted backing image fails" [Medium,Fix released] - Assigned to Matthew Booth (mbooth-9) | 13:12 |
bauzas | but that's for resize | 13:12 |
amorin | first bug look good, but I am running openstack newton... | 13:12 |
amorin | seems that the bug is supposed to be fixed since juno | 13:13 |
*** mchlumsky has joined #openstack-nova | 13:13 | |
bauzas | yup, very old bug, hence the needed stacktrace | 13:13 |
*** dpawlik has quit IRC | 13:13 | |
bauzas | we need to understand more why it fails | 13:13 |
*** tbachman has joined #openstack-nova | 13:15 | |
amorin | I'll try to restart nova compute in debug and find a trace | 13:17 |
amorin | on both source and dest host | 13:17 |
amorin | I will come back to you asap | 13:17 |
*** dpawlik has joined #openstack-nova | 13:18 | |
*** jistr|afk is now known as jistr | 13:20 | |
bauzas | ok, please highlight me when you reply so I can see it | 13:20 |
bauzas | amorin: like this :) | 13:20 |
amorin | yup | 13:20 |
openstackgerrit | Adam Spiers proposed openstack/nova-specs master: Add spec for libvirt driver launching AMD SEV-encrypted instances https://review.openstack.org/609779 | 13:22 |
*** mriedem has joined #openstack-nova | 13:23 | |
*** k_mouza has joined #openstack-nova | 13:24 | |
*** tbachman has quit IRC | 13:27 | |
*** k_mouza_ has quit IRC | 13:27 | |
*** dpawlik has quit IRC | 13:30 | |
*** tbachman has joined #openstack-nova | 13:34 | |
*** awaugama has joined #openstack-nova | 13:37 | |
*** ociuhandu has joined #openstack-nova | 13:37 | |
*** jpena|lunch is now known as jpena | 13:37 | |
*** ociuhandu has quit IRC | 13:38 | |
*** dpawlik has joined #openstack-nova | 13:40 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/queens: Handle missing marker during online data migration https://review.openstack.org/610974 | 13:42 |
mnaser | asking here because i think nova would probably be a project that does this but | 13:45 |
mnaser | is it possible to have multiple rootwrap daemons? | 13:45 |
mnaser | i'm running to problems with neutron vpnagent doing a lot of commands that take a long time which block everything else from running (in rootwrap daemon) | 13:45 |
*** eharney has joined #openstack-nova | 13:45 | |
dansmith | multiple in a load-balance situation? I kinda doubt it | 13:46 |
mnaser | even as in dedicated situation | 13:46 |
dansmith | I think that's one of the many performance limitations of rootwrap | 13:46 |
mnaser | it also made me wonder if privsep has the same issues | 13:46 |
mnaser | which i think it might not? | 13:46 |
dansmith | I think privsep is the same | 13:46 |
mnaser | because i frequently see privsep spawn processes for a specific module or so | 13:47 |
dansmith | ISTR the cinder people being concerned about that | 13:47 |
mnaser | i much rather have vpnaas (in this case) fight for resources between itself rather than with things that are important and need to happen in a few seconds | 13:47 |
mnaser | i guess the advantage this would give would be the ability to replace a shell out by a code/library | 13:49 |
smcginnis | There is a performance issue with privsep right now that it serializes anything it runs. | 13:52 |
smcginnis | So only one "priveleged" thing can happen at a time. | 13:53 |
smcginnis | But there's a patch up to fix that. | 13:53 |
mnaser | smcginnis: thats the case with rootwrap daemon too, no? | 13:53 |
mnaser | at least, that's what the behavior im seeing anyways | 13:53 |
smcginnis | mnaser: I didn't think so. That just calls out to run commands, so I thought it didn't have the same issue. | 13:53 |
mnaser | well, rootwrap yes, it just calls out to run commads, but rootwrap daemon seems to do the whole serialize thing | 13:54 |
mnaser | (i think) | 13:54 |
smcginnis | Here's the privsep patch if anyone is interested - https://review.openstack.org/#/c/593556/ | 13:54 |
smcginnis | It must not be quite as bad. There was push back on moving fully to privsep because there was a noticeable performance impact in doing so. | 13:55 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Transform scheduler.select_destinations notification https://review.openstack.org/508506 | 13:55 |
dansmith | smcginnis: to be clear, he's talking about rootwrap *daemon* | 13:57 |
*** janki has quit IRC | 13:58 | |
smcginnis | Yeah | 14:00 |
*** Luzi has quit IRC | 14:00 | |
mnaser | i wonder if i can spawn an independent/second rootwrap daemon | 14:01 |
*** READ10 has joined #openstack-nova | 14:04 | |
*** mlavalle has joined #openstack-nova | 14:07 | |
*** jcosmao has joined #openstack-nova | 14:08 | |
*** mrch has quit IRC | 14:10 | |
*** dpawlik has quit IRC | 14:11 | |
bauzas | mnaser: I guess the problem is how nova.rootwrap would know which daemon to pick | 14:19 |
*** munimeha1 has joined #openstack-nova | 14:19 | |
bauzas | mnaser: oh wait, you can spawn multiple daemons, each per service, nope ? | 14:21 |
mnaser | bauzas: i think so, thats what im attempting | 14:21 |
mnaser | just launch another client.. | 14:22 |
openstackgerrit | Merged openstack/nova master: Use tempest-pg-full https://review.openstack.org/609954 | 14:22 |
bauzas | anyway, taxi time | 14:22 |
*** k_mouza has quit IRC | 14:24 | |
*** moshele has quit IRC | 14:29 | |
*** eharney has quit IRC | 14:33 | |
*** k_mouza has joined #openstack-nova | 14:40 | |
*** dpawlik has joined #openstack-nova | 14:46 | |
*** eharney has joined #openstack-nova | 14:48 | |
*** dpawlik has quit IRC | 14:50 | |
bauzas | mnaser: interesting to read https://specs.openstack.org/openstack/oslo-specs/specs/juno/rootwrap-daemon-mode.html#client-api | 14:57 |
*** ccamacho has quit IRC | 14:57 | |
*** ccamacho has joined #openstack-nova | 15:00 | |
*** READ10 has quit IRC | 15:01 | |
bauzas | mnaser: more interesting https://github.com/openstack/nova/blob/master/nova/utils.py#L126 | 15:05 |
bauzas | mnaser: we allow one client per rootwrap config | 15:05 |
bauzas | if two configs, two clients | 15:06 |
bauzas | and since clients lazily load daemons if needed... | 15:06 |
bauzas | and I guess https://github.com/openstack/nova/blob/master/nova/utils.py#L123 is your PITA | 15:07 |
mnaser | bauzas: I’m investigating on how to pull this out and seeing what breaks terribly with multiple clients | 15:09 |
bauzas | mnaser: could you test something ? what if you have two distinct services running different config files, with each of them differencing by the rootwrap_config option value | 15:13 |
bauzas | mnaser: in this case, I guess we would automatically create two clients and two daemons | 15:14 |
openstackgerrit | Sundar Nadathur proposed openstack/nova-specs master: Nova Cyborg interaction specification. https://review.openstack.org/603955 | 15:27 |
kashyap | mriedem: Hi, I saw a ping fly by last night on 'hpet' and libvirt. I wonder if it's resolved | 15:29 |
* kashyap has been heads-down preparing for a conf; so been a bit less active here | 15:30 | |
mriedem | https://review.openstack.org/#/c/607989/ | 15:34 |
mriedem | lots of chatter yesterday about whether or not libvirt exposed hpet capability as a clock source for the host caps | 15:35 |
mriedem | it doesn't look like it does | 15:35 |
openstackgerrit | Jan Gutter proposed openstack/os-vif master: Do not call linux_net.delete_net_dev on Windows https://review.openstack.org/610916 | 15:36 |
openstackgerrit | Jan Gutter proposed openstack/os-vif master: Fix random test_unplug_ovs failures https://review.openstack.org/611017 | 15:36 |
openstackgerrit | Ivaylo Mitev proposed openstack/nova master: VMware: OVA and StrOpt images as VM templates https://review.openstack.org/609736 | 15:36 |
*** macza has joined #openstack-nova | 15:39 | |
*** hamzy has quit IRC | 15:42 | |
* kashyap goes to read the review | 15:44 | |
*** macza has quit IRC | 15:44 | |
*** helenafm has quit IRC | 15:46 | |
mriedem | gerritbot must be dead | 15:50 |
mriedem | imacdonn: +W on https://review.openstack.org/#/c/608091/ | 15:50 |
imacdonn | mriedem: OK, thanks .... I need to get a new rev in to fix a typo on in the release note | 15:51 |
mriedem | i fixed it | 15:51 |
imacdonn | mriedem: ok, cool, thanks .. just got email from review too | 15:51 |
*** macza has joined #openstack-nova | 15:57 | |
*** macza has quit IRC | 15:58 | |
*** macza has joined #openstack-nova | 16:00 | |
kashyap | mriedem: So, from a quick chat w/ the QEMU & libvirt folks -- | 16:00 |
kashyap | They say: "I'd would not do that" (configuring 'hpet') | 16:01 |
kashyap | As it's super expensive compared to other timer sources | 16:01 |
kashyap | And even worse for virtual machines | 16:01 |
kashyap | "HPET access involves a context switch to QEMU userspace, where as TSC is handled by KVM natively" | 16:01 |
kashyap | mriedem: But to your original question -- no I don't see it in libvirt's host capabilities either. | 16:02 |
kashyap | Probably because "no one has asked for it before" | 16:04 |
* kashyap goes to comment on the review, about the value of enabling HPET | 16:04 | |
kashyap | mriedem: Responded on the review with details. | 16:12 |
*** macza has quit IRC | 16:13 | |
*** macza has joined #openstack-nova | 16:13 | |
efried | hah, so after all of that, we may wind up not doing this thing at all? | 16:14 |
cdent | rad | 16:15 |
mriedem | well it was explicitly disabled in libvirt guests originally for a reason | 16:15 |
mriedem | b/c apparently at least for windows images it can skew the clock in the guest | 16:16 |
mriedem | kashyap: thanks for investigating | 16:16 |
mriedem | artom: +2 on your live migration cleanup thing https://review.openstack.org/#/c/609517/ | 16:16 |
artom | mriedem, thanks for the thorough reviewing :) | 16:16 |
artom | dansmith, feel like hitting up ^^ ? | 16:17 |
openstackgerrit | Rodolfo Alonso Hernandez proposed openstack/os-vif master: Add native implementation OVSDB API https://review.openstack.org/482226 | 16:19 |
kashyap | mriedem: Yeah, also RHEL disables it completely even now | 16:19 |
*** tssurya has joined #openstack-nova | 16:31 | |
dansmith | artom: man that's a lot of derping | 16:35 |
artom | dansmith, herp | 16:35 |
*** moshele has joined #openstack-nova | 16:35 | |
kashyap | Hey folks, a random question -- does anyone came across upstream bugs asking for CPU hotplug in Nova? | 16:36 |
* kashyap goes to do a look up | 16:36 | |
kashyap | Okay, I see a few blueprints, old and new | 16:37 |
mnaser | before i start diving | 16:41 |
mnaser | really old environment: juno-era, upgraded all the way up to rocky (no ffus) .. i'm seeing exceptions in placement once i hit rocky (around _create_incomplete_consumers_for_provider) | 16:42 |
mnaser | wit DBDuplicateEntry exceptions for unique consumer uuid | 16:42 |
mnaser | i'm guessing that it's trying to create incomplete consumers but they're there, or something. | 16:43 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/rocky: Handle volume API failure in _post_live_migration https://review.openstack.org/611083 | 16:43 |
mriedem | mnaser: traceback in a paste? | 16:44 |
*** adrianc has quit IRC | 16:44 | |
tssurya | dansmith: had a question about the cell templating stuff, | 16:44 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/queens: Handle volume API failure in _post_live_migration https://review.openstack.org/611084 | 16:44 |
dansmith | tssurya: yah? | 16:44 |
mnaser | mriedem: http://paste.openstack.org/show/732260/ | 16:45 |
tssurya | shouldn't we consider the cell0's transport_url here : https://github.com/openstack/nova/blob/a53e46a75936b55c93face840764a67f2186cb11/nova/objects/cell_mapping.py#L144 ? | 16:45 |
mnaser | OH also fun little thing i found out about today | 16:45 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Fail to live migration if instance has a NUMA topology https://review.openstack.org/611088 | 16:45 |
stephenfin | artom: Thoughts? https://review.openstack.org/611088 | 16:45 |
mnaser | Q=>R upgrades requires you to run api_db sync first then db sync after (i dont think this is documented) | 16:45 |
tssurya | right now running db sync without local_cell parameter gives out errors | 16:45 |
*** jpena is now known as jpena|off | 16:45 | |
mnaser | because cell disabled field is missing from api database so the db sync fails | 16:45 |
stephenfin | artom: I'd personally like to backport that as far as we can go. I'm kind of sick of explaining how broken this is to people | 16:45 |
mnaser | api_db sync first adds that field, which then lets db sync do it after | 16:46 |
mriedem | mnaser: i think it's ordered that way in the upgrade docs | 16:46 |
melwitt | 16:46 | |
mnaser | really, let me double check | 16:46 |
dansmith | tssurya: not sure what you mean.. pastebin an error? | 16:46 |
mriedem | https://docs.openstack.org/nova/latest/user/upgrade.html#rolling-upgrade-process | 16:46 |
tssurya | mnaser, mriedem: yea someone ran into the same issue and we changed the order | 16:46 |
mriedem | "Using the newly installed nova code, run the DB sync. (nova-manage api_db sync; nova-manage db sync). These schema change operations should have minimal or no effect on performance, and should not cause any operations to fail." | 16:46 |
tssurya | dansmith: ok | 16:46 |
artom | stephenfin, I don't know the full history, but I feel like it's opening a can of worms | 16:46 |
mnaser | mriedem: serves me right for looking at the queens docs thinking it hasnt change because it "looks" the same | 16:47 |
artom | stephenfin, also, I could imagine a scenario where an operator really pinky swears the destination host is fine, and wants to live migrate regardless | 16:47 |
mnaser | you're right, the order was swapped in rocky, my bad | 16:47 |
artom | stephenfin, so I'm not sure I'm comfortable with such a heavy handed approach | 16:47 |
mnaser | but anyways, back to that gigantic traceback | 16:47 |
mriedem | mnaser: i think grenade was doing it the right way before that docs change, | 16:47 |
mriedem | our docs were just old | 16:47 |
tssurya | dansmith: https://pastebin.com/7cQKv0fz | 16:47 |
artom | stephenfin, totally get where you're coming from though :) | 16:47 |
stephenfin | artom: Fair point. Wanna stick your thoughts in that review? | 16:47 |
* stephenfin has done that and is now off home | 16:47 | |
dansmith | tssurya: oh because cell0's transport_url column can be NULL ? | 16:48 |
artom | stephenfin, yep, will do | 16:48 |
*** panda is now known as panda|off | 16:48 | |
stephenfin | ta | 16:48 |
tssurya | yea | 16:48 |
dansmith | tssurya: add "and val" here: https://github.com/openstack/nova/blob/a53e46a75936b55c93face840764a67f2186cb11/nova/objects/cell_mapping.py#L162 | 16:48 |
dansmith | tssurya: you gonna cook up a patch or do you want me to? | 16:48 |
dansmith | tssurya: I wonder why/how we're not hitting that in the gate? | 16:49 |
dansmith | do we set it to something bogus? | 16:49 |
tssurya | dansmith: would be nice if you do it.. | 16:49 |
dansmith | tssurya: sure | 16:49 |
tssurya | we were just doing the upgrade checks for rocky and saw this | 16:49 |
mriedem | mnaser: so this is being triggered when listing resource providers, what is doing that? | 16:49 |
mnaser | mriedem: i'm assuming nova-scheduler? | 16:50 |
mnaser | https://github.com/openstack/nova/blob/377921103121bc62a3f7fce60c63e30815406851/nova/api/openstack/placement/objects/resource_provider.py#L1923-L1965 | 16:50 |
mnaser | this is interesting | 16:50 |
dansmith | tssurya: hmm, actually, it's not nullable on the object, so I'm not sure how it'd be doing the right thing if you have NULL in the db | 16:50 |
tssurya | dansmith: but we can have NULL on the conf file | 16:51 |
tssurya | technically there is no requirement to produce a transport_url for db sync right ? | 16:51 |
tssurya | s/to produce/to supply | 16:52 |
dansmith | tssurya: ohh, I see, I thought you were saying it was NULL in the database | 16:52 |
*** k_mouza_ has joined #openstack-nova | 16:52 | |
mnaser | [placement] incomplete_consumer_project_id and incomplete_consumer_user_id are a thing, i guess | 16:52 |
tssurya | no not in the database. but maybe we hit it here ? https://github.com/openstack/nova/blob/a53e46a75936b55c93face840764a67f2186cb11/nova/objects/cell_mapping.py#L150 | 16:52 |
mriedem | mnaser: oh i guess adding/removing host aggregates in rocky would do it b/c we have to find the provider by name which we use GET /resource_providers?name=foo for that | 16:52 |
mriedem | and when building the "provider tree" | 16:52 |
mriedem | when reporting inventory | 16:53 |
mriedem | from the compute | 16:53 |
mnaser | mriedem: the 500 is coming on /resource_providers/foo/allocations" | 16:53 |
dansmith | tssurya: so there is already a check for None-ness, but I guess if the value in the db isn't a template we'll fail to exit | 16:53 |
mnaser | and what it seems like almost any time its requesting allocations | 16:54 |
mriedem | mnaser: yeah that happens in the resource tracker on the compute | 16:54 |
tssurya | dansmith: ah yea was just wondering why that check didn't catch the Noneness | 16:54 |
mriedem | _remove_deleted_instances_allocations method | 16:54 |
mnaser | yep, i see compute ips sending in that request | 16:54 |
mriedem | so on startup of the compute, it's going to list allocations for the given compute ndoe provider | 16:54 |
mriedem | and try to create consumers table records for that provider and any allocations against it | 16:55 |
mriedem | i'm not sure how we race to hit _create_incomplete_consumers_for_provider though | 16:55 |
mriedem | because that should be idempotent | 16:55 |
mriedem | and jaypipes isn't around | 16:55 |
*** k_mouza has quit IRC | 16:55 | |
dansmith | tssurya: I'm shocked we haven't seen this in the regular tests | 16:55 |
tssurya | dansmith: yea we should have seen this somewhere | 16:56 |
*** k_mouza_ has quit IRC | 16:56 | |
tssurya | not sure if people didn't hit this when moving to rocky ? | 16:56 |
dansmith | well, I'm not sure why devstack doesn't hit it | 16:57 |
dansmith | I guess because we always have those defined in config | 16:57 |
tssurya | probably yea, but its easily reproducible | 16:57 |
dansmith | yep | 16:58 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/pike: Handle volume API failure in _post_live_migration https://review.openstack.org/611093 | 16:58 |
tssurya | will you also backport this please ? we might need this in queens | 16:58 |
mriedem | our base test case uses the rpc fixture https://github.com/openstack/nova/blob/377921103121bc62a3f7fce60c63e30815406851/nova/test.py#L238 | 16:58 |
mriedem | so that's probably why we'd never hit it? | 16:58 |
mriedem | tssurya: the template stuff wasn't in queens | 16:58 |
mriedem | https://github.com/openstack/nova/blob/396156eb13521a0e7af4488a8cd4693aa65a0da2/nova/tests/fixtures.py#L728 | 16:59 |
mriedem | all of our tests at least configure this: transport_url = 'fake:/' | 16:59 |
tssurya | mriedem: oh yea sorry rocky then | 16:59 |
mriedem | tssurya: have you reported a bug? | 16:59 |
dansmith | mriedem: I got one already | 17:00 |
dansmith | coming | 17:00 |
tssurya | mriedem: no | 17:00 |
openstackgerrit | Dan Smith proposed openstack/nova master: Fix formatting non-templated cell URLs with no config https://review.openstack.org/611094 | 17:00 |
tssurya | dansmith: ah thanks :) | 17:00 |
dansmith | mriedem: tssurya ^ | 17:00 |
mriedem | mnaser: well i'm not sure how it's happening, but clearly we could race if two requests are listing resource providers at the same time, | 17:00 |
mriedem | mnaser: so likely just bug report it and we can add a try/except for the duplicate entry error | 17:00 |
dansmith | mriedem: I meant devstack-based tests, but it's because we always have a transport_url set I think.. | 17:00 |
mnaser | mriedem: but i mean this is constantly happening and i think i cant provision any more vms on the cloud with a hostnotfound type of thing | 17:00 |
dansmith | and yeah, this'll be a rocky backport | 17:01 |
*** moshele has quit IRC | 17:01 | |
mriedem | mnaser: so i wonder if https://github.com/openstack/nova/blob/377921103121bc62a3f7fce60c63e30815406851/nova/api/openstack/placement/objects/resource_provider.py#L1940 is always True? | 17:01 |
mnaser | mriedem: thats what im trying to decipher | 17:02 |
openstackgerrit | Dan Smith proposed openstack/nova master: Fix formatting non-templated cell URLs with no config https://review.openstack.org/611094 | 17:02 |
*** dtantsur is now known as dtantsur|afk | 17:03 | |
mnaser | mriedem: i think this is a cloud with some brokenness that is being exposed | 17:04 |
mnaser | right now, there are records in allocations for that specific uuid under resource_provider_id=4 | 17:04 |
mnaser | but in that traceback, it tries to add allocations for resource_provider_id=5 | 17:04 |
mnaser | sorry, no, i lied | 17:05 |
mnaser | it actaully tries to add for 4 | 17:05 |
mnaser | let me see if there is anything in cosnumers | 17:05 |
*** hamzy has joined #openstack-nova | 17:05 | |
tssurya | dansmith: thanks a lot | 17:06 |
mnaser | i think this sql server is f'd | 17:07 |
mnaser | select * from consumers where uuid='fbee657b-6a60-4525-b7c8-b070643404ec'; returns nothing | 17:07 |
mriedem | there aren't consumer records until rocky | 17:07 |
mriedem | and the data migration function that is blowing up is trying to populate that table from existing allocations records | 17:08 |
dansmith | tssurya: np | 17:08 |
mnaser | mriedem: right, but the traceback seems to say: Duplicate entry 'fbee657b-6a60-4525-b7c8-b070643404ec' for key 'uniq_consumers0uuid' | 17:08 |
mnaser | yet -- select * from consumers where uuid='fbee657b-6a60-4525-b7c8-b070643404ec'; -- returns nothing | 17:09 |
mriedem | hmm it's doing an insert from select, | 17:09 |
mriedem | so the select results probably have duplicates | 17:09 |
mriedem | and those aren't being trimmed | 17:09 |
mriedem | and i bet the test for this only had 1 allocation against 1 provider | 17:09 |
mriedem | or something like that | 17:10 |
mnaser | lets test that out | 17:10 |
*** munimeha1 has quit IRC | 17:10 | |
mriedem | would be nice to see what the select query results are | 17:10 |
mriedem | the sql-fu in here is hard for me to grok | 17:10 |
mnaser | mriedem: you're right | 17:11 |
mnaser | 9 rows returned from that | 17:11 |
mriedem | what's the select query? | 17:11 |
mnaser | mriedem: http://paste.openstack.org/show/732263/ | 17:11 |
mnaser | stole this from the traceback | 17:12 |
mnaser | it was right next to the error | 17:12 |
*** whoami-rajat has quit IRC | 17:13 | |
*** mvkr has quit IRC | 17:13 | |
mnaser | mriedem: yup.. i see 9 records but really 3 unique ones | 17:14 |
mriedem | ok, so i bet 3 instances with allocations against a single provider, and each instance has 3 resource class allocations (VCPU, MEMORY_MB and DISK_GB) | 17:15 |
mriedem | and we're not collapsing those 3 allocations for the same consumer into a single consumer entry | 17:15 |
mriedem | let me see if i can dig up what is supposed to be testing this | 17:16 |
mnaser | mriedem: thats exactly the case | 17:17 |
mriedem | \o/ | 17:17 |
mnaser | i can confirm same resource provider each, with 3 resource classes | 17:17 |
mnaser | how come the others didnt break when migrating | 17:17 |
mnaser | i mean this isn't exactly an outlier | 17:17 |
mriedem | don't know | 17:17 |
mriedem | https://review.openstack.org/#/c/565405/26/nova/tests/functional/api/openstack/placement/db/test_consumer.py is only testing with 3 unique allocations each with a single resource class | 17:18 |
mriedem | so that's why i guess tests didn't catch it | 17:18 |
*** k_mouza has joined #openstack-nova | 17:18 | |
mnaser | poop | 17:19 |
mnaser | well | 17:19 |
mnaser | i guess i gotta find a fix | 17:19 |
* mnaser doesnt wanna db hack it | 17:19 | |
mriedem | i'm having a hard f'ing time understanding these test | 17:19 |
mriedem | *tests | 17:19 |
mnaser | yeah :\ | 17:19 |
mnaser | and the whole logic too | 17:19 |
mriedem | well for the select query, i'd think we need to group the allocations records results by consumer_id | 17:20 |
mriedem | really need a recreate in a test to see how to fix this | 17:21 |
mnaser | mriedem: or maybe just even a select distinct? | 17:21 |
mnaser | but yes, i agree | 17:21 |
mriedem | yeah true, | 17:22 |
mriedem | again, wish jay was here | 17:22 |
mriedem | also, this is a placement bug so maybe i can just kick you over to that channel and let those guys fix it :P | 17:22 |
mnaser | lolll | 17:23 |
mnaser | i mean you're not wrong | 17:23 |
*** k_mouza has quit IRC | 17:23 | |
mnaser | i'll take this to #openstack-placement | 17:23 |
*** panda|off has quit IRC | 17:24 | |
mnaser | mriedem: on a nova note im thinking maybe live migrate those machines and cheat | 17:24 |
mnaser | lol | 17:24 |
*** panda has joined #openstack-nova | 17:27 | |
*** ralonsoh has quit IRC | 17:28 | |
openstackgerrit | melanie witt proposed openstack/nova master: Bump os-brick version to 2.6.1 https://review.openstack.org/611109 | 17:35 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add recreate test for bug 1798163 https://review.openstack.org/611113 | 17:49 |
openstack | bug 1798163 in OpenStack Compute (nova) "Placement incomplete consumers online migration fails" [Undecided,New] https://launchpad.net/bugs/1798163 | 17:49 |
mriedem | mnaser: ^ fugly but works | 17:49 |
mnaser | mriedem: i have a fix | 17:50 |
mnaser | do you want me to squash into yours or rather two patch? | 17:50 |
openstackgerrit | Mohammed Naser proposed openstack/nova master: Use unique consumer_id when doing online data migration https://review.openstack.org/611115 | 17:55 |
mnaser | oops missed an uncomment | 17:57 |
mnaser | testing functional tests again locally | 17:57 |
melwitt | mriedem: should we wait for the tempest tests to merge before marking https://blueprints.launchpad.net/nova/+spec/boot-instance-specific-storage-backend as complete? | 17:59 |
mnaser | bleh | 18:00 |
mnaser | another thing broke | 18:00 |
mnaser | ok yay | 18:05 |
mnaser | i was missing somethig | 18:05 |
openstackgerrit | Mohammed Naser proposed openstack/nova master: Use unique consumer_id when doing online data migration https://review.openstack.org/611115 | 18:06 |
mriedem | mnaser: i think i might clean mine up to be a simple new test, and then you can stack on top | 18:09 |
mriedem | i was rushing b/c my pizza was getting cold | 18:10 |
mnaser | mriedem: feel free to checkout mine locally i you want | 18:10 |
mnaser | mriedem: very valid reason tbh | 18:10 |
mriedem | melwitt: no, i was going to mark it today but forgot | 18:10 |
mnaser | i was rushing because i have a broken cloud :-) | 18:10 |
melwitt | ok, I can mark it | 18:10 |
mnaser | but i'm comfortable pushing that one liner now | 18:10 |
mriedem | i'm glad i deleted my vexxhost vm the other day then | 18:10 |
mriedem | :) | 18:10 |
mnaser | its not our public cloud | 18:11 |
mnaser | i wouldnt trust those nova people and the code they ship | 18:11 |
mnaser | :p | 18:11 |
mriedem | good plan | 18:11 |
tssurya | mriedem: question about https://review.openstack.org/#/c/571535/ . How are we able to set the compute_node.uuid which is a read-only attribute ? Am I missing something here ? | 18:13 |
mriedem | please hold dear caller | 18:14 |
tssurya | ack :) | 18:14 |
*** mvkr has joined #openstack-nova | 18:14 | |
mriedem | we create compute node records with uuids in tests all the time | 18:14 |
tssurya | true.. I am somehow hitting https://pastebin.com/kqyu66wB | 18:15 |
tssurya | will look closer | 18:16 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add recreate test for bug 1798163 https://review.openstack.org/611113 | 18:16 |
openstack | bug 1798163 in OpenStack Compute (nova) "Placement incomplete consumers online migration fails" [Undecided,In progress] https://launchpad.net/bugs/1798163 - Assigned to Mohammed Naser (mnaser) | 18:16 |
mnaser | mriedem: woo, fix worked here | 18:17 |
mnaser | confirmed with the LOG.info message showed up as expected | 18:17 |
mnaser | with 3 consumer records for rp #4 as the one that was broken which we were looking at | 18:17 |
mriedem | rebasing | 18:18 |
mnaser | lol | 18:19 |
mnaser | i'm getting a KeyError now | 18:19 |
*** gyee has joined #openstack-nova | 18:19 | |
mnaser | (in placement) | 18:19 |
mnaser | http://paste.openstack.org/show/732269/ this time | 18:19 |
mnaser | https://github.com/openstack/nova/blob/master/nova/api/openstack/placement/objects/resource_provider.py#L3522 | 18:20 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Use unique consumer_id when doing online data migration https://review.openstack.org/611115 | 18:20 |
mriedem | tssurya: hmm, maybe that's hitting after an upgrade? | 18:21 |
mriedem | but, | 18:21 |
mriedem | if that were the case, ironic's grenade job should be broken | 18:21 |
mriedem | is the value changing? | 18:22 |
tssurya | so at the moment the only info I have is that we cherry-picked this to queens | 18:22 |
tssurya | maybe something is wrong because of that | 18:23 |
*** hamzy has quit IRC | 18:23 | |
mriedem | tssurya: ok it's called from here in the RT https://review.openstack.org/#/c/571535/2/nova/compute/resource_tracker.py@617 | 18:24 |
mriedem | my guess is the compute node record already existed with a random uuid, and on restart of the compute service with the new code, it's trying to update the uuid in the compute node record using the ironic node uuid | 18:24 |
mriedem | i don't know why ironic's grenade job wouldn't fail for the same reason, but it seems like an obvious oversight in that patch | 18:24 |
tssurya | mriedem: yea | 18:24 |
tssurya | that's exactly what's happening | 18:25 |
tssurya | so this is for the existing nodes.. | 18:25 |
mriedem | \o/ | 18:25 |
mriedem | mnaser: as for https://github.com/openstack/nova/blob/stable/rocky/nova/api/openstack/placement/objects/resource_provider.py#L3486 | 18:25 |
mriedem | and the KeyError | 18:25 |
mriedem | i don't understand any of that code | 18:26 |
mriedem | that's all efried, tetsuro and jaypipes | 18:26 |
mriedem | https://review.openstack.org/#/c/559480/ | 18:27 |
* mriedem celebrates all the things being on fire at once | 18:28 | |
tssurya | heh | 18:28 |
mriedem | tssurya: report a bug for the ironic node uuid thing i guess | 18:28 |
mriedem | jroll: do you know anything about the ironic grenade job? | 18:28 |
tssurya | mriedem: yea I will | 18:28 |
mriedem | does it restart n-cpu across releases? | 18:28 |
mnaser | mriedem: i think this is a weird env-related corner case where those havent done much reporting to placement | 18:29 |
mnaser | going to delete those stale rps | 18:30 |
mnaser | mriedem: every time i use osc-placement i have to say thank you | 18:32 |
mnaser | its such a life saver | 18:32 |
mriedem | don't thank me, thank avolkov | 18:34 |
*** bjolo has joined #openstack-nova | 18:34 | |
mriedem | and rpodolykia | 18:34 |
mriedem | i know i butchered that irc | 18:34 |
mriedem | tssurya: yeah so you can set a readonly field while it's never been set | 18:35 |
mriedem | once it's set though, it's stuck | 18:35 |
tssurya | mriedem: ah right, thanks | 18:35 |
tssurya | I am filing a bug now | 18:35 |
mriedem | cdent: efried: https://review.openstack.org/#/q/topic:bug/1798163+(status:open+OR+status:merged) | 18:35 |
mriedem | dansmith: melwitt: ^ | 18:36 |
mriedem | starting on the placement repo cherry picks | 18:39 |
melwitt | 👀 | 18:39 |
tssurya | mriedem: https://bugs.launchpad.net/nova/+bug/1798172 | 18:39 |
openstack | Launchpad bug 1798172 in OpenStack Compute (nova) "Ironic driver tries to update the compute_node's UUID which of course fails in case of existing compute_nodes" [Undecided,New] | 18:39 |
mriedem | i'll assume that lego block is a pile-o-poo | 18:39 |
mriedem | "of course" | 18:40 |
mriedem | nice | 18:40 |
mriedem | :( | 18:40 |
melwitt | it's eyes, looking at the linked patches | 18:40 |
melwitt | but pile-o-poo could have been good | 18:40 |
dansmith | mriedem: so this has to be broken for anyone right? | 18:41 |
mriedem | dansmith: which thing? | 18:41 |
mriedem | placement? yes. | 18:42 |
dansmith | yeah | 18:42 |
mriedem | i guess mnaser is first to rocky ever | 18:42 |
mnaser | world first | 18:42 |
mnaser | but now | 18:42 |
mnaser | we have MORE | 18:42 |
mnaser | Exception during message handling: RPCVersionCapError: Requested message version, 5.0 is incompatible. It needs to be equal in major version and less than or equal in minor version as the specified version cap 4.17. | 18:42 |
mriedem | honestly i'm not sure how it wouldn't have been a problem caught in grenade | 18:42 |
dansmith | looks straightforward | 18:42 |
mnaser | nova-conductor when scheduling new instances | 18:42 |
mriedem | b/c in grenade we have existing instances with allocations | 18:42 |
melwitt | mnaser: need to get a windshield sticker for your car WORLD FIRST TO ROCKY | 18:42 |
mnaser | more like a military medal for the stuff i have to go through y'all | 18:43 |
mnaser | lol | 18:43 |
melwitt | :***( | 18:43 |
mnaser | all my packages are up to date, control plane is all rocky with queen computes | 18:43 |
melwitt | (those are tears) | 18:43 |
mnaser | its ok, i do it for the people | 18:43 |
mnaser | openstack upgrades are easy, i promise, i already helped fixed most of it | 18:43 |
mnaser | where are the versions listed again? in the rpc api file? | 18:45 |
mriedem | dansmith: so on upgrade to rocky we would have listed allocations for a provider here https://github.com/openstack/nova/blob/237bfcfd82fc28a955574b588fbce1d2392c9e45/nova/compute/resource_tracker.py#L1298 which should have hit the unique constraint | 18:46 |
mriedem | so idk | 18:46 |
dansmith | mriedem: yeah | 18:46 |
mriedem | i mean in a grenade run | 18:46 |
dansmith | mnaser: that rpc error looks like you have something old | 18:46 |
dansmith | like an old conductor still running maybe? | 18:46 |
mriedem | maybe we didn't hit it in grenade because the queens allocations already have project_id/user_id created? | 18:46 |
mriedem | s/created/set/ | 18:46 |
mriedem | or.... because we ran the online data migrations? | 18:47 |
mnaser | dansmith: oh shit, i missed a compute. | 18:47 |
mnaser | in pike=>queens | 18:47 |
melwitt | mriedem: well, normally allocations are required to be created with project/user and that's what creates a consumer. this bug is about the online data migration for allocations with missing consumers, prior to the microversion where we required project/user, right? | 18:47 |
dansmith | mnaser: mm, I dunno, if that's coming from a conductor node I think it's probably an old conductor, but I'd have to see more about where exactly | 18:47 |
melwitt | so grenade probably doesn't cover that, I wouldn't think | 18:48 |
dansmith | mnaser: or are you saying the auto stuff is calculating a 4.x when everything has moved past 5.0 because of a very old compute? | 18:48 |
dansmith | melwitt: but you don't create consumers directly and before some version, no consumers were created for you | 18:48 |
mnaser | dansmith: there is a compute that is active running at version 22 (service table), rest of services are 30 (for queens computes) and 35 (control plaen) | 18:49 |
dansmith | melwitt: so grenade should hit this at some point, unless we run online migrations and fix them up before we run or whatever | 18:49 |
mnaser | so i guess the auto stuff is calculating based on the fact that the oldest compute (which is active) | 18:49 |
dansmith | mnaser: ah okay yeah | 18:49 |
mnaser | so | 18:49 |
mnaser | working as intended | 18:49 |
mnaser | i think the upgrade check would have probably warned me if i used it oops | 18:49 |
melwitt | dansmith: true, but I'm pretty sure the online migration to create missing consumers was created later on, i.e. not in the same release where we started created consumers with allocations. so maybe that's how it missed it. by the time the online data migration existed, grenade was no longer testing the old way that didn't create consumers | 18:50 |
mriedem | melwitt: different code path | 18:50 |
mriedem | https://review.openstack.org/#/c/611115/3/nova/api/openstack/placement/objects/consumer.py@29 | 18:50 |
dansmith | mnaser: yeah, can't really blame us for this one :) | 18:51 |
mnaser | dansmith: ill take that one :P | 18:51 |
mnaser | but yeah, i think the issue is upgrading across releases | 18:51 |
mriedem | the upgrade check CLI doesn't look to see if your minimum compute version is > N-1 | 18:51 |
mriedem | fwiw | 18:51 |
* mnaser takes out pitchforks again | 18:52 | |
melwitt | so the microversion that started creating consumers was 1.8, pike https://docs.openstack.org/nova/latest/user/placement.html#require-placement-project-id-user-id-in-put-allocations | 18:52 |
mriedem | no one has requested it check for that | 18:52 |
*** tssurya has quit IRC | 18:52 | |
melwitt | now when was the online data migration added... | 18:52 |
mnaser | fwiw | 18:52 |
mnaser | this cloud exists since juno to as far as i know | 18:52 |
mnaser | it's seen some shit | 18:52 |
melwitt | create_incomplete consumers was added in rocky | 18:52 |
melwitt | so allocations without consumers would be from before pike | 18:53 |
mriedem | hmm, i do seem to recall online data migrations for some placement stuff not working | 18:53 |
mriedem | b/c we were hitting the wrong db config | 18:53 |
mriedem | making it think nothing needed to be migrated | 18:54 |
melwitt | so any grenade that covered create_incomplete_consumers would be testing queens => rocky and never see any consumerless allocations | 18:54 |
mriedem | i suppose we were using at least 1.8 when creating allocations in queens | 18:55 |
mriedem | b/c of dansmith's migratoin allocation stuff | 18:55 |
melwitt | I thought we started using 1.8 in pike, that's when it was added | 18:55 |
mriedem | yeah i guess https://review.openstack.org/#/c/469634/ | 18:56 |
mriedem | ok i guess that solves the grenade mystery | 18:58 |
mriedem | geez when is someone going to add an FFU job that runs from ocata-em to master?! | 18:58 |
mnaser | issues like this is why ffu upgrades terrify me | 19:01 |
mnaser | lol | 19:01 |
mnaser | "when did this break? here's 2 years worth of code to go through!" | 19:01 |
dansmith | mnaser: it's way easier than the alternative, IMHO | 19:01 |
dansmith | of not knowing if the data set has been transformed since juno or not | 19:01 |
mnaser | dansmith: i'll agree on that statement | 19:02 |
mriedem | well, i was right about one thing | 19:03 |
mriedem | http://logs.openstack.org/00/607600/1/check/ironic-grenade-dsvm/4d493b1/logs/screen-n-cpu.txt.gz#_Oct_03_18_33_59_072341 | 19:03 |
melwitt | hm, I just realized, we're going to need to dupe these patches and use the same change-id to propose them to placement as well | 19:04 |
mriedem | yes | 19:04 |
mriedem | thta's what we've been doing | 19:04 |
melwitt | ok | 19:04 |
mnaser | forward porting | 19:04 |
mnaser | is that what we call it | 19:04 |
melwitt | on the second patch, it looks like there's at least one additional place we need to add the group_by, right? | 19:05 |
mriedem | yes | 19:05 |
melwitt | and should correspondingly test it too. the recreate test patch is already approved though | 19:05 |
mriedem | not for long | 19:05 |
melwitt | k | 19:05 |
efried | mriedem: That KeyError. Is that part of the existing bugs you've been talking about, or has it not yet been investigated? | 19:05 |
mnaser | the keyerror is not related | 19:06 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add recreate test for bug 1798163 https://review.openstack.org/611113 | 19:06 |
openstack | bug 1798163 in OpenStack Compute (nova) "Placement incomplete consumers online migration fails" [Critical,In progress] https://launchpad.net/bugs/1798163 - Assigned to Mohammed Naser (mnaser) | 19:06 |
mnaser | it was just some weird leftovers | 19:06 |
mriedem | mnaser: i'll rev my functional test patch and yours on top | 19:06 |
* mriedem makes coffee | 19:06 | |
efried | Is it a bug that needs to be fixed, or was it a user error? | 19:06 |
mnaser | the keyerror? i dunno, but i dont think it should have been an issue because said user doesnt touch placement | 19:07 |
efried | I wouldn't have thought it should be possible no matter what abuse you lavish on the placement db. | 19:07 |
*** dave-mccowan has quit IRC | 19:07 | |
mnaser | fwiw the resource provider had nothing allocated | 19:07 |
mnaser | no usage that is | 19:07 |
*** spatel has joined #openstack-nova | 19:07 | |
*** dave-mccowan has joined #openstack-nova | 19:08 | |
efried | mnaser: Any sharing providers in this mess? | 19:08 |
mnaser | efried: sorry, not sure what you mean by that | 19:08 |
efried | Providers with the MISC_SHARES_VIA_AGGREGATE trait | 19:08 |
mnaser | efried: i am not sure hoenstly, i didn't dig in that much | 19:10 |
*** jding1_ has joined #openstack-nova | 19:12 | |
mriedem | then no | 19:12 |
mriedem | b/c you'd have to create them yourself | 19:12 |
*** jding1_ has quit IRC | 19:12 | |
mnaser | yeah besides nova | 19:13 |
mnaser | no api interaction | 19:13 |
*** jackding has quit IRC | 19:15 | |
efried | If you see a repro, lmk. Otherwise I'm going to pretend it didn't happen. | 19:15 |
*** tbachman has quit IRC | 19:15 | |
*** jackding has joined #openstack-nova | 19:17 | |
mnaser | efried: i can get you a stacktrace if you want, but i dont think id be able to reproduce it given i deleted stuff | 19:17 |
efried | mnaser: The stack trace won't tell me much. Logs up to that point might help a bit. | 19:18 |
efried | especially if they've got our fun new debug messages | 19:19 |
openstackgerrit | Matthew Edmonds proposed openstack/nova master: Use tempfile for powervm config drive https://review.openstack.org/610174 | 19:20 |
edmondsw | efried ^ this should address the fd open issue | 19:20 |
mriedem | fudge, | 19:23 |
mriedem | this unique constraint error is in 3 f'ing places | 19:23 |
melwitt | I wondered if there were more. and I had thought they'd call through the same method to create missing consumers but I guess all of the queries are different | 19:24 |
mriedem | well maybe not | 19:24 |
mriedem | https://review.openstack.org/#/c/611115/3/nova/api/openstack/placement/objects/resource_provider.py@1973 | 19:24 |
mriedem | that's not doing the insert-from-select | 19:25 |
mriedem | like the others | 19:25 |
mnaser | um | 19:26 |
mnaser | in rocky we moved to console auth tokens stored in db, right? | 19:26 |
melwitt | yes, in addition to nova-consoleauth until this lands https://review.openstack.org/610673 | 19:27 |
mnaser | melwitt: what service creates the auth tokens? | 19:28 |
melwitt | mnaser: nova-compute creates them for the database, nova-consoleauth creates them for nova-consoleauth | 19:28 |
mnaser | so if your nova-compute is not on rocky | 19:29 |
mnaser | ..does that mean no console? | 19:29 |
melwitt | then you get nova-consoleauth tokens | 19:29 |
mnaser | ok i see | 19:29 |
melwitt | no, you get console | 19:29 |
* cdent sighs and cries about 1798163 | 19:29 | |
mnaser | so just an extra indirection right now | 19:29 |
mnaser | till nova-compute creates to db directly in the future | 19:30 |
melwitt | nova-compute creates directly to db in rocky. just obviously your older computes will not and those instances will be supported by nova-consoleauth | 19:30 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add recreate test for bug 1798163 https://review.openstack.org/611113 | 19:31 |
openstack | bug 1798163 in OpenStack Compute (nova) "Placement incomplete consumers online migration fails" [Critical,In progress] https://launchpad.net/bugs/1798163 - Assigned to Mohammed Naser (mnaser) | 19:31 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Use unique consumer_id when doing online data migration https://review.openstack.org/611115 | 19:31 |
mriedem | mnaser: melwitt: dansmith: cdent: efried: ^ should be good now, covers both cases | 19:31 |
melwitt | once all your computes are on rocky, then you wouldn't need nova-consoleauth once this backport lands https://review.openstack.org/610673 | 19:31 |
mnaser | melwitt: im seeing consoleauth get a token, but the traceback that says token validation failed is resulted from a method that does db.console_auth_token_get | 19:32 |
mnaser | ugh | 19:33 |
mnaser | [workarounds] enable_consoleauth=True | 19:33 |
melwitt | mnaser: and you have a mix of rocky computes and older than rocky computes? in that case, you'll need to set [workarounds]enable_consoleauth = True on your console proxy host | 19:33 |
mnaser | melwitt: that was it, thank you | 19:36 |
mnaser | part of me wants to document all this in some "heads up" way, but also i worry "look how bad it is" messaging :\ | 19:37 |
cdent | thanks mriedem, is the expectation on that stuff that since that code is "remove in stein" or "called from online migrations" that we can remove it in openstack/placement instead of using those changes (I've not caught up fully on the irc log) | 19:37 |
melwitt | mnaser: it defaults to False, but for a rolling upgrade you would need it to be True. yet more information I missed in the upgrade release notes :( | 19:37 |
mriedem | cdent: i don't expect it will be removed in stein, | 19:37 |
mriedem | we have to have a blocker migration first | 19:37 |
mriedem | but haven't thought through it all yet | 19:37 |
mnaser | melwitt: np, want me to push up a bug or something or you'll write it down? | 19:38 |
mriedem | openstack/placement doesn't have a "placement-manage db online_data_migrations" yet | 19:38 |
mriedem | or does it? | 19:38 |
mnaser | maybe good to add to the upgrade doc | 19:38 |
mriedem | cdent: ^ | 19:38 |
cdent | it doesn't have _any_ placement-manage yet | 19:38 |
melwitt | mnaser: alternatively, could switch to defaulting to True and then let operators turn it off and decommission nova-consoleauth intentionally once they've rolled everything to rocky | 19:38 |
cdent | that's part of that message I sent earlier today | 19:38 |
cdent | mriedem: but my thinking was: we don't need to do any online db migrations, yet, either | 19:39 |
cdent | that is: Isn't the database in the correct state when someone gets to using openstack/placement? | 19:39 |
mnaser | melwitt: that feels like a better user experience | 19:40 |
mnaser | a lot of users probably will do rolling upgrades | 19:40 |
mriedem | cdent: nope | 19:41 |
mnaser | and you can keep it =True for rocky only anyways and remove it after | 19:41 |
mriedem | cdent: not if you're upgrading from <pike to stein | 19:41 |
mriedem | like mnaser is doing with going from juno to rocky | 19:41 |
dansmith | melwitt: workarounds are supposed to default to off | 19:41 |
melwitt | mnaser: you can create a bug, that would be most visible I think | 19:41 |
dansmith | melwitt: so really we should have landed it where false meant what we wanted | 19:41 |
cdent | mriedem: I'm confused on how that's supposed to work: don' t you stop on the rocky _code_ when doing a FFU? | 19:42 |
dansmith | changing it again is kindof the suck too, IMHO | 19:42 |
melwitt | dansmith: ok :( I see | 19:42 |
mnaser | melwitt: dansmith i guess i'm pretty busy these times but i | 19:42 |
mnaser | i'll put up a bug and leave it for the team to decide whats best :> | 19:43 |
mriedem | cdent: you mean upgrade to rocky where the create_incomplete_consumeres online migration runs as part of the rocky nova-manage db online_data_migrations, and then extract placement while upgrading to stein? | 19:43 |
mriedem | and assume create_incomplete_consumers is done already? | 19:43 |
mriedem | that might happen | 19:43 |
mriedem | it's just nice to have a blocker migration to prevent you from upgrading if you didn't do the homework | 19:44 |
mriedem | we don't always have those though b/c sometimes they span multiple DBs | 19:44 |
cdent | the term "blocker migration" has never been sufficiently defined for me | 19:44 |
mriedem | we've used nova-status upgrade check for that though | 19:44 |
mriedem | as in db sync fails | 19:44 |
mriedem | db sync in N fails b/c you didn't complete the online migrations in N-1 | 19:44 |
melwitt | mnaser: thanks. at the very least, we can add more info to the upgrade reno, I think. but it sounds like I messed up the [workarounds] option too much to fix | 19:45 |
mnaser | melwitt: nah, it's fine, i think the messaging needs to be more clear as in like | 19:45 |
mnaser | yo shit will be broken if you havent completed the upgrade | 19:45 |
mnaser | because to me i saw some stuff related to it but it didnt really feel like "that was my issue" | 19:45 |
dansmith | melwitt: just MHO of course. We've deviated from my initial proposal of workarounds in the past, but I do think that adding another possible "how it is if you didn't change it" case at this point is probably less helpful | 19:46 |
melwitt | well, I was thinking "if you're doing a rolling upgrade, set [workarounds]enable_consoleauth = True | 19:46 |
melwitt | mnaser: ^ rather than, yo shit will be broken | 19:46 |
mnaser | yeah i'd be in favour of enabling it given that it cant really do much | 19:46 |
mriedem | cdent: i couldn't do a block migration for this request spec thing, so added it to nova-status upgrade check https://review.openstack.org/#/c/581813/ | 19:47 |
mriedem | *blocker migration | 19:47 |
mnaser | melwitt: much better delivered. | 19:47 |
mnaser | :p | 19:47 |
mnaser | melwitt: and my sh.... rolling upgrades were affected by https://bugs.launchpad.net/nova/+bug/1798188 | 19:47 |
openstack | Launchpad bug 1798188 in OpenStack Compute (nova) "VNC stops working in rolling upgrade by default" [Undecided,New] | 19:47 |
melwitt | lol, no I mean it _won't_ be broken if you set it | 19:47 |
mnaser | oh | 19:47 |
mnaser | ah my brain has potato'd | 19:47 |
cdent | mriedem perhaps it would be good/ideal if we can instead of a suite of N blocker migrations we have a sanity check of some kind in a placement status check, which effectively does the same kind of "You're database isn't ready" thing. | 19:48 |
mnaser | i did ocata => pike => queens => rocky in 2 days | 19:48 |
mriedem | cdent: that's an optoin | 19:48 |
mriedem | *option | 19:48 |
melwitt | dansmith: I didn't understand the "how it is if you didn't change it" are you saying you think changing the default would be less helpful than adding more words to the upgrade reno? | 19:48 |
mnaser | this one was by far the toughest but as expected i guess | 19:48 |
mriedem | for dropping create_incomplete_consumers | 19:48 |
dansmith | melwitt: I'm saying if you change it now, then people reading the renos will see "This was deprecated. Oopps, undeprecated set this workaround, Oops Oops, nevermind, it's set by default now" | 19:49 |
dansmith | melwitt: and it just seems like we're piling on the confusion if we keep making that a moving target | 19:49 |
dansmith | melwitt: if anything, add something to nova-status and backport it to help make sure people are warned to pay attention to this | 19:49 |
dansmith | and get that released before people have a chance to stumble over this | 19:49 |
melwitt | dansmith: I see, yeah. that's true, if we change the default we have to change all of the words related to how the workaround works, that would be confusing if someone's seen it before | 19:50 |
mnaser | (but also how many people went through this document already given the issue i ran into today :p) | 19:51 |
dansmith | melwitt: and I don't think we get to alter the older renos, if I'm not mistaken, but even still it's out there so if someone is looking at X.1 docs and then they're similar but different in X.2... | 19:51 |
melwitt | argh, yeah. | 19:52 |
dansmith | mnaser: I'm talking about in a year when most people are deploying rocky and trying to figure out what the story is now | 19:52 |
mnaser | dansmith: makes sense | 19:52 |
*** icey has quit IRC | 19:52 | |
melwitt | I have cursed nova-consoleauth :( | 19:53 |
melwitt | ok, so add something to nova-status. I hope everyone uses nova-status | 19:53 |
*** hamzy has joined #openstack-nova | 19:54 | |
dansmith | of course not everyone does.. OSA does I think, and hopefully all the buzz around making this a generic thing will mean in a year people are looking at it | 19:55 |
dansmith | I thought you were also suggesting clarifying words in renos to help understand | 19:55 |
dansmith | I was just saying flipping the default behavior now and trying to document _that_ is the confusing part | 19:55 |
melwitt | yeah but IIUC that doesn't help someone upgrading to rocky if I can't backport those words | 19:55 |
melwitt | oh | 19:56 |
dansmith | you can backport the words, | 19:56 |
dansmith | I just don't think you should change the behavior and backport more words explaining how it's changed for the third time | 19:56 |
melwitt | got it, ok | 19:56 |
mriedem | what would the nova-status upgrade check look for? that [workarounds]/enable_consoleauth is False and return a warning? | 19:57 |
*** angiewang has joined #openstack-nova | 19:57 | |
dansmith | yeah, and maybe check the services table or current tokens to see if you even use that stuff | 19:58 |
dansmith | if you don't use console, then you don't need to warn, | 19:58 |
dansmith | but if you do and you're rolling, pretty much should have that set right? | 19:58 |
melwitt | yeah | 19:58 |
*** angiewang has quit IRC | 19:58 | |
mriedem | you can also tell if there are no console auth entries in the db right? | 19:58 |
dansmith | I said that | 19:58 |
mriedem | i said it with an accent | 19:59 |
dansmith | fancy | 19:59 |
mriedem | dansmith: btw, https://review.openstack.org/#/c/611094/ needs an assertion on it | 19:59 |
melwitt | there wouldn't be, before rocky though. they'd be in the nova-consoleauth service | 19:59 |
mriedem | melwitt: well, that's the point right? | 19:59 |
mriedem | if you're using nova-consoleauth in queens, and upgrading to rocky, you want the workaround enabled | 19:59 |
dansmith | mriedem: ack, I have to run off for a bit but will hit that when I get back | 20:00 |
mriedem | and nova-consoleauth would show up in the services table in....one of the dbs | 20:00 |
melwitt | yeah, I mean, if you are checking a queens deployment for whether they use consoles at all, you'd have to check the nova-consoleauth service | 20:00 |
mriedem | we don't want to make an rpc call from the status check | 20:00 |
mriedem | but we could check the services table to see if it's been started | 20:00 |
melwitt | oh, you're thinking if they don't use consoles they won't run the service at all. that makes sense too | 20:00 |
melwitt | yeah | 20:00 |
mriedem | i just don't know which db that'd be in | 20:00 |
mriedem | api? | 20:00 |
mriedem | no, | 20:01 |
mriedem | wrong schema | 20:01 |
mriedem | i guess just iterate the cell dbs | 20:01 |
mriedem | if you find a non-deleted nova-consoleauth service record in that db, but no console auth tokens in the db, and workarounds is false, then fail | 20:01 |
melwitt | yeah, or warn like dansmith said. only matters if you're rolling | 20:02 |
melwitt | i.e. it will only mess you up if you're rolling | 20:03 |
mriedem | i left a comment on the bug with the status ugprade check idea | 20:04 |
melwitt | thanks | 20:05 |
*** angiewang has joined #openstack-nova | 20:05 | |
openstackgerrit | Matthew Edmonds proposed openstack/nova master: Use tempfile for powervm config drive https://review.openstack.org/610174 | 20:05 |
*** rmart04 has joined #openstack-nova | 20:05 | |
*** rmart04 has quit IRC | 20:06 | |
*** angiewang has left #openstack-nova | 20:08 | |
mriedem | dansmith: np i got it, it was 1 line | 20:11 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix formatting non-templated cell URLs with no config https://review.openstack.org/611094 | 20:11 |
mriedem | easy fix for another core ^ | 20:12 |
melwitt | +W | 20:17 |
*** cdent has quit IRC | 20:24 | |
*** macza has quit IRC | 20:25 | |
*** macza_ has joined #openstack-nova | 20:25 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Ignore uuid if already set in ComputeNode.update_from_virt_driver https://review.openstack.org/611162 | 20:27 |
mriedem | efried: jroll: ^ | 20:27 |
*** macza_ has quit IRC | 20:28 | |
*** macza has joined #openstack-nova | 20:29 | |
openstackgerrit | melanie witt proposed openstack/nova master: Bump os-brick version to 2.6.1 https://review.openstack.org/611109 | 20:30 |
efried | I don't get it. | 20:30 |
*** macza has quit IRC | 20:31 | |
*** macza has joined #openstack-nova | 20:32 | |
openstackgerrit | Sundar Nadathur proposed openstack/nova-specs master: Nova Cyborg interaction specification. https://review.openstack.org/603955 | 20:32 |
*** slaweq has quit IRC | 20:33 | |
spatel | Folks! i have 64G compute node | 20:33 |
dansmith | spatel: it's not nice to brag | 20:34 |
dansmith | mriedem: thanks home skillet | 20:34 |
spatel | Should i go with 1G hugepage or 2M | 20:34 |
spatel | dansmith: i was going to write question but hit enter middle of i | 20:34 |
spatel | dansmith: i was going to write question but hit enter middle of it | 20:34 |
dansmith | spatel: I know, I'm just joking :P | 20:34 |
spatel | :) | 20:35 |
spatel | what do you recommend if that is the case | 20:35 |
spatel | Problem is if i launch application then it will be hard to adjust those value | 20:36 |
spatel | currently i have "hugepagesz=2M hugepages=27000 transparent_hugepage=never" | 20:36 |
dansmith | spatel: you probably want cfriesen | 20:36 |
spatel | cfriesen: ^^ | 20:37 |
*** moshele has joined #openstack-nova | 20:37 | |
spatel | He may be not around | 20:38 |
mriedem | efried: you were on the original regression patch of mine so figured you'd have context | 20:38 |
mriedem | this https://review.openstack.org/#/c/571535/ | 20:38 |
efried | mriedem: Yeah, I think I get it now. | 20:38 |
efried | See if my review comment makes sense. | 20:39 |
mriedem | efried: yup | 20:40 |
efried | mriedem: ...and another update | 20:40 |
cfriesen | spatel: in our testing 2M gave a noticeable benefit. 1G gave some additional benefit but only for specific testcases | 20:40 |
spatel | There you go!! thanks you | 20:40 |
mriedem | efried: yup | 20:41 |
efried | cool | 20:41 |
cfriesen | spatel: are you using dedicated CPUs? | 20:41 |
spatel | Do you think 27000 is good number on 64G compute node? | 20:41 |
spatel | yes I am pinning CPU | 20:41 |
spatel | i am going to run riak cluster application on this compute node | 20:41 |
spatel | riak love memory | 20:42 |
cfriesen | you don't need to allocate hugepages at boot. you can allocate them at runtime | 20:42 |
spatel | i heard sometime it cause issue during runtime | 20:42 |
cfriesen | spatel: If you allocate them early during startup the memory hasn't gotten fragmented yet | 20:42 |
spatel | hmm! | 20:43 |
spatel | i will keep that in mind then.. | 20:43 |
cfriesen | it's a bit of a tradeoff, since any memory you reserve for hugepages can't be allocated to small-page instances. Also, you need to keep some 4K memory around for the host itself. | 20:43 |
spatel | but you have to reboot your flavor also right after adjust hugepage | 20:43 |
*** hamzy has quit IRC | 20:44 | |
cfriesen | not sure what you mean by "reboot your flavor" :) | 20:44 |
spatel | I kept 8G memory for host that is why i pick 27000 pages | 20:44 |
spatel | i meant reboot your VM | 20:44 |
cfriesen | spatel: that should be fine as long as most of your guests are using hugepages | 20:44 |
spatel | cfriesen: thanks! in that case i will go with 27000 | 20:45 |
cfriesen | spatel: changing a flavor and then rebooting your vm won't do anything. the flavor information was cached in the VM at creation time. | 20:46 |
cfriesen | in the instance object, rather | 20:47 |
spatel | oh!! | 20:47 |
spatel | cool | 20:47 |
*** erlon has quit IRC | 20:48 | |
*** awaugama has quit IRC | 20:51 | |
*** READ10 has joined #openstack-nova | 20:55 | |
mriedem | i still don't understand how this forbidden aggregate placement API thing is going to be used via nova for blazar which is use case in this spec https://review.openstack.org/#/c/603352/ | 20:56 |
efried | mriedem: You kind of had to be in the room, unfortunately. | 20:56 |
mriedem | it seems we're gung ho about adding a placement api without any details on how to ues it | 20:56 |
mriedem | *use | 20:56 |
mriedem | well, the place to document that is in the spec for those not in the room right? | 20:56 |
mriedem | if the answer is, "we're going to fork nova and make pre-request filters an extension point" then say that | 20:56 |
efried | Yes, I agree. If it's not clear to someone who wasn't in the room, it needs a rewrite. | 20:57 |
*** priteau has quit IRC | 20:57 | |
mriedem | i particularly want Kevin_Zheng on board with this b/c he had a use case for the dedicated host stuff as well | 20:58 |
mriedem | but he needs to speak up on the spec review too | 20:58 |
*** priteau has joined #openstack-nova | 21:01 | |
*** mriedem is now known as mriedem_away | 21:02 | |
mriedem_away | time for parent/teacher conferences | 21:02 |
*** eharney has quit IRC | 21:03 | |
*** munimeha1 has joined #openstack-nova | 21:05 | |
efried | mnaser, dansmith: Would DISTINCT have done the same thing? And possibly be more efficient? | 21:18 |
efried | sorry, I'm talking about https://review.openstack.org/#/c/611115/3 | 21:18 |
mnaser | efried: i dunno, not an sql expert, i didnt try it and it seemed like the.. easier way | 21:19 |
*** mchlumsky has quit IRC | 21:25 | |
*** moshele has quit IRC | 21:35 | |
*** lbragstad is now known as lbragstad-503 | 21:42 | |
openstackgerrit | Merged openstack/nova master: Transform volume.usage notification https://review.openstack.org/580345 | 21:42 |
*** munimeha1 has quit IRC | 21:45 | |
*** slaweq has joined #openstack-nova | 21:53 | |
*** mriedem_away has quit IRC | 21:55 | |
*** smcginnis is now known as smcginnis_vaca | 21:55 | |
*** spatel has quit IRC | 21:56 | |
*** priteau has quit IRC | 22:03 | |
*** tbachman has joined #openstack-nova | 22:04 | |
*** slaweq has quit IRC | 22:09 | |
*** tbachman has quit IRC | 22:10 | |
*** slaweq has joined #openstack-nova | 22:11 | |
*** tbachman has joined #openstack-nova | 22:14 | |
*** slaweq has quit IRC | 22:44 | |
*** rcernin has joined #openstack-nova | 22:49 | |
*** macza has quit IRC | 23:01 | |
*** slaweq has joined #openstack-nova | 23:11 | |
*** dave-mccowan has quit IRC | 23:24 | |
*** hamzy has joined #openstack-nova | 23:24 | |
*** mlavalle has quit IRC | 23:31 | |
*** takashin has joined #openstack-nova | 23:41 | |
*** slaweq has quit IRC | 23:44 | |
*** k_mouza has joined #openstack-nova | 23:53 | |
*** erlon has joined #openstack-nova | 23:55 | |
*** k_mouza has quit IRC | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!