*** betherly has joined #openstack-nova | 00:03 | |
*** s10 has quit IRC | 00:06 | |
*** s10 has joined #openstack-nova | 00:06 | |
*** s10 has quit IRC | 00:07 | |
*** s10 has joined #openstack-nova | 00:07 | |
*** betherly has quit IRC | 00:08 | |
*** s10 has quit IRC | 00:08 | |
*** s10 has joined #openstack-nova | 00:08 | |
*** s10 has quit IRC | 00:08 | |
*** s10 has joined #openstack-nova | 00:09 | |
*** s10 has quit IRC | 00:09 | |
*** s10 has joined #openstack-nova | 00:10 | |
*** s10 has quit IRC | 00:10 | |
*** s10 has joined #openstack-nova | 00:10 | |
*** s10 has quit IRC | 00:11 | |
*** s10 has joined #openstack-nova | 00:11 | |
*** s10 has quit IRC | 00:12 | |
*** takashin has joined #openstack-nova | 00:12 | |
*** s10 has joined #openstack-nova | 00:12 | |
*** s10 has quit IRC | 00:12 | |
*** s10 has joined #openstack-nova | 00:13 | |
*** s10 has quit IRC | 00:13 | |
*** s10 has joined #openstack-nova | 00:13 | |
*** s10 has quit IRC | 00:14 | |
*** s10 has joined #openstack-nova | 00:14 | |
*** s10 has quit IRC | 00:15 | |
*** s10 has joined #openstack-nova | 00:15 | |
*** s10 has quit IRC | 00:15 | |
*** brinzhang has joined #openstack-nova | 00:15 | |
*** s10 has joined #openstack-nova | 00:16 | |
*** s10 has quit IRC | 00:16 | |
*** s10 has joined #openstack-nova | 00:17 | |
*** s10 has quit IRC | 00:17 | |
*** s10 has joined #openstack-nova | 00:17 | |
*** s10 has quit IRC | 00:18 | |
*** s10 has joined #openstack-nova | 00:18 | |
*** s10 has quit IRC | 00:18 | |
*** READ10 has joined #openstack-nova | 00:19 | |
*** s10 has joined #openstack-nova | 00:19 | |
*** s10 has quit IRC | 00:19 | |
*** s10 has joined #openstack-nova | 00:20 | |
*** s10 has quit IRC | 00:20 | |
*** s10 has joined #openstack-nova | 00:20 | |
*** s10 has quit IRC | 00:21 | |
*** s10 has joined #openstack-nova | 00:21 | |
*** s10 has quit IRC | 00:22 | |
*** s10 has joined #openstack-nova | 00:22 | |
*** s10 has quit IRC | 00:22 | |
*** s10 has joined #openstack-nova | 00:23 | |
*** s10 has quit IRC | 00:23 | |
*** betherly has joined #openstack-nova | 00:23 | |
*** s10 has joined #openstack-nova | 00:23 | |
*** s10 has quit IRC | 00:24 | |
*** s10 has joined #openstack-nova | 00:24 | |
*** s10 has quit IRC | 00:25 | |
*** s10 has joined #openstack-nova | 00:25 | |
*** s10 has quit IRC | 00:25 | |
*** s10 has joined #openstack-nova | 00:26 | |
*** s10 has quit IRC | 00:26 | |
*** s10 has joined #openstack-nova | 00:27 | |
*** s10 has quit IRC | 00:27 | |
*** s10 has joined #openstack-nova | 00:27 | |
*** s10 has quit IRC | 00:28 | |
*** s10 has joined #openstack-nova | 00:28 | |
*** betherly has quit IRC | 00:28 | |
*** s10 has quit IRC | 00:29 | |
*** s10 has joined #openstack-nova | 00:29 | |
*** s10 has quit IRC | 00:29 | |
*** s10 has joined #openstack-nova | 00:30 | |
*** s10 has quit IRC | 00:30 | |
*** s10 has joined #openstack-nova | 00:30 | |
*** s10 has quit IRC | 00:31 | |
*** s10 has joined #openstack-nova | 00:31 | |
*** s10 has quit IRC | 00:32 | |
*** s10 has joined #openstack-nova | 00:32 | |
*** s10 has quit IRC | 00:32 | |
*** s10 has joined #openstack-nova | 00:33 | |
*** s10 has quit IRC | 00:33 | |
*** s10 has joined #openstack-nova | 00:34 | |
*** s10 has quit IRC | 00:34 | |
*** s10 has joined #openstack-nova | 00:34 | |
*** s10 has quit IRC | 00:35 | |
*** s10 has joined #openstack-nova | 00:35 | |
*** s10 has quit IRC | 00:36 | |
*** s10 has joined #openstack-nova | 00:36 | |
*** s10 has quit IRC | 00:36 | |
*** s10 has joined #openstack-nova | 00:37 | |
*** s10 has quit IRC | 00:37 | |
*** s10 has joined #openstack-nova | 00:37 | |
*** s10 has quit IRC | 00:38 | |
*** s10 has joined #openstack-nova | 00:38 | |
*** s10 has quit IRC | 00:39 | |
*** s10 has joined #openstack-nova | 00:39 | |
*** s10 has quit IRC | 00:39 | |
*** s10 has joined #openstack-nova | 00:40 | |
*** s10 has quit IRC | 00:40 | |
*** s10 has joined #openstack-nova | 00:41 | |
*** s10 has quit IRC | 00:41 | |
*** s10 has joined #openstack-nova | 00:41 | |
*** s10 has quit IRC | 00:42 | |
*** s10 has joined #openstack-nova | 00:42 | |
*** s10 has quit IRC | 00:42 | |
*** s10 has joined #openstack-nova | 00:43 | |
*** s10 has quit IRC | 00:43 | |
*** s10 has joined #openstack-nova | 00:44 | |
*** s10 has quit IRC | 00:44 | |
*** s10 has joined #openstack-nova | 00:44 | |
*** s10 has quit IRC | 00:45 | |
*** s10 has joined #openstack-nova | 00:45 | |
*** s10 has quit IRC | 00:46 | |
*** s10 has joined #openstack-nova | 00:46 | |
*** s10 has quit IRC | 00:46 | |
*** s10 has joined #openstack-nova | 00:47 | |
*** s10 has quit IRC | 00:47 | |
*** s10 has joined #openstack-nova | 00:47 | |
*** s10 has quit IRC | 00:48 | |
*** s10 has joined #openstack-nova | 00:48 | |
*** s10 has quit IRC | 00:49 | |
*** markvoelker has joined #openstack-nova | 00:58 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add restrictions on updated_at when getting migrations https://review.openstack.org/607798 | 00:58 |
---|---|---|
openstackgerrit | Brin Zhang proposed openstack/nova master: Add restrictions on updated_at when getting instance action records https://review.openstack.org/607801 | 01:05 |
*** erlon__ has quit IRC | 01:13 | |
*** tommylikehu has joined #openstack-nova | 01:17 | |
*** beekneemech has joined #openstack-nova | 01:17 | |
*** bnemec has quit IRC | 01:19 | |
*** imacdonn has quit IRC | 01:22 | |
*** imacdonn has joined #openstack-nova | 01:22 | |
*** mrsoul has quit IRC | 01:25 | |
*** markvoelker has quit IRC | 01:32 | |
*** tetsuro has joined #openstack-nova | 01:37 | |
*** Dinesh_Bhor has joined #openstack-nova | 01:44 | |
*** pooja_jadhav has quit IRC | 01:45 | |
*** mhen has quit IRC | 01:47 | |
*** dklyle has joined #openstack-nova | 01:50 | |
*** mhen has joined #openstack-nova | 01:50 | |
*** david-lyle has joined #openstack-nova | 01:53 | |
*** pooja_jadhav has joined #openstack-nova | 01:53 | |
*** dklyle has quit IRC | 01:56 | |
*** dklyle has joined #openstack-nova | 01:56 | |
*** david-lyle has quit IRC | 01:59 | |
*** hongbin has joined #openstack-nova | 02:03 | |
openstackgerrit | Merged openstack/python-novaclient master: Recommend against using --force for evacuate/live migration https://review.openstack.org/611436 | 02:04 |
*** lei-zh has joined #openstack-nova | 02:06 | |
*** bnemec has joined #openstack-nova | 02:13 | |
*** TuanDA has joined #openstack-nova | 02:14 | |
*** beekneemech has quit IRC | 02:15 | |
*** beekneemech_ has joined #openstack-nova | 02:15 | |
*** bnemec has quit IRC | 02:18 | |
*** dklyle has quit IRC | 02:20 | |
*** betherly has joined #openstack-nova | 02:26 | |
*** bnemec has joined #openstack-nova | 02:26 | |
*** beekneemech_ has quit IRC | 02:29 | |
*** betherly has quit IRC | 02:30 | |
*** lei-zh has quit IRC | 02:31 | |
*** Dinesh_Bhor has quit IRC | 02:31 | |
*** lei-zh has joined #openstack-nova | 02:31 | |
*** beekneemech has joined #openstack-nova | 02:31 | |
*** bnemec has quit IRC | 02:34 | |
*** Dinesh_Bhor has joined #openstack-nova | 02:35 | |
*** bnemec has joined #openstack-nova | 02:48 | |
*** beekneemech has quit IRC | 02:51 | |
*** betherly has joined #openstack-nova | 02:56 | |
*** tetsuro has quit IRC | 02:59 | |
*** betherly has quit IRC | 03:00 | |
*** tetsuro has joined #openstack-nova | 03:02 | |
openstackgerrit | Merged openstack/nova stable/rocky: Ignore uuid if already set in ComputeNode.update_from_virt_driver https://review.openstack.org/611337 | 03:04 |
*** READ10 has quit IRC | 03:12 | |
*** lei-zh has quit IRC | 03:32 | |
openstackgerrit | Merged openstack/nova stable/rocky: Use unique consumer_id when doing online data migration https://review.openstack.org/611315 | 03:34 |
*** hongbin has quit IRC | 03:46 | |
*** betherly has joined #openstack-nova | 03:47 | |
*** betherly has quit IRC | 03:52 | |
*** tetsuro has quit IRC | 04:03 | |
*** tetsuro has joined #openstack-nova | 04:05 | |
*** Dinesh_Bhor has quit IRC | 04:08 | |
*** beekneemech has joined #openstack-nova | 04:10 | |
*** bnemec has quit IRC | 04:13 | |
*** tommylikehu has quit IRC | 04:16 | |
*** jaosorior has joined #openstack-nova | 04:24 | |
*** tetsuro has quit IRC | 04:33 | |
*** bnemec has joined #openstack-nova | 04:45 | |
*** beekneemech has quit IRC | 04:48 | |
*** bnemec has quit IRC | 04:51 | |
*** bnemec has joined #openstack-nova | 04:51 | |
*** beekneemech has joined #openstack-nova | 05:03 | |
*** bnemec has quit IRC | 05:06 | |
*** andreaf has quit IRC | 05:10 | |
*** andreaf has joined #openstack-nova | 05:15 | |
*** Dinesh_Bhor has joined #openstack-nova | 05:17 | |
*** s10 has joined #openstack-nova | 05:18 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Fix best_match() deprecation warning https://review.openstack.org/611204 | 05:35 |
*** Luzi has joined #openstack-nova | 05:45 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add restrictions on updated_at when getting migrations https://review.openstack.org/607798 | 05:46 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add restrictions on updated_at when getting instance action records https://review.openstack.org/607801 | 05:48 |
*** TuanDA has quit IRC | 05:54 | |
*** pcaruana has joined #openstack-nova | 06:14 | |
*** s10 has quit IRC | 06:19 | |
*** lei-zh has joined #openstack-nova | 06:20 | |
*** jding1_ has joined #openstack-nova | 06:24 | |
*** jackding has quit IRC | 06:27 | |
*** jding1__ has joined #openstack-nova | 06:32 | |
*** jaosorior has quit IRC | 06:32 | |
*** renxiaof has joined #openstack-nova | 06:34 | |
*** jding1_ has quit IRC | 06:34 | |
*** jaosorior has joined #openstack-nova | 06:34 | |
*** Dinesh_Bhor has quit IRC | 06:36 | |
jaosorior | Could I get a review for this https://review.openstack.org/#/c/609591/ ? | 06:36 |
*** jding1_ has joined #openstack-nova | 06:37 | |
*** jackding has joined #openstack-nova | 06:40 | |
*** jding1_ has quit IRC | 06:41 | |
*** jding1__ has quit IRC | 06:41 | |
*** ccamacho has joined #openstack-nova | 06:45 | |
*** sahid has joined #openstack-nova | 06:57 | |
*** sahid has quit IRC | 06:57 | |
*** sahid has joined #openstack-nova | 06:57 | |
bauzas | Good morning Nova | 07:09 |
*** pcaruana has quit IRC | 07:09 | |
*** stakeda has joined #openstack-nova | 07:17 | |
*** rcernin has quit IRC | 07:17 | |
openstackgerrit | Zhenyu Zheng proposed openstack/nova-specs master: Detach and attach boot volumes - Stein https://review.openstack.org/600628 | 07:17 |
*** Dinesh_Bhor has joined #openstack-nova | 07:26 | |
*** pcaruana has joined #openstack-nova | 07:28 | |
*** ralonsoh has joined #openstack-nova | 07:29 | |
*** TuanDA has joined #openstack-nova | 07:29 | |
*** pcaruana is now known as pcaruana|elisa| | 07:30 | |
*** renxiaof has quit IRC | 07:31 | |
*** slaweq has quit IRC | 07:32 | |
*** helenafm has joined #openstack-nova | 07:32 | |
*** slaweq has joined #openstack-nova | 07:33 | |
*** tssurya has joined #openstack-nova | 07:49 | |
*** takashin has left #openstack-nova | 08:00 | |
*** Dinesh_Bhor has quit IRC | 08:02 | |
openstackgerrit | Merged openstack/nova master: Migrate nova v2.0 legacy job to zuulv3 https://review.openstack.org/610403 | 08:19 |
*** derekh has joined #openstack-nova | 08:26 | |
*** tetsuro has joined #openstack-nova | 08:27 | |
*** s10 has joined #openstack-nova | 08:34 | |
*** erlon has joined #openstack-nova | 08:39 | |
*** vabada has quit IRC | 08:42 | |
*** vabada has joined #openstack-nova | 08:43 | |
*** erlon has quit IRC | 08:58 | |
*** lei-zh has quit IRC | 08:58 | |
*** helenafm has quit IRC | 09:01 | |
*** Dinesh_Bhor has joined #openstack-nova | 09:06 | |
*** ttsiouts has joined #openstack-nova | 09:12 | |
*** zigo has joined #openstack-nova | 09:14 | |
*** s10 has quit IRC | 09:15 | |
*** helenafm has joined #openstack-nova | 09:21 | |
*** maciejjozefczyk has quit IRC | 09:30 | |
*** maciejjozefczyk has joined #openstack-nova | 09:31 | |
*** erlon has joined #openstack-nova | 09:47 | |
*** s10 has joined #openstack-nova | 09:50 | |
*** brinzhang has quit IRC | 09:53 | |
*** TuanDA has quit IRC | 10:05 | |
openstackgerrit | Tetsuro Nakamura proposed openstack/nova-specs master: Spec: Support filtering by forbidden aggregate https://review.openstack.org/603352 | 10:13 |
*** sambetts|afk is now known as sambetts | 10:14 | |
*** stakeda has quit IRC | 10:16 | |
*** tetsuro has quit IRC | 10:20 | |
*** Dinesh_Bhor has quit IRC | 10:22 | |
*** helenafm has quit IRC | 10:23 | |
openstackgerrit | huanhongda proposed openstack/nova master: AZ operations: check host has no instances https://review.openstack.org/611833 | 10:30 |
*** tetsuro has joined #openstack-nova | 10:31 | |
*** tbachman has quit IRC | 10:33 | |
openstackgerrit | huanhongda proposed openstack/nova master: AZ operations: check host has no instances https://review.openstack.org/611833 | 10:33 |
*** cdent has joined #openstack-nova | 10:35 | |
*** helenafm has joined #openstack-nova | 10:36 | |
*** tbachman has joined #openstack-nova | 10:37 | |
*** dtantsur|afk is now known as dtantsur | 10:43 | |
*** tbachman has quit IRC | 10:46 | |
*** phillu has joined #openstack-nova | 10:51 | |
*** dave-mccowan has joined #openstack-nova | 10:53 | |
*** tetsuro has quit IRC | 10:58 | |
*** skatsaounis has quit IRC | 11:25 | |
*** skatsaounis has joined #openstack-nova | 11:26 | |
*** phillu has quit IRC | 11:35 | |
*** phillu has joined #openstack-nova | 11:36 | |
openstackgerrit | Radoslav Gerganov proposed openstack/nova master: Preserve compute stats used by the scheduler https://review.openstack.org/611852 | 11:39 |
*** phillu has quit IRC | 11:41 | |
*** panda is now known as panda|lunch | 11:53 | |
*** jdillaman1 has joined #openstack-nova | 11:57 | |
*** dillaman has quit IRC | 11:59 | |
*** tbachman has joined #openstack-nova | 12:05 | |
*** tbachman has quit IRC | 12:10 | |
*** tbachman has joined #openstack-nova | 12:15 | |
*** jiaopengju has quit IRC | 12:16 | |
*** jiaopengju has joined #openstack-nova | 12:16 | |
*** mriedem has joined #openstack-nova | 12:19 | |
*** cdent has quit IRC | 12:24 | |
*** jangutter has quit IRC | 12:30 | |
*** dpawlik has quit IRC | 12:30 | |
*** dpawlik has joined #openstack-nova | 12:30 | |
*** jangutter has joined #openstack-nova | 12:32 | |
*** eharney has joined #openstack-nova | 12:33 | |
*** priteau has joined #openstack-nova | 12:34 | |
*** dpawlik has quit IRC | 12:35 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Document each libvirt.sysinfo_serial choice https://review.openstack.org/611426 | 12:39 |
mriedem | so uh, do we want to do this stable-only ironic inventory workaround thing in rocky? https://review.openstack.org/#/c/609043/ | 12:43 |
mriedem | and queens and pike | 12:43 |
mriedem | tl;dr once you've migrated all of your ironic instances to resource classes, you don't want to report vcpu/ram/disk inventory anymore on those nodes | 12:44 |
*** dansmith is now known as SteelyDan | 12:45 | |
SteelyDan | I'd say so | 12:48 |
mriedem | i wasn't sure if the option should be deprecated immediately? seems kind of weird, but it will just be gone when you get to stein. | 12:53 |
mriedem | not sure how much it matters | 12:53 |
SteelyDan | me either | 12:56 |
mriedem | bauzas: were you witholding a +W on https://review.openstack.org/#/c/610088/ for some reason? | 12:56 |
SteelyDan | mriedem: does this ring any bells? http://logs.openstack.org/58/591658/12/check/tempest-full-py3/5224550/controller/logs/screen-n-api.txt.gz?level=TRACE#_Oct_18_20_49_20_086138 | 12:56 |
mriedem | even though you were +2? | 12:56 |
*** stephenfin is now known as finucannot | 12:57 | |
SteelyDan | there's a ton of unrelated red in that log, so ignore the rest, but the FK error there doesn't seem related to the patch | 12:57 |
SteelyDan | and there are also rabbit connection failures later in the log | 12:57 |
*** slaweq has quit IRC | 12:57 | |
mriedem | hmm, no, also seems weird that we'd get cell0 FK errors for a resize operation.... | 12:58 |
mriedem | which shouldn't have anything to do with cell0 | 12:58 |
SteelyDan | it's an action, | 12:58 |
mriedem | unless it's trying to record an action in the api, | 12:58 |
SteelyDan | but yeah that clearly looks like we're talking to the wrong db | 12:58 |
mriedem | and defaulting to the cell0 db connectoin in nova.conf, | 12:58 |
mriedem | but when we look up the instance we should target the context to cell1 | 12:59 |
SteelyDan | oh I bet I know | 12:59 |
mriedem | so, it looks like it's probably shitting b/c it's trying to create an action in cell0 for an instance in cell1 | 12:59 |
SteelyDan | this changes it so we're not permanently targeting the context.. so the resize goes on to be untargeted | 12:59 |
SteelyDan | didn't think of that until you made the connection to cell0 | 13:00 |
SteelyDan | that's also probably why we can't connect to fake:/// or whatever rabbit | 13:00 |
mriedem | uh huh | 13:00 |
openstackgerrit | Dan Smith proposed openstack/nova master: Return a minimal construct for nova show when a cell is down https://review.openstack.org/591658 | 13:02 |
openstackgerrit | Dan Smith proposed openstack/nova master: Return a minimal construct for nova service-list when a cell is down https://review.openstack.org/584829 | 13:02 |
*** slaweq has joined #openstack-nova | 13:07 | |
*** derekh has quit IRC | 13:11 | |
*** derekh has joined #openstack-nova | 13:11 | |
*** mchlumsky has joined #openstack-nova | 13:14 | |
bauzas | mriedem: just a miss I guess | 13:16 |
bauzas | fixed | 13:16 |
*** jmlowe has joined #openstack-nova | 13:17 | |
mriedem | danke | 13:18 |
bauzas | bitte | 13:18 |
* bauzas now has 1800XP for Duolingo in German | 13:18 | |
*** lbragstad is now known as elbragstad | 13:23 | |
*** slaweq has quit IRC | 13:27 | |
openstackgerrit | Daniel Abad proposed openstack/nova master: Fix ironic client ironic_url deprecation warning https://review.openstack.org/611872 | 13:30 |
*** dpawlik has joined #openstack-nova | 13:37 | |
*** dpawlik has quit IRC | 13:38 | |
*** dpawlik has joined #openstack-nova | 13:38 | |
*** awaugama has joined #openstack-nova | 13:43 | |
*** slaweq has joined #openstack-nova | 13:49 | |
*** munimeha1 has joined #openstack-nova | 13:53 | |
*** rpittau has quit IRC | 13:56 | |
melwitt | 14:01 | |
*** sidx64 has joined #openstack-nova | 14:05 | |
*** mriedem has quit IRC | 14:06 | |
*** Luzi has quit IRC | 14:07 | |
*** mriedem has joined #openstack-nova | 14:09 | |
*** s10 has quit IRC | 14:15 | |
openstackgerrit | Stephen Finucane proposed openstack/osc-placement master: Enforce key-value'ness for 'allocation candidate list --resource' https://review.openstack.org/611883 | 14:18 |
openstackgerrit | Stephen Finucane proposed openstack/osc-placement master: tox: Hide deprecation warnings from stdlib https://review.openstack.org/611884 | 14:18 |
*** s10 has joined #openstack-nova | 14:21 | |
*** k_mouza has joined #openstack-nova | 14:24 | |
*** efried is now known as efried_pto | 14:26 | |
*** mlavalle has joined #openstack-nova | 14:28 | |
*** jistr is now known as jistr|call | 14:29 | |
*** k_mouza has quit IRC | 14:30 | |
*** k_mouza has joined #openstack-nova | 14:31 | |
*** jangutter has quit IRC | 14:33 | |
*** jangutter has joined #openstack-nova | 14:33 | |
*** slaweq has quit IRC | 14:40 | |
*** maciejjozefczyk has quit IRC | 14:45 | |
*** sidx64 has quit IRC | 14:46 | |
*** jistr|call is now known as jistr | 14:47 | |
*** spatel has joined #openstack-nova | 14:48 | |
*** panda|lunch is now known as panda | 14:49 | |
spatel | I got this error when i reboot one of my instance any idea what is this? http://paste.openstack.org/show/732501/ | 14:49 |
*** ttsiouts has quit IRC | 14:50 | |
*** pcaruana|elisa| has quit IRC | 14:56 | |
*** pcaruana|elisa| has joined #openstack-nova | 14:57 | |
*** s10 has quit IRC | 15:01 | |
*** ttsiouts has joined #openstack-nova | 15:01 | |
*** helenafm has quit IRC | 15:05 | |
*** pcaruana|elisa| has quit IRC | 15:08 | |
*** pcaruana has joined #openstack-nova | 15:08 | |
*** tssurya has quit IRC | 15:11 | |
*** ttsiouts has quit IRC | 15:12 | |
*** k_mouza has quit IRC | 15:12 | |
*** mriedem is now known as mriedem_afk | 15:27 | |
cfriesen | spatel: looks like corrupt filesystem | 15:27 |
cfriesen | or at least corrupt something on disk | 15:28 |
*** pcaruana has quit IRC | 15:31 | |
mriedem_afk | i've triaged https://bugs.launchpad.net/nova/+bug/1798805 but not sure if we would ever do anything about it | 15:31 |
openstack | Launchpad bug 1798805 in OpenStack Compute (nova) "Nova scheduler schedules VMs on nodes where nova-compute is down" [Wishlist,Triaged] | 15:31 |
finucannot | bauzas: If I delete a compute nodes RP, it'll be recreated, right? | 15:33 |
*** spatel has quit IRC | 15:33 | |
SteelyDan | mriedem_afk: I think that means disable in the "break its needs" sort of sense | 15:34 |
*** dpawlik has quit IRC | 15:34 | |
*** dpawlik has joined #openstack-nova | 15:34 | |
SteelyDan | I'm strongly in favor of keeping disable purely for scheduler reasons, but I think it's saying if the compute is actually down, we shouldn't take action on instances | 15:34 |
SteelyDan | which is probably reasonable | 15:34 |
*** dpawlik has quit IRC | 15:34 | |
*** k_mouza has joined #openstack-nova | 15:34 | |
SteelyDan | or make our super awesome bug-free state sync periodic power the instance on when the compute comes back | 15:35 |
*** dpawlik has joined #openstack-nova | 15:35 | |
imacdonn | I kinda sorta wish there was a "don't schedule anything new" status, which is different from "don't try to interact with me at all" | 15:36 |
*** dpawlik has quit IRC | 15:36 | |
SteelyDan | imacdonn: that's what disable means | 15:36 |
*** dpawlik has joined #openstack-nova | 15:36 | |
SteelyDan | it's the only thing it means | 15:36 |
imacdonn | the former, you mean | 15:36 |
SteelyDan | it means don't schedule anything there | 15:36 |
imacdonn | right | 15:37 |
*** ttsiouts has joined #openstack-nova | 15:37 | |
SteelyDan | you're saying you wish there was a "kneecap this compute" | 15:37 |
SteelyDan | API? | 15:37 |
imacdonn | IMO, it should really mean "don't talk that node at all right now", and there should be some other way to tell the scheduler not to pout anything NEW there | 15:37 |
finucannot | mriedem_afk, awaugama: More investigation needed but I think there's something wrong with those foo_allocation_ratio options. I configured them on the compute node, waited ages aaand...nada. Deleting the RP fixed things | 15:38 |
imacdonn | (or maybe the opposite, so keep the existing meaning of "disable") ... but I think they are use-cases for each | 15:38 |
SteelyDan | we're not changing the existing meaning for disable | 15:38 |
SteelyDan | we could add another state, but it just means for every single operation we do another db hit to check the state of the host | 15:38 |
finucannot | mriedem_afk, awaugama: At least, assuming my understanding of how that's _supposed_ to work is correct. It it Friday so maybe it's not | 15:39 |
imacdonn | "hard_disabled" ? :) | 15:39 |
SteelyDan | no. | 15:39 |
*** jangutter has quit IRC | 15:40 | |
*** dpawlik has quit IRC | 15:40 | |
imacdonn | possible use-case: node is being taken down for hardware maintenance. operator has shut down the instances on it. We don't want to allow the user to star their instances back up until the maintenance is completed | 15:42 |
imacdonn | (we do this all the time, in a private cloud context) | 15:42 |
*** spatel has joined #openstack-nova | 15:44 | |
spatel | cfriesen: I have other VMs running on same compute node they are fine.. very strange issue | 15:44 |
SteelyDan | imacdonn: yeah I'm sure everyone does.. I get the use case, it's unfortunate to need to check the status on every call, but I get it | 15:45 |
*** dpawlik has joined #openstack-nova | 15:52 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: api-ref: 'os-hypervisors' doesn't reflect overcommit ratio https://review.openstack.org/611604 | 15:53 |
*** ccamacho has quit IRC | 15:56 | |
*** dpawlik has quit IRC | 15:56 | |
*** jmlowe has quit IRC | 16:01 | |
cfriesen | imacdonn: so lock the instances? | 16:07 |
*** beekneemech has quit IRC | 16:08 | |
*** bnemec has joined #openstack-nova | 16:08 | |
cfriesen | imacdonn: or stop the nova-compute process? | 16:09 |
imacdonn | cfriesen: I guess, but the owner can unlock it? | 16:09 |
cfriesen | imacdonn: you could always make lock/unlock admin-only. | 16:09 |
imacdonn | cfriesen: I guess stopping nova-compute may cause the bug that mriedem_afk was reviewing ... although there are some open questions in the bug | 16:09 |
*** mgariepy has quit IRC | 16:10 | |
*** mgariepy has joined #openstack-nova | 16:15 | |
cfriesen | I'm not sure it's really a bug...more like a feature to more gracefully handle cases where things get "stuck" in what was supposed to be a transitional state. | 16:16 |
*** tbachman has quit IRC | 16:21 | |
*** openstackgerrit has quit IRC | 16:24 | |
*** ttsiouts has quit IRC | 16:30 | |
*** sambetts is now known as sambetts|afk | 16:32 | |
*** k_mouza_ has joined #openstack-nova | 16:39 | |
*** sahid has quit IRC | 16:42 | |
*** bnemec is now known as beekneemech | 16:43 | |
*** k_mouza has quit IRC | 16:43 | |
*** k_mouza_ has quit IRC | 16:44 | |
*** dtantsur is now known as dtantsur|afk | 16:44 | |
*** openstackgerrit has joined #openstack-nova | 16:52 | |
openstackgerrit | Merged openstack/nova master: Fix deprecated base64.decodestring warning https://review.openstack.org/610401 | 16:52 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Fail to live migration if instance has a NUMA topology https://review.openstack.org/611088 | 17:00 |
*** derekh has quit IRC | 17:01 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Fail to live migration if instance has a NUMA topology https://review.openstack.org/611088 | 17:05 |
*** mdbooth has quit IRC | 17:06 | |
*** gyee has joined #openstack-nova | 17:12 | |
*** k_mouza has joined #openstack-nova | 17:16 | |
*** cdent has joined #openstack-nova | 17:20 | |
*** k_mouza has quit IRC | 17:21 | |
awaugama | finucannot, was at lunch. so you're able to reproduce that setting the value on the compute node doesn't change anything in placement? | 17:23 |
sean-k-mooney | awaugama: which setting is that? | 17:29 |
awaugama | cpu_allocation_ratio I believe | 17:30 |
awaugama | yeah | 17:30 |
sean-k-mooney | awaugama: there is a spec current for changing how we adress this for stien | 17:30 |
awaugama | I'm not aware of one | 17:31 |
sean-k-mooney | awaugama: there are 2 but i belive we have setteled on one ill see if i can find it | 17:31 |
awaugama | cool | 17:31 |
sean-k-mooney | awaugama: this is one of the specs https://review.openstack.org/#/c/552105/ | 17:32 |
sean-k-mooney | i think that is the most recent one | 17:33 |
sean-k-mooney | this is the other one https://review.openstack.org/#/c/544683/ but im not sure if it will still be requried | 17:35 |
awaugama | checking | 17:35 |
awaugama | yeah that looks like what finucannot was talking about being messed up | 17:36 |
sean-k-mooney | well that depends on what you were expecting | 17:36 |
sean-k-mooney | awaugama: what was your initall issue | 17:36 |
awaugama | setting cpu_allocation_ratio to 1 in nova.conf changed the value on the compute node (even on compute node startup it showed it was changed) but placement still has it set to 16 | 17:37 |
sean-k-mooney | awaugama: right i belive currently it will only use the value if it is first creating the provider | 17:37 |
sean-k-mooney | if the provider exists it will not update it i think... | 17:38 |
awaugama | that seems counter-intuitive | 17:38 |
sean-k-mooney | awaugama: this will be changed going forward to allow the cpu_allocation_ratio confige option to allow specify the placement value too and cpu_inital_allocation_ratio to specify what it shoudl be for newly creeted resouce providers | 17:39 |
sean-k-mooney | awaugama: the current behavior i belive is intened to allow you to mainge the allocation ratios via the api instead of the config | 17:39 |
awaugama | interesting. finucannot and mriedem_afk ^ | 17:40 |
awaugama | i need to step away for a few minutes, brb | 17:40 |
sean-k-mooney | awaugama: finucannot is likely not around anymore. | 17:41 |
sean-k-mooney | awaugama: he is usually enjoying his weekend by now :) | 17:42 |
awaugama | sean-k-mooney, yeah, I figure he'll get the scrollback at somepoint | 17:50 |
sean-k-mooney | awaugama: the inital allocation ratios spec is likely the one that is most relevent to you https://review.openstack.org/#/c/552105/ | 17:57 |
*** ralonsoh has quit IRC | 18:03 | |
spatel | sean-k-mooney: hey! how are you doing :) | 18:04 |
*** jmlowe has joined #openstack-nova | 18:04 | |
sean-k-mooney | spatel: quite well thank you. how are you :) | 18:04 |
spatel | after longtime i am seeing you online or may be i was not paying attention | 18:04 |
sean-k-mooney | i was on company traing most of this week | 18:05 |
spatel | I am great!! and my openstack also going great | 18:05 |
sean-k-mooney | so ya i was not online much. that is good to hear. | 18:05 |
spatel | I have added 80 SR-IOV compute nodes and put them in production :) | 18:06 |
sean-k-mooney | wow that was fast. are you happy with the feature set/perfromance you have achived? | 18:06 |
spatel | Yup!! i am not seeing any performance impact for my application so far | 18:06 |
*** medberry has joined #openstack-nova | 18:07 | |
sean-k-mooney | you decided to go with vnic_type direct with a flavor with hugepages and pinned cpus in the end? | 18:07 |
spatel | I have create two AZ group as we spoken last time, tor-1 and tor-2 and spreading application across two AZ | 18:07 |
spatel | Yes vnic_type=direct / hugepages / cpu pinning / numa=2 | 18:08 |
spatel | whatever setting you suggested last time | 18:08 |
sean-k-mooney | cool that should be a good starting point for any VNF deployed on openstack | 18:09 |
spatel | I am happy with those setting :) | 18:09 |
sean-k-mooney | and for the tor config did you split the control and dataplane across the tors or keep them on the same tor | 18:09 |
spatel | That i didn't tested yet because of deployment urgency! | 18:11 |
spatel | i need to test that in LAB first | 18:11 |
spatel | that is in my to-do list | 18:11 |
sean-k-mooney | well the original config you proposed should work but i was not certen the failure mode was optimal. that said you intended to add a deicated manament switch at a later poit so that will likely be the best longterm solution in anycase | 18:12 |
spatel | as soon as i get time i want to run test on dpdk (SR-IOV is still painful even its better) | 18:12 |
*** gibi is now known as gibi_off | 18:12 | |
*** medberry has quit IRC | 18:14 | |
spatel | sean-k-mooney: yes i have plan in future to isolate mgmt traffic using extra nic | 18:14 |
spatel | sean-k-mooney: Quick question, how do you guys monitor instance stats? like CPU/memory from KVM | 18:16 |
spatel | currently i have snmp agent running on compute node and also inside instance but that won't give you 100% result right? i want to monitor hypervisor level stats | 18:18 |
spatel | KVM level monitoring in short | 18:18 |
sean-k-mooney | am you have several options | 18:19 |
sean-k-mooney | in generall i would recommened collectd for host level perfomance monitoring | 18:20 |
sean-k-mooney | if you have deployed the openstack ceilometer project you can also confure it to monitro the libvirt isntances | 18:20 |
sean-k-mooney | i belive the prometious project has good monitoring capablites for the openstack servcies but it does not have good monitoring capablites for the host or the vms as far as i am aware | 18:21 |
spatel | I didn't configure ceilometer because i wasn't sure i need that component because its private cloud and we don't need billing | 18:22 |
sean-k-mooney | spatel: that is a common missconceptions ceilometer is for telemetry not billing | 18:22 |
sean-k-mooney | that said its not nessisarly the best solution for telemetry either | 18:23 |
spatel | i was worried it will eat my resources | 18:24 |
sean-k-mooney | if you decide to use collectd you could look into https://github.com/openstack/collectd-openstack-plugins to plublis metrics to celometer or gnocci but collectd can also advertise the stata it monitors over snmp or toher protocols | 18:24 |
sean-k-mooney | spatel: yes that is a concern with celimiter. it does not scale well which is why i generally dont recomend it as my first choice | 18:24 |
spatel | Agreed ++ | 18:25 |
spatel | i had same concern so i didn't bother to install | 18:25 |
sean-k-mooney | ya so in general i would check out collectd and see if it meets your needs | 18:26 |
spatel | I am going to try collectd sure! | 18:26 |
sean-k-mooney | my former team enhanced the collectd libvirt plugin to allow reporting stats for vms using the uuid field instead of the id field | 18:26 |
sean-k-mooney | the uuid filed is set to the nova instance uuid | 18:27 |
sean-k-mooney | so its easy to do a 1:1 mapping between the stats and the workload | 18:27 |
spatel | hmm! interesting i think i have to look into collectd now | 18:27 |
spatel | also i was thinking push collectd data to influx + grafana so i have good dashboard | 18:28 |
sean-k-mooney | you can also have collectd use its network plugin to send the stats to influxdb and then use graphana to visualise the results | 18:28 |
sean-k-mooney | jinks | 18:28 |
sean-k-mooney | :) that is a good solution | 18:28 |
spatel | yup! | 18:28 |
sean-k-mooney | are you familar with OPNFV? | 18:29 |
spatel | no much | 18:29 |
sean-k-mooney | thats ok its rather vast in scope like openstack | 18:29 |
sean-k-mooney | spatel: i just wanted to highlight the barometer project https://wiki.opnfv.org/display/fastpath/Barometer+Home | 18:30 |
spatel | reading.. | 18:31 |
sean-k-mooney | spatel: they have be working with integrating collect graphana and other tools for openstack monitoring for telco/NFV usecases | 18:32 |
sean-k-mooney | https://www.youtube.com/watch?v=-82UShFiBBM | 18:32 |
spatel | sean-k-mooney: very interesting... also i had question about SR-IOV VF nic stats, it doesn't tell you how much data flowing through your VF | 18:32 |
sean-k-mooney | spatel: that depend on the nic | 18:32 |
spatel | I have Qlogic so i doubt it has that feature | 18:33 |
sean-k-mooney | so most nics dont have enough hardware counters to gatar stats on all the VFs from the host side | 18:33 |
sean-k-mooney | spatel: if you are able to run collectd in the guest however it can monitor the kernel stats and give you that info | 18:34 |
spatel | Yes! that is what i am doing currently! guest base snmp monitoring | 18:34 |
sean-k-mooney | spatel: one other tool that you might also find userful in this space is skydive. http://skydive.network/ | 18:34 |
spatel | hmm! looking cool | 18:35 |
spatel | we are using observium and Cisco DCNM | 18:35 |
spatel | hmm! they have openstack support too | 18:36 |
sean-k-mooney | skydive was developed in the last 2 years ago specificly for cloud and conterised enviornments but it began to mature just after i deployed my last cloud so i have not used it myself | 18:37 |
*** tbachman has joined #openstack-nova | 18:37 | |
spatel | Can i use for Cisco switches and router? | 18:38 |
spatel | look like they use agent do i doubt | 18:38 |
sean-k-mooney | i belive so. it uses multiple protocols to do its monitoring | 18:38 |
sean-k-mooney | if thet cisco switch support sflow i belive it would work | 18:40 |
spatel | We have all Cisco nexus switches and they do sflow :) i am going to explore it now | 18:40 |
spatel | Do you guys using collectd to monitor your compute nodes? | 18:41 |
mriedem_afk | awaugama: as far as i know if you set the cpu_allocation_ratio in nova.conf on the nova-compute service, it should mirror that to the new OR existing resource provider in placement | 18:41 |
*** mriedem_afk is now known as mriedem | 18:41 | |
mriedem | if it's nit mirroring those updates i think that's a bug | 18:41 |
mriedem | SteelyDan: on that bug, tbc, i think they got confused about "scheduling" | 18:42 |
awaugama | mriedem: that's what I thought would happen. sean-k-mooney you think it won't update existing ones? | 18:42 |
mriedem | it's not scheduling anything, they just power on an instance on a stopped compute and the rpc cast is sent to the void | 18:42 |
sean-k-mooney | awaugama: im not sure. | 18:43 |
mriedem | cfriesen: yes the bug is saying they power on the instance, the api changes the task_state, but b/c the compute is down, we never finish the task and it's "stuck" | 18:43 |
sean-k-mooney | awaugama: it will do one of two things either it will overright it allways with the config value or it will only use the config value on creattion fo the rp | 18:43 |
mriedem | fixing that would mean needing to check the compute service status for every instance action... | 18:43 |
sean-k-mooney | awaugama: both cases are "wronge" depending on who you ask hench the sepc for initall allcoation ratios https://review.openstack.org/#/c/552105/ | 18:44 |
mriedem | on every periodic, we update the resource class inventory allocation ratio based on config that we send to placement https://github.com/openstack/nova/blob/5b815eec4c5fc8c19863aa38b1d1920705b17bfa/nova/compute/resource_tracker.py#L108 | 18:44 |
mriedem | https://github.com/openstack/nova/blob/5b815eec4c5fc8c19863aa38b1d1920705b17bfa/nova/compute/resource_tracker.py#L952 | 18:45 |
mriedem | the question is if prov_tree.update_inventory(nodename, inv_data) has a bug thinking that nothing changed | 18:45 |
mriedem | b/c it's a cache | 18:45 |
mriedem | and the reportclient itself has a cache of the provider tree | 18:45 |
mriedem | https://github.com/openstack/nova/blob/5b815eec4c5fc8c19863aa38b1d1920705b17bfa/nova/scheduler/client/report.py#L1501 | 18:46 |
mriedem | "The specified ProviderTree is compared against the local cache. Any changes are flushed back to the placement service. " | 18:46 |
awaugama | makes sense | 18:47 |
mriedem | the bug is probably here https://github.com/openstack/nova/blob/5b815eec4c5fc8c19863aa38b1d1920705b17bfa/nova/scheduler/client/report.py#L1575-L1576 | 18:48 |
mriedem | if we have a single compute node resource provider and that doesn't change, both of those sets will be empty | 18:48 |
cfriesen | mriedem: as I said in the bug, that won't actually fix it, just make the race window smaller. | 18:48 |
mriedem | and the for loops below won't flush any changes to placement | 18:48 |
mriedem | cfriesen: isure | 18:48 |
mriedem | *sure | 18:48 |
mriedem | b/c of the service group api | 18:48 |
mriedem | unless you force down the service | 18:49 |
mriedem | awaugama: in queens we weren't using that provider tree stuff | 18:49 |
mriedem | https://github.com/openstack/nova/blob/stable/queens/nova/compute/resource_tracker.py#L883 | 18:49 |
mriedem | https://github.com/openstack/nova/blob/stable/queens/nova/scheduler/client/report.py#L1112 | 18:50 |
mriedem | https://github.com/openstack/nova/blob/stable/queens/nova/scheduler/client/report.py#L850 | 18:50 |
cdent | because it is friday and I don't feel like filtering, can I just say: god I hate caches, why do we do caches? | 18:50 |
mriedem | idk | 18:50 |
mriedem | to avoid calling the placement API 500 times per periodic? | 18:50 |
mriedem | i just know caches are very tricky | 18:51 |
mriedem | brittle | 18:51 |
sean-k-mooney | cdent: so the processor has another way to mess with your view of a sequtially constent exectuion of your program | 18:51 |
* cdent gives both mriedem and sean-k-mooney an unpleasant kiss | 18:51 | |
awaugama | mriedem: so basically it's a stale value and placement is never refreshing to pick up the new conf setting? | 18:51 |
awaugama | or the rp isn't refreshing? | 18:52 |
sean-k-mooney | cdent: hehe since you are here can i get your input on https://review.openstack.org/#/c/610034/ | 18:52 |
mriedem | https://github.com/openstack/nova/blob/stable/queens/nova/compute/provider_tree.py#L124 | 18:52 |
sean-k-mooney | cdent: actully perhaps the placement channel would be better. | 18:52 |
mriedem | awaugama: i think the scheduler report client in master/rocky is thinking nothing is changing | 18:53 |
cdent | yeah, join me over there because I think that may be fixed | 18:53 |
mriedem | i just had doritos, you probably don't want to kiss me | 18:53 |
mriedem | at least not open mouth | 18:53 |
awaugama | this channel gets weird on Fridays | 18:53 |
mriedem | awaugama: i think on master/rocky the problem is we're not getting here https://github.com/openstack/nova/blob/5b815eec4c5fc8c19863aa38b1d1920705b17bfa/nova/scheduler/client/report.py#L1646 | 18:54 |
mriedem | awaugama: it should be pretty easy to recreate this | 18:54 |
awaugama | mriedem: I think finucannot was able to reproduce on his system | 18:55 |
awaugama | think he was just using devstack | 18:55 |
mriedem | that or https://github.com/openstack/nova/blob/5b815eec4c5fc8c19863aa38b1d1920705b17bfa/nova/scheduler/client/report.py#L1112 is short circuirting | 18:55 |
mriedem | *circuiting | 18:55 |
mriedem | at one point i had a debug patch for a bunch of this b/c it's really hard to know wtf is going on without any logs | 18:56 |
awaugama | mriedem, I had to redeploy my system for another feature test, I can see about reproducing next week | 18:56 |
mriedem | i'll see if i can dredge that up | 18:56 |
awaugama | with logs | 18:56 |
mriedem | awaugama: https://review.openstack.org/#/c/597560/ | 18:57 |
awaugama | cool, I'll make a note of that patch and see if I can get it applied | 18:59 |
mriedem | the stuff in here is probably still useful https://review.openstack.org/#/c/597560/6/nova/scheduler/client/report.py | 18:59 |
mriedem | the rest was for debugging a specific thing that is now fixed | 18:59 |
mriedem | maybe i'll restore and rev that to clean it up | 18:59 |
mriedem | on top of https://review.openstack.org/#/c/597553/ | 19:00 |
*** erlon has quit IRC | 19:07 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Log the operation when updating generation in ProviderTree https://review.openstack.org/597553 | 19:19 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add debug logs for when provider inventory changes https://review.openstack.org/597560 | 19:19 |
mriedem | melwitt: https://bugs.launchpad.net/nova/+bug/1798787 | 19:29 |
openstack | Launchpad bug 1798787 in OpenStack Compute (nova) "Installation help documentation is incorrect - verify & nova-consoleauth" [Medium,Triaged] | 19:29 |
mriedem | the install guide tells you to verify nova-consoleauth is running but we don't tell you to install/start it | 19:30 |
mriedem | b/c that part was removed in rocky | 19:30 |
melwitt | gah | 19:31 |
cdent | Is there a bug that is associated with provider tree cache problems discussed above? | 19:32 |
mriedem | cdent: not that i'm aware of | 19:32 |
mriedem | i think awaugama hit it in QE | 19:32 |
awaugama | yeah verifying vcpu weighter feature | 19:33 |
cdent | awaugama: are you making a bug? If so I want to be sure to follow along | 19:33 |
awaugama | cdent, I will next week. I need to reinstall my system (did another feature test in the meantime) so I'll need to recollect logs | 19:33 |
mriedem | is this a tripleo system that takes 3 days? | 19:34 |
SteelyDan | heh | 19:34 |
* SteelyDan thinks 3 days sounds optimistic | 19:34 | |
cdent | great, thanks awaugama | 19:34 |
awaugama | mriedem: I can probably get it repro'd by EOD Tuesday | 19:34 |
mriedem | btw, speaking of tripleo | 19:34 |
awaugama | but yeah | 19:34 |
mriedem | who from the red hat nova cabal can add nova-status upgrade check to tripleo? | 19:35 |
mriedem | owalsh: ^? | 19:35 |
SteelyDan | or mschuppert | 19:35 |
awaugama | yeah one of those two is the nova deployment guy | 19:36 |
awaugama | their specialty is based on the need | 19:36 |
openstackgerrit | Elod Illes proposed openstack/nova master: Transform scheduler.select_destinations notification https://review.openstack.org/508506 | 19:38 |
sean-k-mooney | awaugama: and the fact that everyone else avoid using tripleo to deploy our dev envs if we can | 19:38 |
awaugama | fair enough | 19:41 |
mriedem | SteelyDan: oooo guess what just rotated in https://www.youtube.com/watch?v=jJ9Xk-VoGqo | 19:41 |
SteelyDan | nice | 19:41 |
SteelyDan | I enjoyed when this rotated in for me this morning: https://www.youtube.com/watch?v=KCdKBHdPz30 | 19:42 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/rocky: Add regression test for bug 1797580 https://review.openstack.org/611938 | 19:43 |
openstack | bug 1797580 in OpenStack Compute (nova) "NoValidHost during live migration after cold migrating to a specified host" [High,In progress] https://launchpad.net/bugs/1797580 - Assigned to Matt Riedemann (mriedem) | 19:43 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/rocky: Don't persist RequestSpec.requested_destination https://review.openstack.org/611939 | 19:43 |
mriedem | i know i'd never do it without the fez on | 19:44 |
openstackgerrit | Dan Smith proposed openstack/nova master: Modify get_by_cell_and_project() to get_not_deleted_by_cell_and_projects() https://review.openstack.org/607663 | 19:44 |
openstackgerrit | Dan Smith proposed openstack/nova master: Minimal construct plumbing for nova list when a cell is down https://review.openstack.org/567785 | 19:44 |
openstackgerrit | Dan Smith proposed openstack/nova master: Refactor scatter-gather utility to return exception objects https://review.openstack.org/607934 | 19:44 |
openstackgerrit | Dan Smith proposed openstack/nova master: Return a minimal construct for nova show when a cell is down https://review.openstack.org/591658 | 19:44 |
openstackgerrit | Dan Smith proposed openstack/nova master: Return a minimal construct for nova service-list when a cell is down https://review.openstack.org/584829 | 19:44 |
SteelyDan | heck no | 19:44 |
melwitt | never heard of either of those. I only know the major steely dan hits | 19:44 |
mriedem | kid charlemagne is a major hit | 19:45 |
mriedem | all the major dudes know that | 19:45 |
SteelyDan | well, | 19:45 |
SteelyDan | she probably means like reelin' in the years and rikki | 19:45 |
melwitt | major dudes?? | 19:45 |
SteelyDan | and dirty work | 19:45 |
melwitt | yeah and like NO STATIC AT ALLLL | 19:45 |
mriedem | https://www.youtube.com/watch?v=kND8TRZap8Y | 19:45 |
SteelyDan | like floyd, the major hits are not very representative of the larger work | 19:45 |
melwitt | and hey nineteen | 19:45 |
melwitt | and deacon blues | 19:46 |
melwitt | lol, of course it was a reference to another song I don't know | 19:46 |
mriedem | when the sax solo from FM comes on in the car, i always yell at the girls to quit down and crank it way up | 19:47 |
mriedem | obnoxiously so | 19:47 |
mriedem | *quiet | 19:47 |
SteelyDan | heh | 19:47 |
melwitt | as one does | 19:47 |
sean-k-mooney | mriedem: oh i actully have heard the last link you posted before | 19:47 |
openstackgerrit | Merged openstack/nova master: Add regression test for bug 1797580 https://review.openstack.org/610088 | 19:55 |
openstack | bug 1797580 in OpenStack Compute (nova) rocky "NoValidHost during live migration after cold migrating to a specified host" [High,In progress] https://launchpad.net/bugs/1797580 - Assigned to Matt Riedemann (mriedem) | 19:55 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/queens: Add regression test for bug 1797580 https://review.openstack.org/611944 | 19:56 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/queens: Don't persist RequestSpec.requested_destination https://review.openstack.org/611945 | 19:56 |
mriedem | SteelyDan: remember this? https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L4310 | 19:57 |
melwitt | wah wah | 19:57 |
*** dave-mccowan has quit IRC | 19:57 | |
mriedem | it's pretty safe to assume that all cold/live migrations are migration-based allocations now right? | 19:57 |
mriedem | b/c that was added to compute in queens | 19:58 |
mriedem | i'm wondering if we can start rolling those compat shims back, including https://github.com/openstack/nova/blob/master/nova/conductor/tasks/migrate.py#L102 | 19:58 |
mriedem | and just assume compute is new enough to always send a migration record | 19:58 |
mriedem | and do that hot swap action | 19:58 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Use oslo_db.sqlalchemy.test_fixtures https://review.openstack.org/609352 | 19:59 |
mriedem | i think that's also safe because of https://github.com/openstack/nova/blob/master/nova/compute/rpcapi.py#L716 | 19:59 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (3) https://review.openstack.org/574104 | 19:59 |
mriedem | we're unconditionally sending the migration record | 19:59 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (4) https://review.openstack.org/574106 | 19:59 |
SteelyDan | mriedem: yeah | 19:59 |
SteelyDan | mriedem: I leave those TODOs for others | 19:59 |
mriedem | tl;dr i would like to rip that code out before landing gibi's https://review.openstack.org/#/c/606050/ which breaks resize to same host allocatoins if we don't have the source allocations on the migration record | 19:59 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (5) https://review.openstack.org/574110 | 20:00 |
mriedem | ok cool | 20:00 |
mriedem | the other thing with that was caching scheduler which is also queued up for death | 20:00 |
* SteelyDan mriedem: bee tee dubs, did you ever look at this? https://review.openstack.org/#/c/611665/ | 20:00 | |
melwitt | that would be a funny TODO. "TODO(dan): Other people remove this in Rocky" | 20:00 |
mriedem | SteelyDan: i have ont yet | 20:00 |
mriedem | *not yet | 20:00 |
SteelyDan | melwitt: next time. I promise. | 20:00 |
melwitt | awesome | 20:00 |
SteelyDan | whoa, errant /me in there, sorry | 20:01 |
*** awaugama is now known as awaugama_away | 20:02 | |
cfriesen | ephemeral disks are supposed to last the life of the instance, right? | 20:03 |
openstackgerrit | Merged openstack/nova master: Don't persist RequestSpec.requested_destination https://review.openstack.org/610098 | 20:03 |
sean-k-mooney | cfriesen: yes and know | 20:04 |
sean-k-mooney | cfriesen: i dont know if they are ment to be copied if we move the guest or not | 20:05 |
sean-k-mooney | cfriesen: they are ment to exsits for the life of an insntace on a singel host definetly. | 20:05 |
sean-k-mooney | cfriesen: resize is totally undefiend what happens. live/coldmigrate i think we copy them but havent checked that is just my intuition | 20:06 |
sean-k-mooney | cfriesen: similarly for shelve unshelve not sure we persit tiem. it might be driver specific | 20:07 |
cfriesen | sean-k-mooney: hmm...I think this wording in the compute API has changed since I looked at it: "Ephemeral disks may be written over on server state changes. So should only be used as a scratch space for applications that are aware of its limitations." | 20:07 |
sean-k-mooney | cfriesen: ya root disk will be perserved obviourly but adtionl epheral disk i think its totally up to the driver on what state chages they are preserved and when they are recreated | 20:09 |
SteelyDan | sean-k-mooney: I don't think any of those things are right | 20:09 |
SteelyDan | I don't think it's well defined at all, but the original assumption was that you couldn't even rely on them across start/stop cycles on your instance | 20:09 |
cfriesen | on the other hand, nova/doc/source/user/flavors.rst says "Ephemeral disks offer machine local disk storage linked to the lifecycle of a | 20:10 |
cfriesen | VM instance. When a VM is terminated, all data on the ephemeral disk is lost." | 20:10 |
sean-k-mooney | SteelyDan: from an api persectiv i would totally aggree. | 20:10 |
sean-k-mooney | cfriesen: yes but terminated is not stopped | 20:11 |
cfriesen | okay...I had been thinking that they were supposed to be preserved over everything except termination. | 20:11 |
cfriesen | but it sounds like that was incorrect | 20:11 |
sean-k-mooney | SteelyDan: i think the libvirt driver preservs the ephemeral disk in more cases then its required too | 20:11 |
SteelyDan | I'm quite sure they're not included in any snapshot, so not for shelve | 20:11 |
SteelyDan | sean-k-mooney: yes | 20:11 |
melwitt | you mean ephemeral in the flavor, not ephemeral like a normal local disk of any instance without ephemeral in the flavor | 20:11 |
cfriesen | melwitt: yes | 20:12 |
melwitt | gotcha | 20:12 |
*** cdent has quit IRC | 20:12 | |
*** spatel has quit IRC | 20:12 | |
cfriesen | okay. that simplifies my life, though not the end user's. :) | 20:12 |
sean-k-mooney | cfriesen: so as far as i can tell this is what determins what disks are migrated https://github.com/openstack/nova/blob/e2a39bb30f716c78af30d61efb3fb7526f9bdf6d/nova/virt/libvirt/driver.py#L7208-L7247 | 20:32 |
cfriesen | sean-k-mooney: for live migration specifically | 20:33 |
sean-k-mooney | cfriesen: so for libvirt i think, that will include the ephermeral disk on live migration | 20:33 |
sean-k-mooney | yes | 20:34 |
sean-k-mooney | for rebuild/resize/cold migration i think we dont copy them | 20:34 |
cfriesen | the case I was looking into was a resize, followed by a resize-revert. | 20:34 |
sean-k-mooney | i can take a look. it makes sense that we copyt them on live migrate as that does not effect the life time of the instance as long as it succeeds | 20:35 |
*** owalsh has quit IRC | 20:36 | |
*** dklyle has joined #openstack-nova | 20:37 | |
sean-k-mooney | cfriesen: so for resize we copy the image here https://github.com/openstack/nova/blob/e2a39bb30f716c78af30d61efb3fb7526f9bdf6d/nova/virt/libvirt/driver.py#L8309-L8343 but i need to chec if the ephermeral disks are in elf._get_instance_disk_info( | 20:38 |
cfriesen | sean-k-mooney: don't waste time on my account, if the compute API says it can change, that's good enough for me. | 20:40 |
sean-k-mooney | cfriesen: it looks like its looping over all the disks too https://github.com/openstack/nova/blob/e2a39bb30f716c78af30d61efb3fb7526f9bdf6d/nova/virt/libvirt/driver.py#L7986 | 20:40 |
sean-k-mooney | the code is very similar to the live migration code, they shoudl proably be refactored to gether but is also differnt enough that im not sure it does the same thing exactly | 20:42 |
sean-k-mooney | cfriesen: ya from an api persective it can. libvirt appears to keep them for cold and live migrate. i would assume they are not kept for rebuild,shelve and evacuate. i also think libvirt may be expanding the existing ephmeral disk in resize | 20:46 |
sean-k-mooney | cfriesen: the libvirt direver is specifclaly checking that we are not resizeing the ephermerl disk down https://github.com/openstack/nova/blob/e2a39bb30f716c78af30d61efb3fb7526f9bdf6d/nova/virt/libvirt/driver.py#L8268-L8275 implying that its not recreateing them but may be expanding them on resize | 20:47 |
*** dklyle has quit IRC | 20:50 | |
*** munimeha1 has quit IRC | 20:51 | |
*** beekneemech has quit IRC | 20:53 | |
*** boden has joined #openstack-nova | 20:58 | |
boden | anyone heard about a "oslo_db.exception.DBNonExistentTable" cropping up across projects? Appears to have cropped up around 10/10 http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22oslo_db.exception.DBNonExistentTable%5C%22 and includes neutron, nova and others | 20:58 |
*** owalsh has joined #openstack-nova | 21:10 | |
mriedem | boden: those are unit/functional tests, | 21:15 |
mriedem | and 10 days is as far back as logstash goes | 21:15 |
mriedem | so 10/10 is probably not really when it started | 21:15 |
mriedem | that's just how far back we have logs | 21:16 |
mriedem | it's showing patches in pike | 21:16 |
mriedem | and also shows up in successful job runs, so most likely unrelated to anything that's failing | 21:16 |
boden | mriedem perhaps that's the case for nova (I haven't dug there), but it doesnt appear to be the case for all others; there are valid failures | 21:17 |
sean-k-mooney | mriedem: boden is this realated why we have to loop for nova-manage out of interest | 21:21 |
sean-k-mooney | e.g. why wer did https://review.openstack.org/#/c/608091/ | 21:21 |
sean-k-mooney | actully looking at the logs no this is unrealted | 21:24 |
boden | yeah my bad... I assumed nova was failing without digging... we have been getting this error on some other projects randomly; appears to be memory/resource related | 21:35 |
*** mriedem has quit IRC | 21:40 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Drop legacy cold migrate allocation compat code https://review.openstack.org/611970 | 21:50 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Drop legacy cold migrate allocation compat code https://review.openstack.org/611970 | 21:52 |
*** boden has quit IRC | 21:52 | |
*** sean-k-mooney has quit IRC | 22:03 | |
*** dklyle has joined #openstack-nova | 22:03 | |
*** priteau has quit IRC | 22:12 | |
openstackgerrit | melanie witt proposed openstack/nova master: libvirt: set device address tag only if setting disk unit https://review.openstack.org/611974 | 22:18 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Drop legacy live migrate allocation compat code https://review.openstack.org/611975 | 22:22 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Drop legacy live migrate allocation compat code https://review.openstack.org/611975 | 22:23 |
*** tbachman has quit IRC | 23:07 | |
*** bnemec has joined #openstack-nova | 23:25 | |
*** medberry has joined #openstack-nova | 23:26 | |
*** tbachman has joined #openstack-nova | 23:44 | |
*** gyee has quit IRC | 23:50 | |
*** dklyle has quit IRC | 23:52 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!