*** liverpooler has quit IRC | 00:18 | |
*** sdague has quit IRC | 00:18 | |
mriedem | nailed it http://logs.openstack.org/67/529867/1/check/tempest-full/23d2919/controller/logs/screen-n-sch.txt.gz#_Dec_22_18_14_13_693048 | 00:23 |
---|---|---|
mriedem | jaypipes: could you have guessed the stats / num_instances / host_state.instances stuff could have all gotten screwy? | 00:23 |
jaypipes | mriedem: not surprised. | 00:26 |
mriedem | heh "Reported number of instances (0) does not match the tracked number of instances (3)." | 00:31 |
mriedem | not even close | 00:31 |
*** awestin1 has joined #openstack-nova | 00:31 | |
*** jobewan has quit IRC | 00:32 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/pike: doc: Add configuration index page https://review.openstack.org/531042 | 00:38 |
*** claudiub has quit IRC | 00:41 | |
*** tetsuro_ has joined #openstack-nova | 00:48 | |
openstackgerrit | Takashi NATSUME proposed openstack/python-novaclient master: Microversion 2.59 - List/Show all server migration types https://review.openstack.org/430839 | 00:54 |
*** ihrachys has joined #openstack-nova | 00:54 | |
mriedem | will need someone more familiar with ironic to triage this https://bugs.launchpad.net/nova/+bug/1730834 | 01:02 |
openstack | Launchpad bug 1730834 in OpenStack Compute (nova) "Ironic compute node doesn't take over nodes with instance when the owner compute node is down" [Undecided,New] | 01:02 |
mriedem | i don't know what "take over" means here | 01:02 |
jroll | mriedem: that's expected behavior | 01:03 |
jroll | 'take over' meaning have another compute service manage the instance, because the hash ring thing | 01:03 |
jroll | we should make that better so it isn't expected behavior, but low priority I guess | 01:04 |
mriedem | oh | 01:04 |
mriedem | well then | 01:04 |
* jroll triages | 01:05 | |
jroll | oh, I can't set importance ¯\_(ツ)_/¯ | 01:07 |
mriedem | join the bug team | 01:15 |
mriedem | should be able to then | 01:15 |
*** Swami has quit IRC | 01:17 | |
jroll | idk, you might make me do stuff | 01:17 |
*** purplerbot has quit IRC | 01:18 | |
*** bjhuangr has joined #openstack-nova | 01:25 | |
bjhuangr | mriedem, hi, do you have a chance to review https://review.openstack.org/#/c/523387/ ? Thanks in advance . | 01:26 |
openstackgerrit | Tetsuro Nakamura proposed openstack/nova master: [libvirt] Add _get_XXXpin_cpuset() https://review.openstack.org/527631 | 01:33 |
openstackgerrit | Tetsuro Nakamura proposed openstack/nova master: Add NumaTopology support for libvirt/qemu driver https://review.openstack.org/530451 | 01:33 |
openstackgerrit | Tetsuro Nakamura proposed openstack/nova master: disable cpu pinning with libvirt/qemu driver https://review.openstack.org/531049 | 01:33 |
*** Dinesh_Bhor has joined #openstack-nova | 01:45 | |
*** smatzek has joined #openstack-nova | 01:53 | |
*** threestrands has joined #openstack-nova | 01:55 | |
*** smatzek has quit IRC | 01:57 | |
*** kaisers has quit IRC | 01:58 | |
*** liuyulong has joined #openstack-nova | 02:02 | |
*** tetsuro_ has quit IRC | 02:11 | |
*** tetsuro_ has joined #openstack-nova | 02:12 | |
*** smcginnis has quit IRC | 02:13 | |
*** zhurong has joined #openstack-nova | 02:13 | |
mriedem | kashyap: could use some help with this if you get a chance https://review.openstack.org/#/c/267587/75/nova/virt/libvirt/guest.py - trying to get multiattach working which used to be ok before qemu 2.10 but now we hit issues with a write lock when attaching the volume to the 2nd guest, and i thought we could pass the force flag to the attach device call to libvirt but i got this error: | 02:20 |
mriedem | libvirtError: unsupported flags (0x4) in function qemuDomainAttachDeviceLiveAndConfig | 02:21 |
Kevin_Zheng | seems Nova is broken after add uuid to BDM | 02:25 |
Kevin_Zheng | due to https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L497 | 02:25 |
openstackgerrit | Lance Bragstad proposed openstack/nova master: Simplify logic in get_enforcer https://review.openstack.org/531008 | 02:25 |
Kevin_Zheng | the field could not be added to cell1 db | 02:25 |
mriedem | Kevin_Zheng: that nova-manage code is really old | 02:27 |
mriedem | if the bdm uuid change broke that, we should have seen it in CI | 02:27 |
mriedem | since devstack runs this | 02:27 |
mriedem | the cell1 sync happens here https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L513 | 02:27 |
mriedem | after running the schema migrations for cell0 | 02:27 |
Kevin_Zheng | Hmm, but my db didn't got updated | 02:28 |
mriedem | is your nova.conf correct? | 02:28 |
Kevin_Zheng | I will check, but I have been using this for sometime, it should be correct. | 02:30 |
*** ljjjustin has joined #openstack-nova | 02:31 | |
*** Dinesh_Bhor has quit IRC | 02:32 | |
Kevin_Zheng | Hmm... in config file for api service, the database connection should be cell0 db, correct? | 02:32 |
*** Dinesh_Bhor has joined #openstack-nova | 02:34 | |
mriedem | that's what we have in http://logs.openstack.org/58/526258/3/check/tempest-full/d133d1f/controller/logs/etc/nova/ | 02:35 |
mriedem | but devstack will sync using the cell1 conf too | 02:36 |
mriedem | which has the cell1 database in the [database] section | 02:36 |
mriedem | https://github.com/openstack-dev/devstack/blob/master/lib/nova#L707 | 02:36 |
mriedem | this runs it for cell0 https://github.com/openstack-dev/devstack/blob/master/lib/nova#L711 | 02:37 |
Kevin_Zheng | ah, I see | 02:37 |
mriedem | this was never implemented to hit all cells https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L479 | 02:37 |
mriedem | i had a patch for it but it must be abandoned | 02:37 |
mriedem | https://review.openstack.org/#/c/420973/ | 02:38 |
Kevin_Zheng | yeah, | 02:39 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix up formatting for deprecate-api-extensions-policies release note https://review.openstack.org/531061 | 02:39 |
Kevin_Zheng | Thanks, problem solved | 02:39 |
Kevin_Zheng | did we mentioned this in any docs? | 02:39 |
mriedem | it should be in the install guide | 02:41 |
mriedem | oh, well, https://docs.openstack.org/nova/latest/install/controller-install-ubuntu.html#install-and-configure-components | 02:42 |
melwitt | mriedem: re: that multiattach thing, did you see this bug? https://bugzilla.redhat.com/show_bug.cgi?id=1378242 based on that it looks like there needs to be share-rw=on property set under <shareable/> in order for it to allow the concurrent access | 02:42 |
openstack | bugzilla.redhat.com bug 1378242 in libvirt "QEMU image file locking (libvirt)" [Unspecified,On_qa] - Assigned to pkrempa | 02:42 |
mriedem | Kevin_Zheng: that install guide works because it's configuring nova.conf to set the [database] to the cell1 db | 02:42 |
mriedem | Kevin_Zheng: that install guide was written before the superconductor mode stuff that dansmith did in devstack | 02:42 |
mriedem | where the controllers are pointed at cell0 | 02:43 |
mriedem | Kevin_Zheng: we also have https://docs.openstack.org/nova/latest/user/cells.html#setup-of-cells-v2 | 02:43 |
mriedem | it's not very clear, but there is a note in there too | 02:44 |
mriedem | "At this point, the API database can now find the cell database, and further commands will attempt to look inside. If this is a completely fresh database (such as if you’re adding a cell, or if this is a new deployment), then you will need to run nova-manage db sync on it to initialize the schema." | 02:44 |
mriedem | Kevin_Zheng: might be a good FAQs entry https://docs.openstack.org/nova/latest/user/cells.html#faqs | 02:44 |
mriedem | melwitt: nope never seen that | 02:44 |
Kevin_Zheng | yeah, | 02:44 |
Kevin_Zheng | I will add it later | 02:45 |
mriedem | melwitt: not mentioned in the domain xml docs at all https://libvirt.org/formatdomain.html | 02:45 |
mriedem | :( | 02:45 |
mriedem | and by the looks of when this was 'fixed' i'm guessing we'd need super modern versions of qemu to use this | 02:47 |
*** namnh has joined #openstack-nova | 02:47 | |
melwitt | yeah, I was just thinking the same | 02:47 |
mriedem | well, so much for multiattach in queens | 02:48 |
melwitt | this all looks fairly recent, which would explain the absence in the docs | 02:48 |
mriedem | wah wah | 02:48 |
mriedem | yeah | 02:48 |
mriedem | commit 860a3c4bea1d24773d8a495f213d5de3ac48a462 Author: Peter Krempa <pkrempa@redhat.com> Date: Wed Nov 15 15:02:58 2017 +0100 | 02:48 |
mriedem | we could have conditional logic around the version of qemu being used, but that sucks, | 02:50 |
mriedem | and i don't know if like, qemu < 2.10 works, and then is broken until qemu >= x | 02:50 |
mriedem | so you'd have a middle ground where things just don't work | 02:50 |
melwitt | what do you mean, like if this was a regression there could be a window in the middle where things don't work? | 02:53 |
mriedem | yeah | 02:53 |
mriedem | we didn't have this problem around ~newton when i wrote the original tempest test for multiattach | 02:53 |
mriedem | you could do the 2 attachments fine, it's just that nova didn't orchestrate the detach properly | 02:53 |
melwitt | oh :( | 02:53 |
mriedem | now we're in a case with newer qemu where the 2nd attach fails | 02:53 |
openstackgerrit | Chen Hanxiao proposed openstack/nova master: log test: use fixtures.StandardLogging in setUp https://review.openstack.org/531065 | 02:54 |
mriedem | so apparently fixed in libvirt-3.9.0-3.el7 | 02:54 |
mriedem | we are testing against 3.6.0 | 02:54 |
mriedem | https://bugzilla.redhat.com/show_bug.cgi?id=1378242#c13 | 02:55 |
openstack | bugzilla.redhat.com bug 1378242 in libvirt "QEMU image file locking (libvirt)" [Unspecified,On_qa] - Assigned to pkrempa | 02:55 |
melwitt | and this is the qemu bug that the libvirt bug was cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1378241 | 02:55 |
openstack | bugzilla.redhat.com bug 1378241 in qemu-kvm-rhev "QEMU image file locking" [Unspecified,Verified] - Assigned to famz | 02:55 |
mriedem | same thing we're hitting | 02:55 |
melwitt | that says qemu-kvm-rhev though. | 02:56 |
mriedem | yeah so that caused the bug we had with 2.10 where we were hitting the same issue with qemu-img info | 02:57 |
mriedem | and had to start using the force flag for that | 02:57 |
mriedem | https://bugs.launchpad.net/nova/+bug/1718295 | 02:57 |
openstack | Launchpad bug 1718295 in OpenStack Compute (nova) "Live migration fails with qemu-img >= 2.10: "Failed to get shared "write" lock\nIs another process using the image?"" [High,Fix released] - Assigned to Sean Dague (sdague) | 02:57 |
mriedem | so now our support matrix is likely something like, | 02:57 |
mriedem | 1. do it the old way if qemu<2.10, else | 02:57 |
mriedem | 2. do it the new way if qemu>=2.10 AND libvirt>=3.9.0 | 02:58 |
mriedem | else | 02:58 |
mriedem | 3. shart your pants | 02:58 |
mriedem | and only 3 is testable in our current CI env :( | 02:58 |
melwitt | sigh, so this has some history | 02:58 |
clarkb | didnt nova implement its own lock wrapper for that? | 03:01 |
mriedem | for what? | 03:01 |
openstackgerrit | Eli Qiao proposed openstack/nova master: Api-guide: Add Block Device Mapping https://review.openstack.org/522084 | 03:01 |
*** yikun has joined #openstack-nova | 03:03 | |
clarkb | mriedem: for the lack of working locks | 03:03 |
clarkb | there wa sa while wrapper script thing iirc | 03:03 |
bjhuangr | mriedem: hi Matt, do you have a chance to review https://review.openstack.org/#/c/523387/ ? Thanks in advance | 03:04 |
mriedem | bjhuangr: not right now sorry | 03:05 |
mriedem | clarkb: sounds like something different | 03:05 |
mriedem | melwitt: looking at https://github.com/libvirt/libvirt/commit/28907b0043fbf71085a798372ab9c816ba043b93 it actually looks like that went into libvirt 3.10 | 03:06 |
mriedem | and it's unclear to me if we'd actually have to put something in the disk config xml or if libvirt just handles that for us | 03:06 |
*** Dinesh_Bhor has quit IRC | 03:07 | |
melwitt | oh, hm. I wasn't sure about that either | 03:09 |
melwitt | looking at this, it looks like libvirt handles it. but not sure | 03:10 |
mriedem | yeah that's what i'm wondering | 03:10 |
mriedem | based on https://bugzilla.redhat.com/show_bug.cgi?id=1378242#c14 | 03:10 |
openstack | bugzilla.redhat.com bug 1378242 in libvirt "QEMU image file locking (libvirt)" [Unspecified,On_qa] - Assigned to pkrempa | 03:10 |
mriedem | i think peter is saying, you need qemu 2.10 with the new lock stuff and the fixes mentioned in comment 10 to libvirt | 03:11 |
mriedem | so if qemu>=2.10, libvirt must be >=3.10 (i think) | 03:11 |
mriedem | says it was fixed in libvirt-3.9.0-3.el7 but maybe that's an rpm package version with the patch backported? | 03:11 |
mriedem | because https://github.com/libvirt/libvirt/commit/28907b0043fbf71085a798372ab9c816ba043b93 says 3.10 | 03:12 |
melwitt | that's how I interpret it too | 03:12 |
mriedem | suck | 03:12 |
mriedem | i wonder if i could swindle jamespage and the gang to backport that to libvirt 3.6.0 :) | 03:12 |
melwitt | yeah, I'm honestly surprised this is all so recent | 03:13 |
melwitt | with all of the multiattach talk that's been happening for years, I never thought there could be a blocker at the qemu and libvirt level | 03:13 |
mriedem | sweet revenge | 03:14 |
melwitt | and the "fixed-in-version 3.9" with the commit in 3.10 doesn't make sense to me either | 03:15 |
mriedem | did you notice the guy complaining in comment 14 is using oracle RAC? | 03:15 |
mriedem | which is like the #1 use case for multiattach and why oracle wants it in nova so badly | 03:15 |
mriedem | comment 13 i should say | 03:15 |
mriedem | stvnoyes: fyi as this is all going to impact multiattach support in nova in queens ^ | 03:15 |
melwitt | what a mess | 03:16 |
mriedem | stvnoyes: if you're looking for something to do, we need some testing for the multiattach patches using qemu 2.10 and libvirt 3.1.0 | 03:16 |
mriedem | *3.10 | 03:16 |
mriedem | oh ha | 03:19 |
mriedem | lyarwood: https://bugzilla.redhat.com/show_bug.cgi?id=1415250 | 03:19 |
openstack | bugzilla.redhat.com bug 1415250 in openstack-nova "QEMU image file locking (RHOS)" [High,Post] - Assigned to lyarwood | 03:19 |
mriedem | melwitt: ^ | 03:19 |
melwitt | oh hello friends | 03:20 |
mriedem | left a comment because i like to comment in bz's | 03:21 |
mriedem | feels like i'm part of the team | 03:21 |
mriedem | ooo cc'ed to markmc@redhat.com | 03:21 |
mriedem | and eglynn@redhat.com, dansmith@redhat.com, | 03:22 |
mriedem | heh | 03:22 |
melwitt | :) | 03:23 |
*** ejat has quit IRC | 03:40 | |
*** stvnoyes has quit IRC | 03:41 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: [libvirt] Allow multiple volume attachments https://review.openstack.org/267587 | 03:43 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: [api] Allow multi-attach in compute api https://review.openstack.org/271047 | 03:43 |
*** thingee has joined #openstack-nova | 03:43 | |
*** thingee has left #openstack-nova | 03:43 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: [libvirt] Allow multiple volume attachments https://review.openstack.org/267587 | 03:44 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: [api] Allow multi-attach in compute api https://review.openstack.org/271047 | 03:44 |
*** abhishekk has joined #openstack-nova | 03:47 | |
*** mriedem has quit IRC | 03:48 | |
*** lbragstad has quit IRC | 04:11 | |
*** s1061123 has quit IRC | 04:28 | |
*** gyee has quit IRC | 04:29 | |
*** armax has quit IRC | 04:32 | |
*** armax has joined #openstack-nova | 04:33 | |
*** armax has quit IRC | 04:33 | |
*** armax has joined #openstack-nova | 04:33 | |
*** armax has quit IRC | 04:34 | |
*** armax has joined #openstack-nova | 04:34 | |
*** armax has quit IRC | 04:34 | |
*** armax has joined #openstack-nova | 04:35 | |
*** armax has quit IRC | 04:35 | |
*** udesale has joined #openstack-nova | 04:37 | |
*** takashin has quit IRC | 04:38 | |
*** takashin has joined #openstack-nova | 04:45 | |
*** nicolasbock has quit IRC | 04:53 | |
*** takashin has quit IRC | 04:53 | |
*** s1061123 has joined #openstack-nova | 05:00 | |
*** takashin has joined #openstack-nova | 05:00 | |
*** takashin has quit IRC | 05:12 | |
*** takashin has joined #openstack-nova | 05:17 | |
takashin | 05:39 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Adds view builders for keypairs controller https://review.openstack.org/347289 | 05:39 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Adds view builders for keypairs controller https://review.openstack.org/347289 | 05:41 |
*** janki has joined #openstack-nova | 05:43 | |
*** markmcclain has quit IRC | 05:51 | |
*** markmcclain has joined #openstack-nova | 05:54 | |
*** takashin has quit IRC | 06:21 | |
*** hongbin has joined #openstack-nova | 06:22 | |
*** hongbin has quit IRC | 06:22 | |
*** 07EAAP3JW has joined #openstack-nova | 06:42 | |
*** 07EAAP3JW has quit IRC | 06:43 | |
*** hongbin has joined #openstack-nova | 06:43 | |
*** zhurong has quit IRC | 06:47 | |
*** zhurong has joined #openstack-nova | 06:52 | |
*** ratailor has joined #openstack-nova | 06:54 | |
*** abhishekk has quit IRC | 06:57 | |
*** jaosorior has quit IRC | 07:01 | |
*** takashin has joined #openstack-nova | 07:01 | |
openstackgerrit | Zhenyu Zheng proposed openstack/nova master: Use neutron port_list when filtering instance by ip https://review.openstack.org/525505 | 07:03 |
*** Brin has joined #openstack-nova | 07:05 | |
*** jaosorior has joined #openstack-nova | 07:11 | |
*** jaosorior has quit IRC | 07:13 | |
*** jaosorior has joined #openstack-nova | 07:14 | |
*** jaosorior has quit IRC | 07:14 | |
*** jaosorior has joined #openstack-nova | 07:14 | |
*** abhishekk has joined #openstack-nova | 07:15 | |
*** takashin has quit IRC | 07:18 | |
*** takashin has joined #openstack-nova | 07:19 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/nova master: Imported Translations from Zanata https://review.openstack.org/524795 | 07:20 |
*** sbezverk has quit IRC | 07:21 | |
*** threestrands has quit IRC | 07:25 | |
*** takashin has quit IRC | 07:25 | |
*** takashin has joined #openstack-nova | 07:25 | |
*** annp has joined #openstack-nova | 07:27 | |
*** pcaruana has joined #openstack-nova | 07:32 | |
*** elod has joined #openstack-nova | 07:32 | |
*** bjhuangr has quit IRC | 07:34 | |
takashin | git log -1 | 07:42 |
*** jaypipes has quit IRC | 07:53 | |
*** markvoelker has quit IRC | 07:55 | |
*** takashin has left #openstack-nova | 08:00 | |
*** ljjjustin has quit IRC | 08:02 | |
*** rcernin has quit IRC | 08:04 | |
openstackgerrit | Yikun Jiang (Kero) proposed openstack/nova master: Add cross cell sort support for get_migrations https://review.openstack.org/517273 | 08:14 |
openstackgerrit | Yikun Jiang (Kero) proposed openstack/nova master: Add pagination and Changes-since filter support for os-migrations. https://review.openstack.org/330406 | 08:14 |
hrw | morning | 08:25 |
*** liusheng has quit IRC | 08:30 | |
*** MikeG451 has quit IRC | 08:47 | |
*** dtantsur|afk is now known as dtantsur | 08:49 | |
*** lucas-afk is now known as lucasagomes | 08:49 | |
*** cdent has joined #openstack-nova | 08:50 | |
*** hongbin has quit IRC | 09:03 | |
*** tetsuro_ has quit IRC | 09:09 | |
*** tetsuro_ has joined #openstack-nova | 09:09 | |
*** tetsuro_ has left #openstack-nova | 09:09 | |
*** derekh has joined #openstack-nova | 09:24 | |
*** fragatina has quit IRC | 09:49 | |
openstackgerrit | Yikun Jiang (Kero) proposed openstack/nova master: [WIP] Fix 500 in test_resize_server_negative_invalid_state https://review.openstack.org/531117 | 09:51 |
openstackgerrit | Ildiko Vancsa proposed openstack/nova master: Use volume shared_targets to lock during attach/detach https://review.openstack.org/529695 | 09:55 |
openstackgerrit | Ildiko Vancsa proposed openstack/nova master: [libvirt] Allow multiple volume attachments https://review.openstack.org/267587 | 09:55 |
openstackgerrit | Ildiko Vancsa proposed openstack/nova master: [api] Allow multi-attach in compute api https://review.openstack.org/271047 | 09:55 |
*** markvoelker has joined #openstack-nova | 09:56 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: zuul: Move legacy jobs to project https://review.openstack.org/514309 | 09:59 |
openstackgerrit | Yikun Jiang (Kero) proposed openstack/nova master: [WIP] Fix 500 in test_resize_server_negative_invalid_state https://review.openstack.org/531117 | 10:00 |
*** Brin has quit IRC | 10:01 | |
openstackgerrit | Yikun Jiang (Kero) proposed openstack/nova master: Add index(instance_uuid, updated_at) on instance_actions table https://review.openstack.org/530429 | 10:01 |
openstackgerrit | Ildiko Vancsa proposed openstack/nova master: [api] Allow multi-attach in compute api https://review.openstack.org/271047 | 10:06 |
*** annp has quit IRC | 10:12 | |
*** toabctl has quit IRC | 10:16 | |
mdbooth | Is today finally going to be the day when https://review.openstack.org/#/c/529037/ doesn't fail spuriously in the gate? | 10:21 |
mdbooth | This is the fourth attempt, though, each attempt taking a full working day. | 10:21 |
mdbooth | It's not looking good for the plucky patch. | 10:21 |
*** toabctl has joined #openstack-nova | 10:23 | |
*** namnh has quit IRC | 10:27 | |
*** markvoelker has quit IRC | 10:30 | |
kashyap | ildikov: Hi, do you have some CI Gate log of this -- https://bugzilla.redhat.com/show_bug.cgi?id=1415250#c13 | 10:32 |
openstack | bugzilla.redhat.com bug 1415250 in openstack-nova "QEMU image file locking (RHOS)" [High,Post] - Assigned to lyarwood | 10:32 |
kashyap | ildikov: So I could fish through the debug logs to find relevant bits | 10:32 |
kashyap | ildikov: Actually, disregard me, it's here: https://review.openstack.org/#/c/267587/ | 10:33 |
ildikov | kashyap: I got the logs from Matt with the following comment yeterday: "<mriedem> i can't link to these n-cpu logs because of how infra has changed where the logs live now, but that's the issue" | 10:34 |
*** abhishekk has quit IRC | 10:34 | |
kashyap | Ah, I didn't even realize openstack-infra has changed the log location | 10:34 |
ildikov | kashyap: no worries, Matt added a few extra bits to that patch yesterday | 10:35 |
*** ArchiFleKs has joined #openstack-nova | 10:35 | |
kashyap | Yeah, just reading the review through | 10:35 |
ildikov | we haven't tested multi-attach for a while and this got us as a surprise | 10:35 |
ildikov | I only fixed other easy bits in the chain hoping to get a few small dependencies out of the way | 10:36 |
openstackgerrit | Marcin Juszkiewicz proposed openstack/nova master: libvirt: use 'host-passthrough' as default on AArch64 https://review.openstack.org/530965 | 10:36 |
hrw | stephenfin: reported bug, added Closes-bug into commit message. | 10:36 |
hrw | stephenfin: have to check at tests. Will take a while as I do not know nova code | 10:37 |
ildikov | kashyap: thanks for looking into it | 10:37 |
kashyap | ildikov: No worries; Matt already did the sluething last night (I was away early yesterday). W.r.t versioning of the 'shareable' lock | 10:38 |
ArchiFleKs | Hi, this may sound like a trivial question but do you have more information about how nova and neutron dns worked together, I'm using neutron as a dns server, which work fine, I can have instances with .openstacklocal domain or a custom domain, but on ths instances itself, the hostname always get sets to .novalocal anyway, is that a behavior that will change ultimatly ? If I set dhcp_domain to an | 10:39 |
ArchiFleKs | empty string in nova, will it solve this ? | 10:39 |
ildikov | kashyap: yeah, I gave up around midnight my time too... :/ :) | 10:39 |
kashyap | Midnight...Sounds a bit too late :-) | 10:39 |
* lyarwood reads up | 10:40 | |
kashyap | lyarwood: No action needed; you were prompted because you were the current bug asignee | 10:40 |
lyarwood | hmmm I don't see what the qemu-img change has to do with this | 10:40 |
ildikov | kashyap: it's just such a bummer as I remember chatting with Daniel Berrange back at the time about the 'shareable' flag and the support for that is in for ages and kinda thought that the virt layer is a done deal and we "only" need to figure out Cinder and Nova itself... | 10:41 |
kashyap | lyarwood: It's due to the the versioning in Gate; just have to have some conditional logic in Nova | 10:41 |
kashyap | ildikov: BTW, some of the image locking stuff was a bit more recent (as recent as last month -- https://bugzilla.redhat.com/show_bug.cgi?id=1378242#c18) | 10:42 |
openstack | bugzilla.redhat.com bug 1378242 in libvirt "QEMU image file locking (libvirt)" [Unspecified,On_qa] - Assigned to pkrempa | 10:42 |
kashyap | But yeah, the 'shareable' flag has been around for ages | 10:42 |
ildikov | kashyap: yeah, I saw that one this morning | 10:42 |
lyarwood | kashyap: I'm not sure I agree with Matt's point in https://review.openstack.org/#/c/267587/75/nova/virt/libvirt/guest.py re the versions | 10:43 |
lyarwood | kashyap: https://github.com/libvirt/libvirt/commit/28907b0043fbf71085a798372ab9c816ba043b93 doesn't suggest we need to do anything different here, libvirt and QEMU should work out when to provide the share-rw=on flag | 10:44 |
* kashyap clicks | 10:44 | |
kashyap | lyarwood: I think it's too late for that | 10:46 |
lyarwood | kashyap: for what? | 10:47 |
openstackgerrit | Marcin Juszkiewicz proposed openstack/nova master: libvirt: use 'host-passthrough' as default on AArch64 https://review.openstack.org/530965 | 10:47 |
kashyap | For libvirt / QEMU to do the "right thing" | 10:47 |
kashyap | lyarwood: At the moment, Nova simply needs the conditional logic, going by the errors Matt and ildikov are reporting | 10:47 |
lyarwood | I'm obviously missing something here because what you're saying makes no sense | 10:48 |
lyarwood | the unsupported flag error is a docs bug | 10:48 |
hrw | now the worst part - reading nova/libvirt tests to find where it needs to be changed for https://review.openstack.org/530965 patch | 10:48 |
kashyap | Did you also look at this change: https://review.openstack.org/#/c/505673/ | 10:48 |
kashyap | hrw: Heya, just saw your change, note that 'host-passthrough' means, migration will completely break *unless* all the CPUs involved are all *identical* | 10:49 |
lyarwood | kashyap: that's totally unrelated to this isn't it? | 10:49 |
kashyap | hrw: I'll add some comments | 10:49 |
kashyap | lyarwood: No, it is related. | 10:49 |
hrw | kashyap: thanks | 10:49 |
lyarwood | kashyap: that just allows us to run qemu-img info against a live instance | 10:50 |
hrw | kashyap: I am aware of that issue. | 10:50 |
hrw | kashyap: getting that fixed layer after layer. | 10:51 |
*** markmcclain has quit IRC | 10:51 | |
kashyap | lyarwood: That's true; the two aspects here are - file locking from the standalone tool, 'qemu-img', and the locking semantics when running qemu-system-x86_64 | 10:51 |
kashyap | Give me a few minutes before I coment more | 10:51 |
lyarwood | ildikov: are we adding <shareable/> to the disk XML of multi-attached disks? | 10:52 |
hrw | kashyap: Kolla deploys Nova now with cpu_mode set. But not everyone uses Kolla-ansible to deploy. Then libvirt needs to be changed probably. I hope to meet some virt-arm people face-to-face later this month and want to discuss that | 10:52 |
ildikov | lyarwood: theoretically yes, proposed code change is here: https://review.openstack.org/#/c/267587/78/nova/virt/libvirt/volume/volume.py | 10:53 |
lyarwood | ildikov: thanks | 10:53 |
lyarwood | ildikov: hmmm shouldn't that be data.get('multiattach', False) ? | 10:54 |
lyarwood | good old connection_info, confusing everyone yet again | 10:54 |
kashyap | hrw: What does the 'Kolla' tool set it to? | 10:55 |
hrw | kashyap: host-passthrough | 10:55 |
kashyap | hrw: Hmm, that's wrong... | 10:55 |
* kashyap bbiab | 10:55 | |
ildikov | lyarwood: nope, it's not coming from Cinder, I add it here: https://review.openstack.org/#/c/267587/78/nova/virt/block_device.py | 10:56 |
ildikov | lyarwood: I mean it's in the volume details and I chose connection_info to pass it with the other stuff | 10:56 |
lyarwood | ildikov: ah, I assumed the backend was writing this in, sorry | 10:56 |
ildikov | can add it to the data part of it, we actually wanted to remove the nested dict structure of lovely connection_info, but seemed like a too huge headache | 10:57 |
ildikov | don't get me started on it... :) | 10:57 |
ildikov | lyarwood: no worries, would be nice, but it's a less direct interaction in the sense, that you can only create a multi-attach volume if the back end supports it, but then the info is stored in the volume object | 10:58 |
*** markmcclain has joined #openstack-nova | 10:59 | |
lyarwood | ildikov: yeah true | 11:06 |
*** Guest71543 has quit IRC | 11:07 | |
*** yasemin_ has joined #openstack-nova | 11:09 | |
yasemin_ | hi, i want to add to hyper-v compute node to ocata openstack system, i installed it, but it gives an error "Filter ImagePropertiesFilter returned 0 hosts" ? Do you have any idea? Could you help me ? | 11:10 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: console: introduce framework for RFB authentication https://review.openstack.org/345397 | 11:10 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: console: introduce the VeNCrypt RFB authentication scheme https://review.openstack.org/345398 | 11:10 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: console: Provide an RFB security proxy implementation https://review.openstack.org/345399 | 11:10 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: doc: Document TLS security setup for noVNC proxy https://review.openstack.org/500544 | 11:10 |
openstackgerrit | Yikun Jiang (Kero) proposed openstack/nova master: test https://review.openstack.org/529519 | 11:13 |
openstackgerrit | Yikun Jiang (Kero) proposed openstack/nova master: Add index(updated_at) on migrations table. https://review.openstack.org/531132 | 11:13 |
hrw | kashyap: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1673467 | 11:14 |
openstack | Launchpad bug 1673467 in OpenStack nova-compute charm "[ocata] unsupported configuration: CPU mode 'host-model' for aarch64 kvm domain on aarch64 host is not supported by hypervisor" [High,Fix released] - Assigned to James Page (james-page) | 11:14 |
hrw | 11:14 (228s) linaro@cb-r1-m1-c1n1:~$ uname -m;virsh domcapabilities --emulatorbin /usr/bin/qemu-system-aarch64 | grep host-model | 11:14 |
hrw | aarch64 | 11:14 |
hrw | <mode name='host-model' supported='no'/> | 11:14 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: console: introduce framework for RFB authentication https://review.openstack.org/345397 | 11:14 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: console: introduce the VeNCrypt RFB authentication scheme https://review.openstack.org/345398 | 11:14 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: console: Provide an RFB security proxy implementation https://review.openstack.org/345399 | 11:14 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: doc: Document TLS security setup for noVNC proxy https://review.openstack.org/500544 | 11:14 |
openstackgerrit | Yikun Jiang (Kero) proposed openstack/nova master: Add index(updated_at) on migrations table. https://review.openstack.org/531132 | 11:16 |
*** nicolasbock has joined #openstack-nova | 11:17 | |
kashyap | hrw: Thanks for the link | 11:22 |
*** markvoelker has joined #openstack-nova | 11:27 | |
-openstackstatus- NOTICE: zuul seems to have gotten stuck and will probably need a restart, please be patient | 11:27 | |
*** openstackstatus has quit IRC | 11:28 | |
*** openstack has quit IRC | 11:28 | |
*** openstack has joined #openstack-nova | 13:08 | |
*** ChanServ sets mode: +o openstack | 13:08 | |
*** openstackstatus has joined #openstack-nova | 13:10 | |
*** ChanServ sets mode: +v openstackstatus | 13:10 | |
yasemin_ | hi, i want to add to hyper-v compute node to ocata openstack system, i installed it, but it gives an error "Filter ImagePropertiesFilter returned 0 hosts" ? Do you have any idea? Could you help me ? | 13:18 |
*** liuyulong has quit IRC | 13:23 | |
*** lucas-hungry is now known as lucasagomes | 13:32 | |
alex_xu | cdent: the patch LGTM, just a small thing https://review.openstack.org/#/c/513526/17/nova/api/openstack/placement/schemas/allocation_candidate.py, but I won't block on it | 13:35 |
jaypipes | yasemin_: please see /topic. better to ask your question on the openstack@ mailing list with the [nova][hyper-v] subject markers. | 13:38 |
hrw | kashyap: custom/cortex-a53 fails :( | 13:39 |
yasemin_ | <jaypipes> okey thank you | 13:39 |
kashyap | hrw: Oops | 13:40 |
kashyap | (Okay; let's move this to OFTC, Virt channel.) | 13:42 |
cdent | alex_xu: I borrowed that get schema from nova's positive_integer in parameter_types, which has the same redundancies (presumably to protect against the query string parsing library changing from underneath it) | 13:42 |
*** tbachman has joined #openstack-nova | 13:43 | |
*** tbachman_ has joined #openstack-nova | 13:46 | |
*** tbachman has quit IRC | 13:48 | |
*** tbachman_ is now known as tbachman | 13:48 | |
*** dtantsur is now known as dtantsur|brb | 13:49 | |
*** efried_ has joined #openstack-nova | 13:54 | |
*** efried_ is now known as efried | 13:56 | |
alex_xu | cdent: the positive_integer is used for the request body, not just the query string | 14:07 |
efried | jaypipes: I was planning to answer mriedem's call for more details on nrp (http://lists.openstack.org/pipermail/openstack-dev/2018-January/125953.html). Unless you were already doing that? | 14:07 |
jaypipes | efried: no. please go ahead. | 14:08 |
jaypipes | efried: in the middle of trying to get the aggregate affinity series straightened up. should take me another few hours. | 14:08 |
efried | jaypipes: Roger. FYI, the main thing I'd be throwing you under the bus for is accomodating nrps in GET /allocation_candidates | 14:09 |
efried | (throwing under bus - was that the right analogy? mebbe not) | 14:09 |
jaypipes | efried: that's cool. | 14:10 |
*** amodi has quit IRC | 14:10 | |
stephenfin | 😃 | 14:14 |
stephenfin | Whoops | 14:14 |
*** dtantsur|brb is now known as dtantsur | 14:22 | |
*** dansmith has quit IRC | 14:25 | |
*** smatzek has quit IRC | 14:25 | |
*** sapcc-bot1 has quit IRC | 14:26 | |
*** dgonzalez_5 has quit IRC | 14:27 | |
*** mkoderer_ has quit IRC | 14:27 | |
openstackgerrit | Claudiu Belu proposed openstack/nova master: hyperv: Cleans up live migration Planned VM https://review.openstack.org/478943 | 14:27 |
*** dgonzalez_3 has quit IRC | 14:28 | |
*** hongbin_ has joined #openstack-nova | 14:29 | |
*** mriedem has joined #openstack-nova | 14:29 | |
*** hongbin_ has quit IRC | 14:29 | |
*** hongbin has joined #openstack-nova | 14:30 | |
openstackgerrit | Lajos Katona proposed openstack/nova master: Deduplicate aggregate notification samples https://review.openstack.org/531162 | 14:30 |
*** jobewan has joined #openstack-nova | 14:33 | |
*** esberglu has joined #openstack-nova | 14:35 | |
openstackgerrit | Marcin Juszkiewicz proposed openstack/nova master: libvirt: use 'host-passthrough' as default on AArch64 https://review.openstack.org/530965 | 14:37 |
*** efried is now known as efried_tmp | 14:38 | |
*** efried has joined #openstack-nova | 14:38 | |
*** efried_tmp has quit IRC | 14:39 | |
*** burt has joined #openstack-nova | 14:39 | |
*** lbragstad has joined #openstack-nova | 14:43 | |
*** yamamoto has quit IRC | 14:43 | |
-openstackstatus- NOTICE: zuul has been restarted, all queues have been reset. please recheck your patches when appropriate | 14:47 | |
*** eharney has joined #openstack-nova | 14:54 | |
openstackgerrit | Matthew Booth proposed openstack/nova master: Fix fake libvirt XML generation for disks https://review.openstack.org/531165 | 14:55 |
*** esberglu has quit IRC | 14:57 | |
mriedem | kashyap: lyarwood: thanks for looking at https://review.openstack.org/#/c/267587/ - i think we're on the same page, | 14:58 |
mriedem | libvirt 3.10 has a fix for a regression introduced in qemu 2.10 | 14:58 |
kashyap | mriedem: I noticed your ping too last evening | 14:58 |
kashyap | I saw your sleuthing on the review | 14:58 |
mriedem | so i added the version check on startup such that multiattach isn't supported unless qemu<2.10 or libvirt>=3.10 | 14:58 |
kashyap | Yep | 14:58 |
mriedem | which sucks because we don't have those versions in our CI env | 14:58 |
*** esberglu has joined #openstack-nova | 14:59 | |
kashyap | (Maybe we should finally get that utopic gate job that will run on newer versions to allow us to test at least) | 14:59 |
mriedem | we are using the pike UCA which is about as new as it gets for ubuntu | 14:59 |
*** gouthamr has joined #openstack-nova | 15:00 | |
mriedem | jamespage: coreycb: what do you think are the chances of getting these libvirt patches backported from 3.10 to 3.6 in the pike UCA? https://bugzilla.redhat.com/show_bug.cgi?id=1378242#c10 | 15:00 |
openstack | bugzilla.redhat.com bug 1378242 in libvirt "QEMU image file locking (libvirt)" [Unspecified,On_qa] - Assigned to pkrempa | 15:00 |
kashyap | Yep, I recall; mriedem -- I wonder if you've ever seen this (for DevStack, though) -- https://review.openstack.org/#/c/108714/10 | 15:00 |
mriedem | alternatively, i'm going to have to probably look at using a special job which doesn't use the pike UCA so we just use the xenial package versions which would have qemu<2.10 | 15:01 |
*** yamamoto has joined #openstack-nova | 15:01 | |
mriedem | kashyap: so i thought markus_z and tonyb already had a job that does something like this | 15:01 |
*** smatzek has joined #openstack-nova | 15:01 | |
kashyap | mriedem: I vaguely recall talking to them too, at Barcelona, but it didn't go anywhere | 15:02 |
* kashyap looks at the list archives | 15:02 | |
*** smatzek has quit IRC | 15:02 | |
mriedem | i have the job name handy, sec | 15:02 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/newton: Raise MarkerNotFound if BuildRequestList.get_by_filters doesn't find marker https://review.openstack.org/530982 | 15:02 |
*** smatzek has joined #openstack-nova | 15:02 | |
kashyap | http://lists.openstack.org/pipermail/openstack-dev/2016-October/105552.html | 15:03 |
mriedem | kashyap: legacy-tempest-dsvm-nova-libvirt-kvm-apr | 15:03 |
kashyap | Thanks; let me look up the job definition | 15:04 |
mriedem | http://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/legacy/tempest-dsvm-nova-libvirt-kvm-apr/run.yaml#n2 | 15:04 |
kashyap | Heh, you're faster than Google | 15:04 |
mriedem | codesearch.openstack.org | 15:04 |
jamespage | mriedem: looking | 15:04 |
kashyap | Oh, didn't know of that thing | 15:05 |
kashyap | Why is it called 'legacy'? | 15:05 |
mriedem | because it existed before zuulv3 | 15:06 |
coreycb | jamespage: mriedem: do those patches enable sharable disks with qemu 2.10? 2.10 is what we have for pike/queens atm. | 15:06 |
mriedem | coreycb: yes | 15:06 |
mriedem | coreycb: and yeah, qemu 2.10 is precisely the problem | 15:06 |
kashyap | Okido, makes sense. | 15:06 |
mriedem | kashyap: i don't know if that apr job works, but i can propose that we run it on devstack experimental queue so i can see | 15:07 |
kashyap | mriedem: Yeah, I'm first looking at what's in there - git://git.openstack.org/openstack/devstack-plugin-additional-pkg-repos | 15:07 |
kashyap | Last updated 01-Apr-2016. | 15:08 |
kashyap | So at least it needs updates to the file: devstack/lib/libvirt | 15:09 |
kashyap | (To reflect correct versions.) | 15:09 |
hrw | uf. one patch, one bug. and then 3 other bugs for the same. ouch | 15:10 |
kashyap | hrw: What are the three other bugs? It's the same one we were talking about earlier, right | 15:10 |
mriedem | kashyap: oh hmm https://git.openstack.org/cgit/openstack/devstack-plugin-additional-pkg-repos/tree/devstack/lib/libvirt#n21 | 15:10 |
kashyap | hrw: I'm even surprised that OpenStack even works on AArch64 | 15:10 |
mriedem | pointing at liberty still yeah... | 15:10 |
kashyap | Yeah, the last Git commit should gave it away | 15:11 |
mriedem | kashyap: and i guess these versions for libvirt and qemu https://git.openstack.org/cgit/openstack/devstack-plugin-additional-pkg-repos/tree/devstack/lib/libvirt#n24 | 15:11 |
hrw | kashyap: https://bugs.launchpad.net/nova/+bug/1741230 was opened by me for my patch. Then found https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1673467 with some extra info. Then reported bug at upstream libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=1531076 and got pointed to discussion in https://bugzilla.redhat.com/show_bug.cgi?id=1430987 one where again good info is provided. | 15:11 |
openstack | Launchpad bug 1741230 in OpenStack Compute (nova) "libvirt: use 'host-passthrough' as default on AArch64" [Undecided,In progress] - Assigned to Marcin Juszkiewicz (hrw) | 15:11 |
openstack | Launchpad bug 1673467 in OpenStack nova-compute charm "[ocata] unsupported configuration: CPU mode 'host-model' for aarch64 kvm domain on aarch64 host is not supported by hypervisor" [High,Fix released] - Assigned to James Page (james-page) | 15:11 |
openstack | bugzilla.redhat.com bug 1531076 in libvirt "Support 'host-model' on aarch64" [Unspecified,New] - Assigned to libvirt-maint | 15:11 |
openstack | bugzilla.redhat.com bug 1430987 in libvirt "No cpu model and feature in capabilities" [High,Assigned] - Assigned to abologna | 15:11 |
hrw | kashyap: we use OpenStack on AArch64 since Liberty ;D | 15:11 |
kashyap | hrw: Yeah, at this point, your getting traction on the upstream Bugzilla filed is the best bet. | 15:12 |
mriedem | looking at https://packages.ubuntu.com/search?suite=all§ion=all&arch=any&keywords=libvirt-bin&searchon=names | 15:12 |
hrw | kashyap: first edition of Linaro Developer Cloud was Liberty based. and then ~200 users/projects/companies used it. | 15:12 |
mriedem | ubuntu doesn't have a libvirt 3.10 yet | 15:12 |
kashyap | Even the UCA? | 15:13 |
hrw | kashyap: when Mitaka got released we migrated as it gave us UEFI support which simplified booting A LOT. | 15:13 |
mriedem | i assume Pike UCA is the latest, and that has libvirt 3.6 | 15:13 |
mriedem | same as bionic | 15:13 |
hrw | kashyap: then we migrated to Newton. | 15:13 |
hrw | kashyap: there are some testing deployments of Pike but we target Queens | 15:13 |
mriedem | same issue for qemu https://packages.ubuntu.com/search?suite=all§ion=all&arch=any&keywords=qemu-kvm&searchon=names | 15:13 |
mriedem | 2.10 is the latest in bionic | 15:13 |
hrw | mriedem: there is Queens UCA | 15:13 |
mriedem | and pike UCA | 15:13 |
mriedem | hrw: orly | 15:14 |
hrw | iirc | 15:14 |
hrw | http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/xenial-updates/queens/ | 15:14 |
hrw | mriedem: but still libvirt 3.6.0 ;( | 15:14 |
kashyap | hrw: Okay. But meanwhile, my Aarch64 Mustang is remote, and I haven't accessed it in a while. If you get sometime, can you power your machine on (the one you talked about earlier in the day) & try? | 15:15 |
hrw | kashyap: I use xgene1 based HPe Moonshots | 15:15 |
hrw | kashyap: remotely as we have a bunch of those at Linaro lab | 15:15 |
mriedem | yup still 3.6 http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/xenial-updates/queens/main/binary-amd64/Packages | 15:15 |
mriedem | so that doesn't help | 15:15 |
hrw | mriedem: make a PPA and grab libvirt from buster? | 15:16 |
kashyap | hrw: Are there any real users using OpenStack on AArch64, besides test / devel at Linaro? | 15:16 |
hrw | mriedem: that's what I did for Debian stable at Linaro | 15:16 |
hrw | kashyap: yes, they are | 15:16 |
hrw | kashyap: can not list names | 15:16 |
openstackgerrit | Eric Berglund proposed openstack/nova master: WIP: PowerVM Driver: vSCSI https://review.openstack.org/526094 | 15:17 |
kashyap | No prob | 15:17 |
hrw | kashyap: we provide VM instances as build slaves for several CI systems for example | 15:17 |
hrw | kashyap: also provided set of VMs as porter boxes for several projects | 15:17 |
kashyap | Nod | 15:17 |
mriedem | alternatively, i can push a devstack DNM patch which doesn't use the Pike UCA and we'll get qemu 2.5 | 15:17 |
kashyap | mriedem: You mean, just to exercise the relevant code for tests? | 15:18 |
mriedem | yes | 15:18 |
mriedem | even if jamespage / coreycb were able to backport those patches from 3.10 to 3.6 in the pike UCA, we'd have to put a workaround option in nova to bypass the version check and just run the multiattach code | 15:18 |
hrw | kashyap: the plan is to migrate all our cloud setups from our Newton to clean Queens. from set of venvs to containers | 15:18 |
mriedem | i.e. 'i've patched my packages, so don't care about the versions you think are required' | 15:19 |
hrw | mriedem: can not libvirt report that multiattach feature in domcapabilities or sth? | 15:19 |
kashyap | Yeah, I saw your version check code in that patch series | 15:20 |
hrw | mriedem: so instead "if libvirt >= x.y" you can use "if libvirt.capabilities.contains('multiattach')" | 15:20 |
mriedem | hrw: maybe? https://github.com/libvirt/libvirt/commit/860a3c4bea1d24773d8a495f213d5de3ac48a462 | 15:20 |
hrw | version check suxx when features are not enabled on all archs | 15:20 |
mriedem | is that how this would work with ^ and "disk-share-rw"? | 15:20 |
hrw | kashyap: so, can I get +2 from you on patch? ;D | 15:21 |
kashyap | hrw: I can't +2 | 15:21 |
kashyap | hrw: But I'd like someone else's opinion, too, like mriedem | 15:22 |
kashyap | Or mdrabe | 15:22 |
hrw | kashyap: or w8... I was supposed to do something with tests... no idea how to do that part | 15:22 |
kashyap | Err, mdbooth I mean | 15:22 |
mriedem | kashyap: can you tell me if https://github.com/libvirt/libvirt/commit/860a3c4bea1d24773d8a495f213d5de3ac48a462 is used to expose a hypervisor capability? | 15:22 |
mdbooth | kashyap: ? | 15:23 |
kashyap | hrw: I have one more comment on the change | 15:23 |
kashyap | Almost about to hit send | 15:23 |
mdbooth | Which change? | 15:23 |
kashyap | mriedem: 1 sec, let me look | 15:23 |
mriedem | like in https://github.com/openstack/nova/blob/master/nova/virt/libvirt/host.py#L615 | 15:23 |
hrw | mdbooth: https://review.openstack.org/#/c/530965/ | 15:23 |
kashyap | mdbooth: https://review.openstack.org/#/c/530965/4 | 15:23 |
kashyap | mdbooth: It's arch-specific, but migration-related, too. I noted the concern there | 15:23 |
mdbooth | kashyap: I don't have a useful opinion on that unfortunately without doing my own doc diving. | 15:24 |
*** smatzek has quit IRC | 15:24 | |
kashyap | mdbooth: Okido, I have enough context there. Disregard | 15:24 |
hrw | mdbooth: imho https://bugzilla.redhat.com/show_bug.cgi?id=1430987 is best part of info on subject | 15:25 |
openstack | bugzilla.redhat.com bug 1430987 in libvirt "No cpu model and feature in capabilities" [High,Assigned] - Assigned to abologna | 15:25 |
kashyap | mriedem: Still checking | 15:25 |
hrw | kashyap: thx | 15:25 |
*** smatzek has joined #openstack-nova | 15:26 | |
*** smatzek_ has joined #openstack-nova | 15:27 | |
*** smatzek_ has quit IRC | 15:28 | |
*** smatzek_ has joined #openstack-nova | 15:28 | |
kashyap | mriedem: Hmm, I just built the newest libvirt-python bindings, and don't see it any of the capabilities; let me ask one of the libvirt folks | 15:29 |
kashyap | mriedem: So no -- it isn't exposed via any capabilities currently (like `virsh (dom)capablities`) | 15:30 |
kashyap | mriedem: So Peter (who wrote that commit says): | 15:30 |
kashyap | 16:29 < pkrempa> kashyap: it is not exposed currently, since it's supposed to be transparent for the users | 15:31 |
kashyap | 16:30 < pkrempa> and if it's not transparent I'd suggest to complain to qemu | 15:31 |
*** smatzek has quit IRC | 15:31 | |
mriedem | kashyap: ok so we're stuck with version checks | 15:31 |
mriedem | thanks for asking | 15:31 |
mriedem | we could, as noted, add a workaround config option to bypass the version checks if you've patched your packages | 15:31 |
kashyap | mriedem: But that'd be asking the user to be too awake, and alert and aware :P | 15:32 |
kashyap | More seriously, yeah - workaround config sounds good | 15:33 |
mriedem | it wouldn't be the end user, it'd be the deployer | 15:33 |
mriedem | but yes | 15:33 |
mriedem | depends on how much people want their multiattach | 15:33 |
kashyap | You mean in the [workarounds] section, right. Like that live snapshots thing (that now we removed, IIRC) | 15:33 |
*** smatzek_ has quit IRC | 15:33 | |
mriedem | yes | 15:34 |
openstackgerrit | Marcin Juszkiewicz proposed openstack/nova master: libvirt: use 'host-passthrough' as default on AArch64 https://review.openstack.org/530965 | 15:34 |
mriedem | a "this isn't tested upstream so it's not supported thing, but it's a backdoor if you need it" | 15:34 |
kashyap | Heh, yeah | 15:34 |
mdbooth | mriedem: lyarwood I also commented on the multi attach patch. There must be a better place to stash multiattach than connection_info. That's essentially adding pain to something we're effectively deprecating. | 15:35 |
kashyap | hrw: So, on the above patch, thanks for addressing that. Besides that, I don't have anything else | 15:35 |
hrw | kashyap: thanks for help | 15:35 |
coreycb | mriedem: will get back to you hopefully shortly. i'm checking with our libvirt maintainer. | 15:35 |
mdbooth | Incidentally, I think a 'can multiattach' flag on the BDM object would be appropriate. | 15:36 |
ildikov | mdbooth: what are we deprecating? | 15:36 |
*** smatzek has joined #openstack-nova | 15:36 | |
mdbooth | ildikov: We're trying to get rid of the requirement for Nova to know anything about connection_info | 15:36 |
kashyap | hrw: Oh, hang on -- same update needs to be done in the rel note. | 15:37 |
mdbooth | Ideally we can just fetch it from cinder whenever we need it. | 15:37 |
kashyap | hrw: As that's more public facing; sorry, should've caught it earlier | 15:37 |
mriedem | mdbooth: with 2 weeks to feature freeze, i think that's a future improvement | 15:37 |
hrw | kashyap: will do | 15:37 |
mriedem | mdbooth: because i don't want to drag this out for queens with a schema and object migration and all that | 15:37 |
mdbooth | ildikov: We definitely don't want to be stashing more stuff in a foreign opaque dict, and then relying on it. | 15:37 |
mriedem | mdbooth: hell, we could just add a "multiattach" boolean flag to driver.attach_volume | 15:37 |
mdbooth | mriedem: Yeah, I wondered about that, but I think it needs to be persistent. | 15:38 |
mriedem | why? | 15:38 |
mriedem | we get the multiattach value from the volume | 15:38 |
mriedem | i don't really want to persist cinder state in nova's db | 15:38 |
mdbooth | mriedem: Admittedly I didn't stare at it for a long time, but it's used in volume_driver.get_config() | 15:38 |
*** smatzek has quit IRC | 15:38 | |
mriedem | mdbooth: yeah i think that's called from driver.attach_volume | 15:38 |
mdbooth | I'm pretty sure we can call that outside of the context of attach() | 15:38 |
mriedem | so we can pass a boolean arg down | 15:38 |
hrw | kashyap: I just copy comment to release notes basically in next patch | 15:39 |
mriedem | mdbooth: hmm, for live migrate maybe | 15:39 |
mriedem | yeah | 15:39 |
kashyap | hrw: Yep | 15:39 |
mdbooth | mriedem: However, if that's not true (I didn't check), that would be great | 15:39 |
ildikov | mdbooth: mriedem: I'm open to suggestions it's just the first idea I had two years ago... | 15:39 |
mdbooth | Anyway, just my 2c, and I totally get that pragmatism might be required here. | 15:40 |
mdbooth | But it raised a flag for me. We'll need to unwind it eventually. | 15:40 |
hrw | also commit message got rewritten | 15:41 |
openstackgerrit | Marcin Juszkiewicz proposed openstack/nova master: libvirt: use 'host-passthrough' as default on AArch64 https://review.openstack.org/530965 | 15:41 |
mriedem | yeah i think passing a boolean through attach_volume to get_config is easy, it's live migrate that i'm worried about | 15:41 |
ildikov | mdbooth: not the first time it came up, but we didn't manage to have this as a top priority problem as of yet to find a better way :( | 15:41 |
mdbooth | ildikov: At the very least we'll need to be able to find it in order to unwind it. | 15:42 |
* mdbooth wonders where the best place to put a warning about it would be. | 15:42 | |
kashyap | mriedem: Oh the previous point about capablities, libvirt upstream says, if I file a bug they could add it - as it shouldn't be too difficult | 15:43 |
kashyap | mriedem: That'd be cleaner for us (Nova), isn't it? | 15:43 |
mdbooth | ildikov: Incidentally, does service_uuid indicate a shared pool of volumes? | 15:43 |
mdbooth | ildikov: e.g. multiple volumes on the same NFS mount? | 15:44 |
ildikov | mdbooth: no, that's independent | 15:44 |
mdbooth | ildikov: Where can I read what it means? | 15:44 |
mriedem | kashyap: long-term that would be cleaner yes, | 15:44 |
mriedem | but not something that is going to help me in queens | 15:45 |
kashyap | Right. /me imagines: If you add the version check, and then the capability comes along later, no one will remember to swap that, until prompted by something | 15:45 |
mriedem | kashyap: feel free to file a bug if you want :) | 15:46 |
kashyap | ildikov: Can you point to the latest URL of the multi-attach specification, please? | 15:46 |
openstackgerrit | Matthew Booth proposed openstack/nova master: Pass DriverBlockDevice to driver.attach_volume https://review.openstack.org/528363 | 15:46 |
openstackgerrit | Matthew Booth proposed openstack/nova master: Use real block_device_info data in libvirt tests https://review.openstack.org/527916 | 15:46 |
openstackgerrit | Matthew Booth proposed openstack/nova master: Fix libvirt volume tests passing invalid disk_info https://review.openstack.org/529328 | 15:46 |
openstackgerrit | Matthew Booth proposed openstack/nova master: Pass disk_info dict to libvirt_info https://review.openstack.org/529329 | 15:46 |
openstackgerrit | Matthew Booth proposed openstack/nova master: Expose volume host type and path independent of libvirt config https://review.openstack.org/530786 | 15:46 |
openstackgerrit | Matthew Booth proposed openstack/nova master: Don't generate fake disk_info in swap_volume https://review.openstack.org/530787 | 15:46 |
openstackgerrit | Matthew Booth proposed openstack/nova master: Local disk serial numbers for the libvirt driver https://review.openstack.org/529380 | 15:46 |
openstackgerrit | Matthew Booth proposed openstack/nova master: Remove redundant swap_volume tests https://review.openstack.org/531179 | 15:46 |
*** eharney has quit IRC | 15:46 | |
mriedem | kashyap: https://specs.openstack.org/openstack/nova-specs/specs/queens/approved/cinder-volume-multi-attach.html | 15:46 |
kashyap | mriedem: Will do | 15:46 |
kashyap | Thanks | 15:46 |
openstackgerrit | Andreas Karis proposed openstack/nova master: Add debug output for selected page size https://review.openstack.org/530662 | 15:47 |
coreycb | mriedem: this is being tracked in https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1716028 | 15:48 |
openstack | Launchpad bug 1716028 in libvirt (Ubuntu) "qemu 2.10 locks images with no feature flag" [Medium,Triaged] | 15:48 |
coreycb | mriedem: cpaelzer says that after bionic is done he'll take a look at a potential SRU, although he currently has the SRU to artful(pike) as a low priority. | 15:48 |
mriedem | coreycb: ok thanks | 15:49 |
kashyap | It sucks hard that one cannot access the URLs (https://review.openstack.org/#/c/267587/78/nova/virt/libvirt/driver.py) without Gerrit account | 15:51 |
*** smatzek has joined #openstack-nova | 15:51 | |
kashyap | Can't pass in-progress patch URLs to people aren't Gerrit users. Surely there must be a way | 15:51 |
kashyap | Without bothering people to ask to make an account (much like mailing lists). | 15:51 |
mriedem | kashyap: i'm not signed in and i can view https://review.openstack.org/#/c/267587/78/nova/virt/libvirt/driver.py | 15:52 |
mriedem | you just can't comment or vote | 15:52 |
kashyap | mriedem: Err, sorry. The person was complaining about typing in a comment. | 15:52 |
kashyap | mriedem: Anyway, the quick point that Peter wanted to add was: " sharing disk image is possible even with current qemu/libvirt if the image is 'raw' and <shareable/> is used" | 15:54 |
mriedem | we must be using qcow2 images | 15:55 |
*** smatzek has quit IRC | 15:55 | |
*** armax has joined #openstack-nova | 15:56 | |
mriedem | another thing i can try | 15:56 |
ildikov | mdbooth: the service_uuid field was added here: https://review.openstack.org/#/c/519025/ | 15:57 |
*** smatzek has joined #openstack-nova | 15:57 | |
mdbooth | ildikov: Just found it. I think it's orthogonal to multi-attach, tbh. | 15:57 |
mriedem | kashyap: although it's a bit confusing, | 15:57 |
mriedem | we dump the disk config before trying to attach the device | 15:57 |
mriedem | and it says type="raw" | 15:58 |
mriedem | http://paste.openstack.org/show/638081/ | 15:58 |
mdbooth | ildikov: Still a good idea. I could also remove my NFS locking stuff in Nova if we had that, although my NFS locking is finer grained but significantly more complex. | 15:58 |
*** imacdonn has quit IRC | 15:58 | |
ildikov | mdbooth: it was added along with a shared_targets field so we can use a lock in case the target exported by the back end is shared among volumes/attachments | 15:59 |
mdbooth | ildikov: I get it. It's a good idea, I just don't see the relationship to multi-attach. | 15:59 |
kashyap | mriedem: (Aside - we both wrote almost same comment 4 mins apart) | 15:59 |
ildikov | mdbooth: it is supposed to help to solve our detach problems | 15:59 |
kashyap | mriedem: Looking at your paste | 15:59 |
mdbooth | We hit this with or without multi-attach. | 16:00 |
ildikov | mdbooth: as if the target is shared and gets removed with the first attachment then the remaining attachments are screwed | 16:00 |
mdbooth | ildikov: Right, but you can can do that without multi-attach. | 16:00 |
*** smatzek_ has joined #openstack-nova | 16:00 | |
ildikov | mdbooth: and the Nova patches using it are dependencies to multi-attach as it might be a problem otherwise as well | 16:00 |
mdbooth | Multi-attach doesn't even make the problem particularly worse. | 16:00 |
*** hemna_ has joined #openstack-nova | 16:01 | |
*** udesale has joined #openstack-nova | 16:01 | |
*** jafeha has quit IRC | 16:01 | |
ildikov | mdbooth: well, I got it in referring to multi-attach, but I guess the path doesn't matter once you got where you wanted... :) | 16:01 |
*** smatzek has quit IRC | 16:01 | |
efried | jaypipes mriedem cdent alex_xu Draft: http://paste.openstack.org/show/638080/ -- anything missing/incorrect/silly? | 16:01 |
* kashyap bbiab; need a few to bike home. | 16:02 | |
cdent | efried: will look in a mo, thanks for doing that | 16:02 |
mdbooth | ildikov: Hehe, I hear you :) | 16:02 |
mriedem | mdbooth: is it just me or is libvirt.images_type, use_cow_images and force_raw_images set of options totally confusing? | 16:04 |
stephenfin | alex_xu: Done (https://review.openstack.org/#/c/530284/( | 16:04 |
*** smatzek_ has quit IRC | 16:05 | |
*** smatzek has joined #openstack-nova | 16:06 | |
*** smatzek has quit IRC | 16:07 | |
*** smatzek has joined #openstack-nova | 16:08 | |
*** tbachman has quit IRC | 16:09 | |
*** udesale has quit IRC | 16:14 | |
*** edmondsw has joined #openstack-nova | 16:14 | |
*** smatzek has quit IRC | 16:15 | |
*** smatzek_ has joined #openstack-nova | 16:15 | |
*** smatzek_ has quit IRC | 16:16 | |
efried | mriedem Log processing & coloring is handled in the openstack-infra/os-loganalyze project (but I think you knew that). Is there something in particular you're having trouble finding in there? | 16:18 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: conf: Deprecate 'network_manager' https://review.openstack.org/530923 | 16:19 |
mriedem | efried: what in infra actually calls os-loganalyze to format the logs | 16:19 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: conf: Use new-style choice values https://review.openstack.org/530924 | 16:20 |
efried | mriedem Ah - it's an apache plugin thingy. The files aren't actually modified - they get twiddled on the fly when you do your http request. | 16:20 |
efried | mriedem So you just need to muck with the filters to make sure they're being run on the files you're interested in. | 16:21 |
efried | mriedem If you have access to a log server, you can play by fiddling with the os-loganalyze source in place, restarting the apache server, and then reloading your browser. | 16:22 |
*** smatzek has joined #openstack-nova | 16:22 | |
*** eharney has joined #openstack-nova | 16:24 | |
*** nicolasbock has quit IRC | 16:24 | |
*** nicolasbock has joined #openstack-nova | 16:25 | |
*** smatzek has quit IRC | 16:27 | |
mriedem | i don't | 16:27 |
*** smatzek has joined #openstack-nova | 16:30 | |
*** smatzek_ has joined #openstack-nova | 16:34 | |
*** smatzek has quit IRC | 16:35 | |
*** dtantsur is now known as dtantsur|afk | 16:41 | |
cdent | efried: information seems accurate at the detail level, but feels like it needs some kind of executive summary or something, a kind of "here's what we're trying to accomplish" which is then followed by the "and this is how it is being done" (which is what you've already got) | 16:44 |
efried | cdent Okay. I felt like it was already getting kinda long, but... tough :) | 16:45 |
*** esberglu has quit IRC | 16:46 | |
cdent | I think length is useful in this case because there's been not enough in the way of spec, so this kind of stands in for that | 16:46 |
*** esberglu has joined #openstack-nova | 16:48 | |
*** cdent has quit IRC | 16:49 | |
*** gcb has quit IRC | 16:49 | |
*** ChanServ sets mode: -r | 16:49 | |
*** syjulian has joined #openstack-nova | 16:50 | |
*** MikeG451 has joined #openstack-nova | 16:50 | |
*** gcb has joined #openstack-nova | 16:50 | |
clarkb | mriedem: the test framework for it runs without apache | 16:50 |
*** tbachman has joined #openstack-nova | 16:52 | |
*** pcaruana has quit IRC | 16:52 | |
openstackgerrit | Eric Berglund proposed openstack/nova master: WIP: PowerVM Driver: vSCSI https://review.openstack.org/526094 | 16:52 |
*** cdent has joined #openstack-nova | 16:52 | |
*** smatzek_ has quit IRC | 16:53 | |
hrw | https://marcin.juszkiewicz.com.pl/2018/01/04/today-i-was-fighting-with-nova-no-idea-who-won/ | 16:53 |
hrw | kashyap: ^^ | 16:53 |
kashyap | hrw: On a call, will read :-) | 16:54 |
hrw | kashyap: thx ;) | 16:54 |
kashyap | Damn, I already clicked | 16:54 |
*** sdague has joined #openstack-nova | 16:56 | |
*** smatzek_ has joined #openstack-nova | 16:56 | |
*** danpawlik has quit IRC | 16:56 | |
*** danpawlik has joined #openstack-nova | 16:57 | |
*** sheel has joined #openstack-nova | 16:57 | |
*** jafeha has joined #openstack-nova | 16:57 | |
hrw | kashyap: :D | 16:58 |
kashyap | hrw: It looks fine; there's an extra tab in there | 16:58 |
kashyap | "It" == the review | 16:59 |
rybridges | Hey guys, can anyone confirm for sure whether or not ephemeral GB is part of a VMs snapshot? | 16:59 |
hrw | kashyap: can you mark it in comment? I am unable to find it locally | 17:00 |
kashyap | hrw: Done. | 17:00 |
kashyap | It's extra tab | 17:00 |
kashyap | In the rel note file | 17:00 |
*** smatzek_ has quit IRC | 17:00 | |
hrw | rught | 17:00 |
hrw | 4 spaces to be exact ;d | 17:01 |
hrw | that's why I did not ofund | 17:01 |
openstackgerrit | Marcin Juszkiewicz proposed openstack/nova master: libvirt: use 'host-passthrough' as default on AArch64 https://review.openstack.org/530965 | 17:01 |
hrw | done | 17:01 |
* hrw off | 17:01 | |
efried | cdent (jaypipes) Howzat: http://paste.openstack.org/raw/638137/ | 17:03 |
*** esberglu has quit IRC | 17:03 | |
*** esberglu has joined #openstack-nova | 17:03 | |
*** gyee has joined #openstack-nova | 17:17 | |
*** smatzek has joined #openstack-nova | 17:18 | |
*** eharney has quit IRC | 17:20 | |
mnaser | is stable/pike ci broken? | 17:21 |
mriedem | yes | 17:21 |
*** smatzek has quit IRC | 17:21 | |
*** smatzek has joined #openstack-nova | 17:21 | |
mriedem | https://review.openstack.org/#/c/531058/ | 17:21 |
lyarwood | also https://review.openstack.org/#/c/531046/ | 17:21 |
*** eharney has joined #openstack-nova | 17:22 | |
mnaser | ok, i guess i'll apply the patch locally till it lands in stable/pike (https://review.openstack.org/#/c/529384/) | 17:23 |
edleafe | mriedem: so for the migration bug: would the fix be to just log that there were no orig_alloc found, and not raise the exception? | 17:24 |
mriedem | edleafe: well, there are really 2 fixes, | 17:24 |
edleafe | Or should I also check if the migration is the orig_alloc (as in retries) | 17:24 |
mriedem | 1. If using the CachingScheduler, there won't be allocations and we need to just log something and ignore it | 17:25 |
mnaser | also if any stable cores for nova are around, this is pretty useful - https://review.openstack.org/#/c/529385/ | 17:25 |
mriedem | 2. If we're rescheduling (we should know this via filter_properties 'retry' entry), then we need to modify how we swap allocations to only change the allocation for the instance record, and leave the migration allocation on the source node untouched | 17:25 |
mriedem | edleafe: so i'm thinking 2 separate patches | 17:26 |
mriedem | edleafe: i could probably wip up a simple regression test for the caching scheduler one and fix in the same patch, which could go below yours | 17:27 |
mriedem | since resize is just busted with caching scheduler regardless of reschedule | 17:27 |
*** moshele has joined #openstack-nova | 17:27 | |
edleafe | mriedem: on #2, not following. The source allocs will be the migration, and the dest allocs won't be done because we haven't picked a target host yet | 17:27 |
mriedem | edleafe: on #2 the problem is when we reschedule right? | 17:28 |
edleafe | yeah | 17:28 |
edleafe | it's checking that the instance is allocated against the source | 17:28 |
mriedem | at the point of the first reschedule, the source node allocs are on the migration record and the failed first chosen host allocs are on the instance | 17:28 |
*** yikun has quit IRC | 17:28 | |
mriedem | right, i'm saying, | 17:28 |
edleafe | on a retry, the allocs on the source will be the migration | 17:28 |
mriedem | we need to modify the logic that assumes the source node allocs are on the instance, | 17:28 |
mriedem | if we know we're doing a reschedule, | 17:28 |
*** yikun has joined #openstack-nova | 17:29 | |
mriedem | and we can determine that based on the filter_properties 'retry' entry | 17:29 |
mriedem | so on a retry, we don't do anything with the source, | 17:29 |
mriedem | we just update the dest | 17:29 |
mriedem | update the instance allocation from failed dest 1 to next dest 2 | 17:29 |
edleafe | mriedem: what did you mean then by "only change the allocation for the instance record"? | 17:29 |
mriedem | on a reschedule, we move the instance allocation from the failed dest host to the next alternative host | 17:30 |
mriedem | and don't touch the migration allocation on the source host | 17:30 |
edleafe | the instance won't be allocated after a fail. Those should be rolled back, no? | 17:30 |
edleafe | no, the next host isn't selected until after this part of the code is run | 17:30 |
edleafe | This is happening in _preallocate_migration | 17:31 |
mriedem | where do we rollback allocations for the instance on a fail? | 17:32 |
edleafe | When we eventually select a host (either through select_destinations (now) or alternate host (when that last patch merges), that's when we allocate the instance to the target | 17:32 |
edleafe | mriedem: in compute. Let me look | 17:32 |
mriedem | delete_allocation_for_failed_resize in prep_resize? | 17:33 |
mriedem | oh i see, the call to _revert_allocation | 17:33 |
mriedem | https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L4095 | 17:34 |
edleafe | damn, you're faster than I am :) | 17:34 |
mriedem | so if the compute reverts the allocations before casting back to conductor to reschedule, | 17:35 |
mriedem | why is conductor failing on the reschedule | 17:35 |
mriedem | ? | 17:35 |
edleafe | Because it is confirming that the instance has allocations on the source | 17:35 |
edleafe | But at that point on a retry, it's the migration that has allocations on the source | 17:35 |
mriedem | yeah, which should be the case if we got here https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L4000 | 17:35 |
mriedem | not if we did ^ | 17:36 |
*** mchiappero has quit IRC | 17:36 | |
*** moshele has quit IRC | 17:36 | |
mriedem | is that failing? | 17:36 |
edleafe | dunno - all I know is that the instance didn't have allocations on the source. Let me check that out. | 17:37 |
*** mchiappero has joined #openstack-nova | 17:39 | |
mriedem | i've created https://bugs.launchpad.net/nova/+bug/1741307 to deal with the caching scheduler + resize stuff | 17:43 |
openstack | Launchpad bug 1741307 in OpenStack Compute (nova) "Resize always fails when using the CachingScheduler" [High,Triaged] - Assigned to Matt Riedemann (mriedem) | 17:43 |
mriedem | kashyap: fyi that i updated the devstack patch to test multiattach to set CONF.libvirt.images_type=raw and CONF.use_cow_images=False to see if that makes a difference | 17:45 |
*** lucasagomes is now known as lucas-afk | 17:46 | |
openstackgerrit | Merged openstack/nova master: Revert "Modify _poll_shelved_instances periodic task call _shelve_offload_instance()" https://review.openstack.org/530284 | 17:48 |
*** armax has quit IRC | 17:50 | |
*** armax has joined #openstack-nova | 17:50 | |
*** damien_r has joined #openstack-nova | 17:50 | |
*** Apoorva has joined #openstack-nova | 17:59 | |
*** david-lyle has quit IRC | 18:00 | |
*** david-lyle has joined #openstack-nova | 18:01 | |
*** derekh has quit IRC | 18:03 | |
mriedem | wow, if i'm reading his results correctly, Kevin_Zheng got a 90% improvement in listing instances with details using an ip filter after we proxy to neutron first | 18:19 |
mriedem | sdague: ^ | 18:19 |
mriedem | 2000 instances in nova and 2K ports in neutron | 18:19 |
mriedem | told him to send the details to the dev list | 18:21 |
openstackgerrit | Marcin Juszkiewicz proposed openstack/nova master: libvirt: use 'host-passthrough' as default on AArch64 https://review.openstack.org/530965 | 18:22 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add regression test for resizing failing when using CachingScheduler https://review.openstack.org/531211 | 18:27 |
mriedem | edleafe: ^ functional regression test for resize + caching scheduler failure | 18:27 |
edleafe | mriedem: thx | 18:29 |
*** yamamoto has quit IRC | 18:29 | |
cdent | efried: sorry for delay, better ✔ | 18:29 |
efried | Thanks cdent. jaypipes, you want to look before I send it? http://paste.openstack.org/raw/638137/ | 18:30 |
*** yamamoto has joined #openstack-nova | 18:33 | |
efried | going, going, gone. | 18:36 |
*** yamamoto has quit IRC | 18:37 | |
cdent | shut your filthy mouth efried | 18:47 |
cdent | (re notifications) | 18:47 |
efried | Hehehehe, thought you'd like that. | 18:47 |
*** efried is now known as efried_nomnom | 18:48 | |
*** stvnoyes has joined #openstack-nova | 18:53 | |
openstackgerrit | Merged openstack/nova master: Add support for getting volume details with a specified microversion https://review.openstack.org/529656 | 19:02 |
*** sheel has quit IRC | 19:06 | |
*** tbachman has quit IRC | 19:14 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Handle no allocations during migrate https://review.openstack.org/531220 | 19:20 |
*** yamamoto has joined #openstack-nova | 19:35 | |
openstackgerrit | Merged openstack/nova master: DriverBlockDevice: make subclasses inherit _proxy_as_attr https://review.openstack.org/524167 | 19:36 |
openstackgerrit | Merged openstack/nova master: Expose BDM uuid to drivers https://review.openstack.org/529037 | 19:36 |
*** smatzek has quit IRC | 19:39 | |
*** armax has quit IRC | 19:40 | |
mriedem | mdbooth: i was looking at where _get_volume_config is used and it's used in quite a few places in the libvirt driver, like live migration, swap volume, and generally anywhere we build a guest xml, so also for resize/cold migrate, | 19:40 |
mriedem | and in a lot of those places, we don't have a volume to know if it's multiattach or not to set the conf.shareable flag, | 19:40 |
mriedem | so despite it not being great, stashing the multiattach value in the connection_info which gets saved in the bdm is about the best we can do right now | 19:41 |
mriedem | because all of these places in the driver that call _get_volume_config are using the connection_info off the bdm | 19:41 |
*** yamamoto has quit IRC | 19:42 | |
*** tbachman has joined #openstack-nova | 19:49 | |
sdague | mriedem: nice | 19:53 |
sdague | mriedem: I'd totally believe it, we do extra dumb stuff in there | 19:53 |
mriedem | substring match is a bit slower, as expected, but still much better than doing it in nova | 19:54 |
sdague | mriedem: yep | 19:54 |
mriedem | 4000ms w/o the patch, 400ms with the patch and no substring, about 900ms with the patch and substring match | 19:54 |
*** eharney has quit IRC | 20:01 | |
*** efried_nomnom is now known as efried | 20:03 | |
mriedem | cdent: edleafe: it's your favorite game show, "IS IT 400 OR 409?!" | 20:03 |
* cdent puts hand over button | 20:03 | |
mriedem | so, say i'm trying to attach a multiattach volume to an instance and the compute is too old to support multiattach, | 20:04 |
mriedem | is that a 400, or a 409? knowing that at some point the compute might be upgraded to support multiattach. | 20:04 |
cdent | the state of the compute is out of sync with the desired state, but you made a valid request (had it been in the right state), so I'd go with 409 | 20:06 |
mriedem | artom: remember how we punted for tagged attached on a shelved offloaded instance because we couldn't tell if the host supported it? | 20:06 |
mriedem | cdent: ok that's what i was thinking too | 20:06 |
cdent | \o/ | 20:06 |
mriedem | artom: but we don't make that same distinction in the API for tagged bdms with boot from volume | 20:07 |
mriedem | can you remember why? | 20:07 |
mriedem | artom: my guess was "you could reschedule and attempt to hit a host that does support device tagging" but we raise BuildAbortException so we don't even reschedule in that case | 20:08 |
*** markmcclain has quit IRC | 20:10 | |
*** markmcclain has joined #openstack-nova | 20:11 | |
openstackgerrit | melanie witt proposed openstack/nova master: Add access_url_base to console_auth_tokens table https://review.openstack.org/334614 | 20:13 |
openstackgerrit | melanie witt proposed openstack/nova master: Optionalize instance_uuid in console_auth_token_get_valid() https://review.openstack.org/481700 | 20:13 |
*** nicolasbock has quit IRC | 20:13 | |
openstackgerrit | melanie witt proposed openstack/nova master: Add ConsoleAuthToken object https://review.openstack.org/320063 | 20:13 |
openstackgerrit | melanie witt proposed openstack/nova master: Add periodic task to clean expired console tokens https://review.openstack.org/325381 | 20:13 |
openstackgerrit | melanie witt proposed openstack/nova master: Use ConsoleAuthToken object to generate authorizations https://review.openstack.org/325414 | 20:13 |
openstackgerrit | melanie witt proposed openstack/nova master: Convert websocketproxy to use db for token validation https://review.openstack.org/333990 | 20:13 |
* edleafe reads back | 20:15 | |
edleafe | yeah, the request isn't malformed; the system can't support it | 20:15 |
artom | mriedem, yeah, I think your guess is correct | 20:16 |
cdent | support it _right now_ | 20:16 |
mriedem | artom: ok, so that's not really any different from punting when trying to attach a volume to a shelved offloaded instance with tags | 20:16 |
mriedem | artom: i'm only asking b/c we're going to have a similar situation with multiattach volumes | 20:16 |
mriedem | trying to decide if we should support attaching a multiattach volume to a shelved offloaded instance - same with bfv, either way it might fail once we get to the compute and you're dead | 20:17 |
mriedem | at some point in the future, which no one will probably ever work on, we could do a scheduler filter for this stuff to make it pick a host which supports the thing you need | 20:18 |
artom | With the full 100% knowledge of every bit of context about multiattach, I'd vote to be consistent and refuse it for shelved offloaded | 20:18 |
mgagne | if running mitaka, is legacy v2 API still used or is it dead code? | 20:18 |
artom | Because we have no idea what host it'll end up on | 20:18 |
artom | When it's unshelved | 20:18 |
mriedem | artom: the same is true for boot from volume | 20:18 |
artom | mriedem, eh, how so? | 20:19 |
mriedem | but we allow tagged bdms with bfv | 20:19 |
mriedem | when you create a server, or unshelve a server, they both go through the scheduler to pick a host, | 20:19 |
mriedem | you have an equal chance in either scenario of picking a host that doesn't support the capability | 20:19 |
mriedem | so the fact we allow tagged bdms with bfv but not shelved offloaded instances is inconsistent | 20:20 |
artom | mriedem, how is bfv different from normal volume tagging? IIRC the api just checked service level, to see if the cloud was fully upgraded to support it | 20:20 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Attach and detach encryptors during swap_volume https://review.openstack.org/531233 | 20:20 |
mriedem | artom: ok yeah that's the one difference then, is that we check in the api when creating a server if the computes are new enough to handle tagged bdms | 20:21 |
mriedem | we could easily have done the same for shelved offloaded attach | 20:21 |
artom | mriedem, so there's this 2 x 2 matrix of stuff | 20:22 |
artom | we have tagged boot / tagged attach | 20:22 |
mriedem | that's basically what i was saying we'd do in those two cases for multiattach too - check the min compute service version and fail if computes aren't upgraded yet | 20:22 |
artom | And compute manager supports it / virt driver supports it | 20:22 |
artom | For tagged boot, we used the service level check to see if compute manager supports it | 20:22 |
artom | For tagged boot/virt driver... I forget how we handled it | 20:23 |
mriedem | we just fail in the compute | 20:23 |
mriedem | and raise BuildAbortException | 20:23 |
artom | Ah, right, hopefully rescheduling to a virt driver that supports it | 20:23 |
mriedem | no, we don't reschedule | 20:23 |
mriedem | BuildAbortException means abort, don't reschedule | 20:24 |
artom | Ah, ok | 20:24 |
artom | Yeah, that's not awesome, but no other way of doing it I guess | 20:24 |
mriedem | BuildRescheduledException means kick edleafe in the head a few times | 20:24 |
artom | And then for tagged attach... | 20:24 |
artom | I think they were all RPC calls down to the compute | 20:25 |
artom | So virt driver support was straightforward | 20:25 |
artom | Except for shelved offloaded, which was a cast | 20:25 |
*** chyka has joined #openstack-nova | 20:25 | |
artom | (Obviously) | 20:25 |
mriedem | yeah that one is easy | 20:25 |
mriedem | unless the server is shelved offloaded | 20:25 |
artom | So that one we decided to just fail in the API | 20:25 |
mriedem | in the case of shelved offloaded, we don't do anything with the compute b/c there is no compute | 20:25 |
artom | Instead of having the unshelve fail like, 3 years later | 20:25 |
mriedem | yeah, i'm saying, i think at a minimum we could do the same for shelved offloaded as we do for tagged boot which is check the service version | 20:26 |
mriedem | we don't reschedule on an unshelve failure, but we don't reschedule on a tagged boot failure either | 20:26 |
mriedem | so it's basically the same | 20:26 |
*** cdent has quit IRC | 20:26 | |
artom | One is definitely more immediate than the other | 20:26 |
artom | Though I guess we if we relay the exception properly, it's not a massive deal | 20:27 |
edleafe | mriedem: so it seems that the bug I found was the result of a race | 20:27 |
edleafe | The call to retry on https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L4087 runs the retry before the 'finally:' clause reight below it gets hit. | 20:27 |
*** jackie-truong has joined #openstack-nova | 20:27 | |
edleafe | If I add the call the self._revert_allocation() before L4087, the test I wrote passes. | 20:27 |
artom | Like, if nova show unshelved-instance has 'you unshelved on a virt that doesn't support tagging' somewhere in there (but more cloudy) it can work | 20:27 |
mriedem | edleafe: ah just like the thing we hit in the other patch | 20:28 |
mriedem | need to cleanup the allocation before casting to conductor | 20:28 |
mriedem | artom: we'd record a fault | 20:28 |
mriedem | that's about it | 20:28 |
edleafe | mriedem: do you have a fix for the caching scheduler? Or should I do that? | 20:29 |
artom | mriedem, is that obvious to the user? | 20:29 |
mriedem | edleafe: https://review.openstack.org/#/c/531220/ | 20:29 |
mdbooth | mriedem: My patch series removes it from swap_volume, btw, but not surprised by the others. | 20:29 |
mriedem | artom: not really, the fault's traceback is only available to admins | 20:29 |
mriedem | mdbooth: swap_volume is the one case where i don't think this matters because we don't rely on the conf.shareable attribute | 20:29 |
artom | mriedem, so give this lack of feedback to the user, it's probably better to fail fast in the API, no? | 20:29 |
mriedem | artom: maybe | 20:30 |
mriedem | i can see the argument for being consistent | 20:30 |
mdbooth | mriedem: Ok, so how can we flag this as a second connection_info wart? | 20:30 |
mriedem | mdbooth: 2nd as in device_path is #1? | 20:30 |
artom | mriedem, what's the issue we see with multiattach though? | 20:30 |
mdbooth | I mean there's this and also device_path | 20:31 |
edleafe | mriedem: ah, I just saw the test patch | 20:31 |
mdbooth | mriedem: Right, yeah. What you said :) | 20:31 |
artom | I haven't followed closely, so it may or may not be worth it to dump all the context on me for whatever my opinion is worth ;) | 20:31 |
* mdbooth just got in from a bike ride... | 20:31 | |
mriedem | artom: nothing - i'm just weighing options since we don't have the api plumbed in for all of the multiattach stuff | 20:31 |
mriedem | so i'm thinking through how to handle this, | 20:31 |
mriedem | because the backend capabilty checking is the same as with tagged attach | 20:31 |
mriedem | and i need to get out of my head sometimes | 20:32 |
mriedem | mdbooth: i'm not sure how to flag this | 20:32 |
artom | Are we talking about attaching to a shelved offloaded instance? And we don't know whether the eventual compute would support it? | 20:32 |
mriedem | artom: correct | 20:32 |
mdbooth | mriedem: I don't think there's a good central focal point for connection_info cruft. Which is another problem, tbh. | 20:33 |
jackie-truong | sdague: The nova-queens-blueprint-status etherpad mentioned that you needed Johns Hopkins to sync up with you on the certificate validation feature | 20:33 |
artom | mriedem, so exactly the same problem? 1. check that compute manager is new enough 2. check that virt driver supports it | 20:34 |
artom | Or is multi-attach virt-agnostic? | 20:34 |
jackie-truong | sdague: We created an etherpad (https://etherpad.openstack.org/p/queens-nova-certificate-validation) to walk through usage and testing. Let me know if you need more information or have any questions. | 20:35 |
mriedem | artom: it's the same problem | 20:35 |
mriedem | in queens, assuming we ship this code, only the libvirt driver will support multiattach | 20:36 |
artom | mriedem, one thing we talked about was scheduling with compute driver capabilities taken into account, probably through placement/resource providers | 20:37 |
artom | I don't think that work is ready yet, though | 20:37 |
artom | Then again, by the time multiattach lands, there might be talks of replacing placement with a new quantum-powered scheduler | 20:37 |
artom | ;) | 20:37 |
mriedem | sure, the ambiguity goes away if we had the CapabilitiesFilter aware of this and handling it | 20:39 |
mriedem | "request says it wants multiattach, find a host that supports multiattach" | 20:39 |
mriedem | done | 20:39 |
mriedem | anywho, i'll work on what i know we need to support for now, and get testing going, and then bikeshed on the rest | 20:40 |
openstackgerrit | Jay Pipes proposed openstack/nova master: allow compute nodes to be associated with host agg https://review.openstack.org/526753 | 20:42 |
openstackgerrit | Jay Pipes proposed openstack/nova master: Remove server group sched filter support caching https://review.openstack.org/529200 | 20:42 |
openstackgerrit | Jay Pipes proposed openstack/nova master: WIP Support aggregate affinity filters https://review.openstack.org/529201 | 20:42 |
openstackgerrit | Jay Pipes proposed openstack/nova master: get instance group's aggregate associations https://review.openstack.org/531243 | 20:43 |
*** READ10 has joined #openstack-nova | 20:43 | |
mgagne | answering myself, legacy v2 code is still used in mitaka: https://docs.openstack.org/nova/latest/reference/stable-api.html | 20:47 |
*** takashin has joined #openstack-nova | 20:48 | |
*** takashin has left #openstack-nova | 20:49 | |
*** smatzek has joined #openstack-nova | 20:50 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Use volume shared_targets to lock during attach/detach https://review.openstack.org/529695 | 20:52 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: [libvirt] Allow multiple volume attachments https://review.openstack.org/267587 | 20:52 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: WIP: [api] Allow multi-attach in compute api https://review.openstack.org/271047 | 20:52 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: WIP: Pass multiattach flag to reserve_block_device_name https://review.openstack.org/531244 | 20:52 |
mriedem | ildikov: here is the reserve_block_device_name change ^ | 20:52 |
ildikov | mriedem: cool, thanks! | 20:52 |
*** takashin has joined #openstack-nova | 20:52 | |
*** edmondsw has quit IRC | 20:54 | |
*** eharney has joined #openstack-nova | 20:56 | |
mriedem | nova meeting in 1 minute | 20:58 |
*** lyan has joined #openstack-nova | 21:00 | |
*** lyan has quit IRC | 21:02 | |
*** smatzek has quit IRC | 21:05 | |
*** smatzek has joined #openstack-nova | 21:06 | |
*** lyan has joined #openstack-nova | 21:06 | |
*** smatzek has quit IRC | 21:10 | |
*** fragatina has joined #openstack-nova | 21:11 | |
*** edmondsw has joined #openstack-nova | 21:19 | |
openstackgerrit | Merged openstack/os-traits master: Add NIC Switchdev feature https://review.openstack.org/508817 | 21:20 |
*** smcginnis has joined #openstack-nova | 21:27 | |
*** fragatina has quit IRC | 21:27 | |
*** moshele has joined #openstack-nova | 21:33 | |
*** threestrands has joined #openstack-nova | 21:35 | |
*** ab2434_ has joined #openstack-nova | 21:36 | |
jaypipes | ab2434_: ok, so a couple questions for you... | 21:39 |
ab2434_ | sure | 21:39 |
jaypipes | ab2434_: question #1: if a VM consumes a VF, why should the VF's PF information be made available to it? | 21:39 |
ab2434_ | its being used for Active & Available Inventory | 21:40 |
jaypipes | ab2434_: and? | 21:40 |
ab2434_ | mainly thats it | 21:41 |
jaypipes | ab2434_: what purpose does having the PF's PCI address serve? | 21:41 |
ab2434_ | mainly for inventory | 21:42 |
ab2434_ | in this case | 21:42 |
*** sbezverk has joined #openstack-nova | 21:43 | |
*** chyka has quit IRC | 21:43 | |
*** chyka has joined #openstack-nova | 21:44 | |
jaypipes | ab2434_: I still don't see what purpose having the PF's PCI address serves. I mean, I can kind of see an inventory management system taking inventory of all hardware on compute hosts, but that's not what the Neutron port binding profile is for. Why doesn't AAI just, you know... inventory the systems itself? | 21:45 |
ab2434_ | one other thing is SDN-F application can configure switch port interfaces based on the mapping | 21:45 |
jaypipes | ab2434_: that particular VNF should have the *PF* assigned to the VM, then, so that it can inventory the VFs on the PF itself (and control them accordingly). no? | 21:46 |
ab2434_ | true. but there is a corresponding swith port that needs to be configured | 21:47 |
jaypipes | ab2434_: and you need the PF's PCI address in order to figure out which switch port the PF is associated with? | 21:47 |
jaypipes | ab2434_: why is the switch port or tag decorating the Neutron port binding? | 21:48 |
jaypipes | why *isn't*... | 21:48 |
jaypipes | sorry | 21:48 |
ab2434_ | i dont think thats what's happening today | 21:48 |
ab2434_ | as a side note there could be multiple switches | 21:49 |
jaypipes | ab2434_: but why should Nova twist and turn to satisfy the needs of one particular VNF? :) | 21:49 |
jaypipes | ab2434_: especially when said VNF isn't *really* a VM but instead just hardware masquerading as software ;) | 21:50 |
jaypipes | but I digress... | 21:50 |
ab2434_ | well. it boils down to the mapping, having nova provide a way to map the ports | 21:50 |
ab2434_ | for the sdn-f to configure the correct switch /port for the vm | 21:50 |
mriedem | do vif tags not work here? | 21:52 |
jaypipes | ab2434_: my point is this: if the Neutron port binding can be decorated with the switch group or port tag, then that information can be passed down to Nova (in the instance PCI request) and used to identify the physical function that should be selected for the VM. what is being proposed here is the opposite of that design. the proposal here is to essentially inventory the topology and hardware for the entire deployment ahead of time and | 21:52 |
jaypipes | pre-schedule/place VMs that consume specific PCI devices on specific hardware all at once. | 21:52 |
jaypipes | it's the opposite of cloud... the opposite of on-demand. | 21:52 |
ab2434_ | ok | 21:52 |
*** esberglu has quit IRC | 21:54 | |
jaypipes | ab2434_: the world that we (Nova/OpenStack, whatever) is trying to get to is a world where the VNFs/applications *describe to Nova the resources and traits that it needs* and Nova goes and finds an appropriate place for that workload to land and devices to consume. What the world that NFV is trying to force on Nova is the opposite of that: a world where a VNF says to Nova "hey, put me on node X and PF Y and use VF foo. Oh, and also pin me to | 21:55 |
jaypipes | CPUs 1-12, 18-24 and NUMA cell 0." | 21:55 |
ab2434_ | yes i see your point | 21:56 |
jaypipes | ab2434_: it's this incongruence of worldviews that is at the root of the issue I think. | 21:56 |
jaypipes | ab2434_: and yes, I understand I work for Verizon (used to be AT&T) and that certain groups at Verizon think that OpenStack should just get on with the business of being an NFVI and nothing more ;) | 21:56 |
*** smatzek has joined #openstack-nova | 21:57 | |
tonyb | kashyap, mriedem: we had the beginings of one but it bitrotted. We could look at reviving it after the PTG. | 21:58 |
mriedem | tonyb: i don't think that job is useful anymore now that we're using the pike UCA by default in devstack | 21:59 |
tonyb | mriedem: Yeah the UCA stuff was always s'posed to be a POC the real plan was to have a tandem repo where we built tagged snapshots and use that | 22:00 |
*** moshele has quit IRC | 22:02 | |
tonyb | mriedem: At times there is a reasonable Fedora$current image, if ianw_pto has that working we can use the std. virt repo for Fedora also (again that was part of the plan) | 22:03 |
tonyb | mriedem, kashyap: I'm really happy to help revive that work QA, Neuton and infra all want somethign like that I just can't really be the driver | 22:03 |
*** jose-phillips has quit IRC | 22:05 | |
jaypipes | efried: still around? | 22:05 |
*** esberglu has joined #openstack-nova | 22:06 | |
jaypipes | efried: so... this will fail a functional test: https://review.openstack.org/#/c/531243/ and I'm not entirely sure why. perhaps if you're around later you could pull that patch and have a looksie? it looks like the instance creation ain't actually working. | 22:06 |
mriedem | mdbooth: so, long-term we should probably store the multiattach value on the bdm record... | 22:07 |
mriedem | the more i think about it | 22:07 |
mriedem | just like a tag | 22:07 |
jaypipes | efried: in any case, meh, will hit it later and tomorrow but if you have any time, could use your eyeballs. | 22:07 |
jaypipes | efried: it will fail the assertion here: https://review.openstack.org/#/c/531243/1/nova/tests/functional/db/test_instance_group.py on line 351 | 22:07 |
mriedem | mdbooth: and that always tells us, the volume representing this bdm was attached and supported multiattach at that time, so treat it like that until it's detached and the bdm is deleted | 22:07 |
*** jackie-truong has quit IRC | 22:08 | |
*** jose-phillips has joined #openstack-nova | 22:08 | |
*** markmcclain has quit IRC | 22:10 | |
*** markmcclain has joined #openstack-nova | 22:11 | |
*** rcernin has joined #openstack-nova | 22:12 | |
*** smatzek has quit IRC | 22:15 | |
*** markmcclain has quit IRC | 22:19 | |
*** markmcclain has joined #openstack-nova | 22:20 | |
*** smatzek has joined #openstack-nova | 22:21 | |
efried | jaypipes Sorry, I'm back now. Catching up... | 22:22 |
*** jose-phillips has quit IRC | 22:22 | |
*** smatzek has quit IRC | 22:22 | |
*** smatzek has joined #openstack-nova | 22:23 | |
efried | jaypipes wtf is an instance group? | 22:23 |
efried | jaypipes And are these host aggregates (as opposed to placement aggregates)? | 22:25 |
hemna_ | mriedem, hey man, I added an update to bug https://bugs.launchpad.net/nova/+bug/1452641 | 22:26 |
openstack | Launchpad bug 1452641 in OpenStack Compute (nova) "Static Ceph mon IP addresses in connection_info can prevent VM startup" [Medium,Confirmed] | 22:26 |
hemna_ | mriedem, thought you might want to take a look and see what you thought. | 22:26 |
*** smatzek has quit IRC | 22:27 | |
*** jistr has quit IRC | 22:29 | |
*** jose-phillips has joined #openstack-nova | 22:30 | |
mriedem | yikes, super rbd specific code in there | 22:30 |
mriedem | hemna_: at the ptg in denver we said we'd just always refresh the connection_info when we needed it http://lists.openstack.org/pipermail/openstack-dev/2017-September/122170.html | 22:31 |
hemna_ | yah, that was just a customer's hacked patch to get it to work | 22:32 |
lyarwood | that's imagebackend rbd btw | 22:32 |
lyarwood | not connection_info volume rbd | 22:32 |
mriedem | sure. the forced refresh_connection_info=True thing would arguably be much simpler, if it actually works | 22:32 |
hemna_ | my customer tried to live migrate and it didn't get the updated info, hence his hack | 22:32 |
lyarwood | hemna_: vms/3b97914e-3f9b-410a-b3d9-6c1a83244136_disk isn't a volume however right? | 22:33 |
jaypipes | efried: and instance group is an abomination otherwise called a server group. never mind, though... I'm gonna sleep on it and tackle manana | 22:33 |
mriedem | hemnahttps://github.com/openstack/nova/blob/master/nova/compute/manager.py#L5957 | 22:33 |
hemna_ | I'm not sure, this is a dump direct from the customer's env. | 22:34 |
mriedem | hemna_: there is a refresh_conn_info kwarg to _get_instance_block_device_info | 22:34 |
efried | jaypipes Okay. I'll play with it a bit if I get some time here. | 22:34 |
mriedem | so the idea from the ptg was just always pass refresh_conn_info=True there when we were going to do something like this | 22:34 |
hemna_ | mriedem, is that an option at nova cmdln time ? | 22:34 |
mriedem | no | 22:34 |
*** ihrachys has quit IRC | 22:34 | |
mriedem | it was a way to do this w/o any api changes | 22:35 |
*** ihrachys has joined #openstack-nova | 22:35 | |
lyarwood | mriedem: the issue isn't with the rbd volume, but the imagebackend rbd images | 22:35 |
mriedem | i don't know what that means | 22:35 |
mriedem | i thought the issue was stale rbd information in the connection_info, which nova gets from cinder | 22:35 |
lyarwood | mriedem: volumes/volume-6d04520d-0029-499c-af81-516a7ba37a54 is the volume | 22:35 |
mriedem | and uses to populate the disk config | 22:35 |
lyarwood | mriedem: not for ephemerial rbd images, it's all hard coded from the local nova.conf iirc | 22:36 |
mriedem | lyarwood: yeah i'm not talking about ephemeral, | 22:36 |
mriedem | just volumes | 22:36 |
mriedem | this https://github.com/openstack/nova/blob/master/nova/virt/libvirt/volume/net.py#L56 | 22:37 |
lyarwood | mriedem: yeah that appears to be updated in the LM flow | 22:37 |
lyarwood | mriedem: <source protocol='rbd' name='volumes/volume-6d04520d-0029-499c-af81-516a7ba37a54'> <-- this one is changed, new ips | 22:37 |
mriedem | right, because the live migration flow calls _get_volume_config | 22:37 |
mriedem | using the bdm.connection_info | 22:37 |
mriedem | which is stale | 22:37 |
hemna_ | yup | 22:38 |
lyarwood | yup | 22:38 |
mriedem | what we said in denver, | 22:38 |
mriedem | was when we start live migration, and get the bdms, we call cinder to refresh the connection info | 22:38 |
mriedem | by pass refresh_conn_info=True to https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1683 | 22:38 |
*** Apoorva has quit IRC | 22:38 | |
mriedem | that will make a new os-initialize_connection call to cinder and return the latest connection_info | 22:38 |
* lyarwood nods | 22:39 | |
mriedem | then we save that on the bdm.connection_info | 22:39 |
*** Apoorva has joined #openstack-nova | 22:39 | |
mriedem | for reasons, 'ive just never written that patch, | 22:39 |
mriedem | but i also have concerns about the new attach flow making that no longer work | 22:39 |
hemna_ | :( | 22:39 |
mriedem | with the new flow, nova calls cinder to get the attachment record which has the connection_info stored in the cinder db, | 22:39 |
mriedem | so i don't think it actually creates a new connection (export?) so we wouldn't refresh | 22:39 |
hemna_ | which could be stale... | 22:40 |
mriedem | this https://github.com/openstack/nova/blob/master/nova/virt/block_device.py#L557 | 22:40 |
hemna_ | in this case we just moved the stale info from nova to cinder, and cinder is stale? | 22:40 |
mriedem | hemna_: right, if cinder doesn't update that, and just returns whta's in the db, it would be stale - and we'd have the same problem as using the stale nova db info | 22:40 |
mriedem | yes | 22:40 |
mriedem | exactly | 22:40 |
hemna_ | bleh | 22:40 |
*** burt has quit IRC | 22:41 | |
mriedem | yeah idk, nova could still call os-initialize_connection if we wanted to, | 22:42 |
mriedem | i don't know if that would update the attachment record in cinder or not | 22:42 |
hemna_ | so I thought the purpose of storing the conn info in cinder's db is for force delete time when nova doesn't know anything | 22:43 |
mriedem | otherwise we'd likely need like a refresh=True query parameter to GET /attachments/{id} | 22:43 |
mriedem | hemna_: that's part of it yeah | 22:43 |
hemna_ | in which case, can't cinder always refresh that if nova asks for it? | 22:43 |
hemna_ | I dunno | 22:43 |
mriedem | but how does cinder know that we're asking for a refresh? | 22:43 |
mriedem | or just a simple read-only get | 22:43 |
hemna_ | true | 22:44 |
mriedem | this is where we got talking about API changes to force a refresh | 22:44 |
*** nicolasbock has joined #openstack-nova | 22:44 | |
hemna_ | I guess this could affect other cinder backends too | 22:44 |
hemna_ | not just ceph | 22:44 |
hemna_ | multipath IPs can change in between attaching the same volume, if the cinder backend gets changed | 22:46 |
hemna_ | ok so for now, I'll just tell my customer to bounce their VMs. :( | 22:46 |
mriedem | like a retype? | 22:46 |
*** armax has joined #openstack-nova | 22:47 | |
hemna_ | well, more like a storage array loses one of it's interfaces | 22:47 |
mriedem | or you mean the backend backend | 22:47 |
hemna_ | and the IP changes | 22:47 |
mriedem | ok | 22:47 |
hemna_ | or new interfaces are added | 22:47 |
hemna_ | it's kinda the same thing as a new ceph monitor IP | 22:47 |
lyarwood | hemna_: FWIW restarting the instance is the only way to update the mon IPs for the ephemerial rbd images. | 22:48 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: [placement] Add sending global request ID in put (1) https://review.openstack.org/531258 | 22:48 |
lyarwood | ephemeral* | 22:48 |
hemna_ | lyarwood, yah. they were trying to avoid that, as there could be tons of VMs to bounce | 22:48 |
mriedem | hmm, actually it looks like for live migration we do refresh the connection_info https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L5882 | 22:49 |
mriedem | in pre-live migration on the dest host | 22:49 |
lyarwood | right, the issue hemna_ reported in the bug isn't with rbd volumes :) | 22:49 |
mriedem | oh | 22:50 |
mriedem | now i get it | 22:50 |
*** jose-phillips has quit IRC | 22:51 | |
mriedem | well, i'm reminded once again that i'd like to just give up and buy a farm in austria and just live out my days there | 22:51 |
mriedem | because this whole software business isn't worth it | 22:51 |
hemna_ | lolz | 22:51 |
hemna_ | you aren't the only one.....I've been in a youtube black whole, watching dudes build log cabins... | 22:52 |
mriedem | that's the manliest thing i've heard all day | 22:52 |
*** jose-phillips has joined #openstack-nova | 22:53 | |
*** damien_r has quit IRC | 22:54 | |
*** lyan has quit IRC | 22:56 | |
*** edmondsw has quit IRC | 22:56 | |
*** edmondsw has joined #openstack-nova | 22:57 | |
hemna_ | lyarwood, should I file a separate bug then to cover the rbd image update during LM ? | 22:58 |
lyarwood | hemna_: yeah I think so, it's a different codepath etc | 23:00 |
hemna_ | ok will do | 23:00 |
lyarwood | hemna_: cool thanks, I'll take a look tomorrow | 23:00 |
*** edmondsw has quit IRC | 23:01 | |
*** smatzek has joined #openstack-nova | 23:04 | |
*** mvk has joined #openstack-nova | 23:05 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Track associated sharing RPs in report client https://review.openstack.org/526539 | 23:06 |
openstackgerrit | Eric Fried proposed openstack/nova master: Raise on API errors getting aggregates/traits https://review.openstack.org/526540 | 23:06 |
openstackgerrit | Eric Fried proposed openstack/nova master: ProviderTree.populate_from_iterable https://review.openstack.org/520756 | 23:06 |
openstackgerrit | Eric Fried proposed openstack/nova master: Track tree-associated providers in report client https://review.openstack.org/526541 | 23:06 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Scheduler[Report]Client.get_provider_tree https://review.openstack.org/521098 | 23:06 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: ComputeDriver.update_provider_tree() https://review.openstack.org/521187 | 23:06 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Use update_provider_tree from resource tracker https://review.openstack.org/520246 | 23:06 |
openstackgerrit | Eric Fried proposed openstack/nova master: Fix nits in update_provider_tree series https://review.openstack.org/531260 | 23:06 |
efried | jaypipes ^ -- and response to review comments on the way... | 23:07 |
jaypipes | coolio. | 23:07 |
hemna_ | lyarwood, https://bugs.launchpad.net/nova/+bug/1741364 | 23:08 |
openstack | Launchpad bug 1741364 in OpenStack Compute (nova) "ceph ephemeral info not updated during live migrate" [Undecided,New] | 23:08 |
*** dave-mccowan has joined #openstack-nova | 23:08 | |
efried | jaypipes See response on https://review.openstack.org/#/c/526539/ -- not sure if I've answered the right question there. | 23:11 |
*** ab2434_ has quit IRC | 23:14 | |
*** flwang has quit IRC | 23:14 | |
*** stvnoyes has quit IRC | 23:18 | |
*** hongbin has quit IRC | 23:18 | |
*** smatzek has quit IRC | 23:19 | |
*** smatzek has joined #openstack-nova | 23:19 | |
*** jistr has joined #openstack-nova | 23:19 | |
*** smatzek has quit IRC | 23:21 | |
*** smatzek has joined #openstack-nova | 23:21 | |
*** flwang has joined #openstack-nova | 23:24 | |
mnaser | does nova not set the instance status to ERROR if cinder volume detach fails? | 23:26 |
*** smatzek has quit IRC | 23:26 | |
mnaser | I have some cinder volumes which show attached servers, but the server is deleted | 23:26 |
mriedem | mnaser: correct | 23:31 |
mriedem | normal volume detach? or volume detach during instance delete? | 23:31 |
mnaser | mriedem: volume detach during instance delete | 23:31 |
*** dave-mccowan has quit IRC | 23:31 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: [libvirt] Allow multiple volume attachments https://review.openstack.org/267587 | 23:32 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Pass multiattach flag to reserve_block_device_name https://review.openstack.org/531244 | 23:32 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: WIP: [api] Allow multi-attach in compute api https://review.openstack.org/271047 | 23:32 |
mriedem | mnaser: yeah that happens here https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2391 | 23:32 |
mnaser | (boot from volume instance) | 23:32 |
mriedem | if it fails we log something but keep going | 23:32 |
mriedem | because we've already destroyed the guest on the hypervisor | 23:32 |
mriedem | this is where we try to delete the volume if delete_on_termination=True https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2440 | 23:33 |
mnaser | hmmm | 23:33 |
mnaser | because now i end up with volumes with attachments (for whatever reason) and there doesn't seem to be a straight forward way of clearing that out | 23:34 |
mriedem | force detach in cinder? | 23:34 |
mnaser | i see force delete but not force detach | 23:35 |
mnaser | (force delete is ok for me though) | 23:35 |
mriedem | that might be what i was thinking of | 23:36 |
mriedem | smcginnis: knows | 23:36 |
openstackgerrit | Andreas Karis proposed openstack/nova master: Add debug output for selected page size https://review.openstack.org/530662 | 23:36 |
mnaser | Delete for volume 973d5a3d-07f3-4d05-b62c-8dfe746d298c failed: Invalid volume: Volume must not be migrating, attached, belong to a group, have snapshots or be disassociated from snapshots after volume transfer. (HTTP 400) (Request-ID: req-ecfad96c-8125-47e4-8e25-9a9e892139aa) | 23:36 |
mnaser | i guess this is in cinder-land | 23:37 |
mnaser | (cant force delete an attach volume) | 23:37 |
mriedem | mnaser: https://developer.openstack.org/api-ref/block-storage/v2/#force-detach-volume | 23:37 |
mordred | wow. y'all are having all the fun | 23:38 |
*** Eran_Kuris has quit IRC | 23:38 | |
mriedem | mordred: this channel is a barrel of laughs all day every day | 23:38 |
mordred | mriedem: that's what I tell people | 23:38 |
mordred | "looking for a barrel of laughs? go check out #openstack-nova!", I tell them | 23:39 |
*** mtreinish has quit IRC | 23:39 | |
mnaser | mriedem: interesting, i could probably even normal detach, but i wonder if the cinder cli client has it | 23:40 |
mnaser | hopefully scaling up our cinder api endpoints should mean that we don't see this again | 23:41 |
mriedem | i don't see a force detach command in cinderclient | 23:41 |
*** mtreinish has joined #openstack-nova | 23:42 | |
SamYaple | i dont think there is one... | 23:46 |
SamYaple | inconsitencies between nova and cinder ive always had to go to the db to solve | 23:46 |
mriedem | there is an api | 23:47 |
mriedem | so curl should work | 23:47 |
mriedem | should also be easy enough to add a cli for this in cinderclient | 23:47 |
SamYaple | indeed | 23:47 |
*** Eran_Kuris has joined #openstack-nova | 23:51 | |
*** jistr has quit IRC | 23:54 | |
*** lyan has joined #openstack-nova | 23:54 | |
*** tetsuro has joined #openstack-nova | 23:54 | |
*** jistr has joined #openstack-nova | 23:55 | |
*** liangy has joined #openstack-nova | 23:56 | |
*** lyan has quit IRC | 23:58 | |
*** nicolasbock has quit IRC | 23:58 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!