*** brinzhang has quit IRC | 00:01 | |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Add emulated TPM support to Nova https://review.opendev.org/631363 | 00:12 |
---|---|---|
openstackgerrit | Eric Fried proposed openstack/nova master: Add support for resize and cold migration of emulated TPM files https://review.opendev.org/639934 | 00:12 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: vTPM request_filter https://review.opendev.org/678325 | 00:12 |
*** ociuhandu has joined #openstack-nova | 00:13 | |
*** ociuhandu has quit IRC | 00:18 | |
*** TxGirlGeek has quit IRC | 00:24 | |
*** mlavalle has quit IRC | 00:32 | |
*** macz has quit IRC | 00:33 | |
*** brinzhang has joined #openstack-nova | 00:53 | |
*** brinzhang_ has quit IRC | 00:57 | |
*** Liang__ has joined #openstack-nova | 01:01 | |
*** brinzhang_ has joined #openstack-nova | 01:02 | |
*** brinzhang has quit IRC | 01:05 | |
*** brinzhang has joined #openstack-nova | 01:08 | |
*** dswebb has quit IRC | 01:09 | |
*** brinzhang has quit IRC | 01:10 | |
*** brinzhang has joined #openstack-nova | 01:10 | |
*** ociuhandu has joined #openstack-nova | 01:16 | |
melwitt | gmann: johnthetubaguy has updated the spec for your comments, if you could take another look when you get a chance https://review.opendev.org/602201 | 01:21 |
*** ociuhandu has quit IRC | 01:24 | |
*** ociuhandu has joined #openstack-nova | 01:36 | |
*** gyee has quit IRC | 01:41 | |
*** ociuhandu has quit IRC | 01:46 | |
*** ileixe has quit IRC | 01:57 | |
*** ileixe has joined #openstack-nova | 01:59 | |
*** ileixe has quit IRC | 02:00 | |
*** ileixe has joined #openstack-nova | 02:00 | |
*** macz has joined #openstack-nova | 02:01 | |
*** macz has quit IRC | 02:06 | |
*** takashin has joined #openstack-nova | 02:08 | |
*** TxGirlGeek has joined #openstack-nova | 02:29 | |
*** brault has quit IRC | 02:32 | |
*** brault has joined #openstack-nova | 02:32 | |
*** mkrai has joined #openstack-nova | 02:51 | |
*** abaindur has joined #openstack-nova | 02:58 | |
*** brinzhang has quit IRC | 02:59 | |
*** brinzhang_ has quit IRC | 03:00 | |
*** ccamacho has quit IRC | 03:13 | |
*** ociuhandu has joined #openstack-nova | 03:17 | |
*** ociuhandu has quit IRC | 03:22 | |
*** tbachman has quit IRC | 03:25 | |
*** zhanglong has joined #openstack-nova | 03:31 | |
*** bhagyashris has joined #openstack-nova | 03:42 | |
*** udesale has joined #openstack-nova | 03:42 | |
*** liuyulong has quit IRC | 03:46 | |
*** ileixe has quit IRC | 03:46 | |
*** ileixe has joined #openstack-nova | 03:50 | |
*** awalende has joined #openstack-nova | 03:54 | |
*** zhubx has quit IRC | 03:59 | |
*** awalende has quit IRC | 03:59 | |
*** boxiang has joined #openstack-nova | 03:59 | |
*** dave-mccowan has quit IRC | 04:02 | |
*** dave-mccowan has joined #openstack-nova | 04:12 | |
*** yaawang has quit IRC | 04:15 | |
*** yaawang has joined #openstack-nova | 04:16 | |
*** tkajinam has quit IRC | 04:26 | |
*** tkajinam has joined #openstack-nova | 04:33 | |
*** TxGirlGeek has quit IRC | 04:36 | |
*** ociuhandu has joined #openstack-nova | 04:43 | |
*** bhagyashris has quit IRC | 04:44 | |
*** ociuhandu has quit IRC | 04:48 | |
*** takashin has left #openstack-nova | 04:51 | |
*** dave-mccowan has quit IRC | 05:05 | |
openstackgerrit | ya.wang proposed openstack/nova-specs master: Add "live migration without performance impact" spec. https://review.opendev.org/693655 | 05:06 |
*** tkajinam_ has joined #openstack-nova | 05:08 | |
*** tkajinam has quit IRC | 05:11 | |
*** bhagyashris has joined #openstack-nova | 05:18 | |
*** ociuhandu has joined #openstack-nova | 05:31 | |
*** tkajinam_ has quit IRC | 05:34 | |
*** ociuhandu has quit IRC | 05:35 | |
*** tkajinam has joined #openstack-nova | 05:36 | |
*** tkajinam has quit IRC | 06:03 | |
*** tkajinam has joined #openstack-nova | 06:05 | |
*** abaindur has quit IRC | 06:16 | |
*** tkajinam has quit IRC | 06:30 | |
*** tkajinam has joined #openstack-nova | 06:31 | |
*** tkajinam has quit IRC | 06:31 | |
*** tbachman has joined #openstack-nova | 06:34 | |
*** dtantsur|afk is now known as dtantsur | 06:34 | |
*** tbachman_ has joined #openstack-nova | 06:35 | |
*** tbachman has quit IRC | 06:39 | |
*** tbachman_ is now known as tbachman | 06:39 | |
*** Luzi has joined #openstack-nova | 06:40 | |
*** slaweq has quit IRC | 06:45 | |
*** sridharg has joined #openstack-nova | 06:46 | |
*** tbachman has quit IRC | 06:53 | |
*** zhanglong has quit IRC | 07:00 | |
*** do3meli has joined #openstack-nova | 07:01 | |
*** tkajinam has joined #openstack-nova | 07:01 | |
*** jangutter has joined #openstack-nova | 07:02 | |
*** dpawlik has joined #openstack-nova | 07:05 | |
*** chenhaw has joined #openstack-nova | 07:07 | |
*** dpawlik has quit IRC | 07:10 | |
*** dklyle has quit IRC | 07:19 | |
*** dklyle has joined #openstack-nova | 07:19 | |
*** dpawlik has joined #openstack-nova | 07:20 | |
*** igordc has quit IRC | 07:25 | |
*** do3meli has left #openstack-nova | 07:27 | |
*** bhagyashris has quit IRC | 07:28 | |
*** ociuhandu has joined #openstack-nova | 07:35 | |
*** ociuhandu has quit IRC | 07:41 | |
*** damien_r has joined #openstack-nova | 07:52 | |
*** mmethot has quit IRC | 07:56 | |
*** trident has quit IRC | 07:57 | |
*** tesseract has joined #openstack-nova | 08:00 | |
*** maciejjozefczyk has joined #openstack-nova | 08:02 | |
*** trident has joined #openstack-nova | 08:06 | |
*** bhagyashris has joined #openstack-nova | 08:15 | |
*** awalende has joined #openstack-nova | 08:16 | |
*** rpittau|afk is now known as rpittau | 08:17 | |
*** ivve has joined #openstack-nova | 08:19 | |
*** yaawang has quit IRC | 08:24 | |
*** yaawang has joined #openstack-nova | 08:24 | |
*** ralonsoh has joined #openstack-nova | 08:29 | |
*** dpawlik has quit IRC | 08:32 | |
*** tkajinam has quit IRC | 08:35 | |
*** xek_ has joined #openstack-nova | 08:38 | |
*** links has joined #openstack-nova | 08:40 | |
*** yan0s has joined #openstack-nova | 08:40 | |
*** zhanglong has joined #openstack-nova | 08:44 | |
*** dpawlik has joined #openstack-nova | 08:47 | |
*** dswebb has joined #openstack-nova | 08:49 | |
*** dswebb has left #openstack-nova | 08:49 | |
*** slaweq has joined #openstack-nova | 08:52 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Wire up a force disconnect_volume flag https://review.opendev.org/584849 | 08:53 |
*** damien_r has quit IRC | 08:54 | |
*** ociuhandu has joined #openstack-nova | 09:03 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/train: Use admin neutron client to query ports for binding https://review.opendev.org/694013 | 09:06 |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/train: Use admin neutron client to gather port resource requests https://review.opendev.org/694015 | 09:07 |
*** ociuhandu has quit IRC | 09:09 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/train: Use admin neutron client to gather port resource requests https://review.opendev.org/694015 | 09:10 |
*** ociuhandu has joined #openstack-nova | 09:10 | |
*** ccamacho has joined #openstack-nova | 09:12 | |
*** jistr has quit IRC | 09:19 | |
*** jistr has joined #openstack-nova | 09:20 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/train: Use admin neutron client to see if instance has qos ports https://review.opendev.org/694018 | 09:21 |
*** ociuhandu has quit IRC | 09:25 | |
*** ociuhandu has joined #openstack-nova | 09:26 | |
stephenfin | sean-k-mooney: Approved https://review.opendev.org/#/c/683174 FYI | 09:27 |
*** ociuhandu has quit IRC | 09:29 | |
*** jistr has quit IRC | 09:29 | |
*** ociuhandu has joined #openstack-nova | 09:30 | |
*** tssurya has joined #openstack-nova | 09:30 | |
*** ociuhandu has quit IRC | 09:31 | |
*** ociuhandu has joined #openstack-nova | 09:31 | |
*** ccamacho has quit IRC | 09:32 | |
*** jistr has joined #openstack-nova | 09:36 | |
*** ccamacho has joined #openstack-nova | 09:36 | |
openstackgerrit | Merged openstack/nova-specs master: Add spec for VM-scoped SR-IOV NUMA affinity https://review.opendev.org/683174 | 09:37 |
*** ociuhandu has quit IRC | 09:37 | |
*** ociuhandu has joined #openstack-nova | 09:37 | |
*** ileixe has quit IRC | 09:47 | |
*** ileixe has joined #openstack-nova | 09:48 | |
*** zhanglong has quit IRC | 09:50 | |
*** Liang__ has quit IRC | 09:52 | |
*** priteau has joined #openstack-nova | 10:05 | |
*** shilpasd has joined #openstack-nova | 10:05 | |
*** gibi is now known as gibi_off | 10:11 | |
* gibi_off is on two days of internal conference | 10:11 | |
*** ociuhandu has quit IRC | 10:13 | |
*** ociuhandu has joined #openstack-nova | 10:14 | |
*** ileixe has quit IRC | 10:16 | |
*** ociuhandu has quit IRC | 10:19 | |
*** zbr has quit IRC | 10:21 | |
*** brinzhang has joined #openstack-nova | 10:24 | |
openstackgerrit | Lee Yarwood proposed openstack/os-traits master: WIP Add COMPUTE_RESCUE_STABLE_DEVICES and COMPUTE_RESCUE_BFV traits https://review.opendev.org/694033 | 10:25 |
*** brinzhang_ has joined #openstack-nova | 10:29 | |
*** brinzhang has quit IRC | 10:31 | |
*** brinzhang has joined #openstack-nova | 10:42 | |
*** chenhaw has quit IRC | 10:44 | |
*** brinzhang_ has quit IRC | 10:45 | |
*** zbr has joined #openstack-nova | 10:48 | |
*** ralonsoh has quit IRC | 11:07 | |
*** ralonsoh has joined #openstack-nova | 11:09 | |
*** ociuhandu has joined #openstack-nova | 11:14 | |
*** udesale has quit IRC | 11:15 | |
*** ociuhandu has quit IRC | 11:18 | |
*** purplerbot has quit IRC | 11:23 | |
*** brinzhang_ has joined #openstack-nova | 11:29 | |
*** brinzhang has quit IRC | 11:32 | |
*** ociuhandu has joined #openstack-nova | 11:36 | |
*** bhagyashris has quit IRC | 11:59 | |
sean-k-mooney | stephenfin: when you wrote the orginial alias based numa affintiy feature did you add docs? i dont see it referenced here https://docs.openstack.org/nova/train/admin/pci-passthrough.html | 12:01 |
sean-k-mooney | there is https://docs.openstack.org/nova/train/configuration/config.html#pci.alias | 12:02 |
sean-k-mooney | whcich mention the numa_policy field but it does not explain them | 12:02 |
*** kaisers has quit IRC | 12:02 | |
*** henriqueof1 has quit IRC | 12:04 | |
*** kaisers has joined #openstack-nova | 12:05 | |
*** dpawlik has quit IRC | 12:06 | |
*** dpawlik has joined #openstack-nova | 12:12 | |
*** dpawlik has quit IRC | 12:14 | |
*** purplerbot has joined #openstack-nova | 12:14 | |
*** dpawlik has joined #openstack-nova | 12:17 | |
*** dpawlik has quit IRC | 12:17 | |
*** priteau has quit IRC | 12:19 | |
*** mkrai has quit IRC | 12:21 | |
*** mkrai has joined #openstack-nova | 12:21 | |
*** priteau has joined #openstack-nova | 12:22 | |
*** dpawlik has joined #openstack-nova | 12:22 | |
*** brinzhang has joined #openstack-nova | 12:25 | |
*** dpawlik has quit IRC | 12:26 | |
*** dpawlik has joined #openstack-nova | 12:27 | |
*** brinzhang_ has quit IRC | 12:28 | |
*** ociuhandu has quit IRC | 12:35 | |
*** ociuhandu has joined #openstack-nova | 12:35 | |
*** ociuhandu has quit IRC | 12:36 | |
*** ociuhandu has joined #openstack-nova | 12:37 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/train: Use admin neutron client to see if instance has qos ports https://review.opendev.org/694018 | 12:45 |
gibi_off | elod: ^^ additional diff was needed to make the backport work properly due to feature merged in ussuri | 12:47 |
*** shilpasd has quit IRC | 12:48 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Mask the token used to allow access to consoles https://review.opendev.org/220622 | 12:50 |
*** brinzhang_ has joined #openstack-nova | 12:53 | |
*** damien_r has joined #openstack-nova | 12:53 | |
*** rcernin has quit IRC | 12:54 | |
openstackgerrit | Lee Yarwood proposed openstack/nova-specs master: Virtual instance rescue with stable disk devices https://review.opendev.org/693849 | 12:55 |
openstackgerrit | Lee Yarwood proposed openstack/nova-specs master: Boot from volume instance rescue https://review.opendev.org/694063 | 12:55 |
*** brinzhang has quit IRC | 12:55 | |
*** dtantsur is now known as dtantsur|afk | 12:57 | |
elod | gibi_off: thanks, i'm not there yet, but will look into it :) | 13:01 |
stephenfin | sean-k-mooney: Think I just documented it in the config option? | 13:02 |
stephenfin | though we should really have it documented in doc/source/admin/pci-passthrough.rst | 13:02 |
stephenfin | I can do that now | 13:02 |
stephenfin | good way to get used to having working internet again | 13:03 |
*** tbachman has joined #openstack-nova | 13:03 | |
* stephenfin has no idea how alex_xu et al manages :O | 13:04 | |
*** tbachman has quit IRC | 13:08 | |
*** ociuhandu has quit IRC | 13:08 | |
*** ociuhandu has joined #openstack-nova | 13:09 | |
*** brinzhang has joined #openstack-nova | 13:09 | |
*** brinzhang has quit IRC | 13:10 | |
*** brinzhang_ has quit IRC | 13:11 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Remove functional test specific nova code https://review.opendev.org/683609 | 13:12 |
*** tbachman has joined #openstack-nova | 13:14 | |
*** ociuhandu has quit IRC | 13:14 | |
*** dpawlik has quit IRC | 13:24 | |
*** mkrai has quit IRC | 13:28 | |
*** mkrai_ has joined #openstack-nova | 13:28 | |
*** tbachman has quit IRC | 13:30 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Remove functional test specific nova code https://review.opendev.org/683609 | 13:31 |
efried | stephenfin: alex_xu already has to deal with Intel's firewall, the gfwoc is nbd. | 13:40 |
efried | I work around it by using a personal computer for real work. | 13:41 |
bauzas | I wonder whether we should propose some Chinese IRC server | 13:41 |
efried | bauzas: isn't that called WeChat? | 13:41 |
bauzas | efried: sure, but WeChat doesn't support IRC | 13:41 |
efried | oh, you mean a server for actual IRC, got it. | 13:41 |
bauzas | having the same client would be nice | 13:41 |
efried | I'm not sure the technology is the problem; it's the connectivity. | 13:42 |
bauzas | like, I could use this Chinese server plus the Freenode one | 13:42 |
efried | isn't the problem that the bits can't get in & out? | 13:42 |
bauzas | of course, but I'm pretty sure a IRC server would be supported by the chinese government :) | 13:42 |
bauzas | I mean, when running in a chinese cloud ;) | 13:42 |
bauzas | alex_xu: FWIW, I'll still continue to use WeChat | 13:43 |
*** nweinber has joined #openstack-nova | 13:47 | |
*** mdbooth has joined #openstack-nova | 13:52 | |
*** mriedem has joined #openstack-nova | 13:53 | |
kashyap | efried: Hi, when you get a moment, is a Blueprint required for this: https://bugs.launchpad.net/nova/+bug/1852437 (Allow ability to disable individual CPU features via `cpu_model_extra_flags`) | 13:54 |
openstack | Launchpad bug 1852437 in OpenStack Compute (nova) "Allow ability to disable individual CPU features via `cpu_model_extra_flags`" [Undecided,New] | 13:54 |
*** dpawlik has joined #openstack-nova | 13:54 | |
*** davee_ has joined #openstack-nova | 13:55 | |
efried | kashyap: at a glance, a blueprint seems entirely necessary. A spec, not as sure. Perhaps put it on tomorrow's meeting agenda for discussion? | 13:55 |
kashyap | efried: Spec / BP seems like an overkill, IMHO. But sure, can discuss tomm | 13:56 |
*** tbachman has joined #openstack-nova | 13:56 | |
efried | we're talking about enhancing syntax of a conf option in what seems like a nontrivial way. | 13:56 |
kashyap | It is essentially aims to implement what I stated at the end of this commit in paranthesis: https://opendev.org/openstack/nova/commit/cc27a2007f314 | 13:56 |
*** eharney has joined #openstack-nova | 13:56 | |
kashyap | efried: Hmm, phrased that way... | 13:56 |
kashyap | efried: But what do you think of it? Is there a better way you can think of than the +/- notion? | 13:57 |
*** mmethot has joined #openstack-nova | 13:57 | |
efried | kashyap: no, that seems reasonable, just seems like a thing that ought to have a "design" somewhere written down that we can agree on. Just having it in a RFE bug might be sufficient, but really that's what blueprints are for. And it's the kind of thing I would not expect to backport. | 13:59 |
*** tbachman has quit IRC | 14:00 | |
*** tbachman has joined #openstack-nova | 14:00 | |
kashyap | efried: Sure, can file a simple BluePrint | 14:01 |
efried | thanks kashyap. | 14:01 |
kashyap | I first wanted to file it, but went the bug route, thinking it is a "simple idea". :D | 14:02 |
*** tbachman has quit IRC | 14:09 | |
kashyap | Done: https://blueprints.launchpad.net/nova/+spec/allow-disabling-cpu-flags | 14:09 |
*** dviroel has joined #openstack-nova | 14:13 | |
*** amodi has quit IRC | 14:14 | |
*** mkrai_ has quit IRC | 14:18 | |
kashyap | mriedem: Maybe we can just close this, as we'll be tracking it in the Blueprint? - https://bugs.launchpad.net/nova/+bug/1852437 | 14:19 |
openstack | Launchpad bug 1852437 in OpenStack Compute (nova) "Allow ability to disable individual CPU features via `cpu_model_extra_flags`" [Wishlist,New] | 14:19 |
* kashyap marked it as "Invalid" / "Wishlist" | 14:20 | |
mriedem | wfm | 14:20 |
*** links has quit IRC | 14:30 | |
*** tbachman has joined #openstack-nova | 14:36 | |
efried | mriedem: what's the easiest/best way to "discover" a compute node's UUID (the one that'll match the placement root RP)? | 14:43 |
efried | (I'm trying to middleman here, not actually sure if we know anything about the node beforehand) | 14:44 |
*** ociuhandu has joined #openstack-nova | 14:44 | |
*** dpawlik has quit IRC | 14:48 | |
mriedem | you mean to get the ComputeNode object? | 14:48 |
mriedem | for non-ironic nodes the compute node uuid is randomly generated when the record is created the first time | 14:49 |
mriedem | to look up the computenode record, you want to use the host/nodename | 14:49 |
mriedem | if you're on compute and it's not ironic, you can just use CONF.host | 14:49 |
mriedem | host == node for non-ironic | 14:49 |
mriedem | otherwise you get the nodenames from the driver | 14:49 |
mriedem | see ComputeManager.update_available_resource | 14:50 |
mriedem | or you're trying to do something outside of nova to try to expose a vulnerability? | 14:50 |
bauzas | efried: yeah, you need to know the (host, node) tuple | 14:51 |
bauzas | mriedem: and no, AFAICR, for some virt drivers, node != host | 14:51 |
mriedem | bauzas: yeah, ironic | 14:52 |
mriedem | i said that | 14:52 |
bauzas | mriedem: not only for ironic | 14:52 |
mriedem | which ones? | 14:52 |
mriedem | if you're thinking vcenter, you're thinking of kilo era | 14:52 |
bauzas | I don't remember, lemme look | 14:52 |
*** Luzi has quit IRC | 14:52 | |
*** tesseract has quit IRC | 14:52 | |
efried | I think I can work with this, thanks. | 14:53 |
bauzas | I do wonder for HyperV | 14:53 |
mriedem | hyperv has only 1 node per host | 14:53 |
mriedem | http://cloudbase-ci.com/nova/693937/1/windows/logs/n-h2-693937-1/nova-compute.log.gz | 14:54 |
mriedem | 2019-11-13 05:46:51.769 5012 103574784 GreenThread-1 INFO nova.compute.resource_tracker [req-320b7c3d-8917-4c50-b48e-18dd405c7877 - - - - -] Compute node record created for n-h2-693937-1:n-h2-693937-1 with uuid: 77e9e8d8-bba6-4aba-81f9-5e9f69ca1db9 | 14:54 |
*** boxiang has quit IRC | 14:54 | |
*** tssurya has quit IRC | 14:54 | |
*** boxiang has joined #openstack-nova | 14:55 | |
mriedem | ironic is the only weirdo | 14:55 |
*** tesseract has joined #openstack-nova | 14:55 | |
bauzas | anyway, looks you're right | 14:56 |
bauzas | https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L9189 | 14:57 |
*** liuyulong has joined #openstack-nova | 14:58 | |
*** lpetrut has joined #openstack-nova | 14:58 | |
*** usr2033 has joined #openstack-nova | 14:58 | |
usr2033 | Hi, can i pxe boot nova/kvm instances? | 14:59 |
mriedem | note that https://docs.openstack.org/nova/latest/admin/configuration/hypervisors.html doesn't have a subpage for ironic as a compute driver in nova - that's probably a decent sized gap given all of the edge cases we could describe with ironic in nova as a compute driver | 14:59 |
openstackgerrit | Merged openstack/python-novaclient master: Add minor version [21] to the test_versions https://review.opendev.org/688599 | 14:59 |
mriedem | not to mention scaling issues when using the ironic driver to manage lots of nodes from a single compute service, re the ML threads on RT perf issues | 15:00 |
mriedem | usr2033: not natively no | 15:02 |
mriedem | see https://serverfault.com/questions/469479/does-nova-support-pxe-boot | 15:02 |
mriedem | usr2033: you may be interested in https://openstack-virtual-baremetal.readthedocs.io/en/latest/index.html | 15:03 |
*** ociuhandu has quit IRC | 15:04 | |
efried | I may be choppy today, going to be trying to work remotely (like, more remotely than usual). | 15:06 |
*** ociuhandu has joined #openstack-nova | 15:06 | |
* efried travels... | 15:06 | |
*** efried has quit IRC | 15:06 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'os-consoles' API https://review.opendev.org/687907 | 15:09 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'nova-console' service, 'os-consoles' API https://review.opendev.org/687908 | 15:09 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'nova-xvpvncproxy' https://review.opendev.org/687909 | 15:09 |
*** ociuhandu has quit IRC | 15:11 | |
*** liuyulong has quit IRC | 15:11 | |
openstackgerrit | Merged openstack/nova master: Remove TODOs around claim_resources_on_destination https://review.opendev.org/693635 | 15:15 |
mriedem | bauzas: https://bugs.launchpad.net/nova/+bug/1852446 | 15:15 |
openstack | Launchpad bug 1852446 in OpenStack Compute (nova) "Hypervisors in nova - no subpage details for ironic" [Undecided,New] | 15:15 |
bauzas | ack, good point | 15:16 |
* bauzas needs to disappear for 45 mins | 15:16 | |
stephenfin | mriedem: It's not urgent, but if you can rebase https://review.opendev.org/#/c/693425/ today I'm happy to push it through | 15:18 |
mriedem | ack, let me rebase the entire cross-cell series first quick so i can destroy the gate | 15:19 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Make API always RPC cast to conductor for resize/migrate https://review.opendev.org/693937 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Filter duplicates from compute API get_migrations_sorted() https://review.opendev.org/636224 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Start functional testing for cross-cell resize https://review.opendev.org/636253 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Handle target host cross-cell cold migration in conductor https://review.opendev.org/642591 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Validate image/create during cross-cell resize functional testing https://review.opendev.org/642592 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add zones wrinkle to TestMultiCellMigrate https://review.opendev.org/643450 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add negative test for cross-cell finish_resize failing https://review.opendev.org/643451 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add negative test for prep_snapshot_based_resize_at_source failing https://review.opendev.org/669013 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add confirm_snapshot_based_resize_at_source compute method https://review.opendev.org/637058 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add ConfirmResizeTask https://review.opendev.org/637070 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add confirm_snapshot_based_resize conductor RPC method https://review.opendev.org/637075 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Confirm cross-cell resize from the API https://review.opendev.org/637316 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add revert_snapshot_based_resize_at_dest compute method https://review.opendev.org/637630 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Deal with cross-cell resize in _remove_deleted_instances_allocations https://review.opendev.org/639453 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add finish_revert_snapshot_based_resize_at_source compute method https://review.opendev.org/637647 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: WIP: Add RevertResizeTask https://review.opendev.org/638046 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add revert_snapshot_based_resize conductor RPC method https://review.opendev.org/638047 | 15:21 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Revert cross-cell resize from the API https://review.opendev.org/638048 | 15:21 |
mriedem | dansmith: gibi_off: https://review.opendev.org/#/c/693937/ is fixed now | 15:22 |
mriedem | py2 vs py3 weirdness | 15:22 |
*** mkrai_ has joined #openstack-nova | 15:26 | |
openstackgerrit | Merged openstack/nova master: Remove now invalid TODO from ComputeManager._confirm_resize https://review.opendev.org/693427 | 15:27 |
openstackgerrit | Merged openstack/nova master: Use ListOfUUIDField from oslo.versionedobjects https://review.opendev.org/693258 | 15:27 |
openstackgerrit | Merged openstack/nova master: Add known limitation about resize not resizing ephemeral disks https://review.opendev.org/691915 | 15:27 |
openstackgerrit | Merged openstack/nova master: api-ref: re-work resize action post-conditions https://review.opendev.org/691918 | 15:27 |
openstackgerrit | Merged openstack/nova master: Provide a better error when _verify_response hits a TypeError https://review.opendev.org/693042 | 15:27 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Use named kwargs in compute.API.resize https://review.opendev.org/693425 | 15:30 |
mriedem | stephenfin: done ^ | 15:30 |
*** awalende has quit IRC | 15:33 | |
*** awalende has joined #openstack-nova | 15:33 | |
openstackgerrit | Elod Illes proposed openstack/nova stable/pike: cleanup evacuated instances not on hypervisor https://review.opendev.org/687912 | 15:36 |
mdbooth | lyarwood: Re https://review.opendev.org/#/c/694033/ What did you think about combining these 2 traits? | 15:36 |
mdbooth | IIUC their window of usefulness is limited to the period between stable disks and BFV rescue landing, right? | 15:37 |
mdbooth | COMPUTE_RESCUE_BFV implies COMPUTE_RESCUE_STABLE_DEVICES | 15:37 |
*** awalende has quit IRC | 15:38 | |
lyarwood | mdbooth: for the libvirt driver implementation yeah | 15:39 |
* mdbooth suggests: 1) nobody else is going to implement this 2) if they did, there's no reason they couldn't have the same semantics. | 15:40 | |
sean-k-mooney | i think having two traits is fine | 15:40 |
sean-k-mooney | glad to see you put them under compute :) | 15:41 |
mdbooth | lyarwood: As I said in the spec, though, if this discussion gets in the way of getting this done, it's not worth it. | 15:41 |
* mdbooth is +1 on this with or without 2 traits. | 15:41 | |
mdbooth | But I think 1 trait would be better. | 15:41 |
*** JamesBenson has joined #openstack-nova | 15:41 | |
sean-k-mooney | its only better if we support rescue for BFV in the same step | 15:42 |
sean-k-mooney | or release | 15:42 |
mdbooth | sean-k-mooney: Right, but the change from stable disks to rescue BFV is trivial. | 15:42 |
mdbooth | IIRC it's just removing a check in the api which prevents it. | 15:42 |
sean-k-mooney | sure although one point. you cant assume that just using a usb device will mean it wont reorder the disk in all cases | 15:43 |
sean-k-mooney | its going to be true 99% of the time | 15:43 |
mdbooth | sean-k-mooney: That's unrelated. | 15:43 |
mdbooth | (True, but unrelated) | 15:44 |
sean-k-mooney | well its part of the premisis of the stable_device resuce spec | 15:44 |
sean-k-mooney | ya | 15:44 |
*** TxGirlGeek has joined #openstack-nova | 15:44 | |
lyarwood | it's not going to reorder the physical layout, that's all we can guarantee | 15:44 |
sean-k-mooney | just said i would mention it since you can use hw_disk_bus to usb already https://github.com/openstack/glance/blob/master/etc/metadefs/compute-libvirt-image.json#L39 | 15:44 |
lyarwood | everything else within the guestOS is out of our control | 15:44 |
mdbooth | Right. It's definitey way better than what we do now in all cases. | 15:45 |
mdbooth | lyarwood: Don't suppose you still have a link to the old patches kicking about, do you? | 15:45 |
*** jangutter has quit IRC | 15:45 | |
lyarwood | yeah I'm working through a rebase now | 15:45 |
lyarwood | mdbooth: https://review.opendev.org/#/q/topic:bp/virt-rescue-stable-disk-devices | 15:45 |
lyarwood | mdbooth: hope to have it posted later this evening | 15:46 |
mdbooth | lyarwood: Cool. Did we discuss switching it on unconditionally in a new microversion, btw? | 15:47 |
mdbooth | IIRC there's a new microversion involved anyway. | 15:47 |
lyarwood | mdbooth: that's what I'm suggesting in the follow up spec at the moment | 15:48 |
lyarwood | mdbooth: well, with the trait | 15:48 |
lyarwood | mdbooth: so it's not unconditional | 15:48 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: api-ref: re-work migrate action post-conditions https://review.opendev.org/694103 | 15:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Fix review link. https://review.opendev.org/689612 | 15:54 |
openstackgerrit | Elod Illes proposed openstack/nova stable/pike: Add functional test for resize crash compute restart revert https://review.opendev.org/687913 | 15:59 |
openstackgerrit | Merged openstack/nova master: ItemsMatcher: mock call list arg in any order https://review.opendev.org/689487 | 16:04 |
artom | sean-k-mooney, I know you're respinning https://review.opendev.org/#/c/674072/4 soon, but there are a couple more typos I found | 16:04 |
mriedem | melwitt: dansmith: this might have come up before and i'm just not remembering, but an instance that fails initial scheduling and is buried in cell0 doesn't have a 'create' instance action because we apparently don't create that in cell0 | 16:04 |
dansmith | mriedem: not sure it has come up before | 16:05 |
dansmith | mriedem: did it have one prior to the bury-in-cell0 behavior? | 16:05 |
melwitt | I don't recall talking about this before either | 16:05 |
*** ccamacho has quit IRC | 16:06 | |
mriedem | looking back at mitaka, the api would create the 'create' action in the 'nova' cell db https://github.com/openstack/nova/blob/mitaka-eol/nova/compute/api.py#L1180 before casting off to conductor | 16:07 |
*** lpetrut has quit IRC | 16:08 | |
mriedem | i don't see anything in conductor which would "complete" that action with a fail even if scheduling failed https://github.com/openstack/nova/blob/mitaka-eol/nova/conductor/manager.py#L374 | 16:08 |
mriedem | but at least the action would exist | 16:08 |
dansmith | ack | 16:09 |
dansmith | well, makes sense | 16:09 |
openstackgerrit | Elod Illes proposed openstack/nova stable/pike: Prevent init_host test to interfere with other tests https://review.opendev.org/687916 | 16:11 |
*** mlavalle has joined #openstack-nova | 16:12 | |
mriedem | makes sense that we wouldn't create the instance action in cell0? | 16:13 |
sean-k-mooney | artom: cool i ran a spell check on it since i figured out how to do that with emacs | 16:13 |
sean-k-mooney | artom: im just running the tests currently | 16:14 |
artom | sean-k-mooney, it was just capitalization and "use case" into 2 words | 16:14 |
openstackgerrit | Merged openstack/nova master: Fix ItemMatcher to avoid false positives https://review.opendev.org/689690 | 16:15 |
mriedem | sean-k-mooney: are you planning on writing functional tests for https://review.opendev.org/#/c/674072/ ? | 16:15 |
dansmith | mriedem: no makes sense that we missed doing that when we created bury-in-cell0, and makes sense that we should fix that | 16:17 |
mriedem | dansmith: ah ok | 16:17 |
mriedem | i'll open a bug in a bit | 16:17 |
*** gyee has joined #openstack-nova | 16:18 | |
*** JamesBen_ has joined #openstack-nova | 16:18 | |
*** ociuhandu has joined #openstack-nova | 16:20 | |
sean-k-mooney | am i can if you would like them | 16:20 |
*** JamesBenson has quit IRC | 16:21 | |
mriedem | anything involving image meta / flavor extra specs + pci + affinity + scheduling + compute likely means unit tests aren't sufficient, yeah | 16:21 |
mriedem | ? | 16:21 |
mriedem | maybe that's just me | 16:21 |
openstackgerrit | Alexandre arents proposed openstack/nova master: Abort live-migration during instance_init https://review.opendev.org/678016 | 16:21 |
sean-k-mooney | mriedem: the code change is just reuseing the existing support for numa policies | 16:23 |
sean-k-mooney | so im jsut passing that policy via the flavor instead of the alisa | 16:23 |
sean-k-mooney | that said we may not have existing fucntional test for that so i will look | 16:24 |
*** TxGirlGeek has quit IRC | 16:24 | |
*** ivve has quit IRC | 16:26 | |
sean-k-mooney | mriedem: i was kind of assuming the existing functional test for the numa polices would be sufficent but i will check. | 16:27 |
mriedem | i never assume anything relating to numa test coverage is sufficient for new code dealing with numa | 16:30 |
sean-k-mooney | fair | 16:30 |
mriedem | but i'm a cranky old troll | 16:30 |
*** ociuhandu has quit IRC | 16:30 | |
*** ociuhandu has joined #openstack-nova | 16:31 | |
*** efried has joined #openstack-nova | 16:36 | |
openstackgerrit | Elod Illes proposed openstack/nova stable/pike: Functional reproduce for bug 1833581 https://review.opendev.org/687917 | 16:37 |
openstack | bug 1833581 in OpenStack Compute (nova) train "instance stuck in BUILD state if nova-compute is restarted" [Low,Fix committed] https://launchpad.net/bugs/1833581 - Assigned to Balazs Gibizer (balazs-gibizer) | 16:37 |
mriedem | melwitt: can you hit https://review.opendev.org/#/c/693554/ to keep amodi's doc bug fix backports moving? | 16:38 |
*** ociuhandu has quit IRC | 16:44 | |
melwitt | mriedem: yup, thanks | 16:45 |
stephenfin | mriedem: You're not that old | 16:45 |
stephenfin | 🙃 | 16:45 |
sean-k-mooney | artom: looks like i have already fixed most of your comments locally im going to fix the last few then ill push it up. | 16:47 |
sean-k-mooney | stephenfin: do you recall if you wrote functional tests when you implemented the orgininal numa policies. if so ill extend them if not i guess ill write tests for both. | 16:48 |
stephenfin | I very much doubt it. I've only started writing those for the last two cycles or so | 16:49 |
artom | sean-k-mooney, I know I added https://review.opendev.org/#/c/682941/ recently | 16:49 |
sean-k-mooney | ok | 16:50 |
* bauzas is back from town hall | 16:50 | |
sean-k-mooney | artom: ya i rememebr | 16:50 |
sean-k-mooney | so those cover the alias based policies i think | 16:50 |
sean-k-mooney | you added the missing one | 16:50 |
sean-k-mooney | so i was more or less relying on thos but it hsould be easy to etend those to use the policy form the flavor | 16:51 |
artom | sean-k-mooney, well yeah, because we only have the alias based policies for now :) | 16:51 |
*** jaosorior has joined #openstack-nova | 16:51 | |
sean-k-mooney | ok ill push there version i have now after i adress you comments and ill start working on the func test after | 16:52 |
*** TxGirlGeek has joined #openstack-nova | 16:52 | |
*** yan0s has quit IRC | 16:53 | |
openstackgerrit | sean mooney proposed openstack/nova master: support pci numa affinity policies in flavor and image https://review.opendev.org/674072 | 16:54 |
*** TxGirlGeek has quit IRC | 16:54 | |
*** nweinber_ has joined #openstack-nova | 16:56 | |
*** mgariepy has quit IRC | 16:56 | |
*** nweinber has quit IRC | 16:58 | |
*** TxGirlGeek has joined #openstack-nova | 17:00 | |
*** ociuhandu has joined #openstack-nova | 17:01 | |
*** igordc has joined #openstack-nova | 17:02 | |
*** rpittau is now known as rpittau|afk | 17:08 | |
*** mkrai_ has quit IRC | 17:08 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Consolidate [image_cache] conf options https://review.opendev.org/690723 | 17:09 |
openstackgerrit | Eric Fried proposed openstack/nova master: Add image caching to the support matrix https://review.opendev.org/690748 | 17:09 |
efried | mriedem, stephenfin: Rebased and nit-fixed ^ | 17:09 |
*** TxGirlGeek has quit IRC | 17:10 | |
*** jaosorior has quit IRC | 17:12 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: "SUSPENDED" description changed in server_concepts guide and API REF https://review.opendev.org/663590 | 17:12 |
openstackgerrit | Elod Illes proposed openstack/nova stable/pike: Error out interrupted builds https://review.opendev.org/687918 | 17:13 |
artom | stephenfin, oh, friendly reminder that the NUMA LM func test stack is ready for another look: https://review.opendev.org/#/c/687404/ | 17:22 |
artom | (I know it's not so much a learning curve as a learning brick wall) | 17:22 |
stephenfin | artom: Sure thing, but that's too much work for 5:20pm when jetlagged. I'll grab it in the morning :) | 17:24 |
artom | stephenfin, ack, thank you! | 17:24 |
openstackgerrit | Eric Fried proposed openstack/nova master: Remove functional test specific nova code https://review.opendev.org/683609 | 17:25 |
bauzas | mriedem: dumb question but what the user sees from the API when we error out some instance because of NoValidHost ? | 17:31 |
bauzas | by a nova show I mean | 17:31 |
bauzas | mriedem: was about to say +1 to your ML thread, but was wondering the API result for the same | 17:32 |
*** igordc has quit IRC | 17:32 | |
* bauzas looks at https://docs.openstack.org/api-ref/compute/?expanded=show-server-details-detail#list-servers-detailed | 17:33 | |
mriedem | bauzas: the non-admin user? they see this: | 17:33 |
mriedem | $ openstack server show build-fail1 -f value -c fault {u'message': u'No valid host was found. ', u'code': 500, u'created': u'2019-11-13T15:57:13Z'} | 17:33 |
mriedem | the fault is only shown if the server status is ERROR or DELETED | 17:33 |
bauzas | mriedem: so the instance action event show should do the same | 17:33 |
mriedem | tbc, i wasn't proposing that the action event exception type is only shown for an ERROR or DELETED status server | 17:34 |
mriedem | since as i said in the thread, you can fail a resize and the server status doesn't go to ERROR | 17:35 |
*** efried has quit IRC | 17:35 | |
*** igordc has joined #openstack-nova | 17:35 | |
bauzas | mriedem: yeah, I understood this | 17:35 |
stephenfin | efried: Think you could hold your nose on https://review.opendev.org/#/c/684345/16/nova/network/neutronv2/api.py to keep this moving, given I'm removing it again shortly after? | 17:36 |
*** nweinber_ has quit IRC | 17:36 | |
bauzas | mriedem: that's why I think we should *also* do it | 17:36 |
stephenfin | drat, just missed him | 17:36 |
bauzas | and yeah, of course, when a user calls some ops, we can ask 'do a nova show' | 17:36 |
bauzas | but sometimes we need to look at the instance actions (like for resize) and that's why I'd love your spec :) | 17:36 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Add TODOs for remaining nova-network functional tests https://review.opendev.org/684345 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'os-security-group-default-rules' REST API https://review.opendev.org/686807 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove unused '*_default_rules' security group DB APIs https://review.opendev.org/686808 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove (most) '/os-networks' REST APIs https://review.opendev.org/686809 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove '/os-tenant-networks' REST API https://review.opendev.org/686810 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove 'USE_NEUTRON' from functional tests https://review.opendev.org/686811 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove 'networks' quota https://review.opendev.org/686812 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove nova-manage network, floating commands https://review.opendev.org/686813 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove associate, disassociate network APIs https://review.opendev.org/686814 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove 'nova-dhcpbridge' binary https://review.opendev.org/686815 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove 'nova-network' binary https://review.opendev.org/686816 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: docs: Blast most references to nova-network https://review.opendev.org/686817 | 17:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: WIP https://review.opendev.org/686818 | 17:53 |
*** damien_r has quit IRC | 17:56 | |
*** maciejjozefczyk has quit IRC | 17:58 | |
*** JamesBen_ has quit IRC | 18:01 | |
*** JamesBenson has joined #openstack-nova | 18:01 | |
*** ociuhandu has quit IRC | 18:17 | |
*** ociuhandu has joined #openstack-nova | 18:18 | |
*** sridharg has quit IRC | 18:20 | |
*** ociuhandu has quit IRC | 18:23 | |
*** slaweq has quit IRC | 18:25 | |
*** nweinber_ has joined #openstack-nova | 18:28 | |
*** priteau has quit IRC | 18:32 | |
*** tesseract has quit IRC | 18:35 | |
*** jkulik has quit IRC | 18:40 | |
*** gbarros has quit IRC | 18:49 | |
*** JamesBen_ has joined #openstack-nova | 19:02 | |
*** JamesBenson has quit IRC | 19:04 | |
*** ralonsoh has quit IRC | 19:25 | |
*** ociuhandu has joined #openstack-nova | 19:50 | |
*** ociuhandu has quit IRC | 19:56 | |
*** ociuhandu has joined #openstack-nova | 19:59 | |
*** slaweq has joined #openstack-nova | 20:01 | |
*** efried has joined #openstack-nova | 20:02 | |
*** JamesBen_ has quit IRC | 20:02 | |
*** JamesBenson has joined #openstack-nova | 20:03 | |
*** ociuhandu has quit IRC | 20:06 | |
openstackgerrit | Dan Smith proposed openstack/nova-specs master: Virtual instance rescue with stable disk devices https://review.opendev.org/693849 | 20:08 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Create instance action when burying in cell0 https://review.opendev.org/694165 | 20:11 |
*** abaindur has joined #openstack-nova | 20:14 | |
*** ociuhandu has joined #openstack-nova | 20:16 | |
*** dklyle has quit IRC | 20:20 | |
*** dklyle has joined #openstack-nova | 20:23 | |
*** ociuhandu has quit IRC | 20:25 | |
*** jaosorior has joined #openstack-nova | 20:26 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Create instance action when burying in cell0 https://review.opendev.org/694165 | 20:27 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: doc: add troubleshooting guide for cleaning up orphaned allocations https://review.opendev.org/691427 | 20:31 |
*** abaindur has quit IRC | 20:40 | |
*** abaindur has joined #openstack-nova | 20:40 | |
*** abaindur has quit IRC | 20:41 | |
*** abaindur has joined #openstack-nova | 20:42 | |
*** CeeMac has quit IRC | 20:44 | |
*** abaindur has quit IRC | 20:45 | |
*** ociuhandu has joined #openstack-nova | 21:00 | |
*** ociuhandu has quit IRC | 21:06 | |
*** priteau has joined #openstack-nova | 21:17 | |
*** priteau has quit IRC | 21:18 | |
*** priteau has joined #openstack-nova | 21:20 | |
*** priteau has quit IRC | 21:25 | |
*** nweinber_ has quit IRC | 21:33 | |
*** gshippey has quit IRC | 21:33 | |
openstackgerrit | François Palin proposed openstack/nova master: Add retry to cinder api calls related to volume detach https://review.opendev.org/669674 | 21:41 |
openstackgerrit | Merged openstack/nova master: Rename Claims resources to compute_node https://review.opendev.org/679470 | 21:44 |
openstackgerrit | Merged openstack/nova master: Clear instance.launched_on when build fails https://review.opendev.org/683725 | 21:45 |
*** takashin has joined #openstack-nova | 21:47 | |
melwitt | so... after helping a colleague unwedge a failed resize for a customer, I've learned that we intentionally don't roll back port bindings to the source when finish_resize fails https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L5274 | 21:49 |
melwitt | and the reasoning is because the assumption is "everything is fine" on the dest and that only the virt guest needs to be created | 21:49 |
melwitt | but, when finish_resize fails because update of volume attachments fails (error had to do with "duplicate connectors") it doesn't seem like the dest should really be considered an ok place for the instance to be | 21:50 |
*** dviroel has quit IRC | 21:52 | |
*** eharney has quit IRC | 21:54 | |
melwitt | I wonder if fixing this might be as simple as moving the volume attach update before the network setup | 21:57 |
melwitt | on the dest | 21:57 |
mriedem | i feel like i recently added that comment but maybe not | 21:58 |
mriedem | we have a lot of "cleanup networking and cleanup volumes" in live and cold migrate/resize where none of it is really atomic | 21:58 |
mriedem | i made a note of that in the revert code for cross-cell resize here https://review.opendev.org/#/c/637647/51/nova/compute/manager.py@4690 | 21:59 |
mriedem | i'm not sure what that duplicate connectors error is | 22:00 |
mriedem | ah that comment is relatively new https://review.opendev.org/#/c/635349/ | 22:00 |
melwitt | yeah we're going to be investigating the root cause of the duplicate connectors thing, never seen that before | 22:01 |
melwitt | error came back from cinder | 22:01 |
mriedem | what was the volume type? | 22:01 |
melwitt | I dunno yet. we only focused on recovery for now, it's late at night for those who were fixing the wedged resize | 22:02 |
mriedem | so migrate_instance_finish updated the port bindings to point at the dest host and then _update_volume_attachments blew up right? | 22:03 |
melwitt | we (they) moved the port bindings back to the source manually and then we hard rebooted the instance to get it going for the customer again. they had only wanted to resize to a bigger flavor | 22:03 |
melwitt | yes | 22:03 |
mriedem | still, if you get to finish_resize the instance.host already points at the dest | 22:04 |
mriedem | so they would have had to do more than change port bindings | 22:04 |
mriedem | to get the guest to reboot on the source | 22:04 |
mriedem | resize_instance on the source updates the instance.host/node before casting to finish_resize on the dest | 22:04 |
melwitt | and it started up fine with the larger flavor once the port binding was put back on the source. and prior to that yes I had them change the host/node back to the source | 22:04 |
melwitt | yeah sorry | 22:04 |
melwitt | they'll have to fix the allocations too but the host is full so they're gonna move other stuff off and then fix once there's room | 22:05 |
*** slaweq has quit IRC | 22:06 | |
mriedem | so even if we swapped migrate_instance_finish and _update_volume_attachments for this very specific fail case, the operator still has manual stuff they have to do, like updating the instance to point back at the source host | 22:06 |
mriedem | and if you do swap those, you likely should revert the instance host/node so hard rebooting on the source host works without manual db surgery | 22:07 |
melwitt | yeah, fair, but doing that is far easier than the port binding update. that was pretty involved and I personally didn't know how to do it. involved updating the virtual_interfaces table too I think | 22:07 |
mriedem | hell, even if finish_migration i'm not sure why we wouldn't roll everything back to the source host, but we've already updated port bindings and volume attachments by then which is likely why we say "you're on the dest now" | 22:07 |
mriedem | the virtual_interfaces table shouldn't have anything to do with it | 22:08 |
mriedem | unless they created new ports | 22:08 |
melwitt | ok, maybe they just looked for a uuid or something. I wasn't really understanding | 22:09 |
melwitt | looks like the volume type is "null" which would be the default | 22:10 |
melwitt | and I don't remember what the default is | 22:10 |
mriedem | well, if we changed the resize/cold migrate flow to use the multi-port binding api like we do for live migratoin since rocky thing would be a bit simpler to cleanup, because you'd have 2 port binding resources in neutron rather than one, one is active and one is inactive | 22:10 |
mriedem | but that's not a backportable change | 22:11 |
mriedem | and trying to rollback automatically from everything that could go wrong during finish_resize is likely to be pretty hairy | 22:11 |
melwitt | yeah, if port bindings AND volume attach succeed then saying you're on the dest now is fine | 22:11 |
melwitt | but if volumes fails, I don't see how dest could be ok | 22:11 |
melwitt | I was thinking of swapping the network setup and volume update only, in _finish_resize | 22:12 |
melwitt | so that if volume update fails, set the host/node back to the source like you said and bail. leave the port bindings alone | 22:13 |
mriedem | i don't think that gets you out of the woods, | 22:14 |
mriedem | because resize_instance on the source, before casting to finish_resize on the dest, deletes the old volume attachment with the source host connector and creates a new 'empty' volume attachment that gets updated on the dest with the dest host connector | 22:14 |
mriedem | see _terminate_volume_connections | 22:14 |
melwitt | O.o | 22:14 |
melwitt | dammit | 22:14 |
mriedem | so even if you go back to the source, the volume attachment has to be updated with the source host connector to re-connect the volumes on the source host | 22:15 |
melwitt | gah, this is over my head. I let them know they might have more to do to fix that VM | 22:16 |
mriedem | i want to say that at some point booth had a patch which would automatically call revert_resize (dest) from finish_resize (dest) if finish_resize failed | 22:16 |
mriedem | but i very much doubt that revert_resize is idempotent | 22:17 |
melwitt | that sounds familiar | 22:17 |
mriedem | i.e. i would not be surprised if just calling revert_resize from finish_resize fails in some weird way because it has some implicit preconditions that aren't setup when it's called to rollback | 22:17 |
melwitt | yeah | 22:18 |
mriedem | now if this all used task flow tasks with built in rollbacks...then we'd be cooking! | 22:18 |
mriedem | i kid, but that's why i used granular conductor tasks for the cross-cell stuff so we can rollback at certain points in the flow | 22:19 |
melwitt | task flow did come to mind | 22:20 |
mriedem | i was hoping efried had it as a keyword and he'd just appear | 22:20 |
efried | *poof* | 22:20 |
efried | you should ask johnsom, shiny new taskflow core. Been a while since I touched it. | 22:21 |
efried | though it's pretty simple when used simply. | 22:21 |
efried | and you're already doing most of the work, defining `execute` and `rollback` methods in your classes. Make those inherit from Task and do a little plumbing on the engine side, and presto. | 22:22 |
johnsom | Hi, happy to chat about taskflow, but I'm in a meeting right now. Ping me in 30 if you need something | 22:22 |
efried | jroll: There's generally a master switch to enable a feature like swtpm. I'm just defining that. I was starting to make it boolean, but it occurred to me that someone might care to say "enable vtpm {1.2 and/or 2.0}". Do you? | 22:22 |
jroll | efried: I'm not sure if we care | 22:23 |
jroll | that seems reasonable, though, 2.0 isn't backward compat | 22:23 |
jroll | (from an application perspective) | 22:23 |
efried | yeah, rn there's support for both or neither, conf opt notwithstanding. | 22:23 |
efried | so really it would just allow you to disable one or the other. | 22:24 |
melwitt | mriedem: how do we find the source host connector? is it in our db or somewhere else? | 22:24 |
efried | jroll: Easier for me if you don't care. Since you didn't immediately, I'll code it up bool, and we can discuss it in review if necessary. | 22:24 |
jroll | efried: in the 'both' case, how is it decided which version is presented to a vm? | 22:24 |
efried | jroll: flavor | 22:25 |
jroll | ah | 22:25 |
efried | you are required to ask for one version or the other | 22:25 |
jroll | yeah, +1 for discuss in review | 22:25 |
efried | because, as you say, they're not compat. | 22:25 |
mriedem | melwitt: the host connector is retrieved from the driver | 22:28 |
melwitt | argh.. ok, I was just seeing that in the code | 22:29 |
mriedem | cinder does stash the host connector in the volume attachment record when we update the attachment, but as noted in resize_instance we blow away the source host attachment and create a new empty attachment for the dest host to update | 22:29 |
mriedem | having said that, we (nova) do stash the last host connector used in the bdm.connection_info i think | 22:30 |
melwitt | :'( | 22:30 |
melwitt | cray cray | 22:30 |
mriedem | that might only be when using initialize_connection for legacy attachments though | 22:30 |
mriedem | i don't think we do that pack rat stash for new style volume attachments because cinder stores that information per volume attachment | 22:32 |
*** rcernin has joined #openstack-nova | 22:32 | |
mriedem | whereas before with initialize_connection we worked with a single attachment record and duplicated it in our bdm.connection_info | 22:32 |
melwitt | looks like we're ok, this is using ceph and somehow it's connected to the correct host (source) | 22:33 |
mriedem | in this very specific scenario | 22:33 |
melwitt | and it's boot from volume | 22:33 |
melwitt | yeah | 22:33 |
*** JamesBen_ has joined #openstack-nova | 22:33 | |
mriedem | i think another idea that has come up before when talking about resize fail recovery is allowing the admin to reset the status on the server so they could then revert the resize | 22:33 |
mriedem | or allow the revert resize api to work with instances in ERROR status | 22:34 |
melwitt | yup. that is the first thing I told them to try and there was no way to get the api to allow it | 22:34 |
mriedem | the instance was in ERROR state in this case right? | 22:34 |
*** JamesBen_ has quit IRC | 22:34 | |
mriedem | so if the api was changed to allow reverting a resize on an instance in ERROR status that could be potentially one way | 22:35 |
melwitt | it was originally yeah | 22:35 |
mriedem | and we can detect that the instance was being resized because of the old_vm_state and old/new flavor stuff stashed on the instance that isn't cleaned up until confirm/revert | 22:35 |
mriedem | meaning, the api could puke if you tried reverting a resize on an ERROR instance that wasn't actually being resized | 22:35 |
*** JamesBenson has quit IRC | 22:36 | |
mriedem | i want to say dansmith was in that discussion when it happened and wasn't crazy about the idea - maybe around the time of the proposal from booth to auto-revert on failed resize | 22:37 |
mriedem | or maybe dan preferred just reverting from ERROR instead, i can't remember | 22:37 |
melwitt | hmm ok | 22:38 |
mriedem | https://review.opendev.org/#/c/462521/ was booth's patch btw | 22:38 |
melwitt | ah ok | 22:40 |
melwitt | tbh I think reverting from ERROR would be nice as just something | 22:41 |
melwitt | it's painful knowing all the code is there to do the thing but you can't get to it because of a vm_state block and then have to do a bunch of gnarly manual stuff | 22:42 |
*** abaindur has joined #openstack-nova | 22:46 | |
mriedem | yeah it's definitely a kick in the ass to fix up a failed resize because of volumes, networking, allocations in placement, and some day cyborg devices, | 22:46 |
mriedem | plus the quota stuff involved to even move the server back | 22:46 |
mriedem | like you said, the source host was full when they tried moving it back | 22:47 |
mriedem | i left a note in https://review.opendev.org/#/c/462521/12/releasenotes/notes/resize-auto-revert-6e1648828aba16b2.yaml@5 if it helps start a new discussion about changing the API | 22:47 |
melwitt | yeah thanks. and thanks for talking through this, I'm gonna refresh on what was going on in that review and then I was thinking start a ML thread about the API angle based on your comment | 22:49 |
mriedem | i believe the tl;dr on why that patch to auto-revert was not great was dan's comment in the reno | 22:51 |
mriedem | external tooling could get screwed up | 22:51 |
melwitt | given the behaviorial/api change called out on that patch, I'm not sure whether there's a decent way to roll back automatically. if there's not, the api change would be a big help imho | 22:51 |
mriedem | just allowing reverting an ERROR'ed resize from the api though is pretty straight-forward and opt-into-y | 22:51 |
melwitt | yeah | 22:51 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Update keypairs in saving an instance object https://review.opendev.org/683043 | 22:52 |
*** jaosorior has quit IRC | 22:55 | |
dansmith | mriedem: yeah, not sure I remember revert-from-ERROR specifically, but definitely not in favor of auto rollback | 23:01 |
*** tkajinam has joined #openstack-nova | 23:06 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Update keypairs in saving an instance object https://review.opendev.org/683043 | 23:09 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Update keypairs in saving an instance object https://review.opendev.org/683043 | 23:13 |
openstackgerrit | Merged openstack/nova master: cond: rename 'recreate' var to 'evacuate' https://review.opendev.org/692900 | 23:22 |
*** xek_ has quit IRC | 23:24 | |
*** awalende has joined #openstack-nova | 23:25 | |
*** igordc has quit IRC | 23:26 | |
*** mlavalle has quit IRC | 23:26 | |
openstackgerrit | Merged openstack/nova master: Remove PlacementAPIConnectFailure handling from AggregateAPI https://review.opendev.org/660852 | 23:27 |
*** awalende has quit IRC | 23:29 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Do not reschedule on ExternalNetworkAttachForbidden https://review.opendev.org/694179 | 23:30 |
*** ociuhandu has joined #openstack-nova | 23:31 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Do not reschedule on ExternalNetworkAttachForbidden https://review.opendev.org/694179 | 23:34 |
*** ociuhandu has quit IRC | 23:35 | |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Add emulated TPM support to Nova https://review.opendev.org/631363 | 23:36 |
openstackgerrit | Eric Fried proposed openstack/nova master: Add support for resize and cold migration of emulated TPM files https://review.opendev.org/639934 | 23:36 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: vTPM request_filter https://review.opendev.org/678325 | 23:36 |
*** efried has quit IRC | 23:39 | |
openstackgerrit | Merged openstack/nova master: Remove dead HostAPI.service_delete code https://review.opendev.org/693422 | 23:43 |
openstackgerrit | Merged openstack/nova master: Add support matrix for Delete (Abort) on-going live migration https://review.opendev.org/625781 | 23:43 |
openstackgerrit | Merged openstack/nova master: Implement update_provider_tree for mocked driver in test_resource_tracker https://review.opendev.org/693431 | 23:44 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: api-ref: mark device response param as optional for list/show vol attachments https://review.opendev.org/690383 | 23:45 |
*** brault has quit IRC | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!