*** jistr has quit IRC | 00:07 | |
*** jistr has joined #openstack-nova | 00:08 | |
*** ivve has quit IRC | 00:16 | |
*** dpawlik has joined #openstack-nova | 00:23 | |
*** dpawlik has quit IRC | 00:28 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Tie requester_id to RequestGroup suffix https://review.opendev.org/696946 | 00:34 |
---|---|---|
openstackgerrit | Eric Fried proposed openstack/nova master: refactor: RequestGroup knows when it's empty https://review.opendev.org/696991 | 00:34 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Use provider mappings from Placement (mostly) https://review.opendev.org/696992 | 00:34 |
*** sorrison has joined #openstack-nova | 01:02 | |
*** tetsuro has joined #openstack-nova | 01:15 | |
*** ociuhandu has joined #openstack-nova | 01:31 | |
*** ociuhandu has quit IRC | 01:35 | |
*** sorrison has quit IRC | 01:38 | |
*** sorrison has joined #openstack-nova | 01:38 | |
*** sorrison has quit IRC | 01:40 | |
*** gyee has quit IRC | 01:48 | |
*** sorrison has joined #openstack-nova | 01:55 | |
*** sorrison has quit IRC | 01:58 | |
*** awalende has joined #openstack-nova | 02:00 | |
*** yikun has joined #openstack-nova | 02:01 | |
*** sorrison has joined #openstack-nova | 02:02 | |
*** awalende has quit IRC | 02:04 | |
*** sorrison has quit IRC | 02:15 | |
*** sorrison has joined #openstack-nova | 02:16 | |
*** davee_ has quit IRC | 02:16 | |
*** davee_ has joined #openstack-nova | 02:17 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova master: Make evacuation respects anti-affinity rule https://review.opendev.org/649963 | 02:17 |
*** dpawlik has joined #openstack-nova | 02:24 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova master: Fix live migration break group policy simultaneously https://review.opendev.org/651969 | 02:25 |
*** dpawlik has quit IRC | 02:28 | |
*** yaawang has quit IRC | 02:36 | |
*** yaawang has joined #openstack-nova | 02:36 | |
*** macz has quit IRC | 02:37 | |
*** chenhaw has joined #openstack-nova | 02:45 | |
openstackgerrit | Merged openstack/nova master: Add new default rules and mapping in policy base class https://review.opendev.org/645452 | 02:53 |
*** mkrai has joined #openstack-nova | 03:27 | |
*** Liang__ has joined #openstack-nova | 03:38 | |
*** jbernard has quit IRC | 03:45 | |
*** macz has joined #openstack-nova | 03:46 | |
*** jbernard has joined #openstack-nova | 03:46 | |
*** macz has quit IRC | 03:51 | |
*** tetsuro has quit IRC | 03:58 | |
*** jangutter has joined #openstack-nova | 03:59 | |
*** yaawang has quit IRC | 04:01 | |
*** yaawang has joined #openstack-nova | 04:02 | |
*** jangutter has quit IRC | 04:03 | |
*** tetsuro has joined #openstack-nova | 04:04 | |
*** sorrison has quit IRC | 04:04 | |
*** tetsuro has quit IRC | 04:05 | |
*** bhagyashris has joined #openstack-nova | 04:09 | |
*** sorrison has joined #openstack-nova | 04:10 | |
*** udesale has joined #openstack-nova | 04:13 | |
*** sorrison has quit IRC | 04:15 | |
*** tetsuro has joined #openstack-nova | 04:16 | |
*** tetsuro has quit IRC | 04:19 | |
*** tetsuro_ has joined #openstack-nova | 04:19 | |
*** tetsuro_ has quit IRC | 04:23 | |
*** tetsuro has joined #openstack-nova | 04:23 | |
*** dpawlik has joined #openstack-nova | 04:25 | |
*** dpawlik has quit IRC | 04:29 | |
*** igordc has quit IRC | 04:34 | |
*** tetsuro has quit IRC | 04:36 | |
*** sorrison has joined #openstack-nova | 04:45 | |
*** jangutter has joined #openstack-nova | 04:46 | |
*** jangutter has quit IRC | 04:51 | |
*** sorrison has quit IRC | 04:52 | |
*** sorrison has joined #openstack-nova | 04:57 | |
*** tkajinam has quit IRC | 05:02 | |
*** udesale has quit IRC | 05:15 | |
*** udesale has joined #openstack-nova | 05:16 | |
*** tkajinam has joined #openstack-nova | 05:33 | |
*** mkrai has quit IRC | 05:36 | |
*** boxiang has joined #openstack-nova | 05:36 | |
*** boxiang has quit IRC | 05:38 | |
*** boxiang has joined #openstack-nova | 05:38 | |
*** tetsuro has joined #openstack-nova | 05:39 | |
*** udesale has quit IRC | 05:42 | |
*** boxiang_ has joined #openstack-nova | 05:43 | |
*** tetsuro has quit IRC | 05:44 | |
*** boxiang has quit IRC | 05:46 | |
*** udesale has joined #openstack-nova | 05:51 | |
*** links has joined #openstack-nova | 05:52 | |
*** mkrai has joined #openstack-nova | 05:54 | |
*** bhagyashris has quit IRC | 05:57 | |
*** tetsuro has joined #openstack-nova | 05:57 | |
*** bhagyashris has joined #openstack-nova | 05:58 | |
*** ircuser-1 has joined #openstack-nova | 06:19 | |
*** jkulik has joined #openstack-nova | 06:19 | |
*** zhanglong has joined #openstack-nova | 06:22 | |
*** sapd1_x has joined #openstack-nova | 06:23 | |
*** dpawlik has joined #openstack-nova | 06:26 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/nova master: Imported Translations from Zanata https://review.opendev.org/694717 | 06:27 |
*** mkrai has quit IRC | 06:30 | |
*** mkrai has joined #openstack-nova | 06:30 | |
*** dpawlik has quit IRC | 06:30 | |
*** mkrai has quit IRC | 06:35 | |
*** mkrai_ has joined #openstack-nova | 06:36 | |
*** ccamacho has quit IRC | 06:43 | |
*** jangutter has joined #openstack-nova | 06:47 | |
*** yaawang has quit IRC | 06:47 | |
*** yaawang has joined #openstack-nova | 06:48 | |
*** mkrai_ has quit IRC | 06:49 | |
*** jangutter has quit IRC | 06:52 | |
*** dpawlik has joined #openstack-nova | 07:00 | |
*** brault has joined #openstack-nova | 07:05 | |
*** tetsuro_ has joined #openstack-nova | 07:06 | |
*** tetsuro has quit IRC | 07:08 | |
*** sorrison has quit IRC | 07:14 | |
*** tkajinam_ has joined #openstack-nova | 07:18 | |
*** tkajinam_ has quit IRC | 07:19 | |
*** tkajinam_ has joined #openstack-nova | 07:20 | |
*** tkajinam has quit IRC | 07:21 | |
*** sapd1_x has quit IRC | 07:23 | |
*** mkrai_ has joined #openstack-nova | 07:24 | |
*** sorrison has joined #openstack-nova | 07:25 | |
*** slaweq has joined #openstack-nova | 07:45 | |
*** yaawang has quit IRC | 07:46 | |
*** yaawang has joined #openstack-nova | 07:48 | |
*** trident has quit IRC | 07:51 | |
*** johanssone has quit IRC | 07:51 | |
*** trident has joined #openstack-nova | 07:51 | |
*** johanssone has joined #openstack-nova | 07:52 | |
*** sorrison has quit IRC | 07:53 | |
*** sorrison has joined #openstack-nova | 07:54 | |
*** maciejjozefczyk has joined #openstack-nova | 07:56 | |
*** damien_r has joined #openstack-nova | 08:00 | |
*** gibi has joined #openstack-nova | 08:03 | |
*** jangutter has joined #openstack-nova | 08:04 | |
*** macz has joined #openstack-nova | 08:12 | |
*** awalende has joined #openstack-nova | 08:14 | |
*** tesseract has joined #openstack-nova | 08:16 | |
*** macz has quit IRC | 08:16 | |
*** yedongcan has joined #openstack-nova | 08:21 | |
*** Roamer` has joined #openstack-nova | 08:26 | |
*** sorrison has quit IRC | 08:26 | |
*** ccamacho has joined #openstack-nova | 08:27 | |
*** sorrison has joined #openstack-nova | 08:27 | |
*** tosky has joined #openstack-nova | 08:29 | |
*** boxiang has joined #openstack-nova | 08:31 | |
*** boxiang has quit IRC | 08:32 | |
*** boxiang has joined #openstack-nova | 08:33 | |
*** ralonsoh has joined #openstack-nova | 08:33 | |
*** boxiang_ has quit IRC | 08:34 | |
*** sorrison has quit IRC | 08:42 | |
*** rpittau|afk is now known as rpittau | 08:43 | |
*** sorrison has joined #openstack-nova | 08:43 | |
*** udesale has quit IRC | 08:43 | |
*** udesale has joined #openstack-nova | 08:44 | |
*** sorrison has quit IRC | 08:48 | |
*** ivve has joined #openstack-nova | 08:51 | |
*** mkrai_ has quit IRC | 09:00 | |
*** mkrai has joined #openstack-nova | 09:00 | |
openstackgerrit | Akira KAMIO proposed openstack/nova master: VMware: disk_io_limits settings are not reflected when resize https://review.opendev.org/680296 | 09:02 |
*** martinkennelly has joined #openstack-nova | 09:03 | |
*** tkajinam_ has quit IRC | 09:20 | |
*** sorrison has joined #openstack-nova | 09:23 | |
*** gshippey has joined #openstack-nova | 09:25 | |
*** sorrison has quit IRC | 09:28 | |
*** dpawlik has quit IRC | 09:34 | |
*** dasp has quit IRC | 09:42 | |
*** dasp has joined #openstack-nova | 09:43 | |
*** yaawang has quit IRC | 09:45 | |
*** yaawang has joined #openstack-nova | 09:46 | |
*** abaindur has quit IRC | 09:47 | |
*** mkrai has quit IRC | 09:54 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Extend NeutronFixture to handle multiple bindings https://review.opendev.org/696246 | 09:55 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Do not mock setup net and migrate inst in NeutronFixture https://review.opendev.org/696247 | 09:56 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Move _get_request_group_mapping() to RequestSpec https://review.opendev.org/696541 | 09:58 |
*** sorrison has joined #openstack-nova | 09:58 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Move _update_pci_request_spec_with_allocated_interface_name https://review.opendev.org/696574 | 09:59 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Support live migration with qos ports https://review.opendev.org/695905 | 09:59 |
*** dpawlik has joined #openstack-nova | 10:02 | |
*** sorrison has quit IRC | 10:04 | |
*** huaqiang has joined #openstack-nova | 10:05 | |
*** Liang__ has quit IRC | 10:05 | |
*** rcernin has quit IRC | 10:06 | |
*** dpawlik has quit IRC | 10:07 | |
openstackgerrit | Merged openstack/nova master: add [libvirt]/max_queues config option https://review.opendev.org/695118 | 10:08 |
*** pcaruana has joined #openstack-nova | 10:08 | |
*** dtantsur|afk is now known as dtantsur | 10:19 | |
*** chenhaw has quit IRC | 10:27 | |
*** dpawlik has joined #openstack-nova | 10:29 | |
*** lpetrut has joined #openstack-nova | 10:34 | |
*** derekh has joined #openstack-nova | 10:35 | |
*** sorrison has joined #openstack-nova | 10:40 | |
*** udesale has quit IRC | 10:44 | |
*** dpawlik has quit IRC | 10:46 | |
*** sorrison has quit IRC | 10:48 | |
*** sorrison has joined #openstack-nova | 10:49 | |
*** sorrison has quit IRC | 10:53 | |
*** sorrison has joined #openstack-nova | 10:54 | |
*** udesale has joined #openstack-nova | 10:58 | |
*** sorrison has quit IRC | 10:59 | |
*** zhanglong has quit IRC | 10:59 | |
*** sorrison has joined #openstack-nova | 11:00 | |
*** ociuhandu has joined #openstack-nova | 11:02 | |
*** dpawlik has joined #openstack-nova | 11:02 | |
*** sorrison has quit IRC | 11:05 | |
*** dpawlik has quit IRC | 11:06 | |
*** sorrison has joined #openstack-nova | 11:09 | |
*** dpawlik has joined #openstack-nova | 11:10 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Extend NeutronFixture to allow live migration with ports https://review.opendev.org/696245 | 11:10 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Make the binding:profile handling consistent in NeutronFixture https://review.opendev.org/696526 | 11:11 |
*** ociuhandu has quit IRC | 11:12 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Extend NeutronFixture to handle multiple bindings https://review.opendev.org/696246 | 11:12 |
*** sorrison has quit IRC | 11:13 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Do not mock setup net and migrate inst in NeutronFixture https://review.opendev.org/696247 | 11:15 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Move _get_request_group_mapping() to RequestSpec https://review.opendev.org/696541 | 11:17 |
*** rpittau is now known as rpittau|bbl | 11:18 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Extend NeutronFixture to handle multiple bindings https://review.opendev.org/696246 | 11:19 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Do not mock setup net and migrate inst in NeutronFixture https://review.opendev.org/696247 | 11:19 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Move _get_request_group_mapping() to RequestSpec https://review.opendev.org/696541 | 11:19 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Move _update_pci_request_spec_with_allocated_interface_name https://review.opendev.org/696574 | 11:24 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Support live migration with qos ports https://review.opendev.org/695905 | 11:25 |
*** sorrison has joined #openstack-nova | 11:26 | |
*** dpawlik has quit IRC | 11:32 | |
*** sorrison has quit IRC | 11:34 | |
*** dpawlik has joined #openstack-nova | 11:36 | |
*** awalende_ has joined #openstack-nova | 11:37 | |
*** sorrison has joined #openstack-nova | 11:38 | |
*** bhagyashris has quit IRC | 11:40 | |
*** trident has quit IRC | 11:40 | |
*** awalende has quit IRC | 11:41 | |
*** sorrison has quit IRC | 11:42 | |
*** trident has joined #openstack-nova | 11:43 | |
*** sorrison has joined #openstack-nova | 11:44 | |
*** tbachman has quit IRC | 11:48 | |
*** sorrison has quit IRC | 11:49 | |
*** ociuhandu has joined #openstack-nova | 11:50 | |
*** sorrison has joined #openstack-nova | 11:52 | |
*** lpetrut has quit IRC | 11:53 | |
*** zbr_ has quit IRC | 11:55 | |
*** sorrison has quit IRC | 11:56 | |
*** zbr has joined #openstack-nova | 11:57 | |
*** boxiang has quit IRC | 11:59 | |
*** sorrison has joined #openstack-nova | 12:00 | |
*** boxiang has joined #openstack-nova | 12:00 | |
*** sorrison has quit IRC | 12:04 | |
*** mkrai has joined #openstack-nova | 12:10 | |
*** sorrison has joined #openstack-nova | 12:13 | |
*** mkrai has quit IRC | 12:14 | |
*** sorrison has quit IRC | 12:18 | |
openstackgerrit | sean mooney proposed openstack/nova master: support pci numa affinity policies in flavor and image https://review.opendev.org/674072 | 12:24 |
*** sorrison has joined #openstack-nova | 12:24 | |
*** ociuhandu has quit IRC | 12:26 | |
openstackgerrit | Huachang Wang proposed openstack/nova-specs master: Use PCPU and VCPU in one instance https://review.opendev.org/668656 | 12:27 |
*** ociuhandu has joined #openstack-nova | 12:28 | |
*** sorrison has quit IRC | 12:28 | |
*** yedongcan has left #openstack-nova | 12:32 | |
*** macz has joined #openstack-nova | 12:33 | |
*** sorrison has joined #openstack-nova | 12:37 | |
*** macz has quit IRC | 12:38 | |
*** sorrison has quit IRC | 12:42 | |
*** udesale has quit IRC | 12:46 | |
*** udesale has joined #openstack-nova | 12:47 | |
*** shilpasd has joined #openstack-nova | 12:49 | |
*** sorrison has joined #openstack-nova | 12:54 | |
*** sorrison has quit IRC | 12:58 | |
*** tbachman has joined #openstack-nova | 13:04 | |
*** dasp has quit IRC | 13:12 | |
*** dasp has joined #openstack-nova | 13:12 | |
*** mgariepy has quit IRC | 13:12 | |
*** sorrison has joined #openstack-nova | 13:13 | |
*** mgariepy has joined #openstack-nova | 13:15 | |
*** sorrison has quit IRC | 13:23 | |
*** ociuhandu has quit IRC | 13:26 | |
*** ociuhandu has joined #openstack-nova | 13:27 | |
*** ociuhandu has quit IRC | 13:33 | |
*** sorrison has joined #openstack-nova | 13:35 | |
*** lpetrut has joined #openstack-nova | 13:40 | |
*** rpittau|bbl is now known as rpittau | 13:46 | |
*** sorrison has quit IRC | 13:46 | |
*** nweinber has joined #openstack-nova | 13:48 | |
*** ygk_12345 has joined #openstack-nova | 13:50 | |
*** sorrison has joined #openstack-nova | 13:53 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Move _update_pci_request_spec_with_allocated_interface_name https://review.opendev.org/696574 | 13:55 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Support live migration with qos ports https://review.opendev.org/695905 | 13:55 |
*** sorrison has quit IRC | 13:58 | |
*** eharney has quit IRC | 13:58 | |
*** ygk_12345 has quit IRC | 13:59 | |
*** dave-mccowan has joined #openstack-nova | 14:02 | |
*** mlavalle has joined #openstack-nova | 14:04 | |
*** ygk_12345 has joined #openstack-nova | 14:05 | |
*** tbachman has quit IRC | 14:05 | |
*** tbachman has joined #openstack-nova | 14:06 | |
*** sorrison has joined #openstack-nova | 14:08 | |
*** sorrison has quit IRC | 14:12 | |
*** mriedem has joined #openstack-nova | 14:12 | |
ygk_12345 | hi all | 14:13 |
ygk_12345 | i am seeing broken pipe errors in the spice nova console logs | 14:13 |
*** tinwood_ is now known as tinwood | 14:13 | |
ygk_12345 | the vm console is not taking the keyboard input properly and it is gibberish | 14:13 |
ygk_12345 | can anyone hepl me please ? | 14:14 |
ygk_12345 | *help | 14:14 |
*** shilpasd has quit IRC | 14:14 | |
*** liuyulong has joined #openstack-nova | 14:14 | |
ygk_12345 | any expert here in the spice console service ? | 14:14 |
*** shilpasd has joined #openstack-nova | 14:17 | |
*** bhagyashris has joined #openstack-nova | 14:23 | |
*** sorrison has joined #openstack-nova | 14:23 | |
ygk_12345 | can someone look into this please | 14:23 |
ygk_12345 | https://bugs.launchpad.net/nova/+bug/1854950 | 14:23 |
openstack | Launchpad bug 1854950 in OpenStack Compute (nova) "VM console not clear" [Undecided,New] | 14:23 |
*** trident has quit IRC | 14:24 | |
rouk | im no expert but i dont know how a broken pipe would be the issue, broken pipe is usually client going away. | 14:25 |
*** trident has joined #openstack-nova | 14:25 | |
ygk_12345 | the user input is breaking into lines and not clear what they are typing into the console | 14:26 |
*** tbachman_ has joined #openstack-nova | 14:26 | |
*** jhesketh has quit IRC | 14:27 | |
*** tbachman has quit IRC | 14:27 | |
*** tbachman_ is now known as tbachman | 14:27 | |
*** jhesketh has joined #openstack-nova | 14:28 | |
mriedem | do we have any maintainers for the vmware driver anymore? if so, would be good for them to check https://review.opendev.org/#/c/696503/ which removes nova-net support from the driver. | 14:28 |
*** bhagyashris has quit IRC | 14:29 | |
*** sorrison has quit IRC | 14:30 | |
*** pcaruana has quit IRC | 14:33 | |
ygk_12345 | thos who want to see the issue, I am pasting it here | 14:34 |
ygk_12345 | Uploaded file: https://uploads.kiwiirc.com/files/ee624c90d6211419f9c7f626d11e6aad/Screenshot%20from%202019-12-03%2020-03-58.png | 14:35 |
*** tbachman has quit IRC | 14:37 | |
*** amodi has quit IRC | 14:37 | |
*** tbachman has joined #openstack-nova | 14:37 | |
*** dave-mccowan has quit IRC | 14:42 | |
*** mdbooth has quit IRC | 14:43 | |
*** sorrison has joined #openstack-nova | 14:44 | |
*** mdbooth has joined #openstack-nova | 14:44 | |
ygk_12345 | can anyone check my issue please ? | 14:45 |
ygk_12345 | https://bugs.launchpad.net/nova/+bug/1854950 | 14:45 |
openstack | Launchpad bug 1854950 in OpenStack Compute (nova) "VM spice console not clear" [Undecided,New] | 14:45 |
*** sorrison has quit IRC | 14:48 | |
mriedem | stephenfin: question in https://review.opendev.org/#/c/696505/ - the xenapi nova-net removal change; i think there is a firewall driver we can also remove but maybe handle it in a follow up | 14:50 |
*** artom has quit IRC | 14:50 | |
stephenfin | mriedem: You mean these firewall drivers? https://review.opendev.org/#/c/696514/ | 14:50 |
mriedem | you cheeky monkey | 14:51 |
* mriedem assumes he used that properly | 14:51 | |
sean-k-mooney | yes you did which is not something i associate with the us | 14:51 |
mriedem | b/c it's not | 14:52 |
sean-k-mooney | thats more of a british/irish thing that granparents say to little childeren when they got away with something | 14:52 |
ygk_12345 | is anyone here familiar with the nova spice console ? | 14:53 |
sean-k-mooney | i think we all have deployed it at different times. are you haveing a specific issue | 14:54 |
ygk_12345 | sean-k-mooney this one https://bugs.launchpad.net/nova/+bug/1854950 | 14:54 |
openstack | Launchpad bug 1854950 in OpenStack Compute (nova) "VM spice console not clear" [Undecided,New] | 14:54 |
sean-k-mooney | that is strange i dont think i have ever seen https://launchpadlibrarian.net/454102031/Screenshot%20from%202019-12-03%2020-03-58.png | 14:56 |
sean-k-mooney | stephenfin: do you recall if we still use websockify with the spice console | 14:57 |
ygk_12345 | when a use presses even an ENTER key it is splitting into lines and dots | 14:57 |
stephenfin | we do | 14:57 |
ygk_12345 | *user | 14:57 |
*** mkrai has joined #openstack-nova | 14:57 | |
*** igordc has joined #openstack-nova | 14:57 | |
*** ociuhandu has joined #openstack-nova | 14:58 | |
sean-k-mooney | ygk_12345: it look like the data is being currpted and the broken pipes would lead me to belive this is why the console is currupted | 14:58 |
sean-k-mooney | im wondering if the could be a websockify issue | 14:58 |
sean-k-mooney | we had this bug back in august https://bugs.launchpad.net/nova/+bug/1840788 | 15:00 |
openstack | Launchpad bug 1840788 in OpenStack Compute (nova) "websockify-0.9.0 breaks tempest tests" [Undecided,In progress] - Assigned to melanie witt (melwitt) | 15:00 |
sean-k-mooney | ygk_12345: what version of websockify do you have installed? | 15:00 |
ygk_12345 | sean-k-mooney how to check it ? | 15:00 |
sean-k-mooney | how did you install | 15:00 |
ygk_12345 | sean-k-mooney openstack ansible rocky 18.1.9 branch | 15:01 |
sean-k-mooney | ok did you do the package install or the souce install | 15:01 |
*** mlavalle has quit IRC | 15:01 | |
ygk_12345 | sean-k-mooney how do I determine that ? I just ran the playbooks | 15:02 |
sean-k-mooney | mnaser: what is the default install mode for openstack ansible in rocky? | 15:02 |
ygk_12345 | followed the deployment guide as usual | 15:02 |
*** martinkennelly has quit IRC | 15:02 | |
mnaser | sean-k-mooney: default is source inside containers | 15:02 |
sean-k-mooney | mnaser: so to check websockify version ygk_12345 would have to ssh into the lxc container then do a pip freeze in the virtual env? | 15:03 |
*** pcaruana has joined #openstack-nova | 15:04 | |
ygk_12345 | mnaser exact command please | 15:04 |
*** sorrison has joined #openstack-nova | 15:04 | |
mnaser | right they can hop into the lxc container (using lxc-attach too) and /openstack/venvs/nova-$version/bin/pip freeze | 15:04 |
ygk_12345 | mnaser ok | 15:05 |
sean-k-mooney | ygk_12345: that may not be the error but if you are using 0.9.0 then its possibel. resolving the broken pipes will likely fix the issue but first you need to figure out why that happens | 15:06 |
sean-k-mooney | that is outside the scope of nova | 15:06 |
ygk_12345 | sean-k-mooney websockify==0.8.0 | 15:07 |
sean-k-mooney | ok so that should hopefully be ok | 15:07 |
ygk_12345 | sean-k-mooney how to proceed now ? | 15:08 |
*** eharney has joined #openstack-nova | 15:08 | |
sean-k-mooney | you need to determin what is causing the broken pipies | 15:08 |
ygk_12345 | any clues ? | 15:09 |
*** sorrison has quit IRC | 15:09 | |
sean-k-mooney | other then looking at the websockify logs and journalctl not really but perhaps someone else has an idea | 15:10 |
*** udesale has quit IRC | 15:10 | |
*** udesale has joined #openstack-nova | 15:10 | |
mriedem | stephenfin: i jumped a bit but there are some nits in https://review.opendev.org/#/c/696511/4 if you want to FUP or if you end up needing to rev the series | 15:13 |
*** sorrison has joined #openstack-nova | 15:14 | |
*** bhagyashris has joined #openstack-nova | 15:14 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: WIP: log when loading security group driver https://review.opendev.org/652783 | 15:14 |
stephenfin | coolness | 15:16 |
ygk_12345 | sean-k-mooney we have a another similar setup, there also I observer broke pipe errors but the console is functioning fine there | 15:17 |
*** igordc has quit IRC | 15:17 | |
ygk_12345 | sean-k-mooney do u thin k the problem is with the netwrok issues ? | 15:17 |
*** bhagyashris has quit IRC | 15:19 | |
sean-k-mooney | dansmith: by the way after our conversation with sundar yesterday i decied to look at option of emulating pci device in the kernel again. we may be able ot use the PCI Endpoint Framework https://www.kernel.org/doc/html/latest/PCI/endpoint/index.html to create device we could use for pci passthough testing and cyborg testing | 15:19 |
dansmith | sean-k-mooney: sweet | 15:19 |
sean-k-mooney | i need to play around with it as the test driver is not complied in to the ubunut kernel module extra so il need to see how to compile it but ill let you know how it goes | 15:20 |
dansmith | cool | 15:20 |
sean-k-mooney | it should allow use to create pci devices by creating folders with in /sys | 15:20 |
*** sorrison has quit IRC | 15:20 | |
sean-k-mooney | https://www.kernel.org/doc/html/latest/PCI/endpoint/pci-test-howto.html#creating-pci-epf-test-device | 15:21 |
sean-k-mooney | if it works we shoudl be able to set the vendor id and prodcut id ot like 1234:42 then assert that a pci deivce with that vendor and prodcit id is availabel and passed to the guest | 15:22 |
*** munimeha1 has joined #openstack-nova | 15:22 | |
*** sorrison has joined #openstack-nova | 15:23 | |
*** igordc has joined #openstack-nova | 15:25 | |
stephenfin | mriedem: Dumb question, but the 'device_id' in a neutron port response will always be the instance's UUID, not the ID, right? | 15:25 |
*** ygk_12345 has quit IRC | 15:25 | |
*** bhagyashris has joined #openstack-nova | 15:25 | |
*** awalende_ has quit IRC | 15:25 | |
*** awalende has joined #openstack-nova | 15:26 | |
sean-k-mooney | stephenfin: yes it will never be the db short id | 15:28 |
*** sorrison has quit IRC | 15:28 | |
stephenfin | ta | 15:28 |
*** igordc has quit IRC | 15:30 | |
*** awalende has quit IRC | 15:30 | |
dansmith | stephenfin: if it were the db id we wouldn't be able to distinguish one instance over another across cells | 15:30 |
mriedem | we also don't expose the server primary key id out of the rest api | 15:31 |
sean-k-mooney | impliying that if it was teh primary key id that neutron would not be able to use it in api queries | 15:31 |
sean-k-mooney | we did expose the hypervior primary key although i think that was by mistake. | 15:32 |
mriedem | it wasn't by mistake for hypervisors, | 15:32 |
mriedem | they originally didn't have uuids | 15:32 |
mriedem | same with services and lots of other things | 15:32 |
sean-k-mooney | ah ok | 15:32 |
*** bhagyashris has quit IRC | 15:35 | |
*** tesseract has quit IRC | 15:37 | |
*** sorrison has joined #openstack-nova | 15:40 | |
*** ociuhandu has quit IRC | 15:42 | |
*** ociuhandu has joined #openstack-nova | 15:43 | |
*** artom has joined #openstack-nova | 15:46 | |
*** tesseract has joined #openstack-nova | 15:48 | |
*** ociuhandu has quit IRC | 15:48 | |
*** ociuhandu has joined #openstack-nova | 15:49 | |
*** sorrison has quit IRC | 15:50 | |
*** jmlowe has quit IRC | 15:52 | |
*** aloga has joined #openstack-nova | 15:55 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Cache security group driver https://review.opendev.org/697122 | 15:57 |
*** sorrison has joined #openstack-nova | 15:57 | |
*** sorrison has quit IRC | 16:02 | |
*** mkrai has quit IRC | 16:02 | |
*** jmlowe has joined #openstack-nova | 16:09 | |
*** sorrison has joined #openstack-nova | 16:09 | |
*** sorrison has quit IRC | 16:14 | |
*** ivve has quit IRC | 16:15 | |
*** udesale has quit IRC | 16:17 | |
*** gyee has joined #openstack-nova | 16:19 | |
*** dave-mccowan has joined #openstack-nova | 16:19 | |
*** jamesdenton has quit IRC | 16:25 | |
dansmith | anybody want to +W this so Sundar can just rebase on master? https://review.opendev.org/#/c/695985/ | 16:26 |
dansmith | s/rebase/rebase the cyborg stuff/ | 16:26 |
mriedem | gibi: i replied to your questions in https://review.opendev.org/#/c/637058/, thanks. i'll stack a change on top of the series to see what replacing that setup_networks_on_host hack would look like if we just implement port binding deleting in cleanup_instance_network_on_host | 16:27 |
*** dpawlik has quit IRC | 16:27 | |
mriedem | dansmith: looking | 16:28 |
*** sorrison has joined #openstack-nova | 16:29 | |
gibi | mriedem: ack, thanks | 16:30 |
*** sorrison has quit IRC | 16:37 | |
*** lpetrut has quit IRC | 16:38 | |
*** damien_r has quit IRC | 16:40 | |
*** ociuhandu has quit IRC | 16:41 | |
*** sorrison has joined #openstack-nova | 16:44 | |
openstackgerrit | Thierry Carrez proposed openstack/nova master: Remove unused rootwrap filters https://review.opendev.org/697134 | 16:46 |
*** sorrison has quit IRC | 16:48 | |
*** ccamacho has quit IRC | 16:48 | |
*** hamzy has quit IRC | 16:53 | |
*** sorrison has joined #openstack-nova | 16:54 | |
*** sorrison has quit IRC | 17:02 | |
*** tesseract has quit IRC | 17:03 | |
*** jlvillal has joined #openstack-nova | 17:09 | |
*** sorrison has joined #openstack-nova | 17:10 | |
*** rpittau is now known as rpittau|afk | 17:12 | |
*** jlvillal has quit IRC | 17:14 | |
*** jlvillal has joined #openstack-nova | 17:14 | |
*** sorrison has quit IRC | 17:15 | |
*** sorrison has joined #openstack-nova | 17:21 | |
*** mlavalle has joined #openstack-nova | 17:21 | |
*** sorrison has quit IRC | 17:28 | |
*** sorrison has joined #openstack-nova | 17:30 | |
*** dtantsur is now known as dtantsur|afk | 17:31 | |
*** sorrison has quit IRC | 17:36 | |
dansmith | eandersson: what perf improvements in nova specifically have shifted your scale focus away from nova? | 17:36 |
*** dpawlik has joined #openstack-nova | 17:37 | |
*** sorrison has joined #openstack-nova | 17:37 | |
efried | dansmith: Would you please have another look at the vTPM spec. I'd like to get your +2 before you bugger off til 2020. https://review.opendev.org/#/c/686804/ | 17:40 |
*** links has quit IRC | 17:40 | |
dansmith | efried: I've skimmed it.. tbh, I'm really like -0.9 on it, so if you want me to review it, it's probably going to be not helpful to your effort | 17:40 |
dansmith | I'm not sure I see the benefit outweighing the nightmare of users trying to predict the behavior | 17:41 |
*** dpawlik has quit IRC | 17:41 | |
*** sorrison has quit IRC | 17:41 | |
dansmith | I was thinking it was required for secure boot, but talking with folks I realize it's not, so ... it's just hard to justify I think | 17:41 |
*** jhesketh has quit IRC | 17:42 | |
*** jhesketh has joined #openstack-nova | 17:44 | |
sean-k-mooney | for what its worth the out of tree hyperv driver support vtpm for 3-4 years already | 17:45 |
sean-k-mooney | its part fo there sheilded vms thing but i have no idea how they solved any of the issues with move operations | 17:45 |
efried | did they? | 17:45 |
sean-k-mooney | yep ill get the link. did i not send this to you already | 17:46 |
dansmith | sean-k-mooney: given how hyperv moves things I expect it's tracked by the hypervisor and so it just works, but I could be wrong | 17:46 |
sean-k-mooney | https://github.com/openstack/compute-hyperv/commit/f37ce8b6bb0eb88a367239698ba7c3df3b64db38 | 17:46 |
dansmith | the question is more about how things like snapshot works | 17:47 |
sean-k-mooney | they have an os_vtpm image property | 17:47 |
sean-k-mooney | they must have patches against nova because the ImageMetaProps object does not accept os_vtpm or os_shielded_vm | 17:50 |
*** sorrison has joined #openstack-nova | 17:52 | |
dansmith | looks like maybe they're doing some mangling of instance.hostname to reference specific keys or something? | 17:53 |
dansmith | and they're illegally stashing their own stuff in insance.metadata | 17:53 |
dansmith | so all manner of hacks in that implementation | 17:53 |
sean-k-mooney | yep | 17:53 |
sean-k-mooney | and none of this is supported in the in tree one | 17:53 |
*** sorrison has quit IRC | 17:56 | |
sean-k-mooney | im looking at there snapshot function now https://github.com/openstack/compute-hyperv/blob/cb203978f262f31790592b2a0692fc2acaaef33d/compute_hyperv/nova/snapshotops.py#L66 but i think its just of the image so they dont snapshot the tpm? | 17:56 |
dansmith | I dunno, but since they've already broken many rules of user interaction, looking further into how they handle it doesn't seem like it would help guide us | 17:57 |
sean-k-mooney | it proably wont but it looks like they just ignored it | 17:57 |
sean-k-mooney | so i guess there stance was the tpm state would not be resoreted if you restored form a snapshot | 17:58 |
sean-k-mooney | the same way non root disks are not restored | 17:58 |
dansmith | or there is magic in the hostname mangling stuff so that if you recreate an instance with the right magic hostname, it will re-get the tpm stored in the hypervisor? | 17:59 |
dansmith | given that they basically implement features by bounty in that out of tree driver, I expect they implemented just the part of the solution that would suffice for the one customer asking for it | 18:00 |
sean-k-mooney | i dont think they are manageling the path in the metatada but yes i would guess the custoemr did not ask for snapshotiing so they did not add it | 18:01 |
efried | To me, the only severe issue is evacuate. The rest should be easy to understand with proper documentation. | 18:02 |
sean-k-mooney | i think they were using the vtpm soly for disk encryption | 18:02 |
efried | "vTPM is tied to your instance. Don't expect to be able to snapshot and clone it." | 18:03 |
*** ociuhandu has joined #openstack-nova | 18:04 | |
efried | "You can back up and restore as long as you use 'backup'." | 18:04 |
sean-k-mooney | ? you use backup | 18:04 |
*** ociuhandu has quit IRC | 18:05 | |
sean-k-mooney | you mean as long as you rebuild form a snapshot and dont launch an new instance right | 18:05 |
efried | No. I mean as long as you use rebuild from an image created via the createBackup server action | 18:05 |
sean-k-mooney | right but does that not just create a shapshot | 18:06 |
efried | 'snapshot' isn't a server action. There's createImage and createBackup. Both of them snapshot under the covers. | 18:06 |
sean-k-mooney | oh right | 18:06 |
sean-k-mooney | but createBackup uploads it as a snapshot to glance | 18:06 |
efried | point is, the image you get from createImage can be (non-awkwardly) used to clone your VM. Whereas createBackup is more tailored to rebuild. | 18:07 |
sean-k-mooney | does createimage upload as a snapshot i though it flattened the image | 18:07 |
sean-k-mooney | ok | 18:07 |
*** ociuhandu has joined #openstack-nova | 18:07 | |
efried | rebuild is your same instance. Otherwise you're creating a new instance. If you're creating a new instance, it will get a new (or no) vTPM. | 18:07 |
efried | so the only quirk is that in order for rebuild to restore your vTPM, you have to use an image from createBackup. | 18:08 |
efried | which makes sense | 18:08 |
efried | if you use a random image from createImage, why would you expect your vTPM from your original instance to be restored? | 18:08 |
efried | conversely, if you're creating a brand new instance (from *any* image), why would you expect anything other than a fresh vTPM? | 18:09 |
sean-k-mooney | ya so i said previously that the vtpm should be tied to the lifetime of the instace so i think that makes sense | 18:09 |
efried | we've got an answer for all the move operations and shelve/unshelve. | 18:10 |
sean-k-mooney | we are explcitly saying the usecase of i create a vm snapshot and spawn 100 more is expresly out of scope (in that they wont all get the orginal vms vtpm) | 18:10 |
*** sorrison has joined #openstack-nova | 18:11 | |
efried | I wouldn't phrase it as "out of scope". It's got clean, predictable behavior: you don't get the original VM's vTPM. | 18:11 |
sean-k-mooney | yes that is what i intened | 18:11 |
efried | which, for anyone using a vTPM, ought to make sense. "don't give my secure stuff to another instance" | 18:11 |
sean-k-mooney | copying the vtpm in that case is not supported intentionally | 18:11 |
*** ociuhandu has quit IRC | 18:12 | |
efried | yeah, I'm just getting nitpicky about the language. "out of scope" and "not supported" imply "don't work". | 18:12 |
sean-k-mooney | so i would be ok with that set of constratits but im not really the person you have to convice | 18:12 |
eandersson | dansmith so we jumped from mitaka to rocky | 18:12 |
eandersson | So it's difficult to say exactly what part improved performance the most | 18:13 |
eandersson | but we believe placement played a big role here | 18:13 |
dansmith | eandersson: well, is it scheduling performance, instance listing performance, build time, etc? | 18:13 |
eandersson | scheduled is a lot better at least ~50% | 18:13 |
dansmith | sweet, so that's likely placement yeah | 18:13 |
eandersson | Also scaling computes | 18:13 |
dansmith | eandersson: what does "scaling computes" mean? | 18:14 |
eandersson | We initially limited ourself to 1000 computes per region | 18:14 |
eandersson | in Mitaka | 18:14 |
eandersson | but now believe we can reach a much higher number than that | 18:14 |
dansmith | eandersson: is that via cells or just in general? because I don't think we've done anything other than get *more* chatty to the db/mq since mitaka :) | 18:14 |
eandersson | And even if we do hit limitations we can now use cells | 18:14 |
*** jmlowe has quit IRC | 18:15 | |
eandersson | scheduler used to be really heavy on rmq | 18:15 |
eandersson | in mitaka | 18:15 |
*** igordc has joined #openstack-nova | 18:15 | |
sean-k-mooney | eandersson: do you run sepperate rmq instacne per openstack service | 18:15 |
*** jmlowe has joined #openstack-nova | 18:15 | |
eandersson | We do not | 18:16 |
*** sorrison has quit IRC | 18:16 | |
eandersson | but even at 1k computes we are barely putting any stress on rmq at the moment | 18:16 |
eandersson | Of course we are not using ceilometer | 18:16 |
sean-k-mooney | im surpirsed you are getting to 1000 nodes on one cluster with neutron and nova sharing it | 18:16 |
dansmith | eandersson: so you're thinking that lowered load on rabbit from the scheduler lets you have more computes? | 18:16 |
sean-k-mooney | ya not using ceilometer helps | 18:16 |
eandersson | dansmith I think it helps | 18:17 |
dansmith | ack | 18:17 |
*** ociuhandu has joined #openstack-nova | 18:18 | |
eandersson | We were getting a lot of slow api calls in mitaka and they are all very consistent now | 18:18 |
*** maciejjozefczyk has quit IRC | 18:18 | |
eandersson | The only problems we have with nova now is getting our super custom scheduling logic to scale | 18:19 |
dansmith | that's what I'm interested in specifically, | 18:19 |
dansmith | but without knowing which calls those were I can't really attribute them to anything | 18:19 |
dansmith | (that == slow api calls) | 18:19 |
eandersson | Yea - unfortunately a lot of the research and testing we did was back in ~2016 | 18:20 |
eandersson | We didn't do a great job tracking individual improvements when going from Mitaka to Rocky | 18:21 |
sean-k-mooney | eandersson: do you have a list of constraits you need to schduler on that are not supported cleanly upstream that could be shared perhaps we could accomadate some of your custom logic | 18:21 |
eandersson | We do flavor stacking and what... we internally call "perfect fit". | 18:22 |
sean-k-mooney | using the type affingity filter | 18:22 |
mriedem | you also have a variant of the old flavor affinity filter yeah? | 18:23 |
sean-k-mooney | so that each host only one flavor | 18:23 |
eandersson | So we actually want to stack flavors on a compute | 18:23 |
eandersson | because these are game servers | 18:23 |
eandersson | So each game server takes up one numa | 18:23 |
eandersson | but we still want to be able to schedule other micro services on top | 18:24 |
*** jmlowe has quit IRC | 18:24 | |
*** eharney has quit IRC | 18:24 | |
eandersson | Since we don't want to have to divide the fleet | 18:24 |
sean-k-mooney | so you want to pack the large flavor and then fit the micof servces wehre tehy can | 18:24 |
eandersson | Yep | 18:24 |
sean-k-mooney | that not really unresobaly to be fair | 18:25 |
sean-k-mooney | the main issue i guess you face right now is fragmenation | 18:25 |
eandersson | Yea - if we go with the out of the box implementation | 18:26 |
sean-k-mooney | e.g. a small instnace spawns preventing a large instnace | 18:26 |
eandersson | yep | 18:26 |
dansmith | but also...scheduler filters are the one place I think we *should* be pluggable, and so unless other people want *exactly* the same weird scheduling thing, it makes sense for them to do this on their own, IMHO | 18:26 |
eandersson | Yep - I hate that I have to patch nova for this | 18:26 |
eandersson | I mean we have a super custom way of deploying, plus probably 20 custom nova patches at least | 18:26 |
sean-k-mooney | this is not the first time i have heard this requrest however | 18:27 |
*** awalende has joined #openstack-nova | 18:27 | |
*** tbachman has quit IRC | 18:27 | |
eandersson | So it's not a big deal, but would be a lot easier for us to manage it if it was pluggable | 18:27 |
dansmith | scheduler filters *are* pluggable | 18:27 |
sean-k-mooney | eandersson: as are the weighers | 18:27 |
dansmith | presumably you're dependent on other changes/ | 18:27 |
eandersson | They are? | 18:27 |
sean-k-mooney | yep | 18:27 |
sean-k-mooney | but its non ovious how to do it | 18:28 |
*** sorrison has joined #openstack-nova | 18:28 | |
dansmith | it's very obvious | 18:28 |
dansmith | it may not be _documented_ :) | 18:28 |
eandersson | btw we also have weights for upgrading computes (e.g. a compute that needs a OS upgrade would be moved into an aggregate to reduce the changes of it getting scheduled to) | 18:28 |
dansmith | https://docs.openstack.org/nova/latest/user/filter-scheduler.html#writing-your-own-filter | 18:28 |
sean-k-mooney | eandersson: here is an example https://opendev.org/x/nfv-filters | 18:29 |
*** derekh has quit IRC | 18:29 | |
eandersson | Ah yea I see | 18:30 |
*** awalende has quit IRC | 18:31 | |
sean-k-mooney | you just do filter_scheduler.available_filters=nova.scheduler.filters.all_filters,nfv_filters.nova.scheduler.filters.aggregate_instance_type_filter | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Kill it https://review.opendev.org/696518 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Rename 'nova.network.neutronv2' -> 'nova.network' https://review.opendev.org/696745 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Rename 'nova.network.security_group.neutron_driver' -> 'nova.network.security_group' https://review.opendev.org/696746 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove unnecessary 'neutronv2' prefixes https://review.opendev.org/696776 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove unused exceptions https://review.opendev.org/697149 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove db methods for ProviderMethod https://review.opendev.org/697150 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove unused 'stub_out_db_network_api' https://review.opendev.org/697151 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove remaining nova-network quotas https://review.opendev.org/697152 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove use of legacy 'FloatingIP' object https://review.opendev.org/697153 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove use of legacy 'Network' object https://review.opendev.org/697154 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove use of legacy 'SecurityGroup' object https://review.opendev.org/697155 | 18:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: nova-net: Remove unused nova-network objects https://review.opendev.org/697156 | 18:31 |
sean-k-mooney | you can similary have out of tree weighers that you can load in a similar way | 18:32 |
*** sorrison has quit IRC | 18:32 | |
sean-k-mooney | dansmith: by the way this is actully a lot simpler then i rememeberd i was thinking of out of tree virt drivers which require you to reopen the nova namespace and do other things to make them work | 18:34 |
sean-k-mooney | dansmith: be we dont really want that to be plugablle in the same way so it makes sense that is more work | 18:34 |
dansmith | yup | 18:34 |
*** tbachman has joined #openstack-nova | 18:35 | |
*** abaindur has joined #openstack-nova | 18:37 | |
*** ociuhandu has quit IRC | 18:39 | |
*** ociuhandu has joined #openstack-nova | 18:40 | |
*** sorrison has joined #openstack-nova | 18:40 | |
*** jmlowe has joined #openstack-nova | 18:40 | |
*** abaindur has quit IRC | 18:42 | |
*** sorrison has quit IRC | 18:44 | |
*** ociuhandu has quit IRC | 18:44 | |
artom | Do we not reset the old flavor back on the instance if we fail a resize? Fail as in, something goes wrong during _prep_resize or resize_instance | 18:49 |
*** sorrison has joined #openstack-nova | 18:49 | |
efried | gosh, I would hope we do | 18:49 |
artom | I can't find it - maybe I'm looking in the wrong place... | 18:50 |
artom | I mean, we must | 18:50 |
sean-k-mooney | in pre_resize i dont think we have saved the instace yet | 18:50 |
artom | Oh, is that how? | 18:51 |
sean-k-mooney | i have not looked at that in a while however. have we saved it since old_flavor was defined | 18:51 |
artom | We just don't persist until it's final | 18:51 |
sean-k-mooney | well we woudl presisti it before we go to resize verify | 18:51 |
*** abaindur has joined #openstack-nova | 18:51 | |
*** abaindur has quit IRC | 18:52 | |
sean-k-mooney | i would just check where it is saved first | 18:52 |
*** abaindur has joined #openstack-nova | 18:52 | |
artom | That's what I'm looking for... | 18:53 |
dansmith | not just save, but save() after instance.flavor is set, unrelated to instance.old_flavor | 18:53 |
*** sorrison has quit IRC | 18:54 | |
sean-k-mooney | i think this is where we revert it if we get to revert resize https://github.com/openstack/nova/blob/757fc03b78d542e7262343b65eacea02ce11dd04/nova/objects/instance.py#L1021-L1035 | 18:55 |
sean-k-mooney | but im not sure that is required if we fail early | 18:55 |
*** damien_r has joined #openstack-nova | 18:57 | |
sean-k-mooney | this is where we update the instace i think https://github.com/openstack/nova/blob/757fc03b78d542e7262343b65eacea02ce11dd04/nova/compute/manager.py#L5260-L5292 | 18:58 |
sean-k-mooney | so assuming we call _prep_resize before _finish_resize if you fail in _prep_resize im not sure you need to do anything | 18:59 |
*** hamzy has joined #openstack-nova | 18:59 | |
*** gmann is now known as gmann_afk | 19:00 | |
*** jamesdenton has joined #openstack-nova | 19:00 | |
sean-k-mooney | artom: is that what you were looking for? | 19:00 |
*** ralonsoh has quit IRC | 19:00 | |
artom | sean-k-mooney, I'm looking for where the request spec is reverted back to the old flavor | 19:02 |
artom | I said instance dind't I? | 19:03 |
artom | I meant request spec | 19:03 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: WIP: Implement cleanup_instance_network_on_host for neutron API https://review.opendev.org/697162 | 19:03 |
dansmith | artom: compute can't reset the reqspec, so if it happens late enough, that won't happen | 19:07 |
dansmith | er, s/happens/fails/ | 19:07 |
dansmith | I think the only downside to that is if we do something that looks at the reqspec and not the instance, not sure what that would be | 19:08 |
artom | Sorry, was on a call | 19:17 |
* artom reads scrollback | 19:17 | |
artom | Context is https://review.opendev.org/#/c/662522/13 | 19:18 |
artom | So I guess scheduling looks at request spec | 19:19 |
*** awalende has joined #openstack-nova | 19:20 | |
eandersson | btw major annoyance at the moment is when instances get stuck in building | 19:21 |
eandersson | often you don't know which compute they are stuck buidling against | 19:21 |
*** awalende has quit IRC | 19:24 | |
mriedem | check the launched_on field on the instance - that wasn't cleaned up until recently | 19:24 |
mriedem | if instances are getting stuck in BUILD status somewhere when we failed, report a bug | 19:24 |
mriedem | like you did with bug 1837955 | 19:25 |
openstack | bug 1837955 in OpenStack Compute (nova) stein "MaxRetriesExceeded sometime fails with messaging exception" [Medium,Fix committed] https://launchpad.net/bugs/1837955 - Assigned to Matt Riedemann (mriedem) | 19:25 |
mriedem | efried: dustinc: what baremetal api version is used when we get a node's details now? https://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py#L226 | 19:26 |
mriedem | since we're not using the client_wrapper there i can't tell if it's 1.46 https://github.com/openstack/nova/blob/master/nova/virt/ironic/client_wrapper.py#L35 or something else | 19:27 |
efried | negotiated by sdk, no? | 19:27 |
dansmith | artom: so you mean a failed resize followed by a migration would use the new flavor as part of scheduling yeah? | 19:27 |
mriedem | based on the fields requested? | 19:27 |
efried | no, based on what's available | 19:28 |
efried | I think sdk does version discovery and uses the latest they mutually understand. mordred? | 19:28 |
mordred | uh | 19:28 |
*** ivve has joined #openstack-nova | 19:28 | |
mordred | reading | 19:28 |
dustinc | I don't know, I never had to deal with it so I _assume_ it is the most recent common version | 19:28 |
mriedem | is there a way to tell if the sdk will use 1.50+ to get the node.owner field? | 19:29 |
mriedem | ah yes | 19:29 |
mriedem | _max_microversion = '1.52' | 19:29 |
mordred | yes - that's right. it does discovery and gets the latest that sdk and the remote both understand | 19:29 |
mordred | so yeah - shuold be 1.50+ - assuming the ironic supports that | 19:30 |
mriedem | ah but the sdk node object doesn't have an explicit owner property so i'd have to add that | 19:30 |
mordred | yah. dtantsur|afk ^^ | 19:30 |
efried | mriedem: right -- the sdk is kind of supposed to abstract away "microversion" and instead let you think about "feature" -- in this case node.owner. | 19:30 |
mriedem | sure. i didn't know it hard-coded all of the properties though. | 19:31 |
efried | so if you want $feature at $microversion, sdk has to be plumbed to a) use $microversion, and b) expose $feature. Then nova has to require the release of sdk that does that, etc. | 19:31 |
efried | oh. Yeah :) | 19:31 |
mriedem | makes my little wip hackaroo i was going to do this afternoon not so simple, but anyway | 19:32 |
mriedem | thanks | 19:32 |
artom | dansmith, I think so | 19:33 |
dustinc | https://docs.openstack.org/openstacksdk/latest/user/microversions.html | 19:33 |
mordred | mriedem: one day we'll get to the point where people will add support for a thing to sdk when they add it to a service and life will be magical | 19:33 |
artom | (Sorry, got ADHD'ed away to another thing) | 19:33 |
mordred | we are not at that day today | 19:33 |
dansmith | artom: might be valid to refresh the reqspec from the instance before we go into an operation like that | 19:33 |
mriedem | mordred: and osc et al :) | 19:34 |
*** martinkennelly has joined #openstack-nova | 19:34 | |
mriedem | efried: note the ..note:: rendering at the bottom of https://docs.openstack.org/openstacksdk/latest/user/microversions.html - tickles your rst ocd?! | 19:34 |
efried | aaaaaagh! | 19:35 |
artom | Hey, so I'm backporting https://review.opendev.org/#/c/619953/10/nova/conf/libvirt.py@719 internally | 19:36 |
artom | And wondering about changing the default to 'unique' | 19:36 |
efried | mordred: mriedem: https://review.opendev.org/697168 | 19:36 |
*** eharney has joined #openstack-nova | 19:36 | |
efried | ...and, you suck. | 19:36 |
artom | Can't imagine Windows would appreciate its sysinfo_serial changing under it | 19:37 |
*** dpawlik has joined #openstack-nova | 19:37 | |
artom | Am I just being paranoid about this? mriedem, dansmith, ^^ you guys are the approvers do you recall talking about this at all? | 19:37 |
mordred | efried: +A | 19:38 |
eandersson | mriedem I think there are at least a handful of bugs causing it for us | 19:40 |
eandersson | difficult to get a handle on the causes | 19:40 |
mriedem | artom: backporting features huh | 19:41 |
artom | mriedem, only backportable ones | 19:41 |
mriedem | are you asking about the serial changing on an existing guest? | 19:41 |
artom | mriedem, yeah | 19:41 |
mriedem | so change the default to 'auto' like it was? | 19:42 |
*** dpawlik has quit IRC | 19:42 | |
artom | mriedem, well, not sure it makes sense to go back at this point | 19:42 |
mriedem | there was quite a bit of discussion around that blueprint, including being able to override the host config with image meta or flavors but that was dropped | 19:42 |
eandersson | btw launched_on is is always null | 19:42 |
artom | But... there's an impact, right? The default becomes 'unique', instance is hard-rebooted, boom, new serial | 19:42 |
artom | Linux is probably OK, but Windows... | 19:42 |
artom | So lyarwood just updated our proposed backport to not change the default | 19:43 |
artom | Do... do we need to do the same upstream? | 19:43 |
mriedem | there is an upgrade release note. i'd have to dig into all of the comments on that patch to determine if we changed the default later or not for good reason | 19:43 |
*** nweinber has quit IRC | 19:44 | |
mriedem | that was released in stein | 19:44 |
mriedem | changing the default from 2 releases back would be weird | 19:44 |
artom | Right, so it's already out there | 19:44 |
artom | I know | 19:44 |
*** nweinber has joined #openstack-nova | 19:44 | |
*** nweinber has quit IRC | 19:44 | |
*** nweinber has joined #openstack-nova | 19:45 | |
efried | mriedem: see gibi's comment at the bottom of https://review.opendev.org/#/c/696992/ -- is it possible the bw test was removed when you refactored to get rid of tempest-slow? | 19:45 |
eandersson | I think this is a new stuckin BUILD bug because I can't even find the instance in placement | 19:46 |
mriedem | efried: it should be run in the nova-next job | 19:48 |
efried | okay, will look for it there | 19:49 |
mriedem | i don't see it, checking something | 19:49 |
artom | I guess there's less impact when it's in a new release | 19:49 |
mriedem | artom: well we don't backport features upstream for a reason so sure | 19:49 |
artom | But in an existing release we definitely can't change the defaults | 19:50 |
mriedem | hey, how you want to break your enterprise users on queens is up to you :) | 19:50 |
artom | In a way that we can fix for loads of $$ | 19:50 |
mriedem | job security | 19:50 |
artom | 'zactly | 19:50 |
openstackgerrit | Eric Fried proposed openstack/nova master: Use Placement 1.34 (string suffixes & mappings) https://review.opendev.org/696418 | 19:52 |
openstackgerrit | Eric Fried proposed openstack/nova master: refactor: RequestGroup.is_empty() and .strip_zeros() https://review.opendev.org/696991 | 19:52 |
openstackgerrit | Eric Fried proposed openstack/nova master: Tie requester_id to RequestGroup suffix https://review.opendev.org/696946 | 19:52 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Use provider mappings from Placement (mostly) https://review.opendev.org/696992 | 19:52 |
mriedem | efried: so nova-next should run tempest api compute and scenario tests: | 19:52 |
mriedem | tempest_test_regex: ^tempest\.(scenario|api\.compute) | 19:52 |
mriedem | except these scenario tests: | 19:53 |
mriedem | tempest_black_regex: ^tempest.scenario.test_network | 19:53 |
mriedem | but that shouldn't hit on tempest/scenario/test_minbw_allocation_placement | 19:53 |
mriedem | it does get run in tempest-slow in train though yeah https://zuul.opendev.org/t/openstack/build/ddb1fb60455d4f7681a9a377aaef63ab/log/job-output.txt#73583 | 19:54 |
eandersson | I am wondering if we are hitting some race conndition | 19:55 |
eandersson | Because we create 2 VMs ever 10 minutes and never get VMs stuck in BUILDING. | 19:55 |
eandersson | but our customers hit this very often with aggressive terraform deployments | 19:55 |
mriedem | efried: so we went from this in tempest-slow: | 19:55 |
mriedem | slow-serial run-test: commands[1] | tempest run --serial --regex '\[.*\bslow\b.*\]' --concurrency=2 --black-regex= | 19:55 |
mriedem | to this in nova-next: | 19:55 |
mriedem | all run-test: commands[1] | tempest run --regex '^tempest\.(scenario|api\.compute)' --concurrency=4 '--black-regex=^tempest.scenario.test_network' | 19:56 |
mriedem | i'm not sure why test_minbw_allocation_placement would be filtered out | 19:56 |
mriedem | oh i see | 19:56 |
mriedem | https://zuul.opendev.org/t/openstack/build/66f29bf5f12449059e82d24db5aff47a/log/job-output.txt#79328 | 19:56 |
mriedem | {2} setUpClass (tempest.scenario.test_minbw_allocation_placement.MinBwAllocationPlacementTest) ... SKIPPED: Skipped as no physnet is available in config for placement based QoS allocation. | 19:57 |
efried | o | 19:57 |
efried | "in config" like the devstack config? | 19:57 |
mriedem | tempest config | 19:58 |
mriedem | https://github.com/openstack/tempest/blob/3eb3c29e979fd3f13c205d62119748952d63054a/tempest/scenario/test_minbw_allocation_placement.py#L72 | 19:58 |
mriedem | https://zuul.opendev.org/t/openstack/build/ddb1fb60455d4f7681a9a377aaef63ab/log/controller/logs/tempest_conf.txt.gz#81 | 19:58 |
mriedem | the tempest-slow job has that, the nova-next job does not | 19:59 |
mriedem | https://github.com/openstack/tempest/blob/3eb3c29e979fd3f13c205d62119748952d63054a/.zuul.yaml#L290 | 19:59 |
mriedem | so we need that in the nova-next job | 20:00 |
efried | neat | 20:00 |
mriedem | report a bug and i can fix that up in a bit or push your own change | 20:00 |
efried | and probably the stuff above it too. | 20:01 |
mriedem | yeah i suppose | 20:01 |
mriedem | https://github.com/openstack/tempest/commit/c87a06b3c29427dc8f2513047c804e0410b4b99c | 20:01 |
mriedem | whatever was added in there | 20:01 |
mriedem | actually you're in luck, | 20:02 |
eandersson | I created a bug, will add more info if/when I find it. https://bugs.launchpad.net/nova/+bug/1854992 | 20:02 |
openstack | Launchpad bug 1854992 in OpenStack Compute (nova) "Frequent instances stuck in BUILD with no apparent failure" [Undecided,New] | 20:02 |
mriedem | the nova-next job already sets that shit up b/c the post-test hook runs heal_allocations on a instance with a port that has bw | 20:02 |
mriedem | you just need tempest.conf updated | 20:02 |
*** ociuhandu has joined #openstack-nova | 20:03 | |
eandersson | My best guess at the moment is RabbitMQ related issues (e.g. we have hit bugs in RabbitMQ where bindings exists, but are broken) | 20:04 |
*** nweinber has quit IRC | 20:05 | |
openstackgerrit | Lee Yarwood proposed openstack/nova-specs master: Boot from volume instance rescue https://review.opendev.org/694063 | 20:07 |
efried | mriedem: https://bugs.launchpad.net/nova/+bug/1854993 | 20:07 |
openstack | Launchpad bug 1854993 in OpenStack Compute (nova) "QoS bandwidth tempest test no longer running" [Undecided,New] | 20:07 |
*** ociuhandu has quit IRC | 20:08 | |
*** nweinber has joined #openstack-nova | 20:09 | |
efried | fixing... | 20:09 |
openstackgerrit | Merged openstack/nova master: Add a way to exit early from a wait_for_instance_event() https://review.opendev.org/695985 | 20:09 |
artom | dansmith, answering my own earlier question, looks like we actually save the request_spec in the conductor only if the resize/migration succeeded | 20:10 |
openstackgerrit | Merged openstack/nova master: docs: Change order of PCI configuration steps https://review.opendev.org/694521 | 20:10 |
openstackgerrit | Merged openstack/nova master: docs: Clarify configuration steps for PF devices https://review.opendev.org/694522 | 20:10 |
openstackgerrit | Merged openstack/nova master: Suppress policy deprecated warnings in tests https://review.opendev.org/676670 | 20:10 |
artom | Except... that doens't work, because the conductor then casts to the computes to do the work | 20:10 |
artom | So if something fails in prep_resize or resize_instance, we'll never know | 20:10 |
openstackgerrit | Eric Fried proposed openstack/nova master: Add QoS tempest config so bw tests run https://review.opendev.org/697180 | 20:11 |
efried | mriedem: ^ | 20:11 |
efried | gibi: ^ | 20:11 |
mriedem | efried: hammered you | 20:15 |
mriedem | https://youtu.be/otCpCn0l4Wo?t=110 | 20:16 |
mriedem | break it down! | 20:16 |
dansmith | artom: right, that's why I said if it happens too late | 20:17 |
*** tbachman has quit IRC | 20:18 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Add QoS tempest config so bw tests run https://review.opendev.org/697180 | 20:19 |
artom | dansmith, right, just caught up with you | 20:19 |
artom | Mind you | 20:20 |
* artom is confuzzled | 20:21 | |
mriedem | doing a thing in the api, failing in compute, and being out of whack isn't a new problem | 20:22 |
openstackgerrit | Eric Fried proposed openstack/nova master: Use Placement 1.34 (string suffixes & mappings) https://review.opendev.org/696418 | 20:22 |
openstackgerrit | Eric Fried proposed openstack/nova master: refactor: RequestGroup.is_empty() and .strip_zeros() https://review.opendev.org/696991 | 20:22 |
openstackgerrit | Eric Fried proposed openstack/nova master: Tie requester_id to RequestGroup suffix https://review.opendev.org/696946 | 20:22 |
mriedem | s/api/controller/ | 20:22 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Use provider mappings from Placement (mostly) https://review.opendev.org/696992 | 20:22 |
artom | mriedem, so do we have a "best practice" kind of thing to handle it? | 20:22 |
artom | Or it's all case by case? | 20:22 |
mriedem | you're worried that we change the request spec to use the new flavor in the api/conductor, cast to compute to do the resize, it fails and the instance is using the old flavor and the request spec is using the new flavor, right? | 20:23 |
mriedem | and then when cold migrating that server, the request spec is incorrectly using the new flavor that the instance isn't actually using | 20:23 |
artom | mriedem, yes to both | 20:24 |
artom | And really it's gibi that's worried - he brought it up on the review | 20:24 |
mriedem | there isn't really a best practice for that. the compute can't/shouldn't get an update the request spec. there is no periodic in the controller services that is healing the request spec for failed resizes. | 20:25 |
*** gmann_afk is now known as gmann | 20:25 | |
mriedem | note that in revert_resize in the API, we also update the request spec to match the source host and then cast off to do the things in compute | 20:25 |
mriedem | if those things fail, the request spec could be out of sync again | 20:25 |
mriedem | as time has gone on we've persisted less and less of the original crap we used to store in the request spec | 20:26 |
mriedem | so it was just a blob to pass things from the api to the scheduler | 20:26 |
mriedem | the flavor is tricky though since it is useful for things like down-cell api responses where we don't have the instance record | 20:26 |
mriedem | tl;dr is there a best practice for avoiding split brain? heal things periodically? use etcd? :) | 20:27 |
artom | (so what's in the api_db for instances? just a cell mapping?) | 20:27 |
mriedem | cell/instance mappings and request spec | 20:27 |
dansmith | mriedem: earlier I said it's probably legit to refresh the reqspec from the instance's flavor before we go into a scheduling operation for an existing instance | 20:27 |
mriedem | server group members | 20:27 |
dansmith | that would solve the problem I think | 20:28 |
dansmith | leaving only cell down showing some outdated instance info guesses, which wouldn't be the end of the world | 20:28 |
artom | dansmith, that's kinda where I'm leaning - if we take the definition of request spec to be what mriedem said "pass scheduling-related things from API to scheduler" | 20:28 |
dansmith | artom: yup | 20:28 |
artom | Then yeah, "load it up from latest instance makes sense' | 20:28 |
dansmith | there's another much more edge case where that makes sense too, | 20:29 |
artom | Although that makes one wonder "what's the point of request_spec in the first place, if everything it in instance anyways" | 20:29 |
dansmith | which is you restore either database from a non-coordinated backup after a disaster or failed upgrade | 20:29 |
dansmith | artom: you mean reqspec.flavor? | 20:29 |
dansmith | artom: before an instance exists, that's where we hold the flavor, | 20:30 |
artom | dansmith, yeah, and request_spec.numa_topology | 20:30 |
mriedem | artom: it's not all instance | 20:30 |
mriedem | scheduler hints, forced hosts/nodes, etc | 20:30 |
dansmith | artom: and if the cell is down, then we use the stuff in reqspec as a cache | 20:30 |
mriedem | requested destinations | 20:30 |
artom | I see | 20:30 |
dansmith | mriedem: right, hence why I asked, I assume he meant the duplicative parts of reqspec, which is just flavor, AFAIK | 20:30 |
artom | dansmith, numa_topology | 20:30 |
mriedem | it's an objectified and persisted version of the old filter_properties stuff, and some of it getting persisted always has caused a lot of problems | 20:31 |
dansmith | artom: I thought the instance numa topo ended up getting fleshed out by the virt driver, | 20:31 |
dansmith | artom: where reqspec's copy was kinda the planned topo, which would still be valid to keep I tink | 20:31 |
artom | dansmith, so there's 2 things there: 1 is the instance numa topology, which can be got from flavor and image meta | 20:31 |
mriedem | there are a few things, numa topo, flavor, image, az, pci requests | 20:31 |
artom | 2 is how it fits on a host - *that* part is virt driver | 20:31 |
artom | 1 is in reqest spec | 20:31 |
dansmith | mriedem: yeah, fair | 20:32 |
artom | Unfortunately, they're muddled in the same object :( | 20:32 |
dansmith | well, regardless, | 20:32 |
dansmith | some duplication gives us a bit of a backup in the case of cell downage as I said | 20:32 |
artom | Fair enough | 20:33 |
artom | OK, gives me stuff to think about | 20:33 |
dansmith | mriedem: doesn't pci_requests in the instance get fleshed out more than in the reqspec too? | 20:33 |
artom | I'll also need to think about backportability | 20:34 |
mriedem | maybe, don't know off the top of my head | 20:35 |
artom | Because while https://bugzilla.redhat.com/show_bug.cgi?id=1715240 is Newton is at last EOL for HR | 20:35 |
openstack | bugzilla.redhat.com bug 1715240 in openstack-nova "Resize ignores mem_page_size in new flavor" [High,On_dev] - Assigned to alifshit | 20:35 |
artom | *RH | 20:35 |
mriedem | i want to say that's Instance.pci_devices which is the allocated devices on the node | 20:35 |
artom | We'll probably want it in Queens as well | 20:35 |
artom | Need to bounce early to pick up 1/2 of my kids | 20:36 |
artom | Thank you gentlemen | 20:36 |
dansmith | mriedem: I thought we updated some physnet stuff in instance_pci_requests at least, but anyway, doesn't matter | 20:36 |
mriedem | what happened to the other half of the children? | 20:36 |
artom | mriedem, I have the other half of my marriage for that :D | 20:36 |
dansmith | I assume that when you get divorced and you have an even number of children, you just split those like the bank and other assets/liabilities right? | 20:37 |
artom | No, the woman takes everything | 20:38 |
dansmith | oh, okay | 20:38 |
artom | And you're just left with alcoholism and depresseion | 20:38 |
artom | (And those are not the names of the kids, btw) | 20:38 |
dansmith | assuming those aren't the names of half your children.. gotcha | 20:38 |
dansmith | haha | 20:38 |
artom | :D | 20:39 |
*** artom has quit IRC | 20:40 | |
*** vesper11 has quit IRC | 20:49 | |
*** vesper11 has joined #openstack-nova | 20:51 | |
*** sorrison has joined #openstack-nova | 20:59 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: ironic: report a custom trait for the node owner https://review.opendev.org/697184 | 21:01 |
*** sorrison has quit IRC | 21:04 | |
*** abaindur has quit IRC | 21:08 | |
*** francoisp has joined #openstack-nova | 21:12 | |
eandersson | > Unable to submit allocation for instance x (409 {"errors": [{"status": 409, "request_id": "reqz", "code": "placement.undefined_code", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'VCPU' on resource provider y'. The requested amount would exceed the | 21:16 |
eandersson | capacity. ", "title": "Conflict"}]}) | 21:16 |
eandersson | This was the cause of the latest instance stuck in BUILD | 21:16 |
mriedem | those instances should all be buried in cell0 in conductor | 21:17 |
mriedem | scheduler should raise a NoValidHost | 21:17 |
mnaser | does setting `hw:mem_page_size` = `any` enable memory tracking on the host inside nova? reading the code leads to show that it does | 21:18 |
mriedem | eandersson: NoValidHost should be handled by conductor here https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L1418 | 21:18 |
mnaser | i'm running into an issue where doing numa node pinning without `hw:mem_page_size` yields in nova packing too many instances into numa node 0, which then results in `oom-killer` | 21:18 |
eandersson | I see one more log line on that req-id in the scheduler | 21:19 |
eandersson | > Computed NUMA topology CPU pinning: usable pCPUs: [[18, 38]], vCPUs mapping: [(0, 18), (1, 38)] | 21:19 |
mriedem | eandersson: and all of those build requests should result in creating instances in ERROR status in cell0 https://github.com/openstack/nova/blob/1c2b7d8f01814adfd6d28b97013a40cca51dfbdf/nova/conductor/manager.py#L1348 | 21:19 |
mriedem | check your conductor logs | 21:19 |
eandersson | conductor logs are empty | 21:19 |
sean-k-mooney | mnaser: just got back. hw:mem_page_size=any was intened to allow the image to choose if hugepage or small pages should be used | 21:19 |
sean-k-mooney | mnaser: i belive it should enable the numa tracking but its the one policy that is least used | 21:20 |
mnaser | sean-k-mooney: but what if i dont want neither and i just want nova to be aware of memory per numa node? | 21:20 |
mnaser | cause that involves rebooting the machine and what not to enable those | 21:20 |
eandersson | and when I say empty, there are literately no log lines over the last 12 hours. | 21:20 |
mnaser | sean-k-mooney: according to virsh capabilities, the cell only has pages with size='4' | 21:21 |
sean-k-mooney | well you cant change an exitisng image to any with out a resize | 21:21 |
mnaser | right but for newly booted instances flavor change | 21:21 |
*** tbachman has joined #openstack-nova | 21:21 | |
mnaser | update th flavor extra_specs and newly booted instances will do the right thing(tm) | 21:21 |
sean-k-mooney | right so hw:mem_page_size=small would use 4k pages | 21:21 |
eandersson | btw placement failed 6 times with the same erroor before it pretty much just stopped trying | 21:22 |
sean-k-mooney | any should also use 4k pages but would allow the image to request 2mb | 21:22 |
mnaser | sean-k-mooney: ok cool, i see, so 'any' should technically make things work and make nova start tracking memory | 21:22 |
eandersson | Its probably RabbitMQ issues again, but we tested all queues and binding and none are failing. | 21:22 |
eandersson | Like we have seen in the past. | 21:22 |
sean-k-mooney | i need to triple check it but if hw:mem_page_size is defiend it is ment to enable the numa aware tracking | 21:23 |
sean-k-mooney | no mater what value you set it to | 21:23 |
mnaser | sean-k-mooney: ok ill check and report back, but my notes seem to add up to yours :> | 21:23 |
eandersson | The most frustrating part is that there are no logs, and nothing in the database, so troubleshooting these usually requires looking into the database. | 21:23 |
eandersson | *nothing from the api | 21:24 |
mnaser | eandersson: did you have any rabbitmq outage/issue? | 21:24 |
sean-k-mooney | the reason im a little less confident with hw:mem_page_size=any is i know we dont have tempest testing for it | 21:24 |
sean-k-mooney | where as we did for large and small | 21:24 |
eandersson | mnaser in the past yea and we found bad bindings after that | 21:24 |
mnaser | in this case the only reason i care about mem page size is to get nova to track memory so maybe ill try small after all | 21:24 |
eandersson | we found a way to diagnose that | 21:24 |
mnaser | eandersson: in my experience anytime rabbitmq suffers .. anything .. you end up in that weird state and you have to restart the cloud | 21:25 |
eandersson | by pushing a fake message to the compute queues | 21:25 |
mnaser | ah gotcha | 21:25 |
eandersson | of course it might be something new with rmq | 21:25 |
eandersson | I know there is a patch to add mandatory to nova | 21:25 |
eandersson | the mandatory flag when publishing messages | 21:25 |
eandersson | Since if a message is lost you are stuck in buidling forever | 21:26 |
eandersson | with no error | 21:26 |
mnaser | my favorite :) | 21:27 |
*** sorrison has joined #openstack-nova | 21:27 | |
eandersson | Gonna wipe the rmq db and see if it goes away | 21:33 |
eandersson | It's just odd that our synth tests are not hitting this | 21:34 |
*** martinkennelly has quit IRC | 21:34 | |
eandersson | We create 300 VMs per day and none end up in this state | 21:34 |
*** dpawlik has joined #openstack-nova | 21:38 | |
*** dpawlik has quit IRC | 21:43 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: WIP: Add node owner pre-filter https://review.opendev.org/697187 | 21:49 |
mriedem | dansmith: this is probably a blast from the past for you https://review.opendev.org/#/c/697122/ | 21:54 |
mriedem | mitaka era | 21:54 |
*** awalende has joined #openstack-nova | 22:00 | |
*** nweinber has quit IRC | 22:01 | |
*** awalende has quit IRC | 22:04 | |
*** pcaruana has quit IRC | 22:06 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: WIP: Implement cleanup_instance_network_on_host for neutron API https://review.opendev.org/697162 | 22:09 |
*** mriedem is now known as mriedem_away | 22:11 | |
dansmith | mriedem_away: I don't remember it, but I totes believe I did something awesome that someone else broke | 22:11 |
*** slaweq has quit IRC | 22:15 | |
*** mriedem_away has quit IRC | 22:22 | |
*** munimeha1 has quit IRC | 22:43 | |
*** tkajinam has joined #openstack-nova | 22:56 | |
*** rcernin has joined #openstack-nova | 22:57 | |
sean-k-mooney | eandersson: the patch for the mandatory flag i think is more or less stalled i have not seen anything happen on the fron in a while | 23:00 |
eandersson | Awh that is too bad | 23:01 |
sean-k-mooney | it looks like all the oslo.messaging fixes are done https://review.opendev.org/#/q/status:merged+project:openstack/oslo.messaging+branch:master+topic:bp/transport-options | 23:02 |
sean-k-mooney | but i dont think anyone has gotten around to using that in nova yet | 23:02 |
eandersson | Yea - and would probably not be easy to backport. | 23:02 |
eandersson | Or even possible. | 23:02 |
sean-k-mooney | well we need to do too things. frist start using it in master and second we need to figure out what the correct action is to take if we get an exception indicating it could not be delivered | 23:03 |
eandersson | I think I am going to add a log to nova just before it sends that notification to the computes. | 23:03 |
sean-k-mooney | for example if we could not deliver a message to a compute queue should we disabel that compute node so we dont try to schdule to it agian? | 23:05 |
sean-k-mooney | shoudl we jsut retry? | 23:05 |
sean-k-mooney | i actully have no idea what the best path forward would be in that case but we kind of need to figure that out before we can fix the issue | 23:06 |
*** hamzy has quit IRC | 23:06 | |
sean-k-mooney | we now have the feature that allows use to adress it however | 23:06 |
sean-k-mooney | melwitt: did you ever have time to talk to the oslo folks on how we could use this https://blueprints.launchpad.net/oslo.messaging/+spec/transport-options and specificly the mandaroty flag so we can adress the nova aspect of https://bugs.launchpad.net/oslo.messaging/+bug/1661510 | 23:11 |
openstack | Launchpad bug 1661510 in oslo.messaging "topic_send may loss messages if the queue not exists" [Medium,In progress] - Assigned to Gabriele Santomaggio (gsantomaggio) | 23:11 |
sean-k-mooney | i know i havent but it looks like all the patchs are in place on the oslo side | 23:12 |
*** tbachman has quit IRC | 23:16 | |
*** slaweq has joined #openstack-nova | 23:25 | |
melwitt | sean-k-mooney: I haven't talked to them bc like you said, their side is done. it's up to us (me) to figure out how to use it on the nova side and I haven't gotten a chance to dig in to it yet | 23:28 |
melwitt | I don't yet know where/how to pass it in the nova code | 23:28 |
sean-k-mooney | ya i just asked on the bug if they could provdie an example | 23:29 |
*** ociuhandu has joined #openstack-nova | 23:30 | |
sean-k-mooney | i know its something we shoudl be setting when we do a topic send which is i guess a call to a compute node primarly | 23:30 |
*** slaweq has quit IRC | 23:31 | |
sean-k-mooney | but i dont know what we should do if we are not able to deliver it to the message queue | 23:31 |
*** ociuhandu has quit IRC | 23:36 | |
*** dpawlik has joined #openstack-nova | 23:39 | |
*** abaindur has joined #openstack-nova | 23:40 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add new default roles in os-services API policies https://review.opendev.org/648480 | 23:40 |
*** dpawlik has quit IRC | 23:43 | |
gmann | johnthetubaguy: this should be ready now. I was checking why project-admin and legacy admin can access the new system reader rule. which was correct because of old defaults(admin_api) are deprecated and still work. This make sure any old token will keep working for changed defaults also. | 23:44 |
*** abaindur has quit IRC | 23:45 | |
gmann | johnthetubaguy: once you are ok with this then it will complete the first set of change - https://review.opendev.org/648480 | 23:45 |
*** ivve has quit IRC | 23:47 | |
*** tbachman has joined #openstack-nova | 23:48 | |
openstackgerrit | sean mooney proposed openstack/nova master: Block rebuild when NUMA topology changed https://review.opendev.org/687957 | 23:51 |
openstackgerrit | sean mooney proposed openstack/nova master: Disable NUMATopologyFilter on rebuild https://review.opendev.org/689861 | 23:51 |
*** abaindur has joined #openstack-nova | 23:52 | |
*** abaindur has joined #openstack-nova | 23:53 | |
efried | https://bugs.launchpad.net/neutron/+bug/1855015 | 23:56 |
openstack | Launchpad bug 1855015 in OpenStack Compute (nova) "Intermittent fails since 11/23 with "Multiple possible networks found, use a Network ID to be more specific."" [Undecided,New] | 23:56 |
sean-k-mooney | has that reappared | 23:56 |
sean-k-mooney | that is a tempset bug and or we have not confgiured tempet correctly | 23:57 |
eandersson | If we can’t deliver to the queue I think we should just set to error or move to the next available compute | 23:58 |
sean-k-mooney | eandersson: set what to error? | 23:58 |
eandersson | The instance | 23:58 |
sean-k-mooney | that would not be correct in all cases | 23:59 |
sean-k-mooney | if the call was to say the instace diagnostics endpoint | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!