Monday, 2018-07-23

openstackgerritfupingxie proposed openstack/nova master: Delete allocations when it is re-allocated  https://review.openstack.org/58289900:46
*** edmondsw has joined #openstack-placement01:11
*** edmondsw has quit IRC01:15
openstackgerritZhenyu Zheng proposed openstack/nova master: Func test for improper cn local DISK_GB reporting  https://review.openstack.org/58364601:34
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in libvirt/test_driver.py (5)  https://review.openstack.org/57084201:43
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in libvirt/test_driver.py (6)  https://review.openstack.org/57133001:43
*** lei-zh has joined #openstack-placement01:58
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in libvirt/test_driver.py (7)  https://review.openstack.org/57199202:07
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in libvirt/test_driver.py (8)  https://review.openstack.org/57199302:09
openstackgerrityanpuqing proposed openstack/nova master: Rename auth_uri to www_authenticate_uri  https://review.openstack.org/57682002:19
*** tetsuro has joined #openstack-placement02:25
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (3)  https://review.openstack.org/57410402:36
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (4)  https://review.openstack.org/57410602:36
openstackgerritMerged openstack/nova master: Add regression test for bug 1781710  https://review.openstack.org/58333902:38
openstackbug 1781710 in OpenStack Compute (nova) "ServersOnMultiNodesTest.test_create_server_with_scheduler_hint_group_anti_affinity failing with "Servers are on the same host"" [High,In progress] https://launchpad.net/bugs/1781710 - Assigned to Matt Riedemann (mriedem)02:38
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (5)  https://review.openstack.org/57411002:53
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (6)  https://review.openstack.org/57411302:53
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (7)  https://review.openstack.org/57497402:54
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (8)  https://review.openstack.org/57531102:54
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (9)  https://review.openstack.org/57558102:54
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (10)  https://review.openstack.org/57601702:54
*** edmondsw has joined #openstack-placement02:59
*** edmondsw has quit IRC03:04
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (11)  https://review.openstack.org/57601803:09
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (12)  https://review.openstack.org/57601903:09
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (13)  https://review.openstack.org/57602003:09
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (14)  https://review.openstack.org/57602703:10
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (15)  https://review.openstack.org/57603103:10
openstackgerritZhenyu Zheng proposed openstack/nova master: Report 0 root_gb in resource tracker if instance is bfv.  https://review.openstack.org/58420403:25
*** lei-zh has quit IRC04:06
*** edmondsw has joined #openstack-placement04:47
*** edmondsw has quit IRC04:52
openstackgerritMerged openstack/nova master: Merge server create for multiple-create extension  https://review.openstack.org/58001704:54
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (16)  https://review.openstack.org/57629904:55
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (17)  https://review.openstack.org/57634404:55
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (18)  https://review.openstack.org/57667304:55
openstackgerritGhanshyam Mann proposed openstack/nova master: Merge extended availability zone response into server controller  https://review.openstack.org/50285905:12
openstackgerritGhanshyam Mann proposed openstack/nova master: Merge config drive extension response into server controller  https://review.openstack.org/58422305:14
openstackgerritGhanshyam Mann proposed openstack/nova master: Merge config drive extension response into server controller  https://review.openstack.org/58422305:22
openstackgerritGhanshyam Mann proposed openstack/nova master: Merge extended server attributes extension response  https://review.openstack.org/58459005:23
*** lei-zh has joined #openstack-placement05:29
openstackgerritGhanshyam Mann proposed openstack/nova master: Merge extended server attributes extension response  https://review.openstack.org/58459005:42
openstackgerritjichenjc proposed openstack/nova master: add zvm into support matrix  https://review.openstack.org/53272006:14
openstackgerritjichenjc proposed openstack/nova master: Add zvm admin intro and hypervisor information  https://review.openstack.org/53312506:14
openstackgerritjichenjc proposed openstack/nova master: Add zvm CI information  https://review.openstack.org/53351206:14
openstackgerritLei Zhang proposed openstack/nova master: Add method to get cpu traits  https://review.openstack.org/56031706:24
openstackgerritZhenyu Zheng proposed openstack/nova master: Report 0 root_gb in resource tracker if instance is bfv.  https://review.openstack.org/58420406:25
openstackgerritLei Zhang proposed openstack/nova master: Add method to get cpu traits  https://review.openstack.org/56031706:29
*** edmondsw has joined #openstack-placement06:35
*** edmondsw has quit IRC06:39
openstackgerritjichenjc proposed openstack/nova master: add zvm into support matrix  https://review.openstack.org/53272006:51
openstackgerritjichenjc proposed openstack/nova master: Add zvm admin intro and hypervisor information  https://review.openstack.org/53312506:51
openstackgerritjichenjc proposed openstack/nova master: Add zvm CI information  https://review.openstack.org/53351206:51
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (19)  https://review.openstack.org/57667607:00
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (20)  https://review.openstack.org/57668907:00
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (21)  https://review.openstack.org/57670907:00
*** tssurya has joined #openstack-placement07:14
openstackgerritjichenjc proposed openstack/nova master: Add zvm admin intro and hypervisor information  https://review.openstack.org/53312507:24
openstackgerritjichenjc proposed openstack/nova master: Add zvm CI information  https://review.openstack.org/53351207:24
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (22)  https://review.openstack.org/57671207:29
openstackgerrithuanhongda proposed openstack/nova master: WIP: hypervisor-stats shows wrong disk usages with shared storage  https://review.openstack.org/14987807:32
openstackgerritGhanshyam Mann proposed openstack/nova master: Merge keypair extension response into server view builder  https://review.openstack.org/58474807:40
*** avolkov has joined #openstack-placement07:57
*** avolkov has quit IRC07:59
*** takashin has left #openstack-placement08:03
*** edmondsw has joined #openstack-placement08:23
*** edmondsw has quit IRC08:28
openstackgerritBoxiang Zhu proposed openstack/nova stable/pike: Fix "instance snap min disk size err after resize instance"  https://review.openstack.org/58477008:40
*** e0ne has joined #openstack-placement08:43
*** cdent has joined #openstack-placement09:39
*** lei-zh has quit IRC09:50
*** finucannot has quit IRC09:53
*** stephenfin has joined #openstack-placement09:55
*** edmondsw has joined #openstack-placement10:11
*** edmondsw has quit IRC10:16
*** bauzas has joined #openstack-placement10:20
openstackgerritChris Dent proposed openstack/nova master: [placement] Add /reshaper handler for POST  https://review.openstack.org/57692710:20
*** avolkov has joined #openstack-placement11:22
openstackgerritAndrey Volkov proposed openstack/nova master: Docs: Add Placement to Nova system architecture  https://review.openstack.org/58433811:22
openstackgerritLee Yarwood proposed openstack/nova master: libvirt: Remove reference to transient domain when detaching devices  https://review.openstack.org/58443311:35
openstackgerritAndrey Volkov proposed openstack/nova master: Docs: Add Placement to Nova system architecture  https://review.openstack.org/58433811:54
openstackgerritSurya Seetharaman proposed openstack/nova master: Add queued_for_delete field to InstanceMapping object  https://review.openstack.org/56679511:54
openstackgerritSurya Seetharaman proposed openstack/nova master: Online migration tool for populating queued-for-delete  https://review.openstack.org/58253611:54
openstackgerritSurya Seetharaman proposed openstack/nova master: Update queued-for-delete from the ComputeAPI during deletion/restoration  https://review.openstack.org/56681311:54
openstackgerritSurya Seetharaman proposed openstack/nova master: Return a minimal construct for nova service-list when a cell is down  https://review.openstack.org/58482911:54
*** tetsuro has quit IRC12:04
*** tetsuro has joined #openstack-placement12:09
*** tetsuro has quit IRC12:10
*** edmondsw has joined #openstack-placement12:13
openstackgerritLee Yarwood proposed openstack/nova master: libvirt: Wire up a force disconnect_volume flag  https://review.openstack.org/58484912:20
openstackgerritLee Yarwood proposed openstack/nova master: WIP libvirt: Forcibly disconnect volumes during post_live_migration  https://review.openstack.org/58485012:20
*** leakypipes is now known as jaypipes12:29
openstackgerrithuanhongda proposed openstack/nova master: hypervisor-stats shows wrong disk usages with shared storage  https://review.openstack.org/14987812:32
*** mriedem has joined #openstack-placement12:53
*** lei-zh has joined #openstack-placement13:03
cdentgibi, efried, jaypipes : found a buglet in the database side of reshaper13:12
cdentit currently assumes you'll never be clearing inventory fully13:13
cdentbrb13:13
openstackgerritBalazs Gibizer proposed openstack/nova master: Transform missing delete notifications  https://review.openstack.org/41029713:13
efriedcdent: Clearing all inventories for a particular provider?13:16
*** lei-zh has quit IRC13:16
*** lei-zh has joined #openstack-placement13:17
cdentefried: yeah13:18
cdentend up with an IndexError13:18
efriedouch.13:19
cdentline 4087 of objects/resource_provider.py13:19
cdentbecause it is assuming there will be a new_inv_list for the rp13:19
efriedCool.13:19
efriedBy which I mean, good find.13:19
cdentthis was revealved because gibi asked me to right another test about new consumers, and I decided oh, hey, let's move all the inventory back to the originals13:20
cdentand boom13:20
gibicdent: nice catch13:20
cdentis the transformer code already in the gate or can/should we stop it13:21
* cdent makes a bug, for history13:22
gibicdent: it is on the gate13:22
gibicdent: but we can pull it13:22
cdentdoesn't really matter13:22
cdentwe can just fix it in a parent of my thing13:23
cdentsince there are no caller until my stuff merges13:23
gibicdent: OK, let's fix it in a followup13:23
*** takashin has joined #openstack-placement13:27
cdentgibi, efried, jaypipes: https://bugs.launchpad.net/nova/+bug/178313013:30
openstackLaunchpad bug 1783130 in OpenStack Compute (nova) "placement reshaper doesn't clearing all inventories for a resource provider" [Medium,Confirmed]13:30
cdentefried: I think you had some notions about this when writing that version of that code. What do you think of my last paragraph in the bug summary?13:31
efriedcdent: Yeah, I remember noticing this and deliberately not using rp_uuid to retrieve the rp obj because we already had it in the inventory obj.13:35
efriedexcept we don't in this case.13:35
efriedIt wouldn't hurt to build & maintain a cache of rp_uuid: rp obj from the first loop, including `except IndexError` => do a lookup.13:36
*** belmoreira has joined #openstack-placement13:38
efriedcdent: Are you already working a fix?13:38
cdentefried: no sir13:40
cdentI went to check on sarah and her foot13:40
*** tetsuro has joined #openstack-placement13:42
efriedI'll take it13:42
cdentthanks13:42
openstackgerritBalazs Gibizer proposed openstack/nova master: Send soft_delete from context manager  https://review.openstack.org/47645913:43
jaypipesefried: if you could address it, that would be appreciated. I have to go to the doctor to address a lacerated cornea. :(13:45
efriedjaypipes: Golf swing go awry? Or a pug-related incident?13:45
jaypipesefried: I can barely look at the computer screen for more than a few minutes at a time.13:45
cdentyow, that sounds horrible jaypipes. I hope they can fix it quick.13:46
jaypipesefried: neither. I've had a sinus infection for 2 weeks now. no idea how this tear on my cornea occurred sometime in the last day.13:46
jaypipescdent: me too, considering I need to leave for Ohio tomorrow.13:46
efriedScheduler meeting in ten minutes in #openstack-meeting-alt13:50
openstackgerritDan Smith proposed openstack/nova master: Online data migration for queued_for_delete flag  https://review.openstack.org/58450413:52
openstackgerritBoxiang Zhu proposed openstack/nova stable/pike: Fix "instance snap min disk size err after resize instance"  https://review.openstack.org/58477014:06
openstackgerritChris Dent proposed openstack/nova master: [placement] Add /reshaper handler for POST  https://review.openstack.org/57692714:24
*** purplerbot has quit IRC14:41
*** purplerbot has joined #openstack-placement14:41
*** lei-zh has quit IRC14:42
*** tetsuro has quit IRC14:48
openstackgerritKashyap Chamarthy proposed openstack/nova master: libvirt: Remove usage of migrateToURI{2} APIs  https://review.openstack.org/56725814:56
gibiefried, cdent, jaypipes: the PUT /allocations/{consumer_uuid} API doc says that it can return 409 for concurrent inventory update: https://developer.openstack.org/api-ref/placement/#update-allocations15:04
*** takashin has left #openstack-placement15:04
gibiefried, cdent, jaypipes: but we are not providing any provider generation information in the request body (just consumer generation)15:05
gibiefried, cdent, jaypipes: so how can it be an inventory conflit?15:05
* jaypipes heading to eye doctor now... back later15:06
gibijaypipes: take care15:06
efriedgibi: We come into the handler, grab the rp objects to check usage, then another thread comes in and updates the inventory, then we proceed and do our allocation, then when we're done with our allocation, we go to increment the rp generation and bounce there.15:07
gibiefried: but PUT /allocations/{consumer_uuid} bounces RP generation not just consumer generation?15:08
efriedright15:08
efriedit's kind of a statement of, "Capacity was fine when we started working on this, but then something happened asynchronously that might make that not be true anymore, so we need to fail."15:09
gibibut this should not be different from client perspectice from a simple not enough inventory available error case15:11
gibiI mean the client does not need to differentiate between no inventory before the call or no inventory at the end of the call, both is just no inventory15:12
*** belmoreira has quit IRC15:12
gibiI don't feel I can explain it clearly15:13
* cdent catches up15:13
gibigetting a 409 due to consumer_generation mismatch is a thing that the client of the placement should understand. Also the client should understand if there is no inventory available for the allocation request. But the client should not see a resource proider generation conflict during PUT /allocations/{consumer_uuid}15:14
gibias the client doesn't provided a resource provider generation in the request15:15
gibis/provided/provide/15:15
openstackgerritMatt Riedemann proposed openstack/nova master: Report 0 root_gb in resource tracker if instance is bfv.  https://review.openstack.org/58420415:15
openstackgerritMatt Riedemann proposed openstack/nova master: Heal RequestSpec.is_bfv for legacy instances during moves  https://review.openstack.org/58371515:15
openstackgerritMatt Riedemann proposed openstack/nova master: Fix wonky reqspec handling in conductor.unshelve_instance  https://review.openstack.org/58373915:15
openstackgerritMatt Riedemann proposed openstack/nova master: Add shelve/unshelve wrinkle to volume-backed disk func test  https://review.openstack.org/58493115:15
cdentgibi: that's one of the reasons we have this:15:15
cdenthttps://bugs.launchpad.net/nova/+bug/171993315:15
openstackLaunchpad bug 1719933 in OpenStack Compute (nova) "placement server needs to retry allocations, server-side" [Medium,Triaged]15:15
cdentthe rp generation conflicts that are happen are not relevant to the client-side as they are only used to represent the database encountering a situation where it cannot ensure integrity of the compare and swap operation15:16
gibicdent: thanks that TODO makes sens15:17
cdentunder high allocation load (against the same resource provider) that can happen and we really ought to retry more often server side before we let the client know15:17
gibicdent: today we have a client side retry for that15:17
gibicdent: I agree that is should be server side retry15:17
gibicdent: and now we will have to types of retry in client side15:18
gibicdent: 1) blind retry in case of RP conflict15:18
cdentbut it is still the case that even if we retried ten times server side, we could still never succeed, so for the docs to be complete they need to report that the rp generation can be the source of conflict15:18
efriedgibi: We're not rechecking capacity when that 409 happens. We don't *know* whether we're out of capacity; we just know something changed.15:18
* cdent nods15:18
gibicdent: 2) potentially smarter retry for consumer_generation conflict15:18
efriedcdent: sanity check, placement db can now be configured via [api_database] or [placement_database], right?15:19
cdentefried: correct. if the latter is not set the former will be used15:19
efriednod15:19
efriedthx15:19
*** yikun has quit IRC15:19
gibiefried, cdent: thank I think I got it. I will try to differentiate between the two conflict on client side somehow15:19
efriedgibi: If there's no error code, there oughtta be.15:20
efriedgibi: Pretty sure the ConcurrentUpdate will have an error code associated with it. Unless we didn't merge that patch yet - cdent, you got that at your fingertips?15:20
efriedgibi: In any case, differentiating should be possible with error codes, which is their whole purpose.15:20
cdentefried: you mean this one: https://review.openstack.org/#/c/581771/15:20
cdent(that was supposed to be a question)15:21
efriedcdent: Yup, its predecessor probably hit the path gibi is talking about, but that one is good too.  gibi, feel like pushing ^ that one?15:21
gibiI will review that patch right away15:21
gibias I don't want to build more on top of this code https://github.com/openstack/nova/blob/f1ec4ebe64179a0d2f8efbc236be58f8e0ff577a/nova/scheduler/client/report.py#L166515:22
efriedgibi: Oh, if you want to resolve that TODO, that would be awesome.15:23
*** yikun has joined #openstack-placement15:23
efriedgibi: we implemented error codes specifically to be able to de-uglify code like that.15:23
gibicdent: do I see correctly that we only have a single error code CONCURRENT_UPDATE for every type of resources.15:25
gibi?15:25
* cdent looks at efried 15:25
cdentfor the time being, yes, because jay didn't think we needed different kinds15:25
efriedyes15:25
cdentbut during that discussion there were ways to change it later if needed15:25
gibithen I cannot differentiate between consumer_generation conflict and resource provider conflict15:25
cdentyeah, if you want to do that, we need to add the aforementioned new codes15:26
cdentefried: do you remember which review that was?15:26
efriedcdent: I don't. Didn't we decide we would need a new microversion, or something, if we wanted to do that?15:26
cdentthe idea was that if we change the code on a well known (but "error") response, then yes, a new microversion would be ideal15:27
cdentthat is: if the code is used in flow control15:27
gibicdent, efried: before jumping into adding the new error codes, could you check https://review.openstack.org/#/c/583667/2/nova/scheduler/client/report.py@1614 to see if I really need to differentiate between rp conflict (do blind retry) and consumer conflict (raise from reportclient and let the caller handle)?15:28
* cdent looks15:28
gibiL1631 now implements the blind retry for RP conflict only15:29
efriedIt looks like blind retry ought to work for consumer gen conflict too, because you're checking for existing allocs and loading them up. That payload will include the latest consumer gen.15:30
cdentefried has a bit more state on how we want to deal with conflicts in the report client. But I can say that the blind retry for rp conflict is okay and desirable.15:31
cdentyeah since you're adding the GET it's probably okay15:31
gibicdent: I agree that rp conflict can be handled with blind retry15:31
cdentalthough it seems weird to me that all we do in response to a consumer gen is try again15:31
cdentshouldn't we check more of the world?15:31
efriedright, we need to consider whether we would still want to replace the allocations wholesale.15:32
gibiefried: blind retry for consumer conflict would work but would defeate the idea of consumer generation15:32
cdentyeah what gibi just said15:32
efriedLike, do we need to revalidate whatever assumptions led us to construct the alloc_request param?15:32
efriedso we're all saying the same thing.15:32
cdentbrb15:32
efriedgibi: I think we need to look further up the call stack to see what's going into the building of the alloc_request param.15:33
gibiefried: most of the callers of claim_resources expect a totally new consumer to be created, like in case of boot or migrate15:33
efriedgibi: So under what circumstances could we possibly enter the condition on L1608?15:33
gibiefried: only in case of force move operations15:34
efriedI think the "doubled up" algorithm is obsolete since we implemented migration UUIDs.15:34
efriedisn't it?15:34
efriedi.e. it should be "impossible" to have existing allocs when we get here?15:34
efriedunless e.g. two migrations are happening at the same time? But isn't that synchronized at the conductor level or something?15:35
gibiefried: this is still a thing nova.scheduler.utils.claim_resources_on_destination15:35
gibiefried: which is called from nova.conductor.manager.ComputeTaskManager#_allocate_for_evacuate_dest_host15:36
gibiefried: and the comment in https://github.com/openstack/nova/blob/bcbc1f9aeddb060513768489450c429bf53e1e46/nova/conductor/manager.py#L838 points to forced evacuation15:37
gibiefried: I might be mistaken15:42
gibiefried: let me dig a bit more15:43
gibiefried: the https://github.com/openstack/nova/blob/0807b9408045db8cd674c84def8b7e1b27171377/nova/tests/functional/test_servers.py#L2312 test case hits that edge case in claim_resources15:49
efriedgibi: not having read the test yet - is it a real thing or contrived via the test code?15:50
*** lei-zh has joined #openstack-placement15:50
gibiefried: I think it is a real thing, nothing relevant is mocked in that functional test and the call stack coming from the compute/manager rebuild_instance I would expect15:51
gibis/coming from/starting at.15:52
gibiefried: couple of functional test is affected by that code path and both is related to evacuation15:53
gibis/both/all/15:54
efriedmriedem: Reviewed https://review.openstack.org/#/c/575912/15:55
efriedIt really ought to be trivial to swap out aggregate_add_host.15:55
efriedIs it that you're busy with other stuff?15:55
gibiefried: it seems that with evacuation both the source and the dest host allocation is held by the instance16:00
efriedgibi: Okay, we don't use the "migration UUID" for that case?16:02
gibiit doesn't seems so16:02
gibiwe do create the Migration object in the api16:02
gibibut16:02
efriedisn't that problematic for quotas, or something?16:02
gibihttps://github.com/openstack/nova/blob/f1ec4ebe64179a0d2f8efbc236be58f8e0ff577a/nova/compute/api.py#L4502-L450416:03
mriedemefried: i'm busy and we just can't really afford to push things out too far this week because of the gate16:04
efriedight16:04
mriedemso things that can be done in a follow up should be done in a follow up16:05
gibimriedem: do you know why evacuate does not use the migration uuid as the consumer of the source host allocation?16:05
mriedemgibi: because during evac the source host is down and when the source compute comes back up (if it comes back up) we'll delete it's allocations16:05
mriedemso leaving the allocations for the instance against the dead/down source compute shouldn't be a problem16:05
mriedemdansmith: ^ keep me honest16:06
mriedem*we'll delete its allocations for evacuated instances16:06
mriedemwhich reminds me, lyarwood has a related bug fix here https://review.openstack.org/#/c/562284/16:06
gibimriedem: it is not a resource allocation problem, but more like the only case left where we double-up allocations for the same consumer during claim_resources()16:06
mriedemoh16:07
dansmithyup, the migration uuid holding allocations is for the source anyway, and since we're effectively never going back to the source we don't need to keep track of them back there16:07
dansmith(or shouldn't)16:07
dansmithbut yeah we shouldn't be doing the doubling if that's true16:07
mriedemgibi: it just wasn't done for evac b/c we didn't need to and it'd be a non-trivial amount of work to handle it that way probably16:07
*** e0ne has quit IRC16:08
gibimriedem, dansmith: thanks now I have to figure out which is a bigger work, keep the code for the double-up in claim_resources() or do the migartion_uuid thing for evac16:09
dansmithgibi: I don't think the latter is a thing16:09
gibiwhere keeping the double-up code really means making it work with consumer_generations16:09
dansmithand I don't think the former was intentional right?16:09
dansmithI think just replace the allocations with the new ones and move on16:09
mriedemi have to imagine it's not that simple,16:10
mriedemwhat happens if the source compute comes back up during the evac?16:10
mriedemand/or if the evac on the dest host fails16:10
dansmithit's going to delete the instance anyway16:10
mriedemnot if the evac failed16:10
dansmithI thought I put the nail in that already?16:11
mriedemb/c of stuff like this https://review.openstack.org/#/c/562284/16:11
gibiyeah, I guess evac failed then source compute recovered is a valid case16:11
dansmithas I've said before, I don't think we should ever let the instance just restart on the evac source16:11
dansmithbecause of shit like this16:11
dansmithI think once you've evac'd we should continue with doing it, even if the first one fails16:11
mriedemyou'd have to rebuild the instance on the source host to recover it right?16:11
mriedemso maybe this is a ptg topic too because this isn't something we should monkey with in rocky imo16:12
dansmithin which case?16:12
dansmithsounds good to me16:12
efriedmriedem: Mocking bugs on https://review.openstack.org/#/c/583715/ - I'll fast-approve if you want to fix 'em real quick.16:13
dansmithhaving dealt with our instance ha people trying to automate this process, I think we need to stick to a very obvious workflow16:13
gibimriedem, dansmith : OK, I'm adding it to the ptg pad...16:13
mriedemi'm saying, what happens if we remove the allocations for the source compute during evac scheduling but the evac fails - then the source compute comes back up, what do we do?16:13
mriedemwe have to think through that before we just make changes here16:13
dansmithI think we should make the source never start the instance again, but yeah, let's talk about it at ptg16:13
mriedemefried: i'll fix them quick16:13
* dansmith has to jump on a call anyway16:14
mriedemadding a cache to the end of that series atm16:14
mriedemefried: ^ for the is_bfv flag in the RT16:14
gibimriedem, dansmith: thanks16:14
*** lei-zh has quit IRC16:15
gibicdent, efried: as adding separated error codes would need a new microversion and we are in days before FF I either solve the retry issue without error codes or we punt the 1.28 bump to Stein16:23
efried"I told you so"16:23
openstackgerritMatt Riedemann proposed openstack/nova master: Heal RequestSpec.is_bfv for legacy instances during moves  https://review.openstack.org/58371516:24
openstackgerritMatt Riedemann proposed openstack/nova master: Fix wonky reqspec handling in conductor.unshelve_instance  https://review.openstack.org/58373916:24
openstackgerritMatt Riedemann proposed openstack/nova master: Add shelve/unshelve wrinkle to volume-backed disk func test  https://review.openstack.org/58493116:24
openstackgerritMatt Riedemann proposed openstack/nova master: Cache is_bfv check in ResourceTracker  https://review.openstack.org/58496216:24
jaypipescdent, efried: well, the good news is that it's not a lacerated cornea. bad news is that it's a bacterial corneal ulcer.16:25
efriedjaypipes: eyedrops and augmentin, you'll be fine in 36 hours.16:25
jaypipesefried: waiting on the prescription to get filled now...16:26
* gibi leaves for today16:31
efriedcdent: Sanity check me here...16:46
efriedhttps://review.openstack.org/#/c/582383/5/nova/api/openstack/placement/objects/resource_provider.py@411816:47
efriedI didn't have to "keep track" of anything. I can identify "providers for which all inventories are being deleted" right here by asking if new_inv_list is empty.16:48
efriedthat ain't right.16:48
efriedthe comment should say resource *classes*, not resource *providers*.16:49
efriedactually, the comment is totally bogus.16:50
efriedit should say "keep track of resource providers whose inventories are actually changing". Which... ought to be all of them, or you wouldn't have passed it in.16:50
efriedI'm going to delete the comment.16:50
jaypipesefried: hold up.16:54
* efried holds up16:55
*** tssurya has quit IRC16:55
openstackgerritMerged openstack/nova master: perform reshaper operations in single transaction  https://review.openstack.org/58238316:57
openstackgerritMerged openstack/nova master: Refactor _heal_instances_in_cell  https://review.openstack.org/57789616:57
cdentefried: I'm back, but pausing for jaypipes16:58
jaypipesefried: so, looking closer at this, what about putting just this right before line 4098:17:01
cdentefried: you do want to keep track here17:01
jaypipesif new_inv_list:17:01
jaypipescdent, efried: that way you never set_inventory() on line 4098 for the providers you are going to remove all inventory for?17:02
cdentyou need to blow out sooner, around 4086, no?17:02
cdentthe problem isn't really with the logic in the code17:03
cdentit's simply that you can't index None17:03
cdentso when new_inv_list is none in the first block, skip it17:03
cdentand when new_inv_list is None in the last block, use a differnt resource provider17:04
efriedI've solved the IndexError.17:04
efriedI was just in the neighborhood and trying to figure out that other thing.17:04
efriedbut you're right; if the new_inv_list is empty, I can `continue` in the top loop.17:04
cdentright, and your comment is wrong, so the neighbordhood looks different from hwat you were thinking, right?17:04
openstackgerritElod Illes proposed openstack/nova stable/queens: Call generate_image_url only for legacy notification  https://review.openstack.org/58496917:05
cdentwe can't "skip the rest" as the comment says17:05
efriedright17:06
efriedThe only optimization in the bottom would be if they passed in an inventory unchanged from its existing state, which would be silly of them, so no reason to optimize for it.17:07
cdent17:08
jaypipesack17:12
jaypipesthat works for me.17:12
jaypipesefried: lol, patch just merged.17:13
jaypipesefried: only after what, like 7 days in the gate?17:13
efriedjaypipes: which?17:13
efriedThe reshaper db? gibi just approved it this morning.17:14
jaypipesefried: oh? I thought it had been approved for a lot longer than that.17:14
jaypipeswas there a new revision that crept in or something?17:14
efriedNo. I +2ed it late last week, but gibi had only +1ed until he saw the consensus that we should merge it rather than waiting.17:15
jaypipesah, ack17:15
efriedcdent: After some creative rebasing, I'm failing only that bottom test, for a new reason: Inventory for 'VCPU' on resource provider 'fb48e253-50cb-4f98-b2fc-2b770a844d0b' in use.17:19
efriedProbably just forgot to remove that allocation in the test case, haven't looked yet.17:19
efriedyeah, ALT_RP_UUID still has allocations from a couple consumers.17:21
*** e0ne has joined #openstack-placement17:22
cdentefried: yeah, I didn't have a good way to actually test the test, so was winging it17:22
*** e0ne has quit IRC17:22
cdentefried: If you want to leave it, I can clean things up later (or tomorrow morning). Or feel free if you're feeling energetic.17:23
efriedcdent: Almost there.17:24
efriedI can't push this without yanking just-fix-it out of the gate. I'll do that and reapprove it. Only lose a couple hours of gate time.17:27
* cdent shrugs17:27
efriedoh, never mind17:27
efriedit's almost merged.17:27
cdentI'll check back in a couple hours17:27
efriedI'll just wait.17:27
efriedAnd work on the top of the series in the meantime.17:27
cdentgotta do dinner etc17:28
efriedenjoy.17:28
openstackgerritMatt Riedemann proposed openstack/nova master: Use consumer generation in _heal_allocations_for_instance  https://review.openstack.org/57790518:10
mriedemefried: ^ was a clean local rebase but gerrit apparently doesn't think it was a rebase18:10
mriedemnot sure why18:11
efried...18:11
mriedemi'm just looking for re-approval18:11
efriedBet it was this: https://review.openstack.org/#/c/577905/11/nova/scheduler/client/report.py@5618:12
efriedmriedem: done.18:13
mriedemthanks18:14
efried13 placement microversions from pike to queens. 13 from queens to rocky, assuming we don't do more, which seems reasonably likely.18:25
*** e0ne has joined #openstack-placement18:26
openstackgerritMerged openstack/nova master: Rename auth_uri to www_authenticate_uri  https://review.openstack.org/57682018:27
openstackgerritsean mooney proposed openstack/nova master: fix disk_bus handeling  https://review.openstack.org/58499918:43
openstackgerritMerged openstack/nova master: Func test for improper cn local DISK_GB reporting  https://review.openstack.org/58364618:55
openstackgerritMerged openstack/nova master: [placement] disallow additional fields in allocations  https://review.openstack.org/58390718:56
*** cdent has quit IRC19:04
efriedcdent's fix flushed out a bug in mriedem's test case19:31
mriedemunpossible19:32
efriedmriedem: You're right, it was my bug.19:34
efriedmriedem: But lets me resolve a TODO of yours.19:34
openstackgerritMatt Riedemann proposed openstack/nova master: Add method to get cpu traits  https://review.openstack.org/56031719:53
openstackgerritMatt Riedemann proposed openstack/nova master: FakeLibvirtFixture: mock get_fs_info  https://review.openstack.org/57920119:53
openstackgerritMatt Riedemann proposed openstack/nova master: Blacklist greenlet 0.4.14  https://review.openstack.org/58501619:53
*** edmondsw has quit IRC20:09
*** e0ne has quit IRC20:50
*** edmondsw has joined #openstack-placement20:56
openstackgerritJim Rollenhagen proposed openstack/nova master: ironic: add instance_uuid before any other spawn activity  https://review.openstack.org/56372220:59
openstackgerritEric Fried proposed openstack/nova master: [placement] Add /reshaper handler for POST  https://review.openstack.org/57692721:03
openstackgerritEric Fried proposed openstack/nova master: Make get_allocations_for_resource_provider sane  https://review.openstack.org/58459821:03
openstackgerritEric Fried proposed openstack/nova master: Report client: Real get_allocs_for_consumer  https://review.openstack.org/58459921:03
openstackgerritEric Fried proposed openstack/nova master: Report client: get_allocations_for_provider_tree  https://review.openstack.org/58464821:03
openstackgerritEric Fried proposed openstack/nova master: reshaper: Look up provider if not in inventories  https://review.openstack.org/58503321:03
openstackgerritEric Fried proposed openstack/nova master: Report client: _reshape helper, placement min bump  https://review.openstack.org/58503421:03
efriedcdent, jaypipes, mriedem, dansmith: Some material to get ahead on, review-wise. Client side isn't finished, but these pieces ^ are ready.21:04
efriedgibi, bauzas: ^21:04
*** e0ne has joined #openstack-placement21:19
*** e0ne has quit IRC21:21
jaypipesefried: rock on, thx21:27
* jaypipes goes back to applying drops to weeping eye21:27
*** avolkov has quit IRC21:42
openstackgerritDan Smith proposed openstack/nova master: Online data migration for queued_for_delete flag  https://review.openstack.org/58450421:43
efriedahcrap, I think I put 'em in the wrong order.21:44
efriedwtf did I do?21:45
efriedI think I need to comment out the failing tests in the microversion patch and then re-enable them in the bug fix.21:45
openstackgerritDan Smith proposed openstack/nova master: Online data migration for queued_for_delete flag  https://review.openstack.org/58450421:48
openstackgerritEric Fried proposed openstack/nova master: [placement] Add /reshaper handler for POST  https://review.openstack.org/57692721:50
openstackgerritEric Fried proposed openstack/nova master: reshaper: Look up provider if not in inventories  https://review.openstack.org/58503321:50
openstackgerritEric Fried proposed openstack/nova master: Make get_allocations_for_resource_provider sane  https://review.openstack.org/58459821:50
openstackgerritEric Fried proposed openstack/nova master: Report client: Real get_allocs_for_consumer  https://review.openstack.org/58459921:50
openstackgerritEric Fried proposed openstack/nova master: Report client: get_allocations_for_provider_tree  https://review.openstack.org/58464821:50
openstackgerritEric Fried proposed openstack/nova master: Report client: _reshape helper, placement min bump  https://review.openstack.org/58503421:50
efriedfixed ^21:50
efriedsorry if someone was in mid-review.21:51
*** edmondsw has quit IRC22:04
openstackgerritMatt Riedemann proposed openstack/nova master: Update queued-for-delete from the ComputeAPI during deletion/restoration  https://review.openstack.org/56681322:05
openstackgerritMerged openstack/nova master: Add VIFMigrateData object for live migration  https://review.openstack.org/51542322:10
efriedI could use a GET /resource_providers?uuid=in:[list of uuids] at this point.22:11
openstackgerritMatt Riedemann proposed openstack/nova master: doc: link to CERN summit video about upgrading from cells v1 to v2  https://review.openstack.org/58504422:16
mriedemefried: like the force_hosts thing22:21
mriedemor i thought we talked about that being an option for that22:21
mriedemwell, GET /allocation_candidates22:22
efriedmriedem: That was GET /allocation_candidates?uuid yeah22:22
efriedmriedem: Because "we" decided not to return a payload including updated generations from POST /reshaper, "we" now need to re-GET all the providers before we proceed with update_from_provider_tree.22:22
efriedAnd absent GET /rps?uuid=in:[list], we gotta do 'em one at a time.22:22
openstackgerritMatt Riedemann proposed openstack/nova master: doc: link to AZ talk from the Rocky summit  https://review.openstack.org/58504522:25
mriedemi don't really remember any specific discussion about that,22:26
mriedembut the microversion is still there, so we could also add  GET /rps?uuid=in:[list] if we wanted22:26
efriedmriedem: In the same microversion as POST /reshaper? That seems like an abuse of microversionage.22:28
mriedemthis is because the other operations in update_from_provider_tree are going to need the updated generations from POST /reshaper yeah?22:28
efriedcorrect.22:28
efriedInventory updates will be skipped (because already done) but agg & trait updates need the new gens.22:28
mriedemi know of at least one compute microversion that touched 2 things that were loosely related22:28
mriedemhttps://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id4622:29
mriedemwhy did "we" decide not to return generations in the response from POST /reshaper?22:29
mriedemthe request body requires generations right?22:30
mriedemfor POST /reshaper22:30
efriedyeah.22:31
efriedIt wasn't a good reason.22:32
efriedI don't remember what it was.22:32
efriedor whether it was in review or IRC or what.22:32
efriedI remember conceding the point under protest, but not as much protest as usual, because we needed to get moving and it was going to be extra work.22:32
*** mriedem has quit IRC22:39
openstackgerritEric Fried proposed openstack/nova master: WIP: Compute: Handle reshaped provider trees  https://review.openstack.org/57623623:06
openstackgerritEric Fried proposed openstack/nova master: WIP: Report client: update_from_provider_tree w/reshape  https://review.openstack.org/58504923:06
efriedmriedem, cdent, jaypipes, gibi, bauzas, dansmith: The series is complete now. Top two patches need test, but code side is ready for perusal ^23:07
*** tetsuro has joined #openstack-placement23:19
*** mriedem has joined #openstack-placement23:21
*** edmondsw has joined #openstack-placement23:38
*** edmondsw has quit IRC23:43
openstackgerritMatt Riedemann proposed openstack/nova master: Enhance doc to guide user to use nova user  https://review.openstack.org/58311523:48
*** mriedem has quit IRC23:50
*** takashin has joined #openstack-placement23:52

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!