Friday, 2018-10-05

*** openstackgerrit has quit IRC04:52
*** openstackgerrit has joined #openstack-placement04:57
*** e0ne has joined #openstack-placement05:34
*** e0ne has quit IRC05:50
*** e0ne has joined #openstack-placement05:53
*** e0ne has quit IRC06:12
*** gibi is now known as giblet06:42
*** helenafm has joined #openstack-placement07:18
*** bauzas is now known as PapaOurs07:39
gibletefried, jaypipes: thank you for the reviews and approvals. :) Regarding the rest of the use-nested-allocation-candidates bp I put the patches in the runway queue but did not put them diretly in the empty runway slot as I need one or two workdays to focus on some slide makig tasks. Still I will try to respond on the reviews you made07:41
gibletjaypipes, efried: as https://review.openstack.org/#/c/585672 merged, the happy path works so we already unblocked reshape + vgpu for PapaOurs07:42
PapaOurs\o/07:43
gibletjaypipes, efried: the rest of the open patches are test coverage and not-so-happy-path bugfixing07:43
PapaOursgiblet: I'll rebase my reshaper change then07:43
gibletPapaOurs: nothing better on Friday morning than a nice clean rebase with a cup of good coffee07:45
PapaOursI already have the coffee :)07:45
*** e0ne has joined #openstack-placement07:49
*** ttsiouts has joined #openstack-placement07:59
*** tssurya has joined #openstack-placement07:59
PapaOursgiblet: can I ask you a dummy question ?07:59
*** takashin has left #openstack-placement08:00
*** ttsiouts has quit IRC08:02
*** ttsiouts has joined #openstack-placement08:03
gibletPapaOurs: sure go ahead08:05
giblet(there is no such thing that dummy question)08:05
PapaOursgiblet: when we call a-c for getting a list of potential candidates08:05
PapaOursgiblet: then we PUT allocations08:06
PapaOursgiblet: I guess the allocations are then fine because even if we have different RPs, it works08:06
*** ttsiouts has quit IRC08:07
gibletPapaOurs: what do you mean by different RPs? the RPs we are allocating from is in the allocation candidate response08:07
PapaOursok so we're cool08:07
PapaOursgiblet: I just wonder which specific RPs we pass down to the filters08:07
gibletPapaOurs: the filter scheduler gets both the allocation requests (from the candidates) and the provider summaries in https://github.com/openstack/nova/blob/122702a5a8aaf3231f0ee416ba9bc6935547f0ab/nova/scheduler/filter_scheduler.py#L48-L5008:11
PapaOursgiblet: sorry, I'm just discussing on a separate internal issue downstream as of now, so I'm a bit on and off08:12
PapaOursgiblet: lemme clarify08:12
PapaOursgiblet: previously (pre-nested world), the allocation candidate was a single RP UUID right?08:12
PapaOursgiblet: so when it was chosen, we were just looking at the corresponding host08:13
PapaOurslemme find the code08:13
gibletPapaOurs: if we ignore the sharing RP case which was not really supported in nova then yes08:13
gibletPapaOurs: yeah, the filters are still filtering hosts as far as I know08:13
PapaOursgiblet: I'm talking of https://github.com/openstack/nova/blob/122702a5a8aaf3231f0ee416ba9bc6935547f0ab/nova/scheduler/filter_scheduler.py#L14208:14
*** ttsiouts has joined #openstack-placement08:15
PapaOursgiblet: yeah, so https://github.com/openstack/nova/blob/122702a5a8aaf3231f0ee416ba9bc6935547f0ab/nova/scheduler/filter_scheduler.py#L481 is concerning me08:17
gibletPapaOurs: yeah, hosts states are gathered by the RP uuids in the a_c response08:17
gibletPapaOurs: yeah, that assumption does not hold for nested RP trees any more08:17
PapaOursgiblet: because we assess that a RP UUID from the list of provider_summaries is necessarly a compute node08:17
PapaOursgiblet: which is incorrect in terms of nested RPs with VGPU resource classes on a nested inventoriy08:18
gibletPapaOurs: let me play with the functional env to see why it does not fail hard when we have  a non compute RP in the a_c response08:18
PapaOursgiblet: I think that since we only check classic resources that are not on a child RP, we're fine08:18
PapaOursI mean, I'm fine with us having +2d the main change, but that's not fully functional yet per my guts08:19
gibletPapaOurs: but I have tests that has CUSTOM_MAGIC on nested RP08:19
gibletPapaOurs: and that seems to work08:19
PapaOursgiblet: that's strange then08:19
PapaOursbecause when I look https://developer.openstack.org/api-ref/placement/#list-allocation-candidates08:19
gibletPapaOurs: yeah, need to trace those tests a bit08:19
PapaOursI can see a provider summary which isn't potentially a root RP08:20
PapaOursgiblet: any idea where I could comment this in the series ?08:20
gibletPapaOurs: you can comment on the test that it should fail based on the code you saw08:21
gibletPapaOurs: here is the FakeDriver reporting nested tree https://review.openstack.org/#/c/604084/7/nova/virt/fake.py@68608:21
gibletPapaOurs: and here is the test setting up a flavor with resource request from the child RP08:22
gibletPapaOurs: https://review.openstack.org/#/c/604084/7/nova/tests/functional/test_servers.py@486908:22
gibletPapaOurs: basically the whole ServerMovingTestsWithNestedResourceRequests test class should fail on scheduling based on the assumption you found in the code above08:23
PapaOursgiblet: yeah, even a classic boot request08:23
gibletPapaOurs: exaclty08:23
gibletgiblet: so I will go and run those test with some exta tracing in the scheduler08:23
PapaOursgiblet: you know what ? I feel I understand why it fails08:28
PapaOursgiblet: well, technically it doesn't fail08:28
PapaOursgiblet: it's just that we don't find the corresponding compute node UUID08:28
gibletPapaOurs: do you mean that when we gather the computes by uuid and we have non compute uuids in the list then the SQL still return the matching computes without complaining? https://github.com/openstack/nova/blob/235d03ca95aa656957ac13e29ebb3b7515cdba8a/nova/objects/compute_node.py#L44608:31
PapaOursgiblet: I need to look at the exact code we have for gathering the host states08:32
PapaOursgiblet: but I imagine a crazy problem08:32
PapaOursgiblet: with aggregates08:32
PapaOursgiblet: if two providers can fit the request but are on two distinct hosts, we should in theory check those two hosts by the filters08:33
PapaOursgiblet: don't take it the hard way, I just have put a -2 on the change08:34
PapaOursgiblet: it's just for making sure we correctly understand the implications08:34
gibletPapaOurs: no hard feelings. I also want to understand why it works08:34
PapaOursgiblet: I'll remove it if we consider this concern as a non-issue08:34
gibletPapaOurs: but if two RPs (on two distinct hosts) can fit the requests then the scheduler filter will have two distinct host state that it feeds to the filters08:36
gibletPapaOurs: except if the two RPs are not compute RPS08:36
gibletPapaOurs: but child RPs08:36
PapaOursgiblet: what if one of the two RPs is a child RP ?08:36
PapaOursyeah08:36
gibletPapaOurs: BUT in partice in nova every candidate will contain at least the compute RP and now maybe additional child RPs08:37
PapaOursgiblet: can we easily get the tree from the scheduler perspective ?08:37
gibletPapaOurs: we cannot have a candidate that only contain a single child RP08:37
PapaOursgiblet: what if I'm only asking for resources:VGPU=1 ?08:37
PapaOursand I stupidely don't care about CPU, DISK and RAM ?08:38
gibletPapaOurs: your flavor will have vcpu and memory_mb in it08:38
PapaOursor even worst, what if I have a NUMA topology with CPU and RAM on a child RP, and DISK on a shared RP ?08:38
gibletPapaOurs: yeah NUMA will complicate this08:38
PapaOursgiblet: yeah, you're right, my example is bad08:38
PapaOursgiblet: so, my take on that is08:38
gibletPapaOurs: there was a concept mentioned couple of times in placement "anchoring RP"08:39
gibletPapaOurs: I feel it is related08:39
PapaOurs1/ I just think we silently only return compute nodes, which is okay for now08:39
PapaOurs2/ but that could become a problem once we model NUMA08:39
PapaOursif you have a resource that's on the root RP, you're safe08:40
PapaOursbut if you only ask for allocations that are on child RPs, then you could have 0 potential hosts to filter08:40
gibletPapaOurs: yeah I agree with your points08:41
PapaOursI just need to doublecheck 1/08:42
gibletPapaOurs: I have to dig a bit but I think the problem was discussed before from placement perspective08:42
*** cdent has joined #openstack-placement08:45
*** tetsuro_ has quit IRC08:45
PapaOursgiblet: confirmed that 1/ is harmless08:45
giblet PapaOurs: yeah I've just run the test and observed that even if we provid not just compute uuids for the get_host_states_by_uuids() call it still return the host states for the computes and ignores the rest of the uuids (the children RPs)08:49
PapaOursgiblet: I just added a comment08:49
PapaOursgiblet: https://github.com/openstack/nova/blob/c22b53c2481bac518a6b32cdee7b7df23d91251e/nova/objects/compute_node.py#L444-L445 isn't very conservative08:50
gibletPapaOurs: I have an idea how to prove 2/ I think I can put together some functional test where only the child RP has resources08:50
PapaOursgiblet: yeah08:50
PapaOursgiblet: https://review.openstack.org/#/c/604084/7/nova/tests/functional/test_servers.py@486908:51
PapaOursI need to disappear, I have four internal bugs to triage :(08:51
gibletPapaOurs: thanks for the nice discussion. Happy triaging! (I will work on the reproduction of 2/)08:52
PapaOurshappy, happy, I'm not sure08:54
*** efried has quit IRC09:17
*** efried1 has joined #openstack-placement09:17
*** efried1 is now known as efried09:19
gibletPapaOurs: I can reproduce the assumed fault of 2/ but I think the situation is a bit better than we thought09:22
PapaOursok cool09:23
gibletPapaOurs: so the instance requests resources as normal, but every resource is provided by a child RP and the root RP has no inventory.09:23
PapaOursyou mean the response we get from a-c ?09:23
gibletPapaOurs: in this case tha GET a_c response contains allocation only for the child RP BUT the provider_summaries part contains the root RP and child RP as well09:23
PapaOursso we still have the provider_summaries having the root RP in it ,09:24
PapaOursoh ok cool then09:24
gibletyes09:24
giblethowever the scheduling still fails :)09:24
PapaOurshah09:24
PapaOurswhy ?09:24
gibletstill looking...09:24
gibletit ends up with NoValidHosts09:24
PapaOursok09:24
gibletI will push the reproduction patch after lunch09:25
gibletPapaOurs: so the filters are finding the right hoststate but when the filter scheduler try to translate that back to an allocation candidate it does not found the candidate here https://github.com/openstack/nova/blob/122702a5a8aaf3231f0ee416ba9bc6935547f0ab/nova/scheduler/filter_scheduler.py#L21609:31
gibletPapaOurs: alloc_reqs_by_rp_uuid built based on the candidates and therefore does not contains the root RP just the child RP09:31
gibletPapaOurs: but the selected host state only contains the root RP09:32
cdentugh09:32
gibletPapaOurs: this causes that no candidate is found09:32
gibletcdent: while it is a problem it is only problem when the instance does not request resources from the root RP. And still placement is good as it returns the root RP in the provider summary. So only nova needs to adapt to this when NUMA RPs are introduced09:34
* cdent nods09:35
cdentI was mostly "ugh"ing about the many pieces needing to interact making it hard to get it right and simple09:35
*** stephenfin is now known as finucannot09:35
gibletcdent: it is just placement and the nova scheduler ;)09:37
cdenthost states v resource providers v allocation requests v provider summaries v root rp v child rp v post request filters v pre request filters v placement filtering v ....09:38
gibletcdent: if you look at it that way... :)09:38
cdentit's a tangled web we have woven09:38
cdentwhen it works well it is great09:38
PapaOursgiblet: I see09:40
PapaOursgiblet: thanks for the update09:40
giblethost state is nova only and it is really just about serving the scheduler filters. I think the allocation request, allocation candidate, provider summaries tripplet is the confusing part and of course the translation inbetween them09:41
gibletahh and alternates :)09:41
PapaOursgiblet: cdent: I feel that while we say it's only needed for NUMA, we somehow need to fix it sooner than later09:41
PapaOursgiblet: cdent: because that's just fortunate it works09:41
PapaOursand I don't want to hold NUMA because of this09:42
gibletPapaOurs: yeah, at least we have to remove the bad assumptions from the code09:42
giblet~ assumptions that doesn't hold any more09:42
PapaOursgiblet: what we could do is try to find a way to say "which is my root RP" ?09:42
PapaOursI feel it's probably a nova-only fix09:43
gibletPapaOurs: yeah I think the solution could be a scheduler only piece that replace alloc_reqs_by_rp_uuid with a better lookup table09:43
cdentwould it make any sense to "give" provider summaries to their corresponding host states?09:43
PapaOursI need to go to the gym09:43
PapaOursI'll be back later in 2 hours-ish09:43
* cdent needs to go the gym for his entire openstack career09:43
gibletPapaOurs: I also run fur lunch so see you later09:44
PapaOursgiblet: I leave my procedural -2 until we clarify a good solution path09:44
PapaOurs++09:44
gibletPapaOurs: OK, sure09:44
* giblet needs to insert the swimming pool visits back to his calendar09:45
gibletcdent: attaaching summaries to host states could be a way too09:45
cdentgiblet: if you get a chance to look at this stack from efried, the would merge some nice cleanups: https://review.openstack.org/#/c/602701/10:00
cdentand get many of the outstanding commits10:01
cdentyou might also have some ideas on the best way to deal with https://review.openstack.org/#/c/601866/10:01
*** ttsiouts has quit IRC10:22
gibletcdent: do we have nova-placement integration test in place to see if such refactoring https://review.openstack.org/#/c/602701/ is really just a refactoring?10:40
cdentthe functional tests will cover all that, and if they don't they aren't good enough :). We need to rely as much as possible on the API fitting its contract and nova using that API, not random integrations. However, I've started the process for integration tests in this stack: https://review.openstack.org/#/c/601614/ but they have quite a few unresolved depends-on10:42
cdentgiblet: did you see: https://anticdent.org/gabbi-in-the-gate.html10:42
gibletcdent: it is queued up for reading10:43
cdentgreat, thanks10:43
cdentthere's also this: https://review.openstack.org/#/c/601412/10:43
cdentall of that is hung up on: https://review.openstack.org/#/c/600162/ and https://review.openstack.org/#/c/604454/10:44
cdentAnd the whole thing keys off this hack which needs to be turned into a real thing: https://review.openstack.org/#/c/600161/10:46
cdentbiab10:46
gibletcdent: nice, I queued up the patches for review10:47
*** tetsuro has quit IRC10:48
*** s10 has joined #openstack-placement11:06
*** helenafm has quit IRC11:21
*** ttsiouts has joined #openstack-placement11:40
*** e0ne has quit IRC11:54
*** helenafm has joined #openstack-placement12:12
*** jaypipes is now known as leakypipes12:12
*** e0ne has joined #openstack-placement12:29
*** dims_ has quit IRC12:58
*** mriedem has joined #openstack-placement12:59
PapaOursgiblet: I'm back but I'm looking at some internal issue13:10
gibletPapaOurs: no worries, I'm working on a fix for the issue you found13:10
gibletPapaOurs: first I thought it will be easy so I dived in, but the the get_alternate_hosts codepath gives me headache13:11
PapaOursargh13:11
PapaOursok13:11
PapaOursthanks for the fix13:11
*** efried is now known as fried_rice13:14
openstackgerritJay Pipes proposed openstack/os-traits master: Add COMPUTE_TIME_HPET trait  https://review.openstack.org/60825813:18
fried_ricePapaOurs, giblet, cdent: Seems like there's two things we need wrt filters:13:23
fried_rice1) They need to be multi-provider allocation request-aware to do their filtering operations13:23
fried_rice2) They need to know how to find the root provider (i.e. host) from a multi-provider allocation request so they know where to send the boot request13:23
gibletfried_rice: I'm not sure our filter needs to know about other than the hoststate13:24
gibletfried_rice: for the 2) I agree13:24
gibletfried_rice: and there provider_summaries helps13:24
fried_riceAt least 2) is a simple thing: provider_summaries[random_provider_from_allocation_request.root_provider_uuid]13:24
gibletfried_rice: I'm cooking up a solution right now for 2)13:25
gibletfried_rice: exactly13:25
fried_ricegiblet: If we only care to filter on the host, then we can do 2) first and 1) is unchanged.13:25
gibletfried_rice: give me couple of hours and I will have specific code to talk about :)13:25
gibletfried_rice: yeah, current code explicitly ignores 1)13:26
gibletfried_rice: so I only want to solve 2) first as that is unavoidable13:26
gibletfried_rice: then if there is a need then we need to talk about 1) separately13:26
fried_rice++13:27
fried_riceWhich patch is -2'd for this currently?13:27
gibletfried_rice: https://review.openstack.org/#/c/60408413:28
gibletfried_rice: which is just test coverage we PapaOurs thought should fail13:28
gibletwe with PapaOurs13:28
fried_riceCool.13:29
gibletthen we understood why it doesn't fail right now but it will as soon as there is an allocation candidate that only contains child RPs13:29
PapaOursgiblet: honestly, my -2 is for making sure people see the concern13:30
PapaOurscall it a flag :)13:30
PapaOursfried_rice: for 1/ we checked and that's not a problem, see my comments in the -2'd change13:31
fried_riceack13:31
gibletPapaOurs: I think I will ask you to remove that _after_ I was able to provide tests that fails due to what you discovered and we see some way forward about them13:31
gibletPapaOurs: 1/ could be a separate need. I mean if a scheduler filter want to select not just between hosts but between alternative candidates targeting the same host13:32
PapaOursgiblet: not sure I understand your point13:32
gibletPapaOurs: e.g there is two NUMA nodes on the same host and both can fulfill the request13:32
gibletPapaOurs: then we will have two candidates but a single host13:32
PapaOursgiblet: well, filters don't know about placement at all13:32
gibletPapaOurs: but NUMA filter knows about NUMA nodes13:33
PapaOursif we have a single host to check for filters, that's cool13:33
gibletPapaOurs: so NUMA filter migth want to select between NUMA nodes13:33
PapaOursgiblet: sure we said to not change that :)13:33
gibletPapaOurs: sure I don't want to change that13:33
gibletPapaOurs: this is why I said it can be a future need13:33
PapaOursgiblet: but NUMA nodes will be checked by placement so we could deprecate the filter13:35
PapaOursgiblet: I mean, the filter will continue to do the same13:35
fried_riceI wouldn't bet on NUMA filtering being done by placement any time soon.13:35
gibletPapaOurs: deprecating NUMA filter would be really nice :)13:35
PapaOursgiblet: but once some verifications are done by placement, we'll only leave some specific thnigs13:35
fried_riceWe are likely to model NUMA in placement but do the filtering post-/a_c for at least a release or two.13:36
PapaOursyup13:38
PapaOursthat's my plan13:38
PapaOursdo the topology in placement13:38
PapaOursbut just keep the filters for a few cycles13:39
PapaOursfilter*13:39
*** ttsiouts has quit IRC13:39
*** s10_ has joined #openstack-placement13:42
*** s10 has quit IRC13:42
gibletohh sh*t, nova assume that a ruuning server will alway have allocation on the compute RP to detect that such allocation needs to be moved to the migration_uuid during move13:53
gibletso in my test when a server only allocates from the NUMA child, the move operations stop to move the source allocation to the migration_uuid13:54
gibletthis will be fun13:54
gibletstill technically in the todays nova code base this assumption holds as today even with nested enabled we cannot boot a server without allocation something from the compute RP13:55
gibletso nothing is broken today it just does not support the case when CPU and MEMORY is moved to the NUMA RP13:55
giblet... and I thought that it is a happy Friday ...13:57
*** ttsiouts has joined #openstack-placement14:00
cdentI still think that we should have vcpu on the host, and some other thing on the numa node14:01
cdentso that there is always something on the host14:01
cdentmriedem: a) aren't you off, b) this stack of depends on is so much fun14:03
mriedemyes and yes14:04
mriedemextracting a required thing wasn't going to be simple...14:04
cdentit's actually simpler than I feared, in some aspects14:09
cdentis the interrelationship between the various bits of qa/infra that is bewildering, but it fails in reasonable ways14:11
mriedemwell one must sacrifice a few virgin souls to the qa/infra gods to get things to work but yes14:16
fried_ricewow, virgin soul must be even harder, what with reincarnation and all14:19
*** dansmith is now known as SteelyDan14:20
*** ttsiouts has quit IRC14:31
*** ttsiouts has joined #openstack-placement14:31
*** ttsiouts has quit IRC14:36
*** e0ne has quit IRC14:38
openstackgerritJay Pipes proposed openstack/os-traits master: Add COMPUTE_TIME_HPET trait  https://review.openstack.org/60825814:48
PapaOurscdent: sorry was in meeting so I missed your point when you said you feel VCPUs should be a root RP RC14:53
PapaOurscdent: but I proposed this on my spec as an alternative and got some solid disagreement on that being a thing14:53
PapaOursgiblet: I'm back after 2 hours of meeting and internal stuff14:54
PapaOursgiblet: I feel we all know a bit more about the problem we're trying to solve, I'm about to un-214:54
PapaOursno real need to hold this change since it doesn't regress14:54
gibletPapaOurs: fine. I'm about to push the patch14:54
cdentPapaOurs: yeah, I think I remember that. I'm mostly being nostalgic for "easy targetting"14:55
gibletPapaOurs: I think I managed to solve the issue 2, but we can debate if my solution is a nice one14:55
*** helenafm has quit IRC14:58
PapaOurscool :)15:01
fried_ricegiblet, PapaOurs: Oh, I was thinking we should put the new test case into that patch. But can reissue +2 if we decide otherwise.15:11
PapaOursfried_rice: that's a good point, I'm not sure we really need to hold based on this15:12
PapaOursfried_rice: that can be a follow-up15:12
fried_ricey'all just let me know what we decide.15:12
* giblet still working on the functional tests15:16
*** ttsiouts has joined #openstack-placement15:21
*** ttsiouts has quit IRC15:29
*** ttsiouts has joined #openstack-placement15:30
*** ttsiouts has quit IRC15:34
gibletPapaOurs, fried_rice, cdent: here is the my first stab at fixing the children only allocation issue found by PapaOurs https://review.openstack.org/#/c/608298/15:43
gibletPapaOurs: be aware that it is ugly at places :)15:43
PapaOursgiblet: ack, thank15:43
fried_riceack15:43
PapaOursgiblet: that said, I'll call a day soon15:43
gibletPapaOurs: no worries I'm calling the day and the week right now :)15:44
giblethave a nice weekend folks!15:44
fried_riceSee ya giblet. Great work this week15:44
PapaOursenjoy15:44
*** e0ne has joined #openstack-placement15:50
*** e0ne has quit IRC15:55
cdentsomething about the breadth of that change doesn't seem right, but I'm rather out of time (for a week) to review it properly16:03
*** PapaOurs is now known as bauzas16:08
*** mriedem has quit IRC16:09
*** helenafm has joined #openstack-placement16:12
*** s10_ has quit IRC16:12
cdentI think I'm done. See you all in a bit more than a week.16:30
* cdent waves16:30
*** cdent has left #openstack-placement16:30
*** dims has joined #openstack-placement16:46
*** helenafm has quit IRC16:49
*** sean-k-mooney has quit IRC17:16
*** sean-k-mooney has joined #openstack-placement17:24
*** tssurya has quit IRC17:44
*** e0ne has joined #openstack-placement19:38
*** e0ne has quit IRC19:43
*** e0ne has joined #openstack-placement19:55
*** e0ne has quit IRC19:56
*** e0ne has joined #openstack-placement20:44
*** e0ne has quit IRC20:49

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!