Thursday, 2019-06-27

*** yedongcan has joined #openstack-nova00:03
*** openstack has joined #openstack-nova13:12
*** ChanServ sets mode: +o openstack13:12
gibimriedem: the previous patches just refactored the heal allocation code path13:13
gibito make it easy to add the port healing13:14
*** mmethot has quit IRC13:15
gibimriedem: the basic structure of final patch13:15
gibimriedem: during _heal_allocations_for_instance first we get if the instance has ports that needs healing13:18
gibithis _get_port_allocations_to_heal is the majority of the added code there13:18
gibias it detect is there is any port that needs healing and generates the allocation fragment for that port13:19
mriedemok i'm assuming just checking if the port has a resource request (if hitting neutron directly) or checking the info cache for a bound port with the allocatoin in the binding profile?13:19
gibiwe are hitting neutron directly13:19
mriedemhmm, that's kind of expensive isn't it?13:20
mriedemwhy can't we use the info cache to tell if the vif has a port allocation?13:20
mriedemor is that some kind of chicken and egg issue? or we don't trust the cache?13:20
gibiwe need to hit neutron to see if there is resource request on the port13:20
gibithat is not part of the vif cache13:21
mriedemsure but the binding profile in the vif cache has a thing that tells us if it had a resource request right?13:21
gibimriedem: now, the binding profile in the cache only could tell us if there is allocation for the port13:21
openstackgerritEric Fried proposed openstack/nova-specs master: WIP: Physical TPM passthrough  https://review.opendev.org/66792613:21
gibibut if there is no allocation then we don't know if there is a need for such allocation13:21
mriedemso let's say the binding profile in the vif cache doesn't have that allocation marker, but we find a port with a resource request - we're going to heal the allocation by putting allocations to placement for that instance right?13:22
mriedemare we also going to update the cache to indicate it's now got an allocatoin?13:23
gibimriedem: right, and also updating neutron with the rp uuid13:23
gibimriedem: hm, we are not updating the info cache13:23
gibi:/13:23
mriedemyeah, so i'm wondering about the scenario that our cache is busted, we go tell placement there is an allocation, but then what happens when we teardown that port/server in the compute service? are we basing that on the cache? b/c if so, we'll leak the allocation correcT?13:24
openstackgerritEric Fried proposed openstack/nova master: WIP: Physical TPM passthrough  https://review.opendev.org/66792813:24
mriedemsimilarly are we only creating the port allocation in heal_allocations if the port is bound?13:25
gibimriedem: we cannot leak allocation during delete as delete deletes everything for the consumer13:25
mriedemgibi: what about when we detach the port though?13:25
mriedembut don't delete the server13:25
gibiI hope that code path works based on the allocation key in the binding profile, but let me check13:26
mriedemhttps://github.com/openstack/nova/blob/7b769ad403751268c60b095f722437cbed692071/nova/network/neutronv2/api.py#L167213:26
mriedemlooks like it does...13:27
gibimriedem: in the other hand, is there any way to trigger a info_cache refresh or we need to do it manually?13:27
mriedemthere is a periodic task in the compute service that will forcefully rebuild the cache from info in neutron,13:28
mriedembut i don't think that code is doing anything with the resource request / allocation information yet13:28
mriedemmight have been something i brought up when reviewing the series in stein13:28
mriedemthis is the call from the compute task https://github.com/openstack/nova/blob/7b769ad403751268c60b095f722437cbed692071/nova/compute/manager.py#L744513:29
mriedemthis is the neutronv2 api method that builds the vif object in the cache https://github.com/openstack/nova/blob/7b769ad403751268c60b095f722437cbed692071/nova/network/neutronv2/api.py#L284213:30
mriedemwe copy the binding:profile directly off the port https://github.com/openstack/nova/blob/7b769ad403751268c60b095f722437cbed692071/nova/network/neutronv2/api.py#L288713:30
*** rcernin has quit IRC13:31
mriedemmaybe that's enough? the 'allocation' is in the binding:profile yeah? and that maps to the resource provider uuid on which the allocation belongs13:31
*** jistr is now known as jistr|call13:31
gibicopying binding:profile is enough as it has the 'allocaton' key13:31
gibiand that points to the RP13:32
gibiwe allocate from13:32
gibimriedem: this is where the heal port allocation makes sure that neutron binding:profile is updated https://review.opendev.org/#/c/637955/28/nova/cmd/manage.py@185713:32
openstackgerritMerged openstack/nova stable/stein: libvirt: Rework 'EBUSY' (SIGKILL) error handling code path  https://review.opendev.org/66738913:33
mriedemok so going back to my original question, cache vs neutron directly, i guess heal_allocations is going to the source of truth first (neutron) and working based on that, which is more fool-proof if possibly less performant13:34
mriedemif we healed based on the cache, and the cache was stale or missing information somehow, then we would fail to heal allocations for the ports in that busted cache13:34
gibimriedem: there is no way to figure out what to heal if we don't hit neutron. the info cache only good to see if there is allocation or not for a port, but only neutron knows if the port needs allocation13:35
mriedemwith the way you have it, heal_allocations would heal the allocations and the port binding profile and the _heal_instance_info_cache task in the compute service running that instance would heal the cache13:35
mriedemsure, i realize that,13:35
mriedemi was thinking of using the cache to pre-filter the list of ports we're asking neutron about13:35
mriedemi.e. if we're iterating over 50 intances that's 50 calls to neutron13:36
gibimriedem: you mean if we see allocaton key in the cache then we assume that such port doesn't need to be healed?13:36
mriedemi do like that you added an option to skip this though13:36
mriedemgibi: something like that13:37
mriedemif we would have stored some flag in the vif cache originally that indicated the port had resource requests (not necessarily the resource request itself), heal_allocations could have just checked that flag in the cache and then call neutron for that port13:37
mriedemand if ifs and buts were candy and nuts we'd all have a very fine christmas13:38
gibimriedem: these vifs are created _before_ there was any logic in nova about bandwidth resource. So I don't see how to have a flag in nova about it13:38
mriedemoh right this is also for healing the case of ports attached with qos before stein...13:39
mriedemi forgot about that13:39
gibithis is the case exaclty that the port was attached without allocation13:39
mriedemanyway, i'm trying to optimize something that we just want to work on the first go, so i can drop it13:39
gibibecause neutron had support way earlier for bandiwdth13:39
*** takashin has joined #openstack-nova13:40
mriedemthe other good thing is that heal_allocations now has the option to heal a single instance so if an operator is trying to fix one specific instance (b/c of some user ticket or something) then they can just target that one rather than all instances to heal 113:40
*** rajinir has joined #openstack-nova13:41
gibiyeah, I remember I had to adapt to that improvement at some point13:41
gibione extra complexity to note. as we need to update both placement and neutron the update cannot be made atomic13:42
mriedemand i'm assuming the earlier patches to refactor the code didn't really require changes to the functional tests the validate the interface (unlike you'd need with unit tests)13:43
*** munimeha1 has joined #openstack-nova13:43
*** dave-mccowan has quit IRC13:43
mriedemi wrote all of the heal_allocations stuff with functional tests b/c i didn't trust unit tests for testing the interface13:43
gibimriedem: yes, no functional test was harmed in the refactoring process13:43
mriedemif we fail to update one (placement or neutron) we could rollback the other...13:43
mriedemheh, "no functional test was harmed in the making of this patch"13:44
mriedemok this has been a useful discussion before i dive in13:44
mriedemthanks13:44
gibimriedem: yes, rollback is one option. I went for just printing what failed and how to update neutron manually13:45
gibimriedem: if you think rollback is better then I have to temprarly store the original allocation of the instance before the heal, and put that back if neutron update fails13:46
*** pcaruana has quit IRC13:46
*** tbachman has quit IRC13:46
*** pcaruana has joined #openstack-nova13:47
*** bbowen has joined #openstack-nova13:48
openstackgerritGhanshyam Mann proposed openstack/nova master: Multiple API cleanup changes  https://review.opendev.org/66688913:50
mriedemwe could also retry the port updates with a backoff loop, like 3 retries or something if it's a temporary network issue13:51
mriedemcould be follow ups though13:51
mriedemi'll keep it in mind when i review and leave a comment13:51
gibimriedem: ack13:54
*** eharney has quit IRC13:58
*** mkrai_ has joined #openstack-nova14:00
mriedemnova meeting happening14:02
*** Luzi has quit IRC14:02
*** tbachman has joined #openstack-nova14:04
*** _alastor_ has joined #openstack-nova14:06
*** mkrai_ has quit IRC14:06
mriedemgibi: also in the back of my mind i've been thinking of adding something to the nova-next job post_test_hook, like creating a server, deleting it's allocatoins in placement and then running heal_allocations just to make sure we have integration test coverage as well, but it's lower priority14:10
gibimriedem: I did similar manual testing for my patch in devstack and discovered bugs. So I agree that it would be useful14:11
*** eharney has joined #openstack-nova14:11
*** lpetrut has quit IRC14:13
*** mlavalle has joined #openstack-nova14:15
*** ricolin_ has joined #openstack-nova14:20
jrosserguilhermesp: this lot https://review.opendev.org/#/q/topic:fix-octavia+(status:open+OR+status:merged)14:20
jrosserguilhermesp: there are some jobs running with some depends-on, but it's difficult to see whats going on14:20
jrosseroops -ECHAN14:21
* guilhermesp looking jrosser 14:23
*** dpawlik has quit IRC14:27
*** mmethot has joined #openstack-nova14:28
*** jistr|call is now known as jistr14:31
*** mmethot has quit IRC14:33
*** mmethot has joined #openstack-nova14:42
openstackgerritSurya Seetharaman proposed openstack/nova stable/stein: Grab fresh power state info from the driver  https://review.opendev.org/66794814:50
*** NostawRm has quit IRC14:52
*** xek__ has quit IRC14:53
mriedemhttp://status.openstack.org/reviews/#nova sure is fun for abandon fodder14:54
openstackgerritVrushali Kamde proposed openstack/nova master: Support filtering of hosts by forbidden aggregates  https://review.opendev.org/66795214:55
*** xek has joined #openstack-nova14:55
*** ratailor has joined #openstack-nova14:56
*** xek_ has joined #openstack-nova14:58
efriedabandon away14:58
*** ivve has quit IRC15:00
*** takashin has left #openstack-nova15:01
*** xek has quit IRC15:01
*** dpawlik has joined #openstack-nova15:01
*** tbachman has quit IRC15:02
*** dpawlik has quit IRC15:07
openstackgerritSurya Seetharaman proposed openstack/nova stable/rocky: Grab fresh power state info from the driver  https://review.opendev.org/66795515:09
*** cfriesen has joined #openstack-nova15:09
*** hamzy has quit IRC15:13
*** pcaruana has quit IRC15:14
*** dpawlik has joined #openstack-nova15:17
*** dpawlik has quit IRC15:22
openstackgerritMatt Riedemann proposed openstack/nova master: RT: replace _instance_in_resize_state with _is_trackable_migration  https://review.opendev.org/56046715:24
mriedemefried: another rebase on that one ^15:25
*** udesale has joined #openstack-nova15:25
mriedemdansmith: can you come back on this instance.hidden patch https://review.opendev.org/#/c/631123/ ?15:25
dansmith<insert hidden pun here>15:26
mriedemdansmith.hidden = True15:26
*** _alastor_ has quit IRC15:27
efriedmriedem: I can't even see where the conflict was on that one.15:27
*** _alastor_ has joined #openstack-nova15:27
*** ohwhyosa has quit IRC15:28
efriedoh, never mind15:28
dansmithmriedem: is the hidden check in the api really required? aren't you filtering out hidden instances from the list query?15:30
mriedemdansmith: i'd have to look again to confirm but i believe there is a window where both are not hidden15:31
mriedemwhile swapping over15:31
*** NostawRm has joined #openstack-nova15:31
mriedemwhich,15:31
dansmithmriedem: in that case, the check for hidden-ness doesn't help right?15:31
mriedemarguably the api will still filter - it might pick the "wrong" one but...15:32
mriedemthe comment in there mentions something about that right? updated_at and such15:32
dansmithright, but the check of the hidden field would be pointless in that case15:32
*** tbachman has joined #openstack-nova15:32
dansmiththe comment is talking about the case where they're both not hidden15:32
dansmithI'm talking about the case where one is.. what's the point of checking it if instance_list doesn't return them?15:33
*** bbowen has quit IRC15:34
*** luksky has quit IRC15:34
mriedemok so this is the point where we could have 2 copies where hidden=False https://review.opendev.org/#/c/635646/32/nova/conductor/tasks/cross_cell_migrate.py@593 so the DB API would return both from each cell while listing15:34
mriedemnow that DB API isn't returning hidden=True by default...15:34
dansmith...and so checking for instance.hidden in compute/api does what?15:34
mriedemyeah in this case now b/c db api won't return the hidden one, "or instance.hidden" will always be false15:35
dansmithright15:36
mriedemthe note above still applies, but the logical or doesn't15:36
mriedemso are you ok with removing the or condition and leaving the comment?15:36
dansmithyep, just said that in the review15:37
mriedemyup, thanks15:37
*** hamzy has joined #openstack-nova15:40
*** factor has quit IRC15:41
*** hongbin has joined #openstack-nova15:41
*** factor has joined #openstack-nova15:41
openstackgerritmelanie witt proposed openstack/nova master: Require at least cryptography>=2.7  https://review.opendev.org/66776515:48
*** icarusfactor has joined #openstack-nova15:49
*** factor has quit IRC15:50
*** ttsiouts has quit IRC15:51
*** itssurya has quit IRC15:52
*** ttsiouts has joined #openstack-nova15:52
*** ccamacho has quit IRC15:55
*** ttsiouts has quit IRC15:56
*** ratailor has quit IRC15:57
*** artom|gmtplus3 has quit IRC15:59
*** wwriverrat has joined #openstack-nova16:00
*** wwriverrat has quit IRC16:00
*** damien_r has quit IRC16:01
*** wwriverrat has joined #openstack-nova16:01
*** jangutter has quit IRC16:02
*** jangutter has joined #openstack-nova16:02
mriedemdansmith: i'm going to drop https://review.opendev.org/#/c/631123/34/nova/tests/unit/compute/test_compute_api.py then since it's not really valid since the db api wouldn't return a hidden=True instance16:09
dansmithyeah16:09
*** igordc has joined #openstack-nova16:11
*** wwriverrat has left #openstack-nova16:13
openstackgerritMiguel Ángel Herranz Trillo proposed openstack/nova master: Add support for 'initenv' elements  https://review.opendev.org/66797516:13
openstackgerritMiguel Ángel Herranz Trillo proposed openstack/nova master: Add support for cloud-init on LXC instances  https://review.opendev.org/66797616:13
mriedemlooks like someone is trying to make libvirt+lxc work again16:15
openstackgerritMiguel Ángel Herranz Trillo proposed openstack/nova master: Add support for cloud-init on LXC instances  https://review.opendev.org/66797616:16
*** xek__ has joined #openstack-nova16:20
*** bbowen has joined #openstack-nova16:21
openstackgerritMerged openstack/nova-specs master: support virtual persistent memory  https://review.opendev.org/60159616:21
*** xek_ has quit IRC16:23
*** mdbooth_ has joined #openstack-nova16:29
efriedmriedem: At some point recently you wrote a functional test that used a weigher to prefer host1 so that the assertion that we landed on host2 was provably valid every time (instead of just by chance). Can you put your finger on that easily?16:30
*** luksky has joined #openstack-nova16:30
*** rdopiera has quit IRC16:31
mriedemefried: look for HostNameWeigher in the nova/tests/functional16:31
mriedemthere are several examples16:31
efriedthanks16:31
mriedemima push this big ass series b/c i've been rebasing it locally for weeks and want to flush it16:31
*** mdbooth has quit IRC16:32
*** panda has quit IRC16:32
openstackgerritMatt Riedemann proposed openstack/nova master: Add InstanceAction/Event create() method  https://review.opendev.org/61403616:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add Instance.hidden field  https://review.opendev.org/63112316:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add TargetDBSetupTask  https://review.opendev.org/62789216:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add CrossCellMigrationTask  https://review.opendev.org/63158116:33
openstackgerritMatt Riedemann proposed openstack/nova master: Execute TargetDBSetupTask  https://review.opendev.org/63385316:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add prep_snapshot_based_resize_at_dest compute method  https://review.opendev.org/63329316:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add PrepResizeAtDestTask  https://review.opendev.org/62789016:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add prep_snapshot_based_resize_at_source compute method  https://review.opendev.org/63483216:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add nova.compute.utils.delete_image  https://review.opendev.org/63760516:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add PrepResizeAtSourceTask  https://review.opendev.org/62789116:33
openstackgerritMatt Riedemann proposed openstack/nova master: Refactor ComputeManager.remove_volume_connection  https://review.opendev.org/64218316:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add power_on kwarg to ComputeDriver.spawn() method  https://review.opendev.org/64259016:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add finish_snapshot_based_resize_at_dest compute method  https://review.opendev.org/63508016:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add FinishResizeAtDestTask  https://review.opendev.org/63564616:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add Destination.allow_cross_cell_move field  https://review.opendev.org/61403516:33
openstackgerritMatt Riedemann proposed openstack/nova master: Execute CrossCellMigrationTask from MigrationTask  https://review.opendev.org/63566816:33
openstackgerritMatt Riedemann proposed openstack/nova master: Plumb allow_cross_cell_resize into compute API resize()  https://review.opendev.org/63568416:33
openstackgerritMatt Riedemann proposed openstack/nova master: Filter duplicates from compute API get_migrations_sorted()  https://review.opendev.org/63622416:33
openstackgerritMatt Riedemann proposed openstack/nova master: Add cross-cell resize policy rule and enable in API  https://review.opendev.org/63826916:33
openstackgerritMatt Riedemann proposed openstack/nova master: WIP: Enable cross-cell resize in the nova-multi-cell job  https://review.opendev.org/65665616:33
*** mdbooth_ has quit IRC16:36
*** panda has joined #openstack-nova16:36
*** whoami-rajat has quit IRC16:44
*** davidsha has quit IRC16:47
*** udesale has quit IRC16:54
*** icarusfactor has quit IRC16:59
efriedAnyone know artom's status?17:00
efriedseemed like he was traveling recently17:00
mnaserefried: i think i remember seeing his nick with some timezone appended to it17:02
*** xek_ has joined #openstack-nova17:02
efriedyeah, I recall that too, GMT+3 or something. But I wasn't sure where he was or whether he was going to be afk for some amount of time17:03
tbachmanp!spy artom17:03
tbachmanoops17:03
tbachmanlol17:03
efriedworst spy ever17:04
tbachmanlol17:04
edleafeefried: https://leafe.com/timeline/%23openstack-nova/2019-06-27T08:57:0517:04
tbachmanpurplerbot: log at http://p.anticdent.org/logs/artom17:04
dansmithefried: he's on GMT+317:04
efriedoh, so he's around, just out "early" (relative to me). Cool.17:04
dansmithyar17:05
*** xek__ has quit IRC17:05
*** whoami-rajat has joined #openstack-nova17:06
*** hamzy has quit IRC17:07
*** ricolin_ has quit IRC17:12
*** ricolin has joined #openstack-nova17:12
*** hamzy has joined #openstack-nova17:12
*** martinkennelly has quit IRC17:13
*** maciejjozefczyk has quit IRC17:17
openstackgerritMatt Riedemann proposed openstack/nova master: Add integration testing for heal_allocations  https://review.opendev.org/66799417:26
*** kaisers has quit IRC17:27
*** kaisers has joined #openstack-nova17:31
*** ralonsoh has quit IRC17:35
openstackgerritMatt Riedemann proposed openstack/nova master: Add integration testing for heal_allocations  https://review.opendev.org/66799417:35
*** _erlon_ has quit IRC17:42
*** xek_ has quit IRC17:43
*** ociuhandu has quit IRC17:48
openstackgerritMerged openstack/nova master: Remove global state from the FakeDriver  https://review.opendev.org/65670917:50
openstackgerritMerged openstack/nova master: Enhance service restart in functional env  https://review.opendev.org/51255217:50
*** ociuhandu has joined #openstack-nova17:53
*** pcaruana has joined #openstack-nova17:54
*** ociuhandu has quit IRC17:58
*** mvkr has quit IRC18:03
*** ricolin has quit IRC18:09
*** bbowen_ has joined #openstack-nova18:15
*** bbowen has quit IRC18:18
*** hamzy has quit IRC18:24
*** hamzy has joined #openstack-nova18:25
*** hamzy has quit IRC18:30
openstackgerritMerged openstack/nova master: libvirt: flatten rbd images when unshelving an instance  https://review.opendev.org/45788618:43
*** bbowen__ has joined #openstack-nova19:08
*** bbowen_ has quit IRC19:11
*** raub has joined #openstack-nova19:28
*** bbowen__ has quit IRC19:32
*** tbachman has quit IRC19:34
openstackgerritMerged openstack/nova master: reorder conditions in _heal_allocations_for_instance  https://review.opendev.org/65545819:47
*** whoami-rajat has quit IRC19:54
*** shilpasd has quit IRC20:04
*** ivve has joined #openstack-nova20:12
*** eharney has quit IRC20:55
*** pcaruana has quit IRC21:05
*** mmethot has quit IRC21:11
*** tesseract has quit IRC21:19
*** rcernin has joined #openstack-nova21:24
openstackgerritMerged openstack/nova master: Prepare _heal_allocations_for_instance for nested allocations  https://review.opendev.org/63795421:33
mriedemefried: https://review.opendev.org/#/c/637955/28 tag teamed21:36
openstackgerritMerged openstack/nova master: pull out put_allocation call from _heal_*  https://review.opendev.org/65545921:36
mriedemwhich do you want to be? https://toomanyposts.files.wordpress.com/2011/11/hartfound.jpg21:36
efrieddude, as long as I get to wear pink tights, who cares??21:37
mriedemok i'm bret then21:38
sean-k-mooneyhas the follow up spec for numa with pmem been submited or is that for U21:39
efriedI haven't seen such a thing21:40
mriedemthat seems to be getting the cart way before the horse21:40
sean-k-mooneythe current spec says numa will be adress in a follow up spec21:41
sean-k-mooneymriedem: not really21:41
sean-k-mooneyi think it makes sense to proced with the current spec that just merged21:41
sean-k-mooneybut im concerned the porsal i prolematic21:41
sean-k-mooneyspecifcaly im really not ok with the driver generating numa object with out goign through the numa toplogy filter unless we ensure we dont restrit the guest ram and cpus to a numa node21:43
sean-k-mooneyif we generate a virtual numa node and do no affintiy to a host numa node fo any resoces it proably fine21:43
*** mriedem is now known as mriedem_afk21:45
*** bbowen__ has joined #openstack-nova21:53
*** hongbin has quit IRC22:01
*** mvkr has joined #openstack-nova22:01
openstackgerritEric Fried proposed openstack/nova master: Un-safe_connect and publicize get_providers_in_tree  https://review.opendev.org/66806222:04
openstackgerritsean mooney proposed openstack/nova-specs master: Libvirt: add vPMU spec for train  https://review.opendev.org/65126922:08
*** tbachman has joined #openstack-nova22:20
*** munimeha1 has quit IRC22:23
*** slaweq has quit IRC22:23
*** igordc has quit IRC22:36
*** mlavalle has quit IRC22:40
*** mlavalle has joined #openstack-nova22:41
*** mlavalle has quit IRC22:41
*** luksky has quit IRC23:00
*** spatel has joined #openstack-nova23:13
*** spatel has quit IRC23:14
*** sean-k-mooney has quit IRC23:19
*** sean-k-mooney has joined #openstack-nova23:20
alex_xujohnthetubaguy: mriedem_afk sean-k-mooney efried thanks for all the review, I will continue to look at the comment23:27
*** tonyb has quit IRC23:32
*** tonyb has joined #openstack-nova23:32
*** mmethot has joined #openstack-nova23:33
*** slaweq has joined #openstack-nova23:42
*** slaweq has quit IRC23:46
*** efried has quit IRC23:47
*** efried has joined #openstack-nova23:48

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!