*** edmondsw_ has joined #openstack-placement | 00:52 | |
*** edmondsw has quit IRC | 00:55 | |
*** tetsuro has joined #openstack-placement | 01:27 | |
*** edmondsw_ has quit IRC | 01:29 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (10) https://review.openstack.org/576017 | 01:47 |
---|---|---|
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (11) https://review.openstack.org/576018 | 02:00 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (12) https://review.openstack.org/576019 | 02:09 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (13) https://review.openstack.org/576020 | 02:20 |
openstackgerrit | Dinesh Bhor proposed openstack/nova master: PCPU: Define numa dedicated CPU resource class https://review.openstack.org/561770 | 02:21 |
openstackgerrit | Dinesh Bhor proposed openstack/nova master: PCPU: Define numa dedicated CPU resource class https://review.openstack.org/561770 | 02:24 |
openstackgerrit | Dinesh Bhor proposed openstack/nova master: PCPU: Add respective conf options https://review.openstack.org/561771 | 02:24 |
openstackgerrit | Dinesh Bhor proposed openstack/nova master: PCPU: Add respective conf options https://review.openstack.org/561771 | 02:27 |
openstackgerrit | Dinesh Bhor proposed openstack/nova master: NUMACell, InstanceNUMACell: Adopt 'PCPU' changes https://review.openstack.org/576021 | 02:27 |
*** edmondsw has joined #openstack-placement | 02:44 | |
*** edmondsw has quit IRC | 02:49 | |
*** e0ne has joined #openstack-placement | 04:11 | |
*** e0ne has quit IRC | 04:13 | |
*** edmondsw has joined #openstack-placement | 04:32 | |
*** edmondsw has quit IRC | 04:37 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (14) https://review.openstack.org/576027 | 04:38 |
openstackgerrit | Dinesh Bhor proposed openstack/os-traits master: Adds HW_CPU_HYPERTHREADING standard trait https://review.openstack.org/576030 | 04:51 |
*** e0ne has joined #openstack-placement | 05:19 | |
*** e0ne has quit IRC | 05:22 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (15) https://review.openstack.org/576031 | 05:30 |
openstackgerrit | garyk proposed openstack/nova master: Prevent compute manager freeze when greenpool is full https://review.openstack.org/575034 | 05:48 |
*** bhagyashris has quit IRC | 05:55 | |
*** edmondsw has joined #openstack-placement | 06:21 | |
openstackgerrit | Dinesh Bhor proposed openstack/nova master: NUMACell, InstanceNUMACell: Adopt 'PCPU' changes https://review.openstack.org/576021 | 06:22 |
*** edmondsw has quit IRC | 06:26 | |
*** rubasov has joined #openstack-placement | 06:58 | |
*** belmoreira has joined #openstack-placement | 07:12 | |
*** giblet is now known as gibi | 07:44 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: api-ref: Fix parameters about trusted certificate IDs https://review.openstack.org/576046 | 07:50 |
*** ttsiouts has joined #openstack-placement | 07:53 | |
*** e0ne has joined #openstack-placement | 08:03 | |
*** e0ne has quit IRC | 08:07 | |
*** tetsuro has quit IRC | 08:07 | |
*** tetsuro has joined #openstack-placement | 08:08 | |
*** edmondsw has joined #openstack-placement | 08:09 | |
*** edmondsw has quit IRC | 08:14 | |
*** sususuryashines has joined #openstack-placement | 08:24 | |
openstackgerrit | garyk proposed openstack/nova master: Resource tracker: improve resource tracker periodic task https://review.openstack.org/576052 | 08:30 |
*** takashin has left #openstack-placement | 08:34 | |
openstackgerrit | karim proposed openstack/nova master: Update scheduler to use image-traits https://review.openstack.org/576054 | 08:41 |
*** e0ne has joined #openstack-placement | 08:46 | |
openstackgerrit | Merged openstack/nova stable/queens: Fix the file name of development-environment.rst https://review.openstack.org/574175 | 08:47 |
*** finucannot is now known as stephenfin | 08:47 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: placement: Make API history doc more consistent https://review.openstack.org/477478 | 09:02 |
*** PapaOurs is now known as bauzas | 09:34 | |
*** tetsuro has quit IRC | 09:42 | |
*** rubasov has quit IRC | 09:55 | |
openstackgerrit | Andrey Volkov proposed openstack/nova master: Update nova network info when doing rebuild for evacuate operation https://review.openstack.org/382853 | 10:40 |
*** alex_xu has quit IRC | 10:49 | |
*** alex_xu has joined #openstack-placement | 10:49 | |
openstackgerrit | jichenjc proposed openstack/nova master: z/VM Driver: Initial change set of z/VM driver https://review.openstack.org/523387 | 10:50 |
openstackgerrit | jichenjc proposed openstack/nova master: z/VM Driver: Spawn and destroy function of z/VM driver https://review.openstack.org/527658 | 10:50 |
openstackgerrit | jichenjc proposed openstack/nova master: z/VM Driver: add snapshot function https://review.openstack.org/534240 | 10:51 |
openstackgerrit | jichenjc proposed openstack/nova master: z/VM Driver: add power actions https://review.openstack.org/543340 | 10:51 |
openstackgerrit | jichenjc proposed openstack/nova master: z/VM Driver: add get console output https://review.openstack.org/543344 | 10:51 |
openstackgerrit | Vladyslav Drok proposed openstack/nova master: ironic: Report resources as reserved when needed https://review.openstack.org/517921 | 10:52 |
openstackgerrit | garyk proposed openstack/nova master: Resource tracker: remove costly copy https://review.openstack.org/576099 | 11:00 |
*** cdent has joined #openstack-placement | 11:04 | |
*** rubasov has joined #openstack-placement | 11:57 | |
*** edleafe- has joined #openstack-placement | 12:07 | |
*** edmondsw has joined #openstack-placement | 12:09 | |
*** edleafe has quit IRC | 12:09 | |
*** edleafe- is now known as edleafe | 12:09 | |
openstackgerrit | Rajesh Tailor proposed openstack/nova master: Fix case-sensitivity for metadata keys https://review.openstack.org/504885 | 12:14 |
*** takashin has joined #openstack-placement | 12:39 | |
openstackgerrit | Merged openstack/nova master: Add trusted certs to feature support matrix docs https://review.openstack.org/574890 | 12:44 |
*** jaypipes has joined #openstack-placement | 13:03 | |
*** mriedem has joined #openstack-placement | 13:05 | |
jaypipes | hmm, old jaypipes is back on etherpads... hrmph. | 13:06 |
efried | Do you have one signed in on your phone or something? | 13:07 |
cdent | it's a haxor for sur3 | 13:08 |
efried | Do you have any colleagues (or techy relatives) who like to make fun of your age? | 13:09 |
jaypipes | efried, cdent: nope, no browsers open on any phone or laptop... it's a mystery. | 13:17 |
jaypipes | efried: heh, well, I'm sure I do have those friends, colleagues, yes. but it's definitely neither of you since you're both either of the same or greater age as me :P | 13:18 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Revert "Re-using the code of os brick cinder" https://review.openstack.org/576136 | 13:23 |
*** avolkov has joined #openstack-placement | 13:26 | |
*** ttsiouts has quit IRC | 13:26 | |
*** sususuryashines has quit IRC | 13:27 | |
mriedem | i need one more core to finish out the granular policy series https://review.openstack.org/#/c/571201/ | 13:28 |
*** nicolasbock has joined #openstack-placement | 13:29 | |
*** ttsiouts has joined #openstack-placement | 13:30 | |
jaypipes | mriedem: done. | 13:31 |
mriedem | jaypipes: thanks | 13:32 |
jaypipes | mriedem: no, thank *you*. | 13:32 |
*** sususuryashines has joined #openstack-placement | 13:32 | |
openstackgerrit | Julia Kreger proposed openstack/nova master: ironic: bugfix: ensure a host is set for volume connectors https://review.openstack.org/571982 | 13:33 |
*** belmorei_ has joined #openstack-placement | 13:33 | |
*** belmoreira has quit IRC | 13:34 | |
jaypipes | sususuryashines: long weekend? ;) | 13:35 |
sususuryashines | jaypipes: hehe | 13:36 |
*** sususuryashines is now known as tssurya | 13:36 | |
*** superdan is now known as dansmith | 13:38 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: conf: Deprecate 'network_manager' https://review.openstack.org/530923 | 13:40 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Simplify instance name generation https://review.openstack.org/516573 | 13:47 |
*** tetsuro has joined #openstack-placement | 13:51 | |
*** nicolasbock has quit IRC | 14:11 | |
jaypipes | vdrok: cool, thanks for answering my queries on the report resources as reserved patch. all those suggestions can be done in a followup patch. no need to update the existing one. | 14:51 |
vdrok | jaypipes: sure, thanks Jay :) | 14:52 |
efried | mriedem: When will you be around to continue this discussion? | 15:00 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix regression when listing build_requests with marker and ip filter https://review.openstack.org/576161 | 15:00 |
*** takashin has left #openstack-placement | 15:01 | |
mriedem | efried: if it's just more circles on the same thing i don't really want to continue discussing it | 15:01 |
efried | mriedem: We just need to decide which way to implement it. | 15:02 |
mriedem | to restate, i think we should: | 15:02 |
mriedem | 1. check_migrations=True on startup; if MigrationNeeded, get allocs and run the migration (on startup) and if fails, kill the service | 15:03 |
*** tetsuro has quit IRC | 15:03 | |
mriedem | 2. if check_migrations=False during RT.update, the virt driver assumes new/current model, not backward compat for failed startup migration | 15:03 |
mriedem | and the rocky-specific vgpu on root provider check goes away in stein | 15:03 |
efried | "assumes new/current model" is not meaningful | 15:04 |
mriedem | the virt driver *in rocky* assumes vgpu is on child provider | 15:04 |
mriedem | if check_migrations=False | 15:04 |
efried | It is receiving a model from RT, via the provider_tree arg. That model either matches what it thinks it should be doing, or it doesn't. | 15:04 |
efried | If `mismatch`, I would rather it have one behavior (raise MigrationNeeded), rather than behavior conditioned on the check_migrations flag | 15:05 |
mriedem | without seeing what the actual code in upt is going to look like in the virt driver for this, i can't really say - i tried to weigh in on the debate from the spec at the start of the meeting, but apparently not in a helpful way | 15:07 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: metadata: Add '[metadata] domain_name' option https://review.openstack.org/480616 | 15:07 |
mriedem | i don't know how the virt driver is doing model comparison on the provider tree | 15:08 |
cdent | From watching the different of opinion seems to be in what's considered possible. It sounds like matt is saying that a provider tree mismatch is not something that should be unexpected. It only happens when we know a thing: like vgpu moving from compute node to nested. Is that right? | 15:09 |
*** nicolasbock has joined #openstack-placement | 15:09 | |
efried | mriedem: The way I'm seeing it, it looks like this: | 15:10 |
efried | def update_provider_tree(self, ptree, allocs): | 15:10 |
efried | if self._reshape_needed(ptree) and allocs is None: | 15:10 |
efried | raise MigrationNeeded | 15:10 |
efried | or | 15:10 |
efried | def update_provider_tree(self, ptree, allocs, check_migrations=None): | 15:10 |
efried | if self._reshape_needed(ptree) and allocs is None: | 15:10 |
efried | raise MigrationNeeded | 15:10 |
efried | i.e. check_migrations is redundant. | 15:11 |
efried | cdent: I'm agreeing with all of that. I'm also saying there will be a code path for when the unexpected happens. And we would do well to specify how we want that code path to be, even if we think we'll never hit it. | 15:12 |
efried | because otherwise we just have an undefined there. | 15:12 |
*** belmorei_ has quit IRC | 15:12 | |
cdent | segfault | 15:13 |
efried | basically. Except we don't get one of those. Things would move on... somehow. And that somehow could be different from one virt driver to the next, from one migration to the next, etc. | 15:14 |
cdent | the only reason we don't get one of those is because we say so. we can shut down the compute if we want to. | 15:15 |
mriedem | i'm saying we don't need a _reshape_needed method unless check_migrations=True | 15:15 |
*** belmoreira has joined #openstack-placement | 15:15 | |
mriedem | _reshape_needed in rocky will have the 'is the vgpu inventory on the root provider' check | 15:15 |
mriedem | and we remove that in stein so _reshape_needed will be a noop | 15:16 |
efried | mriedem: Right, the part where that gets stuck is where update_provider_tree checks and updates the inventories on the child providers of VGPUs. | 15:24 |
efried | mriedem: If we didn't do _reshape_needed, and those child providers... aren't there - then what? We have to blow up. | 15:24 |
mriedem | what does that check look like? virt driver has vgpus on the host, and assumes there is a vgpu child in the provider tree (or adds it if it's not there)? | 15:25 |
mriedem | isn't the virt driver setting the vgpu inventory on the child provider in the tree? | 15:25 |
efried | mriedem: Right, that check would basically be: if VGPU in ptree.inventory_for(root_rp) | 15:25 |
mriedem | but we do'nt need that check on every upt call | 15:26 |
efried | mriedem: And if so, we either need to move it (if allocs is not None) or blow up. | 15:26 |
efried | mriedem: Oh, yes we do. | 15:26 |
mriedem | why? | 15:26 |
efried | That's the whole purpose of upt: | 15:26 |
efried | to make sure the ptree from placement winds up looking like we think it should. | 15:26 |
efried | Before this reshape business, that could consist of creating/removing providers, adding/removing inventories, changing traits and aggregates, etc. | 15:27 |
mriedem | the virt driver is moving the vgpu inventory to a child in the tree on startup, so why does it need to check that it did that thing every periodic? | 15:27 |
efried | mriedem: By that same logic, there's no reason to run upt on periodic in the first place. | 15:27 |
*** bhagyashris has joined #openstack-placement | 15:28 | |
efried | mriedem: I guess it's the same question as: why do we run get_inventory on every periodic today? | 15:29 |
efried | or get_available_resource | 15:29 |
mriedem | in case the reserved amount is dynamically changed out of band? | 15:30 |
efried | Cool. If cards are hot-plugged or unplugged. | 15:30 |
efried | If NUMA nodes are taken offline or something. | 15:31 |
mriedem | i assumed the driver would be putting the vgpu inventory in the child provider of the tree on each periodi | 15:31 |
mriedem | is the fear that we could end up with a driver reporting vgpu inventory on both the root and a child? | 15:31 |
efried | mriedem: If the vgpu inventory is already on the child provider at the start of the periodic, the ptree doesn't change, and the RT's next call (to update_*from*_provider_tree) is a no-op. | 15:32 |
efried | mriedem: Yes, that's a thing that could conceivably happen if we don't close this gap. | 15:32 |
efried | I guess | 15:32 |
mriedem | how could it conceivably happen though? | 15:32 |
mriedem | if the migrator is atomic | 15:33 |
mriedem | and we kill the service on migration failures on startup | 15:33 |
efried | The migrator doesn't get to run in this scenario. | 15:33 |
mriedem | i realize | 15:33 |
mriedem | but if the migrator is run on startup, and atomic, and fails and we kill the service, i don't see how we can end up with a scenario where the provider tree has vgpu on both the root and a child provider | 15:33 |
mriedem | and i don't think we need to worry about that | 15:33 |
efried | Dude. | 15:33 |
mriedem | if it happened, we f'ed up and have a bug to fix | 15:34 |
efried | I agree "it should never happen". | 15:34 |
efried | So what should the behavior be if we f'ed up? | 15:34 |
efried | I'm trying to nail that behavior down to a specific condition that we will recognize and say, "oh, we f'ed up, specifically *this* way". | 15:34 |
efried | As opposed to kinda letting it slide around loose and have that code path be different for every virt driver, for every migration. | 15:35 |
efried | and, bonus, doing that results in *simpler* code. | 15:35 |
mriedem | i assume we wouldn't fail, we'd be reporting double the vgpu inventory in the same tree, and the scheduler could pick either of those providers to host a workload, which might or might not fail once we get to the driver.spawn()? | 15:35 |
efried | That would be one way it could go pear-shaped. | 15:35 |
efried | Another way would be that the virt driver tries to move the inventory, and update_from_provider_tree bounces because it tries to delete inventory (the one on the compute node RP) that has allocations on it. | 15:36 |
efried | Another way would be that we try to do ^ but wind up with the races that POST /migrator was designed to prevent. | 15:36 |
efried | Again, corner case, should be vanishingly rare. But when closing that gap and defining that behavior is actually *easier* than not closing it, I don't understand why we wouldn't do that. | 15:37 |
*** ttsiouts has quit IRC | 15:39 | |
dansmith | just to be clear, | 15:39 |
dansmith | at the moment we have numa nodes such that we need to pivot gpus onto those, | 15:40 |
dansmith | we also have to pivot memory and cpus right? | 15:40 |
efried | Yes, I would think so. | 15:40 |
dansmith | we're not going to have numa nodes with nothing but gpu inventory | 15:40 |
dansmith | right, so this makes is a giant fundamental problem for everyone, not just people with gpus | 15:40 |
dansmith | makes *it* | 15:40 |
dansmith | I just want to make sure that's clear, | 15:40 |
efried | I also think we need to put NICs into NUMA nodes (stephenfin ^) | 15:40 |
dansmith | as I think we keep focusing on gpus, which is fine, we just need to make sure we're not painting this as a niche case only | 15:41 |
efried | or maybe it was vswitches. | 15:41 |
stephenfin | SR-IOV devices anyway | 15:41 |
efried | dansmith: Agree, been using gpus as an example, but trying to make sure everyone's aware that it's only an example. | 15:41 |
dansmith | ack | 15:41 |
mriedem | so i think the point is, | 15:41 |
mriedem | the simple 'reshape_needed' on every periodic might not be as simple as the vgpu case | 15:41 |
mriedem | and could cost us in performance overhead when we should have already migrated | 15:42 |
mriedem | on startup | 15:42 |
mriedem | is that right? | 15:42 |
dansmith | tbh, I'm completely exhausted with this topic and haven't been following all the recent discussion about this, but let me summarize my feelings if I may... | 15:42 |
dansmith | I think that we should migrate from non-nested to nested on startup (however that happens mechanically, which I think everyone is fine with), and maybe we need to check on each periodic (don't care), but if we determine that we need a big atomic pivot at runtime, we should fail hard and fast. All runtime virt and compute code can and should assume we're operating in nested mode all the time | 15:44 |
dansmith | no compatibility for "well we didn't get migrated, so continue to work as if we're flat" compat code anywhere | 15:44 |
efried | mriedem: The virt driver needs to know what shape it thinks it's dealing with no matter what - i.e. needs to do whatever platform-specific internal query to get up with the state of the world. That part exists every period, no performance change there. | 15:46 |
efried | The other piece is a provider_tree that has already been collected/populated by the RT. Also no performance delta there. | 15:46 |
efried | So the thing we're doing "every time" is comparing one python object (the ptree from RT) to another python object (info gleaned from platform). This should be negligible in all reasonable cases. | 15:46 |
mriedem | for the "but if we determine that we need a big atomic pivot at runtime, we should fail hard and fast." - eric's questoin then was, if we raise MigrationNeeded and we're in the RT, what do we do, log and continue or... | 15:46 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: conf: Undeprecate the 'dhcp_domain' option https://review.openstack.org/480616 | 15:46 |
dansmith | mriedem: log and fail hard with lots of red in the logs, IMHO | 15:47 |
mriedem | you can't kill the service from the RT | 15:47 |
mriedem | you could disable it | 15:47 |
dansmith | maybe even self-disable as if we lost libvirt | 15:47 |
mriedem | but... | 15:47 |
dansmith | no, definitely not kill the service | 15:47 |
mriedem | that's what i'm saying | 15:47 |
dansmith | because systemd will just restart it over and over | 15:47 |
dansmith | kill the service on startup if we fail only, but not at runtime if we encounter an issue | 15:47 |
*** ttsiouts has joined #openstack-placement | 15:48 | |
mriedem | so it sounds like go with eric's thing then (which is what we were talking about above, but might not be what's current in the spec) | 15:48 |
efried | mriedem: Oh, yeah, the spec has to change for sure. Though this is closer to what's there than what we started with. | 15:48 |
mriedem | if RT.update gets MigrationNeeded, log an error and continue, like if the virt driver's get_available_resources raised TypeError or something unexpected | 15:48 |
mriedem | i don't think we need to bake in auto-disable logic right now since this shouldn't even happen | 15:49 |
efried | Sounds like a plan. We can always throw in auto-disable later if we feel it's warranted. | 15:49 |
dansmith | as long as we don't trigger another pivot, that's fine | 15:49 |
dansmith | but log lots of red at least | 15:49 |
efried | dansmith: Roger that. It'll be "virt driver thought we needed a migration in the middle of steady state, we done f'ed up" | 15:50 |
mriedem | you'd trace right here https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L7363 | 15:50 |
dansmith | efried: yes | 15:50 |
efried | mriedem: yes | 15:50 |
efried | mriedem: well | 15:51 |
* mriedem loads gun | 15:51 | |
* dansmith chuckles | 15:51 | |
efried | We could do that, and it would just traceback the MigrationNeeded, which would probably be enough information to go by. | 15:51 |
mriedem | yes, it shouldn't happen | 15:51 |
mriedem | if it does, things are busted | 15:51 |
dansmith | don't trace, log what happened | 15:51 |
efried | But deciding whether to reraise or quit was going to have to happen... somewhere else (in the RT?) based on the `startup` flag. | 15:52 |
dansmith | explain it in the logs to the op | 15:52 |
mriedem | efried: you said with this you don't need the startup flag | 15:52 |
mriedem | oh nvm i see | 15:52 |
efried | mriedem: The *RT* needs the startup flag... yeah. | 15:52 |
openstackgerrit | Vladyslav Drok proposed openstack/nova master: ironic: Report resources as reserved when needed https://review.openstack.org/517921 | 15:52 |
mriedem | i guess you have to handle MigrationNeeded and reraise from _update_available_resource_for_node, or pass the startup flag down | 15:52 |
efried | mriedem: Cause we're still in RT code when we're doing the reshape-on-startup thing, right? Did you point to that code in the meeting? | 15:53 |
mriedem | from update_available_resource | 15:53 |
mriedem | https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L7367 is the method with the startup flag | 15:53 |
mriedem | called from https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1161 | 15:53 |
*** e0ne has quit IRC | 15:54 | |
mriedem | so i guess pass the startup flag to _update_available_resource_for_node and if rt raises MigrationNeeded, check the startup flag to see if you need to log an error and fail or get allocs and try to migrate | 15:54 |
mriedem | anyway, ^ isn't really detail that needs to be in the spec | 15:56 |
mriedem | 'fail on startup if not migrated, else log an error on periodic' | 15:56 |
efried | I think the scenarios are roughly: | 16:00 |
efried | 1) On startup, I didn't pass allocs, I don't get MigrationNeeded <= happy path, no reshaping. If something goes wrong in update_from_provider_tree, blow up n-cpu. | 16:00 |
efried | 2) On startup, I didn't pass allocs, I do get MigrationNeeded <= happy path, reshape needed, get allocs and call again. If something goes wrong on the retry or in update_from_provider_tree, blow up n-cpu. | 16:00 |
efried | 3) On periodic (never pass allocs) I don't get MigrationNeeded <= happy path, and what we have going today. If something goes wrong in update_from_provider_tree, log and continue. | 16:00 |
efried | 4) On periodic I *do* get MigrationNeeded <= log and continue. | 16:00 |
efried | Not sure about #1; maybe I don't blow up, because that's what we have today, and it's self-healing. | 16:00 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Skip ServerShowV263Test.test_show_update_rebuild_list_server for cellsv1 https://review.openstack.org/576194 | 16:01 |
efried | so yeah, the only time we kill n-cpu is #2 | 16:01 |
*** nicolasbock has quit IRC | 16:08 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Skip ServerShowV263Test.test_show_update_rebuild_list_server for cellsv1 https://review.openstack.org/576194 | 16:12 |
*** rubasov has quit IRC | 16:15 | |
*** nicolasbock has joined #openstack-placement | 16:20 | |
*** ttsiouts has quit IRC | 16:22 | |
*** ttsiouts has joined #openstack-placement | 16:22 | |
*** nicolasbock has quit IRC | 16:24 | |
*** nicolasbock has joined #openstack-placement | 16:24 | |
*** ttsiouts has quit IRC | 16:27 | |
openstackgerrit | Vladyslav Drok proposed openstack/nova master: ironic: Report resources as reserved when needed https://review.openstack.org/517921 | 16:31 |
efried | mriedem, jaypipes: Would you please check my responses on https://review.openstack.org/#/c/556669/ and let me know if you disagree, before I respin? | 16:32 |
jaypipes | efried: yep, on it. | 16:32 |
efried | thx | 16:32 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix regression when listing build_requests with marker and ip filter https://review.openstack.org/576161 | 16:33 |
openstackgerrit | Elod Illes proposed openstack/nova master: Reject interface attach with QoS aware port https://review.openstack.org/570078 | 16:33 |
mriedem | done | 16:34 |
jaypipes | efried: done | 16:39 |
efried | thanks y'all | 16:43 |
openstackgerrit | Eric Fried proposed openstack/nova master: Handle agg generation conflict in report client https://review.openstack.org/556669 | 17:03 |
efried | jaypipes, mriedem: Done ^ | 17:03 |
openstackgerrit | Eric Fried proposed openstack/nova master: Nix unused raise_if_custom_resource_class_pre_v1_1 https://review.openstack.org/575847 | 17:07 |
efried | jaypipes, mriedem, dansmith, cdent: btw, I'd kinda like to stop using the word "migrate" for this thing. Can we call it "reshape" across the board? We've got enough confusion around the various kinds of "migration" that we don't need to add another "migration" into the mix. | 17:15 |
efried | POST /reshaper; exceptions.ReshapeNeeded; etc. | 17:16 |
dansmith | reshape is fine with me as the thing that gets done against placement, | 17:16 |
dansmith | but from a nova project lifecycle thing, the thing that is happening is a data migration, IMHO | 17:16 |
efried | ugh, you're right of course, it is a "data migration". | 17:17 |
efried | It's just that someone at some point is going to end up with a bloody forehead-and-brick-wall from trying to figure out why update_provider_tree is asking resource tracker to migrate instances, or whatever. | 17:17 |
efried | ...and then that's not happening. | 17:18 |
jaypipes | POST /data_migrations? | 17:18 |
efried | POST /migrate_inventory_and_allocations | 17:19 |
dansmith | POST /reshape seems good to me, fwiw | 17:21 |
efried | POST /reshaper (donning my cdent hat, cause it should be a noun, yah?) | 17:22 |
*** nicolasbock has quit IRC | 17:25 | |
efried | I'd like the REST term to match the exception, though. So either POST /reshaper + ReshapeNeeded (which seems reasonable in light of how update_provider_tree is using it) or POST /migrator and MigrationNeeded. | 17:25 |
cdent | /reshaper is more generic, so perhaps better long term, and still accurate for short term | 17:26 |
efried | ack | 17:27 |
* cdent departs before getting sucked in any more | 17:28 | |
cdent | bon chance | 17:28 |
*** cdent has quit IRC | 17:28 | |
openstackgerrit | Eric Fried proposed openstack/nova-specs master: Spec: Handling Reshaped Provider Trees https://review.openstack.org/572583 | 17:33 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: DNM: Use claim context during live migration https://review.openstack.org/576222 | 17:48 |
*** tssurya has quit IRC | 17:49 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/queens: Add policy rule to block image-backed servers with 0 root disk flavor https://review.openstack.org/563692 | 17:52 |
*** e0ne has joined #openstack-placement | 17:53 | |
*** gjayavelu has joined #openstack-placement | 18:01 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/pike: Add policy rule to block image-backed servers with 0 root disk flavor https://review.openstack.org/563700 | 18:01 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/ocata: Add policy rule to block image-backed servers with 0 root disk flavor https://review.openstack.org/563719 | 18:17 |
*** jroll has quit IRC | 18:24 | |
*** jroll has joined #openstack-placement | 18:24 | |
openstackgerrit | Zack Cornelius proposed openstack/nova master: Implement file backed memory for instances in libvirt https://review.openstack.org/567876 | 18:39 |
*** avolkov has quit IRC | 18:56 | |
openstackgerrit | Giridhar Jayavelu proposed openstack/nova master: Avoid redundant compute node update https://review.openstack.org/576235 | 19:04 |
*** tssurya has joined #openstack-placement | 19:07 | |
openstackgerrit | Eric Fried proposed openstack/nova-specs master: Spec: Handling Reshaped Provider Trees https://review.openstack.org/572583 | 19:11 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Compute: Handle reshaped provider trees https://review.openstack.org/576236 | 19:15 |
efried | dansmith, mriedem, cdent, jaypipes, bauzas, gibi, alex_xu, edmondsw, edleafe ^ and --^ | 19:16 |
mriedem | efried: few things in here https://review.openstack.org/#/c/556669/ | 19:29 |
*** edmondsw has quit IRC | 19:39 | |
*** ttsiouts has joined #openstack-placement | 19:44 | |
efried | mriedem: Other than the horrible zuul failures? Starting to think those might be legitimate... | 19:48 |
mriedem | efried: also a few comments in the shrub pruner spec https://review.openstack.org/#/c/572583/ | 19:49 |
efried | mriedem: ack, thanks. | 19:49 |
mriedem | as for your test failures http://logs.openstack.org/69/556669/4/check/tempest-full/c2bd2e9/controller/logs/screen-n-cond-cell1.txt.gz?level=TRACE | 19:50 |
openstackgerrit | Eric Fried proposed openstack/nova master: Nix unused raise_if_custom_resource_class_pre_v1_1 https://review.openstack.org/575847 | 19:52 |
mriedem | efried: looking at logstash, it's something in just your change | 19:54 |
mriedem | http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Failed%20to%20synchronize%20the%20placement%20service%20with%20resource%20provider%20information%20supplied%20by%20the%20compute%20host.%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d | 19:54 |
efried | mriedem: Yeah, one of the other failures is giving a conflict on setting host aggregates. | 19:55 |
efried | which certainly smells like my patch. | 19:55 |
*** e0ne has quit IRC | 20:02 | |
efried | mriedem: This would almost certainly be solved by a retry... | 20:06 |
efried | ...but I wonder if it'd be appropriate to make the test a bit more resilient. | 20:07 |
*** ttsiouts has quit IRC | 20:11 | |
*** e0ne has joined #openstack-placement | 20:12 | |
*** ttsiouts has joined #openstack-placement | 20:12 | |
mriedem | efried: are you sure that _refresh_associations isn't using a stale generation or something? | 20:14 |
efried | mriedem: It shouldn't be. But looking at that made me realize I missed something in this patch | 20:14 |
efried | mriedem: This patch was supposed to be using 1.19 to get the generation handling for aggregate APIs, but I missed _get_provider_aggregates. | 20:15 |
efried | Which also winds me up needing to fix _get_provider_traits, which is almost in unrelated-to-this-patch territory. | 20:15 |
efried | Going to have to reword the commit message slightly. | 20:15 |
mriedem | yeah i think i just saw the same thing inline | 20:16 |
efried | mriedem: What I don't get is how the generation could have changed between _get_provider_by_name and set_aggregates_for_provider. | 20:17 |
openstackgerrit | Merged openstack/nova master: Skip ServerShowV263Test.test_show_update_rebuild_list_server for cellsv1 https://review.openstack.org/576194 | 20:18 |
*** e0ne has quit IRC | 20:18 | |
efried | ya know, it's probably just that the cache is out of date. | 20:21 |
efried | which is probably because we're missing a cache invalidate somewhere. | 20:21 |
efried | oh, except this is actually on the host aggs code path, so that can't be it. | 20:21 |
openstackgerrit | Dan Smith proposed openstack/nova master: Fix MigrateData object tests for compat routines https://review.openstack.org/576256 | 20:28 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: DNM: Use claim context during live migration https://review.openstack.org/576222 | 20:36 |
mriedem | i went "sell shit on craigslist" crazy yesterday and now the hordes are non-stop; fyi in case anyone doesn't hear from me in a couple of days because someone from craigslist murdered me over a jogging stroller | 20:39 |
mriedem | jesus i would not drive 90 minutes round trip for a stroller | 20:42 |
jroll | those things are expensive yo | 20:43 |
mriedem | not surprisingly, no one wants to buy up my old ink jet printer! | 20:45 |
efried | I bought a truck on craigslist yesterday. Got scrooood. | 20:49 |
efried | tranny gave out on the drive home. Like 5 miles. | 20:49 |
jroll | oof | 20:50 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix regression when listing build_requests with marker and ip filter https://review.openstack.org/576161 | 20:51 |
openstackgerrit | Eric Fried proposed openstack/nova master: Tighten up ReportClient use of generation https://review.openstack.org/556669 | 20:56 |
efried | Let's see how that ^ flies. mriedem, jaypipes: ^ is the artist formely known as aggregate generations. Let's see if zuul likes it better now. | 20:57 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Clarify placement DB schema migration https://review.openstack.org/576265 | 20:57 |
openstackgerrit | Eric Fried proposed openstack/nova-specs master: Spec: Handling Reshaped Provider Trees https://review.openstack.org/572583 | 21:10 |
openstackgerrit | Eric Fried proposed openstack/nova-specs master: Spec: Handling Reshaped Provider Trees https://review.openstack.org/572583 | 21:13 |
* efried corrects gerrit UI multiple edit fail ^ | 21:13 | |
openstackgerrit | Chris Dent proposed openstack/nova master: Isolate placement database config https://review.openstack.org/541435 | 21:16 |
openstackgerrit | Chris Dent proposed openstack/nova master: Ensure that os-traits sync is attempted only at start of process https://review.openstack.org/553857 | 21:16 |
openstackgerrit | Dan Smith proposed openstack/nova stable/queens: Add amd-ssbd and amd-no-ssb CPU flags https://review.openstack.org/576270 | 21:19 |
*** tssurya has quit IRC | 21:28 | |
*** gjayavelu has quit IRC | 22:08 | |
*** mriedem has quit IRC | 22:16 | |
*** tssurya has joined #openstack-placement | 22:17 | |
*** mriedem has joined #openstack-placement | 22:19 | |
*** tssurya has quit IRC | 22:22 | |
*** gjayavelu has joined #openstack-placement | 22:32 | |
*** ttsiouts has quit IRC | 22:33 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Wait for network-vif-plugged before starting live migration https://review.openstack.org/558001 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add check if neutron "binding-extended" extension is available https://review.openstack.org/523548 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add "bind_ports_to_host" neutron API method https://review.openstack.org/523604 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add "delete_port_binding" network API method https://review.openstack.org/552170 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add "activate_port_binding" neutron API method https://review.openstack.org/555947 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Delete port bindings in setup_networks_on_host if teardown=True https://review.openstack.org/556333 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Implement migrate_instance_start method for neutron https://review.openstack.org/556334 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add VIFMigrateData object for live migration https://review.openstack.org/515423 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add VIFMigrateData.get_dest_vif https://review.openstack.org/566931 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: libvirt: factor out pre_live_migration plug_vifs call https://review.openstack.org/566932 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: libvirt: use dest host port bindings during pre_live_migration https://review.openstack.org/566933 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: libvirt: use dest host vif migrate details for live migration https://review.openstack.org/551370 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Annotate flows and handle PortBindingDeletionFailed in ComputeManager https://review.openstack.org/551371 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Port binding based on events during live migration https://review.openstack.org/434870 | 22:53 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: conductor: use port binding extended API in during live migrate https://review.openstack.org/522537 | 22:53 |
*** gjayavelu has quit IRC | 23:05 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/pike: Fixed auto-convergence option name in doc https://review.openstack.org/576282 | 23:32 |
*** takashin has joined #openstack-placement | 23:45 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!