Thursday, 2018-09-27

*** dpawlik has joined #openstack-nova00:02
*** dpawlik has quit IRC00:06
*** tetsuro has joined #openstack-nova00:10
*** slaweq has joined #openstack-nova00:11
*** slaweq has quit IRC00:15
*** mriedem_away has quit IRC00:18
*** brinzhang has joined #openstack-nova00:37
*** Dinesh_Bhor has joined #openstack-nova01:11
*** hongbin has joined #openstack-nova01:19
alex_xugmann: sorry, I didn't back at that time yesterday, I have late meeting at yesterday lunchtime01:25
gmannalex_xu: no prob.01:27
gmanncfriesen: did not get completely about query on microversion need.  are you saying to return 400 if image properties and flavor extra-spec if in rebuild/resize/create request ?01:28
*** Dinesh_Bhor has quit IRC01:31
*** liuyulong has joined #openstack-nova01:32
*** Dinesh_Bhor has joined #openstack-nova01:37
*** openstackgerrit has joined #openstack-nova01:52
openstackgerritzhaodan7597 proposed openstack/nova master: Unable to delete volume when a vmware instance bfv is failed.  https://review.openstack.org/57111201:52
*** cfriesen has quit IRC01:53
*** mrsoul has joined #openstack-nova01:54
*** cfriesen has joined #openstack-nova02:01
cfriesengmann: we're planning on adding extra validation for flavor extra-specs and image properties for operations that could change either or both of them.  if the combination of flavour extraspecs and image properties doesn't make sense, we will return an error (400 presumably) back to the user.02:03
cfriesengmann: currently this request would be accepted but would fail later on down on the compute node, but that's an RPC cast and so the user wouldn't get an error message.02:04
gmanncfriesen: ok, what all APIs? i am wondering if that is taken care by additionalProperties or not.02:08
*** Bhujay has joined #openstack-nova02:19
*** Bhujay has quit IRC02:19
*** Bhujay has joined #openstack-nova02:20
*** Dinesh_Bhor has quit IRC02:22
*** Dinesh_Bhor has joined #openstack-nova02:26
*** tbachman has quit IRC02:43
*** liuyulong has quit IRC02:48
*** psachin has joined #openstack-nova02:48
*** imacdonn has quit IRC02:50
*** markvoelker has joined #openstack-nova02:50
*** imacdonn has joined #openstack-nova02:51
*** Bhujay has quit IRC02:59
*** k_mouza has joined #openstack-nova03:00
*** k_mouza has quit IRC03:04
openstackgerritBrin Zhang proposed openstack/nova master: Add support volume_type in compute api  https://review.openstack.org/60557303:07
openstackgerritBrin Zhang proposed openstack/nova master: Add support volume_type in compute api  https://review.openstack.org/60557303:16
*** litao has quit IRC03:17
openstackgerritzhaodan7597 proposed openstack/nova master: Unable to delete volume when a vmware instance bfv is failed.  https://review.openstack.org/57111203:20
openstackgerritzhaodan7597 proposed openstack/nova master: Unable to delete volume when a vmware instance bfv is failed.  https://review.openstack.org/57111203:21
brinzhangKevin_zheng: Take a look at this patch https://review.openstack.org/#/c/605573/2/nova/compute/api.py03:26
brinzhangKevin_zheng: VOLUME_TYPE_MIN_COMPUTE_VERSION = 52 this variable is not need, instead of it with CINDER_V3_VOLUME_TYPE_MIN_COMPUTE_VERSION = 3503:27
brinzhangto check the volume type is supported in cinder min version.03:28
Kevin_Zhengyou should re arrange your patchsets, it now seems very hard to follow03:29
Kevin_Zhengreplied in your new patch03:33
brinzhangYeah, updating03:40
*** takashin has joined #openstack-nova03:42
*** jiapei has joined #openstack-nova03:42
*** rcernin_ has quit IRC03:42
*** rcernin has joined #openstack-nova03:43
*** dave-mccowan has quit IRC03:46
*** hongbin has quit IRC03:47
*** sapd1 has quit IRC03:55
*** sapd1 has joined #openstack-nova04:00
*** dpawlik has joined #openstack-nova04:02
*** tetsuro has quit IRC04:04
*** dpawlik has quit IRC04:07
openstackgerritMerged openstack/nova master: Revert "Make host_aggregate_map dictionary case-insensitive"  https://review.openstack.org/60489804:11
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (3)  https://review.openstack.org/57410404:12
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (4)  https://review.openstack.org/57410604:12
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (5)  https://review.openstack.org/57411004:13
*** vivsoni_ has joined #openstack-nova04:13
*** vivsoni has quit IRC04:15
*** udesale has joined #openstack-nova04:18
*** rmulugu has joined #openstack-nova04:48
*** dave-mccowan has joined #openstack-nova04:50
*** macza has joined #openstack-nova05:03
*** tetsuro has joined #openstack-nova05:07
*** macza has quit IRC05:08
*** tbachman has joined #openstack-nova05:10
*** slaweq has joined #openstack-nova05:11
*** cfriesen has quit IRC05:12
*** tbachman has quit IRC05:15
*** slaweq has quit IRC05:15
*** Bhujay has joined #openstack-nova05:35
*** bnemec has quit IRC05:39
openstackgerritfupingxie proposed openstack/nova master: Don't recreate inst_base on source when using rbd backend in resize  https://review.openstack.org/60559005:42
*** ratailor has joined #openstack-nova05:52
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in libvirt/test_driver.py (7)  https://review.openstack.org/57199206:05
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in libvirt/test_driver.py (8)  https://review.openstack.org/57199306:05
*** hoangcx has quit IRC06:11
*** hoangcx has joined #openstack-nova06:12
openstackgerritTao Li proposed openstack/nova master: Rollback instance vm_state to original where instance claims failed  https://review.openstack.org/59225206:13
*** slaweq has joined #openstack-nova06:23
*** dpawlik has joined #openstack-nova06:23
*** Dinesh_Bhor has quit IRC06:24
*** pcaruana has joined #openstack-nova06:33
*** adrianc_ has joined #openstack-nova06:34
openstackgerritLee Yarwood proposed openstack/nova stable/rocky: placement: Always reset conf.CONF when starting the wsgi app  https://review.openstack.org/60469406:35
*** jamesdenton has quit IRC06:35
*** Dinesh_Bhor has joined #openstack-nova06:43
openstackgerritBryan Song proposed openstack/nova master: Creation image for volume-backend instance should use volume size in image property 'min_disk'  https://review.openstack.org/60559606:45
openstackgerritTao Li proposed openstack/nova master: Don't persist retry information into database  https://review.openstack.org/60501107:03
*** Luzi has joined #openstack-nova07:06
*** litao_ has joined #openstack-nova07:11
*** rcernin has quit IRC07:12
*** helenafm has joined #openstack-nova07:12
openstackgerritMerged openstack/nova master: consumer gen: move_allocations  https://review.openstack.org/59181007:20
openstackgerrithuanhongda proposed openstack/nova master: Allow to attach/detach port when vm_state is soft-delete  https://review.openstack.org/60560207:24
*** mhen has joined #openstack-nova07:26
*** ralonsoh has joined #openstack-nova07:44
*** hshiina has quit IRC07:48
openstackgerrithuanhongda proposed openstack/nova master: Allow to attach/detach port when vm_state is soft-delete  https://review.openstack.org/60560208:00
*** jpena|off is now known as jpena08:01
*** tetsuro has quit IRC08:10
*** tetsuro_ has joined #openstack-nova08:10
*** Dinesh_Bhor has quit IRC08:16
*** k_mouza has joined #openstack-nova08:20
kashyapgibi: Morning, want to put this through: https://review.openstack.org/#/c/605060/08:21
kashyapgibi: We got confirmation from all the relevant distros08:21
gibikashyap: good morning. looking08:21
kashyapThank you!08:21
*** alexchadin has joined #openstack-nova08:21
gibikashyap: there are couple of FIXMEs in https://wiki.openstack.org/wiki/LibvirtDistroSupportMatrix regarding minimum libvirt and qemu versions08:25
gibikashyap: I guess if we get the relevant infos from the distros then we can fill those out now08:26
kashyapgibi: Yep, I just sent the two reminders to Iain from Oracle and Colleen from SUSE to fill the FIXME there08:26
kashyapgibi: I added the FIXMEs there :-)08:27
gibikashyap: cool :)08:27
*** janki has joined #openstack-nova08:27
gibikashyap: +208:27
kashyapSweet, thank you!08:28
gibikashyap: thank you for picking this work up08:28
kashyapgibi: No worries; I did that last cycle, and once or twice before too.  Thought I'd "remove the bandage quickly" this time too :-)08:29
*** jiapei has quit IRC08:31
*** lpetrut has joined #openstack-nova08:33
*** dtantsur|afk is now known as dtantsur08:36
stephenfinkashyap: I assume you're going to follow that up with a patch to bump the current minimums?08:41
kashyapstephenfin: Yeah, indeed.08:41
kashyapstephenfin: Want to ACK the above, already got one from gibi08:41
stephenfinI can. Just reading through the notes on the Wiki first08:42
kashyapSure08:42
kashyapstephenfin: In short: Oracle Linux already has the relevant versions we bumped to and SLES will have it (Colleen confirmed on the review).08:42
kashyapAnd the rest of the distributions already have it.08:43
kashyapThat's the thread on the list: lists.openstack.org/pipermail/openstack-operators/2018-September/015929.html08:43
*** derekh has joined #openstack-nova08:43
kashyapClicakable: http://lists.openstack.org/pipermail/openstack-operators/2018-September/015929.html08:43
*** Sigyn has quit IRC08:44
*** Sigyn has joined #openstack-nova08:45
*** Sigyn has quit IRC08:45
*** Sigyn has joined #openstack-nova08:45
*** Sigyn has quit IRC08:45
*** ttsiouts has joined #openstack-nova08:49
*** helenafm has quit IRC08:49
*** skatsaounis has quit IRC08:50
*** Sigyn has joined #openstack-nova08:50
*** Sigyn has quit IRC08:51
*** mgoddard has joined #openstack-nova08:51
openstackgerritRadoslav Gerganov proposed openstack/nova master: VMware: Live migration of instances  https://review.openstack.org/27011608:53
*** mgoddard has quit IRC08:53
*** mgoddard has joined #openstack-nova08:54
*** Dinesh_Bhor has joined #openstack-nova08:55
*** Sigyn has joined #openstack-nova08:56
*** skatsaounis has joined #openstack-nova08:56
*** ttsiouts has quit IRC08:58
*** ttsiouts has joined #openstack-nova09:00
*** panda|off is now known as panda09:00
*** a-pugachev has joined #openstack-nova09:01
*** tetsuro_ has quit IRC09:02
*** k_mouza has quit IRC09:02
*** mgoddard has quit IRC09:03
*** mgoddard has joined #openstack-nova09:04
*** tssurya has joined #openstack-nova09:11
*** janki has quit IRC09:13
openstackgerritTakashi NATSUME proposed openstack/nova master: Add API ref guideline for body text  https://review.openstack.org/60562809:17
*** Bhujay has quit IRC09:19
*** Bhujay has joined #openstack-nova09:25
*** k_mouza has joined #openstack-nova09:28
*** Dinesh_Bhor has quit IRC09:29
openstackgerritBalazs Gibizer proposed openstack/nova master: Follow up for Ib6f95c22ffd3ea235b60db4da32094d49c2efa2a  https://review.openstack.org/60474309:34
*** k_mouza has quit IRC09:35
*** k_mouza has joined #openstack-nova09:38
*** alexchadin has quit IRC09:42
*** cdent has joined #openstack-nova09:44
*** moshele has joined #openstack-nova09:52
openstackgerritChen proposed openstack/nova master: remove commented-out code  https://review.openstack.org/60563510:00
*** erlon has joined #openstack-nova10:00
*** Dinesh_Bhor has joined #openstack-nova10:04
*** purplerbot has joined #openstack-nova10:08
*** cdent has quit IRC10:08
*** helenafm has joined #openstack-nova10:15
openstackgerritBrin Zhang proposed openstack/nova master: Specifies the storage backend to boot instance  https://review.openstack.org/57936010:18
*** brinzhang has quit IRC10:22
*** Dinesh_Bhor has quit IRC10:29
openstackgerritMatthew Booth proposed openstack/nova master: Don't delete disks on shared storage during evacuate  https://review.openstack.org/57884610:35
openstackgerrittianhui proposed openstack/nova master: Update doc: launch-instance-from-volume  https://review.openstack.org/60564010:35
*** ttsiouts has quit IRC10:41
*** udesale has quit IRC10:54
*** cdent has joined #openstack-nova11:00
*** jpena is now known as jpena|lunch11:06
*** mgoddard has quit IRC11:09
*** macza has joined #openstack-nova11:11
*** macza has quit IRC11:15
openstackgerritMerged openstack/nova master: Pick next minimum libvirt / QEMU versions for "T" release  https://review.openstack.org/60506011:15
*** alexchadin has joined #openstack-nova11:22
*** helenafm has quit IRC11:24
openstackgerritBalazs Gibizer proposed openstack/nova master: Followup for Iba230201803ef3d33bccaaf83eb10453eea43f20  https://review.openstack.org/60565311:27
*** ratailor has quit IRC11:40
*** mgoddard has joined #openstack-nova11:45
*** ttsiouts has joined #openstack-nova11:45
*** pcaruana has quit IRC11:50
*** Bhujay has quit IRC11:51
*** Bhujay has joined #openstack-nova11:52
*** Bhujay has quit IRC11:53
*** Bhujay has joined #openstack-nova11:53
*** Bhujay has quit IRC11:54
*** rmulugu has quit IRC11:55
openstackgerritBalazs Gibizer proposed openstack/nova master: Follow up for Iba230201803ef3d33bccaaf83eb10453eea43f20  https://review.openstack.org/60565311:57
openstackgerritBalazs Gibizer proposed openstack/nova master: Follow up for Ie991d4b53e9bb5e7ec26da99219178ab7695abf6  https://review.openstack.org/60565811:57
gibijaypipes: I've left answer to your question in https://review.openstack.org/#/c/591811/12:01
*** artom has quit IRC12:12
mdboothHmm, I just discovered that initialisation by defaultdict isn't threadsafe12:13
cdentmdbooth: that statement has an mdbooth number of 712:13
mdboothIt means that if you've got: foo = defaultdict(threading.Lock)12:14
mdboothand 2 threads concurrently do: foo['thing']12:14
mdboothThey might get different threading.Lock objects12:14
* mdbooth had always assumed that the library authors weren't actually evil12:15
*** jamesdenton has joined #openstack-nova12:16
cdentthere's some interesting but potentially old discussion at https://stackoverflow.com/questions/17682484/is-collections-defaultdict-thread-safe12:16
mdboothcdent: Yep, that's where I read it :)12:17
*** mriedem has joined #openstack-nova12:22
*** takashin has left #openstack-nova12:32
*** mriedem has quit IRC12:34
*** jpena|lunch is now known as jpena12:34
*** pcaruana has joined #openstack-nova12:39
openstackgerrithuanhongda proposed openstack/nova master: Allow to attach/detach port when vm_state is soft-delete  https://review.openstack.org/60560212:43
*** alexchadin has quit IRC12:43
openstackgerritElod Illes proposed openstack/nova master: Reject interface attach with QoS aware port  https://review.openstack.org/57007812:48
*** mriedem has joined #openstack-nova12:49
openstackgerritMatt Riedemann proposed openstack/nova master: Null out instance.availability_zone on shelve offload  https://review.openstack.org/59908712:50
mriedemjohnthetubaguy: alex_xu: ^ was just a rebase due to a new notification test that needed a sample update12:50
*** tbachman has joined #openstack-nova13:03
gibimriedem: is the skip_filters flag of the scheduler you mention in https://github.com/openstack/nova/blob/8c3d02ac3d890f414ce4e05c41d44dca3b385424/nova/conductor/tasks/live_migrate.py#L103-L108 a reality or just a dream?13:11
gibimriedem: I'm trying to solve the nested force migration issues in https://review.openstack.org/#/c/604084/2/nova/tests/functional/test_servers.py@4897 and I think that skip_filters flag is the solution13:12
*** liuyulong has joined #openstack-nova13:15
*** scroll has joined #openstack-nova13:16
mriedemgibi: i'm not working on it if that's what you mean13:17
mriedembut i think the idea is still ok - pass a flag to the scheduler to not run filtered hosts, just do the claim on the forced host13:18
mriedemtreats the scheduler more like a library13:18
*** bnemec has joined #openstack-nova13:18
mriedemthe point of that todo was more about DRYing up the code13:18
gibimriedem: would scheduler still call GET allocation_candidates?13:18
*** ttsiouts has quit IRC13:19
mriedemif it needs those to make the claim, then i guess? but that kind of goes against what forcing a host is all about,13:19
mriedemwhich is, claims be damned this is where i want the thing to go13:19
mriedembut having said that,13:19
mriedemwe already broke that contract with https://github.com/openstack/nova/blob/8c3d02ac3d890f414ce4e05c41d44dca3b385424/nova/conductor/tasks/live_migrate.py#L103-L10813:19
mriedembecause we have to keep allocations straight in placement13:20
mriedemor that's what we said to justify that change at the time13:20
mriedemhonestly force with bypassing the scheduler was always a terrible idea13:20
gibimriedem: when we forcing a host we cannot blindly copy the source allocation to the dest host in case of nested13:20
gibimriedem: as we only know the dest host rp_uuid but not the nested rps13:21
mriedemso we need a new set of allocation candidates13:21
gibimriedem: one way to solve that is to call GET a_c but limit the search for the given dest host13:21
mriedemthat seems reasonable13:21
mriedemessentially,13:21
mriedemin the case of nested RP allocations + force, we'd just go through the scheduler with the requested host but kind of ignore the force flag13:22
mriedemi.e. you can live migrate today by specifying a specific host w/o forcing it and we'll validate that requested host with the scheduler13:22
mriedemit sounds like you want to do the same13:22
mriedems/want/need/13:22
gibimriedem: that seems to be the way forward. So now I go and dig that code path13:23
gibimriedem: thanks13:23
mriedemin the case of an unforced live migration to a specific host, we just set the RequestSpec.requested_destination to the requested host/node and send that to the scheduler13:24
mriedemit sounds like you'd just need some logic up-front to determine, is this an instance that has allocations on nested RPs13:24
mriedemand if so, ignore the force flag13:24
gibimriedem: yeah, forced live migration will be less forced as it can return NoValidHost after this change13:25
mriedemhttps://github.com/openstack/nova/blob/8c3d02ac3d890f414ce4e05c41d44dca3b385424/nova/compute/api.py#L437413:25
mriedemgibi: and i think that's ok - forced live migration could always fail even if you bypassed the scheduler b/c the conductor task still does some prechecks on the forced host13:25
gibimriedem: which means if every instance will be on nested RP then force livemigration will be equal to non forced livemigration with provided host13:25
mriedemhttps://github.com/openstack/nova/blob/8c3d02ac3d890f414ce4e05c41d44dca3b385424/nova/conductor/tasks/live_migrate.py#L16013:26
mriedemlike, self._check_host_is_up(self.destination) is the ComputeFilter13:26
mriedemself._check_destination_has_enough_memory() is the RamFilter13:26
mriedemetc13:26
mriedemi'd really like to just get rid of that _check_requested_destination method13:26
gibimriedem: OK, this seems to be a way forward to eventually get rid of the force live migration altogether13:27
jaypipesmriedem: "recheck slow node" <-- worst 90s band name EVAH.13:27
mriedemjaypipes: it's definitely not as good as butthole surfers13:27
jaypipesindeed.13:27
jaypipestrue fact: my high school band in the early 90s was called "Slow Children at Play".13:28
mriedemhighly offensive13:28
jaypipesembarrassing, I know.13:28
mriedemhttps://www.youtube.com/watch?v=SkjJLQUhxks13:29
mriedemnow i know what needs to be playing this morning13:29
mriedembauzas: depending on where you check if you should ignore the force flag it could get messy,13:32
mriedemhttps://github.com/openstack/nova/blob/8c3d02ac3d890f414ce4e05c41d44dca3b385424/nova/compute/api.py#L435913:32
mriedembecause the api determines what gets passed to conductor13:32
mriedemit's all very tightly coupled13:32
mriedemi.e. if force is True, the requested host parameter isn't passed to conductor, but the request spec is set with the requested_destination13:33
mriedembut if force is True, the request spec is untouched and the requested host is passed to conductor13:33
mriedemand conductor has to know what that means13:33
gibimriedem: I see13:34
mriedemi assume the logic to determine if the instance has allocations against nested RPs should probably live in conductor13:34
mriedemso we don't block the API response13:34
mriedemalthough having said that....13:34
mriedembefore some microversion, live migration is aysync rpc call from api-conductor-scheduler until we pick a host and cast to it13:34
mriedemhttps://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id3113:35
mriedemmeaning it's pretty easy to timeout the API while picking hosts before 2.3413:35
*** ttsiouts has joined #openstack-nova13:35
mriedemcold migrate is the same way - it's all synchronous until we cast to the chosen compute13:35
mriedem:(13:36
*** psachin has quit IRC13:36
mriedemanywho, i'd recommend doing your nested RP calculations in conductor rather than api13:36
gibimriedem: ack13:37
mriedemmaybe you don't need to look at placement? maybe you can just glean if it's got nested rps by looking at the flavor?13:37
mriedemi guess you care about allocations for ports13:37
gibimriedem: even if flavor has granular groups I don't know if the host the instance is on are already reshaped to nested or not13:38
mriedemright yeah13:38
gibimriedem: can we simply ignore the force flag for every instance not just for the nested ones? I mean the end user will not know if his instance is nested or not13:39
mriedemit's not really about the owner of the instance (the user), force is for the admin13:40
gibimriedem: true, I missed that13:40
mriedemif we're really going down a path of blatantly ignoring the force parameter, we should probably consider just deprecating it in the api13:41
mriedemlike i said, we kind of already ignore it today for ram/disk/vcpu13:41
mriedemplus some other sanity checks that conductor does13:41
mriedemcaching scheduler is the only thing that would still truly force today since it doesn't create allocations13:41
mriedembut we can maybe remove the caching scheduler now13:41
mriedemmgagne: i've been meaning to follow up with you about the caching scheduler removal...13:42
gibimriedem: does this ignoring behavior needs to be guarded with a new api microversion? I hope not as maintaining the old behavior is not easily possible for nested13:43
mriedemgibi: no i don't think so,13:44
mriedemwe're already ignoring force for some filters like i said13:44
mriedema microversion would be more of a signaling mechanism,13:45
mriedemplus things like nova CLI that do version discovery and pass the latest microversion by default would simply opt into the new microversion where force is never passed13:45
mriedemiow, force is legacy from when we just had claims in the compute and overcomitting was "simpler"13:46
gibimriedem: so there would be a microversion to remove the force flag from the API but ignoring the force flag would be retroactive in every version13:46
bauzasmriedem: I'm in and out due to some customer issue, but I'll reply to you later on13:46
mriedemwell, i don't know if we should blatantly ignore it if we can help it13:46
mriedembut i'm not sure how we determine if we should ignore it or not for nested RPs13:47
gibimriedem: we cannot easily help it in case of nested, and sooner or later (after NUMA) every instance will be nested13:47
mriedemb/c like you said, the instance might have allocations on nested RPs on the source host but maybe not the forced dest?13:47
mriedemwell, every instance with numa allocations13:47
mriedemnot all instances are those kinds though13:47
gibimriedem: if VCPU resources are moved to NUMA RP then the simple instance will become nested13:48
mriedemare we talking about doing that?13:53
mriedemi guess for PCPU/VCPU modeling?13:53
mriedemif so, shiiiiit that reshape is going to be big in a public cloud13:53
gibimriedem: honestly I don't know what will be the numa model in plaacement13:53
gibimriedem: https://review.openstack.org/#/c/555081/18/specs/stein/approved/cpu-resources.rst@26113:55
gibimriedem: in this spec VCPU is under NUMA13:55
*** itlinux has quit IRC13:59
mriedemgibi: ok left a comment / question in the upgrade impact section,14:00
mriedemwe also said at the ptg that that spec wouldn't be a priority for stein14:00
gibimriedem: correct14:00
*** awaugama has joined #openstack-nova14:01
mriedemjust getting vgpu reshaping done is going to be a big hurdle14:01
gibimriedem: yeah so vgpu and bandwidth using instance will be the first nested instances. So during this time it make sense to keep a separate code path for them14:01
gibimriedem: I try to implement what you described. checking the nestedness in conductor and ignoring the force flag for nested14:02
*** artom has joined #openstack-nova14:08
*** mlavalle has joined #openstack-nova14:13
*** ttsiouts has quit IRC14:15
openstackgerritChristoph Manns proposed openstack/nova master: Fix stacktraces with redis caching backend  https://review.openstack.org/60574814:24
openstackgerritArtom Lifshitz proposed openstack/nova-specs master: Re-propose numa-aware-live-migration spec  https://review.openstack.org/59958714:24
*** ttsiouts has joined #openstack-nova14:30
*** jistr is now known as jistr|call14:31
openstackgerritMark Goddard proposed openstack/nova master: Don't emit warning when ironic properties are zero  https://review.openstack.org/60575414:34
*** lpetrut has quit IRC14:36
mnaserfiltering happens after placement, correct?14:36
*** lpetrut has joined #openstack-nova14:36
mnaseris there no warning message that says "i couldn't find anything?"14:37
mnaseri guess i can just rely on conductor's "Setting instance to ERROR state."14:37
bauzasmriedem: I'm still working on the vgpu reshape patch, and yes, it's a big hurdle :(14:37
bauzasgibi: FWIW, I need to split https://review.openstack.org/#/c/552924/ in two, one targeted for Stein with no NUMA affinity14:38
openstackgerritMark Goddard proposed openstack/nova master: Don't emit warning when ironic properties are zero  https://review.openstack.org/60575414:38
bauzasmnaser: you're right, filters are called after we found an allocation candidate14:38
gibibauzas: ack14:38
openstackgerritMerged openstack/nova master: consumer gen: more tests for delete allocation cases  https://review.openstack.org/59181114:38
mnaserthanks bauzas, im relying on this logstash query to monitor those: tags:nova AND message:"Setting instance to ERROR state." AND message:NoValidHost_Remote14:39
bauzasmnaser: but you should at least still have the filtering logs14:39
mnaseryeah the filter logs are there but if 0 computes match at the end, but i'm curious if there's any warning if no allocation candidates that come in the first place14:40
bauzashah, yeah, but we provide an INFO log saying "heh, 0 hosts found"14:40
mnaserso i guess if its gets 0 allocation candidates, it'll just go through filters and end up with 0 anyways14:40
bauzasI also think we tell in the logs whether we found no candidates after placement14:40
mnaserlet me verify14:40
bauzasmnaser: sec, checking the gate14:40
bauzasok14:40
mriedemi don't think we do,14:40
mriedemwe just pass an empty list to ComputeNodeList.get_by_uuids() which returns an empty list and passes that down to the filter scheduler driver which then raises NoValidHost14:41
*** moshele has quit IRC14:41
mriedemoh i guess we log something at debug,14:41
mriedembut that won't be indexed by logstash14:41
mnaseryeah we dont do that, too much data14:42
mriedemhttps://github.com/openstack/nova/blob/master/nova/scheduler/manager.py#L15014:42
bauzasmnaser: yup, we do http://logs.openstack.org/72/585672/7/check/tempest-full-py3/f1d0b34/controller/logs/screen-n-sch.txt.gz#_Sep_26_19_02_54_28554614:42
mriedemmnaser: do you have a failure log to check?14:42
mriedemplacement should be logging some stuff now too14:43
mnasermriedem: i might if it hasnt rotated out14:43
bauzasmriedem: see, we put an info log on how many hosts we got from placement ^14:43
mriedemas to which "filters" in placement resulted in 0 allocation candidates14:43
mriedemSep 26 19:02:54.285546 ubuntu-xenial-rax-ord-0002338068 nova-scheduler[18720]: DEBUG nova.filters [None req-8a6074ae-e62f-4ac7-a525-4c411c130c39 tempest-AutoAllocateNetworkTest-2112321998 tempest-AutoAllocateNetworkTest-2112321998] Starting with 1 host(s) {{(pid=20232) get_filtered_objects /opt/stack/nova/nova/filters.py:70}}14:43
bauzasoh shit, that's DEBUG14:43
mnasermriedem: that is a LOG.debug() so a normal deployment wont see it14:43
mnaseryeah14:43
mriedemright14:43
bauzasmy bad14:43
mriedemGAWD!14:43
mnaseri think it's useful to get that warning because a lot of times when placement isnt returning anything14:44
bauzaswe info out when we have the filtering results14:44
mnaseri would get really confused14:44
mriedemlogging https://github.com/openstack/nova/blob/master/nova/scheduler/manager.py#L150 at INFO might be ok14:44
mnaseryeah but we don't even pass things down to filter14:44
bauzasWTF14:44
mnaserif we get 0 allocation candidates14:44
mnaserif i understand what mriedem linked above14:44
bauzasthat's correct14:44
bauzaswe just said "meh, that's bad"14:44
bauzasmy point was, if we end up with 0 hosts from filtering, some INFO log is done14:45
*** evrardjp has quit IRC14:45
bauzasso, having the pre-filtering result to be INFO seems consistent and valid to me14:45
mnaseryeah that scenario is taken care of i agree14:45
bauzaslemme dig the code14:45
mnaseri'd even go as far as say that's a warning14:45
mnaserhttps://github.com/openstack/nova/blob/master/nova/scheduler/manager.py#L150 -- just switch that to warning ?14:45
bauzasbut I'm pretty sure we say it's INFO (and no ERROR or warning, because a capacity problem isn't a scheduling problem)14:46
* mnaser is tempted to make his patch as spammy as possible14:46
mnaser"change debug level for more info"14:46
bauzasmnaser: I'd advocate for INFO14:46
bauzasno WARN14:46
mnaserit would be consistent with the other stuff14:46
bauzaslemme find the existing log we raise post-filtering14:46
mriedembauzas: unrelated, but is it just me or do we persist RequestSpec.requested_destination?14:46
mriedemand probably shouldn't...14:46
mnaserbauzas: its info, i have an entry here14:47
bauzasmriedem: wait, wait wait14:47
mnaserbauzas: 2018-09-27 12:37:00.467 394218 INFO nova.filters [<snip>] Filter ComputeFilter returned 0 hosts14:47
mriedemmnaser: i'd say info14:47
bauzasmriedem: probably yet another PEBKAC then14:47
mriedemit's not a warning if someone is trying to resize to a flavor that won't fit anywhere14:47
*** evrardjp has joined #openstack-nova14:47
bauzas(for the persisted field)14:47
bauzasmriedem: zactly14:47
bauzas(16:46:11) bauzas: but I'm pretty sure we say it's INFO (and no ERROR or warning, because a capacity problem isn't a scheduling problem)14:47
bauzasgosh, already 4:46pm here :(14:48
mriedemso if we do persist the request spec requested_destination, i'm just not sure how it doesn't cause problems14:48
mriedemmaybe we just get lucky and don't call request_spec.save() on the dirty request spec?14:49
mnaserwould we be able to backport that log level change? it's kinda useful.  if we can i guess i'll file a bug?14:50
mriedemmnaser: sure14:50
bauzasmriedem: https://github.com/openstack/nova/blob/master/nova/objects/request_spec.py#L2914:51
mriedembauzas: that doesn't really tell me anything14:51
bauzasand shit, I threw my day on some internal bug and now I'm done, I have to go into a meeting14:51
* bauzas has a very productive day14:51
mriedemmnaser: this is your justification https://github.com/openstack/nova/blob/c6218428e9b29a2c52808ec7d27b4b21aadc0299/nova/filters.py#L13014:52
mriedemb/c if we got allocation candidates, but the filters rejected all of them, we log something at INFO14:52
mriedemhttp://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Filtering%20removed%20all%20hosts%20for%20the%20request%20with%5C%22%20AND%20tags%3A%5C%22screen-n-sch.txt%5C%22&from=7d14:53
bauzasmriedem: I guess I have to doublecheck this spaghetti code14:54
bauzasmriedem: but since requested_destination is only set on a live migration or an evacuation, I just wonder whether we .save() this14:55
mriedemit's also set on resize14:55
mriedemb/c you can pass a host on resize since queens14:55
mriedemand on resize we persist the request spec with the new flavor before casting to compute14:56
mriedemi'm pretty sure i raised this with takashi when he was writing that14:56
bauzasah14:56
mriedemah this is how he dealt with that https://github.com/openstack/nova/blob/master/nova/compute/api.py#L350514:56
bauzasghood point14:56
bauzasanyway, I need to jump on a call14:57
mriedembut....that's likely not good enough if you resize to a specific host, and then live migrate without specifying a host...14:57
bauzasmriedem: live migrate has the same logic IIRC14:57
bauzaswe null out the field14:58
*** itlinux has joined #openstack-nova14:58
mriedemi don't see that happening14:58
mriedemfor live migrate14:58
bauzasoh shit no you're right14:58
bauzasbug bug bug14:58
*** ttsiouts has quit IRC14:58
openstackgerritMohammed Naser proposed openstack/nova master: Use INFO for logging no allocation candidates  https://review.openstack.org/60576514:58
*** evrardjp has quit IRC14:58
mnasermriedem: bauzas ^14:58
mnasertook me longer to come up with a decent commit message jeez14:59
*** ttsiouts has joined #openstack-nova14:59
melwitt.14:59
*** Swami has joined #openstack-nova14:59
mriedembauzas: i'll give myself a todo to write a regression test for this15:00
bauzasmriedem: ack15:00
*** hamzy_ has quit IRC15:00
mriedemmnaser: +215:01
openstackgerritChristoph Manns proposed openstack/nova master: Fix stacktraces with redis caching backend  https://review.openstack.org/60574815:01
bauzasmnaser: +Wipped15:01
bauzasmnaser: please make a cherry-pick for rocky15:02
*** ttsiouts has quit IRC15:04
*** cfriesen has joined #openstack-nova15:06
openstackgerritMohammed Naser proposed openstack/nova stable/rocky: Use INFO for logging no allocation candidates  https://review.openstack.org/60577115:06
mnaserbauzas: done15:07
*** ratailor has joined #openstack-nova15:11
*** Luzi has quit IRC15:13
*** cfriesen has quit IRC15:14
*** cfriesen has joined #openstack-nova15:15
*** ttsiouts has joined #openstack-nova15:20
*** ttsiouts has quit IRC15:23
*** dave-mccowan has quit IRC15:23
*** hamzy_ has joined #openstack-nova15:26
mnaserAggregateRamFilter is still relevant and working? i remember there was an aggregate filter that had a long ml discussion about how it wasnt really working?15:26
*** k_mouza has quit IRC15:27
mriedemmnaser: ask jaypipes re https://review.openstack.org/#/c/544683/ and https://review.openstack.org/#/c/552105/15:28
* mnaser reads15:28
*** hamzy_ has quit IRC15:30
*** artom has quit IRC15:31
melwittmnaser: this is the situation http://lists.openstack.org/pipermail/openstack-dev/2018-January/126283.html and it's still the case now. those two specs ^ are what's needed to restore the ability to set allocation ratios per aggregate15:38
openstackgerritBalazs Gibizer proposed openstack/nova master: Ignore forcing of live migration for nested instance  https://review.openstack.org/60578515:41
*** dave-mccowan has joined #openstack-nova15:41
gibimriedem: my first stab for ignoring the force flag ^^15:41
mgagne@mriedem: what's up with caching scheduler?15:42
mriedemmgagne: are we ok to remove it now?15:42
mriedemin stein15:42
mnasermelwitt: i see, i think i might be looking at the wrong filter then15:43
mriedemi.e. is heal_allocation sufficient for you right now to get upgraded to a FilterScheduler world15:43
mgagnemriedem: I think we figured out it was ok after you wrote the allocation healing tool.15:43
mgagne++15:43
mriedemmgagne: ok15:43
mriedemthanks15:43
mnasermaybe it was a weigher that had the flip of a switch15:43
mnaser-1 or 1 to pack vs distribute15:43
mriedemhost_subset_size?15:43
mnaseryeah it is gr15:43
mnaserhttps://github.com/openstack/nova/blob/master/nova/scheduler/weights/ram.py i guess no way of having that per host or per aggregate or anything15:44
mgagnemnaser: we have a private implementation of the RAMWeigher per aggregate15:44
melwittmgagne: while you're here, don't forget to re-propose https://review.openstack.org/312626 for stein. I was +2 on the implementation but we were at feature freeze at the time15:44
mgagnemelwitt: thanks for the follow up15:45
mnasermgagne: that'd be nice to have upstream i guess, assuming that's a possiblity15:45
*** k_mouza has joined #openstack-nova15:45
mgagnemnaser: sure, unfortunately, I'm not sure it's gonna play well with placement API :D15:46
mgagneor maybe it's not related?15:46
mnaserweighers are after placement15:46
openstackgerritBalazs Gibizer proposed openstack/nova master: Ignore forcing of live migration for nested instance  https://review.openstack.org/60578515:46
mnaserso it doesnt matter15:46
mgagnemnaser: awesome15:46
mnaserits just "which machine do i prefer"15:46
mnaserbut im not a nova dev but thats as far as i understand it15:46
mnaserweighers run after placement allocations AND filters have ran15:46
*** jistr|call is now known as jistr15:46
mgagneso you just need a high host_subset_size for weigher to get some hosts to choose from15:46
*** macza has joined #openstack-nova15:47
mnaseryeah we have that bumped up. the silly idea is right now by default nova spreads vms which is ok but with large vms it becomes problematic15:47
mnaserbecause we have enough capacity for them but just not in aggregate if that makes sense15:47
mgagnehehe, I know =)15:47
bauzasmriedem: you're my rebuild specialist, so lemme bug you about some silly question15:48
mnaserbut we had some really bad decision making with our first flavors so running it with -1 ram multiplier for * .. bad things happen15:48
bauzasmriedem: as of today, do we rebuild by calling the scheduler or have we stopped this ? /me is confused by the number of bugs we had about15:48
bauzasmy brain sucks15:49
openstackgerritElod Illes proposed openstack/nova master: Reject networks with QoS policy  https://review.openstack.org/57007915:50
mgagnemnaser: based on Newton (sorry): https://gist.github.com/mgagne/142e20e32049abd0cdf5d2da7e04860815:51
melwittmnaser: hm, supposed to pack by default, I thought15:51
mnasermnaser: by default the weighers are set to positive values so distribute15:52
melwittok. in the past the scheduler used to pack by default, so I'm not sure when/how that changed15:52
mgagnedoesn't make much sense to pack since if you have lets say the openstack infra team spawning 120 VMs at the same time, you will overload the same hosts with requests.15:53
melwittnot since claims in the scheduler, but in the past yeah15:53
mgagnetrue15:53
*** gyee has joined #openstack-nova15:53
*** rpittau has quit IRC15:54
*** gyee has quit IRC15:54
mgagnebut still, maybe you don't want 6 images being downloaded for the first time on the same host at the same time. or other similar expansive operations.15:54
*** dpawlik has quit IRC15:54
mnasermgagne: thanks for that, ill have a look15:54
melwittgenerally speaking, I think pack is the more desired behavior for efficient usage of compute hosts. the only reason people increased subset size, as I understand it, was to avoid the racing of parallel requests trying to claim the same nodes with the old way of claiming15:55
mnaseryeah that was what we had to do for a while15:55
mriedembauzas: we call the scheduler if the server is being rebuilt with a new image15:55
mriedemb/c we need to validate the new image for the host that the instance is on15:55
bauzasok, I didn't remember all the conditionals15:56
melwittyeah, if it's a first time ever image. but when I worked at yahoo we used to warm the cache for images on the compute hosts before letting users at it15:56
bauzasmriedem: thanks15:56
mnaserwith ceph it's not even an issue because cow15:56
melwitttrue. we didn't use ceph15:56
mriedemmnaser: you're asking about having the RamWeigher applied to an aggregate?15:57
bauzasmriedem: that's from Queens, right?15:57
bauzasI remember the CVE15:57
mriedembauzas: i think so, but it was backported so ...15:57
*** gyee has joined #openstack-nova15:57
bauzasmriedem: okay15:57
*** dpawlik has joined #openstack-nova15:58
mriedemmnaser: reminds me of Kevin_Zheng's spec https://review.openstack.org/#/c/599308/15:58
mriedemtrying to make the weight configuration not global15:58
*** dpawlik has quit IRC15:59
*** dpawlik has joined #openstack-nova15:59
mgagnemriedem: yes and I commented that I had a similar solution per aggregate, not per flavor. code posted above in a gist =)16:00
*** liuyulong has quit IRC16:00
mriedemmgagne: ok i remember reading your comment but totally missed the part about having the weight configuration per aggregate16:03
mriedemso Kevin_Zheng's spec is maybe way too extreme on the granular side, being per-flavor,16:03
mriedembut global weight configs is also pretty extreme,16:04
mriedemit seems per-aggregate weight configuration would be a nice compromise16:04
*** ratailor has quit IRC16:04
bauzasmriedem: mgagne: I think there was a consensus on that approach, even at the PTG16:05
bauzasputting on a flavor, I nacked, but I'm okay with it per aggregate16:05
bauzas(in the spec, I meant)16:05
mriedemyeah i totally didn't connect the dots on what the alternative was (weights per aggregate)16:06
mriedemi must have been thinking about just pinning flavors to aggregates or something, idk16:06
openstackgerritMatthew Booth proposed openstack/nova master: Fix a race evacuating instances in an anti-affinity group  https://review.openstack.org/60543616:07
bauzasyou can mix both indeed16:07
bauzasif that helps your case16:07
bauzasstick flavors to aggregates, the latter having specific weight policies16:07
bauzasthat would fit Kevin_Zheng's concern16:07
*** dave-mccowan has quit IRC16:08
bauzasanyway, I need to disappear for a meetup, \o16:08
*** efried has quit IRC16:10
*** efried has joined #openstack-nova16:10
nicolasbockmriedem: Hi. I had asked you about "lost" servers a while back, i.e. servers that were migrated but nova's database was not updated. You had mentioned that resource provider allocation show will tell me about where placement thinks the server is running.16:14
cfriesengmann: sorry, I didn't notice your question earlier.  the validation of flavor extra-specs and image properties would be done on instance creation, instance resize, and instance rebuild.16:14
nicolasbockUnfortunately, in our deployment none of the hypervisors shows anything using this command16:14
nicolasbockIs there an issue with placement? Or are we missing some configuration? Sorry if I sound confused, but I am ;)16:15
openstackgerritChen proposed openstack/nova master: remove commented-out code  https://review.openstack.org/60563516:19
melwittnicolasbock: for that command, you need to pass the instance uuid "the consumer". did you pass that or something else? https://docs.openstack.org/osc-placement/latest/cli/index.html#resource-provider-allocation-show16:21
nicolasbockmelwitt: I ran 'resource provider list' first and took the UUIDs as argument for 'resource provider allocation show'16:22
melwittnicolasbock: ok, those would be the compute host uuids, which is not what you need to pass. you need to pass the uuid of the instance/server, that is "lost"16:23
melwittand then it will show you information about that instance's allocations and where they are, which resource provider aka which compute host16:24
nicolasbockOh sorry, I totally misunderstood the command :(16:25
nicolasbockIt's working much better now16:25
nicolasbockThanks!16:25
melwittno worries, I had thought the same thing the first time I learned about the command16:25
*** cdent has quit IRC16:26
*** dpawlik has quit IRC16:26
*** dpawlik has joined #openstack-nova16:30
*** a-pugachev has quit IRC16:34
*** sapd1_ has joined #openstack-nova16:40
*** k_mouza has quit IRC16:41
*** k_mouza has joined #openstack-nova16:43
cfriesenmdbooth: I think I found a flaw in your fail-fast algorithm for https://review.openstack.org/60543616:43
*** Swami has quit IRC16:46
cfriesenWe're proposing kind of a "big hammer" fix for a missing marker during online data migration. (https://review.openstack.org/#/c/605164/)   Does anyone have a more elegant solution?16:46
*** mdbooth has quit IRC16:49
*** jackding has joined #openstack-nova16:49
*** med_ has joined #openstack-nova16:53
*** k_mouza has quit IRC16:53
*** tbachman has quit IRC16:53
*** cdent has joined #openstack-nova16:53
*** dtantsur is now known as dtantsur|afk16:54
*** mgoddard has quit IRC16:54
mriedemmelwitt: nicolasbock: https://docs.openstack.org/osc-placement/latest/cli/index.html#cmdoption-openstack-resource-provider-allocation-show-arg-uuid describes the uuid but we could rename that metavar to be consumer_uuid so it's more obvious from the beginning16:56
mriedemconsumers aren't a top-level resource in placement so that's probably why it's confusing16:57
mriedemunlike openstack resource provider show https://docs.openstack.org/osc-placement/latest/cli/index.html#resource-provider-show16:57
melwittyeah. it does say "consumer". I think the confusion comes from the fact that it's in the resource provider command family16:57
*** dpawlik has quit IRC16:58
melwittthe first time I read it, I thought the documentation was a mistake. but edleafe confirmed that it is indeed supposed to be "consumer" uuid16:58
mriedemprobably should have been "openstack resource allocation list <consumer_uuid>"16:58
*** dpawlik has joined #openstack-nova16:59
mriedemcould still add that and deprecate the old command16:59
openstackgerritMatt Riedemann proposed openstack/nova master: WIP: Cross-cell resize  https://review.openstack.org/60393017:02
*** med_ has quit IRC17:03
*** derekh has quit IRC17:04
*** tbachman has joined #openstack-nova17:05
*** artom has joined #openstack-nova17:06
*** jpena is now known as jpena|off17:11
openstackgerritMatt Riedemann proposed openstack/nova master: Fix stacktraces with redis caching backend  https://review.openstack.org/60574817:15
*** cdent has quit IRC17:16
*** adrianc_ has quit IRC17:18
* mnaser takes deep breath17:22
mnaseropenstackclient doesnt let you do a live migration unless you specify a host?17:22
mnaserhttps://github.com/openstack/python-openstackclient/blob/c0567806916995698e94734d2b2c422a4bf5a1db/openstackclient/compute/v2/server.py#L1333-L133717:23
nicolasbockThanks mriedem and melwitt . Yes, the wording could be clearer, but then again I probabaly could have read the help a little bit more carefully :)17:23
nicolasbockI think that's true mnaser17:23
* melwitt runs away from mnaser 17:24
mnasernova can do live migrations without specifying a host17:24
mnaserthe novaclient lets you do it17:24
mnaserand i think there's been voices of "forcing a host in migrations is a bad ideaâ„¢"17:24
nicolasbockBut in 'nova live-migration' you also need to specify a host17:25
nicolasbockSame as in 'openstack server migrate'17:25
cfriesenmnaser: yeah, it's messed up17:25
nicolasbocksorry 'openstack server migrate --live'17:26
mnasernova live-migration does not require a host17:26
cfriesenmnaser: migration and live migration in OSC need help17:26
mnaserit is optional17:26
melwittyeah, there are unfortunate discrepancies between novaclient and openstackclient. we talked about it a bit at the PTG L721 https://etherpad.openstack.org/p/nova-ptg-stein17:26
mnaseri guess this is a lot harder than expected because we'd break "api"17:26
mnaserim not sure what'd be the ideal solution17:26
melwittI was thinking we could just "fix" the openstackclient side to be able to do the same stuff as novaclient. just someone has to do it17:27
mnaserhmm im thinking add a positional argument [host]17:27
melwittwould have to talk to dtroyer about it more17:27
mnaserand make osc ignore the parameter provided to --live17:27
mnaserat least to let you do clean live migrations17:28
cfriesenmnaser: while you're in there you could also fix up all the *other* live-migration related stuff that OSC doesn't handle.17:28
cfriesenautomatic detection of block/shared, for example17:29
melwittoh yep, that's one that our customers have hit many times17:29
artomCareful about that though, they're moving to openstacksdk, so it might be more intelligent to just help with the move17:29
artom(Assuming openstacksdk doesn't have the same problems)17:29
artom(Can I just say that I feel like like the *clients, then openstackclient, then openstacksdk is a case of https://xkcd.com/927/ )17:30
mnaseryep17:30
melwittmoving to openstacksdk is going to change the CLI arguments? I haven't read in detail about what that will involve17:30
mnaseropenstacksdk has some 'intelligence'17:31
mnaserinherited from shade17:31
mnaserso it does a lot of figuring out of the right thing to do17:31
mnaseror the ideal set of defaults so to speak17:31
melwittoh good, it can fix everything automatically then17:31
mnaserwhere a live migration python call in python-novaclient assumes nothing and sends rest, openstacksdk tries to workaround all the weird things we have and deliver a reasonable end result "a successful live migration"17:32
mnaserhttps://review.openstack.org/#/c/589012/17:32
mnaserthis is good though17:32
melwittoh, nice patch17:33
mnaserit was in the etherpad, so it gets through that step at least17:33
melwittah, I'm blind17:33
cfriesenthe thing I don't really like about openstackclient is that the help text isn't sensitive to API version17:35
artomVery few things are :/17:35
mnaserim trying to force myself to use it17:35
mnaserso that people can use it17:35
*** dpawlik has quit IRC17:36
mnaserbtw17:37
mnasershall we update the topic?17:37
melwittwe should but dansmith is out until next week. I'm not sure who else can do it17:38
mnaseri can17:38
*** ChanServ changes topic to "This channel is for Nova development. For support of Nova deployments, please use #openstack."17:39
*** dpawlik has joined #openstack-nova17:39
*** dpawlik has quit IRC17:39
mnaserlet me know if you want to switch it to something else or whatever :>17:39
*** dpawlik has joined #openstack-nova17:40
*** dpawlik has quit IRC17:40
*** mvkr has quit IRC17:42
*** jamesdenton has quit IRC17:44
melwittoh, heh. ok, IIRC we had the current release schedule in it https://wiki.openstack.org/wiki/Nova/Stein_Release_Schedule and included the current runways, which would be use-nested-allocation-candidates right now17:45
*** panda is now known as panda|off17:46
*** tssurya has quit IRC17:55
mnasermelwitt: wanna give me something to copy pasta into the topic?18:00
melwittyeah, lemme see. I can't remember what order it was in18:00
mriedemi've talked with dtroyer about the osc live migration support, i think he basically wants to just re-write the command on a major version18:01
mriedemmnaser: if you do specify a host with osc's live migration command, make sure you are using the microversion that doesn't bypass the scheduler18:02
mriedemhttps://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id2718:02
mnasermriedem: i switched to using python-novaclient because i explicitly want the scheduler to decide for me18:02
mriedemtbc, the live migration API explicitly requires that the host param is sent, it's not optional, but the value in the REST API can be None (it's dumb)18:03
mriedembut ^ isn't possible in osc b/c you can't specify None on the command line18:03
mriedemyou would need to make --host optional18:03
mriedemer --live <hostname>18:04
mriedemhttps://docs.openstack.org/python-openstackclient/latest/cli/command-objects/server.html#server-migrate18:04
mriedemlike, --live ''18:04
mnasermriedem: hence https://review.openstack.org/#/c/589012/18:04
mriedemyes, there is another one as well18:04
mnaserbut yeah i dunno18:04
mnasermicroversions for osc api?18:04
mnaserheck yeah18:04
mriedemhttps://review.openstack.org/#/c/460059/18:05
mnaser(i'm kidding please don't end me)18:05
mriedemyou can pass microversions through for osc18:05
*** dims_ is now known as dims18:05
mriedemopenstack --os-compute-api-version 2.30 migrate --live18:05
mriedemor set an env var18:05
mnaseryeah no but i meant like18:05
mriedemosc doesn't do version negotiation to default to the latest like nova cli18:05
mnaserthe actaul openstack client shell api or whatever18:05
mnaserbut i guess we call those just *releases* of the client18:06
mriedemhttps://review.openstack.org/#/c/460059/7//COMMIT_MSG@1518:06
mriedemhas my suggestion in it18:06
mriedemyou also can't bfv with osc18:08
mriedemunless you use an existing volume18:08
mriedemthose are two pretty big gaps in functionality18:08
mnasernow on another note i think live migrations are scheduled when they are received, not when they start, right?18:08
mriedemcorrect,18:08
mriedemwe have to pick the dest host18:08
mriedemso we can setup things like port bindings and volume attachments18:09
mnasermakes our hypervisor evacuations a bit more annoying in that we can kinda do one at a time18:09
mriedemcan only do?18:09
mnaserwell if we do host-evacuate-live on 3 nodes at once18:09
mnaserits possible that they schedule to each other18:09
mnaser(sometimes we actually do host-evacuate-live only for the purpose of having instances go in the right place after we make scheduler changes)18:10
*** Swami has joined #openstack-nova18:10
mriedemif you are evacuating the hosts, you could disable the compute service18:10
mriedemnot evacuate like the evacuate API, i mean "get the vms off this host"18:10
mnaseryeah when we're shutting things down that's what we go for, but when shuffling things around18:10
mnaseryou kinda just want instances to move into where they are supposed to go18:11
mnaseranyways18:11
mnaservery minor thing18:11
*** ralonsoh has quit IRC18:11
mriedem"you kinda just want instances to move into where they are supposed to go"18:11
mriedemha18:11
mriedemof course!18:11
mriedemsilly scheduler18:11
mriedemwell as a tc big wig,18:12
mriedemyou can influence the goal setting for T18:12
mnaserusually after we make scheduling tweaks18:12
mnaserwe just do a rolling live migration18:12
mriedemso watcher but without watcher18:12
*** sapd1_ has quit IRC18:12
mnaserjust a one time watcher18:13
mnaseri think watcher is super interesting but i think it depends on too many things18:13
mriedemwell as a tc big wig,18:14
mriedemyou can influence adoption of a new top-level project: watcher-lite18:15
mriedemwatcher zero18:15
mriedemall of the flavor, none of the guilt18:15
mnaserlols18:17
melwittmnaser: Current runways: use-nested-allocation-candidates -- This channel is for Nova development. For support of Nova deployments, please use #openstack.18:21
*** lpetrut has quit IRC18:21
*** ChanServ changes topic to "Current runways: use-nested-allocation-candidates -- This channel is for Nova development. For support of Nova deployments, please use #openstack."18:22
mnasermelwitt: voila i've made myself useful for today18:22
melwitt\o/18:23
openstackgerritMatt Riedemann proposed openstack/nova stable/rocky: nova-manage - fix online_data_migrations counts  https://review.openstack.org/60582818:23
mriedemimacdonn: +W on https://review.openstack.org/#/c/605329/ and i found an example of a migration that has total > 0 with completed == 018:26
imacdonnmriedem: ack. I was just about to ask you if you intentionally didn't +W with your +2  :)18:26
mriedemit was intentional,18:26
mriedembecause i was going to backport to stable and get a grenade run where i knew we actually had things to migrate18:27
mriedembut i found one in stein too18:27
imacdonngot it18:27
mriedemhttp://logs.openstack.org/29/605329/2/check/neutron-grenade/2200365/logs/grenade.sh.txt.gz#_2018-09-27_11_17_32_53618:27
mriedem2 rows matched query migrate_instances_add_request_spec, 0 migrated18:27
mriedem^ is with your change18:27
mriedem|          populate_queued_for_delete         |      2       |     2     |18:27
mriedemhttp://logs.openstack.org/88/605488/1/check/neutron-grenade/d64e316/logs/grenade.sh.txt.gz#_2018-09-27_01_15_47_182 is without18:27
mriedem|          populate_queued_for_delete         |      0       |     0     |18:27
imacdonn\o/18:27
mriedemhttps://github.com/openstack/nova/blob/e658f41d686e4533640b101622f2342348c0316d/nova/objects/request_spec.py#L707 is the example where total can be >0 but we don't actually migrate anything18:28
mriedemso that with the explanation here https://github.com/openstack/nova/blob/e658f41d686e4533640b101622f2342348c0316d/nova/cmd/manage.py#L374 is confusing18:29
imacdonnthat may be a bug18:29
mriedemit does say, "If found is nonzero and done is zero, some records are                           # not migratable, but all migrations that can complete have                           # finished."18:29
mriedem"not migrateable" should really be, "don't require migration"18:29
imacdonnper Dan's description, count_all should never be greater than max_count18:29
mriedema lot of the migrations return found==done because the query to find the $found number is filtering on things that need to be migrated18:30
mriedeme.g. select * bdms where uuid is None;18:30
mriedemfound == done ^18:30
mriedembut that's not the same with the request spec migratoin18:30
mriedem*migration18:30
mriedemsince we have to hit 2 different dbs18:30
*** jamesdenton has joined #openstack-nova18:31
imacdonnpersonally I think the batch mechanism is a bit broken, at least as it's described in the comments18:31
mriedemas i mentioned on your change, the 'Total Needed' column is misleading18:31
mriedemhttp://logs.openstack.org/29/605329/2/check/neutron-grenade/2200365/logs/grenade.sh.txt.gz#_2018-09-27_11_17_32_53618:31
imacdonnbut, as you said, we need Dan for that conversation18:31
mnaserso has anyone ever thought what happens when we hit instance-ffffffff18:41
imacdonnfsshhh ... that'll never happen18:43
melwittin case anyone is wondering about the failing ceph job, I'm trying out a fix here https://review.openstack.org/60583318:44
* mordred waves to melwitt and mriedem and mnaser having seen conversation in the past about sdk and osc related things18:47
*** jistr has quit IRC18:47
mnasermordred: it's a bit of a difficult position but ideally figuring out what the best way to deal with cold/live migration and reworking it.. (openstack server migrate)18:48
mordredyah. as you know, the sdk code for that is ... fun :)18:48
mnasermainly my issue was osc forces you to specify a host when its optional18:48
mordredfwiw - mriedem is right - osc doesn't currently do version negotiation. once we start migrating it to sdk though, it'll pick up that ability18:48
* mnaser looks forward for that but knows that's quite the effort18:49
*** jistr has joined #openstack-nova18:49
mriedemmnaser: easy: make --live just an option with no value, add --host (optional, takes a value), and add --cold18:49
mnasermriedem: but the not breaking scripts part i guess18:49
mriedemor, let --live take a value for compat but proxy it to --host if --host isn't used18:50
mriedemi'm not sure how you could specify --live w/o a host though if --live can take a host18:50
mriedemgd CLIs18:50
mriedemempty string?18:50
mriedempretty janky18:50
mriedemopenstack server migrate --live-but-with-no-host-seriously my_server18:51
mordredmriedem: ++18:51
mordredthat's totally the right answer18:51
mriedemdo i win something?18:51
mordredyou win this bucket of parts I just found18:52
*** tbachman has quit IRC18:54
openstackgerritMatt Riedemann proposed openstack/nova master: Add more documentation for online_data_migrations CLI  https://review.openstack.org/60583618:54
mriedemimacdonn: efried: ^ does this make life better?18:54
imacdonnonly slightly, IMO18:55
mriedem:(18:56
imacdonnif those two rows don't need migration, then they couldn't be included in something named "Total Needed"......... ?18:57
openstackgerritMatt Riedemann proposed openstack/nova stable/queens: nova-manage - fix online_data_migrations counts  https://review.openstack.org/60583918:57
mriedemlike i said, total needed is a bad title,18:57
imacdonnif "Total Needed" means "Total Rows that exist that may or may not need it", we should see a lot less zeroes18:57
mriedemi'm not sure if renaming that to Total Found breaks any kind of compat,18:57
mriedemTotal Candidates18:58
mriedemsomething like that18:58
mriedemno one should be parsing the output of this command for column headers and such anyway18:58
efriedhah18:58
imacdonnI suspect that the existing migration methods may already interpret it inconsistently, but I haven't analysed it to confirm18:58
efriedyeah, to me, seeing needed=2/completed=0 feels like it should be an error18:59
openstackgerritMatt Riedemann proposed openstack/nova stable/pike: nova-manage - fix online_data_migrations counts  https://review.openstack.org/60584019:00
efriedI mean, this is better than it was before, because there's at least *some* attempt to explain wtf is going on.19:01
efriedIs this dansmith's bailiwick btw? Something he would want to review?19:01
*** tbachman has joined #openstack-nova19:05
mriedemi assume he would yes19:05
openstackgerritMatt Riedemann proposed openstack/nova stable/ocata: nova-manage - fix online_data_migrations counts  https://review.openstack.org/60584219:06
*** jistr has quit IRC19:08
*** jistr has joined #openstack-nova19:08
*** itlinux has quit IRC19:09
mordredmriedem, cfriesen: I just left a suggestion on the osc live migration patch about a way to make --live take an optional argument19:11
cfriesenmordred: sweet, I think that's probably the best way to handle backwards compatibility19:12
mriedemthere are two, but i found it19:13
*** tbachman has quit IRC19:14
mriedemi like that idea yeah19:14
mriedemnote you can't also cold migrate and specify a target host now...19:14
mriedemso i'm not sure how that would play with this too19:14
mriedem*can also19:15
imacdonnI always thought that was weird .. that you can't do that (but can for live)19:15
imacdonn(always => since icehouse days, at least)19:15
cfriesenimacdonn: artificial OSC limitation19:16
cfriesenimacdonn: the compute API lets you specify a host since 2.5619:16
imacdonnhmm19:16
cfriesen(which is admittedly fairly new)19:17
imacdonnright .. I was just checking ;)19:17
imacdonnso "the struggle was real" when I last looked19:18
artomI'm guessing splitting live and cold migration into different subcommands is no longer an option at this point, right?19:18
artomSince they're, you know, fundamentally different operations?19:18
cfriesenartom: to the end user, they're very similar.19:19
artomcfriesen, you mean besides the fact that your workload goes down?19:19
artom;)19:19
artomAnd that live migration is admin-only (by default)?19:20
cfriesenartom: cold is too, isn't it?19:21
artomcfriesen, doh, you're right19:22
imacdonnto a typical sysadmin (I'm thinking private cloud), "migrate" means "VM is running on node A, and I want it to be running <somewhere else>" .. if we can do it without shutting the VM OS down, that'll make my life better19:22
artomWait no, I was looking at the wrong bit of api-ref19:22
imacdonnfrom that perspective, they're basically the same thing19:22
artom... aaand no, still admin-only19:22
*** jistr has quit IRC19:23
artomI dunno, there's just a whole bunch of things that cold migration can do that live migration can't19:23
artomAnd anyways, I was being rhetorical, we're obviously not going to limit osc migrate to cold migration and add a new osc live-migrate at this point.19:24
*** jistr has joined #openstack-nova19:26
mriedemcold migrate is admin only, resize is non-admin19:27
mriedemw/ cold migrate under the covers19:27
artomKinky.19:27
mriedemonly on anniversaries dude19:28
*** jistr has quit IRC19:28
mriedemresize and cold migrate have been married awhile19:28
artomYeah, I'm surprised there's still anything going on under the covers.19:28
imacdonnbetter than cold feet19:29
*** jistr has joined #openstack-nova19:29
*** jistr has quit IRC19:34
*** jistr has joined #openstack-nova19:37
*** hamzy has joined #openstack-nova19:38
*** jistr has quit IRC19:39
*** hamzy has quit IRC19:44
*** jistr has joined #openstack-nova19:49
*** tbachman has joined #openstack-nova19:50
*** pcaruana has quit IRC19:54
*** Nel1x has joined #openstack-nova20:04
mriedemmelwitt: i think the vmware live migration change can go into a runway slot20:07
mriedemrgerganov updated it to get ci passing,20:07
mriedemi've done another pass, still -1 but it's closer20:07
melwittok, cool. I'll add it then. missed the +1 vote from the vmware CI earlier today after I rechecked it20:08
*** hamzy has joined #openstack-nova20:09
*** med_ has joined #openstack-nova20:10
*** hamzy has quit IRC20:20
*** hamzy has joined #openstack-nova20:20
*** hamzy has quit IRC20:25
*** hamzy has joined #openstack-nova20:28
*** macza has quit IRC20:31
*** macza has joined #openstack-nova20:32
*** artom has quit IRC20:33
*** hamzy has quit IRC20:34
*** hamzy has joined #openstack-nova20:34
*** dpawlik has joined #openstack-nova20:42
*** hamzy has quit IRC20:43
*** dpawlik has quit IRC20:46
melwittnova meeting in 11 minutes20:49
melwitt10 minutes20:50
mriedemmelwitt: want to hit this backport? https://review.openstack.org/#/c/605260/ - would be good to get those reverts in stable merged/released since they regressed blazar20:52
*** takashin has joined #openstack-nova20:52
mriedemthe fix to replace the original is stacked on top on master but the gate is....not cooperating20:52
melwittcan do. any idea why lee removed his vote?20:53
mriedemno idea20:57
mriedemclassic lee20:57
*** awaugama has quit IRC21:01
*** erlon has quit IRC21:04
*** jamesdenton has quit IRC21:15
*** mchlumsky has quit IRC21:37
*** mriedem has quit IRC21:55
*** burt has quit IRC21:59
*** mriedem has joined #openstack-nova22:07
*** scarab_ has joined #openstack-nova22:09
*** scarab_ has quit IRC22:11
*** mvkr has joined #openstack-nova22:13
*** rcernin has joined #openstack-nova22:29
*** takashin has left #openstack-nova22:31
*** macza has quit IRC22:38
*** macza has joined #openstack-nova22:38
*** dpawlik has joined #openstack-nova22:42
cfriesenis there an equivalent of CONF.reserved_huge_pages but for regular memory?  (per numa node though)22:45
cfriesenI can see CONF.reserved_host_memory_mb but that's not specifically 4k pages and is per compute node, not numa node.22:46
*** dpawlik has quit IRC22:47
openstackgerritMatt Riedemann proposed openstack/nova stable/pike: Update RequestSpec.flavor on resize_revert  https://review.openstack.org/60587922:47
*** macza has quit IRC22:51
*** macza_ has joined #openstack-nova22:51
openstackgerritMatt Riedemann proposed openstack/nova stable/ocata: Update RequestSpec.flavor on resize_revert  https://review.openstack.org/60588023:01
mriedemstable branch core review please https://review.openstack.org/#/c/600113/23:03
cfriesenmriedem: what would you think of something like CONF.reserved_huge_pages but for regular memory?  (ie, to reserve specific amounts of 4K memory on each host numa node)23:08
cfriesenor can  CONF.reserved_huge_pages be used for 4k pages as well even though the name implies otherwise23:10
mriedemcfriesen: i don't think you realize that i don't know anything about that nova/virt/hardware stuff23:11
cfriesenheh23:11
cfriesenI'm off to go dig through code23:11
mriedemso not reserved_host_ram or whatever we have?23:11
mriedemthat goes on the compute node?23:11
cfriesenyeah, "reserved_host_memory_mb" is per compute node23:12
mriedemreserved_host_memory_mb23:12
mriedemand you want something to reserve ram per numa node?23:12
*** artom has joined #openstack-nova23:12
openstackgerritMatt Riedemann proposed openstack/nova stable/ocata: Fix instance evacuation with PCI devices  https://review.openstack.org/60588123:13
mriedemmelwitt: fyi i'm trying to flush through my stable queens and pike changes which also apply to ocata, and then gonna probably send a thing to the ML to wrassle a stable branch review sprint for next week23:13
mriedemto flush all stable branches so we can get ocata released and tagged for EM23:13
mriedemit took an entire week to just get stuff merged for the last round of stable releases23:14
mriedemb/c of the gate23:14
cfriesenmriedem: after looking at the code, I think I could use  CONF.reserved_huge_pages to reserve 4K memory per numa node, even though it's not actually huge pages.  now to actually try it out23:16
melwittmriedem: sounds good23:17
*** macza_ has quit IRC23:18
openstackgerritMatt Riedemann proposed openstack/nova stable/ocata: Update nova network info when doing rebuild for evacuate operation  https://review.openstack.org/60588223:20
*** dpawlik has joined #openstack-nova23:20
*** dpawlik has quit IRC23:25
openstackgerritMatt Riedemann proposed openstack/nova stable/ocata: unquiesce instance after quiesce failure  https://review.openstack.org/60588423:25
*** mriedem has quit IRC23:26
*** Swami has quit IRC23:45
*** erlon has joined #openstack-nova23:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!