Friday, 2018-02-16

tonybmriedem: any chnace you can take a quick look at: https://review.openstack.org/#/c/543348 ?00:03
*** salv-orlando has quit IRC00:06
*** salv-orlando has joined #openstack-nova00:07
*** claudiub has quit IRC00:09
openstackgerritNicolas Bock proposed openstack/nova stable/pike: Fix SUSE Install Guide: Placement port  https://review.openstack.org/54516700:09
melwittmnaser, mriedem: I'm going through your comments on https://review.openstack.org/#/c/340614/18/nova/compute/api.py@2055 and I'm not seeing what the 'available' volume scenario is ...00:11
*** salv-orlando has quit IRC00:12
melwittthat is, even back in liberty, if the instance failed on the host, _cleanup_volumes in the compute manager would have called volume_api.delete, so maybe that's why the detach would fail in the compute api local delete ... but that doesn't add up with having a volume left in 'available' state00:13
*** AlexeyAbashkin has joined #openstack-nova00:13
mnasermelwitt: in that case you are right there wouldn’t be any possible scenario where you would end up in an available volume if it failed in compute00:13
melwittso I'm not really sure what case the new except block is handling00:14
melwitt(I didn't add that part but back when this was going on in liberty, an operator we were working with added that saying they ran into it when they were testing the patch)00:14
*** mlavalle has quit IRC00:15
melwittI'm trying to figure out whether to nix it from this patch or if there's some scenario I'm missing where we need to handle it like this00:15
mnasermelwitt: I just got home so I’m on mobile but can you boot from volume with an existing volume that’s also delete on terminate?00:16
mnaserThat would be a scenario where you’d want us to delete it, assuming scheduling failed and we want to follow through and delete the volume as requested00:16
mnaserThough that might be super confusing to the user and unexpected00:16
*** AlexeyAbashkin has quit IRC00:18
cfriesenmnaser: in that case wouldn't the instance sit around in an ERROR state with the volume still attached?00:20
cfriesenie the volume should stay attached until the instance is deleted00:20
melwittmnaser: cfriesen it was that way until very recently https://review.openstack.org/#/c/528385/00:21
mnaserWill the volume ever end up on attached state?  I think it would be in attaching because it never reached the compute note to finish the attach00:23
*** hiro-kobayashi has joined #openstack-nova00:27
openstackgerritMerged openstack/nova stable/queens: Refine waiting for vif plug events during _hard_reboot  https://review.openstack.org/54273800:27
openstackgerritMerged openstack/nova stable/queens: Don't JSON encode instance_info.traits for ironic  https://review.openstack.org/54503700:27
openstackgerritMerged openstack/nova stable/queens: Use correct arguments in task inits  https://review.openstack.org/54410900:27
*** chyka has quit IRC00:27
openstackgerritMerged openstack/nova stable/queens: Update UPPER_CONSTRAINTS_FILE for stable/queens  https://review.openstack.org/54265700:28
melwittmnaser: yeah, you're right, in the scenario you're describing, BFV with already existing volume + delete_on_termination, if scheduling fails, the volume would never have been attached00:28
melwittso it would be available, so detach would fail, so we would want to delete the volume.00:28
melwitter, detach might just do a no-op if available, but I'm not sure on that00:29
mnasermelwitt: wouldn’t it actually be in “attaching” state because the API layer reserves the volume00:31
melwittmnaser: oh, yeah good point. it will be whatever the reserve did00:32
mnasermelwitt: yeah so it would have an attachment or reservation depending on flow, same issue we’re trying to resolve here00:32
mnasermelwitt: I do think the extra try except added in this patch is useless if detach is noop00:34
mnaserIf detach is noop, nothing will happen. If it fails because the volume doesn’t exist, then we won’t try to delete it anyways00:34
melwittyeah. that's where I'm at too00:34
mnaserNow put multiattach in that equation lol..00:35
melwittbut at least I think now I understand why it was added long ago, if detach didn't used to do a no-op if not attached00:35
mnaserYeah it’s a bit of old code00:36
*** gyee has quit IRC00:41
*** Dinesh_Bhor has joined #openstack-nova00:42
*** Dinesh_Bhor has quit IRC00:42
*** andreas_s has joined #openstack-nova00:45
*** chyka has joined #openstack-nova00:47
*** andreas_s has quit IRC00:49
*** tbachman has joined #openstack-nova00:50
*** hiro-kobayashi_ has joined #openstack-nova00:51
*** chyka has quit IRC00:52
*** hamzy has joined #openstack-nova01:02
*** felipemonteiro_ has quit IRC01:03
*** salv-orlando has joined #openstack-nova01:08
*** salv-orlando has quit IRC01:12
*** AlexeyAbashkin has joined #openstack-nova01:13
*** AlexeyAbashkin has quit IRC01:18
*** jobewan has joined #openstack-nova01:42
*** jobewan has quit IRC01:49
openstackgerritMerged openstack/nova stable/queens: Cleanup the manage-volumes admin doc  https://review.openstack.org/54514101:51
openstackgerritMerged openstack/nova stable/queens: Add admin guide doc on volume multiattach support  https://review.openstack.org/54514201:51
*** jobewan has joined #openstack-nova01:52
*** jobewan has quit IRC01:58
*** Tom-Tom has quit IRC01:58
*** Tom-Tom has joined #openstack-nova01:58
*** yamahata has quit IRC02:04
*** acormier has joined #openstack-nova02:04
*** salv-orlando has joined #openstack-nova02:08
*** hiro-kobayashi_ has quit IRC02:10
*** salv-orlando has quit IRC02:13
*** hongbin has joined #openstack-nova02:18
*** tidwellr has quit IRC02:26
*** mriedem1 has joined #openstack-nova02:29
*** mriedem has quit IRC02:29
mriedem1cfriesen: the method on the instance is just a helper method02:29
*** mriedem1 is now known as mriedem02:29
mriedemmelwitt: mnaser: did you sort out the available volume thing?02:30
mriedemtonyb: done02:30
tonybmriedem: Thanks.02:32
*** takashin has left #openstack-nova02:32
mriedemimacdonn: if you can, leave a comment on the patch that fixed the issue for you to support it's greatness02:34
mriedemimacdonn: nvm i see you already did, thanks02:34
mriedemlyarwood: can you hit the +2ed stable/queens backports in your morning? https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/queens - then we'll cut RC202:35
mriedemtonyb: or i guess you could do that ^02:35
tonybmriedem: Oh I can?  I didn't think I could between rc1 and release02:36
* tonyb checks02:36
mriedemsmcginnis was able to, so i assume you can02:36
tonybmriedem: Oh look at that ...02:36
* tonyb was intentionally ignoring queens as opposed the pike and ocata where it was just an accident ;p02:37
mnasermriedem: i think so?02:41
tonybmriedem: done02:45
*** slaweq has joined #openstack-nova02:46
*** slaweq has quit IRC02:50
mriedemtonyb: nice thanks02:50
tonybmriedem: If you create the tag request with a -W I can update it with the final SHA once they merge ... or we can wait 'til next week02:51
*** jogo has quit IRC02:51
openstackgerritMerged openstack/osc-placement master: tox.ini settings for global constraints are out of date  https://review.openstack.org/54334802:54
mriedemi shall do so02:55
tonybmriedem: cool02:56
*** vladikr has quit IRC02:57
*** vladikr has joined #openstack-nova02:58
*** jogo has joined #openstack-nova02:58
tonybmriedem, stephenfin, melwitt: I seem to recall that we said for backports of docs only changes we'd only require 1 stable-core to approve.  Did I make that up?02:59
*** dave-mccowan has joined #openstack-nova03:01
mriedemidk03:02
*** claudiub has joined #openstack-nova03:02
mriedemif it's a docs bug fix then i think that's probably ok03:02
mriedemif it's another stable core that backported it then i'm also ok with fast approve on those03:02
mriedemalright gotta go03:06
*** mriedem has quit IRC03:07
*** salv-orlando has joined #openstack-nova03:09
*** dave-mcc_ has joined #openstack-nova03:14
*** salv-orlando has quit IRC03:14
*** dave-mccowan has quit IRC03:15
*** sree has joined #openstack-nova03:19
*** claudiub has quit IRC03:22
*** harlowja_ has quit IRC03:23
*** vivsoni has joined #openstack-nova03:24
*** bhujay has joined #openstack-nova03:26
*** bhujay has quit IRC03:37
*** bhujay has joined #openstack-nova03:37
*** abhishekk has joined #openstack-nova03:39
*** dave-mcc_ has quit IRC03:41
*** acormier has quit IRC03:43
*** jaianshu has joined #openstack-nova03:46
*** sridharg has joined #openstack-nova03:54
*** bhujay has quit IRC03:55
*** bhujay has joined #openstack-nova03:56
*** bhujay has quit IRC04:01
*** yamamoto has joined #openstack-nova04:05
*** mdnadeem has joined #openstack-nova04:08
*** udesale has joined #openstack-nova04:09
*** bhujay has joined #openstack-nova04:12
*** andreas_s has joined #openstack-nova04:13
*** andreas_s has quit IRC04:17
*** salv-orlando has joined #openstack-nova04:23
*** psachin has joined #openstack-nova04:25
*** bhujay has quit IRC04:28
*** salv-orlando has quit IRC04:29
*** bhujay has joined #openstack-nova04:29
*** cfriesen has quit IRC04:29
*** sree has quit IRC04:34
*** sree has joined #openstack-nova04:34
*** janki|afk has joined #openstack-nova04:35
*** lpetrut has joined #openstack-nova04:36
*** hongbin has quit IRC04:45
*** bhujay has quit IRC04:45
*** harlowja has joined #openstack-nova04:46
*** slaweq has joined #openstack-nova04:46
*** slaweq has quit IRC04:51
*** yamamoto_ has joined #openstack-nova04:51
openstackgerritMerged openstack/nova stable/queens: Bindep does not catch missing libpcre3-dev on Ubuntu  https://review.openstack.org/54410804:52
openstackgerritMerged openstack/nova stable/queens: doc: fix the link for the evacuate cli  https://review.openstack.org/54350704:52
openstackgerritMerged openstack/nova stable/queens: Add regression test for BFV+IsolatedHostsFilter failure  https://review.openstack.org/54359404:52
*** janki|afk has quit IRC04:54
*** janki has joined #openstack-nova04:54
*** yamamoto has quit IRC04:55
*** ameeda has quit IRC04:57
*** harlowja has quit IRC04:59
openstackgerritMerged openstack/nova stable/queens: Handle volume-backed instances in IsolatedHostsFilter  https://review.openstack.org/54359505:00
openstackgerritMerged openstack/nova stable/queens: Fix docs for IsolatedHostsFilter  https://review.openstack.org/54359605:00
*** rmcall has joined #openstack-nova05:02
*** ansiwen has quit IRC05:13
*** mdbooth has quit IRC05:13
*** lpetrut has quit IRC05:13
openstackgerritMerged openstack/nova stable/queens: Make bdms querying in multi-cell use scatter-gather and ignore down cell  https://review.openstack.org/54348905:15
openstackgerritMerged openstack/nova stable/queens: VGPU: Modify the example of vgpu white_list set  https://review.openstack.org/54288205:16
*** priteau has joined #openstack-nova05:19
*** lpetrut has joined #openstack-nova05:19
*** harlowja has joined #openstack-nova05:19
*** priteau has quit IRC05:23
*** inara has quit IRC05:23
*** salv-orlando has joined #openstack-nova05:25
*** harlowja has quit IRC05:27
*** salv-orlando has quit IRC05:29
*** inara has joined #openstack-nova05:31
*** lpetrut has quit IRC05:35
*** claudiub has joined #openstack-nova05:40
*** hoonetorg has quit IRC05:40
*** acormier has joined #openstack-nova05:43
*** acormier has quit IRC05:48
*** hoonetorg has joined #openstack-nova05:53
*** Tom-Tom has quit IRC06:10
*** bhujay has joined #openstack-nova06:11
openstackgerritmelanie witt proposed openstack/nova master: Clean up ports and volumes when deleting ERROR instance  https://review.openstack.org/34061406:13
openstackgerritmelanie witt proposed openstack/nova master: Add functional recreate test of deleting a BFV server pre-scheduling  https://review.openstack.org/54512306:13
openstackgerritmelanie witt proposed openstack/nova master: Detach volumes when deleting a BFV server pre-scheduling  https://review.openstack.org/54513206:13
*** threestrands has quit IRC06:20
*** salv-orlando has joined #openstack-nova06:25
*** openstackstatus has quit IRC06:27
*** openstack has joined #openstack-nova06:29
*** ChanServ sets mode: +o openstack06:29
*** Tom-Tom has joined #openstack-nova06:29
*** salv-orlando has quit IRC06:30
*** salv-orlando has joined #openstack-nova06:32
*** jafeha__ is now known as jafeha06:36
*** yamahata has joined #openstack-nova06:37
openstackgerritmelanie witt proposed openstack/nova master: Add periodic task to clean expired console tokens  https://review.openstack.org/32538106:37
openstackgerritmelanie witt proposed openstack/nova master: Use ConsoleAuthToken object to generate authorizations  https://review.openstack.org/32541406:37
openstackgerritmelanie witt proposed openstack/nova master: Convert websocketproxy to use db for token validation  https://review.openstack.org/33399006:37
*** andreas_s has joined #openstack-nova06:57
*** andreas_s has quit IRC07:01
*** lajoskatona has joined #openstack-nova07:19
*** threestrands has joined #openstack-nova07:22
*** threestrands has quit IRC07:22
*** threestrands has joined #openstack-nova07:23
*** threestrands has quit IRC07:23
*** threestrands has joined #openstack-nova07:23
*** AlexeyAbashkin has joined #openstack-nova07:26
*** threestrands has quit IRC07:34
*** stelucz_ has joined #openstack-nova07:40
*** yamamoto_ has quit IRC07:40
*** rcernin has quit IRC07:41
*** yamamoto has joined #openstack-nova07:42
*** yamamoto has quit IRC07:42
*** yamamoto has joined #openstack-nova07:43
stelucz_Hello, is there nova client command to delete resource_provider from database? After running nova service-delete <id>, record for compute node still exists in resource_providers table, thus reprovisioning of compute node ends up in message:  Another thread already created a resource provider with the UUID 7b01cc27-c101-4c05-aaed-5958ef1270a1. Grabbing that record from the placement API.07:43
*** lpetrut has joined #openstack-nova07:44
*** slaweq has joined #openstack-nova07:44
*** dtantsur|afk is now known as dtantsur07:46
*** yamamoto has quit IRC07:48
*** sree has quit IRC07:50
openstackgerritOpenStack Proposal Bot proposed openstack/nova master: Imported Translations from Zanata  https://review.openstack.org/54156107:57
*** vladikr has quit IRC07:57
*** vladikr has joined #openstack-nova07:58
*** jafeha has quit IRC08:01
openstackgerritHiroaki Kobayashi proposed openstack/osc-placement master: Use jsonutils of oslo_serialization  https://review.openstack.org/54523108:01
*** jafeha has joined #openstack-nova08:01
*** damien_r has joined #openstack-nova08:07
*** andreas_s has joined #openstack-nova08:10
*** alexchadin has joined #openstack-nova08:13
*** trinaths has joined #openstack-nova08:23
*** alexchadin has quit IRC08:24
*** alexchadin has joined #openstack-nova08:25
*** tesseract has joined #openstack-nova08:27
*** sree has joined #openstack-nova08:31
*** hiro-kobayashi has quit IRC08:31
*** salv-orlando has quit IRC08:32
*** amoralej|off is now known as amoralej08:32
*** salv-orlando has joined #openstack-nova08:32
*** salv-orlando has quit IRC08:37
*** gibi has joined #openstack-nova08:40
*** gibi has quit IRC08:41
*** gibi has joined #openstack-nova08:41
*** jpena|off is now known as jpena08:41
*** gibi is now known as giblet08:42
*** yamamoto has joined #openstack-nova08:44
*** salv-orlando has joined #openstack-nova08:45
*** david-lyle has quit IRC08:46
openstackgerritTetsuro Nakamura proposed openstack/nova master: [libvirt] Add _get_numa_memnode()  https://review.openstack.org/52990608:47
openstackgerritTetsuro Nakamura proposed openstack/nova master: [libvirt] Add _get_XXXpin_cpuset()  https://review.openstack.org/52763108:47
openstackgerritTetsuro Nakamura proposed openstack/nova master: Add NumaTopology support for libvirt/qemu driver  https://review.openstack.org/53045108:47
openstackgerritTetsuro Nakamura proposed openstack/nova master: disable cpu pinning with libvirt/qemu driver  https://review.openstack.org/53104908:47
*** slaweq_ has joined #openstack-nova08:48
*** yamamoto has quit IRC08:50
*** ralonsoh has joined #openstack-nova08:51
*** slaweq_ has quit IRC08:53
*** tssurya has joined #openstack-nova08:55
*** yangyapeng has joined #openstack-nova08:57
*** yamamoto has joined #openstack-nova08:59
*** salv-orlando has quit IRC09:04
*** salv-orlando has joined #openstack-nova09:05
hrwhow to mock nova.conf option in tests?09:05
*** mgoddard_ has joined #openstack-nova09:05
*** salv-orlando has quit IRC09:09
*** pcaruana has joined #openstack-nova09:11
*** belmoreira has joined #openstack-nova09:11
*** rmcall has quit IRC09:14
*** yamamoto has quit IRC09:15
*** rmcall has joined #openstack-nova09:15
*** ociuhandu has joined #openstack-nova09:18
*** yamahata has quit IRC09:21
*** rmcall has quit IRC09:23
*** ociuhandu has quit IRC09:23
*** priteau has joined #openstack-nova09:26
*** derekh has joined #openstack-nova09:29
*** bauzas is now known as bauwser09:30
bauwsergood Friday everyone09:30
*** acormier has joined #openstack-nova09:34
*** stephenfin is now known as finucannot09:35
openstackgerritTetsuro Nakamura proposed openstack/nova master: trivial: omit condition evaluations  https://review.openstack.org/54524809:36
gmann_hrw: you can by set_override - https://github.com/openstack/nova/blob/bfae5f28a4c3b39f7978d5f3015c1b32be81215d/nova/tests/functional/api_sample_tests/test_hide_server_addresses.py#L3009:41
hrwgmann_: thx09:41
hrw    oslo_config.cfg.NoSuchOptError: no such option num_of_pcie_slots in group [libvirt]09:44
hrwnow just have to find where tests fake whole nova.conf ;d09:44
*** links has joined #openstack-nova09:47
*** links has quit IRC09:49
*** acormier has quit IRC09:51
*** chyka has joined #openstack-nova09:54
*** priteau has quit IRC09:56
*** priteau has joined #openstack-nova09:57
*** links has joined #openstack-nova09:58
*** abhishekk has quit IRC09:58
*** chyka has quit IRC09:59
*** alexchadin has quit IRC10:01
*** priteau has quit IRC10:03
*** giblet has quit IRC10:03
openstackgerritMarcin Juszkiewicz proposed openstack/nova master: Allow to configure amount of PCIe ports in aarch64 instance  https://review.openstack.org/54503410:07
*** udesale_ has joined #openstack-nova10:07
hrwthis one adds test and new config option10:08
*** udesale has quit IRC10:10
stelucz_Hello, is there nova client command to delete resource_provider from database? After running nova service-delete <id>, record for compute node still exists in resource_providers table, thus reprovisioning of compute node ends up in message:  Another thread already created a resource provider with the UUID 7b01cc27-c101-4c05-aaed-5958ef1270a1. Grabbing that record from the placement API.10:11
*** links has quit IRC10:12
*** yamamoto has joined #openstack-nova10:15
*** udesale__ has joined #openstack-nova10:16
*** openstackgerrit has quit IRC10:18
*** ccamacho has joined #openstack-nova10:19
*** udesale_ has quit IRC10:19
*** yamamoto has quit IRC10:20
*** priteau has joined #openstack-nova10:23
*** jafeha__ has joined #openstack-nova10:25
*** jafeha has quit IRC10:27
*** dosaboy has quit IRC10:33
*** stakeda has quit IRC10:33
*** dosaboy has joined #openstack-nova10:38
*** trinaths has quit IRC10:38
*** sambetts|afk is now known as sambetts10:41
*** sree has quit IRC10:42
*** sree has joined #openstack-nova10:43
*** cdent has joined #openstack-nova10:46
*** sree has quit IRC10:47
* cdent blinks10:50
*** yamamoto has joined #openstack-nova10:51
*** alexchadin has joined #openstack-nova10:54
*** lucas-afk is now known as lucasagomes11:03
*** giblet has joined #openstack-nova11:05
*** alexchadin has quit IRC11:10
*** openstackgerrit has joined #openstack-nova11:15
openstackgerritMerged openstack/nova master: api-ref: provide more detail on what a provider aggregate is  https://review.openstack.org/53903311:15
*** alexchadin has joined #openstack-nova11:22
*** chyka has joined #openstack-nova11:25
*** belmoreira has quit IRC11:28
*** udesale_ has joined #openstack-nova11:29
*** chyka has quit IRC11:29
*** udesale__ has quit IRC11:32
*** udesale_ has quit IRC11:34
*** AlexeyAbashkin has quit IRC11:37
*** sdague has joined #openstack-nova11:39
*** AlexeyAbashkin has joined #openstack-nova11:40
*** alexchadin has quit IRC11:41
*** alexchadin has joined #openstack-nova11:42
openstackgerritMarcin Juszkiewicz proposed openstack/nova master: Allow to configure amount of PCIe ports in aarch64 instance  https://review.openstack.org/54503411:45
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Move __init__ function  https://review.openstack.org/53822311:48
openstackgerritStephen Finucane proposed openstack/nova master: tox: Add mypy target  https://review.openstack.org/53822111:48
openstackgerritStephen Finucane proposed openstack/nova master: tox: Store list of converted files  https://review.openstack.org/53822211:48
openstackgerritStephen Finucane proposed openstack/nova master: mypy: Add type annotations to 'nova.pci'  https://review.openstack.org/53822411:48
openstackgerritStephen Finucane proposed openstack/nova master: zuul: Add 'mypy' job  https://review.openstack.org/53916811:48
*** jaianshu has quit IRC11:49
*** acormier has joined #openstack-nova11:52
*** yamamoto has quit IRC11:56
*** acormier has quit IRC11:56
*** yamamoto has joined #openstack-nova11:59
*** links has joined #openstack-nova12:03
*** cdent has quit IRC12:08
*** yamamoto has quit IRC12:08
*** salv-orlando has joined #openstack-nova12:10
*** yamamoto has joined #openstack-nova12:11
*** psachin has quit IRC12:14
*** bhujay has quit IRC12:15
*** chyka has joined #openstack-nova12:24
*** psachin has joined #openstack-nova12:25
*** larsks has joined #openstack-nova12:25
*** elmaciej has joined #openstack-nova12:26
larsksHey nova folk: there are some libvirt guests on our pike compute nodes that have an empty <nova:owner> attribute in the libvirt xml (which is causing ceilometer to blow up).  Is an empty owner a known problem? Or is that expected behavior in some circumstances?12:26
hrwlarsks: maybe owner account got removed from system?12:27
larskshrw: the owner is the associated project, right?  The project still exists.12:28
*** elmaciej_ has joined #openstack-nova12:28
larsksAnd would nova update the xml on a running guest like that in any case?12:28
hrw<nova:owner><nova:user /><nova:project /></nova:owner>12:29
*** chyka has quit IRC12:29
hrwlarsks: sorry, no idea. I jsut hack nova12:29
hrwand use bigger and bigger knives each time12:30
larsksFair enough :).12:30
*** elmaciej has quit IRC12:32
*** janki has quit IRC12:35
*** cdent has joined #openstack-nova12:38
*** udesale has joined #openstack-nova12:41
*** yamamoto has quit IRC12:41
*** belmoreira has joined #openstack-nova12:43
*** sree has joined #openstack-nova12:43
*** jpena is now known as jpena|lunch12:48
cdentfinucannot: has mypy found some juicy bugs in nova yet?12:49
*** slaweq_ has joined #openstack-nova12:50
*** slaweq_ has quit IRC12:54
*** sree has quit IRC12:55
*** janki has joined #openstack-nova13:06
*** sree has joined #openstack-nova13:06
*** dave-mccowan has joined #openstack-nova13:08
*** nicolasbock has joined #openstack-nova13:10
*** nicolasbock has quit IRC13:10
*** sree has quit IRC13:14
*** r-daneel has joined #openstack-nova13:14
*** yamamoto has joined #openstack-nova13:15
*** janki has quit IRC13:25
*** hshiina has quit IRC13:29
*** jaypipes is now known as leakypipes13:31
*** dave-mccowan has quit IRC13:31
*** vladikr has quit IRC13:31
hrwanyone know how to get is guest instance i440fx or q35?13:32
*** leakypipes has quit IRC13:33
hrwfound. guest.os_mach_type13:35
finucannotcdent: Nothing exciting yet but it will as it expands. I started with 'nova.pci' simply because comprehending that stuff is currently a nightmare and types help13:37
* cdent nods at finucannot 13:37
cdentWill be cool to see how things play out13:38
*** psachin has quit IRC13:42
*** eharney has quit IRC13:43
*** awaugama has joined #openstack-nova13:49
*** jpena|lunch is now known as jpena13:51
*** acormier has joined #openstack-nova13:52
*** liverpooler has joined #openstack-nova13:55
*** mlavalle has joined #openstack-nova13:56
*** acormier has quit IRC13:56
*** pchavva has joined #openstack-nova13:59
*** cdent has quit IRC14:00
*** ralonsoh_ has joined #openstack-nova14:00
openstackgerritMarcin Juszkiewicz proposed openstack/nova master: Allow to configure amount of PCIe ports in aarch64 instance  https://review.openstack.org/54503414:01
hrwone step closer14:01
hrwshit. have to fix commit message now14:01
openstackgerritMarcin Juszkiewicz proposed openstack/nova master: Allow to configure amount of PCIe ports  https://review.openstack.org/54503414:02
*** mriedem has joined #openstack-nova14:03
*** ralonsoh has quit IRC14:03
openstackgerritEric Young proposed openstack/nova master: Enable native mode for ScaleIO volumes  https://review.openstack.org/54530414:06
*** dave-mccowan has joined #openstack-nova14:07
hrwone simple thing, few projects involved to find out what is going on. nova, libvirt, qemu, uefi :(14:07
*** edleafe is now known as figleaf14:08
*** acormier has joined #openstack-nova14:08
*** udesale has quit IRC14:08
*** udesale_ has joined #openstack-nova14:08
*** sree has joined #openstack-nova14:10
*** acormier has quit IRC14:11
*** acormier has joined #openstack-nova14:12
*** sree has quit IRC14:15
*** acormier has quit IRC14:16
*** andreas_s has quit IRC14:24
*** rmcall has joined #openstack-nova14:24
*** andreas_s has joined #openstack-nova14:25
*** mdnadeem has quit IRC14:25
*** eharney has joined #openstack-nova14:26
*** jaypipes has joined #openstack-nova14:28
*** sree has joined #openstack-nova14:28
*** jaypipes is now known as leakypipes14:28
*** acormier has joined #openstack-nova14:29
*** stelucz_ has quit IRC14:29
mnaserthis simple clean up has been a lot messier than expected :(14:30
*** felipemonteiro_ has joined #openstack-nova14:30
*** lucasagomes is now known as lucas-hungry14:31
mnasermelwitt: for some reason that example you have listed has multiple attachments so that’s probably why it still is marked as reserved?   mriedem maybe can confirm this?14:31
*** elmaciej_ has quit IRC14:32
*** elmaciej has joined #openstack-nova14:33
*** felipemonteiro__ has joined #openstack-nova14:34
*** acormier has quit IRC14:34
*** efried is now known as fried_rice14:34
*** sree has quit IRC14:35
*** andreas_s has quit IRC14:35
finucannotbauwser: Could you knock this one through? https://review.openstack.org/#/c/538223/14:36
fried_ricestelucz: I believe osc has the command you're looking for.  But it doesn't sound like you need to delete it in that scenario.  The message you're seeing is not an error.14:37
*** alexchadin has quit IRC14:37
*** felipemonteiro_ has quit IRC14:37
*** elmaciej has quit IRC14:39
*** elmaciej has joined #openstack-nova14:40
*** sree has joined #openstack-nova14:42
*** amoralej is now known as amoralej|lunch14:43
*** dansmith is now known as superdan14:43
*** elmaciej_ has joined #openstack-nova14:43
*** elmaciej has quit IRC14:46
*** jistr is now known as jistr|mtg14:52
*** sree has quit IRC14:52
*** andreas_s has joined #openstack-nova14:54
melwittmnaser: multiple attachments? no14:54
lyarwoodsean-k-mooney: https://bugs.launchpad.net/os-vif/+bug/1749972 - finucannot pointed me in your direction regarding this odd `brctl setageing $bridge 0` behaviour I've just stumbled across with the latest 16.04 kernel, would you mind taking a look?14:55
openstackLaunchpad bug 1749972 in os-vif "`brctl setageing $bridge 0` fails on Ubuntu 16.04 4.4.0-21-generic" [Undecided,New]14:55
*** salv-orl_ has joined #openstack-nova14:55
mnasermelwitt: in your curl example, I saw multiple attachments returned14:55
mnaserI count 414:55
mnaser(To 4 différence instances)14:55
mnaserDifferent*14:55
openstackgerritEric Fried proposed openstack/nova master: api-ref: Further clarify placement aggregates  https://review.openstack.org/54535614:56
fried_ricefigleaf: ^^^14:56
melwittmnaser: oh, weird. I didn't notice that14:56
fried_ricemriedem, superdan: bauwser also ^14:56
mnasermelwitt: I can only guess that maybe you tried to run this before applying patch so a few got accumulated? Or maybe it’s a multiattach volume?14:57
melwittI had been reusing the same volume as I messed up a bunch of tests. I'll try again with a new one14:57
bauwserfried_rice: +214:58
fried_ricethx14:58
bauwserwe can iterate long about docs :)14:58
*** r-daneel has quit IRC14:58
*** salv-orlando has quit IRC14:58
bauwserbut I'm fine14:58
*** andreas_s has quit IRC14:58
*** mdbooth has joined #openstack-nova14:58
*** ansiwen has joined #openstack-nova14:58
*** pooja_jadhav has quit IRC15:00
fried_riceAs long as each iteration is an improvement over the previous, agree that we need not make every patch perfect.15:00
melwittmnaser: indeed you're right, it works fine with a fresh volume with only one attachment. thanks for pointing that out15:04
*** andreas_s has joined #openstack-nova15:04
*** sree has joined #openstack-nova15:04
*** amodi has joined #openstack-nova15:04
melwittI'll respin it to remove that code comment I wrote that was wrong15:04
mnasermelwitt: awesome!15:04
*** felipemonteiro__ has quit IRC15:06
*** felipemonteiro_ has joined #openstack-nova15:06
*** andreas_s has quit IRC15:08
*** acormier has joined #openstack-nova15:09
*** acormier has quit IRC15:13
*** lbragstad has quit IRC15:14
*** jistr|mtg is now known as jistr15:15
*** lbragstad has joined #openstack-nova15:15
melwittcorrection, with a fresh volume it actually has no attachments, if I make it fail to build on compute. it ends up 'reserved' with no attachments, then the detach makes it 'available' then the delete deletes the volume15:16
openstackgerritmelanie witt proposed openstack/nova master: Clean up ports and volumes when deleting ERROR instance  https://review.openstack.org/34061415:17
openstackgerritmelanie witt proposed openstack/nova master: Add functional recreate test of deleting a BFV server pre-scheduling  https://review.openstack.org/54512315:17
openstackgerritmelanie witt proposed openstack/nova master: Detach volumes when deleting a BFV server pre-scheduling  https://review.openstack.org/54513215:17
*** vladikr has joined #openstack-nova15:18
tobascoi have a ongoing issue right now, nova metadata api returns error code 300 to neutron-metadata-agent, so cloud-init fails, neutron metadata givens error 500 to cloud-init15:18
tobascomitaka, anybody know what would cause a 300 error code? i'm gonna dump the request and try manually now15:19
*** kholkina has quit IRC15:25
*** kholkina has joined #openstack-nova15:25
*** amoralej|lunch is now known as amoralej15:25
*** felipemonteiro__ has joined #openstack-nova15:26
*** vladikr has quit IRC15:27
*** vladikr has joined #openstack-nova15:27
*** lucas-hungry is now known as lucasagomes15:28
*** acormier_ has joined #openstack-nova15:29
*** kholkina has quit IRC15:29
*** felipemonteiro_ has quit IRC15:29
*** acormie__ has joined #openstack-nova15:30
*** acormier_ has quit IRC15:30
*** cfriesen has joined #openstack-nova15:31
mriedemit sucks that we have to provide a token to list the versions for the compute api15:31
*** lpetrut has quit IRC15:31
*** acormie__ has quit IRC15:35
*** acormier has joined #openstack-nova15:36
*** andreas_s has joined #openstack-nova15:36
*** cdent has joined #openstack-nova15:38
fried_ricemriedem: We fixed that for placement.15:39
fried_riceWe should fix it for nova too15:39
*** r-daneel has joined #openstack-nova15:40
fried_ricemriedem: https://review.openstack.org/#/c/522002/15:40
*** sridharg has quit IRC15:40
fried_riceBecause the root URI is supposed to be auth-less.  mordred will tell you so.15:40
*** acormier_ has joined #openstack-nova15:40
mriedemyeah i remember15:40
superdanmordred will tell you lots of things15:41
fried_riceIt just makes sense too though.15:41
fried_ricesuperdan: Are you saying mordred tends to the garrulous?15:41
* superdan googles15:41
superdanoh I dunno about that15:42
*** slaweq has quit IRC15:42
superdanhe's just mordredulous15:42
fried_ricemordredoquacious15:42
*** slaweq has joined #openstack-nova15:42
*** acormier has quit IRC15:43
mriedemhuh, this is kind of fun15:44
mriedemstack@queens:~$ openstack server list15:44
mriedem+--------------------------------------+---------+--------+----------+-------+--------+15:44
mriedem| ID                                   | Name    | Status | Networks | Image | Flavor |15:44
mriedem+--------------------------------------+---------+--------+----------+-------+--------+15:44
mriedem| fd20384d-c0e5-40ab-b9f5-c3fae406379b | server1 | ERROR  |          |       |        |15:44
mriedem+--------------------------------------+---------+--------+----------+-------+--------+15:44
mriedemthis server failed in the api i think15:44
mriedemon create15:44
mriedemit's volume-backed so that's why no image, but no flavor?15:44
*** david-lyle has joined #openstack-nova15:44
*** slaweq has quit IRC15:45
mriedemmelwitt: mnaser: heh, as a user, i just hit the bug we're trying to fix,15:45
*** slaweq has joined #openstack-nova15:45
mriedemvolume-backed server create failed, i delete it, then went to do it again but the volume is reserved15:46
mriedemso now i have to switch to admin to force detach it15:46
melwittthat bug so hot right now15:46
*** swamireddy has quit IRC15:47
*** READ10 has joined #openstack-nova15:48
openstackgerritMarcin Juszkiewicz proposed openstack/nova master: Allow to configure amount of PCIe ports  https://review.openstack.org/54503415:49
*** slaweq has quit IRC15:49
*** Tom-Tom_ has joined #openstack-nova15:50
cdentI think a t-shirt with "that bug so hot right now" would work pretty well. It rings.15:52
*** Tom-Tom has quit IRC15:53
mordredsuperdan, mriedem, fried_rice: yes please to not having auth on discovery urls15:54
fried_ricecdent: Picture of a cockroach with a smug grin and a pompadour15:55
*** r-daneel has quit IRC15:55
ingymordred: hey o/15:56
cdentfried_rice: that will do nicely15:56
* mordred waves to ingy15:57
*** pcaruana has quit IRC15:57
finucannotFYI, I'm gone for the next week so if anyone pings me don't expect me to answer. See everyone at the PTG!16:00
ingycdent is OnIt™16:00
*** salv-orl_ has quit IRC16:00
cdentingy: always16:00
*** salv-orlando has joined #openstack-nova16:00
openstackgerritStephen Finucane proposed openstack/nova master: Rename '_numa_get_constraints_XXX' functions  https://review.openstack.org/38507216:01
openstackgerritStephen Finucane proposed openstack/nova master: Standardize '_get_XXX_constraints' functions  https://review.openstack.org/38507116:01
*** felipemonteiro__ has quit IRC16:02
*** bnemec is now known as beekneemech16:03
*** r-daneel has joined #openstack-nova16:03
*** mrjk_ has joined #openstack-nova16:03
mriedemam i making some obvious mistake here?16:05
mriedemcurl -d '{"os-force_detach": {}}' -H "accept: application/json" -H "x-auth-token: $token" http://199.204.45.19/volume/v3/e9d773beeef2435eb59f7c6eeaf685a9/volumes/126c8d4b-c582-484a-8c09-fe901a7dc17f/action16:05
mriedem{"badRequest": {"message": "There is no such action: None", "code": 400}}16:05
*** salv-orlando has quit IRC16:05
mriedemhttps://developer.openstack.org/api-ref/block-storage/v3/#force-detach-a-volume16:05
*** slaweq has joined #openstack-nova16:06
melwittdid you include a request body?16:06
mriedemyeah, -d16:06
*** yamahata has joined #openstack-nova16:07
melwittoh, I'm blind16:07
*** acormier_ has quit IRC16:08
*** acormier has joined #openstack-nova16:09
*** andreas_s has quit IRC16:09
mriedemaha16:09
mriedemFeb 16 16:08:49 queens devstack@c-api.service[1549]: DEBUG cinder.api.openstack.wsgi [None req-c7279a60-f7ba-4a11-98f2-8fa2b2ec281d demo demo] Unrecognized Content-Type provided in request {{(pid=1723) get_body /opt/stack/cinder/cinder/api/openstack/wsgi.py:724}}16:09
mriedemexcellent UX16:09
*** finucannot is now known as stephenfin16:10
cdenta != b16:10
*** mlavalle has quit IRC16:10
*** mlavalle has joined #openstack-nova16:10
*** slaweq has quit IRC16:11
mriedemyeah my fault16:13
mriedemcinder api can figure out if i'm missing the content-type header though and let me know16:14
mriedemrather than just 'no such action, f u'16:15
*** david-lyle has quit IRC16:15
smcginnisif user == mriedem: return "f u"16:15
mriedemwhy i aughta16:15
* mriedem pushes a patch16:15
melwittlol16:15
*** hongbin has joined #openstack-nova16:16
*** slunkad_ has joined #openstack-nova16:17
*** jafeha has joined #openstack-nova16:18
*** jafeha__ has quit IRC16:19
*** david-lyle has joined #openstack-nova16:22
mrjk_Hi, I hit this problem: https://bugs.launchpad.net/nova/+bug/1579213. Comments are pretty well explicit as well. So I came to change filter order, to lower scheduler_driver_task_period=30 (#was 60) but nothing worked.16:23
openstackLaunchpad bug 1579213 in OpenStack Compute (nova) "ComputeFilter fails because compute node has not been heard from in a while" [Undecided,Invalid]16:23
*** sree has quit IRC16:23
mriedemmrjk_: are you using the caching scheduler?16:23
*** markvoelker has quit IRC16:23
mrjk_Now, I only have the service_down_time>60 option, but I don't like it as it will impact all of my services16:24
mrjk_mriedem, lemme check16:24
*** damien_r has quit IRC16:24
mriedemif you're not using the caching_scheduler, scheduler_driver_task_period is not used16:24
mrjk_mriedem, no caching_scheduler in place (I'm running liberty)16:25
mriedemthen scheduler_driver_task_period isn't used16:26
mriedemyou're sure that you don't have scheduler_driver set in nova.conf?16:26
*** sree has joined #openstack-nova16:26
mrjk_Got this: scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler16:27
*** andreas_s has joined #openstack-nova16:27
*** markvoelker_ has joined #openstack-nova16:27
mriedemthen the compute filter is probably failing because you have a down compute service16:28
mriedemnova service-list will show you the compute service that is down16:28
openstackgerritMerged openstack/nova master: trivial: Move __init__ function  https://review.openstack.org/53822316:28
mrjk_Actually it fails when I try to load a lots of VM (from 30)16:29
mrjk_I have some node down, because it always have, it shouldn't impact. The sheduling process mays definitely take up to 1 minute as well16:29
mriedemmrjk_: ok, well, that could be lots of things potentially so you're going to have to dig through some logs; and you're on a long EOL release16:29
mrjk_yep, I know ... The log said the host heartbeat is too old16:30
mrjk_The thing I don't get is how this does work. Does it do like a snapshot of it's current available hosts, and then process it for all instances ?16:31
*** sree has quit IRC16:31
cfriesenmrjk_: basically, yes16:31
*** AlexeyAbashkin has quit IRC16:33
mrjk_So I've no choice to increase service_down_time :/16:33
cfriesenmrjk_: is the service actually down, or does it just look down due to missed updates from the compute node due to load?16:34
*** slaweq has joined #openstack-nova16:34
mrjk_Services are up, I only have few nodes down (3/150). In the log I see the scheduler querying for each compute status, and then after 1 minute it start to say all computes seems dead because no heartbeat16:36
openstackgerritMatt Riedemann proposed openstack/nova master: Add a nova-caching-scheduler job to the experimental queue  https://review.openstack.org/53926016:37
mrjk_And I look into the code, the is_up check is made at the query time, it does not seems to query  any cache service (my 2cts)16:37
*** udesale_ has quit IRC16:37
*** imacdonn has quit IRC16:37
*** imacdonn has joined #openstack-nova16:38
mriedemare you running conductor and scheduler on the same host? do you only have 1 conductor? maybe you need more conductor workers.16:38
mriedemsounds like a scaling problem16:38
*** slaweq has quit IRC16:39
mriedemanyway, debugging liberty deployment scaling issues isn't really the focus for this channel, you can try #openstack or #openstack-operators maybe16:39
*** Dave has quit IRC16:39
cfriesenmriedem: if you were scheduling a whole bunch of instances, such that by the time you get to the last one the cached "last checkin" time on a service was more than "service_down_time" ago, wouldn't that cause this sort of thing?16:39
mriedempossible, idk, i don't create a bunch of instances in a single request16:40
mriedemi know doing so has all sorts of issues16:40
*** Dave has joined #openstack-nova16:40
cfriesenmrjk_: is there a reason why you are creating so many in one request?16:40
mriedemlike if you create 1000 instances in a single request, we don't limit that, and we can cause the rpc call from conductor to scheduler to timeout and retry it, thus increasing the load and failure16:40
mriedemhttps://review.openstack.org/#/c/510235/16:41
cfriesenmrjk_: as a general rule, I'd suggest keeping the number of servers in a single boot request small enough that the scheduling time is safely below "service_down_time"16:44
*** felipemonteiro has joined #openstack-nova16:45
*** andreas_s has quit IRC16:46
openstackgerritMatt Riedemann proposed openstack/nova master: Provide a hint when performing a volume action can't find the method  https://review.openstack.org/54538216:48
*** acormier has quit IRC16:52
*** itlinux has joined #openstack-nova16:55
mriedemmelwitt: i asked laura last night if she knew what "get on the horn" meant and she had no idea what i was talking about16:56
superdanwat16:58
melwitthah, *solidarity*16:59
*** vladikr has quit IRC17:00
*** r-daneel has quit IRC17:02
gibletjust a quick heads up, I will be mostly unavailable during next week. see you in Dublin!17:04
cfriesenmelwitt: given the new behaviour in https://review.openstack.org/#/c/528385/ it seems the new expectation if you fail to build is that you need to create a new cinder volume and create a new instance.  previously you could try doing a "rebuild" operation after fixing whatever the problem was.17:04
*** slaweq has joined #openstack-nova17:04
mriedemcfriesen: what is the new behavior here? that the volume isn't left stuck in a non-available status?17:06
melwittcfriesen: it would only be a new volume if it was set to delete_on_termination. otherwise, it's just detached. and yes, create new instance17:06
cfriesenmriedem: about the  instance.info_cache.network_info vs self.network_api.get_instance_nw_info(context, instance) you said the method on the instance is a helper method...but instance.info_cache isn't a method, it's an object.17:07
melwitttbh, I wasn't thinking of anyone using rebuild to fix a failed build17:07
*** elmaciej_ has quit IRC17:07
*** chyka has joined #openstack-nova17:07
mriedemmelwitt: not necessarily a new volume17:08
cfriesenmriedem: I think previously you would have an instance in ERROR state with an attached volume.  I think you could have done a "rebuild" on it from the error state.17:08
mriedemmelwitt: i can create a volume in cinder, and use it for bfv, and specify delete_on_termination17:08
mriedemi'm not sure why you'd do that though17:08
*** sambetts is now known as sambetts|afk17:08
mriedemcfriesen: i thought you were asking about instance.get_network_info()17:08
mriedemnot self.network_api.get_instance_nw_info(context, instance)17:08
mriedemself.network_api.get_instance_nw_info(context, instance) will rebuild the info cache17:09
melwittmriedem: if delete_on_termination was set and it failed the build and you had to delete the instance and start again, you would have to create a new volume, right?17:09
mriedeminstance.get_network_info() is the same as instance.info_cache.network_info17:09
mriedemmelwitt: yes17:09
melwittthat's what I was saying to cfriesen, you would only have to create a new volume if you had used delete_on_termination. else the volume would be only detached and you could use it again with a new instance17:10
smcginnisYou might want a bfv deleted on instance delete if you are using it just to manage your storage capacity separate from your local n-compute local storage.17:10
*** felipemonteiro has quit IRC17:10
cfriesenmriedem: how do I know when I can call instance.get_network_info()?  Specifically I'm looking at nova.compute.rpcapi.pre_live_migration()....I want to adjust the RPC timeout based on the number of network ports.17:11
mriedemsmcginnis: so not the root disk17:11
mriedemapplication data17:11
mriedemi just figure people that create a volume directly in cinder and use it to bfv care about re-using the volume,17:11
mriedemand people that let nova create the volume for you, don't care17:11
smcginnisEh, less useful maybe just for application data, but still can be used in that way.17:11
mriedemcfriesen: you can call it at any time...?17:12
smcginnismriedem: I would think that is usually the case that they do care about that data though if they create in cinder first.17:12
mriedemsmcginnis: yeah17:12
mriedemcfriesen: presumably we rebuild the nw info cache before starting live migration anyway17:12
mriedemcfriesen: yes we do17:13
mriedemnetwork_info = self.network_api.get_instance_nw_info(context, instance)17:13
mriedemin pre_live_migration in the ComputeManager17:13
cfriesenmriedem: I'm confused (clearly).  in the existing code, in some places where they want network_info they look at instance.info_cache.network_info, and in other places they call17:13
mriedemthat will refresh the nw info cache17:13
cfriesenself.network_api.get_instance_nw_info(17:13
cfriesen                    context, instance)17:13
cfriesenwhoops, bad paste, sorry17:13
mriedemso at that point, the instance.info_cache.network_info is up to date17:13
mriedemso if you're going to use it to count ports to adjust rpc call timeout, you should be good17:13
*** derekh has quit IRC17:14
cfriesenno, I need to have the ports in the rpc code that is *calling* pre_live_migration on the dest node.17:15
cfriesenso updating in pre_live_migration doesn't help17:15
mriedemlet me dig up the live migration call chart17:15
mriedemof which i still owe fried_rice a shiny nickel17:15
fried_rice\o/17:16
cfriesenI think live_migration() on the source node calls pre_live_migration() on the dest17:16
cfriesenor rather _do_live_migration() I guess17:16
mriedemf me if i can ever find anything in our docs17:17
mriedemhttps://docs.openstack.org/nova/latest/reference/live-migration.html17:17
mriedemcfriesen: yeah you're right17:17
openstackgerritMatthew Edmonds proposed openstack/nova-specs master: PowerVM Virt Integration (Rocky)  https://review.openstack.org/54511117:18
mriedemand _do_live_migration doesn't refresh the instance nw info cache before calling pre_live_migration17:18
cfriesenokay, so I probably need to call self.network_api.get_instance_nw_info(context, instance) ?17:18
*** links has quit IRC17:18
mriedemcfriesen: so from _do_live_migration if you call instance.get_network_info() then you're just getting whatever is current for the instance in the db17:18
mriedemif you really want to refresh to get the latest17:19
cfriesenwhat would cause that to be stale?  if someone was in the middle of attaching an interface or something?17:19
mriedembut we have the network events from neutron if that changes, and the heal_instance_info_cache periodic17:19
mriedemumm17:19
mriedemyeah,17:20
mriedemsince we don't set a task_state for attaching/detaching an interface, you can live migrate the server at the same time17:20
mriedemif the instance.task_state was not None (attaching/detaching), you couldn't do a live migration17:20
mriedemthta's something i've always wondered about,17:20
mriedemwhy we don't change task_state while attaching/detaching interfaces and volumes17:21
*** itlinux has quit IRC17:22
cfriesenmight be a fun stress test. ping-pong live migrations while attaching/detaching interfaces/volumes17:22
*** swamireddy has joined #openstack-nova17:24
*** READ10 has quit IRC17:24
mriedemfun like a hernia17:24
*** itlinux has joined #openstack-nova17:25
mriedemcfriesen: so you're probably better off just starting small and getting the current info cache in _do_live_migration and basing the rpc timeout on that17:26
*** acormier has joined #openstack-nova17:27
*** belmoreira has quit IRC17:29
*** acormier has quit IRC17:29
*** acormier has joined #openstack-nova17:29
*** acormier has quit IRC17:30
*** acormier has joined #openstack-nova17:30
cfriesenmriedem: agreed, it should be good enough.   Is this idea of varying the RPC timeout based on number of attached interfaces something that you think would be generally useful?  We're seeing it take ~1.5 sec per interface in pre_live_migration() to update the ports, but I'm not sure if that's typical or due to our neutron changes.17:36
superdanI don't really like the idea of doing that17:36
cfriesenand in nova those calls to neutron are serialized, so with a large number of interfaces it's possible to eat up a good chunk of the RPC timeout just doing the port updates17:37
superdanI'd rather see something in olso.messaging that heartbeats running calls so we have a soft and hard timeout range17:37
mriedemi seem to remember having similar conversations for other rpc calls that can take a long time, but can't remember details,17:37
mriedemor if they were just "should we add a specific rpc timeout config option for this *one* really bad operation?"17:37
superdanor work on not chaining rpc calls together17:37
mriedemcfriesen: and this is with https://review.openstack.org/#/c/465787/ applied right?17:38
*** giblet is now known as gibi_off17:38
cfriesenwe internally already hack the pre_live_migration timeout for block-live-migration to allow time to download the image from glance.17:38
mriedemcfriesen: do you happen to have any rough numbers on how much ^ helps with live migration with >1 ports?17:38
mriedemare you using the image cache?17:39
cfriesenmriedem: yes, it's with that applied.   let me see if I can dig something up.17:39
cfriesenmriedem: I think so, but that still means it could hit the first instance that wants that image.17:39
*** ttsiouts has quit IRC17:40
mriedemon the dest host17:41
mriedemthe image would be cached on the source host but that doesn't help you17:41
mriedemyou guys don't use ceph?17:41
cfriesenmriedem: up to the end user17:41
mriedemok; figured with all of the live migration you seem to do, you'd push ceph17:41
cfriesenmriedem: we support compute nodes with ceph, with local qcow2, and with local thin LVM17:41
cfriesenmriedem: some installs are really small, like 2 all-in-one nodes17:42
*** READ10 has joined #openstack-nova17:47
cfriesenmriedem: looking at my notes, that patch cut the neutron load by a factor of 3 and reduced lock contention in nova.  We also had an oslo.lockutils change to introduce fair locks--they merged it but then it seemed to cause mysterious issues in the CI for a couple of projects so they reverted it.17:48
mriedemhow many ports on that instance?17:49
mriedemin that test?17:49
cfriesenactually, wait, that factor of 3 reduction was for removing redundant calls for the same instance17:50
cfriesen16 ports on the instance (our max)17:51
*** vladikr has joined #openstack-nova17:54
*** harlowja has joined #openstack-nova17:55
*** lbragstad has quit IRC17:55
*** lbragstad has joined #openstack-nova17:56
cfriesenmriedem: found some numbers.  as of april last year each call to get_instance_nw_info() cost roughly 200ms plus about 125ms per port.  And without that patch there are O(N) network-changed events in a live-migration (where N is the number of interfaces), so the overall cost is O(N^2)17:57
cfriesenwith that patch, it drops to O(N)17:57
*** hemna_ has quit IRC17:57
*** mgoddard_ has quit IRC17:58
mriedemcool17:58
*** mriedem is now known as mriedem_lunch17:58
mriedem_lunchi'll feast on those results at lunch17:58
*** hemna_ has joined #openstack-nova18:00
*** gyee has joined #openstack-nova18:00
* cdent notes the date of an O-notation sighting18:02
*** r-daneel has joined #openstack-nova18:05
*** Swami has joined #openstack-nova18:06
*** jpena is now known as jpena|off18:09
*** hemna_ has quit IRC18:09
*** salv-orlando has joined #openstack-nova18:10
*** hemna_ has joined #openstack-nova18:11
*** yamahata has quit IRC18:11
*** david-lyle has quit IRC18:12
*** brault has quit IRC18:17
*** hemna_ has quit IRC18:20
*** AlexeyAbashkin has joined #openstack-nova18:24
*** hemna_ has joined #openstack-nova18:25
*** ralonsoh_ has quit IRC18:26
*** tesseract has quit IRC18:26
*** AlexeyAbashkin has quit IRC18:28
*** dtantsur is now known as dtantsur|afk18:29
openstackgerritKen'ichi Ohmichi proposed openstack/nova master: Trivial: Update help of enabled_filters  https://review.openstack.org/54543118:30
openstackgerritKen'ichi Ohmichi proposed openstack/nova master: Trivial: Update help of enabled_filters  https://review.openstack.org/54543118:31
*** r-daneel has quit IRC18:37
*** r-daneel has joined #openstack-nova18:38
*** jamesdenton has quit IRC18:39
*** psachin has joined #openstack-nova18:41
*** harlowja has quit IRC18:44
*** oomichi has joined #openstack-nova18:46
*** harlowja has joined #openstack-nova18:48
*** weshay is now known as weshay|bbiab18:50
*** r-daneel has quit IRC18:54
*** lpetrut has joined #openstack-nova18:55
*** lajoskatona has quit IRC18:55
*** mgoddard_ has joined #openstack-nova18:56
*** cdent has quit IRC18:58
*** r-daneel has joined #openstack-nova18:58
*** yamahata has joined #openstack-nova19:04
mrjk_mriedem, ok, thank you for your hints19:06
mnaserif nova creates neutron ports when it boots an instance, and the port is manually detached with nova instance-detach, the port is deleted (as per preserve_on_delete=False).. anyone else feel this behaviour is bleh?19:08
mnaserif i detach a port manually, it means i want to use it19:08
*** harlowja has quit IRC19:10
*** hemna_ has quit IRC19:11
*** lucasagomes is now known as lucas-afk19:12
*** lpetrut has quit IRC19:12
*** david-lyle has joined #openstack-nova19:13
imacdonnhmm, interesting .... it could be argued that if you want to make your ports available for use with other instances, you should pre-create them separately19:15
*** hemna_ has joined #openstack-nova19:15
*** gyee has quit IRC19:16
imacdonnassuming that, perhaps you shouldn't be allowed to detach a port that was created along with the instance (thinking out loud)19:18
*** fullmetaljackiet has joined #openstack-nova19:18
*** lpetrut has joined #openstack-nova19:20
*** sambetts|afk has quit IRC19:22
*** mgoddard_ has quit IRC19:23
*** sambetts_ has joined #openstack-nova19:24
*** pchavva1 has joined #openstack-nova19:27
TheJuliajroll: do you, or anyone else, remember if there was any outcome from the duplicate placement issues we encountered yesterday with ironic's multinode grenade job on queens->master ? It passed without issues once, revised the patch slightly and hit the same issue again19:29
jrollTheJulia: I didn't see anything, fried_rice said he might look into it19:29
jrollI probably won't get bandwidth to investigate today19:30
jrollall I got was that yes, we likely triggered a rebalance of ironic n-cpu things19:30
mrjk_cfriesen, I'm reading back your comments, "I'd suggest keeping the number of servers in a single boot request small enough" => Google does not answer to this question :)19:36
mrjk_That means I don't have the control on this issue ? (I was looking for a max cap allowed instance per requests)19:37
*** mrjk_ has quit IRC19:38
*** mrjk has joined #openstack-nova19:39
*** pchavva1 has quit IRC19:42
*** pchavva has quit IRC19:42
*** mriedem_lunch is now known as mriedem19:44
imacdonnmriedem: I have another "pool volume handling" situation to run by you ... wondering if it'd be covered by any existing work, or if I should create a new bug for it19:46
mriedemmrjk: there is no limit, besides the user quota, on --max-count instances in a single server create request19:46
openstackgerritMerged openstack/nova master: Only log during pop retry phase  https://review.openstack.org/54165519:46
mriedemmrjk: so if you allow users to have a quota of instances to be 30, and they can create up to 30 instances in a single request, then you have to account for that in your deployment19:47
mriedemimacdonn: pool volume handling?19:47
imacdonnmriedem: SA reported that a volume was attached to an instance that didn't exist .... from log-trawling, I discovered that the volume was attached to an instance, and the instance was terminated while the cinder backend was down - cinder-volume threw a VolumeBackendAPIException, but nova-compute ignored it, and proceeded to delete the instance anyway - https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L240419:49
*** weshay|bbiab is now known as weshay19:50
*** harlowja has joined #openstack-nova19:50
*** amoralej is now known as amoralej|off19:53
mriedemimacdonn: and then19:53
imacdonnmriedem: and then we have a volume that's attached to an instance that doesn't exist19:53
mriedemyup19:53
mriedemyou'll have to force-detach the volume19:53
imacdonnnot only that, but it left behind broken SCSI/multipath devices on the compute node19:54
imacdonnshouldn't the instance termination fail in this case ?19:54
mriedemthis is the way it's always worked, i'm not exactly sure why, besides nova just wants to delete the instance19:55
cfriesenmriedem: I can see an argument for not deleting the instance if it means leaving neutron/cinder in a confused state19:56
mriedemwell,19:56
mriedemi think nova ignores it,19:56
mriedembecause at this point, nova has already destroyed the guest19:56
mriedemhttps://github.com/openstack/nova/blob/master/nova/compute/manager.py#L235419:56
mriedemsimilarly, if we can't unbind the port, we log an exception but don't reraise it https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L56319:58
mrjkmriedem, ok so I guess I can't really solve this issue then, but I'm still annoyed because users report errors now. Increasing service_down_time has a wider impact than just the scheduler, right?19:58
cfriesenanyone else find it disconcerting that _shutdown_instance() is closer to "destroy/delete" than just "shutdown"?19:59
imacdonnhmm that log message for the port really should be LOG.warn(), not debug()19:59
mriedemmrjk: yes service_down_time is used per service,19:59
mriedemmrjk: if you have your control services running on different hosts, then you could have a separate config value for them19:59
mriedemmrjk: if you are going to allow your users to have a high enough quota to create lots of instances in a single request, and scheduler/conductor is taking too longer, then you need to scale something out20:00
mrjkhmm, it's a bit hackish, but I will definitely consider this20:00
mriedemmaybe conductor20:00
imacdonnalso, a port not existing is different from a "something went badly wrong" exception, IMO20:00
mriedemimacdonn: it's not logging an exception on port not found20:01
mrjkI'll continue to investigate on conductor, to see if I see more stuffs20:01
mriedemit doesn't care about the port not found b/c it's trying to unbind it20:01
mrjkThx for your help20:01
mriedemif the actual port update fails with a 400 or 500 or something, it logs an exception trace20:01
mriedembut keeps going20:01
mriedemlike i said, the guest is gone by this point20:01
fried_ricejroll, TheJulia: My investigation stalled at "Yup, you tried to create a provider with the same name but a different UUID."20:01
imacdonnoh, right, yeah20:01
mriedemso if volume/port cleanup fails,20:01
mriedemthe guest in the hypervisor is already gone, and you have manual cleanup to do in cinder/neutron20:02
mriedemif there are other better historical reasons for this, i'm hoping maybe leakypipes or superdan can chime in20:02
*** r-daneel_ has joined #openstack-nova20:02
*** ccamacho has quit IRC20:02
smcginnisAnd if the cinder or neutron failure was due to that volume or port being deleted externally, you wouldn't want your broken instances stuck.20:03
*** salv-orlando has quit IRC20:03
*** salv-orlando has joined #openstack-nova20:04
imacdonnI'm actually more concerned about the left-behind SCSI/multipath devices on the compute node ... we actually monitor for that, because it's caused us a lot of pain .. it may be less painful with Gorka's work on os-brick, but it's still messy to leave that stuff laying around20:04
openstackgerritMerged openstack/nova master: api-ref: Further clarify placement aggregates  https://review.openstack.org/54535620:04
openstackgerritMerged openstack/nova master: Fix and update compute schedulers config guide  https://review.openstack.org/54401020:04
openstackgerritMerged openstack/nova master: Fix warn api_class is deprecated, use backend  https://review.openstack.org/54383020:05
superdanlyarwood is probably your man for understanding the multipath residue20:05
*** r-daneel has quit IRC20:05
*** r-daneel_ is now known as r-daneel20:05
superdanI definitely don't have much context on that20:05
imacdonnif nova's going to take the ostrich approach to cinder failures, perhaps it should at least clean up the os-brick stuff .... I suppose I should try to confirm that it doesn't do that with the latest code .. the case I diagnosed was on Ocata20:05
mriedemimacdonn: anything to do with os-brick,20:06
mriedemwould be when driver.destroy is called to delete the guest20:06
mriedemwhich presumably didn't fail20:07
mriedemright here https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L105520:07
*** salv-orlando has quit IRC20:08
imacdonnhmmm20:08
imacdonnsomething failed, because the devices were left behind20:08
mriedemthe thing you pointed out earlier, was when nova detaches the volume on the cinder side20:08
mriedemnot the compute host20:08
mriedemwell, calls terminate_connection to remove the export20:09
mriedemdetach_volume in cinder is just updating the volume status to 'available'20:09
mriedemmultipath shenanigans would be in os-brick when the libvirt driver calls disconnect_volume20:09
mriedembtw, i don't know what an SA is20:10
mriedemexcept super america20:10
mriedemgas station20:10
imacdonnheh .. sys admin20:10
imacdonnseems like I'm going to have to try to reproduce this .... when I got to the case I diagnosed, the broken devices had already been cleaned up manually ... I only have non-debug logs to go on20:13
*** eharney has quit IRC20:13
imacdonnactually... just had a thought ... it may be that the devices DID get unplumbed, but then a subsequent instance creation caused an iSCSI rescan, and they were rediscovered ..  if that's the case, Gorka's os-brick work (in Pike) should solve that20:15
mriedemimacdonn: if os-brick failed we should have a warning from https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L106420:15
*** lpetrut has quit IRC20:16
*** fullmetaljackiet has quit IRC20:16
*** lpetrut has joined #openstack-nova20:17
imacdonn("Gorka's work" referring to https://review.openstack.org/#/c/445943/)20:18
imacdonnor whatever that turned into .. that doesn't look like the right review20:19
imacdonnhttps://review.openstack.org/#/c/433104/20:20
imacdonngiven that, it's not so bad .... I still think it's kinda not great to leave cinder believing that the volumes are attached to instances that don't exist, but I think I can live with it20:22
*** BlackDex has quit IRC20:23
*** salv-orlando has joined #openstack-nova20:34
*** READ10 has quit IRC20:36
*** BlackDex has joined #openstack-nova20:41
*** andreas_s has joined #openstack-nova20:43
*** andreas_s has quit IRC20:47
TheJuliafried_rice: jroll: In that case, it almost sounds like the list of nodes being serviced by that individual compute node is out of sync... and such a constraint almost sounds like we can't have multiple nova-compute processes running at all 8(20:48
fried_riceTheJulia: As long as your compute node RPs are actually the same, you should be fine.  Why aren't they being created with the same name & uuid?20:50
fried_riceTheJulia: In resource tracker _update, the compute_node param has a UUID and a name.  The report client is using both of them to _ensure_resource_provider.20:53
TheJuliaso, I guess then I have a data model question due to a lack of understanding on my part. is the uuid ironic's baremetal node uuid that we are talking about? Because I thought based on comments yesterday that it was getting posted as the name of the entry and the uuid was the actual nova-compute node?20:53
fried_riceTheJulia: This is where I'm also pretty confused.  But it seems to me like, whichever is the case, its uuid/name should be immutable.20:54
TheJuliaugh20:54
fried_riceEspecially since the name is a UUID (which is even more confusing)20:54
TheJuliaif it is the compute node uuid.... then.... we have lost functionality20:55
fried_riceWe're creating the RP in placement with compute_node.uuid, compute_node.hypervisor_hostname20:55
*** markvoelker_ has quit IRC20:55
TheJuliaso they can never fail between compute nodes.... heh20:56
fried_riceUnder what circumstances could I get a compute_node with the same hypervisor_hostname but a *different* uuid?20:56
*** markvoelker has joined #openstack-nova20:56
TheJuliaif a compute process went down too long and the hash ring recalculated20:56
TheJuliathe other compute process would take over for the other nodes20:56
TheJuliaor at least, that is what we were doing as i understand it20:57
*** liverpooler has quit IRC20:57
fried_riceCool cool.  But why doesn't the second compute process perceive the nodes as having the same UUID as the first compute process did?20:57
TheJuliaand now I'm confused :)20:58
* TheJulia switches gears and cracks open the nova code21:00
fried_riceTheJulia: resource_tracker.py in the 900s21:00
fried_ricesorry, 500s21:00
TheJuliathanks21:00
fried_riceIn particular, I thought I4253cffca3dbf558c875eed7e77711a31e9e3406 was supposed to be attacking this very problem.21:01
*** eharney has joined #openstack-nova21:05
*** sambetts_ has quit IRC21:06
fried_riceTheJulia: I think I get it.  Is the hypervisor_hostname is assigned based on compute_node.host?21:06
TheJuliathat is what I think, it is just I've never dug through this portion of nova, so I'm not grasping it very well21:06
TheJuliaalso *squirrel* *blink* *blink*21:07
fried_ricemriedem: Executive summary on the difference between nodename and hypervisor_hostname?21:07
jrollin ironic-land, nodename == ironic node uuid, hypervisor_hostname == nova-compute hostname (rabbit queue)21:08
fried_riceAha21:09
fried_riceI think therein lies the boggle.21:09
fried_riceBecause in non-ironic-land, I *think* they are the same.21:09
fried_riceAnd we're using the wrong one.21:09
jrollthe patch you mentioned was meant to handle this, but I think only for the compute_nodes table, not resource providers21:09
*** r-daneel has quit IRC21:10
jroll(emphasis on I think)21:10
fried_riceL86921:11
jrollnice.21:11
mriedemfor non-ironic, compute_nodes.host and compute_nodes.hypervisor_hostname are the same21:12
* fried_rice hacks21:12
jrollwait, 869 can't be wrong, we would have never had any resources21:13
jrollthat would have blown up all over the place21:13
jrolldid I have this backwards?21:13
TheJuliahmmmmmmmm21:13
jrollfried_rice: I'm sorry, I lied. hypervisor_hostname is the ironic node uuid21:15
jrollcompute_node.host is the nova-compute hostname21:16
jrollyour use of nodename threw me off, that is equivalent in (most? all?) places to hypervisor_hostname21:16
fried_riceSo then we go back to the original question: How does compute_node.uuid *change* when compute_node.hypervisor_hostname is the same?21:17
jrollcompute_node.uuid does not21:18
jrollbut resource provider uuid does21:18
jrollafaict21:18
* jroll goes back to the logs21:18
fried_riceNo, because the error is happening via _ensure_resource_provider(compute_node.uuid, compute_node.hypervisor_hostname)21:18
fried_riceand we're running into a conflict on the latter.21:18
jrollis resource provider UUID always the same as compute node uuid?21:19
fried_riceFor compute node resource providers, in Queens, yes.21:19
fried_riceUhm.21:19
fried_riceYes.21:19
fried_riceWas gonna say get_inventory might be able to muck with it, but no.21:20
fried_riceIs this because nova is creating the ComputeNode entries afresh, with an autogenerated UUID?21:20
jrollseems like it would be, yes. but that's what https://review.openstack.org/#/c/508555/ should have fixed21:20
fried_riceExacitically21:20
fried_riceL518-9 in fact21:21
jrollright21:21
*** tssurya has quit IRC21:22
*** oomichi has quit IRC21:22
jrollwe demonstrate that code is running: http://logs.openstack.org/50/544750/10/check/ironic-grenade-dsvm-multinode-multitenant/d7a1ee7/logs/subnode-2/screen-n-cpu.txt.gz#_Feb_16_17_21_04_61378621:23
* fried_rice hacks more...21:23
fried_riceI'm sorta guessing you can't change the UUID of a ComputeNode in the db.  mriedem?21:24
jrollboth compute nodes appear to be attempting to create an RP for that node (9de4d7b4-51c9-4088-99b4-cd648332504e), at different times of course21:24
fried_riceRight; the first one succeeds but the second one barfs21:24
fried_riceright?21:24
jrollum21:24
jrollwell, both fail21:24
mriedemfried_rice: changing the uuid would be kinda bad21:24
jrollbut my um: how do we feel about adding a cn.save() here: https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py?utf8=%E2%9C%93#L52521:24
* jroll checks if that's done elsewhere21:25
fried_ricejroll: I was going to suggest that.21:25
fried_ricebut also not knowing if it's The Right Thing.21:25
fried_riceI was also going to suggest setting the UUID before save()ing.21:25
fried_riceBut mriedem won't come to my birthday party if I do that.21:25
jrollnah, the UUID should be fine, we're updating an existing CN21:25
fried_riceBut I thought that was the whole problem.21:26
fried_ricebtw, assuming _resource_change returns True, that .save() should be getting done at L859.21:26
* jroll unwinds his brain21:26
jrollyeah, I'm not sure it will, the schedulable resources should be the same as before, only the host attribute is changing21:27
jrollok, I get it now, we did get a new UUID, wtf21:30
fried_ricew, indeed, tf21:30
jrollsorry, I got confuzzled21:30
jrollyou know...21:31
jrolldoes a compute_node record get deleted at some point if the compute service disappears?21:32
jrolland if so, do we clean up the resource providers?21:32
fried_riceI could answer that second question if I knew the answer to the first.  But I don't.  mriedem?  (Or mriedem if you'd care to delegate, who's around who knows this stuff?)21:33
jrollso um21:33
jrollthis is pike->queens, afaik21:33
jrollcan we please land https://review.openstack.org/#/c/527423/ .21:33
openstackgerritEric Fried proposed openstack/nova master: WIP: Make sure rebalance saves the compute node  https://review.openstack.org/54546421:33
fried_ricejroll: Here's a quickie to make sure that save() is happening ^21:34
fried_ricejroll: But where the error happens, we're clearly running that code.21:34
jroller, the one I'm looking at is queens->master21:34
jrollyeah21:34
fried_riceSo whereas I agree we should land that backport, that's not gonna be your fix here.21:35
mriedemjroll: it doesn't21:35
jrollsorry, we've been having issues with this job since before queens was cut, so I've been getting confused21:35
mriedemcompute_nodes hang out until manually removed21:35
jrolldamn21:35
mriedemthere are some bugs that tssurya opened for removing compute nodes and providers21:35
jrollI was hoping compute nodes did get deleted but not RPs21:36
jrollthat would explain things21:36
mriedemhttps://bugs.launchpad.net/nova/+bug/174973421:36
openstackLaunchpad bug 1749734 in OpenStack Compute (nova) "Purge the compute_node records, resource provider records and host_mappings when doing force delete of the host" [Medium,Confirmed] - Assigned to Surya Seetharaman (tssurya)21:36
jrollthanks21:36
fried_ricejroll: I can get a little more aggressive with the hacking.21:37
jrollfried_rice: feel free, I'm just trying to wrap my head around some of this21:38
jrollthis seems wrong. http://logs.openstack.org/50/544750/10/check/ironic-grenade-dsvm-multinode-multitenant/d7a1ee7/logs/screen-n-cpu.txt.gz#_Feb_16_17_06_33_59231321:38
jrolloh, there we are http://logs.openstack.org/50/544750/10/check/ironic-grenade-dsvm-multinode-multitenant/d7a1ee7/logs/screen-n-cpu.txt.gz#_Feb_16_17_04_22_81402021:39
jrollbut we can't talk to placement, so the RP stays21:39
jrollbecause Feb 16 17:04:22.591129 ubuntu-xenial-rax-ord-0002580076 nova-compute[28778]: DEBUG nova.virt.ironic.driver [None req-969cdb75-026b-4cac-ba09-3f3be962a09d service nova] Returning 0 available node(s) {{(pid=28778) get_available_nodes /opt/stack/old/nova/nova/virt/ironic/driver.py:757}}21:39
jrollbecause ironic is down21:40
jrollgot dang.21:40
* jroll stabs everything21:40
jrollI remember this code landing to fix something else21:40
fried_ricejroll: Is there any chance that this ironic node being rebalanced has allocations?21:40
jrollfried_rice: yes, we create an instance before the upgrade AFAIK21:41
fried_riceSo we have to do more than just delete the old RP.  We have to move his allocations to the new one.  This ain't gonna work.21:41
jrollthis crap is burning us: https://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py#L607-L61621:41
jrollright, we shouldn't be deleting the compute node or the RP21:41
jrollso here's what's going on, in short:21:42
fried_riceThe answer is quite simply to make sure the compute node doesn't change its freakin uuid.21:42
jrolln-cpu has a bunch of ironic nodes it's managing21:42
jrollironic goes down for upgrade21:42
jrolln-cpu does a RT update21:42
jrolln-cpu can't reach ironic21:42
jrolln-cpu thinks all the ironic nodes are gone, for good, as if ironic returned I have no nodes21:43
jrolln-cpu deletes the compute_node records21:43
jrollironic comes back21:43
jrolln-cpu does an RT update, sees nodes, creates compute_node records21:43
jrollmeanwhile, n-cpu couldn't delete the resource providers from placement, and so it tries to create new ones and *boom*21:44
* jroll burns this entire driver to the ground21:44
fried_ricen-cpu deletes the compute_node records?21:45
jrollbtw, s/ironic goes down for upgrade/keystone goes down for upgrade/, which is why neither ironic nor placement can be reached21:45
fried_riceI thought we decided that wasn't happening.21:45
jrollyes21:45
jrollhttp://logs.openstack.org/50/544750/10/check/ironic-grenade-dsvm-multinode-multitenant/d7a1ee7/logs/screen-n-cpu.txt.gz#_Feb_16_17_04_22_81402021:45
jrollbecause: http://logs.openstack.org/50/544750/10/check/ironic-grenade-dsvm-multinode-multitenant/d7a1ee7/logs/screen-n-cpu.txt.gz#_Feb_16_17_04_22_59112921:45
jrollbecause: https://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py#L607-L61621:45
jroll(maybe it is apache that is down, don't know, don't care, things are down and n-cpu can't deal)21:46
jrollfor the curious, the commit message and bug here explain why we're returning an empty list of nodes there: https://review.openstack.org/#/c/487925/21:47
jrollfried_rice: that all make sense?21:48
jrollTheJulia: ^ fyi, I think I nailed it down.21:48
*** r-daneel has joined #openstack-nova21:48
* TheJulia hands jroll accelerant to assist with the burning the driver to the ground21:50
jrollthanks, my gas can is nearly empty21:50
jrollturns out lying to the resource tracker is wrong, who'da thought21:53
* jroll thinks we should just kill that with fire, but then n-cpu can't start without ironic up, need to work that out too21:53
*** sree has joined #openstack-nova21:54
* TheJulia hands jroll magnesium for good measure21:54
jrollso the only way n-cpu will blow up by ironic not being reachable is because of this: https://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py#L52421:57
jrollwhich will happen on the first RT run21:57
jrollor not even? wtf21:57
jrolloh, there used to be a _refresh_cache() there21:58
jrollbut even before 487925 we would return an empty list21:58
*** sree has quit IRC21:59
jrollah jeez https://github.com/openstack/nova/commit/cce06a1e9855d9eed3f7c653200853f23466d79121:59
* jroll thinks we can kill all this, test what happens when starting n-cpu without ironic available, fix that stuff without lying to RT, go from there22:00
*** lpetrut has quit IRC22:00
*** edmondsw has quit IRC22:01
jrollhm, 5pm friday22:01
* jroll chugs a coffee and hacks22:01
fried_ricejroll: Need anything from me?22:01
*** edmondsw has joined #openstack-nova22:01
jrollfried_rice: whiskey may be needed22:01
jroll:)22:01
fried_riceCan https://review.openstack.org/545464 be abandoned?22:01
jrollyes, believe so22:02
fried_ricejroll: The fact that you're a time zone ahead of me indicates I have no way of getting you a bottle in time to save you.22:02
TheJuliajroll: I will buy you whiskey in Dublin22:02
fried_riceYeah, that ^22:02
jrollheh22:02
TheJuliaAnd next time I'm through your part of the country, I'll make a point of bringing really good whiskey on my RV22:02
jroll:o <322:03
*** psachin has quit IRC22:04
*** edmondsw has quit IRC22:06
*** fullmetaljackiet has joined #openstack-nova22:09
*** amodi has quit IRC22:10
*** awaugama has quit IRC22:13
*** slaweq has quit IRC22:15
mrjkAbout nova.cfg, something is not clear. Let's say I've conductor, api and any other service. Most of settings are in the DEFAULT section, but is it possible to override some default parameters for a specific service ? Let's say I want to change the debug mode for only one service, and not the others ...22:16
mriedemdebug is global22:17
mriedemas long as you have all of your controller services running on the same host, they are going to share config from the [DEFAULT] section22:17
mrjkHow could I find out this info by myself ?22:17
mriedemyou could split configs and create an /etc/nova/nova-api.conf which has config specific to your API service22:17
mriedemand remove debug from the base /etc/nova/nova.conf22:17
mriedemthen run the service with both config files22:18
mrjkOk, this would be the way to go. I was unsure, I believed there was a kind defaulting/override values22:18
mriedemnova-api --config-file /etc/nova/nova.conf --config-file /etc/nova/nova-api.conf22:18
*** mchlumsky_ has quit IRC22:18
*** jafeha__ has joined #openstack-nova22:18
*** jafeha has quit IRC22:21
*** Guest48782 has quit IRC22:24
openstackgerritMatthew Treinish proposed openstack/nova master: Remove single quotes from posargs on stestr run commands  https://review.openstack.org/54547622:26
mtreinishmelwitt: ^^^ but lets test this and make sure I'm not just seeing things...22:26
melwittk, I'll try it22:27
*** acormier_ has joined #openstack-nova22:28
*** acormier has quit IRC22:32
*** zzzeek has quit IRC22:38
*** kuzko has joined #openstack-nova22:40
*** priteau has quit IRC22:41
*** zzzeek has joined #openstack-nova22:42
openstackgerritMatt Riedemann proposed openstack/nova master: Fix error handling in compute API for multiattach errors  https://review.openstack.org/54547822:42
mriedemwell i wish i would have found this before we cut RC2 ^ because that's an annoying UX problem22:42
openstackgerritJim Rollenhagen proposed openstack/nova master: ironic: stop lying to the RT when ironic is down  https://review.openstack.org/54547922:46
jrollTheJulia: fried_rice: ^ that fixes it, but will crash at startup if ironic is down22:46
* jroll is going to walk his dog before sunlight runs out22:47
fried_ricejroll: Maybe we *should* crash at startup if ironic is down.22:47
jrollfried_rice: yeah, I kind of agree, kind of don't, regardless crashing when ironic is down was a huge pain in CI in the past that I don't want to live again22:48
jrollI also feel like I want to be able to start my computes whenever and have them do stuff when ironic comes back22:48
jrollthough I don't believe in upgrading nova and ironic at the same time (or even the same maintenance window), other people do and this makes their life easier22:49
mriedemnova-compute doesn't start if we can't connect to libvirt22:49
mriedemi think the same for powervm?22:49
mriedemnot sure about hyperv/xen/vmware22:49
* fried_rice recalls doing something for this in powervm...22:49
* fried_rice looks...22:50
mriedemnova-compute shouldn't come up,22:50
mriedembecause then the service will say it's up, and be around for scheduling,22:50
mriedemand will just not work if the scheduler picks it and the hypervisor is gone22:50
jrollmriedem: yeah, but libvirt isn't some external service, that just means you've configured your hypervisor wrong22:50
mriedemvcenter is an external service22:51
jrolland at least in ironic's case, there won't be any resources to schedule to, until it can connect to ironic22:51
jrollor I guess there will, sigh22:51
fried_riceIn powervm, it looks like we'll hold up init_host for a while if we can't talk to the hypervisor, and then we'll ultimately blow up.22:51
mriedemi thought someone's dog was going to get walked?22:51
mriedemi can hear him whining from here22:51
jrollgood point22:51
fried_riceBut I've got a nice TODO there to make it work like I73a34eb6e0ca32d03e54d12a5e066b2ed4f19a61 which will actually disable the compute service (but not crash it) in that case.22:52
jrollbbiab22:52
mriedemmelwitt: did you figure this out? https://review.openstack.org/#/c/340614/18/nova/compute/api.py@202922:53
mriedemyou had >1 attachment right>22:53
mriedem?22:53
mriedemand that's why the volume status wasn't changing to 'available'?22:53
melwittmriedem: I had multiple attachments because I was having trouble getting the code path to hit in devstack. so I tried the scenario multiple times with the same volume by reset-state on it22:55
melwittand didn't notice it was building up attachments22:55
melwittso once I started from a clean slate, new volume and did the scenario, it worked as expected. the volume actually had no attachments in the fresh volume case. it was 'reserved' with no attachments22:56
melwittthen the attachment_delete change it from 'reserved' -> 'available', then the volume_api.delete deleted the volume properly22:56
melwittso I think all is well22:57
mriedemthat was tied to the thing on L2063 too?22:58
mriedemif detach fails, you definitely can't delete22:58
mriedemi just updated the comments after you realized it was a test env issue23:00
mriedemand yes volume attachments can build up if not managed23:00
mriedemthat's why i was using force detach earlier,23:00
mriedembecause you can do reset-state on the volume, but that doesn't remove the old attachments23:00
mriedemreally need a CLI for force detach in cinder23:01
melwittyeah23:03
melwittyeah, I know that if the detach fails you definitely can't delete. I was just trying to work out whether we should add a new try-except there to try the delete even if the detach fails23:04
melwittbut I think that was legacy from when detach could fail with "nothing to detach"23:04
melwittso I took it out23:04
*** burt has quit IRC23:07
TheJuliajroll: fwiw, we put some restarts in for issues with having to restart nova-compute due to restarts. We should just be able to make it a default thing... I think.... we'll likely want to verify that it works across multinode grenade jobs for ironic since they are different23:09
jrollTheJulia: so what you're saying is allowing a crash at startup shouldn't be an issue for CI?23:10
TheJuliaafaik it should not be23:10
TheJuliawe have a default restart if it is not multinode if I'm remembering correctly23:11
*** acormier_ has quit IRC23:11
TheJuliawhich likely is why...23:11
TheJuliaugh23:11
* TheJulia puts laptop down, and goes and has a drink23:11
*** acormier has joined #openstack-nova23:11
jrollhrm23:11
jrollagree, this sounds like a monday thing23:12
*** weshay is now known as weshay_PTO23:14
TheJuliajroll: yes ++23:14
mriedemmelwitt: ok +2 on https://review.openstack.org/#/c/340614/ now23:14
mriedemmelwitt: now we need to rope in superdan to peruse the series23:15
mriedemat 3:15pm on a friday23:15
mriedemmelwitt: did you take a look at https://review.openstack.org/#/c/545123/ and the one after it?23:15
*** acormier has quit IRC23:17
*** figleaf is now known as edleafe23:22
melwitt\o/ hallelujah23:37
melwittmriedem: not yet, it's next on my list23:38
melwittgood, the patches are small. yess23:41
openstackgerritEric Fried proposed openstack/nova master: New-style _set_inventory_for_provider  https://review.openstack.org/53764823:43
openstackgerritEric Fried proposed openstack/nova master: SchedulerReportClient.update_from_provider_tree  https://review.openstack.org/53382123:43
openstackgerritEric Fried proposed openstack/nova master: Use update_provider_tree from resource tracker  https://review.openstack.org/52024623:43
openstackgerritEric Fried proposed openstack/nova master: Fix nits in update_provider_tree series  https://review.openstack.org/53126023:43
openstackgerritEric Fried proposed openstack/nova master: Move refresh time from report client to prov tree  https://review.openstack.org/53551723:43
openstackgerritEric Fried proposed openstack/nova master: Make generation optional in ProviderTree  https://review.openstack.org/53932423:43
openstackgerritEric Fried proposed openstack/nova master: WIP: Add nested resources to server moving tests  https://review.openstack.org/52772823:43
fried_riceBecause you know I'm all about rebase, 'bout rebase...23:43
*** hemna_ has quit IRC23:50
*** hongbin has quit IRC23:51
fried_riceedleafe:  Making sure I'm not seeing things - do we not have GET /resource_providers?with_traits=... ?23:54
*** sree has joined #openstack-nova23:55
*** chyka has quit IRC23:58
*** chyka has joined #openstack-nova23:58
*** sree has quit IRC23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!