Friday, 2019-04-19

*** igordc has quit IRC00:04
*** jangutter has joined #openstack-nova00:06
*** samueldmq has quit IRC00:08
*** jangutter has quit IRC00:11
*** bbowen has quit IRC00:18
*** bbowen has joined #openstack-nova00:18
*** HW-Peter has joined #openstack-nova00:25
*** HW-Peter has quit IRC00:27
*** HW-Peter has joined #openstack-nova00:27
*** HW-Peter has quit IRC00:30
*** HW-Peter has joined #openstack-nova00:31
*** gyee has quit IRC00:31
*** HW_Peter has joined #openstack-nova00:37
*** markvoelker has quit IRC00:37
*** mlavalle has quit IRC00:47
openstackgerritMerged openstack/nova stable/queens: Update instance.availability_zone on revertResize  https://review.openstack.org/64841500:47
*** mriedem has quit IRC00:59
openstackgerritMatt Riedemann proposed openstack/nova master: Fix ProviderUsageBaseTestCase._run_periodics for multi-cell  https://review.openstack.org/64117901:00
openstackgerritMatt Riedemann proposed openstack/nova master: Improve CinderFixtureNewAttachFlow  https://review.openstack.org/63938201:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add functional recreate test for bug 1818914  https://review.openstack.org/64152101:00
openstackbug 1818914 in OpenStack Compute (nova) "Hypervisor resource usage on source still shows old flavor usage after resize confirm until update_available_resource periodic runs" [Low,In progress] https://launchpad.net/bugs/1818914 - Assigned to Matt Riedemann (mriedem)01:00
openstackgerritMatt Riedemann proposed openstack/nova master: Remove unused context parameter from RT._get_instance_type  https://review.openstack.org/64179201:00
openstackgerritMatt Riedemann proposed openstack/nova master: Update usage in RT.drop_move_claim during confirm resize  https://review.openstack.org/64180601:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add Migration.cross_cell_move and get_by_uuid  https://review.openstack.org/61401201:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add InstanceAction/Event create() method  https://review.openstack.org/61403601:00
openstackgerritMatt Riedemann proposed openstack/nova master: DNM: Add instance hard delete  https://review.openstack.org/65098401:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add Instance.hidden field  https://review.openstack.org/63112301:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add TargetDBSetupTask  https://review.openstack.org/62789201:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add CrossCellMigrationTask  https://review.openstack.org/63158101:00
openstackgerritMatt Riedemann proposed openstack/nova master: Execute TargetDBSetupTask  https://review.openstack.org/63385301:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add can_connect_volume() compute driver method  https://review.openstack.org/62131301:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add prep_snapshot_based_resize_at_dest compute method  https://review.openstack.org/63329301:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add PrepResizeAtDestTask  https://review.openstack.org/62789001:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add prep_snapshot_based_resize_at_source compute method  https://review.openstack.org/63483201:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add nova.compute.utils.delete_image  https://review.openstack.org/63760501:00
openstackgerritMatt Riedemann proposed openstack/nova master: Add PrepResizeAtSourceTask  https://review.openstack.org/62789101:00
openstackgerritMatt Riedemann proposed openstack/nova master: WIP: Add RevertResizeTask  https://review.openstack.org/63804601:01
openstackgerritMatt Riedemann proposed openstack/nova master: Add revert_snapshot_based_resize conductor RPC method  https://review.openstack.org/63804701:01
openstackgerritMatt Riedemann proposed openstack/nova master: Revert cross-cell resize from the API  https://review.openstack.org/63804801:01
openstackgerritMatt Riedemann proposed openstack/nova master: Confirm cross-cell resize while deleting a server  https://review.openstack.org/63826801:01
openstackgerritMatt Riedemann proposed openstack/nova master: Add archive_deleted_rows wrinkle to cross-cell functional test  https://review.openstack.org/65165001:01
openstackgerritMatt Riedemann proposed openstack/nova master: Add CrossCellWeigher  https://review.openstack.org/61435301:01
openstackgerritMatt Riedemann proposed openstack/nova master: Add cross-cell resize policy rule and enable in API  https://review.openstack.org/63826901:01
*** takashin has joined #openstack-nova01:08
*** nicolasbock has quit IRC01:09
*** ricolin has joined #openstack-nova01:09
*** brinzhang has joined #openstack-nova01:34
*** bbowen has quit IRC01:42
*** bbowen has joined #openstack-nova01:42
openstackgerritmelanie witt proposed openstack/nova master: Fix SynchronousThreadPoolExecutorFixture mock spec  https://review.openstack.org/65017101:55
openstackgerritmelanie witt proposed openstack/nova master: Use futurist.GreenThreadPoolExecutor in scatter_gather_cells  https://review.openstack.org/65017201:55
openstackgerritmelanie witt proposed openstack/nova master: Revert "Fix target_cell usage for scatter_gather_cells"  https://review.openstack.org/65389401:55
*** jangutter has joined #openstack-nova02:07
*** jangutter has quit IRC02:12
*** tbachman has quit IRC02:21
*** mdbooth_ has joined #openstack-nova02:22
*** mdbooth has quit IRC02:23
*** dannins has joined #openstack-nova02:26
*** lbragstad has quit IRC02:37
*** tbachman has joined #openstack-nova02:59
*** irclogbot_2 has quit IRC03:05
*** edmondsw has quit IRC03:08
*** irclogbot_3 has joined #openstack-nova03:09
*** edleafe has quit IRC03:10
*** markvoelker has joined #openstack-nova03:33
*** d34dh0r53 has joined #openstack-nova03:36
*** brinzhang has quit IRC03:41
eanderssonefried, found something odd with the ironic code04:02
eanderssonhttp://logs.openstack.org/39/653839/2/check/ironic-tempest-ipa-wholedisk-bios-agent_ipmitool-tinyipa/63a832a/controller/logs/screen-n-cpu.txt.gz#_Apr_19_03_21_25_24377704:02
eanderssonhttps://review.openstack.org/#/c/653839/2/nova/virt/ironic/client_wrapper.py04:02
eanderssonIf you provide a region_name, the get_endpoint does not seem to work04:02
eanderssonIt looks to be related to min_version, max_version. If I remove those options it works as intended.04:03
*** brinzhang has joined #openstack-nova04:05
*** imacdonn has quit IRC04:08
*** jangutter has joined #openstack-nova04:08
*** imacdonn has joined #openstack-nova04:08
*** jangutter has quit IRC04:12
*** bhagyashris has joined #openstack-nova05:02
*** lpetrut has joined #openstack-nova05:12
*** tetsuro has joined #openstack-nova05:41
*** yedongcan has joined #openstack-nova05:49
*** jangutter has joined #openstack-nova06:09
*** jangutter has quit IRC06:13
*** whoami-rajat has joined #openstack-nova06:15
*** plestang has joined #openstack-nova06:26
*** markvoelker has quit IRC06:27
*** tetsuro has quit IRC06:28
*** lpetrut has quit IRC06:34
*** tetsuro has joined #openstack-nova06:35
*** rpittau|afk is now known as rpittau06:52
openstackgerritya.wang proposed openstack/nova-specs master: Expose auto converge and post copy  https://review.openstack.org/65168106:56
*** tetsuro_ has joined #openstack-nova07:08
*** tetsuro has quit IRC07:11
*** pcaruana has joined #openstack-nova07:12
*** tesseract has joined #openstack-nova07:18
*** markvoelker has joined #openstack-nova07:27
*** lpetrut has joined #openstack-nova07:34
lpetrutHi. Is nova already using nested resource providers for NUMA topologies? looking at the libvirt driver 'update_provider_tree' method it seems not but I may be wrong07:38
*** liuyulong has quit IRC07:39
*** tetsuro_ has quit IRC07:48
*** tetsuro has joined #openstack-nova07:50
*** luksky has joined #openstack-nova07:52
*** slaweq has quit IRC07:53
*** helenafm has joined #openstack-nova07:53
*** tetsuro_ has joined #openstack-nova08:00
kashyapcfriesen: I think you didn't read the Secure Boot spec in full.  I did mention in the Work Items that we will re-use existing flavor/image properties.08:00
*** markvoelker has quit IRC08:01
*** takashin has left #openstack-nova08:01
*** tetsuro has quit IRC08:04
*** jangutter has joined #openstack-nova08:10
*** jangutter has quit IRC08:14
*** tkajinam has quit IRC08:19
*** whoami-rajat has quit IRC08:24
*** tetsuro has joined #openstack-nova08:37
*** tetsuro_ has quit IRC08:39
*** whoami-rajat has joined #openstack-nova08:55
*** davidsha has joined #openstack-nova08:57
kashyapcfriesen: Anyway, will respond to your questions on the change :-)08:57
*** tetsuro has quit IRC09:02
*** luksky has quit IRC09:08
*** tetsuro has joined #openstack-nova09:12
*** luksky has joined #openstack-nova09:22
*** tetsuro has quit IRC09:23
openstackgerritBrin Zhang proposed openstack/nova-specs master: Specifying az when restore shelved server  https://review.openstack.org/62468909:38
*** brinzhang has quit IRC09:43
openstackgerritya.wang proposed openstack/nova-specs master: Expose auto converge and post copy  https://review.openstack.org/65168109:47
openstackgerritBoxiang Zhu proposed openstack/nova master: Add host and hypervisor_hostname flag to create server  https://review.openstack.org/64552009:52
*** markvoelker has joined #openstack-nova09:58
*** bhagyashris has quit IRC10:09
*** pcaruana has quit IRC10:17
*** markvoelker has quit IRC10:31
*** nicolasbock has joined #openstack-nova10:45
*** luksky has quit IRC10:51
*** yan0s has joined #openstack-nova11:03
*** bbowen has quit IRC11:03
*** bbowen has joined #openstack-nova11:04
*** yedongcan has left #openstack-nova11:04
*** pvradu has joined #openstack-nova11:07
*** markvoelker has joined #openstack-nova11:29
*** pvradu has quit IRC11:30
*** lpetrut has quit IRC11:40
*** jmlowe has joined #openstack-nova11:41
yan0shello I'm having a problem with VNC in the web gui11:49
yan0sI can see it using the admin user11:49
yan0sbut not with "Member" role users (also using SAML integrated login)11:50
yan0s error in nova-novncproxy.log is : code 400, message Bad request syntax ('\x88\x8fI¿cØJW7¹;Ø\x06¬iÜ\x0f·:Ú\x07')11:50
*** edmondsw has joined #openstack-nova11:50
*** kaliya has joined #openstack-nova11:51
*** kaliya has quit IRC11:52
*** kaliya has joined #openstack-nova11:53
*** alex_xu has quit IRC11:56
*** markvoelker has quit IRC12:02
*** whoami-rajat has quit IRC12:05
*** jangutter has joined #openstack-nova12:08
*** jangutter has quit IRC12:12
*** HW-Peter has quit IRC12:17
*** dikonoor has joined #openstack-nova12:22
*** luksky has joined #openstack-nova12:33
*** ricolin has quit IRC12:41
*** lbragstad has joined #openstack-nova12:48
*** mriedem has joined #openstack-nova12:56
*** markvoelker has joined #openstack-nova12:58
*** jmlowe has quit IRC13:03
mriedemseems like something exploded in infra over night and now everything has to be rechecked13:05
yan0sActually, novnc breaks only when I launch the VM with a cloud-config file13:06
*** dims has quit IRC13:21
*** efried is now known as fried_rice13:25
fried_riceeandersson: I can't say I'm particularly surprised tbh.13:25
fried_ricedo you have a way of seeing whether it broke recently?13:25
fried_ricemriedem: TestRPC.test_create_transport in py36 - is that the thing you were seeing yesterday?13:29
*** edleafe has joined #openstack-nova13:29
mriedemyes13:29
mriedemsee ML13:29
*** whoami-rajat has joined #openstack-nova13:31
*** markvoelker has quit IRC13:31
fried_ricethanks mriedem. Recheckable or 100%?13:33
*** jmlowe has joined #openstack-nova13:36
*** bbowen has quit IRC13:37
mriedemi've been rechecking13:42
mriedemwith 12 hour turnaround on ci times plus there is a gerrit maintenance scheduled today...13:43
mriedemlooks like i should just quite early13:43
fried_ricemriedem: When I run this locally, I get 'oslo_config.cfg.NoSuchGroupError: no such group [oslo_messaging_notifications]'13:47
fried_riceTrying to find where that group is being registered...13:47
*** dims has joined #openstack-nova13:48
mriedemvia oslo.messaging opts13:48
mriedemi think in nova.rpc13:48
mnasermriedem: also, I had to turn down ovh-bhs1 (150 VMs) because it had networking issues13:49
mnaserand now I'm seeing a lot of RETRY_LIMIT failing jobs, which means that another provider is having issues (I suspect the other ovh region from what I was seeing)13:49
mriedemah this https://github.com/openstack/nova/blob/master/nova/config.py#L5013:49
mriedemmnaser: yeah i saw a lot of infra failures like that13:50
mnaserso we're pretty short on infra :\13:50
mnaserbut 100% relate on the "lets fix our gates that are on fire instead of making code pretty" comment13:50
mriedemmnaser: has there been something else going on all week because it's taken half a day to get a result on a change at times13:50
mnasermriedem: afaik ovh has been having issues where a whole job will run and then fail to collect logs, restarting that job13:51
mnaserso in our case a 2-3h deploy job would run 3 times in a row and still fail13:51
mriedemah13:51
mnaserand even then it'd re-queue after the fail so takes ages to get a node assigned again13:52
mriedemyeah i saw the POST_FAILURES which is generally failing to collect and publish logs13:52
mnaserbecause there's so many other things that are trying to also retry13:52
mriedemfried_rice: or it might be this that registers those options https://github.com/openstack/nova/blob/master/nova/config.py#L6113:53
mnaserand today you're going to have an automated patch change all the references in your code to point to opendev and so13:53
mnaserenjoy that it in advance and hopefully that doesn't break your world too13:53
* mriedem checks to see if there are any decent movies playing13:54
mriedemwhich core can i pay off to look at these 2 simple changes? https://review.openstack.org/#/q/topic:bug/1823781+(status:open+OR+status:merged)13:58
fried_riceI'll look. But your credit score is shit with me. "The nickel is in my backpack," suuuure.14:00
fried_ricetonyb[m]: "Now would be a good time to start brainstorming Forum topics while some of the14:00
fried_ricePTG discussions are fresh. Just a couple months until the Summit and Forum in14:00
fried_riceBerlin."14:00
fried_riceYou were just seeing if we were paying attention.14:00
mriedemfried_rice: it's still in there, i just can't ever remember to give it to you in person14:01
mriedemmy offer to mail it still stands14:01
fried_riceThen it'll be, "the nickel's in the mail." I've heard that one before too.14:02
fried_riceI get that same NoSuchGroupError in stein and rocky. So I'm guessing there's a bug in the test setup where the test env doesn't run through the code path that registers the options if I'm just running those two tests.14:04
mriedemhttps://photos.app.goo.gl/giiB1e8RD8WztkaQ814:05
mriedemi've set a reminder14:05
*** nicolasbock has quit IRC14:06
mriedemfried_rice: yeah i get the same: $ tox -r -e py36 -- nova.tests.unit.test_rpc --until-failure14:06
*** awalende has joined #openstack-nova14:07
fried_riceso that's a red herring for our gate failure :(14:07
*** jangutter has joined #openstack-nova14:09
fried_riceIf you actually pay me off, I won't be able to keep giving you crap about it. We need a different plan.14:12
*** jangutter has quit IRC14:14
mriedemwhat i'd like to know is what is calling oslo.config when the death spiral starts14:14
*** nicolasbock has joined #openstack-nova14:18
*** dims has quit IRC14:20
mriedemwell i fixed that latent traceback at least :)14:23
fried_riceRunning all of py36 locally I got these:14:23
fried_rice{2} nova.tests.unit.pci.test_utils.PciDeviceMatchTestCase.test_spec_extra_key [] ... inprogress14:23
fried_rice{7} nova.tests.unit.objects.test_instance_group.TestInstanceGroupObject.test_get_by_hint [] ... inprogress14:23
fried_riceoh, by the way, that ^ was on rocky14:24
*** mlavalle has joined #openstack-nova14:24
fried_ricewhat have we backported lately?14:25
mriedemshrug, lots of stuff14:26
mriedemwhat is getting an option with a namespace? https://github.com/openstack/oslo.config/blob/6.8.1/oslo_config/cfg.py#L261414:27
mriedemwtf is a config option namespace?14:27
mriedemstephenfin: ^?14:27
mriedemhttp://logs.openstack.org/45/649345/7/check/openstack-tox-py36/ba15c17/job-output.txt.gz#_2019-04-18_18_10_53_42395214:27
mriedemthat's where the stack overflow starts14:27
fried_riceand why am I suddenly getting14:28
fried_riceError: pg_config executable not found.14:28
fried_ricewhen rebuilding my venv??14:28
mriedemhttps://github.com/openstack/oslo.config/blob/6.8.1/oslo_config/cfg.py#L218314:28
mriedemmight be due to newer psycopg2?14:29
mriedemare you on 18.04?14:29
mriedembionic i mean14:29
*** vishakha has joined #openstack-nova14:30
fried_riceyes. Have been for three months. Apparently I suddenly needed libpq-dev14:30
fried_ricefour months14:31
fried_ricetime flies14:31
fried_riceOn master I just got14:31
fried_rice{6} nova.tests.unit.objects.test_instance.TestRemoteInstanceObject.test_create_with_extras [] ... inprogress14:31
fried_rice'>' not supported between instances of 'NoneType' and 'datetime.datetime'14:31
fried_riceso it's not RPC I guess14:31
mriedemheh i can't recreate oslo_config.cfg.NoSuchGroupError: no such group [oslo_messaging_notifications] now14:32
yan0sSo, novnc produces the error I mentioned earlier when both a key pair and a cloud-config (setting user password) are set14:32
mriedemfried_rice: that's what i saw yesterday when looking at this14:33
mriedemit's not in the logs everytime the stack overflow happens, but i know the cells 1 removal stuff touched some of that code14:33
*** dims has joined #openstack-nova14:35
*** dr_gogeta86 has quit IRC14:41
*** awalende_ has joined #openstack-nova14:44
*** helenafm has quit IRC14:47
*** awalende has quit IRC14:47
*** awalende_ has quit IRC14:51
*** dims has quit IRC14:53
openstackgerritMatt Riedemann proposed openstack/nova master: Only set oslo_messaging_notifications.driver if using RPCFixture  https://review.openstack.org/65395414:55
*** dims has joined #openstack-nova14:56
*** nicolasbock has quit IRC14:59
*** dims has quit IRC15:01
*** lpetrut has joined #openstack-nova15:04
-openstackstatus- NOTICE: Gerrit is offline for several hours starting at 15:00 UTC to perform the opendev migration; see http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005011.html15:04
*** ChanServ changes topic to "Gerrit is offline for several hours starting at 15:00 UTC to perform the opendev migration; see http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005011.html"15:04
*** nicolasbock has joined #openstack-nova15:05
*** gyee has joined #openstack-nova15:06
*** yan0s has quit IRC15:07
*** dims has joined #openstack-nova15:07
*** davidsha has quit IRC15:09
*** plestang has quit IRC15:34
*** dims has quit IRC15:35
*** tjgresha has joined #openstack-nova15:40
kashyapHmm, https://review.openstack.org/ is offline15:43
*** boxiang has joined #openstack-nova15:43
dansmithkashyap: see topic15:44
*** dims has joined #openstack-nova15:44
*** mujahidali has joined #openstack-nova15:44
*** boxiang has quit IRC15:45
kashyapdansmith: Yeah, noticed it.  _Just_ when I was hitting send on a comment I saw it go down15:45
*** boxiang has joined #openstack-nova15:45
kashyapdansmith: But, I trained my muscle memory to copy it into an editor; before pasting it :-)15:46
kashyapdansmith: Shouldn't you be offline, too?15:46
dansmithno :)15:46
dansmiththis day has no special meaning for me15:46
kashyapHehe, nod.15:47
edleafeIt *is* the 244th anniversary of the start of the American Revolution. Surely *that* has meaning for you?15:48
*** sapd1_x has joined #openstack-nova15:56
mriedemweeeee https://bugs.launchpad.net/nova/+bug/182553716:00
openstackLaunchpad bug 1825537 in OpenStack Compute (nova) "finish_resize failures incorrectly revert allocations" [Medium,Triaged]16:00
mriedemmnaser: eandersson: ^ will add to the placement allocations are wrong woes16:01
mriedemi've got a recreate test locally which i'll push up once gerrit is back16:01
*** jangutter has joined #openstack-nova16:10
mriedemdansmith: interested in your opinion on the options i've laid out in https://bugs.launchpad.net/nova/+bug/1825537 for fixing it - comment 216:11
openstackLaunchpad bug 1825537 in OpenStack Compute (nova) "finish_resize failures incorrectly revert allocations" [Medium,Triaged]16:11
dansmithmriedem: this is getting really old16:11
dansmithcan't you do something positive to help the project?16:12
mriedemwould you like auto-formatted code style?16:12
mriedemit will bring the new contribs16:12
dansmithyes please!16:12
dansmithmriedem: I'm highly allergic to changing the host/node assignment timing16:13
dansmithnot because what we do today is good or right, but just because I'm afraid of breaking tons of other things that encode those assumptuons16:14
dansmith*assumptions16:14
*** jangutter has quit IRC16:14
mriedemdansmith: i.e. "I think this is probably not really an option because finish_resize has  never done this on failure and we don't really know what state the  instance is in"16:14
dansmithsure16:15
mriedemi don't think that's really a good option either, i just listed it for completeness16:15
*** tesseract has quit IRC16:16
mriedeman auto-revert is appealing but i very much doubt the revert code is graceful enough to handle it, b/c of baked in assumptions about the state of the world when a revert is started16:16
mriedeme.g. "cleaning up networking...wtf there is no networking on this host KABOOM!!!"16:17
* dansmith nods16:19
fried_rice mriedem: Did you already do the e-r thing for bug 1825435?16:22
openstackbug 1825435 in OpenStack Compute (nova) "TestRPC unit tests intermittently fail with "'>' not supported between instances of 'NoneType' and 'datetime.datetime'" - maybe due to "Fatal Python error: Cannot recover from stack overflow."" [High,Confirmed] https://launchpad.net/bugs/182543516:22
mriedemyeah16:22
fried_ricek16:22
mriedemhttp://status.openstack.org/elastic-recheck/#182543516:23
*** wwriverrat has joined #openstack-nova16:26
mriedemdansmith: ok option 1 fix is quick and easy and already done locally16:28
mriedemit's essentially just doing what confirm_resize does16:28
dansmithcool16:28
*** sapd1_x has quit IRC16:31
*** igordc has joined #openstack-nova16:32
*** mgoddard has quit IRC16:34
*** mgoddard has joined #openstack-nova16:35
*** lpetrut has quit IRC16:41
*** rpittau is now known as rpittau|afk16:42
*** mujahidali has quit IRC16:43
*** mgoddard has quit IRC16:44
*** mgoddard has joined #openstack-nova16:45
*** igordc has quit IRC16:46
*** gyee has quit IRC16:48
*** gyee has joined #openstack-nova16:49
*** mriedem is now known as mriedem_lunch16:57
*** kaliya has quit IRC17:26
*** vishakha has quit IRC17:27
*** igordc has joined #openstack-nova17:28
*** jmlowe has quit IRC17:33
eanderssonSweet mriedem_lunch18:00
eanderssonfried_rice, We have this in Rocky as well. I don't think it ever worked.18:01
eanderssonhttps://github.com/openstack-dev/devstack/blob/master/lib/nova_plugins/hypervisor-ironic#L4418:01
eanderssonWe don't add a region_name, so that path was never really tested.18:02
*** jangutter has joined #openstack-nova18:11
*** mriedem_lunch is now known as mriedem18:15
*** jangutter has quit IRC18:15
*** jmlowe has joined #openstack-nova18:19
*** dims has quit IRC18:26
*** jaypipes_ has joined #openstack-nova18:28
*** dims has joined #openstack-nova18:29
*** jaypipes has quit IRC18:30
*** edmondsw has quit IRC18:33
*** edmondsw_ has joined #openstack-nova18:35
*** bryan_stephenson has joined #openstack-nova18:36
fried_riceeandersson: Okay. It's nearly impossible to test with all the permutations of ksa opts, times all the permutations of service catalog and endpoint setups. Inevitable that we missed things.18:43
eanderssonYep18:43
eanderssonFor sure.18:43
*** wwriverrat has quit IRC18:43
eanderssonI opened a PR to have it added.18:43
fried_riceeandersson: IIUC this is only reproducible when there's a real ironicclient listening on the other end?18:44
eanderssonThe problem is really that the real ironicclient hides this.18:44
fried_riceright18:44
fried_riceoh18:44
fried_ricewait, you mean the *real* real ironicclient?18:44
eanderssonSince when you pass on a None value, the ironicclient tries to figure it out.18:45
eanderssonYep18:45
fried_riceso... where is it a problem?18:45
eanderssonWell the problem is that ironicclient is smart, but not smart enough18:46
eanderssonwhen you pass on a None endpoint it just gets the first endpoint in the catalog18:46
eanderssonand does not take region into account18:47
*** dims has quit IRC18:48
eanderssonThere has been multiple bugs with this code pass, so been difficult to know exactly where things are failing18:48
eanderssone.g. https://github.com/openstack/python-ironicclient/commit/466be3b6568b643605d826e5aa26d9a344cc74ae18:49
eandersson*code path18:49
fried_riceI don't actually see where we're taking region into account at all.18:50
eanderssonYea - that was my first assumption.18:51
fried_riceoh, there it is.18:51
*** dims has joined #openstack-nova18:51
eanderssonBut it actually looks like it works.18:51
eanderssonI assumed that > get_endpoint(region_name=bla)18:51
fried_riceeandersson: I'm still not understanding exactly what environment and setup it takes to make it not work, and also how it fails.18:51
eanderssonhad to look like this18:51
eanderssonfried_rice, if you add region_name get_endpoint(..) in nova will always return None18:52
fried_riceno, by the time we hit get_endpoint(), region_name has already been consumed from the conf. That happens on load_adapter_from_conf_options.18:52
eanderssonhttp://logs.openstack.org/39/653839/2/check/ironic-tempest-ipa-wholedisk-bios-agent_ipmitool-tinyipa/63a832a/controller/logs/screen-n-cpu.txt.gz#_Apr_19_03_21_25_24377718:52
eanderssonI just adding if not endpoint into the Nova code here and ran the test with region_name=regionOne set18:53
eandersson*added18:53
fried_riceAnd your service catalog has your ironic endpoint in regionOne?18:53
eanderssonYep18:57
eanderssonOnce review.openstack is up again I'll link you the changes I did for testing18:57
fried_riceOkay. I don't have a way to do ironic stuff locally. Is there a way to spin up the service with no nodes?18:59
eanderssonThey have a noop driver you can probably use.19:01
*** bryan_stephenson has quit IRC19:04
eanderssonI was just doing something like this for testing http://paste.openstack.org/show/uANO7kLnU4NcTJrB2ywE/19:07
eanderssonand ironic can be set up with a super basic config to provide api only19:08
eanderssonSince you don't actually need to create nodes to test this19:09
eanderssonhttp://logs.openstack.org/39/653839/2/check/ironic-tempest-ipa-wholedisk-bios-agent_ipmitool-tinyipa/63a832a/controller/logs/etc/ironic/ironic_conf.txt.gz19:09
fried_riceeandersson: But you're not actually even getting to ironicclient by the time you hit the problem.19:15
eanderssonSorry, that was for a different issue.19:15
fried_riceI think you're loading the ksa auth, session, and adapter from conf...19:16
eanderssonJust an example.19:16
fried_riceand then asking the adapter to get_endpoint()...19:16
fried_ricewhich is going to the service catalog looking for a matching endpoint19:16
fried_riceEither it doesn't find it there, or it finds it and then goes and queries that endpoint - the ironic API itself - to do the discovery.19:16
fried_riceOne of those two steps is failing to yield resulds.19:16
fried_riceresults.19:16
*** bbowen has joined #openstack-nova19:17
fried_riceeandersson: You have an env where you're able to reproduce this, or just in zuul?19:20
eanderssonI do have an env yea19:21
eanderssonI did it locally first, but as the way I have it set up is very specific to our env I wanted to confirm in zuul as well.19:21
fried_riceokay, can you throw some debug logs into...19:21
eanderssonSure19:21
fried_ricehttps://github.com/openstack/nova/blob/master/nova/virt/ironic/client_wrapper.py#L122-L13319:22
fried_riceSpecifically, after L122 to see if that's reached; vs at L127 to see if get_endpoint raised an exception - should be one or the other - and then after L130 to see what get_client returns.19:24
fried_riceeandersson: ^19:24
eanderssonSo that is what I added into the Zuul logs19:24
eanderssonksa_adap.get_endpoint() returns None19:24
eanderssonironic.client.get_client(...) returns the endpoint19:25
fried_ricethat's... pretty weird.19:25
eanderssonIf I remove line 120-121 it get_endpoint() returns the endpoint19:25
fried_riceWhat does the version document at your ironic endpoint say?19:25
fried_ricelike if you just curl the root endpoint19:26
*** jaypipes_ is now known as jaypipes19:26
eanderssonhttp://paste.openstack.org/show/eXVi99OZVrjJ6V9C496P/19:31
*** igordc has quit IRC19:36
*** bbowen has quit IRC19:38
fried_riceeandersson: Oh. Ahem. That min_version/max_version thing is just wrong. No wonder it's borked.19:40
fried_riceThose are supposed to be major version numbers, not microversions.19:40
fried_riceyeah, those should just be removed entirely.19:40
*** ttsiouts has joined #openstack-nova19:40
fried_riceye gods, who *wrote* this crap??19:40
*** whoami-rajat has quit IRC19:40
fried_riceeandersson: Do you have a bug for this?19:41
*** hongbin has joined #openstack-nova19:42
*** slaweq has joined #openstack-nova19:48
fried_ricemriedem: I thought I could get rid of, "There is no script for 63 version" by rebuilding venv, wth?19:56
mriedemis that an api db version?19:58
*** ttsiouts_ has joined #openstack-nova19:58
mriedemdo you have stale pycs or something?19:59
eanderssonno bug yet fried_rice20:00
fried_riceeandersson: Okay, we're going to need one20:00
fried_riceeandersson: Because this fix will want to be backported20:00
fried_ricemriedem: Must be.20:00
fried_ricemriedem: seems to have cleared up now20:00
*** ttsiouts has quit IRC20:00
eanderssonI'll create one in a bit20:01
fried_ricethanks20:01
fried_ricemriedem: I was on the second of those patches, 'Soft delete virtual_interfaces when instance is destroyed' -- test passes when fix is reverted.20:02
mriedemok i'll look in a bit20:03
*** ralonsoh has joined #openstack-nova20:08
*** jangutter has joined #openstack-nova20:12
fried_riceeandersson: So this https://github.com/openstack-dev/devstack/blob/master/lib/nova_plugins/hypervisor-ironic#L44 isn't setting api_endpoint. Which means it still ought to be going through the code path that does get_ksa_adapter with the bogus min_version and max_version kwargs. Which should still be returning None. Which should still be attempting to call get_client with endpoint=None. So I don't get how adding region_name to20:12
fried_ricebecause region_name isn't in the kwargs sent to get_client20:13
fried_riceget_endpoint with min/max but no region in the conf somehow magically works, but adding region_name in the conf makes it not work until you remove the min/max version?? /me confused.20:15
*** jangutter has quit IRC20:16
*** ralonsoh has quit IRC20:26
eanderssonYea - not sure why it works without the region_name20:28
mriedemfried_rice: test_delete_virtual_interfaces_on_instance_destroy fails for me locally if i remove the fix20:30
*** slaweq has quit IRC20:37
eanderssonI wrote a bug report but... launchpad error'd out :'(20:42
fried_riceeandersson: Yeah, I won't be able to post the fix until all the things come back anyway.20:45
fried_riceeandersson: One other data point that would be interesting:20:45
fried_riceThe failure setup, i.e. with region_name and min_version and max_version, but change min_version from IRONIC_API_VERSION to the tuple (1, 0)20:46
eanderssonYea that seems to work20:49
fried_riceeandersson: Okay, good, theory confirmed. Soon as the dark, dark night is over, we can have this fix pushed.20:50
fried_ricethough it's still a mystery why it doesn't break until you specify region_name. mordred, you around or chasing bunnies?20:51
mordredchasing finishing this opendev rollout20:51
fried_riceoh, you're involved in that? I'm sorry.20:52
fried_ricePlease do carry on.20:52
mordredsorry - I saw the note from eandersson earlier -just havne't gotten to it.20:52
mordredwe're having all the fun20:52
fried_riceyeah, I don't want to distract you from that for sure.20:53
eanderssonbtw fried_rice would be nice if we could error out if endpoint is None20:53
*** hongbin has quit IRC20:53
fried_riceeandersson: Well, if you look at the comment there, I deliberately didn't do that20:53
eanderssonbecause at the moment if this gets fixed, and nova is misconfigured it will look like it's working20:53
*** hongbin has joined #openstack-nova20:53
eanderssonYea - saw your comment on that20:53
eanderssonAt the very least I believe we should log it20:53
fried_riceI was basically paranoid about changing the nature of legacy broken behavior.20:54
fried_riceYeah, I can get behind a log for sure.20:54
eanderssonbecause it can lead to some bad stuff20:54
eanderssonor maybe we can pass on region to the ironic client20:54
eanderssonbecause we are essentially telling the ironic client, hey if I fail, just make the best judgment you can.... but I will not give you all the info you need20:55
fried_riceeandersson: Well, I would rather rip out ironicclient altogether and use the sdk.20:55
eanderssonI do like that idea :p20:55
fried_riceeandersson: https://blueprints.launchpad.net/nova/+spec/openstacksdk-in-nova20:56
fried_rice(I wasn't being entirely hypothetical)20:56
eanderssonYep - already commented on the very initial implementation :P20:56
fried_ricecool. I would have known that if I could have referenced the gerrit part of my extended brain.20:56
eanderssonhttps://github.com/openstack/python-ironicclient/blob/master/ironicclient/client.py#L11121:00
eanderssonI still think we should pass on region_name to the client if we keep the current behavior with a possible endpoint = None21:00
eanderssonThat way at the least the ironicclient can make an educated decision21:01
fried_riceThat can be done. I think I would prefer it to be a separate change though. It's clearly a different bug from the bogus min_version thing.21:02
eanderssonFor sure21:02
fried_riceugh, this is such a mess21:03
fried_riceironicclient is doing all this preliminary stuff that's almost (but not quite, and incorrectly) duplicating a subset of the logic in ksa itself.21:04
eanderssonYea - this took me hours to figure out21:05
eanderssonBecause there was actually 3 separate bugs21:05
fried_riceand for that matter, nova's client_wrapper is *also* doing more of that21:05
fried_ricemainly for backward compatibility21:05
eanderssonhttps://github.com/openstack/python-ironicclient/commit/466be3b6568b643605d826e5aa26d9a344cc74ae21:07
eanderssonThis bug added A LOT of confusion for me21:08
fried_ricemriedem: Confirmed the test fails when the fix is removed. Something bizarre happening with cached .pyc files maybe, would explain a couple of the weird things that have befallen me today. So I'll be +A on that guy when the world comes back up.21:08
fried_riceeandersson: Oh yeah, that one had a knock-on effect resulting in several other bugs.21:08
melwittmriedem: re: tripleo, I think it supports neither deploy nor upgrade with extracted placement based on the last I heard from lyarwood and EmelienM. this isn't going to work right now but the relevant patches are https://review.openstack.org/#/c/630644/ and https://review.openstack.org/#/q/topic:tripleo-placement-extraction+status:open and https://review.openstack.org/#/q/topic:tripleo-placement-upgrade-from-nova-placement+status:open21:09
fried_riceeandersson: like https://github.com/openstack/python-ironicclient/commit/ae1743d2c194c690c4d4629e51e860b5f5b8425221:09
melwittmriedem: I'm not 100% clear on whether deploy is supported in any capacity or if it's just that it can't deploy it by default yet21:09
mriedemmelwitt: ok21:12
melwittit's being worked on and supposed to ramp up soon. there's an additional person who's going to help who has been taken on the task recently21:16
*** IvensZambrano has joined #openstack-nova21:19
*** IvensZambrano is now known as snevi21:19
imacdonnwho knows stuff about eventlet monkey-patching and WSGI? Re. the problem I reported on the ML, with nova-API not maintain the heartbeat on AMQP connections when running under uWSGI, I've discovered that the problem goes away when I remove the eventlet monkey-patching21:24
fried_riceimacdonn: isn't mdbooth_ our resident monkey-patching expert, and cdent our resident wsgi expert? I kinda think you might have poor luck raising the former, and I know the latter is on vacation.21:28
melwittimacdonn: mdbooth_ landed a patch to change eventlet monkey-patching around in train, that you could try to see if it helps if you were up for an experiment, but I can't link it to you right now bc gerrit21:30
melwittI guess I could find it in github21:31
imacdonnmelwitt, Are you sure that didn't make Stein? i.e. this: https://github.com/openstack/nova/commit/3c5e2b0e9fac985294a949852bb8c83d4ed77e0421:31
imacdonnproblem started before that, and continues with that, though21:31
melwittimacdonn: I'm sure. it's proposed for backport to stein but we wanted to let it bake longer and see if anyone else was having problems first. bc only RHOSP reported the issue21:32
imacdonnmelwitt: hmm. Not finding anything beyond the above (which was definitely about "moving it around")21:33
imacdonnmelwitt: maybe not merged to master yet?21:34
melwittimacdonn: no, that's the one, it's just it's not in stein21:34
fried_ricedefinitely merged in master21:34
melwittit's in train21:34
melwittI was saying you could try applying that patch to see if it helps21:35
imacdonnhmm, that's odd .. I seemed to have that change ... which suggests either the RDO Stein packages are not actually Stein, or I'm doing something stupid\21:36
*** ianw_pto is now known as ianw21:37
imacdonn# rpm -qf nova/monkey_patch.py21:37
imacdonnpython2-nova-19.0.0-1.el7.noarch21:37
melwittohhh21:38
melwittI bet we (redhat) backported it downstream only. but I wasn't sure we do that for RDO21:39
fried_riceWow. That seems kinda crazy.21:39
melwittI guess they did in this case at least bc otherwise it was totally broken for us21:39
fried_riceBut I guess this was a special case because RH was busted.21:39
fried_riceyeah21:39
fried_ricebut imacdonn you said the problem existed before this commit as well? Or is that erroneous in light of revelations above?21:40
imacdonnso, in any case, my problem is present both with and without that change21:40
fried_riceokay.21:40
melwittok, was worth a shot21:41
melwittI have to run now, I'm off today because it's Spring Holiday™. see yall next week21:41
imacdonnk, have fun ;)21:41
fried_ricePesach Tov.21:44
fried_riceI'm going to hit the metaphorical road as well. eandersson, I have your fix ready; please add me to that bug once you have it.21:51
*** fried_rice is now known as efried21:51
imacdonnhave a good w/e, efried21:52
efriedo/21:52
*** mriedem has quit IRC21:53
eanderssonLet me try to create it again21:55
eanderssonefried, https://bugs.launchpad.net/nova/+bug/182558322:02
openstackLaunchpad bug 1825583 in OpenStack Compute (nova) "Region name isn't respected when configuring ironic" [Undecided,New]22:02
eanderssonI wish I would have saved the original one I wrote.22:02
*** jangutter has joined #openstack-nova22:12
*** jangutter has quit IRC22:17
eanderssonI tried to keep the description searchable. So I didn't actually mention the min/max version stuff.22:18
*** alex_xu has joined #openstack-nova22:27
alex_xugerrit is down?22:27
*** snevi has quit IRC22:43
mnaseralex_xu: see topic22:47
alex_xumnaser: thanks22:48
mordredwe're having a fun day22:51
*** lbragstad has quit IRC23:11
*** ttsiouts_ has quit IRC23:22
*** ttsiouts has joined #openstack-nova23:37
*** luksky has quit IRC23:39
*** hongbin has quit IRC23:41
*** ttsiouts has quit IRC23:41
*** nicolasbock has quit IRC23:45

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!