Friday, 2020-03-13

*** CeeMac has quit IRC00:10
*** tosky has quit IRC00:18
openstackgerritSundar Nadathur proposed openstack/nova master: ksa auth conf and client for Cyborg access  https://review.opendev.org/63124200:32
openstackgerritSundar Nadathur proposed openstack/nova master: Add Cyborg device profile groups to request spec.  https://review.opendev.org/63124300:32
openstackgerritSundar Nadathur proposed openstack/nova master: Create and bind Cyborg ARQs.  https://review.opendev.org/63124400:32
openstackgerritSundar Nadathur proposed openstack/nova master: Pass accelerator requests to each virt driver from compute manager.  https://review.opendev.org/69858100:32
openstackgerritSundar Nadathur proposed openstack/nova master: Compose accelerator PCI devices into domain XML in libvirt driver.  https://review.opendev.org/63124500:32
openstackgerritSundar Nadathur proposed openstack/nova master: Delete ARQs for an instance when the instance is deleted.  https://review.opendev.org/67373500:32
openstackgerritSundar Nadathur proposed openstack/nova master: Enable hard/soft reboot with accelerators.  https://review.opendev.org/69794000:32
openstackgerritSundar Nadathur proposed openstack/nova master: Enable start/stop of instances with accelerators.  https://review.opendev.org/69955300:32
openstackgerritSundar Nadathur proposed openstack/nova master: Enable and use COMPUTE_ACCELERATORS trait.  https://review.opendev.org/69955400:32
openstackgerritSundar Nadathur proposed openstack/nova master: Bump compute rpcapi version and reduce Cyborg calls.  https://review.opendev.org/70422700:32
openstackgerritSundar Nadathur proposed openstack/nova master: Block unsupported instance operations with accelerators.  https://review.opendev.org/67472600:32
openstackgerritSundar Nadathur proposed openstack/nova master: Add cyborg tempest job.  https://review.opendev.org/67099900:32
*** tetsuro has joined #openstack-nova00:36
*** vishalmanchanda has joined #openstack-nova00:43
*** mlavalle has quit IRC00:46
openstackgerritMerged openstack/nova stable/stein: Fix os-keypairs pagination links  https://review.opendev.org/71189600:53
*** tbachman_ has joined #openstack-nova00:56
*** tbachman has quit IRC00:56
*** tbachman_ is now known as tbachman00:56
*** tetsuro_ has joined #openstack-nova01:01
*** tetsuro has quit IRC01:05
*** gyee has quit IRC01:05
*** CeeMac has joined #openstack-nova01:05
*** zhanglong has joined #openstack-nova01:10
*** tbachman_ has joined #openstack-nova01:13
*** tbachman has quit IRC01:14
*** tbachman_ is now known as tbachman01:14
brinzhanggmann:hi01:25
brinzhanggmann: I combined the os-instance-actions policy patch, and that has previous issue in cmd/test_policy https://review.opendev.org/#/c/706470/01:27
brinzhanggmann: can you have a fast check ?01:27
*** spatel has joined #openstack-nova01:28
*** dave-mccowan has joined #openstack-nova01:32
*** spatel has quit IRC01:32
*** mkrai has joined #openstack-nova02:37
*** ociuhandu has joined #openstack-nova02:43
*** psachin has joined #openstack-nova02:43
*** ociuhandu has quit IRC02:48
*** spatel has joined #openstack-nova03:02
*** spatel has quit IRC03:13
*** CeeMac has quit IRC03:15
*** mkrai has quit IRC03:17
*** mkrai has joined #openstack-nova03:18
*** psachin has quit IRC03:30
*** dave-mccowan has quit IRC03:42
*** tetsuro_ has quit IRC03:54
*** zhanglong has quit IRC03:55
*** tetsuro has joined #openstack-nova04:00
openstackgerritMerged openstack/nova master: Catch exception when use invalid architecture of image  https://review.opendev.org/71136304:09
*** udesale has joined #openstack-nova04:40
openstackgerritBrin Zhang proposed openstack/python-novaclient master: Microversion 2.83: Add volume-patch CLI  https://review.opendev.org/71265104:52
*** mkrai has quit IRC05:02
*** bnemec has quit IRC05:19
*** bnemec has joined #openstack-nova05:28
openstackgerritBrin Zhang proposed openstack/nova master: Add new default roles in os-instance-actions policies  https://review.opendev.org/70647005:33
*** evrardjp has quit IRC05:35
*** evrardjp has joined #openstack-nova05:36
*** links has joined #openstack-nova05:36
*** mkrai has joined #openstack-nova05:53
*** dpawlik has quit IRC06:37
*** dpawlik has joined #openstack-nova06:55
*** dpawlik has quit IRC07:10
*** dpawlik has joined #openstack-nova07:16
*** rpittau|afk is now known as rpittau07:24
*** factor has joined #openstack-nova07:50
*** slaweq has joined #openstack-nova07:56
*** damien_r has joined #openstack-nova08:00
*** ociuhandu has joined #openstack-nova08:02
*** ociuhandu has quit IRC08:07
*** maciejjozefczyk has joined #openstack-nova08:08
openstackgerritBrin Zhang proposed openstack/nova master: Add SYSTEM_READER role to servers actions API  https://review.opendev.org/70617908:11
*** tesseract has joined #openstack-nova08:12
*** amoralej|off is now known as amoralej08:17
*** ociuhandu has joined #openstack-nova08:20
*** ociuhandu has quit IRC08:24
*** rcernin has quit IRC08:27
*** tkajinam has quit IRC08:31
*** ociuhandu has joined #openstack-nova08:40
*** tosky has joined #openstack-nova08:46
*** ociuhandu has quit IRC08:49
*** ociuhandu has joined #openstack-nova08:50
*** xek has joined #openstack-nova08:50
kashyapmelwitt: stephenfin: Just caught up with that "lift the SCSI unit restriction" libivirt issue: I'm pretty damn sure it's fixed upstream _and_ for CentOS / RHEL08:58
*** ralonsoh has joined #openstack-nova08:59
kashyapOkay, I see that stephenfin has pointed out the RHBZ08:59
kashyapmelwitt: stephenfin: It looks like Ubunutu Virt folks need to backport that patch series09:00
*** ociuhandu has quit IRC09:02
*** links has quit IRC09:04
*** ociuhandu has joined #openstack-nova09:14
*** ociuhandu has quit IRC09:15
*** ociuhandu has joined #openstack-nova09:16
*** ociuhandu has quit IRC09:25
*** ociuhandu has joined #openstack-nova09:26
*** derekh has joined #openstack-nova09:33
*** martinkennelly has joined #openstack-nova09:35
*** jangutter has joined #openstack-nova09:45
*** lpetrut has joined #openstack-nova09:53
*** ociuhandu has quit IRC09:53
*** jangutter has quit IRC09:57
*** jangutter has joined #openstack-nova09:58
*** tbachman has quit IRC10:04
*** lpetrut has quit IRC10:05
*** lpetrut has joined #openstack-nova10:09
*** ociuhandu has joined #openstack-nova10:10
*** lpetrut has quit IRC10:10
*** lpetrut has joined #openstack-nova10:12
*** ociuhandu has quit IRC10:12
*** ociuhandu has joined #openstack-nova10:12
*** tetsuro has quit IRC10:19
*** ociuhandu has quit IRC10:22
*** ociuhandu has joined #openstack-nova10:24
*** ociuhandu has quit IRC10:28
*** mkrai has quit IRC10:36
*** mkrai has joined #openstack-nova10:37
*** mkrai has quit IRC10:44
*** ociuhandu has joined #openstack-nova11:00
*** mvkr has quit IRC11:08
*** udesale_ has joined #openstack-nova11:09
*** mvkr has joined #openstack-nova11:10
*** udesale has quit IRC11:12
*** nicolasbock has joined #openstack-nova11:25
*** rpittau is now known as rpittau|bbl11:28
*** jangutter has quit IRC11:29
*** hoonetorg has quit IRC11:47
*** ociuhandu has quit IRC11:48
*** hoonetorg has joined #openstack-nova12:00
*** nicolasbock has quit IRC12:07
*** nicolasbock has joined #openstack-nova12:07
*** ociuhandu has joined #openstack-nova12:09
*** mkrai has joined #openstack-nova12:12
*** udesale_ is now known as udesale12:16
*** jangutter has joined #openstack-nova12:23
* gibi is on and off during the day due to downstream stuff12:24
*** grandchild has joined #openstack-nova12:36
*** grandchild has quit IRC12:44
mnaserhas anyne seen this in nova's ci?12:48
mnaserlibvirt.libvirtError: Requested operation is not valid: format of backing image '/var/lib/nova/instances/_base/791111176e5cb97db82b0a71a670431f65838a05' of image '/var/lib/nova/instances/ff811290-bbb0-4d30-b1e7-ee9aec4913ee/disk' was not specified in the image metadata (See https://libvirt.org/kbase/backing_chains.html for troubleshooting)12:48
mnaserOSA's jobs are failing with that and i wonder if its because the libvirt version with bionic is different... maybe12:48
mnaseri fonud this - https://bugzilla.redhat.com/show_bug.cgi?id=179814812:50
openstackbugzilla.redhat.com bug 1798148 in libvirt "Regression: Requested operation is not valid: format of backing image ... was not specified in the image metadata" [Unspecified,On_qa] - Assigned to pkrempa12:50
mnaserok i see https://bugs.launchpad.net/nova/+bug/186402012:50
openstackLaunchpad bug 1864020 in OpenStack Compute (nova) "libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified in the image metadata (See https://libvirt.org/kbase/backing_chains.html for troubleshooting)" [Undecided,In progress] - Assigned to Lee Yarwood (lyarwood)12:50
*** vishalmanchanda has quit IRC13:03
lyarwoodmnaser: yeah that should be resolved now13:03
*** rpittau|bbl is now known as rpittau13:04
mnaserlyarwood: cool, we are bumping our versions which should help with this in OSA.  thanks for looking into it!13:05
lyarwoodmnaser: np, which jobs hit this in OSA btw? You need a very recent version of libvirt to hit this.13:06
* lyarwood only found this with a new fedora virt-preview job13:07
*** jangutter has quit IRC13:09
sean-k-mooneygibi: efried_gone https://review.opendev.org/#/c/676522/44/nova/compute/resource_tracker.py@1738 i think that adresses the duplicate resouce provider definitons in different yaml files13:10
sean-k-mooneyalso is efried_gone gone permently now or will erric be back before he finishes his current role13:10
gibisean-k-mooney: i will look at the provider series next week. thanks for taking it over13:17
sean-k-mooneygibi: no rush bar m3, i still need to adress the final patch and then look at the testing13:17
sean-k-mooneyi think i covered the main funcitonal changes already however13:18
gibisean-k-mooney: ack13:18
gibim3 is in 4 weeks13:19
sean-k-mooneyyep so no rush13:19
gibish*t i have to fix up the qos series too til13:19
gibianyhow I expect a rush of reviews at m3 as usual13:20
efried_gonesean-k-mooney: I'm gone gone. Spending maybe a few minutes a day paying attention to OpenStack stuff.13:22
sean-k-mooneycool good to know. ill miss having you around13:22
*** lbragstad__ has quit IRC13:29
*** lbragstad has joined #openstack-nova13:31
*** mriedem has joined #openstack-nova13:36
*** nweinber has joined #openstack-nova13:36
*** rcernin has joined #openstack-nova13:41
*** jangutter has joined #openstack-nova13:41
*** mlavalle has joined #openstack-nova13:44
kashyapefried_gone: Was good knowing you; see you on the Other Side(tm).13:48
kashyapAnd thanks for all the outstanding (as in, excellent, not "remaining") work! ;-)13:48
kashyapmnaser: Hey, yes...13:49
kashyapmnaser: That issue known due to a libvirt regression, as lyarwood got a patch merged to solve it in Nova -- which is the right thing to do _anyway)13:49
kashyapmnaser: See this one:13:50
stephenfinlyarwood, smcginnis: Think you folks could hit this backport from elod today? It fixes issues with failing tests seen in other backports :) https://review.opendev.org/#/c/712751/13:50
kashyapmnaser: https://review.opendev.org/#/c/708745/ ("libvirt: Provide the backing file format when creating qcow2 disks")13:50
kashyapmnaser: It's merged in master, Train backport in-progress.13:50
stephenfin(like this one https://review.opendev.org/#/c/711670/)13:50
kashyapOh, lyarwood already answered it; silly me.  I should read the scrollback in full before spamming the channel.13:51
*** amoralej is now known as amoralej|lunch13:55
*** mkrai has quit IRC13:56
*** lucidguy has quit IRC13:57
*** ociuhandu has quit IRC13:59
*** factor has quit IRC14:00
*** factor has joined #openstack-nova14:00
lyarwoodkashyap: looking14:02
lyarwoodkashyap: ah this one14:03
brinzhangstephenfin: https://review.opendev.org/#/c/712651/ could please check this patch of novaclient? I add a PATCH ``volume-patch`` CLI, debug it, and not found where is wrong14:03
kashyaplyarwood: Yeah, sorry for the noise.14:03
*** factor has quit IRC14:03
*** factor has joined #openstack-nova14:04
lyarwoodstephenfin: ack I'll take a swing later today14:05
*** ociuhandu has joined #openstack-nova14:07
*** ociuhandu has quit IRC14:07
*** ociuhandu has joined #openstack-nova14:07
*** TxGirlGeek has joined #openstack-nova14:09
*** bnemec is now known as beekneemech14:10
stephenfinbrinzhang: You need to add a 'patch_servers_1234_os_volume_attachments_Work' method to 'FakeSessionClient' in 'novaclient/tests/unit/v2/fakes.py'14:10
stephenfinwith a response mocking what you'd see from nova-api14:11
brinzhangstephenfin: cools, thanks, yep, I will try, I missing that response in 'FakeSessionClient'14:13
*** factor has quit IRC14:14
*** factor has joined #openstack-nova14:14
*** factor has quit IRC14:15
sean-k-mooneystephenfin: i cant cross link to source code form nova docs right14:17
brinzhangstephenfin: thanks, I will complete its unit tests, and then update ^^14:17
stephenfinsean-k-mooney: Only if the code is autodoc'd somewhere, and I think only the notifier stuff falls in that bracket14:17
stephenfinI _think_ you should still be able to use e.g. :py:method:`nova.foo.bar` but it won't resolve to anything14:18
sean-k-mooneystephenfin: ok in that case i think ill have to link to opendev/github14:18
sean-k-mooneystephenfin: context is https://review.opendev.org/#/c/693460/18/doc/source/admin/managing-resource-providers.rst,unified@6614:18
sean-k-mooneyjust trying to figure out how i would go ablout liking to the schem file for the provider.yaml validation14:19
sean-k-mooneyi think an external link is the only way to do it so ill figure out what the opendev path would be14:19
stephenfinyou could just '.. include' it14:19
sean-k-mooneyoh i didnt know you could do that am ill try14:20
*** ociuhandu has quit IRC14:23
brinzhangjohnthetubaguy: I resolved the os-instance-actions policy issue, and granular the GET API, pls review again. https://review.opendev.org/#/c/706470/14:24
*** ociuhandu has joined #openstack-nova14:29
*** TxGirlGeek has quit IRC14:31
*** grandchild has joined #openstack-nova14:33
*** Liang__ has quit IRC14:43
*** ociuhandu has quit IRC14:49
dansmithbrinzhang: assume you saw this right? https://review.opendev.org/#/c/712697/14:55
*** amoralej|lunch is now known as amoralej14:55
brinzhangdansmith: yeah, I have seen it, but I am not very understand why14:55
brinzhangdansmith: so I am not update my patch :(14:56
dansmithbrinzhang: understand why what?14:56
openstackgerritLee Yarwood proposed openstack/nova master: libvirt: Use virDomainBlockCopy to swap volumes with Libvirt >= 5.10.0  https://review.opendev.org/69683414:58
dansmithbrinzhang: your patch needs to be rebased on top of that, but otherwise there isn't much you need to do I think14:58
brinzhangdansmith: no, I am still confusing of that change ...14:58
brinzhangI don't seem to grasp the main reason for doing this.14:59
*** mlavalle has quit IRC14:59
brinzhangI think that's why I am confusing14:59
*** mlavalle has joined #openstack-nova15:00
dansmithbrinzhang: the main reason for doing the "format_message()" instead of just "str(e)" ?15:00
sean-k-mooneybrinzhang: to not leak sensitive infomation to end users when there are errors15:00
brinzhang<sean-k-mooney> yeah, I know, but can you give me a example?15:01
dansmithyes,15:01
brinzhangI think I need a sample to understand this well15:01
dansmithwe had a ceph exception which was something like "Failed to connect to 12.34.56.78"15:01
kashyaplyarwood: I just noticed I didn't hit send on my new comment; done.  (See the regression mentioned there.)15:01
brinzhangs/example/sample15:01
dansmithwhich gets exposed out of the API and users get to see sensitive details like the internal ceph ip address15:01
dansmithbrinzhang: I think there was even a case where a credential got leaked.. mriedem might remember the bug(s) to point to15:02
brinzhangdansmith: now we can get the exception only, and cannot get "Failed to connect to 12.34.56.78", right?15:03
dansmithbrinzhang: yeah so if the exception is not known (i.e. not inherited from NovaException) we just get the exception *name*, so FailedToConnectToCeph or ConnectionFailure or something like that15:04
brinzhangIn other words, we cannot get the details message for the sensitive15:04
dansmithbrinzhang: we can *log* it in the compute log for the admin, but we don't want to expose it to the user15:04
dansmithright15:04
brinzhangdansmith: ok, let me taste your changes again15:05
dansmithlol15:05
dansmithbrinzhang: um, global pandemic going on right now.. probably best not to lick anything15:06
lyarwoodkashyap: ack, we aren't going to hit that FWIW15:06
lyarwoodkashyap: we don't use the shallow copy flag15:06
brinzhangdansmith: haha15:06
dansmithbrinzhang: :)15:06
lyarwoodkashyap: but thanks for raising that15:06
brinzhangdansmith: Thanks to a guy named Guo WeiPeng, I have been in quarantine for another 15 days at home15:07
kashyaplyarwood: Yeah, was reading the details; just wanted to think through if there are any other places we need to bear in mind.15:07
dansmithbrinzhang: well, the definitely don't lick *my* patch :)15:07
brinzhangdansmith: so I have too much time to talk with you15:07
dansmithbrinzhang: heh, well, I work from home all the time, so I'm pretty much on quarantine normally :)15:08
brinzhangdansmith: you are happiness, it's my dream :)15:08
kashyapWauw, Dan _is_ Happiness; that's something.15:09
dansmithheh15:09
openstackgerritBrin Zhang proposed openstack/nova master: Store instance action event exc_val fault details  https://review.opendev.org/69442815:13
dansmithbrinzhang: we might want to make sure we have some functional tests that raise both nova and non-nova exceptions in such a way that we can examine them from the API15:14
brinzhangdansmith: I saw you add non-nova test exception in https://review.opendev.org/#/c/712697/1/nova/tests/unit/objects/test_objects.py@100315:17
brinzhangYou mean, I should add some nova and non-nova exception for os-instance-actions API?15:17
mriedemdansmith: brinzhang: i can look up the cve, but we were exposing credentials to the rbd backing a compute host via instance faults15:17
dansmithbrinzhang: yeah, but in a functional test I think15:17
mriedemthat's why i referred to the nova.compute.utils code that handles faults15:17
dansmithmriedem: ack yeah15:18
mriedemyou could build on the functional test i wrote for ^15:18
mriedemdansmith: brinzhang: https://review.opendev.org/#/c/674821/15:19
*** ociuhandu has joined #openstack-nova15:19
*** eharney has quit IRC15:19
melwittkashyap: fyi I added you to this review about adding aarch64 cpu model15:20
melwitthttps://review.opendev.org/70949415:20
kashyapmelwitt: Hiya15:20
kashyapmelwitt: Will look; thanks for the heads-up15:20
*** ociuhandu has quit IRC15:20
*** ociuhandu has joined #openstack-nova15:20
melwittcool thanks15:21
brinzhangmriedem: dansmith: Looks like I need to add a functional test file for instance_actions, such as nova/tests/functional/test_instance_action.py15:22
mriedem*shrug* there are lots of existing functional tests that make assertions using instance actions,15:22
mriedemi'm not sure there is a module dedicated to instance actions outside of the api samples15:22
dansmithyeah, just another case in one of those, or just add to test_server_faults, IMHO.. it's mostly the same thing15:23
*** gyee has joined #openstack-nova15:23
mriedemumm https://github.com/openstack/nova/blob/master/nova/tests/functional/test_instance_actions.py15:23
mriedemor just use the existing module :)15:23
dansmithheh yeah15:24
brinzhangmriedem: yeah, I saw another api sample tests in nova\tests\functional\api_sample_tests\test_instance_actions.py15:24
dansmith*gasp*15:24
dansmithbackslashes!15:24
mriedembrinzhang: api samples are generally not really for this type of testing15:25
mriedemthey are more about happy path positive test scenarios with minimal fixture15:25
brinzhangmriedem: I know, I will use your paste linke15:25
brinzhangs/linke/link15:25
brinzhangdansmith: I copied from windows, so it's backslashes!15:26
dansmithbrinzhang: hence the gasp! :)15:26
brinzhangdansmith: The github sometime I cannot open it fastly, sometime 400 for me, I donot know why ..15:27
*** ociuhandu_ has joined #openstack-nova15:28
brinzhangdansmith: You are a humorous technology *bull*. I think I missed a lot of interesting things :)15:28
dansmithbrinzhang: I'm just joking around, don't take me seriously :)15:29
mriedembrinzhang: the great firewall :)15:29
mriedemi'm assuming you're tunneled in through a windows VM for development and being able to be on IRC15:29
mriedemlike all of my old huawei coworkers in china15:30
brinzhangdansmith: I like this style, It makes me free.15:30
* mriedem goes back to what he's actually supposed to be working on15:31
brinzhangmriedem: yes, the great firewall, we bought shadowscokets, but it cannot work now ..15:31
kashyapLOL, interesting choice of words to describe dansmith: "humorous technology *bull*" :D15:31
kashyapHe _is_, though.  But still, the vividness of the metaphor.15:31
*** ociuhandu has quit IRC15:32
*** ccamacho has quit IRC15:32
*** ociuhandu_ has quit IRC15:33
brinzhangkashyap: Forgive my native language is chinese15:33
brinzhangkashyap: :)15:33
kashyapbrinzhang: No-no, I was just saying in jest (joking).  Your English is fine15:33
*** _mlavalle_1 has joined #openstack-nova15:33
*** grandchild has quit IRC15:34
brinzhangkashyap: thanks, a little..15:34
*** mlavalle has quit IRC15:36
brinzhangmriedem: I used windows OS and installed IRC(HexChat) in windows, but my VM boot from our company's server(NODE), it's an OpenStack cloud deployed by kolla, just connect by xshell :)15:38
kashyapmelwitt: I didn't notice it earlier, but confusingly the method is returning 'mode', while the actual content is 'model': https://review.opendev.org/#/c/709494/3/nova/virt/libvirt/utils.py15:40
kashyapmelwitt: I'll add some words in the change; didn't notice it before.15:41
melwittkashyap: yeah, me neither. though that's not the fault of that patch, right?15:41
melwittor are you saying the patch is returning the wrong thing15:42
kashyapmelwitt: You're right - not the fault of the patch; it's existing15:42
kashyapmelwitt: We can change that later, but the core idea (on the basis to set up a CI) is good15:42
kashyapmelwitt: I just asked the upstream QEMU AArch64 folks about the model chosen in the patch ('cortex-a57').  I'll report back on the patch.15:43
sean-k-mooneybrinzhang: i used a windows laptop basically as a thin client with cygwin and ssh to linux server most of my time working upstream15:43
sean-k-mooneybrinzhang: i only started runing linux on my main worstation/laptop when i left intel as we were required to use windows on the it provisioned laptop15:44
melwittkashyap: ok. yeah, I wasn't sure about the order and placement of the aarch64 checks. and maybe could use some code comments15:44
sean-k-mooneyalso i like kolla. its easy to debug and uses15:44
kashyapmelwitt: Yes, defintely +1 on the code comments.15:45
brinzhangsean-k-mooney: we are same, I think you like to use linux15:47
sean-k-mooneyi do but i dont dislike windoes either.15:48
*** lpetrut has quit IRC15:48
brinzhangsean-k-mooney: yeah, what suits you is the best :)15:51
brinzhangit's time too later for me, I will go, thanks damsmith, mriedem, kashyap, sean-k-mooney (good morning) ^^15:52
*** _mlavalle_1 has quit IRC16:03
*** mlavalle has joined #openstack-nova16:03
kashyapmelwitt: Ah, only noticed your comments _after_ I've hit send on mine.  Along with your questions, I have added a few more.16:04
melwittkashyap: cool, better to have more comments to show if there's any agreement or if I'm only asking dumb questions :P16:06
kashyapmelwitt: No, just reading your questions; you make perfectly valid points there.16:06
kashyapYou are not a mind reader to know the intention; so asking for code comments is the only reasonable thing :D16:06
melwittthis stuff is greek to me. I'd rather have some explanations there for those that venture to the code in the future, looking to refactor or whatever16:08
kashyapmelwitt: Okay, got some more input from QEMU maintainer (Peter Maydell) - he has special interest in AArc64 - I'll add it in the change.  He recommends a bunch of things.16:08
melwittawesome!16:08
kashyapmelwitt: I know the mechanics of how QEMU handles things; but the innards of AArch64 and the usage is Greek for me too.  (I find Greek aesthetically pleasing, though. I have a couple of books with Greek on left, the English translation on right. :D)16:09
melwitt:)16:09
*** damien_r has quit IRC16:12
*** lpetrut has joined #openstack-nova16:14
melwittkashyap: is this libvirt kvm aarch64 (which is mentioned on feature support matrix)? or is it libvirt qemu aarch64 (not on feature support matrix yet)? https://docs.openstack.org/nova/train/user/support-matrix.html16:19
kashyapmelwitt: Yeah, there was two things:16:19
kashyaps/was/are/16:19
kashyap(1) TCG (the emulatd bits), or what is also referred to as "QEMU"-only; and (2) KVM (with hardware accelerated) in context of AArch64. I guess we want 'care' about both16:21
kashyapLuckily, Peter from QEMU informs that for _both TCG/KVM, we can just use one model: 'max'16:22
kashyap(Comments on why upcoming...)16:22
melwittkashyap: ok, so the proposed patches aren't for one in particular only16:22
kashyapmelwitt: Although for CI, perhaps the dev just cares about QEMU (TCG).   Just like how Nova x86 CI runs on TCG guests (because no nested, to get KVM).16:23
kashyapNot sure if I'm helping or confusing :D16:23
melwittuh ... helping a little :) but it's not you, it's me16:23
melwittI'm just trying to write something on the lp bug for the patch above the one we've been looking at16:23
sean-k-mooneykashyap: what is the max model usef for?16:24
melwittthe bug is unable to attach volume to instance with config drive on arm64 and according to the feature support matrix the support is "unknown" so I was thinking to set the bug Low based on that https://docs.openstack.org/nova/latest/user/support-matrix.html#operation_attach_volume_driver_libvirt_kvm_aarch6416:24
kashyapmelwitt: I see.  I'll collect thoughts and write it in the change, and we can take it from there.16:25
kashyapsean-k-mooney: The 'max' model will apparently give you the moving-target of "all the stuff we [QEMU] can currently emulate".16:25
kashyapmelwitt: Yeah, 'low' for now is fine.16:26
sean-k-mooneykashyap: so totally non portable16:26
sean-k-mooneykashyap: where is max used16:27
sean-k-mooneye.g.is it a cpu model or a scisi contoler or what16:27
kashyapsean-k-mooney: Well, the recommendation from the AArch64 experts is to use 'max' for _both_ TCG and KVM:16:27
kashyap... "unless you really specifically want an always-the-same-thing even in newer QEMU versions fixed target".16:27
sean-k-mooneyim trying to figure out the context16:27
kashyapPlease read the scrollback with Mel.16:27
kashyapIt's in context of https://review.opendev.org/#/c/709494/16:28
sean-k-mooneyya i was trying to and could not fiutre it out16:28
sean-k-mooneyso this has changed form the old advise of use the VIRT cpu model16:28
melwittkashyap: yeah so I think what should likely happen is that once they have the CI set up and running we will update the feature support matrix with all of the things that are working in the CI?16:28
kashyapsean-k-mooney: No, no you're mixing up CPU model and machine type for AArch6416:31
kashyapsean-k-mooney: 'virt' is still the recommended machine type for AArch64.16:31
kashyapmelwitt: Yeap.16:31
sean-k-mooneyah yes i am16:31
melwittcool16:31
*** openstackgerrit has quit IRC16:31
sean-k-mooneythis still will cause issue for live migration if we use max16:31
sean-k-mooneyso i dont know if that is a good default16:32
kashyapmelwitt: I _think_ first want to care about TCG, because for KVM, you'd need AArch64 hardware in the CI16:32
kashyapsean-k-mooney: I've talked to the AArch64 maintainer, I'm posting the recommendations in the patch.16:32
sean-k-mooneykashyap: we have AArch64 hardware in ci16:32
melwittkashyap: ack16:32
sean-k-mooneykashyap: thats the whole point lenario donated some16:32
kashyapsean-k-mooney: Good, then.  We've got the recommendations for that, too.16:32
melwittkashyap: so do you think TCG would be a separate column in the matrix?16:32
kashyapLinaro, I take it.  Yeah16:32
kashyapmelwitt: Yeah, I'd say so.16:33
sean-k-mooneyah yes16:33
melwittok. thanks for explaining all this16:33
kashyap(We need to clearly distinguish both cases.)16:33
kashyapNo worries, I need to refresh this every few months mysel f:D16:33
melwitt:)16:33
sean-k-mooneykashyap: we proably should be using the cpu_mode config option to define this16:33
*** nweinber has quit IRC16:34
sean-k-mooneye.g. tie it in to host-passthough and host-model some how16:34
stephenfinsean-k-mooney: how strongly do you feel about https://review.opendev.org/#/c/468203 ?16:35
stephenfinspecifically jaypipes arguments there16:35
kashyapsean-k-mooney: One thing at a time :-)16:35
sean-k-mooneyim also not sure how i feel about htis being a bug, it fells more like a specless blueprint but im not going to really object too stronly to it beign a bug16:36
stephenfinI ask because I'm trying to decide how to say "these cores should be dedicated" in a mixed instance16:37
sean-k-mooneykashyap: well for now they could jsut use cpu_mode=custom and cpu_model=max or cpu_model=min right16:37
stephenfinCurrently I'm going with 'hw:cpu_dedicated_mask', which is a CPU list16:37
sean-k-mooneystephenfin: am ill take a look now16:37
sean-k-mooneystephenfin: ya i would be fine with that16:37
sean-k-mooneystephenfin: whats the other option16:37
stephenfinthere are a few16:38
stephenfinin this scenario that I'm following, you'll be able to use hw:cpu_dedicated_mask *or* hw:cpu_realtime_mask16:38
sean-k-mooneystephenfin: no you would use both16:38
stephenfinI see no reason to say these cores are shared, these are dedicated but non-realtime, and these are dedicated and realtime16:39
sean-k-mooneywell optinally16:39
sean-k-mooneye.g. not an exclucive or16:39
sean-k-mooneyi dont see a reason to block it16:39
stephenfinwhy? What real-world user is going to use all three types of core in an instance?16:39
stephenfinBecause it's less complicated16:39
sean-k-mooneyits more complicated16:39
sean-k-mooneythe validation logic to prevent all 3 is extra logic we dont need if we allow it16:40
stephenfinA|B is easier grok than A.issubset(B)16:40
stephenfinand the it makes my XML generation easier16:40
stephenfins/the //16:41
sean-k-mooneystephenfin: you are over loading hw:cpu_realtime_mask16:41
sean-k-mooneyits behavior would change based on the hw:cpu_policy16:41
sean-k-mooneyso that gets harder to reason about16:41
stephenfinnope, it stays the same: these are cores that are real-time16:41
sean-k-mooneyif we allow both it does not16:41
stephenfinwhat changes is what happens to the other cores16:41
stephenfinand that's purely based on hw:cpu_policy16:41
sean-k-mooneyno it chacnge form tehse are realtime to these are realtime and dedicated and the rest flaot16:42
kashyapsean-k-mooney: Not entirely; there's also a quirk of making sure to specify the interrupt controller (as the default is less featureful) -- `-machine gic-version=max`16:42
kashyapmelwitt: For later, added my notes in the change.16:42
stephenfinthey're always realtime and dedicated16:42
stephenfinyou can't have realtime floating cores16:42
melwittkashyap: thanks16:42
sean-k-mooneywell libvirt allows you to but that is a seperate thign16:43
stephenfin...in nova16:43
kashyapYep16:43
sean-k-mooneystephenfin: my perference would be to allow both to be set16:43
stephenfinI could allow it, but I don't want to force it16:43
sean-k-mooneyand always require hw:dedicated_cpu_mask for mixed16:43
stephenfinand because I don't want to force it, I'd rather say there's only one way to do this16:43
sean-k-mooneywhich one do you not want to force16:44
*** nweinber has joined #openstack-nova16:44
sean-k-mooneythe realtime mask or the dedicated mask16:44
stephenfinhaving to set both hw:cpu_dedicated_mask and hw:cpu_realtime_mask16:44
sean-k-mooneyright so i would make the cpu_realtime_mask optional16:44
stephenfinif you want a real-time instance with the non-realtime cores floating16:44
sean-k-mooneyand always require the cpu_dedicated_mask16:44
stephenfinbut then you have a difference of behavior elsewhere16:45
sean-k-mooneyand make it so that if you dont set cpu_realtime_mask but do set hw:cpu_realtime=true then we use the hw:dedciated_cpu_mask16:45
stephenfinnow you have to use 'hw:cpu_realtime_mask' when using dedicated16:45
stephenfinbut use 'hw:cpu_dedicated_mask' when using mixed16:45
stephenfinanyway, we'll invariably debate this in the patch so back to my original question16:46
sean-k-mooneyi think its a much simpler rule to say if you want realtime always use realtime_mask and if you want mixed always use the dedicated_mask16:46
sean-k-mooneyif you want both use both and require the realtime mask must be a subset of the dedicated mask16:47
stephenfinif we're doing this, do we want to fix that annoying thing where hw:cpu_realtime_mask has to be preceded by carat?16:47
stephenfinbecause I don't want to force that 'hw:cpu_dedicated_mask'16:47
stephenfinand it would be nice for them to behave similarly16:47
sean-k-mooneystephenfin: ya i would not mind doing that16:48
sean-k-mooneystephenfin: we said the dedicated mask should follow the rules for the config opntion16:48
sean-k-mooneynot the rules for the realtime mask16:48
stephenfinthe only reason I see to not do that is jaypipes wanted us to kill 'hw:cpu_realtime_mask' in favor of 'hw:cpu_realtime_set'16:48
stephenfinbut tbh, I don't think it's worth the effort16:48
stephenfinwe haven't deprecated flavor extra specs before. I don't even want to get into that16:48
sean-k-mooneyi would be ok with that i guess but yat that ^16:49
stephenfinditto for image metadata props, for that matter16:49
stephenfinokay, sweet16:49
stephenfinI can revive those patches so16:49
stephenfinwhoo, rebase fun!16:49
sean-k-mooneystephenfin: cool. i prefer each option to do one thing and one thing only. but do what you think is best16:50
sean-k-mooneyits a preference not a blocker for me16:50
sean-k-mooneyand sice we cant do cross extra spec validation in your validation propsoeal i also prefer that form the avlidation point of view16:51
sean-k-mooneylet me know when you want me to review and or play around with it16:51
sean-k-mooneyim going to drop soon just an fyi16:51
stephenfinwill do16:51
stephenfinah yeah, it'll be next week anyway16:51
*** rpittau is now known as rpittau|afk16:58
*** maciejjozefczyk has quit IRC17:03
*** tesseract has quit IRC17:07
*** openstackgerrit has joined #openstack-nova17:16
openstackgerritElod Illes proposed openstack/nova stable/rocky: Enhance service restart in functional env  https://review.opendev.org/71303317:16
openstackgerritLee Yarwood proposed openstack/nova master: WIP nova-live-migration: Wait for n-cpu services to come up after configuring Ceph  https://review.opendev.org/71303517:19
openstackgerritLee Yarwood proposed openstack/nova stable/pike: WIP nova-live-migration: Wait for n-cpu services to come up after configuring Ceph  https://review.opendev.org/71303617:21
*** maciejjozefczyk has joined #openstack-nova17:21
*** lbragstad has quit IRC17:23
*** lbragstad has joined #openstack-nova17:26
openstackgerritMerged openstack/nova stable/rocky: Remove global state from the FakeDriver  https://review.opendev.org/71275117:29
melwittlyarwood: yay, thanks for looking at that. is the same thing happening with the intermittent failures on master?17:29
melwitt*intermittent nova-live-migration failures17:29
lyarwoodmelwitt: I don't recall seeing this on master but let me have a quick look in logstash17:30
*** dpawlik has quit IRC17:30
melwittlyarwood: ok, was just wondering if you knew off the top of your head. I've seen occasional failures of nova-live-migration on master and just curious if it's the same thing. I hope so, cause that would mean your fix would fix that too17:31
*** iurygregory has quit IRC17:32
*** evrardjp has quit IRC17:35
*** evrardjp has joined #openstack-nova17:36
lyarwoodmelwitt: yeah, I can't see anything on master but my elasticsearch foo is awful17:37
melwittok np17:37
lyarwoodmelwitt: FWIW your https://review.opendev.org/#/c/712226/ change failed because of http://status.openstack.org/elastic-recheck/#181378917:39
lyarwoodmelwitt: I've not had time to look into that but I have been seeing that across master17:39
melwittlyarwood: ah thanks. I've been looking at http://status.openstack.org/elastic-recheck/#1844929 again yesterday and today17:40
melwittso far, no dice17:41
lyarwoodmelwitt: kk, these all appear rather tricky17:42
melwittyeah :(17:43
*** hamzy has joined #openstack-nova17:45
*** udesale_ has joined #openstack-nova17:52
*** eharney has joined #openstack-nova17:52
*** udesale has quit IRC17:54
*** jangutter_ has joined #openstack-nova17:54
*** jangutter has quit IRC17:58
*** derekh has quit IRC18:01
*** lseki has joined #openstack-nova18:11
*** maciejjozefczyk_ has joined #openstack-nova18:16
*** maciejjozefczyk has quit IRC18:16
melwittzzzeek: could you pls sanity check me on this -- this logging is done after getting a response back from the database server right? it's not a client side logging before making the query https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/engine/result.py#L157918:28
zzzeekmelwitt: row logging is after we've executed the statement and we've received rows back from the DBAPI cursor, that line logs the row itself18:28
zzzeekthere's no "row" that we would have before invoking a statement18:29
melwittthanks zzzeek++18:29
melwitthaha yeah. makes sense18:29
*** TxGirlGeek has joined #openstack-nova18:31
melwittzzzeek: I'm investigating a gate bug (and we enabled connection_debug=100) and noticing that in the failure cases, we get no rows logged for our query for compute nodes. the gate environment on the nodes where it fails is known to be running on nodes with restricted disk iops. do you have any idea what could be happening or how we could tune to handle this environment better?18:32
melwittlike, in what scenarios could we make a query to the db server and then get no response within 60 seconds?18:33
zzzeekmelwitt: well eventlet can do that, if you're using eventlet in some case18:33
zzzeekmelwitt: another is, the query has a huge cartesian product and is taking too long to ORDER BY18:33
melwittand if there's any rule of thumb way to deal with that, assuming that the disks cannot be made faster or allow more iops18:33
melwittwe are using eventlet to manage the 60 second client side timeout waiting for the result18:34
zzzeekthe eventlet issue is that if some other greenlets are hogging the CPU then the greenlet in question might not be able to get a result back, but this has never been observed at the scale of 60 seconds18:35
zzzeekif the "restricted disk iops" is making MySQL chug to a halt, that would be a thing18:35
zzzeekis that both read and write ops?18:36
*** ralonsoh has quit IRC18:36
*** lbragstad_ has joined #openstack-nova18:37
melwitthm, not sure. I'm referencing this ML thread http://lists.openstack.org/pipermail/openstack-discuss/2019-November/010505.html18:37
*** udesale_ has quit IRC18:39
zzzeekso...the issue is, MySQL is actually being observed to be overworked in this case and we want to tune it for reduced disk IO ?18:39
*** lbragstad has quit IRC18:39
melwittI was wondering that as a possibility, yes18:40
melwittI'm playing around with some innodb_ settings in devstack to experiment, but mostly don't know what to try18:40
melwittI wondered if tuning for that might enable it to work under the reduced disk IO condition18:41
melwittzzzeek: I have to run for a bit, but if you think of any good tunables I could try in my.conf, pls give me a shout18:53
zzzeeksure...there are some write flags but read is tricky18:53
melwitt*my.cnf18:54
*** tbachman has joined #openstack-nova19:04
*** lpetrut has quit IRC19:05
*** amoralej is now known as amoralej|off19:06
*** ociuhandu has joined #openstack-nova19:32
*** brinzhang_ has joined #openstack-nova19:38
*** jangutter_ has quit IRC19:39
*** brinzhang has quit IRC19:42
*** mgariepy has quit IRC19:53
openstackgerritLee Yarwood proposed openstack/nova stable/pike: WIP nova-live-migration: Wait for n-cpu services to come up after configuring Ceph  https://review.opendev.org/71303620:13
*** ociuhandu has quit IRC20:23
sean-k-mooneymelwitt: how big is the db20:27
sean-k-mooneyit would be a bit of a hack but it might be posible to mount its data on tempfs or otherwise use some form of filesystem or block level caching to mask the io limitations20:28
*** tbachman has quit IRC20:33
melwittsean-k-mooney: how can I tell how big is the db? I have been using this page to get an idea of things to try https://dev.mysql.com/doc/refman/5.6/en/optimizing-innodb-diskio.html20:34
sean-k-mooneyi was just thining of useing du20:34
melwittoh, lemme see, there's df output on the job runs20:36
*** tbachman has joined #openstack-nova20:36
*** TxGirlGeek has quit IRC20:36
melwitthttps://zuul.opendev.org/t/openstack/build/8c91fd21815148d9894ac2bf60893a9e/log/logs/df.txt20:37
sean-k-mooneyya that unfortunetly is not going to show us20:37
sean-k-mooneyill check my local devstack20:37
sean-k-mooneyso im seeing about 180mb20:38
sean-k-mooneyin /var/lib/mysql20:38
sean-k-mooneyso we could just try mounting that on tempfs20:38
sean-k-mooneyi think we would have enough ram to do that20:39
melwittoh, thanks20:39
sean-k-mooneyruning a db on tempfs is normaly really dumb but its a ci so ...20:40
sean-k-mooneyits not like we care if the data goes away after a reboot20:40
melwittheh20:40
*** nweinber has quit IRC20:43
*** ociuhandu has joined #openstack-nova20:55
*** maciejjozefczyk_ has quit IRC20:59
*** slaweq has quit IRC21:04
*** ociuhandu has quit IRC21:06
*** ociuhandu has joined #openstack-nova21:07
*** ociuhandu has quit IRC21:16
openstackgerritLee Yarwood proposed openstack/nova master: nova-live-migration: Wait for n-cpu services to come up after configuring Ceph  https://review.opendev.org/71303521:40
lyarwoodmelwitt / sean-k-mooney ; ^ if you're around, thoughts on that would be appreciated. Appears to fix the issues we've been seeing on stable/pike.21:41
melwittcool, will look. thanks for digging into that21:42
lyarwoodthanks :)21:47
* lyarwood heads offline for the weekend \o21:47
*** tbachman has quit IRC22:01
*** ociuhandu has joined #openstack-nova22:07
*** ociuhandu has quit IRC22:20
sean-k-mooneylyarwood: oh is that why the migration job was failing. it was not waiting for ceph after it reconfigred and restarted n-cpu22:23
*** mlavalle has quit IRC22:31
*** mriedem has left #openstack-nova22:35
*** rcernin has quit IRC22:58
*** rcernin has joined #openstack-nova22:59
*** sean-k-mooney has quit IRC23:02
*** gyee has quit IRC23:22
*** bbowen_ has quit IRC23:29
*** martinkennelly has quit IRC23:31
*** xek has quit IRC23:38
*** bbowen has joined #openstack-nova23:39
*** sean-k-mooney has joined #openstack-nova23:47
*** lbragstad_ has quit IRC23:49

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!