Friday, 2021-10-08

opendevreviewMerged openstack/nova master: tests: Walk database migrations in correct order  https://review.opendev.org/c/openstack/nova/+/81029100:30
opendevreviewMerged openstack/nova master: tests: Address some nits with database migration series  https://review.opendev.org/c/openstack/nova/+/81085600:31
opendevreviewMerged openstack/nova master: tests: Silence noise from database tests  https://review.opendev.org/c/openstack/nova/+/81085700:31
opendevreviewmelanie witt proposed openstack/nova master: DNM Run against unmerged oslo.limit changes  https://review.opendev.org/c/openstack/nova/+/81223600:51
opendevreviewmelanie witt proposed openstack/nova master: Add logic to enforce local api and db limits  https://review.opendev.org/c/openstack/nova/+/71213900:51
opendevreviewmelanie witt proposed openstack/nova master: Enforce api and db limits  https://review.opendev.org/c/openstack/nova/+/71214200:51
opendevreviewmelanie witt proposed openstack/nova master: Update quota_class APIs for db and api limits  https://review.opendev.org/c/openstack/nova/+/71214300:51
opendevreviewmelanie witt proposed openstack/nova master: Update limit APIs  https://review.opendev.org/c/openstack/nova/+/71270700:51
opendevreviewmelanie witt proposed openstack/nova master: Update quota sets APIs  https://review.opendev.org/c/openstack/nova/+/71274900:51
opendevreviewmelanie witt proposed openstack/nova master: Tell oslo.limit how to count nova resources  https://review.opendev.org/c/openstack/nova/+/71330100:51
opendevreviewmelanie witt proposed openstack/nova master: Enforce resource limits using oslo.limit  https://review.opendev.org/c/openstack/nova/+/61518000:51
opendevreviewmelanie witt proposed openstack/nova master: Add legacy limits and usage to placement unified limits  https://review.opendev.org/c/openstack/nova/+/71349800:51
opendevreviewmelanie witt proposed openstack/nova master: Update quota apis with keystone limits and usage  https://review.opendev.org/c/openstack/nova/+/71349900:51
opendevreviewmelanie witt proposed openstack/nova master: Add reno for unified limits  https://review.opendev.org/c/openstack/nova/+/71527100:51
opendevreviewmelanie witt proposed openstack/nova master: WIP Enable unified limits in the nova-next job  https://review.opendev.org/c/openstack/nova/+/78996300:51
gmannmelwitt: its green now https://review.opendev.org/c/openstack/devstack/+/812092/401:22
melwittgmann: \o/01:22
opendevreviewMerged openstack/nova master: Reproduce bug 1945310  https://review.opendev.org/c/openstack/nova/+/81139401:25
opendevreviewMerged openstack/nova master: Update min supported service version for Yoga  https://review.opendev.org/c/openstack/nova/+/80993202:36
opendevreviewpengzhang proposed openstack/nova master: the vif_name error leads virsh domifstat cant work  https://review.opendev.org/c/openstack/nova/+/81311903:25
opendevreviewpengzhang proposed openstack/nova master: the vif_name error leads virsh domifstat cant work  https://review.opendev.org/c/openstack/nova/+/81311905:46
opendevreviewnorman shen proposed openstack/nova master: Recreate mdev devices according to placement  https://review.opendev.org/c/openstack/nova/+/81022007:33
bauzasgood (late) morning Nova08:12
lyarwood\o08:15
gibio/08:17
opendevreviewDmitriy Rabotyagov proposed openstack/nova master: Ensure MAC addresses characters are in the same case  https://review.opendev.org/c/openstack/nova/+/81194709:16
opendevreviewElod Illes proposed openstack/nova stable/train: Reject open redirection in the console proxy  https://review.opendev.org/c/openstack/nova/+/79180709:38
opendevreviewElod Illes proposed openstack/nova stable/train: address open redirect with 3 forward slashes  https://review.opendev.org/c/openstack/nova/+/80662909:38
gibisean-k-mooney: what is the reason of not waiting for network-vif-plugged events during hard reboot? I hard reboot does unplug / plug but does not wait for plugged event. Compared that to boot where we do plug and then wait for plugged event before we let the VM to run10:20
gibiin normal ovs case I see that during hard reboot neutron eventually sends the unplugged and plugged events to us we just not waiting for it10:20
sean-k-mooneyyou are correct it will in most cases10:24
sean-k-mooneytechnically linux bridge might miss the removal and not send it10:24
sean-k-mooneyfor ovn and odl it wont10:24
sean-k-mooneystarnard ml2/ovs it will10:24
sean-k-mooneyso we dont wait because we need to know which ml2 driver bound the port and the behaivor of that driver10:25
sean-k-mooneygibi: i likely will be looking at reqorkign how we do this in yoga its one of the things i want to talk about at the ptg10:25
sean-k-mooneyin train neutron started teilling us what driver bound the port10:26
gibiOK I see, thanks10:26
sean-k-mooneyso we can actully no start using that info10:26
sean-k-mooneywe just need to add it to our vif objects and use it where it makes sense10:26
sean-k-mooneye.g. wait for event on hard rebot if its ml2/ovs for example10:26
sean-k-mooneybut not if its odl10:26
gibiI see10:26
sean-k-mooneygibi: the short answer why we  dont wait is we dont rebind the ports on hard reboot so we only get event for backend that send plug time event and since beforfe train we could not tell if ovs port were plug time or bind time we could not wait safly10:28
sean-k-mooneygibi: are you seeing a issue wiht not waiting currently?10:28
sean-k-mooneyor was this just courisity10:28
gibiI have a downstream case (on pike) where the nova unplug / replug behavior at openstack server start causes problems in ODL10:29
gibibut you said that odl won't even send us plugtime events10:29
gibiso waiting won't be the solution10:30
gibiwondering if for backends not supporting plug time events we need to rebind10:30
sean-k-mooneyah ml2/odl used to send events at bind time yes not at plug time but i think later version might have adress that10:30
gibiohh, good info, then I have to figure out the odl version too10:30
sean-k-mooneygive me to second and ill see if i can find that in the odl neutron driver10:31
sean-k-mooney basically to support bind time event they had to open a websock that odl could call back to to send event to neutron which neuton could then send to nova10:32
sean-k-mooneythat was not part of the orginal ml2 driver10:32
gibiyou mean for plug time event10:32
sean-k-mooneyso in the legacy driver they just hardcoded the port as active as soon as the port was bound https://opendev.org/openstack/networking-odl/src/branch/master/networking_odl/ml2/legacy_port_binding.py#L58-L6110:33
sean-k-mooneywhich resulted in network-vif-plugged beign sent at bind time only10:33
sean-k-mooneythen latere they added a way to get port status updates form odl https://opendev.org/openstack/networking-odl/commit/3432e0f875f8243a2c1febf2012bd0f58863c56b10:34
sean-k-mooneyhttps://opendev.org/openstack/networking-odl/src/branch/master/networking_odl/ml2/port_status_update.py#L91-L9510:34
sean-k-mooneyand then they would correctly send plug time events10:34
sean-k-mooneygibi: but pike might predate that10:35
gibicool, I can check the dates then10:35
gibisean-k-mooney: thanks a lot10:35
gibiso the solution is fresh enough networking-odl to have plug time event, and the in yoga think about waiting for plug time event at reboot as well10:36
sean-k-mooneyno worries assuming we can confirm this works today we could add odl also to the list of "send plug events" or whatever and then start waiting10:36
sean-k-mooneyyep10:36
gibidoes neutron have odl jobs in the upstream ci?10:36
sean-k-mooneyso we dont really need to wait for the ptg to discuss this i just have not had time to work on a spec10:36
sean-k-mooneygood question im not sure10:37
gibiok I will check that too10:37
sean-k-mooneyi suspect not10:37
sean-k-mooneynetworking-odl should have jobs however10:37
gibiahh good point that is a separate repo with it own jobs10:37
sean-k-mooneyhttps://opendev.org/openstack/networking-odl/src/branch/master/.zuul.d/project.yaml#L51-L5210:38
sean-k-mooneyit also has multi node versions10:38
gibicool then we can have test coverage upstream about the change10:38
sean-k-mooneyam on thing i have been talking to artom about is possible adding some other network toplogies to nova's gate as a periodic10:39
sean-k-mooneyso right now we have ovs with iptables in nova-next, we have ovn for most things  and linux bridge for some file changes10:39
sean-k-mooneyit might be nice to add a weekly multi node odl and ml2/ovs job10:40
sean-k-mooneyweekly multi-node job that is to tesst migration and resize ectra10:40
gibiI would expect that some of this coverage is already running on the neutron side10:41
gibibut if not then I agree we can add it10:41
sean-k-mooneythat will tell them if we break things but only if the project is activly beign developed10:41
sean-k-mooneyit looks like networking-odl for example only gets comiits every month or so10:42
gibiyeah, that is a good point too10:42
gibithen we need the periodic to trigger testing even if networking-odl is not changed10:43
sean-k-mooneyyep just to make sure we did not break something. but really weekly is more then enought i think10:43
sean-k-mooneywe can review it like the placment jobs10:43
gibiyepp, agree10:44
sean-k-mooneygibi: i would also like to add dpdk at some point but i have not been maintaining networking-ovs-dpdk so i will need to add devstack support for dpdk or fix it before thats even posible at this poing10:45
sean-k-mooney*point10:45
gibiI support that idea too. Downstream we use dpdk a lot, so would be nice to get coverage for it upstream. 10:46
sean-k-mooneyfortunetly on the dpdk front its support by distros now so its now just a matter of turning it on and seting a few config values but still need to sit down and do that10:46
gibilet me know if I can help10:46
sean-k-mooneysure will do. when i get around to workign on the devstack patches ill let you know10:47
opendevreviewsean mooney proposed openstack/nova master: Ensure MAC addresses characters are in the same case  https://review.opendev.org/c/openstack/nova/+/81194711:09
sean-k-mooneynoonedeadpunk: hopefully you dont mind me updateing ^11:14
sean-k-mooneygibi: can you take a look at ^ when you have time11:15
gibiack I will try11:16
* gibi gathered a big backlog of task this week, craxy11:16
noonedeadpunksure I'm not!11:19
opendevreviewLee Yarwood proposed openstack/nova-specs master: WIP libvirt: Allow Manila shares to be directly attached to instances  https://review.opendev.org/c/openstack/nova-specs/+/81318012:20
*** lbragstad_ is now known as lbragstad13:51
lyarwoodstupid Friday question but can anyone think of an API where we've changed the return code in a microversion? I want to fix the POST os-volume_attachments method to return 202 but I can't think of the best way to do this 14:37
lyarwoodah nvm I see just duplicate the actual method and decorate it14:39
bauzaslyarwood: correct, I was about to point it to you15:00
bauzas(sorry for the late ping, kids taxi)15:00
bauzaslyarwood: do you want to discuss on your spec ?15:00
stephenfinlyarwood: There are quite a few APIs that suffer from that issue. Are you just going to do the one or multiple?15:00
bauzaslyarwood: fwiw, I'm leaving in 25 mins for the weekend15:04
lyarwoodbauzas: sorry was just getting a drink, happy to chat now if you had any questions etc15:06
bauzasI briefly looked at your spec15:06
lyarwoodstephenfin: just os-volume_attachments for now but I'll likely start to address others once I've worked this out15:07
lyarwoodI'm not sure if people would prefer a single microversion for everything etc15:07
dansmithso, I'm pretty sure that we originally said we wouldn't do microversions for things like this15:08
bauzasif 500, yes15:08
lyarwoodthat would make my life easier15:08
dansmithbecause nobody is going to opt-in to a new return code, or opt out (back?) to the old one15:08
sean-k-mooneydont we requrie a microversion if you are chaning the reponcoe code to one that is not currently used15:08
bauzasI mean, 500 to something clearer, not a microversion15:08
dansmithif you're changing other things about the api, then doing the return code as part of it makes sense15:08
dansmithbut not just a "2.98: change a bunch of 200s to 202s"15:08
bauzassec, giving you the API rules15:08
sean-k-mooneywell 200 to 202 is blocking to async15:09
lyarwoodI wonder why we have TODOs everywhere for this then15:09
dansmithsean-k-mooney: yeah, I'm not saying do it without a microversion, I'm saying don't do it unless there's something else changing15:09
lyarwoodoh sorry I see15:09
sean-k-mooneyhttps://developer.mozilla.org/en-US/docs/Web/HTTP/Status/20215:09
lyarwoodwell tbh it's never going to get done in that case15:09
sean-k-mooneylyarwood: you want to move this to be an async api yes?15:10
dansmithif we15:10
bauzashttps://specs.openstack.org/openstack/api-wg/guidelines/microversion_specification.html for the API general rules15:10
lyarwoodsean-k-mooney: it already is15:10
sean-k-mooneyoh then 200 is wrong ya 15:10
bauzashttps://docs.openstack.org/nova/latest/contributor/microversions.html#when-do-i-need-a-new-microversion for nova specifics15:10
lyarwoodyes15:10
dansmithare moving it from sync to async, then *of course* it makes sense to change, but just a microversion for changing the return code but not the behavior is kinda silly15:11
sean-k-mooneyright15:11
sean-k-mooneyim more ok with not haveing a microversion15:11
sean-k-mooneyif we are not changing behavior15:11
sean-k-mooneyjust the return code15:11
dansmithI'm not saying we should do that15:11
bauzastechnically, we agreed on asking for a microversion https://docs.openstack.org/nova/latest/_images/graphviz-a4c9682faf139450eed50fdcc7dd06df5677c197.png15:11
bauzasunless it was changing from HTTP50015:12
dansmithI'm saying returning 200 for an async api that has always been async is unfortunate, but not that big of a deal15:12
sean-k-mooneydansmith: you are say until we modify  tese again dont change it15:12
dansmithsean-k-mooney: yes15:12
dansmithnobody is going to opt-in to "no new behavior but a more accurate return value"15:12
sean-k-mooneyi think that is also reasonable becasue of ^15:12
bauzasbut we mention something :15:12
bauzas" 15:12
bauzasThe exception to not needing a microversion when returning a previously unspecified error code is the 400, 403, 404 and 415 cases. This is considered OK to return even if previously unspecified in the code since it’s implied given keystone authentication can fail with a 403 and API validation can fail with a 400 for invalid json request body. Request to url/resource that does not exist always fails with 404. Invalid content typ15:12
bauzases are handled before API methods are called which results in a 415.15:12
bauzas"15:12
lyarwoodright but at least it's fixed later when they opt-in to some other change15:12
dansmithbauzas: that has nothing to do with this case15:13
bauzasnothing tells us to not ask for a microversion if changing from 20015:13
bauzasdansmith: sure, I was just looking at what we wrote :)15:13
bauzasto see whether we said "cool if 200 -> 202 and async"15:13
sean-k-mooneybauzas: under that policy we sould have to do a microversion but there is no untily in changing the repssonce doe and making it opt in15:13
dansmithlyarwood: right but all of those kinds of death-by-1000-cuts things impact people, which is why we didn't do a massive cleanup early on15:13
dansmithlyarwood: right now if you know that api returns 200 and you check for 200, then returning 202 could break client code even though absolutely nothing has changed15:14
dansmithit'd be bad client code, granted, but.. it's pain for pretty much no gain15:14
bauzasthat's the point ^15:14
lyarwooddansmith: and how does that change anything by including it with other changes later on?15:14
lyarwooddansmith: bad clients will still break 15:14
sean-k-mooneylyarwood: in the ohter api change later the clinet needed to be updated anyway15:15
bauzasif clients expect a 200 on a call, we can't leave them wondering now why they get something else15:15
sean-k-mooneye.g. there was already work for them to do15:15
dansmithlyarwood: well, right, I'm saying it's not worth changing like ever, unless some other behavior changes that would require a behavioral change anyway15:15
dansmithit's a developer purity thing, like many of our TODOs everywhere, but unless it impacts proper client behavior, it's more likely to break things than fix them, IMHO15:16
lyarwoodat least this way we could change all of them in one go and fix the same issue in the clients in one go15:16
lyarwoodinstead you're suggesting different people fix them one by one at different times15:16
dansmithwhen we started out do the v3 API, it was all this purity stuff, and we literally unrolled it all when we realized we were just making pain for people15:16
dansmithlyarwood: again, I'm not15:16
lyarwoodokay it's Friday I'm likely missing something but given people don't want this I'm going to stop and move on15:18
sean-k-mooneylyarwood: when we have done this clean up in the past we have paired it with other changes with the logic being a client tha want to use that new feature via a microversion need to be updated anyway so they can adapt to both change at the same time.15:18
dansmithlet me ask this way: what client behavior is going to be wrong if we return 200 that will be right if we return 202?15:18
dansmithif there's something there, then it's worth changing15:18
dansmithsean-k-mooney: right15:18
lyarwoodtbh with 200 to 202 it's not our clients that I'm worried about 15:19
lyarwoodit's other callers outside of them and their understanding of the API15:19
dansmithlyarwood: right, but ... what's the behavior or understanding that will be wrong?15:20
lyarwoodthe CSI plugin in k8s for OpenStack for example has been misusing it for a while15:20
sean-k-mooneyyour assumeing they are reading the reponce code but not the docs that say this is an async api15:20
lyarwooddansmith:  assuming it's sync and not polling15:20
dansmithmeaning, generally 202 means "if you query this thing right after the PUT/POST then you will not see it updated so don't expect it to be" .. is it that? or is it "this operation takes a while?"15:20
dansmithlyarwood: but is the actual API operations sync and just not the attachment?15:21
lyarwoodwe return after we cast to the compute to do the actual attachment15:22
sean-k-mooneydansmith: i think k8s has been assuming that the attachment was completed when it got the 20015:22
lyarwoodso it takes a while, but the CSI driver would start polling Cinder instead of Nova and that has caused various issues15:22
lyarwoodbecause they assumed the attachment in Nova was done at that point15:22
dansmithI'm not sure that15:23
dansmithis reasonable behavior to assume from either 200 or 202 though15:23
dansmithyou're giving it attachment information, which it is storing, and that triggers some behavior right?15:23
dansmithlike, IIRC, 200 is for "I've stored your object, here's the ID you can use to get it again" and 202 is for "I will store that later, but I can't necessarily give you its identifier yet" right?15:24
sean-k-mooneyfor a 202 they cannot assuem its complete and should expect to poll nova for the compleation15:24
lyarwooddansmith: yeah but my understanding with 200 was that processing was complete with the return15:25
sean-k-mooneyfor 200 generally you would assume it blocks until its completed but its valid to retrun before that15:25
lyarwooddansmith: and as sean-k-mooney said, 202 requires additional processing15:25
dansmithI think that's true of the REST object and not necessarily the actions implied from the thing you're storing15:25
sean-k-mooneylyarwood: has the k8s csi plugin be fixed to pool nova by the way15:26
dansmithbut again, as sean-k-mooney said, this is really a result of them not reading the docs that say it's async and implying which part of the operation is synchronous from the 20015:26
sean-k-mooneyregardeless of if we cahnge the api15:26
lyarwoodsean-k-mooney: https://github.com/kubernetes/cloud-provider-openstack/issues/1645 not yet15:26
sean-k-mooneyok but they are aware of it and it will get eventually fixed15:27
lyarwoodyup sure, just outlining my motivation to fix  this15:27
* lyarwood looks at the Nova bug backlog and laughs 15:27
lyarwood;)15:27
sean-k-mooney:)15:27
sean-k-mooneyi assume we have a customer hitting this so that normally seed up fixing things15:28
dansmithto be clear, making it 202 won't fix this, we're just assuming the human that wrote the code would have assumed they should do something different if the code was 202 right?15:28
lyarwoodyup that's the assumption, it wouldn't fix anything in CSI15:28
lyarwoodI got triggered by the TODO today and wanted to take a swing a killing it15:29
sean-k-mooneyif we wnated to fix CSI issue in nova we would have to block15:29
dansmithso, I mean.. we're like changing something that could easily break other clients that did read the docs, based on the assumption that someone who didn't would have done the right thing if it was different15:29
sean-k-mooneywich i dont think we want to do15:29
dansmithpretty meh :)15:29
bauzasok, sorry guys, I need to bail out from now15:30
lyarwoodbauzas: \o if you want to chat on Monday ping me an email, I should have time to chat once I'm on London15:31
lyarwoodin*15:31
bauzaslyarwood: don't worry, I'll use the power of community brainstorm to move on with your spec ;)15:32
dansmithI think we've all put on a few pandemic pounds, but being "on london" would require a substantial increase in waist size, I think15:32
* lyarwood looks at 16 month old child15:32
lyarwooddansmith: it has been a rough 16 months, it wouldn't take much more 15:33
dansmithheh15:33
opendevreviewJulia Kreger proposed openstack/nova master: Ignore plug_vifs on the ironic driver  https://review.opendev.org/c/openstack/nova/+/81326322:23
*** tbachman is now known as Guest226423:35

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!