Wednesday, 2023-08-30

opendevreviewmelanie witt proposed openstack/nova master: nova-manage: Add 'limits migrate_to_unified_limits'  https://review.opendev.org/c/openstack/nova/+/89164600:01
opendevreviewmelanie witt proposed openstack/nova master: nova-manage: Add 'limits migrate_to_unified_limits'  https://review.opendev.org/c/openstack/nova/+/89164601:26
opendevreviewmelanie witt proposed openstack/nova master: Add documentation for unified limits  https://review.opendev.org/c/openstack/nova/+/89312704:33
melwittbauzas: I looked at the blueprint status etherpad after reading the nova meeting notes, just fyi it's no longer WIP https://review.opendev.org/q/topic:bp%252Funified-limits-nova-tool-and-docs04:42
opendevreviewmelanie witt proposed openstack/nova master: Add documentation for unified limits  https://review.opendev.org/c/openstack/nova/+/89312706:44
bauzasmelwitt: ack, will then look today :)06:50
opendevreviewAmit Uniyal proposed openstack/nova master: Reproducer for dangling bdms  https://review.opendev.org/c/openstack/nova/+/88994707:51
opendevreviewAmit Uniyal proposed openstack/nova master: Delete dangling bdms  https://review.opendev.org/c/openstack/nova/+/88228407:51
auniyalbauzas, I have updated patches, can you please have another look08:43
bauzasauniyal: sure08:45
auniyalthanks08:45
auniyalthere are very less zuul jobs ran for reproducer patch -https://review.opendev.org/c/openstack/nova/+/889947/6?tab=change-view-tab-header-zuul-results-summary08:46
auniyallike no nova-next08:46
auniyalbauzas, is it because the only file which added/modified is in nova/tests directory ?08:47
bauzasauniyal: indeed08:55
bauzashmmm, looks like we have a venv issue08:55
bauzashttps://zuul.opendev.org/t/openstack/build/b915c9886c2a420fbc686b2ca4ccece008:56
bauzasmost of the jobs are failing when creating the venv due to a NoneType exception08:56
bauzasAttributeError I mean08:56
bauzasnvm, this looks a transient issue https://zuul.opendev.org/t/openstack/builds?job_name=openstack-tox-py39&project=openstack%2Fnova&skip=009:01
opendevreviewmelanie witt proposed openstack/nova master: Add documentation for unified limits  https://review.opendev.org/c/openstack/nova/+/89312709:21
bauzasauniyal: I'm eventually +2 on the repro patch but please read my comment for understanding the difference between annotations and defaulting arguments, and why you should *never* set default args with mutables https://review.opendev.org/c/openstack/nova/+/889947/comment/34e10b03_a5d53174/09:40
auniyalbauzas, ack thanks09:51
auniyalI knew about setting list() insignature  but missed, thanks for the comments09:52
auniyalcan we +W  this https://review.opendev.org/c/openstack/nova/+/881457 09:53
auniyalbauzas, regading this https://review.opendev.org/c/openstack/nova/+/882284/47/nova/compute/manager.py#424609:55
auniyals on bdm.destroy(), bdm object wont get updated correct ?09:55
auniyalfrom DB bdm will will deleted09:56
auniyalbut locally bdm object, will stay as same09:56
bauzasauniyal: not if you play with mutables09:59
bauzasbdms is a list, right?10:00
auniyalyes10:00
bauzasauniyal: https://paste.opendev.org/show/bBaeGvpA8Bx5sSF3wKgZ/10:04
bauzasin my example, tamper_bdms is altering a mutable object, which is a list10:05
bauzasthat's semantically identical to your _delete_dangling_bdms() method10:05
bauzas(provided you alter it)10:05
auniyalbauzas, I might be wrong but this is what I tested  - https://paste.opendev.org/show/bnMHd0uJuFYaSWSuP3Gu/10:19
auniyalthe first obj was bdms, then I destroyed one of bdm from it, there is no change in bdms, then later when I retrieved new bdms list, it did not have deleted bdm10:20
bauzasauniyal: because you haven't deleted the local copy 10:20
auniyalyes, 10:21
bauzasbdm.destroy() doesn't remove the object from the list10:21
bauzasyou somehow need to either pop it from the list, or just delete it by hand 10:21
auniyalyes, so the same I was trying to say, so instead of deleted their, we ca have fresh copy10:21
auniyalwill it affect performance10:22
bauzasauniyal: sure, I don't disagree, but it's just doing an extra RPC roundtrip for some calculation we already made10:22
bauzasauniyal: if you compare your new bdm list after the cleanup with the old one, you should in theory see that only the destroyed BDMs are left10:23
bauzasauniyal: well, it's optimization10:23
bauzasyou know what? I'll switch my vote to +1 and I'll leave others chime in10:23
auniyalif it may affect, after destroy I can remove bdm from local copy and return same one.10:28
auniyalI think only one line change10:28
bauzasyou don't need to return anything, bdms is a mutable10:39
bauzastake my pastebin as a reference 10:39
auniyalyes, just pass bdm, update the list after bdm destroy10:39
auniyalbauzas, TypeError: 'BlockDeviceMappingList' object does not support item deletion, neither  remove 11:14
auniyalit has these - https://paste.opendev.org/show/bnDCZUcY00oiKOYIreMs/11:14
auniyalits inherited from ObjectListBase which is again from ovoo_base.ObjectListBase 11:16
auniyalthis is the oslo one, is there a way to del bdms obj with oslo ?11:17
*** d34dh0r5- is now known as d34dh0r5312:14
opendevreviewAmit Uniyal proposed openstack/nova master: Delete dangling bdms  https://review.opendev.org/c/openstack/nova/+/88228412:23
auniyalbauzas, I have updated the patch can you please have another look12:32
opendevreviewAmit Uniyal proposed openstack/nova master: Delete dangling bdms  https://review.opendev.org/c/openstack/nova/+/88228412:46
bauzasauniyal: sec, finding a quick alternative13:43
bauzassorry, I needed to leave for a sec, back14:04
bauzasdansmith: I'm just discussing with auniyal14:20
dansmithabout removing items from the BDM list? I think he resolved it no?14:20
bauzasdansmith: I wonder why oslo.versionedobjects doesn't support directly deleting a list attribute14:20
bauzasis it by design or just a miss ?14:21
dansmithbut I definitely agree with you that fetching again to refresh is not good, and I thought I made that point already, but that might have been *another* patch14:21
bauzaswe have many builtins but not this one14:21
dansmithbauzas: I think it works by index, didn't he say?14:21
bauzasdansmith: not exactly, see my comment : https://review.opendev.org/c/openstack/nova/+/88228414:21
bauzasshit14:21
bauzaswrong url14:22
bauzasdansmith: https://review.opendev.org/c/openstack/nova/+/882284/comment/8a1291a9_e0291cdc/14:22
bauzasI just simplified the loop14:22
sean-k-mooneygenerally its preferable to not use del by the way14:22
bauzasbut, the fact is, we can access the index, but we can't directly del it14:22
sean-k-mooneyi know it works but it raises if its not present14:23
bauzasso we need to access the somehow hidden objects internal field of the CoercedList objet14:23
bauzassean-k-mooney: I don't disagree on the general case, but in this case, I want to prevent an extra RPC roundtrip for something easy to cleanup14:23
sean-k-mooneythe bdms is a list of objects filed right14:24
sean-k-mooneyhttps://github.com/openstack/oslo.versionedobjects/blob/master/oslo_versionedobjects/fields.py#L1179-L118314:24
dansmithbauzas: your suggestion is to modify the list while iterating right? that's not good.14:24
bauzasdansmith: well, yeah you're right, it's a bad pattern14:24
sean-k-mooneywe should not need to fetch again14:24
sean-k-mooneyjust recreate the bdms object wihtout the relevent bdms14:24
bauzasokay, so I'll remove this comment and just ask auniyal to access the index the other wazy14:25
dansmiththe current approach that makes a list to remove and then does bdms.objects.remove() for each seems fine to kme14:26
sean-k-mooneycant we effectivly jsut do a list comprehaent and create a new bdms object by iterating over the old one and filter out the things we dont want14:26
dansmithyep, same same14:26
bauzasdansmith: yeah, I'll remove my comment and just leave one comment which is a nit : we can iterate the bdms by 'for bdm in bdms' without requiring to iterate like he wrote with 'for bdm in bdms.objects'14:27
bauzasthere are many ways to do it14:27
bauzasthis is just a class python filtering over a list14:27
bauzasclassic*14:27
sean-k-mooneyhonestly i woudl prefer if this funciton did not modify the passed in bdms object in place14:28
sean-k-mooneyand instead just constucted a new one and returned that14:28
dansmithwell, that's not very good either,14:29
dansmithbecause it would need to re-associate the context14:29
bauzasyeah14:29
sean-k-mooneysorry trying to parse what tha tmeans14:29
dansmithI don't remember if obj_clone() keeps the context, so if it does, then you could do that, but .. the point here is that it is _intending_ to modify the list14:29
dansmithand save it14:29
bauzasthe workflow is :14:30
bauzas1/ call RPC to get the list14:30
bauzas2/ sanitize the list and eventually remove stale ones14:30
sean-k-mooneyright but we rarely use the fact that python supprot pass by pointer14:30
bauzas3/ do the rest of bdm checking that was existing14:30
sean-k-mooneyso the fact we are modifying an imput arge at all is kind of odd based on our normal code style14:30
bauzassean-k-mooney: well, I actually like python because of its mutables14:30
bauzasbut this is just a code pattern14:31
dansmithI don't think it's a rule or even convention that we don't modify the input arguments in these RPC calls14:31
bauzasI just wanted originally to prevent an unnecessary extra RPC roundtrip14:31
bauzasdansmith: well, this is after the RPC call14:31
sean-k-mooneyits not a rule no i just find it tend to make the code less intuitive to read14:31
dansmithbauzas: yes and that's good14:31
sean-k-mooneyas you cant tell form the call site if it modfies the object or not14:32
dansmithsean-k-mooney: if we're going to modify it and save it, then modifying the copy we got makes more sense to me anyway, even though nobody else has a pointer to this actual instance other than us :)14:32
bauzasn-cpu gets the list, then n-cpu reduces the list14:33
sean-k-mooneyi guess either works i just like the pattere of x = do_stuff(x)14:33
bauzasthen n-cpu can call as many saves as it wants14:33
sean-k-mooneyhostly either works14:33
sean-k-mooneythe only time i really care is if a funciton modifed inputs and returns something14:33
sean-k-mooneyi really dont like that partern but if it does one or the other its generally ok14:34
bauzassean-k-mooney: yeah and that's because your brain is set on functional programming while I'm more imperative :p14:34
sean-k-mooneythis only modifies the input14:34
sean-k-mooneybauzas: not really i just have been bitten bug bugs where inputs were modifed in nova and we were not aware of it14:35
sean-k-mooneyim thinkign of the requst spec in the schudler specificaly 14:35
bauzasyeah, I do understand14:36
bauzashere, it's a filtering case14:36
bauzasbut I'm fine with both approaches, we can ask amit to return the sanitized list14:36
bauzasif you don't like the implicit approach of tampering a mutable14:37
bauzas(which I do understand is somehow risky)14:37
sean-k-mooneyits not that its risky just less common in our code base14:37
sean-k-mooneythe end result14:37
sean-k-mooneywe want in ether case14:38
sean-k-mooneyis a single rpc14:38
bauzasI just don't want us to push this feature under the FeatureFreeze bus because of a code pattern argument :)14:38
sean-k-mooneyso aslong as we get ther its fine14:38
bauzaswhich is the case in the current latest PS14:38
bauzasso, +214:38
sean-k-mooneyack14:38
sean-k-mooneyelodilles: by the way the os-vif ci has been pretty green https://zuul.openstack.org/builds?project=openstack%2Fos-vif&skip=0 bu twe also dont see many executions there either14:56
sean-k-mooneysounds like you are concerredn about hte os-vif-ovs-iptables job sepcifically which has had a few failures https://zuul.openstack.org/builds?job_name=os-vif-ovs-iptables&project=openstack%2Fos-vif14:57
elodillessean-k-mooney: yepp, os-vif-ovs-iptables failed 2 times on the generated 'ci health checking' patch, but i saw that it passed recently15:07
elodillesi'm not really concerned, but wanted to see that the DNM patch is passing so that i can abandon it o:)15:08
sean-k-mooneyelodilles: i think this failrues were mainly related to volumes16:02
sean-k-mooneyso ya we will see what happens iwht the dnm16:03
sean-k-mooneyhonestly we likely sould disable the volume tests in os-vif16:03
sean-k-mooneyits also possible that those test are not configured for verification16:04
sean-k-mooneyits not disabled expleictly however so i dont thin kthis is the issue16:05
*** ralonsoh is now known as ralonsoh_ooo16:20
bauzasfolks, I'm about to EOB, but I'd appreciate if some cores around may look in particular at https://review.opendev.org/c/openstack/nova/+/873221 , https://review.opendev.org/q/topic:dangling-volumes+project:openstack/nova+is:open and https://review.opendev.org/q/topic:ironic-shards+project:openstack/nova+is:open which both have my +2 (mostly, or just asking for nits)16:54
bauzasas a reminder, FF is tomorrow and I'd very appreciate if we could somehow give a go to those series 16:54
opendevreviewAmit Uniyal proposed openstack/nova master: Delete dangling bdms  https://review.opendev.org/c/openstack/nova/+/88228416:59
auniyalbauzas, dansmith, sean-k-mooney can you please have another look dangling patches17:01
dansmithbauzas: there are so many typos in that bottom ironic sharing patch including doc text and renos, I think it's worth a respin.. what do you think?17:07
sean-k-mooneyauniyal: im -1 on the reproducer https://review.opendev.org/c/openstack/nova/+/889947/6/nova/tests/functional/compute/test_attached_volumes.py18:01
sean-k-mooneybauzas: dansmith  im confilcited on what to do with the dangeleing volume seriese18:08
dansmithI gotta run in a sec, but I don't think it's anything approaching critical18:09
dansmithand it doesn't yet address concerns about the cinder volume being detached and reattached outside nova, AFAIK18:09
sean-k-mooneyright so i didnt think we wanted to suport that18:09
sean-k-mooneythis would actully enabel detach to work18:09
sean-k-mooneyalbeit with a hard reboot18:10
sean-k-mooneyadn needing a service user token18:10
dansmithI mean it doesn't catch that case, but should18:10
sean-k-mooneyso the bit im conflicted by is Verify if volume exists and cinder has an attachment ID else delete its BDMS data from nova DB and vice versa.18:11
sean-k-mooneythe "and cinder has an attachment ID" part18:12
dansmithwhat about it?18:13
sean-k-mooneyi was not expecting use to delete the BDM if the volume existiend but the atachment was gone18:13
dansmithwhy not? that means it's been detached underneath us18:13
sean-k-mooneywhihc we dont allow so we shoudl repair it?18:14
dansmith(/me will go poof in a few minutes without warning)18:14
sean-k-mooneyor error if you attached it to somethign else18:14
dansmithwe should heal ourselves and not require admin intervention18:14
dansmithwhich is what is needed right now18:14
sean-k-mooneyin the case the volume is deleted i agree18:15
sean-k-mooneyim unconfrotabel with just the case wehre the attachment is deleted18:15
sean-k-mooneyperhaps that is valid18:15
dansmithwhy? from our perspective it's the same18:15
dansmiththe state of the volume has been mucked with undereath us, so bail18:16
dansmithdon't try to hook it back up or recover in any other way than just avoiding our own problems in the future18:16
sean-k-mooneyso i woudl error in the attachment delete case personslly and say no you srewed with thsi you need ot manuall fix it18:16
dansmiththat's exactly what we're trying to avoid18:16
dansmithbecause the user can't manually fix it18:17
dansmithjust like when we go to delete something and it's already deleted, we say "okay fair enough" and move on18:17
dansmiththese are distributed systems.. nova's DB could have been restored from ten minutes ago, etc18:18
sean-k-mooneywell the admin can with the attachemtn refesh command. i was debating if we shoudl do the refresh in the case the voluem is aviabale but i get you dont want that complexity18:18
sean-k-mooneyi guess the fact the attachment api needs service user18:19
sean-k-mooneytokens now18:19
sean-k-mooneyfixes most of the concerns i have18:19
sean-k-mooneyas long as we are still very clear that if you detach a volume form cidner or a prot form neutorn your isntance is tainted18:19
sean-k-mooneyand all we are doing is minimal efffort to boot again 18:20
sean-k-mooneyi guess that is ok18:20
sean-k-mooneyill look at this again tomorooww but without the service user chagne i dont this woudl have been the correct approch to take.18:25
dansmithsean-k-mooney: right the point in this is very specifically to *not* require the admin to intervene when the only thing they can reasonably do is run a command to do the same thing, which is eliminate the BDM from nova's side18:35
dansmithwe wouldn't want the admin to "fix" it (IMHO)18:35
dansmithand especially with the cinder policy change, the only real thing that could cause this workflow-wise would be a database restore from checkpoint such that the two sides are out of sync, or an admin operation on the cinder side that just then needs to be mirrored with admin operations on the nova side18:36
dansmithit's not to _enable_ any sort of bad behavior by the user, it's to avoid the user being stuck and unable to get their instance back18:36
dansmithfor self-service reasons, I'd much prefer my volume to just become detached and me still be able to work on my instance without waiting in an admin's queue18:36
opendevreviewMerged openstack/nova master: Add a new NumInstancesWeigher  https://review.opendev.org/c/openstack/nova/+/88623218:46
opendevreviewmelanie witt proposed openstack/nova master: nova-manage: Add 'limits migrate_to_unified_limits'  https://review.opendev.org/c/openstack/nova/+/89164619:23
opendevreviewmelanie witt proposed openstack/nova master: Add documentation for unified limits  https://review.opendev.org/c/openstack/nova/+/89312719:23
opendevreviewmelanie witt proposed openstack/nova master: Add documentation for unified limits  https://review.opendev.org/c/openstack/nova/+/89312719:34
sean-k-mooneydansmith: ok i can bye that. i kind of feel like in addtion to this we show have a external event form cidner saying its detach or something like that to make it properly automatic but again with the service token change that shoudl not be reuqired either.19:36
dansmithto be clear, I do not think we should _ever_ tolerate detach via cinder, thus no need for an event19:36
dansmiththis is a corruption recovery mechanism and nothing else19:36
sean-k-mooneyso ill remove my -1 on the repoducer. i still have not looked at the actual final patch however19:36
sean-k-mooneyack19:37

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!