Tuesday, 2022-07-19

*** hemna7 is now known as hemna00:53
*** tkajinam is now known as Guest543507:14
*** Guest5435 is now known as tkajinam07:36
bauzasmorning Nova07:48
gibio/08:13
opendevreviewMerged openstack/nova master: Adds link in releasenotes for hw machine type bug  https://review.opendev.org/c/openstack/nova/+/84953208:48
stephenfinmorning, I need another +2 on dtantsur's patch here https://review.opendev.org/c/openstack/nova/+/84988108:51
stephenfinto hopefully finally unblock jsonschema 4.x08:51
gibisean-k-mooney[m]: done09:00
gibisorry09:00
gibistephenfin: done09:00
stephenfingibi: thanks09:14
opendevreviewMerged openstack/nova master: Add a proper schema version to network_data.json  https://review.opendev.org/c/openstack/nova/+/84988109:22
bauzasUggla: lemme know when you have time to respin https://review.opendev.org/c/openstack/nova/+/84589709:49
Ugglabauzas, yep I will look at it beginning of the afternoon. ok for you ?09:50
bauzasUggla: cool, I already uploaded my series using the 2.92 microversion so I'll only fix some simple merge conflicts https://review.opendev.org/c/openstack/nova/+/84913309:51
bauzassean-k-mooney: I found a way to have a regression test for https://bugs.launchpad.net/nova/+bug/195165609:51
bauzasI'll upload it09:51
sean-k-mooneybauzas: that great09:51
sean-k-mooneyi did not have time to try and create one myslef but i hoped it woudl be possibel without too much work09:52
sean-k-mooneybauzas: with that regression did you also obseve the breakage of the reuse of freed mdevs09:52
sean-k-mooneymy guess is the list of aviable free mdevs was empty09:52
bauzasyes09:53
sean-k-mooneyso we always tried to alocate09:53
bauzasso the inventory is getting less09:53
sean-k-mooneywell thats good that we have a repoducer now09:53
sean-k-mooneyit should aid in fixing and backporting09:53
gibisean-k-mooney, bauzas: can we quickly discuss the force down requirement in https://review.opendev.org/c/openstack/nova/+/848886 ?09:56
gibisean-k-mooney: could you elaborate on the data corruption risk. is it depends on the task state?09:57
sean-k-mooneywell my ortinal concern was we were in the midel of doing an operation09:59
sean-k-mooneyit could be a snapshot for example09:59
sean-k-mooneyand depending on the backend we do strange thigns with say nfs09:59
sean-k-mooneyso im not sure how safe it is to always ignore it09:59
sean-k-mooneylike for nfs cinder volume snapshots we create a delta disks and update some paths in the xml and on the cidner side10:00
sean-k-mooneyif we evaucate in the midel of that i dont knwo what the sate will be10:00
sean-k-mooneysame for ceph i guess10:01
sean-k-mooneyin this case its just powering off10:01
sean-k-mooneywhich shoudl be fine because we allow eveac with active vms10:01
gibiohh so the case is when the task was actually started by the compute service, then the compute service died. leaving a half uploaded snapshot or leaving a not fully updated geust xml behind10:02
sean-k-mooneyya or if we were shelving ectra 10:02
sean-k-mooneybasically if we are in the midel fo an operation im not sure what will happen10:02
sean-k-mooneymaybe its fine to evacuate10:02
sean-k-mooneybut we dont allwo that today10:03
gibiso if that is the case then simply asking the admin to fence + force down before evac is not enough, the admin manually needs to clean up / repair things10:03
sean-k-mooneyya i guess that is true10:03
gibiright now force down only requires fencing10:03
sean-k-mooneyyes10:03
gibibut going forward it will required a manual check by the admin10:03
gibiand we need to be able to describe what to check10:03
sean-k-mooneyso i guess it really does not help in that respect10:04
sean-k-mooney(force down)10:04
sean-k-mooneyi guess we are starting form this is blocked and you have to do reset-state10:04
sean-k-mooneyto evacuate10:04
gibiyeah, it is convinient to push the responsibility to the admin by asking to force down, but we have to be able to tell the admin what to do before force down10:04
sean-k-mooneyif the node is down say psu exploded10:05
gibireset-state and force down is pretty similar in this regard, we ask the admin to do something and take over the burden of keeping the system consistent10:05
sean-k-mooneyi dont think there is much if anything the operator can do to clean up the state10:06
gibiyeah10:06
sean-k-mooneyat least not without out lookign behind cinders back at the stroage 10:06
sean-k-mooneyso maybe im overthinking it10:06
sean-k-mooneybut that is why i was suggesting force down10:06
gibiwe can suggest force down but then we have to change the definition of force down, as simple fencing the host is not enough any more10:08
sean-k-mooneyya i think im coming around to your way of thinking and not overloading force-down10:09
sean-k-mooneywe perhaps should isntead just document that if the instance task state is not None10:09
sean-k-mooneythe operator may need to do addtional cleanup of the instance10:09
sean-k-mooneye.g. remove a partial snapshot10:10
gibiyepp10:10
gibiwe can keep the reset-state requirement as today if that helps against accidental evac10:10
sean-k-mooneywell i guess that is the choice we have to make. if we think the timeout is suffeicnet as it has been in the past10:11
sean-k-mooneythen we can proceed with the change and just drop the force down requirement and add some extra docs10:11
sean-k-mooneyotherwise yes we can leave it as it is today with reset state10:12
bauzasOK, I need to go lunching, but I think I see the problem with the mdev names11:13
bauzasit creates an exception when we run the periodic RT method for updating11:13
sean-k-mooneyack that is what i was expecting woudl happen11:14
sean-k-mooneyeither an excption or the list would be empty11:14
sean-k-mooneyin either case resulitng int placment getting out of sync11:14
sean-k-mooneyand preventing reuse11:14
opendevreviewMerged openstack/nova master: libvirt: Ignore LibvirtConfigObject kwargs  https://review.opendev.org/c/openstack/nova/+/83064411:27
opendevreviewMerged openstack/nova master: libvirt: Remove unnecessary TODO  https://review.opendev.org/c/openstack/nova/+/83064511:47
sean-k-mooneystephenfin: i added https://review.opendev.org/c/openstack/nova-specs/+/849488 to open discussion to ask for the spec freeze exception11:54
sean-k-mooneystephenfin: so lets defer the +w to sylvain so they can either -2 it if we reject the exctpion or +w if we accept assuming they agree with the spec content11:55
stephenfinsounds good12:06
sean-k-mooneystephenfin: remind me you orgianlly wanted to truncate the displayname to set the hostname right12:08
sean-k-mooneyrather then normalise12:08
sean-k-mooneyim stongly considering if we shoudl have done neither and added support for fqdns when talking to neutron by doing the truncation there12:10
stephenfinoh, I've no idea /o\ I'd have to go check the spec/patches12:10
sean-k-mooneyya not really imporant now i guess12:11
sean-k-mooneycontext is  https://review.opendev.org/c/openstack/nova-specs/+/849765 and whetere or not we should do this in nova12:11
opendevreviewsean mooney proposed openstack/nova-specs master: Revert "Configurable instance domains"  https://review.opendev.org/c/openstack/nova-specs/+/85004812:14
sean-k-mooneydansmith: ^ ill wait for the discusion in the nova team meeting to determin if i shoudl repopose the spec un a spec freeze exception or if we defer to AA12:17
sean-k-mooneyin which case we dont need to rush to reopen the review and we can wait for artom to return form pto12:17
opendevreviewsean mooney proposed openstack/nova-specs master: Revert "Configurable instance domains"  https://review.opendev.org/c/openstack/nova-specs/+/85004812:32
sean-k-mooney^ less typos and better commit12:32
opendevreviewStephen Finucane proposed openstack/nova master: Use unittest.mock instead of third party mock  https://review.opendev.org/c/openstack/nova/+/71467612:46
opendevreviewStephen Finucane proposed openstack/nova master: Remove the PowerVM driver  https://review.opendev.org/c/openstack/nova/+/85034612:46
opendevreviewribaudr proposed openstack/nova master: Allow unshelve to a specific host (REST API part)  https://review.opendev.org/c/openstack/nova/+/84589713:28
*** dasm|off is now known as dasm|ruck13:31
opendevreviewMerged openstack/nova-specs master: Revert "Configurable instance domains"  https://review.opendev.org/c/openstack/nova-specs/+/85004813:46
*** haleyb_ is now known as haleyb14:02
opendevreviewsean mooney proposed openstack/nova-specs master: Revert "Revert "Configurable instance domains""  https://review.opendev.org/c/openstack/nova-specs/+/85035214:11
sean-k-mooneyok i have mad the instnace.domain -> instance.dns_domain change and tried to call out the open issues  in ^14:12
bauzasI'm under deep water but we'll have our weekly meeting15:23
bauzas37 mins from now here15:23
bauzasI'll prepare the agenda15:23
UgglaWent out for a bike ride to get my car from the garage. It is really hot today.15:27
admin1hi .. i have cpu_allocation_ratio is at 4.0, but it refuses to go above the physical threads .. nova scheduler reporting:  There was a conflict when trying to complete your request. Unable to allocate inventory: Unable to create allocation for 'VCPU' on resource provider 'UUID '. The requested amount would exceed the capacity.15:33
admin1if i want to deploy something below the actual vcpus, it works .. but it does not go above the physical threads 15:33
sean-k-mooneyadmin1: you canno thave a singel allcotion that excced the numa of actual cpus15:34
admin1sorry .. what does that mean :) 15:34
sean-k-mooneywe do not allow vms to over subscibel against themselves15:34
sean-k-mooneyso 1 vm can never have more vcpu then the host has15:34
sean-k-mooneythe allocation ratio is not related too that15:35
admin1i have 40 cpus ( physical threads) .. and  ratio is 4.0  .. but i can only have 2 instance of 16 vcpu there 15:35
sean-k-mooneyhum you shoudl be able to boot more15:35
sean-k-mooneyso the total on the inventory is 4015:36
admin1yes 15:36
sean-k-mooneyand used is 3215:36
admin1right 15:36
sean-k-mooneywhat version of mariadb are you useing15:36
sean-k-mooneyyou might be hitting a mariadb bug that was reported on the mailing list last week15:36
admin1Server version: 10.6.5-MariaDB-1:10.6.5+maria~focal-log mariadb.org binary distribution15:36
sean-k-mooneywe are posibly seeing the same bug downstream15:36
sean-k-mooneyadmin1: yep that version is apprently broken and its fixed in 10.6.8 i belive15:37
*** akekane_ is now known as abhishekk15:37
sean-k-mooneyadmin1: see this thread https://lists.openstack.org/pipermail/openstack-discuss/2022-July/029536.html15:37
admin1sean-k-mooney thanks  for the direction 15:38
bauzas#startmeeting nova16:00
opendevmeetMeeting started Tue Jul 19 16:00:21 2022 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
opendevmeetThe meeting name has been set to 'nova'16:00
bauzashello everyone16:00
bauzas#link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting16:00
* bauzas is currently trampled by internal issues with vgpus but I'll chair this meeting16:00
bauzaswho's around ?16:01
gibio/16:01
elodilleso/16:01
bahnwaertero/16:01
dansmitho/16:01
Ugglao/16:02
bauzasok, we can start16:02
bauzas#topic Bugs (stuck/critical) 16:02
bauzas#info No Critical bug16:03
bauzas#link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 10 new untriaged bugs (-1 since the last meeting)16:03
bauzasUggla: thanks for helping on this16:03
bauzasany bug to point out ?16:03
bauzasmmmm, crickets 16:04
bauzas #link https://storyboard.openstack.org/#!/project/openstack/placement 27 open stories (+0 since the last meeting) in Storyboard for Placement 16:04
bauzas #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster16:05
bauzas#info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster16:05
bauzas#link https://storyboard.openstack.org/#!/project/openstack/placement 27 open stories (+0 since the last meeting) in Storyboard for Placement 16:05
bauzassean-k-mooney: do you have some time for looking at bugs this week ?16:05
bauzasmmm, I'll discuss with sean-k-mooney later to know if he can16:07
bauzasif not, I'll be the owner for this week16:08
bauzas#topic Gate status 16:08
bauzas#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:08
sean-k-mooneyi can make it16:08
sean-k-mooneyi proably dont but i will16:08
bauzassean-k-mooney: thanks, very much16:08
bauzassean-k-mooney: again, this is not a priority16:08
bauzasdo it as you can16:08
bauzasthe number of open bugs is pretty low at this time, which is cool, kudos to the team16:09
sean-k-mooneyits fine ill keep an eye on them16:09
bauzass/open/new16:09
bauzas#link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly Placement periodic job status 16:09
bauzasa new check :16:09
bauzas#link https://zuul.openstack.org/builds?job_name=tempest-integrated-compute-centos-9-stream&project=openstack%2Fnova&pipeline=periodic-weekly&skip=0 Centos 9 Stream periodic job status16:09
bauzassean-k-mooney: didn't had time to see whether you proposed a tempest patch for removing c9s from check and gate ?16:10
dansmithisn't there some way to see previous periodic runs too?16:10
sean-k-mooneybauzas: yes i did16:10
dansmithso we can see that it's passed N times in a row?16:10
sean-k-mooneydansmith: yes16:10
dansmithor has this only run once?16:10
sean-k-mooneydansmith: i can see if i can get that16:10
bauzasdansmith: https://zuul.openstack.org/builds?job_name=tempest-integrated-compute-centos-9-stream&project=openstack%2Fnova&skip=016:11
dansmithsean-k-mooney: not important now, just wanted to make sure we don't need to build a thing16:11
sean-k-mooneyi think you can filter the build by pipeline and proejct16:11
sean-k-mooneyya basically ^16:11
bauzasdansmith: as you see, for the moment, we're still testing c9s in check and gate due to a tempest template16:11
dansmithyeah but that only shows one periodic run, is that because it has only run once yet?16:11
bauzasdansmith: correct16:11
dansmithgtotcha16:11
bauzaswe could also check experiment if we want16:12
sean-k-mooneyhttps://review.opendev.org/c/openstack/tempest/+/85024216:12
bauzasta16:12
sean-k-mooneythat is the tempest template update patch ^16:12
dansmithack16:12
bauzas#link https://review.opendev.org/c/openstack/tempest/+/850242 removing Centos 9 stream jobs from default Tempest template16:12
bauzas#link https://zuul.opendev.org/t/openstack/builds?job_name=nova-emulation&pipeline=periodic-weekly&skip=0 Emulation periodic job runs16:13
bauzas#info Please look at the gate failures and file a bug report with the gate-failure tag.16:13
bauzas#info STOP DOING BLIND RECHECKS aka. 'recheck' https://docs.openstack.org/project-team-guide/testing.html#how-to-handle-test-failures16:13
bauzasthat's it 16:13
bauzasmoving on16:13
bauzas#topic Release Planning 16:13
bauzas#link https://releases.openstack.org/zed/schedule.html16:13
bauzas#info Zed-2 was last week16:13
bauzas#info Specs are no longer accepted16:14
bauzas#link https://blueprints.launchpad.net/nova/zed current blueprints open for the zed timeframe16:14
bauzas13 bps16:14
bauzasless than in Yoga16:14
bauzaswe could have one more16:14
bauzasbut this will be discussed in the open discussion 16:15
bauzasoh, I forgot to prepare an etherpad, stupid me16:15
bauzasgive me a second, I'd like to ask our API microversion proposers to ask for a specific microversion16:16
bauzasinstead of all of them rushing into 2.9216:16
bauzas2.9116:16
bauzasthere it is16:17
bauzas#link https://etherpad.opendev.org/p/nova-zed-microversions-plan Etherpad for a microversion use query16:18
bauzasfwiw, I left 2.91 for https://review.opendev.org/c/openstack/nova/+/84589716:18
bauzasand https://review.opendev.org/c/openstack/nova/+/849133 is now ready for 2.9216:18
bauzasthe latter will require a very small merge conflict resolution due to the api docs16:19
bauzasbut as you see, nothing was preventing me to use the 2.92 microversion even if 2.91 wasn't merged16:19
bauzasthat's why I propose our different change owners to ask for a microversion16:19
gibiyou will have a bit more than a single conflict on the api doc but yes hopefully the conflict will be small16:20
bauzasI'll send an email tomorrow explaining about this16:20
bauzasgibi: I tested it16:20
bauzasgibi: 4 files were conflicting16:20
gibiI expect a conflict on https://review.opendev.org/c/openstack/nova/+/849133/5/doc/api_samples/versions/v21-version-get-resp.json16:20
bauzasyup, this one and the root v216:21
bauzasplus the rest api microversions list doc16:21
bauzasand I can't remember the last one16:21
gibiand on the api_version_request.py16:21
bauzasbut easy conflicts16:21
bauzascould be harder to resolve if two changes are touching the same API resource16:21
bauzasbut let's not overcomplicate this16:22
Ugglafyi I have just reserved 2.93 for virtiofs16:22
bauzasUggla: then prepare your API change to use the 2.93 microversion, this should work like mine16:22
gibibauzas: sure16:22
bauzastbc, we reserve the right to flip the microversions16:23
Ugglaneed to review because I bet on 2.92.16:23
bauzasdepending on the state of the review16:23
bauzaswe had runways before16:24
bauzastake it as a unformal runway for asking reviews16:24
bauzaseither way, I'll send an email explaining the rules16:24
bauzasI don't want an overcomplicated process16:25
bauzasthis is just a way to prevent people doing frequent rebases 16:25
bauzasbut the smaller microversion you ask, the higher you need to be reviewed hence be present16:26
bauzasI just don't want us to wait for new revisions that could stale other changes16:26
bauzasso I'll just say that we're free to drop some change from the microversion number16:27
bauzashope you folks don't disagree with this stupid plan16:27
gibiwith the amount of folks pushing for a microversion right now I don't see trouble16:28
Ugglasounds good16:28
bauzaswell, I see 5 different patches 16:28
bauzasat leasty16:28
bauzasanyway, moving on16:29
gibipersonally I would not bet on current+3 or higher to be ordered 16:29
bauzas#action bauzas to clarify the game rules of the etherpad in a later email tomorrow16:29
gibibut having a c+1 and c+2 ordered make sens16:29
gibie16:29
bauzasgibi: that's a reasonable point16:29
bauzasI could remove 2.95 and newer16:29
bauzasmoving on 16:29
bauzas#topic Review priorities 16:29
gibiand also if you are not ready for review then please don't allocate a microversion :)16:29
bauzasgibi: that's the game rule16:30
gibicoolio16:30
bauzasand if you are on vacations for 4 weeks, don't ask for the next microversion16:30
bauzasor ask the next one, provided your patch can be reviewed before you leave16:30
bauzas:)16:30
gibi:)16:30
bahnwaerter:)16:31
* bauzas of course won't take 4 weeks off16:31
bauzasonly 3.516:31
bauzas#topic Review priorities 16:31
bauzas#link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement+OR+project:openstack/os-traits+OR+project:openstack/os-resource-classes+OR+project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/osc-placement)+label:Review-Priority%252B116:32
bauzashuzzah16:32
bauzas#link https://review.opendev.org/c/openstack/project-config/+/837595 is merged16:32
bauzas#link https://docs.openstack.org/nova/latest/contributor/process.html#what-the-review-priority-label-in-gerrit-are-use-for explains now the new gerrit flag and how to use it16:32
gibiwould be interesting to see how many +1 will appeare on this list from now16:33
dansmithI friggin hate the RP label btw16:33
bauzasat least two from sean-k-mooney :)16:33
gibibauzas: btw you need to update your query to show both +1 and +216:34
dansmithI keep RP+2ing patches and people ask a week later why I didn't CR+2 them :/16:34
bauzasdansmith: point them doc16:34
sean-k-mooneythey were proably form before the update16:34
bauzasgibi: yeah I need to modify it16:34
dansmithbauzas: no, I mean *I* do the wrong thing because I go looking for the +2 button to click and choose the wrong one16:34
sean-k-mooneyyes they were bot form before the cahnge merged16:35
dansmithI wish RP could be a different scale like -A +B +C16:35
sean-k-mooneybut plese do look at them16:35
sean-k-mooneydansmith: it can be16:35
gibidansmith: valid point, can we change it from +2 to +B?16:35
bauzasdansmith: ah I get your point16:35
dansmithif we can choose anything,16:35
bauzasit's confusing indeed16:35
sean-k-mooneyi am pretty sure it does not have to be a number16:35
dansmithcould we make it -NotYet, +Prio, +HighPrio ?16:35
bauzasI don't know, we need to look at gerrit acls16:35
dansmithor -Low, +Med, +High16:36
sean-k-mooneyi can check but this si just a cutom lable16:36
dansmithdon't do it just for me, but just relating my frustration with it on the glance side16:36
bauzasdansmith: man, we got this https://review.opendev.org/c/openstack/project-config/+/837595 open for a while, you know your very good comment would have been more than appreciated then ? :D16:36
bauzasanyway, it took us 6 months to get it 16:36
dansmithbauzas: sorry, but it has taken actual experience to realize it's annoying16:36
bauzasI'm pretty sure we can take one month more to find if we can change the acls and to get it merged :p16:37
sean-k-mooneybauzas: well thats just ebcause we didnt agree on what it shoudl be 16:37
sean-k-mooneyif we can set a custom value 16:37
bauzassean-k-mooney: not exactly16:37
sean-k-mooneyand we want too we can get it updated quickly16:37
dansmithI will help push for reviews on a change if you want to do it16:37
bauzassean-k-mooney: it took us nearly a cycle to agree and then nearly a midcycle to get it merged16:37
dansmithagain, not trying to mess anything up, just conveying my experience16:37
bauzasdansmith: your comment is legit16:37
sean-k-mooneybauzas: thats just because i had asked them to wait until i went back to them16:37
bauzasand I don't want contributors to mess this up16:37
gibiI would go for +P (contributor review promise) +CP (core review promise)16:38
bauzassomeone thinking he would +1 a patch and instead saying "yay, I'm committed on reviewing it soon"16:38
dansmithis that what you meant by +1 +2? :)16:38
dansmithif so, the labels would be much more useful16:38
bauzasyup16:38
sean-k-mooneyhttps://gerrit-review.googlesource.com/Documentation/config-labels.html#label_value16:38
bauzaswe'll figure it out16:38
sean-k-mooneyso it might need to be an int16:38
sean-k-mooneythe name can be anything we want16:39
sean-k-mooneywe can proably move on and confrim outside the meeting16:40
dansmith++16:40
gibiack16:40
bauzas++16:40
bauzaswe have someone having added https://review.opendev.org/c/openstack/nova-specs/+/816542 to the agenda16:41
bauzasSpec for modifiable user_data was accepted / merged, but implementations are still pending final review / merge 16:41
bauzasso I guess he's raising our attention to :16:41
bauzas#link https://review.opendev.org/c/openstack/nova/+/816157 server implementation16:42
bauzas #link https://review.opendev.org/c/openstack/python-novaclient/+/816158 novaclient16:42
bauzas#link https://review.opendev.org/c/openstack/python-openstackclient/+/847792 openstackclient16:42
sean-k-mooneyyep i skimmed that16:42
bauzas#link https://review.opendev.org/c/openstack/python-novaclient/+/816158 novaclient16:42
bauzasthis is one of the API changes we have16:42
bauzas2.94 ?16:42
sean-k-mooneyim not sure if its reday to merge but its an api change yes16:43
bauzasI'll add it to the etherpad16:43
bauzasmoving on16:44
bauzas#topic Stable Branches 16:44
bauzaselodilles: are you around ?16:44
elodillesyes16:44
elodilles#info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:44
elodilles#info stable/train is blocked, fix exists but hasn't merged yet due to intermittent failures + now nova-grenade-multinode & nova-live-migration started to fail @ devstack 'create' phase16:44
elodillesso train is now 'more' broken16:45
elodillesi could not reproduce the devstack issue locally yet16:46
bauzaslooks like the French SNCF rail 16:46
elodilles:)16:46
elodillesanyway, i'll try to look into the issue, but any hint is appreciated16:47
elodilles(i've added some details to the nova-stable-branch-ci, but haven't created a bug yet)16:48
bauzaselodilles: honesly, I'm under the water as I speak16:48
elodillesbauzas: ok, no problem, it's just a heads up for everyone who is interested in train branch :)16:49
bauzas*some* may be interested16:49
elodillesand that's it about stable branches from me i think16:50
* gibi more and more feels we don't have the bandwidth to maintain stable/train16:50
bauzasunfortunately, let's move on, then16:52
bauzas#topic Open discussion 16:52
bauzas(sean) https://review.opendev.org/c/openstack/nova-specs/+/849488 spec freeze exception for spice compression16:52
bauzasso yeah I wrote we could discuss this now16:52
bauzasto see whether we punt it for Zed or we accept it16:52
bauzasanyone having opinions about it ?16:53
bauzastbh, I'm meh to it16:53
bauzastrying to honestly balance the risks vs. the benefits16:53
sean-k-mooneyrisk shoudl be small since this is not user faceing16:53
yoctozeptoo/16:53
sean-k-mooneythere is no api impact to this16:54
bauzasthis is configurable, right?16:54
yoctozeptoright16:54
sean-k-mooneyvia host level config options only16:54
yoctozeptoand defaults to previous default16:54
bauzasyeah, so basically a regression wouldn't be a big deal16:54
sean-k-mooneywe might want to default to unset16:54
bauzaschanging the options and that's it16:54
sean-k-mooneybut we could defer that to the implemation16:54
bauzasyeah16:54
sean-k-mooneyto keep it entirly off by defualt16:54
bauzasif that's purely additive and host-config based only, doesn't sound a big deal16:55
yoctozeptoi.e., "do not touch this part of libvirt's xml" by default?16:55
bauzascorrect16:55
yoctozeptomakes sense16:55
bauzasno upgrade impact16:55
sean-k-mooneyright we cuurrently dont generate the elements16:55
sean-k-mooneyso we coudl continue to do that by default16:55
yoctozeptoagreed16:55
gibiI'm OK to grant the exception for this.16:55
bahnwaertersean-k-mooney: Yeah, I could change that. It makes more sense to only set the libvirt entries if they are specified in a nova.conf16:56
bauzasyoctozepto: do you have open changes against it ?16:56
bauzasoh, that's bahnwaerter's question then16:56
sean-k-mooneythere is a nova change and nova-specs change open16:56
yoctozepto++16:56
sean-k-mooneyso if we grant the excption we can update the sepc before we merge it16:56
bauzasok, so there is already a poc16:56
sean-k-mooneyyes16:57
bauzasall the planets are aligned then16:57
bahnwaerterbauzas: Yeah, I was invited to this dicussion today ;)16:57
bauzaslet me take my baton then...16:57
bauzas#agreed https://review.opendev.org/c/openstack/nova-specs/+/849488/ granted as a spec deadline exception, sounds reasonable provided there is no upgrade impact and the change being purely self-contained and additive16:58
bauzascores, I'd appreciate if you could review it ASAP16:58
bauzas(the spec, tbc)16:58
bauzasthat's it I guess for today16:59
sean-k-mooneyill drop +2 given the pending change to the config behavior16:59
sean-k-mooneynot quite16:59
bauzassean-k-mooney: about the spec itself16:59
bauzas (sean) there seams to be considerable outstanding question with regards to Configurable instance domains16:59
sean-k-mooneyoh ya so that it for that topic16:59
sean-k-mooneyyep so just want to make sure we disucssed ^17:00
bauzashttps://review.opendev.org/c/openstack/nova-specs/+/850048 revert was merged17:00
bauzasdo we want to grant an exception for it ?17:00
dansmithare we on to the domains thing?17:00
bauzasyup17:00
gibiI do apologize pushing the spec through within such a sort timeframe last week17:00
bauzasdns domain this17:00
dansmithyeah I still (heartily) question the approach in general17:00
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova-specs/+/850352 is the revert of the revert with some issues adressed17:00
sean-k-mooneyyes17:00
bauzasfwiw, we're overtime17:01
bauzasso we'll need to end the meeting17:01
dansmithI know it will/would be more work to do this via integration with neutron, but it seems like it would be a lot better to do so17:01
bauzasbut I'd appreciate if people could continue the convo17:01
sean-k-mooneybauzas: well we can extned 17:01
sean-k-mooneyits in the nova changel now17:01
sean-k-mooneybut eitehr way17:01
bauzasyeah17:01
bauzasthis is just we try to stick with one hour17:02
bauzasanyway17:02
sean-k-mooneydansmith: so do you think we shoudl take a step back and spend more time looking at this 17:02
dansmithpersonally I do, yeah17:02
bauzasme too17:02
bauzasI feel we require a proper brainstorming about it17:02
sean-k-mooneythen that fine we can defer to AA17:02
dansmithI'd like to understand more of what we can and can't do with help from neutron17:02
sean-k-mooneyand not rush this17:02
dansmithcodifying this in our API is just a hack, IMHO17:02
dansmithsounds good to me17:02
bauzasyup, sounds we need a bit of a design time17:03
sean-k-mooneyim partly worried that we made dission in this space in the past that tie our hand but we may want to revaulate those17:03
sean-k-mooneyso i think we shoudl spend some time between now and ptg evaulating this again17:03
sean-k-mooneyincluding the previous desision17:03
bauzassean-k-mooney: tbh, one week ago, we were still reviewing some metadata API change IIRC17:03
sean-k-mooneybauzas: yes which we new would not work for quite a while17:04
bauzaswhich, by reading the superseding spec, I understand why this approach is no longer possible17:04
sean-k-mooneywe can do the metadta change too17:04
sean-k-mooneybut it wont help the usecasue17:04
sean-k-mooneyits just providing more info to the domain17:04
sean-k-mooney... vm17:05
sean-k-mooneywhich it will ignore17:05
bauzasbut yeah, sounds to me that domains are some information given by the network, not the user17:05
* sean-k-mooney hates i typoed domain instead of vm17:05
sean-k-mooneybauzas: i would disagree with that17:05
sean-k-mooneybut it really depend on the config17:05
sean-k-mooneyin generally the domain is carried by the port/floating ip and a default domain can be added to the netowrk17:05
sean-k-mooneyanyway we are over time17:06
dansmithI think bauzas meant the network infra,17:06
dansmithwhich is what I meant when I said it17:06
bauzaswell, unless I'm wrong, DNS is L5 17:06
sean-k-mooneyright17:06
dansmithnot necessarily "the network object in neutron, distinct from the port object"17:06
sean-k-mooneywhat is greate is this is not integrated with routed network properly either17:06
bauzasyeah, not the "neutron network"17:06
dansmithand I totally think it does come from network infra, at least in terms of plumbing17:06
bauzasI meant 'the network infrastructure"17:07
sean-k-mooneydansmith: its ment to be self service17:07
dansmithif you override it in your guest OS, that's fine, but we don't need to be involved in that level, IMHO17:07
sean-k-mooneyat least with designate the model is bring your own domain17:07
bauzasthis is some data Nova doesn't have to deal with17:07
dansmithsean-k-mooney: totally, but we should integrate with the services providing the network infra, even if you took your own domain to them17:07
sean-k-mooneyyou point your domain mx rcords to desigante and then manage it as an enduser independet of the cloud admin17:07
dansmithyep, understand17:07
dansmithI'm not saying openstack shouldn't handle this, I'm saying I think nova is probably not the right place to set this per-instance17:08
bauzasin theory, you could even have your DNS servers totally uncorrelated from OpenStack services17:08
sean-k-mooneyyes.17:08
bauzasbut Nova shouldn't be managing it17:08
sean-k-mooneyanyway im not going to ask for a spec freeze excption for this17:08
bauzasthis could be some "metadata" information17:08
bauzasyeah and we need to end the meeting17:09
sean-k-mooneyim not sure i agree with "nova shoudl not be managing this" but i understand why you have that view point17:09
sean-k-mooneyso yes lets end the meeting17:09
bauzaslet's end the meeting for now17:09
bauzasand we'll continue17:09
bauzasthanks all17:10
bauzas#endmeeting17:10
opendevmeetMeeting ended Tue Jul 19 17:10:03 2022 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)17:10
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2022/nova.2022-07-19-16.00.html17:10
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2022/nova.2022-07-19-16.00.txt17:10
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2022/nova.2022-07-19-16.00.log.html17:10
bauzassean-k-mooney: I meant this can be an instance metadata information17:10
bauzasbut I don't want it to be "primer" as officially defined in our instance API17:10
bauzastechnically, you can pass your domain information thru userdata too17:11
sean-k-mooneyperhaps we have alternitives to not need to encodeing in the instance api17:11
sean-k-mooneyyou can 17:11
bauzasif you want to have your domains managed by OpenStack services, then this is Designate17:12
sean-k-mooneyjust a note that while these neutron apis existed for a long time they are relitively new to backend like ovn17:12
bauzasif you don't want them, then you can use the entryknobs we currently have17:12
sean-k-mooneybauzas: well right now this is a one way path17:12
sean-k-mooneyform nova to neutron to designate17:12
dansmithsean-k-mooney: right, but if the operator has chosen a network backend that doesn't support this, then I think it's reasonable to say that it's not supported to manage this for the user on those deployments17:12
sean-k-mooneyno infor ever flows the other way 17:13
dansmithif they do, then you can, and if a popular backend doesn't have support for this, but should, then ... work should be done :)17:13
dansmithespecially for something as important as OVN17:13
sean-k-mooneyit only got this in yoga17:13
sean-k-mooneywith some support added in backport ithink17:13
sean-k-mooneyat least the per-port dns info only got added in yoga17:14
bauzasI have to quit by now17:14
johnsomJust a note, the guest VM FQDN used for the hostname in the kernel is not related to the FQDN(s) configured for the host in DNS/BIND/Designate. Very different things17:14
sean-k-mooneyintenrl dns supprot is in progress for zed17:14
sean-k-mooneyjohnsom: well by definiti8on the hostname used by the kernel should not be an FQDN17:14
johnsomNot true17:14
sean-k-mooneyjohnsom: that what started this mess17:14
johnsomBut, we have had that discussion.17:15
sean-k-mooneyform a nova persetive we never supprot that17:15
bauzasyup17:15
bauzashostnames are host names17:15
dansmithjohnsom: we're talking about providing some way to say that the former should be set from the latter according to some rule, like "take the first one" or "this one is my primary"17:15
johnsomYeah, I'm not talking for Nova. Just the kernel UTC17:15
bauzashostnames aren't FQDNs in Nova17:15
dansmithinstead of nova taking a hostname that we set in the guest17:15
sean-k-mooneyjohnsom: form a kernel UTC perstive the validat of FQDN as a  hostname depend on what distro you ask17:16
* bauzas drops by now17:16
johnsomsean-k-mooney It's defined and managed in the kernel, so as long as it's running a Linux kernel it will be the same.17:17
sean-k-mooneyjohnsom: what is allowable and what are recommend are two differnt things17:17
* johnsom notes, sorry, typo, it's the UTS namespace. I have timezones on the brain today17:17
sean-k-mooneyhttps://www.freedesktop.org/software/systemd/man/hostname.html 17:17
sean-k-mooney""" The hostname should be composed of up to 64 7-bit ASCII lower-case alphanumeric characters or hyphens forming a valid DNS domain name. It is recommended that this name contains only a single label, i.e. without any dots. """17:18
johnsomOnce again, what systemd does/thinks is not what the kernel does.17:18
sean-k-mooneycorrect17:19
sean-k-mooneybut that is the same recommendation that nova has for how to constuct hostnames17:19
johnsomAs I have mentioned before, even RHEL satellite expects a FQDN.17:19
sean-k-mooneyyep and we recommend agains using fqdns in our downstream product17:19
sean-k-mooneythat recomendation was ignored and that fine17:19
sean-k-mooneybut strictly speaking nova orgially did not orgianly intend to supprot having two compute nodes with the same hostname but idffernt fqdns17:20
johnsomReally, I think this is getting over thought a bit. It seems like we should just pass the FQDN in the metadata and let cloud-init figure out what settings need to go where. That way people don't have to change the hostname of the guest to sign up the instance with satellite, etc. 17:21
sean-k-mooneydansmith: johnsom  for what its worth i think alternitive 3 in my lates revision might be the best path forward17:21
sean-k-mooneyjohnsom: that is kind of my option 317:22
sean-k-mooneyjohnsom: https://review.opendev.org/c/openstack/nova-specs/+/850352/1/specs/zed/approved/configurable-instance-domains.rst#9617:22
johnsomI am just now reading through that.17:22
dansmithI don't understand that opinion *at all* :)17:22
sean-k-mooneyso we would truncate and set host name to everyting up to the first . and fqdn to the full thing17:22
sean-k-mooneydansmith: did you look at https://github.com/canonical/cloud-init/blob/91fd72c3f5b416b7815314eebea0b82ccd7e3f73/cloudinit/config/cc_set_hostname.py#L25-L32=17:23
johnsomhttps://cloudinit.readthedocs.io/en/latest/topics/modules.html#set-hostname17:24
johnsomThat too is a good reference17:24
sean-k-mooneyi think? its the same but html version17:24
dansmithcloud-init does that from the user's own instance metadata right?17:24
sean-k-mooneyya it is17:24
sean-k-mooneyhum17:25
sean-k-mooneyyou thinking we can jsut set the fqdn on the instance metadta17:25
sean-k-mooneyand it will show up17:25
sean-k-mooneyand prefer_fqdn_over_hostname17:25
sean-k-mooneythat woudl be worth a try17:25
dansmithI dunno, I'm asking.. if so that seems like a much better deal,17:25
dansmithbasically the contract is between the user and cloud-init, with nova uninvolved17:25
sean-k-mooneywell its reading those values form the isntance metadata17:26
sean-k-mooneyim just not sure if the urls line up17:26
sean-k-mooneybut i can try that now17:26
sean-k-mooneyim not sure if the servers generic metadata is under a subkey17:27
dansmithright but is "instance metadata" the actual user's metadata blob, or the "metadata blob that nova generates, of which a sub-dict is the user's" ?17:27
sean-k-mooneyits not part of user data17:28
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/api/metadata/base.py#L161=17:28
sean-k-mooneyim just going to boot a vm and see what it looks like17:29
dansmithnot user data, user metadata17:29
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/api/metadata/base.py#L318=17:30
sean-k-mooneyit look like its in a subkey call meta17:30
dansmithright, metadata['meta'] = { .. the stuff I passed to the API as "metadata" .. } 17:32
dansmithcorrect?17:32
sean-k-mooneyya i think so17:32
dansmithand that's not where cloud-init is looking, correct?17:33
sean-k-mooneynot sure about that last bit i think its looking for FQDN as a sibling of meta17:33
sean-k-mooneybut not sure17:33
dansmithoh okay good17:33
sean-k-mooneyhostname is at the same level17:33
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/api/metadata/base.py#L349=17:34
sean-k-mooneyvm is booting now we will know shortly17:35
sean-k-mooneyubuntu@meta-test:~$ sudo cat /var/lib/cloud/data/set-hostname 17:40
sean-k-mooney{17:40
sean-k-mooney "fqdn": "meta-test",17:40
sean-k-mooney "hostname": "meta-test"17:40
sean-k-mooney}17:40
sean-k-mooneywhat the special adress again ill see what in teh api with curl17:40
sean-k-mooney169.254.169.254 perhaps17:41
sean-k-mooneyodd im not seeing it where i expect too17:42
sean-k-mooneyhttps://termbin.com/n7bk17:44
sean-k-mooneyso its set on the instance 17:44
sean-k-mooneybut i dont se fqdu anywhere17:44
sean-k-mooneyisnte server metadata ment to be discoverabel via the metadtaa api endpoing17:45
sean-k-mooneyim expecting to see it somewhere under curl 169.254.169.254/latest/meta-data17:45
*** dasm|ruck is now known as dasm|off17:48
sean-k-mooneyso that would be a no17:52
sean-k-mooneyunless im missing somthing im not sing where we expose the userdata or metadata at that url17:53
dansmithsean-k-mooney: https://github.com/openstack/nova/blob/c53ec4e48884235566962bc934cbf292ad5b67b8/nova/api/metadata/base.py#L317-L31817:54
dansmithlaunch_metadata comes from utils.instance_meta() which should be the user's k=v metadata they provided17:55
sean-k-mooney yes and we regeister that fucntion as a path handeler here https://github.com/openstack/nova/blob/c53ec4e48884235566962bc934cbf292ad5b67b8/nova/api/metadata/base.py#L221=17:56
sean-k-mooneyso with meta_data.json" as the route17:57
sean-k-mooneyill check that17:57
sean-k-mooneyso the instance metadta is not part of the ec2 info https://github.com/openstack/nova/blob/c53ec4e48884235566962bc934cbf292ad5b67b8/nova/api/metadata/base.py#L235-L303=18:00
sean-k-mooneyim not really sure what to say to be honest18:04
sean-k-mooneyok its there18:08
sean-k-mooneynot that i can find via curl but18:09
sean-k-mooneycloud-init query -a sees it18:09
sean-k-mooneyits in the meta section18:09
sean-k-mooneyhttps://paste.opendev.org/show/bfDEteTetI0rInj8NfBx/18:09
sean-k-mooneyline 57-6018:11
sean-k-mooneyoh its at   http://169.254.169.254/openstack/2012-08-10/meta_data.json18:15
sean-k-mooneycurl   http://169.254.169.254 does not list the openstack directory18:16
sean-k-mooneyso that makes sense without openstack im looking at the ec2 part18:16
sean-k-mooneydansmith: https://termbin.com/cxty that is the metadta we generated for that instnace18:21
dansmiththat's what I would expect, yeah,18:21
dansmithso does cloud-init look at that meta['fqdn'] properly?18:21
sean-k-mooneyno18:22
sean-k-mooneyit could but it does not appear too18:22
sean-k-mooneyso the openstack data souce could be updated18:22
dansmithcould or should?18:22
sean-k-mooneyto make keys under meta have precidence18:22
sean-k-mooneycould, im not sure it should or not18:22
sean-k-mooneyor well we could implement atht too allow keys to be "un namespaced" i.e. not embeded in the metakey18:23
sean-k-mooneyasuming the hostname filed there has precinece over the ec2 version18:24
dansmithwe really *really* should avoid nova making any contract about the keys in user metadata18:24
sean-k-mooneywhich im not sure it does18:24
dansmithif it's something cloud-init looks at, then fine, but we should not be touching that stuff18:24
sean-k-mooneyack18:24
dansmithwell, still, it seems to me that hostname should be the short part and we should get domain from some primary network affiliation18:25
sean-k-mooneywell i belvie they put fqdns in it sometime in the examples18:25
dansmithso remind me again why the user can't just use an fqdn in the hostname field and we just look the other way?18:27
sean-k-mooneyit broke desginate18:27
sean-k-mooneyor neutron18:28
sean-k-mooneybecause we passed that directly too them18:28
sean-k-mooneyif the hostname hand a numeric tld18:28
sean-k-mooneyso ubuntu-20.0418:28
sean-k-mooneywould break neutron18:28
sean-k-mooneybecause we set the dns_name to that18:28
sean-k-mooneydansmith: https://cloudinit.readthedocs.io/en/latest/topics/instancedata.html#example-output was the example i was refering too18:32
dansmiththey do show hostname having an FQDN, even though it's a . internal one18:33
dansmithbut it also shows the public real hostnames being more associated with interfaces, which seems more right-er to me18:34
sean-k-mooneyyep "public_hostname": "ec2-3-89-187-177.compute-1.amazonaws.com",18:34
sean-k-mooneyif you look at v1 the hsotnaem was truncated18:34
sean-k-mooneythen the top level "local_hostname": "ip-172-31-81-43", is still truncated18:35
sean-k-mooneybut hostname is now "hostname": "ip-172-31-81-43.ec2.internal",18:36
sean-k-mooneyand they have per network interface info18:36
sean-k-mooneydansmith: the other spec which we abandoned18:36
sean-k-mooneywas adding the per networking interface info18:36
sean-k-mooneysince that would allow us to pass all the inform form neutron18:36
sean-k-mooneyand just get out of thet way18:37
sean-k-mooneybut apprenly cloud init wont use that18:37
sean-k-mooneythis is all based on aws by the way18:37
sean-k-mooneythe example18:37
sean-k-mooneyand they have obviouly changed there mind over time18:38
johnsomdansmith I am 100% on board with allowing fqdn in the hostname field and just pass it to cloud-init to deal with. That would solve the problem a customer was having. If there is an issue with neutron, that should be easily fixable.18:38
dansmithjohnsom: agree18:38
sean-k-mooneyjohnsom: that customer issue is why we are talking about this18:38
johnsomYeah, I guessed as much18:39
sean-k-mooneywe had the option of doing that in wallaby and agreed to add the display name sanatiation18:39
sean-k-mooneywe aslo had a mailing list thread on this topic18:39
sean-k-mooneyso we cloud just allow fwdns again and strip or pass the domain when talking to neutron18:40
sean-k-mooneyour consern with that is we have shipted the normaliasation for 3 releases now18:40
dansmithno18:40
sean-k-mooneyso we dont knwo who we will break18:40
dansmithwe should not parse the hostname and split out domains18:40
sean-k-mooneydansmith: so neutron should?18:41
dansmithwe should take that string, pass it to cloud-init and/or neutron, and fix whatever the problem is on the neutron side that didn't like it sometimes18:41
dansmithsean-k-mooney: neutron is the networking service18:41
johnsomdansmith +118:41
sean-k-mooneyright but we are curently setting the dns_name filed in there api18:41
sean-k-mooneyto an fqdn when its defiend to take a hostname18:41
sean-k-mooneyhttps://github.com/openstack/nova/blob/50fdbc752a9ca9c31488140ef2997ed59d861a41/releasenotes/notes/instance-hostname-used-to-populate-ports-dns-name-08341ec73dc076c0.yaml18:42
dansmithI really think the right thing is for us to take a hostname and get our domain affiliation from neutron, but if we really really need to be able to take an FQDN via nova, we should be as hands-off about it as possible18:42
sean-k-mooneywell right now the hostname filed can only be a hostname if passed driectly18:43
johnsomIt is common case that the FQDN in the guest does not match the FQDN on the port in neutron. One is an internal view, the other external.18:43
sean-k-mooneyif its not passed we generate it form the dispalyname18:43
sean-k-mooneyjohnsom: sure but the customer in quetion needs it to be resolvable18:44
sean-k-mooneyso what ever gets set need to actully reslove in nutrons dns18:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Rename [pci]passthrough_whitelist to device_spec  https://review.opendev.org/c/openstack/nova/+/84383418:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Rename exception.PciConfigInvalidWhitelist to PciConfigInvalidSpec  https://review.opendev.org/c/openstack/nova/+/84386118:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Rename whitelist in tests  https://review.opendev.org/c/openstack/nova/+/84386218:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Basics for PCI Placement reporting  https://review.opendev.org/c/openstack/nova/+/84618718:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Extend device_spec with resource_class and traits  https://review.opendev.org/c/openstack/nova/+/84621818:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Reject PCI dependent device config  https://review.opendev.org/c/openstack/nova/+/84643518:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Reject mixed VF rc and trait config  https://review.opendev.org/c/openstack/nova/+/84643618:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Ignore PCI devs with physical_network tag  https://review.opendev.org/c/openstack/nova/+/84621918:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Reject devname based device_spec config  https://review.opendev.org/c/openstack/nova/+/84646618:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Support [pci]device_spec reconfiguration  https://review.opendev.org/c/openstack/nova/+/84647018:44
opendevreviewBalazs Gibizer proposed openstack/nova master: Stop if tracking is disable after it was enabled before  https://review.opendev.org/c/openstack/nova/+/84700918:44
dansmithjohnsom: I think that's a broken way to think about it in a managed environment, which is why I think you should choose one port to be your primary interface and we get our domain affiliation from that18:44
sean-k-mooneydansmith: perhaps but in general ports and neworks dont have domains18:45
sean-k-mooneythey do in the custoemrs case18:45
sean-k-mooneythey define them on the network18:45
dansmithbut I understand there's practicalities about where we are at the moment, so I'd rather just pass the hostname to the other services and let them handle what to do if it looks like an fqdn18:45
sean-k-mooneyand they are expecting the domain to propacate down to all vms on that network18:45
dansmithsean-k-mooney: again, I'm not talking about absolutes about how neutron works today, I'm saying what I think about the way it *should* work18:46
sean-k-mooneydansmith: to make that work both neutron and designate would need to be able to handel that18:46
dansmithsean-k-mooney: I understand18:47
johnsomdansmith Think about the case where the domain is cloud generated based on the project ID, etc.  The in-guest name doesn't necessarily need to be resolvable from outside the guest. It can be, but it's not necessary.18:47
sean-k-mooneyjohnsom: in there paticaly case it needs to be but in general it may not18:48
sean-k-mooneyhttps://github.com/openstack/nova/blob/93a65f06df67ce39d65827692150c78013c7f6d5/nova/network/neutron.py#L1737=18:48
sean-k-mooneythis is where we set the dns_name on the port by the way18:48
sean-k-mooneyif that just truncated the hostname neutron and designate woudl work18:48
sean-k-mooneywe are sanatising the hostname to workaround that bad request form neutron today18:49
johnsomYep, just split on the first label18:49
sean-k-mooneyso we could revert the normalisation we do right now when setting instance.hostname and truncate there and all uscases that work before wallaby would work again however the downstide of that is if your vm is called ubuntu-20.04 in neutron it will have ubuntu-20 set as the domain name18:52
dansmithjohnsom: I know, in the most generic cloud case it would also be something other than your own domain. It just seems like we've got all this overly-complex plumbing of networks now, so we should be able to actually manage network things via the network :)18:52
sean-k-mooneyand to make that worse it breaks multi create in that casse18:52
dansmithsean-k-mooney: I really don't think we should be normalizing or truncating or splitting the hostname in nova18:52
dansmithperhaps checking that it's a valid hostname (*maybe*)18:52
dansmithbut let the other services deal with it18:52
sean-k-mooneydansmith: well we have been since before essexe18:53
sean-k-mooneythere was exsitng code that removed unicode18:53
dansmithsean-k-mooney: and look at where we're at :)18:53
sean-k-mooneyand some other sepcial charters18:53
sean-k-mooneyso just sayign if we revert this it will still exits18:53
sean-k-mooneydansmith: im not disagreeing that its undeisreable18:54
* johnsom grumbles that unicode is valid in the kernel hostname UTS18:54
dansmithagain, checking for sanity is not such a big deal, but I'd think we'd want to reject the instance boot, not just sanitize-and-go18:54
sean-k-mooneydansmith: that was also something we discuseed18:54
sean-k-mooneypsi was unhappy with that proposal18:54
sean-k-mooneybut i think that was our first responce18:54
sean-k-mooney"this is invalide sorry neturon told us so"18:55
sean-k-mooneyi dont recall all the details but with queens and without designate vms with numeric TLDs booted 18:56
sean-k-mooneyand with train and designate it did not18:56
sean-k-mooneynova has been seeting the dns_name form instance.hostname since mitaka 18:57
dansmithwe have friends that work on designate right? :)18:57
sean-k-mooneyso that was either a change in neutron or caused by adding designate18:58
johnsomgrin18:58
johnsomYou do....18:58
sean-k-mooneypluarl?18:58
dansmithsean-k-mooney: johnsom is worth at least two18:58
sean-k-mooney:)18:58
johnsomlol18:58
dansmithjohnsom: that was a comment of your worth, not your waistline, btw ;P18:58
sean-k-mooneyi just tought designate was one of the more understaffed project18:58
sean-k-mooneyi think this vlaidation change was in neutron to be honest18:59
sean-k-mooneybetween train and queens18:59
johnsomThere are two full time RH folks, and a couple more cores active.18:59
sean-k-mooneyoh ok glad that has improved18:59
johnsomBut, as dansmith said, the Designate buck stops with me at the moment, so if we need to fix something on the designate side, assign the bug to me.19:00
sean-k-mooneyhttps://github.com/openstack/neutron-lib/blob/f01b2e9025d33aeff3bf22ea2568bda036878819/neutron_lib/api/validators/dns.py#L59-L92=19:02
sean-k-mooneyso that apprently is what does the validation in neutron19:02
sean-k-mooneywell it starts here https://github.com/openstack/neutron-lib/blob/f01b2e9025d33aeff3bf22ea2568bda036878819/neutron_lib/api/validators/dns.py#L112=19:03
dansmithI just had to: https://imgur.com/a/spREDAg19:03
johnsomlol19:04
johnsomAt least it's a $10019:04
dansmithinflation, yo19:04
sean-k-mooneyhttps://github.com/openstack/neutron-lib/blob/f01b2e9025d33aeff3bf22ea2568bda036878819/neutron_lib/api/validators/dns.py#L50-L52=19:05
sean-k-mooneythat is what was rejecting the numeric tlds19:05
sean-k-mooneyand that has been in place since pike19:06
sean-k-mooneyso the psi issue was caused by turning on the dns extention19:06
sean-k-mooneyit was there in mitak too shich is when we started setting that field https://github.com/openstack/neutron/blob/4d8685da8050df79d9193f91cab572cfc6d67a47/neutron/extensions/dns.py#L130-L133=19:09
sean-k-mooneydansmith: so we could go back to not normalising or do want we don downstream19:10
dansmithsounds like we need to collab with the neutron buck-stop19:10
sean-k-mooneydownstream if the tld is numeric we normalise to remove .19:11
sean-k-mooneybtu otherwisse we allow the fqdn19:11
sean-k-mooneyin hostname19:11
sean-k-mooneyso downstream its targeted to just making that one edgecase pass since the change was never backported upstream19:12
sean-k-mooneyupstream form wallaby on we replace all '.' with _ or - i cant recall19:12
johnsomThe TLD rules are pretty simple, I think it is perfectly acceptable to error to the user.  There are two length limitations and the basic regex neutron has. I assume the neutron raise comes to late for nova to communicate that to the user?19:16
sean-k-mooneyyes it haapens on the compute node when we are bidning the ports19:16
dansmithnot too late to communicate, just too late to reject the request19:17
dansmithwe have lots of reasons why the instance goes into error state based on lies you told us earlier19:17
sean-k-mooneywithout designate this would also work19:17
sean-k-mooneybecause nova wont try to set the field19:18
sean-k-mooneysince the extenion is not enabled19:18
dansmithso if dns is enabled and you gave us something invalid, then failing to wire up would be a fine reason to error the instance19:18
sean-k-mooneyok because that is what we used to do19:18
johnsomYep19:18
sean-k-mooneybefore it was reported as a bug and "fixed"19:18
sean-k-mooneyin wallaby19:18
dansmith"hacked"19:19
dansmith"swept into the future debt dustbin to screw someone else later"19:19
dansmithbut yeah :)19:19
sean-k-mooneywell we filed an rfe to add hostname as a sperate top level paramter and did a lot of other work19:19
sean-k-mooneybut ya im not happy with the situration we are in currently19:20
sean-k-mooneygoing back to that old behavior will break some users but fix others19:20
sean-k-mooneydepending on if designate is aviaable of not19:20
sean-k-mooneywhich yes is tecninally detechable via the neutron api19:21
sean-k-mooneyby checking the extentions as that is one of the few that is only reported when enabled i belive19:21
sean-k-mooneyneutron has a habit of reporting all extesion even if they are not enabled makeing it imposible to determin that19:21
johnsomSo, what I am hearing is a proposal: hostname field, remove the FQDN restriction, hand it off to cloud-init single label or FQDN, no hacking on the string (i.e. no . -> -). Pass the string through to neutron. If the domain doesn't match or is rejected, ERROR the instance with "invalid hostname" in the error field.19:22
johnsomJust trying to summarize for clarity19:22
dansmiththat's what I'm saying yeah19:23
sean-k-mooneythat would regress a fixed bug19:23
dansmiththere might be some opinions about whether or not we need to hide that behind a microversion or not I guess19:23
sean-k-mooneyand break people on upgrade19:23
johnsomThat works for me and would solve the customer issue19:23
sean-k-mooneyincluding breaking psi19:23
sean-k-mooneybut if we have an a way to help them fix all invalid hostname we cloud19:23
dansmithsean-k-mooney: it doesn't break them if the neutron thing is fixed right?19:24
johnsomWhy would it break on upgrade? you are going from more restrictive to less19:24
sean-k-mooneyit wont break exiting vms19:24
sean-k-mooneythat we have already normalised19:24
johnsomRight19:24
sean-k-mooneybut it will break anyone that started depending on that19:24
sean-k-mooneyso custoemr with exsiting heat templates19:24
sean-k-mooneywoudl find it breaks 19:25
dansmithdepending on what specifically? the mangled hostname?19:25
sean-k-mooneyyes19:25
johnsomThey would start getting ERROR instances if the hostname is bogus instead of having the hostname switched around on them. Which seems like the right answer to me. APIs that magically change the data input  to something else are ... unpleasant19:26
sean-k-mooneyjohnsom: im pretty sure you review this by the way in the past if not appolgies but the mangaleing was discussed at leant on the mainile list19:27
sean-k-mooneyand it was chosen to go that appoch since we already did it for unicode and we were following the rfc for mangeling rules19:28
sean-k-mooneyand we explictly asks operator if they were depenidng on the fqdns in the hostname at the time19:28
dansmithjohnsom: agree, and unless we let people opt into the old behavior with a microversion, we're already changing that behavior underneath them19:28
sean-k-mooneythat is an option19:29
dansmithno,19:29
dansmithI mean with the previous change19:29
sean-k-mooneydisable the mangeling in new microverion19:29
dansmithgoing from more mangling to less mangling is less disruptive, I'm sure19:29
sean-k-mooneyit will go from 200 to 40019:29
dansmithno, because we won't know until too late, right?19:30
dansmithbut as I said above, there's a discussion to be had on the microversion requirement19:30
sean-k-mooneyactully right it will not change the respocne and go form active to errror19:30
dansmith...right19:30
sean-k-mooneyhttps://github.com/openstack/nova/commit/9046f0fff4be424eda25401a3f9b8752964de775 that was the change we did 2 years ago to adress https://bugs.launchpad.net/nova/+bug/158197719:34
sean-k-mooneyhttps://lists.openstack.org/pipermail/openstack-discuss/2020-November/019113.html was the mailing list thread 19:35
dansmithsean-k-mooney: that's because we set hostname from display name if hostname isn't set specifically right?19:37
sean-k-mooneyyes before xena that was teh only way to set hostname19:38
sean-k-mooneyit was an internal atribute on the instance19:38
dansmithah19:38
sean-k-mooneywe added an api to set that in repsonce to the issues raised with ^19:38
dansmithokay so, this is even less of a problem I think.. if they specify the hostname, we can just take it as-is and if not, then we keep the existing display->filter->hostname behavior right?19:38
sean-k-mooneyhttps://specs.openstack.org/openstack/nova-specs/specs/xena/implemented/configurable-instance-hostnames.html19:39
sean-k-mooneywell we do not allow FQDNs via that api19:39
sean-k-mooneye.g. if you pass --hostname it must not be an fqdn19:39
dansmithack, so that requires a microversion then19:39
sean-k-mooneyto allow it to be an fqdn yes19:40
sean-k-mooneyif we want to allwo that19:40
dansmithall sounds fine to me, and isn't going to break anyone19:40
sean-k-mooneywell that is what the --domain spec was trying to do19:40
johnsomdansmith Yeah, that was what  I was thinking. The display name is always going to be a mess. But the hostname field is pretty clear it can't be crazy and should be good for pass through19:40
sean-k-mooneyadd a domain filed to not continute to overload hostname19:40
sean-k-mooneysince hostnanme is used for dns_name and that cant be an fqdn today19:40
dansmithjohnsom: right and since we have the displayname->hostname(ifunset) part, we can keep the mangling if you don't otherwise set it to something... but if you do, it better be right19:41
sean-k-mooneywell ok it kind of can19:41
dansmith"it" being hostname in my statement above19:41
sean-k-mooneydansmith: so the propsoal is to just relax instanstnce.hostname if you pass it explictly19:41
sean-k-mooneythat might need a db migration19:41
sean-k-mooneyi would hae to check the column size19:42
johnsomNope, I looked, the field is fine in the DB19:42
dansmithsean-k-mooney: sounds like that's all that needs to happen?19:42
johnsomIt's 255 already19:42
sean-k-mooneyoh ok19:42
sean-k-mooneywe are limiting it to 6319:42
sean-k-mooneyin the api19:42
sean-k-mooneyso we would just need to drop the extra vlaidation on hostname when its passed explcitly19:42
sean-k-mooneyand live with the fact it can be an fqdn19:43
dansmithfor the new microversion yeah19:43
sean-k-mooneyand document that it will be passed as is to neutron 19:43
dansmithI don't love it, but I like it a lot better than us taking multiple things and constructing FQDNs19:43
sean-k-mooneyya we dissucsed having --fqdn when --hostname was added19:43
sean-k-mooneybut we did not want to support fqdns19:43
sean-k-mooneybut i guess that is a less invasive change19:44
dansmithokay now hold up,19:44
dansmithwhat about the multi-create case?19:44
sean-k-mooneywe will sufix the fqdn19:44
dansmiththat would require the parsing of the hostname in nova to insert the index19:44
sean-k-mooneywell or that19:44
sean-k-mooneyor make the sufic a prefix19:45
dansmithsuffixing the fqdn won't yield anything valid, so no point in doing that19:45
dansmithyeah 1-$hostname would work I think19:45
sean-k-mooneywe could change that in the microversion19:45
sean-k-mooneyi assume we cant just decied to not support multi create19:46
dansmithwell, I was going to say,19:46
sean-k-mooneyit would make life eiser in many ways :)19:46
dansmithI wonder how useful hostname really is in the multi case19:46
sean-k-mooneyso in artoms spec he nandedl this by saying we woudl continue to suffix the hostname part as we do today19:46
sean-k-mooneybut the doamin value would be appended and the same for all instnaces19:47
sean-k-mooneywhich when you have it as two parts makes sense19:47
sean-k-mooneybut if we want to avodi parsing then ya prefix or declare not supported19:47
dansmithor just set them all to what they gave us19:48
sean-k-mooneyor that19:48
sean-k-mooneyok its getting late here 19:48
sean-k-mooneyif we want to explore this this cycle i can update the spec19:48
dansmith...and I'm sick of obsessing over hostnames :)19:48
sean-k-mooneyand formally ask for a spec freeze on the mailing list19:49
dansmithI can't imagine we're going to get to agreement on all this in time19:49
sean-k-mooneythats fine i suspect the same19:49
sean-k-mooneyim also kind of burnt out on this but i can  prepare a draft or at least link to this in the spec19:50

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!