Tuesday, 2024-02-27

noonedeadpunknot sure what ceph_config08:15
noonedeadpunkmornings08:18
noonedeadpunksorry, had to take a day off yesterday08:18
noonedeadpunkjrosser: no idea what ceph_config is doing though? I guess injecting conf into ceph mons directly?08:19
noonedeadpunkaka what cephadm does?08:19
noonedeadpunkthat would kinda make sense then I guess...08:19
jrosserI think so yes08:28
jrosserthough as far as I can see only from the [global] section perhaps08:28
jrossernoonedeadpunk: also what is this?! https://github.com/NVIDIA/open-gpu-kernel-modules/commit/476bd34534a9389eedff73464d3f2fa5912f09ae09:11
noonedeadpunko_O09:12
noonedeadpunkso this shouldn't be from application hub anymore and finally part of opensource drivers? 09:13
noonedeadpunkas actually, all nvidia drivers, except ones for vgpu, were opensourced for a while now09:14
noonedeadpunkso if you'd use mig or pci-passthrough without vgpu - they could be used.09:14
noonedeadpunkso that's kinda sweet. 09:14
jrosseridk where README.vgpu is though09:15
noonedeadpunkhm... they've also updated application hub? as my account doesn/t work anymore.....09:16
noonedeadpunkit suggests creating an account now o_)09:17
jrosserthere is also this https://github.com/NVIDIA/vgpu-device-manager09:17
jrosseroh well nvidia have some strange definition of SSO where you actually have multiple accounts and they keep merging/changing the auth backend09:18
jrosserits like multiple-single-sign-on09:18
noonedeadpunkbut my email that used to work now suggests creating a new account... anyway09:19
noonedeadpunkvgpu-device-manager is also super interesting actually09:19
jrosserwhilst it says k8s thats not really what the readme describes09:19
jrosserlooks totally usable on normal hosts too09:19
noonedeadpunkyup, it is. I guess it still does echo to /sys though. but in a way more usable fashion kinda09:20
noonedeadpunkI wonder if they do have packaging for it....09:21
jrosserwe are just doing planning for removing vCS licences, which is a total mess09:21
noonedeadpunkYeah... 09:22
jrosserso if the open driver can do similar then that is very very interesting09:22
noonedeadpunkbtw, do you know where they've moved doc on slicing vgpu for enterprise ai?09:22
noonedeadpunkas they've removed vCS from vGPU page completely09:22
noonedeadpunk(which is fair)09:22
noonedeadpunkbut didn't add this enterprise ai 09:23
noonedeadpunk(which is not)09:23
noonedeadpunkMeaning this: https://docs.nvidia.com/grid/16.0/grid-vgpu-user-guide/index.html#virtual-gpu-types-grid-reference09:24
jrosserlike this? https://docs.nvidia.com/ai-enterprise/latest/user-guide/index.html#supported-gpus-grid-vgpu09:25
jrosserthey even call it grid :)09:26
jrosserso - what do you think about this https://review.opendev.org/c/openstack/openstack-ansible/+/91022009:28
jrosser^ easy fix09:28
jrosseror difficult fix is to make slurp jobs handle branch names that will randomly change from stable/* to unmaintained/*09:28
jrosserbecasue currently we are totally broken on 2023.109:29
ThiagoCMCSo, a bit about this Ceph situation. I am unable to use `ceph-ansible` `stable-8.0` in an isolated (Ceph-only) lab. I swear it worked days ago, not anymore.   :-/09:57
jrosserThiagoCMC: they are continuously changing the code on that branch, i am not surprised at all10:02
jrosserlike i said yesterday i think that you will be able to deploy Reef using stable-7.010:02
jrosseryou just need to override the variable that defines the version10:02
ThiagoCMCCool, I haven't tried that. Thanks for reminding me! I'll give it a try.10:03
jrosserif it doesnt work, then the problem is likely to be small compared to stable-8.0 imho10:03
ThiagoCMCTrue lol10:04
jrosserdamiandabrowski: andrew is away this week so if you are able to look at reviews it would be helpful12:04
damiandabrowskiokay! I'll have a look during the evening12:08
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_nova stable/2023.1: Evaluate my_ip address once  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/90869912:19
jrossernoonedeadpunk: at some point it would be good to look at the magnum stuff again12:48
jrosseri am really unsure about what to do with tempest and the resources creation as eventually it seems to always end up "refactor even more old stuff" and it's kind of never ending12:48
jrosserfor example, i could make a new tidier tempest role somewhere in plugins collection or something, but that feels like taking on yet another large project12:50
noonedeadpunk#startmeeting openstack_ansible_meeting15:01
opendevmeetMeeting started Tue Feb 27 15:01:02 2024 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:01
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:01
noonedeadpunk#topic rollcall15:01
noonedeadpunko/15:01
damiandabrowskihi!15:01
jrossero/ hello15:01
noonedeadpunk#topic office hours15:03
noonedeadpunkso, it feels it's really high time for new point releases15:03
noonedeadpunkthough I saw some "blockers" which would be nice to handle first15:03
noonedeadpunkseems https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/909868 was quite important, for instance15:04
jrosser2023.1 is totally blocked i think15:04
noonedeadpunkyep, by Yoga upgrade15:05
noonedeadpunkSo we need to land Yoga upgrade disablement first: https://review.opendev.org/c/openstack/openstack-ansible/+/91022015:05
jrosseri looked at how to handle stable|unmaintainted but that was just /o\ complicated15:05
noonedeadpunkYeah, I also failed to get us access to unmaintained.15:06
noonedeadpunkAnd frankly - this branch removal/adding is quite confusing...15:06
NeilHanlono/ sorry i'm late15:09
noonedeadpunkI also didn't check neither on docs for ops repo, nor for octavia and ovn scenario in AIO15:10
jrosseri need some direction on the magnum patches15:11
jrosserwell not so much magnum, but the fixing * else that seems to be also involved :(15:11
jrosserspecifically tempest resource creation, it's just gigantic mess now15:11
noonedeadpunkyup15:12
noonedeadpunkI know...15:12
jrosseri think that i can make time this week to just strip everythig to do with resource creation out of os_tempest15:13
jrosserand port it to openstack_resources15:13
jrosserbut we should decide if that is a good idea or not15:13
noonedeadpunkthat is very good question15:14
noonedeadpunkas problematic part - that plenty of logic and weirdness lies in tempest role itself15:14
jrosseri am wondering if that is just historical accumulation15:14
noonedeadpunkand I guess end-goal of all that would be to just skip tempest, but do have some resources?15:14
jrosseryes thats right15:15
noonedeadpunkAnd basically only public network is needed iirc15:15
jrosserbut you can't do that just now without making the logic in tempest role even more complicated15:15
jrosserultimately there is actually not much needed in tempest.conf15:16
jrosserflavor / image id * 2, network id15:16
jrossermaybe one more15:16
jrosserso i was thinking to make it possible to pass in name -> os_tempest looks up the id15:17
jrosseror pass the id directly15:17
jrosserand move all the creation stuff out of the role completely15:17
jrosseras even if we use openstack_resources that doesnt return the id really to re-use later15:18
noonedeadpunkbut why you still try to install it at all instead of just disabling it as a whole and including openstack_resources just here https://review.opendev.org/c/openstack/openstack-ansible-ops/+/906363/14/mcapi_vexxhost/playbooks/install_and_test.yml#14 ?15:18
noonedeadpunkyeah, output of openstack_resources result is actually a good topic on it's own15:18
noonedeadpunkand if that should be covered15:18
noonedeadpunkmaybe registering results or output to some local facts might be useful...15:19
jrosserwell maybe you are right and i was trying too hard to make a general solution15:19
noonedeadpunkI mean - doing general solution is perfect scenario15:19
noonedeadpunkBut given amount of overhead...15:20
noonedeadpunkMaybe it should not be a blocker and we just need to iterate over things15:20
jrosseryes tbh this is a better way to look at it15:20
noonedeadpunkI still think we should do smth with tempest.15:20
jrosserseems everone is busy++ so need to take a tractible path15:21
noonedeadpunkbut this should not really block capi from my perspective. Or at least if there's a way to unblock - better do that15:21
noonedeadpunkYes, until end of March I'm really just /o\15:21
noonedeadpunkSo is damiandabrowski15:21
noonedeadpunkI do hope to be able to catch-up though once thing we're working on is done.15:22
noonedeadpunkAlso, I guess it's time to start populating PTG etherpad....15:24
noonedeadpunkLet it be the link15:24
noonedeadpunk#link https://etherpad.opendev.org/p/osa-dalmatian-ptg15:24
NeilHanlon🥳15:24
noonedeadpunkand I'm adding ceph-ansible right away.15:25
NeilHanlonyes.15:25
noonedeadpunkWill populate it with leftovers from caracal ptg as well15:26
noonedeadpunkbut also - we probably should pick up a timeframe for the PTG15:26
noonedeadpunkWe can do "as usual" Tuesday - 14 - 17 UTC?15:27
noonedeadpunkor 15 - 1815:28
noonedeadpunkor should I make some kind of poll to vote on it?15:28
jrosserwhat actual date is this?15:29
noonedeadpunkgood question15:30
noonedeadpunkApril 915:30
NeilHanlon April 8-12, 202415:30
NeilHanlonyep, so the 9th15:30
NeilHanloni'm flexible, but will be traveling to Texas for a conference on 4/1115:31
jrosserhmm that is during school holidays for me so 50/50 at best for the whole week15:33
noonedeadpunkouch15:33
noonedeadpunkthat's defenitely a bad timing for PTG then...15:33
noonedeadpunkbut eventually, looking at scope for Caracal, it slightly feels that not much will be delivered out of it15:34
noonedeadpunklike - incus for sure won't be done15:34
jrossertbh i think this is a large job15:34
noonedeadpunkyeah...15:34
jrosserand requires some pretty good thinking, as it is an opportunity to modernise things rather than just drop-in replacement15:35
noonedeadpunkI close to never used LXD at scale, so hard to judge on what's best practise would be15:35
jrosseri think that personally i can only commit to smaller things than that for maybe the next cycle or two15:36
noonedeadpunkBut also I guess it should be not drop-in but indeed smth modern which can be done as an option to old legacy15:36
jrossermy hunch is that we can collapse many many ansible tasks into native things in LXD/incus15:36
NeilHanlonI think incus is reasonable for next cycle, fwiw (on the Fedora/EL side)15:37
noonedeadpunkwell, will see about time/prios for that15:44
noonedeadpunkas that is totally would be very-very appealing to have and quite logical evolution of what we have today15:44
noonedeadpunkwith LXC15:44
jrosserare there any bugs to look at?15:44
ThiagoCMCI have experience with LXD, I am currently running part of my OSA (Compute, Network, and OSDs) on top of LXD Containers. I want to help!15:44
jrosseri had a report from hamburgler3 yesterday which i have just put into launchpad15:45
noonedeadpunkwell, I mean, we have also an etherpad from bug triage day that needs to be looked at15:45
noonedeadpunk#link https://bugs.launchpad.net/openstack-ansible/+bug/205517815:46
noonedeadpunkok, I had very simmilar lately15:46
noonedeadpunkI didn't get to the point of finding out wtf is going on15:46
noonedeadpunkeventually, /var/lib/haproxy/dev/log is a "chroot"15:47
noonedeadpunkAnd actually... not being idempotent might be the root cause15:48
noonedeadpunkso that is potentially good catch15:48
jrossermy thoughts were why we needed to do any of this15:48
jrosseras i would expect the distro packages to do the necessary stuff when haproxy is installed15:49
noonedeadpunkwell... there's your note there....15:49
jrosserwell indeed, but it has been a while and that might no longer be true15:49
noonedeadpunkYep, we had this exact issue being reproduced, so I for sure can look there with some priority15:50
jrossereven needing to make the bind mount surprises me, as haproxy does this chroot thing as part of it's own functionality15:50
jrosserbut i kind of feel i miss something important here15:51
noonedeadpunkyep, true, I did just rmdir and it was created is proper permissions on restart15:51
noonedeadpunkand well, after systemd-journald restart as well15:51
noonedeadpunkbut again - that was all on ubuntu15:52
noonedeadpunkworth trying dropping all that for sure15:53
jrossermaybe it is as simple as boot a centos / ubuntu vm and chdeck that haproxy can log to the journal out of the box15:53
jrosserif so we can delete all of this15:53
noonedeadpunk++15:54
noonedeadpunkbtw, we've also tested and slightly adopted andrew's patch to keystone: https://review.opendev.org/c/openstack/keystone/+/91033715:55
noonedeadpunkso if you can check if it works for you still - would be great :)15:55
noonedeadpunkbut yes15:55
jrosseroh i did see that yes15:57
jrosserwe can look at that maybe next week15:57
noonedeadpunk#endmeeting15:59
opendevmeetMeeting ended Tue Feb 27 15:59:08 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:59
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-02-27-15.01.html15:59
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-02-27-15.01.txt15:59
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-02-27-15.01.log.html15:59
noonedeadpunkah, snap15:59
noonedeadpunkI fully forgot about 1 big thing....15:59
noonedeadpunkas part of ovn-bgp-agent, deployment of FRRouting is needed15:59
noonedeadpunkWith that I'm intentind to move ansible-role-frrouting from vexxhost namespace under osa governance16:00
NeilHanlon👍 makes sense16:00
jrossersure, looks like nice new capability16:01
noonedeadpunkAnd I'm practising there with Molecule right now. Hopefully this will trigger me to add coverage to other "standalone" roles in a good way16:01
jrosseri also did some zuul error cleanup16:02
noonedeadpunkah, yes, thanks a lot for that!16:02
noonedeadpunkI guess Zed is slightly broken now from what I saw...16:02
jrosserthese i think are maybe only useful if someone is amenable to force-merge them on older branches16:02
noonedeadpunkactually ovng-bgp-agent has soooo many ways to be deployed/configured....16:06
noonedeadpunkand then quite some changes in logic might be needed. Like one way requires a standalone local ovn cluster on top16:07
noonedeadpunkanother just absent connectivity of provider network to ovs (don't add port to briidge)16:08
jrosserNeilHanlon: if you are around could you take a look at https://review.opendev.org/c/openstack/openstack-ansible/+/91022016:23
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-ops master: Add hook playbook install and test magnum capi driver  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/90636316:36
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver  https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/90519916:37
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Bump ansible version to 2.15.9  https://review.opendev.org/c/openstack/openstack-ansible/+/90561916:43
spatelFolks, do you know how to not allow end user to release/disassociate floating IP in horizon or commandline? 17:02
spatelcurrently my customer has permission to remove or disassociate floating IP that is painful sometime. I want to stop this behavior 17:03
noonedeadpunkpolicy?17:03
spatelreading policy file and found - delete_floatingips_tags17:07
noonedeadpunkok, just in case - this bind mount is needed for centos at least...18:36
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-haproxy_server master: Use correct permissions for haproxy log mount  https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/91038418:46
noonedeadpunkactually this seems fixing it ^18:46
noonedeadpunkhamburgler3: pinging you so you could check as well :)18:46
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Update upstream SHAs  https://review.opendev.org/c/openstack/openstack-ansible/+/91038618:52
hamburglernoonedeadpunk: haproxy fix looks good :)19:01
noonedeadpunkawesome19:01
spatelFolks, Did you try to use NFS based cinder-volume ?20:28
spatelDo I need to mount NFS directory to all the controller + compute nodes ?20:29
opendevreviewMerged openstack/openstack-ansible-plugins master: Do not log contents of installed keypairs by default  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/90883823:07
opendevreviewMerged openstack/openstack-ansible-os_tempest master: Switch default external network name to 'physnet1'  https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/90876823:26
opendevreviewMerged openstack/ansible-role-uwsgi stable/2023.2: Remove undefined bionic linters job  https://review.opendev.org/c/openstack/ansible-role-uwsgi/+/91019123:41

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!