Tuesday, 2023-01-17

moha7Hi06:24
noonedeadpunko/08:52
jrossermorning09:06
* noonedeadpunk trying to check https://bugs.launchpad.net/openstack-ansible/+bug/200289709:12
noonedeadpunkI'm not sure if it was discussed yestarday though09:12
jrosserno it wasnt09:14
jrosseri wonder what version that way09:14
jrosseralso very interesting ML thread about vgpu09:15
noonedeadpunkthat basically stopped me from proposing bump for Z as sounds quite critical09:15
noonedeadpunkoh yes09:15
noonedeadpunkI was o_O that you can have different vgpu types on one card09:15
jrosserit is very interesting to see possibility that cyborg handles MIG09:15
noonedeadpunkAs that was not possible like 6 month ago09:15
jrosseryeah, thats mig + sriov right though?09:15
jrosseras they list the pcie addresses for each type09:16
noonedeadpunkWell I don't have mig on mine...09:16
noonedeadpunkBut vGPU is quite the same09:16
noonedeadpunkWe still list PCI09:16
jrosseri did look quickly yesterday at cyborg and its probably not to bad to make a role09:16
noonedeadpunkThe problem is, that nvidia driver says you can't create any more mdev on any other type once 1 vgpu was created of specific one09:17
jrosserunfortunately no time at all in my team to investigate or test it :(09:17
noonedeadpunkWell, I had a look at cyborg year ago or so and they've struggled to have proper vgpu support back then. And overall I missed the poit why to use it as usecases are mostly covered with nova solely as Sean has written09:17
jrosserthough having said all this the biggest requirement i get from users right now is "whole A100/80G"09:18
noonedeadpunkWell. We have request for whole T409:18
noonedeadpunkStill.09:18
noonedeadpunkSo if saying to have A100 - we likely can split in 309:19
noonedeadpunkBut we have A1009:19
jrosseranyway - to bug 200289709:19
jrosserit doesnt say what release does it?09:19
noonedeadpunkNope, it does not. I just assumed it's Z09:20
jrosseroh well09:20
jrosser`type_drivers = geneve,vlan,flat`09:20
jrosser`tenant_network_types = vxlan,flat,vlan`09:20
jrosserthat looks suspect09:20
noonedeadpunkhuh, yes09:21
jrosseri think i have a recent AIO, lets compare09:21
noonedeadpunkTo be frank - Ive never checked horizon on Z for OVN even in AIO09:21
jrosserok well in my AIO i have `type_drivers` matching `tenant_network_types`09:22
jrosserwhere do we define `neutron_provider_networks`09:26
noonedeadpunkI think that's the issue https://opendev.org/openstack/openstack-ansible-os_horizon/src/branch/master/templates/horizon_local_settings.py.j2#L34309:26
noonedeadpunkAnd that doesn't look configurable or overridable. Well. We have some sort of override, but I guess they will need to define whole OPENSTACK_NEUTRON_NETWORK09:27
noonedeadpunkneutron_provider_networks are now only in neutron role defaults I believe09:28
jrosseroh well hold on09:28
jrosserregardless of horizon, neutron server is down09:29
noonedeadpunkwell, I assume it's down specifically because of having vxlan there 09:29
noonedeadpunkIt's less of concern now, as I think bug itself quite valid09:30
noonedeadpunkEven if neutron was configured properly09:30
jrosseryeah, we just need to be clear that theres two things going on09:31
jrosseralso ovn does support vxlan09:34
noonedeadpunkit does?09:35
jrosseryes as it has to interoperate with physical TOR switches09:35
noonedeadpunkum... I'm confused then09:36
jrosserhttps://bugzilla.redhat.com/show_bug.cgi?id=188170409:36
noonedeadpunkas everyone I talked about migration from OVS were marking migration to geneve as main pain point09:36
jrosserso we need to add `horizon_supported_neutron_provider_types` to os_horizon role?09:38
noonedeadpunkyeah, smth like that09:38
jrosserthere is also the special value '*' which enables them all09:38
noonedeadpunkNow I'm really confused why we made all that complexity with vxlan/geneve09:39
jrosseri'm not sure09:40
jrossergeneve adds flow specific information to the headers rather than just source/destination09:40
jrosserbut then none of that seems to be surfaced/leveraged from a user POV so i also am kind of confused by it09:41
jrosseron another topic, do you have any idea how to do this? https://paste.opendev.org/show/b8oNDxVrtGZ3a0ieYr7T/09:41
noonedeadpunkI was quite sure that vxlan simply not implemented09:42
jrosseri try to implement something to disassemble a url, then rebuild it according to some spec supplied in a string09:42
jrosserbut the answer is always the spec string itself rather than being evaluated as a template09:43
noonedeadpunkUm, you've forgotten {{ }}?09:43
jrosserwell no, becasue i want to supply the spec as a var09:44
jrosserto make it changeable09:44
noonedeadpunkah09:44
jrosserits totally fine if i add the {{ }} :)09:44
noonedeadpunkand format doesn't work for you for some reason?09:49
noonedeadpunkhttps://jinja.palletsprojects.com/en/3.1.x/templates/#jinja-filters.format09:49
noonedeadpunkI'm not sure if you can have there named arguments though09:50
jrosserhmm let me try that09:54
noonedeadpunkjrosser: https://paste.opendev.org/show/br8DLxuBlhWteNLO2iMd/09:58
jrossernoonedeadpunk: thats excellent10:06
moha7The 3rd step deployed in 7 hours!10:20
moha7Issue: metadata injection into the instances does not worked well in this new deployed env. Instances got IP and could ping their router hands, but the hostname did not set on it! Ctrl+F for 169.254.169.254 in http://ix.io/4ltx to see the error.10:21
moha7jrosser:   ERROR ovsdbapp.backend.ovs_idl.connection [-] non-zero flags not allowed in calls to send() on <class 'eventlet.green.ssl.GreenSSLSocket'>: ValueError: non-zero flags not allowed in calls to send() on <class 'eventlet.green.ssl.GreenSSLSocket'>023-01-17 10:34:26.475 2290 ERROR ovsdbapp.backend.ovs_idl.connection Traceback (most recent call last)10:28
moha7Solution: I copied SSLs  (these items: https://paste.opendev.org/show/bXxUevJGtWu023mfZmNE/) from ` /etc/neutron/plugins/ml2/ml2_conf.ini` and inserted to ` /etc/neutron/neutron_ovn_metadata_agent.ini`. Now the instance gets metadata after some retrying on checking http://169.254.169.254/2009-04-04/instance-id10:28
jrossermoha7: if i remember this is what you experienced a couple of weeks ago?10:28
moha7jamesdenton: I created two types of external networks: Flat and VLAN; And then I created two routers each in one of these external networks. For the router with a hand in the flat-type external network, everything works well and I can ping the gateway of the router from the outside of OpenStack (Thanks for your hints). But the gateway of the other router that is in the VLAN-type external network is not accessible from 10:31
moha7outside!10:31
admin1moha7, you can also use configdrive  10:31
admin1so if metadata does not work, configdrive supplies those details 10:32
moha7(Note: OpenStack has been installed within a VM hosted on the ProxMox. By tcpdump on the ProxMox I see packets of OpenStack instances in the flat-type network, but nothing for the other one. That means there's something with OpenStack itself.)10:32
admin1especially useful when you have workload that does not connect to outside world 10:32
moha7jrosser: I think so, I have ad lots of issues during these days.10:33
moha7had*10:33
noonedeadpunkhm, there's no options for ssl in neutron docs https://docs.openstack.org/networking-ovn/latest/configuration/networking_ovn_metadata_agent.html10:35
noonedeadpunkoh, well, that page is super old10:36
jrosseri was asking last time this happened if it was a service restart10:36
jrosserwhich would be indistinguishable from changing config and restarting a service10:36
moha7admin1: you mean like this: https://serverfault.com/a/1114711/96807110:37
admin1yes10:37
moha7`configdrive` as a support?10:37
admin1but you can pass this as nova override 10:37
admin1during deployment10:38
jrossernoonedeadpunk: thoughts on this as an approach? https://paste.opendev.org/show/bQRTtLCc4WgWWhBp0nep/10:40
jrosseridea is to let you reconstruct the collection URL to point at a mirror10:40
jrosserfor example `export OSA_COLLECTIONS_GIT_LOCATION="https://mirror.example.com/{hostname}{path}"`10:43
noonedeadpunkwell. then it would make sense to add auth as well...10:43
moha7jrosser: The issue happens during creating the instance and also if you restart the instance, it takes around 30 minutes to pass and comes up.10:44
jrosseri think netloc would add username/password from original URL10:44
noonedeadpunkah, might be10:44
jrosserah no it wont10:44
jrosserneed to add username/password as well i think10:45
jrossermoha7: i think that you need to debug this with someone like jamesdenton who has great expertise with neutron10:45
moha7+110:46
noonedeadpunkwell, except names of variables I think it's quite good and flexible10:48
noonedeadpunkas I'd use user_collections_git_location_schema or smth instead10:49
jrossersure10:49
jrosserit was really hard to work out how to do this10:49
jrosseras there are no collections installed and nowhere to write vars files10:49
noonedeadpunkWell, it took quite some time to realize that as well10:49
noonedeadpunks/realize/read/10:49
jrosseri am also not sure if the URL from user-collection-requirements should be rewritten10:49
jrossermy feeling is that they should be left alone as that is user input10:50
noonedeadpunkyup, agree here10:50
jrosserbut interested in your thoughts on that too10:50
jrosserright - so the code needs adjusting to only modify non-user URLs10:50
noonedeadpunkas it might be annoying if user provided urls got overwritten/corrupted in an unexpected way10:51
jrosseryes10:51
noonedeadpunkbut, I'm not sure what can get wrong with such default template. Except, if you change it for general you might not it to be applied for user provided ones10:52
noonedeadpunkSo some separation is still needed regardless10:52
*** dviroel|out is now known as dviroel11:44
amaraoCan we bump `ansible-role-requirements.yml` for stable/zed? I'm waiting for circular dependencies fix...11:50
noonedeadpunkamarao: well, you can override role if required in /etc/openstack_deploy/user-role-requirements.yml11:53
noonedeadpunkIm checking one another bug right now that seems to affect Zed 11:54
amaraoI know, I've tested fix this way. Now I'm waiting it get it to stable.11:54
noonedeadpunkand during bumps we update all roles right before creating new tag11:54
noonedeadpunkso it will be part of 26.1.011:55
amaraoack, thanks.11:59
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_horizon master: Allow to override supported_provider_types  https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/87080112:54
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_horizon master: Allow to override supported_provider_types  https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/87080112:55
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: [doc] Add LXB scenario documentation  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87080513:28
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/zed: Add Glance tempest plugin repo to testing SHA pins list  https://review.opendev.org/c/openstack/openstack-ansible/+/87077713:46
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/zed: Add Glance tempest plugin repo to testing SHA pins list  https://review.opendev.org/c/openstack/openstack-ansible/+/87077713:46
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Add Glance tempest plugin repo to testing SHA pins list  https://review.opendev.org/c/openstack/openstack-ansible/+/87077813:46
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Add container_ip option for metal hosts  https://review.opendev.org/c/openstack/openstack-ansible/+/87011314:02
mgariepyjamesdenton, i'm looking at your mnaiov2 code. 14:03
mgariepyusing existing images would be nice addition14:04
mgariepysame for flavor14:06
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/yoga: Bump OpenStack-Ansible Yoga  https://review.opendev.org/c/openstack/openstack-ansible/+/87081014:13
opendevreviewchandan kumar proposed openstack/openstack-ansible-os_tempest master: Add support for whitebox-neutron-tempest-plugin  https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/87081214:16
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: [doc] Add LXB scenario documentation  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87080514:17
jamesdentonthanks mgagne 14:21
jamesdentonsorry, mgariepy 14:21
jamesdentonhi moha7 - we might be missing metadata config for ssl, i will look at that today14:23
jamesdentonand you're saying that your flat external provider network works, but a vlan-tagged provider network doesn't?14:23
noonedeadpunkfwiw I tried to look at ovn metadata code and I haven't seen good half of options we have in it's config ;(14:24
noonedeadpunkAnd indeed we might have some race condition on service restart as jrosser mentioned14:24
jrosseris this stuff we run via uwsgi?14:25
noonedeadpunkum, no, I think that uwsgi for neutron-api don't work if ovn is used14:25
noonedeadpunkso we default uwsgi to disable for ovn14:25
jrosserah ok becasue we still have the bug there where only one of the many neutron.conf... is passed to the uwsgi service14:26
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible-os_neutron/src/branch/master/defaults/main.yml#L18914:26
noonedeadpunkum, haven't someone pushed a fix for that?14:26
jrosseri don't think so14:26
noonedeadpunkIt was genericswith?14:26
jrosseryes thats where we found it14:26
jamesdentonjrosser i had a fix for that, i thought14:26
jrosserafaik we just moved that config to neutron.conf :(14:27
jamesdentonmight need to hit eavesdrop14:27
jrosservia override14:27
jrosseri did do some looking at uwsgi how to pass multiple config files14:27
noonedeadpunkyeah, jamesdenton totally found the fix14:27
noonedeadpunkor well, the way how to handle through uwsgi14:28
noonedeadpunkor maybe it was you...14:28
jrosserperhaps :)14:28
noonedeadpunkhttps://bugs.launchpad.net/openstack-ansible/+bug/198740514:28
noonedeadpunkSo there's solution I guess, but no patch14:28
jamesdentonwhoops :D14:29
jrosserright so multiple use of --config-file14:29
jrosserlooks like one of those things where we need to distinguish between string and list14:29
jrosserand iterate if it's a list14:30
mgariepyjamesdenton, have you seen this project ? https://github.com/ComputeCanada/magic_castle14:33
jamesdentoni have not14:34
jamesdentonbut i love duplicating efforts :D14:34
mgariepyit's not the same thing :D14:35
mgariepyit's more an hpc training ground 14:36
mgariepybut the use of TF is quite nice there.14:36
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Add facility to rewrite source URLs for ansible collections during bootstrap  https://review.opendev.org/c/openstack/openstack-ansible/+/87082014:36
jrossernoonedeadpunk: ^ this is pretty much for your isolated deploy - i don't need this on my deploy hosts14:37
jamesdentonit is helpful, i'm pretty new to terraform14:37
mgariepyho. also. booting from volume would be nice too :d14:37
noonedeadpunkjrosser: much appreciated! I was just gonna check out these patches for bump script14:38
jrossercould do with a sanity check on that last one, i've tried it in an AIO14:38
jamesdentonyes, i thought about that. i'm implementing small ceph VMs with small OSDs 14:39
jrosserneed to look next at ansible-role-requirements and what to do with that14:39
mgariepyshould i create issue on your project?14:39
noonedeadpunkwell... I assume I'm going to have another problem with OSA_COLLECTIONS_GIT_SCHEMA. As I assume user.rc (https://opendev.org/openstack/openstack-ansible/src/commit/e315e2e32738aed2b0a59ef8099280dee698607b/scripts/openstack-ansible.sh#L58) is not going to work during bootstrap14:44
noonedeadpunkor it will...14:44
noonedeadpunknah, we run directly /opt/ansible-runtime/bin/ansible-playbook14:45
noonedeadpunkwell, I will take care of that anyway14:46
jamesdentonmgariepy yes, that would be helpful. thank you.14:57
noonedeadpunk#startmeeting openstack_ansible_meeting15:00
opendevmeetMeeting started Tue Jan 17 15:00:18 2023 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:00
noonedeadpunk#topic rollcall15:00
noonedeadpunko/15:00
jrosserhello15:00
damiandabrowskihi!15:00
jamesdentono/, sortof15:00
noonedeadpunk#topic office hours15:02
mgariepyhello o/15:02
noonedeadpunkI was about to do bump for Z, but then https://bugs.launchpad.net/openstack-ansible/+bug/2002897 raised15:03
noonedeadpunkFix for it has been already pushed, so given some reviews on https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/870801 it should be fast after all15:04
noonedeadpunkThen for OVN driver for Octavia, we're still missing fixing SHA for the plugin. jamesdenton do you want to push the change? I can do it if needed15:05
jamesdentonif you can that would be great, i'm tied up a bit15:05
admin1i am trying to recreate a setup for upgrade test .. its on focal and tag 24.4.2 .. having trouble with rabbitmq-server install  rabbitmq-server : Depends: erlang-base (< 1:25.0) but 1:25.2-1 is to be installed or erlang-base-hipe (< 1:25.0) but it is not going to be installed or  esl-erlang (< 1:25.0) but it is not installable15:05
noonedeadpunk#link https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/86846215:05
NeilHanlono/15:06
admin1oh .. we are in a meeting .. 15:06
admin1\o15:06
noonedeadpunkadmin1: https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/86810715:06
admin1thank you noonedeadpunk15:07
noonedeadpunkjamesdenton: ok, will do. 15:07
jamesdentonit will also help me understand what you mean about the sha :)15:07
noonedeadpunkbtw, I'm going to join you NeilHanlon on FOSDEM :)15:07
mgariepycool15:08
NeilHanlon:gasp: 15:08
admin1i have 50/50 chances of coming in fosdem 15:08
noonedeadpunkjamesdenton: I mean adding octavia_ovn_octavia_provider_git_install_branch to https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/defaults/repo_packages/openstack_services.yml15:09
noonedeadpunkAnd yes, we also need to decide a scope we want for AA15:09
jamesdentonahhh yes yes, thanks15:10
noonedeadpunkI've checked etherpad lately, and we already merged most of things15:11
noonedeadpunk#link https://etherpad.opendev.org/p/osa-antelope-ptg15:11
jamesdentonwoot15:11
noonedeadpunkWe have huge topic about internal TLS at very least15:12
noonedeadpunkAlso finilizing skyline is smth we can totally do15:13
noonedeadpunkoh, and we also do have huge question about modular libvirt15:14
noonedeadpunkAnd things regaridng CI/functional testing, but that's not blocking releasing at least15:15
noonedeadpunkAs of functional changes I'd try to minimize them as Zed was quite stressfull15:15
noonedeadpunkWe always saying about releasing earlier, but never managed to do so :(15:16
noonedeadpunkIt's mostly me to blame though, I guess15:16
jrosseris internal TLS realistic for AA?15:16
noonedeadpunkdamiandabrowski: ?15:16
jrosserlike huge and we have not really started15:16
jamesdentonnoonedeadpunk you're a beast, thank you15:17
jrosseri do wonder how much we need to step back and just let the OVN stuff settle15:17
damiandabrowskijrosser: yes totally15:17
damiandabrowskiand i to have started15:17
damiandabrowskido*15:17
jamesdentonmaybe a big docs push for AA?15:17
noonedeadpunkyes, this is good ^15:18
noonedeadpunkLet me start new etherpad maybe?15:18
jamesdentonsure15:19
damiandabrowskiright now i'm testing if it's worth to configure haproxy services separately in os_ roles rather then preconfigure all of them with haproxy_server15:19
damiandabrowskitoday i deployed all base openstack services like this, but i need to improve few things before I push a change15:20
noonedeadpunk#link https://etherpad.opendev.org/p/osa-bobcat-ptg15:21
noonedeadpunkdecided to start PTG etherpad for tracking these things15:21
noonedeadpunkyes, totally jrosser I want also to calm things down with OVN a bit, and maybe do docs and bug fixes mostly15:23
jrosserit is frustrating not to be able to resolve moha7 troubles too15:23
jrosserits not possible to determine if it is environment trouble or actual bugs15:23
noonedeadpunkbut as internal tls was started quite a while ago and presumably damiandabrowski will have some time for it, and given there won't be anything really breaking I'd say we can see how this will go15:24
noonedeadpunkyeah, true. But I'd say with OVN it's in general hard to tell/debug some things. We also had some step back in using journals for logging15:25
noonedeadpunkAs example - ovn and gluster that do just local logs15:25
jamesdentonjrosser i think a small combination, but likely due to lack of clarity in the docs15:26
jamesdentoni will do a deploy now and test the SSL metadata issue mentioned15:26
jrosserit is also hard translating bugs/trouble in multinode onto an AIO15:26
noonedeadpunkI'd say what worth testing is somehow test ordering issues when config is changed15:27
jamesdentonagreed. Hoping to solve that with this MNAIOv2 thing i'm working on, but it's dependent on deploying VMs on OpenStack vs KVM.15:27
jrosseror at the end of a fresh deploy, i think thats a really good question15:27
jrosserif just letting the playbooks run to success is it working/broken15:27
noonedeadpunkI assume it's working, otherwise tempest won't succeed...15:28
noonedeadpunkHm, does octavia leverage metadata in any way?15:28
*** dviroel is now known as dviroel|lunch15:28
noonedeadpunkas for cirros it's not really required...15:28
jamesdentoni don't know - the agent is http based i think15:29
jamesdentonno ssh key required afaik15:29
jrosserthere is possibility of a debug key - not sure how that gets in15:29
jamesdentonmetadata, but i don't think we test that15:29
noonedeadpunkI assume through metadata, but that could be config drive as well15:29
noonedeadpunkyeah, so that's good to test...15:30
NeilHanlonjamesdenton: random question, any OSV resources you recommend for deep dives/learning?15:33
jamesdentonOVS? Hmm, not offhand15:38
jamesdentonhappy to look15:38
mgariepyi think the best way to learn is to have issue that you want to fix .. :D15:38
NeilHanlonhehe.. yeah, that's been my learning path so far but I was looking for a more unified/top down approach to the architecture, I guess. sometimes it leaves me scratching my head15:45
NeilHanlonand yeah jamesdenton: OVS.. oops :) 15:45
NeilHanlonthis helped.. a little :P https://www.kernel.org/doc/html/latest/networking/openvswitch.html15:46
jamesdentoni have sustained surface level knowledge, then i go down a rabbit hole when necessary and promptly forget15:46
mgariepyi want to add some doc on how to find specific stuff in the ovn/ovs db 15:46
NeilHanlonwell at least it sounds like it's not just me 😂15:47
jamesdentonthere's just too much to know!15:47
NeilHanlongood company, anyways :) 15:47
mgariepyhttps://blog.russellbryant.net/2016/11/11/ovn-logical-flows-and-ovn-trace/15:48
jamesdenton:)15:48
jamesdenton6 years ago... is a lifetime :D15:48
mgariepywell the base is kinda the same anyway ;p15:48
mgariepyand also doc is always outdated ;p haha15:48
NeilHanlonproblem? buy a <company> license and our experts will fix it for you!15:49
jamesdentonjuju what?15:49
mgariepyyeah.. pay us and we will figure it out to fix it..15:49
mgariepythat's how it works. 15:49
NeilHanlonanyways, sorry for the diversion lol15:52
johnsomOctavia defaults to using config drive to get the initial configuration data on an amphora boot. It can use the metadata service, but that was unstable back in the day we developed this (since fixed), so we stuck to config drive.15:54
noonedeadpunkthanks johnsom, that explains why we don't see failure in our tests if we have broken metadata :)15:55
jamesdentonnice15:59
opendevreviewJames Denton proposed openstack/openstack-ansible master: Add Octavia OVN Provider repo requirements  https://review.opendev.org/c/openstack/openstack-ansible/+/87083416:00
noonedeadpunk#endmeeting 16:02
opendevmeetMeeting ended Tue Jan 17 16:02:35 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:02
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-01-17-15.00.html16:02
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-01-17-15.00.txt16:02
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-01-17-15.00.log.html16:02
NeilHanlonty noonedeadpunk !16:07
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Add facility to rewrite source URLs for ansible collections during bootstrap  https://review.opendev.org/c/openstack/openstack-ansible/+/87082016:12
jrosserhmm do we ever take tempest plugins from local zuul repos - i don't see them in required-projects16:17
jrosseri think we only deal with playbooks/defaults/repo_packages/openstack_services.yml in the past16:19
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Add tempest and tempest plugins to required jobs for source deploys  https://review.opendev.org/c/openstack/openstack-ansible/+/87083916:26
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Allow git servers for openstack services and tempest to be overridden  https://review.opendev.org/c/openstack/openstack-ansible/+/86974816:28
noonedeadpunkjrosser: I had patch for that16:31
jrosseroh cool16:31
noonedeadpunkhttps://review.opendev.org/c/openstack/openstack-ansible/+/83428916:32
noonedeadpunkas you might see I made some mistake somewhere16:32
*** dviroel|lunch is now known as dviroel16:35
jrosseroh and i forgot about the ubuntu distro needing tempest source16:39
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Add tempest and tempest plugins to required jobs for source deploys  https://review.opendev.org/c/openstack/openstack-ansible/+/87083916:42
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Allow git servers for openstack services and tempest to be overridden  https://review.opendev.org/c/openstack/openstack-ansible/+/86974816:42
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-plugins master: Add variable to control no_log in service_setup role  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/86960416:45
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-plugins master: Add variable to control no_log in mq_setup role  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/86960216:46
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-plugins master: Add variable to control no_log in service_setup role  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/86960416:46
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Fix usage of _oslodb_setup_nolog  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/87084216:49
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Add variable to control no_log in service_setup role  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/86960416:50
noonedeadpunkWell, I tried to leverage loop_control labels as contrary to no_log but without much result16:51
noonedeadpunkI had hopes that it's somehow possible to do register and some logic for no_log depending task is failing or not, but register happens after no_log being evaluated16:52
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-plugins master: Fix no_log variable templating in db_setup role.  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/87084316:53
jrossernoonedeadpunk: ^ i completely expected that to behave like a when: .... i.e the "{{ }}" not required16:54
noonedeadpunkyeah, I was under same impression before tested16:54
jrosseryeah i just tested locally and it totally doesnt work or warn or error16:54
jrosserwhich is unfortunate16:55
jrossereven wierder it's not just evaluating the string to be 'True' and still not logging16:55
jrosserit just logs regardless16:56
noonedeadpunkoh, ok.  I for some reason decided it's not logging regardless....16:56
noonedeadpunkbut I added extra stuff there, that wasn't working either, so maybe that's why16:57
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Add facility to rewrite source URLs for ansible collections during bootstrap  https://review.opendev.org/c/openstack/openstack-ansible/+/87082017:38
mgariepyfor the ovn-meta it seems like it needs restart.17:39
mgariepyi'll have lunch and look at it a bit more.17:42
jamesdentonwell, i just did an AIO and the service is pulling in /etc/neutron/plugins/ml2/ml2_conf.ini, which has the SSL bits17:47
noonedeadpunkjrosser: I think we should place openstack_services_opendev_server/openstack_services_github_server elsewhere, like group_vars or smth like that17:48
mgariepyyes but the ovn-meta service is tracebacking.17:50
noonedeadpunkas even in bump script it's quite messy now. As you need to read yaml file, then get variables from it, then render original non-parsed "string", and then parse renderred one as yaml again17:50
noonedeadpunkor well, hardcode17:50
mgariepyi think it's racing because of the ordering.17:51
noonedeadpunkAnd not sure there's any need in splitting openstack_services_opendev_server from openstack_testing_opendev_server17:52
noonedeadpunkor well, then we're missing github services for gnocchi and consoles17:52
jamesdentonhmm, no trace here.17:52
mgariepymy aio was tracing :/17:53
mgariepydid you run a vm ?17:53
jamesdentonbut no working metadata, either :D17:53
jamesdentonyes17:53
jamesdentondid master, not zed. wasn't thinking about it17:54
jrosserjamesdenton: i had a dig around in my AIO too17:55
jrosserthe only thing i could see that looked suspicious was https://paste.opendev.org/show/bi8cKjYlKNS8TyoT4Dq4/17:55
mgariepyi'm on zed.17:55
jrossersearch that ^ for "died"17:55
jamesdentonahh yes, died. i see that too17:56
jamesdentonso, i see traffic from instance->metadata namespace, but [S] then [R]18:02
jamesdentoni think it may be another permissions issue, one sec18:04
jrosseri was trying to check that i could see the ovn haproxy listening socket in the metadata namespace18:07
jrosserthats not obvious how to do18:07
jamesdentondo you see this in your logs? Proxy listener started.18:09
jamesdentonJan 17 18:08:49 aio1 haproxy-metadata-proxy-142e1e29-b98b-481c-beba-a73d99ea54b0[242356]: Proxy listener started.18:09
jrosserwhich log do you check there?18:09
jamesdentonthe neutron-ovn-metadata-agent journal18:10
mgariepyhttps://paste.openstack.org/show/bqbgN34FD1P5NHjjZdRB/18:10
jamesdentontcp                LISTEN               0                    1024                               169.254.169.254:80                                    0.0.0.0:*                  users:(("haproxy",pi18:10
jrosserno, i don't think i do18:11
jrosserwas that just netstat on the host? or in the ns?18:11
mgariepyin the ns probably18:11
mgariepylike in my paste18:11
jamesdentonok, i think a couple of things: 1. the permissions on /var/lib/neutron/ovn-metadata-proxy/ need to be neutron:neutron, and the hapropxy pid needs to be killed, ovnmeta namespace deleted, then ovn metadata agent restarted18:12
jrosserright, so i don't see the socket open there18:12
jamesdentonhttps://paste.opendev.org/show/bhshXaNHzOGIfSpp4ASL/18:13
jrossertheres also a `metadata_proxy` socket which is root:root18:13
jamesdentonyeah, but i think that's because: https://github.com/openstack/openstack-ansible-os_neutron/blob/master/vars/main.yml#L496-L49718:14
jamesdentonand if we change the dir perms to neutron:neutron maybe that root can be avoided18:14
jrossermetadata agent journal is full of privsep stuff so hopefully none of this needs to be root18:15
jrossermore suspicious stuff https://paste.opendev.org/show/bDy27zbLeBJTPH16zp43/18:17
jrossertwo metadata agents?18:18
mgariepyhmm probably need to go.. :/18:20
mgariepyhttps://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c9b/852243/2/check/openstack-ansible-deploy-aio_lxc-ubuntu-jammy/c9b06a3/logs/host/neutron-metadata-agent.service.journal-12-39-53.log.txt18:20
mgariepyhttps://github.com/openstack/openstack-ansible-os_neutron/blob/master/vars/main.yml#L288 probably need to filter for ovn.18:25
jrosseralso the dhcp agent?18:31
jrosseri also have to head off18:31
mgariepyi was more speaking of the service.. lol not me..18:32
mgariepyhaha18:32
opendevreviewMarc Gariépy proposed openstack/openstack-ansible-os_neutron master: Disable dhcp-agent and metadata-agent for OVN  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87085418:45
mgariepyjamesdenton, do you patch the perms issue ?18:46
mgariepyor you want me to push it ?18:46
prometheanfireanyone use the ovn driver for octavia in OSA?18:50
mgariepyprometheanfire, https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/86846218:52
jamesdentoni haven't, but feel free. would be curious to see how CI does18:52
prometheanfireya, looks like a reasonable patch18:54
opendevreviewMarc Gariépy proposed openstack/openstack-ansible-os_neutron master: Fix user for neutron-ovn-metadata-agent  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87085718:55
jamesdentoni do think it will default to neutron if not called out, but this is a good test18:56
jamesdentonthank you18:56
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Allow git servers for openstack services and tempest to be overridden  https://review.opendev.org/c/openstack/openstack-ansible/+/86974818:58
opendevreviewMarc Gariépy proposed openstack/openstack-ansible-os_neutron master: Fix user for neutron-ovn-metadata-agent  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/87085718:58
noonedeadpunkI have list of comments to that path, more more cosmetical ones rather then functional19:09
noonedeadpunkTo octavia/ovn I mean19:09
mgariepyfor the ownership of the folder do we need to try to fix it in zed for deployment ?19:10
jrossermgariepy: i think 870854 only disables for new deployments, rather than existing ones19:10
jrosserbecasue of https://github.com/openstack/openstack-ansible-os_neutron/blob/d4cbd2d7adcf4bba95a885c87d2c6e4dc9c1b012/vars/main.yml#L340-L34119:11
noonedeadpunkjust in case I've patched bump script to work with 86974819:11
opendevreviewMerged openstack/openstack-ansible-os_horizon master: Allow to override supported_provider_types  https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/87080119:11
mgariepyso we add a release note then ?19:12
mgariepy:/19:12
mgariepyfor zed i mean.19:13
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_horizon stable/zed: Allow to override supported_provider_types  https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/87077919:14
noonedeadpunkwhat is that and why we have it then... https://github.com/openstack/openstack-ansible-os_neutron/blob/d4cbd2d7adcf4bba95a885c87d2c6e4dc9c1b012/vars/main.yml#L488-L49319:17
moha7Sorry for bumping the group; I'm resending my question as I was disconnected for hours.19:18
moha7jamesdenton: Hi james; To give you a feedback on deploying both flat and vlan networks:19:19
noonedeadpunkto be frank I don't have good solution, except quite complicated one19:19
moha7In the new deployed env, I created two types of external networks: Flat and VLAN; And then I created two routers each in one of these external networks; For the router with a hand in the flat-type external network, everything works well and I can ping the gateway of the router from the outside of OpenStack. But the gateway of the other router that is in the VLAN-type external network is not accessible from outside!19:20
moha7Note: OpenStack has been installed within a VM hosted on the ProxMox. By tcpdump-ing from the Prox, I see packets of OpenStack instances in the flat-type network, but nothing for the other one19:21
noonedeadpunkAs we could set `enabled: false masked:true` based on the ovn/non-ovn but then services will be created, but left disabled19:21
noonedeadpunkWhile we don't really want to create them for new deployments at all19:21
mgariepywhere does the packet for the vlan on is ending ? 19:21
noonedeadpunkOr create some "upgrade/transitional" task that will check for services and disable/mask dhcp/metadata ones19:22
moha7`ip netnes exec <ovn-metadata-somesring> ping upstream gateway` leads to nothing19:22
noonedeadpunkI kind of like idea of such task tbh. But I wonder if for that we would need to gather systemd_service facts wich are huge19:22
moha7(`ip netns` on the computes)19:22
noonedeadpunkrelease note... meh, dunno19:23
mgariepylol19:23
noonedeadpunkI will try to take a look tomorrow how complicated it would be19:23
jrossermoha7: do you know of your proxmox supports vlan tagged traffic on the VM interfaces?19:25
mgariepyif you tcpdump on the interface do you see something in the compute vm ?19:25
prometheanfireiirc a bit over a decade ago it supported passing through trunks19:26
mgariepyok thanks noonedeadpunk 19:27
moha7mgariepy: subnet: 172.17.238.0/24; upstream router gateway: 172.17.238.254 that is set on the TOR router, the hop after ProxMox; The ip that OpenStack router has takes randomly is 172.17.238.176; the other hand is an internal network connected to instances. I can ping 172.17.238.176 within instances, but not 172.17.238.254; from the outside, 172.17.238.176 is not pingable too.19:27
mgariepyok19:27
mgariepyin the computes that is hosting the router. if you tcp dump the network interface do you see something passing ?19:28
moha7let me check19:30
moha7jrosser: I'm not sure. going to search to see if there' similar issues with Prox19:30
moha7Should the "Segmentation ID" be the same as VLAN ID in the PrX and TOR router? <-- https://ibb.co/SBgP4pk19:40
jrosseryes it should19:46
jrosserbut as far as i can see you will need to make config in proxmox so that whatever your vm connects to is vlan aware19:46
jrosserand permit that vlan19:46
moha7I think Trunking has a role here, doesn't it? https://docs.openstack.org/neutron/zed/admin/config-trunking.html19:48
jrosserno19:48
moha7considering te Prox, I'm looking in some forums posts19:48
moha7the*19:49
jrosserunless you can see arp or other chatter from your physical router interface on the inside of your VM, not much point moving forward19:50
jrosserthe neutron trunking docs is about creating trunk interfaces to connect to instances19:50
*** dviroel_ is now known as dviroel|afk19:57
jamesdentonhi moha7 19:59
moha7Hi19:59
moha7By the config you gave me, now there's no error in neutron logs and the flat network is working well.20:00
moha7I had also an issue with the metadata as below:20:01
jamesdentonlet me review your notes20:01
moha7> metadata injection into the instances does not work well. Instances gets IP and can ping their router hands, but the hostname does not set on it! Ctrl+F for 169.254.169.254 in http://ix.io/4ltx to see the error.20:02
jamesdentonyes, we have a patch we are testing right now for that20:03
moha7but solved as I described in above comments20:03
jamesdentonok20:05
jamesdentonso, in this lab, the infra and computes are all VMs?20:05
moha7yes20:05
jamesdentonfor ESX, if you were doing trunking, the port group would be VLAN 4095 (which translates to trunk, essentially)20:05
jamesdentonand then you can tag traffic in your VM and if the switchport was configured to allow it, it will20:05
moha7All VMs with these interfaces: http://ix.io/4lwO20:06
jamesdentonyes, and enp6s21 is used for your OVS bridge, right?20:06
jamesdentonon the proxmox side, that enp6s21 interface would be tied to some network or port definition, i would think?20:07
moha7The free one, enp6s21, is set for the provider networks (both flat and vlan) in the openstack_user_config.yml20:07
jamesdentonyes, and untagged traffic is working OK and tagged is not, so i suspect something needs to be tweaked in proxmox to support tagged20:08
jamesdentonon the proxmox side, are there bridge configurations?20:08
jamesdentonvmbr0, vmbrX, etc?20:09
moha7i proxmox, the interface is on separate vlan, definde as vmbr0, wit the vlan ID 367720:09
moha7in*20:09
moha7Yes, they are all bridge20:09
jamesdentoncan you show the config?20:10
jrosserare there any bridge-vlan-aware and bridge-vids set?20:10
jamesdenton^^20:10
jamesdentonif you are using eth0.3677 in vmbr0, then flat will work and vlan won't, essentially20:11
jamesdentonbut the config will help see what what looks ike20:11
moha7https://ibb.co/wYY5pTh --> vmbr4 is bunch of subnets all on one vlan, set on those other interfaces, enp6s19-2020:12
jamesdentonany chance you can show us what the CLI is showing? Is it in /etc/network/interfaces maybe?20:12
moha7Sure; w8 plz20:13
jamesdentonsure brb20:15
moha7O_o20:33
moha7A mistake by my colleague <- http://ix.io/4lwZ who has defined all subnets under vmbr0 the same!20:34
moha7vmbr4*20:34
moha7jamesdenton: Then, it's proxmox network misconfiguration :'(20:35
jamesdentonlooking20:36
jamesdentonso, i would expect vmbr0 to contain ens3f1 (no tag), and have "bridge-vlan-aware yes" set20:37
jamesdentonthen, it ought to pass vlan tags (i think, anyway)20:37
jamesdentonyour flat network will continue to work *IF* 3677 is the native vlan on the switchport20:38
jamesdentonbut in this case, i expect it is not20:38
jrossersomehow this feels more complicated than it needs to be20:39
jrosserare there really multiple interfaces in the vm hooked down to the same bridge?20:40
moha7Oops, but vmbr0 has been correct. vmbr0 <--> enp6s2120:40
jamesdentonyes, but the interface connected to that bridge, ens3f1.3677, is a vlan-tagged interface (by the underlying host)20:41
jamesdentonso Neutron is adding a VLAN tag on top of that20:41
moha7Umm, seems there's no `bridge-vlan-aware yes` there!20:41
jamesdentonens3f1.3677 being tagged already is sort of transparent to neutron20:41
jamesdentoni think will need to remove ens3f1.3677 and make it ens3f1, and possible add bridge-vlan-aware20:41
jamesdentonwith a flat network, neutron is not tagging. As neutron passes untagged traffic thru enp6s21, it hits vmbr0, where a VLAN 3677 is applied on the way out ens3f120:42
jamesdentonif you want neutron tagging to work, neutron needs to be able to tag on enp6s21, and it hit vmbr0->ens3f1 (intact and not tagged again)20:43
moha7`3677` is native vlan on TOR switch. Indeed, 3677 is created by the network team introduced to us.20:43
jrossermoha7: is that how it has to be?20:44
jamesdentonare you sure it's the native vlan?20:44
jamesdentoneither way, if you only have vlan 3677 then you will only be able to support a flat provider network OR a single VLAN network (Segment ID 3677). But if the switchport is configured as an access port in 3677, then you're limited to flat.20:45
jrosserit kind of has to be a trunk with ens3f1.3677 existing20:46
moha7If I understand 'native' in the right way, 3677 is VLAN on the switch that is controlled by the network guys; Actually we don't have more information how they are handling it; Maybe I need to talk to that team.20:46
jamesdentonens3f1 looks like a trunk carrying 3677,3674 based on your config anyway20:46
moha7Yes, it's trunk as I remember some arguments with network administrator.20:47
jamesdentonok20:47
jamesdentoni speak in Cisco/Arista config, so apologies if 'access' port doesn't make sense20:48
jamesdentonbottom line is, with this configuration in place, you're likely stuck with a flat provider network20:48
moha7let me review your comments to see how I should edit that file; there's lots of info now here (:20:49
jamesdentonwhich is fine -- tenant traffic would traverse geneve tunnel. What isn't clear is which interface that rides on20:49
jrosser^ though if you can change the proxmox host config, you can do vlan too?20:49
jrossermoha7: i think maybe taking some time to think about what your physical deployment might look like, away from the proxmox stuff would help20:50
jrosserif you already won the battle to get a trunk port with vlans from your network team, that is good20:51
moha7+120:51
jrosserthings seem to be getting confused with the virtualisation setup, which will all go away with physical hosts20:51
jamesdenton+220:51
jrosserthen see how you would map what you actually want from the real deployment back to proxmox to make a realistic lab20:52
jrosseryou are probably in a much better position to do that now having spent some time with the deployment20:52
moha7jrosser: but I also should deliver the deployment task as it's taking 3 sprints long; For the physical network topic, I really need to speak with the whole team, plus joining network administrators too.20:53
moha7Now, I think my task is going to be moved to the done basket.20:53
jrosserimho getting a grip on the network requirements is one of the largest thing to wrap the brain around for an openstack deployment20:54
jamesdentonmgariepy As neutron, neutron-ovn-metadata agent can't open socket:  Exception: Could not retrieve schema from unix:/var/run/openvswitch/db.sock20:54
mgariepyha20:59
mgariepysad20:59
jamesdentonturns out i mentioned that here, whoops: https://github.com/openstack/openstack-ansible-os_neutron/commit/d4cbd2d7adcf4bba95a885c87d2c6e4dc9c1b01221:00
jrossermaybe stupid question but do we run ovs-vswitchd with too much priviledge too?21:01
jamesdentonyes?21:02
jamesdentonbut we install from package, right?21:02
jamesdentonmeaning, i don't think we go out of our way there21:03
jrosserman page suggests there is some priv dropping possible with --user user:group21:04
mgariepyon rhel it runs as openvswitch21:04
mgariepyubuntu ship as root.21:05
jamesdentonwell, maybe we can drop to neutron:neutron and see what breaks?21:05
mgariepyall the ssl stuff will breajk21:05
jamesdentonmeh21:05
jamesdentonsounds like a mgariepy problem21:06
mgariepylol21:06
mgariepyhttps://github.com/search?q=repo%3Aopenstack/openstack-ansible-os_neutron%20neutron_ovn_system_user_name&type=code21:07
mgariepywhat a mess.21:07
mgariepylol21:07
jamesdentonfreedom to choose the tool used to stab yourself in the eye21:08
jamesdentonLinux™21:08
mgariepyi wonder how a migration from root to neutron or openvswitch.21:08
jamesdentonopenvswitch user doesn't exist in Ubuntu, FWIW21:09
mgariepyyeah i know21:10
jrosserchanging the cert ownership to root:neutron shouldnt hurt should it?21:19
jrosserthats proabably how it should be anyway, rw by root, r by neutron21:19
jamesdentonhttps://developers.redhat.com/blog/2018/03/23/non-root-open-vswitch-rhel21:27
opendevreviewMerged openstack/openstack-ansible-os_horizon stable/zed: Allow to override supported_provider_types  https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/87077923:52

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!