Friday, 2021-07-23

opendevreviewSatish Patel proposed openstack/openstack-ansible master: Set FQDN hostname for aio  https://review.opendev.org/c/openstack/openstack-ansible/+/80193204:25
opendevreviewAndrew Bonney proposed openstack/openstack-ansible-os_keystone stable/victoria: Improvements to federation packaging  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/80194007:28
*** rpittau|afk is now known as rpittau07:35
noonedeadpunkmornings08:47
noonedeadpunkandrewbonney: if you're around vote on https://review.opendev.org/c/openstack/openstack-ansible-os_panko/+/799806 would be great :)08:48
andrewbonneyI'll take a look shortly08:48
opendevreviewMerged openstack/openstack-ansible-os_panko master: Deprecate os-panko role  https://review.opendev.org/c/openstack/openstack-ansible-os_panko/+/79980608:52
*** mgoddard- is now known as mgoddard12:18
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-haproxy_server master: Fix service removal condition  https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/80191012:25
admin1hi guys .. i am trying to use (untagged network ) for the first time .. do I have to use flat or can i use vlan but not use segmentation id ? 12:57
admin1and is the bridge still on br-vlan .. just without a tag ? 12:57
spateli think without tag would be flat network 13:01
admin1do i need to create br-flat? 13:02
spateli believe you just need to map your flat network with phynet interface ( either physical nic or br nic)13:04
spatelsomething like this physical_interface_mappings = provider:eth113:05
spatelflat_networks = provider13:06
spatelThis is good example - https://docs.openstack.org/newton/install-guide-rdo/launch-instance-networks-provider.html13:06
spatelnoonedeadpunk jrosser are you there?13:06
spatelwant to talk about - https://review.opendev.org/c/openstack/openstack-ansible/+/80193213:07
noonedeadpunkI'm around13:09
noonedeadpunkdamn, ceph-ansible still make troubles with depth....13:11
noonedeadpunkadmin1: eventually flat should be different interface from br-vlan13:13
noonedeadpunkIt can be tagged vlan itself. But it must be some existing interface other then br-vlan preferably with consistent naming13:14
noonedeadpunkjrosser: do you think instead of depth, we should use --shallow-since or just --single-branch ?13:15
noonedeadpunkso with bump we can update some shallow_date and use it for shallow-since ?13:18
spatelsorry i was in meeting.. 13:24
spatelnoonedeadpunk i found OVN is failing because of hostname issue specially aio build13:24
spatelwe are just setting hostname but not FQDN 13:25
spatellooking for why my patch failed with metal build13:25
noonedeadpunkIt failed for upgrade and I know why....13:26
noonedeadpunkspatel: but what to do for ppl who doen't have fqdn? 13:26
noonedeadpunkor they've not set it? It's not really obvious for me, why I can't set hostname I want13:27
spatelwhy don't you have fqdn in production? 13:27
spatelthat would be bad config right?13:27
spatelwhen i build aio in lab OVN set hostname aio1.openstack.local but hostname command return aio1  (this is what nova-compute also use) 13:29
spatelnoonedeadpunk check this out - https://bugs.launchpad.net/openstack-ansible/+bug/192297813:30
spatelbottom line whatever hostname this command return we need same in OVN openvswitch  - openstack compute service list13:31
spatelwhen i build my ovn in production i didn't see this issue, so look like aio1 related issue 13:31
noonedeadpunkopenstack compute service list does return hostname not fqdn 13:32
noonedeadpunkhypervisors do return fqdn13:32
spatelin my production its fqdn compute01.foo.com 13:32
noonedeadpunkok, let me clarify it - it takes from python socket.gethostname()13:33
noonedeadpunkwhich eventually depends on the order of records in hosts file13:33
noonedeadpunkso what I have in there is13:34
noonedeadpunk| 190 | nova-compute   | cc-compute12-dx1                      | nova     | enabled | up    | 2021-07-23T13:34:29.000000 |13:35
noonedeadpunkwhich kind of means that ovn won't work for me?13:35
spatelIn my production, i don't add any compute hostname in /etc/hosts file and DNS. my nova compute pick hostname command and my hostname return compute01.foo.com 13:35
noonedeadpunkwhich is kind of weird don't you think?13:35
spateltell me one thing.. what hostname command return on your cc-compute12-dx1   ?13:36
noonedeadpunkhttps://paste.opendev.org/show/807683/13:37
noonedeadpunkand aio kind of the same?13:37
spatelthis is strange.. because in my production i set hostname compute01.foo.com in /etc/hostname file 13:37
spatelso in my case hostname command return FQDN13:37
spatelits personal choice to set FQDN or not.. 13:38
spatelThis is going to be problem who deploying ovn ( if they won't follow rules related hostname then it will be issue which we are having right now)13:39
spatelmay be we can put some more check in ansible to find out correct hostname (short or fqdn and set according) 13:40
noonedeadpunkhttps://paste.opendev.org/show/807684/13:41
noonedeadpunkbtw, only hostname -A should retrun fqdn afaik13:42
noonedeadpunkor hostname --fqdn13:43
noonedeadpunkbut just hostname is fine to be non-fqdn13:43
spatelwithout this patch https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/793009/3/tasks/providers/setup_ovs_ovn.yml#b1713:43
spateljames was using command: "ovs-vsctl set open_vswitch . external-ids:hostname='{{ ansible_facts['hostname'] }}'"13:43
spatelansible_facts['hostname'] always return short hostname 13:43
noonedeadpunkSo. ovn hostname should match nova one, right?13:44
noonedeadpunkOr it should be just fqdn and doesn't really matter if it's mathing nova or not?13:44
spatelwhen i deploy OSA in my production and my compute node hostname command return compute01.foo.com but ansible tell ovs to use compute01 and that break my environment (we need ovs and nova compute hostname similar)13:45
noonedeadpunkyeah I see, that's valid bug13:45
noonedeadpunkNot sure what is valid solution for it though :)13:45
spatelwe need to come up with smart way to handle this dual problem where some people set fqdn and some people set fqdn for compute nodes13:46
noonedeadpunkAs hostname thing is really weird thing and depends on the record in /etc/hosts13:46
noonedeadpunkyeah, totally agree13:46
spatelwhat if OSA set hostname on compute node and make it short even someone set fqdn ?13:46
noonedeadpunkwe will break things during upgrade13:48
spatelfor experiment i changed hostname of my compute node and restart nova service which re-register new compute name with newer name and ovn got confused :) 13:48
noonedeadpunkAs all computes and net nodes and cinder services and etc would be renamed13:48
jrosserhostname returning short name is correct imho13:48
noonedeadpunkIt will break masakari and corosync cluster most likely as well13:48
jrosserit is deployed problem to have proper host / dns setup, not OSA13:49
jrosser*deployer13:49
noonedeadpunk`You can check the FQDN using hostname --fqdn`13:49
spatelovn has only issue with compute hostname, it doesn't care about other hostname.. 13:49
noonedeadpunkman hostname13:49
noonedeadpunk`If a machine has multiple network interfaces/addresses or is used in a mobile environment, then it may either have multiple  FQDNs/domain  names  or  none  at  all.`13:50
spatelagreed its not our problem to fix how people use hostname in their datacenter 13:50
spatelwe just need to educate or document about this issue to make sure your hostname is short otherwise it will create issue 13:50
noonedeadpunkBut yes, I agree that it's not osa business how ppl set hostnames13:51
noonedeadpunkbut we must deal with these choices...13:51
spatelso lets delete this patch and just document when someone go deploy OSA in production 13:51
spatelhttps://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/79300913:51
jrosserthis is putting just fqdn in /etc/hosts and not a third parameter for hostname?13:52
spatelI think james created this because i opened bug issue when i was deploying in production and hit this same issue related hostname conflict13:52
noonedeadpunkI think as workaround we can retrieve actual hostname with socket.gethostname13:52
noonedeadpunkas that's what used by nova and will give valid result anyway13:53
spateljrosser does nova compute use /etc/hosts file to read hostname? 13:53
noonedeadpunkspatel what `{{ ansible_facts['nodename'] }}` would return for you?13:54
spatelcompute01  short name13:54
noonedeadpunkso nodename and hostname are the same?13:55
noonedeadpunkdoh (13:55
spatelansible localhost -m ansible.builtin.setup 13:55
spateldoesn't ansible read that from facts 13:55
spatelrun this command ansible localhost -m ansible.builtin.setup and see what is hostname facts13:55
noonedeadpunkI just thought that nodename would return real hostname, and not trimmed one13:56
spateland nodename is always FQDN 13:57
noonedeadpunkwait13:57
noonedeadpunknodename is not fqdn13:57
noonedeadpunk`ansible_facts['nodename']": "cc-compute12-dx1"`13:57
spatelin my case "ansible_nodename": "os-osa.v1v0x.net", 13:58
spatelI am talking about this returning all these facts- ansible localhost -m ansible.builtin.setup13:59
noonedeadpunkthen we can just replace there hostname with nodename ?:)13:59
noonedeadpunkhttps://paste.opendev.org/show/807685/13:59
noonedeadpunkfirst is nodename, second is hostname and third is fqdn14:00
jrosserwhy don’t we make the var looked up in the facts itself a variable14:00
jrosseransible_facts[neutron_ovs_hostname_fact]14:01
jrosserand default it to ‘hostname’14:01
jrosserthen whatever strangeness exists, a deployment can pick the fact that works for them14:02
noonedeadpunkWhy not default to nodename? As it seems like nodename == hostname bash command?14:02
noonedeadpunkOr I'm wrong?14:02
jrosserwell, default as appropriate:)14:02
noonedeadpunkbut yeah, I agree that we can jsut add neutron_ovs_hostname_fact14:03
noonedeadpunkwe need to sort out depth issue once and forever as well... https://zuul.opendev.org/t/openstack/build/4b4487db6123489dbb483dbdcc8134c3/log/job-output.txt#373614:04
spatelLook at here these three should be similar when it comes to ovn  - https://paste.opendev.org/show/807686/14:06
noonedeadpunkyeah, ok, so let's indeed ad variable which would default to nodename14:06
spatelare you saying use nodename here like this command: "ovs-vsctl set open_vswitch . external-ids:hostname='{{ ansible_facts['nodename'] }}'"14:08
spatelor other work around we can set host = compute01 in /etc/nova.conf (that way no matter what is your OS hostname it will use same name) 14:10
spatelnoonedeadpunk if you know how to fix this nodename vs hostname could you cut the patch :) so i can verify my lab14:11
*** rpittau is now known as rpittau|afk14:13
noonedeadpunkok, I got the issue... It's kind of the same one that present for libvirt vs nova way of reporting hostname....14:14
noonedeadpunkSo I think the only solution here is try to set hostname to be persistant for OVN14:22
noonedeadpunkand during server restart ovn retrieves hostname on it's own and not using what has been previoously set for it.14:22
noonedeadpunkI winder what argiments can be passed to the OVN_CTL_OPTS14:23
spateltechnically its OVS switch hostname to report to ovn_northd14:23
noonedeadpunkSo I guess that OVS_EXTRA can provide hostname. But I see it being used only in network-scripts14:24
spateldid you reboot server to verify what hostname ovs switch picking up 14:24
noonedeadpunkI don't have ovn lab so....14:24
noonedeadpunkbut it's what is written in bug:)14:25
spateli think once you set hostname in ovs switch it will stay same in ovsdb database not going to change after reboot14:25
noonedeadpunkSo ansible was setting everyting correctly, but after reboot nova-compute service differs with ovn14:25
spatelI think james got confused there 14:25
noonedeadpunk`Pre-reboot external_ids : {hostname=compute3`, `Post-reboot external_ids : {hostname=compute3.openstack.local`14:26
noonedeadpunkso that patch is not going to help at all....14:26
noonedeadpunkwell, setting hostname doesn't matter that way as well...14:27
spatellet me give you example, if i set compute node hostname to hostnamectl set-hostname foo.bar.com and restart nova-compute then openstack report foo.bar.com in openstack compute service list and that is what ovn/ovs switch doesn't like 14:27
noonedeadpunkso task now find the way how to make ovs respect provided hostname.14:27
noonedeadpunkI think the one way would be to add override to systemd service14:27
noonedeadpunkand set some ENV or add hook after service start to adjust hostname for ovs14:28
noonedeadpunkUm. We're not talking about hostname change do we?14:28
noonedeadpunkSo current problem is the same as nova has with libvirt14:28
noonedeadpunkSo ovs, like libvirt, are C written. So in C with gethostname you will get fqdn, while in Python gethostname gets hostname14:29
spateli am just giving idea that ovn only care about what nova hostname show up in openstack, so question is how to make sure nova-compute hostname and ovn/ovn hostname match 14:29
noonedeadpunkEventually this results in different hostnames being reported for openstack hypervisor list and openstack compute service list14:30
noonedeadpunk(it's kind of fact I see atm)14:30
noonedeadpunkWith OVS it's really the same14:30
spatelif we tell /etc/nova.conf  host = compute01 and same time tell ovs hostname = compute01 then problem can be solved 14:30
noonedeadpunkexcept it breaks deployemtns14:30
noonedeadpunkhow to tell ovs?14:30
noonedeadpunkI dunno how to tell ovs, and that's the current problem14:31
spatelcommand: "ovs-vsctl set open_vswitch . external-ids:hostname='compute01'}'"14:31
noonedeadpunkyes, but it's reseted with reboot14:31
spatelno it won't 14:31
noonedeadpunkand after reboot (or service restart even) OVS rollsback hostname to fqdn14:31
spatelit will set on ovsdb database 14:31
noonedeadpunkand does not respect what we have set with that command14:32
spateli can test in few min.. if you won't i don't think it will get change 14:32
noonedeadpunkwell, ok, have you read bug report then?:)14:32
spatelwant*14:32
spatelyes i did read14:32
noonedeadpunkEventually this bug report is specifically about that 14:32
noonedeadpunkas once host is rebooted - ovs change hostname14:32
spatelhow about i can verify right now 14:33
noonedeadpunksure, o rush14:33
noonedeadpunk*no14:33
spatelwhat test you want to run?14:33
spatelchange hostname of OS and reboot ?14:33
noonedeadpunkoh... you would need to have different output for hostname and hostname --fqdn first )14:33
noonedeadpunkso basically AIO way of things14:34
spatelhttps://paste.opendev.org/show/807688/14:34
noonedeadpunkyou won't reproduce issue 14:35
noonedeadpunkso. hostname == os-compute-1 and hostname --fqdn == os-compute-1.v1v0x.net. Then `ovs-vsctl set open_vswitch . external-ids:hostname='os-compute-1'}'`14:36
noonedeadpunkafter reboot you will get external-ids:hostname='os-compute-1.v1v0x.net'14:36
noonedeadpunkI think we actually have 2 issues now )14:36
spatelhmm14:37
noonedeadpunk1st is that we always set external-ids:hostname='os-compute-1' ? And the second is that OVS does not preserve hostname14:37
spatelhmm 14:40
spatelwe don't have direct proof ovs doesn't preserve hostname 14:41
noonedeadpunkit's claimed in bug report at least...14:41
spatelif that is the case then we can add command in ovs systemd to set hostname 14:41
noonedeadpunkI'd need to spawn sandbox to prove that14:41
noonedeadpunkyeah, or maybe it respects some ENV?14:41
noonedeadpunklike OVS_EXTRA ?14:42
spatelyes we can add something in systemd to make it work 14:42
spatellet me run some experiments and see.. i think we both on same page :)14:42
spatellets me go back to my lab and play with it14:43
spatelOVS by default read hostname --fqdn 14:43
noonedeadpunkYeah I gues we are now! Thanks for checking that in!14:43
noonedeadpunkI think it's more about how C reads hostname - it's always fqdn, yes14:44
noonedeadpunkbut nova and other services use python way, which would be different anyway14:44
spatelif somehow we can also tell nova to use that then things can get aligned14:44
noonedeadpunkand cinder and neutron and etc etc etc14:45
spatelovn doesn't care about cinder and neutron hostname 14:45
noonedeadpunkSo nova has overcomed with that with libvirt actually14:45
spatelovn only care about what compute nodes reporting 14:45
noonedeadpunkYes, but nova does kind of?14:45
spatelYes only compute not api services etc14:45
noonedeadpunkso then we would need to change all services to align ovs?14:45
spateli don't know if all other service care about hostname or not but i have noticed only issue with nova-compute hostname which run openvswitch service and that is where name should be match 14:47
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: [DNM] Add Vault role support  https://review.opendev.org/c/openstack/openstack-ansible/+/80078714:47
spatelnova-compute report hostname to ovn_southd and that is how neutron knows where to bind port, if that hostname doesn't match then neutron failed bind14:48
noonedeadpunkYeah, but I mean I'd rather make ovs match rest of openstack then other way14:48
noonedeadpunkit's really same issue kind of as hypervisor names are reported with libvirt, but nova knows how to work with that properly https://paste.opendev.org/show/807689/14:49
spatelThis is it... that is the problem.. 14:49
spatelmultiple hostname representation 14:50
noonedeadpunkIt's not a problem until you use ovn:)14:50
spatelI believe this is the issue with AIO only... i haven't seen this issue in production 14:50
noonedeadpunkI have all my prods set that way (except really several)14:50
spatelI think for now we can remove this patch and that will pass all build related aio - https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/79300914:51
noonedeadpunkSo it;s really about if you follow RFC or jsut go convenient way...14:51
spateland we can make doc for production so people will be aware 14:51
noonedeadpunkI think we still need to set nodename? Or ansible_facts['hostname'] gives correct result for you?14:52
spatelansible_facts['hostname'] giving me short name 14:53
spatelif we keep ansible_facts['hostname'] then aio will pass all OVN patches 14:53
noonedeadpunkso I guess 2 things: 1. replace hostname with nodename 2. make ovs respect hostname that was set (most likely with systemd override)14:53
noonedeadpunkdoes this make sense?14:54
spateldo you want me to try this with nodename.  -> external-ids:hostname='{{ ansible_facts['nodename'] }}'14:54
spateland re-deploy aio14:54
noonedeadpunkyeah14:54
spatellet me deploy aio with nodename and see if it pass or fail14:55
noonedeadpunkI think it would be fine both for AIO and for scneario where hostname == hostname --fqdn14:55
spatelyes..14:55
spatellets me spin up my lab again and verify 14:55
opendevreviewDmitriy Rabotyagov proposed openstack/ansible-role-vault master: Initial commit to Vault role  https://review.opendev.org/c/openstack/ansible-role-vault/+/80079214:55
anskiyansible_facts['hostname'] would always return short name: https://github.com/ansible/ansible/blob/stable-2.11/lib/ansible/module_utils/facts/system/platform.py#L52-L5414:57
noonedeadpunkgreat link!15:01
noonedeadpunkyes, we have to use nodename15:01
spateli am going to try and will post result here... to see how it goes 15:04
anskiyplatform and socket are both standart python modules; And according to straces from these: https://paste.opendev.org/show/807690/ it looks like socket.getfqdn() is, like, real networking stuff opposed to platform.node(), which looks like just parsing output from uname syscall15:06
opendevreviewMerged openstack/openstack-ansible-os_keystone master: Fix oidc scope misspelling in newer releases  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/80160415:10
opendevreviewMerged openstack/openstack-ansible-os_adjutant master: Run handlers only against present services  https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/80118815:13
opendevreviewMerged openstack/openstack-ansible-os_octavia master: Fix self-signed certs distribution  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/80150515:16
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia stable/wallaby: Fix self-signed certs distribution  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/80188915:37
opendevreviewDmitriy Rabotyagov proposed openstack/ansible-role-vault master: Initial commit to Vault role  https://review.opendev.org/c/openstack/ansible-role-vault/+/80079215:49
opendevreviewDmitriy Rabotyagov proposed openstack/ansible-role-vault master: Initial commit to Vault role  https://review.opendev.org/c/openstack/ansible-role-vault/+/80079215:52
spatelnoonedeadpunk around ?20:01
admin1my flat network adding creates a bridge .. but that bridge does not conenct to any physical network 20:37
admin1i added the network like this:  https://pastebin.com/wjkJqwqZ20:38
admin1so br-vlan has eth1 which is where the public ips are accessible  ( directly connected ot the switch ) 20:38
admin1this is a linuxbridge setup 20:39
admin1does not explain flat .. https://docs.openstack.org/openstack-ansible-os_neutron/latest/app-openvswitch.html#configuring-bridges-linux-bridge 20:40
admin1i have been using only br-vlan for the last 5-6 years of using osa .. .first time doing flat 20:41
jrosseradmin1: every time you build an AIO it has a flat provider network https://github.com/openstack/openstack-ansible/blob/master/etc/openstack_deploy/openstack_user_config.yml.aio.j2#L16121:05
jrosserthe “configuring bridges” stuff really is about infrastructure, not so much about neutron21:07
jrosserlook at how eth12 works in the AIO21:09
jrosserit is for the deployer to arrange the flat interface (in this case eth12) to be connected to the right place21:09
jrosserfor the AIO the bootstrap scripts wire eth12 to br-vlan with a veth, but that’s totally just for CI convenience21:10
jrosserso set it up how you need it21:11
admin1i have eth1 which has access to the public IPs .. and i can put any bridge on top of it 21:29
admin1isn't that how it works ? 21:29
admin1i tried br-flat and just mentoning eth1.. but  it also did not worked 21:29
admin1so i am not sure how flat actually works 21:30
admin1i am going to build one aio now and check 21:30
admin1thank you jrosser.. i will check it out 21:45
jrosseradmin1: neutron wants an interface, too much worrying about bridges here - it’s not that complicated21:58
admin1jrosser -- here is the old one and new one  https://pastebin.com/EJLU4bdz -- both did not worked22:07
admin1br-vlan is on top of eth122:07
opendevreviewSatish Patel proposed openstack/openstack-ansible-os_neutron master: Set ovn hostname using nodename facts  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80213422:41
opendevreviewSatish Patel proposed openstack/openstack-ansible-os_neutron master: Change OVN metadata protocol to https  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80213522:54
opendevreviewSatish Patel proposed openstack/openstack-ansible-os_neutron master: Set ovn hostname using nodename facts  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/80213422:57
spatelhope these patches solved all OVN tempest issue 22:58

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!