art | hi all | 05:50 |
---|---|---|
art | we have installed OSA(Zed)+OVN on 3 controllers and 2 compute nodes (called compute01, and compute03) | 05:50 |
art | Ovn-northd and neutron-servers are installed on containers (on infras). And, ovn-gateways have been set on the compute nodes. | 05:50 |
art | https://www.irccloud.com/pastebin/lSSoji45/user%20variable | 05:51 |
art | Also, this is our config of v-lan(br-provider) and tenant(br-vxlan)networks: | 05:52 |
art | https://www.irccloud.com/pastebin/qyTylsZm/openstack_user_config | 05:52 |
art | On our compute nodes, before deploying OSA, we had created br-provider (using OVS in the netplan). Then let OSA install the compute nodes. | 05:52 |
art | In this scenario, sometimes some of instances had not access to external network | 05:52 |
art | Then I added another compute node (called compute02). The only difference between compute2 and other computes is that I have not created br-provider manually, and let OSA to create it. | 05:53 |
art | Currently, routers of the projects which their first chassis priority is compute2, have access to external network, but the others have not. | 05:53 |
art | For example: In the following screenshot, chassis_name: “75c7ce80-…” is our compute02, and have access to external net. | 05:53 |
art | As it seems the default configuration of OVN is according to the “Centralized Floating IP” architecture, I tried to stop ovn_controllers on compute1 and compute3, and expected the router on compute2 routes the traffic of the project which their first chassis priority is not compute02, such as this one (first priority of this router is the chassis ‘0d3c2a0a-…, which is our compute01)’: | 05:54 |
art | It seems I can not send screenshots here! :P | 05:55 |
art | anyway, stopping all the ovn controllers on all three compute nodes have not any effect on the instances access to the external network. It seems, ovn gw can not route the traffic to its second and third gw-priorities. | 05:55 |
art | Not sure if I am making mistake, or sth is wrong with our config. | 05:58 |
art | Do you have any idea what’s wrong overhere? | 05:58 |
jrosser | does your use of *controller_hosts and *compute_hosts in user_variables actually do what you expect? | 05:59 |
jrosser | what I mean is, where are those defined, and how do you think ansible loads the values? | 06:00 |
art | we have defined them in openstack_user_config: | 06:14 |
art | https://www.irccloud.com/pastebin/RDfVaDm7/ | 06:15 |
jrosser | ok, so openstack_user_config is the input to the OSA dynamic inventory which creates any required ansible groups and so on | 06:33 |
art | exactly | 06:34 |
jrosser | user_variables sets ansible vars that you use to override things in group_vars / host_vars or ansible role defaults | 06:34 |
jrosser | so defining network_northd-hosts in user_variables won’t define an ansible host group, nor is *compute_hosts an ansible variable that you can refer to outside openstack_user_config | 06:36 |
jrosser | I think it looks like you are trying to define host groups in user_variables, which won’t work | 06:38 |
art | oh, i'm sorry. I have written incorrectly, here! the ovn groups have been defined in openstack_user_variables, not user_variables. I just made mistake in writting them here | 06:39 |
art | the config related to ovn in user_variable is just these lines: | 06:39 |
art | https://www.irccloud.com/pastebin/aQmtD267/ | 06:40 |
jrosser | anywa I think you need OVN controller running on all chassis regardless of if they are gateway or not | 06:41 |
art | nothing else is defined in user_variable which be related to ovn. All other configs have been defined in openstack_user_config, exactly as you mentioned | 06:41 |
art | it means I need to have ovn-controller deamon on all of my compute nodes, which I currently have them | 06:42 |
art | Do I need any other thing that should be considered? | 06:43 |
jrosser | sadly I have very limited experience with OVN | 06:43 |
jrosser | if your compute2 works reliably then this maybe points to some difference with how be-provider was created between net plan and letting OSA do it | 06:44 |
art | that's ok. Thank you very much for taking the time to help me with this issue. really appreciate that | 06:45 |
art | yeah. it seems. I try to change all other computes to this config for br-provider. let you know the result :) | 06:45 |
art | Thanks again | 06:45 |
opendevreview | Andrew Bonney proposed openstack/openstack-ansible stable/2023.1: haproxy: fix health checks for serialconsole in http mode https://review.opendev.org/c/openstack/openstack-ansible/+/891452 | 07:05 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Remove deprecated variables https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/891567 | 08:03 |
jrosser | arxcruz: ^ looking at codesearch this might be a breaking change, but I am assuming that the design error mentioned in the commit message here has been addressed https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/884172 | 08:05 |
noonedeadpunk | btw https://review.opendev.org/q/topic:osa%252Fopenstack_resources is another breaking thing | 08:21 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-haproxy_server master: Add HTTP/2 support for frontends/backends https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/891572 | 08:37 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-haproxy_server master: Add HTTP/2 support for frontends/backends https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/891572 | 09:21 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Enable HTTP/2 for TLS-covered frontends https://review.opendev.org/c/openstack/openstack-ansible/+/891575 | 09:31 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Update haproxy healthcheck options https://review.opendev.org/c/openstack/openstack-ansible/+/887285 | 09:36 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Rename includelist/excludelist file path vars https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/891578 | 10:00 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Allow include/exclude lists to be defined in many variables https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/891579 | 10:00 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant stable/zed: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891447 | 11:05 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant stable/2023.1: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891279 | 11:06 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant master: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/888985 | 11:06 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant stable/2023.1: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891279 | 11:06 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant stable/zed: Install mysqlclient devel package https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891447 | 11:07 |
jrosser | i don't have much luck at all with tempest basic server ops on a baremetal AIO :/ | 11:38 |
jrosser | lots of neutron failed port binding | 11:38 |
noonedeadpunk | I've just spawned one. | 11:48 |
noonedeadpunk | Even with Octavia | 11:48 |
jrosser | the tempest vars patches seem to be working | 11:50 |
jrosser | but i backed out to only basic server ops as it was all very breaking | 11:50 |
jrosser | i see this https://paste.opendev.org/show/bWJPQOl7ddixcg6TXrKp/ | 11:50 |
jrosser | but then becasue OVN i'm a bit lost with what to look at next | 11:51 |
noonedeadpunk | I actually do recall seing that in CI, but it was solved somehow... | 11:52 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Ensure test exclusion file is removed when there are no exclusions https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/891586 | 11:52 |
jrosser | like theres no neutron agent any more to look in the logs of | 11:52 |
noonedeadpunk | Usually neutron-server and then ovn-nortd are good starting points | 11:53 |
noonedeadpunk | Ah. I think it was hostname mismatch when I saw in CI | 11:53 |
noonedeadpunk | that one defined for ovsdb does not match what's expected by neutron | 11:53 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-os_neutron/commit/74b0884fc232aa96f601b4c24c3e36f3fba4f1bb | 11:54 |
jrosser | wierdly, i can manually boot a cirros and aparrently attach it to one of the leftover tempest networks | 11:54 |
noonedeadpunk | Actually, we likely can revert this now... | 11:54 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Revert "Workaround ovs bug that resets hostname with add command" https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/891453 | 11:57 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Do not override tempest_plugins for AIO scenarios https://review.opendev.org/c/openstack/openstack-ansible/+/891587 | 11:57 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Do not override tempest_plugins for AIO scenarios https://review.opendev.org/c/openstack/openstack-ansible/+/891587 | 11:58 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Revert "Workaround ovs bug that resets hostname with add command" https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/891454 | 11:59 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Revert "Workaround ovs bug that resets hostname with add command" https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/891453 | 12:00 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Revert "Workaround ovs bug that resets hostname with add command" https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/891453 | 12:00 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Adopt for usage openstack_resources role https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/889879 | 13:26 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_gnocchi master: Use proper galera port in configuration https://review.opendev.org/c/openstack/openstack-ansible-os_gnocchi/+/890100 | 13:33 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_aodh master: Use proper galera port in configuration https://review.opendev.org/c/openstack/openstack-ansible-os_aodh/+/890093 | 13:34 |
noonedeadpunk | to unblock adjutant role, we'd need to start merging from Zed: https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891447 | 13:36 |
noonedeadpunk | otherwise upgrade jobs are not happy | 13:36 |
spatel | Folks! Do you have any suggestion of good JBOD card for Dell server? I am looking to use for Ceph. | 13:46 |
jrosser | lsi-logic/broadcom HBA have been ok for us | 13:51 |
jrosser | spatel: ^ you mean like SAS adaptor to connect a JBOD chassis? | 13:51 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_adjutant master: Stop reffering _member_ role https://review.opendev.org/c/openstack/openstack-ansible-os_adjutant/+/891462 | 14:01 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Add openstack_resources role skeleton https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/878794 | 14:17 |
spatel | jrosser, what do you mean by JBOD chassis? | 14:22 |
jrosser | spatel: you've not been very specific - i assumed you meant you had some disk chassis to add to a dell server | 14:23 |
spatel | I have Dell server which RAID controller and it doesn't have JBOD mode. I have two option configure all drive with RAID0 each and expose to OS or buy new card which has JBOD mode. | 14:23 |
spatel | I have 4x 6TB disk on each dell server for OSDs | 14:24 |
jrosser | the firmware on some dell controllers can be reflashed to IT mode | 14:25 |
jrosser | on my r730 i can configure the drives as "non raid" and they just get passed through to the OS | 14:26 |
spatel | Oh!! I see | 14:32 |
spatel | Do you have dedicated mon nodes or running on with openstack controller? | 14:33 |
spatel | I have 5 ceph nodes and thinking running controller node with OSDs nodes :( | 14:33 |
spatel | mon is very light weight and only required processing when OSD or any node is down | 14:34 |
noonedeadpunk | usually having mon/mgr on osd is not great | 14:34 |
noonedeadpunk | as then upgrade process is quite cumbersome. | 14:35 |
noonedeadpunk | or well, maybe with ceph-adm it's not a concern anymore | 14:35 |
jrosser | i have dedicated mon nodes in that they are in LXD on some other hosts sharing with other stuff | 14:36 |
noonedeadpunk | yeah, you totally should some kind of isolation between mon and osd services if you decide to co-locate them on the same hardware | 14:36 |
jrosser | but they are in a totally separate failure domain from the OSD | 14:36 |
spatel | In cephadm they are kind of isolated anyway in containers | 14:40 |
noonedeadpunk | also rgw/mds are not _that_ lightweight in case you've ever considered using them | 14:41 |
spatel | I have also question related OS which one is better for ceph CentOS vs Ubuntu | 14:41 |
* jrosser has waaay more hardware for rgw than mons | 14:41 | |
jrosser | like 10x | 14:41 |
spatel | I heard podman is best container system for Ceph so choose CentOS distro | 14:42 |
noonedeadpunk | Well, I guess it would highly depend on the usecase and traffic served | 14:42 |
noonedeadpunk | but cephadm will deploy centos containers anyway? | 14:42 |
noonedeadpunk | there's no selection from what I heard? | 14:43 |
noonedeadpunk | and dockerhub cointains pretty outdated images, so it's also pretty much only quay.io that should be used | 14:44 |
spatel | what do you mean by depend on usecase of traffic? | 14:51 |
spatel | container are just for control plane right | 14:51 |
noonedeadpunk | I think pretty much everything runs in containers? Including OSDs? | 14:52 |
jrosser | that was probably to do with my very large number of rgw | 14:52 |
noonedeadpunk | ot 100% sure though | 14:52 |
spatel | I love centOS but only worried in future IBM mess with it again :) | 14:54 |
noonedeadpunk | but there're no other options with cephadm? | 14:54 |
noonedeadpunk | yes, you can run containers pretty much anywhere, but containers themselvs will have centos anyway? | 14:55 |
noonedeadpunk | jsut check their manifests: https://quay.io/repository/ceph/ceph/manifest/sha256:11e9bcdd06cfd4f6b0fef3da5c2a3f0f87d7a48c043080d6f5d02f7010ad72eb | 14:58 |
spatel | Yes! Ubuntu required dedicated repository to run podman | 14:58 |
noonedeadpunk | 17.2.6 running on CentOS 8 Stream | 14:59 |
spatel | centOS does it native | 14:59 |
jrosser | no wonder there is nervous people tbh | 14:59 |
spatel | I don't trust IBM tbh :) | 15:00 |
noonedeadpunk | or well: https://github.com/ceph/ceph-container/tree/main/ceph-releases/quincy | 15:00 |
spatel | They are business people and they can change policy with stroke of pen | 15:00 |
noonedeadpunk | this is repo from which these containers are built from - rhel/ibm/centos | 15:01 |
noonedeadpunk | and they not only can - they have already did that lately | 15:02 |
noonedeadpunk | *have already done that lately | 15:02 |
spatel | Lets stick with Ubuntu then.. hell to centos | 15:02 |
noonedeadpunk | but as I said - you still have CentOS inside images :) | 15:03 |
noonedeadpunk | or better say - inside containers | 15:03 |
noonedeadpunk | so as long as you're running cephadm it doesn't really matter what will be on host | 15:05 |
spatel | noonedeadpunk No... I have deployed cephadm and its running ubuntu based containers | 15:06 |
spatel | https://paste.opendev.org/show/bOZDh0p3sTj62acovqxh/ | 15:06 |
noonedeadpunk | but uname will show host kernel | 15:06 |
noonedeadpunk | as containers are not virtualizing that | 15:07 |
jrosser | try some dnf command :) | 15:07 |
noonedeadpunk | what's `cat /etc/*release*`? | 15:07 |
spatel | CentOS Stream release 8 | 15:08 |
opendevreview | Merged openstack/openstack-ansible master: Update haproxy healthcheck options https://review.opendev.org/c/openstack/openstack-ansible/+/887285 | 15:08 |
spatel | wtf... | 15:08 |
jrosser | spatel: this is how containers work, right? | 15:09 |
jrosser | theres some $you-shoudlnt-have-to-care rootfs / runtime and it uses the host kernel | 15:10 |
spatel | +1 | 15:10 |
jrosser | same is true for LXC/LXD | 15:10 |
jrosser | it's completely trivial to use other OS LXC/D on ubuntu or whatever | 15:11 |
jrosser | it's just an OSA thing where we take the choice to tightly couple the OS in the container to the host | 15:11 |
noonedeadpunk | Well, we technicaly could de-couple that as well... But then we'd need to pull in container images from linuxcontainers.org rather then building them | 15:12 |
spatel | currently we have ceph + cinder but in future we want to setup NFS storage | 16:09 |
spatel | cinder with NFS required to mount NFS volume to all compute nodes correct? | 16:10 |
noonedeadpunk | yup | 16:16 |
noonedeadpunk | which ends up in a need to reboot all computes once NFS becomes unreachable | 16:17 |
spatel | holy crap! | 16:18 |
spatel | why not lazy unmount ? | 16:18 |
spatel | Does cinder create LVM volume on top of NFS share and give it to VMs? | 16:19 |
spatel | Trying to understand how NFS share plug to VMs :) | 16:20 |
jonher | hi, i was running a deployment yesterday and finished it up today after hitting an issues with the ceph-rgw-keystone-setup playbook. I'm pretty sure this has worked in previous deployments on the same git branch/version when running 'setup-openstack' playbook: https://github.com/openstack/openstack-ansible/blob/xena-em/playbooks/ceph-rgw-keystone-setup.yml#L17 | 16:31 |
jonher | here hosts would evaluate to "localhost" and execute there, then as vars_file loads "openstack_service_setup_host" will become the first utility container as defined in defaults/source_install.yml | 16:31 |
jonher | what i ended up doing was settings the "openstack_service_setup_host" in user_variables.yml to the first utility container, just like the source_install.yml does. Otherwise it would attempt to use /openstack/utility-<verison>/bin/python on the osa-deployer host to run the task. | 16:31 |
jonher | But this seems like a bug? And why has it previously worked? Is the variable somehow available from a previous task when running setup-openstack.yml in one pass or similar? | 16:31 |
noonedeadpunk | jonher: huh, interesting. But the playboook is expected to load defaults/source_install.yml according to the code? But now I kinda no sure if ansible does not evaluate vars differently. Ie, including vars after resolving hosts | 20:07 |
noonedeadpunk | as that would result in a behaviour you've mentioned | 20:07 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.1: Bump keystone version to unblock upgrade jobs https://review.opendev.org/c/openstack/openstack-ansible/+/891633 | 20:11 |
noonedeadpunk | though I wonder how that worked in CI, as I believe we should run this code in ceph job, and it's LXC-based, so localhost should not be valid target... | 20:12 |
*** bjoernt_ is now known as bjoernt | 21:41 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!