opendevreview | Michal Arbet proposed openstack/kolla master: Fix permissions for ironic metrics https://review.opendev.org/c/openstack/kolla/+/940514 | 06:16 |
---|---|---|
opendevreview | Michal Arbet proposed openstack/kolla master: Fix permissions for ironic metrics https://review.opendev.org/c/openstack/kolla/+/940514 | 06:24 |
opendevreview | Michal Arbet proposed openstack/kolla master: Install pycadf from pypi package https://review.opendev.org/c/openstack/kolla/+/940605 | 06:31 |
opendevreview | Michal Arbet proposed openstack/kolla master: Fix permissions for ironic metrics https://review.opendev.org/c/openstack/kolla/+/940514 | 06:31 |
kevko | frickler: replied to your comments | 06:44 |
opendevreview | Roman Krcek proposed openstack/kolla-ansible master: Merge of container_facts modules https://review.opendev.org/c/openstack/kolla-ansible/+/912460 | 08:47 |
opendevreview | Roman Krcek proposed openstack/kolla-ansible master: Add action for getting container names list https://review.opendev.org/c/openstack/kolla-ansible/+/924389 | 08:47 |
opendevreview | Roman Krcek proposed openstack/kolla-ansible master: Add container engine migration scenario https://review.opendev.org/c/openstack/kolla-ansible/+/836941 | 08:47 |
opendevreview | Grzegorz Koper proposed openstack/kolla-ansible stable/2024.2: Fix Grafana datasource update https://review.opendev.org/c/openstack/kolla-ansible/+/940856 | 08:52 |
opendevreview | Grzegorz Koper proposed openstack/kolla-ansible stable/2024.1: Fix Grafana datasource update https://review.opendev.org/c/openstack/kolla-ansible/+/940857 | 08:52 |
opendevreview | Grzegorz Koper proposed openstack/kolla-ansible stable/2023.2: Fix Grafana datasource update https://review.opendev.org/c/openstack/kolla-ansible/+/940858 | 08:52 |
opendevreview | Pierre Riteau proposed openstack/kayobe stable/2024.2: Support forcing time synchronisation https://review.opendev.org/c/openstack/kayobe/+/940869 | 10:05 |
opendevreview | Matt Crees proposed openstack/kayobe master: Drop kolla-tags and kolla-limit https://review.opendev.org/c/openstack/kayobe/+/935669 | 11:06 |
kevko | frickler: can u please review comments so we can continue reviewing/merging rabbitmq queue manager ? | 11:52 |
opendevreview | Verification of a change to openstack/kayobe master failed: Replace pause with chronyc waitsync in ntp sync https://review.opendev.org/c/openstack/kayobe/+/940646 | 12:05 |
opendevreview | Merged openstack/kolla-ansible master: Fix Grafana datasource update https://review.opendev.org/c/openstack/kolla-ansible/+/940133 | 12:06 |
andreykurilin | Hi folks! Can someone suggest the direction to dig? One of CI jobs failed with `NoneType: None` message on `kolla-ansible reconfigure -i /etc/kolla/inventory -vvv` step. | 12:11 |
opendevreview | Grzegorz Koper proposed openstack/kolla-ansible stable/2024.2: Fix Grafana datasource update https://review.opendev.org/c/openstack/kolla-ansible/+/940856 | 12:36 |
opendevreview | Grzegorz Koper proposed openstack/kolla-ansible stable/2024.1: Fix Grafana datasource update https://review.opendev.org/c/openstack/kolla-ansible/+/940857 | 12:36 |
opendevreview | Grzegorz Koper proposed openstack/kolla-ansible stable/2023.2: Fix Grafana datasource update https://review.opendev.org/c/openstack/kolla-ansible/+/940858 | 12:37 |
opendevreview | Michal Arbet proposed openstack/kolla master: Fix permissions for ironic metrics https://review.opendev.org/c/openstack/kolla/+/940514 | 12:42 |
opendevreview | Michal Arbet proposed openstack/kolla master: Install pycadf from pypi package https://review.opendev.org/c/openstack/kolla/+/940605 | 12:43 |
opendevreview | Michal Arbet proposed openstack/kolla master: Fix permissions for ironic metrics https://review.opendev.org/c/openstack/kolla/+/940514 | 12:43 |
opendevreview | Jakub Darmach proposed openstack/kolla stable/2024.1: Add support for Ubuntu 24.04 LTS https://review.opendev.org/c/openstack/kolla/+/932386 | 14:35 |
dcapone2004 | when upgrading rabbitmq between 2023.1 and 2023.2, the docs reference om_enable_rabbitmq_quorum_queues and om_enable_rabbitmq_high_availability ... neither of these options are in my current globals.yml file on 2023.1.... is this correct? | 15:11 |
dcapone2004 | I am also not seeing that variable in etc_examples folder of the 2023.2 release, so I am tring to follow the docs but I am unsure where these variables are set and how to safely upgrade rabbitmq for a transition to 2023.2 | 15:22 |
opendevreview | Andrey Kurilin proposed openstack/kolla-ansible master: Makes Grafana database ssl options configurable https://review.opendev.org/c/openstack/kolla-ansible/+/940886 | 15:23 |
priteau | dcapone2004: The quorum queue option was added as a backport after release: https://review.opendev.org/c/openstack/kolla-ansible/+/902387 | 15:25 |
priteau | You can probably diff the changes to bring your own globals.yml up to date. | 15:26 |
dcapone2004 | priteau: I assume I need that variable in my globals.yml to safely upgrade? | 15:26 |
priteau | There is a precheck that will complain if you are not using either om_enable_rabbitmq_quorum_queues or om_enable_rabbitmq_high_availability | 15:27 |
priteau | You need to pick one | 15:27 |
dcapone2004 | it is a bit confusing as the "upgrade" page of the 2023.2 docs do not make mention of separately upgrading rabbitmq unless you are performing a SLURP update, but if you did into the rabbitmq configuration page, it has a separate specific procedure for upgrading rabbitmq, so I am a little confused if the rabbitmq separate upgrade is needed | 15:28 |
dcapone2004 | dig* | 15:28 |
kevko | andreykurilin: link ? | 15:36 |
andreykurilin | kevko: https://zuul.opendev.org/t/openstack/build/1616a7134fe84526a50fdf185329255c from https://review.opendev.org/c/openstack/kolla-ansible/+/940825 | 15:37 |
dcapone2004 | priteau: also, fyi it appears that the inclusion of the variable itself into the globals.yml was left off the merge and updates as I reviewed the last version of globals.yml in the KA repo and it also does not include the variables, so diff I do not think would help here | 15:37 |
dcapone2004 | it actually isn't included in any of the globals.yml files up to and including the current master repo .... so does that variable get defined in globals.yaml or elsewhere? | 15:41 |
kevko | andreykurilin: non-related i think | 15:42 |
kevko | andreykurilin: mariadb didn't create a cluster | 15:43 |
kevko | andreykurilin: 2025-02-05 19:46:13 2 [ERROR] WSREP: ./gcs/src/gcs.cpp:s_join():990: Sending JOIN failed: -103 (Connection was closed). | 15:43 |
kevko | 2025-02-05 19:46:13 2 [Warning] WSREP: Failed to JOIN the cluster after SST gcs_join(01e127e1-e3f9-11ef-9365-b799a7089f5d:21) failed | 15:43 |
kevko | andreykurilin: https://ee4b9eb34e1fd5d099ea-4540e2fcfebb146227dc28f41e3112d0.ssl.cf5.rackcdn.com/940825/1/check/kolla-ansible-ubuntu-mariadb/1616a71/primary/logs/kolla/mariadb/mariadb.txt | 15:43 |
kevko | andreykurilin: so monitor user was not created ..and playbook failed | 15:43 |
kevko | andreykurilin: just recheck | 15:43 |
bbezak | frickler: got issue with that change - https://review.opendev.org/c/openstack/kolla-ansible/+/910503. not sure why it can't workflow - as related changed got merged - however somehow on the different patchset | 15:44 |
dcapone2004 | ok, in digging further, I think I understand....on 2023.1 by default transient queues are enabled.... ONLY IF you want to switch to durable queues, does the extra steps need to be followed | 15:48 |
dcapone2004 | however, if upgrading to 2023.2, to maintain transient queues, I need to add om_enable_rabbitmq_quorum_queues : false and om_enable_rabbitmq_high_availability: true to the globals.yml file before upgrading | 15:49 |
dcapone2004 | am I also correct that switching to durable queues requires downtime??? The procedure indicates a need to stop services that rely on rabbitmq to make the change which afaik includes neutron and also afaik if neutron containers are stopped, network traffic will stop | 15:50 |
andreykurilin | kevko: thank you for looking! | 15:53 |
priteau | dcapone2004: that's correct, it requires a full stop of all services using RMQ | 16:00 |
dcapone2004 | priteau: thanks....am I correct that I am safely upgrade without downtime, by simply using KA upgrade IF I set om_enable_rabbitmq_quorum_queues : false and om_enable_rabbitmq_high_availability: true in globals.yml with no other special steps? | 16:02 |
frickler | bbezak: depends-on isn't merged https://review.opendev.org/c/openstack/kolla-ansible/+/899615 | 16:05 |
bbezak | Right. I didn't notice in this long commit msg. Thank you | 16:06 |
-opendevstatus- NOTICE: nominations for the OpenStack PTL and TC positions are now open, for details see https://governance.openstack.org/election/ | 16:08 | |
opendevreview | Massimiliano Favaro-Bedford proposed openstack/kayobe master: Add variable to allow image builds of different platforms. https://review.opendev.org/c/openstack/kayobe/+/940894 | 16:13 |
kevko | dcapone2004: well, we can't stop services ..we will migrate it on the fly | 16:14 |
kevko | dcapone2004: but we need to patch oslo.messaging i think ...and still not done | 16:14 |
priteau | dcapone2004: If you are not already using ha queues, I think you will need full stop? | 16:15 |
kevko | bbezak: https://review.opendev.org/c/openstack/kolla-ansible/+/940496 ? | 16:15 |
kevko | can u please ? | 16:15 |
dcapone2004 | kevko: the docs as of now say a full stop is needed to go from transient to durable queues....I think your comment is saying that you intend to change this requirement in the future, but it isn't completed yet | 16:16 |
kevko | dcapone2004: well, switch to quorum queues can be fatal ...we have +/- 150 computes and thousands of k8s running on top .... if there is some service down ...it's really fatal for us ... | 16:17 |
kevko | dcapone2004: so, we will probably patch oslo.messaging and provide migration path somehow ...from a code it can be done when I've checked firstly ... | 16:18 |
kevko | dcapone2004: but yeah, official opendev or kolla way don't exist ...just stop ..reset ..restart yeagh | 16:18 |
dcapone2004 | kevko: so safest way to upgrade from 2023.1 to 2023.2 and beyond for the time being is staying on transient queues | 16:19 |
dcapone2004 | kevko: which I can do like any othe rupgrade in the past, clone 2023.2 repo, merge passwords / globals.yml variables | 16:19 |
dcapone2004 | kevko: add om_enable_rabbitmq_quorum_queues : false and om_enable_rabbitmq_high_availability: true to the globals.yml file | 16:20 |
kevko | dcapone2004: correct | 16:20 |
dcapone2004 | kevko: run kolla-ansible upgrade... | 16:20 |
kevko | dcapone2004: exactly | 16:20 |
kevko | dcapone2004: you can run it until rabbitmq 4.0 will be here in kolla | 16:21 |
dcapone2004 | kevko: awesome, thank you very much....all the rabbitmq noise was stressing me out lol | 16:21 |
kevko | dcapone2004: I like this noise; after all, if everything worked as it should, many of us wouldn't have a job , right ? :D | 16:22 |
dcapone2004 | kevko: valid point | 16:27 |
dcapone2004 | kevko: one last question, does this mean that openstack and KA is going to remain on RMQ 3.13 for the foreseeable future until that issue is fixed as the docs mentioned support for transient queues was dropped in RMQ 4.0? | 16:56 |
dcapone2004 | kevko: I am essentially trying to make sure I do not upgrade to an openstack release that requires the transition to quorum queues and want to know where my eol may be | 16:57 |
opendevreview | Jakub Darmach proposed openstack/kolla stable/2024.1: Add support for Ubuntu 24.04 LTS https://review.opendev.org/c/openstack/kolla/+/932386 | 16:58 |
priteau | dcapone2004: the move to rmq 4.0 is planned for 2025.1 (Epoxy) | 17:01 |
dcapone2004 | ok, so do not upgrade to 2025.1 without having a solution to switch to quorum queues | 17:01 |
dcapone2004 | got it | 17:01 |
dcapone2004 | ok.....last thing I am noticing merging the configs is that the default ceph keyrings seem to have dropped "ceph." from the start of them, my existing keyring files are all ceph.client.<service>.keyring ... do the playbooksnow automatically prepend "ceph." to the value specified in the globals.yml or do I need to ensure my globals.yml include the ceph. | 17:38 |
dcapone2004 | trying to look through the playbooks to check, but hoping I could get a good answer here | 17:41 |
priteau | dcapone2004: I believe it is transparent. See https://review.opendev.org/c/openstack/kolla-ansible/+/877413 | 17:44 |
priteau | In particular, see cinder_ceph_backends in ansible/roles/cinder/defaults/main.yml and how its "cluster" key is used in e.g. https://review.opendev.org/c/openstack/kolla-ansible/+/877413/17/ansible/roles/cinder/tasks/external_ceph.yml | 17:45 |
dcapone2004 | priteau: thx for the references, saved me some digging | 17:46 |
opendevreview | Pierre Riteau proposed openstack/kayobe master: Replace pause with chronyc waitsync in ntp sync https://review.opendev.org/c/openstack/kayobe/+/940646 | 17:57 |
dcapone2004 | is the gerrit system the best place to make the suggestion that the raabitmq queue variables to be added into the globals.yml example file and commented for the impacts? | 18:02 |
priteau | Sure, you can submit a patch on Gerrit, core reviewers will give their opinion | 18:16 |
priteau | https://docs.openstack.org/kolla-ansible/latest/contributor/contributing.html | 18:17 |
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline momentarily while we upgrade for a new jeepyb feature and switch our database container image source repository | 18:52 | |
shermanm | hey, I had a (possibly silly) question about dns resolution when using ovs / neutron-dhcp-agent | 20:18 |
shermanm | it seems like dnsmasq defaults to trying to act as a dns resolver from each dhcp instance, but if a subnet has no router attached, then dnsmasq can't reach any upstream resolvers, not even one on the host outside its network namespace | 20:18 |
shermanm | is there a sane way to provide access to external dns resolution for subnets without a router? | 20:18 |
shermanm | the context being that I'm trying to get internal dns resolution to work, and I can't seem to get dnsmasq to handle that case if it can't also reach an upstream resolver | 20:22 |
frickler | shermanm: you can tell dnsmasq to use the host networking, see https://docs.openstack.org/neutron/latest/admin/config-dns-res.html#case-2b-queries-are-forwarded-to-dns-resolver-s-configured-on-the-host | 20:42 |
shermanm | frickler: so, we tried that, in which case it attempts to use the contents of /etc/resolv.conf (removes --no-resolv flag) | 20:43 |
shermanm | but, resolvers on the host, even e.g. 127.0.0.53, aren't accessible from the network namespace where the dnsmasq instance is running | 20:43 |
shermanm | unless a router is attached to the relevant subnet | 20:44 |
shermanm | this is only an issue for us because, in the case that you create 2 subnets with default settings, only one with a router, and then attach an instance to both of them, dhcp will advertise a dns server for both | 21:09 |
shermanm | and dns for the non-routable subnet will take 10s before rejecting any query, causing lots of "hangs" on the guest os querying it | 21:09 |
frickler | shermanm: ah. you can do "openstack subnet set --dns-nameserver 0.0.0.0" in order to make dhcp not send an dns server | 21:31 |
shermanm | yep! it's not an issue for our more expert users, but very non-intuitive for more novice ones. tbh if we could just set a default nameserver for subnets to use, that would fix the user experience | 21:33 |
frickler | shermanm: yes, I would have liked to switch the meaning of 0.0.0.0 and --no-dns-nameserver, but that wasn't possible in order to stay backwards compatible, so it had to be this kind of twisted operation | 21:36 |
frickler | how would having a default nameserver help in the isolated subnet case? | 21:37 |
shermanm | it would help because we could then disable resolution in dnsmasq entirely, without breaking the case where a subnet is routable, but didn't have a custom nameserver set via --dns-nameserver | 21:39 |
shermanm | routable subnets just use the default, and non-routable ones fail cleanly, because you get "no route available" to the upstream resolver, instead of a timeout before failing | 21:39 |
frickler | shermanm: I see, it might make sense to add such an option to neutron, do you want to propose an RFE? but also, mostly everyone seems to be going the OVN path and that messes up DNS even more badly | 21:44 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!