Tuesday, 2024-03-12

opendevreviewMerged openstack/kolla-ansible stable/2023.1: Add precheck for RabbitMQ quorum queues  https://review.opendev.org/c/openstack/kolla-ansible/+/90996702:47
mnasiadkakevko, frickler: Would be good to get new RMQ version in: https://review.opendev.org/c/openstack/kolla/+/911093 and https://review.opendev.org/c/openstack/kolla-ansible/+/911094 ;-)06:49
opendevreviewMichal Nasiadka proposed openstack/kolla master: WIP: Switch to Ubuntu 24.04  https://review.opendev.org/c/openstack/kolla/+/90758906:54
opendevreviewMichal Nasiadka proposed openstack/kolla master: WIP: Add support for rpm to repos.yaml  https://review.opendev.org/c/openstack/kolla/+/90987906:55
opendevreviewMichal Nasiadka proposed openstack/kolla master: WIP: Add support for rpm to repos.yaml  https://review.opendev.org/c/openstack/kolla/+/90987906:55
opendevreviewRafal Lewandowski proposed openstack/kayobe master: Add Redfish rules to Ironic and Bifrost introspection  https://review.opendev.org/c/openstack/kayobe/+/90277208:09
SvenKieskeo/08:56
SvenKieskekevko: I also like prechecks against my own stupidity :)08:57
kevkomnasiadka: done 09:03
opendevreviewVerification of a change to openstack/kolla-ansible master failed: rabbitmq: Add 3.12 feature flags (for upgrade to 3.13)  https://review.opendev.org/c/openstack/kolla-ansible/+/91109409:07
fricklerkevko: mnasiadka: ^^ pulled the brake on that one, not sure if we need to stop the kolla change, too (and the dependency might need to be the other way round?)09:09
opendevreviewMichal Arbet proposed openstack/kolla-ansible master: Fix images pull in ovs-dpdk role  https://review.opendev.org/c/openstack/kolla-ansible/+/89962609:11
kevkofrickler: hmmm, how yeah, good catch ...do we have only two multinodes jobs ? 09:17
fricklerkevko: no, we also have cephadm jobs, still waiting on results for those09:18
kevkofrickler: Oops, sorry, that's my mistake. I didn't even notice that the CI was running. I saw it was verified +1, and I also remember that I did that rebase yesterday, and it was okay. 09:22
kevkofrickler: thanks 09:22
mnasiadkafrickler: have no clue if that would error or not, if yes - then only probably on slurp jobs09:52
mnasiadkaand those two feature flags are in standard set, but I don't know if those are required09:53
mnasiadkafrickler: https://rabbitmq-website.pages.dev/docs/feature-flags#core-feature-flags - no, these two new are not required09:53
opendevreviewMichal Nasiadka proposed openstack/kolla master: WIP: Add support for rpm to repos.yaml  https://review.opendev.org/c/openstack/kolla/+/90987910:30
opendevreviewMartin Hiner proposed openstack/kolla-ansible master: Fix incorrect condition in kolla_container_facts  https://review.opendev.org/c/openstack/kolla-ansible/+/91252110:40
opendevreviewVerification of a change to openstack/kolla master failed: Bump rabbitmq to 3.13  https://review.opendev.org/c/openstack/kolla/+/91109310:43
fricklermnasiadka: rmq stopped but not starting again on upgrade ^^ I think I've seen a similar report earlier10:55
mnasiadkathat's interesting10:56
mnasiadkarecreate_or_restart_container should probably do some post-check10:56
mnasiadkaespecially that we use systemd now11:00
opendevreviewMatúš Jenča proposed openstack/kolla-ansible master: Implement TLS for Redis  https://review.opendev.org/c/openstack/kolla-ansible/+/90918811:02
kevkofrickler: we were upgrading rabbitmq 3 weeks ago and rabbitmq didn't start 11:15
kevkofrickler: we needed to force start 11:15
kevko2024-02-23 21:56:32.297 [info] <0.274.0> Waiting for Mnesia tables for 60000 ms, 9 retries left11:21
kevko2024-02-23 21:57:32.298 [warning] <0.274.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@controller0,rabbit@controller1,rabbit@controller2],[rabbit_user_permission,rabbit_semi_durable_route,rabbit_topic_trie_edge,rabbit_queue]}11:22
kevkowe needed to force boot the node ....11:23
kevkomy theory is that we are using pause_minority ...and clouds with huge load as we have can fail rabbitmq upgrade phase because of this ... i think we should switch to autoheal during the upgrade and after upgrade is done ..restart again to set pause_minority back ....11:25
mnasiadkawell, in CI RMQ is never stopped, but maybe we just need handling for waiting for container to be really stopped11:26
mnasiadka(and started)11:26
kevkomnasiadka: during upgrade ? 11:29
mnasiadkayes - see https://85e97ec41df2e8196f68-ad9677b8f3b079990c4951ac9cbbd797.ssl.cf1.rackcdn.com/911093/4/gate/kolla-ansible-debian-upgrade/2f81bbb/primary/logs/ansible/reconfigure-rabbitmq11:29
mnasiadkawondering if it shows up again11:29
mnasiadka(did a recheck)11:29
mnasiadkaand it's only Debian for now11:30
kevkomnasiadka: this is not issue i've seen i think ...11:34
kevkomnasiadka: and this is only one node rabbit 11:34
mnasiadkayup11:34
kevkomnasiadka: what i've seen is this 11:42
kevkomnasiadka: https://paste.openstack.org/show/bgZwGqWjAtzgjOxICj3H/11:42
kevkomnasiadka: upgrade was from xena -> yoga .. and if I am correct ..during the upgrade ..two nodes are stopped and then restarted ....and that's the thing .pause minority will break start of the node which lost two neighbours11:43
kevkomnasiadka: i haven't investigated master yet 11:44
kevkobecause there is a different code 11:44
kevkoanybody to approve https://review.opendev.org/c/openstack/kolla-ansible/+/899626 ? trivial and bug fixing 12:04
mnasiadkaI'm still amazed that anybody uses it and it works12:10
kevkomnasiadka: haha, yeah ... I was asked to push it little bit as we have customer who really using it :) 12:10
kevkomnasiadka: and I've already verified onsite  that it's working ...12:11
kevkomnasiadka: thanks 12:11
mnasiadkastill don't understand why it's a separate role - and we don't even publish those images I think12:11
mnasiadkaah, we do for debian/ubuntu12:11
kevkomnasiadka: if it is in kolla  repo ..it's buildable - so it's supported :) 12:15
opendevreviewVerification of a change to openstack/kolla master failed: Bump rabbitmq to 3.13  https://review.opendev.org/c/openstack/kolla/+/91109312:39
mnasiadkaand failed again, nice12:57
mnasiadkawhat is with debian :)12:58
mnasiadkahmm, weird, it's some old notification12:59
fricklermnasiadka: the notification is repeated when the arm64 result comes in since that doesn't change the V-2 state13:08
mnasiadkafrickler: a bit misleading, but whatever13:09
mnasiadkawell, the bot got tired of my complaints13:09
opendevreviewMerged openstack/kolla-ansible stable/2023.1: Rework quorum queues precheck  https://review.opendev.org/c/openstack/kolla-ansible/+/90996813:14
opendevreviewVerification of a change to openstack/kolla-ansible master failed: Fix images pull in ovs-dpdk role  https://review.opendev.org/c/openstack/kolla-ansible/+/89962613:43
opendevreviewMichal Nasiadka proposed openstack/kolla master: WIP: Add support for rpm to repos.yaml  https://review.opendev.org/c/openstack/kolla/+/90987913:45
mnasiadkafrickler: I think after tooz pin the cephadm multinode jobs are so stable that I'm tempted to change them to voting and gate on them ;-)13:51
SvenKieskethat would actually be very nice if it worked.13:52
fricklermnasiadka: yes, let's do that13:55
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Change cephadm jobs to voting and add them to gating  https://review.opendev.org/c/openstack/kolla-ansible/+/91258513:59
mnasiadkalet's see13:59
SvenKieskekevko: (anybody interested in rmq really) maybe have a look at these new rmq tuning parameters: https://review.opendev.org/c/openstack/kolla-ansible/+/90052814:03
SvenKieskelooking at the multinode jobs: did I miss it or do we not have any multinode jobs on debian?14:05
SvenKieskehttps://review.opendev.org/c/openstack/kolla-ansible/+/899626?tab=change-view-tab-header-zuul-results-summary fails also on debian upgrade in gate pipeline :(14:13
SvenKieskethe same rmq error as above14:15
mnasiadkaSvenKieske: not the same14:16
mnasiadkain the rmq 3.13 case it was not existing container14:16
mnasiadkahere wait is timing out14:16
mnasiadkawonder why only on Debian14:16
SvenKieskeah right, sorry, don't know how I mixed that up14:17
mnasiadkaand it's deploy phase14:17
SvenKieskeyeah, it's right after restart of rmq container, about which you talked somewhere up there^^14:18
SvenKieske[11:55] <frickler> mnasiadka: rmq stopped but not starting again on upgrade ^^ I think I've seen a similar report earlier <- this one14:18
SvenKieskeah, but that was multinode ipv6 ubuntu iiuc14:20
SvenKieskebut different error, yes. "container is not running" vs "Waiting for pid..."14:22
mnasiadkaSvenKieske: standard timeout is 10 seconds, from the log it seems rabbitmq came up in something like 15 seconds14:25
LockesmithI'm generating certificates using certbot. I have figured out what I need to piece together to get internal and external traffic working with tls, but I can't figure out exactly what I need to get backend tls working. Can I even use a certbot cert for that?14:29
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: rabbitmq: bump wait timeout to 60 seconds  https://review.opendev.org/c/openstack/kolla-ansible/+/91258614:30
SvenKieskeLockesmith: did you read https://docs.openstack.org/kolla-ansible/latest/admin/tls.html#back-end-tls-configuration ? is there anything missing there not answering your question?14:46
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: rabbitmq: bump wait timeout to 60 seconds  https://review.opendev.org/c/openstack/kolla-ansible/+/91258614:48
LockesmithSvenKieske: I did. And everything looks good up until applying to keystone at which point the action `os_keystone_service` fails with a 503.15:17
LockesmithIt fails at `TASK [service-ks-register : keystone | Creating services]`15:17
SvenKieskewhich openstack version do you run and do you happen to build your own containers or do you consume a stable git branch, or do you install packages from pypi?15:18
mhinerHello, can you please review this small fix: https://review.opendev.org/c/openstack/kolla-ansible/+/91252115:20
opendevreviewMerged openstack/kolla master: Bump rabbitmq to 3.13  https://review.opendev.org/c/openstack/kolla/+/91109315:58
opendevreviewMerged openstack/kolla-ansible stable/2023.1: RabbitMQ: correct docs on Quorum Queue migrations  https://review.opendev.org/c/openstack/kolla-ansible/+/90996915:58
LockesmithSvenKieske: I install it from pypi and use the master branch16:14
SvenKieskeLockesmith: can you provide the full trace of that task failure? e.g. via http://paste.openstack.org ?16:20
opendevreviewMatt Crees proposed openstack/kolla-ansible master: CI: Only migrate RMQ queues during SLURP  https://review.opendev.org/c/openstack/kolla-ansible/+/90997116:21
LockesmithSvenKieske: absolutely! https://paste.opendev.org/show/bZAZUOcTN2RnOjKegeOk/16:44
SvenKieskeLockesmith: well the error seems to be that your provided auth_url is either incorrect, or if it is correct, which I guess it is, there's a problem with https://openstack.coldforge.xyz:5000 (which is the default keystone auth port). it returns a HTTP 503 service unavailable16:47
SvenKieskepossibly the keystone logs have more information what has gone wrong there :)16:48
SvenKieskeso this error seems, at first sight, unrelated to your actual task, but it's crucial that authentication is working. you might also want to check your keystone backend, if there is anything externally configured, like LDAP, active directory, etc.16:51
LockesmithSorry, my brain is all over the place. I'll grab those too for you.16:52
LockesmithSvenKieske: I'll paste the logs if you'd like still, but I cleared them and ran the reconfigure again and there's not an error in them nor the docker logs.17:53
SvenKieskeLockesmith: if it works now it was most likely a spurious network error, might be worth to track it down yourself, but I doubt it's a bug in the software, rather something in your setup/hardware. :)17:56
LockesmithSvenKieske: It's not working, I just can't find any error in the logs outside of the one I sent earlier.17:57
LockesmithThough I'm not really sure which servers to check other than keystone and haproxy17:57
Lockesmithservices*17:57
LockesmithDo I need to copy the letsencrypt ca into my containers?18:20

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!