jrosser_ | hmm glance really is not happy with these capi jobs https://zuul.opendev.org/t/openstack/build/61ed8fbc30db4a64a4a70b3b26266e27/log/job-output.txt#27356 | 07:26 |
---|---|---|
jrosser_ | write error https://zuul.opendev.org/t/openstack/build/61ed8fbc30db4a64a4a70b3b26266e27/log/logs/openstack/aio1-glance-container-7f359023/glance-api.service.journal-23-38-40.log.txt#2052 | 07:28 |
noonedeadpunk | but they are for previous one? | 07:30 |
noonedeadpunk | that;'s quite weird | 07:30 |
jrosser_ | its doing it a lot | 07:51 |
jrosser_ | and is wierd as all i did was make a patch to have several jobs running with one variable different between then | 07:51 |
jrosser_ | but yesterday merging the main patch to os_magnum just want through with no problems | 07:51 |
jrosser_ | *went | 07:51 |
noonedeadpunk | it really did | 07:52 |
jrosser_ | i think i still have an aio here so i can try one of those failing versions | 07:52 |
noonedeadpunk | but also https://zuul.opendev.org/t/openstack/build/4f26b36e71f945afa131013bf7792ca6 jsut passed as well | 07:52 |
jrosser_ | but somehow feels like CI node issue | 07:52 |
noonedeadpunk | 2 times in a row? | 07:53 |
jrosser_ | one obvious thing is that my job launches a bunch of these in parallel | 07:53 |
jrosser_ | but yesterdays patch only runs one | 07:53 |
jrosser_ | this disk looks huge so it's doesnt seem out of space https://zuul.opendev.org/t/openstack/build/61ed8fbc30db4a64a4a70b3b26266e27/log/logs/openstack/instance-info/host_system_info_22-25-57.log.txt#6525 | 07:54 |
jrosser_ | looks like 160G root disk on these | 07:55 |
noonedeadpunk | oh, btw, we were transfered https://opendev.org/openstack/ansible-role-frrouting | 08:00 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Reflect frrouting role new place https://review.opendev.org/c/openstack/openstack-ansible/+/916719 | 08:02 |
noonedeadpunk | there were couple of open patches there | 08:03 |
noonedeadpunk | https://review.opendev.org/c/openstack/ansible-role-frrouting/+/910373 | 08:03 |
noonedeadpunk | and, it has molecule (was even operational) | 08:03 |
noonedeadpunk | so might be good place to start experiments on how to make "consistent" functional jobs for roles | 08:04 |
jrosser_ | why does that need 2 nodes | 08:24 |
jrosser_ | with the docker driver for molecule doesnt it just make two containers? | 08:24 |
noonedeadpunk | it does | 08:25 |
noonedeadpunk | it was historically without molecule | 08:25 |
noonedeadpunk | just naitve 2 nodes job | 08:26 |
noonedeadpunk | I guess this can be dropped now actually | 08:26 |
jrosser_ | so interestingly glance upload of one of the other k8s images just worked first time in my AIO | 08:28 |
jrosser_ | interesting discussion on the ML about branchless SDK | 09:07 |
jrosser_ | i wondder how much better/worse the 0.99/1.x transition of SDK vs. the ansible collection would have been in that case | 09:08 |
noonedeadpunk | huh? I don't see this one at all | 10:00 |
noonedeadpunk | ah, branchless detection thread | 10:00 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_skyline master: Add ssl configuration to DB connection string https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/916754 | 10:09 |
noonedeadpunk | doh, seems we have circular dependency for skyline between https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/916754/ and https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/912370 | 11:36 |
jrosser_ | hmm i guess we should just squash those together | 12:57 |
noonedeadpunk | yeah | 13:07 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_skyline master: Add EL distro support and ssl configuration for DB connection https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/912370 | 13:08 |
jrosser_ | interesting https://opendev.org/vexxhost/openstack-operator/src/branch/master/openstack_operator/templates/operator/uwsgidefaultconfig.yml.j2#L30 | 13:30 |
jrosser_ | relatedly https://stackoverflow.com/questions/36156887/uwsgi-raises-oserror-write-error-during-large-request | 13:31 |
opendevreview | Jonathan Rosser proposed openstack/ansible-role-uwsgi master: Work around OSError during large transfers https://review.opendev.org/c/openstack/ansible-role-uwsgi/+/916790 | 13:35 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: Test all supported versions of k8s workload cluster with magnum-cluster-api https://review.opendev.org/c/openstack/openstack-ansible-ops/+/916649 | 13:38 |
noonedeadpunk | huh, interesting indeed | 14:00 |
jrosser_ | the SO article is really quite like what i am seeing for glance | 14:06 |
jrosser_ | but suuuuuper unhelpfully there is no further information past OSError | 14:07 |
noonedeadpunk | well, we still avoid uwsgi for glance with ceph. though now we're doing a setup with glance and uwsgi, but it's using swift as backend... | 14:08 |
jrosser_ | excellent docs https://uwsgi-docs.readthedocs.io/en/latest/Options.html#ignore-write-errors | 14:09 |
noonedeadpunk | hard to argue about annoying part | 14:10 |
noonedeadpunk | not sure if uploaded images are consistent.... | 14:10 |
noonedeadpunk | maybe they are... | 14:10 |
noonedeadpunk | this all really is very confusing | 14:11 |
jrosser_ | right | 14:11 |
jrosser_ | feels like just randomly stabbing at things with no understanding :( | 14:11 |
jrosser_ | i cant reproduce it either | 14:12 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Update global pins for 2024.1 https://review.opendev.org/c/openstack/openstack-ansible/+/916792 | 14:21 |
noonedeadpunk | no idea if that will pass ^ | 14:22 |
jrosser_ | i did look quickly at updating openstack_hosts for caracal UCA | 14:23 |
jrosser_ | but that seemed to rely on openstack_distrib_code_name | 14:24 |
jrosser_ | was not sure if we wanted to update that yet? | 14:24 |
noonedeadpunk | #startmeeting openstack_ansible_meeting | 15:01 |
opendevmeet | Meeting started Tue Apr 23 15:01:03 2024 UTC and is due to finish in 60 minutes. The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:01 |
opendevmeet | The meeting name has been set to 'openstack_ansible_meeting' | 15:01 |
noonedeadpunk | #topic rollcall | 15:01 |
noonedeadpunk | o/ | 15:01 |
jrosser_ | o/ hello | 15:01 |
noonedeadpunk | so, I failed quite a bit to send out PTG results :( | 15:02 |
noonedeadpunk | though there were not much things | 15:02 |
noonedeadpunk | mainly moved things from the past one to the new cycle | 15:03 |
noonedeadpunk | #topic office hours | 15:06 |
noonedeadpunk | so, there was a ML thread abour cluster-api deployment with osa: https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/27R2UOTCIHNAAGFFCG36KWHDPFLD3TZ4/ | 15:08 |
noonedeadpunk | I was mainly guessing there as never checked it/played enough myself | 15:08 |
noonedeadpunk | no idea if folk gave up or made it working though | 15:08 |
NeilHanlon | hiya folks | 15:09 |
noonedeadpunk | o/ | 15:11 |
noonedeadpunk | I'm actually planning to keep pushing patches this week for quorum queues improvements | 15:12 |
noonedeadpunk | as this needs to be done across all services - would be good to know a health state overall | 15:12 |
noonedeadpunk | until it's not _too_ late | 15:12 |
jrosser_ | i think that the cluster api guy from the ML showed up in the cluster-api slack channel and i helped out there | 15:12 |
noonedeadpunk | ah | 15:12 |
noonedeadpunk | ok, good to know there's some slack channel :D | 15:13 |
jrosser_ | that was the push i needed to get it fixed up last weekend | 15:13 |
noonedeadpunk | Also, if anybody missed, I've opened Answers section in launchpad, where almost instantly landed question around OVN docs: https://answers.launchpad.net/openstack-ansible/+question/709484 | 15:14 |
jrosser_ | this is kind of a big problem | 15:18 |
noonedeadpunk | that I've opened it?:) | 15:22 |
noonedeadpunk | or that we're bad in documentation? | 15:22 |
noonedeadpunk | also as an update - frrouting role has been moved under our governance. it's needed for ovn-bgp-agent implementation | 15:26 |
jrosser_ | i think the trouble is that OVN is so hard | 15:26 |
NeilHanlon | nice :D | 15:26 |
NeilHanlon | I really need to look into OVN | 15:26 |
noonedeadpunk | Actually, I don't find it _that_ hard today | 15:26 |
NeilHanlon | sorry i'm distracted today... covering for a few people | 15:27 |
noonedeadpunk | like it's mainly just very different | 15:27 |
jrosser_ | and simultaneously the docs do not show a newcomer what all the moving parts are | 15:27 |
jrosser_ | and how to translate what you want into working config | 15:27 |
NeilHanlon | i think the problem is that OVN is itself a huge beast, and so a newcomer to OpenStack **and** OVN will be like "WTF!" | 15:27 |
jrosser_ | right | 15:27 |
noonedeadpunk | yeah, and I think we lack of examples. Or better say - they're hidden in os_neutron docs | 15:27 |
NeilHanlon | also OpenStack Networking in general is pretty counter to what a "Network Engineer" would expect | 15:28 |
noonedeadpunk | yeah, I think you might be into smth NeilHanlon | 15:28 |
NeilHanlon | like I still have to reprogram my brain to think about what OpenStack does with provider/external nets :) | 15:28 |
mgariepy | it was always left to the deployer for the network configuration | 15:30 |
mgariepy | stuff can work quite differently with ovn tho. | 15:30 |
jrosser_ | i think it is pretty difficult on top to understand what to put in openstack_user_config | 15:38 |
noonedeadpunk | yeah | 15:38 |
jrosser_ | there are too many wierdly named fields | 15:38 |
mgariepy | there are many ways to do stuff also. | 15:39 |
jrosser_ | so primarily this is a documentation issue then | 15:39 |
noonedeadpunk | yeah, maybe we should somehow promote neutron_provider_networks which is way more obvious.... | 15:40 |
noonedeadpunk | but then when I tried to - it was even more confusing on what to add where | 15:41 |
NeilHanlon | surely the AI can do it for us | 15:46 |
* NeilHanlon ducks | 15:46 | |
jrosser_ | this is also not a case of making some quick fix to the docs for some error | 15:47 |
jrosser_ | it needs time/concentration to make a coherent set of stuff for how things actually are with OVN | 15:47 |
NeilHanlon | yeah | 15:47 |
* NeilHanlon will ask in his circles if anyone might be interested in some OSA+OVN documentation | 15:48 | |
NeilHanlon | work* | 15:48 |
noonedeadpunk | well, I partially did some | 15:50 |
noonedeadpunk | but apparently we have more places to fix | 15:50 |
noonedeadpunk | like deployment guide at least | 15:50 |
jrosser_ | maybe we have to start from what got put in the launchpad answers section | 15:51 |
noonedeadpunk | yeah | 15:53 |
noonedeadpunk | that sounds as a good start | 15:54 |
noonedeadpunk | #endmeeting | 15:58 |
opendevmeet | Meeting ended Tue Apr 23 15:58:00 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:58 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-23-15.01.html | 15:58 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-23-15.01.txt | 15:58 |
opendevmeet | Log: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-23-15.01.log.html | 15:58 |
opendevreview | Merged openstack/ansible-role-uwsgi master: Add Debian 12 distro setup variable https://review.opendev.org/c/openstack/ansible-role-uwsgi/+/915080 | 16:02 |
jrosser_ | haproxy thinks glance API is down during image upload https://zuul.opendev.org/t/openstack/build/f7e8ae3093e84d438cebd22eda845ce0/log/logs/host/haproxy.service.journal-15-42-20.log.txt#6873-6903 | 16:02 |
jrosser_ | does uploading an image prevent it responding to a heathheck i wonder | 16:03 |
noonedeadpunk | ah | 16:07 |
noonedeadpunk | yes, I think we have too less workers for glance | 16:07 |
noonedeadpunk | *too few | 16:07 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/templates/user_variables.aio.yml.j2#L98-L101 | 16:08 |
noonedeadpunk | so really no available workers to respond when upload is in progress | 16:08 |
noonedeadpunk | I've actually seen same in AIO when tried to upload through horizon | 16:08 |
jrosser_ | we probably get away with that for tempest/cirros as the image is tiny | 16:09 |
noonedeadpunk | yeah, likely | 16:09 |
jrosser_ | and perhaps my storage here is quick so i have not seen it | 16:10 |
noonedeadpunk | not sure how it passes with coreos though or amphora... | 16:10 |
jrosser_ | which of those vars actually makes a different | 16:10 |
noonedeadpunk | I think glance_wsgi_threads ? | 16:11 |
noonedeadpunk | not sure... | 16:11 |
noonedeadpunk | yeah, wsgi_threads is for uwsgi | 16:11 |
noonedeadpunk | and glance_api_threads for non-uwsgi | 16:12 |
jrosser_ | ah ok | 16:12 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_skyline master: Do not define a random password for each run https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/912332 | 16:12 |
opendevreview | James Denton proposed openstack/openstack-ansible-os_skyline master: Support large uploads via Skyline https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/914149 | 16:13 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_skyline master: Install skyline-console through yarn https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/914405 | 16:13 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Increase number of threads to 2 for glance in AIO https://review.opendev.org/c/openstack/openstack-ansible/+/916810 | 16:14 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: Test all supported versions of k8s workload cluster with magnum-cluster-api https://review.opendev.org/c/openstack/openstack-ansible-ops/+/916649 | 16:15 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Add service policies defenition https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/916812 | 16:31 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Add service policies defenition https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/916812 | 16:32 |
jrosser_ | noonedeadpunk: did i show you this? https://bugs.launchpad.net/oslo.messaging/+bug/2031512 | 16:34 |
noonedeadpunk | jrosser_: I think I saw except latest Andrew comments | 16:55 |
noonedeadpunk | so last update I saw was from March I guess | 16:56 |
jrosser_ | yeah so we did more work and also got an answer from the rabbitmq people | 16:56 |
jrosser_ | that the un-named reply queues basically will never be OK | 16:56 |
jrosser_ | and changes to oslo.messaging now make this actually visible where before it was not | 16:57 |
opendevreview | Stuart Grace proposed openstack/openstack-ansible-ops master: Clarifications to mcapi_vexxhost README https://review.opendev.org/c/openstack/openstack-ansible-ops/+/916817 | 16:58 |
noonedeadpunk | So I guess question is - what changes you mean?:) | 16:58 |
jrosser_ | so we plan to switch to HA reply queues (unless there is something terrible we overlooked) - andrew submitted some patches | 16:59 |
noonedeadpunk | As I think latest change for 2024.1 was to make transient queues replicated | 16:59 |
jrosser_ | a bug fix i think https://github.com/openstack/oslo.messaging/commit/b4b49248bcfcb169f96ab2d47b5d207b1354ffa8 | 16:59 |
noonedeadpunk | aha | 16:59 |
jrosser_ | before that, there was enough time for the queue to be deleted in rabbitmq | 16:59 |
jrosser_ | but now it is quicker to try to reconnect / re-create the queue and it's full-on race condition | 17:00 |
noonedeadpunk | https://github.com/openstack/oslo.messaging/commit/989dbb8aad8be68a9c63e2e6a4d445cc445c051c | 17:00 |
noonedeadpunk | and basically I wanted to enable that as default in 2024.1 | 17:00 |
jrosser_ | right yes - so reason i bring this up is that we have on the backlog here a task to make a helper script to do the quorum queue migration | 17:01 |
noonedeadpunk | Well. I think migration is kinda "handled" by changing name of vhost | 17:01 |
noonedeadpunk | (currently) | 17:01 |
noonedeadpunk | And I was unable to find anything more efficient | 17:02 |
jrosser_ | kind of, yes, but i think that the downtime might be significant | 17:02 |
jrosser_ | ^ if you just do it as part of a regular upgrade | 17:02 |
noonedeadpunk | you do that service by service anyway. But yes, compute/neutron might struggle | 17:02 |
noonedeadpunk | unless you're running ovn | 17:02 |
jrosser_ | indeed - so we were going to look at the upgrade stuff a bit for this | 17:03 |
noonedeadpunk | but you still pretty much need to empty out all queues/release all connections to switch | 17:03 |
noonedeadpunk | as once "old" client connects to host - it creates classic queues and then you can't convert | 17:03 |
noonedeadpunk | *to vhost | 17:03 |
noonedeadpunk | so you pretty much need to stop every client connecting to vhost before doing anything | 17:04 |
noonedeadpunk | which is pretty alike to just swap vhost name, and then downtime on operations until playbooks is finished... | 17:05 |
noonedeadpunk | but yeah | 17:05 |
noonedeadpunk | would be great to improve that | 17:05 |
noonedeadpunk | potentially, we might have some kind of tag, but we'd need to both re-configure service and oslo part | 17:05 |
noonedeadpunk | jsut to skip them during upgrade and execute right afterwards, so config change/restart happened really fast | 17:06 |
jrosser_ | yeah so i think this is what we will look at | 17:08 |
jrosser_ | making sture the playbooks/tags are all set up properly to do some minimal change just for the message queues | 17:08 |
noonedeadpunk | and eventually - <service>-config tag went completely out of control.... | 17:08 |
noonedeadpunk | it's doing soooooooo many things todat | 17:09 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Add variable to globally control notifications enablement and disable RPC https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/916820 | 17:13 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Add variable to globally control notifications enablement and disable RPC https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/916820 | 17:20 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Implement variables to address oslo.messaging improvements https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/916821 | 17:20 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Implement variables to address oslo.messaging improvements https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/916821 | 17:20 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_heat master: Add service policies defenition https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/916826 | 17:50 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_heat master: Add variable to globally control notifications enablement https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/916827 | 17:53 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_heat master: Implement variables to address oslo.messaging improvements https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/916828 | 17:56 |
opendevreview | Christian Mattsson proposed openstack/openstack-ansible-os_neutron master: Add debian package libstrongswan-standard-plugins https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/916832 | 18:18 |
opendevreview | Stuart Grace proposed openstack/openstack-ansible-ops master: Clarifications to mcapi_vexxhost README https://review.opendev.org/c/openstack/openstack-ansible-ops/+/916817 | 18:32 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: Clarifications to mcapi_vexxhost README https://review.opendev.org/c/openstack/openstack-ansible-ops/+/916817 | 18:34 |
noonedeadpunk | well, clusterapi doc seems to be renderred "well" enough as well: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_afd/916653/3/check/openstack-tox-docs/afd849d/docs/mcapi.html | 18:36 |
jrosser_ | yeah thats good | 18:41 |
jrosser_ | the images need some "retain original aspect ratio" thing, that was also a bit wonky on the elk one | 18:42 |
jrosser_ | and i can also look at including the actual config files out of the repo instead of having duplication | 18:42 |
noonedeadpunk | I had to drop ratio, as sphinx can't make it for SVG | 18:44 |
noonedeadpunk | it can for PNG though | 18:44 |
jrosser_ | looks like the glance threads thing was the cause for upload failures | 18:44 |
jrosser_ | the capi one is png i think | 18:46 |
noonedeadpunk | yeah | 18:46 |
noonedeadpunk | but I thought I left scale there if it was there... | 18:50 |
noonedeadpunk | ok, it was `:scale: 100 %` :D | 18:50 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!