Thursday, 2022-03-24

noonedeadpunkA pity that fghaas is not around here, as I bet he uses k8s with octavia and magnum09:33
BraceSo, we seem to have got to the bottom of our broken cluster, the l3 agent on c2 isn't working.  So we've disabled all the neutron services on that controller and are able to bring up some instances now.09:46
BraceWhich will give us a bit of time to try and find out what's actually wrong with c2.09:46
jrosserzigo: do i remember right that you had some insights into uwsgi / chuncked transfer settings?10:19
zigojrosser: I do: the swift team pretends Swift works with uwsgi, but that's bullshit, it's broken in very subtile ways.10:20
zigoI reverted all of swift over uwsgi.10:20
jrosserhow about for glance?10:20
zigojrosser: Glance is said to be fine starting with Xena.10:21
zigojrosser: FYI, here's the config I use in Debian for swift: https://review.opendev.org/c/openstack/swift/+/82119210:21
zigoI'd love upstream to adopt it, and start gating with it.10:21
zigoUntil then, I'll stick on Eventlet.10:21
zigoWe insisted for more than a year already, and we keep getting issues.10:22
zigoThe last one was empty uploads, even though swift says it's ok ... :/10:22
jrossermossblaser: is this any help? ^^10:22
zigoNote the:10:24
zigoroute-run = chunked:10:24
zigoand:10:24
zigoroute = .* addheader:Date: ${httptime[]}10:24
zigoin my proposed patch. While these options are making Swift pass all refstack tests, they are forcing "Transfer-Encoding: chunked" which is probably not what one wants.10:24
zigoThough we haven't find another way to get things *approximatly* working.10:24
zigoThe other thing, is that the Swift object server is *COMPLETELY* broken over uwsgi, because the exchanges between proxy <=> object servers aren't even HTTP compliant.10:25
mossblaserjrosser: I'm afraid I'm not familiar enough with either glance or uwsgi to know off hand if this is the same issue as I've been seeing. Though assuming we are on Xena(?) the issue does seem to persist...10:25
zigoAll this is reall a shame, because uwsgi provides a x2 performance improvement ...10:25
zigojrosser: What issue are you seeing?10:26
jrossermossblaser: can you paste something at paste.openstack.org from what we see with glance?10:27
mossblaseran intermittent failure during image upload from cinder to glance which looks very much like this bug: https://bugs.launchpad.net/glance/+bug/1916482 -- logs from our observed issue: https://paste.opendev.org/show/bLq9YXaH6ZdsBj57iWkL/10:30
noonedeadpunkfwiw I catched that recently as well, but in my case we used by mistake different chunk size for cinder and glance10:34
noonedeadpunkby default cinder sets chunk size to 4, and glance to 810:34
noonedeadpunkIf you accidentally missed to configure that, you will have issues with images to create from volumes for sure10:34
mossblasernoonedeadpunk: I presume that would lead to persistent failures, rather than intermittent? (In our case image creation succeeds the majority of the time)10:35
noonedeadpunkit not persistant, no. but depends on luck  and volume size10:36
noonedeadpunkbigger ones almost always fail, smaller mostly work10:39
noonedeadpunkalso for nova with local drives you might want to try out https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/828897 in case the have physical connectivity to ceph10:41
mossblaser(a quick check and it seems that the block size is left as the default for cinder and glance in our setup which looking at the docs I hope means they're the same! Thanks for the suggestion!)10:42
noonedeadpunkdefault means they are not same:)10:48
noonedeadpunkdefault for glance is 8: https://docs.openstack.org/glance/latest/configuration/glance_api.html#glance.store.rbd.store.rbd_store_chunk_size10:48
mossblaseruh-ohh! -- I must have looked at the docs for an older version10:49
noonedeadpunkfor cinder-volume rbd_store_chunk_size is 4 https://docs.openstack.org/cinder/latest/configuration/block-storage/samples/cinder.conf.html10:49
noonedeadpunkit was always like that)10:49
mossblaserevidently I need to start drinking coffee!10:50
mossblaserthat is unfortunate10:50
noonedeadpunkIf you check https://docs.openstack.org/openstack-ansible/latest/user/ceph/full-deploy.html#user-variables you will find that we define `rbd_store_chunk_size: 8` there10:50
noonedeadpunkfor this exact reason10:51
noonedeadpunkjrosser: question - do you think we should add `galera_monitoring_user_password` to user_secrets?10:52
noonedeadpunkor there're cases when you don't want to have it covered with any password?10:53
noonedeadpunkor well, maybe it's question to andrewbonney :)10:53
andrewbonneyI don't think I have a reason for it to be password-less10:58
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Add galera monitoring user to secrets  https://review.opendev.org/c/openstack/openstack-ansible/+/83503811:03
noonedeadpunkNeilHanlon: hey! around?11:56
noonedeadpunkpinging you as rocky expert:) We see that our patch fails now on rocky, as we assume that /etc/ssh/sshd_config.d exist and used by ssh11:57
noonedeadpunkthings go smooth for CentOS 8, but fail for Rocky.11:57
noonedeadpunkSo was wondering, if you know anything about that difference11:57
noonedeadpunkhttps://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/827100 as example for logs11:57
jrossernoonedeadpunk: oh i think there was special handling for that on centos12:21
jrossernoonedeadpunk: argh yes https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/825113/16/roles/ssh_keypairs/tasks/standalone/install_ssh_ca.yml#5212:22
jrosserso i think we maybe dont test rocky for plugins?12:23
noonedeadpunkwe don't indeed12:24
* jrosser feels unit tests disucssion coming up again :)12:24
* noonedeadpunk don't stay in same place more then 2-3 weeks in a row so life is full mess, so can't focus on a thing for some time now...12:25
jrosseroh of course, i'm not complaining :)12:26
spatelany idea about this error - https://paste.opendev.org/show/b8OmbVOc1e6b1CypkNns/12:59
spatelusing this doc to create octavia ingress controller 12:59
spatelI would appreciate if anyone has any google example yaml to create octavia ingress controller for my k8s (because nothing working for me :()13:00
noonedeadpunkspatel: try finding way to reach fghaas - he likely can help you if in good mood 13:10
noonedeadpunkbut I'm not sure 100% if he runs octavia from k8s or jsut use heat and magnum for that...13:12
jrosserspatel: people here who do k8s on openstack use the nginx ingress and then octavia in TCP LB across however many backends needed13:33
jrosserthats turned out simplest as you can have cert management (LE in this case) handled in the k8s side, not octavia13:33
spatelhmm! i thought everyone default using octavia?13:35
spatelIf nginx is way to go and easy as hell then sure i would go with that way in production 13:39
spatelI thought tightly couple with octavia and we don't need to do anything just request for LB and it will be available without extra steps like this doc saying - https://superuser.openstack.org/articles/guide-octavia-ingress-controller-for-kubernetes/13:40
mossblaserzigo: jrosser noonedeadpunk: so I tried out setting the block size and this did not seem to fix the problem but switching haproxy into tcp mode does appear to -- perhaps glance isn't set up right in Xena after all.15:03
noonedeadpunkmossblaser: another suggestion - don't use uwsgi for glance15:04
zigomossblaser: Glance does work over uwsgi on *any* release, it's only broken when using Swift as a backend in some specific cases.15:04
noonedeadpunkzigo: and except you need interoperable import being used?15:04
zigonoonedeadpunk: I'm kind of tired to read this all the time, and would very much prefer if upstream was working on fixes.15:05
zigo:/15:05
zigo(not blaming anyone on this channel: don't take it personally)15:05
zigoSame thing with Swift.15:05
jrossernoonedeadpunk: seems that volume-to-image suffers as well as interoperable import15:07
noonedeadpunkSo while I understand why tcp would work, I'm not really convinced it's root cause tbh15:07
zigojrosser: The problem is always Transfer-Encoding: chunked related indeed...15:07
noonedeadpunkor well, proper way to fix15:07
jrossermossblaser: did you try any of the uwsgi config things?15:08
noonedeadpunkjrosser: well, for me changing chunk size just worked tbh to fix volume-to-image15:09
jrosserhmm15:09
noonedeadpunkbut I I guess in this case mossblaser trying to upload image from nova ephemeral that's on local drive?15:10
zigoDo you have "wsgi-manage-chunked-input = true" ?15:10
zigoWhat version of uwsgi is that btw?15:10
zigo>= 2.0.19 ?15:10
noonedeadpunkwe have that by default zigo https://opendev.org/openstack/ansible-role-uwsgi/src/branch/master/templates/uwsgi.ini.j2#L3315:10
zigoLower wont have the option...15:10
mossblaserjrosser: I did not yet (since this appears it may need more than a simple config change in OSA)15:11
noonedeadpunkoh, wait, you mentioned other option....15:11
mossblasernoonedeadpunk: I was uploading an image from a nova volume which lives in CEPH into Glance (also using CEPH for storage), nothing local involved15:11
noonedeadpunkoh, ok...15:12
zigoAlso activate the transformation_chunked  plugin !15:12
jrosser*cinder volume15:12
zigoplugins = python3,transformation_chunked15:12
noonedeadpunkas admin1 was refferencing same issue just 2 days ago, but was uploading from local15:12
jrossermossblaser: you can hack this stuff into the uwsgi config by hand in the test lab15:12
jrosserthen we can work on a patch if it fixes things15:12
mossblasersorry cinder, of course (it has been a long day!)15:14
mossblaserI shall have a play re: uwsgi15:14
noonedeadpunkyes, that would be interesting15:15
noonedeadpunkand we  should deploy uWSGI==2.0.2015:16
zigoIMO, it's kind of silly that you guys are just remplementing all what's already done in packages...15:19
zigoThat's twice the work for no valid reason.15:19
noonedeadpunkexcept to be sure that you can install any specific version anytime you want?15:20
noonedeadpunkwithout need to mirror repos?15:20
zigoAgain, that's a packaging concern, to make sure all versions are fit together.15:21
zigocan15:21
noonedeadpunkI think it depends on what is meant under fitting15:22
zigoI don't ! :)15:22
zigoI don't think it depends on anything.15:22
zigoThat's distro's work, end of the story.15:22
noonedeadpunkum, and what if regressions in code exist? As no secret that weird backports take place close to each release. And what you should do as cloud operator to revert things back, when only latest package versions is stored in repos?15:24
noonedeadpunkAs I don't understand how I should ensure state of my cloud with packages, when deploying next week I just have new version of software without any options15:25
zigoWrong package: fix the package.15:25
zigoNot wrong package -> use some weirdo overrides.15:25
noonedeadpunkMy point was leading to ensuring exact same software being deployed not depending on time when it is deployed :)15:26
mgariepyor the os.15:26
zigoThat's because you see the OS as working against you, instead of trying to modify it to do what you want.15:27
zigoIf you want a specific snapshot of the OS so you don't get the latest point release... make such snapshot and be done with it! :)15:28
noonedeadpunkWell yes, I do agree here that it's likely point of perception being present:)15:28
zigoThere's all the tooling you want for that.15:28
zigoAs being the person behind all the Debian package since OpenStack exists, I'm probably completely biased ... :)15:28
noonedeadpunksorry, what should I do with that snapshot then?:)15:28
noonedeadpunkdeploy it in other region?15:29
noonedeadpunkI can imagine using it for CI testing...15:29
noonedeadpunkbut not sure I see how it can be re-used anywhere esle except that host15:29
zigoMy way of doing things is to simply trust the package manager to do what's right, and provide only bugfixes with no regressions.15:29
zigoSo I wouldn't do a snapshot, it's only you who claimed you don't want things to be fixed ... :)15:30
noonedeadpunkAnd I admit it makes sense for some usecases:)15:30
admin1my issue with glance was the haproxy was set to mode http and it was hitting some byte limit .. which solved after i changed mode in haproxy for glance from http => tcp 15:31
zigoadmin1: Byte limits? Can you be more specific?15:31
noonedeadpunkit was exact same issue and reference to https://bugs.launchpad.net/glance/+bug/1916482 15:32
noonedeadpunkand that's the bug created https://bugs.launchpad.net/openstack-ansible/+bug/196598615:33
zigoOh, that's a long standing issue in Glance, which is why everyone with some experience never chooses Ceph as a backend for it.15:33
noonedeadpunkthat's why I thought that mossblaser issue is same one15:33
zigoSad but truth, Glance over RBD simply sux...15:33
zigoThough it's IIRC not related to haproxy.15:34
noonedeadpunkjrosser: sorry I didn't fully get your comment on https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/831550 - did you mean we should just place temp dir inside /tmp and get done with it?15:45
jrosserit was #tmp not /tmp ?15:46
noonedeadpunk`galera_tmp_dir: /var/lib/mysql/#tmp`15:46
noonedeadpunkand galera_ignore_db_dirs is relative to datadir15:46
noonedeadpunkwe can set `galera_tmp_dir: /tmp` actually15:46
jrosseroh15:47
jrosserbecasue galera_tmp_dir: /var/lib/mysql/#tmp15:47
noonedeadpunkbut I wasn't sure if it's good since /var/lib/mysql can be separate mount point...15:47
jrosseri honestly thought the # was a typo :)15:47
jrosseris that a convention for mysql things15:47
noonedeadpunkah, no, it was intended)) added # as otherwise ppl won't be able to create database with name `tmp`15:48
noonedeadpunkand I think that `#tmp` highly unlikely to be created :D15:48
noonedeadpunkbut actually yes15:48
noonedeadpunkif directory is not set, maria tends to create smth like /var/lib/mysql/#mysql50#tmp.stLr46FBlt15:49
noonedeadpunkeasy solution would be if `ignore_db_dirs` was supporting regexp, but it doesn't15:49
noonedeadpunkI even saw CI failures for upgrade jobs because of that15:55
noonedeadpunkand catched in another region in production during upgrade15:55
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Add mysql directory for logging  https://review.opendev.org/c/openstack/openstack-ansible/+/83509116:07
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-galera_server master: Update MariDB version to 10.6.7  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/83325916:08
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-galera_server master: Update MariaDB version to 10.6.7  https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/83325916:08
noonedeadpunkso that was original error like I saw in production https://zuul.opendev.org/t/openstack/build/fe6fd9e0341c4d4b80530cbe5e091cc3/log/logs/openstack/aio1_galera_container-4bf4bdaa/mariadb.service.journal-12-06-26.log.txt#46016:09
noonedeadpunkto be fair, I'm not sure if that's fixed with patch as another common weird error raised even with it....16:10
noonedeadpunkmaybe jsut point that to /tmp indeed....16:11
admin1zigo, how it got solved in haproxy then ? 16:22
admin1i mean i was able to create snapshots from local as well as remote after that 16:22
noonedeadpunkI need really to reproduce that to play with it. As chunked plugin for uwsgi sounds promising17:32
spatelI am running this command - openstack-ansible setup-openstack.yml --tags common-mq --limit '!nova_compute'19:34
spatelgot this error - https://paste.opendev.org/show/bzT7JMrwONnSza328XoO/19:34
spateljrosser ^19:38
spatelRelated to this play https://opendev.org/openstack/ansible-role-uwsgi/src/branch/master/tasks/main.yml#L1619:41
spatelI have changed include_vars: "{{ item }}" to include_vars: "{{ lookup('first_found', params) }}"19:52
spatelstill same error, I am running 24.0.0 tag19:56
*** dviroel is now known as dviroel|pto20:45
opendevreviewNeil Hanlon proposed openstack/openstack-ansible-plugins master: Update ssh_keypairs role to fix module for Rocky Linux 8  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/83515221:55
NeilHanlonnoonedeadpunk / jrosser - i think that should do the trick21:55
jrosserNeilHanlon: one small issue but otherwise looks ok22:12
NeilHanlonjrosser: thank you.. I should look for WARNINGs :) 22:57

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!