Wednesday, 2022-03-16

*** dviroel|ruck is now known as dviroel|ruck|Afk00:05
*** arxcruz|off is now known as arxcruz07:50
jrossermorning08:17
noonedeadpunko/08:29
*** dviroel|ruck|Afk is now known as dviroel|ruck11:16
MrClayPole_Morning, I'm still migrating from OSA Train (20.2.6) to OSA Victoria (22.4.1). At https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/stable/victoria/tasks/nova_compute.yml#L16 I'm getting the error "Error when evaluating variable in dynamic parent include path: drivers/{{ nova_virt_type }}/nova_compute_{{ nova_virt_type }}.yml. When using static imports, the parent dynamic include cannot utilize host 11:17
MrClayPole_facts or variables from inventory". I've confirmed that nova_virt_type is set to "kvm" so is it that using this type of variable is not supported in this context?11:17
noonedeadpunkhm, that's interesting12:19
noonedeadpunkMrClayPole_: do you have override of nova_virt_type somewhere?12:19
MrClayPole_I've grep'ed my /etc/openstack-deploy/*.yml and got no hits12:19
noonedeadpunkand you don't have  /etc/openstack-deploy/host_vars or  /etc/openstack-deploy/group_vars ?12:20
MrClayPole_checking ....12:20
MrClayPole_inventory/group_vars/kvm-compute_hosts.yml "nova_virt_type: kvm"12:22
MrClayPole_noonedeadpunk: should I try commenting that out and re-running?12:24
MrClayPole_I don't have a /etc/openstack-deploy/host_vars/ and /etc/openstack-deploy/group_vars/ files don't have that variable set12:27
noonedeadpunkwell, I guess it's placed there for some reason, right?12:29
noonedeadpunkand I'd say it's valid usecase....12:29
noonedeadpunkjrosser: that's actually interesting issue I think ^12:30
MrClayPole_Unfortunately I didn't build this environment. Let me check with the person that did to see why /opt/openstack-ansible/inventory/group_vars/kvm-compute_hosts.yml is "nova_virt_type: kvm" set12:31
noonedeadpunkMrClayPole_: are you also sure that roles bootstrapped?12:31
noonedeadpunkoh, wait12:32
MrClayPole_The boot strap showing no errors at the end12:32
noonedeadpunkcan you replace that with include_tasks? https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/stable/victoria/tasks/main.yml#L22612:32
MrClayPole_when you say replace do you mean in "kvm-compute_hosts.yml"12:35
noonedeadpunkno, I mean in the line I posted :)12:36
noonedeadpunkso `include_tasks: nova_compute.yml`12:36
MrClayPole_Sorry I'm being slow on the up take here ... I'm not sure where you want me to put it12:36
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible-os_nova/src/branch/stable/victoria/tasks/main.yml#L22612:37
MrClayPole_so just to be clear change L226 from "task_import: nova_compute.yml" to "include_tasks: nova_compute.yml"12:38
noonedeadpunkwait. why it's `task_import`?12:39
noonedeadpunkas it should be currently `import_tasks`12:40
MrClayPole_Thats whats showing on L226 on the link 12:40
noonedeadpunkDo we see different content via the link?:) As I see there  "import_tasks: nova_compute.yml"12:40
MrClayPole_I see import tasks on the link you sent but I though you said "so `include_tasks: nova_compute.yml`"12:41
MrClayPole_it was my typo before it does say "import_task"12:44
MrClayPole_it was my typo before it does say "import_tasks"12:44
noonedeadpunkyes, so it should be include_tasks, not import_tasks :)12:51
MrClayPole_noonedeadpunk: Thanks, I seem to be struggling this morning, need get more sleep tonight. I've made the changes and am running the os_nova playbook again.12:55
noonedeadpunkLet me know about the result12:55
noonedeadpunkI believe this is pretty valid bug12:56
jamesdentoni confirm the bug. ran into it with victoria AIO yesterday13:00
jamesdentonhad to hardset nova_virt_type13:00
opendevreviewMerged openstack/openstack-ansible-os_neutron master: Add configuration option for heartbeat_in_pthread  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/83323713:16
opendevreviewAndrew Bonney proposed openstack/openstack-ansible-os_cinder stable/xena: Add configuration option for heartbeat_in_pthread  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/83386313:17
opendevreviewAndrew Bonney proposed openstack/openstack-ansible-os_cinder stable/wallaby: Add configuration option for heartbeat_in_pthread  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/83386413:18
opendevreviewAndrew Bonney proposed openstack/openstack-ansible-os_nova stable/xena: Add configuration option for heartbeat_in_pthread  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/83386513:18
opendevreviewAndrew Bonney proposed openstack/openstack-ansible-os_nova stable/wallaby: Add configuration option for heartbeat_in_pthread  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/83386613:18
opendevreviewAndrew Bonney proposed openstack/openstack-ansible-os_neutron stable/xena: Add configuration option for heartbeat_in_pthread  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/83386713:18
opendevreviewAndrew Bonney proposed openstack/openstack-ansible-os_neutron stable/wallaby: Add configuration option for heartbeat_in_pthread  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/83386813:18
MrClayPole_noonedeadpunk: Just popped away for some lunch. Back now, It failed with the same error.13:30
noonedeadpunkhm13:31
noonedeadpunkI need to spawn aio then to check for possible ways to solve that13:32
MrClayPole_I'll back out that patch we just made and wait to hear from you. This is just a test environment my side so while I need to get it done its not critical.13:33
MrClayPole_I'm happy to test once you have a patch :)13:33
MrClayPole_Thanks for the info jamesdenton. I'll hold off on the hardset unless I get presure from my side to get it done.13:35
anskiynoonedeadpunk: hey! you've mentioned yesterday about some mariadb connection drops. Currently I'm observing strange timeouts on X with 10.6.5. Looks like it happens when connection times out on server side and client tries to reconnect. Is it somehow related to your problem?14:40
noonedeadpunkanskiy: um... yes but in facts it's different I believe14:47
noonedeadpunkAfter futher investigation I believe that connection is dropped by haproxy for some reason, clients aknoldeges that connection is dropped and logs "Server has gone away during query", but MariaDB thinks connection is still active and drops it with timeout only14:49
noonedeadpunkAs a result, if connection with update statement is killed that way, table remains locked until timeout releases14:50
noonedeadpunkThe only way we for now workarounded that was pointing to specific mariadb container with port forwarding and disabling haproxy frontend/backend. 14:51
noonedeadpunkAnd not sure what goes wrong with haproxy14:51
noonedeadpunkat same time several of our regions are running same osa version without single issue14:52
anskiyis it 10.6.x? I'm feeling a bit lazy about further investigating into my issue and thinking about just downgrading to 10.5.12 (which worked flawlessly).14:55
noonedeadpunkyup, 10.6.514:56
noonedeadpunkand you can't downgrade just in case14:56
noonedeadpunkand case is not in mariadb either14:56
noonedeadpunkor well, at least from what we see...14:57
noonedeadpunkduring 10.5 -> 10.6 upgrade mysql system tables are adjusted heavily14:57
noonedeadpunkand there're really migrations that could be breaking14:57
noonedeadpunkbut well...14:58
noonedeadpunkyou can try:)14:58
anskiywell, maybe I'm hitting some other bug... Thanks for the info, was going to check downgrade in Vagrant first anyways14:58
noonedeadpunkit sounds suuuuper related to what we see just in case14:59
noonedeadpunkanskiy: was you performing some kind of W->X upgrade?15:04
noonedeadpunkOr just upgraded mariadb?15:04
anskiynoonedeadpunk: during 23.1.2 I've went from 10.5.6 to 10.5.12 (which still was deadlocking), then, prior to upgrading to 23.2.0, I've upgraded to 10.5.13, which was fine. Then I've upgraded to X and it was working okay (probably for a week or so)15:13
anskiythere is no actual workload for now, I'm just occasionally poking at it. But deadlocking was so bad, it could've happen on normal playbook run, when ansible was checking/registering services in Keystone15:15
anskiyhonestly, it could be anything else, but remembering previous problems with mariadb...15:15
noonedeadpunkeventually with keystone - it could be that apache hitting conenction limit? For test instances it's set to pretty low value15:16
noonedeadpunkas we see never saw any lock with keystone. it is always smth that writes intensively to db, like octavia while checking health or neutron during updating ports15:17
noonedeadpunkor nova ofc updating list of instances per computes15:17
anskiyI can't remember now, but I'm pretty sure it was mariadb's fault. I still have those links to mariadb's Jira in user_variables SCM history :)15:20
opendevreviewMerged openstack/openstack-ansible-os_keystone master: Drop distributed_lock parameter  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/83178615:20
anskiywith identical symptoms15:21
noonedeadpunkhm15:22
opendevreviewMerged openstack/openstack-ansible-os_gnocchi master: Add availability to define gnocchi_incoming_driver  https://review.opendev.org/c/openstack/openstack-ansible-os_gnocchi/+/82290515:22
noonedeadpunkwell let me know if downgrade to 10.5.13 will just work...15:22
jrosserhow many seperate mariadb issues do we have15:25
jrosseri'm not sure if the deadlock discussion here is the 10.5.6 fails-to-startup problem15:26
anskiynoonedeadpunk: well, at first look, admin's password is no longer accepted now :)15:36
noonedeadpunkwell I have one pretty weird thing that I don't think even related to mariadb, as when we exclude haproxy from chain of connection things just work15:37
noonedeadpunkbut I can hardly imagine wtf could be with haproxy15:38
jrosserwe had a network switch flap yesterday here which completely upset haproxy wrt galera in a very unexpected way15:38
jrosserandrewbonney: was going to look into that a bit as haproxy said the backend was down when it clearly wasnt15:38
jrosserso there is certainly something suspect with the healthcheck15:39
noonedeadpunkwe don't have backend flapping :(15:39
anskiyjrosser: yeah, it's not the same. What's that "fails-to-start" problem? Was it the one during installation on ubuntu/debian?15:40
opendevreviewMerged openstack/openstack-ansible-os_magnum master: Remove legacy policy.json cleanup handler  https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/82744415:41
jrosseryes, where there was an internal "deadlock" inside mariadb that meant the service never started properly15:41
jrosserthere is an issue on the mariadb jira about that15:42
jrosseriirc thats why we did 10.5.6 -> 10.5.12 even on a stable branch where we'd never normally touch the version outside major version upgrades15:42
*** dviroel|ruck is now known as dviroel|ruck|lunch16:02
*** frenzy_friday is now known as frenzyfriday16:17
opendevreviewMerged openstack/openstack-ansible master: Connect openstack_pki_regen_ca variable to pki role  https://review.opendev.org/c/openstack/openstack-ansible/+/83124216:21
opendevreviewJonathan Rosser proposed openstack/openstack-ansible stable/xena: Connect openstack_pki_regen_ca variable to pki role  https://review.opendev.org/c/openstack/openstack-ansible/+/83401716:40
opendevreviewMerged openstack/openstack-ansible master: Replace use of deprecated ANSIBLE_CALLBACK_WHITELIST  https://review.opendev.org/c/openstack/openstack-ansible/+/82900216:55
*** dviroel|ruck|lunch is now known as dviroel|ruck17:15
*** arxcruz is now known as arxcruz|off17:18
opendevreviewMerged openstack/openstack-ansible master: Set minimum and maximum microversions for manila api  https://review.opendev.org/c/openstack/openstack-ansible/+/82756017:34
opendevreviewMerged openstack/openstack-ansible stable/xena: Add test of used SHAs  https://review.opendev.org/c/openstack/openstack-ansible/+/83103117:34
*** dviroel|ruck is now known as dviroel|ruck|afk20:03

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!