Friday, 2023-09-01

opendevreviewMichal Nasiadka proposed openstack/kolla stable/yoga: Pin iptables to 1.8.4 in Centos Stream 8  https://review.opendev.org/c/openstack/kolla/+/89335906:37
opendevreviewBartosz Bezak proposed openstack/kolla stable/yoga: Pin iptables to 1.8.4 in Centos Stream 8  https://review.opendev.org/c/openstack/kolla/+/89342307:22
opendevreviewBartosz Bezak proposed openstack/kolla stable/yoga: Pin iptables to 1.8.4 in Centos Stream 8  https://review.opendev.org/c/openstack/kolla/+/89342307:24
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: ovn: Improve clustering  https://review.opendev.org/c/openstack/kolla-ansible/+/86892907:37
opendevreviewMerged openstack/kayobe stable/xena: Remove upgrade jobs following Wallaby EOL  https://review.opendev.org/c/openstack/kayobe/+/89343507:39
opendevreviewMaksim Malchuk proposed openstack/kolla stable/yoga: Add server-status handler to Rocky/Centos Apache conf  https://review.opendev.org/c/openstack/kolla/+/89324208:19
SvenKieskeouch @ that iptables regression, at least they added an upstream test to detect that in the future :)08:32
fricklerevidence #371 in the "centos stream is unuseable" case08:35
mnasiadkathat's not really regression, that's RH way of building packages, they backport patches they like, not take the latest git version from application repository - it's long time fixed in 1.8.5, but they decided to not include the patch when they bumped iptables to 1.8.5 in c8s08:35
opendevreviewPierre Riteau proposed openstack/kayobe stable/xena: Speed up calls to Bifrost  https://review.opendev.org/c/openstack/kayobe/+/89320408:35
bbezakfrickler: totally true08:40
SvenKieskefrickler: I needed to suppress the urge to write that :D08:40
SvenKiesketo be fair: redhat has at least _some_ QA, but I feel most of it is benefiting fedora these days :D (I won't complain)08:41
TK_Hello Guys, I have a quick support request, I have 4 compute nodes but when I migrate an instance from one compute to another I get the error below 08:43
TK_https://paste.openstack.org/show/bWws544zUWbMR6AJBvw4/08:43
fricklerwhatever they do, it pretty obviously is no longer a stable distro, which is what we usually require for our CI platforms08:43
fricklerTK_: didn't you already create a bug report for that issue?08:44
SvenKieskeTK_ what do you not understand about your error? the cpu's you are migrating between aren't compatible (I'm just rephrasing the error message here). They need to be.08:44
frickleralso not everyone may identify as "Guys", just saying08:44
fricklerSvenKieske: I think there may have been a bug in nova about this somehow08:45
SvenKieskemake a diff between "lscpu" on each node and that should tell you what is wrong in most of the cases. their are edgecases where that might not be enough though.08:45
TK_@Frickler .. No harm intended ... My bad 08:45
SvenKieskefrickler: yes, nova tried to be clever and used it's own cpucomparison feature, but that got patched out..1 or 2 releases ago?08:46
SvenKieskethis: https://review.opendev.org/c/openstack/nova/+/838926 was the original problem08:46
TK_So I guess I will have to just wait for a patch 08:48
SvenKieskeuh oh, they try to be clever again and reintroduce that? https://review.opendev.org/c/openstack/nova/+/76233008:48
SvenKieskeTK_ That is already merged, your error most likely is not related to that bug08:48
SvenKieskebut you need to investigate to check if it is or is not related :)08:48
fricklerthis looks similar and isn't fixed yet afaict https://bugs.launchpad.net/nova/+bug/202303508:48
SvenKieskeI honestly don't know why nova tries to reinvent the wheel and always slaps it's own cpu compare function on top of libvirt, and than has to fix the ensuing mess08:49
SvenKieskealso nova logs are constantly lying and referring to upstream libvirt cpu maps when in reality nova uses it's own comparison function which fails.08:51
fricklerthis revert is also still blocked, need to ping ppl again https://review.opendev.org/c/openstack/nova/+/87196808:52
SvenKieskemhm, what _is_ the state of the nova patches here? There is e.g. https://review.opendev.org/c/openstack/nova/+/76233008:53
SvenKieskenot merged, I notice I'm even CC'ed to that08:54
SvenKieskethere's also https://review.opendev.org/c/openstack/nova/+/869587 also not merged08:55
SvenKieskeand also: https://review.opendev.org/c/openstack/nova/+/838552; also not merged08:55
SvenKieskeseems like a real mess currently08:56
SvenKieskeseems I don't know the current state of nova stuff. I originally had the hopes this was fixed by the first mentioned fix https://review.opendev.org/c/openstack/nova/+/83892608:56
TK_Let me try a few things... I will update in the findings 08:59
SvenKieskealso funny this wasn't catched during integration tests; I'm fairly certain nova _does_ run some live migration tests.09:00
SvenKieskecaught*09:00
opendevreviewMaksim Malchuk proposed openstack/kolla-ansible master: Add forgotten releasenote  https://review.opendev.org/c/openstack/kolla-ansible/+/89348209:05
SvenKieskeah nice, libvirt has a new API: https://review.opendev.org/c/openstack/nova/+/869950 interesting didn't read about this stuff this year09:05
bbezakwe've also added this workaround to disable nova comparison and rely on libvirt comparison - https://review.opendev.org/c/openstack/kolla-ansible/+/88802809:09
SvenKieskeah right, always weird to be reminded of commits I did review, didn't remember that one.09:10
SvenKieskeso above mentioned bug report confirms reverting 468b03e0ee4a917ae26106f6e57081bcd9e7a65b from stable/2023.1 fixed the issue for one user09:11
SvenKieskebut I still haven't fully grasped why the new, supposedly better, api leads to worse results :)09:11
SvenKieskeany, probably a topic for #openstack-nova09:12
SvenKieskeanyway*09:12
* SvenKieske can't type today..09:12
opendevreviewMaksim Malchuk proposed openstack/kolla-ansible master: Add forgotten release note for 886747  https://review.opendev.org/c/openstack/kolla-ansible/+/89348209:14
opendevreviewMatt Crees proposed openstack/kolla master: Document KOLLA_UPGRADE_CHECK environment variable  https://review.opendev.org/c/openstack/kolla/+/89348409:22
opendevreviewMaksim Malchuk proposed openstack/kolla-ansible stable/2023.1: Use better default bind address for ironic-tftp  https://review.opendev.org/c/openstack/kolla-ansible/+/89338509:26
opendevreviewMatt Crees proposed openstack/kolla master: Document KOLLA_UPGRADE_CHECK environment variable  https://review.opendev.org/c/openstack/kolla/+/89348409:28
opendevreviewMaksim Malchuk proposed openstack/kolla-ansible stable/zed: Use better default bind address for ironic-tftp  https://review.opendev.org/c/openstack/kolla-ansible/+/89338609:33
opendevreviewMaksim Malchuk proposed openstack/kolla-ansible stable/yoga: Use better default bind address for ironic-tftp  https://review.opendev.org/c/openstack/kolla-ansible/+/89342109:36
opendevreviewMaksim Malchuk proposed openstack/kolla-ansible stable/xena: Use better default bind address for ironic-tftp  https://review.opendev.org/c/openstack/kolla-ansible/+/89342209:37
TK_What is weird is the CPUs are exactly the same 09:51
SvenKieskeTK_ can you post the "lscpu" output of both hosts on paste.openstack.org ?09:52
SvenKieskein my experience they are most of the time not the same, but the difference can be really subtle, like a single missing register, especially intel is notorious for disabling/enabling cpu stuff depending on how much money you throw at them..09:54
TK_https://paste.openstack.org/show/bnqkvU8xapOY6lZGsIpe/09:54
SvenKieskediff --side-by-side --suppress-common-lines cpu3 cpu110:00
SvenKieskecpu1 has these flags that cpu3 hasn't:10:00
SvenKieskemhm, these diffs are butchered..there are weird line breaks10:02
hrwhwp_* flags are missing in one and present in second10:02
hrwTK_: you are running different OS/kernel on them, right?10:02
TK_I am running Ubuntu 20 on both 10:02
SvenKieskethat famous patch: https://patchwork.kernel.org/project/linux-pm/patch/1442944296-11737-1-git-send-email-kristen@linux.intel.com/10:03
hrwlscpu output differs suggesting different OS versions10:03
fricklerTK_: which kernel versions? also there is no Ubuntu 20, you likely run Ubuntu 20.04?10:04
TK_VERSION="20.04.2 LTS (Focal Fossa)"  VERSION="20.04.6 LTS (Focal Fossa)"10:05
TK_They are different 10:05
fricklerthat's not the kernel version, "uname --kernel-version" shows it10:07
TK_Should that be a problem ?10:07
bbezakmaybe microcode versions differ ?10:07
TK_ #176-Ubuntu SMP Mon Aug 14 12:04:20 UTC 2023  #165-Ubuntu SMP Tue Apr 18 08:53:12 UTC 2023\10:08
TK_Those are the outputs 10:08
fricklerso different kernels, as hrw suggested10:08
fricklerif the newer one has the patch SvenKieske mentioned, that's your issue10:08
SvenKieskeI almost certain there was a bug report for that, but I can't find it currently..10:10
TK_Would you recommend installing 20.04.2  on the new servers instead of 20.04.610:10
SvenKieskeregarding the live migration part10:10
SvenKieskeI'd generally recommend to keep these versions in sync on all your hypervisors: base distro version; kernel version (to the patch), libvirt, qemu, docker10:11
TK_ok10:11
SvenKieskeyou of course need to deviate from this during upgrades10:11
SvenKieskeah I forgot an important one: cpu microcode!10:11
TK_Do I need to upgrade them ?10:14
opendevreviewMerged openstack/kolla stable/yoga: Pin iptables to 1.8.4 in Centos Stream 8  https://review.opendev.org/c/openstack/kolla/+/89342310:22
hrwTK_: upgrade 22.04 to current state on all machines?10:23
hrwTK_: and then keep all systems in sync?10:23
hrwTK_: I could recommend using 22.04.latest on new systems and migrate old ones to same version10:25
kevkoSvenKieske: did you read my comment regarding your +1 ? :D 11:08
SvenKieskekevko: not just yet, had lunch break and now the next meeting is starting; will have a look later11:57
TK_I deployed  wallaby using kolla-ansible and as per the documentation, It requires Ubuntu 20.0412:16
TK_https://docs.openstack.org/kolla-ansible/wallaby/user/support-matrix12:17
SvenKieskethe long cooking designate change would be happy about some (core) reviews: https://review.opendev.org/c/openstack/kolla-ansible/+/878270 :)13:22
opendevreviewBartosz Bezak proposed openstack/kolla-ansible stable/zed: Added precheck for OpenSearch migration  https://review.opendev.org/c/openstack/kolla-ansible/+/89345313:39
SvenKieskekevko: I replied with a long rant about CI :)13:53
SvenKieskemnasiadka: frickler: might be worthwhile reading for you too, if you like reading rants about CI and stuff: https://review.opendev.org/c/openstack/kolla-ansible/+/799229/comment/ab898aa0_5ea90b38/ if not, better skip it :D13:54
SvenKieskekevko: besides my rant: good catches, especially the relnotes, I hate large changesets because I always tend to miss something in them. I really don't trust the changesets if they are large. I also don't trust the reviewers, because no person can stay alert for the amount of time needed to really review such large changes.13:56
SvenKieskeso you review those in batches and have then to check, that you didn't miss anything in between, and that you have all the connections between different code parts present in your mind. very difficult imho. changesets > 200 lines are evil.13:57
SvenKieskethat number was arbitrarily chosen, but you get the point :)13:58
opendevreviewBartosz Bezak proposed openstack/kolla-ansible stable/zed: Added precheck for OpenSearch migration  https://review.opendev.org/c/openstack/kolla-ansible/+/89345313:58
SvenKieskeregarding my CI rant: a quick win imho would be, if zuul only did post the result of failed jobs, in good ol' unix tradition (no output=everything is fine). nobody looks at successful jobs anyway, right?14:00
SvenKieskekevko: I replied to my own reply with an example which might explain why I'm lost in our current CI system. maybe someone can explain it to me, so I can do a better job at checking jobs :)14:09
fricklerSvenKieske: there are lots of reasons to look at successful jobs, too14:10
SvenKieskeah okay, lol. is there a list somewhere? how do we get anything done this way? I mean you surely don't check manually the CI output of every job on every changeset?14:12
SvenKieskeso there needs to be some kind of heuristic. if the answer is "you get experience and a feeling over time to know which jobs to check" maybe we can extract this useful knowledge from your heads so I don't have to make errors for years until I have the same knowledge?14:13
SvenKieskeand maybe write that down somewhere, for all the other people also, looking at that change, two other people also gave +1, so obviously didn't check the zuul jobs careful enough as well.14:14
SvenKieskeone of those was maksim who I would not consider inexperienced :) (don't want to call you out maksim, I just think we need to improve this stuff)14:15
SvenKieskeI'm also fine with an incomplete jobs which to check, or the reverse, a list of jobs never to check, but the current status quo for me is basically: every job _might_ be important to check, no matter the job result, which is just ridiculous. can we please make it at least a little easier to contribute?14:19
SvenKieskeincomplete list*14:19
SvenKieskefrickler: this testset only has an ara sqlite dump and not the actual openstack-tests present, how to debug this? https://zuul.opendev.org/t/openstack/build/49c367a8d6f04f27b166edb3f7061360/logs15:31
SvenKieskenvm I'm blind15:31
opendevreviewMerged openstack/kayobe master: Use merge_configs and merge_yaml to generate Kolla custom config  https://review.opendev.org/c/openstack/kayobe/+/78274918:12
opendevreviewMerged openstack/kayobe master: Add cached plugin  https://review.opendev.org/c/openstack/kayobe/+/80306418:12
opendevreviewMerged openstack/kayobe master: Kayobe environment dependencies  https://review.opendev.org/c/openstack/kayobe/+/80286518:12

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!