opendevreview | Michal Nasiadka proposed openstack/diskimage-builder master: Add Rocky Linux 10 support to rocky-container element https://review.opendev.org/c/openstack/diskimage-builder/+/952548 | 04:44 |
---|---|---|
opendevreview | Michal Nasiadka proposed openstack/diskimage-builder master: Add Rocky Linux 10 support to rocky-container element https://review.opendev.org/c/openstack/diskimage-builder/+/952548 | 04:48 |
opendevreview | Michal Nasiadka proposed openstack/diskimage-builder master: Add Rocky Linux 10 support to rocky-container element https://review.opendev.org/c/openstack/diskimage-builder/+/952548 | 04:49 |
opendevreview | Michal Nasiadka proposed opendev/zuul-providers master: Add Ubuntu Bionic and Focal image definitions https://review.opendev.org/c/opendev/zuul-providers/+/953268 | 04:54 |
mnasiadka | clarkb: ^^ - I assume there's no need for arm64 versions | 04:54 |
tonyb | mnasiadka: That's correct | 04:55 |
tonyb | (that we only need x86_64(amd64) images for the old distros) | 04:56 |
opendevreview | Michal Nasiadka proposed opendev/zuul-providers master: Add Ubuntu Bionic and Focal image definitions https://review.opendev.org/c/opendev/zuul-providers/+/953268 | 04:56 |
mnasiadka | tonyb: great - if you can single merge ^^ - I can make the actual build working before clarkb and others wake up ;-) | 04:58 |
opendevreview | Merged opendev/zuul-providers master: Add Ubuntu Bionic and Focal image definitions https://review.opendev.org/c/opendev/zuul-providers/+/953268 | 05:00 |
tonyb | mnasiadka: ^^ done :) | 05:00 |
mnasiadka | yay | 05:01 |
opendevreview | Michal Nasiadka proposed opendev/zuul-providers master: Add Ubuntu bionic/focal builds, labels and provider config https://review.opendev.org/c/opendev/zuul-providers/+/953269 | 05:14 |
opendevreview | Michal Nasiadka proposed opendev/zuul-providers master: Add Ubuntu bionic/focal builds, labels and provider config https://review.opendev.org/c/opendev/zuul-providers/+/953269 | 05:15 |
opendevreview | Michal Nasiadka proposed opendev/zuul-providers master: Add Ubuntu bionic/focal builds, labels and provider config https://review.opendev.org/c/opendev/zuul-providers/+/953269 | 05:28 |
opendevreview | Merged openstack/diskimage-builder master: Add new openstack/devstack based functional testing https://review.opendev.org/c/openstack/diskimage-builder/+/949942 | 05:45 |
opendevreview | Merged openstack/diskimage-builder master: Disable nodepool testing https://review.opendev.org/c/openstack/diskimage-builder/+/953246 | 05:45 |
opendevreview | Merged openstack/diskimage-builder master: Add support for CentOS Stream 10 https://review.opendev.org/c/openstack/diskimage-builder/+/934045 | 05:45 |
opendevreview | Michal Nasiadka proposed opendev/zuul-providers master: Add Ubuntu bionic/focal builds, labels and provider config https://review.opendev.org/c/opendev/zuul-providers/+/953269 | 05:49 |
opendevreview | Michal Nasiadka proposed opendev/zuul-providers master: Add Ubuntu bionic/focal builds, labels and provider config https://review.opendev.org/c/opendev/zuul-providers/+/953269 | 05:50 |
opendevreview | Michal Nasiadka proposed opendev/zuul-providers master: Add Ubuntu bionic/focal builds, labels and provider config https://review.opendev.org/c/opendev/zuul-providers/+/953269 | 05:56 |
opendevreview | Michal Nasiadka proposed opendev/zuul-providers master: Add Ubuntu bionic/focal builds, labels and provider config https://review.opendev.org/c/opendev/zuul-providers/+/953269 | 06:01 |
opendevreview | Michal Nasiadka proposed opendev/zuul-providers master: Add Ubuntu bionic/focal builds, labels and provider config https://review.opendev.org/c/opendev/zuul-providers/+/953269 | 06:17 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Ensure files are created with 0600 perms https://review.opendev.org/c/opendev/glean/+/953276 | 06:34 |
opendevreview | Michal Nasiadka proposed openstack/diskimage-builder master: Add Rocky Linux 10 support to rocky-container element https://review.opendev.org/c/openstack/diskimage-builder/+/952548 | 06:34 |
frickler | infra-root: looks like gitea might be a bit flakey, see e.g. the failures on https://review.opendev.org/c/opendev/zuul-providers/+/953269 (note they are for old images, not the added ones) https://zuul.opendev.org/t/opendev/build/632f2848f8cb4ccd9961852b598b5e40 | 12:03 |
fungi | frickler: i'm still waking up, but consistent with the gitea servers getting bombarded by crawlers again? | 12:13 |
frickler | that might be possible, didn't check the servers yet | 12:15 |
opendevreview | Daniel Bengtsson proposed opendev/irc-meetings master: Drop the oslo team meeting. https://review.opendev.org/c/opendev/irc-meetings/+/953305 | 12:52 |
opendevreview | Merged opendev/irc-meetings master: Drop the oslo team meeting. https://review.opendev.org/c/opendev/irc-meetings/+/953305 | 13:17 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Ensure files are created with 0600 perms https://review.opendev.org/c/opendev/glean/+/953276 | 13:23 |
opendevreview | Dmitriy Rabotyagov proposed openstack/diskimage-builder master: Add support for building Fedora 40 https://review.opendev.org/c/openstack/diskimage-builder/+/922109 | 13:47 |
frickler | maybe we're also dossing ourselves if we run 15 image builds in parallel that all clone the whole opendev git tree? might be an option for further improvement to do that only once per buildset and have each real image copy from there? | 13:52 |
Clark[m] | They should start with the existing image cache so not a full clone but an update or the existing data to current data | 13:55 |
Clark[m] | It may still be a dos though | 13:56 |
mnasiadka | clarkb: Added the bionic and jammy images here https://review.opendev.org/c/opendev/zuul-providers/+/953269 - had to fix a case where zuul.image_formats is not populated (probably because the image does not exist yet in zuul) | 14:01 |
corvus | mnasiadka: oops, that's an oversight on my part, sorry! | 14:19 |
mnasiadka | corvus: no worries, I traced back the patch in Zuul to understand what is happening there - using default filter did the trick for now (unless you want it to be different in future, but I guess that's a followup) | 14:19 |
corvus | that seems like a good fix for now... | 14:20 |
mnasiadka | great, so needs another +2 now to make unmaintainers happy ;-) | 14:20 |
corvus | i'll need to think a bit about whether zuul can come up with a value for that in a speculative state... it... may not be possible.... so that might be the permanent fix. | 14:20 |
mnasiadka | corvus: you would probably need to have a definition per provider on accepted image formats to limit those formats to less than the usual three we default to | 14:21 |
corvus | lgtm; i didn't +w in case Clark wants to look | 14:22 |
mnasiadka | sure, let's wait | 14:22 |
mnasiadka | was that the only reason for insta-revert yesterday? | 14:22 |
corvus | mnasiadka: zuul does know the image formats for the providers normally -- it's just that it isn't going to speculatively attach a new image to providers, so we wouldn't know which formats it needs until after the change to add the image to providers merges | 14:24 |
mnasiadka | ah right, so then default makes sense | 14:24 |
corvus | we could try to do something to special case that during the config validation... but honestly, i think we ought to be able to run those jobs even when not attached to providers, so i'm more and more thinking that just having a default makes sense. | 14:24 |
frickler | mnasiadka: iiuc https://review.opendev.org/c/zuul/zuul/+/953244 was the reason | 14:25 |
corvus | mnasiadka: no, the insta-revert was because a cloud blew up and we didn't handle that in the request assignment loop, so the launcher spun. simple bug/fix. | 14:25 |
corvus | yeah that one | 14:25 |
mnasiadka | ah, nice | 14:26 |
corvus | i plan to try again today once i get some stuff out of the way | 14:26 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Ensure files are created with 0600 perms https://review.opendev.org/c/opendev/glean/+/953276 | 14:29 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Stop adding uuid= in keyfiles https://review.opendev.org/c/opendev/glean/+/953320 | 14:29 |
opendevreview | Michal Nasiadka proposed openstack/diskimage-builder master: Add Rocky Linux 10 support to rocky-container element https://review.opendev.org/c/openstack/diskimage-builder/+/952548 | 14:29 |
mnasiadka | ok, it's weird NM in RHEL-clone has different issues than the one in CentOS Stream 10, but whatever ;-) | 14:30 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Ensure files are created with 0600 perms https://review.opendev.org/c/opendev/glean/+/953276 | 14:35 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Stop adding uuid= in keyfiles https://review.opendev.org/c/opendev/glean/+/953320 | 14:35 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Stop adding uuid= in keyfiles https://review.opendev.org/c/opendev/glean/+/953320 | 14:42 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Stop adding uuid= in keyfiles https://review.opendev.org/c/opendev/glean/+/953320 | 14:43 |
opendevreview | Clark Boylan proposed opendev/system-config master: Update etherpad to v2.3.1 https://review.opendev.org/c/opendev/system-config/+/953328 | 15:17 |
fungi | popping out to run lunch errands, back in an hour | 15:20 |
frickler | corvus: was there a change in zuul/nodepool (likely deployed this weekend), that would make private_ipv4 point to the FIP instead of the internal tenant IP? cf. https://c4d77360c4f137c70770-625a0eb0440aa527fbdb216e8991f5a6.ssl.cf1.rackcdn.com/openstack/2b5b978a4bde4afa913491107676f422/zuul-info/inventory.yaml | 15:21 |
opendevreview | Clark Boylan proposed opendev/system-config master: DNM force etherpad failure to hold node https://review.opendev.org/c/opendev/system-config/+/840972 | 15:25 |
frickler | compare to an earlier run https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_900/openstack/9002bca23bf34476bdcaa7ee2b79c981/zuul-info/inventory.yaml | 15:26 |
clarkb | doing a nodepool list --detail shows that the private ip == public ip for all nodes in raxflex sjc3 right now | 15:27 |
clarkb | I wonder if this is an oepsntack sdk update that got pulled in by other changes | 15:28 |
clarkb | there was a new release on june 3 | 15:28 |
frickler | that's another option, yes. do we pull in only released sdk? | 15:28 |
clarkb | yes nodepool only uses released sdk. However, june 3 is long enough ago that I think we would have seen this problem sooner. I know I updated nodepool here: https://review.opendev.org/c/zuul/nodepool/+/952587 which merged on the 13th | 15:29 |
corvus | may 7, june 13, june 23 are the update dates for the nodepool latest image | 15:30 |
frickler | nothing obvious in the sdk git log, too. checking nodepool next | 15:30 |
frickler | or maybe raxflex changed something in their API? nodepool doesn't have any obvious changes either | 15:31 |
clarkb | yes that seems possible as well | 15:32 |
frickler | do we have an easy way to find a zuul-launcher run on raxflex? just to compare the inventory there | 15:32 |
clarkb | DFW3 does not exhibit this issue | 15:33 |
clarkb | it is our only other floating ip cloud in nodepool. To me that does seem to point at something region specific and not nodepool or openstacksdk specific | 15:33 |
corvus | is this all 3 old rax regions, or only rax-dfw? | 15:34 |
frickler | ok, ok, that would support the API version theory | 15:35 |
frickler | dfw3 is the second raxflex region I think? | 15:35 |
clarkb | yes sjc3 and dfw3 are both in raxflex. They are the only nodepool providers using floating IPs. rax classic ahs public and private ips on separate interfaces but both are directly attached to the nodes | 15:35 |
clarkb | let me see what nodepool list --detail reports for rax classic | 15:36 |
corvus | what's the behavior for raxflex-sjc3? | 15:36 |
corvus | the two most recent nodepool builds (from june 13 and june 23) both used openstacksdk-4.6.0-py3-none-any.whl | 15:36 |
clarkb | corvus: raxflex-sjc3 has public ip == private ip using the public ip address. raxflex-dfw3, rax-dfw, rax-iad, rax-ord all have different public and private ips on each node | 15:37 |
corvus | oh ok, so raxflex-sjc3 is the only place we see this | 15:38 |
clarkb | corvus: the problem is that since raxflex-sjc3 uses floating ips jobs are running assuming they can bind sockets to the private ip safely (but not the public ip). But this assumption breaks when we don't have nodepool properly setting the private up | 15:38 |
clarkb | yes raxflex-sjc3 seems to be unique here | 15:38 |
clarkb | I think if raxflex-dfw3 exhibited this behavior I would be more sketical of nodepool or openstacksdk themselves. Given their behavior differs I wonder if there is some configuration difference (either in our clouds.yaml or in the cloud resources themselves that cause this change) | 15:39 |
frickler | hmm, doing "openstack server list" on sjc3 + dfw3 show both public+private IPs as expected | 15:40 |
clarkb | frickler: ya and server show against instances in both shows entries that look similar to my eye | 15:40 |
frickler | so the API side seems fine | 15:40 |
clarkb | we probably need to see how openstacksdk distinguishes and work form there | 15:41 |
frickler | I'm wary of doing deeper nodepool debugging right now in case the issue goes away with z-l | 15:41 |
corvus | nodepool may be involved here -- nodepool does its own server listing for efficiency | 15:42 |
corvus | let me see if i can do a manual version of that on bridge or something... | 15:42 |
clarkb | corvus: nodepool seems to do a listing then when a list entry matches it calls sdk's expand_server_interfaces() function. That function then does `server['private_v4'] = get_server_private_ip(server, cloud) or ''` | 15:46 |
clarkb | get_server_private_ip() has a docstring explaining its method of determining the private ip. I'm guessing that process is what fails for us | 15:47 |
clarkb | I've started looking at the mechanics of replacing zookeeper cluster nodes and I have realized that while nodepool will automatically detect changes to the zookeeper server list and update its zk client it doesn't appear that zuul does so. Zuul's zk client tooling does have the same resetHosts() method that is found in nodepool but it doesn't seem that anything calls it. Do we think | 16:02 |
clarkb | adding support for that in zuul makes sense before replacing zk servers or should I just plan to do a full zuul restart each time a zk host is added/removed to the cluster? | 16:02 |
corvus | clarkb: what about a restart to add 3 new servers, then a restart to remove the old one. it shouldn't matter that dns doesn't resolve. | 16:03 |
corvus | s/old one/old ones/ | 16:03 |
clarkb | corvus: and then manage the actual cluster in a rolling fashion behind the scenes to avoid split brain potential? | 16:04 |
corvus | ya | 16:04 |
frickler | oh, the working example I posted earlier was from dfw3 because I was focusing only on raxflex. so possibly sjc3 has been broken for longer, just did not hit the issue often enough to be noticed | 16:04 |
clarkb | that should work. One complication for that is we get the list of servers from ansible inventory so I'll need to add all of them to our inventory. But I think that is manageable with the emergency file list to avoid the cluster growing/shrinking inappropriately | 16:05 |
corvus | could you just grow to 6 then shrink to 3? | 16:05 |
clarkb | I think technically you can. The main risk there is if during that period we split brain into 3 and 3 there won't be a correct winner. However, maybe we just grow to 6 then immediately shutdown one of the old ones so that we're at 5 | 16:06 |
clarkb | then we can restart zuul so that it knows to connect to the new ones, then shutdown the remaining 2 old servers, then restart zuul again so that it stops trying to use the old servers | 16:07 |
rubencabrera[m] | Hi! This is my first contribution and I don't know if I'm missing anything to get it merged: https://review.opendev.org/c/openstack/diskimage-builder/+/952875 | 16:07 |
rubencabrera[m] | Zuul gave a +1 but both `Verified` and `Workflow` are `UNSATISFIED`. Does this need any action on my side? | 16:07 |
frickler | the only delta I see between the regions is that for a "network list", PUBLICNET comes before opendevzuul-network1 in SJC3 and the other way round in DFW3. since neither is named "public" or "private", this might affect the magic that mordred weaved into the sdk | 16:08 |
corvus | https://paste.opendev.org/show/bClgsWkAoGHM6F7YaSyI/ | 16:09 |
clarkb | rubencabrera[m]: when a reviewer approves the change that will satisfy the workflow vote requirement. That triggers CI to test and ensure that the change continues to work against the last tip of master and if so CI +2's on the verify label which satisfies that requirement. Then the CI system will automatically submit the change | 16:09 |
frickler | rubencabrera[m]: this change is just waiting for a second review, no action needed on your side | 16:09 |
corvus | frickler: clarkb i ran that paste on bridge, once for each region | 16:09 |
corvus | for sjc3 i got output like: 66.70.103.103 10.0.17.62 66.70.103.103 | 16:10 |
corvus | for dfw3 i got: 174.143.59.80 10.0.19.227 174.143.59.80 | 16:10 |
corvus | that looks correct to me | 16:10 |
rubencabrera[m] | Thank you very much for the kind explanation! | 16:10 |
clarkb | corvus: agreed that is what I would expect in a working situation | 16:11 |
clarkb | corvus: I wonder if there is a difference in our cloud.yaml files? | 16:11 |
frickler | corvus: yes, still we see something else than 10.0.16/20 in private_ipv4 in the inventory | 16:12 |
clarkb | you know, there was the issues with api response time in sjc3 recently ish | 16:12 |
corvus | yeah, and nodepool list is still showing the bad behavior, so new nodes are getting that... | 16:12 |
clarkb | I wonder if nodepool's openstacsdk instance for the cloud cached bad/incomplete data as a result of that | 16:12 |
corvus | i'll try running the test script on nl05 to eliminate/confirm clouds.yaml | 16:12 |
corvus | if that doesn't get us anywhere -- maybe clarkb is suggesting we turn it off and on again? :) | 16:13 |
clarkb | corvus: ya think that is what I'd do next unfortauntely | 16:14 |
corvus | running the test script in the container on nl05 for sjc3 produces normal looking data | 16:15 |
corvus | i think we're at turn it off and on again | 16:15 |
corvus | i can do that if other folks are ready for that step | 16:16 |
clarkb | I'm ready if you are | 16:17 |
corvus | restarting now | 16:17 |
corvus | behavior changed | 16:19 |
clarkb | wow ok | 16:19 |
corvus | | 0041230215 | raxflex-sjc3 | ubuntu-noble | 13768ec8-acbd-4d7e-8b77-acbcdbe9b759 | 66.70.103.127 | | ready | 00:00:00:34 | locked | main | | 10.0.17.160 | az1 | 16:19 |
clarkb | corvus: the heuristics in get_server_private_ip() must rely on cached data then | 16:20 |
clarkb | whcih is probably good 99.9% of the time :) | 16:20 |
corvus | that 10. ip is the private ip | 16:20 |
corvus | yeah | 16:20 |
frickler | ok, I just found a working example from z-l https://zuul.opendev.org/t/zuul/build/1a5d69e6895d4dc892d9f4bfb09096af/log/zuul-info/inventory.yaml#22 | 16:20 |
corvus | presumably zl didn't happen to have the bad cached data; i would guess it would have been affected if it did | 16:21 |
clarkb | corvus: ya its likely a timing thing from the June 23 nodepool update since we auto restart nodepool launchers | 16:22 |
clarkb | and June 23 is when there were api issues in that region | 16:22 |
frickler | that would match the starting date mentioned in https://bugs.launchpad.net/bugs/2115338 , I'll update that bug now | 16:22 |
clarkb | re zk I think I've convinced myself it should be safe to grow to 6 and then immediately reduce to 5. Then restart services. Then reduce to 3, then restart services again to avoid unnecessary config. The process will be add 3 new zk's to inventory and have them deploy, immediately after that is done put zk06 in the emergency file and shutdown zk on that server. Then do a complete zuul | 16:24 |
clarkb | restart. Then remove zk04-06 from inventory and shutdown their zk instances. Then do another complete zuul restart. | 16:24 |
clarkb | infra-root ^ let me know if there are any concerns with that appraoch. I can get things started by booting new servers and updating dns before we have to commit to the actual replacement strategy | 16:25 |
clarkb | corvus: and I geuss let me know how you think we should coordinate that around other zuul updates that are in flight | 16:25 |
clarkb | frickler: corvus sdk's _find_interesting_networks() has `if self._network_list_stamp: return` in it. I think that is the caching check | 16:27 |
clarkb | interestingly this means if you add new networks later I think you may be in a similar situation? | 16:27 |
clarkb | for zk risk I think the major risk is if we simultaneously turned out 3 new zookeepers because then they would have equivalent quorum power to the existing 3? | 16:30 |
clarkb | but our playbook does one zookeeper at a time so I think we'll add one at a time and quorum of 3 then 4 then 5 will beat 1 | 16:31 |
clarkb | hrm except we may need to restart each of the old servers to pick up the new server membership | 16:38 |
corvus | i think it would be too much work to coordinate zuul upgrades with zk; let's keep them orthogonal | 16:40 |
clarkb | ya each zk server currently has a list of all servers in the cluster in its static config. Our playbooks don't seem to do anything with dynamic config (they may predate that existing?). I think this means we would need to restart the existing servers for them to be aware of the new servers. And since we go in increasing order by zk number that wouldn't happen until all three zk | 16:42 |
clarkb | servers are up | 16:42 |
clarkb | *until all three new zk servers are up | 16:42 |
clarkb | and it doesnt' look like we auto restart on config changes either so would need to be manual or update the ansible. I could add each new zk one at a time ensuring that it becomes part of the quorum successfully before adding the next I guess | 16:43 |
fungi | wow, i miss a lot by going to lunch! | 16:43 |
clarkb | yes I think we should consider adding one at a time. This way we can ensure each node is added properly by restaring all non leader nodes then the leader last and checking that we have a new leader and quorum. Then add another server. | 16:49 |
clarkb | rinse and repeat. Once we got all new servers into the cluster we can shutdown 06 and then work on zuul restarts | 16:49 |
mordred | looks like I missed some fun here with sdk and everyone's favorite logic to figure out what the ip address is | 16:51 |
mordred | and yeah - there's definitely caching in there because networks usually don't change nearly as frequently as servers do, and the number of API calls needed to build up the metadata is pathological | 16:51 |
mordred | probably wouldn't be a bad idea to add in a staleness/expiration though | 16:52 |
fungi | if i'd known in advance i would have popped some popcorn | 16:52 |
fungi | well, i guess there's still time... | 16:52 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add config option to disable zookeeper process management https://review.opendev.org/c/opendev/system-config/+/953337 | 17:02 |
clarkb | I'm not convinced ^ is actually helpful yet. But I think it may be. Basically this would allow me to add all three new zk's to inventory so that our config files update without restarting any of the zookeeper processes to pick up the change. Then I can manually start each new zk server and the non leader then the leader in turn to build of the new quorum relatively quickly | 17:03 |
clarkb | and then when that is done we remove a non leader zk from the 04-06 set from the inventory and repeat | 17:03 |
clarkb | I think what I'm beginning to realize is this process is a bit complicated which is made worse by zuul needing restarts to pick up the changes on the zuul side and also our relatively slow deployment process for inventory updates (which is thankfully much faster than it once was) | 17:05 |
clarkb | hrm except when I start zk01 and restart zk04-06 they will all have zk02-03 in their configuration as well | 17:07 |
opendevreview | Michal Nasiadka proposed openstack/diskimage-builder master: Add Rocky Linux 10 support to rocky-container element https://review.opendev.org/c/openstack/diskimage-builder/+/952548 | 17:07 |
clarkb | which gets us back to the 3 vs 3 problem | 17:07 |
clarkb | so ya I think we really want to do these one at a time to keep as many things on the quorum side as possible while we restart services | 17:08 |
clarkb | I wonder ifI can find the old etherpad from when I did this last time | 17:09 |
fungi | you could do pattern search queries in the backing db | 17:10 |
clarkb | zookeeper is the backing db? this is why I want to be careful | 17:11 |
fungi | i meant etherpad's db | 17:11 |
fungi | searching for the pad from where you last did this | 17:11 |
clarkb | ah | 17:12 |
clarkb | https://etherpad.opendev.org/p/opendev-zookeeper-upgrade-2021 | 17:14 |
clarkb | I found it looking at the created timestamp of zk04 and checking irc logs on that date | 17:14 |
fungi | also an option i've used in the past | 17:14 |
clarkb | ok I'm glad I dug that up. I like it because it is something I've done before so is somewhat proven, but also seems that I did a manual zuul config update sothat only one zuul restart was needed | 17:17 |
clarkb | basically the process used before was to replace the first two servers one at a time so that we're always running with a quorum of three. Then with 2 new servers and one old server we update the zuul config out of band to contain only the three new servers (our final config state) and restart all of zuul. Then we replace the last server | 17:18 |
clarkb | I need to take a break but I'll copy that over into a new document and start filling in details for this round | 17:19 |
clarkb | corvus: both service-zookeeper.yaml and service-zuul.yaml include a noopy debug task against the zookeeper group with comments indicating that this is done to ensure that group host vars are populated for later use by configuration files listing the servers. service-nodepool.yaml does not do this. Do you know if service-nodepool.yaml is figuring it out through some other mechanism or | 17:34 |
clarkb | maybe this is no longer necessary? | 17:34 |
clarkb | oh one difference is nodepool does the zookeeper group lookup within ansible itself the other two do it within jinja templates. Maybe that explains it | 17:36 |
corvus | i've read the commit 446b24c52f696e7c0912072ed486c9ba507228c4 and i still don't understand it | 17:37 |
corvus | oh i might understand it | 17:38 |
clarkb | you know I wonder if we were relying on hostnames with ansible inventory in the past and not hardcoded ips back then | 17:39 |
clarkb | so we'd have to connect to the host and gather facts to populate the ip address information. But now we hardcode that into the inventory so maybe its always available? | 17:39 |
corvus | i think he was saying that disabled hosts won't end up in the hostvars, so he's running a task on all the hosts | 17:39 |
clarkb | ohhhh | 17:40 |
clarkb | also reading my old etherpad the very end of the etherpad has a list of considerations that I need to followup on which may impact whether or not we use zk01, zk02, zk03 again | 17:40 |
corvus | server.{{ host | regex_replace('^zk(\d+)\.open.*\.org$', '\1') | int }}={{ (hostvars[host].public_v4) }}:2888:3888 | 17:40 |
clarkb | specifically whether or not we can add a lower numbered id to the existing cluster and whether or not a new zk01 would somehow conflict with the ancient zk01 | 17:41 |
corvus | that's from zoo.cfg and looks like it's still using hostvars | 17:41 |
clarkb | corvus: playbooks/roles/nodepool-base/tasks/main.yaml also uses hostvars though | 17:42 |
corvus | yeah... so maybe with our change to inventory that's not necessary? | 17:42 |
clarkb | I think either nodepool is buggy or that extra step is no longer necessary | 17:42 |
clarkb | "You do not want an existing id value to join without having knowledge of transactions previously ACKd by the id." and "A new server with a lower id than has previously been in the cluster may refuse to join because ZK considers id ordinality. Higher values can join clusters of lower values." are the etherpad considerations I'm now worried about | 17:43 |
clarkb | corvus: I suspect we'll be ok ignoring that difference in setup behavior between the playbooks for now | 17:43 |
clarkb | corvus: it is easy enough to manually edit the nodepool config if it comes to that | 17:43 |
corvus | clarkb: there's something fishy about option b... i'm pretty sure you can blow away the data for one of the nodes and it can recover from the others | 17:51 |
corvus | like.... without that ability... the system would not be very robust... pretty much any outage would cause it to fail | 17:52 |
corvus | so i'm interested in where "You do not want an existing id value to join without having knowledge of transactions previously ACKd by the id." comes from | 17:52 |
clarkb | corvus: ya option A is what we ultimately used last time. I'm currently trying to understand that consideration now | 17:53 |
clarkb | corvus: I found https://serverfault.com/questions/760320/zookeeper-faulty-cluster-member-replacement which isn't super helpful but does seem to point in that direction | 17:54 |
corvus | tbh, i'd also like to know where "A new server with a lower id than has previously been in the cluster may refuse to join because ZK considers id ordinality. Higher values can join clusters of lower values." comes from too | 17:54 |
corvus | that SO question also just has an unsourced assertion | 17:56 |
corvus | i just created 3 zk servers, set a timestamp value at /test using a connection to zk01, deleted zk01, started zk01 again with no data, reconnected to zk01, got the value (it was correct) and set another one. | 18:04 |
corvus | i think that suggests that "delete the entire data directory and start from scratch" is a viable replacement or recovery strategy for zk servers. (which has been by understanding for a while) | 18:05 |
clarkb | reading up on zab it seems that if a server knows of a higher zxid than the leader this can cause problems. However with data loss you'd not know of any zxids so the leader continues as is | 18:06 |
corvus | yes, that makes sense | 18:07 |
clarkb | and the server id value is used for zxid collisions. With the higher id winning | 18:07 |
clarkb | so if two servers are aware of the same zxid then the one with the larger myid value would become the leader | 18:08 |
clarkb | for "You do not want an existing id value to join without having knowledge of transactions previously ACKd by the id." I wonder if the issue is that other members may "know" that a server that went away and came back had a certain zxid level ackd and if it doesn't internally know about that it could become a leader and create problems? | 18:09 |
clarkb | but thats a bit of a stretch I'm still trying to understand the impact of the server id on operations within the larger cluster | 18:10 |
corvus | i just removed zk01, added zk04 (empty disk), performed a transaction, removed zk04, added zk01 (empty disk) and it worked fine | 18:14 |
corvus | so i have doubts about both of those assertions | 18:14 |
corvus | the quorum members talk to each other and sync missing transactions | 18:15 |
clarkb | ack | 18:16 |
clarkb | I found "Never list more than one joiner as participant in the initial configuration" in https://zookeeper.apache.org/doc/r3.8.0/zookeeperReconfig.html so I think we do want to add one server at a time at least | 18:17 |
clarkb | you can add more than one at a time but it requires additional coordination listing the extra servers as observers rather than participants | 18:17 |
clarkb | which confirms my concern about creating a split brain on startup | 18:18 |
corvus | that seems fair | 18:18 |
clarkb | server ids must also be unique which seems like a given | 18:20 |
clarkb | but ya I'm having a hard time finding any indication that we can't use a lower numbered id or that we can't go back to using an id from several years ago | 18:21 |
clarkb | corvus: ya so far I have only been able to determine that myid must be unique for each ensemble member and that the values must be between 1 to 255. Then the value influences leader election for tie breaking | 18:33 |
corvus | ++ | 18:33 |
clarkb | so I think I can use option A to upgrade again and just add 01 then 02 then 03 similar to what was done before and things should work | 18:34 |
clarkb | I just have to add each new node to the cluster one at a time | 18:34 |
corvus | sgtm | 18:35 |
clarkb | and then option A only needs one zuul restart | 18:36 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Ensure files are created with 0600 perms https://review.opendev.org/c/opendev/glean/+/953276 | 18:38 |
clarkb | corvus: something like this: https://etherpad.opendev.org/p/opendev-zookeeper-upgrade-2025 for the process | 18:51 |
clarkb | there was a naive part of me that thought I'd be mostly done by the end of today and now I feel like I'm just starting heh | 18:53 |
clarkb | corvus: that doc indicates using the zuul_restart.yaml playbook after 2 of 3 nodes are rotated in to pick up the future state of all three new servers in zuul config. I think that is the non graceful restart. That is still probably the best way to do this right? | 18:59 |
clarkb | I guess I'll start booting new servers after lunch and can update that document with even more details and update the ansible playbook to update zuul under the hood | 19:01 |
corvus | the graceful will take 24h... non-graceful is probably fine unless there's some kind of critical release job running... but even then they should be idempotent, so probably ok) | 19:02 |
clarkb | ack | 19:02 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Ensure files are created with 0600 perms https://review.opendev.org/c/opendev/glean/+/953276 | 20:04 |
opendevreview | Michal Nasiadka proposed opendev/glean master: Stop adding uuid= in keyfiles https://review.opendev.org/c/opendev/glean/+/953320 | 20:05 |
clarkb | arg I forgot to add the --image flag to launch node and my new zk01 is booting on jammy. I'll clean delete it and clear out the zk01 ansible facts after its done then try again | 20:12 |
clarkb | "Exception: Node has less than 2 CPUs" <- the check I added to catch the noble issue tripped. Glad I added that now | 20:43 |
fungi | nice! glad it's actually workin | 20:44 |
fungi | g | 20:44 |
opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Add DNS records for three new zk servers https://review.opendev.org/c/opendev/zone-opendev.org/+/953353 | 20:55 |
opendevreview | Clark Boylan proposed opendev/system-config master: Small playbook to set zuul zk hosts config https://review.opendev.org/c/opendev/system-config/+/788342 | 20:58 |
opendevreview | Clark Boylan proposed opendev/system-config master: Replace zk06 with zk01 https://review.opendev.org/c/opendev/system-config/+/951164 | 21:02 |
clarkb | I think it is safe to land 953353 whenever. 788342 doesn't need to be merged. I've just pushed an update to it for record keeping purposes and clarity. 951164 should probably be landed first thing on $day so that we can try and get through the entire cluster in one day. If reviewers are happy with these chagne and the process outlined in | 21:03 |
clarkb | https://etherpad.opendev.org/p/opendev-zookeeper-upgrade-2025 and tomorrow doesn't have any early fires I can probably start then. otherwise I'll wait for a quiet morning where I think I can get through the whole set in one big push | 21:03 |
corvus | 953353+3 | 21:04 |
corvus | i think that the zuul launchers have not been able to keep up with image uploads due to all the recent errors and restarts. things are looking better with the current code, but i think i'd like to see them fully catch up before we drown the logs in node launches | 21:05 |
opendevreview | Merged opendev/zone-opendev.org master: Add DNS records for three new zk servers https://review.opendev.org/c/opendev/zone-opendev.org/+/953353 | 21:05 |
corvus | so i think i'm going to let that run for a bit before i look into re-applying the openstack move | 21:06 |
clarkb | wfm | 21:06 |
opendevreview | Clark Boylan proposed opendev/system-config master: Update etherpad to v2.3.2 https://review.opendev.org/c/opendev/system-config/+/953328 | 22:40 |
opendevreview | Clark Boylan proposed opendev/system-config master: DNM force etherpad failure to hold node https://review.opendev.org/c/opendev/system-config/+/840972 | 22:40 |
clarkb | etherpad 2.3.1 failed in our testing but then a 2.3.2 showed up. Here's hoping that fixes the problem | 22:40 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!