Monday, 2021-11-08

@jim:acmegating.comopendev is still running master with 2 schedulers.  so far it's still looking good and i'm inclined to leave it running in that configuration into the workday tomorrow.01:07
-@gerrit:opendev.org- Felix Edel proposed:07:37
- [zuul/zuul] 814996: Make the ConfigLoader work independently of the Scheduler https://review.opendev.org/c/zuul/zuul/+/814996
- [zuul/zuul] 816361: Load system config and tenant layouts in zuul-web https://review.opendev.org/c/zuul/zuul/+/816361
- [zuul/zuul] 816362: Implement job freezing API in zuul-web https://review.opendev.org/c/zuul/zuul/+/816362
- [zuul/zuul] 816514: Implement managenet events directly in zuul-web https://review.opendev.org/c/zuul/zuul/+/816514
- [zuul/zuul] 816783: Implement autohold endpoints directly in zuul-web https://review.opendev.org/c/zuul/zuul/+/816783
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 816972: Use election for dispatching timer events https://review.opendev.org/c/zuul/zuul/+/81697209:39
-@gerrit:opendev.org- Simon Westphahl proposed:09:40
- [zuul/zuul] 815787: Refresh pipelines in tests when settled https://review.opendev.org/c/zuul/zuul/+/815787
- [zuul/zuul] 815278: DNM: execute tests with two schedulers https://review.opendev.org/c/zuul/zuul/+/815278
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 816993: Remove pf-c-content CSS class from status page https://review.opendev.org/c/zuul/zuul/+/81699311:53
@felixedel:matrix.orgcorvus: ^ A small UI improvement for the headings on the status page.11:54
@felixedel:matrix.orgcorvus mhu The stack around https://review.opendev.org/c/zuul/zuul/+/814711/1 also includes some small UI fixes for the buildset result page. Would be cool if somebody could have a look at those :)11:58
@felixedel:matrix.org * corvus mhu The stack around https://review.opendev.org/c/zuul/zuul/+/814711/1 also includes some small UI fixes for the buildset result page. Would be great if somebody could have a look at those :)12:08
-@gerrit:opendev.org- Simon Westphahl proposed on behalf of Felix Edel: [zuul/zuul] 816807: Split up registerScheduler() and onLoad() methods https://review.opendev.org/c/zuul/zuul/+/81680712:41
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 816993: Remove pf-c-content CSS class from status page https://review.opendev.org/c/zuul/zuul/+/81699313:29
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 768115: Web UI: allow a privileged user to request autohold https://review.opendev.org/c/zuul/zuul/+/76811513:30
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 768199: Web UI: add Autoholds, Autohold page https://review.opendev.org/c/zuul/zuul/+/76819913:31
-@gerrit:opendev.org- Simon Westphahl proposed on behalf of Felix Edel:13:42
- [zuul/zuul] 816807: Split up registerScheduler() and onLoad() methods https://review.opendev.org/c/zuul/zuul/+/816807
- [zuul/zuul] 814996: Make the ConfigLoader work independently of the Scheduler https://review.opendev.org/c/zuul/zuul/+/814996
- [zuul/zuul] 816361: Load system config and tenant layouts in zuul-web https://review.opendev.org/c/zuul/zuul/+/816361
- [zuul/zuul] 816362: Implement job freezing API in zuul-web https://review.opendev.org/c/zuul/zuul/+/816362
- [zuul/zuul] 816514: Implement managenet events directly in zuul-web https://review.opendev.org/c/zuul/zuul/+/816514
- [zuul/zuul] 816783: Implement autohold endpoints directly in zuul-web https://review.opendev.org/c/zuul/zuul/+/816783
-@gerrit:opendev.org- Simon Westphahl proposed:13:42
- [zuul/zuul] 817003: Store pipeline status for zuul-web in Zookeeper https://review.opendev.org/c/zuul/zuul/+/817003
- [zuul/zuul] 817004: Use pipeline status from Zookeeper in zuul-web https://review.opendev.org/c/zuul/zuul/+/817004
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 810699: Web UI: Show pipeline types as icons https://review.opendev.org/c/zuul/zuul/+/81069913:57
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 781858: web UI: allow a privileged user to promote a change https://review.opendev.org/c/zuul/zuul/+/78185814:01
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 802559: Web UI: Add "Create Autohold Request" form, improve API error messages https://review.opendev.org/c/zuul/zuul/+/80255914:30
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] 769943: Example Docker compose: keycloak integration https://review.opendev.org/c/zuul/zuul/+/76994314:30
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 816208: WIP Use AuthProvider https://review.opendev.org/c/zuul/zuul/+/81620814:31
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 817035: Fix race condition when updating node requests https://review.opendev.org/c/zuul/zuul/+/81703515:12
@shrews:matrix.orgIs it possible to control the commit message format for a Github project gated by Zuul? The format seems include: PR title, 1st commit message, PR description, reviewers. It would be nice if we could trim that down to NOT include the PR description or reviewers.15:22
@fungicide:matrix.orgwhen you say "the commit message format" do you mean on the merge commit created by zuul merging the change to the public branch in github?15:46
@shrews:matrix.orgfungi: yes15:46
@fungicide:matrix.orgthanks, to be clear i don't know the answer to your question, just making sure i understood what you were asking15:47
@clarkb:matrix.orgI think GitHub does the merging not zuul. Would be a GitHub setting if it exists16:07
@fungicide:matrix.orgthough maybe another argument in favor of having zuul push merge commits16:19
@clarkb:matrix.orgAre the bigger zuul status queue card entries expected?16:46
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 817056: Refresh pipelines before checking for leaked node requests https://review.opendev.org/c/zuul/zuul/+/81705617:04
@jim:acmegating.comClark: you can fix the bigger card size with a +3 on https://review.opendev.org/81609417:04
@jim:acmegating.comClark, fungi, tobiash, swest, felixedel: i believe https://review.opendev.org/817056 is causing some jobs in opendev's zuul to wait indefinitely for nodes17:05
@jim:acmegating.comthe case i examined had zuul logging that it deleted a leaked node request, but it wasn't leaked; i think that's why17:06
@jim:acmegating.comi'd like to merge that and then try a zero-downtime restart in opendev to roll out that fix17:07
@clarkb:matrix.org+217:09
@clarkb:matrix.orgI've also approved the card sizing in the web ui change after verifying it looks more how I expect in the site preview17:09
@tobias.henkel:matrix.orgYay, zero downtime restarts17:11
@jim:acmegating.comit's going to be fun, no matter what happens! :)17:13
@westphahl:matrix.orgcorvus: we found another case that can lead to jobs waiting indefinitely. Should be fixed by 81703517:25
@jim:acmegating.comswest: thanks!  Clark, tobiash: ^ want to get that one in too?17:27
@clarkb:matrix.orgyup looking17:28
@clarkb:matrix.orgdo we want to recheck it? not sure how worried I should be about the failed unittest job17:28
@jim:acmegating.comi took a quick look and seriously doubt it's related.  i think just going into gate is ok.17:29
@clarkb:matrix.orgDoes the docstring on the method there need to be updated? it says the lock attribute will be updated whcih I guess is true for the local system but not zk?17:29
@clarkb:matrix.orgoh wait updateNodeRequest is updating it from zk into memory17:30
@goneri:matrix.orgIs there a way to manually ask nodepool to start a nodeset on a given provider? I'm adding some new images and I would like validate all the combination are working.17:33
@jim:acmegating.comGonéri: you can give them a special label and request that in your test job17:34
@goneri:matrix.orgThis is what I do, but it's also a lot of work.17:36
@tristanc_:matrix.orgGonéri: you can also set a min-ready attribute to 117:36
@goneri:matrix.orgoh, good idea. I will try that.17:37
@goneri:matrix.orgActually, with min-ready, I still need one label by provider.17:39
@clarkb:matrix.orgWhat I tend to do these days is tell nodepool to build an upload imges then I can manually boot them easily17:49
@clarkb:matrix.orgWhen I do similar stuff for OpenDev you can assume reasonably consistent performance across all images so what you need ot check is that all images boot and then you can pick a single image to do some representative testing which reduces the overhead17:50
@goneri:matrix.orgWe've got a lot of inconsistencies (boot-from-volume, different flavors, AWS AMI ID, etc) between our providers and we often discover the problems weeks later.17:57
@clarkb:matrix.orgah I see new image across all providers rather than new provider with a bunch of images18:00
@clarkb:matrix.orgFedora 34 was recently problematic for us in that way as they broke their kernels for VMs.18:01
@clarkb:matrix.orgthen fixed f35 first and left f34 broken for an extended period18:02
@clarkb:matrix.orgcorvus: I think I'm seeing the status page flap. What is interesting to me is that we seem to have done a gate reset in zuul gate but I'm not sure there was ever a fourth change ahead?18:19
@clarkb:matrix.orgis it possible that we're doing inappropriate resets due to job retries?18:19
@jim:acmegating.comyeah, i'm looking at the scheduler logs, and i think the schedulers are actually in disagreement on that18:21
@jim:acmegating.comthey're actually canceling jobs and restarting18:21
@jim:acmegating.comi think the bug is with the "RETRY" job state18:28
@jim:acmegating.comhasAnyJobFailed considers a 'retry' to be a failure18:28
@clarkb:matrix.orgah and it should only do so if the retry limit has been reached18:28
@clarkb:matrix.orgFWIW this seem to be affecting other tenant's gate queues in the opendev zuul. I suspect the retries were due to the zk issues so not an ongoing problem?18:29
@jim:acmegating.comyeah, i think this is another thing that can be fixed live with dequeue/enqueue18:30
@jim:acmegating.comClark: feel free to do that for openstack, but please leave the one in zuul/gate for a little longer while i continue to look18:30
@clarkb:matrix.orgcorvus: can do. Just dequeue enqueue the change's with retry state jobs?18:30
@jim:acmegating.comyep18:31
@clarkb:matrix.orgThis is not completing very quickly. Just noting that if it isn't expected18:33
@clarkb:matrix.orgbut it does eventually finish18:33
@jim:acmegating.comyes it's very slow18:35
@jim:acmegating.comthe 2 schedulers have 2 different znode versions of the builds18:37
@jim:acmegating.comand that's because they have different buildset znode versions18:40
@jim:acmegating.comand different item versions18:43
@jim:acmegating.comokay, checking over time, they're both changing; so i suspect it's less that the state is out of sync, and more that they're just fighting each other18:46
@clarkb:matrix.orgI'm realizing that most of the openstack check queue (and probably other queues) need to be refreshed too18:47
@clarkb:matrix.orgnoticed it in gate because of the resets being easy to spot but it happens in the other queues too preventing check jobs from compeleting for a single change18:47
@clarkb:matrix.orgalso this is a bit mind bending to try and make sense of. There is once change that shows up in one scheduler but not the other. Neither appears to have reported it to gerrit18:49
@clarkb:matrix.orgoh wait maybe they are at the end of the status page for one and not the other. More trippy :)18:50
@clarkb:matrix.orgok I'm going to dequeue and reenqueue 54 check entries and hope that helps me make more sense of things18:54
@clarkb:matrix.orgthey are all tenant openstack so shouldn't affect the zuul stuff you asked me to leave alone18:55
@jim:acmegating.comokay, i think i see the issue; we don't check if the the build path has changed when we deserialize builds; so we're refreshing existing objects from 2 different paths19:00
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 817067: Verify build path before refreshing from ZK https://review.opendev.org/c/zuul/zuul/+/81706719:05
@jim:acmegating.comClark, fungi, tobiash, swest: ^ i think that's the issue we're seeing with retry builds19:05
@jim:acmegating.comi think it makes sense we would only see that with retry builds (where we create new builds), and also that we would only see that with a second scheduler19:06
@jim:acmegating.comClark: i may be able to monkeypatch that in using the repl19:07
@spamaps:spamaps.ems.hostSo I have definitely found infinite recursion in the gce provider config code19:07
@spamaps:spamaps.ems.hostI have a very simple reproducer that is a very normal config.19:07
@clarkb:matrix.orgcorvus: I'm observing new behavior from the periodic pipeline showing many neutron 000000 entries after I enqueued a single one19:07
@spamaps:spamaps.ems.hostWorking through how to solve it now, but wondering why others haven't hit it.19:07
@jim:acmegating.comspamaps: you are probably the second person to use that driver19:08
@clarkb:matrix.orgcorvus:  if you can take a look at cleaning up the periodic queue because I'm confused that would be good (but I know you are looking at fixing the underlying issue)19:08
@jim:acmegating.comClark: we know periodic enqueues are weird with 0000 refs; can we just dequeue them and ignore that for now?19:09
@clarkb:matrix.orgcorvus: yes. How do I dequeue the 00000 ref? use --ref 00000 ?19:09
@jim:acmegating.comi think that should work19:10
@clarkb:matrix.orgok I'll try it19:10
@spamaps:spamaps.ems.host> <@jim:acmegating.com> spamaps: you are probably the second person to use that driver19:10
Well then there's my answer. :)
@spamaps:spamaps.ems.hostThere's already an attempt to avoid the recursion19:12
@spamaps:spamaps.ems.hostI think it may have been too narrow,.19:12
@spamaps:spamaps.ems.hostUltimately what happens is, a pool has a provider has a pool has a provider...19:13
@spamaps:spamaps.ems.hostah so simple19:14
@spamaps:spamaps.ems.hostignore_equality should have ['provider'] but is []19:14
@clarkb:matrix.orgcorvus: "Exception: Unable to find shared change queue for openstack/neutron:0000000000000000000000000000000000000000" fwiw. I also tried the seven digit 0000000 string19:15
@spamaps:spamaps.ems.hostYay, that did it.19:18
@spamaps:spamaps.ems.hostDamn, do I have to get Spotify to sign the openstack CLA?19:19
@jim:acmegating.comspamaps: no cla required for zuul19:19
@spamaps:spamaps.ems.host\o/19:19
@jim:acmegating.combecause of exactly that19:20
@jim:acmegating.comClark: i have monkeypatched the fix; that might abate the need for re-enqueing retry stuff19:21
@jim:acmegating.comClark: it looks like the situation in the zuul gate queue is resolved19:21
@clarkb:matrix.orgcorvus: ok I can ^C my script and then only enqueue what was dequeued?19:22
@jim:acmegating.comClark: yeah i think so19:22
@clarkb:matrix.orgalright doing that now19:22
@jim:acmegating.comClark: then if you could +3 https://review.opendev.org/817067 that'd be great since we're running that in prod :)19:22
@clarkb:matrix.orgcorvus: do you have time to look at dequeuing the 000000 periodic changes? I'm not sure what I should try next there19:23
@jim:acmegating.comClark: can do19:24
@clarkb:matrix.orgthanks19:24
-@gerrit:opendev.org- Clint Byrum proposed: [zuul/nodepool] 817070: Fix infinite recursion in GCE provider https://review.opendev.org/c/zuul/nodepool/+/81707019:37
@spamaps:spamaps.ems.hostFun.. now I get to figure out how to make my instances accessible by executor/nodepool when this is happening: 19:51
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 817056: Refresh pipelines before checking for leaked node requests https://review.opendev.org/c/zuul/zuul/+/81705620:12
-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl: [zuul/zuul] 817035: Fix race condition when updating node requests https://review.opendev.org/c/zuul/zuul/+/81703520:30
@clarkb:matrix.orgLooks like GCE default firewall rules allow ingress port 22 which implies you're rules are not default?20:36
@spamaps:spamaps.ems.hostCorrect! I am a tiny fish in a very big Spotify GCE ocean. ;)20:52
@spamaps:spamaps.ems.hostAnd for whatever reason my network-local firewall rules don't seem to be letting me do what I want. ;)20:53
@spamaps:spamaps.ems.hostBut more pressing is trying to ship a patched nodepool.20:53
@jim:acmegating.comspamaps: you said you have a test case; can you attach that to 817070?20:54
@spamaps:spamaps.ems.host> <@jim:acmegating.com> spamaps: you said you have a test case; can you attach that to 817070?20:55
Ah yeah it looks like that didn't make it as I rebased out my flailing, let me add it back in. :)
@jim:acmegating.comsweet thx20:55
@spamaps:spamaps.ems.hostThe one I have is nearly identical to the fixture in the gce unit tests.20:58
@spamaps:spamaps.ems.hostSo I suspect this code isn't exercised.20:58
@spamaps:spamaps.ems.hostIf anything my example is smaller than the one in the unit tests.20:59
@jim:acmegating.comwas it config validation that failed?21:00
@spamaps:spamaps.ems.hostIt was failing when it tried to determine if the config had changed.21:00
@spamaps:spamaps.ems.host```launcher_1      | 2021-11-08 18:53:53,062 ERROR nodepool.NodePool: Exception in main loop:21:01
launcher_1 | 2021-11-08 18:53:53,062 ERROR nodepool.NodePool: Traceback (most recent call last):
launcher_1 | 2021-11-08 18:53:53,062 ERROR nodepool.NodePool: File "/usr/local/lib/python3.9/site-packages/nodepool/launcher.py", line 1095, in run
launcher_1 | 2021-11-08 18:53:53,062 ERROR nodepool.NodePool: self.updateConfig()
launcher_1 | 2021-11-08 18:53:53,062 ERROR nodepool.NodePool: File "/usr/local/lib/python3.9/site-packages/nodepool/launcher.py", line 957, in updateConfig
launcher_1 | 2021-11-08 18:53:53,062 ERROR nodepool.NodePool: provider_manager.ProviderManager.reconfigure(self.config, config,
launcher_1 | 2021-11-08 18:53:53,062 ERROR nodepool.NodePool: File "/usr/local/lib/python3.9/site-packages/nodepool/provider_manager.py", line 52, in reconfigure
launcher_1 | 2021-11-08 18:53:53,062 ERROR nodepool.NodePool: if oldmanager and p != oldmanager.provider:```
@jim:acmegating.comoh, then it's entirely possible that the other user is also hitting this bug, since the config never changes :)21:01
@jim:acmegating.comspamaps: if the existing unit tests aren't exercising that, maybe a config validation test would21:01
@spamaps:spamaps.ems.hostIt may not have even prevented things from working. Are you sure it's not just streaming in the logs right now? ;)21:01
@jim:acmegating.comspamaps: i feel there's a >90% change that's exactly what's happening :)21:02
@spamaps:spamaps.ems.hostIt doesn't happen on an actual change, it just happens in the loop whenever we call updateConfig()21:03
@spamaps:spamaps.ems.hostand it gets eaten by the except line, so yeah, just logged and ignored.21:04
@spamaps:spamaps.ems.hostThe first config works.21:04
@spamaps:spamaps.ems.hostBecause other is None.21:04
@spamaps:spamaps.ems.host> <@jim:acmegating.com> spamaps: if the existing unit tests aren't exercising that, maybe a config validation test would21:04
I wonder if it would make sense to just make sure updateConfig works twice on every fixture config.
@jim:acmegating.comyeah21:05
@spamaps:spamaps.ems.hostActually we don't have to use updateConfig...21:06
@spamaps:spamaps.ems.hostbut just do an equality check to oneself.21:06
@spamaps:spamaps.ems.hostor to the same one loaded twice.21:06
@spamaps:spamaps.ems.hostthe latter would probably make more sense.21:06
@spamaps:spamaps.ems.hostI think I have a tiny tweak to the test_driver_gce that catches this bug.21:09
@spamaps:spamaps.ems.hostgah.. still can't run the unit tests native on Mac OS.21:16
@spamaps:spamaps.ems.hostHrm, it's happening on Linux in docker too21:37
@spamaps:spamaps.ems.host```ImportError: cannot import name 'OP_NO_TICKET' from 'urllib3.util.ssl_' (/src/.tox/py39/lib/python3.9/site-packages/urllib3/util/ssl_.py)```21:37
@spamaps:spamaps.ems.hostAnybody know how to get around this?21:37
@clarkb:matrix.orgwe do test nodepool with focal + python3.9 and whatever openssl is on that platform21:38
@spamaps:spamaps.ems.hostI'm literally just mounting nodepool source into the nodepool container from Docker.io and trying to run tox on it.21:38
@clarkb:matrix.orgtox will install stuff to a new venv. The images on docker don't have the -dev(el) packages installed21:39
@clarkb:matrix.orgI wonder if it is just failing to link openssl properly? Though its weird since that stack shoudl all have wheels now21:39
@spamaps:spamaps.ems.hostlibssl-dev is installed21:40
@clarkb:matrix.orghuh that is unexpected21:41
@spamaps:spamaps.ems.hostNo I mean before I tox'd21:41
@clarkb:matrix.orgya I know, the container builds are supposed to do all the -dev(el) linking in a throwaway image then wheels are copied out from that and installed on the file image without the -dev(el) packages present to keep image sizes down21:41
@clarkb:matrix.orgwe probably have a wrong bindep rule, but also none of that explains the error21:42
@spamaps:spamaps.ems.hostPerhaps I'm not doing some step :-P21:45
@clarkb:matrix.orgspamaps: if I pull zuul/nodepool-launcher and run python and import ssl then >>> ssl.OP_NO_TICKET says <Options.OP_NO_TICKET: 16384>21:48
@spamaps:spamaps.ems.hostClark: I just docker built from the Dockerfile in nodepool's repo and it didn't work there either.21:49
@spamaps:spamaps.ems.hostI didn't bindep.. forgot that existed.. so I'm trying that21:50
@spamaps:spamaps.ems.hostAnd I'm guessing no wheels are ever being used21:52
@spamaps:spamaps.ems.hostbecause every tox -epy39 takes 5-10 minutes.21:52
@spamaps:spamaps.ems.hostjust for the installdeps21:52
@spamaps:spamaps.ems.host```--- import errors ---21:53
Failed to import test module: nodepool.tests.unit.test_driver_aws
Traceback (most recent call last):
File "/usr/local/lib/python3.9/unittest/loader.py", line 436, in _find_test_path
module = self._get_module_from_name(name)
File "/usr/local/lib/python3.9/unittest/loader.py", line 377, in _get_module_from_name
__import__(name)
File "/src/nodepool/tests/unit/test_driver_aws.py", line 24, in <module>
import boto3
File "/src/.tox/py39/lib/python3.9/site-packages/boto3/__init__.py", line 16, in <module>
from boto3.session import Session
File "/src/.tox/py39/lib/python3.9/site-packages/boto3/session.py", line 17, in <module>
import botocore.session
File "/src/.tox/py39/lib/python3.9/site-packages/botocore/session.py", line 29, in <module>
import botocore.credentials
File "/src/.tox/py39/lib/python3.9/site-packages/botocore/credentials.py", line 34, in <module>
from botocore.config import Config
File "/src/.tox/py39/lib/python3.9/site-packages/botocore/config.py", line 16, in <module>
from botocore.endpoint import DEFAULT_TIMEOUT, MAX_POOL_CONNECTIONS
File "/src/.tox/py39/lib/python3.9/site-packages/botocore/endpoint.py", line 22, in <module>
from botocore.awsrequest import create_request_object
File "/src/.tox/py39/lib/python3.9/site-packages/botocore/awsrequest.py", line 24, in <module>
import botocore.utils
File "/src/.tox/py39/lib/python3.9/site-packages/botocore/utils.py", line 32, in <module>
import botocore.httpsession
File "/src/.tox/py39/lib/python3.9/site-packages/botocore/httpsession.py", line 10, in <module>
from urllib3.util.ssl_ import (
ImportError: cannot import name 'OP_NO_TICKET' from 'urllib3.util.ssl_' (/src/.tox/py39/lib/python3.9/site-packages/urllib3/util/ssl_.py)
================================================================================
The above traceback was encountered during test discovery which imports all the found test modules in the specified test_path.
ERROR: InvocationError for command /src/.tox/py39/bin/stestr --test-path ./nodepool/tests/unit run --no-subunit-trace (exited with code 100)
________________________________________________________________________ summary _________________________________________________________________________
ERROR: py39: commands failed
root@74296c2b01e2:/src# .tox/py39/bin/python3
Python 3.9.6 (default, Aug 17 2021, 02:29:16)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl
>>> ssl.OP_NO_TICKET
<Options.OP_NO_TICKET: 16384>
>>>```
@clarkb:matrix.orgthe docker image build should use bindep and install all the runtime deps for you. but it is a multistage build with a sacrificial image used just for building all the deps into wheels21:53
@spamaps:spamaps.ems.host🤷21:53
@spamaps:spamaps.ems.hostNote that this is the urllib3 vendored ssl or something like that?21:54
@clarkb:matrix.orgits https://github.com/urllib3/urllib3/blob/main/src/urllib3/util/ssl_.py#L98-L14821:54
@spamaps:spamaps.ems.host```>>> import urllib3.util.ssl_21:54
>>> urllib3.util.ssl_
<module 'urllib3.util.ssl_' from '/src/.tox/py39/lib/python3.9/site-packages/urllib3/util/ssl_.py'>
>>> urllib3.util.ssl_.OP_NO_TICKET
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'urllib3.util.ssl_' has no attribute 'OP_NO_TICKET'
>>>```
@clarkb:matrix.orgoh its importing OP_NO_TICKET from urlib21:55
@clarkb:matrix.organd urllib is importing it from ssl?21:55
@spamaps:spamaps.ems.hostno idea21:56
@spamaps:spamaps.ems.hostI didn't miss this part of python21:56
@spamaps:spamaps.ems.hostand "this part" being "boto"21:56
@clarkb:matrix.orgI suspect that urrlib3 is too old for boto or similar21:56
@clarkb:matrix.orgYup https://github.com/urllib3/urllib3/blob/1.25.11/src/urllib3/util/ssl_.py is the version on the image and has no OP_NO_TICKET21:57
@clarkb:matrix.orgspamaps: try to remove the urllib3 cap in requirements.txt?21:57
@clarkb:matrix.orgboto must've just recently released because we updated some nodepool stuff last week iirc21:58
@clarkb:matrix.orghttps://pypi.org/project/botocore/#history literally 2 hours ago21:58
@spamaps:spamaps.ems.hostYeah why is nodepool capped I wonder.21:59
@spamaps:spamaps.ems.host```nodepool 4.3.1.dev11 requires urllib3<1.26,>=1.25.4, but you have urllib3 1.26.7 which is incompatible.```21:59
@spamaps:spamaps.ems.hostYeah I just manually upgraded it and got that.21:59
@clarkb:matrix.orgthere are comments about it. It was beacuse python requests couldn't work with 1.26 for a time. I think that is no longer an issue so we should bump the min requirement up to whatever botocore requires /me tries to figure that out now22:00
@clarkb:matrix.orghttps://github.com/boto/botocore/blob/1.23.0/setup.cfg#L8 is inaccurate22:00
@spamaps:spamaps.ems.hostOne wonders why that wouldn't be breaking the gate.22:01
@clarkb:matrix.orgit needs >=1.26.022:01
@clarkb:matrix.orgspamaps: because the change happened 2 hours ago when botocore released 1.23.022:01
@clarkb:matrix.orgyou're the first to notice in that time span22:01
@spamaps:spamaps.ems.hostWait, botocore has dependency problems? That doesn't sound right.. ;)22:01
@spamaps:spamaps.ems.hostSo we could also pin botocore back22:01
@spamaps:spamaps.ems.hostUntil they fix their stuff too22:02
@clarkb:matrix.orgI think bumping urllib3 up is fine. I'll push that up unless you want to22:02
@clarkb:matrix.orgor ya I guess we pin boto and file a bug against them22:02
@clarkb:matrix.orghttps://github.com/boto/botocore/issues/2562 already exists as an upstream bug22:03
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/nodepool] 817114: Fix boto deps https://review.opendev.org/c/zuul/nodepool/+/81711422:07
@clarkb:matrix.orgI expect that integration testing will check ^ generally works against a devstackcloud with openstacksdk.22:08
@spamaps:spamaps.ems.hostAnd now I'm stuck because I don't have a zk22:23
@spamaps:spamaps.ems.hostAm I crazy to think it shouldn't be this hard just to run one unit test?22:23
@clarkb:matrix.orgthere is a tools/test-setup-docker.sh22:23
@clarkb:matrix.orgI don't think its crazy, but I also think you're subverting the docker image to do something it was never intended for :P22:24
@clarkb:matrix.orgthe docker image is meant to be deployed to production hence the intention of removing -devel package after wheels are built for example22:24
@spamaps:spamaps.ems.hostNo I don't mean that I just mean like, I am on a mac, that is a very common thing.. :-P22:25
@clarkb:matrix.orgya but none of us are able to test or reproduce that. I appreciate people might have macs but none of us do and having mac ci resources isn't very pragmatic due to licensing22:25
@clarkb:matrix.orgyou essentially have to create a farm of mac minis22:25
@spamaps:spamaps.ems.hostI don't mind dockering to run tsts22:26
@spamaps:spamaps.ems.hostbut this doesn't help me do that.22:26
@jim:acmegating.comtest-setup-docker.sh will run zk for tests22:27
@spamaps:spamaps.ems.hostRight so then what? I have to also docker run the tests.22:28
@spamaps:spamaps.ems.hostI'm not complaining, I just need a flow22:29
@spamaps:spamaps.ems.hostI can't seem to find one22:29
@jim:acmegating.comtest-setup-docker.sh will set up the dependencies.  then on linux, i run the tests using tox.  if you need to run them in docker for some reason, then i guess that's what you'd do.22:30
@spamaps:spamaps.ems.hostIt is probably what I have to do.. we'll see in a minute.. now that I have boto pinned it might work.22:30
@spamaps:spamaps.ems.hostAh ok the SSL error had me assuming it wouldn't run native on the mac22:34
@spamaps:spamaps.ems.hostSeems like it might work now22:34
@jim:acmegating.comspamaps: just fyi, there are no guarantees that testing natively on mac for zuul or nodepool will work.  people do it, but at least in zuul, there are workarounds required.  i believe there are some tests which cannot run (i don't know if they are auto-skipped or not).22:37
@clarkb:matrix.orggear is still not osx happy. nodepool doesn't rely on it anymore at least22:37
@clarkb:matrix.orgI suspect that is what zuul needs working around22:37
@jim:acmegating.comtheres other stuff22:37
@jim:acmegating.com(i've seen notes in tests, i just don't remember off the top of my head what they are; not trying to be obtuse)22:40
@spamaps:spamaps.ems.hostOk I got tests to work but my idea to just add a quick assertion doesn't work.22:46
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 817122: Shard config errors https://review.opendev.org/c/zuul/zuul/+/81712223:05
@jim:acmegating.comClark, tobiash, swest, fungi: ^ that should take care of the next thing we're observing in production on opendev23:05
@clarkb:matrix.orgLooking and thanks23:05
@jim:acmegating.comi just kicked off a local test run for that (but i smoke tested a few first before pushing it up)23:06
@clarkb:matrix.orgcorvus: couple of questions in there23:15
@clarkb:matrix.orgthe deserialization of attributes like _path is something I generally wonder about as we seem to do it via the passed arg rather than the data in zk typically23:16
@clarkb:matrix.orgI wonder if it might be easier conceptually to just always store that in zk though it will use more disk to do that23:16
@jim:acmegating.comClark: replied23:17
@jim:acmegating.comone test error locally; will fix both things23:17
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 817122: Shard config errors https://review.opendev.org/c/zuul/zuul/+/81712223:20
@jim:acmegating.comClark: ^ there we go23:20
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 817067: Verify build path before refreshing from ZK https://review.opendev.org/c/zuul/zuul/+/81706723:21
@clarkb:matrix.org+2 thanks23:21
@jim:acmegating.comClark: do you think we should get further review on that, or go ahead and approve it to get it into opendev prod?23:22
@clarkb:matrix.orgselfishly I'd like to get it into prod. I think this is a pattern we've gone through a few times converting unshared to sharded objects. Maybe we should just go for it?23:22
@clarkb:matrix.org * selfishly I'd like to get it into prod. I think this is a pattern we've gone through a few times converting unsharded to sharded objects. Maybe we should just go for it?23:22
@jim:acmegating.comyeah, i think this one is worth going for it and just making sure to ping tobiash and swest for retro-review23:23
@jim:acmegating.comtristanC: thanks! :)23:26
@tristanc_:matrix.orgcorvus: i'm trying to follow the on-going effort, but I find that new zk based system to be quite tricky to internalize, so I'm sorry to I'm not too helpful on those reviews.23:28
@clarkb:matrix.org> <@tristanc_:matrix.org> corvus: i'm trying to follow the on-going effort, but I find that new zk based system to be quite tricky to internalize, so I'm sorry to I'm not too helpful on those reviews.23:30
It definitely requires a lot of effort to work through. I've generally had to block off an entire half a day when doing reviews of each stack so that I can read through them carefully and ensure I'm understanding it well enough
@clarkb:matrix.orgI suspect it will get easier as things are in less of an in between space23:30
@spamaps:spamaps.ems.hostwoo I got a test written and it even does some decorating23:32
-@gerrit:opendev.org- Clint Byrum proposed: [zuul/nodepool] 817070: Fix infinite recursion in GCE provider https://review.opendev.org/c/zuul/nodepool/+/81707023:33
@spamaps:spamaps.ems.host^^ Rebased on top of the boto bump and adds a unittest that reproduces the infinite recursion problem it is fixing.23:34
@spamaps:spamaps.ems.hostDoes gerrit not link to the running zuul jobs anymore?23:37
@spamaps:spamaps.ems.hostDid it ever? I have been away for a while. ;)23:38
@clarkb:matrix.orgIt did for like a day then we stopped doing it because it caused gerrit to have a sad. Since then the zuul status apis have been improved and we could probably give it another go but gerrit has also completely rewritten its UI so we'd have to start over on the gerrit side23:39
@clarkb:matrix.org * It did for like a day then we stopped doing it because it caused zuul api for the status to have a sad. Since then the zuul status apis have been improved and we could probably give it another go but gerrit has also completely rewritten its UI so we'd have to start over on the gerrit side23:40
@spamaps:spamaps.ems.hostI just mean a link..23:42
@spamaps:spamaps.ems.hostnot the actual status.23:42
@clarkb:matrix.orgya it never did that except for a day and it got reverted23:44
@spamaps:spamaps.ems.hostAh.23:44
@spamaps:spamaps.ems.hostI seem to recall it used to post a comment.23:45
@spamaps:spamaps.ems.hostI know my GH based Zuul's all posted a comment at the beginning with a templated link to the status page.23:45
@spamaps:spamaps.ems.hostand in GH the checks API serves this purpose.23:45
@clarkb:matrix.orgThere was a time when it posted a comment saying it had started jobs (I think it still does for the gate?) but I don't know that that ever included a link to running zuul jobs (because zuul only recently added the ability to do that)23:46
@spamaps:spamaps.ems.hostYeah I faked it with search links. :)23:46
@spamaps:spamaps.ems.hostwhich were poorly documented23:46
@clarkb:matrix.organd ya gerrit keeps deprecating and reinventing the system similar to github's checks api23:47
@clarkb:matrix.orgso we don't bother to use it yet in opendev. I'm hoping the version they've built for 3.4 will stick around long enough that maybe we can target that and make use of it once we upgrade to that version23:47
@spamaps:spamaps.ems.hostAnyway now to get back to actually trying to ship a patched nodepool into my little PoC setup.23:47
@spamaps:spamaps.ems.hostLooks like I can just git review -d my change and build a local docker image. Yay.23:48
@jim:acmegating.comtristanC: yeah, it's a lot.  maybe once we're done, we should do a video meeting to go over the design/concepts for zuul-maint (and use that to figure out what we should put in developer docs).23:50
@jim:acmegating.comat a high level, it's mostly sticking to the spec, but we invented some ideas along the way to solve problems as they came up of course (like sharded zk objects)23:51
@tristanc_:matrix.orgcorvus: that sounds like a good idea. I'm not sure what is your definition of "done" here, but I would be happy to learn more about (and help document) the implementation details :)23:56

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!