Wednesday, 2021-08-18

*** ysandeep|out is now known as ysandeep04:30
*** ykarel|away is now known as ykarel05:02
*** iurygregory_ is now known as iurygregory06:42
*** rpittau|afk is now known as rpittau07:22
*** jpena|off is now known as jpena07:34
opendevreviewJiri Podivin proposed openstack/openstack-zuul-jobs master: DNM  https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/80496208:16
*** mgoddard- is now known as mgoddard08:20
*** ysandeep is now known as ysandeep|lunch08:26
gryfhi. I'm just wondering. I have a job with two nodes (controller and compute1), defined like this: https://opendev.org/openstack/kuryr-kubernetes/src/branch/master/.zuul.d/multinode.yaml#L15-L63 and I'd like to pass something which is generated during stacking devstack. is it possible?08:35
gryfI'm looking on https://zuul-ci.org/docs/zuul/reference/jobs.html#return-values but cannot figure out how to pass something from controller to the compute1.08:36
gryfespecially how data.foo = bar would be exposed to zuul by controller? I'd like to have it passed to the compute1.08:37
opendevreviewchandan kumar proposed openstack/openstack-zuul-jobs master: DNM  https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/80496208:39
opendevreviewchandan kumar proposed openstack/openstack-zuul-jobs master: DNM  https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/80496208:40
opendevreviewSorin Sbârnea proposed openstack/openstack-zuul-jobs master: tox: help py36 jobs use utf8 encoding  https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/80489008:58
opendevreviewchzhang8 proposed openstack/project-config master: bring tricircle under x namespaces  https://review.opendev.org/c/openstack/project-config/+/80496909:00
opendevreviewchzhang8 proposed openstack/project-config master: bring tricircle under x namespaces  https://review.opendev.org/c/openstack/project-config/+/80497009:04
*** ykarel is now known as ykarel|lunch09:04
opendevreviewchandan kumar proposed openstack/openstack-zuul-jobs master: DNM  https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/80496209:06
opendevreviewchzhang8 proposed openstack/project-config master: bring tricircle under x namespaces  https://review.opendev.org/c/openstack/project-config/+/80497209:20
*** ysandeep|lunch is now known as ysandeep09:47
opendevreviewchzhang8 proposed openstack/project-config master: bring tricircle under x namespaces  https://review.opendev.org/c/openstack/project-config/+/80497709:51
*** odyssey4me is now known as Guest472210:37
*** ykarel|lunch is now known as ykarel10:58
*** jpena is now known as jpena|lunch11:34
*** rlandy is now known as rlandyrover11:36
*** rlandyrover is now known as rlandy|rover11:36
*** sshnaidm|pto is now known as sshnaidm12:25
*** jpena|lunch is now known as jpena12:32
slaweqfrickler: hi12:39
slaweqfrickler: can You maybe help me once again with one zuul related issue (I think it's zuul related)12:39
slaweqsome time ago we had in tobiko problem that zuul was reporting "Unable to freeze job graph: Job devstack-tobiko is abstract and may not be directly run"12:39
slaweqwe fixed that with patch https://review.opendev.org/c/x/devstack-plugin-tobiko/+/80435612:40
slaweqbut now it happend again for us in patch https://review.opendev.org/c/x/tobiko/+/80488112:40
slaweqand looking at https://zuul.openstack.org/job/devstack-tobiko we see there are 2 jobs devstack-tobiko defined there12:40
slaweqand one of them is abstract12:40
slaweqbut in fact we don't have 2 definitions of that job, only one which isn't abstract for sure12:41
slaweqfrickler: can You take a look and help me understand that problem? Thx in advance12:41
fricklerslaweq: sorry, no time right now, maybe some other infra-root can take a look in a bit13:00
slaweqfrickler: sure, np13:00
slaweqinfra-root folks, can someone help me with ^^? Thx in advance13:01
fungislaweq: so this started sometime after 14:55 utc yesterday when the previous patchset was pushed?13:08
fungiand by 07:16 utc today when the next revision was pushed13:09
fungii guess it affects all open changes for x/tobiko?13:09
slaweqfungi: it started happening again today, at least we noticed it then13:10
slaweqfungi: and it is now affecting all tobiko patches13:11
fungislaweq: i do wonder if there's any relationship with the remaining config errors for x/devstack-plugin-tobiko: https://zuul.opendev.org/t/openstack/config-errors13:11
slaweqfungi: it's related in the way that it reports issues in jobs defined in zuul.d/jobs.yaml13:13
slaweqand that file is not existing anymore13:13
slaweqalso the abstract "version" of the devstack-tobiko job was also defined in the same file13:13
slaweqbut now it's in different file13:13
fungii see, and that was adjusted by https://review.opendev.org/801436 which merged on july 2113:17
slaweqfungi: yes13:18
fungiyeah, something weird is going on there, zuul should no longer be complaining about a file which doesn't exist13:20
slaweqfungi: we just tried to revert that patch which renamed file https://review.opendev.org/c/x/devstack-plugin-tobiko/+/805018/ and with that jobs seems to be run https://review.opendev.org/c/x/tobiko/+/805019/13:23
slaweqso it seems that zuul have somewhere old files - maybe it's some cache?13:23
fungislaweq: yes, that's what i'm inquiring about. there was some recent work on configuration caches, and we restarted zuul between those known working and broken runs13:25
fungiyeah, so the old file before it was renamed did define that job as abstract, so if zuul is still caching that old file then it would explain both the problem you noticed and the stale config errors13:43
zbrfungi: clarkb  re py36-pip issue, i updated https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/804890 with correct fix and I even manged to get a helping change merged into tox itself: https://github.com/tox-dev/tox/pull/2162 14:08
zbrfor the moment on tripleo we are implementing `LC_ALL={env:LC_ALL:en_US.UTF-8}` in tox.in, as a workaround. At least that one will respect other values if present on the nodes.14:10
fungito clarkb's point yesterday though, if we work around it in the job definition instead of in the project's tox.ini file, then people may be confused when running tox locally breaks in ways that running it under zuul does not14:18
*** ykarel is now known as ykarel|away14:42
clarkbfwiw I think updating tox.ini is the correct fix not a workaround (as long as pip doesn't fix it themselves)15:14
*** ysandeep is now known as ysandeep|away15:38
*** jpena is now known as jpena|off15:42
*** rpittau is now known as rpittau|afk16:06
zbrtox patch to add LC_ALL to default passenv was just released. this should help a little16:30
zbri doubt pip will fix it, it appears deep into python. filesystem related apparently. ansible contains some test files with unicode filenames, that is what is breaking it.16:31
zbrmost projects do not have unicode files inside their wheels16:32
fungiyeah, having tox itself do that seems like a better solution, since if users run into it with local testing the solution is simply to upgrade tox16:33
zbrimho, we still need the patch for openstack-tox-py36 job if we do not want to force all repos owners to add the hack themselves.16:59
clarkbzbr: right we are saying repos should be forced to do that so there isn't confusion when people run tox locally17:00
fungior tell people to use newer tox which infers a locale when none is set17:08
zbrif i remember correctly microsoft already added that locale on their github images as i did not encounter the same bug there, even if i run lots of jobs with py36 installing ansible17:20
fungisome distros do set a default system-wide locale in their images17:21
fungialso some connection methods may override those envvars17:21
fungifor example, openssh tries to substitute your client-side locale vars in place of whatever might be set server-side17:22
fungislaweq: so the good news is i think corvus has identified the cause, it does indeed seem to be a stale cache problem of sorts (old files not getting correctly removed from the cache, then being read in from the cache on the next startup)17:34
fungialso it looks like 804356 which you thought fixed a problem in your config did not, there was no actual problem in your current config you were seeing a ghost of the old config, but merging that change caused zuul to use a correct view of that project's configuration until its next scheduler restart17:35
fungiwhich explains why the problem seems to have spontaneously returned17:36
fungiwhoami-rajat: circling back around to your stale config issue from yesterday, we suspect it was different, looks like zuul did not see or had trouble processing the merge event for your config change, which resulted in its cache continuing with the previous content17:38
fungithe error we saw in the debug log may be related to network instability between the zuul scheduler and the gerrit server (they're in different cloud providers now and communicate with each other across the open internet)17:39
fungiclarkb has proposed https://review.opendev.org/804929 to hopefully make that communication a bit more robust17:39
whoami-rajatfungi, ack, thanks for the update, will recheck after this change merges17:40
fungiwhoami-rajat: yeah, your issue may have been solved by yesterday's scheduler restart, since it added a change to perform a reconfigure on startup looking for stale configuration states (that clearly doesn't catch the deleted files left behind in the cache however)17:42
fungialso corvus has proposed https://review.opendev.org/804304 which would give our administrators the ability to clear the entire state cache17:43
slaweqfungi: corvus: thx a lot for help with that issue17:45
fungiif all goes well we may restart on a fixed version later today or tomorrow now that the cause is better understood17:48
clarkbzbr: fwiw we run lots of py36 jobs on bionic too and don't hit the problem17:50
clarkbzbr: zuul in particular comes to mind and it does install ansible17:50
*** timburke_ is now known as timburke21:00

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!