Friday, 2022-09-23

-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 858726: Fix CORS and endpoint in AWS log upload https://review.opendev.org/c/zuul/zuul-jobs/+/85872600:15
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 858726: Fix CORS and endpoint in AWS log upload https://review.opendev.org/c/zuul/zuul-jobs/+/85872600:30
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 859013: Azure: handle missing Zone https://review.opendev.org/c/zuul/nodepool/+/85901303:46
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 855096: Tracing: implement span save/restore https://review.opendev.org/c/zuul/zuul/+/85509606:46
-@gerrit:opendev.org- Albin Vass proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/nodepool] 807806: Add "slots" to static node driver https://review.opendev.org/c/zuul/nodepool/+/80780607:08
-@gerrit:opendev.org- Per Wiklund proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/nodepool] 807806: Add "slots" to static node driver https://review.opendev.org/c/zuul/nodepool/+/80780607:59
-@gerrit:opendev.org- Simon Westphahl proposed:10:09
- [zuul/zuul] 859066: Link span of queue item to trigger event span https://review.opendev.org/c/zuul/zuul/+/859066
- [zuul/zuul] 859067: Trace received Github events https://review.opendev.org/c/zuul/zuul/+/859067
-@gerrit:opendev.org- Simon Westphahl proposed:10:17
- [zuul/zuul] 859066: Link span of queue item to trigger event span https://review.opendev.org/c/zuul/zuul/+/859066
- [zuul/zuul] 859067: Trace received Github events https://review.opendev.org/c/zuul/zuul/+/859067
@caiquemello:matrix.org> <@clarkb:matrix.org> I would check that your executor is accepting jobs. It is possible it may have tripped a governor and stopped running jobs11:52
Thank you Clark, I'm gonna take a look
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 859013: Azure: handle missing Zone https://review.opendev.org/c/zuul/nodepool/+/85901314:14
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 858726: Fix CORS and endpoint in AWS log upload https://review.opendev.org/c/zuul/zuul-jobs/+/85872615:12
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 857796: Remove support for Ansible 2 https://review.opendev.org/c/zuul/zuul/+/85779617:32
@clarkb:matrix.orgzuulians https://review.opendev.org/c/zuul/zuul-jobs/+/858961 seems to be happy in testing. I'm happy to help keep an eye on that today if we land it today. But I'm also happy to wait a bit if we prefer (I think it will affect most zuul jobs out there)17:58
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/nodepool] 858741: Demote "Starting/Finished cleanup" log entries to debug https://review.opendev.org/c/zuul/nodepool/+/85874118:48
-@gerrit:opendev.org- lotorev vitaly proposed wip: [zuul/project-config] 859151: Update link to zuul gating docs https://review.opendev.org/c/zuul/project-config/+/85915121:10
-@gerrit:opendev.org- lotorev vitaly proposed wip: [zuul/project-config] 859151: Update link to zuul gating docs https://review.opendev.org/c/zuul/project-config/+/85915121:10
-@gerrit:opendev.org- lotorev vitaly marked as active: [zuul/project-config] 859151: Update link to zuul gating docs https://review.opendev.org/c/zuul/project-config/+/85915121:11
-@gerrit:opendev.org- lotorev vitaly proposed wip: [zuul/zuul] 859152: Update link to zuul gating docs in reference pipeline https://review.opendev.org/c/zuul/zuul/+/85915221:13
-@gerrit:opendev.org- lotorev vitaly proposed wip: [zuul/zuul] 859152: Update link to zuul gating docs in reference pipeline https://review.opendev.org/c/zuul/zuul/+/85915221:16
-@gerrit:opendev.org- lotorev vitaly marked as active: [zuul/zuul] 859152: Update link to zuul gating docs in reference pipeline https://review.opendev.org/c/zuul/zuul/+/85915221:16
@jim:acmegating.comi've seen the nodepool-functional-container-openstack-release job fail twice with a timeout, and the underlying cause is an openstack error: #033[01;35m[instance: 770f1397-0550-4edc-8815-caa7f720b6d7] #033[01;31mFailed to allocate network(s)#033[00m: nova.exception.ExternalNetworkAttachForbidden: It is not allowed to create an interface on external network db590dd4-5493-4ffc-a658-13bad70e96d722:40
@jim:acmegating.comhttps://zuul.opendev.org/t/zuul/build/74fab5fdddee4c3d88e71e40ad6795a7 is one failure22:40
@jim:acmegating.comi do not know how to debug that, so if someone who is more knowledgeable about openstack would like to look into that, that would be great22:41
@clarkb:matrix.orgMy first thought is "wow I thought that was explicitly allowed and we do that on the inmotion cloud to avoid wasting IPs for router interfaces"22:41
@jim:acmegating.com(it's the two most recent builds that failed like that; i'm not sure if we're at 100% failure rate on that currently after a recent openstack change, or we just got lucky)22:41
@jim:acmegating.comClark: maybe some default openstack/devstack policy change?22:42
@jim:acmegating.com(i'm just aping words i've heard before, i don't really know if that's a thing)22:42
@clarkb:matrix.orgThe message comes from nova somewhere. Let me see if the nova git logs indicate anything useful22:43
@jim:acmegating.combuild history: https://zuul.opendev.org/t/zuul/builds?job_name=nodepool-functional-container-openstack-release&project=zuul/nodepool22:44
@jim:acmegating.comso if it's a change, it's very recent22:45
@clarkb:matrix.orgOpenStack is doing a bunch of release stuff so it is possible22:45
@jim:acmegating.com"Test Nodepool containers and OpenStack, with released projects" is the job description22:46
@jim:acmegating.comso i guess a dependency release in the last 12 hours could do it?22:46
@clarkb:matrix.orgya22:47
@clarkb:matrix.orghttps://opendev.org/openstack/nova/src/branch/master/nova/network/neutron.py#L603-L613 I think it may be policy related based on that22:47
@clarkb:matrix.org(that exception's string matches the error we got)22:48
@clarkb:matrix.orgNova itself doesn't seem to set a policy outside of its tests and that string doesn't show up in devstack22:49
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 859170: Make nodepool-functional-container-openstack-release non-voting https://review.opendev.org/c/zuul/nodepool/+/85917022:51
@jim:acmegating.comClark: yeah, i'm wondering how devstack is working but we aren't22:52
@clarkb:matrix.orgAlso that code hasn't appeared to have changed recently in nova which makes me think it must be the policy config that is different22:53
@clarkb:matrix.orghrm it actually looks like devstack is installing master based on the logs22:56
@clarkb:matrix.orgI think the -release in that job is for dib and glean not openstack + devstack22:56
@jim:acmegating.comhuh.  maybe we can update the description when this is through22:57
@jim:acmegating.comhrm, given that it takes 1.5 hours to time out that job, maybe we should remove it or give it a 45m timeout for now?23:04
@jim:acmegating.comthe median successful time is 36m.  some successes are longer, but i think 45m would still give us data about whether the job is fixed externally without hindering ongoing nodepool maintenance23:05
@clarkb:matrix.orgthat seems reasonable. Though devstack alone takes about 20 minutes iirc23:06
@clarkb:matrix.orghttps://review.opendev.org/c/openstack/nova/+/849209 is suspicious but seems to have merged a month ago23:06
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 859170: Make nodepool-functional-container-openstack-release non-voting https://review.opendev.org/c/zuul/nodepool/+/85917023:06
@clarkb:matrix.orgwow there are even comments that nova shouldn't do this check .... https://review.opendev.org/c/openstack/nova/+/849209/6/nova/policies/servers.py line 31523:07
@clarkb:matrix.orgcorvus: what log file did that "it is not allowed" message come out of?23:08
@clarkb:matrix.orgI'm trying to find it in the example job you linked and having trouble23:08
@jim:acmegating.comsyslog23:08
@jim:acmegating.comsomething similar was also reported to nodepool via sdk in the launcher log23:08
@clarkb:matrix.orgya the launcher log is much more terse23:09
@clarkb:matrix.orgcorvus: I think I've confirmed that both a successful and a failed job ran with the same version of nova installed: https://zuul.opendev.org/t/zuul/build/9097b115a25c4d2d96295cb1e7d301f8/log/job-output.txt#8968 vs https://zuul.opendev.org/t/zuul/build/74fab5fdddee4c3d88e71e40ad6795a7/log/job-output.txt#885823:21
@clarkb:matrix.orgfirst is success, second is failure23:21
@jim:acmegating.comso this might be intermittent?  or the cause may not be nova?23:22
@clarkb:matrix.orgya23:22
@clarkb:matrix.orgI pinged gmann in the nova channel about it since gmann seems to be pushing the rbac stuff along and this seems very related23:31
@clarkb:matrix.orgcorvus: gmann explains that a project admin is an admin so the update to the rbac policy should be equivalent. On further digging the successful jobs have the same rbac failure and tracebacks23:49
@clarkb:matrix.orgBut the failed jobs also have 'nova.exception.VirtualInterfaceCreateException: Virtual Interface creation failed'23:49
@clarkb:matrix.orgI think that this may be the fatal error: https://zuul.opendev.org/t/zuul/build/74fab5fdddee4c3d88e71e40ad6795a7/log/syslog?severity=0#84665-8469223:51
@clarkb:matrix.organd that appears to be waiting for 5 minutes for libvirt to make a virtual interface and it isn't happening in that amount of time23:51
@clarkb:matrix.orgcorvus: and I think we'd need to collect libvirt logs to see where libvirt is getting stuck23:53
@jim:acmegating.comso 2 new things to consider: libvirt versions, and maybe opendev cloud providers?23:54
@clarkb:matrix.orgyes23:54
-@gerrit:opendev.org- Zuul merged on behalf of Dr. Jens Harbott: [zuul/zuul] 834671: Handle reviews by anonymous github users https://review.opendev.org/c/zuul/zuul/+/83467123:56
@jim:acmegating.com2 failures on on iweb and ovh23:56
@jim:acmegating.commost recent success in iweb23:56
@jim:acmegating.comso no smoking gun there23:56

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!