*** ysandeep is now known as ysandeep|holiday | 06:35 | |
chandankumar | marios: Good morning | 06:36 |
---|---|---|
chandankumar | marios: happy monday o/ | 06:36 |
chandankumar | we have node_failures on vexxhost cloud https://bugs.launchpad.net/tripleo/+bug/1986502 , so we can fun and relax | 06:37 |
marios | o/ chandankumar | 06:38 |
marios | looking | 06:38 |
marios | chandankumar: buildsets don't have it but they are all from 2/3 days ago (master there https://review.rdoproject.org/zuul/buildset/953eacd0246648ada324268a4c21a8e6 wallaby there https://review.rdoproject.org/zuul/buildset/41e058e884fc4e35bfe4bd19d8471411 and train https://review.rdoproject.org/zuul/buildset/0a685aa16ac04f0eb43c04c925bfd36b) | 06:41 |
marios | chandankumar: so does it only hit component line? or another question why nothing ran on the weekend (per the latest buildsets i just linked) | 06:42 |
marios | chandankumar: ok thats why https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-9-wallaby-promote-promoted-components-to-tripleo-ci-testing | 06:43 |
marios | so promote to tripleo-ci-testing is hitting the node failure | 06:43 |
chandankumar | marios: thank you :-) I have not seen the buildset results, now we have more data, | 06:46 |
marios | chandankumar: added to bug with comment & screenshot | 06:47 |
marios | chandankumar: basically they are all blocked on promote to citesting hiting the node failure | 06:47 |
chandankumar | yes, correct, thank you for commenting on the bug. | 06:52 |
chandankumar | waiting for dpawlik4 come back online and take a look at the issue. | 06:52 |
*** chandankumar is now known as chkumar|ruck | 07:37 | |
marios | chkumar|ruck: from earlier that one eg master https://review.rdoproject.org/zuul/buildset/953eacd0246648ada324268a4c21a8e6 from saturday 13 has like 4/5 RETRY | 09:53 |
marios | chkumar|ruck: (per the chat just now) | 09:53 |
marios | so still seeing that (if you want to update the status on lp/trello) | 09:53 |
marios | so for c8 train/wallaby you mean IBM cloud resources exhausted? | 09:54 |
marios | chkumar|ruck: ^ ? | 09:54 |
marios | or heat stacks ? | 09:54 |
chkumar|ruck | marios: yes | 09:54 |
chkumar|ruck | marios: at max, 3 ovb jobs can run now | 09:54 |
marios | chkumar|ruck: k do we can we do something about it now? i am guessing we are trying to get moar | 09:54 |
marios | ah | 09:54 |
marios | so | 09:54 |
marios | then why are we moving lines over | 09:54 |
marios | i mean we should wait until we have enough to run them ? | 09:54 |
marios | i think you probably only moved some of the train jobs if i recall the review ... | 09:55 |
chkumar|ruck | if more jobs starts running and there is no resources then it leads to no valid host found, | 09:55 |
marios | like some ovb jobs | 09:55 |
marios | but still ... we should not move lines over or any ovb jobs at least until we get that increased | 09:55 |
chkumar|ruck | marios: regarding IBM cloud resource issue, Nicholas will come back from PTO next week and he can take a look | 09:56 |
marios | chkumar|ruck: k but agree on ovb/lines cannot move yet? | 09:56 |
marios | chkumar|ruck: :) | 09:56 |
marios | maybe we should move some back if we are seeing this out of resources already | 09:56 |
marios | until next week | 09:56 |
chkumar|ruck | marios: based on these https://review.rdoproject.org/zuul/builds?result=RETRY&skip=0, 5 runs consisting of 2 client component and 3 cs8 wallaby component | 09:59 |
chkumar|ruck | have hit retry issue | 10:00 |
chkumar|ruck | rest of the lines are good | 10:00 |
chkumar|ruck | so I think we are good on IBM cloud | 10:00 |
chkumar|ruck | we are not going to move any further lines there till we sort out zuul console and resource issue. | 10:01 |
marios | chkumar|ruck: taking ages to load that build result | 10:01 |
chkumar|ruck | https://review.rdoproject.org/zuul/builds?result=RETRY&skip=0 this one? | 10:02 |
marios | yeah | 10:02 |
marios | ah finaly | 10:02 |
marios | chkumar|ruck: so actually that has 2 different things there ... the lines running on ibmcloud are retry because of resource/ovb stack limit eg https://logserver.rdoproject.org/22/44522/4/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby/b0d3bf7/job-output.txt | 10:03 |
marios | chkumar|ruck: but the ones on vex are for the dns issue Could not resolve host: mirror.regionone.vexxhost-nodepool-tripleo.rdoproject.org] eg https://review.rdoproject.org/zuul/build/3f38777c28ec42c9bbc50192b7ed2a3d | 10:04 |
chkumar|ruck | yes correct | 10:04 |
marios | chkumar|ruck: i think if we consistently hit resource issue on ibm cloud then we should consider reverting some of the jobs there until it is sorted | 10:04 |
marios | chkumar|ruck: cos a week is a long time for something that can take 15 mins to post and merge | 10:05 |
marios | chkumar|ruck: but i leave it to your judgement since you're watching those today | 10:05 |
marios | chkumar|ruck: mainly as mentioned earlier, ping if you need something esp. context from last week | 10:05 |
chkumar|ruck | marios: I will monitor for today, if still comes, I will move cs8 wallaby component fs01 jobs to vexxhost | 10:05 |
chkumar|ruck | marios: I think I need help on this one https://trello.com/c/WlMm5XYB/2671-cixlp1985981tripleociproa-standalone-job-deploying-ceph-failing-with-error-container-init-binary-not-found-on-the-host-stat-usr | 10:06 |
chkumar|ruck | sorry | 10:06 |
chkumar|ruck | not this one | 10:06 |
chkumar|ruck | marios: https://trello.com/c/QtOwkWvu/2663-cixlp1983817tripleociproa-periodic-integration-all-branches-retry-could-not-resolve-host-mirrorregiononevexxhost-nodepool-triple | 10:06 |
chkumar|ruck | you can update the status | 10:06 |
chkumar|ruck | on the bug and card | 10:06 |
marios | chkumar|ruck: to be clear, you mean i should update this card https://trello.com/c/QtOwkWvu/2663-cixlp1983817tripleociproa-periodic-integration-all-branches-retry-could-not-resolve-host-mirrorregiononevexxhost-nodepool-triple correct? | 10:07 |
chkumar|ruck | marios: yes | 10:07 |
marios | chkumar|ruck: k will do in a bit np | 10:07 |
chkumar|ruck | marios: thank you :-) | 10:07 |
marios | np chkumar|ruck just ping for anything | 10:09 |
marios | you should get some company in a bit with rafael | 10:09 |
chkumar|ruck | yup :-) | 10:10 |
marios | just dropped from review time ping me if you want to talk reviews | 11:18 |
reviewbot | Do you want me to add your patch to the Review list? Please type something like add to review list <your_patch> so that I can understand. Thanks. | 11:18 |
marios | brb | 11:19 |
*** dviroel|out is now known as dviroel | 11:38 | |
dviroel | o/ | 11:38 |
marios | sup brazil bro dviroel o/ | 11:48 |
dviroel | hey there marios | 11:51 |
chkumar|ruck | dviroel: o/ | 12:00 |
dviroel | hey mr chkumar o/ | 12:11 |
chkumar|ruck | marios: so zuul scheduler ran out of space due to multiple times rewrite of message ssl certificate expired for rdo vexxhost | 12:44 |
chkumar|ruck | tristan has filed a vexxhost ticket to update the certificate | 12:45 |
chkumar|ruck | it might fix the node_failure issue | 12:45 |
chkumar|ruck | details here: https://trello.com/c/aeU6CJS4/2672-cixlp1986502tripleociproa-multiple-nodefailure-on-all-jobs-running-on-rdoprojectorg-tenant | 12:45 |
chkumar|ruck | dviroel: fyi we are hitting node_failures on rdoproject tenant. more details here: https://trello.com/c/aeU6CJS4/2672-cixlp1986502tripleociproa-multiple-nodefailure-on-all-jobs-running-on-rdoprojectorg-tenant | 12:48 |
dviroel | chkumar|ruck: thanks for sharing | 12:49 |
marios | chkumar|ruck: thanks for update | 12:54 |
marios | chkumar|ruck: let me know if can do something | 12:54 |
*** Guest5 is now known as rcastillo | 13:06 | |
*** rcastillo is now known as rcastillo|rover | 13:06 | |
rcastillo|rover | o/ | 13:06 |
chkumar|ruck | rcastillo|rover: o/ | 13:07 |
chkumar|ruck | rcastillo|rover: let me when ready we can sync | 13:07 |
rcastillo|rover | chkumar|ruck: we can do it now if you want | 13:07 |
chkumar|ruck | rcastillo|rover: grabbing gmeet then | 13:08 |
chkumar|ruck | rcastillo|rover: https://meet.google.com/osp-eybt-yig?authuser=0&hl=en | 13:09 |
chkumar|ruck | rcastillo|rover: https://hackmd.io/9b8XBCJYSDKf6QDDD9c2OQ?both | 13:10 |
chkumar|ruck | rcastillo|rover: all cix cards updated. I will be around during cix | 13:26 |
chkumar|ruck | rcastillo|rover: cs8 wallaby will promote today | 13:26 |
chkumar|ruck | marios: rcastillo|rover dviroel as per tristin, nodes are now properly spawning in rdo's zuul | 13:41 |
chkumar|ruck | feel free to recheck your patches | 13:41 |
dviroel | ++ | 13:43 |
dviroel | chkumar|ruck: hey, do you know if we can get some public ips on those ibm servers? | 13:43 |
chkumar|ruck | dviroel: as per apevec, Nope. But I will check again | 13:44 |
chkumar|ruck | dviroel: any specific requirement for public ips | 13:44 |
chkumar|ruck | i think apevec is not around today | 13:44 |
marios | chkumar|ruck: thanks just did a recheck | 13:44 |
marios | chkumar|ruck: it was hitting my testproject too | 13:45 |
chkumar|ruck | marios: yes, everywhere on rdo zuul | 13:45 |
marios | https://review.rdoproject.org/zuul/build/27353fb170754fe7a9984e98e26b48e6 | 13:45 |
marios | chkumar|ruck: k thanks lets see | 13:45 |
dviroel | chkumar|ruck: if we want to continue testing prow, prow will need to reach those instances via public ip | 13:45 |
chkumar|ruck | dviroel: ok got it. | 13:46 |
chkumar|ruck | I will check with him and let you know. :-) | 13:46 |
dviroel | thanks chkumar|ruck, we can check with Prow team if there is another way too, later this week | 13:50 |
dviroel | chkumar|ruck: you have nodepool running in the same server right? | 13:51 |
chkumar|ruck | dviroel: yes | 13:57 |
chkumar|ruck | dviroel: nodepool launcher, zuul executor and afs mirror are in the same cloud having private ips | 13:59 |
marios | scrum time folks | 14:01 |
frenzyfriday | short scrum \o/ | 14:13 |
marios | frenzyfriday: short and sweet. lets bring this to retrospective (we often deep dive into each topic but that is not the purpose of scrum) | 14:13 |
marios | there are advantages/disadvantages for both ways of course but if we are talking about actual 'scrum' it should be short/update/blockers | 14:13 |
marios | well i guess we try to do it with the more scrum like thursday call but we often deep dive there too | 14:14 |
Guest167 | marios++ | 14:15 |
*** Guest167 is now known as dasm | 14:15 | |
dasm | marios++ | 14:16 |
* dasm is in srbac meeting atm | 14:16 | |
marios | o/ dasm | 14:26 |
dasm | marios: \o | 14:26 |
dasm | short scrum >> long meeting :) | 14:26 |
marios | thanks chkumar|ruck++ | 14:52 |
chkumar|ruck | yw :-) | 14:56 |
dviroel | chkumar|ruck: we need catatonic workaround on wallaby too | 14:57 |
chkumar|ruck | dviroel: once tp finishes, will add there also and merge that | 15:02 |
dviroel | ack | 15:03 |
marios | am off in a min chkumar|ruck need something rcastillo|rover o/ | 15:41 |
rcastillo|rover | marios: I'm good, thanks. Have a good evening o/ | 15:46 |
chkumar|ruck | marios: dviroel https://review.opendev.org/c/openstack/tripleo-quickstart/+/853142 please +w it for fixing catatonit issue | 15:46 |
chkumar|ruck | tp results: https://review.opendev.org/c/openstack/tripleo-quickstart/+/853142/3#message-77ccf10093b74a112d9b9405fca3431a8731d242 | 15:47 |
chkumar|ruck | it will unblock master and wallaby promotion | 15:47 |
marios | checking chkumar|ruck | 15:48 |
dviroel | chkumar|ruck: voted | 15:49 |
dviroel | tks | 15:49 |
* dviroel lunch | 15:49 | |
*** dviroel is now known as dviroel|lunch | 15:49 | |
marios | chkumar|ruck: ready for workflow when zuul votes you or rcastillo|rover can do it laster | 15:50 |
marios | later | 15:50 |
chkumar|ruck | marios: yes ready for +w | 15:50 |
chkumar|ruck | dviroel|lunch: please +w https://review.opendev.org/c/openstack/tripleo-quickstart/+/853142 once zuul +1 | 15:51 |
chkumar|ruck | marios: have a nice evening and thank you for all the help today :-) | 15:51 |
*** chkumar|ruck is now known as chandankumar | 15:51 | |
marios | o/ | 15:54 |
*** marios is now known as marios|out | 15:54 | |
*** dviroel|lunch is now known as dviroel | 17:01 | |
dviroel | rcastillo|rover: https://zuul.opendev.org/t/openstack/status#853142 - workaround in gate | 19:23 |
*** dviroel is now known as dviroel|brb | 19:31 | |
*** dviroel|brb is now known as dviroel | 22:08 | |
* dviroel out | 22:12 | |
*** dviroel is now known as dviroel|out | 22:12 | |
dasm | o/ dviroel|out | 22:13 |
* dasm => offline | 22:28 | |
*** dasm is now known as dasm|off | 22:28 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!