Monday, 2022-08-15

*** ysandeep is now known as ysandeep|holiday06:35
chandankumarmarios: Good morning06:36
chandankumarmarios: happy monday o/06:36
chandankumarwe have node_failures on vexxhost cloud https://bugs.launchpad.net/tripleo/+bug/1986502 , so we can fun and relax06:37
marioso/ chandankumar06:38
marioslooking06:38
marioschandankumar: buildsets don't have it but they are all from 2/3 days ago (master there https://review.rdoproject.org/zuul/buildset/953eacd0246648ada324268a4c21a8e6 wallaby there https://review.rdoproject.org/zuul/buildset/41e058e884fc4e35bfe4bd19d8471411 and train https://review.rdoproject.org/zuul/buildset/0a685aa16ac04f0eb43c04c925bfd36b)06:41
marioschandankumar: so does it only hit component line? or another question why nothing ran on the weekend (per the latest buildsets i just linked)06:42
marioschandankumar: ok thats why https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-9-wallaby-promote-promoted-components-to-tripleo-ci-testing 06:43
mariosso promote to tripleo-ci-testing is hitting the node failure06:43
chandankumarmarios: thank you :-) I have not seen the buildset results, now we have more data, 06:46
marioschandankumar: added to bug with comment & screenshot06:47
marioschandankumar: basically they are all blocked on promote to citesting hiting the node failure06:47
chandankumaryes, correct, thank you for commenting on the bug.06:52
chandankumarwaiting for dpawlik4 come back online and take a look at the issue.06:52
*** chandankumar is now known as chkumar|ruck07:37
marioschkumar|ruck: from earlier that one eg master https://review.rdoproject.org/zuul/buildset/953eacd0246648ada324268a4c21a8e6 from saturday 13 has like 4/5 RETRY 09:53
marioschkumar|ruck: (per the chat just now)09:53
mariosso still seeing that (if you want to update the status on lp/trello) 09:53
mariosso for c8 train/wallaby you mean IBM cloud resources exhausted? 09:54
marioschkumar|ruck: ^ ? 09:54
mariosor heat stacks ? 09:54
chkumar|ruckmarios: yes09:54
chkumar|ruckmarios: at max, 3 ovb jobs can run now09:54
marioschkumar|ruck: k do we can we do something about it now? i am guessing we are trying to get moar09:54
mariosah 09:54
mariosso 09:54
mariosthen why are we moving lines over09:54
mariosi mean we should wait until we have enough to run them ? 09:54
mariosi think you probably only moved some of the train jobs if i recall the review ... 09:55
chkumar|ruckif more jobs starts running and there is no resources then it leads to no valid host found, 09:55
marioslike some ovb jobs09:55
mariosbut still ... we should not move lines over or any ovb jobs at least until we get that increased09:55
chkumar|ruckmarios: regarding IBM cloud resource issue, Nicholas will come back from PTO next week and he can take a look09:56
marioschkumar|ruck: k but agree on ovb/lines cannot move yet? 09:56
marioschkumar|ruck: :)09:56
mariosmaybe we should move some back if we are seeing this out of resources already 09:56
mariosuntil next week 09:56
chkumar|ruckmarios: based on these https://review.rdoproject.org/zuul/builds?result=RETRY&skip=0, 5 runs consisting of 2 client component and 3 cs8 wallaby component09:59
chkumar|ruckhave hit retry issue10:00
chkumar|ruckrest of the lines are good10:00
chkumar|ruckso I think we are good on IBM cloud10:00
chkumar|ruckwe are not going to move any further lines there till we sort out zuul console and resource issue.10:01
marioschkumar|ruck: taking ages to load that build result 10:01
chkumar|ruckhttps://review.rdoproject.org/zuul/builds?result=RETRY&skip=0 this one?10:02
mariosyeah 10:02
mariosah finaly10:02
marioschkumar|ruck: so actually that has 2 different things there ... the lines running on ibmcloud are retry because of resource/ovb stack limit eg https://logserver.rdoproject.org/22/44522/4/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby/b0d3bf7/job-output.txt10:03
marioschkumar|ruck: but the ones on vex are for the dns issue Could not resolve host: mirror.regionone.vexxhost-nodepool-tripleo.rdoproject.org] eg https://review.rdoproject.org/zuul/build/3f38777c28ec42c9bbc50192b7ed2a3d10:04
chkumar|ruckyes correct10:04
marioschkumar|ruck: i think if we consistently hit resource issue on ibm cloud then we should consider reverting some of the jobs there until it is sorted10:04
marioschkumar|ruck: cos a week is a long time for something that can take 15 mins to post and merge 10:05
marioschkumar|ruck: but i leave it to your judgement since you're watching those today 10:05
marioschkumar|ruck: mainly as mentioned earlier, ping if you need something esp. context from last week 10:05
chkumar|ruckmarios: I will monitor for today, if still comes, I will move cs8 wallaby component fs01 jobs to vexxhost10:05
chkumar|ruckmarios: I think I need help on this one https://trello.com/c/WlMm5XYB/2671-cixlp1985981tripleociproa-standalone-job-deploying-ceph-failing-with-error-container-init-binary-not-found-on-the-host-stat-usr 10:06
chkumar|rucksorry10:06
chkumar|rucknot this one10:06
chkumar|ruckmarios: https://trello.com/c/QtOwkWvu/2663-cixlp1983817tripleociproa-periodic-integration-all-branches-retry-could-not-resolve-host-mirrorregiononevexxhost-nodepool-triple10:06
chkumar|ruckyou can update the status10:06
chkumar|ruckon the bug and card 10:06
marioschkumar|ruck: to be clear, you mean i should update this card https://trello.com/c/QtOwkWvu/2663-cixlp1983817tripleociproa-periodic-integration-all-branches-retry-could-not-resolve-host-mirrorregiononevexxhost-nodepool-triple correct? 10:07
chkumar|ruckmarios: yes 10:07
marioschkumar|ruck: k will do in a bit np10:07
chkumar|ruckmarios: thank you :-)10:07
mariosnp chkumar|ruck just ping for anything 10:09
mariosyou should get some company in a bit with rafael 10:09
chkumar|ruckyup :-)10:10
mariosjust dropped from review time ping me if you want to talk reviews11:18
reviewbotDo you want me to add your patch to the Review list? Please type something like add to review list <your_patch> so that I can understand. Thanks.11:18
mariosbrb11:19
*** dviroel|out is now known as dviroel11:38
dviroelo/11:38
mariossup brazil bro dviroel o/ 11:48
dviroelhey there marios 11:51
chkumar|ruckdviroel: o/12:00
dviroelhey mr chkumar o/12:11
chkumar|ruckmarios: so zuul scheduler ran out of space due to multiple times rewrite of message ssl certificate expired for rdo vexxhost12:44
chkumar|rucktristan has filed a vexxhost ticket to update the certificate12:45
chkumar|ruckit might fix the node_failure issue12:45
chkumar|ruckdetails here: https://trello.com/c/aeU6CJS4/2672-cixlp1986502tripleociproa-multiple-nodefailure-on-all-jobs-running-on-rdoprojectorg-tenant12:45
chkumar|ruckdviroel: fyi we are hitting node_failures on rdoproject tenant. more details here: https://trello.com/c/aeU6CJS4/2672-cixlp1986502tripleociproa-multiple-nodefailure-on-all-jobs-running-on-rdoprojectorg-tenant12:48
dviroelchkumar|ruck: thanks for sharing12:49
marioschkumar|ruck: thanks for update12:54
marioschkumar|ruck: let me know if can do something12:54
*** Guest5 is now known as rcastillo13:06
*** rcastillo is now known as rcastillo|rover13:06
rcastillo|rovero/13:06
chkumar|ruckrcastillo|rover: o/13:07
chkumar|ruckrcastillo|rover: let me when ready we can sync13:07
rcastillo|roverchkumar|ruck: we can do it now if you want13:07
chkumar|ruckrcastillo|rover: grabbing gmeet then13:08
chkumar|ruckrcastillo|rover: https://meet.google.com/osp-eybt-yig?authuser=0&hl=en13:09
chkumar|ruckrcastillo|rover: https://hackmd.io/9b8XBCJYSDKf6QDDD9c2OQ?both13:10
chkumar|ruckrcastillo|rover: all cix cards updated. I will be around during cix13:26
chkumar|ruckrcastillo|rover: cs8 wallaby will promote today13:26
chkumar|ruckmarios: rcastillo|rover dviroel as per tristin,  nodes are now properly spawning in rdo's zuul13:41
chkumar|ruckfeel free to recheck your patches13:41
dviroel++13:43
dviroelchkumar|ruck: hey, do you know if we can get some public ips on those ibm servers?13:43
chkumar|ruckdviroel: as per apevec, Nope. But I will check again13:44
chkumar|ruckdviroel: any specific requirement for public ips13:44
chkumar|rucki think apevec is not around today13:44
marioschkumar|ruck: thanks just did a recheck 13:44
marioschkumar|ruck: it was hitting my testproject too 13:45
chkumar|ruckmarios: yes, everywhere on rdo zuul13:45
marioshttps://review.rdoproject.org/zuul/build/27353fb170754fe7a9984e98e26b48e613:45
marioschkumar|ruck: k thanks lets see 13:45
dviroelchkumar|ruck: if we want to continue testing prow, prow will need to reach those instances via public ip13:45
chkumar|ruckdviroel: ok got it. 13:46
chkumar|ruckI will check with him and let you know. :-)13:46
dviroelthanks chkumar|ruck, we can check with Prow team if there is another way too, later this week13:50
dviroelchkumar|ruck: you have nodepool running in the same server right?13:51
chkumar|ruckdviroel: yes13:57
chkumar|ruckdviroel: nodepool launcher, zuul executor and afs mirror are in the same cloud having private ips13:59
mariosscrum time folks 14:01
frenzyfridayshort scrum \o/14:13
mariosfrenzyfriday: short and sweet. lets bring this to retrospective (we often deep dive into each topic but that is not the purpose of scrum)14:13
mariosthere are advantages/disadvantages for both ways of course but if we are talking about actual 'scrum' it should be short/update/blockers 14:13
marioswell i guess we try to do it with the more scrum like thursday call but we often deep dive there too 14:14
Guest167marios++14:15
*** Guest167 is now known as dasm14:15
dasmmarios++14:16
* dasm is in srbac meeting atm14:16
marioso/ dasm 14:26
dasmmarios: \o14:26
dasmshort scrum >> long meeting :)14:26
mariosthanks chkumar|ruck++14:52
chkumar|ruckyw :-)14:56
dviroelchkumar|ruck: we need catatonic workaround on wallaby too14:57
chkumar|ruckdviroel: once tp finishes, will add there also and merge that15:02
dviroelack15:03
mariosam off in a min chkumar|ruck need something rcastillo|rover o/ 15:41
rcastillo|rovermarios: I'm good, thanks. Have a good evening o/15:46
chkumar|ruckmarios: dviroel https://review.opendev.org/c/openstack/tripleo-quickstart/+/853142 please +w it for fixing catatonit issue15:46
chkumar|rucktp results: https://review.opendev.org/c/openstack/tripleo-quickstart/+/853142/3#message-77ccf10093b74a112d9b9405fca3431a8731d24215:47
chkumar|ruckit will unblock master and wallaby promotion15:47
marioschecking chkumar|ruck 15:48
dviroelchkumar|ruck: voted15:49
dviroeltks15:49
* dviroel lunch15:49
*** dviroel is now known as dviroel|lunch15:49
marioschkumar|ruck: ready for workflow when zuul votes you or rcastillo|rover can do it laster15:50
marioslater15:50
chkumar|ruckmarios: yes ready for +w15:50
chkumar|ruckdviroel|lunch: please +w https://review.opendev.org/c/openstack/tripleo-quickstart/+/853142 once zuul +115:51
chkumar|ruckmarios: have a nice evening and thank you for all the help today :-)15:51
*** chkumar|ruck is now known as chandankumar15:51
marioso/ 15:54
*** marios is now known as marios|out15:54
*** dviroel|lunch is now known as dviroel17:01
dviroelrcastillo|rover: https://zuul.opendev.org/t/openstack/status#853142 - workaround in gate19:23
*** dviroel is now known as dviroel|brb19:31
*** dviroel|brb is now known as dviroel22:08
* dviroel out22:12
*** dviroel is now known as dviroel|out22:12
dasmo/ dviroel|out 22:13
* dasm => offline22:28
*** dasm is now known as dasm|off22:28

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!