Tuesday, 2021-11-09

chandankumarykarel|away: \o. can you take a look at this one https://review.opendev.org/c/openstack/tripleo-quickstart/+/816648 and last comment https://review.opendev.org/c/openstack/tripleo-quickstart/+/816648/3#message-f6c6215b62b2ee15379c37e17fdfce437701a3db04:53
ykarelchandankumar, yes just voted04:54
ykarellooking at error now04:54
chandankumarykarel: new comment04:54
ykarellikely the content renewed, need to check04:55
ykarelalso not sure how far you are away from running these jobs in upstream04:55
ykarelbut before that it would be good to have these repos mirrored in upstream to avoid further issues04:56
chandankumarykarel: we are too close to run those jobs upstream04:56
ykarelchandankumar, ok before enabling get mirrors ready else gate jobs will be affected when contacting external mirrors04:57
ykarelwrt error seems some issue in repos, likely sync is going or04:58
chandankumarykarel: can you take care of syncing these repo on rdo vexxhost cloud? It might help us on periodic side05:01
chandankumarto use nodepool mirrors05:01
chandankumarI will propose a patch for upstream one05:01
ykarelchandankumar, i think patch needed upstream only05:02
ykareland that should take care for all providers05:02
ykarelhttps://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/mirror-update/files/centos-mirror-update is the place05:03
ykarelnot sure what else to check, but ^ is the starting place05:03
chandankumarmirror.dal10.us.leaseweb.net/centos does not have cs9 content05:07
ykarelyes 9-stream moved to different mirror05:07
ykarelhttps://lists.centos.org/pipermail/centos-devel/2021-November/077417.html have some info05:09
ykarelhttps://mirrors.centos.org/metalink?repo=centos-baseos-9-stream&arch=x86_64&protocol=https,http can see current mirrors05:10
chandankumarI will go with rackspace05:12
ykareli think need to check with infra too with what region to choose05:13
ykarelas earlier mirrors are from TX05:13
ykarelchandankumar, can you rerun those jobs, currently mirror looks correct05:18
ykarellikely this is happening as modules are being removed05:19
chandankumarykarel: still not synced https://logserver.rdoproject.org/26/34926/9/check/periodic-tripleo-ci-centos-9-containers-multinode-master/5606dbe/logs/subnode-1/home/zuul-worker/repo_setup.log.txt.gz05:46
chandankumarykarel: I will fall back to previous marios patch and use compose currently05:46
chandankumarand switch to mirror contents once mirrors are synced05:47
ykarelchandankumar, seeeing same issue in RDO jobs too, /me looking05:47
chandankumarykarel: https://review.opendev.org/c/openstack/tripleo-quickstart/+/816648 let's get this in and then will propose a seperate patch to switch to mirrors once system-config patch gets in06:01
chandankumarmarios: sorry, when you were not around, I updated and played with your patch https://review.opendev.org/c/openstack/tripleo-quickstart/+/816648 and back to previous version, please get this merged, thank you :-) Please go over when free07:01
chandankumarysandeep: https://review.opendev.org/c/openstack/tripleo-quickstart/+/816648 please have a look when free, thanks :-)07:01
marioschandankumar: ack 07:05
ysandeepchandankumar, voted we can +w once zuul report back, and revert/modify once mirror content is available.07:05
mariosdviroel|out: o/ sanity check on the |combine otherwise lgtm https://code.engineering.redhat.com/gerrit/c/openstack/rrcockpit/+/288406/1#message-78c10ed00046405983c72b133e874131f31d1ad0 07:18
soniya29|ruckchandankumar, akahat|rover, I have updated the skiplist patch for removal of queens - https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/81490307:39
chandankumarmarios: you can take it from here https://review.rdoproject.org/r/c/testproject/+/36267/25#message-93e9207bc27400d4ee4822117ea12e8ad463702c08:57
chandankumarI will deal with containers multinode job08:57
chandankumarthat is also broken08:57
chandankumarwith different issue08:57
marioschandankumar: ack 09:00
*** ysandeep|lunch is now known as ysandeep09:06
chandankumarysandeep: can you take a look at promote component jobs for cs9? it seems tripleo component is quite old in tripleo-ci-testing09:12
chandankumarit might fix marios issue if we get all hashes there09:13
chandankumarI am filing a bug09:13
ysandeepchandankumar, ack, what's the issue.. components not promoting? 09:14
chandankumarhttps://review.opendev.org/c/openstack/tripleo-heat-templates/+/816727 is not there in tripleo-ci-testing09:16
marioschandankumar: but why do we need a new bug for it wont that be confusing? you mean something new? or for the missing patch? 09:18
chandankumarmarios: a new bug for contianer multinode https://bugs.launchpad.net/tripleo/+bug/195027909:19
marioschandankumar: ah k 09:19
chandankumarmarios: for ovb, fix is not in the repo that we need to figure out why it is not there09:19
chandankumarysandeep: https://trunk.rdoproject.org/centos9-master/component/tripleo/?C=M;O=D , there is no change in dir after 01st nov09:28
chandankumarykarel|lunch: can you check cs9 dlrn builder09:28
ysandeepchandankumar, yeah looks like we are not getting new content09:29
chandankumarnothing got built after 01st nov09:29
ysandeepchandankumar, currently we have nothing in component criteria, patch to add standlaone jobs in criteria is not merged yet: https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/36584 , builds should promote once available09:30
chandankumarysandeep: yes, yes, 09:31
marioschandankumar: ysandeep: nice but we have chicken/egg situation cos of https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-standalone-tripleo-master ?09:32
mariosi.e. the job is still red so it wont promote 09:32
ysandeepmarios, ^^ ahh, this should resolve once your workaround patch merges?09:34
mariosysandeep: which one :)09:34
ysandeepfailing with same issue: https://logserver.rdoproject.org/openstack-component-tripleo/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-standalone-tripleo-master/0d304fa/logs/undercloud/home/zuul-worker/install_packages.sh.log.txt.gz 09:34
chandankumarmarios: dlrn is stuck from 01st nov and https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/36584 got not merged, standalone is not in criteria09:34
mariosysandeep: ack the module_hotfixes one 09:34
chandankumarso our fix is not yet built by dlrn09:34
chandankumarso we are kind of stuck09:35
marioschandankumar: we need teh module_hotfixes so the standalone job can pass ... 09:35
mariosthen we can prepare a sacrifice to zuul09:35
ysandeepstandalone jobs are not blocking anything yet... they are not in component criteria..09:35
mariosso why didn't it promote then ysandeep chandankumar  i mean no criteria means it should just promote no? 09:37
ysandeepmarios, looks like dlrn builder is stuck09:37
mariosi see 09:37
chandankumarakahat|rover: https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/81697509:53
*** ykarel|lunch is now known as ykarel09:54
ykarelchandankumar, looking09:54
chandankumarykarel: I have also pinged dpawlik to take a look into that09:55
ykarelchandankumar, ohkk, i just checked process is stuck, so need to clear it09:56
marios\o/ thanks ykarel chandankumar++09:56
* marios food brb09:56
chandankumarsoniya29|ruck: arxcruz https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/815654/2#message-b340aa4a57b6edcdf88fe17aaa84ea0c76b40677 please have a look when free, thanks :-)10:16
soniya29|ruckchandankumar, so we have netstat implemented already10:44
chandankumarsoniya29|ruck: yes10:44
soniya29|ruckchandankumar: okay, i will abandon my patch then :)10:44
chandankumarsoniya29|ruck:  nono10:45
chandankumarsoniya29|ruck: we can call the role there in tempest.yaml playbook10:49
chandankumarvia that we can trigger netstat before/after running tempest10:49
soniya29|ruckchandankumar, okay, i will update the patch accordingly10:54
chandankumarsshnaidm: please have a look at this bug https://bugs.launchpad.net/tripleo/+bug/1950279 when free, thanks!11:01
dviroelthanks marios , answered your comment11:15
mariosdviroel: thanks will check in bit 11:16
rlandy|ruckysandeep|afk: chandankumar: hey meet.google.com/vzm-nrah-qqf  - and we can deal with the sprint board11:16
sshnaidmchandankumar, interesting.. will look11:17
chandankumarit is defined in containers.conf sshnaidm 11:17
chandankumarfindout the config11:17
sshnaidmchandankumar, seems like ubi9 doesn't exist yet..11:19
jpodivinmarios: hi, I've rebased and updated the feature set deprecation change so it's now more consistent with the state of the docs.11:56
mariosjpodivin: thanks will check again 11:57
rlandy|ruckysandeep|afk; pls vote on https://review.opendev.org/c/openstack/tripleo-quickstart/+/816858 as th eUA11:59
soniya29|ruckchandankumar, some jobs on rhos-17 line are failing with same error as described in https://bugs.launchpad.net/tripleo/+bug/192149312:00
soniya29|ruckbut nova-compute logs are different, so do we need separate bug?12:01
rlandy|ruckysandeep|afk: and this one ... https://code.engineering.redhat.com/gerrit/c/openstack/rrcockpit/+/28840612:04
rlandy|ruckbefore we merge this12:04
*** ysandeep|afk is now known as ysandeep12:05
ysandeeprlandy|ruck, want to meet about sprint board?12:05
rlandy|ruckysandeep: we took care of the board12:06
rlandy|ruckpls take a look at the reviews above12:06
ysandeepgreat! ack12:06
rlandy|ruckakahat|rover: soniya29|ruck: let's ruck/rover sync12:10
rlandy|ruckakahat|rover: https://jenkins-cloudsig-ci.apps.ocp.ci.centos.org/view/phase-1-pipelines/job/rdo_trunk-promote-master-centos8-current-tripleo/12:24
akahat|roverchandankumar, re: https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/816975, this tests are not whitelisted here: https://github.com/openstack/tripleo-quickstart/blob/master/config/general_config/featureset010.yml#L8612:26
akahat|roverchandankumar, still we are expecting it to run?12:26
ysandeepdviroel: thank you for rrcockpit fix patch , I have left a comment on https://code.engineering.redhat.com/gerrit/c/openstack/rrcockpit/+/288406/1#message-e0f19339642235cd6255c30d7cf895a41ee64598 with a soft -1 to get more context.12:40
dviroelysandeep: you answer your comment in a minute, thanks sandeep12:40
dviroelysandeep: answered, thanks sandeep, left you a question there too. lol12:56
dviroelchandankumar: see your reply. great, lets move things to won't fix, if there aren't fixed yet12:59
rlandy|ruckakahat|rover: https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-integration-main13:00
rlandy|ruckbunch of ovb failures there13:00
dviroelchandankumar: my manila bug supervisor feelings tell em that we should do this :P13:00
akahat|roverrlandy|ruck, ack13:00
rlandy|ruckplease create a revert of the patch ykarel pointed to https://review.opendev.org/c/openstack/tripleo-ansible/+/81483213:00
rlandy|ruckand test that13:00
rlandy|ruckakahat|rover: with check and periodic13:01
chandankumarsoniya29|ruck: got the issue https://bugs.launchpad.net/tripleo/+bug/1950279/comments/2 and proposign a fix for pause container13:02
soniya29|ruckchandankumar, ack13:04
chandankumarsorry it was for sshnaidm ^^13:05
sshnaidmchandankumar, thanks!13:06
ysandeepdviroel, thanks will reply13:09
chandankumarysandeep: not sure it will break downstream https://review.opendev.org/c/openstack/tripleo-ansible/+/817202 when backported to wallaby someday13:14
chandankumarplease keep an eye13:14
ysandeepchandankumar, thanks!13:15
chandankumardviroel: sure13:15
dviroelchandankumar: another thing13:16
dviroelchandankumar: wrt https://review.opendev.org/c/openstack/tripleo-repos/+/816211 - a next move would be to allow a mirror update on repos content, to avoid 'sed' commands afterwards13:16
dviroelif that can be useful13:16
dviroelit should, at least looking at all release files13:17
chandankumarwe use sed to replace mirrors with nodepool mirror13:17
chandankumarwill take a look tomorrow13:17
dviroelchandankumar:  so in a next change I can add that, in a single operation, users can provide a new '--mirror-url' to replace the one that is placed in the repo file13:18
dviroelchandankumar: the change is big enough, so i can add in a next one13:18
chandankumardviroel: yes, that sounds good13:18
ysandeeparxcruz, sshnaidm, rlandy, marios, ysandeep, bhagyashris, svyas, soniya29, pojadhav, akahat, chandankumar, frenzy_friday, anbanerj, dviroel community mtg in 2 mins13:28
ysandeepplease add your agenda https://hackmd.io/MMg4WDbYSqOQUhU2Kj8zNg?both if any13:29
rlandy|ruckakahat|rover: looking at https://bugs.launchpad.net/bugs/195032313:34
rlandy|ruckrelease target is yoga-113:34
rlandy|ruckthis is a promotion blocker13:34
akahat|roverokay.. adding tags13:35
rlandy|ruckakahat|rover: pls add closes-bug: # 1950323 on the revert13:35
akahat|roverrlandy|ruck, ack13:35
rlandy|ruckand testproject that13:35
rlandy|ruckakahat|rover: hey - do you have a revert yet? otherwise will add one14:17
akahat|roverrlandy|ruck, no. just checking how to fix it.. 14:17
ysandeeprlandy|ruck, dhill proposed a fix: https://review.opendev.org/c/openstack/tripleo-ansible/+/817220 14:18
ysandeepwe can testproject if ^^ works 14:18
rlandy|ruckysandeep: thanks - I see that now14:18
* rlandy|ruck testprojects14:19
rlandy|rucksoniya29|ruck: hey - need help withe cix?14:24
soniya29|ruckrlandy|ruck, nope..all okay14:26
*** ysandeep is now known as ysandeep|dinner14:27
soniya29|ruck|dinnerrlandy|ruck, https://bugzilla.redhat.com/show_bug.cgi?id=202153614:56
rlandy|rucksoniya29|ruck|dinner: thanks15:02
rlandy|rucksoniya29|ruck|dinner: you need to send an email to #rhos-dev with the bug details15:20
chandankumaritem=[''] | error={"ansible_loop_var": "gateway_ip", "changed": false, "cmd": ["ping", "-w", "10", "-c", "1", "[]"], "delta": "0:00:00.004946", "end": "2021-11-09 09:56:56.878234", "gateway_ip": [""], "msg": "non-zero return code", "rc": 2, "start": "2021-11-09 09:56:56.873288", "stderr": "ping: []: Name or service not known",15:23
chandankumar"stderr_lines": ["ping: []: Name or service not known"], "stdout": "", "stdout_lines": []}15:23
chandankumarIs this coming on cs8 also?15:23
ysandeepdviroel: replied to your query on https://code.engineering.redhat.com/gerrit/c/openstack/rrcockpit/+/288406/1#message-8e5753d42801c9704dcb2b0796d26c129318424c , For some reason i don't have +w permission on this repo.. We can request rlandy|ruck to merge this.15:23
marioschandankumar: i have something different in latest fs1 run - issue with introspection https://logserver.rdoproject.org/67/36267/25/check/tripleo-stream9-development-centos-9-ovb-3ctlr_1comp-featureset001-master/1f9a296/logs/undercloud/home/zuul-worker/overcloud_introspect.log.txt.gz   - i wonder if we need to wait for dlrn builder still (might still be getting stale stuff after this morning restart of the 15:29
rlandy|ruckysandeep: dviroel: we're still off on the downstream cockpit component tracking data15:29
rlandy|ruckpromoted-components/2021-11-05 06:39 -  15:29
rlandy|ruck7 days agocomputepromoted-components4 days ago15:30
rlandy|ruckoh wait nvm15:30
rlandy|ruckjust updated15:30
rlandy|ruckpromoted 3 hours ago15:30
chandankumarmarios: yes, let's wait for tomorrow15:31
ysandeeprlandy|ruck, could you please merge  https://code.engineering.redhat.com/gerrit/c/openstack/rrcockpit/+/288406/ , I am okay with this change.. I don't have +w/submit rights there15:32
rlandy|ruckysandeep: ack15:32
rlandy|ruckysandeep: we have a 1:1 meeting on friday  - let's get your permission set then15:32
* ysandeep checking calendar, i don't recall invite for 1:1 15:33
ysandeeprlandy|ruck, I don't see anything in my calendar for 1:1 on Friday, I will send an invite for 1:115:35
rlandy|ruckysandeep: pls check now15:36
rlandy|ruckyou should see the invite15:36
rlandy|ruckakahat|rover: will miss your 1-1 tomorrow for the prod chain meet up15:36
rlandy|ruckwill catch you on friday as well15:36
ysandeeprlandy|ruck, it conflict with my df mtg, possible to prepone by half an hour?15:37
rlandy|ruckysandeep: moving it15:37
ysandeepthanks ++15:38
ysandeepfyi.. 17 integration lines some jobs failed on tempest, I am rerunning them now: https://code.engineering.redhat.com/gerrit/c/testproject/+/190672 , Incase issue persists i will debug tomorrow in my morning.15:39
rlandy|ruckysandeep: soniya29|ruck|dinner already bugged that15:40
rlandy|ruckwe are working o it15:40
rlandy|ruckysandeep: ^^15:41
rlandy|rucksce 001 004 an d 01015:41
rlandy|ruckysandeep: 16.2 was fixed yesterday afternoon15:41
rlandy|ruckso that should be ok now15:41
ysandeepyeah looks in much better shape, I am keeping an eye on current run for promotion.15:42
ysandeepsoniya29|ruck|dinner, thanks for reporting that!15:42
rlandy|ruckfor 17 001 004 and 010 have been failing for a couple days15:42
rlandy|ruckso we have a real issue there15:43
rlandy|ruckmasked by some infra fun we had until now15:43
rlandy|ruckasking soniya29|ruck|dinner to CIX that15:43
rlandy|ruckwill look into it more after meetings15:43
chandankumarmarios: Do we have a bug open for image sanity check issue?15:44
marioschandankumar: yeah https://bugs.launchpad.net/tripleo/+bug/1949765 15:45
marioschandankumar: still not working i am posting something in a sec 15:45
marioschandankumar: have you seen something related?15:45
chandankumarmarios: https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-9-buildimage-overcloud-full-master&project=openstack/tripleo-ci15:45
chandankumarmay be need a bug for that15:45
ysandeeprlandy|ruck, looks like nova problem: https://sf.hosted.upshift.rdu2.redhat.com/logs/95/200295/198/check/periodic-tripleo-ci-rhel-8-scenario004-standalone-rhos-17/a9d48dc/logs/undercloud/var/log/extra/errors.txt 15:45
marioschandankumar: oh you mean that image_sanity ... that it fails... so not the bug i pointed to then 15:46
* ysandeep adding details to bugzilla15:46
marioschandankumar: i don't think we have one actually 15:46
chandankumarmarios: can you please open up?15:47
marioschandankumar: sure but will do tomorrow15:47
marioschandankumar: middle sthing now15:47
chandankumarmarios: yes, sure 15:47
ysandeeprlandy|ruck, soniya29|ruck|dinner fyi.. updated component of that bug.. should go to nova instead of rhos-release 15:53
ysandeepsoniya29|ruck|dinner, i have send the CIX mail 16:16
* ysandeep|out out, see you tomorrow o/16:21
akahat|roverrlandy|ruck, no worries we can catch later. 16:23
dviroelrlandy|ruck: seems to continue to work:16:31
dviroel`Nov  9 15:35 conf_ruck_rover.yaml`16:33
rlandy|ruckdviroel++ nice16:33
dviroelinside telegraf container16:33
dviroellets wait a second run of cron task16:33
soniya29|ruck|dinnerysandeep|out, thanks16:45
soniya29|ruckrlandy|ruck, anything you want me to look tommorrow morning?16:47
rlandy|rucksoniya29|ruck: pls see updated bug ysandeep|out updated https://bugzilla.redhat.com/show_bug.cgi?id=202153616:47
soniya29|ruckrlandy|ruck, i have seen the updates16:48
rlandy|rucksoniya29|ruck: note private comments16:48
rlandy|rucksoniya29|ruck: also .. https://jenkins-cloudsig-ci.apps.ocp.ci.centos.org/view/phase-1-pipelines/16:48
rlandy|ruckmaster failure16:48
rlandy|ruckand train line if that does not promote by tomorrow16:49
rlandy|ruckrerunning there16:49
soniya29|ruckrlandy|ruck: okay16:52
rlandy|rucksome idea here16:52
dviroelis this the revert that you need https://review.opendev.org/c/openstack/tripleo-ansible/+/817251 ?16:56
dviroelrlandy|ruck: ^16:56
rlandy|ruckdviroel: ack - there is also a fix patch flying around16:57
rlandy|ruckso one of those two should help16:57
dviroelyeah, looking atm16:57
rlandy|ruck David Hill proposed openstack/tripleo-ansible master: Add the MTU size to ping to catch network issues  https://review.opendev.org/c/openstack/tripleo-ansible/+/8172416:58
rlandy|ruckdviroel: ^^16:58
rlandy|rucknot sure which one will land first16:58
rlandy|ruckbut the patch keeps changing16:59
soniya29|ruckrlandy|ruck: leaving for the day16:59
rlandy|ruckwaiting for that patch to settle16:59
rlandy|rucksoniya29|ruck: have a good night16:59
soniya29|ruckrlandy|ruck, :)16:59
rlandy|rucklunch - brb17:07
* akahat|rover leaving for the day!!17:15
rlandy|ruckugh - image mount  still faiking18:28
rlandy|ruckakahat|rover: just fyi - program doc is updated for tomorrow18:59
rlandy|ruck[0;31m2021-11-09 19:54:27.672519 | 00f3f7e9-175e-01fc-ab4e-000000000013 |      FATAL | Expand roles | localhost | error={"msg": "The task includes an option with an undefined variable. The error was: 'default_image' is undefined\n\nThe error appears to be in '/usr/share/ansible/tripleo-playbooks/cli-overcloud-node-provision.yaml': line 92, column 7, but may\nbe elsewhere in the file depending on the exact syntax20:07
rlandy|ruckproblem.\n\nThe offending line appears to be:\n\n\n    - name: Expand roles\n      ^ here\n"}[0m20:07
rlandy|ruckakahat|rover: ^^ fyi20:07
rlandy|ruckfailure on master in jenkins20:07
rlandy|ruckcreating bug20:07
*** dviroel is now known as dviroel|out20:51
rlandy|rucktrain is promoting21:12

