Tuesday, 2023-01-24

*** ysandeep|out is now known as ysandeep05:05
ysandeepo/ good morning 05:06
ysandeepreviewbot, please add in reviewlist: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/87149305:06
reviewbotI have added your review to the Review list05:06
ysandeepakahat, hi o/ Once you wrote doc - How to deploy standalone using ansible.. do you still have that handy?06:05
marioso/ 06:09
ysandeephey marios o/06:16
marioso/06:25
*** amoralej|off is now known as amoralej07:13
Tengu|roverhello there07:21
*** ysandeep is now known as ysandeep|lunch07:39
akahatysandeep, o/ https://amolkahat.github.io/deploy-standalone-tripleo-using-tripleo-operator-ansible.html 07:39
Tengu|roveroh, there were promotions over the night. gooood07:46
*** ysandeep|lunch is now known as ysandeep08:10
ysandeepakahat, thanks!08:10
Tengu|roverok,. what's missing in wallaby for a promotion....08:22
mariosbhagyashris|ruck: o/ 08:31
*** jpena|off is now known as jpena08:35
*** ysandeep is now known as ysandeep|afk08:55
*** ysandeep|afk is now known as ysandeep09:07
ysandeeprlandy|out, marios Tengu|rover :) huff.. finally found out what's the issue with c8 train internal job "nameserver 127.0.0.1" entry in /etc/resolv.conf is causing issue. Took me a while to figure that our because before the job finishes we were correcting the dns entry and when I login on the reproducer node dns entry was correct and I couldn't reproducer the issue. 09:10
ysandeeprlandy_, marios Tengu|rover https://privatebin.corp.redhat.com/?655ab6f811d57f58#ECzq7cxaopuRvi1w9CLWeR1jJ14Fpts24j9PnfBg6RJ609:11
Tengu|roverysandeep: linked to networkmanager?09:11
ysandeepTengu|rover, https://privatebin.corp.redhat.com/?655ab6f811d57f58#ECzq7cxaopuRvi1w9CLWeR1jJ14Fpts24j9PnfBg6RJ6 , with "nameserver 127.0.0.1" as first entry in resolv.conf -> node was not able to resolve internal links 09:12
Tengu|roverysandeep: care to check the content of /etc/NetworkManager/NetworkManager.conf and ensure there's a "dns=none" in the [main] section?09:13
Tengu|roverysandeep: I can work with you on that matter, if needed - I nudged things in upstream infra repo for that already.09:13
ysandeepTengu|rover, http://pastebin.test.redhat.com/108921409:14
ysandeepdns=none not present09:14
Tengu|roverok - so it's missing the setting :)09:14
mariosthnks for digging ysandeep - cant quickly find where that comes from (127) just looked a bit with codesearch 09:14
Tengu|rover-> need to inifile it, and reload the NetworkManager.service systemd unit09:15
mariosprobably default and not something we set09:15
ysandeepyes, looks like coming from default c8 node.. 09:15
ysandeepTengu|rover, trying your suggestion locally09:16
mariosysandeep: this isnt ipa job but noting that the ipa job has explicit task to remove that there https://opendev.org/openstack/tripleo-quickstart-extras/src/commit/c13eb508987b853c93bff024c54402ee605aef09/roles/ipa-multinode/tasks/ipaserver-undercloud-setup.yml#L156 09:16
Tengu|rovermarios: sounds sooo wrong actually.09:16
Tengu|roverthat should be done via the correct networkmanager setting.09:17
Tengu|roverelse, it will override the /etc/resolv.conf at any time.09:17
Tengu|rovermarios: worth pushing a change request against that task file in order to ensure NM is correctly configured?09:19
Tengu|rovermarios, ysandeep https://opendev.org/openstack/tripleo-quickstart-extras/src/branch/master/playbooks/baremetal-full-freeipa.yml#L85-L9709:20
Tengu|roverysandeep: you want the same -^^09:20
ysandeepTengu|rover, didn't help, could you please check if I missed something: http://pastebin.test.redhat.com/1089216 09:22
Tengu|roverysandeep: hmm. So, you want to: ensure NM is properly configured, and reload it. and then only you'll be able to publish the /etc/resolv.conf09:23
Tengu|roverysandeep: pretty sure the resolv.conf is edited prior NM is configured/reloaded, and that the resolve.conf is edited in an "append" way. Since I don't know what you're actually running, I can't say for sure. Do you happen to have  a playbook and related things?09:24
Tengu|roverysandeep: we can even jump in a meet if you want.09:25
* pojadhav afk for ~1hr 09:27
ysandeepTengu|rover, sure.. lets meet on gmeet09:27
ysandeepmeet.google.com/rix-wdxh-vuk09:27
Tengu|roverwallaby on cs9 will promote shortly!09:29
Tengu|rovermarios: -^09:29
mariosnice09:29
Tengu|roverand we should get a clean resolution for that resolver issue pointed by ysandeep.09:36
Tengu|rovermarios: ah, quick question related to the dashboard: the second square,  named "RDO promotion", shows a large gap - last promotions being months ago, while last builds are usually far closer to our current time. Is this normal? iirc you told me something about it, but I don't remember.09:48
Tengu|rover(talking about http://dashboard-ci.tripleo.org/d/mhV51gdVk/upstream-and-rdo-promotions?orgId=1 - sorry)09:48
mariosno longer need to worry about current-tripleo-rdo so can ignore09:52
mariosTengu|rover: ^ 09:52
Tengu|rovermarios: ok! is this something to check whenever a new release is cut or something?09:52
Tengu|roveror is it really just a dead topic09:53
mariosno used to be part of the prod chain but not for a while now (used to go current-tripleo then current-tripleo-rdo )09:53
Tengu|roverok09:53
Tengu|roverwoot! promoted!10:04
Tengu|roverall stable + master are fresh from either yesterday or today!10:04
Tengu|rover(though train doesn't have new content for some days, now, so... it's "old", though up-to-date)10:05
Tengu|roverlovely10:05
marios:)10:05
mariosstop.saying.that.10:05
marioso_O10:05
marios:D10:05
Tengu|rover:]10:05
Tengu|roversorry for being enthusiastic about that silly thing called promotion ;)10:06
mariosi am joking of course like don't jinx it ;)10:06
Tengu|roveroh, well, it will eventually blow anyway, jinxed or not ;)10:06
mariosforgive him oh zuul, he knows not what he says!10:07
ysandeepTengu|rover, marios: Alternative solution - we are using dib to build that c8 image, in c8 we use unbound local resolver in upstream as well.. we were missing correct forwarders in unbound - configuring correct downstream namerservers as forwarders solved the issue as well: http://pastebin.test.redhat.com/108922210:07
Tengu|roverysandeep: soooo. it's a bit more complicated, but yeah10:08
mariosysandeep: thanks what was the primary (network manager?)10:08
Tengu|roverysandeep: in case you want to actually use the ubound service, you'll still need to ensure rc-manager=unmanaged is present in the NM config + service reload.10:08
Tengu|roverysandeep: else, NM may override the whole file in a way you don't want.10:09
mariosysandeep: really my question is is there a reason we want to do the alternative did the primary fix not work/other blocker to do it? 10:09
Tengu|roverysandeep: so, "whatever", but you want to ensure NM isn't editing the /etc/resolve.conf under any circumstances.10:09
ysandeepAs this is c8, I think we should keep using unbound with correct forwarders to be in sync with upstream jobs.10:09
Tengu|rovermarios: now that ysandeep said it, I remember my whole work on the upstream infra was to actually ensure we were using unbound without any of the NM interference.... 10:10
Tengu|roverysandeep: yeah, that's probably the best approach. so pushing the correct forwarder, while ensuring NM doesn't touch the /etc/resolv.conf - this means reloading both services10:11
Tengu|roverbut that's for the better.10:11
ysandeepmarios, iirc.. for c8 - unbound is the default local resolver and that's why we have "nameserver 127.0.0.1" entry in the first place10:11
Tengu|roverand... yeah, this explain why it can't resolve, actually. If the unbound was down, it would fallback on the second or third nameserver set in the config.10:12
Tengu|roverbut since unbound is running, and answers, it will throw some NXDOMAIN which is a valid answer, thus... crash10:12
Tengu|roverysandeep++ for the digging!10:12
Tengu|roverand dumb me for not remembering that very same topic for the upstream - it was during the end of last year.10:13
mariosysandeep: Tengu|rover: thanks i10:14
ysandeepTengu|rover, so I think these 3 things 1) Configure correct forwarders in unbound for downstream case + unbound reload 2)  rc-manager=unmanaged is present in the NM config 3) NM service reload10:17
* ysandeep checking in which pre we can include above ^^10:18
Tengu|roverysandeep: yep, that sounds like the right plan10:18
ysandeepTengu|rover, marios thanks!10:18
Tengu|roverysandeep: lemme know when you have reviews up10:28
reviewbotDo you want me to add your patch to the Review list? Please type something like add to review list <your_patch> so that I can understand. Thanks.10:28
Tengu|roverhmm we may have an issue with master. There are 2 issues with Tempest, one for periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-master and the other for periodic-tripleo-ci-centos-9-ovb-1ctlr_2comp-featureset020-master. Not really sure what to look for. One is related to network settings, the other one to cinder (snapshot). It seems11:05
Tengu|roverto loop on those 2.11:05
Tengu|roverread timeout. Maybe infra is overloaded?11:05
Tengu|rover#lunch11:07
*** rlandy|out is now known as rlandy11:09
rlandyysandeep: ok to w+ https://review.opendev.org/c/openstack/tripleo-heat-templates/+/871493?11:13
rlandymarios: bhagyashris|ruck: hi  not sure if you saw ping slack11:13
ysandeeprlandy, yes11:13
rlandydone11:14
ysandeepthanks11:14
rlandyfrenzy_friday: is there a way to test https://code.engineering.redhat.com/gerrit/c/openstack/rrcockpit/+/438919 - of not, pls comment11:14
rlandyand then we can merge and try it out11:15
rlandyif you are around to watch it and revert11:15
frenzy_fridayrlandy, commented. Yep, I think we can merge and I'll revert if stuff breaks11:16
rlandyfrenzy_friday: ok - I ma going to workflow - pls keep you eyes on the board and let me know befire your EoD if we are ok 11:17
frenzy_fridayrlandy, cool, thanks11:17
*** dviroel|out is now known as dviroel11:18
rlandydone11:18
rlandybhagyashris|ruck: ^^ fyi if you are watching the downstream dashboard11:19
*** ysandeep is now known as ysandeep|afk11:22
dpawlikmarios: o/11:24
dpawlikhttps://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-containers-multinode-master/b6b75d6/job-output.txt11:24
dpawlikI check  logs on quay11:24
dpawlikthat contains that two ip address: 38.102.83.94 and 38.102.83.3911:25
dpawlikand there is nothing related to those ips11:26
dpawlikmarios: could you recheck some job to get new logs?11:28
mariosdpawlik: the issue is resolved now so we won't get the issue reproduced any more with recheck ... but you can check newer run there for example (2 top results are from last periodic runs/green)11:35
mariosrlandy: which one there were a couple 13:13 < rlandy> marios: bhagyashris|ruck: hi  not sure if you saw ping slack11:35
mariosrlandy: replied in slack... :)11:36
mariosdpawlik: there https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-containers-multinode-master&skip=0 11:37
rlandymarios: bhagyashris|ruck: can we meet for 5 re: 17.1?11:38
bhagyashris|ruckrlandy, sure11:38
mariosrlandy: k 11:38
rlandyhttps://meet.google.com/uvk-qdgj-uro?pli=1&authuser=011:39
*** ysandeep|afk is now known as ysandeep12:11
ysandeepTengu|rover, https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/438957 12:26
Tengu|roverysandeep: checking12:27
ysandeepI have added a condition to only add 'rc-manager=unmanaged' for c8 case only to limit the breakage scope of this patch as this is config patch and we can't speculatively test12:28
Tengu|roverysandeep: reviewed. It's... well. I'd just configure NM in any cases, especially seeing the other tasks present in this file.12:34
ysandeepTengu|rover, ++ thanks for the review, I will update/comment after mtgs.. 12:54
ysandeeparxcruz++ great demo \o/12:56
arxcruz\o/12:56
*** amoralej is now known as amoralej|off13:11
*** amoralej|off is now known as amoralej|lunch13:11
Tengu|rovercan anyone help me on debugging a couple of tempest issue? :)13:13
rlandyysandeep: thanks for debugging the train ovb failures13:16
ysandeeprlandy, :) happy to help, I will fix the review comment from Tengu|rover and send it back.13:16
ysandeepTengu|rover, tempest failures - are those consistent? just a headup that sometime infra act up and we see random tempest failures.13:19
Tengu|roverysandeep: I think they are consistent, yes. the 2 same jobs are being constantly failing as far as I can see.13:19
Tengu|roverhttp://dashboard-ci.tripleo.org/d/mhV51gdVk/upstream-and-rdo-promotions?orgId=1&viewPanel=14 13:20
ysandeepdashboard is very slow here.. what is the job name?13:21
Tengu|roverhttps://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-master&pipeline=openstack-periodic-integration-main&skip=013:21
Tengu|roverhttps://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-ovb-1ctlr_2comp-featureset020-master&pipeline=openstack-periodic-integration-main&skip=013:21
Tengu|roverhmm.13:21
Tengu|roverit shows something else here than the grafana13:22
Tengu|roverfun..... grafana shows more runs o_O13:22
Tengu|rovermaybe I should just rekick both jobs?13:24
Tengu|roverhmm. yeah... let's see.13:25
ysandeepTengu|rover, failing tests are different on both jobs.. and as per build history they failed on other issues earlier13:27
ysandeepso yeah lets recheck 13:27
Tengu|rovernudged.13:28
Tengu|roverif we can get a master promotion today, that would be 2 in a row :).13:28
ysandeepTengu|rover, what kind of templating you have in mind for https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/438957/1/playbooks/tripleo-rdo-base/configure-nameserver.yaml#22 13:30
ysandeep* we first find any IP address, replace with internal nameservers?13:31
Tengu|roverysandeep: I don't know the actual file format expected by unbound, but... basically, something that ensure file consistency, being an actual ansible.builtin.template, or an ansible.builtin.copy with content: | full file content13:31
Tengu|roverat least, something ensure a consistency even if the nameservers are changed at some point.13:31
ysandeepforwarding.conf don't have many entries - https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario007-standalone-train/c47a58f/logs/undercloud/etc/unbound/forwarding.conf , so copy/template sound good.13:33
Tengu|roverwhat I thought. yeah.13:35
Tengu|roverso that we override the whole file, and we're sure we don't have any weird content in there.13:35
Tengu|roverysandeep: maybe beware of the IPv6 resolvers?13:35
Tengu|rovernot sure if Red Hat provides any... ?13:35
ysandeepmay be we should just use ipv4 resolver for downstream case13:36
Tengu|roverah. hmm. VPN seems to provide an IPv6, meaning there should be some v6 resolver.13:36
Tengu|roverysandeep: yeah, we can start with pure v4 resolvers first.13:36
Tengu|roverand if need be, we may dig further into that.13:36
ysandeepTengu|rover, ack for pure ipv4 resolver first.. left a link for you in internal channel about dns info I found13:40
*** ysandeep is now known as ysandeep|afk13:43
*** dasm|off is now known as dasm13:58
dasmo/13:58
*** amoralej|lunch is now known as amoralej14:05
bhagyashris|ruckrlandy, fs001 passed rhos17.1 https://code.engineering.redhat.com/gerrit/c/testproject/+/438532/6#message-a46a1bebeb0376240d5a1bd2ff86480acd79407414:14
rlandyTengu|rover: sorry - still need help?14:15
Tengu|roverrlandy: I think I'm good for now14:15
rlandyk14:15
Tengu|roverjust need to get back to the 9.2 testproject you created14:15
* Tengu|rover all over the place14:16
bhagyashris|ruckwe can merge skip patch14:16
rlandyTengu|rover: yeah - wanted to give that some focus in my afternoon14:21
rlandyand get those review in14:21
rlandyyou are busy with rr14:21
Tengu|roverrlandy: apparently we need to nudge the qcow2 image link for 9.214:21
rlandyTengu|rover: ok - let's touch base before you are EoD14:22
Tengu|roverrlandy: nudged qcow2 in both patches (tripleo-environment + tripleo-ci-internal-jobs)14:26
Tengu|roverwe therefore should be able to re-kick your testproject.14:26
* bhagyashris|ruck leaving for the day 14:48
pojadhavCommunity Call in 5 mins : arxcruz, rlandy, marios, ysandeep, bhagyashris|ruck , svyas, soniya29, pojadhav, akahat, chandankumar, frenzy_friday, anbanerj,  dviroel, dasm, Tengu, jgilaber14:55
pojadhavhttps://meet.google.com/igc-nxwj-gws?authuser=014:55
Tengu|roveralready in :)14:56
pojadhavhttps://hackmd.io/iraYQWGBT4qPCKH0VNG31A#2023-01-24-Community-Call14:56
pojadhavfolks, please add agenda if any14:56
*** dviroel is now known as dviroel|lunch15:19
* pojadhav afk15:22
*** ysandeep|afk is now known as ysandeep|out15:24
* ysandeep|out out, see everyone tomorrow o/15:24
dasmysandeep|out: o/15:25
*** dviroel|lunch is now known as dviroel16:30
*** dviroel is now known as dviroel|doc_appt16:43
*** amoralej is now known as amoralej|off16:53
Tengu|roverOK Folks - going offline. See you tomorrow!16:57
*** jpena is now known as jpena|off17:21
dasmTengu|rover: o/17:25
*** dviroel|doc_appt is now known as dviroel19:13
frenzy_fridayrlandy, dasm looks like the cockpit patch didnt mess anything up : http://tripleo-cockpit.lab4.eng.bos.redhat.com/d/MmX0tFSVk/osp-component-ci?orgId=1 - We still have data21:37
frenzy_fridayWe dont need to revert it I think21:38
rlandyfrenzy_friday++ nice - thanks for checking21:38
rlandyfrenzy_friday: it's late ofr you  ... tomorrow pls see slack - requesting your review on 9.2 patches on review list21:38
dasmfrenzy_friday++ thanks for checking that.21:39
dasmfrenzy_friday: i would be surprised if it would affect other views. But you never know with software :D21:39
*** dviroel is now known as dviroel|out22:44
*** rlandy is now known as rlandy|out23:01
* dasm => offline23:09
*** dasm is now known as dasm|off23:09

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!