Friday, 2022-08-26

dviroel|rovermaster promoted, reverting00:08
dviroel|rovercurrent-tripleo/2022-08-26 00:0200:08
rlandynice01:05
rlandytrain is promoting01:05
*** rlandy is now known as rlandy|out01:33
*** pojadhav|out is now known as pojadhav01:35
*** dviroel|rover is now known as dviroel|out01:37
*** ysandeep|out is now known as ysandeep02:21
ysandeepgood morning oooci o/02:25
ysandeeprlandy|out, regarding sc01 slowness card.. its not only sc01 - all the rdo jobs are slow03:28
ysandeep https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-standalone-wallaby/bbc1dd7/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz03:28
ysandeeptook ~30 mins: 022-08-25 18:51:02.273791 | fa163e1b-009f-a236-db4b-000000000889 |     TIMING | tripleo_firewall : Manage firewall rules | standalone | 0:29:08.429153 | 1684.59s03:28
ysandeepSimply running : iptables -t filter -L INPUT - takes 40 seconda03:28
ysandeepiptables -t filter -L INPUT -n returns quite quickly03:29
ysandeepI am debugging with takashi03:29
ysandeephttps://www.fir3net.com/UNIX/Linux/iptables-l-output-displays-slowly.html03:29
ysandeephttps://serverfault.com/questions/791911/centos-extremely-slow-dns-lookup suggests we remove 127.0.0.1 from resolv.conf entry and that resolved the issue03:55
ysandeeplooks like there was a historical reason they added 127.0.0.1 : https://opendev.org/openstack/tripleo-ci/commit/5d612318c95b9b3ff78e66e91d5c225274cb8b0904:06
*** soniya29|ruck is now known as soniya2904:56
ysandeepfinally, we know what the issue is unbound service not present in RDO images06:04
ysandeepthat's the difference with Upstream.. causing slowness in dns resolution06:05
ysandeeplet me check with infra and save ~40 min on each of our job run06:05
ysandeepfyi.. Incase some want to know the details https://bugs.launchpad.net/tripleo/+bug/1983718/comments/1006:06
ysandeepupdated: https://bugs.launchpad.net/tripleo/+bug/1983718 and https://trello.com/c/R6SuOv6E/2661-cixlp1983718tripleociproa-periodic-master-scen1-standalone-fails-timeout-manage-firewall-rules06:25
*** pojadhav is now known as pojadhav|ruck06:47
*** jm1|ruck is now known as jm1|rover06:48
jm1happy friday #oooq ysandeep pojadhav|ruck :)06:48
pojadhav|ruckjm1, happy friday you too :)06:48
ysandeepjm1, good morning o/ 06:49
frenzyfridayhey, does anyone know this error: SELinux relabeling of /etc/pki is not allowed I am gettingthis while getting telegraf container up in downstream. I have already set selinux on the host to permissive07:33
frenzyfridayI have tried adding privileged: true to the docker compose07:36
jm1frenzyfriday: so the line this is coming from is https://github.com/rdo-infra/ci-config/blob/63b70523433c31df47eb5cddef19a7a3e9c96a2d/ci-scripts/infra-setup/roles/rrcockpit/files/docker-compose.yml#L8707:44
jm1frenzyfriday: question is, why do we have this volume in the first place?07:44
frenzyfridayyep, it says we are son supposed to mount /etc and some other dirs07:45
jm1frenzyfriday: i guess the only reason is that we want to pass the rh internal ca cert to the container?07:45
frenzyfridayWe need this to get the certs to connect to downstream07:45
frenzyfridayI was wondering if mounting does not work probably we have to run copy certs to the container itself in the dockerfile07:45
frenzyfridayBut how did it work for the c7 internal cockpit vm?07:46
frenzyfridayBut in this case ^ whenever the certs change , we will have to rebuild the container07:46
jm1frenzyfriday: c7 had docker which behaves differently from podman in c8/9 in some parts07:46
chandankumarfrenzyfriday: regarding selinux relabelling issue https://github.com/containers/podman/issues/2379#issuecomment-466770671 might help 07:48
jm1frenzyfriday: why not download the ca cert in infra-setup playbook (only for incockpit) to both /etc/pki and another place and then mount it into the container?07:48
frenzyfridayyeah, that might work! /m tries07:49
jm1chandankumar, frenzyfriday: from security side this is not a good idea https://bugzilla.redhat.com/show_bug.cgi?id=1594485#c407:49
jm1chandankumar, frenzyfriday: we are running containers as root hence we could face exactly the issue described in bz07:50
*** jpena|off is now known as jpena07:52
jm1frenzyfriday: we have to add the "ca download step" (to /etc/pki on incockpit) as a task to the ansible playbook/role anyway. so copying it to a safe location as well shouldnt be too hard07:53
ysandeepfrenzyfriday, jm1 try removing :z from https://github.com/rdo-infra/ci-config/blame/63b70523433c31df47eb5cddef19a7a3e9c96a2d/ci-scripts/infra-setup/roles/rrcockpit/files/docker-compose.yml#L87 07:54
ysandeepi think I added that because it was needed for C707:54
jm1ysandeep: and it probably still is required because we access https pages which are signed with rh internal ca07:55
ysandeepkeep the mount just remove :z , I mean revert this patch: https://github.com/rdo-infra/ci-config/commit/ef60d16d9a1ff86ec2087e82f1d26a1b80af77c807:56
ysandeepnot 100% sure that will fix the issue but worth a try.07:57
jm1ysandeep, frenzyfriday: very certainly the rh ca is still required because ruck rover script needs it to access internal pages07:57
jm1ysandeep, frenzyfriday: good point about removing :z, worth a try07:59
jm1ysandeep,frenzyfriday: although it would be nice to use selinux in enforcing mode instead of permissive mode08:00
ysandeep+1, remove :z and turn selinux in enforcing mode 08:02
*** ysandeep is now known as ysandeep|away08:18
frenzyfridaycontainers are up \o/08:29
frenzyfridayysandeep|away, jm1, chandankumar thank you guys!08:29
* frenzyfriday now tweaking nginx 08:29
jm1frenzyfriday: awesome!08:29
chandankumaryw :-)08:32
chandankumarjm1: thank you for the bz link. :-)08:33
*** pojadhav is now known as pojadhav|ruck08:57
*** pojadhav is now known as pojadhav|ruck09:03
jm1pojadhav|ruck, rlandy|out: rr notes are up to date now. they are now also more or less in sync with cix cards, ordering of known bugs matches "Prodchain Blocked" lane in cix trello board.09:16
pojadhav|ruckjm1, I have started looking at upstream components 09:17
jm1pojadhav|ruck, rlandy|out: does not look too bad: c8 train compute is 6 days out, c9 wallaby security is 5 days out, the rest is less than 4 days out09:17
jm1* less or equal09:18
soniya29pojadhav|ruck, jm1, let me know when can we have sync-up?09:19
jm1soniya29, pojadhav|ruck: do we want to wait for rlandy|out? i guess she will be here soon09:20
pojadhav|ruckyeah.. we can wait for more 1 hr09:21
pojadhav|ruckthen lets sync 09:21
soniya29pojadhav|ruck, jm1, okay09:21
*** pojadhav|ruck is now known as pojadhav|lunch09:22
jm1pojadhav: i will start rerunning jobs for upstream09:22
* pojadhav|lunch will be back in half hr09:22
pojadhav|lunchjm1, sure09:22
pojadhav|lunchjm1, based on cix card update I am rerunning valiation mater job here: https://review.rdoproject.org/r/c/testproject/+/4089709:23
jm1pojadhav|lunch: ack09:24
frenzyfridayincockpit: http://10.0.109.28/ finally09:41
frenzyfridayThe scripts need some change. I will create a patch09:42
jm1frenzyfriday: cool! where do you want to put internal tenant_vars?09:56
frenzyfridaythere is a base repo downstream, lemme check10:05
*** pojadhav- is now known as pojadhav|ruck10:23
jm1pojadhav|ruck, rlandy|out: went through all failing rdo jobs and annotated failing component jobs in rr notes10:29
* jm1 lunch10:30
*** rlandy|out is now known as rlandy10:30
rlandyjm1: thanks10:30
rlandylet's sync when dviroel|out is in10:30
rlandypojadhav|ruck: ^^10:30
rlandyis there a new rr hackmd10:30
rlandyhttps://hackmd.io/94uNoMlnQgegrgy1iXV1kQ - ok great10:30
rlandyjm1: frenzyfriday: pojadhav|ruck: cockpit upstream is brokem10:33
rlandyout of date10:33
rlandyhttp://dashboard-ci.tripleo.org/d/HkOLImOMk/upstream-and-rdo-promotions?orgId=110:33
rlandypromotions are more recent than that10:33
rlandypojadhav|ruck: hello - you around?10:34
rlandychandankumar: hi10:34
chandankumarrlandy: hello10:34
rlandyarrived safe?10:34
chandankumaryes10:34
chandankumarthank you :-)10:34
pojadhav|ruckrlandy, yes around10:35
rlandypojadhav|ruck: how are you and jm1 dividing the rr work?10:36
rlandyare you working downstream>10:36
pojadhav|ruckyes I started looking at d/stream, but I didnt found any major blocker, only rerunning jobs which are are having sigle failures. now started looking at upstream components.10:37
rlandypojadhav|ruck: we will need to meet about the TC stuff when dasm gets in10:37
pojadhav|ruckrlandy, yes10:37
rlandypojadhav|ruck: pls puts notes and last promotes dates on the rr notes like jm1 has done10:37
rlandydownstream is in good shape10:37
pojadhav|ruckrlandy, yep I will add10:37
rlandyfrenzyfriday: hi10:40
rlandypls ping when around10:40
rlandylooks like upstream cockpit is not getting latest data10:40
frenzyfridayrlandy, hi10:41
frenzyfridayupstream also? :'(10:41
rlandyfrenzyfriday: hi - can you meet for a few?10:41
* frenzyfriday checks10:41
frenzyfridayyep sure10:41
rlandyfrenzyfriday: https://meet.google.com/tug-xmbn-rag?pli=1&authuser=010:41
rlandybhagyashris: hi - will need to meet with you next10:41
rlandypls ping when you are around10:42
rlandyjm1: pls nicj10:47
rlandynick10:47
rlandyjm1: pojadhav|ruck: promotions are in better shape than that - the cockpit is not getting latest data - spoke with frenzyfriday - pls check dlrn for your data10:47
pojadhav|ruckack10:48
*** ysandeep|away is now known as ysandeep10:53
rlandybhagyashris: thanks for https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/426246 - merging10:56
rlandywe will need a zuul restart to get the line to show10:56
rlandybhagyashris: can you add 17.1 on rhel8 to the promoter?11:00
chandankumarysandeep+++++++++++++++++++++11:04
ysandeepchandankumar++ thanks for suggestions on c9 dib image :D11:07
ysandeepreviewbot: please add in review list: https://review.opendev.org/c/openstack/tripleo-ci/+/85475111:07
reviewbotI could not add the review to Review List11:07
rlandyysandeep: chandankumar: ok - will merge11:07
bhagyashrisrlandy, hey i am around...11:14
rlandybhagyashris: hi ... review time now11:15
rlandypls join that - will chat with you afterwards11:15
bhagyashrissure11:15
bhagyashrisjoining...11:15
bhagyashrisfolks review time...11:17
bhagyashrisrlandy, https://review.opendev.org/c/openstack/tripleo-heat-templates/+/852446/11:21
*** ysandeep is now known as ysandeep|afk11:23
bhagyashrisrlandy, https://hackmd.io/kqJ-XcUeQ_24bjQ7pjTVJQ11:23
*** dviroel|out is now known as dviroel11:23
rlandyakahat: hey - https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-9-multinode-mixed-os&pipeline=openstack-periodic-integration-stable1-cs8&skip=011:27
rlandyfailing11:27
rlandycan you look into that11:27
rlandyysandeep|afk: dviroel: you should be unblocked now on resources11:30
rlandydviroel: good morning11:30
rlandyjm1: pojadhav|ruck: soniya: dviroel:pls ping when you are all around so we can rr sync11:30
ysandeep|afkrlandy++ wohoo awesome \o/ 11:31
ysandeep|afkthanks!11:31
dviroelrlandy: nice o/11:32
rlandydviroel: fyi - cockpit is not updating11:32
rlandyso last nights promos are not showing11:32
frenzyfridaythere is a problem with the podman networking in the incockpit. the containers are up but they cannot communicate with each other11:32
rlandywhich is why jm1 thinks things are worse than they are11:33
rlandydviroel: did we promote w c9 security component?11:33
dviroelyeah, i see an comment abot that above11:33
dviroelyes, it is promoted11:33
rlandydviroel++11:33
rlandyso what's out is w c8 integ and w c911:33
rlandyok - we can deal with those when jm1 and pojadhav|ruck sync happens11:33
dviroelhttps://trunk.rdoproject.org/centos9-wallaby/component/security/11:33
dviroelrlandy: is this dns fix for vexx affecting everything11:34
dviroel?11:34
pojadhav|ruckrlandy, i am available for the sync, will wait for others11:35
ysandeep|afkfrenzyfriday, podman networking -- hmm are we starting containers as root?11:36
*** ysandeep|afk is now known as ysandeep11:36
frenzyfridayysandeep, yep11:36
ysandeepI think I know what's the issue and possible workaround, fetching..11:37
frenzyfridayysandeep, even if I do docker exec it keeps throwing me out11:37
jm1rlandy: nick what?11:38
jm1|roverrlandy: i am here11:38
jm1rlandy: me too11:38
rlandyok- think we are all here11:39
ysandeepfrenzyfriday, read this: https://bugzilla.redhat.com/show_bug.cgi?id=209184011:39
rlandydviroel: jm1|rover: pojadhav|ruck: soniya: https://meet.google.com/ocx-ttaj-zbe?pli=1&authuser=011:39
ysandeepfrenzyfriday, run this as root - ip link | grep mtu 11:40
ysandeepand see if podman bridge have higher mtu than eth0?11:40
ysandeepfrenzyfriday, let me know if you want to discuss over a gmeet?11:42
frenzyfridayysandeep, yep, cni-podman1: 1500, eth0 145011:43
frenzyfridayysandeep, yep sure: https://meet.google.com/sug-xzxz-xvn11:44
akahatrlandy, looking..11:56
chandankumarI saw some chatter around downstream promoter , Does it sorted out?11:58
chandankumar*it got11:58
*** ysandeep is now known as ysandeep|afk12:05
frenzyfridaychandankumar, nope. Containers are up - but now they cannot communicate between each other12:06
rlandyjm1: we missed our 1-112:15
jm1rlandy: want to do it in 5mins?12:15
rlandyscheduling that for next week when you are off rr12:15
rlandycan't have other meetings today12:15
jm1rlandy: ok sure12:17
jm1dviroel, soniya: thanks for your rr notes, that really helped us getting started :)12:19
jm1frenzyfriday: need help with upstream cockpit?12:20
dviroeljm1: pojadhav|ruck: testing cloudops fix here https://review.rdoproject.org/zuul/status#4465212:21
pojadhav|ruckdviroel, ack12:21
jm1pojadhav|ruck: will sync our rr notes against cix board again and then update promotion dates in rr notes12:24
pojadhav|ruckjm1, sure12:24
pojadhav|ruckd/stream one already update based on dlrn data12:24
pojadhav|ruckupdated*12:24
pojadhav|ruckpromotion dates12:24
jm1pojadhav|ruck: ack, thanks!12:25
frenzyfridayjm1, hey, do you know if there is an ansible role for podman-compose like https://github.com/rdo-infra/ci-config/blob/master/ci-scripts/infra-setup/roles/rrcockpit/tasks/start_services.yml#L16 ?12:27
frenzyfridayI can see containers.podman.podman_container but it does not accept compose files12:27
jm1frenzyfriday: i dont think so, this does not show anything https://github.com/containers/ansible-podman-collections/tree/master/plugins/modules12:28
jm1frenzyfriday: podman has kube files. but iirc one can use docker compose with podman. are you running in issues with that?12:29
jm1frenzyfriday: we can debug together if you want12:29
frenzyfridayyes, if we have posman installed and try to set up the containers with dockerfile, docker-compose then the containers cannot communicate with each other. But if I use podman-compose up it works fine. Now outr playbook uses docker-compose module and passes the compose file to it. We need to do it through podman compose 12:30
frenzyfridayjm1, you are RRing this week. I'll try to set up the incockpit manually for now (the compose part) and we can check after your RR12:31
jm1frenzyfriday: whatever you prefer ^^12:31
frenzyfridayI am also off next week12:31
jm1frenzyfriday: clean solution would be switching to kube files on podman12:32
jm1frenzyfriday: what about upstream cockpit?12:32
jm1frenzyfriday: do you need help there? is it working again?12:32
frenzyfridayjm1, no, did not get to the upstream yet12:33
jm1frenzyfriday: ack12:33
frenzyfridayjm1, ysandeep|afk for the upstream cockpit I see an error in the telegraf container logs: https://paste.opendev.org/show/bHgnf4CZDrzgkJart8bD/12:36
frenzyfridaymaybe thats why we arent getting new content?12:36
frenzyfridayError in plugin: metric parse error: expected tag at 1:127: "zuul-queue-status,url=https://softwarefactory-project.io/zuul/api/tenant/rdoproject.org/status,pipeline=openstack-check,queue=,job=tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-wallaby,review=852845,patch_set=1 result=\"None\",enqueue_time=1661511140489,enqueued_time=103.6614,result_code=-1"12:37
frenzyfriday2022-08-26T12:36:05Z D! [outputs.influxdb] Wrote batch of 1 metrics in 4.045052ms12:37
ysandeep|afkdasm|off, ^^ 12:37
ysandeep|afkfrenzyfriday, you are seeing that error continously or it was only once?12:40
frenzyfridayysandeep|afk, repeating12:44
frenzyfridayysandeep|afk, https://paste.opendev.org/show/bfphcqOi78l7eOWqC2fK/12:44
* pojadhav|ruck brb12:44
ysandeep|afkfrenzyfriday, probably related to some recent changes, I will let dasm|off take a look otherwise I can take a look on my Monday morning.12:47
frenzyfridayysandeep|afk, cool, thanks!12:47
frenzyfridayrlandy, ^ related to upstream cockpit12:47
rlandyfrenzyfriday: probably last few merged changes on rr tool12:48
rlandysoniya also noticed some data being off lately on rr tool12:49
rlandydasm|off: ^^ when you are in, pls look at these errors12:49
rlandypojadhav|ruck: hi - can you start looking at the node provision failure on check?12:51
pojadhav|ruckrlandy, sure12:53
jm1frenzyfriday, ysandeep|afk, dasm|off: docker containers such as telegraf_py3 are not automatically updated when ruck_rover.py script is changed. for example, on upstream cockpit the telegraf container is 3 month old, but the latest change to ruck_rover.py is 4 days old12:53
jm1rlandy: ^12:53
ysandeep|afksoniya, could you please elaborate what issues you are seeing on rr tool?12:55
jm1frenzyfriday, ysandeep|afk, dasm|off, rlandy: docker-compose up tries to be smart and recreates containers when docker-compose.yml has been altered, but docker-compose will NOT watch for updates of Dockerfile(s) or files added in those Dockerfile(s)12:56
ysandeep|afkjm1, :( we need a way to rebuild container for changes in files.12:57
rlandyyeah - we need to rethink this12:57
rlandyalso - how many changes are going in rr tool12:57
rlandyif they are all needed12:57
jm1ysandeep|afk, rlandy: yeah. how about starting to sync our ansible code in infra-setup with our existing infra? ;)12:57
soniyaysandeep|afk, rlandy, sorry i missed your pings12:58
soniyaysandeep|afk, the status shown with the rr tool was not so correct, like for a pipeline the rr tool shows status=RED but the pipeline didnt had so many blockers as such12:59
ysandeep|afksoniya, by chance do you still have the output/if you are still seeing somewhere?13:00
soniyaysandeep|afk, nope, i noticed this while filling the program call doc13:02
ysandeep|afkack, If even one job is failing whether its in criteria or not than red is correct13:02
jm1soniya: what do you mean with pipeline did not have so many blockers? are you check pipeline with cockpit?13:02
ysandeep|afkso that ruck/rover will check the failing job13:02
* ysandeep|afk goes in a mtg13:02
soniyajm1: we are discussing about rr tool13:03
rlandypojadhav|ruck: 1-1??13:05
jm1soniya: yeah and in that context i am wondering what you took as a comparison for rr tool to find out that rr tool is broken. i am asking that because maybe we have issues somewhere else as well13:05
soniyaysandeep|afk, jm1, so IMHO i think the status shown in rr tool should also consider the latest promotion happened and not just failing criteria jobs, because in our case, we got promotion 2 days back and status shown was 'RED'. just to mention we didn't skip any of jobs as far as i remember13:09
soniya<ysandeep|afk> so that ruck/rover will check the failing job - for this we already have rr tool showing the testproject ready for them, that should be enough, right?13:10
frenzyfridayhey folks, the CRE team wanter to know if we have rocky linux running any of the jobs of tripleo ci?13:11
frenzyfridayafuscoar, ^13:11
*** rcastillo|rover is now known as rcastillo13:15
rcastilloo/13:19
rcastillodon't know why my bouncer wants me to be rover :(13:19
soniya<ysandeep|afk> soniya, by chance do you still have the output/if you are still seeing somewhere? - may be new ruck/rovers can confirm this13:25
soniyaif time permits then13:25
bhagyashrisrlandy, add jobs in criteria https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/426268 and added 17.1 on rhel8 promotion https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/4466313:29
ysandeep|afkrlandy, mtg time13:31
rlandybhagyashris: thanks - will look after meeting13:31
jm1soniya: maybe rr tool is right and our cockpit is wrong? we having issues with cockpit atm13:42
soniyajm1, ack13:51
jm1frenzyfriday: time for a chat?13:52
frenzyfridayjm1, yep sure13:53
pojadhav|ruckrlandy, dasm|off still not in yet :(14:00
rlandypojadhav|ruck: let's wait for him14:02
rlandypojadhav|ruck: ping me when you are eod14:02
pojadhav|ruckrlandy, i was about to eod by 1:30 pm UTC, not its 2 pm UTC :D14:04
*** dasm|off is now known as dasm14:04
dasmo/14:04
pojadhav|ruckomg14:04
pojadhav|ruckdasm here :D14:04
pojadhav|ruckI am waiting for you only dasm 14:05
dasmpojadhav|ruck: where? i joined a call, but no one is here14:05
pojadhav|ruckdasm joining14:05
dasmack14:05
*** njohnston_ is now known as njohnston14:05
pojadhav|ruckrlandy, TC stuff ?14:06
dasm12:53 <jm1> frenzyfriday, ysandeep|afk, dasm|off: docker containers such as telegraf_py3 are not automatically updated when ruck_rover.py script is changed.14:08
dasmjm1: hmm. i was told it's autobuilt. i don't remember who said that.14:09
dasmthat's a bad thing.14:09
rlandyjoining14:10
dasm12:37:30    frenzyfriday | 2022-08-26T12:36:05Z D! [outputs.influxdb] Wrote batch of 1 metrics in 4.045052ms 14:19
dasmi'm gonna check that, but if we're not deploying new code, we might not have newer changes14:19
dasmysandeep|afk: ^14:19
jm1dasm: i am on upstream cockpit right now, trying to rebuild telegraf14:20
dasmjm1: ack14:21
ysandeep|afkdasm, true14:21
*** ysandeep|afk is now known as ysandeep|out14:22
frenzyfridayfolk, lemme know if http://tripleo-cockpit.lab4.eng.bos.redhat.com/?orgId=1 is working for you14:24
frenzyfridayI have changes centos9 image to 8 for telegraf. thanks jm1 for the suggestion! The script will need a lot of work which will take time14:25
dviroelfrenzyfriday: works for me, i see 17.1 rhel 9 there, no data yet but, at least, there is board there now14:26
afuscoarfrenzyfriday: in my case the downstream dashboard I've created is there http://tripleo-cockpit.lab4.eng.bos.redhat.com/d/tbUsg0Z4k/downstream-data?orgId=1 14:28
afuscoarJust not information to populate. I guess bc telegraf problem14:28
frenzyfridayafuscoar, hm.. lemme check what happened to the data14:29
*** jpena is now known as jpena|off14:29
afuscoarIdk if the script is located in the right path 14:31
chandankumarsee ya people14:36
chandankumarhappy weekend :-)14:37
rlandyfrenzyfriday: http://tripleo-cockpit.lab4.eng.bos.redhat.com/d/bSwsg0WVz/rhel9-rhos17-1-full-component-pipeline - yep14:37
rlandyno data yet on the downstream side14:37
afuscoarOh, it's also happening in other dashboards14:37
afuscoarhappy weekend chandankumar14:37
rlandyfrenzyfriday++++++++++++++14:38
rlandyjm1++++++++++++++++++14:38
rlandythank you both14:38
rlandydasm: when you are done with UA stuff, can you look into the upstream cockpit error with frenzyfriday?14:39
rlandyhasn't collected new data14:39
dasmrlandy: it takes few minutes to gather data, but i'll check that one14:39
dviroeljm1: I added you to cloudops space, you might receive tons of notifications, but you can disabled them14:40
dviroeljm1: we are discussing cloudops issue there since yesterday14:40
jm1dviroel: ack ok thanks!14:41
dasmfrenzyfriday: i logged into lab4 instance and manually ran ruck_rover.py --influx. I see results. They should be landing in the cockpit soon14:43
jm1dasm: frenzyfriday has fixed downstream (manually)14:43
frenzyfridaydasm, ack, yeah the logs say it is still pulling data14:43
* frenzyfriday lunch14:43
jm1rlandy, pojadhav|ruck: upstream cockpit is currently refreshing but might still be broken. you can use downstream cockpit for the moment, frenzyfriday has fixed it14:44
rlandyjm1: frenzyfriday: thank you14:44
rlandyjm1: putting in patch to chase wallaby c914:45
jm1rlandy: thank you! put my latest update to rr notes. 14:45
jm1rlandy, dasm, ysandeep|out: upstream cockpit has several issues which we fixed manually for now, e.g. mtu was wrong which prevented container rebuilds for months. (this actually should have caused issues since the beginning of that vm on vexxhost). anyway, we will fix infra code next week. i am eod for today14:48
rlandyjm1: ok - have wallaby c8 and c9 in rerun14:48
dasmjm1: o/ have a good weekend mate14:48
rlandyjm1: asked dasm to add a new eoic14:48
rlandyepic14:48
rlandyto track needed cockpit changes14:48
dasmit's gonna be EPIC!14:48
rlandyto take up next sprint14:48
rlandyor this one  - if anyone has time14:49
dasmEPIC sprint for EPIC epic with EPIC task :)14:49
rlandyjm1: dasm: pls log these tasks there14:49
dasm(sorry, i'm in a weekend mode atm)14:49
rlandyso we capture what needs to be done14:49
rlandyjm1: have a good weekend14:49
jm1rlandy, dasm: ack. dasm please add me to 'Watchers' on jira cards (in case you are creating new ones for infra)14:51
dasmjm1: ack. cc pojadhav|ruck 14:51
* jm1 have a nice weekend #oooq14:53
pojadhav|ruckrlandy, leaving for the day 14:54
rlandyfrenzyfriday: huge thank you for dealing with all this14:54
rlandyfrenzyfriday+++++++++++++++++++++++++++14:54
pojadhav|ruckdid most of the JIRA stuff with dasm 14:54
rlandypojadhav|ruck: dasm: thank you both!!!14:55
rlandyTeam is rocking out!!!!14:55
* pojadhav|ruck out14:55
*** pojadhav|ruck is now known as pojadhav|out14:55
pojadhav|outsee you all on monday !!14:55
pojadhav|outbbye14:55
rlandyhave a great weekend14:56
*** dviroel is now known as dviroel|lunch15:00
Tenguchandankumar: heya! I think there's an issue with the way molecule are launched in tripleo-ansible - it seems to limit to the "default" scenario... how can I make sure it's running *all* ?15:02
rlandyTengu: you missed chandankumar - is it urgent?15:13
Tengurlandy: not really - just making tripleo-ansible molecule tests less reliable.15:13
TenguI'll check next week.15:13
Tengubut basically, this explains why molecule tests are all green for my tripleo_httpd_vhost role, while it's NOT working :).15:14
Tengu"woops".15:14
rlandyyep - too good t be true type thing15:15
Tenguexactly.15:15
TenguI was testing my role "in real situation" and it crashes with a template thingy that should have crashed zuul.15:16
Tenguit didn't.15:16
Tenguso I hunted down the molecule report and, wow, only one test was launched :). the "default" scenario. «woopsss»15:16
rlandy:)15:16
rlandyok - so tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001 works15:17
rlandybut c9 doesn't provision nodes .... fhmmm15:17
rlandylast success 08/2415:18
rlandycharming15:18
Tengurlandy: I may have a working change for the tripleo-ansible molecule thingy.15:43
Tengutesting on my env.15:43
Tenguit's "meh", but it allows to run all the scenarios.15:44
*** frenzyfriday is now known as frenzyfriday|pto15:48
*** dviroel|lunch is now known as dviroel16:10
rlandydasm: hi 16:22
rlandywhen was the last working centos image you could use for bmc?16:22
rlandydate of that image?16:22
dasmrlandy: 20220606.0 next one which fails is 20220621.116:23
rlandyok it's not that16:24
dasm> CentOS-Stream-GenericCloud-9-20220606.0. Following releases, none of them starts.16:24
rlandyyeah - I think we need to promote master16:27
rlandycurrent-trupleo hash is from 08/2216:28
rlandywe need 08/25 or later16:28
dviroelrlandy: on 08/25 we already have cloudops bug?16:41
dviroelyes right, it was yestreday16:41
dviroel:P16:41
*** rcastillo|rover is now known as rcastillo16:55
dviroelrcastillo: do you miss rover times?16:55
dviroel:)16:55
rcastilloevery day I miss it16:57
rlandyrcastillo; we can always out you back on :)17:16
rcastilloit'd be unfair not to let others in on the fun :)17:16
rlandyyou're so considerate :)17:16
dasmrcastillo: i believe others don't mind to allow you to have all fun ;)17:20
dasm*all the fun17:20
rcastilloyou're just saying that to be nice17:23
dviroelrlandy: that tempest test failure on wallaby c8 seems to be a real issue?18:12
dviroelnot first time?18:12
rlandydviroel: got one test failing on latest run18:12
rlandynot repeated18:12
rlandyrunning again18:12
rlandyto compare18:12
rlandydifferent hash18:13
dviroeli think I saw that failing yesterday, not sure which hash18:13
rlandydviroel: wrt node prov failure - I really think we need to promote a more recent version fo master18:13
rlandydviroel: that was a different hash - and a different test18:13
dviroelok18:13
rlandyfollowing both hashes18:13
rlandydviroel: for node prov - you promoted hash from 08/22 I think18:14
dviroelrlandy: yeah, but we are blocked due to scenario00118:14
rlandywe need 08/2518:14
rlandycorrect18:14
rlandyso once that is cleared18:14
dviroelcloudops is out already18:14
dviroeltheir fix didn't work18:15
dviroelthey use Spaces to chat, I can add you if you want18:15
dviroelif you want notifications on your gmail18:15
dviroel:)18:16
rlandynow looking at wallaby c918:36
rlandyugh  - I give up - skipping fs001 to promote wallaby c819:58
rlandydviroel: ^^ https://logserver.rdoproject.org/54/36254/156/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby/e4928dd/logs/undercloud/var/log/tempest/stestr_results.html.gz  - best result I can get19:59
rlandyskipping and promoting this hash19:59
rlandyugh - tempest you win, I give up19:59
dviroelrlandy: you have my vote on this20:03
rlandydviroel: will need  - patch in progress -one sec20:04
rlandydviroel: ok - https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44664 - here20:06
rlandydviroel: thank you20:08
rlandyopenstack-periodic-integration-main - quite good20:10
rlandywaiting on sc0120:11
rlandydviroel: pls check vexx flavors are there20:11
rlandyticket says done20:11
dviroelthey missed one, but it is ok for now20:17
dviroelwill unblock our work20:17
dviroelthe missing flavor may or may not be needed20:17
dviroelthe one with extra memory20:17
dviroelrlandy: added a new section on podified doc, about cluster provision20:18
rlandychecking which one was missed20:18
* dviroel brb - walk20:26
rlandydviroel: fond which one is missing 20:28
rlandyresponded on ticket20:28
rlandydviroel: dasm: rcastillo: one of you guys, pls check me on this before I merge: https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/4466320:31
rlandyfrom bhagyashris 20:31
rcastillolooking20:31
rlandyt20:32
rlandyty20:32
rcastillolooks good20:35
rcastillodo depends-on work on rdo -> downstream gerrit?20:35
dasmrlandy: lgtm20:39
dasmbut, there is one minor thing. i'm double checking20:40
dasmok, it's good. 20:42
rlandythanks20:52
* dviroel back21:00
* dasm => offline21:06
dasmsee you Monday!21:06
*** dasm is now known as dasm|off21:07
rlandybye dasm21:11
rlandydviroel++ thanks for onboarding help21:25
rlandystill trying to promo c821:42
rlandypatch keeps failing on a diff test21:42
dviroeldocker rate limiting21:43
rlandymumble21:43
rlandyI pay $7 a month for us to have a paid account21:43
dviroelhaha21:43
dviroelwe just need to add container login on these jobs then21:44
dviroelwill solve that issue21:44
dviroelI still need to move rdo to container-login role21:44
dviroeli have a note about that, but it is good to create tasks21:45
rlandyotherwise will try on sunday21:45
rlandywhen less people are active21:45
dviroelare we missing wallaby c8 and c9?21:46
dviroelhow is wallaby c9?21:46
rlandydon't ask21:51
rlandykvm job now passed 21:51
rlandyas did tempest21:51
rlandychecking ovb21:51
rlandynew hash just started21:52
rlandyold one was out 64, 39, 121:52
rlandy3521:52
rlandyhttps://review.rdoproject.org/r/c/testproject/+/4388321:52
rlandywas my rerun21:52
rlandy2022-08-26 15:31:24.107091 | primary | TASK [print content of 'resolv.conf' after modifications] **********************21:53
rlandy2022-08-26 15:31:24.107327 | primary | Friday 26 August 2022  15:31:24 -0400 (0:00:01.672)       0:15:50.294 *********21:53
rlandy2022-08-26 15:31:24.133797 | primary | ok: [undercloud] => {21:53
rlandy2022-08-26 15:31:24.133846 | primary |     "msg": "Content of resolv.conf: # Generated by NetworkManager\nsearch openstacklocal novalocal\nsearch ooo.test\nnameserver 10.0.0.250\n# NOTE: the libc resolver may not support more than 3 nameservers.\n# The nameservers listed below may not be recognized."21:53
rlandyha21:53
rlandythat may be the change made yesterday21:54
rlandydiff overcloud deploy errors in multiple runs21:57
rlandyon 06421:57
rlandywe're getting nowhere now22:16
dviroelhum22:17
dviroelyeah, 064 and 039 where like that in master too22:18
dviroeldid you tried another cloud? ibm,  internal?22:18
* dviroel it is happening, I see 3 control plane and 3 worker vms22:19
dviroel7 instances if you add bootstrap22:19
dviroelrlandy: i think that these jobs are testing you 22:23
dviroelrlandy: failing again22:23
* dviroel brb22:31
*** dviroel is now known as dviroel|afk22:31
*** dviroel|afk is now known as dviroel22:58
* dviroel almost there with ocp cluster23:00
dviroelhave a great weekend team?23:00
dviroels/?/!23:01
rlandybye all23:01
dviroelo/23:01
rlandyhave a great weekend23:01
*** dviroel is now known as dviroel|out23:01

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!