Thursday, 2019-09-26

*** weshay has quit IRC00:44
*** dsneddon has quit IRC01:03
*** weshay has joined #oooq01:12
*** weshay has quit IRC02:03
*** rfolco has quit IRC02:06
*** apetrich has quit IRC02:10
*** weshay has joined #oooq02:12
*** aakarsh has joined #oooq02:18
*** brault has quit IRC02:19
*** weshay has quit IRC02:38
*** ykarel has joined #oooq02:39
*** aakarsh has quit IRC02:42
*** aakarsh has joined #oooq02:52
*** aakarsh has quit IRC03:04
*** Goneri has joined #oooq03:44
*** udesale has joined #oooq04:12
*** rlandy|bbl is now known as rlandy04:21
*** ratailor has joined #oooq04:34
*** ykarel has quit IRC04:42
*** ykarel has joined #oooq05:04
*** marios has joined #oooq05:11
*** soniya29 has joined #oooq05:13
*** Goneri has quit IRC05:24
*** Goneri has joined #oooq05:27
*** akahat has joined #oooq05:37
*** jfrancoa has joined #oooq05:52
*** jfrancoa has quit IRC05:56
ykareldtantsur|afk, arxcruz|rover 1 ovb job passed with ironicclient -3.1.0 https://review.rdoproject.org/r/#/c/22551/06:03
ykareland i am taking it in train with https://review.rdoproject.org/r/#/c/22588/ before it's updated in u-c06:03
*** saneax has joined #oooq06:06
*** jfrancoa has joined #oooq06:13
*** Goneri has quit IRC06:17
*** skramaja has joined #oooq06:29
*** jaosorior has quit IRC06:30
*** holser has joined #oooq06:31
*** akahat has quit IRC06:40
*** tosky has joined #oooq06:41
*** udesale has quit IRC07:01
*** udesale has joined #oooq07:02
*** jaosorior has joined #oooq07:05
*** tesseract has joined #oooq07:05
soniya29marios, Hello07:21
*** bogdando has joined #oooq07:21
marioso/ soniya2907:25
*** pierrepr1netti is now known as pierreprinetti07:26
*** jpena|off is now known as jpena07:26
*** ccamacho has joined #oooq07:27
*** akahat has joined #oooq07:28
soniya29marios, I am trying to install cockpit locally, so when I run './development_Script.sh -s', it is consuming too much time. Is it expected behaviour or unusual?07:28
*** brault has joined #oooq07:28
mariossoniya29: o/ hi i think the first time it pulls containers etc so it takes a while if i recall07:30
mariossoniya29: i updated the readme with some info did you check there?07:30
* marios finds07:30
marioshttps://github.com/rdo-infra/ci-config/tree/master/ci-scripts/infra-setup/roles/rrcockpit soniya2907:31
marios" Note it may take a while to load especially on first run. You will see that the containers are being pulled before starting them like:"07:31
mariossoniya29: like is it hanging or is stuff happening just takes a long time?07:31
*** brault has quit IRC07:33
soniya29marios, It is continuosly pulling containers, https://paste.fedoraproject.org/paste/~QAmFD4o3mm6EebowKA2Mg07:33
mariossoniya29: try http://172.18.0.6:3000 in your browser07:35
soniya29marios, It opened dashboard in the browser07:36
mariossoniya29: so its running :)07:37
mariossoniya29: read that it explains a bit https://github.com/rdo-infra/ci-config/blob/master/ci-scripts/infra-setup/roles/rrcockpit/README.md07:37
soniya29marios, thanks :), I thought something went wrong with my installation.07:38
mariossoniya29: np :D i went through the exact same pain last time i tried to work there, which is why i updated the readme07:38
mariossoniya29: maybe you can contribute by improving that further once you're done setting up and if you think of something missing/find some new issue not documented etc07:39
soniya29marios, at second step, path to the development_script.sh isn't mentioned. I was wondering where exactly is the script location.07:43
*** udesale has quit IRC07:44
mariossoniya29: did you clone ci-config?07:44
*** ykarel is now known as ykarel|lunch07:44
mariossoniya29: its there https://github.com/rdo-infra/ci-config/tree/master/ci-scripts/infra-setup/roles/rrcockpit/files07:44
*** udesale has joined #oooq07:44
soniya29marios, yeah I cloned it. But development script is in ci-scripts/infra-setup/roles/rrcockpit/files and maybe the new folk trying to install be at some other location, let's say ci-scripts/infra-setup. He has to do 'cd' to the ci-scripts/infra-setup/roles/rrcockpit/files and then hit './developemnt_script.sh'07:48
*** soniya29 is now known as soniya29|lunch07:48
*** ksambor has quit IRC07:50
*** tesseract has quit IRC07:52
*** ksambor has joined #oooq07:52
*** tesseract has joined #oooq07:53
*** sshnaidm|afk is now known as sshnaidm08:02
*** amoralej|off is now known as amoralej08:12
*** soniya29|lunch has quit IRC08:13
*** apetrich has joined #oooq08:27
*** soniya29 has joined #oooq08:37
*** derekh has joined #oooq08:37
*** ykarel|lunch is now known as ykarel08:41
mariospanda: around? molecule test -s container-push fails locally on some weird error about the python interpreter http://paste.openstack.org/raw/779420/ seen that one?08:45
mariostrying rdo box in a sec08:45
*** chem has joined #oooq08:49
*** soniya29 has quit IRC08:51
arxcruz|roverykarel: cool, thanks for testing08:53
arxcruz|roveru da man08:53
*** soniya29 has joined #oooq08:54
mariospanda: ah i see you're hardcoding it08:54
pandamarios: not anymore09:02
pandamarios: I added a variable there to test it locally09:02
*** brault has joined #oooq09:02
pandamarios: ansible_python_interpreter: ${MOLECULE_INTERPRETER:-/home/zuul/test-python/bin/python}09:02
pandamarios: so it can be tested both locally or in zuul09:03
pandamarios: new PS coming soon, with some other tests. The integration test is now at the point in which the container-push is invoked, so it's time to merge something ...09:05
*** dtantsur|afk is now known as dtantsur09:10
mariospanda: ack... i just removed it for my local test (I want to iterate on manifest include today09:10
mariospanda: but i an issue with generate container list (see comment https://review.rdoproject.org/r/#/c/22445/17/ci-scripts/container-push/roles/containers-promote/tasks/main.yaml@5309:11
mariospanda: i will just iterate in zuul job if i have to but since its molecule the whole point is meant to be local09:11
mariospanda: also needed pip install selinux09:12
*** udesale has quit IRC09:13
*** udesale has joined #oooq09:17
zbr|ruckmarios: pip install selinux helps only inside virtualenvs, when you already have libselinux.09:17
*** jaosorior has quit IRC09:17
marioszbr|ruck: yes it was in venv09:18
zbr|rucka warning regarding using molecule/ansible under centos: you must use only the official python version of the platform, the other one(s) *will not work*.09:18
marioszbr|ruck: so just like i ahve to manually install molecule/docker etc i have to do that too09:19
zbr|rucki do have playbooks for most stuff09:20
zbr|rucki am still worried that we really dont have real docker on centos809:21
zbr|ruckwhich is a problem for using molecule. so many reasons for not using centos 7 or 8 as a development platform.09:22
zbr|rucki have much better testing coverage with fedora/macos/ubuntu/debian where I have docker, all python versions without selinux issues. hopefully we will address these at some point.09:24
ykarelzbr|ruck, arxcruz|rover debugging queens issue with https://review.rdoproject.org/r/#/c/22551/09:28
arxcruz|roverykarel: ack09:28
ykarelarxcruz|rover, it's not getting reproduced, so need to get node held and debug09:29
ykareli will let you know once it fails and node get held09:29
arxcruz|roverykarel: is remove_ovb_after_job hold the node?09:29
arxcruz|roveroh, it doesn't remove the node from nodepool, got it, last time i had this problem, only the main node was held, the others were deleted09:30
arxcruz|rovergood to know :)09:30
ykarelarxcruz|rover, nope ^^ param will hold overcloud nodes, i requested jpena on #rdo to hold the nodepool node09:30
ykarelyes /me trying that remove param first time :)09:30
pandamarios: new PS uploaded, it passes all my local tests.09:58
*** tesseract has quit IRC09:59
*** tesseract has joined #oooq10:00
mariospanda: k will try it in a bit trying to change the curl to use manifest ispect first10:02
mariospanda: it still failed for me on the same things locally10:48
mariospanda: added comment10:48
pandamarios: you're probably missing export MOLECULE_INTERPRETER=10:53
pandamarios: pointing to the python to your local venv10:53
pandamaybe this sould be more explicit , not sure where to put this warning though10:54
mariospanda: sure but is that expected i.e. 'just somethign you have to do for local testing' so maybe we add a readme about that10:54
mariospanda: that one is not a big problem but the containers list one is10:54
mariospanda: but clearly it works in the zuul job so... not sure why the difference10:55
*** brault has quit IRC10:55
pandamarios: checking it now. Can you paste the whole log ? Or the log with some context around ?10:55
mariospanda: http://paste.openstack.org/raw/779429/10:56
chandankumarsoniya29: https://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-standalone-full-tempest-master/08421f4/logs/stestr_results.html11:02
pandamarios: looks like int your env the "mock registries" task fails in some way to push the base images, I'd rerun with molecule converge -s container-push -- -vv to increase verbosity and see what is not run in that task11:03
*** amoralej is now known as amoralej|lunch11:04
*** jaosorior has joined #oooq11:05
chandankumararxcruz|rover: hello11:06
chandankumararxcruz|rover: regarding parsing failed tests from fs021, I got a nice idea11:06
chandankumarsince we have .html file in each job11:06
chandankumarjust convert it to json and get the result11:06
chandankumarwhat do you say?11:06
*** jpena is now known as jpena|lunch11:09
*** udesale has quit IRC11:14
*** derekh has quit IRC11:20
arxcruz|roverchandankumar: hmmmmm will you be able to parse? did you saw the mess that that html is ? :D11:27
arxcruz|roverI mean, of course, All i need is the info in the cockpit, how you going to do it it's another history :P)11:28
arxcruz|roverif parse the html is better... fine11:28
arxcruz|roverusing the unit 2 csv works pretty fine btw11:29
arxcruz|roverykarel: asking just for asking, because you da man, but do you need help on the queens? I saw the job finishes11:29
ykarelarxcruz|rover, yes it reproduced,11:30
ykareland i am not able to ssh it11:30
ykarelarxcruz|rover, i am thinking of enabling password in the image, upload and redeploy, and then access with password11:32
ykarelany other idea if u can recollect to shorten the process11:32
ykarelif you would like to access it, i can add your keys11:33
chandankumararxcruz|rover: subunit2csv is broken11:38
chandankumarI need to spend some time on fixing it11:39
mariospanda: can you please update that https://review.rdoproject.org/r/#/c/22445/19/ci-scripts/infra-setup/roles/promoter/molecule/container-push/playbook.yml11:43
mariospanda: we need all the containers you push in test regsitry to also be tagged with {{ full_hash }}_x86_64 like rdo registry11:44
pandamarios: ack11:44
mariospanda: 2019-09-26 11:32:58.692075 | rdo-centos-7 |       msg: 'Error pulling image localhost:5000/tripleomaster/centos-binary-base:82a1e08b25e42f1e8ea99d7c9b6957bb06423a34_dae2ef8_x86_64 - 404 Client Error: Not Found ("manifest for localhost:5000/tripleomaster/centos-binary-base:82a1e08b25e42f1e8ea99d7c9b6957bb06423a34_dae2ef8_x86_64 not found: manifest unknown: manifest unknown")'11:44
pandamarios: do you want ppc too ?11:44
marioshttp://logs.rdoproject.org/02/22002/59/check/molecule-delegated/2d21292/job-output.txt.gz11:44
mariospanda: thanks11:44
mariospanda: yes ideally not all of them though so we can test the logic in manifest push11:45
mariospanda: like leave 1/2 out for ppc11:45
mariospanda: but for x86 we expect all11:45
arxcruz|roverykarel: where you are not able to ssh?11:47
ykarelarxcruz|rover, overcloud nodes, same issue in the bug11:48
arxcruz|roverykarel: you mean the ssh issue, not the get_endpoint issue right ?11:51
ykarelarxcruz|rover, ye right queens ssh issue, get_endpoint is in master11:52
arxcruz|roversorry, i missunderstood :)11:52
mariospanda: just saw its only 2 there so make on with ppc one without11:52
mariospanda: btw if you're busy with somethign else let me know i'll update it11:52
*** ratailor has quit IRC11:54
*** weshay has joined #oooq11:57
pandamarios: updating it now11:58
ykarelarxcruz|rover, np, i am trying an alternative instead of virt-customize, to inject root password:- https://github.com/openstack/tripleo-heat-templates/commit/724ba3a32f20349ed20093758a48ca1297a0534e#diff-768a7913933c61a5853abd6ed2dd140011:59
weshayarxcruz|rover, any luck w/ a reproducer on fs001 queens?11:59
mariospanda: thx11:59
mariospanda: would *love* to get a run on the manifest push before scrum11:59
*** brault has joined #oooq12:01
*** rfolco has joined #oooq12:03
*** chandankumar has quit IRC12:04
pandamarios: sorry, adding the other tags broke the cleanup and I had to run some tests and change a few things12:12
pandamarios: new PS uploaded12:13
weshayarxcruz|rover, ping me when you have a sec.. I have an update on queens12:13
mariospanda: ack i got it running on the rdo vm via molecule test so not sure what is different locally... i added the tags and got an error12:13
mariospanda: thanks checking now12:13
arxcruz|roverweshay: ping12:13
*** chandankumar has joined #oooq12:14
zbr|ruckweshay: i think that I managed to get the ball rolling regarding missing libselinux, i see progress in my inbox.12:14
weshayarxcruz|rover, hey morning.. can you please check a few other branches of tripleo and ovb jobs.. see if we're getting ips12:14
weshayarxcruz|rover, http://logs.rdoproject.org/11/684411/5/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/54e01f5/logs/overcloud-controller-0/var/log/extra/network.txt.gz and https://bugs.launchpad.net/tripleo/+bug/184516612:14
openstackLaunchpad bug 1845166 in tripleo "[queens] [Periodic][check] OVB jobs failing in OC deploy when trying to ssh to nodes" [Critical,Triaged]12:14
weshayarxcruz|rover, /me gets coffee12:14
mariospanda: thanks for update again but i only see tag where are you pushing now?12:16
mariospanda: https://review.rdoproject.org/r/#/c/22445/21/ci-scripts/infra-setup/roles/promoter/molecule/container-push/playbook.yml12:16
pandamarios: line 77, instead of pushing one by one, I push everything that I tagged12:17
mariospanda: k thx trying12:17
pandamarios: ah you're right12:18
weshayzbr|ruck, help me understand python2-libselinux and centos-812:18
pandamarios: updating again12:18
mariospanda: k12:19
weshayzbr|ruck, or are you working on centos-712:19
*** derekh has joined #oooq12:19
zbr|ruckweshay: yeah, in fact is a double issue: c7 missing one of them and c8 missing the other one.12:19
zbr|rucknot sure which one will be fixed fist. we will see.12:20
pandamarios: done, feel free to take over so I'm not slowing you12:20
zbr|ruckshortly: keep. using fedora12:20
weshayzbr|ruck, arxcruz|rover you guys up for a 10min chat?12:20
*** jpena|lunch is now known as jpena12:20
arxcruz|roverweshay: sure boss12:20
zbr|rucklink?12:21
weshayhttps://meet.google.com/vts-bkoe-itj12:21
mariospanda: thanks checking12:21
sshnaidmbogdando, please don'12:22
sshnaidmt https://review.opendev.org/#/c/683826/12:22
sshnaidmbogdando, it doesn't work12:22
bogdandoundone12:22
*** amoralej|lunch is now known as amoralej12:23
weshayzbr|ruck, libselinux-python-2.5-14.1.el7.x86_6412:24
*** derekh has quit IRC12:24
*** derekh has joined #oooq12:25
*** beagles is now known as beagles-mtg12:29
weshayzbr|ruck, arxcruz|rover http://logs.rdoproject.org/32/618832/7/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch/6c2fe25/logs/undercloud/12:35
weshaylast known good queens12:35
ykarelarxcruz|rover, hi12:35
arxcruz|roverykarel: hallo12:35
weshayhttp://logs.rdoproject.org/32/618832/7/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch/6c2fe25/logs/undercloud/var/log/extra/rpm-list.txt.gz12:36
ykarelarxcruz|rover, can you get me console access for overcloud node12:36
weshayopenssh-7.4p1-21.el7.x86_6412:36
arxcruz|roverykarel: sorry ?12:36
weshayykarel, do you have a queens fs001 recreate?12:36
ykarelweshay, yes12:36
ykarelarxcruz|rover, openstack console url show b9401f73-0553-4eaf-8efe-50b128f3122712:36
ykarel^^ server id of one of overcloud node12:37
*** rlandy has joined #oooq12:37
jfrancoasshnaidm: hey, can you give me a hand with the reproducer? I'm trying to run it but can't get rid of this error: http://pastebin.test.redhat.com/800955 (in this execution I tried with -e cloud_networks=['private_nw'] because with the default ['private'] it also failed)12:37
ykarelweshay, ssh not accessible, so i deployed with passord12:37
arxcruz|roverykarel: i'm not following you, sorry...12:38
ykarelarxcruz|rover, can u run the above command with openstack-nodepool tenant credentials12:38
ykarelthat will give console url12:38
arxcruz|roverykarel: just a sec12:38
rfolcopanda, how do we include teardown playbook to run in the molecule destroy step ?12:39
ykarelarxcruz|rover, ack12:40
pandarfolco: add a destroy.yml in your molecule/scenario/ dir12:40
arxcruz|roverykarel: jsut a sec, i don't have the credentials easy here with me and i'm in a meeting with weshay  now12:40
ykarelarxcruz|rover, ack, may be panda can help with this: openstack console url show b9401f73-0553-4eaf-8efe-50b128f3122712:41
rfolcopanda, there is no converge.yml, how does molecule know playbook.yml is the converge then?12:41
ykarelpanda, can u please help to get it for us ^^12:41
ykarelfrom openstack-nodepool tenant12:42
pandarfolco: magice12:42
pandaykarel: yiou want the result here or in private ?12:43
ykarelpanda, pm will work12:44
weshayykarel, can we get on your queens job?12:46
ykarelweshay, yes12:46
ykarelshare you pub key12:46
arxcruz|rovergithub.com/arxcruz.keys12:46
weshayykarel, https://github.com/weshayutin.keys12:46
weshayzbr|ruck, https://github.com/sshnaidm/jcomparison12:47
ykarelweshay, arxcruz|rover ssh zuul@38.145.33.22512:47
sshnaidmweshay, moving it today to ci-config repo12:47
weshaysshnaidm, cool thanks12:48
sshnaidmweshay, when back from ptos, we'll set up a new repo for it12:48
ykarelpanda, Thanks12:50
*** ccamacho has quit IRC12:51
weshayykarel,  /me is suspicious of the network12:52
weshaytrying ssh w/ verbose and it's hanging12:52
ykarelweshay, hmm, after some time it returns authentication failed12:53
*** jtomasek_ has quit IRC12:53
*** jtomasek has joined #oooq12:56
weshayhttps://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/queens/ci/environments/network/multiple-nics/nic-configs/controller.yaml12:59
rfolcopinglist: marios, sshnaidm, weshay, panda, rlandy, arxcruz, rfolco, chandankumar, zbr, kopecmartin12:59
rfolcoscrum in 1 min12:59
weshaymtu at 135012:59
matbu|hfptoHey folks, quick question, Is oooq can be used on a RHEL 8 distro ?12:59
matbu|hfptooops12:59
*** matbu|hfpto is now known as matbu12:59
rfolcoscrum here https://meet.google.com/oiv-geho-mai?authuser=112:59
sshnaidmmatbu, quickstart.sh? not sure anybody tried13:00
ykarelweshay, hmm this setting should be there from long but why issue started since few days only13:00
matbusshnaidm: oki, i just tried but it failed :D13:00
sshnaidmmatbu, so now we know :)13:00
matbusshnaidm: hehe yep. I just asking because I might miss something13:01
matbusshnaidm: but i can try to make it work13:01
ykarelweshay, token expired, can u get one generated again:- openstack console url show b9401f73-0553-4eaf-8efe-50b128f3122713:01
matbusound mostly matter of checking/installing packages13:01
weshayarxcruz|rover, https://opendev.org/openstack/tripleo-ci/src/branch/master/toci-quickstart/config/testenv/ovb.yml#L4613:01
chandankumarrlandy: I fixed the emit release tests13:01
chandankumartaken too much time13:02
rlandychandankumar: ok - will review after meeting13:02
weshaypanda, you have a vnc console to an overcloud13:04
weshay?13:04
* weshay trying to connect to it13:05
pandaweshay: PM13:06
*** udesale has joined #oooq13:16
*** chem has quit IRC13:29
amoralejykarel, arxcruz|rover weshay it may be a problem with mt13:31
amoralejmtu13:31
amoralejdalvarez is investigating13:31
amoralejwith ovs13:31
amoralejwrt problem in queens13:31
ykarelack /me watching , but why starting happening recently13:32
ykarelthat would be interesting to know13:32
ykarelmay be centos7.7 and some dep from queens13:32
arxcruz|roverykarel: and whhy only on queens?13:32
arxcruz|rovermtu, what they are, where they live, what they eat, that and much more in the next chapters13:33
ykarelqueens have different packages than other13:33
amoralejqueens has ovs 2.913:33
ykarelso could be, but we will know soon, what causing it13:33
amoralejmay be related13:33
ykarelyup13:33
*** ccamacho has joined #oooq13:33
*** soniya29 has quit IRC13:43
pandamarios: you wanted to sync ?13:52
rfolcopanda, molecule should remove molecule container on destroy step ?13:53
mariospanda: yes can you do in like 10 mins quick break?13:53
mariospanda: rfolco rlandy we want sync tomorrow @ 'scrum' time?13:53
mariospanda: rfolco rlandy i'll send invite in a sec13:53
rfolcoyes marios13:53
rlandymarios: ack13:53
rlandythanks13:53
pandamarios: you decide, I'm your humble slave13:53
mariospanda: :D thanks please lets start in 7 mins will pm you in a sec for the room13:53
*** akahat has quit IRC13:54
pandarfolco: if you override the default destroy with your destroy.yml then you have to remove the container too13:54
rlandymarios: ^^ save that above line from panda13:54
mariosrlandy: gonna frame it13:54
rfolcopanda, it wasn't removing before my destroy.yml either13:54
amoralejarx ykarel weshay we are thinking in updating ovs to 2.11 in undercloud in that environment to see if the problem is centos7.7+ovs2.9+ovb13:56
amoralejis that fine?13:56
rfolcomarios, call me for your party, I promise I behave well.13:56
amoralejor you want to check something before?13:56
ykarelamoralej, okk for me13:56
ykarelwe can revert again13:56
pandarfolco: how do you call molecule ?13:56
rfolcopanda, I am running this local, with molecule converge to be quicker for development13:57
amoralejrevert ovs update...13:57
amoralejdunno13:57
amoralejmay work13:57
amoralej:)13:57
rfolcopanda, molecule destroy, molecule converge13:57
ykarel:D13:57
amoralejarxcruz|rover, ^13:57
rfolcoinstead of molecule test or tox -e moulecule13:57
rfolcopanda, ^13:57
pandarfolco: and molecule destroy does not destroy ?13:58
pandarfolco: any errors ?13:58
rfolcono13:58
rfolcono13:58
pandarfolco: you're lying13:58
rfolco:'-(13:58
pandarfolco: how do you know it's not destroyed ?13:58
rfolcosudo docker ps -a ?13:58
mariosack rfolco13:59
*** Vorrtex has joined #oooq13:59
weshayamoralej, k.. can we see that in an rdo-info job?14:00
weshayI don't think ovb is there14:00
weshayamoralej, we did see a 7.7 ovb queens job pass.. at least once or twice14:01
amoralejmmm14:01
weshaybut it's still a good lead14:01
amoralejnot sure if we have a ovb in experimental14:01
amoralejfor rdoinfo tbh14:01
amoralejneed to check14:01
amoraleji'm joingin a mtg now14:01
ykarelweshay, a queens job passed with 7.714:04
ykarelthe diffence there was it was running kernel from centos7.614:05
ykarelbut yes failing has kernel from 7.714:05
ykarelwith pass i meant it moved till tempest14:08
ykareldidn't failed at ssh14:08
weshaysshnaidm, marios https://review.opendev.org/68509014:10
ykarelreference:- https://logs.rdoproject.org/10/587310/9/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/2f67ee3/logs/undercloud/etc/redhat-release.txt.g14:10
rlandychandankumar: I think https://review.rdoproject.org/r/#/c/22549/ is uncontroversial and can probably go when your patches are in . fixing rdo-jobs patch14:11
rlandyhttps://review.rdoproject.org/r/#/c/22550/3/zuul.d/standalone-jobs.yaml14:14
rlandy^^ needs review pls14:15
mariospanda: did it drop you? i just dropped14:17
pandamarios: you crashed my laptop14:19
mariospanda: ack14:21
weshayarxcruz|rover, you on the console of ykarel's job?14:21
rlandychandankumar: weshay: check jobs are removed, removed my -2 - https://review.rdoproject.org/r/#/q/topic:train-jobs+(status:open+OR+status:merged) ready for review14:26
weshaysshnaidm, where is ovb-setup.yml logged?14:26
sshnaidmweshay, logged..?14:27
*** saneax has quit IRC14:28
arxcruz|roverweshay: yes14:30
weshayarxcruz|rover, want to debug together?14:32
weshayarxcruz|rover, https://meet.google.com/tid-cmdj-jry14:33
*** beagles-mtg is now known as beagles14:37
*** dalvarez has joined #oooq14:37
dalvarezo/14:37
weshayarxcruz|rover, ^14:38
weshayjump on14:38
amoralejdalvarez, ykarel arxcruz|rover so what do you thing is better to troubleshoot14:38
amoralejupdate ovs in UC or reboot?14:38
weshayreboot?14:38
amoralejuc14:38
amoraleji think someone proposed that?14:39
dalvarezarxcruz|rover: was it you the one writing on tmux? :)14:39
weshayyou want to reboot the undercloud?14:39
ykarelamoralej, i would prefer whichever would give us option to revert back to original14:39
weshayhttps://meet.google.com/tid-cmdj-jry?authuser=114:39
ykareldalvarez, it was me14:39
dalvarezoh14:39
dalvarezykarel: we can revert back the ovs thing14:39
ykareldalvarez, they we can start with ovs update14:39
dalvarezykarel: so you said that we have same environment, same ovs, same kernel working as well?14:39
amoralejok, let me update ovs14:39
ykareldalvarez, yes14:39
amoralejykarel, also with br-ctlplane14:40
amoralejright?14:40
amoralejsame ovs topology?14:40
dalvarezykarel: amoralej ^ i was saying that maybe it's got to do with the ovs/kernel in the hypervisor where those instances are running14:40
ykarelamoralej, yes same14:40
weshayamoralej, ykarel https://review.opendev.org/#/c/618832/14:40
ykarelamoralej, ssh zuul@38.145.34.24114:40
weshaytripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branchSUCCESS in 2h 30m 37s14:40
dalvarezykarel: amoralej i disabled segmentation hw offloading and still saw packets larger than the mtu for some reason14:40
* ykarel looks14:40
weshaydalvarez, should be at 1350 right?14:41
dalvarezweshay: yeah, but still saw packets of 1600 or so14:41
weshayneutron.conf:130:global_physnet_mtu=135014:41
dalvarezthe mss negotiation seemed ok14:41
dalvareztcp wise14:42
amoralejso you mean that kernel in uc vs kernel in hypervisor issue?14:42
ykarelweshay, in that job, "kernel": "3.10.0-957.27.2.el7.x86_64",14:42
ykarelold kernel from centos 7.614:42
ykarelso it's different case14:42
weshayah poop14:42
weshayykarel,  man.. oh so much for pretest14:43
dalvarez13:30:12.690879 Out fa:16:3e:e9:e9:d8 ethertype IPv4 (0x0800), length 1564: (tos 0x0, ttl 64, id 10771, offset 0, flags [DF], proto TCP (614:43
dalvarezweshay: ^14:43
dalvarezeven though the mss looked ok in the handshake14:43
dalvarezanyways i turned off the segment hw offload and looks better but still bigger packets14:43
dalvarezykarel: so the one that works is using older kernel?14:44
amoralejthe different may be that the failing jobs were created with new kernel from the begining14:44
ykareldalvarez, so before reboot, old kernel + ovs 2.9 + ssh to overcloud was working14:44
amoralejwhile ykarel's one was initially set with old one and then rebooted14:45
ykarelamoralej, yes exactly that's the difference14:45
ykarelyes ^^ true14:45
amoralejold kernel may imply different datapaths stored in te ovs even after reboot with new kernel?14:45
amoralejdalvarez, ^14:45
dalvarezamoralej: no14:46
dalvarezif a new kernel is rebooted, the datapath from that kernel is used14:46
dalvarezunless we load the modules explicitly14:46
dalvarezwhich we should not14:46
amoraleji'm inclined to update ovs14:47
amoralejtbh14:47
dalvarezhmm14:47
dalvarezwait14:47
dalvarezykarel: do you have access to that vm that is working? can you "lsmod |grep openvswitch" ?14:47
ykareldalvarez, yes14:47
ykarel[zuul@undercloud ~]$ lsmod |grep openvswitch14:47
ykarelopenvswitch           114838  514:47
ykarelnf_nat_ipv6            14131  1 openvswitch14:47
ykarelnf_defrag_ipv6         35104  2 openvswitch,nf_conntrack_ipv614:47
ykarelnf_nat_ipv4            14115  2 openvswitch,iptable_nat14:47
ykarelnf_nat                 26583  5 nf_nat_redirect,openvswitch,nf_nat_ipv4,nf_nat_ipv6,nf_nat_masquerade_ipv414:47
ykarelnf_conntrack          139224  9 ip_vs,openvswitch,nf_nat,nf_nat_ipv4,nf_nat_ipv6,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_conntrack_ipv614:47
ykarellibcrc32c              12644  4 ip_vs,openvswitch,nf_nat,nf_conntrack14:47
dalvarezok14:47
dalvarezand using  3.10.0-1062.1.1.el7.x86_64 ?14:48
*** aakarsh has joined #oooq14:48
ykareldalvarez, yes14:49
ykarel[zuul@undercloud ~]$ uname -r14:49
ykarel3.10.0-1062.1.1.el7.x86_6414:49
dalvarez(undercloud) [zuul@undercloud ~]$ modinfo openvswitch                                                                                             ?·····························14:49
dalvarezfilename:       /lib/modules/3.10.0-1062.1.1.el7.x86_64/kernel/net/openvswitch/openvswitch.ko.xz14:49
dalvarezykarel: ^ this module right?14:49
amoralejykarel, you remember if we can run ovb in rdoinfo gate?14:49
weshayykarel, that's what the test job used14:50
weshay        "BOOT_IMAGE": "/boot/vmlinuz-3.10.0-1062.1.1.el7.x86_64",14:50
ykarelamoralej, i think i have tried it earlier14:50
amoraleji remember we tried14:50
amoralejbut don't remember if it worked14:50
ykarelamoralej, and there were some some issues iirc, but could be fixed14:50
amoralejto test it with 2.11 in a DNM14:51
ykarelweshay, yes14:51
dalvarezamoralej: i'd check also rdocloud (ie. maybe it fails when they nodes are on the same (or certain) hypervisors)14:51
dalvarezs/they/the14:51
weshayykarel, http://logs.rdoproject.org/32/618832/7/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch/6c2fe25/logs/overcloud-controller-0/var/log/extra/dump_variables_vars.json.txt.gz14:51
amoralejdalvarez, it fails always14:51
weshayykarel, same kernel w/ a passing job14:51
dalvarezamoralej: always? weshay just linked this ^ passing14:52
dalvarez:?14:52
amoralejmm14:52
dalvarezand also it works when the kernel is upgraded14:52
amoralejalways in periodic :)14:52
dalvarezhah is periodic somehow pinning certain rdocloud nodes?14:52
amoralejand yatin reproducer is in a different tenant14:53
dalvarezlooks like mtu issue definitely14:53
amoralejmay be related to tenant configuration?14:53
*** aakarsh has quit IRC14:53
dalvarezi can even ping with 2000B because the router namespace is fragmenting14:53
ykarelweshay, that is for overcloud, undercloud is 7.6 kernel14:53
* ykarel rechecks14:54
weshayykarel, openstack-neutron-openvswitch-12.1.1-0.20190910010613.28f3e37.el7.noarch14:54
weshaysame ovs package on the passing job14:55
weshayykarel, ya.. the nodepool node may be old /me checks that job's undercloud14:55
ykarelweshay, yes14:56
weshaycorrect.. kernel on the nodepool node.. is         "BOOT_IMAGE": "/boot/vmlinuz-3.10.0-957.27.2.el7.x86_64",14:56
amoralejhttp://logs.rdoproject.org/32/618832/7/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch/6c2fe25/logs/undercloud/var/log/journal.txt.gz14:56
amoralejold kernel14:56
weshayya14:56
weshayI don't think we can reboot the nodepool node into a new kernel at run time14:57
weshaythat's just how it is :(14:57
weshayykarel, amoralej perhaps in the future we need to build a test nodepool node w/ cr?14:58
ykarelyes that should help detect these issues14:58
amoralejweshay, much better we'll have centos streams14:58
weshayif this really ends up being the kernel14:58
amoralejok15:00
amoralejso,,, my proposal15:00
amoralej1. reboot as is15:00
amoralej2. we may try to downgrade kernel15:00
amoralejand check what happens15:00
amoralejor15:00
amoralej3. update ovs15:00
dalvarezamoralej: cant we just update ovs first? :)15:00
amoralejok15:00
weshaythat would be interesting.. to downgrade the kernel on the undercloud that yatin has held15:01
dalvarezbut that'd just update the userspace15:01
amoralejlet's try15:01
amoralejdowngrade kernel vs upgrade ovs :)15:01
amoralejwe can generate other reproducer quite easyily if needed15:01
ykarelbut downgrading kernel would also need reboot i think15:03
amoraleji'll update ovs, if needed let's rebuild15:03
amoralejyes15:03
amoralejit does15:03
amoralejthat's why i proposed to reboot first15:03
amoralejto check rebooting itself does not change15:03
ykarelyup that would give some lead to our theory15:04
rlandychandankumar: I added a separate task for Train check jobs in RDO (marked it blocked) and moved the two cards for periodic train to 'ready for review'15:05
rlandyamoralej: when you have time, pls review https://review.rdoproject.org/r/#/c/22582/ - took a shot at adding the weirdo jobs for train promotions in ci.centos.15:09
*** skramaja has quit IRC15:15
amoralejweshay, we may get other token for the console?15:15
amoraleji think it expired15:15
pandaI'm going to merge https://review.rdoproject.org/r/2253815:17
weshaydalvarez, I think the overclouds are gone15:20
dalvarezweshay: yeah15:20
dalvarezlooks like it :S15:20
weshayamoralej, or ykarel  will have to spin up another env and hold15:20
ykarelovecloud gone?15:21
ykarelhow15:21
chandankumarrlandy: ack15:21
ykarelmay be cleanup script delete, which delete overclouds older than 5 hours, right?15:23
ykarelor someone manually deleted it15:23
weshayykarel, ya.. the reaper15:24
weshayscript15:24
ykarel:(15:24
weshaycould go turn that off15:24
ykareliirc there was an option added to skip some nodes, but can't recall15:24
weshayrlandy, which infra server is the cleanup script running on these days?15:25
weshayhubbot?15:25
sshnaidmweshay, te-broker15:26
weshaysshnaidm, there is no instance in infra w/ that name anymore15:27
sshnaidmweshay, 38.145.33.16615:29
weshaythat's odd15:29
weshayoh.. that's in openstack-ifnra15:30
weshaysorry15:30
weshayopenstack-nodepool rather15:30
weshayya15:30
weshay:)15:30
sshnaidmweshay, it's weshay and whayutin users there :)15:30
weshaysshnaidm, rlandy so we don't have a way to easily exclude certain stacks / servers15:33
sshnaidmweshay, no15:33
weshaywe're debugging a queens kernel issue15:33
weshayadding an exception list an ok idea?15:33
rlandyweshay: no15:39
rlandyadding a an exception list is ok - if we have the resources to exclude some15:40
weshaysshnaidm, ok by you?15:41
weshayI would assume exceptions are submitted and removed via gerrit15:41
sshnaidmweshay, yeah, fine by me15:42
pandazbr|ruck: arxcruz|rover merging a change that could impact the promoter server, I'll keep an eye for the next hour, but if it's acting weird you can revert the patch.15:44
rlandypanda: which patch is merging?15:46
rlandyhttps://review.rdoproject.org/r/#/c/22538/?15:46
pandarlandy: yes15:47
pandarlandy: I touch the production code only in two points15:47
pandarlandy: I add a raise to output any error at the end of the promotion, and install ansible 2.8 in the virtualenv15:47
pandaoh well three points15:48
rlandy2019-09-26 15:38:36.677398 | TASK [promoter : check if dlrn api actually finished a promotion process]15:48
rlandy2019-09-26 15:38:37.726440 | rdo-centos-7 | ERROR15:48
pandaI aslo added a daemon_reload if the dlrn-promoter service change15:48
rlandy^^ failing15:48
rlandynon-voting - guess that's ok15:48
panda.... I knew it ... I make the promoter skip the last log line if an error occurs15:49
pandait never actually writes "FINISHED" if if fails with an error15:49
pandawhich makes sense in some way15:49
pandabut without the raise, I could not see why it was failing bore15:49
pandathe catchall except was making everything.15:50
pandamasking15:50
*** tosky has quit IRC15:52
*** dtantsur is now known as dtantsur|afk15:54
rlandypanda: k  think I am done with train stuff now15:57
rlandyback to promoter test15:57
pandarlandy: ok, want to +W  https://review.rdoproject.org/r/22538  or I'll just selfishly sel merge ?16:01
chandankumarrlandy: did we sent upstream train job patch if not then I will do tomorrow with -W16:01
rlandypanda: I'll w+ it16:02
pandarlandy: thanks16:02
rlandyor at least review +2 it16:02
rlandygive you a hint of legitimacy16:02
rlandychandankumar: ack - thanks16:02
chandankumarweshay: rlandy I have 3 public holiday in between 2-9 in india16:03
rlandypanda: ok - +2'ed - now you can self merge with confidence16:03
rlandychandankumar: np - I have a bunch of holidays coming up16:04
rlandyout monday tues this coming week16:04
rlandywed the next week16:04
rlandyand then mon and tues the two weeks following that16:04
chandankumarafter that I will be attending pycon India from 11th Oct16:05
rlandychandankumar: let's try get this complete as much as we can tomorrow16:05
chandankumarrlandy: yes16:05
rlandyso someone can pick it up in our absence16:05
chandankumaryup16:09
rlandyhmmm ... no W+ yet?16:11
mariospanda: should be green now at https://review.rdoproject.org/r/#/c/22002/ but maybe issue with ppc for the test registries will continue tomorrow (delegated job is good now waiting for zuul to report it)16:11
mariosttyl16:11
*** ykarel is now known as ykarel|afk16:12
*** jpena is now known as jpena|off16:12
rlandypanda: question about keeping a test config file ...16:13
*** marios has quit IRC16:13
rlandybetter to add the test parameters to the current staging config or keep a separate file16:13
rlandynoting some duplication16:13
pandarlandy: you can keep a file and the ovverides separated, then merge the two in the code ...16:14
*** tesseract has quit IRC16:19
*** kopecmartin is now known as kopecmartin|off16:19
*** bogdando has quit IRC16:19
*** ykarel|afk has quit IRC16:50
*** udesale has quit IRC16:51
*** aakarsh has joined #oooq16:57
*** derekh has quit IRC17:01
*** Goneri has joined #oooq17:02
*** holser has quit IRC17:03
*** ykarel has joined #oooq17:05
*** aakarsh|2 has joined #oooq17:06
*** aakarsh has quit IRC17:08
*** weshay is now known as weshay_passport17:13
*** EmilienM is now known as MniLieEm17:15
*** MniLieEm is now known as EmilienM17:15
*** aakarsh|2 has quit IRC17:16
rlandypanda: how do I refernce the dlrn host?17:29
rlandyfrom the staging enV?17:30
rlandyhttp://logs.rdoproject.org/48/22348/41/check/tripleo-ci-promotion-staging/61b95f9/job-output.txt.gz17:30
*** brault has quit IRC17:32
arxcruz|roverrlandy: weshay_passport panda sshnaidm https://review.opendev.org/#/c/683350/ please, +w this to fix rocky, the fs020 pass with it17:33
*** aakarsh has joined #oooq17:37
*** Goneri has quit IRC17:44
*** matbu has quit IRC17:45
*** matbu has joined #oooq17:46
pandarlandy: other than localhost:8080 or 127.0.0.1:8080 ?17:49
rlandypanda: told me no host with localhost:808017:49
rlandyI didn't try  127.0.0.1:8080 yet17:51
rlandyI added the http: to see if that helps17:51
pandarlandy: I have bandwwidht problem right now, so I did not look at that log yet17:52
rlandypanda: no worries - will debug it17:52
rlandyoh - I think it worked now17:53
rlandy2019-09-26 17:52:50.186286 | rdo-centos-7 | Expected commit hash: 17234e9ab9dfab4cf5600f67f1d24db5064f1025 has not been promoted to.triple-ci-staging-promoted17:53
rlandy2019-09-26 17:52:50.186349 | rdo-centos-7 | Traceback (most recent call last):17:53
*** brault has joined #oooq17:53
rlandypromoter errored17:53
rlandythat's right17:53
*** brault has quit IRC17:54
rlandypanda: ^^ I think it's working - it failed because promotion did not succeed17:58
pandarlandy: at least we have the negative test18:02
rlandypanda: baby steps18:02
rlandybetter than before - no promotion - all green - this is fine18:04
*** amoralej is now known as amoralej|off18:06
*** ykarel is now known as ykarel|away18:14
*** weshay_passport is now known as weshay18:45
*** Vorrtex has quit IRC18:52
*** dsneddon has joined #oooq18:53
*** Goneri has joined #oooq18:57
*** sshnaidm is now known as sshnaidm|off19:11
*** weshay has quit IRC19:18
*** Vorrtex has joined #oooq19:21
*** Vorrtex has quit IRC19:21
*** weshay has joined #oooq19:25
weshayugh..19:26
weshayso silly19:26
*** Vorrtex has joined #oooq19:27
rlandyno passport?19:29
rlandyweshay: ^^?19:30
weshaynot yet19:30
rlandyugh - wow19:31
weshayI feel crazy right now19:33
rlandymaybe you're not meant to go19:33
weshaywoot.. finally getting a laptop19:37
weshayrlandy,  check it out https://docs.google.com/document/d/14f6zANeJeBTIu6xCYwq9p3vG4D9r0ixskZiPFIEmfVs/edit#heading=h.howov9rq5h3k19:43
rlandywell, well19:43
rlandyyour influence has expanded19:45
weshayrlandy, that's not influence.. that is EmilienM trollign :)19:45
weshaytrolling19:45
EmilienMlol19:46
*** dtrainor has quit IRC19:49
*** dtrainor has joined #oooq19:56
*** Goneri has quit IRC20:04
*** holser has joined #oooq20:10
*** ykarel|away has quit IRC20:13
*** Goneri has joined #oooq20:20
*** holser has quit IRC20:22
*** Vorrtex has quit IRC20:40
weshayrlandy, https://review.opendev.org/68517720:44
weshayrlandy, ping to chat for a sec if it's not too late20:44
rlandysure20:44
rlandyweshay: where do we even use that?20:46
weshayyou're going to laugh :)20:46
*** brault has joined #oooq20:46
rlandyI could use a laugh - where?20:47
weshayhttps://meet.google.com/cnm-zyfp-owb?authuser=120:47
*** brault has quit IRC20:50
rlandyjoining20:51
rlandyweshay: can you hear me?20:51
rlandyweshay: let me join again20:52
*** ccamacho has quit IRC20:53
weshayrlandy, http://logs.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens/9f915d9/logs/reproducer-quickstart-deprecated/20:57
*** aakarsh has quit IRC21:15
weshayrlandy,  also https://review.opendev.org/#/c/685090/21:25
rlandyok21:26
rlandyI would have put localhost back21:26
rlandybut this is likely easier21:26
rlandyweshay: your install still going?21:27
rlandyafter the skip?21:27
weshayyes21:27
weshayundercloud is installing21:27
rlandywow - it really is the return of the dead21:27
weshayrlandy, also need to find where we pull the centos qcow2 and update that21:27
*** jfrancoa has quit IRC21:29
rlandyhttps://github.com/openstack/tripleo-quickstart/blob/master/config/release/tripleo-ci/CentOS-7/promotion-testing-hash-master.yml#L13021:29
rlandyweshay: ^^ that one?21:29
rlandyI added that for bm21:29
rlandyhttps://review.opendev.org/#/c/579154/ - must have needed that for some reason21:36
rlandythat creates the file we pass for nodes21:36
weshayrlandy,  it's defined in create-zuul-based-reproducer/templates/launcher-playbook.yaml.j221:36
*** matbu has quit IRC21:37
rlandyhttps://review.opendev.org/#/c/636406/12/roles/prepare-node/tasks/main.yaml21:37
rlandy^^ I think we need that21:38
rlandywhich will gives us the vars zuul passes21:38
weshayrlandy, images are building :)21:40
weshayrlandy,  there is life21:40
rlandyhttps://review.opendev.org/#/c/613414/1/roles/nodepool-setup/tasks/main.yml21:42
rlandy^^ that was where the v3 reproducer work started21:42
rlandyhttps://tree.taiga.io/project/tripleo-ci-board/task/28421:43
rlandyweshay: here we go ... https://review.opendev.org/#/c/614638/4/playbooks/tripleo-ci/run-v3.yaml21:46
rlandyhttps://tree.taiga.io/project/tripleo-ci-board/task/27121:47
rlandylist of vars exported21:47
rlandysome of it merged21:47
*** Goneri has quit IRC21:48
weshayheh..21:50
weshayseed planted21:50
rlandylet's see how far we got if we rebased21:54
rlandyif we bring this back, I really will laugh21:54
rlandybut it might actually be easier to keep this tested than to get others to run the zuul-based one21:55
*** brault has joined #oooq21:58
*** brault has quit IRC22:13
rlandyweshay: need to step away for a bit but I have traced some old reviews/tests - will show what wee still have22:23
rlandyafter I rebase22:23
*** rlandy is now known as rlandy|biab22:24
weshayk22:25
*** aakarsh has joined #oooq22:42
*** brault has joined #oooq22:55
*** brault has quit IRC23:00
*** brault has joined #oooq23:16
*** brault has quit IRC23:20
*** dsneddon has quit IRC23:52

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!