Tuesday, 2018-09-04

*** hamzy has quit IRC00:01
*** hamzy has joined #oooq00:14
*** apetrich has quit IRC01:28
*** skramaja has joined #oooq03:21
*** ykarel has joined #oooq03:24
*** jrist has joined #oooq03:31
*** ykarel has quit IRC03:37
*** skramaja_ has joined #oooq03:49
*** ykarel has joined #oooq03:51
*** skramaja has quit IRC03:52
*** ykarel has quit IRC04:08
*** udesale has joined #oooq04:12
*** agopi has quit IRC04:35
*** ykarel has joined #oooq04:37
*** quiquell has joined #oooq05:57
quiquellsshnaidm: Good morning05:57
*** jfrancoa has joined #oooq06:02
*** jfrancoa has quit IRC06:06
*** jfrancoa has joined #oooq06:08
*** jfrancoa has quit IRC06:09
*** agopi has joined #oooq06:10
*** jfrancoa has joined #oooq06:17
*** kopecmartin has joined #oooq06:33
*** apetrich has joined #oooq06:59
*** quiquell has quit IRC07:24
*** quiquell has joined #oooq07:25
*** dtantsur|afk is now known as dtantsur07:26
ssbarneamornign everyone!07:26
quiquellssbarnea: o/07:27
*** ccamacho has joined #oooq07:30
*** holser_ has joined #oooq07:33
ssbarneaquiquell: https://review.openstack.org/#/c/596428/ should be an easy review for the morning.07:33
quiquellssbarnea: ack07:34
ssbarneaalso this one is waiting for a +W https://review.openstack.org/#/c/571176/07:35
ssbarneaa promotion-blocker fix also ready to get merged https://review.openstack.org/#/c/599088/ -- ignore unrelated rdo failure, i think is should get the workflow too.07:39
*** ykarel is now known as ykarel|lunch07:40
*** apetrich has quit IRC07:41
*** tosky has joined #oooq07:41
sshnaidmquiquell, hello07:42
quiquellsshnaidm: I checkout the refactoring of timeout review, and I have some concerns, don't know if they have sense07:43
quiquellsshnaidm: Do you have a minute ? is regarding this https://review.openstack.org/#/c/589068/07:44
sshnaidmquiquell, sure07:44
sshnaidmquiquell, which periodic stuff do you mean?07:46
quiquellsshnaidm: The stuff that is not going to run if zuul's timeout kicks in is07:49
quiquellhttps://github.com/openstack-infra/tripleo-ci/blob/master/playbooks/tripleo-ci/templates/toci_quickstart.sh.j2#L18307:49
quiquellThis range https://github.com/openstack-infra/tripleo-ci/blob/master/playbooks/tripleo-ci/templates/toci_quickstart.sh.j2#L179-L19207:50
quiquellIf we want to remove timeout we have to do this in the post07:50
quiquellAnd it' depends on $exit_value07:50
quiquellHave already talk about this with panda|rover07:50
quiquellThat's why I didn't remove it07:51
quiquell(I really did want to)07:51
sshnaidmquiquell, lines 179-181 should be in logs collect script, lines 189-190 as well07:53
sshnaidmquiquell, I don't think this periodic stuff (lines 183-187) is used anywhere, iirc it's just for our debug.. but need to check07:54
quiquellsshnaidm: ack, going to move them to the collect script07:55
quiquellsshnaidm: And the periodic maybe we can remove it, and go back to it when we port periodic jobs to zuulv307:55
quiquellsshnaidm: I mean new workflow07:55
sshnaidmquiquell, currently I don't find where we use JOB_EXIT_VALUE at all, maybe it could be even removed08:00
sshnaidmquiquell, (and if it breaks, we'll know where it's used)08:00
quiquellsshnaidm: I think it's related to DLRN info08:01
quiquellsshnaidm: To publish job's result at DLRN...08:01
quiquellsshnaidm: maybe zuulv3 give us the run phase result08:01
sshnaidmquiquell, we use "$SUCCESS" for that https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/playbooks/legacy-tripleo-ci-periodic-base/post.yaml#L1108:01
quiquellsshnaidm: Yep I don't find it either, let's just remove it.08:03
quiquellsshnaidm: Maybe we can also remove this https://github.com/openstack-infra/tripleo-ci/blob/master/playbooks/tripleo-ci/templates/toci_quickstart.sh.j2#L167-L172 ?08:04
quiquellWhen we move OVB to the new workflow collect logs will be executed at post08:05
quiquellwdy ?08:05
sshnaidmquiquell, yeah, but let's firstly finish with it and then we'll move OVB to post..08:08
sshnaidmquiquell, I wouldn't make all in one patch..08:08
quiquellsshnaidm: Ok, will cleanup the periodic thing, and move the dns cache stuff08:09
sshnaidmquiquell, hmm.. actually seems like ovb does it in post already08:10
quiquellsshnaidm: So we were doing it twice ?08:10
sshnaidmquiquell, need to check, let's leave it for now08:11
quiquellsshnaidm: Ahh we were moving the script so it doesn't execute at post08:11
quiquellsshnaidm: We are not using this for OVB so let's clean this, and fix collect logs at post for OVB08:11
quiquellwhen we move OVB to the new workflow08:11
quiquellso we move forward08:12
quiquellwhat do you think ?08:12
sshnaidmquiquell, agree08:12
sshnaidmmarios|rover, wrt https://review.openstack.org/#/c/599201/ - it doesn't change any way we test oooq and extras, AT ALL08:13
quiquellok08:13
*** apetrich has joined #oooq08:13
mariossshnaidm: but it could affect the reproducer you generate right? ssbarnea correct me if i'm wrong08:13
mariossshnaidm: no it doesn't change how we test wrt jobs/since we don't run reproducer in any job08:14
mariossshnaidm: but we do generate it08:14
sshnaidmmarios|rover, we clone changes to oooq and extras here: https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/nodepool-setup/tasks/clone-ci-repos.yml#L2408:14
sshnaidmmarios, we generate it in create_reproducer role, how is it related??08:15
mariossshnaidm: so you are saying this https://review.openstack.org/#/c/582963/ is completely unnecesary08:15
mariossshnaidm: i.e. what was the problem ssbarnea was solving with that08:15
sshnaidmfolks, please take a look at reproducer code more closely, just go step by step to learn how it works08:15
mariossshnaidm: if we are already applying the changes08:15
mariossshnaidm: i am trying to learn here too08:16
sshnaidmmarios, it didn't solve anything, because nothing was to solve08:16
mariossshnaidm: ok, then why don't you write as much on the revert instead of 'we sholdn't do it because its bad imo' which is what it currently sounds like08:16
mariossshnaidm: so you're saying that the changes to oooq/extras are already included , even if they are to reproducer generation , and so https://review.openstack.org/#/c/582963/ is completely redundant and we should revert it with https://review.openstack.org/#/c/599201/08:17
sshnaidmmarios, it may be only useful when your patch in extras change the reproducer script itself and you want to test them on live08:18
sshnaidmmarios, in this case it's possible just to add one line to reproducer after cloning extras08:19
mariossshnaidm: k, am gonna copy/paste this stuff into the review ^^^ as extra info, and lets discuss on the community call this afternoon if you're around?08:19
marioswith ssbarnea and anyone else interested08:19
mariossshnaidm: thanks for the info08:19
*** chem has quit IRC08:19
sshnaidmmarios, the worst case that currently it broke reproducer, so it's pretty urgent08:20
mariossshnaidm:  well thats the other thing i wrote there08:20
mariossshnaidm: i.e. unless its a bug08:20
mariossshnaidm: so file the bug i think since it is urgent?08:20
ssbarneaif you read comments from https://review.openstack.org/#/c/582963/ it would be easy to understand why it was needed in the first place. wes also confirmed that before doing it user would have to manually cherry pick zull changes in local repo to be able to run reproducer in the same conditions as the job.08:21
mariossshnaidm: it would increase reviews at least08:21
sshnaidmmarios, I also want to reproduce jobs in some time08:21
mariossshnaidm: so yeah if it broke reproducer lets revert it and ssbarnea can fix it before remerging08:21
mariossshnaidm: but imo we need a bug for that08:21
*** chem has joined #oooq08:21
ssbarneamarios: the problem is that I don't see wat is broken.08:21
sshnaidmmarios, breaking reproducer for me it's kind of critical, ssbarnea can resubmit his original patch and we'll discuss it on one of team meetings08:22
mariosssbarnea: well that is what the bug will explain08:22
mariossshnaidm: agree but lets at least point to how it broke so we can merge the revert today (please bug with pointer/trace?)08:22
ssbarneamy impression is that we have different expectations regarding the "subject" of the reproducer. I do see it as a "build reproducer" and sshnaidm is seeing it as a "job runner". (the different between build and job being crucial, as build is an instance of a job, one with specific parameters (CRs being the most important ones)). If you just want to run a "job" you have some expectations, if you want to run the same build, you do have other08:27
ssbarneaexpections. Anyway, having a bug it would be great as we can post all these on it an have tracebility.08:27
sshnaidmssbarnea, you didn't understand me and you don't understand how reproducer works08:30
quiquellsshnaidm: Do this https://review.openstack.org/#/c/599088/1 depends on this https://review.openstack.org/#/c/595527/?08:35
quiquelldamn... I mean ssbarnea08:35
quiquellssbarnea: Do this https://review.openstack.org/#/c/599088/1 depends on this https://review.openstack.org/#/c/595527/?08:36
ssbarneai don't think so08:40
*** d0ugal has quit IRC08:43
*** ykarel|lunch is now known as ykarel08:45
sshnaidmssbarnea, marios do you know that OVB jobs are broken?08:55
mariossshnaidm: no i did not know that is it from this morning?09:03
*** d0ugal has joined #oooq09:03
mariosquiquell: i just updated https://review.openstack.org/595527 ykarel please when you get a chance09:03
mariosquiquell: if you agree lets try and merge it today09:04
mariosquiquell: the queens and ocata check jobs have been timeout for this since before i left!09:04
sshnaidmmarios, https://logs.rdoproject.org/67/599267/2/openstack-check/legacy-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-master/a1884f2/logs/undercloud/home/zuul/repo_setup.log.txt.gz09:04
marios!!09:04
openstackmarios: Error: "!" is not a valid command.09:04
mariosthanks openstack09:04
ykarelmarios, why master/rocky ? i guess config-download is used there all places09:05
mariossshnaidm: actually yeah i saw that earlier this morning09:05
mariosi mean 2018-09-03 09:17:43.173965 | primary | TASK [repo-setup : Setup repos on live host] ***********************************09:05
marios2018-09-03 09:17:43.192326 | primary | Monday 03 September 2018  09:17:43 +0000 (0:00:00.080)       0:00:12.555 ******09:05
sshnaidmmarios, do you have a running host now on rdocloud?09:05
marios2018-09-03 09:17:44.621980 | primary | fatal: [subnode-2]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": true}09:05
mariossshnaidm: ack we need a bug for that going to file it09:05
mariossshnaidm: i do not haven't checked my env i was going to wipe them if they are around anyway09:06
sshnaidmmarios, yeah, it's because we introduced a new repo, but it doesn't seems that we have a proxy for it in rdo cloud09:06
mariossshnaidm: i was poking earlier because i want to merge https://review.openstack.org/#/c/59552709:06
mariossshnaidm: and it had fail like the one you are pointing to (iu assume 2018-09-03 09:17:44 | /home/zuul/repo_setup.sh: line 212: NODEPOOL_CBS_CENTOS_PROXY: unbound variable )09:06
mariosis what you mean? ^09:06
sshnaidmmarios, yes09:06
ssbarneasshnaidm: not all, but this broke them https://review.openstack.org/#/c/598255/109:07
mariosssbarnea: do we have bug already?09:07
mariosssbarnea: about to file one09:07
ssbarneano, we do not, just found it.09:07
sshnaidmssbarnea, it's code of reproducer at all... please, look carefully09:07
mariosykarel: ah ack ok... i can't vouch without checking that we don't have any featuresets that disable config download there09:07
mariosykarel: but sure if that is 'apparently' the default i can remove from those please comment thanks for your time :)09:08
sshnaidmssbarnea, you need to separate code of reproducer and code of job, really)09:08
mariosssbarnea: ok09:08
sshnaidmmarios, no, this patch is not related09:09
sshnaidmmarios, so we need a bug..09:10
mariossshnaidm: ssbarnea ack filing it now sec09:10
ssbarneaalready filled09:10
ssbarneahttps://bugs.launchpad.net/tripleo/+bug/179060309:11
openstackLaunchpad bug 1790603 in tripleo "repo_setup.sh: line 212: NODEPOOL_CBS_CENTOS_PROXY: unbound variable" [Undecided,Triaged]09:11
mariosykarel: actually you're right we really sholdn't need that for master/rocky since they will be using the role which includes the agents09:13
mariosykarel: as per the bug09:13
mariosykarel: i mean https://bugs.launchpad.net/tripleo/+bug/1785067 comment #209:13
openstackLaunchpad bug 1785067 in tripleo "without bootstrap-subnodes queens/pike/ocata deployment fails for missing os-collect-config" [High,In progress] - Assigned to Marios Andreou (marios-b)09:13
ssbarneasomething is weird because i see no place where NODEPOOL_CBS_CENTOS_PROXY is used without fallback for the case where is underfined, still the build has the unbound error.09:14
mariosquiquell: going to update https://review.openstack.org/#/c/595527/09:14
ykarelmarios, cool as we are seeing only in older releases so good to change there only09:14
mariosykarel: yes09:14
mariosykarel: doing09:14
ykarelmarios, ack09:14
quiquellmarios: It's needed for master and rocky ?09:14
mariosquiquell: no it is not09:14
quiquellThen we have to remove them from promotion-testing-hash-09:15
ykarelmarios, can also skip newton, we don't have promotion job running for those09:15
quiquellocata/pike/queens are the ones affected at promotion-testing-hash09:15
mariosquiquell: yeah we fixed pike but not the promotion hash09:16
sshnaidmssbarnea, don't you really see that https://review.openstack.org/#/c/598255/  changes reproducer code?09:16
mariosquiquell: I mean in the first commit09:16
ssbarneai fount it.09:16
quiquellmarios: Yep, let's fix now the periodic too here09:16
mariosquiquell: ykarel so gonna keep all but master rocky agree?09:17
ykarelmarios, can remove newton changes as well, and keep ocata/pike/queens09:17
mariosykarel: right i was gonna say no promoition job you said before09:18
mariosykarel: ok09:18
quiquellmarios: Also ignore newton09:19
mariosquiquell: yes09:19
mariospromotion09:19
quiquellmarios: As ykarel stated it doesn't have a periodic job09:19
mariosquiquell: ykarel thank you see v 4 @ https://review.openstack.org/59552709:20
mariosquiquell: ykarel whenever you next have some time thanks09:20
quiquellmarios: Do you know if this https://review.openstack.org/#/c/599088 depends on your change ?09:21
ykarelmarios, +109:21
mariosquiquell: it shouldn't matter09:22
ssbarneaok, so what happened is that  /etc/ci/mirror_info.sh is outdated, not having that inside it.09:22
mariosquiquell: i mean if featureset is using config download then mine is redundant at best09:22
mariosquiquell: but if it isn't then it needs mine09:22
quiquellmarios: The this patch depends on yours, for those featureset that are going to not use config-download09:22
quiquellmarios: That's right ?09:23
quiquellmarios: +109:23
mariosquiquell: well yeah , i mean assuming they are hitting the bug like Failed to start os-collect-config.service: Unit not found. on subnode09:23
mariosquiquell: like if there is no os-collect-config the heat stack update will be stuck and timout09:24
quiquellmarios: All job not using config-download and use ceph are going to hit it ?09:24
quiquellDamn, is review.o.o down ?09:24
quiquellmarios, ssbarnea: I have update the review with your comments https://review.openstack.org/#/c/588475/09:29
quiquellmarios, ssbarnea, sshnaidm: ^09:29
mariosquiquell: ack added to next reviews list :)09:30
ykarelmarios, i think u need to check other release files(trunk) ones09:30
ykarelthose might also be affected, those are generally used by end users09:30
quiquellykarel: What's the best way to know what releases file matter ?09:30
mariosykarel: ack but i would rather merge this first since it is really killing us on the stats09:30
marios!09:30
marios:)09:30
ykarelmarios, yes agree09:31
quiquellagree09:31
quiquellLet's first fix CI then the real user :-)09:31
mariosykarel: seriously lets fix the ocata/queens09:31
mariosquiquell: exactly!"09:31
marios:)09:31
ykarelwas just said so we don't let end users suffer, as those files are used by end users09:31
ykarelnext PS is ok09:31
ykarelnext review09:31
ssbarneaI need some help on the mirror_info.sh bug as I have no idea who is supposed to create/update this file.09:33
*** dsneddon has quit IRC09:34
quiquellssbarnea: Don't know, ask at #tripleo channel, it has a wider audience09:35
ssbarneaquiquell: sure, tx.09:35
quiquellssbarnea: #tripleo channel during ruck/rovering is very important09:35
quiquellssbarnea: also #rdo09:35
quiquellssbarnea: Also what I usually do is a git blame over related code and ask guys in the logs list09:38
quiquellssbarnea: Individualized messages have more effect09:38
quiquell This tripleo thing is so massive...09:38
quiquellmarios: All the timed out jobs at "noop" are fixed with your stuff ?09:41
mariosquiquell: not sure man, at least, they would hit this thing if they don't have os-collect-config installed. if there is then some other issue i don't know09:42
marioswe cant say09:42
quiquellpufff a lot of noop is failing a time out... fu... timeout :-/09:43
*** jaosorior has joined #oooq09:47
*** holser_ has quit IRC09:48
quiquellmarios, sshnaidm: "Add another level of parent jobs for zuul v3"  https://review.openstack.org/#/c/593063 is good at CI too, let's +2 it09:50
sshnaidmquiquell, done09:51
*** holser_ has joined #oooq09:52
sshnaidmquiquell, let's wait for rlandy to confirm it's what she wants, although I  don't see any issues with it09:52
quiquellsshnaidm: sure, just helping moving stuff forward09:52
quiquellsshnaidm: Also "Add a script to compare reviews" https://review.openstack.org/#/c/588475/ is good now09:53
quiquellPuff timed out a lot of jobs... in the bash to ansible commit arggg09:54
*** panda|rover has quit IRC09:54
*** saneax has joined #oooq09:56
*** panda has joined #oooq09:59
*** panda has quit IRC10:04
quiquellrfolco: Have to drop early yesterday so I pushed half backed stuff to the bash to ansible review sorry man :-/10:29
*** dsneddon has joined #oooq10:30
rfolcoquiquell, its ok, thanks for helping.10:30
rfolcoquiquell, what a hell is the 'tail' part on https://review.openstack.org/#/c/589068/ ?10:31
rfolco:)10:31
rfolcoyou made ci sad again10:31
quiquellrfolco: If we remove timeout, there is stuff that it's not executed and have to be move to post10:31
quiquellrfolco: that's the "tail" code10:31
quiquellrfolco: CI broken there ?10:32
quiquelldamn10:32
quiquellwas looking at other stuff10:32
rfolcoquiquell, I don't understand the relation of timeout and that piece of code...10:32
rfolcoquiquell, why removing timeout would prevent that to run ?10:33
quiquellrfolco: zuul kill the script10:33
quiquellrfolco: That's why we have an "artifical" timeout smaller than zuul's one10:34
quiquellrfolco: so we kill before zuul kill us and run stuff after it10:34
rfolcoquiquell, ok, so that code is part of the run phase10:34
quiquellrfolco: but moving stuff to post, will allow us to continue removing this "artificial" timeout10:34
quiquellrfolco: was part of run, but have move it to collect logs script that run at post10:35
quiquellrfolco: the periodic part is not needed neither the ovb part10:35
rfolcowe can cleanup after10:36
rfolcosmall chunks or we struggle to merge code like the "move bash to ansible"10:36
rfolcobecuase its too big10:37
rfolcoit was, now its smaller10:37
quiquelljust moved and clean10:37
quiquellso we cover the artificial post run10:37
rfolcocool10:38
* sshnaidm is listening to song "Release it!"10:47
sshnaidmlooking for songs "Fix it!" and "Deploy it!"10:47
quiquellrfolco: Puff don't understand the issue here http://logs.openstack.org/68/589068/16/check/tripleo-ci-centos-7-scenario008-multinode-oooq-container/efcbf04/job-output.txt.gz#_2018-09-04_08_32_23_98188510:48
quiquellrfolco: Maybe the new lines that have piping and all are breaking the EOF ?10:49
rfolcosshnaidm, Review it! :)10:49
quiquellrfolco: there is a "while read i" on them10:49
sshnaidmrfolco++10:49
quiquellrfolco++10:49
quiquellhubbot--10:50
sshnaidmhubbot seems not surviving last outage..10:51
rfolcorip hubbot10:51
sshnaidmhe was a hero10:52
quiquellhubbotxx10:53
quiquellrfolco: fixed, heredoc have to ignore special chars11:00
quiquellrfolco: "END" works fine11:00
rfolcoquiquell, where is the special char?11:01
quiquell rfolco "<" "|"11:03
quiquellrfolco: It try to do the redirection or piping11:03
quiquellbtw I see some "jenkins" stuff in the new workflow execution, I suppose is some legacy from zuul jobs11:05
quiquellOk create_collect_logs is working now11:11
quiquellmarios, ykarel: The config download can be the cause for recent timeouts ?11:12
ykarelquiquell, link for timeouts, which release?11:12
quiquellykarel: master11:13
mariosquiquell: where quiquell there are few different issues afaics some unkown ...i'm currently trying to dig into https://bugs.launchpad.net/tripleo/+bug/1790199 fs035: openstack overcloud deploy returns error code without any signs or error11:13
openstackLaunchpad bug 1790199 in tripleo "fs035: openstack overcloud deploy returns error code without any signs or error" [Critical,Triaged]11:13
mariosquiquell: which is definitely different to the os-collect-config thing my review addressed11:13
mariosquiquell: ah thought the one i'm looking at isnt timeout11:14
mariosbut anyway quiquell do you have pointer or bug?11:14
mariosquiquell: also we can/will find out once alex revert merges?11:14
quiquellmarios: Just some executions of sprint review http://logs.openstack.org/48/589448/30/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/48b0a69/11:15
mariosquiquell: hm is that one timing out on the post deploy stuff? looks like deploy was clean exit 011:17
marioshttp://logs.openstack.org/48/589448/30/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/48b0a69/logs/undercloud/home/zuul/overcloud_deploy_post.log.txt.gz11:17
mariosquiquell: so anyway not related to my review. might be fixed by the config download revert ?11:18
quiquellmarios: timed out at tempest11:18
quiquellhttp://logs.openstack.org/48/589448/30/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/48b0a69/job-output.txt.gz#_2018-09-04_09_21_13_94838911:18
quiquellbut the problem could be in the time it takes to deploy uc or oc11:18
quiquellAnd I don't know if the time it takes to deploy them depends on config-download11:19
quiquell(Don't know how config-download works)11:19
quiquellLooks like we are activating it now11:19
ykarelhmm this may be a generic timeout that we are facing in ci from some time11:19
quiquellSo maybe that's the cause of problesm11:19
quiquellykarel: Can it be related to config-download ?11:19
sshnaidmquiquell, we use config-download from rocky11:20
sshnaidmquiquell, it's a default config last release11:20
ykarelquiquell, don't think so, as it's working from quite some time11:20
ykarelyes correct11:20
quiquellOk11:20
sshnaidmI found strange failure here: http://logs.openstack.org/90/599290/1/gate/tripleo-ci-centos-7-containers-multinode/ace0ab5/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz#_2018-09-04_10_36_5211:20
sshnaidmabout 1 hour passed and "string indices must be integers"  error only11:21
sshnaidmmaybe mistral related..11:21
*** udesale has quit IRC11:24
jaosoriorhas anybody seen this issue http://paste.openstack.org/show/729395/ ?11:33
*** panda has joined #oooq11:34
ssbarneasshnaidm: now I finally see the problem, cherry picking fails in your example. that is clearly a breaking bug. i checked and downloading works (git review -d 123). Now the cherry picking fails because of empty commit not because of conflict, so should be fixable.11:35
sshnaidmssbarnea, the problem is not with specific cherry-pick, but with the fact that you try to cherrypick it at all11:35
sshnaidmssbarnea, I wrote exactly what's the problem in bullet points. Don't you like bullet points?11:36
sshnaidmssbarnea, you still don't understand that you mix unmixable things - jobs patches and reproducers code11:37
ssbarneayeah, l like them. still, I know for sure that we had a case where the tested CR was not used, and cherry picking sorted it.11:37
sshnaidmssbarnea, was not used - where?11:38
ykareljaosorior, seems missing repo, shich release u trying11:39
jaosoriorykarel: queens11:39
ykareljaosorior, mm i can see python2-netaddr: https://trunk.rdoproject.org/centos7-queens/deps/latest/noarch/11:39
sshnaidmjaosorior, maybe it's python2-netad11:40
quiquellrfolco: Don't know what to do with {{ workspace }}, tripleo-ci doesn't feel like a good place for generated stuff11:40
jaosoriorsshnaidm: ...so, where do I try that?11:40
ykareljaosorior, sshnaidm python-netaddr is provided: https://trunk.rdoproject.org/centos7-queens/deps/latest/noarch/python2-netaddr-0.7.19-5.el7.noarch.rpm11:40
ykarelrpm -qp --provides https://trunk.rdoproject.org/centos7-queens/deps/latest/noarch/python2-netaddr-0.7.19-5.el7.noarch.rpm11:40
sshnaidmykarel, I think this error is from virthost, not from job11:40
sshnaidmjaosorior, try to install it on your virthost11:41
ykarelsshnaidm, yup good to check if repo is enabled11:41
sshnaidmjaosorior, to see if it fixes11:41
sshnaidmjaosorior, if so, need to fix our deps install11:41
jaosoriorsshnaidm: on the virthost I already tried running "yum install python-netaddr" and it said it was already installed.11:42
jaosoriorPackage python-netaddr-0.7.5-9.el7.noarch already installed and latest version11:42
ssbarneasshnaidm: I am not sure (as logs were removed even if is only 6weeks old), but i remember that using reproducer from a job like https://review.openstack.org/#/c/581779/ was not running with the change, was using tripleo master, .... so not using the change itself.11:43
ssbarneawes knows for sure as he confirmed the same issue11:43
sshnaidmjaosorior, and on your host when you run quickstart?11:44
sshnaidmssbarnea, your patch had nothing to do with that anyway11:44
sshnaidmssbarnea, I suggest to revert this to get reproducer working, if you think it's still the issue - please submit a bug with a patch and we'll discuss it11:45
ssbarneain fact he told me to hack reproducer to cherry pick my change and it worked. few days later I ended up creating the cherry picking change in order to force it to use the code from CRs.11:45
jaosoriorsshnaidm: it's python2-netaddr or python3-netaddr. I have them installed; but I guess the code in quickstart should change then11:45
rfolcoquiquell, we can consolidate a new workspace in a separate patch11:47
quiquellrfolco: ok11:48
sshnaidmssbarnea, let's talk about it when you have a submitted bug with logs and failures, patch etc11:50
sshnaidmssbarnea, I can't discuss something unknown11:50
sshnaidmjaosorior, you're right, fixing..11:52
jaosoriorsshnaidm: ended up deleting that statement to move forward11:52
ykarelsshnaidm, but that python- should not have caused that issue, as python- is provided by that package11:52
sshnaidmykarel, I don't know which distro jaosorior uses, maybe in his repos there is no python-11:53
sshnaidmjaosorior, can you run "yum list python*-netadr"?11:53
ykarelsshnaidm, yes good to confirm that,11:54
jaosoriorsshnaidm: fedora 2811:54
ykareljaosorior, okk now i got it11:54
ykarelthere is a bug in fedora11:54
ykarelit has issue when multiple repos are enabled with source rpms11:54
jaosoriorInstalled Packages11:54
jaosoriorpython2-netaddr.noarch                                                                                    0.7.19-7.fc28                                                                                     @fedora11:55
jaosoriorpython3-netaddr.noarch                                                                                    0.7.19-7.fc28                                                                                     @fedora11:55
ykareljaosorior, fetching the fedora bug11:55
sshnaidmhmm.. me too with f28, but didn't see this error11:55
ykarelsshnaidm, only happens with multiple repos with same package11:55
sshnaidmykarel, ok11:56
ykareljaosorior, if u disable one repo, u will not see that11:56
ykarellet me find the bug11:56
ykarelsshnaidm, jaosorior https://bugzilla.redhat.com/show_bug.cgi?id=151932511:57
openstackbugzilla.redhat.com bug 1519325 in dnf "dnf fails to install a package by a "provides" name" [Unspecified,Closed: rawhide] - Assigned to rpm-software-management11:57
ykareland changing the package name to python2-netaddr will workaround that11:57
sshnaidmactually we can use "python*-netaddr" and then it will install both python2 and python312:04
sshnaidmnot sure if it's desirable..12:05
*** ajo has joined #oooq12:06
jaosoriorSo... now I don't encounter the python-netaddr issue. But now I have this problem http://paste.openstack.org/show/729399/12:16
jaosorioris it trying to use rhel images?12:17
*** trown|outtypewww is now known as trown12:17
ykarelnope, possibly all repos are disabled in previous steps12:24
sshnaidmjaosorior, yeah.. it tries to install libguestfs-tools-c and doesn't find it12:25
*** rlandy has joined #oooq12:30
quiquellrlandy: o/12:32
rlandyquiquell: hey - welcome back!12:32
quiquellrlandy: thanks12:33
quiquellrlandy: The review "Add another level of parent jobs for zuul v3" looks good12:34
rlandyquiquell: thanks - there are still underlying reviews though we need to get in12:34
quiquellrlandy: https://review.openstack.org/#/c/593063 the playbook executions are exactly the same12:34
*** ykarel is now known as ykarel|away12:34
quiquellrlandy: What's missing to merge it ?12:34
*** ccamacho has quit IRC12:34
quiquellrlandy: Stuff about infra ?12:35
rlandyquiquell: the reparent stuff is a huge mess12:37
* rlandy gets12:37
rlandyquiquell: https://bugs.launchpad.net/tripleo/+bug/178929412:39
openstackLaunchpad bug 1789294 in tripleo "RDO Cloud jobs move to zuulv3 native is blocked by legacy dependencies" [Undecided,Triaged] - Assigned to Ronelle Landy (rlandy)12:39
*** jrist has quit IRC12:39
rlandyquiquell: and we need to get this through zuul12:39
rlandyquiquell: need to touch base with panda about that as well12:40
*** jrist has joined #oooq12:40
quiquellrlandy: Without that we cannot merge the new base jobs ?12:40
rlandywe'll never see them work12:40
quiquellBoth at RDO and openstack infra ?12:41
quiquellbut CI is working in the review, is RDO the problem ?12:41
*** ykarel|away has quit IRC12:42
quiquellrfolco: https://review.openstack.org/#/c/596422 is huge12:44
quiquellhufff12:44
rlandyone sec - phone12:45
rlandyquiquell: some of the failures are legit and some are blocking what we could see12:51
rlandyquiquell: the problem is we keep uncovering more and more to do here12:51
rlandyquiquell: I am looking now at changing the deploy heat template12:51
quiquellrlandy: It affects all the jobs with base tripleo-ci-base ?12:51
quiquellrlandy: both upstream and RDO ?12:52
rlandyquiquell: just RDO jobs - we can't reconfigure RDO until we get rid of legacy12:52
rlandythat is what https://review.openstack.org/#/c/596422 is all about12:52
rlandyit's killing us12:52
quiquellbut RDO is still using the old workflow, we can still merge new workflow improvements while we make it RDO compatible12:53
rlandyhttps://review.openstack.org/#/c/599076/12:54
* rlandy gets error12:55
sshnaidmrlandy, agree with quiquell - why not to merge things for upstream CI for now, while working separately for OVB12:55
sshnaidmrlandy, it's independent and we don't break OVB with those changes12:55
sshnaidmrlandy, at least we'll finish one part..12:55
quiquellAlso is good to hub those stuff already in place to reduce rebasing12:56
quiquells/hub/have/12:56
quiquellSo work for OVB is less painful12:56
rlandysshnaidm: quiquell: my concern is delaying https://review.openstack.org/#/c/596422 more12:57
rlandypanda has moved and renamed stuff deleted in https://review.openstack.org/#/c/59306312:58
rlandyis panda around today?12:58
rlandyanyone seen him?12:58
sshnaidmrlandy, I think he's on PTO this week12:58
quiquellrlandy: Merging https://review.openstack.org/#/c/593063 is not going to delay the other imho12:58
quiquellrlandy: It's just a good restructuring of the new workflow already in place12:59
quiquellsure I am missing something12:59
sshnaidmrlandy, this patch - I don't know in which stage is it, but multinode work seems to be close to merge12:59
sshnaidmrlandy, I see it first time..13:00
rlandysshnaidm: quiquell: let's bj - so we hash this out quickly13:00
sshnaidmrlandy, sure13:01
quiquellrlandy: Agree my PTO has empty my brains13:01
sshnaidmquiquell, lucky one13:01
*** skramaja_ has quit IRC13:01
rlandyhttps://bluejeans.com/u/rlandy/13:01
rfolcorlandy, o/ may I join you guys?13:03
rlandysure13:03
quiquellrlandy: thanks again, that's a titanic work13:17
rlandyquiquell: it's more like the titanic hitting the iceberg part13:18
quiquellmoving the iceberg before hitting it13:18
quiquellrfolco, rlandy: doing a cherry pick of ronelle's changes over "Add another level..." is near trivial so we are good13:22
*** vinaykns has joined #oooq13:22
quiquellrfolco: But it's not the case for the one removing bash variables13:23
rlandyquiquell: ok - great, go ahead13:23
rfolcoquiquell, bash vars patch should not be hard, we can resolve conflicts13:24
rlandyrfolco: do you know what is in /etc/nodepool/sub_nodes_private? http://logs.openstack.org/63/593063/11/check/tripleo-ci-centos-7-containers-multinode/387b2e0/logs/undercloud/etc/nodepool/ shows empty archive13:24
rfolcorlandy, yes, wget it and gzip -d... you'll see ip(s) for the subnode(s)13:25
quiquellhumm cherry-picking also over "Move toci BASH variables to ansible" is not that bad13:26
rlandyrfolco: fails to open13:26
rlandyrfolco: I need to duplicate this line https://github.com/openstack/tripleo-heat-templates/blob/master/ci/common/net-config-multinode-os-net-config.yaml#L14513:26
rlandyhere13:26
rlandyhttps://review.openstack.org/#/c/599076/1/roles/overcloud-prep-network/templates/overcloud-prep-network.sh.j213:26
rfolco$ cat sub_nodes_private13:26
rfolco199.204.45.23213:26
rlandyyou see the empty subnode_index?13:27
quiquellmarios: Can you help +2 this  https://review.openstack.org/#/c/593063 ? rlandy is ok with it13:27
rlandyoh marios is back as well13:28
rlandyyay13:28
rfolcorlandy, subnode_index is empty ? how ?13:28
marios|roverquiquell: i would rather review that properly and since it looks like it is more than 2 mins thing i'll cycle back to it in a bit13:29
marios|roverrlandy: o/13:29
rlandyrfolco: empty where? https://review.openstack.org/#/c/599076/1/roles/overcloud-prep-network/templates/overcloud-prep-network.sh.j2?13:29
marios|rover\o/13:29
rlandyjust because I didn't know what to put there13:29
quiquellmarios|rover: sure, thanks13:29
rlandyrfolco: can you bj with me for a sec?13:32
rlandyjust to sort out nodes issue13:32
rfolcorlandy, give me one sec to test your subnodes line locally here13:32
rlandyrfolco: sure - afaict - it would be empty13:32
rfolcorlandy, yes returning empty, I am trying to execute it in parts... confused on what should be the expected output .... 1 <ip> ?13:34
rfolcolets bj13:34
rfolcorlandy, knocking your door13:35
rfolcoknock knock13:35
rlandyrfolco: joining13:36
rfolcothe door is open, I am getting in13:36
quiquellcherry-picking ronelle's over timeout removal is semy trivial too13:36
quiquellso we are good13:36
rfolcorlandy, my connection is killing me13:37
rfolcooh god13:37
rlandywant to rejoin?13:38
rfolcorlandy, if you change it to...13:46
rfolcogrep -n $(cat sub_nodes_private) sub_nodes_private13:46
rfolco1:199.204.45.23213:46
rfolcoindex:ip13:46
*** ccamacho has joined #oooq13:46
rlandyrfolco: yep - but I am just trying to reproduce what is there13:47
rlandynot change it13:47
rfolcorlandy, who said this is right ? https://github.com/openstack/tripleo-heat-templates/blob/master/ci/common/net-config-multinode-os-net-config.yaml#L13813:48
rfolcoit does not make any sense to me13:48
rlandyrfolco: me neither13:48
rlandybut I want to move the code, not change it13:48
rlandyat least at first13:48
rfolcoso.. primary_node_private == node_private != sub_nodes_private13:50
rfolcostill don't have qe for my cards :(13:56
rfolcohttps://trello.com/c/WvujKLzl/915-override-featureset-config and https://trello.com/c/ZL2yzSK2/916-define-extra-playbooks-in-job-definition13:56
rfolcoany volunteer? ^13:57
quiquellrfolco: I can get some13:57
rfolcothese cards are small patches, and already have test patches to verify13:58
rfolcoso would be a 15 min each I believe13:58
rfolcoquiquell, you rock13:58
quiquellhave already do some reviewing13:58
quiquellrfolco: My brain is still elsewhere13:59
*** udesale has joined #oooq13:59
quiquellok taking override featureset config13:59
rfolcoquiquell, thanks14:00
rfolcosshnaidm, are you overloaded or can be a qe for a small card ? I saw your review already there... please let me know if you can qe https://trello.com/c/ZL2yzSK2/916-define-extra-playbooks-in-job-definition14:01
rfolcowill probably need a rebase when we merge reparenting patch14:01
sshnaidmrfolco, exactly wanted to ask about that, I still wonder why we need this14:01
rfolcosshnaidm, even after my beautiful and so awesome commit ? :) kidding...14:02
*** Guest58453 has quit IRC14:03
*** stack has quit IRC14:04
sshnaidmrfolco, wanna to bluej?14:04
rfolcosshnaidm, I agree its not everyday we'll have an extra playbook... and this needs to exist in tqe or gated by extra-requirements.txt... but that happened to browbeat. So how do we add an extra playbook without changing the playbooks logic (manually including it)?14:04
rfolcosshnaidm, ok, your room?14:04
sshnaidmrfolco, ya14:04
rfolcosshnaidm, rlandy if you get 5 min, please join us14:05
quiquelljfrancoa: the container_registry_file review, it's not better to have it in the release files since they have different name per release14:05
rlandyk14:05
sshnaidmrfolco, rlandy https://bluejeans.com/u/sshnaidm/14:05
*** dsneddon has quit IRC14:06
*** dsneddon has joined #oooq14:06
jfrancoaquiquell: would make sense, but it's not something related to the release itself...if we do it for the container_registry_file, why not doing it too for config_download_args and other variables which change conditionally on the release?14:10
quiquelljfrancoa: Ok, so it's part of the nature of the feature sets ok, kind of blurry.14:12
jfrancoaquiquell: it's a tripleo-upgrade specific parameter that's why I don't think it's a good practice to add it in the release file14:13
quiquelljfrancoa: ack14:13
quiquelljfrancoa: Do you have to change also http://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/tree/playbooks/multinode-overcloud-update.yml ?14:15
quiquellor this is obsolete ?14:15
jfrancoaquiquell: oh, good point.Can you please add a comment in the patch?14:15
jfrancoaquiquell: thanks for the catch14:16
quiquelljfrancoa: sure14:16
quiquelljfrancoa: Happy to give you my -1 :-P14:17
jfrancoaquiquell: I accept it with love :-D14:17
*** quiquell has quit IRC14:21
*** myoung has joined #oooq14:25
rascasoftmyoung, hey hi! Are you able to share with me sometime to define the status of jobs/logs in the pipelines? I can't figure it out by myself14:27
myoungrascasoft: i can, but i'm booked this morning, TC call at 1114:28
rascasoftmyoung, yeah, me too14:28
myoungrascasoft: i have not been online since friday, so i'm a little out of touch, was holiday here yesterday14:28
rascasoftmyoung, I know, but it should be fairly simple14:28
sshnaidmrlandy, well, tried to rebase now and it has conflicts, especially in toci_gate_test.sh.j2. But resolving is not so hard..14:38
rlandysshnaidm: ok - as you see fit14:39
sshnaidmrlandy, will try to post a patch that resolves it.. but I don't understand some changed in your patch14:39
rlandycommunity meeting? off>14:39
rlandymyoung?14:40
sshnaidmrlandy, https://review.openstack.org/#/c/596422/32/playbooks/tripleo-ci/templates/toci_gate_test.sh.j214:40
*** dtrainor has joined #oooq14:42
myoungrlandy: #tripleo meeting still going on, I'm going to have to bail for TC meeting at 11, and I didn't announce this week...with holiday and such wasn't sure.  We can have it if there's stuff to talk about.  joining.14:43
marios|roverrlandy: same as above but tripleo is winding down joining now sorry didn't realise you started14:53
rlandymarios|rover: community meeting is off - we are just chatting there re: reparent stuff14:53
marios|roverrlandy: ack i know ssbarnea and sshnaidm had some beef to discuss14:54
marios|rover;)14:54
marios|rover(too i mean)14:54
marios|roverfighting bluejeans14:54
ssbarneamarios|rover: ... I would prefer to keep the beef for a steak and do some work instead.14:57
ssbarneasshnaidm: regarding the "beef", I am ready to give up, not because i was convinced that we shouldn't cherry pick but because you did a better job: you raised a bug, i didn't when I made the original CRs, which makes it harder to track the reasons and history. Also, I seen the cherry pick failure and I agree that that is serious inconvenience.15:03
rlandysshnaidm: marios|rover: https://github.com/openstack/tripleo-heat-templates/blob/master/ci/common/net-config-multinode-os-net-config.yaml15:03
ssbarneamarios|rover: regarding gate checking, I am bit overwhelmed, the best result so far was to see 1/6 passing.15:07
ssbarneahttps://review.openstack.org/#/q/topic:gate-check+status:open15:07
ssbarneamarios|rover: for https://review.openstack.org/#/c/599358/ do you think it makes sense to raise a bug, or that would be enough. i am inclined to say that it does not need a new bug.15:10
marios|roverssbarnea: do you have time to join?15:14
marios|roverhttps://bluejeans.com/705085945515:14
ssbarneasure15:15
marios|roverrlandy: which bug were you looking at /we were discussin before please15:18
marios|roverrlandy: i missed the bug number15:18
rlandymarios|rover: https://bugs.launchpad.net/tripleo/+bug/178929415:19
openstackLaunchpad bug 1789294 in tripleo "RDO Cloud jobs move to zuulv3 native is blocked by legacy dependencies" [High,Triaged] - Assigned to Ronelle Landy (rlandy)15:19
marios|roverrlandy: thanks15:19
rlandymarios|rover; thanks for your help there15:19
marios|roverrlandy: didn't do nuffin :)15:20
myoungssbarnea, marios|rover, is it known/understood already that rdo2 slaves (rdo-manager-64) all seem to be down/offline?15:25
marios|rovermyoung: rdo was down earlier i believe15:27
sshnaidmrlandy, https://review.openstack.org/#/c/599206/15:27
marios|roversshnaidm: right? did you say rdo cloud was down at some point15:27
* marios|rover checks mail15:27
marios|rovermyoung: we are still on the community call btw15:28
marios|roverbut finishing now15:28
sshnaidmrlandy, https://review.openstack.org/#/c/599460/15:28
myoungmarios|rover: didn't realize, I'm on TC global call15:28
sshnaidmmarios|rover, ^^15:28
sshnaidmmarios|rover, mm, not aware of that, but jobs were failing15:30
sshnaidmmarios|rover, now they should be ok15:30
marios|roverack sshnaidm thanks myoung sorry not down just some jobs had issues15:31
myounghttps://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/computer15:33
marios|roverrlandy: myoung fyi since i'm almost out waiting for this https://review.openstack.org/#/c/595527/ should hopefully fix timeout on ocata/queens jobs15:34
ssbarnearlandy: added as https://bugs.launchpad.net/tripleo/+bug/1790667 and linked to both reviews, we should collaboration there to find a better solution for repro cherry picking.15:35
openstackLaunchpad bug 1790667 in tripleo "reproducer-quickstart.sh fails to cherry pick changes made to tripleo-quickstart(-extras)" [Medium,Confirmed] - Assigned to Sorin Sbarnea (ssbarnea)15:35
sshnaidmrlandy, so, please take a look at rebased patch https://review.openstack.org/#/c/599628/ in your time. Possibly it'll require some additional work..15:37
*** sshnaidm is now known as sshnaidm|afk15:37
* myoung heads to punishment...err... s/punishment/dentist15:38
*** myoung is now known as myoung|bbl15:38
*** saneax has quit IRC15:45
*** ccamacho has quit IRC15:53
rlandysshnaidm|afk: ack - will do15:57
rlandyssbarnea: thanks15:58
rlandyI see the bug15:59
*** holser_ has quit IRC16:15
*** jfrancoa has quit IRC16:31
*** kopecmartin has quit IRC16:36
*** trown is now known as trown|lunch16:39
*** dsneddon has quit IRC16:50
*** udesale has quit IRC16:51
ssbarneasshnaidm|afk: https://review.rdoproject.org/r/#/c/15914/ when you have few minutes, please.17:02
*** ykarel has joined #oooq17:03
ssbarneamarios|rover: https://bugs.launchpad.net/tripleo/+bug/1790685 with bugfix linked to it.17:11
openstackLaunchpad bug 1790685 in tripleo "creating host via 'add_host': hostname=ip_address" [High,In progress] - Assigned to Sorin Sbarnea (ssbarnea)17:11
*** vinaykns has quit IRC17:26
*** dsneddon has joined #oooq17:34
*** dsneddon has quit IRC17:51
*** trown|lunch is now known as trown18:03
*** ykarel is now known as ykarel|away18:13
*** dtantsur is now known as dtantsur|afk18:21
*** dsneddon has joined #oooq18:22
*** myoung|bbl is now known as myoung19:46
*** holser_ has joined #oooq20:26
*** holser_ has quit IRC20:49
*** trown is now known as trown|outtypewww21:03
*** matbu has joined #oooq21:20
*** matbu has quit IRC21:25
*** matbu has joined #oooq21:35
*** rlandy is now known as rlandy|bbl22:45
*** tosky has quit IRC23:48

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!