*** tosky has quit IRC | 00:39 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-scenario001-multinode-oooq-container, tripleo-ci-fedora-28-standalone, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario004-multinode-oooq-container, tripleo-ci-centos-7-containers-multinode, tripleo- (3 more messages) | 00:56 |
---|---|---|
*** agopi|brb has quit IRC | 01:09 | |
*** agopi|brb has joined #oooq | 01:10 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-scenario001-multinode-oooq-container, tripleo-ci-fedora-28-standalone, tripleo-ci-centos-7-scenario007-multinode-oooq-container, tripleo-ci-centos-7-scenario004-multinode-oooq-container, tripleo-ci-centos-7-containers-multinode, tripleo- (3 more messages) | 02:56 |
*** udesale has joined #oooq | 03:45 | |
*** sshnaidm is now known as sshnaidm|afk | 04:07 | |
*** ykarel has joined #oooq | 04:23 | |
*** chkumar has joined #oooq | 04:34 | |
*** ykarel has quit IRC | 04:42 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-scenario001-multinode-oooq-container, tripleo-ci-fedora-28-standalone, tripleo-ci-centos-7-scenario007-multinode-oooq- (3 more messages) | 04:56 |
*** ykarel has joined #oooq | 05:02 | |
*** chkumar is now known as chkumar|ruck | 05:27 | |
*** chkumar has joined #oooq | 05:44 | |
*** chkumar|ruck has quit IRC | 05:47 | |
*** chkumar has quit IRC | 05:48 | |
*** ratailor has joined #oooq | 05:59 | |
*** chkumar246 has joined #oooq | 06:03 | |
*** apetrich has joined #oooq | 06:06 | |
*** chkumar246 is now known as chkumar|ruck | 06:44 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-scenario001-multinode-oooq-container, tripleo-ci-fedora-28-standalone, tripleo-ci-centos-7-scenario007-multinode-oooq- (3 more messages) | 06:56 |
*** jbadiapa has joined #oooq | 06:58 | |
*** jfrancoa has joined #oooq | 07:00 | |
*** quiquell|off is now known as quiquell | 07:11 | |
*** skramaja has joined #oooq | 07:25 | |
quiquell | marios: I don't wnat to workflow mystuff here https://review.openstack.org/#/c/619468/ | 07:26 |
*** ykarel is now known as ykarel|lunch | 07:26 | |
quiquell | I think itÂ's bad practice | 07:27 |
quiquell | sshnaidm|afk: Can yuo workflow this https://review.openstack.org/#/c/619468/ ? | 07:29 |
marios | quiquell: checking | 07:29 |
marios | quiquell: oh | 07:30 |
marios | quiquell: yeah but its early still | 07:30 |
marios | sure we can get a vote before lunch :) ? | 07:30 |
marios | quiquell: i already voted there | 07:30 |
quiquell | marios: Yep let 's wait for Sagi to workflow this, I think is the early next one | 07:30 |
quiquell | with power | 07:30 |
marios | yeah IMO avoid the +2a or even +2 your own patch unless emergeycny | 07:31 |
marios | emergency | 07:31 |
quiquell | this is not emergency is just feature | 07:31 |
quiquell | §'just' | 07:31 |
marios | right and not controversial so we should be able to merge it today if we offer enough blood sacrifice for zuul | 07:31 |
*** jtomasek has joined #oooq | 07:47 | |
*** ccamacho has quit IRC | 08:01 | |
*** ccamacho has joined #oooq | 08:01 | |
*** gkadam has joined #oooq | 08:07 | |
*** saneax has joined #oooq | 08:10 | |
*** ykarel|lunch is now known as ykarel | 08:14 | |
*** amoralej|off is now known as amoralej | 08:31 | |
*** chem has joined #oooq | 08:43 | |
*** tosky has joined #oooq | 08:46 | |
arxcruz | sshnaidm|afk: chkumar|ruck please take a look at https://review.rdoproject.org/r/#/c/17437/ | 08:53 |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-scenario010-multinode-oooq-container, tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-scenario001-multinode-oooq-container, tripleo-ci-fedora-28-standalone, tripleo-ci-centos-7-scenario007-multinode-oooq- (3 more messages) | 08:57 |
chkumar|ruck | arxcruz, hello | 09:01 |
chkumar|ruck | arxcruz, https://review.rdoproject.org/r/#/c/17437/ please send a dummy patch like this https://review.rdoproject.org/r/17339 and call this new job there so that we can see it in action | 09:02 |
arxcruz | ack | 09:03 |
*** holser_ has joined #oooq | 09:04 | |
chkumar|ruck | arxcruz, what about running cinder/keystone/neutron/horizontempest plugins tests also? then it have full coverage | 09:05 |
*** jschluet has joined #oooq | 09:19 | |
chkumar|ruck | ssbarnea|bkp2, quiquell Is someone looking into f28 standalone post_failure issue http://logs.openstack.org/92/619492/2/check/tripleo-ci-fedora-28-standalone/0f91b7f/job-output.txt.gz ? | 09:23 |
quiquell | chkumar|ruck: not yet, could be related to python3 | 09:24 |
*** udesale has quit IRC | 09:25 | |
*** jschlueter has quit IRC | 09:25 | |
*** kopecmartin|off is now known as kopecmartin | 09:25 | |
*** udesale has joined #oooq | 09:26 | |
*** chkumar|ruck has quit IRC | 09:29 | |
*** chkumar246 has joined #oooq | 09:31 | |
*** chkumar246 is now known as chkumar|ruck | 09:31 | |
*** jaosorior has joined #oooq | 09:38 | |
*** derekh has joined #oooq | 09:38 | |
quiquell | ssbarnea|bkp2: this is good now https://review.openstack.org/#/c/619518/ | 09:54 |
quiquell | marios: ^ documentation on pre-commit | 09:54 |
marios | quiquell: ack. | 09:56 |
*** bogdando has joined #oooq | 10:02 | |
*** sshnaidm|afk is now known as sshnaidm | 10:07 | |
quiquell | sshnaidm: good morning, added a tox -e zuul to start zuul it also checks API to wait until it really starts | 10:08 |
quiquell | sshnaidm: Also can you workflow this https://review.openstack.org/#/c/619468/ for standalone scenarios ? | 10:09 |
sshnaidm | quiquell, do you want to override docker/podman? | 10:10 |
sshnaidm | quiquell, why not to do it in featureset? | 10:10 |
quiquell | sshnaidm: we need to use docker at scenarios | 10:11 |
quiquell | sshnaidm: but we don't want to replicate featureset052 | 10:11 |
sshnaidm | quiquell, why not replicate it? | 10:12 |
sshnaidm | quiquell, I'm just afraid we start overuse this overriding, it was supposed to be only for tempest tests | 10:12 |
quiquell | sshnaidm: to reduce redudancy, but panda|rover say it not a good idea reuse fs052 | 10:12 |
quiquell | sshnaidm: yep, panda|rover say so, maybe we don't want | 10:12 |
sshnaidm | quiquell, well, if it's only replicating one featureset I think it's fine.. | 10:13 |
quiquell | sshnaidm: well, we are suppose to use podman in the future, so this will be deleted | 10:13 |
quiquell | sshnaidm: so maybe we can create anodther featureset for scenarios ? | 10:13 |
quiquell | marios: ^ new featureset for scenarios | 10:14 |
sshnaidm | quiquell, we did it for containerized scenarios, from 004 to 016 for example | 10:15 |
quiquell | sshnaidm: ack, then we do the same, looks like it was a pin in the past to reuse featureset you didn't know what you were executing | 10:15 |
quiquell | sshnaidm: we are also override the environment file for standalone, maybe we don't want that either | 10:16 |
quiquell | sshnaidm: and we just use featureset_override for tempest stuff | 10:16 |
quiquell | sshnaidm: we can comment on the scrum and we decide | 10:16 |
quiquell | panda|rover: ^ | 10:16 |
sshnaidm | quiquell, yeah, I think it's good point to discuss it in scrum today | 10:16 |
quiquell | sshnaidm: ok let's not workflow any of this yet | 10:17 |
chkumar|ruck | sshnaidm, Hello | 10:23 |
chkumar|ruck | sshnaidm, regarding this bug https://bugs.launchpad.net/tripleo/+bug/1805094 | 10:23 |
openstack | Launchpad bug 1805094 in tripleo "[master][rocky] No kolla logs getting collected in periodic-tripleo-centos-7-master-containers-build " [Critical,Triaged] - Assigned to chandan kumar (chkumar246) | 10:23 |
chkumar|ruck | sshnaidm, https://review.rdoproject.org/r/#/c/17204/1/playbooks/tripleo-ci-periodic-base/containers-build.yaml is not working | 10:24 |
chkumar|ruck | sshnaidm, can we do something here, always copy the logs whether the job failed or passed | 10:25 |
chkumar|ruck | ? | 10:25 |
chkumar|ruck | let me put a patch | 10:25 |
marios | quiquell: why no i prefer to use overrides | 10:27 |
marios | quiquell: why do we want 10 new fs | 10:27 |
marios | to change 2 things | 10:27 |
*** skramaja_ has joined #oooq | 10:29 | |
quiquell | marios: Looks like it was like that before | 10:30 |
quiquell | marios: At the end we are tripleo CI and tripleo quickstart | 10:30 |
quiquell | marios: If someone want to exercise the job without ci, featureset is the place | 10:30 |
*** skramaja has quit IRC | 10:30 | |
quiquell | marios: I mean exercise not the job the scenario | 10:30 |
quiquell | marios: I mean running quickstart for standalone scenario001 for example | 10:30 |
panda|rover | featuresets were implemented for a variety of reasons, and they demonstrated to be an excellent way to protect us from CI misuse. THey are the ultimate source of truth to understand what a job is doing, a fixed set of feature ina combination taht we support | 10:31 |
quiquell | marios: You just pass @featuresetfoobar.yaml to quickstart and that's it | 10:31 |
quiquell | panda|rover: sshnaidm agree and I think I agree too | 10:31 |
quiquell | panda|rover: maybe we can fix redudancy with other mechanism | 10:31 |
quiquell | panda|rover: we are not workflowing overrides until scrum meeting | 10:32 |
panda|rover | one of the main reason was when we translated tripleo.sh jobs into quickstart, I spend 2 months looking at the logic in the bash script to understand exaclty what the HA job was doing | 10:32 |
marios | 12:31 < panda|rover> featuresets were implemented for a variety of reasons, and they demonstrated to be an excellent way to protect us from CI misuse. THey are the ultimate source of truth to understand what a job is doing, a fixed set of feature ina combination taht we support | 10:32 |
marios | i don't see how that conflicts with what is proposed ^ | 10:32 |
quiquell | marios: with override you need to know that part of the stuff is in the job | 10:33 |
marios | the featureset for the job scenario-standalon1 is 52, plus these two overrides | 10:33 |
panda|rover | with this 1 featureset maps to 4 jobs | 10:33 |
quiquell | marios: and also how the override works | 10:33 |
quiquell | marios: folks have to be able to run standalone with scenario001 without a clue on CI | 10:33 |
marios | quiquell:yeah i accept the ci/quickstart distinction i.e. using it outside of ci. i guess ceph is one example | 10:33 |
panda|rover | you have to look at the zuul configuration and the playbook code to undestand what are you runnning on your job | 10:34 |
marios | cos gfidente was asking me about kicking those jobs in ceph-ansible | 10:34 |
quiquell | marios: For me at least is what make me clear the panda|rover's point | 10:34 |
marios | panda|rover: that is the same currenlty | 10:34 |
marios | panda|rover: you have to look at a featureset 052 adn then in tht there is an environment file multinode-containers whatever its called | 10:34 |
quiquell | panda|rover: still maybe is not bad to have a meachanism to reduce redundacy within tripleo-quickstart | 10:34 |
marios | panda|rover: i mean even without featureset override | 10:34 |
marios | 12:34 < panda|rover> you have to look at the zuul configuration and the playbook code to undestand what are you runnning on your job | 10:35 |
marios | this still holds ^ | 10:35 |
marios | panda|rover: and i don't see what the environment file has to do with featureset override. we still have to specify that env file. your just sayingwe need a new featureuset for that that | 10:36 |
panda|rover | as far as I know the multinode env files are fot the environemnt in wchich we run the featureset, they don't alter the set of features we want to test | 10:36 |
marios | panda|rover: right they specify services, again same in both cases | 10:36 |
panda|rover | the environment for standalone is a n argument to openstack overcloud deploy roght ? | 10:37 |
marios | panda|rover: ie. right now https://github.com/openstack/tripleo-heat-templates/blob/master/ci/environments/scenario001-multinode-containers.yaml | 10:37 |
marios | this is used ^ | 10:37 |
marios | to specify services | 10:37 |
panda|rover | yes, these are arguments to pass to the overcloud deploy | 10:38 |
marios | in our job, we are defining another env file to be used which also defined just services e.g. https://review.openstack.org/619504 for scenario 4 | 10:38 |
*** skramaja_ is now known as skramaja | 10:38 | |
marios | well, scrum in like 2.3 hours so we can shout at each other on camera in a bit :) | 10:39 |
panda|rover | give me your address, I'll send you a Howler | 10:40 |
panda|rover | these types of environments are values for variables in different featureset files, what we did until now was that fsx, has scenario: scenarioY so featureset and scenarios are tied | 10:43 |
panda|rover | with zull we have less need for the featuresets | 10:44 |
panda|rover | and we can discuss removeing or altering them | 10:44 |
panda|rover | but the idea of having a single file that contains all the configurations for a specific job should still remains, that's what we realized was something to protect in the past | 10:45 |
panda|rover | you look at a single file and understand at a glance and without lookign at anything else, what are the switches activated in a particlar job | 10:45 |
panda|rover | also as a warning for the developers :"you can test whatever combination you want, but know that in CI we are currently testing ONLY this sets of combination, if you do something different and have trouble we won't be able to provide as much help" | 10:46 |
marios | panda|rover: but you cant do that currently is my point | 10:46 |
marios | all of this holds | 10:46 |
marios | panda|rover: currently you need to check the featureset and then also https://github.com/openstack/tripleo-heat-templates/blob/master/ci/environments/scenario001-multinode-containers.yaml for example | 10:47 |
marios | that remains the same just different file... | 10:47 |
marios | ? | 10:47 |
panda|rover | marios: maybe I misunderstood your proposal, what I understand right now, is that fs052 can map to four standalone scenarios | 10:47 |
marios | panda|rover: right. possibly all of them | 10:47 |
marios | panda|rover: so its like 10 featuresets | 10:47 |
marios | panda|rover: where we override the same thing just 2 lines | 10:47 |
marios | even for 1 / 4 with ceph we are using the same fs52 | 10:48 |
marios | ie. we didn't need some new thing to warrant a new fs | 10:48 |
panda|rover | we know for th stat of the design that we needed to replicat all the featuresets even for a single different value, because it maps to a different job | 10:49 |
panda|rover | start* | 10:49 |
panda|rover | if we are not doing this, then there's is something that take precedence over the featuresets | 10:49 |
panda|rover | and that's what we didn't want | 10:49 |
panda|rover | featureset has the last word | 10:49 |
sshnaidm | marios, panda|rover, quiquell one of questions is - do we need to run podman or docker on queens/rocky/stein jobs? | 10:49 |
panda|rover | sshnaidm: I don't think so, podman is only for RHEL8 and so master | 10:50 |
quiquell | sshnaidm: scenarios are not prepared for podman, queens, rocky or stein | 10:50 |
panda|rover | I don't think thery're back porting podman to RHEL7 | 10:50 |
sshnaidm | panda|rover, so it will be supported from stein, right? | 10:51 |
panda|rover | sshnaidm: I think so | 10:51 |
sshnaidm | then we'll need to run both docker and podman jobs, docker for previous branches | 10:51 |
sshnaidm | and this is better to do with featuresets, not with overrides.. | 10:51 |
sshnaidm | like we did with non-containerized and containerized jobs | 10:52 |
quiquell | sshnaidm: so -podman -docker jobs | 10:53 |
panda|rover | to me the feature we want to test is "container", docker or podman is an inmplementation, and should be set depending on release, but we always ahad difficult in spearating the switches in featuresets from their configurations | 10:53 |
quiquell | panda|rover, sshnaidm: so if we have fs per scenario docker have to be podman from steain and beyond | 10:53 |
chkumar|ruck | panda|rover, sshnaidm from stein we are switching to podman and there is a var for the same undercloud_container_cli which we have used in validate-tempest role | 10:53 |
quiquell | that's it ? | 10:53 |
sshnaidm | panda|rover, you can also say that feature is scenario and container is implementation | 10:54 |
chkumar|ruck | if this var is not there it assumes we have used docker | 10:54 |
panda|rover | sshnaidm: yeah, because when we move forward, what previously was a feature to test, it's now the default, and starts to make less sense to have the same default | 10:55 |
sshnaidm | just think about configuring periodic and patch jobs for next branches, we'll need to put "override" in all job definitions there to run podman | 10:55 |
panda|rover | uh, taht did make more sense in my mind | 10:55 |
panda|rover | yeah agree, that's also one of the reason why in the featureset distribution etherpad I ask to not use hole in the featureset numbers | 10:56 |
panda|rover | we need to leave the featureset for the older jobs alone | 10:56 |
panda|rover | because they represent older job s | 10:56 |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/604298, master: tripleo-ci-centos-7-scenario002-multinode-oooq-container, tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-scenario009-multinode-oooq, tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates, tripleo- (2 more messages) | 10:57 |
panda|rover | marios: beside all this, I'm really so sorry to not have caught this at the start of the sprint | 10:57 |
panda|rover | marios: now it feels like you're alsmost there and we start making complaints | 10:58 |
*** udesale has quit IRC | 10:58 | |
panda|rover | it sucks | 10:59 |
quiquell | panda|rover: let's just make it right, even if it's not very sprint friendly | 11:00 |
panda|rover | it's not even team member friendly, the sprint is not the important thing, the thing is the satisfaction to have something completed at the end of a short period | 11:00 |
panda|rover | that's why I always push to plan US to be completed on a single sprint, mentally you close tabs frm the browser on you brain and free memory | 11:01 |
panda|rover | this is all the matters for the "sprint" not the sprint itself. | 11:02 |
quiquell | marios: reproduce the linting issue | 11:18 |
quiquell | marios: do you want a tmate session to try to fix it ? | 11:22 |
quiquell | marios: got it | 11:25 |
quiquell | marios: problem is that we are running ansible-lint at venv so we don't see system wide packages | 11:25 |
quiquell | :-/ | 11:25 |
quiquell | Or this is what I think | 11:25 |
quiquell | we have to use ansible-lint from RPM not from pip | 11:27 |
quiquell | humm not working either | 11:31 |
marios | panda|rover: lets talk more on scrum if the consensus is to get new fs we can do it its one more easy review (copy/paste) per job | 11:37 |
marios | quiquell: sure, sec | 11:37 |
panda|rover | chkumar|ruck: how's going ? | 11:40 |
*** holser_ is now known as holser|lunch | 11:41 | |
*** rfolco has joined #oooq | 11:54 | |
chkumar|ruck | panda|rover, on friday, we have promotion for master and rocky | 11:55 |
chkumar|ruck | panda|rover, today container build master failed due to container selinux issue | 11:55 |
chkumar|ruck | panda|rover, I am waiting for next run and rest is ok | 11:55 |
chkumar|ruck | panda|rover, oh pike got promoted also | 11:57 |
chkumar|ruck | panda|rover, please have a look at this bug https://bugs.launchpad.net/tripleo/+bug/1805102/ | 11:57 |
openstack | Launchpad bug 1805102 in tripleo "[master][rocky][fs02] ERROR! Unexpected Exception, this is probably a bug: No module named tripleo_common in upload job while converting image" [Critical,Triaged] | 11:57 |
chkumar|ruck | panda|rover, currently I am dealing with this one https://bugs.launchpad.net/tripleo/+bug/1805094 | 11:57 |
openstack | Launchpad bug 1805094 in tripleo "[master][rocky] No kolla logs getting collected in periodic-tripleo-centos-7-master-containers-build " [Critical,Triaged] - Assigned to chandan kumar (chkumar246) | 11:57 |
chkumar|ruck | panda|rover, on rocky side fs01/02 blocked on this https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-rocky-upload/9d18325/logs/undercloud/var/log/containers/nova/nova-compute.log.txt.gz?level=ERROR#_2018-11-25_13_54_39_585 | 12:02 |
panda|rover | promotion bonanza! | 12:02 |
chkumar|ruck | panda|rover, I have no idea what to do with this? | 12:02 |
chkumar|ruck | panda|rover, vexhost fs01 rocky/master is passing in check queue | 12:02 |
*** hubbot1 has quit IRC | 12:02 | |
panda|rover | chkumar|ruck: do we have a bug for that ? I don't think there's much we can do other thanshow it to some nova guy | 12:04 |
chkumar|ruck | panda|rover, https://bugs.launchpad.net/tripleo/+bug/1801587 | 12:04 |
openstack | Launchpad bug 1801587 in tripleo "[master/Rocky]Fs035 job fails in promotion becasue of heat stack timeout" [Critical,Triaged] | 12:04 |
*** hubbot1 has joined #oooq | 12:05 | |
panda|rover | mmmhh ... | 12:06 |
*** ratailor has quit IRC | 12:10 | |
quiquell | marios: ansible-pacemaker is an RDO thing, so we build it, I think the patch is just wrong | 12:15 |
quiquell | marios: as this is not default for ansible | 12:15 |
quiquell | marios: going to change the distgit | 12:16 |
quiquell | marios: https://github.com/rdo-packages/ansible-pacemaker-distgit/blob/rpm-master/ansible-pacemaker.spec#L48 | 12:16 |
quiquell | marios: defaults ar https://docs.ansible.com/ansible/2.7/dev_guide/developing_locally.html | 12:18 |
chkumar|ruck | panda|rover, did we get a chance to look at rdo phase 1 master jobs? | 12:21 |
panda|rover | chkumar|ruck: from friday evening to now ? | 12:22 |
panda|rover | chkumar|ruck: looks like it worked .. | 12:23 |
panda|rover | chkumar|ruck: https://ci.centos.org/view/rdo/view/promotion-pipeline/job/rdo_trunk-promote-master-current-tripleo/ | 12:23 |
chkumar|ruck | panda|rover, yes | 12:23 |
chkumar|ruck | panda|rover, I think we need to remove this job https://ci.centos.org/view/rdo/view/promotion-pipeline/job/rdo_trunk-promote-ocata-current-tripleo/ | 12:24 |
chkumar|ruck | as we are removing ocata jobs already | 12:24 |
*** apetrich has quit IRC | 12:29 | |
*** apetrich has joined #oooq | 12:41 | |
marios | quiquell lgtm nice on the distgit but still not fully clear on why now | 12:46 |
quiquell | marios: is clear now, looks like before standalone we run ansible using mistral | 12:46 |
quiquell | marios: and at mistral there is the module_path option that was set | 12:46 |
marios | quiquell: ah right so standalone is running ansible directoy | 12:46 |
marios | directly | 12:46 |
quiquell | marios: standalone just run directly ansible-playbook from python-tripleoclient | 12:46 |
marios | bypass mistral | 12:46 |
marios | this is kinda 'violation' in the sense that everything else is using mistral | 12:47 |
marios | but ok makes more sense | 12:47 |
quiquell | marios: yep, so we chagne distgit, we I think is not going to break anything | 12:47 |
quiquell | marios: or we cahnge python-tripleoclient (used also by mistral :-/) | 12:47 |
quiquell | marios: --module-path prepend stuff so is not going to override | 12:47 |
ykarel | panda|rover, can u check my comment https://review.openstack.org/#/c/618669/3..7/playbooks/tripleo-ci/post.yaml and confirm the issue | 12:54 |
*** holser|lunch is now known as holser_ | 12:56 | |
ykarel | panda|rover, i mean if run.yml fails before running collect logs, logs will be missed, i think for that u need to rename the scipt to something else(just after it's successully run in ovb) so it's not run again | 12:56 |
panda|rover | people, still need review on https://review.openstack.org/618669 https://review.openstack.org/607288 https://review.openstack.org/617617 | 12:57 |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/604298, master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/567224 | 12:57 |
*** rlandy has joined #oooq | 12:57 | |
panda|rover | ykarel: it complicated | 12:58 |
panda|rover | ykarel: we would really need to split the logs collection in two parts | 12:58 |
panda|rover | ykarel: one to collect only overcloud nodes, the other to collect the rest | 12:58 |
panda|rover | ykarel: I understand wheat you're saying, if the run times out we can run in post so we can at least have undercloud logs | 12:59 |
*** dtantsur|afk is now known as dtantsur|mtg | 12:59 | |
panda|rover | we could try to check if we have the artifacts colelct logs should produce | 13:02 |
rfolco | connection is bad, let me check my wifi points | 13:03 |
chkumar|ruck | panda|rover, please have a look at #rhos-ops internal | 13:03 |
*** weshay_pto is now known as weshay | 13:05 | |
marios | quiquell: and rfolco please can one each qe my scen1/4 | 13:13 |
marios | ? | 13:13 |
quiquell | marios: Better panda|rover or sshnaidm so we follow what they have in mind regarding that | 13:14 |
marios | quiquell: ack. i added myself https://tree.taiga.io/project/tripleo-ci-board/us/337?milestone=206481 please re-assign if you find someone to sell it to | 13:15 |
quiquell | marios: ack | 13:16 |
weshay | panda|rover, chkumar|ruck thanks for holding things down | 13:18 |
panda|rover | weshay: most of the time, chkumar|ruck already did everything before I could even wake up. | 13:19 |
weshay | I see the board says there was a promotion, however the hash looks old to me | 13:19 |
weshay | panda|rover, quiquell 11/19 https://trunk.rdoproject.org/centos7-master/current-tripleo/delorean.repo | 13:19 |
weshay | er.. sorry quiquell meant chkumar|ruck | 13:20 |
panda|rover | it was too good to be true ... | 13:21 |
* chkumar|ruck is confused | 13:21 | |
panda|rover | checking the promotion server | 13:23 |
*** trown|outtypewww is now known as trown | 13:27 | |
panda|rover | cloud image and container images are updated properly | 13:30 |
marios | rlandy: * repro/zuul3: failed like http://pastebin.test.redhat.com/672514 indeed @ toci_gate_test (as far as run-v3) but continue tomorrow. | 13:31 |
chkumar|ruck | panda|rover, each of the logs http://38.145.34.55/master.log-20181122 at the end does print success? on successful promotion | 13:32 |
weshay | panda|rover, w/ 3ed8ac0e93367a02ad53d9fa93467057724b6621_fd8eb74b | 13:32 |
chkumar|ruck | weshay, I have not looked at the promotion server | 13:32 |
weshay | panda|rover, I don't see that hash here https://trunk.rdoproject.org/centos7-master/report.html | 13:33 |
*** gouthamr has quit IRC | 13:34 | |
panda|rover | these are the promoted images https://images.rdoproject.org/master/rdo_trunk/618d3ab83cd319e03fac86c1d6de510ef4a5134b_be9e0d5c/ | 13:35 |
panda|rover | this 618d3ab83cd319e03fac86c1d6de510ef4a5134b_be9e0d5c is the promotion hash | 13:35 |
panda|rover | I don't see any exception in the logs, so I have to assume the call to DLRN api to promote this hash succeded | 13:35 |
panda|rover | but there was no change in the repo | 13:35 |
weshay | panda|rover, maybe the infra guys did something, because we should not have seen a promote in http://rhos-release.virt.bos.redhat.com:3030/rhosp if the dlrn hash did not update | 13:39 |
weshay | panda|rover, oh ya.. | 13:40 |
*** gouthamr has joined #oooq | 13:40 | |
weshay | http://rhos-release.virt.bos.redhat.com:3030/rhosp | 13:41 |
weshay | panda|rover, check it out.. master moved back to yellow | 13:41 |
weshay | so I think something changed | 13:41 |
weshay | it was green 2days last night iirc | 13:41 |
panda|rover | weshay: asking int prodinfra channel | 13:42 |
chkumar|ruck | weshay, panda|rover is it something problem with master only? as rocky looks good from this dashboard http://rhos-release.virt.bos.redhat.com:3030/rhosp | 13:42 |
panda|rover | chkumar|ruck: to be really sure we have to check if the hash is the same in containers, images, and repo | 13:43 |
chkumar|ruck | adding a todo, will script it | 13:43 |
ykarel | weshay, because master became consistent today | 13:44 |
*** hubbot1 has quit IRC | 13:49 | |
ykarel | panda|rover, ack for the robust plan of collecting overcloud, undercloud logs seperately, for the current situation, changing the script name would not help? | 13:50 |
*** dmellado has quit IRC | 13:51 | |
*** hubbot1 has joined #oooq | 13:51 | |
panda|rover | ykarel: I don't fully understand the idea of different names, but it would be too implicit | 13:52 |
ykarel | panda|rover, i meant post is looking for a file collect_logs.sh to run, if the file not exist(we rename after successful run in run.yml) don't run | 13:53 |
*** dmellado has joined #oooq | 13:53 | |
*** agopi|brb has quit IRC | 13:54 | |
ykarel | so in run.yml success case(collect_logs.sh is run and renamed to collect_logs_ovb.sh), and fail case(collect_logs.sh is not executed in run.yml and then executed in post to collect undercloud logs) | 13:55 |
bogdando | quiquell: hi, PTAL https://review.openstack.org/#/q/topic:base-container-reduction+(status:open+OR+status:merged) | 14:06 |
bogdando | you were asking if we can remove puppet things from the base layers | 14:06 |
chkumar|ruck | panda|rover, weshay I am heading home now, see ya tomorrow :-) | 14:14 |
*** chkumar|ruck has quit IRC | 14:14 | |
quiquell | bogdando: ack give me a sec | 14:16 |
*** agopi|brb has joined #oooq | 14:17 | |
*** agopi|brb is now known as agopi | 14:17 | |
*** zul has joined #oooq | 14:25 | |
*** skramaja has quit IRC | 14:38 | |
*** udesale has joined #oooq | 14:48 | |
*** gfidente has joined #oooq | 14:56 | |
weshay | marios, I can meet quickly | 14:56 |
weshay | marios, scratch that | 14:57 |
weshay | marios, let's chat tomorrow | 14:57 |
*** quiquell is now known as quiquell|off | 14:58 | |
marios | weshay: sure np | 14:58 |
marios | weshay: or im around for at least another hour | 14:58 |
marios | whatever just ping me | 14:58 |
panda|rover | weshay: promotion is legit, chandan repeated the tests on a hash taht was crated on the 19, and it was promoted friday, that's why it seems older than the promotion. SO promoted on friday monday's hash | 15:01 |
marios | rlandy: want to talk ? | 15:05 |
rlandy | yep - give me 5 mins to submit review | 15:05 |
marios | ack np whenever you're ready | 15:05 |
panda|rover | ykarel: sorry was in meeting | 15:06 |
ykarel | panda|rover, ack | 15:07 |
ykarel | panda|rover, anything to discuss? i have to leave now | 15:07 |
panda|rover | ykarel: we can discuss it tomorrow, to me the rename is an hack | 15:08 |
ykarel | panda|rover, ack | 15:08 |
ykarel | yup agree it's a hack | 15:08 |
panda|rover | ykarel: let's see if I can come up with the same idea but more explicit implementation | 15:08 |
panda|rover | and conditionals | 15:08 |
ykarel | panda|rover, ack | 15:09 |
ykarel | panda|rover, and for the promotion master issue, master repo was not consistent from last couple of days, so testing same hash in periodic run and chandan's explicit run:- https://trunk-primary.rdoproject.org/api-centos-master-uc/api/civotes_detail.html?commit_hash=618d3ab83cd319e03fac86c1d6de510ef4a5134b&distro_hash=be9e0d5ccc3bd5a194c7b77587223e48b8469219&offset=0 | 15:09 |
ykarel | master repo became consistent today | 15:10 |
*** ykarel is now known as ykarel|away | 15:10 | |
ykarel|away | with the merge of https://review.rdoproject.org/r/#/c/17410/ | 15:10 |
* ykarel|away out | 15:11 | |
panda|rover | yep | 15:12 |
panda|rover | ok | 15:12 |
agopi | ping panda|rover | 15:14 |
*** chkumar|away has joined #oooq | 15:14 | |
panda|rover | agopi: pong | 15:16 |
*** ykarel|away has quit IRC | 15:16 | |
agopi | hello panda|rover, https://review.rdoproject.org/zuul/builds?job_name=tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053 hasn't triggered for openstack/browbeat in days and it has been failing for others anyways. I don't see any change to https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/zuul.d/tripleo.yaml#L222 as well. any idea what I should be doing? | 15:17 |
panda|rover | agopi: you had some job triggered even after we disabled third party OVB jobs globally ? | 15:22 |
agopi | oh okay I hadn't known that the jobs were disabled globally. and yes looks like it wass triggered for tht, oooq and toci | 15:23 |
panda|rover | agopi: ok, so because of the instability of the jobs, we decided to disable the triggers everywhere, you can always run ovb job by explicitly commenting check-rdo | 15:24 |
panda|rover | but until we have the jobs stable again, you may see some false negatives | 15:25 |
agopi | oh sweet thnks for letting me know panda|rover, that helps. That explains why the job has builds recently. | 15:26 |
agopi | panda|rover++ | 15:27 |
hubbot1 | agopi: panda|rover's karma is now 3 | 15:27 |
weshay | panda|rover, can we chat about ruck/rover and next sprint briefly? | 15:33 |
panda|rover | weshay: ok | 15:41 |
*** ykarel|away has joined #oooq | 15:41 | |
weshay | k cool in my blue | 15:41 |
*** gfidente has quit IRC | 15:42 | |
weshay | panda|rover, https://hub.docker.com/r/tripleomaster/centos-binary-keystone/tags/ | 15:44 |
*** saneax has quit IRC | 15:47 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/604298, master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/567224 | 15:51 |
*** ykarel|away has quit IRC | 15:55 | |
*** udesale has quit IRC | 16:10 | |
*** chkumar|away has quit IRC | 16:11 | |
*** rlandy_ has joined #oooq | 16:25 | |
*** gkadam has quit IRC | 16:27 | |
*** rlandy has quit IRC | 16:27 | |
*** dsneddon has joined #oooq | 16:41 | |
*** rlandy_ is now known as rlandy | 16:49 | |
rlandy | sshnaidm: what rechck will trigger rdo ovb jobs now? | 16:50 |
rlandy | recheck | 16:50 |
*** ykarel|away has joined #oooq | 16:53 | |
*** quiquell|off has quit IRC | 16:59 | |
*** kopecmartin is now known as kopecmartin|off | 17:00 | |
*** bogdando has quit IRC | 17:11 | |
*** ykarel|away has quit IRC | 17:15 | |
sshnaidm | rlandy, "check-rdo" | 17:28 |
*** agopi is now known as agopi|food | 17:29 | |
rlandy | panda|rover: how would we pass a drln_hash_tag_newest now? | 17:37 |
*** jfrancoa has quit IRC | 17:37 | |
rlandy | with the v3 workflow | 17:37 |
rlandy | EXTRA_VARS is not longer included | 17:38 |
panda|rover | via featureset override maybe ? :) | 17:40 |
weshay | arxcruz, rfolco fyi.. https://tree.taiga.io/project/tripleo-ci-board/task/304 https://tree.taiga.io/project/tripleo-ci-board/issue/323 | 17:50 |
weshay | are both close enough that we can have one task | 17:50 |
rfolco | weshay, ok will close 304 and point to 323 which has already something on it | 17:51 |
rfolco | thanks for clarifying | 17:51 |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/604298, master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/567224 | 17:51 |
*** derekh has quit IRC | 17:52 | |
rfolco | actually 304 is closed | 17:53 |
rfolco | so will comment on 323 | 17:53 |
*** apetrich has quit IRC | 17:57 | |
weshay | panda|rover, rlandy sshnaidm https://review.rdoproject.org/r/#/c/17437/ | 17:58 |
*** apetrich has joined #oooq | 17:58 | |
weshay | arxcruz++ | 17:59 |
hubbot1 | weshay: arxcruz's karma is now 9 | 17:59 |
arxcruz | this is confuse, so, folco already have the job in place, just need to add the feature_override right ? | 17:59 |
weshay | arxcruz, work it w/ rfolco :) | 18:02 |
weshay | I'll just be looking at the end result :) | 18:02 |
arxcruz | lol | 18:02 |
arxcruz | at least is in portuguese | 18:02 |
rfolco | arxcruz, talk to me | 18:03 |
rfolco | o/ | 18:03 |
*** agopi|food is now known as agopi | 18:10 | |
sshnaidm | ssbarnea|bkp2, do you set -1 just randomly? :) like https://review.openstack.org/#/c/614633/ and https://review.openstack.org/#/c/565215/ | 18:18 |
sshnaidm | ssbarnea|bkp2, or it's a bot | 18:19 |
weshay | arxcruz, kopecmartin|off need's review https://review.openstack.org/#/c/509728/34/roles/validate-tempest/templates/cleanup-network.sh.j2 | 18:28 |
arxcruz | wow... | 18:30 |
arxcruz | checking | 18:30 |
*** holser_ has quit IRC | 18:38 | |
*** amoralej is now known as amoralej|off | 18:41 | |
*** sshnaidm is now known as sshnaidm|afk | 18:48 | |
weshay | panda|rover, did someone change the keys on the promotion server? | 18:58 |
weshay | I don't have access atm | 18:58 |
*** brault has quit IRC | 19:01 | |
*** brault has joined #oooq | 19:04 | |
sshnaidm|afk | weshay, you can use your user | 19:46 |
sshnaidm|afk | weshay, we blocked centos user from login via ssh | 19:46 |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/604298, master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/567224 | 19:51 |
weshay | sshnaidm|afk, ah k | 20:28 |
weshay | rlandy, need any help w/ anything? | 20:44 |
rlandy | weshay: there are a few design decisions we need to make | 20:45 |
rlandy | weshay: I plan to bring them to the community meeting tomorrow | 20:45 |
weshay | k.. rlandy would you like reviews or discussion? | 20:45 |
weshay | ah k | 20:45 |
rlandy | weshay: there are two review sets to look at ... | 20:46 |
weshay | rlandy, I'll review your stuff now | 20:46 |
rlandy | https://review.openstack.org/#/c/616993 | 20:46 |
rlandy | and | 20:46 |
rlandy | https://review.openstack.org/#/c/618654 | 20:46 |
rlandy | weshay" ^^ those two and all the patches below them | 20:46 |
rlandy | weshay; per marios's suggestion, I am moving the t-q-e patches to a new repo | 20:47 |
rlandy | and not editing nofepool-setup | 20:47 |
rlandy | nodepool-setup | 20:47 |
weshay | hrm.. k | 20:47 |
rlandy | that way we can merge some of these changes without the huge mess | 20:47 |
rlandy | and the old reproducer will stay working | 20:48 |
weshay | a new heredoc | 20:48 |
rlandy | however the main work is those two patches | 20:48 |
weshay | in commit message of https://review.openstack.org/#/c/616993 | 20:48 |
weshay | not sure what you mean | 20:48 |
rlandy | tha messge is just so we get the reproducer out quickly | 20:48 |
rlandy | I can remove that depnends-on | 20:49 |
rlandy | it just edits out the running section | 20:49 |
rlandy | so that we get results quickly | 20:49 |
rlandy | I don't plan to merge that whole set of reviews | 20:49 |
rlandy | at least not until we go through the design discussion tomorrow | 20:49 |
rlandy | there is a diff now that we run ansible playbooks | 20:50 |
rlandy | not straight shell scripts | 20:50 |
rlandy | zuul is slow today though | 20:50 |
weshay | k | 20:52 |
rlandy | weshay: you can start by looking at nodepool setup work | 20:52 |
* weshay rearranges some mtgs | 20:52 | |
weshay | rlandy, you have a DNM test review I can follow and try? | 20:52 |
rlandy | weshay: ack ... you can try this ... | 20:53 |
rlandy | you can get the reproducer and inventory from any job run with https://review.openstack.org/#/c/616993 | 20:53 |
rlandy | then patch the reproducer git fetch https://git.openstack.org/openstack/tripleo-quickstart-extras refs/changes/93/616993/25 && git checkout FETCH_HEAD | 20:54 |
rlandy | just after t-q nad t-q-e are clones | 20:54 |
rlandy | just after t-q nad t-q-e are cloned | 20:54 |
rlandy | and run | 20:55 |
rlandy | you should get as far as running toci-gate_test | 20:55 |
rlandy | to run-toci-gate-test | 20:55 |
rlandy | you will need patch https://review.openstack.org/#/c/618654 | 20:55 |
weshay | off | 20:56 |
weshay | oof | 20:56 |
rlandy | which you can apply and again | 20:56 |
weshay | maybe we need an etherpad again | 20:56 |
* weshay tries | 20:56 | |
rlandy | weshay: ok | 20:56 |
weshay | http://logs.openstack.org/93/616993/25/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/1dfceb8/logs/reproducer-quickstart/ | 20:58 |
rlandy | rfolco: if you have time ^^ | 20:58 |
rlandy | weshay: yep - you need the reproducer script and the inventory | 20:59 |
rlandy | that is all | 20:59 |
rfolco | rlandy, reproduce it on ovb by getting the script and inventory files ? | 20:59 |
rlandy | rfolco: we would need an ovb job run | 21:00 |
rfolco | rlandy, so where this should be tested ? | 21:01 |
rfolco | rlandy, libvirt ? | 21:01 |
rlandy | rfolco: you can use rdocloud with multinode/singlenode/standalone | 21:01 |
rlandy | or libvirt | 21:01 |
rlandy | or ovb | 21:01 |
rlandy | I will kick an ovb run | 21:01 |
rlandy | it's juts not kicked by default | 21:01 |
rfolco | rlandy, standalone would help ? | 21:02 |
rlandy | rfolco: sure ... | 21:03 |
rfolco | rlandy, ok will test standalone reproducer on rdo cloud | 21:03 |
rfolco | thanks rlandy | 21:03 |
rlandy | you will need this in the reproducer file after you t-qe- is cloned ... | 21:03 |
rlandy | cd tripleo-quickstart-extras | 21:03 |
rlandy | git fetch https://git.openstack.org/openstack/tripleo-quickstart-extras refs/changes/93/616993/25 && git checkout FETCH_HEAD | 21:03 |
rlandy | cd .. | 21:03 |
rlandy | sed -i "s#git+https://git.openstack.org/openstack/tripleo-quickstart-extras#file:///$WORKSPACE/tripleo-quickstart-extras#1" $WORKSPACE/tripleo-quickstart/quickstart-extras-requirements.txt | 21:03 |
rlandy | rfolco: weshay: that is not the latest work but it should get you to run the pres and see what's happening | 21:04 |
rlandy | rfolco: weshay: you would need the latest patch plus https://review.openstack.org/#/c/618654/ to finish and run toci-gate-test | 21:06 |
rlandy | rfolco: weshay: I will put together an etherpad for the design decisions and testing flow | 21:06 |
weshay | k | 21:06 |
rfolco | rlandy, please... I am still confused on how to gather the Frankenstein bits | 21:07 |
rlandy | rfolco: weshay: if you want to bj - I can walk you though it quickly | 21:07 |
rlandy | may save you some time in testing | 21:08 |
rfolco | rlandy, I have 12 min before ubering my son from english class | 21:08 |
weshay | let me know if you guys join | 21:09 |
weshay | if not.. I'll keep poking | 21:09 |
rlandy | rfolco: k - let's run through this quickly - your bj? | 21:09 |
rfolco | sure | 21:09 |
rlandy | weshay: ^^ if you want to join | 21:09 |
*** apetrich has quit IRC | 21:22 | |
*** agopi is now known as agopi|brb | 21:28 | |
*** agopi|brb has quit IRC | 21:28 | |
*** jtomasek has quit IRC | 21:39 | |
*** apetrich has joined #oooq | 21:42 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/604298, master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/567224 | 21:51 |
*** vinaykns has joined #oooq | 23:08 | |
*** rlandy has quit IRC | 23:30 | |
*** tosky has quit IRC | 23:32 | |
*** tosky has joined #oooq | 23:32 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/604298, master: tripleo-ci-fedora-28-standalone @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/567224 | 23:51 |
*** vinaykns has quit IRC | 23:51 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!