*** rlandy has quit IRC | 00:00 | |
*** yolanda_ has joined #oooq | 00:06 | |
*** yolanda__ has quit IRC | 00:09 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 00:51 |
---|---|---|
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 02:51 |
*** skramaja has joined #oooq | 02:59 | |
*** jaganathan has joined #oooq | 03:00 | |
*** yolanda__ has joined #oooq | 03:15 | |
*** yolanda_ has quit IRC | 03:18 | |
*** yolanda__ has quit IRC | 03:23 | |
*** yolanda__ has joined #oooq | 03:29 | |
*** yolanda_ has joined #oooq | 03:34 | |
*** noama has joined #oooq | 03:34 | |
*** yolanda__ has quit IRC | 03:36 | |
*** yolanda__ has joined #oooq | 03:44 | |
*** yolanda_ has quit IRC | 03:45 | |
*** agopi has quit IRC | 03:49 | |
*** agopi has joined #oooq | 03:49 | |
*** ratailor has joined #oooq | 04:02 | |
*** yolanda_ has joined #oooq | 04:03 | |
*** yolanda__ has quit IRC | 04:06 | |
*** yolanda has joined #oooq | 04:08 | |
*** yolanda_ has quit IRC | 04:09 | |
*** yolanda_ has joined #oooq | 04:10 | |
*** agopi has quit IRC | 04:13 | |
*** yolanda has quit IRC | 04:13 | |
*** hamzy has quit IRC | 04:17 | |
*** hamzy has joined #oooq | 04:18 | |
*** hamzy has quit IRC | 04:22 | |
*** hamzy has joined #oooq | 04:36 | |
*** skramaja_ has joined #oooq | 04:50 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 04:51 |
*** skramaja has quit IRC | 04:51 | |
*** ccamacho has quit IRC | 05:06 | |
*** jaganathan_ has joined #oooq | 05:09 | |
*** jaganathan has quit IRC | 05:13 | |
*** ykarel has joined #oooq | 05:20 | |
*** ratailor_ has joined #oooq | 05:23 | |
*** ratailor has quit IRC | 05:27 | |
*** waleedm has joined #oooq | 05:30 | |
*** skramaja has joined #oooq | 05:34 | |
*** skramaja_ has quit IRC | 05:34 | |
*** quiquell has joined #oooq | 05:35 | |
*** matbu has joined #oooq | 05:35 | |
*** udesale has joined #oooq | 05:44 | |
*** bogdando has joined #oooq | 05:49 | |
quiquell | arxcruz: https://bugs.launchpad.net/tripleo/+bug/1778637 | 06:03 |
openstack | Launchpad bug 1778637 in tripleo "mistral_tempest_tests.tests.api.v2.test_actions.ActionTestsV2: MismatchError: [u'aodh.alarm_create'," [Critical,Triaged] - Assigned to Quique Llorente (quiquell) | 06:03 |
ykarel | so it happened again | 06:04 |
quiquell | ykarel: You recognize it ? | 06:07 |
ykarel | quiquell, we have seen this earlier as wel | 06:07 |
ykarel | is fs017 | 06:07 |
quiquell | ykarel: Now we don't see more, beacous we have a lot of timeouts so tempest doesn't get executed | 06:08 |
quiquell | (I think) | 06:08 |
ykarel | quiquell, timeout ? fs020? | 06:08 |
*** ccamacho has joined #oooq | 06:09 | |
quiquell | ykarel: Multiple scenearios, give a min to analyze | 06:09 |
ykarel | quiquell, ack | 06:09 |
*** saneax has joined #oooq | 06:12 | |
*** yolanda__ has joined #oooq | 06:21 | |
*** yolanda_ has quit IRC | 06:24 | |
*** pgadiya has joined #oooq | 06:30 | |
*** pgadiya has quit IRC | 06:30 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 06:51 |
*** gkadam has joined #oooq | 06:58 | |
*** noam has joined #oooq | 07:01 | |
*** noama has quit IRC | 07:04 | |
*** tesseract has joined #oooq | 07:14 | |
*** quiquell is now known as quique|rover|afk | 07:16 | |
*** zoli is now known as zoli|wfh | 07:16 | |
*** zoli|wfh is now known as zoli | 07:16 | |
*** pgadiya has joined #oooq | 07:22 | |
*** pgadiya has quit IRC | 07:22 | |
*** kopecmartin has joined #oooq | 07:23 | |
*** florianf has joined #oooq | 07:24 | |
*** florianf has quit IRC | 07:25 | |
*** amoralej|off is now known as amoralej | 07:28 | |
*** tosky has joined #oooq | 07:41 | |
*** quique|rover|afk is now known as quiquell|rover | 07:42 | |
*** noam__ has joined #oooq | 07:45 | |
*** noam has quit IRC | 07:46 | |
*** holser_ has joined #oooq | 08:01 | |
arxcruz | quiquell|rover: if you haven't yet, i can take a look | 08:10 |
quiquell|rover | arxcruz: Go for it, I am looking at a RDO issue | 08:11 |
quiquell|rover | arxcruz: I have also another one with a openstack returning 500 at temptest | 08:11 |
quiquell|rover | arxcruz: But that feels less tempest related | 08:11 |
ajo | gi folks | 08:13 |
ajo | hi | 08:13 |
quiquell|rover | Hello ajo | 08:13 |
ajo | anybody experienced failure to boot/ping the undercloud | 08:13 |
ajo | ? | 08:13 |
ajo | quiquell|rover: In one of my servers it's fine, but on other (completely new from scratch) the undercloud doesn't boot | 08:13 |
*** noam__ has quit IRC | 08:13 | |
ajo | I'm starting virt-manager via VNC to check | 08:13 |
quiquell|rover | ajo: No issues here regarding this | 08:14 |
arxcruz | quiquell|rover: this is in check, but i'll create an env | 08:14 |
ajo | quiquell|rover: ack thanks | 08:14 |
quiquell|rover | arxcruz: It's a gate job | 08:15 |
quiquell|rover | arxcruz: from https://review.openstack.org/#/c/573142/ | 08:15 |
arxcruz | quiquell|rover: yup | 08:16 |
*** saneax has quit IRC | 08:20 | |
*** saneax has joined #oooq | 08:29 | |
*** jaosorior has quit IRC | 08:39 | |
*** gkadam_ has joined #oooq | 08:44 | |
*** gkadam has quit IRC | 08:45 | |
*** gkadam_ has quit IRC | 08:45 | |
*** gkadam_ has joined #oooq | 08:45 | |
*** gkadam__ has joined #oooq | 08:48 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 08:51 |
*** gkadam_ has quit IRC | 08:51 | |
*** brault has quit IRC | 08:57 | |
*** brault has joined #oooq | 09:13 | |
*** dtantsur|afk is now known as dtantsur | 09:18 | |
*** ykarel_ has joined #oooq | 09:20 | |
*** ykarel_ is now known as ykarel|away | 09:22 | |
*** ykarel has quit IRC | 09:23 | |
*** ratailor__ has joined #oooq | 09:24 | |
*** ykarel|away has quit IRC | 09:25 | |
*** ratailor_ has quit IRC | 09:26 | |
*** yolanda_ has joined #oooq | 09:50 | |
*** yolanda__ has quit IRC | 09:53 | |
*** zoli is now known as zoli|lunch | 10:14 | |
*** jaganathan_ has quit IRC | 10:19 | |
*** jaganathan_ has joined #oooq | 10:19 | |
*** jaosorior has joined #oooq | 10:29 | |
*** sshnaidm|afk is now known as sshnaidm | 10:30 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 10:51 |
*** udesale has quit IRC | 11:33 | |
arxcruz | quiquell|rover: i wasn't able to reproduce the failure from http://logs.openstack.org/42/573142/4/gate/tripleo-ci-centos-7-scenario003-multinode-oooq-container/9973f47/logs/tempest.html.gz | 11:34 |
arxcruz | it's passing on my env | 11:34 |
quiquell|rover | arxcruz: Ok will recheck | 11:35 |
*** atoth has joined #oooq | 11:38 | |
*** zoli|lunch is now known as zoli|wfh | 11:38 | |
*** zoli|wfh is now known as zoli | 11:38 | |
*** anande has joined #oooq | 11:45 | |
*** amoralej is now known as amoralej|lunch | 11:59 | |
*** ratailor__ has quit IRC | 12:21 | |
*** trown|outtypewww is now known as trown | 12:24 | |
*** rlandy has joined #oooq | 12:30 | |
rlandy | marios: hello - ping re: hardware box | 12:34 |
marios | rlandy: o/ | 12:34 |
rlandy | marios: still having trouble logging in? | 12:34 |
* rlandy will check it | 12:34 | |
marios | rlandy: should i try again (I could get into ssh but no the drac console). trying on rdo cloud just now so not urgent | 12:34 |
marios | rlandy: thank you | 12:35 |
rlandy | marios: root/calvin | 12:35 |
rlandy | my mistake | 12:35 |
marios | rlandy: np trying | 12:35 |
marios | rlandy: thanks works | 12:36 |
rlandy | marios: go to virtual console - launch virtual console | 12:36 |
rlandy | say yes/ok to all options | 12:37 |
marios | rlandy: seem to be missing some plugin (chrome) | 12:37 |
rlandy | you can power down/up or reboot | 12:37 |
rlandy | yep - so you will need to modify your browser to accept this window | 12:37 |
weshay|ruck | quiquell|rover, https://review.openstack.org/#/c/576990/ | 12:37 |
marios | rlandy: ack thanks (doing/found some relevant info) | 12:38 |
rlandy | marios: you will need iced tea web to open it | 12:38 |
weshay|ruck | quiquell|rover, https://review.openstack.org/#/c/577809/ | 12:39 |
marios | rlandy: ack | 12:39 |
*** skramaja has quit IRC | 12:40 | |
rfolco_ | marios, I am getting OOM on undercloud_reinstall... does this help ? https://review.openstack.org/578023 | 12:50 |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 12:51 |
marios | rfolco_: well it just adds meminfo into the logs (panda gave this to me as homework ;) | 12:55 |
marios | rfolco_: don't think it will solve your oom but it might give you more info about what is happening | 12:55 |
rfolco_ | marios, ack thanks | 12:56 |
panda | marios: do you find it helpful ? | 12:58 |
panda | quiquell|rover: any recent issues caused by mariadb crashes that you know of ? | 12:59 |
quiquell|rover | panda: Nope | 13:01 |
quiquell|rover | panda: Did you found something ? | 13:01 |
quiquell|rover | panda: Humm I have some internal errors at tempest | 13:01 |
quiquell|rover | panda: Could be caused by mariadb failing | 13:01 |
quiquell|rover | panda: https://bugs.launchpad.net/tripleo/+bug/1778655 | 13:02 |
openstack | Launchpad bug 1778655 in tripleo "An unexpected error prevented the server from fulfilling your request. (HTTP 500) (Request-ID: req-d89e9af9-9dff-4115-acb5-5246113ec8c6)" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) | 13:02 |
panda | quiquell|rover: I have failures in my test job, | 13:02 |
panda | quiquell|rover: mariadb logs look clean in that job. doesn't seem to be the same | 13:04 |
quiquell|rover | panda: give it to me | 13:05 |
panda | quiquell|rover: give you what ? | 13:05 |
quiquell|rover | panda: the logs | 13:06 |
panda | quiquell|rover: nah, stay on the bugs taht matter, the team has to solve this, since it's a new functionality | 13:06 |
quiquell|rover | panda: Ok, let me know if you need help | 13:07 |
*** tcw has quit IRC | 13:07 | |
*** tcw1 has joined #oooq | 13:07 | |
*** quiquell|rover is now known as quique|rover|lch | 13:08 | |
panda | rfolco_: oh, we have the same problem then | 13:10 |
panda | rfolco_: upstream gets oom-killer too | 13:10 |
panda | rfolco_: are you testing upstream ? | 13:11 |
panda | rfolco_: but we're getting this during overcloud deploy | 13:11 |
panda | rfolco_: thee oom-killer kills mariadb here | 13:11 |
marios | panda: yes thank you :) | 13:15 |
panda | rfolco_: Jun 25 15:27:52 centos-7-rax-dfw-0000326756 kernel: Out of memory: Kill process 17196 (mysqld) score 43 or sacrifice child | 13:16 |
panda | Jun 25 15:27:52 centos-7-rax-dfw-0000326756 kernel: Killed process 17196 (mysqld) total-vm:4803964kB, anon-rss:352740kB, file-rss:0kB, shmem-rss:0kB | 13:16 |
*** anande has quit IRC | 13:16 | |
panda | it takes alsmo 5G of RAM | 13:16 |
rfolco_ | panda, I get the OOM during undercloud_reinstall... since this is undercloud job only | 13:17 |
rfolco_ | undercloud deploy works well | 13:17 |
panda | rfolco_: I'm testing the all in one job, undercloud install works well, there is no reinstall, then overcloud deploy fails | 13:18 |
panda | how is mairadb memory usage related to the reparenting ? | 13:19 |
panda | rfolco_: the node flavor is exactly the same | 13:19 |
panda | rfolco_: as the legacy node | 13:20 |
rfolco_ | panda, that's what I am trying to understand too | 13:21 |
rlandy | myoung: quique|rover|lch: myoung: this is what I think is wrong with rhos-13 gates: https://paste.fedoraproject.org/paste/d7rP3stBA9BWkH4FVYUQZA | 13:22 |
*** dtantsur is now known as dtantsur|brb | 13:22 | |
rlandy | weshay|ruck: ^^ | 13:22 |
rlandy | when you get to overcloud deploy, "no available hosts" | 13:23 |
rlandy | which is correct | 13:23 |
rlandy | we have no flavor with a big enough disk | 13:23 |
rlandy | requirements for disk is 40 - you can't deploy on a flavor with 40 - has to be at least 41 | 13:24 |
rlandy | compare with RDO Cloud in the same paste | 13:24 |
rlandy | https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/tq-gate-rhos-13-ci-rhos-ovb-featureset001 is set up the switch the gates to fs001 | 13:25 |
* rlandy puts in a request to add flavors with big enough disk | 13:26 | |
rlandy | myoung: weshay|ruck: quique|rover|lch: afaict, we *do* have the correct image passed ... | 13:29 |
rlandy | [stack@undercloud-86094 ~]$ cat /etc/redhat-release | 13:29 |
rlandy | Red Hat Enterprise Linux Server release 7.5 (Maipo) | 13:29 |
rlandy | so the image overrides may be a bug but it's not blocking gates | 13:29 |
*** quique|rover|lch is now known as quiquell|rover | 13:29 | |
quiquell|rover | rlandy: Hi | 13:30 |
rlandy | quiquell|rover: hi - got to join another meeting now - will add you to the request for bugger flavors | 13:30 |
rlandy | bigger | 13:30 |
quiquell|rover | rlandy: ack | 13:31 |
*** udesale has joined #oooq | 13:31 | |
quiquell|rover | rlandy: Thanks | 13:31 |
*** agopi has joined #oooq | 13:34 | |
*** agopi has quit IRC | 13:34 | |
rlandy | chandankumar: looking to join the office hours on #rdo - nothing going on there - am I at wrong place/wrong time? | 13:35 |
rlandy | You are kindly invited to the meeting: | 13:35 |
rlandy | RDO Office Hours on 2018-06-26 from 13:30:00 to 14:30:00 UTC | 13:35 |
rlandy | arxcruz: ^^?? | 13:37 |
*** sanjay__u has quit IRC | 13:38 | |
rfolco_ | panda, vm.swappiness = 60... it was 30 in the legacy parent. This is even safer for OOM... https://mariadb.com/kb/en/library/configuring-swappiness/... still looking | 13:42 |
*** agopi has joined #oooq | 13:43 | |
panda | rfolco_: where did the parent set the swappiness ? | 13:45 |
rlandy | quiquell|rover: hi - ok - there was no meeting, I'm back - more questions on rhos-13? you agree with my flavor assessment? | 13:46 |
rfolco_ | panda, I think it comes with the default one (60)... not sure where it was set/changed in the legacy one. | 13:46 |
rfolco_ | I just checked extras | 13:47 |
rfolco_ | http://logs.openstack.org/56/576956/7/check/tripleo-ci-centos-7-undercloud-oooq/96eb01f/logs/undercloud/var/log/extra/sysctl.txt.gz | 13:47 |
panda | rfolco_: we don't have any idea how much memory mariadb uses normally | 13:50 |
panda | rfolco_: but 5G is definitely too much | 13:50 |
*** jaganathan_ has quit IRC | 13:51 | |
weshay|ruck | arxcruz, kopecmartin you guys making progress on doc? | 13:52 |
panda | rfolco_: what does the oom-killer kills in your case ? | 13:52 |
trown | I just dont get how that would be different based on not running devstack | 13:52 |
panda | trown: indeed | 13:52 |
kopecmartin | weshay|ruck, we're working on that whole day, but if it's there a progress , well , good question | 13:53 |
weshay|ruck | heh.. k :) | 13:53 |
rfolco_ | os-refresh-config, panda http://logs.openstack.org/56/576956/7/check/tripleo-ci-centos-7-undercloud-oooq/96eb01f/logs/undercloud/home/zuul/undercloud_reinstall.log.txt.gz#_2018-06-22_18_14_35 | 13:54 |
panda | rfolco_: oh, you don't get the ooom-killer, | 13:54 |
rfolco_ | cannot allocate mem | 13:54 |
panda | rfolco_: you're just getting oom from the process | 13:54 |
panda | rfolco_: so you're not really sure what is taking all that memory | 13:55 |
rfolco_ | dstat says yes | 13:55 |
panda | rfolco_: we can assume it's mariadb | 13:55 |
rfolco_ | at that exact point I see mem graph at the peak | 13:55 |
rfolco_ | consuming all RAM available | 13:55 |
panda | rfolco_: does dstat say who's the glutton ? | 13:57 |
rfolco_ | glutton... new word | 13:57 |
*** amoralej|lunch is now known as amoralej | 13:58 | |
rfolco_ | panda, will look again | 13:58 |
*** jaosorior has quit IRC | 13:59 | |
panda | trown: how does mariadb behaves in libvirt reproduction ? I don't have a live deployment atm | 14:00 |
panda | maridb logs are scarce :( | 14:01 |
myoung | weshay|ruck, quiquell|rover: prepared basic # of days status for #tripleo meeting, if there's realtime / extra notes that make sense to convey for weekly please add https://etherpad.openstack.org/p/tripleo-ci-squad-meeting @ L50 | 14:01 |
trown | panda: restarting my env now almost to undercloud install | 14:03 |
*** rfolco_ is now known as rfolco | 14:03 | |
quiquell|rover | myoung: ack | 14:03 |
myoung | quiquell|rover: (optional) - I just used the notes from yesterday's scrum | 14:05 |
quiquell|rover | myoung: Maybe you can add that we have timeout issues in the gates | 14:07 |
quiquell|rover | myoung: ass addendum of the using ara in more depth | 14:07 |
myoung | quiquell|rover: aye, i have "using ara in more depth to diagnose timeout issues" already at L55 | 14:08 |
quiquell|rover | myoung: ack | 14:09 |
rfolco | panda, swap is missing | 14:16 |
panda | rfolco: yeah, we don't generally use swap | 14:16 |
rfolco | panda, devstack-gate has a role for it --> https://github.com/openstack-infra/devstack-gate/blob/101e0fbbc5e8851c53b4d09672bd26cee0099201/playbooks/roles/fix_disk_layout/tasks/main.yaml | 14:16 |
panda | rfolco: and even with swap, mariadb would probably continue to eat memory until even the swap is filled | 14:17 |
panda | mmhh | 14:17 |
panda | mmmmmhhh | 14:17 |
panda | so this is masking a mariadb memory problem ? | 14:19 |
weshay|ruck | panda, https://github.com/rdo-infra/ci-config/blob/master/ci-scripts/infra-setup/roles/te-broker/tasks/main.yml#L116 | 14:32 |
*** quiquell|rover is now known as quiquell|off | 14:34 | |
*** ccamacho has quit IRC | 14:35 | |
*** ccamacho has joined #oooq | 14:35 | |
marios | panda: http://logs.openstack.org/56/576956/7/check/tripleo-ci-centos-7-undercloud-oooq/96eb01f/logs/undercloud/var/log/host_info.txt.gz | 14:37 |
weshay|ruck | marios, http://logs.openstack.org/56/576956/7/check/tripleo-ci-centos-7-undercloud-oooq/96eb01f/logs/undercloud/var/log/extra/dstat.html.gz | 14:38 |
marios | weshay|ruck: thanks | 14:39 |
panda | marios: it's clearly mysqld | 14:39 |
trown | panda: on libvirt it is using 3.7G overcloud deploy is just starting though | 14:45 |
*** dtantsur|brb is now known as dtantsur | 14:45 | |
marios | bandini: are you aware of any recent changes in mariadb that could be causing the memory spike we are investigating | 14:46 |
trown | panda: that is one difference on libvirt though.. we are using 16G RAM VMS | 14:46 |
panda | trown: yep, we need to add the swap, I see the latest requirements for undercloud is 16G, upstream flavor is only 8G | 14:49 |
bandini | marios: not that I know of, no. (but I have not followed stuff recently, am currently chasing this rabbitmqeddon) | 14:51 |
marios | bandini: ack thanks just checking | 14:51 |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 14:51 |
panda | trown: rfolco weshay|ruck there is a standard role in zuul repo to configure swap, we should use that | 14:52 |
*** saneax has quit IRC | 14:52 | |
rfolco | panda, hmmm will test it thanks | 14:53 |
marios | panda: rfolco there is even a env file in tht we can just include btw for swap | 14:53 |
marios | panda: rfolco 2 in fact (file vs partition) | 14:53 |
marios | https://github.com/openstack/tripleo-heat-templates/blob/master/environments/enable-swap.yaml | 14:54 |
rfolco | panda, this ? openstack-zuul-jobs/roles/configure-swap/tasks/main.yaml | 14:54 |
marios | https://github.com/openstack/tripleo-heat-templates/blob/master/environments/enable-swap-partition.yaml | 14:54 |
panda | rfolco: yes | 14:54 |
trown | marios: I think it is an issue in the undercloud though | 14:56 |
marios | trown: ah right | 14:57 |
trown | panda: weird that mysqld is using an extra G of memory in CI though | 15:02 |
panda | trown: I'm looking beter at the top command, the memory is pretty much the same. 4.7G resident, 300M resident. I was reading resident wrong before, it's not 3G | 15:03 |
panda | s/better// | 15:03 |
marios | rfolco: how do you manage to have such a dark background ? :) is it night time there? | 15:06 |
marios | rfolco: it looks cool | 15:07 |
marios | rfolco: i want it too :D | 15:07 |
marios | rfolco: do you literally have a blackboard behind you ?:) | 15:07 |
marios | panda: get yo swap on | 15:08 |
rfolco | marios, window is 2% open in front of me, I have a usb flash with light plugged :) | 15:08 |
marios | rfolco: ack | 15:11 |
panda | marios: rfolco updated PS | 15:11 |
weshay|ruck | panda, https://review.openstack.org/#/c/576904/ | 15:11 |
weshay|ruck | rlandy, trown ^ | 15:11 |
panda | trown: uploaded patchset 20 adding the swap role from infra | 15:15 |
*** ccamacho has quit IRC | 15:15 | |
trown | panda: cool, looking in my libvirt env heat is actually using double the memory of mysqld in terms of RSS | 15:16 |
trown | panda: it is split over 4 workers, but each uses ~250M | 15:16 |
trown | panda: still odd why we would need more memory just because we dont use devstack-gate... that still doesnt make sense | 15:18 |
*** waleedm has quit IRC | 15:18 | |
rlandy | quiquell|off: weshay|ruck: looking at the stack create failure on your review ... checked tenant - there are a bunch of delete_failed stacks there | 15:18 |
panda | trown: the nodes have nly 8G of RAM, and the legacy playbook was adding swap, we were missing that in our now setup | 15:20 |
weshay|ruck | rlandy, https://review.rdoproject.org/r/14482 | 15:20 |
weshay|ruck | rlandy, :) | 15:21 |
weshay|ruck | I'm cleaning up now | 15:21 |
trown | panda: ah that makes sense then | 15:21 |
weshay|ruck | thanks | 15:21 |
panda | trown: the minimum requirement is 12-16G for the undercloud, so we need the swap | 15:21 |
weshay|ruck | check it out :) http://logs.openstack.org/60/577960/2/check/tripleo-ci-centos-7-undercloud-containers/7f49be7/logs/ara_oooq_root/ | 15:24 |
panda | ugh, job in queue since 20 minutes | 15:30 |
panda | no fast feedback today. | 15:31 |
weshay|ruck | https://review.openstack.org/#/c/577960/ reviews please | 15:33 |
panda | swap role is running | 15:40 |
panda | but it's taking 4 extra minutes for something htat is maybe not relevant anymore | 15:41 |
panda | 2018-06-26 15:37:06.292986 | TASK [configure-swap : Copy old /opt] | 15:41 |
panda | 2018-06-26 15:40:55.218132 | primary | ok: Runtime: 0:03:48.364601 | 15:42 |
panda | we need to wipe /opt and ensure we are not using anything there first if we don't want it to take too much | 15:44 |
*** trown has quit IRC | 15:44 | |
*** trown|brb has joined #oooq | 15:46 | |
*** bogdando has quit IRC | 15:46 | |
*** ykarel|away has joined #oooq | 15:51 | |
kopecmartin | weshay|ruck, arxcruz I've pushed an update in docs about containerized tempest https://review.openstack.org/#/c/565161/ | 15:53 |
weshay|ruck | k.. thanks kopecmartin | 15:54 |
kopecmartin | weshay|ruck, the bug can't be verified sooner than the doc is merged, can it? | 15:56 |
weshay|ruck | kopecmartin, as long as we make sure it merges.. anyone checking the bug would probably not know | 15:57 |
weshay|ruck | :) | 15:57 |
*** hamzy has quit IRC | 16:00 | |
*** ykarel|away is now known as ykarel | 16:03 | |
*** udesale has quit IRC | 16:09 | |
*** trown|brb is now known as trown | 16:12 | |
ajo | weshay|ruck: https://review.openstack.org/578142 | 16:13 |
ajo | if you can eyeball that ;) it'd be great | 16:13 |
ajo | weshay|ruck: my undercloud was hanging on boot because of that | 16:13 |
ajo | dont ask me why.. | 16:13 |
* weshay|ruck looks | 16:13 | |
arxcruz | rlandy: hey, i did not had that on my calendar :( | 16:14 |
rlandy | arxcruz: it was en email from chandan - maybe a mistake | 16:16 |
rlandy | arxcruz: anyways, thanks for the tempest tour last week - I looked over he repos and reviews you pointed at | 16:16 |
arxcruz | rlandy: cool :) | 16:17 |
rlandy | arxcruz: are there any simple/not urgent tasks I can start working on - in my spare time :)? | 16:17 |
arxcruz | rlandy: i need to check, i'll get back to you :) | 16:17 |
rlandy | arxcruz: thanks - whenever you find some. pls ping me. no rush | 16:18 |
rlandy | I just need to get my feet wet | 16:18 |
weshay|ruck | marios, you still on? I have a question | 16:18 |
*** tesseract has quit IRC | 16:40 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 16:51 |
panda | trown: we are 8 minutes into overcloud deploy, previous attempts failed in 5 minutes. Seems we are on a better spot al least | 16:54 |
trown | panda: nice | 16:54 |
*** panda is now known as panda|off | 17:01 | |
weshay|ruck | kopecmartin, arxcruz remarks in the review | 17:04 |
kopecmartin | weshay|ruck, thanks, i also found issues, so fixing them too | 17:04 |
weshay|ruck | thanks | 17:05 |
weshay|ruck | rlandy, fyi. tenant is clean | 17:05 |
*** zoli is now known as zoli|gone | 17:09 | |
*** zoli|gone is now known as zoli | 17:09 | |
weshay|ruck | rlandy, panda this is odd.. the tebroker log is not getting updated | 17:14 |
weshay|ruck | 17:12:35 +(/opt/stack/new/tripleo-ci/toci_gate_test.sh:285): ./testenv-client -b 192.168.103.254:4730 -t 17400 --envsize 4 --ucinstance 83906ba4-83eb-4a91-931b-62c2affb3402 --net-iso multi-nic -- ./toci_quickstart.sh | 17:14 |
weshay|ruck | 17:12:35 +(/opt/stack/new/tripleo-ci/toci_gate_test.sh:279): sleep 1200 | 17:14 |
weshay|ruck | -rw-r--r--. 1 root root 0 Jun 26 14:20 testenv-worker.log | 17:14 |
weshay|ruck | maybe it takes a minute | 17:14 |
*** dtantsur is now known as dtantsur|afk | 17:15 | |
rlandy | you pushed a change to te-broker? | 17:18 |
rlandy | picks up changes once a day unless you manaullyc hange it | 17:19 |
weshay|ruck | rlandy, I didn't push any changes | 17:21 |
weshay|ruck | rlandy, got nothing going on here.. must be related to the rdo migration, but not sure | 17:22 |
weshay|ruck | maybe the tenant id changed | 17:22 |
rlandy | checking tenant | 17:24 |
weshay|ruck | systemctl -a | grep te_workers | 17:25 |
weshay|ruck | ● te_workers.service loaded failed failed TE Workers | 17:25 |
weshay|ruck | restarted | 17:26 |
weshay|ruck | and we're off | 17:26 |
weshay|ruck | rlandy, http://38.145.34.41/testenv-worker.log | 17:27 |
rlandy | envs 2 and 3 are create_ in_progress | 17:28 |
rlandy | and then nothing | 17:29 |
weshay|ruck | appears to be working https://review.rdoproject.org/jenkins/job/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-pike/1073/console | 17:29 |
weshay|ruck | ya.. now 3 in create complete | 17:29 |
rlandy | 2018-06-26 17:27:36,155 - testenv-worker-11 - INFO - Getting new job... | 17:29 |
rlandy | 2018-06-26 17:27:36,910 - testenv-worker-1 - ERROR - + ENVNUM=1 | 17:29 |
rlandy | arethere jobs to consume those envs | 17:30 |
rlandy | there is something wrong here | 17:31 |
*** holser_ has quit IRC | 17:39 | |
*** ykarel has quit IRC | 17:44 | |
*** kopecmartin has quit IRC | 17:46 | |
*** hamzy has joined #oooq | 17:50 | |
*** amoralej is now known as amoralej|off | 17:56 | |
*** trown has quit IRC | 18:05 | |
*** jaosorior has joined #oooq | 18:13 | |
weshay|ruck | rlandy, https://github.com/openstack/browbeat/blob/master/ci-scripts/tripleo/microbrow-perfci.sh#L90-L103 | 18:16 |
*** trown has joined #oooq | 18:23 | |
*** dmellado has quit IRC | 18:32 | |
*** hubbot has quit IRC | 18:33 | |
*** atoth has quit IRC | 18:53 | |
rook | https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/user/agopi/my-views/view/Browbeat_view/job/browbeat-quickstart-queens-baremetal-mixed/25/console | 18:55 |
rook | weshay|ruck: ^ | 18:55 |
rook | We see the same error referenced in the launchpad agopi shared. | 18:55 |
rook | 18:48:04 SyntaxError: '<' operator not allowed in environment markers | 18:55 |
weshay|ruck | hrm | 19:00 |
weshay|ruck | k | 19:00 |
*** yolanda__ has joined #oooq | 19:04 | |
*** yolanda_ has quit IRC | 19:07 | |
*** yolanda_ has joined #oooq | 19:09 | |
*** sshnaidm has quit IRC | 19:09 | |
*** yolanda__ has quit IRC | 19:12 | |
*** yolanda__ has joined #oooq | 19:15 | |
*** yolanda_ has quit IRC | 19:18 | |
*** holser_ has joined #oooq | 19:20 | |
*** holser_ has quit IRC | 19:20 | |
weshay|ruck | rlandy, re: https://review.openstack.org/#/c/576904 | 19:32 |
weshay|ruck | If other roles need those vars, the safest thing to do is to leave them in extras I think | 19:32 |
rlandy | weshay|ruck: ok - I just thought I'd question it | 19:33 |
weshay|ruck | not sure if there is a right / wrong here.. I do see why they could be in container prep too | 19:33 |
rlandy | weshay|ruck: fine _ I made my little point | 19:33 |
weshay|ruck | heh | 19:33 |
weshay|ruck | I could put a patch on top | 19:33 |
weshay|ruck | quickstart.sh is broken though | 19:33 |
rlandy | weshay|ruck: no if quickstart is broken, push it through | 19:35 |
rlandy | much bigger deal | 19:35 |
weshay|ruck | 24hr queue :) | 19:35 |
rlandy | oh no - again?? | 19:37 |
rlandy | weshay|ruck: https://github.com/openstack/tripleo-quickstart-extras/blob/master/config/environments/rdocloud.yml#L15 | 19:37 |
rlandy | going to change this to m1.xlarge | 19:37 |
rlandy | if CI is going with xlarge now | 19:37 |
rlandy | they will have to match | 19:38 |
weshay|ruck | rlandy, CI? which ci? | 19:38 |
rlandy | weshay|ruck: we just discussed te-borker is using xlarge, no? | 19:38 |
rlandy | for the undercloud | 19:38 |
weshay|ruck | ya.. | 19:39 |
rlandy | https://github.com/openstack/tripleo-quickstart-extras/blob/master/config/environments/rdocloud.yml#L15 is for the reproducer | 19:39 |
weshay|ruck | brb | 19:39 |
*** agopi is now known as agopi|brb | 19:40 | |
*** agopi|brb has quit IRC | 19:45 | |
weshay|ruck | back | 19:48 |
*** sshnaidm has joined #oooq | 19:50 | |
rlandy | on second thoughts ... | 19:50 |
rlandy | https://github.com/openstack-infra/tripleo-ci/blob/master/scripts/prepare-ovb-cloud.sh#L13 | 19:50 |
rlandy | I am not sure it is xlarge | 19:51 |
rlandy | will have to check it when rdocloud returns to us | 19:51 |
rlandy | hmmm ... today was not a good day to try sell moving to rdocloud | 19:53 |
*** waleedm has joined #oooq | 19:58 | |
weshay|ruck | lolz | 20:02 |
weshay|ruck | no doubt | 20:02 |
weshay|ruck | I think rdo-cloud was scared of Joe | 20:02 |
rlandy | shame | 20:04 |
*** noama has joined #oooq | 20:18 | |
weshay|ruck | rlandy, when you have a sec https://docs.google.com/document/d/12XqodWjRUHd-AskAJJ543JtGLK6ZRzhxw_YlOqyhWdc/edit | 20:26 |
weshay|ruck | please just give it a quick glance before I share w/ Joe and the crew | 20:26 |
weshay|ruck | rlandy, can you add me to that ticket please too :) | 20:28 |
weshay|ruck | myoung, rlandy anyone know what happened to the pike branch here? https://code.engineering.redhat.com/gerrit/gitweb?p=tripleo-environments.git | 20:33 |
weshay|ruck | err sorry | 20:33 |
weshay|ruck | wrong git repo :) https://code.engineering.redhat.com/gerrit/gitweb?p=tripleo-quickstart.git | 20:34 |
* myoung looks | 20:34 | |
weshay|ruck | rook, https://bugs.launchpad.net/python-cliff/+bug/1597846 | 20:37 |
openstack | Launchpad bug 1597846 in cliff "Cannot install cliff 2.1.0 in Python 3.x" [Undecided,Invalid] | 20:37 |
rlandy | weshay|ruck: I think you are on the ticket I forwarded | 20:44 |
weshay|ruck | don't see it in my list | 20:45 |
weshay|ruck | searching for it by id does not bring it up | 20:45 |
rlandy | weshay|ruck: browbeat notes look good | 20:47 |
weshay|ruck | k. thanks | 20:47 |
rlandy | few minor comments | 20:47 |
*** hamzy has quit IRC | 20:47 | |
rlandy | weshay|ruck: you should have access to the ci-rhos flavors ticket now ... https://redhat.service-now.com/pnt?id=ticket&table=x_redha_pnt_devops_table&sys_id=deede177db321f00a9e306e2ca961993 | 20:48 |
rlandy | checking the reconfig networking ticket | 20:49 |
weshay|ruck | thanks | 20:50 |
weshay|ruck | see it | 20:50 |
*** hubbot has joined #oooq | 20:51 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 20:52 |
*** waleedm has quit IRC | 21:00 | |
*** trown is now known as trown|outtypewww | 21:01 | |
rlandy | weshay|ruck: https://redhat.service-now.com/pnt/?id=ticket&table=x_redha_pnt_devops_table&sys_id=6bd1c3f3db7e5f003306abc5ca96190d | 21:02 |
myoung | weshay|ruck: w.r.t pike repo, what do you mean? looks like is in sync with upstream etc.. | 21:02 |
rlandy | you should see both now | 21:02 |
myoung | weshay|ruck: is there an issue with stable/pike branch? | 21:02 |
weshay|ruck | myoung, no.. just the pip cache | 21:03 |
weshay|ruck | in jenkins | 21:03 |
weshay|ruck | do you by chance recall how to disable that? | 21:03 |
* myoung looks | 21:03 | |
myoung | oh yes! | 21:03 |
myoung | sec | 21:03 |
* myoung fetches details | 21:03 | |
rlandy | we ff'ed it to enable baremetal | 21:03 |
weshay|ruck | rlandy, ya.. the branch is not the issue | 21:03 |
weshay|ruck | we're hitting https://bugs.launchpad.net/python-cliff/+bug/1597846 in jenkins only | 21:04 |
openstack | Launchpad bug 1597846 in cliff "Cannot install cliff 2.1.0 in Python 3.x" [Undecided,Invalid] | 21:04 |
myoung | weshay|ruck: was this: https://bugs.launchpad.net/tripleo/+bug/1772460 | 21:05 |
openstack | Launchpad bug 1772460 in tripleo "rdo2: BM jobs failing b/c concurrent pip installs are failing due to sharing pip cache" [Critical,Fix released] - Assigned to Matt Young (halcyondude) | 21:05 |
myoung | XDG_CACHE_HOME=$HOME/.cache/$EXECUTOR_NUMBER | 21:05 |
weshay|ruck | k.. saw that in the config | 21:05 |
weshay|ruck | however it needs to be cleared | 21:05 |
myoung | so in jenkins workspace dirs (from which we run ansible) the pip cache should be located in ~/.cache/{ordinal} - so multiple jobs on same executor don't overlap | 21:06 |
weshay|ruck | hrm | 21:06 |
weshay|ruck | I removed the workspace on the slave | 21:06 |
weshay|ruck | well rm -Rf workspace/* | 21:06 |
myoung | the issue i hit was timing, when both were running pip installs and accessing a shared pip cache concurrently, was getting issues | 21:06 |
weshay|ruck | pip install -r requirements works for me locally in python27 and 3 | 21:06 |
weshay|ruck | ya.. that is unrelated | 21:07 |
myoung | aye | 21:07 |
myoung | have link? can look | 21:07 |
* myoung looks in bug | 21:07 | |
weshay|ruck | https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/periodic-pike-rdo_trunk-featureset020-1ctlr_1comp_64gb/65/console | 21:07 |
myoung | ahh 00:00:00.001 Started by upstream project "sbtest-tripleo-quickstart-pike-rdo_trunk" build number 40 | 21:07 |
myoung | weshay|ruck: ok so now these are running on py3? | 21:09 |
* myoung chuckles as he sees "Installed /Users/cleverdevil/.virtualenvs/wham/build/cliff" ... cleverdevil, wham, and cliff on the same line. omen? | 21:09 | |
weshay|ruck | I think it's phucked on the pip mirror | 21:12 |
myoung | weshay|ruck: yeah bug ref's setuptools > 17.1, but we're pulling 39.2... | 21:12 |
*** yolanda_ has joined #oooq | 21:12 | |
myoung | weshay|ruck: can we (for now) freeze/fix python-cliff back a minor rev? | 21:13 |
myoung | or is something requiring the new one? | 21:13 |
* myoung loks | 21:13 | |
myoung | ahh wait... | 21:14 |
myoung | 00:01:21.954 Uninstalling setuptools-12.0.5: | 21:14 |
myoung | 00:01:22.027 Successfully uninstalled setuptools-12.0.5 | 21:14 |
myoung | and from LP | 21:14 |
myoung | This is a serious issue and the current workaround is to pre-install `setuptools>=17.1` BEFORE running pip install. | 21:14 |
*** yolanda__ has quit IRC | 21:14 | |
weshay|ruck | f.. slave is on f22 | 21:16 |
weshay|ruck | myoung, rlandy ya.. f... f22 is the reason | 21:19 |
myoung | weshay|ruck: can hope to 22 --> 24 --> 26 pretty quickly | 21:20 |
myoung | hop* | 21:20 |
* myoung looks at which slave this is...thought we upgraded these already | 21:20 | |
rlandy | cool | 21:20 |
rlandy | we upgraded some | 21:21 |
myoung | ya rdo-manager-slave_rdo-ci-fx2-01-s6 is f22 | 21:22 |
myoung | checking the others | 21:23 |
myoung | rlandy, weshay|ruck, looks like fx2-01-s2 and fx2-01-s3 are also fedora 22 | 21:26 |
myoung | updated descriptions in jenkins with a *f22* prefix | 21:26 |
myoung | also fx2-01-s1 | 21:27 |
*** noama has quit IRC | 21:28 | |
myoung | weshay|ruck, rlandy, https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/computer/rdo-manager-slave_rdo-ci-fx2-01-s5 is at f26 and is also now back in the rdo-manager-64 pool. we can take out the f22's to a diff pool until upgraded to get these jobs rolling now...if ok i'll do that now | 21:29 |
*** myoung is now known as myoung|off | 21:56 | |
*** yolanda__ has joined #oooq | 22:07 | |
*** agopi has joined #oooq | 22:07 | |
*** yolanda_ has quit IRC | 22:10 | |
*** yolanda_ has joined #oooq | 22:21 | |
*** yolanda__ has quit IRC | 22:24 | |
*** dsneddon has quit IRC | 22:39 | |
*** dsneddon has joined #oooq | 22:41 | |
*** rlandy has quit IRC | 22:49 | |
*** dsneddon has quit IRC | 22:50 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 22:52 |
*** tosky has quit IRC | 23:08 | |
*** dsneddon has joined #oooq | 23:22 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!