Monday, 2018-11-05

*** rkukura has joined #openstack-infra00:01
*** longkb has joined #openstack-infra00:10
*** slaweq has joined #openstack-infra00:13
*** slaweq has quit IRC00:45
*** slaweq has joined #openstack-infra01:11
*** armax has joined #openstack-infra01:21
*** ssbarnea has quit IRC01:33
*** yamamoto has joined #openstack-infra01:40
*** slaweq has quit IRC01:44
hogepodgeIf a node goes down in a gate job, is there a way to recover any diagnostics on it if you know things like the hostname and job it was associated with?01:52
clarkbhogepodge: we can determine uuid and followup with the cloud provider but its largely on them I think and depending on the cloud they may be more or less willing to do that01:56
clarkbhogepodge: other things to consider: if uaing nested virt that can crash the VM in some cases and if modifying the network like interfaces or firewalla that can make the node go away too01:58
hogepodgeclarkb: I'm looking at one of the five node jobs. The workflow sets up a kubernetes cluster, one control plane node and four worker nodes. All of the nodes come up fine, and some pods are even deployed to it. Then the node disappears and I can't find any errors in the logs suggesting what happened.02:03
hogepodge(when I say "the node" I mean "a node")02:03
clarkbhogepodge: what cloud was it in?02:03
clarkba d region02:03
hogepodgerax-ord02:04
*** hongbin has joined #openstack-infra02:05
hogepodgeIt took me a while to figure out how to read ARA reports.02:05
clarkbhogepodge: if we collect info to go back into logs we can see if cloudnull is able to help us debug rax side02:05
clarkbhostname, ip addresses, timestamps02:06
hogepodgeplaybooks are shown in reverse chronological order, but tasks in chronological02:06
hogepodgehostname ubuntu-xenial-rax-ord-000032130202:06
clarkblink to the logs would probably help to just to find a y other info later if necessary02:07
hogepodgeactually, here's all the hostinfo http://logs.openstack.org/57/608057/7/check/openstack-helm-infra-five-ubuntu/daa7967/zuul-info/host-info.node-4.yaml02:07
hogepodgehave to run, I'll check back in tomorrow or later02:09
clarkbya I cant debug tonight and chances are we need someone cloud side anyway02:09
clarkbbut we should also try to rule out the test breaking networking on the node too02:10
clarkbfirewalls/sshd/interfaces02:10
*** slaweq has joined #openstack-infra02:11
*** rkukura has quit IRC02:15
hogepodgeI don’t think it does. Recheck passed.02:38
*** slaweq has quit IRC02:45
*** yamamoto has quit IRC02:49
*** mrsoul has quit IRC02:50
fungiianw: there's an alternate armci.yaml in my homedir on the bridge with working initial temporary passwords for the new cloud. i haven't reset and recorded them in the usual places yet but if you want to play around with that feel free02:54
ianwfungi: oh, cool, was just replying that the credentials don't work :)03:03
*** slaweq has joined #openstack-infra03:13
*** bhavikdbavishi has joined #openstack-infra03:21
*** annp has joined #openstack-infra03:40
*** bhavikdbavishi has quit IRC03:42
*** slaweq has quit IRC03:44
*** bhavikdbavishi has joined #openstack-infra03:57
*** mino_ has joined #openstack-infra03:58
*** dave-mccowan has quit IRC04:00
*** ramishra has joined #openstack-infra04:09
*** slaweq has joined #openstack-infra04:16
*** yamamoto has joined #openstack-infra04:24
*** truongnh has joined #openstack-infra04:27
*** udesale has joined #openstack-infra04:31
*** ykarel has joined #openstack-infra04:40
*** janki has joined #openstack-infra04:47
*** slaweq has quit IRC04:48
*** hongbin has quit IRC04:50
*** slaweq has joined #openstack-infra05:16
*** pcaruana has joined #openstack-infra05:23
*** bhavikdbavishi has quit IRC05:27
*** pcaruana has quit IRC05:32
*** slaweq has quit IRC05:44
*** zul has quit IRC05:49
*** truongnh has quit IRC05:55
*** truongnh has joined #openstack-infra06:04
*** slaweq has joined #openstack-infra06:11
*** e0ne has joined #openstack-infra06:17
*** rkukura has joined #openstack-infra06:32
*** quiquell has joined #openstack-infra06:35
*** gfidente has joined #openstack-infra06:36
*** e0ne has quit IRC06:36
*** Dobroslaw has joined #openstack-infra06:37
*** slaweq has quit IRC06:39
*** chkumar|off is now known as chandankumar06:41
*** rkukura has quit IRC06:42
*** felipemonteiro has joined #openstack-infra06:49
*** sshnaidm|off is now known as sshnaidm|rover07:07
*** jbadiapa has joined #openstack-infra07:11
*** slaweq has joined #openstack-infra07:11
*** dpawlik has joined #openstack-infra07:12
*** kjackal has joined #openstack-infra07:13
*** jaosorior has joined #openstack-infra07:13
*** slaweq has quit IRC07:16
*** dpawlik has quit IRC07:16
*** jaosorior has quit IRC07:24
*** jaosorior has joined #openstack-infra07:27
*** dpawlik has joined #openstack-infra07:42
*** ccamacho has joined #openstack-infra07:44
*** kopecmartin|off is now known as kopecmartin07:45
*** dpawlik has quit IRC07:47
*** truongnh has quit IRC07:52
*** felipemonteiro has quit IRC08:00
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Try to reproduce hanging paused job  https://review.openstack.org/61549308:05
*** pcaruana has joined #openstack-infra08:06
*** jtomasek has joined #openstack-infra08:12
*** ralonsoh has joined #openstack-infra08:14
*** dpawlik has joined #openstack-infra08:17
amorinhey all08:26
*** florianf|afk is now known as florianf08:30
*** jtomasek has quit IRC08:36
*** jtomasek has joined #openstack-infra08:44
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Update static.o.o publishing  https://review.openstack.org/61550108:46
*** bauwser is now known as bauzas08:56
*** jpena|off is now known as jpena08:56
*** jpich has joined #openstack-infra08:57
*** lucasagomes has joined #openstack-infra08:57
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Try to reproduce hanging paused job  https://review.openstack.org/61549308:58
*** ccamacho has quit IRC09:04
*** e0ne has joined #openstack-infra09:10
*** rossella_s has joined #openstack-infra09:12
*** ccamacho has joined #openstack-infra09:12
*** pcaruana has quit IRC09:31
*** pcaruana has joined #openstack-infra09:32
*** adriancz has joined #openstack-infra09:39
*** noama has joined #openstack-infra09:40
*** ykarel is now known as ykarel|lunch09:41
*** slaweq has joined #openstack-infra09:41
AJaegerfungi, could you review this ossa change, please? https://review.openstack.org/61549809:43
*** ramishra_ has joined #openstack-infra09:49
*** ramishra has quit IRC09:49
*** derekh has joined #openstack-infra09:49
*** shrasool has joined #openstack-infra09:52
*** trown has quit IRC09:55
*** trown has joined #openstack-infra09:58
openstackgerritNoam Angel proposed openstack/diskimage-builder master: move selinux-permissive configure to pre-install phase  https://review.openstack.org/61551910:00
openstackgerritNoam Angel proposed openstack/diskimage-builder master: move selinux-permissive configure to pre-install phase  https://review.openstack.org/61551910:00
*** longkb has quit IRC10:01
openstackgerritNoam Angel proposed openstack/diskimage-builder master: move selinux-permissive configure to pre-install phase  https://review.openstack.org/61551910:03
*** ykarel|lunch is now known as ykarel10:05
*** panda has joined #openstack-infra10:05
*** electrofelix has joined #openstack-infra10:06
*** ccamacho has quit IRC10:15
iceythe openstack bot doesn't seem to be handling the meeting bits this morning :-/10:16
*** ccamacho has joined #openstack-infra10:17
*** shardy has joined #openstack-infra10:21
*** dtantsur|afk is now known as dtantsur\10:35
*** dtantsur\ is now known as dtantsur10:35
*** mino_ has quit IRC10:35
*** d0ugal has quit IRC10:36
*** maciejjozefczyk has quit IRC10:50
fricklericey: in which channel did you see issues?10:54
*** maciejjozefczyk has joined #openstack-infra10:54
*** d0ugal has joined #openstack-infra10:55
*** ssbarnea has joined #openstack-infra11:04
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Resume paused job with skipped children  https://review.openstack.org/61549311:06
iceyfrickler: #openstack-meeting-411:15
*** rfolco|off has joined #openstack-infra11:22
fricklericey: hmm, it did seem to work fine when I just tested it, sadly I wasn't online in that channel earlier, so not sure what went wrong there. maybe some other infra-root can take a look later11:22
*** rfolco|off is now known as rfolco|ruck11:24
iceyNo worries, meeting finished anyways:-)11:24
*** tpsilva has joined #openstack-infra11:31
*** dpawlik has quit IRC11:38
fricklericey: oh, I just checked the logs, you had a space in front of your #startmeeting command, which is why the bot ignored it. you don't see it in the html log, but if you check the text version you can see it http://eavesdrop.openstack.org/irclogs/%23openstack-meeting-4/%23openstack-meeting-4.2018-11-05.log11:46
*** beekneemech has quit IRC11:53
iceyHeh that explains it, thanks!11:57
*** bnemec has joined #openstack-infra11:57
*** dave-mccowan has joined #openstack-infra12:04
*** janki has quit IRC12:05
*** rh-jelabarre has joined #openstack-infra12:07
*** pbourke has quit IRC12:10
*** pbourke has joined #openstack-infra12:11
*** udesale has quit IRC12:14
*** roman_g has joined #openstack-infra12:15
*** dpawlik has joined #openstack-infra12:16
*** dpawlik has quit IRC12:18
*** dpawlik has joined #openstack-infra12:18
*** jpena is now known as jpena|lunch12:20
*** dpawlik has quit IRC12:23
*** dpawlik has joined #openstack-infra12:26
*** dpawlik has quit IRC12:27
*** dpawlik has joined #openstack-infra12:28
*** dpawlik_ has joined #openstack-infra12:30
*** dpawlik has quit IRC12:31
*** dpawlik_ has quit IRC12:32
*** dpawlik has joined #openstack-infra12:32
*** jroll has quit IRC12:32
*** jroll has joined #openstack-infra12:34
openstackgerritMerged openstack-infra/zuul master: Fix unreachable nodes detection  https://review.openstack.org/60282912:34
*** dpawlik_ has joined #openstack-infra12:35
openstackgerritMerged openstack-infra/zuul master: Also retry the job if a post job failed with unreachable  https://review.openstack.org/60283012:36
*** dpawlik has quit IRC12:37
*** ansmith has quit IRC12:37
*** e0ne_ has joined #openstack-infra12:37
*** e0ne has quit IRC12:40
*** dpawlik_ has quit IRC12:42
*** dpawlik has joined #openstack-infra12:42
*** boden has joined #openstack-infra12:43
openstackgerritMerged openstack-infra/zuul-jobs master: Add role to install kubernetes  https://review.openstack.org/60582312:48
*** rlandy has joined #openstack-infra12:49
*** dtantsur is now known as dtantsur|brb12:51
*** zul has joined #openstack-infra13:15
*** AJaeger has quit IRC13:22
*** jpena|lunch is now known as jpena13:25
*** jtomasek has quit IRC13:27
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Implement an OpenShift resource provider  https://review.openstack.org/57066713:28
*** jtomasek has joined #openstack-infra13:28
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Implement an OpenShift Pod provider  https://review.openstack.org/59033513:28
smcginnisEtherpad appears to be having issues again.13:29
*** haleyb has joined #openstack-infra13:30
*** jistr is now known as jistr|call13:32
*** panda is now known as panda|bbl13:36
*** AJaeger has joined #openstack-infra13:37
*** jcoufal has joined #openstack-infra13:38
*** yamamoto has quit IRC13:38
*** yamamoto has joined #openstack-infra13:38
arxcruzgmann: around?13:44
gmannarxcruz: kind of but i need to go away soon if nothing urgent.13:46
arxcruzgmann: just wondering, I'm noticing on tripleo that tempest scenarios are not running in parallel13:46
arxcruzapi tests are running in parallel, but scenarios start to use only one worker13:47
gmannarxcruz: which job (tox env) you use?13:47
arxcruzgmann: wondering if you know if that's a known issue, or if i need to change any configuration under stestr, or in tempest itself13:47
arxcruzgmann: we use tempest run directly, not from tox13:47
*** jamesmcarthur has quit IRC13:47
arxcruzgmann: i'm aware of the serial scenario run in tox13:48
*** jamesmcarthur has joined #openstack-infra13:48
gmannarxcruz: as integrate-gate (tempest-full job), scenario tests are run in serial due to ssh timeout issue we faced13:48
gmannarxcruz: ok13:48
gmannarxcruz: in that case, it should run in parallel as long as you do not explicitly mention to run them in serial13:49
arxcruzgmann: https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-master/480e81e/logs/stackviz/#/testrepository.subunit/timeline13:49
arxcruzan example13:49
openstackgerritMerged openstack-infra/zuul master: web: uses queues uid to preserve state on change  https://review.openstack.org/61493313:50
arxcruzgmann: https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-master/480e81e/logs/undercloud/home/zuul/tempest_container.sh.txt.gz13:51
dhellmannAJaeger : I realized over the weekend that we have a small problem with the release job for heat. I know generally how to fix it, but could use some advice on details.13:53
dhellmannThe issue is that we're changing the dist name on master in order to be able to publish to pypi13:53
dhellmannThat's not something we're going to want to backport, though13:53
dhellmannSo we need to continue to use the old tarball-only release job on the other branches13:54
*** felipemonteiro has joined #openstack-infra13:54
dhellmannI can set up the job to use a regex based on the tags to know when to run or not run13:54
gmannarxcruz: it is not mentioned serial and concurrency is 3 so it should be parallel13:54
arxcruzgmann: exactly, but you see, api is running in parallel, but when start to run scenario, it runs in serial13:55
dhellmannbut the project-template has dependencies defined between some of the other jobs, so that the announce job only runs if we actually upload a release, and I don't know the best way to reproduce that (use the project-template and then separately set the matching regexes, or bring the whole set of jobs in as custom for heat)13:55
*** ansmith has joined #openstack-infra13:57
gmannarxcruz: not sure how you observe them as serial ?13:57
AJaegerdhellmann: use the individual jobs for heat, not the template. That's the only way out ...13:58
arxcruzgmann: stackviz shows the scenario running only on one worker13:58
arxcruzgmann: https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-master/480e81e/logs/stackviz/#/testrepository.subunit/timeline13:58
dhellmannAJaeger : ok. I'll work on a patch for that later today13:59
*** janki has joined #openstack-infra13:59
AJaegerdhellmann: and then let's check what to do with the dependencies - might be tricky ...14:00
arxcruzgmann: is tempest smart enough to let's say, if our quotas is set to 1 vm, it runs these scenarios in serial ?14:00
dhellmannAJaeger : I thought maybe if I added both job templates, I could then list the 2 release jobs with the matching regexes and zuul would figure out what I meant. :-)14:00
arxcruzlike, we don't have quota, let's wait one scenario teardown everything to run the next14:00
*** jamesmcarthur has quit IRC14:01
AJaegerdhellmann: we need to ask corvus for that - I suggest you push a change out and ask him.14:01
*** ginopc has joined #openstack-infra14:02
dhellmannAJaeger ++14:02
gmannarxcruz: i can see them in parallel. for example -  tempest.scenario.test_volume_boot_pattern and  tempest.scenario.test_network_advanced_server_ops tempest.scenario.test_volume_boot_pattern at same time @4.3614:02
arxcruzgmann: but why the last tests are running only on one single worker ?14:03
arxcruzwhile the other 3 workers are doing nothing14:03
gmannarxcruz: at the end it is only scenario as api tests are done and those remaining scenario get worker after that14:03
arxcruzgmann: so, at the end, we have 3 workers doing nothing, and one woeker running all scenarios14:03
arxcruzall the remain scenarios14:03
*** kgiusti has joined #openstack-infra14:04
arxcruzgmann: here is more visible that http://logs.openstack.org/33/615133/2/check/tripleo-ci-centos-7-standalone/9ac46cc/logs/stackviz/#/testrepository.subunit/timeline14:04
*** bobh has joined #openstack-infra14:05
gmannarxcruz: RE: quota things. no Tempest does not wait for quota things. if no enough quota then it will error if any test creating more then allowed quota14:05
gmannarxcruz: one thing is class level tests is all serial. so you can see all tests in tempest.scenario.test_network_basic_ops class will run in serial always.14:07
gmannarxcruz: but i can see TestVolumeBootPattern tests did not get started on worker 1 or 314:07
*** jistr|call is now known as jistr14:08
gmannarxcruz: and i think you will see different behaviour depends on CPU usage on machine. in your last link you pasted, only these 2 tests are running serially and rest all in parallel. and that depends on CPU allocation to worker etc. Tempest does not have any control on those things14:10
*** mriedem has joined #openstack-infra14:11
gmannarxcruz: from Tempest side, tests are in queue for parallel run at class level and depends on CPU availability they gets executed.14:12
arxcruzgmann: is there a way to change that ?14:13
arxcruzi mean class level run in parallel ?14:13
arxcruzgmann: do you know where the scheduler for this cpu availability?14:13
gmannarxcruz: only things we can do to fast run is to increase the workers.14:13
mordredEmilienM: I've started looking at podman for infra things - and one of the first things that jumps out at me is that the config file for configuring registries (and mirrors) is different than from docker... in the tripleo jobs, when you're doing podman stuff - are you guys writing the per-region mirror info into /etc/containers/registries.conf ?14:13
gmannarxcruz: depends on your machine you run tests14:14
gmannarxcruz: need to leave office otherwise i am going to miss my last train. will try to catch you from home or tomorrow.14:14
EmilienMmordred: so I have bad news for you14:14
arxcruzgmann: ok, thanks!14:14
mordredEmilienM: oh no! not bad news!14:15
*** panda|bbl is now known as panda14:15
EmilienMmordred: there is no support for mirrors / proxies in podman. In fact the code that pulls images is in https://github.com/containers/image14:15
*** ginopc has quit IRC14:15
EmilienMmordred: but AFIK there is nothing in podman that consummes it14:15
EmilienMmordred: the only thing that we were able to do is to use insecure registries from the undercloud when deploying14:16
EmilienMmordred: with [registries.insecure]14:16
mordredEmilienM: but you have to explicitly configure registries - so couldn't you just put in the mirror url instead of docker.io in the registries list?14:17
EmilienMmordred: https://github.com/containers/image/issues/52914:17
AJaegerEmilienM: are you sure? Let me double check with a colleague...14:17
EmilienMmordred: I believe you can do that yes14:18
*** quiquell is now known as quiquell|lunch14:18
EmilienMthere is no equivalent of registry-mirrors in podman14:18
mordredEmilienM: sweet. I'm going to poke at that for a little bit - on test nodes, I would NEVER want them to fallback to docker.io and to always only ever talk to the mirror ... I'll let you know if I get it working14:18
*** ginopc has joined #openstack-infra14:19
EmilienMmordred: question: on which OS/version are you running tests?14:19
*** dtantsur|brb is now known as dtantsur14:19
EmilienMmordred: if centos7, you want to pull podman from https://buildlogs.centos.org/centos/7/virt/x86_64/container/14:19
EmilienMto get the latest version14:19
AJaegermordred, EmilienM, before you try alternatives, let's wait for my question - I know a colleague implemented something for cri-o and expect this works on podman - but let's see whether my colleague is around...14:19
EmilienMAJaeger: no problem14:20
EmilienMin fact we might be able to use [registries.search] if it's a full mirror14:20
AJaegerEmilienM: quick answer: you're right ;(14:20
mordredEmilienM: yes - that's what I was thinking- it is a full mirror14:21
mordredEmilienM: I'm actually working with the libpod team on testing out their new shiny ubuntu bionic ppa14:21
EmilienMmordred: I'll give it a try today in tripleo-ci14:21
*** felipemonteiro has quit IRC14:21
EmilienMI don't know why I didn't think about it before14:21
EmilienMmordred: nice, so join us on #podman :D we have a nice collaboration between tripleo and them14:22
mordredEmilienM: cool.  I'm also going to make an install-podman role in zuul-jobs, similar to the install-docker role - that will install podman and set it up to talk to mirrors14:22
EmilienMoh nice14:22
ykarelmordred, can u check https://review.openstack.org/#/c/615543/, py27 was running with python314:23
mordredykarel: yup - I believe I've +2'd it already - Shrews - you wanna +A it?14:24
Shrewsmordred: no? b/c test failure?14:24
ykarelmordred, ok there is some patch already14:24
*** fried_rice is now known as efried14:25
ykarelpy27 test actually failing after this14:25
mordredShrews: oh. well, bah on test failures14:25
mordredykarel: awesome! do you want to fix those failures in that patch? if not, I can look in to it in a little while14:25
mordredand looks like getting it updated would in fact be important :)14:25
ykarelmordred, u can take it over14:26
ykarelmordred, actually detected in package build: https://logs.rdoproject.org/23/17223/1/check/legacy-rdoinfo-DLRN-check/ed2bc9a/buildset/centos-rpm-master/repos/99/47/9947219c4ee8da7b5d8478f0904fbb6f48f9a457_dev/mock.log14:27
ykarelthere we have multiple failures apart from CI, thread.error: can't start new thread14:28
*** jamesmcarthur has joined #openstack-infra14:28
mordredykarel: the thread error is going to be an issue in tests related to the task manager work - and _should_ be better now- but also there are a few more patches coming that should make that much better14:30
mordredykarel: and awesome- I will do - thanks for pointing it out!14:30
ykarelmordred, Thanks will keep an eye there14:30
ykarelit's issue with 2.19 release, may be next release would be better14:31
*** shrasool has quit IRC14:33
mordredykarel: I hope so - we've got a patch in flight to keystoneauth which will actually make us stop creating the additional threads in the first place - should all be sorted reasonably soon14:36
ykarelmordred, ack Thanks14:36
* mordred afks for just a bit14:37
*** SteelyDan is now known as dansmith14:37
openstackgerritFabien Boucher proposed openstack-infra/zuul master: WIP - Pagure driver  https://review.openstack.org/60440414:39
openstackgerritJean-Philippe Evrard proposed openstack-infra/project-config master: Add notifications to openstack-helm  https://review.openstack.org/61557214:40
*** felipemonteiro has joined #openstack-infra14:42
*** felipemonteiro has quit IRC14:45
*** ccamacho has quit IRC14:50
*** ccamacho has joined #openstack-infra14:52
*** rh-jelabarre has quit IRC14:52
*** sthussey has joined #openstack-infra14:53
openstackgerritMerged openstack-infra/nodepool master: Implement a Kubernetes driver  https://review.openstack.org/53555714:54
openstackgerritMerged openstack-infra/nodepool master: Add tox functional testing for drivers  https://review.openstack.org/60951514:55
*** quiquell|lunch is now known as quiquell14:59
*** janki has quit IRC15:01
*** munimeha1 has joined #openstack-infra15:02
odyssey4meHi folks - it seems like the github mirrors aren't quite up to date. Can someone take a peek?15:03
odyssey4meAs an example, https://git.openstack.org/cgit/openstack/openstack-ansible-os_cinder/commit/?id=02fa53d9de7c984e84710520f966be56e12e988c is not present in the github mirror.15:03
*** jistr is now known as jistr|call15:04
*** jistr|call is now known as jistr15:10
fungiagreed, https://github.com/openstack/cinder/commit/02fa53d9de7c984e84710520f966be56e12e988c seems to return a 40415:21
fungiand gh claims the last merge cinder commit in master is 3 days old15:22
*** d34dh0r53 has quit IRC15:22
*** cloudnull has quit IRC15:22
*** eglute has quit IRC15:22
*** cloudnull has joined #openstack-infra15:23
*** d34dh0r53 has joined #openstack-infra15:23
*** eglute has joined #openstack-infra15:23
fungiit does indeed look like there are a bunch of pushes to git@github.com:$someproject waiting as far back as Nov-03 20:59 (utc)15:26
corvusfungi: cf1a92a7              Nov-03 20:58      [6f2926c3] push git@github.com:openstack/keystone.git15:27
corvusyeah15:27
corvuslooks like that one is "running" but stuck, and the others are behind it15:27
fungioh, yep15:27
corvusi bet if we kill it, things will resume15:27
corvusi'll go ahead and kill it and the other stuck tasks15:29
fungi1bc9f3356d3076334a4b9b36283f463e0a65fa8b seems to be the last commit to replicate to gh for openstack/keystone15:30
corvusthere's another keystone push right after, so it'll probably get updated anyway15:30
fungiso i _think_ 733b37f24d874193b965528bacf1fd56ccffbc79 is what was hung replicating?15:31
fungi(maybe)15:31
fungii don't see anything weird about that change ( https://review.openstack.org/615400 ) so it was probably just something going sideways with the connection15:31
corvusi killed the task; keystone looks a bit more up to date nowe15:32
EmilienMmordred, AJaeger : so if you configure registries.search with the mirror (without http://) and then add it to registries.insecure as well (without http://) and then run "podman pull --tls-verify=false myimage", it works15:32
EmilienMmordred, AJaeger : it would be nice to provide https (secured) mirros to avoid this workaround though15:32
EmilienMbut it's good to know we can use mirrors in /etc/containers/registries.conf15:33
fungitask count is slowly falling15:33
corvusEmilienM: there's a spec for using letsencrypt in infra; i imagine if we can find someone to work on that, we could probably have tls mirrors15:33
EmilienMcorvus: nice15:33
EmilienMcorvus: I might know someone15:34
corvusEmilienM: https://review.openstack.org/58728315:34
EmilienMyeah15:34
* Tengu hides15:36
EmilienMI'll put links here but spredzy (Yanis) wrote https://github.com/Spredzy/ansible-role-lecm and https://github.com/Spredzy/lecm15:37
EmilienMwhich I'm sure can help15:37
TenguI started some fancy work for LE integration on my own, last year - it was focused on the public endpoints though, and took care of detecting where the VIP is, and sync the certificate between HA controllers.15:38
Tengunever had time to finish, unfortunately.15:39
*** sdoran has left #openstack-infra15:39
Tenguhttps://github.com/cjeanneret/openstack-certodia15:41
clarkbEmilienM: mordred to be clear our proxies are not a mirror they are caching proxies15:42
clarkbso yes it is a full "mirror" of dockerhub15:42
Tenguclarkb: TLS MitM then?15:43
clarkbTengu: ?15:43
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Turn on the future parser for eavesdrop.o.o  https://review.openstack.org/59004815:43
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Turn on future parser for lists.katacontainers.io  https://review.openstack.org/60238015:43
clarkbTengu: I was following up to the earlier question about whether or not our "mirrors" were "full mirrors"15:44
clarkbour mirrors are actually just caches for the actual thing so: yes15:44
*** shrasool has joined #openstack-infra15:44
Tenguclarkb: the dockerhup serves all its content via TLS (https) - in order to have a caching proxy it must decrypt the content. so man-in-the-middle (just a side question, not really that important)15:44
clarkbTengu: ah right, we actually only do http to the proxy then it does https to the backend15:45
clarkbso not really mitm'ing15:45
clarkb(at least we aren't pretending to have secured a connection to the backend)15:45
AJaegerfungi, could you review this ossa change, please? https://review.openstack.org/61549815:46
Tenguclarkb: :) ok! thanks for the precision.15:46
clarkbre tls mirrors in general. Keep in mind that for apt you have to install extra packages to speak tls15:47
clarkbso we shouldn't blindly put that in place across the baord15:48
clarkb(though this is no longer true on bionic iirc)15:48
*** quiquell is now known as quiquell|off15:49
Tenguhmmm, latest version of apt doesn't need any additional package.15:49
clarkbTengu: ya I think for bionic and debian testing/unstable its fine. But xenial for sure needs the apt tls package and unsure about debian stable15:50
Tengulatest debian stable already have the new apt version, and there's no "apt-https-transport" anymore (or whatever the name was back then)15:50
clarkbah ok so its just xenial then15:50
Tenguyeah, I think so15:50
fungiyeah, but also apt repositories are rarely served via https anyway because it brings basically nothing useful to do so and it's hard to coordinate certs across a distributed volunteer-run mirror network15:50
Tenguanyway, I didn't touch any (stable) debian for a year now ^^'15:50
clarkbfungi: right, I mostly don't want someone to add tls to the mirrors and break apt15:50
fungiwe could just avoid redirecting http to https15:51
Tengu+115:51
clarkbTengu: re mitm too, since we use our own hostnames we should be able to mitm without any extra work as well. Its not like a squid pretending to be dockerhub, its apache reverse proxying to dockerhub under its own name15:56
clarkb*any extra work on top of setting up tls on the proxy15:56
Tenguok :)15:56
jungleboyjI am afraid that I have another corrupted etherpad:  https://etherpad.openstack.org/p/cinder-stein-meeting-agendas16:02
*** dklyle has joined #openstack-infra16:02
jungleboyjCould someone run the magic command to try to recover it.16:02
clarkbhogepodge: a6cb65b1-5897-45b4-86c8-d79ff74dc327 is the uuid of http://logs.openstack.org/57/608057/7/check/openstack-helm-infra-five-ubuntu/daa7967/zuul-info/host-info.node-4.yaml that host you identified in rax-ord as having trouble16:03
clarkbcloudnull: ^ any chance you have a few minutes today to take a look at that or suggest further debugging? tldr is rax-ord instance lost networking16:04
*** dpawlik has quit IRC16:05
clarkbjungleboyj: that does make we wonder if someone doing cinder things has a client that causes that16:05
*** dpawlik_ has joined #openstack-infra16:05
clarkbjungleboyj: I'll take a look shortly16:05
*** luizbag has joined #openstack-infra16:05
jungleboyjclarkb:  Thank you so much.  We were using it when things are unhappy on the server last week.16:06
clarkbah that could be related too16:06
*** pcaruana has quit IRC16:06
AJaegerconfig-core, could you review https://review.openstack.org/615174 and https://review.openstack.org/615572 , please?16:07
*** ccamacho has quit IRC16:08
*** dpawlik_ has quit IRC16:12
*** dpawlik has joined #openstack-infra16:12
clarkbjungleboyj: http://paste.openstack.org/show/734152/16:13
jungleboyjclarkb  You rock!  Thank you!16:14
jungleboyjclarkb:  Take it the original page is lost?16:15
clarkbjungleboyj: yes, I don't think we know of a way to edit the database to recover those. Likely is editing the json in the tables to be valid16:16
*** maciejjozefczyk has quit IRC16:16
*** ramishra_ has quit IRC16:16
clarkbdtroyer: ildikov for https://review.openstack.org/#/c/615174/2 do we need to coordinate that to update groups quickly so that starlingx work doesn't grind to hault?16:16
*** imacdonn has quit IRC16:16
openstackgerritsebastian marcet proposed openstack-infra/openstackid master: Migration to PHP 7.x  https://review.openstack.org/61193616:17
*** imacdonn has joined #openstack-infra16:17
clarkbdtroyer: ildikov: a volunteer to be the seed user on those groups would probably work? or we can add the old group to the new groups and you all can sort it out from there?16:17
*** dpawlik has quit IRC16:17
fungiclarkb: one workaround is that you can move the corrupt pad (there's an api call like move or rename) and then the original content can be pasted into a new pad at the old name after that so urls remain valid16:17
ildikovclarkb: I have an initial list for people, I can setup the groups if you add me16:17
clarkbildikov: ok I'll approve it now then16:18
ildikovclarkb: thanks!16:18
openstackgerritMerged openstack-infra/project-config master: Add notifications to openstack-helm  https://review.openstack.org/61557216:19
AJaegermrhillsman: could you review governance-sigs open changes, please? https://review.openstack.org/#/q/project:openstack/governance-sigs+is:open16:25
clarkb#status log Added stephenfin and ssbarnea to git-review-core in Gerrit. Both have agree to focus on bug fixes, stability, and improved testing. Or as corvus put it "to be really clear about that, i think any change which requires us to alter our contributor docs should have a nearly impossible hill to climb for acceptance".16:28
openstackstatusclarkb: finished logging16:28
*** dpawlik has joined #openstack-infra16:28
clarkbstephenfin: ssbarnea ^ fyi and thank you!16:28
fungiyes, huge thanks!!!16:28
openstackgerritMerged openstack-infra/project-config master: Add StarlingX core groups  https://review.openstack.org/61517416:30
stephenfinclarkb: Spot on. Cheers :)16:31
stephenfincan't promise I'll do anything until December (summit and PTO) but after that, fix ALL the bugs16:31
corvusi think doing nothing and fixing bugs are both great things to do16:32
*** dpawlik has quit IRC16:32
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: Proposed spec: tenant-scoped admin web API  https://review.openstack.org/56232116:34
*** e0ne_ has quit IRC16:36
*** kopecmartin is now known as kopecmartin|off16:36
ildikovclarkb: how long till the new groups appear in Gerrit?16:39
clarkbildikov: puppet is running every ~45 minutes right now so roughly about that long16:39
ildikovclarkb: cool, tnx16:40
odyssey4mewould it be possible for me to gain access to a held node which is running a test? I'm not able to replicate an issue outside of CI which is causing https://review.openstack.org/#/c/615258/ to fail for the one set of tests on suse/centos16:40
clarkbodyssey4me: have a preference for which job we hold the instance for?16:42
odyssey4meclarkb I guess openstack-ansible-deploy-aio_metal-opensuse-423 given that's what I tried to replicate locally.16:42
clarkbodyssey4me: against openstack-ansible project?16:43
odyssey4meclarkb yeah, ideally a test running against that patch16:43
odyssey4me(it resolves some issues I picked up in local testing)16:43
fungiodyssey4me: thanks for the heads up on the replication backlog. gerrit caught up a little while ago and https://github.com/openstack/openstack-ansible-os_cinder/commit/02fa53d9de7c984e84710520f966be56e12e988c seems to have replicated now16:44
clarkbodyssey4me: next failure of that job on change 615258 should be held16:44
odyssey4mefungi ah, great thanks - while we all know it's a mirror and not necessarily in sync, sometimes people still use it16:44
odyssey4meclarkb so I just run a recheck and it'll hold that node?16:45
clarkbodyssey4me: yup it should16:45
odyssey4mehow do I then actually access it?16:45
clarkbodyssey4me: an infra-root will need to add your ssh key to the held instance16:45
odyssey4meclarkb ok, thanks - I'll wait for the job to run again, then ping for the key to get added - is there a time limit on the hold? I ask because it's pretty much the end of my day and I'd prefer to do the investigation in the morning16:47
clarkbodyssey4me: I did not set a time limit16:47
odyssey4meok thanks, so I just need to ping when I'm done again16:47
fungiwe also usually record enough breadcrumbs to remember who to ask to make sure they're done before we delete it16:48
*** openstackgerrit has quit IRC16:48
clarkbyes "odyssey4me debugging failures" says the comment16:48
odyssey4megreat, tyvm!16:49
*** sshnaidm|rover is now known as sshnaidm|afk16:53
*** ginopc has quit IRC16:55
*** ginopc has joined #openstack-infra16:58
*** trown is now known as trown|lunch17:00
*** gyee has joined #openstack-infra17:02
*** openstackgerrit has joined #openstack-infra17:03
openstackgerritJames E. Blair proposed openstack-infra/system-config master: adns: Set zone directory permissions  https://review.openstack.org/61560717:03
openstackgerritFabien Boucher proposed openstack-infra/zuul master: WIP - Pagure driver  https://review.openstack.org/60440417:05
*** shrasool has quit IRC17:07
clarkbif anyone else has throughts on using the infra onboarding session in berlin for user onboarding please put them on https://etherpad.openstack.org/p/openstack-infra-berlin-onboarding I'm going to circulate that etherpad on the various dev mailing lists shortly17:10
ildikovclarkb: the groups for StarlingX appeared on Gerrit now. Who has rights to add people to it by default?17:15
clarkbildikov: right now only gerrit admins. I'll add you then you can add everyone else17:15
ildikovclarkb: cool, thank you17:15
clarkbildikov: done, I think I got all the groups too, but let me or infra-root know if any were missed17:17
ildikovclarkb: roger; thanks!17:18
corvusclarkb, fungi, mordred: https://review.openstack.org/615607 was the only oopsie from the opendev bootstrapping.  i fixed that manually, and manually opened the firewall, and the new servers are serving data now.  i sent jimmy an email asking him to set up glue records and dnssec (you are cc'd).  once that's set up, we should be gtg.17:18
*** rkukura has joined #openstack-infra17:19
* clarkb reviews the fix17:19
corvus(er, obviously, once the glue records are in place, i'll update the firewall config in ansible to do what i did manually)17:19
corvusalso, i manually ran the snippet in 615607, so that ansible is tested17:19
*** ginopc has quit IRC17:20
*** ykarel has quit IRC17:23
*** ginopc has joined #openstack-infra17:26
*** jpich has quit IRC17:27
ildikovclarkb: this one is missing: https://review.openstack.org/#/admin/groups/1966,members17:28
clarkbildikov: fixed17:29
ildikovtnx!17:29
*** calebb has joined #openstack-infra17:32
*** ginopc has quit IRC17:41
*** yamamoto has quit IRC17:47
*** jpena is now known as jpena|off17:48
*** rkukura has quit IRC17:49
*** dave-mccowan has quit IRC17:49
fungithanks corvus!17:50
*** ginopc has joined #openstack-infra17:54
*** e0ne has joined #openstack-infra17:57
AJaegerprometheanfire, tristanC , fungi , tonyb, could you review this ossa docs change, please? https://review.openstack.org/#/c/615498/17:58
AJaegerthanks, mrhillsman !18:00
*** derekh has quit IRC18:00
prometheanfirek18:01
ssbarneaout of curiosity, do we have an irc bot that can post rss feeds? or if you know one that can easily be setup?18:01
fungiAJaeger: thanks!!!18:01
ssbarneaor even better, a free SaaS solution that does this.18:01
prometheanfiredidn't you hear? rss is dead :(18:01
mrhillsmanAJaeger: you’re welcome, not sure why i did not get notifications, maybe i just missed them18:02
ssbarneai guess rss is like irc, waiting for the singularity.... in order to die.18:02
AJaegerthanks, fungi and prometheanfire !18:02
prometheanfireyarp18:02
fungissbarnea: statusbot posts to wiki.openstack.org and twitter... updating an rss xml blob might not be hard to do with that codebase as a starting point. the main trick is in where/how you go about publishing it18:03
AJaegerfungi, prometheanfire, keep in mind that ossa won't work with python3 - at least not without additional changes...18:03
fungiAJaeger: seems like something we need to fix. thanks for the heads up!18:03
AJaegerfungi: yes, that needs eventual fixing...18:04
fungilooks like patchset #3 failures should point us in a starting direction at least18:05
melwittdoes anyone know if there's a way to search for bug keywords limited to a particular project in storyboard? I don't see how in the docs18:05
melwittI'm trying to see if there's already a bug open for the thing I'm considering opening a bug about18:06
AJaegerfungi: just add python3 as basepython and see it fail - and then we need to update requirements as well (sphinx is < 1.3)18:06
*** luizbag has left #openstack-infra18:07
fungimelwitt: have you tried adding the project and keyword to the search field?18:08
fungiwhen i search for bindep (and pick the openstack-infra/bindep project from the drop-down) and then add alpine as a keyword i get the one bindep story (and associated task) for alpine linux support in bindep. if i remove the openstack-infra/bindep project from the search i get several stories for different projects18:10
melwittit doesn't seem to allow both to be specified, either project or text keyword?18:10
melwittok, I'll try it again18:11
fungiyou can add both. is it not letting you?18:11
fungibasically it gave me https://storyboard.openstack.org/#!/search?q=alpine&project_id=811&title=alpine as the query when i entered what i described above18:11
melwittmaybe a UI challenge but when I wrote "floating" next to the python-openstackclient box, it erased the python-openstackclient box and searched everything18:12
melwittgotcha. I'll play around with it more18:12
fungihuh, that's definitely not intentional :/18:12
fungii wonder if it gets ornery if you don't wait for the little "processing" spinny on the right side to stop and go back to a magnifying glass18:14
fungihttps://storyboard.openstack.org/#!/search?q=floating&project_id=975&title=floating is what i get for what it sounds like you're looking for18:14
melwittah, yes, that is what I wanted18:15
fungithe api is a little slow to get back to the webclient during typeahead searching, could probably stand to profile those queries and see where it's spending time18:16
melwittoh, I see what I did wrong. when I selected python-openstackclient from the drop down, I selected the "text:python-openstackclient" instead of the project version of it18:16
*** Swami has joined #openstack-infra18:17
fungii find the icons a little opaque, so can see where that would be easy to do. the hover tooltip helps but i wonder if we shouldn't spell out what's in the tooltip into the selection menu18:17
melwittso when I added "floating" it took away the text:python-openstackclient. if I select the project:python-openstackclient it works18:17
AJaegerfungi: I have ossa converted to python3! Now cleaning up...18:17
*** yamamoto has joined #openstack-infra18:18
melwittyeah, I've got it now. thank you18:18
fungimelwitt: yeah, right now search terms are exclusively anded, so having more than one for a specific category makes little sense. there's been discussion about how to switch out the search query parser for a more full-featured language18:18
melwittfungi: yeah, makes sense. I just didn't notice there were "different" python-openstackclient query types in the box (and defaults to text if you don't wait for the box selector)18:19
melwittnow I know what to do :)18:19
*** ralonsoh has quit IRC18:20
fungimelwitt: well, if we spell it out with project: and text: in front of terms in the search bar that might make it more obvious than just the icons18:25
*** rkukura has joined #openstack-infra18:27
melwittyeah, that would make it similar to how to search in gerrit or logstash.o.o18:28
*** yamamoto has quit IRC18:29
*** trown|lunch is now known as trown18:36
AJaegerprometheanfire, fungi, https://review.openstack.org/615626 ports ossa to python318:43
*** electrofelix has quit IRC18:43
prometheanfireAJaeger: very  nice18:43
fungiAJaeger: wow, thanks!!!18:44
AJaegerconfig-core, https://review.openstack.org/615501 changes remaining publish jobs to "tox -e docs" - please review.18:44
prometheanfireAJaeger: watching the review (just in case)18:44
AJaegerprometheanfire, fungi, I included the openstackdocstheme into that as well. If you want to use the "Report a bug " link of the theme, tell me where to report bugs - launchpad project or storyboard project. Those are bugs against OSSA itself...18:45
fungiprometheanfire: yeah, we can check the draft rendering once zuul links it18:45
AJaegerIt looked good locally - but yes, let's wait until the build is done. Will tell you...18:45
fungiAJaeger: it's sort of up in the air since we (in theory) use both lp and sb though in practice we haven't had a vulnerability to oversee for any post-sb-migration project yet18:46
AJaegerfungi, I did the initial ossa change for 615501...18:46
*** diablo_rojo has joined #openstack-infra18:46
AJaegerfungi: it's bugs against ossa repo itself, so typo in the descriptions etc.18:47
AJaegerfungi: so, if you click on the "bug" on the page, where should it open a report? I disabled the bug icon for ossa now since the docs did not give any place.18:47
AJaegerfungi: so, it's not a bug against nova etc - but a bug against ossa documents18:48
AJaegerbut if you have nothing for that - we can leave it disabled18:48
fungiright, we've never really had bug reports for that repo itself as we've used bug/task tracking to identify when the vmt needs to take some action on a reported vulnerability18:48
fungibut i suppose we could stick with the lp link for now since it's more in use by the vmt still18:48
openstackgerritMerged openstack-infra/system-config master: adns: Set zone directory permissions  https://review.openstack.org/61560718:49
*** noama has quit IRC18:50
clarkbinfra-root config-core https://review.openstack.org/615628 is my first draft at the project update for berlin18:57
clarkbplease review it for accuracy and also that I didn't miss anything super important18:57
AJaegerclarkb: couple of minor suggestions19:05
clarkbAJaeger: thanks!19:07
*** xek_ has joined #openstack-infra19:09
*** xek has quit IRC19:12
clarkbAJaeger: fixes pushed19:13
*** rockyg has joined #openstack-infra19:14
AJaegerclarkb: I just had one more suggestion - did you include that one?19:14
clarkbAJaeger: I did not, will do19:14
AJaegerclarkb: your push and my addition crossed ;)19:15
AJaegerclarkb: otherwise LGTM19:15
openstackgerritMerged openstack-infra/zuul master: Small script to scrape Zuul job node usage  https://review.openstack.org/61367419:16
AJaegerfungi, prometheanfire, http://logs.openstack.org/26/615626/1/check/openstack-tox-docs/5dae997/html/19:17
AJaegerfungi, prometheanfire, I pushed a small update for ossa - now all ready to review19:19
*** tpsilva has quit IRC19:22
fungithanks again, AJaeger!19:23
*** dtantsur is now known as dtantsur|afk19:24
*** zul has quit IRC19:24
AJaegeryou're welcome, fungi. Note that I also pushed https://review.openstack.org/615629 to remove anchor - it's retired.19:25
AJaegerfungi, if you have some review time, I would appreciate review of https://review.openstack.org/615501 to move docs publishing for more sites to "tox -e docs", please19:26
fungiAJaeger: https://review.openstack.org/615501 is complaining about a configuration error which seems to have crept in over the past 10 hours19:30
*** timothyb89 has quit IRC19:31
*** roman_g has quit IRC19:36
AJaegerfungi: it is in error in the change - I wonder why Zuul did not report it initially ;(19:36
AJaegercorvus: any ideas? ^19:37
* AJaeger will update19:37
openstackgerritColleen Murphy proposed openstack-infra/puppet-pip master: [debug] Fix openstack_pip provider for pip 18  https://review.openstack.org/60602119:39
AJaegerfungi, corvus, those two errors Zuul reports on 615501 have been wrong in initial submission already. Why did Zuul not complain initially in check pipeline - but complains now during gate?19:40
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Update static.o.o publishing  https://review.openstack.org/61550119:41
AJaegerfungi, updated to fix failure - thanks ^19:41
AJaegerneeds another change...19:42
fungiAJaeger: yeah, i find that strange if it wasn't a regression between when check tests ran and when i approved19:45
AJaegerfungi, corvus, I guess I know why it did not test before - because of a change to a trusted project with depends-on19:45
fungiohhh19:45
fungiand the dependency hadn't merged yet?19:46
fungiand now it has19:46
AJaegerit has merged now - but wasn't merged earlier19:46
fungiyeah, if we'd rechecked it likely should have reported once that merged19:46
AJaegerso, now I got directly a -119:46
AJaegerfungi: yep19:46
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Update static.o.o publishing  https://review.openstack.org/61550119:49
*** shardy has quit IRC19:52
*** shardy has joined #openstack-infra19:52
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Remove publish-static  https://review.openstack.org/61563719:55
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Remove publish-static  https://review.openstack.org/61563719:59
fungiAJaeger: do you have any suggestions for how we can deal with the sidebar on http://logs.openstack.org/26/615626/2/check/openstack-tox-docs/bfba776/html/ under openstackdocstheme? seems it ends up with all the security advisories linked there before the toc20:03
AJaegerfungi, I think I can disable it - let me check...20:04
fungithe previous theme didn't include that section in the sidebar20:05
fungibut i'll be honest, i'm not sure what causes that "openstack-security-advisories" section to appear there either20:07
AJaegerfungi, it's the title of the page...20:07
*** boden has quit IRC20:08
*** kjackal has quit IRC20:08
*** kjackal has joined #openstack-infra20:09
AJaegerfungi: updated - the toc is gone20:12
AJaegerclarkb, fungi, could you review  https://review.openstack.org/615501 again, please? This now passes20:13
AJaegerI also needed to push https://review.openstack.org/615637 as followup20:14
*** maciejjozefczyk has joined #openstack-infra20:14
fungiAJaeger: the toc wasn't itself a problem, it's just that under the previous theme the toc didn't include all the advisory documents: https://security.openstack.org/20:15
fungiso i was wondering if it was possible to get openstackdocstheme to behave the same way20:17
AJaegerfungi, that toc is there as well - we had both global and local toc -and that local got confused by the generated stuff. The just pushed version should be what you expect20:18
fungioh! thanks, so it's the global toc which was the issue?20:18
fungibut the local toc will still be included?20:18
AJaegerfungi, check http://logs.openstack.org/26/615626/3/check/openstack-tox-docs/0981cf0/html/20:18
dhellmannis it possible to configure a release or tag pipeline job on one repo so that it only runs if the tag matches a pattern? I see options for setting branches on jobs but not tags.20:19
fungiAJaeger: oh, yep perfect--thanks again!20:19
AJaegeryes, exactly20:19
dhellmannit looks like the tag pattern applies to the pipeline?20:19
fungidhellmann: we have a pattern which determines which pipeline a given tag ref is to be enqueued into, but that's at the changeish level not the job level. i wonder whether branch matcher expressions will work on a tag, but don't know the answer20:20
*** maciejjozefczyk has quit IRC20:21
*** rlandy is now known as rlandy|brb20:21
dhellmannfungi : ok. I realized this weekend that we can't use the same job to release stable versions of heat that we will for master because the dist name doesn't match what we own on pypi20:21
dhellmannso I need a way to run different release jobs on older stable branches of heat20:21
dhellmannunless we decide to backport the dist name change, but that seems like it would be against our stable policy20:22
fungidhellmann: if i (or someone) gets a chance to write up documentation and release notes for 578557 it would also solve that challenge i think?20:22
dhellmannyeah, that would also help20:22
clarkbinfra-root I'm going to reboot the mirror in packethost. John thought that they had resovled many of the problems in packethost. I think we check if the mriror is still running tomorrow (give it ~24 hours) then if it is running set max-servers to say 10 and go from there20:23
*** jtomasek has quit IRC20:23
fungiwhen i asked in #zuul last week, tobiash said he's already been running with that feature on bmw's zuul20:23
dhellmannfungi : is that patch just missing a release note?20:23
fungidhellmann: release note and documentation of the behavior, yes20:23
dhellmannok20:23
clarkbactually no because the mirror doesn't seem to exist at all anymore20:24
clarkbinteresting20:24
clarkbI guess we'll have to build a new mirror there20:24
fungiclarkb: oof, accidentally deleted i guess?20:24
fungi(or the whole deployment was wiped maybe)20:24
clarkbfungi: well they had talked about database improvements. I wonder if that included starting with a fresh db20:25
dmsimardclarkb: I'm not sure if it's related but fwiw packet.com is down20:25
openstackgerritMerged openstack-infra/project-config master: Update static.o.o publishing  https://review.openstack.org/61550120:25
clarkbdmsimard: shouldn't be, the openstack control plane is independent of the packet  host dashboard. We just run on their baremetal instances (and I can talk to the api its just listing no hosts)20:26
dmsimardack20:26
clarkbhttp://logs.openstack.org/33/615633/1/check/nova-live-migration/c3ac608/job-output.txt#_2018-11-05_19_54_44_188300 adds confusion to the networking issue in rax-ord20:30
clarkblooks like we get the wrong host key for one git push then its fine on a subsequent one20:31
clarkbwhich probably does lend some weight to the idea that its two hosts fighting over the same IP20:31
clarkbanyone know if we can have ansible log additioanl ssh remote data?20:36
clarkbthings like sshd version, the anticipated host key vs the one received, ip address (just to double check), etc?20:37
clarkboh neat fact gathering has some of the expected data20:37
*** e0ne has quit IRC20:45
clarkbok I've confirmed that the host key as reported by facts is different after sha256 fingerprinting it with ssh-keygen. I've also checked the ip address matches up with the one in our inventory20:47
fungiwas this the same ip address as the previous problem report, by any chance?20:48
clarkbthis particular IP address shows up in ~12 failures due to this20:48
clarkbfungi: ya I think so. 104.130.222.13820:48
clarkbso ya I wonder if rax lost track of that particular IP in ord20:49
*** rlandy|brb is now known as rlandy20:49
* clarkb dobule checks against hogepodge's instance20:49
clarkbhogepodge's ip addr is different20:49
fungiahh20:49
clarkb104.130.216.8520:49
clarkbcould be a small number of leaked/lost IPs?20:49
fungitrying to communicate this to fanatical support without the assistance of cloudnull is likely to be challenging20:50
clarkbya...20:50
cloudnull^ present20:50
mordredyay it's cloudnull!20:50
cloudnullo/20:50
fungiand he appears in a puff of awesome20:50
clarkbcloudnull: oh hey. So we think that maybe there are duplicate IPs in rax (ord in particular but haven't checked other regions)20:50
* cloudnull first day back from much required holiday 20:50
cloudnulloh thats all bad20:51
mordredcloudnull: welcome back!20:51
cloudnullthanks!20:51
clarkbcloudnull: http://logs.openstack.org/33/615633/1/check/nova-live-migration/c3ac608/job-output.txt#_2018-11-05_19_54_44_188300 is the symptom (notice that we push to secondary before and after that error successfully20:51
fungiyeah, we've seen other cloud providers "lose track" of virtual machines from time to time and exhibit this exact behavior20:51
clarkbcloudnull: and if I look up the error message with that IP address I find ~12 cases where this particular IP address has exhibited this in the last week20:51
fungiand then you end up with arp overwrites in the routers giving you a toss-up as to whether you end up communicating with your vm or the ghost20:52
cloudnulldo we have a rax ticket to go wave around ?20:52
clarkbcloudnull: not yet, I've only really jsut sat down to dig into what data we do have20:52
clarkbI can open one if that will help20:52
cloudnullok.20:52
*** gfidente is now known as gfidente|afk20:52
cloudnullif you have a moment. i will ping some internal folks while im here20:53
clarkbya I'll be around. Just tell me what I should do next :)20:53
clarkbor do you mean file the ticket if I have a moment?20:53
* clarkb digs up uuid for this instance20:53
clarkbde6e6777-f4bf-4fb6-a6ee-ffc1cc1ee2cb is the instance uuid for that particular case20:54
fungiit's one of those sorts of issues where if we just go through normal ticket reporting it's going to take first tier support forever to determine that we're not crazy and escalate it to someone who can check whether there are lost instances squatting those ip addresses20:54
cloudnullclarkb yea, if you could file a ticket it'd be great just so I can go wave it around at people to make them fix it faster.20:54
clarkbcloudnull: ok I'll work on that now20:54
* cloudnull is already causing a ruckuss in their chat channels 20:55
fungiat least in the previous cases we've seen, the nova api is just going to confirm to the operator that there's nothing there20:55
fungiend up needing to track it through the network gear to a particular host and then use virsh or something to find the running vm20:56
cloudnullSo "104.130.222.138" is the troublesome IP20:58
clarkbcloudnull: ya, one of them at least. Sorry working to file ticket, but logstash may have nore data for us20:58
clarkbcloudnull: 104.130.216.85 is one that hogepodge identified yesterday20:58
hogepodgeThank you cloudnull and clarkb !20:59
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Merger: automatically add new hosts to the known_hosts file  https://review.openstack.org/60845321:03
clarkbcloudnull: #181105-ord-000088921:04
ianwfungi: did you see my comments on the insecure: true stuff?21:07
*** shrasool has joined #openstack-infra21:08
ianwfungi: btw that had me so confused i had to git log & code read openstacksdk :) https://review.openstack.org/#/c/615512/21:08
clarkbcloudnull: `message:"ED25519 host key for 104.130.222.138 has changed and you have requested strict checking." AND (tags:"console.html" OR tags:"job-output.txt")` is a logstash query (http://logstash.openstack.org) that will show you instances for that particular IP address21:08
clarkbcloudnull: you can search back 10 days currently21:09
openstackgerritColleen Murphy proposed openstack-infra/puppet-pip master: [debug] Fix openstack_pip provider for pip 18  https://review.openstack.org/60602121:12
fungiianw: yep, did you see my reply comment on the change?21:12
*** jamesmcarthur has quit IRC21:13
*** maciejjozefczyk has joined #openstack-infra21:16
clarkbcloudnull: just let us know if theer is any other info we should dig up. I'll be around all day (though I need lunch right this moment)21:16
*** eernst has joined #openstack-infra21:16
cloudnullfrom the folks in pub cloud "so there is totally a rogue VM for that ip" -- "i'm cleaning it up now"21:17
fungiheh21:17
fungiwe suspect there are other affected addresses as well21:17
ianwfungi: ahhh, sorry yes now reloaded :)  great, i thought it would be something like that but ran out of time to look, thanks21:18
clarkbya we'd need to go digging through logstash to find them though21:18
*** eernst_ has joined #openstack-infra21:18
clarkbfungi: any chance you are in a spot to do that now using some variant of my query above?21:18
fungii have a few minutes to try, yes. i'll see if i come up with anything21:18
fungii suppose i should limit it to the last 24 hours in case some have already been cleaned up21:20
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Turn on the future parser for eavesdrop.o.o  https://review.openstack.org/59004821:21
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Turn on future parser for lists.katacontainers.io  https://review.openstack.org/60238021:21
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Turn on the future parser for lists.openstack.org  https://review.openstack.org/61565621:21
*** maciejjozefczyk has quit IRC21:21
*** eernst has quit IRC21:21
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Don't do live streaming in loops  https://review.openstack.org/61565721:21
fungiNOT message:"104.130.222.138" AND message:"has changed and you have requested strict checking." AND (tags:"console.html" OR tags:"job-output.txt")21:22
fungithat gets me a few hits in the past day21:22
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Turn on the future parser for wiki-dev.openstack.org  https://review.openstack.org/61565821:22
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Turn on the future parser for wiki.openstack.org  https://review.openstack.org/61565921:22
cloudnullthe rouge VM has been dealt with21:22
*** eernst_ has quit IRC21:23
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Turn on the future parser for logstash.openstack.org  https://review.openstack.org/61566021:24
fungiclarkb: cloudnull: looks like that query turns up similar collisions for 162.242.218.218 and 104.130.217.169 in rackspace21:25
fungialso 158.69.64.67 which is in ovh, not rackspace21:25
fungihogepodge: i'm guessing the fact that a 5-node job is 5x as likely to hit a contended ip address makes this show up a lot more for openstack-helm testing21:27
coreycbAJaeger: hi, can you comment on this where I've posed a question to @ajaeger? Thanks in advance. https://review.openstack.org/#/c/610708/5/goals/stein/python3-first.rst@4521:27
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Turn on the future parser for subunit workers  https://review.openstack.org/61566121:29
cloudnullfungi folks are dealing with those IPs now too21:30
openstackgerritDouglas Mendizábal proposed openstack-infra/irc-meetings master: Change Barbican meeting time for DST ending  https://review.openstack.org/61566221:30
fungicloudnull: awesome, thanks!!!21:30
fungicloudnull: i only queried back a day in case older addresses in that situation were already dealt with by the operators, but we should likely keep an eye out for more hits of a similar nature in our logs21:31
fungiclarkb: ^21:31
*** jamesmcarthur has joined #openstack-infra21:31
*** kaiokmo has quit IRC21:35
hogepodgefungi: that's why I always buy five lottery tickets. ;-)21:36
clarkbcloudnull: thank you!21:39
*** shrasool has quit IRC21:39
*** markmcclain has quit IRC21:42
*** adam_g has quit IRC21:43
*** dhellmann has quit IRC21:44
hogepodgeclarkb: cloudnull: so if it's something we're seeing in multiple public clouds, it sounds like it might be an upstream bug21:45
clarkbhogepodge: it could be21:45
*** dhellman_ has joined #openstack-infra21:49
*** dhellman_ is now known as dhellmann21:49
openstackgerritColleen Murphy proposed openstack-infra/system-config master: Turn on the future parser for elasticsearch.openstack.org  https://review.openstack.org/61566521:53
*** ansmith has quit IRC21:54
*** agopi|off has joined #openstack-infra21:58
*** e0ne has joined #openstack-infra22:02
openstackgerritMerged openstack-infra/irc-meetings master: Change Barbican meeting time for DST ending  https://review.openstack.org/61566222:04
clarkbhttp://status.openstack.org/elastic-recheck/#1384373 is the e-r bug to follow on the ip reuse thing22:05
clarkbwe should see that number fall in theory22:05
*** AJaeger_ has joined #openstack-infra22:05
clarkbcloudnull: ^ fyi if you want to follow along using our data tracking22:05
*** e0ne has quit IRC22:05
*** AJaeger has quit IRC22:07
*** gfidente|afk has quit IRC22:08
*** trown is now known as trown|outtypewww22:08
fungiexcluding those other three addresses and extending the query to 2 days gets no additional hits22:08
fungi7 days gets me some others22:09
fungi104.130.207.161 and 104.130.216.201 in rackspace22:10
fungi213.32.77.33 in ovh22:10
funginone of those were seen in the past 48 hours though22:11
fungialso 213.32.73.193 in ovh22:12
*** bobh has quit IRC22:12
fungi172.99.69.23 in rackspace22:13
fungi149.202.161.227 in ovh22:13
fungithat's all of the others for the past week22:14
fungicloudnull: so you might see if they also know about (or have already cleaned up) 104.130.207.161, 104.130.216.201 and 172.99.69.2322:14
fungiianw: can you maybe give amorin a heads up later in your day about 158.69.64.67, 213.32.77.33, 213.32.73.193 and 149.202.161.227?22:15
*** adam_g has joined #openstack-infra22:17
fungiand with that, i need to disappear for dinner. back in a while22:18
*** dhellmann_ has joined #openstack-infra22:18
*** dhellmann has quit IRC22:18
*** bobh has joined #openstack-infra22:19
clarkbfungi: thank you for digging those up22:20
*** dhellmann_ is now known as dhellmann22:20
*** rockyg has quit IRC22:23
*** bobh has quit IRC22:24
*** bobh has joined #openstack-infra22:25
ianwfungi: will do22:26
*** felipemonteiro has joined #openstack-infra22:28
*** bobh has quit IRC22:29
*** munimeha1 has quit IRC22:42
clarkbianw: http://logs.openstack.org/07/612307/6/gate/tripleo-ci-centos-7-undercloud-containers/edecfaf/job-output.txt.gz#_2018-11-02_12_57_33_229120 name resolution errors on centos-7 now too?22:42
clarkboh actually hrm. That failed in bhs1 where openstack apis think we have ipv6 but the instances don't know about it (not in metadata or config drive) I wonder if we misconfigure unbound there?22:43
clarkbnope the ara report shows we don't assume the host has ipv6 there and only set ipv4 resolvers22:44
ianwhmmm22:46
*** jcoufal has quit IRC22:46
clarkbit is interesting that this seems to affect red hat distros more significantly than suse or debian or ubuntu22:47
* clarkb asks logstash if this is still true22:47
ianwa fix for that prefering ipv6 over ipv4 even when no ipv6 got pushed into all fedora unbound22:47
ianwi was thinking of reverting our fix22:47
clarkbin the case of ovh there is ipv6 but it must be statically configured from data retrieved through the nova/neutron apis. You can't see it from the instance metadata directly22:48
clarkbso those are ipv4 only clouds currently with glean22:48
clarkband the failure happens after we should've written an ipv4 only config for unbound22:49
clarkbok its not centos only says logstash22:49
clarkbxenial and bionic show up a bit too22:49
clarkbhappens in gra1 majority of time with bhs1 following with significant portion too (then long tail)22:50
clarkbmaybe we are having problems getting to opendns from ovh?22:50
clarkbinfra-root ^ what do we think about replacing opendns with cloudflare dns (1.1.1.1)22:51
ianwyeah, i mean it could also be transient ... maybe we should put a pause and loop in there?22:51
clarkbianw: ya we could also try a few times to see22:51
ianwall things being equal, maybe we should start with that and keep the resolvers fixed for now, and if we still see timeouts after even a couple of loops, well there's bigger issues?22:52
clarkbsounds good. I mention cloudflare because that have massive distribution and scale (so like google shouldn't have many outages)22:53
clarkbwhereas opendns has been acuqired and who knows anymore22:53
ianwok, i'll look at adding a loop as a first thing22:53
ianwit was one year ago we were watching the horse race at the sydney summit today ... time files22:55
*** agopi|off is now known as agopi22:55
clarkbI think my horse came in dead last22:55
clarkbI know how to pick them22:56
clarkbianw: iirc unbound won't retry a failed lookup against a different forwarder, but it will round robin the next request22:58
clarkbianw: we should be careful that retries don't suddenly work on the second attempt because the other dns server was used then fail later in the jobs. (that would be good indication we should change providers though)22:58
clarkbhttps://system.opendns.com/ indicates opendns should've been fine though23:00
ianwhrm, true; if we could grab some of the unbound log file it would be good too23:00
ianwdmsimard: this feel like something like attachments or artifacts or something like that which i think ara can display?23:02
*** florianf has quit IRC23:05
*** felipemonteiro has quit IRC23:07
*** kjackal has quit IRC23:08
*** lbragstad has quit IRC23:09
*** lbragstad has joined #openstack-infra23:10
*** sthussey has quit IRC23:11
clarkbianw: the upside is that those jobs are failing in pre so will be retried23:20
clarkbprior to zuulv3 I expect many of those nodes would've been recycled by nodepool ready script checks23:20
*** mriedem has quit IRC23:20
*** xek__ has joined #openstack-infra23:21
*** xek_ has quit IRC23:24
clarkbianw: also idea: on top of collecting unbound logs maybe we can track the failed names and backend servers? we might learn that github.com wih 30 second ttl fails a lot more than git.o.o with hour long ttl (or the opposite)23:28
*** adriancz has quit IRC23:28
clarkbthere are potentially things we can do with our dns records to allevaite some of the pain there23:28
ianwhrm you mean retry with a different server?23:32
ianwi mean target DNS name23:33
clarkbmore, that we can change ttls and potentially dns hosting23:33
clarkbif we find our dns is particularly unhappy for some reason23:33
clarkbinfra-root re packethost there appear to be a bunch of floating IPs used (and possibly leaked). I think john's test nodepool may have done that? I am unsure but that is making me think we don't want to boot a new mirror just yet23:36
clarkblooks like most of the /25 is allocated23:36
clarkbbut not attached23:36
*** jamesmcarthur has quit IRC23:40
*** jamesmcarthur has joined #openstack-infra23:41
*** jamesmcarthur has quit IRC23:43
*** kgiusti has left #openstack-infra23:43

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!