Tuesday, 2018-10-23

*** fuentess has quit IRC00:00
clarkbmordred: last thing to check is that sdk is new enough on bridge and nodepool hosts00:00
mordredclarkb: what fo?00:00
mordredfor?00:00
clarkbmordred: the addition of the string interpolation?00:00
clarkbso that oscc loads it up properly? that is part of sdk now right?00:01
mordredoh golly - string interp is *super* old00:01
clarkboh huh00:01
tonybianw: Oh rats :( but also \o/00:01
mordredconoha has been doing that with their regions since the tokyo summit00:01
mordredclarkb: the only issue with it was that I accepted a patch using % style when .format() is what we use00:01
*** sthussey has quit IRC00:04
*** rcernin has joined #openstack-infra00:11
*** jamesmcarthur has joined #openstack-infra00:11
*** rcernin_ has quit IRC00:12
*** rcernin has quit IRC00:13
*** rcernin has joined #openstack-infra00:14
*** jamesmcarthur has quit IRC00:16
openstackgerritMerged openstack-infra/system-config master: Fix zk cluster members listing  https://review.openstack.org/61253500:19
*** tobiash has quit IRC00:21
*** ssbarnea has joined #openstack-infra00:22
*** tobiash has joined #openstack-infra00:23
*** jamesmcarthur has joined #openstack-infra00:30
*** longkb has joined #openstack-infra00:37
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Consume rate limiting task manager from openstacksdk  https://review.openstack.org/61216900:46
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Remove task manager  https://review.openstack.org/61217000:46
clarkbok I've removed zk* from the emergency file. Will check that we still have a proper cluster after puppet runs then call it a day00:47
*** gouthamr has joined #openstack-infra00:47
clarkbThnking about doing the switchover maybe put builders on new zk late wednesday my time, then move zuul and the launchers on friday my time sometime?00:48
clarkbI've got dentist appointment wednesday but otherwise I can be quite flexible this week  Ithink00:48
dmsimardI love that any job I'm troubleshooting has ara enabled.. no matter the project00:50
dmsimard:D00:50
*** dmellado has joined #openstack-infra00:51
dmsimardeven when developing ara, ara helps me troubleshooting the ara integration jobs00:53
dmsimardbut then I need to generate a nested ara report, it gets confusing00:53
openstackgerritDavid Moreau Simard proposed openstack-infra/system-config master: Add support for enabling the ARA callback plugin in install-ansible  https://review.openstack.org/61122800:56
openstackgerritDavid Moreau Simard proposed openstack-infra/system-config master: Add playbook for deploying the ARA web application  https://review.openstack.org/61123200:57
corvusclarkb: i'm around all this week (on east-coast time, believe it or not since i'm talking to you now) except friday i'm afk00:58
mordredcorvus: wow. I *knew* you were east coast, but have also been calculating west coast timezone when deciding whether or not you're likely to respond00:58
clarkbcorvus: in that case maybe we switch builders tomorrow then do the launchers and zuul on thursday?01:00
clarkbmostly I want ~36 hours for image builds to happen01:00
*** stevebaker has joined #openstack-infra01:06
*** xinliang has joined #openstack-infra01:10
Shrewscorvus: what's happening on the east coast?01:17
dmsimardit's cold01:17
clarkbATO?01:19
*** smarcet has joined #openstack-infra01:20
Shrewsclarkb: if you want to cutover builders tomorrow, i can help monitor then01:21
clarkbShrews: does after the infra meeting work for you? 1pm pacific or 4pm eastern?01:21
*** mrsoul has quit IRC01:21
Shrewsclarkb: wfm01:21
Shrewswe should know pretty quickly if there are issues01:22
clarkbcool. Puppet ran on new cluster and it is still a cluster01:22
clarkbso I think we are ready to switch the builders over whenever. I'm going to call it a day now and see you all tomorrow01:22
*** imacdonn has quit IRC01:23
*** imacdonn has joined #openstack-infra01:23
*** markvoelker has joined #openstack-infra01:25
*** rlandy has quit IRC01:27
*** hongbin has joined #openstack-infra01:27
*** carl_cai has quit IRC01:35
*** hongbin has quit IRC01:36
*** hongbin has joined #openstack-infra01:37
*** jamesmcarthur has quit IRC01:41
*** hongbin_ has joined #openstack-infra01:41
*** hongbin has quit IRC01:43
*** bhavikdbavishi has joined #openstack-infra01:43
*** bhavikdbavishi has quit IRC01:50
*** jamesmcarthur has joined #openstack-infra02:00
*** hongbin has joined #openstack-infra02:05
*** jamesmcarthur has quit IRC02:05
*** hongbin_ has quit IRC02:07
*** jamesmcarthur has joined #openstack-infra02:18
*** bhavikdbavishi has joined #openstack-infra02:23
*** jamesmcarthur has quit IRC02:34
*** graphene has joined #openstack-infra02:38
*** jamesmcarthur has joined #openstack-infra02:39
*** bhavikdbavishi has quit IRC02:40
openstackgerritsebastian marcet proposed openstack-infra/openstackid master: Migration to PHP 7.x  https://review.openstack.org/61193602:51
*** psachin has joined #openstack-infra02:56
*** dave-mccowan has quit IRC02:57
*** jamesmcarthur has quit IRC02:59
*** munimeha1 has quit IRC03:01
*** smarcet has quit IRC03:09
*** ramishra has joined #openstack-infra03:12
*** graphene has quit IRC03:23
*** graphene has joined #openstack-infra03:25
*** graphene has quit IRC03:25
*** graphene has joined #openstack-infra03:27
*** graphene has quit IRC03:33
*** graphene has joined #openstack-infra03:35
*** bhavikdbavishi has joined #openstack-infra03:35
*** hongbin has quit IRC03:52
*** graphene has quit IRC03:53
*** graphene has joined #openstack-infra03:54
*** udesale has joined #openstack-infra03:58
*** mrhillsman is now known as openlab04:05
*** openlab is now known as mrhillsman04:06
*** smarcet has joined #openstack-infra04:09
*** roman_g has quit IRC04:10
*** yamamoto has quit IRC04:17
*** yamamoto has joined #openstack-infra04:17
*** jamesmcarthur has joined #openstack-infra04:18
*** jamesmcarthur has quit IRC04:22
openstackgerritzhulingjie proposed openstack/ansible-role-cloud-launcher master: use include_tasks instead of include  https://review.openstack.org/61257004:40
*** smarcet has quit IRC04:45
*** janki has joined #openstack-infra04:59
*** quiquell|off is now known as quiquell05:40
*** kjackal has joined #openstack-infra05:40
*** spsurya has joined #openstack-infra05:48
*** maciejjozefczyk has quit IRC06:09
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: New Repo: OpenStack-Helm Docs  https://review.openstack.org/61189306:15
AJaegerconfig-core, a couple of new repos reviews are up: https://review.openstack.org/612419 https://review.openstack.org/602783 https://review.openstack.org/609531 https://review.openstack.org/611892 https://review.openstack.org/#/c/611893/06:18
*** ccamacho has joined #openstack-infra06:31
*** kjackal has quit IRC06:35
*** ramishra_ has joined #openstack-infra06:40
*** ramishra has quit IRC06:43
*** slaweq has joined #openstack-infra06:43
*** bhavikdbavishi1 has joined #openstack-infra06:53
*** bhavikdbavishi has quit IRC06:55
*** bhavikdbavishi1 is now known as bhavikdbavishi06:55
*** pcaruana has joined #openstack-infra06:56
*** jaosorior has quit IRC07:02
*** graphene has quit IRC07:04
*** armax has quit IRC07:04
*** felipemonteiro has joined #openstack-infra07:04
*** jaosorior has joined #openstack-infra07:05
*** graphene has joined #openstack-infra07:05
*** maciejjozefczyk has joined #openstack-infra07:05
amorinhey guys07:06
amorindo you know if those mountains on graphs07:06
*** rcernin has quit IRC07:06
amorinhttp://grafana.openstack.org/d/BhcSH5Iiz/nodepool-ovh?orgId=1&var-region=ovh-bhs1&var-region=ovh-gra1&from=1540230724666&to=154026214284807:06
amorinfor gra107:06
amorinis it normal behavior?07:07
*** graphene has quit IRC07:07
*** rpittau has quit IRC07:07
*** rpittau has joined #openstack-infra07:07
*** shardy has joined #openstack-infra07:19
*** felipemonteiro has quit IRC07:22
*** gfidente has joined #openstack-infra07:24
cgoncalvesianw, hey. we're seeing another issue with centos DIB: http://logs.openstack.org/79/604479/13/check/octavia-v2-dsvm-scenario-centos-7/615561f/controller/logs/devstacklog.txt.gz#_2018-10-23_01_16_32_21107:27
cgoncalvesianw, shouldn't it be using the epel mirror from openstack ci?07:27
cgoncalveshttp://logs.openstack.org/79/604479/13/check/octavia-v2-dsvm-scenario-centos-7/615561f/controller/logs/devstacklog.txt.gz#_2018-10-23_01_13_20_24107:28
ianwamorin: hey ... it is not normal :)07:33
*** ramishra_ is now known as ramishra07:33
ianwamorin: it seems launches on gra1 are basically always failing, i discussed with clarkb this morning turning it off07:34
*** jamesmcarthur has joined #openstack-infra07:34
ianwamorin: i can jump in to get some id's if it will help07:34
ianwcgoncalves: hrmm, let me see07:35
*** aojea has joined #openstack-infra07:37
ianwcgoncalves: can you make dib run by default with "-x" in this test?  also, we have a --logfile argument now, which might be useful to save the output into a separate log07:37
*** jamesmcarthur has quit IRC07:38
*** sshnaidm|afk is now known as sshnaidm|pto07:41
cgoncalvesianw, oh, that would be super useful for debugging purposes indeed!07:52
ianwcgoncalves: ok, as step 1 i'll try out -> https://review.openstack.org/612622 :)07:52
cgoncalvesianw, do we have the centos cloud image mirrored somewhere?07:52
ianwcgoncalves: no, we don't mirror those07:53
ianwit might be something for our reverse proxy07:53
*** lastmikoi has joined #openstack-infra07:54
*** xek has joined #openstack-infra07:55
quiquellGood morning, we are using fedora28 nodeset and looks like dnf.conf exclude the python virtualenv package07:56
quiquellhttp://logs.openstack.org/90/612290/8/check/tripleo-ci-fedora-28-standalone/cd876e8/logs/undercloud/etc/dnf/dnf.conf.txt.gz07:56
ianwquiquell: yes, this is by design07:56
quiquellianw: Do you know if this is the default or maybe we are changing something07:56
quiquellianw: I have test it laso with Fedora-Cloud-Base-28-1.1.x86_64 and is different07:57
quiquellianw: This is fedora's design or openstack infra stuff ?07:57
*** tosky has joined #openstack-infra07:57
cgoncalvesianw, that patch was freaky quick! thanks!07:58
quiquellianw: centos was working fine07:58
ianwquiquell: no, this is an infra thing; you can read all about it @ https://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/pip-and-virtualenv/install.d/pip-and-virtualenv-source-install/04-install-pip#n4707:58
quiquellianw: the exclusion is not needed there ?07:58
ianwquiquell: umm, i would have said it was excluded there too, of the top of my head07:58
ianwquiquell: the problem, iirc, was that if a new setuptools/pip package appears, but it still less than the upstream versions we've installed, it could get itself into a big mess08:00
quiquellianw: At centos we install the python-virtualenv package http://logs.openstack.org/90/612290/6/check/tripleo-ci-centos-7-standalone/4e58975/job-output.txt.gz#_2018-10-22_13_59_07_72700908:00
*** xek_ has joined #openstack-infra08:01
ianwquiquell: is it possible that yum v dnf just ignores the held package?08:01
quiquellianw: Don't know I am just approaching this08:03
*** xek has quit IRC08:03
ianwquiquell: given that more recent fedora's i think do a much better job at separating user-installed v packaged tools, the whole thing might be able to be reworked08:04
ianwi'd volunteer to review that, but i don't know about work on it :)  i'm not making any excuses that that pip-and-virtualenv element is pretty messy08:04
ianwit's just grown around practicalities, but things do tend to change08:05
quiquellianw: Nah don't worry is totally ok, we are just hacking around to have our new fedora28 job working08:05
quiquellianw: So what's the correct way to use virtualenv at fedora28 from nodesets ?08:06
ianwquiquell: i would say "virtualenv -p python3"08:08
quiquellianw: so pip and virtualenv for python3 is already installed ?08:09
*** jpich has joined #openstack-infra08:10
* quiquell have being too lazy not checking it08:10
ianwquiquell: yes, that's what all the fussing is in about in pip-and-virtualenv package :)08:11
quiquellianw: Yep, ok thanks08:12
quiquellianw: So we are suppose to have it at centos too ?08:12
ianwcgoncalves: hrm so it really seems like it runs h05-rpm-epel-release http://logs.openstack.org/79/604479/13/check/octavia-v2-dsvm-scenario-centos-7/615561f/controller/logs/devstacklog.txt.gz#_2018-10-23_01_13_19_91308:12
ianwquiquell: yep, all platforms should have a working, latest version of "pip" and "virtualenv"08:13
quiquellianw: ack thanks so much08:13
ianwnp, i hope it helps more than hinders :)08:15
quiquellianw: sure it will do the job08:15
cgoncalvesianw, shouldn't it? it is a dep of pip-and-virtualenv08:16
*** shardy has quit IRC08:17
ianwcgoncalves: yeah it should ... without "-x" it doesn't show exactly what it did ... but in theory it should have re-written the epel repo to the mirror ...08:18
*** ginopc has quit IRC08:20
*** ginopc has joined #openstack-infra08:21
amorinianw: the weird thing is that it seems to be working at the beginning of hours08:21
*** eernst has joined #openstack-infra08:21
*** dtantsur|afk is now known as dtantsur08:22
amorinhttp://grafana.openstack.org/d/BhcSH5Iiz/nodepool-ovh?orgId=1&var-region=ovh-bhs1&var-region=ovh-gra1&from=1540245245361&to=154024871913208:22
amorinwhy does it stop after minute 3508:22
ianwamorin: i'm running the port cleaning script every 20 minutes, and it clears out some 600+ ports08:22
*** e0ne has joined #openstack-infra08:23
ianwi wonder if there's a background of very high frequency of failure, but a few get through when the ports are cleared back to zero?08:24
ianwhrm, not sure that makes sense, because nodepool is asynchronous to the cleaning script.  it's not like it pauses and waits for all free ports08:24
amorinnodepool is supposed to spawn all instances, right?08:25
ianwamorin: umm, yes, nothing else is creating vm's if that's what you mean08:26
amorinok08:26
amorinI mean the available line, should be near the max line08:26
amorinall the time, right?08:27
*** eernst has quit IRC08:29
*** eernst has joined #openstack-infra08:29
*** derekh has joined #openstack-infra08:29
*** electrofelix has joined #openstack-infra08:29
ianwamorin: well, it shouldn't look like that :)08:29
ianwamorin: let me see what's in logs ...08:29
amorinok08:30
ianwHttpException: 403: Client Error for url: https://compute.gra1.cloud.ovh.net/v2/dcaab5e32b234d56b626f72581e3644c/servers, {"forbidden": {"message": "The number of d08:30
ianwefined ports: 636 is over the limit: 600", "code": 403}}08:30
ianwalso occasionally08:31
ianwHttpException: 403: Client Error for url: https://compute.gra1.cloud.ovh.net/v2/dcaab5e32b234d56b626f72581e3644c/servers, {"forbidden": {"message": "Maximum number of ports exceeded", "code": 403}}08:31
amorinok08:32
amorinso the courb look like that mostly because of port leaking08:32
*** gouthamr has quit IRC08:32
amorinthat prevent nodepool to spawn new instances08:32
ianwhrm, but every 20 minutes we run and clear out the ports08:32
ianwso the only way we get 600+ ports in 20 minutes is if the vm boot is failing *and* leaving ports behind08:33
ianwthen nodepool is just looping making non-working vm's, if that makes sense08:33
amorinyup08:33
ianwif the nodes come up and are running something, no way we go through that many in that short time08:34
amorinwe found something on GRA1 that we are currently fixing08:34
amorinI will let you know08:34
ianwwe really need to look at nodepool's logs to better correlate the openstacksdk errors with the vm's it's trying to boot in the logs ... it's all so jumbled up08:34
*** dmellado has quit IRC08:34
*** stevebaker has quit IRC08:35
ianwamorin: hrm, "port list" shows 600+ ports, i wonder if the clearing isn't working?08:37
ianwhttp://paste.openstack.org/show/732692/ is a sample of the ports it cleared on the last run08:38
ianwamorin: let me stop the region and clear all the ports and see where we are08:39
*** ifat_afek has joined #openstack-infra08:40
ianwalright, the ports are being removed now08:43
ianwport list | grep DOWN | wc -l08:43
ianw61408:43
quiquellianw: So just to be sure, the opnestack-infra zuul's centos nodesets has already setuptools/pip/virtualenv too ?08:44
*** priteau has joined #openstack-infra08:45
ianwquiquell: yes, it should do08:45
ianwamorin: so i think all we're seeing is "openstack.exceptions.ResourceTimeout: Timeout waiting for the server to come up" ... i.e. our side in the sdk is timing out on the boot with no response.  i can get you some id's if it helps08:46
ianw2c0a4629-b588-47ce-89a6-e4e094d7e846 might be a recent one08:48
amorinchecking08:49
*** shardy has joined #openstack-infra08:49
amorinon my side: I see some neutron timeout errors08:50
ianwc479bd47-e31c-4c36-88c9-1655bd8e3b9f maybe another one with a weird error08:52
ianwopenstack.cloud.exc.OpenStackCloudCreateException: Error creating server: c479bd47-e31c-4c36-88c9-1655bd8e3b9f08:52
ianwthat's it :/08:52
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Add epel element to centos7 testing  https://review.openstack.org/61263608:56
ianwcgoncalves: from dib's side, i thought i put this in for testing but must have forgot ... ^ let's see if dib gate shows any issues08:56
cgoncalvesianw, ack :)08:57
*** carl_cai has joined #openstack-infra08:59
*** eernst has quit IRC09:00
*** kjackal has joined #openstack-infra09:08
*** adriant has quit IRC09:09
*** adriant has joined #openstack-infra09:10
*** gouthamr has joined #openstack-infra09:11
*** adriant has quit IRC09:13
*** adriant has joined #openstack-infra09:14
*** adriant has quit IRC09:16
openstackgerritMerged openstack/diskimage-builder master: Remove redundant sources change/update  https://review.openstack.org/56373909:16
openstackgerritMerged openstack/diskimage-builder master: Add a post-root.d phase  https://review.openstack.org/61180609:16
ianwamorin: ok, something really weird is going on, i just removed 600+ ports and they're back...09:18
*** dmellado has joined #openstack-infra09:18
ianw200 or so, anyway09:19
ianwoh, maybe puppet got to it, the max-servers was set again09:20
*** jpena|off is now known as jpena09:22
*** tosky has quit IRC09:24
*** tosky has joined #openstack-infra09:24
*** ifat_afek has quit IRC09:33
*** stevebaker has joined #openstack-infra09:36
ianwamorin: ok, back to no leaked ports.  i can try starting a server or two, if you like09:42
*** jamesmcarthur has joined #openstack-infra09:43
ianw#status log nl04 in emergency with ovh-gra1 set to 0 for now09:43
openstackstatusianw: finished logging09:43
*** jamesmcarthur has quit IRC09:47
*** quiquell is now known as quiquell|brb09:50
*** udesale has quit IRC09:51
*** udesale has joined #openstack-infra09:52
*** jpich has quit IRC09:53
*** jpich has joined #openstack-infra09:54
*** dhill_ has quit IRC09:55
*** rossella_s has quit IRC09:57
*** shardy has quit IRC09:57
*** shardy has joined #openstack-infra09:58
*** ifat_afek has joined #openstack-infra10:02
*** xek_ has quit IRC10:03
*** xek has joined #openstack-infra10:07
*** eernst has joined #openstack-infra10:07
*** quiquell|brb is now known as quiquell10:08
openstackgerritMerged openstack-infra/zuul-jobs master: Remove the "emit-ara-html" role  https://review.openstack.org/61038110:14
*** ssbarnea_ has joined #openstack-infra10:16
e0nehi. could anybody please help we why depends-on flag doesn't work for https://review.openstack.org/#/c/612652/?10:21
*** ianychoi has quit IRC10:22
*** ianychoi has joined #openstack-infra10:25
*** ifat_afek has quit IRC10:25
*** eernst has quit IRC10:30
*** apetrich has quit IRC10:38
*** pbourke has quit IRC10:47
*** pbourke has joined #openstack-infra10:48
*** apetrich has joined #openstack-infra10:54
priteauTopic on #openstack-meeting-alt is stuck to "Documentation (Meeting topic: trove)" when no other meeting is active, could it be reset?10:55
*** yamamoto has quit IRC11:03
*** yamamoto has joined #openstack-infra11:04
*** yamamoto has quit IRC11:08
*** udesale has quit IRC11:14
*** tosky has quit IRC11:18
*** tosky has joined #openstack-infra11:18
*** longkb has quit IRC11:19
*** yamamoto has joined #openstack-infra11:21
cmurphye0ne: it looks to me like it's working, i can see them linked in http://zuul.openstack.org/status11:27
*** panda is now known as panda|lunch11:27
*** jpena is now known as jpena|lunch11:33
e0necmurphy: thanks. there was an extra space in the commit message :(11:35
*** janki has quit IRC11:41
*** eharney has joined #openstack-infra11:45
*** markvoelker has quit IRC11:45
*** ldnunes has joined #openstack-infra11:51
*** ansmith has quit IRC11:52
*** dhill_ has joined #openstack-infra11:59
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Fix meeting ID for Cyborg  https://review.openstack.org/61267612:01
*** hwoarang has quit IRC12:04
*** hwoarang has joined #openstack-infra12:06
*** rh-jelabarre has joined #openstack-infra12:07
*** dave-mccowan has joined #openstack-infra12:07
*** janki has joined #openstack-infra12:14
*** bhavikdbavishi has quit IRC12:22
*** auristor has quit IRC12:24
*** janki has quit IRC12:25
*** auristor has joined #openstack-infra12:25
*** janki has joined #openstack-infra12:25
*** janki has quit IRC12:27
*** rlandy has joined #openstack-infra12:29
*** tobberydberg has quit IRC12:30
*** udesale has joined #openstack-infra12:31
*** adriancz has joined #openstack-infra12:32
*** janki has joined #openstack-infra12:34
*** markvoelker has joined #openstack-infra12:36
quiquellianw: Have a question regarding openstackclient RPM, this is the place ?12:36
*** jcoufal has joined #openstack-infra12:37
*** markvoelker has quit IRC12:37
*** jchhatbar has joined #openstack-infra12:38
*** jamesdenton has joined #openstack-infra12:39
*** janki has quit IRC12:40
*** jpena|lunch is now known as jpena12:40
dtroyerquiquell: there may be someone here who would know about that, however we do not produce distro packaging at the project level directly, that is done in separate projects or downstream by distros.12:41
quiquelldtroyer: ack, thanks12:41
*** kgiusti has joined #openstack-infra12:47
*** trown|outtypewww is now known as trown12:47
fungiianw: quiquell: does `python3 -m venv` not work on fedora? then it doesn't matter what virtualenv package you've got installed since you wouldn't be using it anyway12:49
quiquellfungi: You have to use the ones already installed at fedora28 in zuul nodsets12:50
*** jchhatbar is now known as janki12:50
fungiquiquell: no, i mean, why call virtualenv at all? python3 has a venv module built-in12:51
fungias long as the distro hasn't stripped it from the python3 stdlib12:52
quiquellfungi: We are going step by step :-)12:52
quiquellfungi: Sure we will use the module at the end, but we have a long road in front of us12:53
fungipriteau: i've reset the default topic in #openstack-meeting-alt and #openstack-meeting-4 now (both were stale)12:53
priteauThank you fungi13:00
*** yamamoto has quit IRC13:02
fungithat tends to happen if the meetbot gets caught on the other side of a netsplit from chanserv in the middle of a meeting, or if it gets restarted during one (it's not stateful through restarts, which are needed to pick up configuration changes, though we try to be careful about merging those when meetings are underway)13:02
*** ansmith has joined #openstack-infra13:04
*** bnemec has joined #openstack-infra13:08
*** psachin has quit IRC13:10
openstackgerritMerged openstack-infra/system-config master: Update clouds.yaml for citycloud with new auth info  https://review.openstack.org/61253813:10
ssbarneafungi: venv is not virtualenv, and as long we support py27 we have reasons to stick with virtualenv. later we can swap but we don't really want to use two different v-env tools at the same time.13:11
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove OpenStack-Chef meeting  https://review.openstack.org/61269113:11
ssbarneafungi: I have question related to pypi mirrors which can be broken, as seen in http://logs.openstack.org/91/610491/9/gate/tripleo-ci-centos-7-containers-multinode/f54eb7e/job-output.txt.gz#_2018-10-23_03_33_30_244926 where it fails to find "pbr".13:11
ssbarneahow do we configure mirrors on our jobs? do we use the --extra-index-url  for a fallback mirror or not really?13:12
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove Daisycloud meeting  https://review.openstack.org/61269213:13
*** quiquell is now known as quiquell|lunch13:13
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove Glare meeting  https://review.openstack.org/61269313:14
*** e0ne has quit IRC13:19
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove ironic-bfv and ironic-ui meetings  https://review.openstack.org/61269513:19
*** sthussey has joined #openstack-infra13:19
fungissbarnea: i was talking about what they're running on fedora, specifically, where it's python3-only. also the venv module has worked fine as a virtualenv stand-in for me so far. what does tripleo do with virtualenv that venv doesn't support?13:20
*** markmcd has quit IRC13:21
fungi(on python3 i mean)13:21
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove diskimage-builder meeting  https://review.openstack.org/61269613:21
ssbarneafungi: enough to make me afraid of using it;) mainly today i was working to fix the hack that injects the libselinux bindings into virtualenv.13:22
ssbarneafungi: don't get me wrong, i am not against venv. is just that I already have too much diversity to deal with.13:23
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove JJB meeting  https://review.openstack.org/61269713:24
fungissbarnea: the error you linked looks like probably a network issue within ovh bhs1. if connectivity between two machines there is failing, i have little hope for connectivity across the internet to pypi13:25
ssbarneafungi: it can be a glitch, i would prefer to see it that it attempted to get it from two sources before failing. don't you agree?13:26
fungissbarnea: it's possible we could try that now that we're simply proxying instead of building a mirror (for that matter, the timeout might have been the proxy failing to reach pypi). back when we built full pypi mirrors instead it made more sense to not have pip try to reach additional indices13:28
ssbarneafungi: can you indicate me where this could be implemented?13:29
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove various unused Neutron meetings  https://review.openstack.org/61269813:29
fungiwhat i'm worried about is that pip won't use it for a fallback, but will actually try to hit all the indices every time (because they could have differing package versions) so we'd effectively be telling our ci jobs to start hitting pypi every time pip is invoked13:30
*** bobh has joined #openstack-infra13:31
fungiadding tons of additional calls across the internet we aren't making now, and putting a lot of additional load on pypi's cdn13:32
fungigranted they probably don't care, but pushing these through a local cache is polite to our donor clouds as well13:32
*** smarcet has joined #openstack-infra13:33
fungissbarnea: how often are you encountering these failures, that increasing inefficiency of all jobs in our ci system is a reasonable workaround?13:33
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove training guide/labs team meetings  https://review.openstack.org/61269913:34
*** mriedem has joined #openstack-infra13:34
fungiand what are the chances that if the job node has trouble reaching the mirror host in that same cloud, or the mirror is in turn having trouble reaching the pypi cdn from that region, that the job node will still have no problem reaching pypi from there?13:34
*** roman_g has joined #openstack-infra13:34
*** yamamoto has joined #openstack-infra13:35
ssbarneafungi: i see two occurences in the last 7 days, i think both were on gate jobs, which caused serious delay. What is the inefficiency introduced by the fallback?13:35
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove Solum team meeting  https://review.openstack.org/61270013:36
fungissbarnea: so two out of tens of thousands?13:37
fungissbarnea: inefficiency introduced by the fallback is that it's not a fallback. pip will hit pypi.org every time it's invoked if pypi.org is one of the listed indices. pip doesn't use them as "fallbacks" but additional indices it thinks it should check, so it checks all of them every time13:38
*** markmcd has joined #openstack-infra13:38
fungissbarnea: as to where we perform the configuration, it's in this task: https://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/configure-mirrors/tasks/mirror.yaml13:38
fungitook me a moment to find13:38
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove Puppet-OpenStack team meeting  https://review.openstack.org/61270113:38
*** agopi has quit IRC13:39
fungissbarnea: we set index-url to the "mirror" proxy in the local region so that pip won't try to hit the pypi.org index, and then we use extra-index-url to add our mirror of prebuilt wheels13:40
fungihttps://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/configure-mirrors/templates/etc/pip.conf.j213:40
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove OpenStackClient team meeting  https://review.openstack.org/61270213:41
*** boden has joined #openstack-infra13:42
ssbarneafungi: ohh, this means that we cannot really do it because pip supports only two URLs, and we are already using both.13:42
*** roman_g has quit IRC13:43
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Remove OSops team meeting  https://review.openstack.org/61270313:43
*** roman_g has joined #openstack-infra13:45
ssbarneafungi i am wondering why it does not attempt any retry for that timeout, I see that pip does have some ability to retry based on https://github.com/pypa/pip/issues/584413:45
*** kiennt26 has joined #openstack-infra13:50
openstackgerritMerged openstack-infra/zuul master: web: Increase height and padding of zuul-job-result  https://review.openstack.org/61098013:51
ssbarneafungi: "Couldn't find index page for" makes me believe it did receive a 404 from the mirror, which would indicate a serious issue with the mirror. a no response is much better than a 404 for a package. strange I do remember seeing the same kind of error from pypi CDN few months back, randomly breaking some of my travis builds.13:51
openstackgerritMerged openstack-infra/zuul master: encrypt_secret: support OpenSSL 1.1.1  https://review.openstack.org/61141413:54
openstackgerritNicola Peditto proposed openstack-infra/project-config master: Added template 'publish-to-pypi-python3' to Iotronic projects.  https://review.openstack.org/61270513:55
*** kiennt26 has quit IRC13:56
*** edmondsw has joined #openstack-infra14:00
fungissbarnea: pip supports multiple urls, but it queries them all as separate indices in case they include different things14:01
fungissbarnea: if the apache proxy on the mirror received an error when it tried to get that index from the pypi cdn, then it could have resulted in the behavior observed14:02
*** panda|lunch is now known as panda14:04
ssbarneafungi: ok, thanks. for the moment case closed, if we see it re-ocurring often enough we can reopen it and think about alternatives.14:04
fungiyeah, i think most of the failures we see of that nature are actually the pypi cdn failing to return what we want, so going straight to the cdn from the test nodes and bypassing the proxy isn't likely to yield much of an improvement, i don't expect14:06
*** ramishra has quit IRC14:06
fungipypi (like most resources queried over the internet) is not reliable at the scale we tend to operate at in our ci jobs14:07
fungiputting a caching proxy between the test nodes and pypi helps absorb some of that, but we're still going to see more failures of that nature than when we built our own copies of pypi14:08
fungithat simply became unworkable once tensorflow and some other machine learning projects started inserting gigabytes a day of ml data snapshots into pypi14:08
*** quiquell|lunch is now known as quiquell14:10
fungiclarkb: something seems to have started up around 0800z today that's chewing up a noticeable amount of system cpu and increasing load average. the delay makes me think it's not related to the xenial upgrade but i'm not immediately seeing what's causing it14:15
*** fresta_ is now known as fresta14:16
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Ignore removed provider in _cleanupLeakedInstances  https://review.openstack.org/60867014:18
fungiwe seem to have picked up a little bit of swap activity since the upgrade which we weren't seeing before (according to cacti). wonder if we didn't have a swap device or simply ran on an image preconfigured with swappiness overridden to 014:18
funginot enough to account for the system cpu utilization though14:18
fungilooks like it could be either apache or nodejs (or maybe both?) waiting on something14:20
fungibut regardless, pads have gotten really slow to load14:20
*** janki has quit IRC14:21
fungidisk of the etherpad-mysql-5.6 trove instance is getting close to full (17.7/20gb). i'll see about increasing that14:23
*** rpioso|afk is now known as rpioso14:26
*** smarcet has quit IRC14:29
*** e0ne has joined #openstack-infra14:29
*** quiquell is now known as quiquell|off14:29
*** aojeagarcia has joined #openstack-infra14:35
*** aojea has quit IRC14:35
*** felipemonteiro has joined #openstack-infra14:38
clarkbfungi: top seems to show apache being the bigger consumer of cpu time than anything else?14:47
fungiit varies14:47
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Cleanup node requests that are declined by all current providers  https://review.openstack.org/61091514:48
clarkbpossible its a meltdown mitigation cost? that for some reason just wasn't present in trusty? (though we checked all were PTIing at the time)14:48
clarkbthat would explain a rise in system cpu cost at least as it has to shuffle things around anytime you syscall and back14:49
clarkbfungi: related to the zuul executors we might try using the hwe kernel on that node to see if it helps?14:49
fungidoesn't explain though why it just started 7 hours ago14:52
clarkbtrue14:52
fungiunattended-upgrades fired at 0700z on the etherpad server14:53
fungiopenssh, man-db, systemd, ureadahead, ufw14:54
fungii wonder if something is reindexing?14:54
fungi#status log doubled size of disk for etherpad-mysql-5.6 trove instance from 20gb to 40gb (contains 17.7gb data)14:55
openstackstatusfungi: finished logging14:56
fungihrm, it's also using all its ram14:56
clarkbit being the db?14:57
fungiyeah14:57
*** ccamacho has quit IRC14:57
fungithe trove instance is going to restart here in a bit14:58
fungii'm doubling its memory allocation14:58
clarkbrelated to the db, it occurred to me if we switched from gz to xz for the compression on backup artifacts we would probably save quite a bit of disk space14:58
clarkbfungi: ++14:58
lbragstadthat would explain the 503s from etherpad.o.o :)14:58
clarkbbut I need to test that hypothesis before pushing it out14:58
clarkbfungi: re something indexing I don't see any processes on the app server itself that look like that14:59
clarkbI do see some unnecessary things like a battery manager :/14:59
clarkbfungi: if the db size up doesn't fix things, maybe we install dstat?15:00
clarkbthat should give us much better data over time?15:01
fungisgtm15:01
fungi#status log doubled memory allocation for etherpad-mysql-5.6 trove instance from 2gb to 4gb (contains indicated ~2gb active use)15:02
openstackstatusfungi: finished logging15:02
evrardjpthanks fungi for handling that event :)15:02
openstackgerritMerged openstack-infra/zuul master: Exclude .keep files from .gitignore  https://review.openstack.org/61199015:03
openstackgerritMerged openstack-infra/zuul master: Add a sanity check for all refs returned by Gerrit  https://review.openstack.org/59901115:03
openstackgerritMerged openstack-infra/zuul master: Reload tenant in case of new project branches  https://review.openstack.org/60008815:03
fungilooks like i may need to restart etherpad-lite service too15:03
clarkbya not sure if it will reconnect on its own15:03
fungiit crashed, looks like15:04
fungirunning again now15:04
evrardjpthanks!15:04
clarkbhttps://etherpad.openstack.org/p/clarkb-test looks happy15:04
*** hamzy has quit IRC15:04
clarkbyup thank you fungi15:05
fungiseems pretty snappy, but at this point hard to know whether it was the nodejs restart or the trove ram/disk increases15:05
*** cfriesen has joined #openstack-infra15:11
cfriesencan anyone tell me why https://review.openstack.org/#/c/611498 didn't merge yesterday?15:11
cfriesenit's got +W and +1 from zuul, but no +215:12
*** jamesmcarthur has joined #openstack-infra15:12
fungiDepends-On: I2861839532049bf0c8a2bf89311c4c56186fc0fb15:12
cfriesenthat's merged15:13
fungimerged at 20:26z15:13
fungiyour change was approved before the depends-on merged15:13
mordredyeah. what fungi said15:13
clarkbbut also 6e34371af089cc71c5e54c6921a644cf4391d77a is the parent commit which is not the current patchset of the parent change15:13
fungioh, yep, i think that's the actual problem15:13
fungibecause the depends-on is to another change in the same repo (which is kind of odd but whatever) so should have shared a change queue15:14
cfriesenthat's not actually my commit, some of our guys didn't know about the implicit depends in the same repo15:14
fungiso it's that the parent in gerrit got another patchset and 611498 wasn't rebased to reparent it15:14
cfriesenso sounds like I need to rebase this?15:15
cfriesenon what actually merged15:15
clarkbfwiw the little orange dot there tries to tell you this, its just really bad ui from gerrit on actually making that clear15:17
*** ccamacho has joined #openstack-infra15:17
*** ccamacho has quit IRC15:17
cfriesenah, I was wondering about the orange dot.  just hadn't looked it up yet.15:17
cfriesenthanks for the help15:17
fungiyeah, https://review.openstack.org/611498 has a parent of 611494,1 but 611494,2 is what ended up merging. so gerrit considers it not mergeable because its parent will never exist in the branch15:17
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Cleanup node requests that are declined by all current providers  https://review.openstack.org/61091515:17
fungicfriesen: a rebase should solve it15:18
cfriesengreat, thanks15:18
dtroyerI'm seeing a new failure in publish-stx-specs at cd7827ba0e3fb3dd2ff5fc77e0bc4c7ba81f4969, looks like we may have exceeded our quota?15:22
*** fuentess has joined #openstack-infra15:22
dtroyershoot, that's not a link…  http://logs.openstack.org/cd/cd7827ba0e3fb3dd2ff5fc77e0bc4c7ba81f4969/post/publish-stx-specs/9232619/ara-report/15:24
dtroyerthat's a link...15:24
openstackgerritMerged openstack-infra/zuul master: Use merger to get list of files for pull-request  https://review.openstack.org/60328715:25
openstackgerritMerged openstack-infra/zuul master: Add support for authentication/STARTTLS to SMTP  https://review.openstack.org/60383315:25
openstackgerritMerged openstack-infra/zuul master: encrypt_secret: Allow file scheme for public key  https://review.openstack.org/58142915:25
*** armax has joined #openstack-infra15:25
clarkbdtroyer: yes, we'll need to bump the quota. Would probably be good to understand what is using the disk space so quickly15:26
dtroyerclarkb: right… we did just add a handful of documents in the last maybe week… I haven't gone through looking at usage yet15:26
fungilarge binary objects maybe?15:27
dtroyerso q, if we find something large that gets removed, its removed from afs too?15:27
dtroyerfungi: that's what I'm thinking15:27
dtroyeror images or ppt15:27
fungiyes, afs publication is basically rsync --delete15:27
dtroyerits stuff from the wiki and I suspect soeone who wanted to upload ppts now finally did15:28
dtroyerfungi: good, I was hoping for that :)15:28
dtroyerwhy did I ever doubt? :)15:28
*** smarcet has joined #openstack-infra15:28
fungiwe put a "root marker" file at the root of each tree handled by a particular job so that the publisher knows not to descend into any child dir with one of those in it15:31
fungiand so only cleans up its own files and not those for which another job is responsible15:31
clarkbdtroyer: if the content is something we want/need then we don't need to delete it. Mostly want to double check there isn't unwanted or unexpected disk consumption15:31
fungi(at least in theory)15:31
*** dtantsur is now known as dtantsur|brb15:32
*** kopecmartin is now known as kopecmartin|off15:33
fungii don't think it's specifically specs at fault, just browsing around /afs/openstack.org/project/starlingx.io/www/specs/ a bit15:34
fungichecking other parts of the tree now15:34
*** ianychoi_ has joined #openstack-infra15:36
fungidu -sh /afs/openstack.org/project/starlingx.io/www says 98M15:36
fungiis the quota only 100mb?15:36
clarkbfungi: ya beacuse zuul is using like 5/100mb on its volume15:37
fungik15:37
clarkbso we used zuul as a starting point15:37
*** cfriesen has quit IRC15:37
*** hamzy has joined #openstack-infra15:38
*** e0ne has quit IRC15:39
*** ianychoi has quit IRC15:40
*** gfidente is now known as gfidente|afk15:46
fungibut yeah, just skimming around i think a bunch of it is merely boilerplate for the dozens of stx projects publishing their docs under there now15:47
fungicopies of jquery, icons, et cetera15:48
fungizuul only has a handful of repos compared to stx15:48
*** agopi has joined #openstack-infra15:48
clarkbfungi: are you in a spot to bump the quota there or should I spin up my aklog shell to do it?15:53
clarkbmaybe go to 1GB?15:53
*** smarcet has quit IRC15:56
fungii can take care of it in just a moment15:59
clarkbthanks!15:59
* dtroyer catches up16:02
dtroyerthanks guys… yeah, we have 30+ separate sphinx builds writing into that tree… we are thinking about doing some sort of meta-build to cut some of that duplication out and to allow things like internal Sphinx references between more of those… longer term16:03
*** smarcet has joined #openstack-infra16:03
*** ginopc has quit IRC16:05
*** udesale has quit IRC16:06
openstackgerritMerged openstack-infra/zuul master: web: add config-errors notifications drawer  https://review.openstack.org/59714716:07
*** felipemonteiro has quit IRC16:09
*** gyee has joined #openstack-infra16:16
fungiclarkb: dtroyer: i've done `fs setquota -path /afs/openstack.org/project/starlingx.io -max 1000000`16:17
fungiand now `fs listquota -path /afs/openstack.org/project/starlingx.io` says 100054 of 1000000 used (10%)16:17
*** smarcet has quit IRC16:17
fungi#status log increased quota for project.starlingx volume from 100mb to 1gb16:18
openstackstatusfungi: finished logging16:18
dtroyerthanks fungi… should I push up another review to re-run that job?16:18
fungiif that's easy16:21
fungiotherwise i can reenqueue the last ref into the post pipeline16:22
dtroyerif you could do that I'd appreciate it, I'm about to get busy for a few hours…16:22
*** agopi has quit IRC16:24
*** smarcet has joined #openstack-infra16:27
fungisure (i mean, i'm busy too, but it's a fairly simple command)16:27
fungidtroyer: i've just now done `sudo zuul enqueue-ref --tenant=openstack --trigger=gerrit --pipeline=post --project=openstack/stx-specs --ref=refs/heads/master --newrev=cd7827ba0e3fb3dd2ff5fc77e0bc4c7ba81f4969`16:28
fungi(based on the ref for the failed job run you linked, assuming that it was the most recent)16:29
*** gfidente|afk has quit IRC16:39
*** armstrong has joined #openstack-infra16:46
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: DNM: testing zookeeper oddities  https://review.openstack.org/61275016:46
*** jamesmcarthur has quit IRC16:46
*** smarcet has quit IRC16:46
*** carl_cai has quit IRC16:47
*** dtruong has quit IRC16:56
*** derekh has quit IRC17:00
*** smarcet has joined #openstack-infra17:01
openstackgerritDavid Moreau Simard proposed openstack-infra/system-config master: Add support for enabling the ARA callback plugin in install-ansible  https://review.openstack.org/61122817:03
openstackgerritDavid Moreau Simard proposed openstack-infra/system-config master: Add playbook for deploying the ARA web application  https://review.openstack.org/61123217:03
*** gothicmindfood has quit IRC17:05
*** bobh has quit IRC17:08
*** e0ne has joined #openstack-infra17:08
*** bobh has joined #openstack-infra17:09
*** gothicmindfood has joined #openstack-infra17:10
*** hamzy has quit IRC17:12
*** hamzy has joined #openstack-infra17:12
*** dtantsur|brb is now known as dtantsur17:13
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: DNM: testing zookeeper oddities  https://review.openstack.org/61275017:13
*** bobh has quit IRC17:14
clarkbfungi: dtroyer http://logs.openstack.org/cd/cd7827ba0e3fb3dd2ff5fc77e0bc4c7ba81f4969/post/publish-stx-specs/d864000/ara-report/ looks happy fwiw17:16
*** lbragstad is now known as lbragstad_f00d17:17
*** aojeagarcia has quit IRC17:18
fungigreat!17:18
*** dtantsur is now known as dtantsur|afk17:29
openstackgerritJames E. Blair proposed openstack-infra/nodepool master: WIP: Run dstat and generate graphs in unit tests  https://review.openstack.org/61276517:30
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: DNM: testing zookeeper oddities  https://review.openstack.org/61275017:31
openstackgerritClark Boylan proposed openstack-infra/zuul master: Run zookeeper datadir on tmpfs during testing  https://review.openstack.org/61276617:33
*** jpena is now known as jpena|off17:34
*** agopi has joined #openstack-infra17:34
*** agopi has quit IRC17:39
*** tpsilva has joined #openstack-infra17:40
corvusi've created adns1; moving on to ns1 now17:41
*** trown is now known as trown|lunch17:42
*** e0ne has quit IRC17:45
*** lbragstad_f00d is now known as lbragstad17:45
*** felipemonteiro has joined #openstack-infra17:46
*** smarcet has quit IRC17:46
Shrewsclarkb: still on for builder zk rehoming in 2 hrs?17:47
clarkbShrews: ya let me remove my -1 wip17:47
clarkbhttps://review.openstack.org/#/c/612441/1 is the change17:48
Shrewsgot it17:48
Shrewsclarkb: i'm almost wondering if we should do a total shutdown of the builder processes before merging that17:53
Shrewswe've never switched to a *different* cluster on the fly17:53
clarkbShrews: or merge it with nb01-03 in the emergency file so that we can coordinate when it applies (and shutdown the builders as part of that)17:53
Shrewsclarkb: i think that would be the best plan17:54
clarkbI'll go add them to that list now17:54
Shrews++17:54
clarkball three builders listed now in the emergency file17:55
* Shrews needs to step away for a bit before the meeting. biab17:57
*** hamzy has quit IRC17:59
clarkbI too need to get a few things done away from the computer before the meeting happens17:59
*** hamzy has joined #openstack-infra18:00
*** gary_perkins has quit IRC18:00
*** kjackal has quit IRC18:02
*** agopi has joined #openstack-infra18:06
*** bobh has joined #openstack-infra18:07
*** jamesmcarthur has joined #openstack-infra18:07
*** jamesmcarthur has quit IRC18:08
*** jamesmcarthur has joined #openstack-infra18:09
*** bobh has quit IRC18:12
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Add opendev nameservers (2/2)  https://review.openstack.org/61006618:14
*** smarcet has joined #openstack-infra18:14
*** apetrich has quit IRC18:16
*** apetrich has joined #openstack-infra18:17
*** jpich has quit IRC18:20
corvusremote:   https://review.openstack.org/612770 Add initial zone info18:21
openstackgerritJames E. Blair proposed openstack-infra/project-config master: Gerritbot: add zone-opendev.org to -infra  https://review.openstack.org/61277118:22
corvusclarkb, fungi: would you plase review those 3 changes asap?18:23
fungicorvus: has anyone talked to mnaser yet to get reverse dns updated on ns2?18:23
fungi(and yes, already looking at them)18:23
corvusfungi: not yet, that's next on my list18:23
*** hamzy has quit IRC18:23
fungiokay, cool just noticed it was still at a default generated ptr18:23
*** hamzy has joined #openstack-infra18:24
corvusfungi: and i believe we decided that we wanted manual dig queries to work before we asked jimmy to set up the glue records, so i'll do that after we work through the initial bootstrap18:24
*** smarcet has quit IRC18:24
clarkbcorvus: I think we need to edit acls or groups for 61277018:24
clarkbI only have +-118:24
*** kjackal has joined #openstack-infra18:25
clarkbshoudl I add infra-root as a group to the opendev zone file core group?18:26
fungilooks like we simply need to update the group membership18:26
clarkbya18:27
fungiyeah, that's what i was about to ask as well, though pretty sure the answer is yes18:27
clarkbI went ahead and added infra core18:27
clarkbcan be changed later if necessary18:27
fungii see that, thanks!18:27
*** bobh has joined #openstack-infra18:28
fungiwe're missing reverse dns on the v6 address of adns1?18:29
corvuswell that's weird18:29
fungiahh, nope, wrong address18:30
fungii was looking at reverse dns for the v6 address of ns218:30
corvuswhew18:30
fungiso nothing to see here, move along ;)18:30
*** e0ne has joined #openstack-infra18:31
*** e0ne has quit IRC18:31
*** felipemonteiro has quit IRC18:34
*** irclogbot_2 has joined #openstack-infra18:35
*** chandankumar is now known as chkumar|off18:37
fungithe anomalous system cpu load on the etherpad server doesn't seem to have returned after the nodejs restart and trove resizing18:44
fungistill quite snappy18:44
*** panda has quit IRC18:45
*** panda has joined #openstack-infra18:45
clarkbfungi: yup it seems to be happy18:46
openstackgerritAdam Coldrick proposed openstack-infra/storyboard-webclient master: Handle project names in the new Story modal controller  https://review.openstack.org/61277818:47
*** gary_perkins has joined #openstack-infra18:51
*** cfriesen has joined #openstack-infra18:52
melwittcan anyone remind me which irc bot is needed in channel to make elastic-recheck comments on gerrit work?18:53
dtroyerfungi: thank you, it does look happy on the web side too18:54
cfriesenmelwitt: recheckwatchbot ?18:55
cfriesenor maybe that's just how it identifies itself in logs18:55
clarkbmelwitt: cfriesen you don't need the irc bot to be in channel to comment on the gerrit changes18:55
clarkbmelwitt: cfriesen it should comment on all gerrit changes it can identify as failing for a particular reason within the timeout window we give it. Then on top of that it can also comment on irc if you configure it for that18:56
melwittok. I can't recall what the issue was last time it stopped commenting. I had thought it had to do with an irc bot18:56
melwittI'm not 100% sure it stopped commenting but was trying to see how I can check if any of the necessary conditions have not been met18:57
clarkbShrews: maybe you want to review/approve https://review.openstack.org/#/c/612441/ now so that it is ready for us after the meeting? the emergency file is set now18:57
*** bobh has quit IRC18:57
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Merger: automatically add new hosts to the known_hosts file  https://review.openstack.org/60845318:57
melwittI was trying to determine whether or not it had stopped commenting18:57
clarkbmelwitt: https://review.openstack.org/#/q/owner:%22Elastic+Recheck+(8871)%22 is the gerrit account that should comment. I forget how to search by commented by18:57
clarkbyou should be able to query gerrit for changes that that account has commented on though18:58
melwittright, I can search "comment:elastic" and find things18:58
clarkbhttps://review.openstack.org/#/q/commentby:%22Elastic+Recheck+(8871)%22 is the query18:59
melwittah, thanks18:59
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Merger: automatically add new hosts to the known_hosts file  https://review.openstack.org/60845318:59
clarkbifnra meeting starting in a minute over in #openstack-meeting18:59
ssbarneayep, didn't had time to even check the agenda19:00
Shrewsclarkb: done19:00
*** agopi has quit IRC19:03
ianwfungi: (re your comment ages ago), yeah, iirc the problem was stuff does "virtualenv -p python3", so that was what we wanted to work.  but i don't know how built-in venv is either, try it on bionic :)  there you need python3-venv19:04
fungiright19:04
ianwfungi: i get very lost trying to understand what the future of something that does what "virtualenv" is19:04
* fungi shakes fist at debian python manitainers19:04
ssbarneaafaik venv is part of python3 and is a replacement of virtualenv, the only catch is that there is no venv for py27, but there is a working virtualenv for py3.19:07
ssbarneawhich made venv for me a no go from start, enough problems with the old virtualenv to combine them with the new ones.19:07
ianw... and it's not *quite* a part of base python3 on all distros ... heh, see, it's that simple :)19:08
*** smarcet has joined #openstack-infra19:08
fungithere's been some frequent talk of backporting the venv module to upstream 2.7.x but sounds like it's nontrivial due to some differences in module relocation support for python3 i think19:08
fungibut yeah, distros stripping out bits of the pythonstdlib and sticking them in packages which don't get installed with the interpreter is annoying19:09
clarkb++19:10
ianwthen you've got pipenv going on too.  and that has a gif screencast of a terminal session on it's github page, which makes it better too19:10
ssbarneaianw: probably with a youtube channel, instagram account too.19:13
ianw:)19:14
fungitechnical support via snapchat19:15
*** e0ne has joined #openstack-infra19:15
dmsimardno twitter ?19:16
*** smarcet has quit IRC19:18
fungitwitter is so 201819:21
*** jamesmcarthur has quit IRC19:22
*** pcaruana has quit IRC19:23
openstackgerritMerged openstack-infra/project-config master: Switch nodepool builders to zk cluster  https://review.openstack.org/61244119:23
*** jamesmcarthur has joined #openstack-infra19:23
*** jamesmcarthur has quit IRC19:27
*** david-lyle has joined #openstack-infra19:27
*** dklyle has quit IRC19:28
*** jamesmcarthur has joined #openstack-infra19:28
*** jamesmcarthur has quit IRC19:30
*** jamesmcarthur has joined #openstack-infra19:32
*** bobh has joined #openstack-infra19:32
*** trown|lunch is now known as trown19:36
ssbarneaanything urgent on the agenda? ;)19:37
*** bobh has quit IRC19:38
fungion the infra meeting agenda? not that i'm aware19:40
*** armstrong has quit IRC19:46
*** agopi has joined #openstack-infra19:50
*** lbragstad has quit IRC19:50
*** lbragstad has joined #openstack-infra19:53
*** david-lyle is now known as dklyle19:53
amorinhey guys19:55
dmsimardinfra-root: I'm back if we want to chat bridge.o.o19:55
fungiafter infra meeting maybe19:55
fungihi amorin! any news?19:56
amorinianw: check the nodepool status on gra1, sounds in better shape19:56
amorinfungi: yes19:56
amorinhttp://grafana.openstack.org/d/BhcSH5Iiz/nodepool-ovh?orgId=1&from=now-3h&to=now19:56
amorinwe fixed an issue on gra119:56
amorinI think it will help  a lot19:56
fungiooh, that's... looking better as of ~1815utc?19:57
fungithanks!!!19:57
amorinwe fid not apply on bhs1 yet19:57
amorinwe will tomorrow19:57
ianwamorin: excellent!  did clarkb turn it back on?19:57
*** jamesmcarthur has quit IRC19:58
clarkbI haven't touched it19:58
clarkbunless I messed up an emergency file edit19:58
clarkbno nl04 is still in that file19:59
ianwhrm, no it's still in there ... but nobody turned it back to 79?19:59
ianwoh for heavens sake19:59
ianwopensatck.org19:59
clarkboh ha19:59
fungithose people are splitters19:59
fungibuncha poseurs20:00
clarkbShrews: I can run the kick.sh for nb0*.openstack.org if you want to shut down the running builders and let me know when you are happy they are off?20:00
ianw#status log nb04.opensatck.org removed from emergency20:00
openstackstatusianw: finished logging20:00
Shrewsclarkb: on it20:00
clarkbShrews: the kick.sh should edit the config and restart the service20:00
clarkbI'll go talk to the release team in the interim20:00
dmsimardianw: replied to your comment on https://review.openstack.org/#/c/611228/ sorry to have missed it20:01
ianwamorin: well all's well that ends well :)  it still seems that we're cleaning up a lot of leaked ports, but it does seem we're not running out of ports before the next cleanup run.  is that a known problem?20:01
*** jamesmcarthur has joined #openstack-infra20:02
openstackgerritMerged openstack-infra/project-config master: Gerritbot: add zone-opendev.org to -infra  https://review.openstack.org/61277120:03
Shrewshrm, i thought builder shutdown would kill any running dib processes20:03
clarkbShrews: I think if dib is blocking on a long running process like the image conversion step it may not process signals?20:04
clarkbI want to say it does go away after a while?20:04
Shrewsclarkb: should be safe to just kill right?20:04
clarkbShrews: ya, it might leak some stuff but we are already counting on cleaning all that up anyway20:04
ianwit might leave behind mounts, probably best to do a reboot20:05
*** bobh has joined #openstack-infra20:05
Shrewsianw: a server reboot?20:05
aspiershow good are the results from inline replies to mail notifications from Gerrit reviews?20:05
clarkbwe can also wait for dib to finish if it won't be too long20:05
Shrewsclarkb: well i already killed it20:05
clarkbShrews: heh ok :)20:05
aspiersi.e. if I inline reply to a Gerrit mail notification, will it create a mess in the review?20:06
clarkbaspiers: I don't think the version of gerrit we run supports it at all20:06
aspiersI found https://gerrit-review.googlesource.com/Documentation/intro-user.html#reply-by-email but it doesn't go into detail20:06
clarkbaspiers: so no, it should just noop20:06
*** openstackgerrit has quit IRC20:06
aspiersoh OK, thanks20:06
clarkbaspiers: the gerrit mailing list makes it seem like that particular feature is still a work in progress20:06
clarkbits mostly works except for the weird corner cases people have run into20:06
Shrewsclarkb: ianw: should we reboot (nb02)?20:06
*** jamesmcarthur has quit IRC20:06
aspiersgotcha20:06
clarkbShrews: let me look20:07
clarkbya there are a bunch of weird mounts. Shrews do we want to disable the nodepool builder service first so that it doesn't start up again and start building before we start it?20:07
clarkbsudo systemctl disable nodepool-builder (I think)20:08
clarkbthen puppet will enable it, update the config, and start it20:08
Shrewsclarkb: yeah20:08
clarkbwe should probably do the same on the other two builders too so that they are all running a consistent kernel and anything else pending a reboot20:09
*** jamesmcarthur has joined #openstack-infra20:09
Shrewsclarkb: nb01 appears to have those weird mounts too, and i didn't kill anything there20:09
*** bobh has quit IRC20:09
clarkbShrews: I wonder if dib crashes leak those too20:09
Shrews*shrug*20:10
ianwyeah, it's possible that failed builds can sometimes leak mounts, depending on how they failed20:10
clarkbin any case I think the reboots should be done on nb01 nb02 and nb03 for update consistency with kernels and the like20:10
clarkband disable nodepool-builder first20:10
Shrews*nod*20:10
ianwconsidering it's been up for 277 days, maybe a few leaks isn't that bad :) (i welcome any reviews that help the cleanup path)20:11
Shrewsclarkb: nodepool-builder disabled on all 320:11
Shrewsclarkb: issueing reboots now20:11
clarkbnb01 looks good, no nodepool-builder running and much cleaner mounts20:13
dmsimardso, re: bridge.o.o... my understanding is that it needs to be rebuilt from scratch anyway because it's not sized properly ? is there a plan on how we might scale bridge.o.o beyond a single machine or do we want to stay on a single server ?20:13
Shrewsclarkb: yep. all 3 back, no builder running20:14
clarkbnb02 too20:14
clarkbShrews: when I kick.sh should I start with nb01.o.o and make sure it all looks good before doing the other two?20:14
clarkbShrews: also I'm ready to do ^ if you are20:14
Shrewsclarkb: yeah20:14
clarkbok running kick against nb01 now20:15
clarkbits takes a couple minutes to figure out the inventory and groups20:16
clarkbShrews: puppet is done on nb0120:18
clarkblooks like it didn't actuall enable and start the builder20:18
Shrewsnope20:18
clarkbI wonder if we have a bug with the sysv init script compat laye20:19
*** smarcet has joined #openstack-infra20:19
corvusdmsimard: for the forseeable future, a single machine20:19
clarkbShrews: considering we manually disabled the service I'm good to manually enable and start the service if you are20:19
clarkbShrews: then we can rerun puppet to make sure everything is happy20:19
Shrewsclarkb: go ahead. i'm monitoring the log now...20:19
clarkbok20:19
clarkbit is running20:20
clarkbseems to be running dib implying its happy with the zk?20:20
clarkbI see the connection from nb01 on zk0120:21
Shrewsyeah20:21
clarkbShrews: I'll rerun kick.sh against nb01 to make sure the puppet is happy with my manual service enablement20:21
Shrews2018-10-23 20:20:16,645 INFO kazoo.client: Connecting to 2001:4800:7817:103:be76:4eff:fe04:e359:218120:21
Shrews2018-10-23 20:20:16,677 INFO kazoo.client: Zookeeper connection established, state: CONNECTED20:22
clarkb(want it to steady state as a noop)20:22
clarkbit nooped. Ready for me to run it against nb02 and nb03?20:23
*** kgiusti has left #openstack-infra20:23
clarkbShrews: ^ I'll wait for your go ahead20:24
Shrewsgo20:24
clarkbsudo ./kick.sh nb02.openstack.org:nb03.openstack.org is running20:25
clarkbthere is probably a better way to glob that20:25
clarkbnb03 is done. I will manually start the service there now20:27
clarkber nb02, 03 is still running20:27
clarkbI see nb02 on zk01 now20:28
Shrewsyep, it connected, creating a new image20:28
clarkbstarting nb03 builder now20:29
Shrewsoops20:29
clarkband I see nb03 on zk01 now20:29
clarkboh?20:29
Shrews2018-10-23 20:29:37,954 INFO kazoo.client: Connecting to 2001:4800:7815:102:be76:4eff:fe02:f134:218120:30
Shrews2018-10-23 20:29:37,959 WARNING kazoo.client: Connection dropped: socket connection error: Network is unreachable20:30
Shrews2018-10-23 20:29:37,962 INFO kazoo.client: Connecting to 2001:4800:7817:103:be76:4eff:fe04:e359:218120:30
Shrews2018-10-23 20:29:37,963 WARNING kazoo.client: Connection dropped: socket connection error: Network is unreachable20:30
Shrews2018-10-23 20:29:37,964 INFO kazoo.client: Connecting to 23.253.236.126:218120:30
Shrewsnb03 could only connect to that last one20:30
clarkbShrews: is that on 03? I wonder if ipv6 doesn't work in that cloud20:30
clarkbianw: ^ any idea if that is expected to work?20:30
Shrewsyeah, only the ipv4 worked20:30
clarkbShrews: I think its ok for now if that builder falls back to ipv420:30
clarkbunfortunately we are finding that happens in a few spots because ipv6 isn't quite working as expected :(20:31
Shrewsnb03 is connected though and building a new image20:31
clarkbnb03 only has a scope:Link address for ipv620:31
clarkbI think ipv6 not working there is expected. Possibly we need to have kazoo check if it can ipv6 before ipv6ing?20:32
Shrewsclarkb: i'm not sure why it chose ipv4 for the last address though20:32
corvusthoe hosts are zk03/zk01/zk01 -- so it tried both the v6 and v4 for zk01.20:32
Shrewsoh, heh20:32
clarkbhttps://review.openstack.org/#/c/611920/ is the chagne I wrote for gear to do similar on the binding side20:32
clarkbbut you can use the AI_ADDRCONFIG flag when connecting too aiui20:32
Shrewswell that's controlled in the kazoo layer20:33
clarkbAI_ADDRCONFIG says give me addresses that are valid for configured AF_INET types on this host20:33
clarkbya probably needs to be fixed in kazoo20:33
clarkbin any case things are working well from what I see so far. I guess we watch it and make sure that images get uploaded into the clouds then prepare for thursday morning zuul+launcher switch?20:34
*** gfidente has joined #openstack-infra20:34
ianwclarkb: i don't think 03 will have ipv620:34
clarkbcorvus: Shrews does a noonish eastern time work for you on thursday to do cutover?20:34
clarkbianw: ya I don't see it there via ifconfig so likely a kazoo bug20:34
Shrewsclarkb: yes, i should be available then20:35
*** ssbarnea_ has quit IRC20:36
clarkbcorvus: I figure we should do a full restart of zuul too and mark the sha1 so that we can cut a zuul release in the near future too20:36
*** ansmith has quit IRC20:37
*** hamzy has quit IRC20:40
*** hamzy has joined #openstack-infra20:40
*** xek has quit IRC20:41
corvusclarkb, Shrews: before 11 and after 2 eastern work better for me; but you don't *need* me, so if you want to do that i can catch up later.20:42
fungialso i had to bail on the conference so am at home and can help at any of those times20:43
*** agopi has quit IRC20:44
clarkbcorvus: ok, mostly want to do it early enough that the release team can do things later in the day (sounds like they may be behind on release stuff so us getting done earlier is better than later)> I could possibly do 7am Pacific but I may not be very useful :)20:44
clarkblets pencil in 7am pacific if that works for shrews and fungi20:45
clarkbthen I will pretend it is a fishign trip and wake up early20:45
fungifishing for server upgrades20:45
*** openstackgerrit has joined #openstack-infra20:46
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] Add epel element to centos7 testing  https://review.openstack.org/61263620:46
corvusoh, wow, i guess i could have gone to ato.  oh well.20:46
clarkbthe changes to do the switchover should all be pushed. The only work outstanding is coordinating the shutdown,  update, start and restore20:46
fungicorvus: oh, were you in nc and didn't realize it was ato week?20:46
corvusfungi: yep!  i'm good at calendars and stuff.20:47
* fungi just assumed you were in this tz for the conference20:47
*** carl_cai has joined #openstack-infra20:48
clarkbI kinda want to do the terrible thing of stopping zk01 now to see what happens :)20:49
clarkbmaybe we do that after all the images are built tomorrow20:49
corvusclarkb: be a chaos monkey.  we'll be fine.  :)20:50
clarkbin this particular moment builders continue to look happy so I am going to step away from the computer for "lunch"20:51
corvus(it should only set the image build process back a little)20:51
clarkbI'll keep and eye on the build process and maybe be a chaos monkey too. Also review the storyboard attachments spec when I get back20:51
clarkbfwiw the zk nodes are in cacti too so we'll have good data as we turn stuff up I hope20:51
clarkb`echo stat | nc localhost 2181` is the incantation to see zk server stats and leader/follower info20:52
clarkbif anyone is wondering20:52
Shrewsoh, if we restart the launchers, we'll probably need to consider ianw's change (https://review.openstack.org/605898) that modifies stats labels (assuming it merges)20:53
Shrewsi'm suspecting that will break our graphs20:53
clarkbShrews: I think we can clean that up as a followon20:53
clarkbwon't impact functionality, just reporting20:53
openstackgerritMerged openstack-infra/zuul master: Run zookeeper datadir on tmpfs during testing  https://review.openstack.org/61276620:54
ianwShrews: yeah, i planned to update graphs with those stats :)20:55
*** bobh has joined #openstack-infra20:56
*** bobh has quit IRC20:59
*** bobh_ has joined #openstack-infra20:59
amorinianw: if api is stable enough, i'll dig into port leak tomorrow21:00
*** bobh_ has quit IRC21:00
ianwamorin: ++ thanks21:01
*** bobh has joined #openstack-infra21:01
*** e0ne has quit IRC21:04
*** bobh has quit IRC21:06
*** trown is now known as trown|outtypewww21:08
*** priteau has quit IRC21:09
clarkbtwo images have been built and are being uploaded. Continues to look happy21:09
*** felipemonteiro has joined #openstack-infra21:13
*** xek has joined #openstack-infra21:14
*** gfidente has quit IRC21:14
*** ldnunes has quit IRC21:15
*** xek has quit IRC21:17
*** dklyle has quit IRC21:19
*** dklyle has joined #openstack-infra21:20
*** jamesmcarthur has quit IRC21:29
openstackgerritKendall Nelson proposed openstack-infra/storyboard-webclient master: Show Email Addresses when Searching  https://review.openstack.org/58971321:29
clarkbdiablo_rojo: I've reviewed the spec21:34
diablo_rojoclarkb, awesome! Thank you.21:34
diablo_rojoI'll give it another day or two and have updates up after that.21:34
*** boden has quit IRC21:37
*** spsurya has quit IRC21:38
*** slaweq has quit IRC21:39
*** rtjure has quit IRC21:39
*** agopi has joined #openstack-infra21:48
openstackgerritClark Boylan proposed openstack-infra/nodepool master: Run test zookeeper on top of tmpfs  https://review.openstack.org/61281621:49
*** felipemonteiro has quit IRC21:52
*** felipemonteiro has joined #openstack-infra21:54
*** slaweq has joined #openstack-infra22:05
*** felipemonteiro has quit IRC22:12
*** eharney has quit IRC22:13
*** smarcet has quit IRC22:18
openstackgerritAndrey Volkov proposed openstack-infra/project-config master: New Airship project - Utils  https://review.openstack.org/61282022:20
*** kjackal has quit IRC22:23
*** kjackal_v2 has joined #openstack-infra22:23
*** rcernin has joined #openstack-infra22:24
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] Add epel element to centos7 testing  https://review.openstack.org/61263622:25
*** bnemec has quit IRC22:25
*** rh-jelabarre has quit IRC22:26
*** mriedem has quit IRC22:26
ianwcgoncalves: ^ hrm "mirrorlist" v "metalink" in the repo ... i'm not sure why i saw it working in the octavia gate22:27
*** ansmith has joined #openstack-infra22:29
*** slaweq has quit IRC22:38
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/zuul master: Add the process environment to zuul.conf parser  https://review.openstack.org/61282422:47
*** kjackal_v2 has quit IRC22:48
*** carl_cai has quit IRC22:54
*** diablo_rojo has quit IRC22:57
*** hamzy has quit IRC22:59
*** threestrands has joined #openstack-infra23:02
*** tosky has quit IRC23:02
*** adriant has joined #openstack-infra23:08
openstackgerritClark Boylan proposed openstack-infra/nodepool master: Do not merge  https://review.openstack.org/61282823:08
*** diablo_rojo has joined #openstack-infra23:08
*** tpsilva has quit IRC23:11
*** slaweq has joined #openstack-infra23:11
*** rlandy is now known as rlandy|bbl23:17
*** xarses_ has quit IRC23:32
*** xarses_ has joined #openstack-infra23:33
*** sthussey has quit IRC23:38
*** roman_g has quit IRC23:43
*** slaweq has quit IRC23:45
*** roman_g has joined #openstack-infra23:45
*** gyee has quit IRC23:46
*** smarcet has joined #openstack-infra23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!