Wednesday, 2016-06-29

*** sacharya_ has quit IRC00:00
alextricity25cloudnull still around?00:00
cloudnullthetrav: that issue looks fine.00:01
cloudnullsomething to dig into for sure.00:01
cloudnullalextricity25: sure, whats up?00:01
thetravyou know what though00:01
*** markvoelker has quit IRC00:01
thetravI realised I hadn't done an apt-get dist-upgrade -y00:01
alextricity25cloudnull: Have you tried building a multi-node from master lately?00:01
thetravI saw https://bugs.launchpad.net/openstack-ansible/+bug/1595323 which made me think of it00:01
openstackLaunchpad bug 1595323 in openstack-ansible "doc: kernel version requirements for mitaka install" [Undecided,Confirmed] - Assigned to Matt Dorn (madorn)00:01
thetravwent form 3.13.0-40-generic to 3.13.0-91-generic00:02
*** markvoelker has joined #openstack-ansible00:02
cloudnullthetrav: maybe a kernel bug fix in there that makes that happier?00:02
*** markvoelker has quit IRC00:02
thetravso far it appears to be holding steady at 16 handles00:02
thetravyeah00:02
cloudnullinteresting .00:02
*** markvoelker has joined #openstack-ansible00:02
cloudnullalextricity25: no not recently.00:02
thetravI mean, dist-upgrade does more than just the kernal00:02
thetravthere was a whole bunch of stuff00:02
cloudnullI did do the osic upgrade no long ago00:02
cloudnulland that was liberty00:03
cloudnullthetrav: for sure.00:03
cloudnullI guess it could've been a whole host of fixes00:03
alextricity25cloudnull: It looks like the synchronize ansible module is broken00:03
cloudnullwhich may have been something in python itself00:03
thetravso, I guess it's my bad.  If I had to make a suggestion I'd recommend a dist-upgrade as part of the bootstrap script for the deployment node00:03
cloudnullthetrav:  we can do that00:03
alextricity25cloudnull: https://gist.github.com/alextricity25/20c7045737324f4cc991864fd8ba1f6500:04
cloudnullalextricity25: what version of ansible?00:04
alextricity25cloudnull: v2.100:04
thetravoop, no, spoke too soon00:04
thetravsame error, same spot00:04
thetravahh00:04
thetravI was running my lsof watch as ubuntu00:04
thetravcouldn't see what root was up to00:04
thetravso ignore what I said.  dist-upgrade fixes nothing00:05
cloudnullalextricity25: hum.00:05
alextricity25cloudnull: Related? https://github.com/ansible/ansible/issues/1540500:05
cloudnullill spin an env in a few and see how it goes.00:05
cloudnullalextricity25: maybe related.00:06
cloudnullima go eat and spin an env and see what happens.00:06
alextricity25cloudnull: let me know what you get. Thanks buddy00:06
cloudnullthetrav: ill look into the FS issue too.00:07
thetrav\o/00:07
thetravmostly I just hope you can reproduce it00:07
thetravwell, no, that's not true, mostly I hope you can fix it ;)00:07
*** adrian_otto1 has quit IRC00:08
*** adrian_otto has joined #openstack-ansible00:09
*** mummer has quit IRC00:10
thetravmy biggest suspect is still opening sub-processes with pipes and not closing stdin stdout or stderr00:13
*** thorst has joined #openstack-ansible00:13
thetravlooking for ways to monkey patch the built ins00:13
*** thorst has quit IRC00:21
openstackgerritDarren Chan proposed openstack/openstack-ansible: [docs] Revise overview chapter in OSA install guide  https://review.openstack.org/33196600:28
*** michaelgugino has quit IRC00:38
*** jthorne_ has joined #openstack-ansible00:40
*** jthorne has quit IRC00:40
*** adrian_otto has quit IRC00:42
*** jthorne_ has quit IRC00:45
*** asettle has joined #openstack-ansible00:45
*** asettle has quit IRC00:50
*** thorst has joined #openstack-ansible00:59
*** wadeholler has joined #openstack-ansible01:01
*** appprod0 has quit IRC01:02
*** sacharya has joined #openstack-ansible01:11
*** openstack has joined #openstack-ansible01:25
*** ManojK has quit IRC01:43
*** thorst has quit IRC01:43
*** thorst has joined #openstack-ansible01:44
*** daneyon has quit IRC01:50
*** thorst has quit IRC01:52
thetravso is anyone actually using the mitaka version of openstack-ansible?02:03
mcardenI have spun up a few Mitaka AIOs on cloud VMs.02:11
*** sacharya_ has joined #openstack-ansible02:26
*** appprod0 has joined #openstack-ansible02:27
*** sacharya has quit IRC02:28
*** raddaoui has joined #openstack-ansible02:32
thetravmcarden All In One?  using the openstack-ansible scripts?02:34
thetravI thought it was supposed to be all HA n stuff02:34
*** woodard has quit IRC02:37
*** woodard has joined #openstack-ansible02:39
*** woodard_ has joined #openstack-ansible02:41
mcardenthetrav: Yep. All In One via the scripts02:41
*** woodard has quit IRC02:42
thetravso you just have one host in the openstack_user_config.yml file?02:42
mcardenLots of hosts - mostly containers.02:43
thetravoh02:44
*** wadeholler has quit IRC02:44
thetravI thought the playbooks created containers?02:44
thetravor are you nesting them?02:44
*** wadeholler has joined #openstack-ansible02:44
thetravI don't suppose you'd let me have a peek at your openstack_user_config.yml would you?02:44
thetravI've been trying to deal with this file descriptor leak and not getting anywhere.  Wondering if I've set things up incorrectly02:45
mcardenSure. Let me get one.02:45
mcardenthetrav: http://paste.openstack.org/show/523879/02:47
*** woodard_ has quit IRC02:50
*** thorst has joined #openstack-ansible02:50
thetravcheers02:50
thetravlooks quite similar to the one I've got02:52
thetravmine doesn't have those affinity in them02:52
thetravalso mine has 3 hosts for each block where yours has only aio1 (except compute)02:53
thetravI wonder if I put mine back down to one host if it'll do better02:54
thetravalso, does affinity: something: 1 mean only one instance?02:54
thetravso for example you have a galera cluster with only a single galera instance?02:54
mcardenIIRC afinity 1 means two instances.02:55
thetravok... starting at zero or something?02:56
mcardenThere's doc about it somewhere...02:56
*** chandanc_ has joined #openstack-ansible02:57
*** thorst has quit IRC02:57
mcardenLooks like I'm wrong: http://docs.openstack.org/developer/openstack-ansible/install-guide/configure-initial.html#affinity02:58
cloudnullthetrav: so far I've not been able to recreate the issue.03:06
thetravcloudnull would it help if I posted my config?03:07
cloudnullI have built an env using 14 nodes + stable/mitaka03:07
cloudnullmaybe ?03:07
*** jamielennox is now known as jamielennox|away03:07
cloudnulldo you have the openstack_user_config.yml file handy ?03:07
*** jamielennox|away is now known as jamielennox03:07
thetravhttp://cdn.pasteraw.com/2457wjmllptyc6teg9ek4y0ftm0m5dt03:07
thetravyou are invoking the setup-hosts.yml ?03:08
thetravso what I'm trying now, is splitting setup-hosts.yml into its component includes03:09
thetravif I run one include at a time that ensures all file handles are released03:09
thetravI'm in work around territory03:09
*** sacharya_ has quit IRC03:20
*** sacharya has joined #openstack-ansible03:20
*** weezS has joined #openstack-ansible03:36
*** adrian_otto has joined #openstack-ansible03:42
cloudnullthetrav: The config looks just fine.03:42
cloudnullhow are the commands being invoked?03:43
cloudnullis it ansible automation invoking openstack-ansible ?03:44
mhaydenaww mkrish left03:44
mhaydenand i finally had time to talk ipv603:44
* mhayden is filled with sads03:45
cloudnullmkrish should be back in the AM i'd imagine.03:45
cloudnullthetrav: i just cant get it to explode.03:45
cloudnull:|03:45
thetravnohup /usr/local/bin/ansible-playbook -vvv -e @/etc/openstack_deploy/user_secrets.yml -e @/etc/openstack_deploy/user_variables.yml --forks=1 setup-hosts.yml &03:46
thetravthe intial failure used the openstack-ansible script, however I wanted more verbose output and fewer forks03:46
thetravso my cidr's are /24 instead of /22 (don't think it matters)03:47
thetravI have multiple hosts in all the infrastructure bits and there's no affinaty03:47
thetravthat's about all I can think of03:48
thetravthe deploy_host is a small ubuntu node on openstack from the cloud_image03:48
thetravthe targets are ubuntu as installed by MaaS03:48
mhaydencloudnull: being on PDT is weird03:49
*** sacharya has quit IRC03:49
thetravif you watch the output of `lsof | grep '^ansible-p' | grep /dev/null | wc -l` while the build executes do you notice the number climbing?03:49
*** sacharya has joined #openstack-ansible03:49
thetravI assume you're using ubuntu 14.04?03:49
cloudnullmhayden: have you seen a FD issue with the sec role by chance?03:50
mhaydenfile descriptor?03:50
cloudnullmhayden: BTW I love being on the Left coast :)03:51
mhaydenfloppy disk?03:51
cloudnulli miss my SF some times03:51
mhaydenit's chilly03:51
cloudnullthetrav: is having a file descriptor issue03:51
thetravI thought it was summer up there?  You guys should visit Melbourne if you want cold03:51
mhaydenit's like 53F in the mornings here03:52
mhayden11.7 C03:52
thetravyeah, we're getting down to 41F03:52
thetrav5C03:52
*** chandanc_ has quit IRC03:53
* thetrav waits for a Canadian to show him up03:53
*** bryan_att has quit IRC03:53
cloudnullThe coldest winter I ever saw was the summer I spent in San Francisco. -- Mark Twain03:53
*** chandanc_ has joined #openstack-ansible03:53
*** thorst has joined #openstack-ansible03:55
cloudnullthetrav:  so i've got two deploys going right now and 1, running master, has 0 open FDs and the other, running mitaka, has 17.03:57
cloudnullim poking it though03:58
thetravI tend to see a ton opened up in the security_hardening phase03:59
thetravany chance I can see your user_config and user_variables?03:59
thetravmy user_variables is all commented out except glance_default_store: file03:59
thetravbecause apparantly if you have the file but no variables in it you get an exception04:00
*** albertcard has quit IRC04:00
*** thorst has quit IRC04:03
cloudnullthetrav:  sure . let me put that together.04:03
cloudnullthetrav: http://cdn.pasteraw.com/vuqdp84inzncp6xkji07luvyn3c3k604:06
cloudnullthats the collection of "cat openstack_user_config.yml conf.d/*.yml"04:07
openstackgerritMichael Carden proposed openstack/openstack-ansible: conditionally include the scsi_dh kernel module  https://review.openstack.org/33520204:13
*** Drago1 has joined #openstack-ansible04:13
*** Drago1 has joined #openstack-ansible04:14
*** zerda2 has joined #openstack-ansible04:15
*** Drago1 has quit IRC04:19
*** markvoelker has quit IRC04:31
openstackgerritKevin Carter (cloudnull) proposed openstack/openstack-ansible-lxc_hosts: Update the version of LXC installed to the latest stable  https://review.openstack.org/33530104:40
cloudnullthetrav: still beating on it, i've just not been able to make it die in a fire quite yet.04:41
thetravsorry, got pulled away, am just looking over the user config now04:41
cloudnullno worries.04:42
thetrav22 mask is bigger than 24 right?04:42
cloudnullyes04:42
thetravyour used ip ranges are bigger too04:42
thetravthe file looks like output from something rather than the thing you get when you follow the online instructions04:42
thetravall the notes not to write to this file04:43
cloudnullyes my env was built using https://github.com/cloudnull/osa-multi-node-aio04:43
cloudnullthe build scripts create those files.04:44
cloudnullthe end result is a 14 node deployment if you change nothing and just run it04:45
cloudnullwhere the file starts "---" is a new file.04:46
cloudnullI just break out the file into multiple ones instead of having them all in one big one.04:46
thetravok, so this is my equivalent: http://cdn.pasteraw.com/6c9jeuirth2vnq383itb1ds2g2zruqc04:49
thetravafter that I ssh into deploy_host and run the ansible-playbook setup-hosts.yml thinger04:50
*** adrian_otto has quit IRC04:50
*** adrian_otto has joined #openstack-ansible04:54
cloudnullthetrav: so this looks like the issue you're seeing https://github.com/ansible/ansible/issues/1518204:55
cloudnullwhich was an issue invoked through the api04:55
cloudnullbut in the end its a FD issue similar to what youre seeing.04:56
thetravfirst scan yes, looks very similar04:56
thetravthe /dev/null thing too04:56
cloudnullthere was never any follow up on it.04:56
thetravunfortunately it's a bit challenging to parse his issue04:57
thetravis that code a library or something?04:57
thetravoh04:57
cloudnullyea, its using the Ansible internals insteaad of the cli clients.04:57
thetravit's programmatically invoking ansible04:57
cloudnulljust for shits and grins, would you mind trying to run ansible installed from git using the stable1.9  branch ?04:58
cloudnullthere have been quite a few bug fixes that have gone in that we're never part of tag04:58
cloudnullmaybe helps?04:58
*** pcaruana has quit IRC04:58
* cloudnull still grasping at straws04:58
thetravso I've tried using the most recent 2.1 ansible04:58
thetravsame result04:59
cloudnullok04:59
cloudnullthen no.04:59
cloudnullare you running the command in screen or tmux ?04:59
thetravtmux05:00
thetravwell05:00
thetravssh + nohup05:00
*** sacharya has quit IRC05:00
cloudnullusing additional logging when the shell was invoked?05:00
thetravnot sure if that counts as what05:00
thetravI have -vvv switched on05:00
thetravnohup /usr/local/bin/ansible-playbook -vvv -e @/etc/openstack_deploy/user_secrets.yml -e @/etc/openstack_deploy/user_variables.yml --forks=1 setup-hosts.yml &05:00
thetravthat is how I invoke it05:00
*** thorst has joined #openstack-ansible05:01
cloudnullinteresting. so i think its the fact that its nohup.05:05
cloudnulli can make it happen like so05:06
cloudnullhttps://snag.gy/LuZa1f.jpg05:06
cloudnullw/out nohup, even backgrouning it, nope.05:06
thetrav?05:06
thetravso nohup may be causing it to fail?05:06
*** M00nr41n has quit IRC05:06
thetravwell that's surprising05:07
cloudnullthat is05:07
cloudnullwhen you had nohup in the command i never even throught to try that.05:07
*** thorst has quit IRC05:08
mcardenI'd have thought that using tmux would take away any need for nohup.05:08
thetravtmux = ?05:09
mcardenSorry, I thought you confirmed earlier using tmux05:09
thetravyeah I may have incorrectly assumed05:10
thetravok, right05:10
thetravno05:10
thetravtmux is a specific program05:10
thetravnot some fancy way of saying terminal emulaiton05:10
mcardenSorry. So I always use tmux for ssh to long running things.05:10
mcardenyep. apt-get install tmux05:11
thetravso tmux is pretty much a fancy version of screen?05:11
cloudnullyup05:11
mcardenYep05:11
thetravok, cool05:11
thetravare you able to do your thing using tmux and not get the fd issue?05:11
cloudnullgood read https://gist.github.com/MohamedAlaa/296105805:11
thetravcause that'd take away my need for nohup05:11
cloudnullI run my default shell in tmux05:11
cloudnullwhen I login to my servers, im in tmux05:11
thetravgonna infer the 'yes' from that response :D05:12
cloudnullyes. :)05:12
cloudnullyou could use screen too05:12
cloudnullif you're more familiar with that05:12
mcarden...but tmux is cooler. :)05:12
thetravI have never learned to use screen properly05:12
thetravI don't even know how to scroll back in it05:12
thetravthat's why I use nohup and tail -f05:13
thetravdoes tmux continue to operate when I disconnect?05:13
thetravsimilar to screen?05:13
cloudnullthis is a better cheatsheet https://tmuxcheatsheet.com/05:13
cloudnullyes05:13
*** adrian_otto has quit IRC05:13
thetravthanks05:14
mcardenIf you get disconnected, just ssh back in and 'tmux attach -t session-name' and you'll be where you were05:14
thetravrad05:16
thetravok, so I also just discovered that the deploy_host can't route to the container network ;P05:16
*** adrian_otto has joined #openstack-ansible05:16
thetravso I'm gonna need the guy who controls the palo alto to update the routes for me05:16
thetravonce that's set up I'll get back to you if I get success05:17
thetravreally hoping it works out.  Tired of supporting my own ansible playbooks05:18
thetravalso trove05:18
thetravmmmm troooove05:18
cloudnullso just to confirm, when I nohup ansible commands w/ lots of hosts it explodes quickly. https://asciinema.org/a/62w2uj7tfnqm1zy9tg6uyww5505:21
cloudnullIDK if the screen cast will be nice05:21
cloudnullbut that was my test case to see it die in a fire05:22
cloudnullthetrav: have you worked on trove before?05:24
cloudnullwe donthave a trove role.05:24
* cloudnull makes sure of that05:24
*** adrian_otto has quit IRC05:24
cloudnullbut it'd be nice to make it go05:24
*** adrian_otto has joined #openstack-ansible05:24
mcardenNice demo cloudnull.  I guess the take=away is "So don't do that"05:25
cloudnullYea. im not sure how to make ansible + nohup = happy, but i think the fix is "dont do that" :)05:26
thetravno, trove is new to me05:27
thetravI've bounced off it a couple of times05:27
thetravso it's not like, 100% new05:27
thetravbut I haven't got a working install of the service, nor have I built any of my own db images05:28
thetravI noticed in mitaka it got added to the apt-get repo and install docs however05:28
thetravso that gives me hope that I can make it happen this time05:28
cloudnullwell if its something your interested in working on it'd be great to get a role together for it.05:29
cloudnulland from what I just read your the expert =)05:29
thetravheheh05:29
thetravif I can make a contribution I will do so05:29
cloudnullcool . so im going to bed.05:29
cloudnullgood chat though05:30
thetravcheers, sleep well05:30
cloudnullthetrav: if you get a moment to have a look at the launchpad issue raised and comment/close it I'd appreciate it.05:30
cloudnullmcarden: cheers brother ttyl05:30
cloudnullnight all.05:30
mcardencya cloudnull05:31
*** markvoelker has joined #openstack-ansible05:32
mcardenthetrav: If you do end up interested in a trove role, here's the 'getting going' guide for role development: http://docs.openstack.org/developer/openstack-ansible/developer-docs/additional-roles.html#role-development-maturity05:36
*** markvoelker has quit IRC05:37
*** adrian_otto has quit IRC05:44
*** McMurlock1 has joined #openstack-ansible05:45
*** McMurlock1 has quit IRC05:49
*** javeriak has joined #openstack-ansible06:02
*** appprod0 has quit IRC06:03
*** thorst has joined #openstack-ansible06:05
*** chhavi has joined #openstack-ansible06:07
chhaviHi all, am facing issue while accessing the VM using the VNC console06:08
*** M00nr41n has joined #openstack-ansible06:08
*** karimb has joined #openstack-ansible06:10
chhaviits not accepting any keyboard input, does openstack-ansible blocks any ports06:11
*** thorst has quit IRC06:12
*** pcaruana has joined #openstack-ansible06:16
*** pcaruana is now known as pcaruana|afk|06:19
*** M00nr41n has quit IRC06:22
*** M00nr41n has joined #openstack-ansible06:23
*** deadnull has quit IRC06:28
*** markvoelker has joined #openstack-ansible06:33
*** M00nr41n has quit IRC06:33
*** M00nr41n has joined #openstack-ansible06:34
*** karimb has quit IRC06:36
*** markvoelker has quit IRC06:37
*** bootsha has joined #openstack-ansible06:38
*** weezS has quit IRC06:39
*** deadnull has joined #openstack-ansible06:46
*** pcaruana|afk| is now known as pcaruana06:49
*** raddaoui has quit IRC06:57
*** appprod0 has joined #openstack-ansible07:00
*** appprod0 has quit IRC07:05
*** jiteka has joined #openstack-ansible07:08
*** thorst has joined #openstack-ansible07:10
*** chhavi has quit IRC07:12
*** tlbr has quit IRC07:17
*** thorst has quit IRC07:17
*** tlbr has joined #openstack-ansible07:19
*** chhavi has joined #openstack-ansible07:20
*** karimb has joined #openstack-ansible07:26
*** bootsha has quit IRC07:28
*** jhyang has joined #openstack-ansible07:30
*** bootsha has joined #openstack-ansible07:31
*** jhyang is now known as derekjhyang07:32
*** markvoelker has joined #openstack-ansible07:34
*** markvoelker has quit IRC07:38
*** vnogin_ has joined #openstack-ansible07:46
*** vnogin has quit IRC07:46
vnogin_good morning07:47
evrardjpmorning everyone07:58
*** tlbr has quit IRC08:00
javeriakmorning evrardjp08:01
javeriakim having some ansible ssh problems, it just wont reach the targets, while a direct ssh to that IP works, any idea what could be wrong08:02
*** karimb has quit IRC08:02
javeriaktried debug mode as well, all it says at the end is FAILED => SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh08:02
*** tlbr has joined #openstack-ansible08:03
*** neilus has joined #openstack-ansible08:04
*** tlbr has quit IRC08:04
*** tlbr has joined #openstack-ansible08:07
*** bootsha has quit IRC08:12
*** thorst has joined #openstack-ansible08:15
*** thetrav has quit IRC08:17
ionijaveriak, make sure that the host you are running ansible, can also connect on the network assigned for containers08:19
*** thorst has quit IRC08:23
javeriakioni yes ssh works otherwise08:25
*** admin0 has joined #openstack-ansible08:26
*** chandanc_ has quit IRC08:27
*** admin0 has quit IRC08:29
*** electrofelix|afk is now known as electrofelix08:31
*** karimb has joined #openstack-ansible08:31
*** bootsha has joined #openstack-ansible08:31
*** markvoelker has joined #openstack-ansible08:34
*** chandanc_ has joined #openstack-ansible08:35
kamszwhat is the proper way to restart the containers on infra?08:36
kamszlxc-system-manage containers-restart?08:36
*** markvoelker has quit IRC08:38
*** asettle has joined #openstack-ansible08:46
evrardjpjaveriak: interesting09:01
evrardjpjaveriak: which version of ansible are you using, which connection plugin?09:01
evrardjpdid you try to have the triple v ?09:01
*** appprod0 has joined #openstack-ansible09:01
evrardjpI mean something like "openstack-ansible playbook.yml -vvv"09:02
javeriakevrardjp yes tried with vvv, the last message is that FAIL SSH error09:02
javeriakansible should be old, whatever was pinned with OSA kilo09:02
evrardjpkamsz: go to your deploy nodes, and do a good ansible -m shell -a reboot <the container group you want>09:02
evrardjppay attention to what you want, some (like galera and rabbit) don't like that09:03
*** bootsha has quit IRC09:03
evrardjpjaveriak: more and more interesting if it's not a network issue09:04
kamszevrardjp: yeah, i've noticed that galera doesn't like it :p09:04
evrardjpcould you show what's in the -vvv ?09:04
evrardjpkamsz: there is doc for that in the operations guide on the docs09:04
evrardjpnot only the error I mean09:04
evrardjpjaveriak: ^09:04
kamszevrardjp: for cluster recovery? yeah, i've followed it and got galera up and running09:05
*** appprod0 has quit IRC09:06
javeriakevrardjp :   the start has the ansible trace and then later i tried running the ssh command directly http://paste.ubuntu.com/18086570/09:10
javeriakso looks like it might not be ansible09:10
*** berendt has quit IRC09:12
*** bootsha has joined #openstack-ansible09:15
evrardjpdefinitely ssh09:16
evrardjpand a task issue09:17
evrardjpwhat's your ansible.cfg ?09:20
javeriakevrardjp yes, but i cant quite make sense of this trace, it has no apparent error, a simple "ssh <IP>" works, just not with these options09:20
*** thorst has joined #openstack-ansible09:20
javeriakthe ansible.cfg is default https://github.com/openstack/openstack-ansible/blob/kilo/playbooks/ansible.cfg09:22
evrardjpthe command asked should be fine because it's ping module09:23
evrardjpnothing fancy in the call itself09:23
evrardjpthe key exchange look fine, and the authentication seems fine09:23
evrardjpbut then you start your command, and don't get the info until you arrive on ControlPersist timeout09:24
evrardjpcould you try this ?09:25
evrardjp /bin/sh -c LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python09:25
evrardjpdirectly on the connected node09:25
evrardjp(infra1 here)09:25
evrardjpand see if it's fast or not09:25
evrardjphint: according to your trace, it should09:25
evrardjpalso you have to ssh 10.100.1.209:26
evrardjpfor consistency09:26
*** thorst has quit IRC09:28
javeriakevrardjp ive lost access to the system for now, so im just looking for leads, will give that a try in the morning09:29
evrardjpyou could have an intermittent network issue09:30
evrardjpdon't hesitate to try different connection plugins and different ssh connection configurations09:30
javeriakso i should ssh into 10.100.1.2 first and then run /bin/sh -c LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python ?09:30
evrardjpyes, this is like the first basic test09:31
javeriaklike paramiko? iev never tried that though09:31
javeriakso what would we be testing with that command?09:31
evrardjpit should be instantaneous to connect and to instant to return to your host with the command09:31
*** admin0 has joined #openstack-ansible09:31
evrardjpjust to make sure the ping module could work and answer directly09:32
evrardjpif you have an answer directly, it means it's not the cause of your timeout09:33
evrardjpI strongly suspect this will not be the case09:33
admin0automagically: here ?09:36
javeriakevrardjp  im not quite following sorry; by the ping module do you mean the ansible ping? we know that doesnt work09:36
evrardjpjaveriak: I'll try to explain better09:36
evrardjpby manually sshing and typing the command, you manually do stuff that are the same as the ping command09:37
evrardjphowever, you'll be using an interactive session09:37
evrardjpyou can have more feedback to it09:37
evrardjpwith it09:38
evrardjpso by doing this, you can first see timings09:38
evrardjp(understand how long does it take to connect and to execute this command)09:38
evrardjpif the command executes instantly it's fine, it's the working behavior09:38
evrardjpif not, you have python issues09:39
evrardjpif the ssh connection takes long, you may have network issues, etc09:39
evrardjp(maybe check the ssh server for UseDNS and things like this)09:39
evrardjpevery ssh setting you've defined as applying for the nodes you're connecting to (in your ~/.ssh/config) will apply to your interactive session, so you'll maybe discover things there09:40
javeriakevrardjp ah yes i get it now; but once i've ssh'd in, wouldnt i be in that machine directly, unless i try executing that command through ssh09:40
*** bootsha has quit IRC09:45
*** vnogin_ has quit IRC09:50
*** karimb has quit IRC09:52
chhavidoes the default openstack-ansible novnc configuration blocks no-ssl connections while accessing VNC console09:53
*** vnogin has joined #openstack-ansible09:55
*** berendt has joined #openstack-ansible09:59
*** admin0 has quit IRC09:59
*** karimb has joined #openstack-ansible10:01
openstackgerritJirayut Nimsaeng proposed openstack/openstack-ansible-os_horizon: Make horizon to use policy as the same as other projects  https://review.openstack.org/33039710:04
winggundamthevrardjp: https://review.openstack.org/#/c/333770/ I tried to recheck. Is it works?10:05
*** Andrew_jedi has joined #openstack-ansible10:07
evrardjpjaveriak: you'd be in that machine, but you'll already "feel" how the machine is behaving10:07
evrardjplike you get connection drops after 60 seconds etc10:08
evrardjpit's just for your to better understand10:08
Andrew_jediHello folks, Do we have any scripts in OSA to recover rabbitmq cluster from a network partition?10:08
evrardjpwinggundamth: I'll check10:08
winggundamthevrardjp: thanks10:08
evrardjpglance issue10:10
evrardjpI'll do one more recheck, but we should maybe focus on the code now10:11
evrardjpI'll mark it in my starred list, and I'll review it as soon as I can10:11
winggundamthevrardjp: which code? mine or gate?10:11
evrardjpyours10:11
winggundamthevrardjp: okay10:11
evrardjpmaking sure that's what we really want10:11
evrardjpcompared to the blueprint that passed long ago etc10:12
*** javeriak has quit IRC10:12
*** chhavi has quit IRC10:15
*** javeriak has joined #openstack-ansible10:17
javeriakevrardjp okay.. and worst case, if all works and is also responsive enough directly, any idea what else i could try...10:18
evrardjpI'm pretty sure you have net drops10:18
evrardjpbut yes, move to paramiko as I said, or extend the control master10:19
evrardjpare you sure there is no weird net stuff happening on this host10:19
evrardjp?10:19
evrardjpdid you check the system logs ?10:19
evrardjpjust in case10:19
evrardjp:p10:19
evrardjpwinggundamth: don't hesitate to ask others to review too10:20
javeriaki was hoping it wouldnt come to that :P10:20
javeriakthe auth logs are oddly empty10:20
*** johnmilton has quit IRC10:20
evrardjplastlog is empty too ?10:20
javeriakbtw could this be relevant: https://github.com/ansible/ansible/issues/1340110:20
javeriakim on ansible 1.9 though10:20
javeriakdidnt check lastlog10:21
evrardjpit's not really relevant here10:22
evrardjphowever, like I said earlier, adapting the ssh config and ansible.cfg ssh config could be useful10:23
evrardjplike not using pipelining etc10:23
evrardjpbut I strongly doubt this is the cause of your issues10:23
evrardjpmost of the time it's network or host issue10:24
*** thorst has joined #openstack-ansible10:25
javeriakyea probably10:28
javeriakwell lets see, thanks for your help10:28
*** thorst has quit IRC10:33
*** bootsha has joined #openstack-ansible10:35
*** markvoelker has joined #openstack-ansible10:36
*** bootsha has quit IRC10:38
*** bootsha has joined #openstack-ansible10:39
*** markvoelker has quit IRC10:40
evrardjpanytime10:46
*** chhavi has joined #openstack-ansible10:47
*** bootsha has quit IRC10:51
*** bootsha has joined #openstack-ansible10:53
*** javeriak has quit IRC10:56
*** neilus1 has joined #openstack-ansible10:59
*** johnmilton has joined #openstack-ansible11:01
*** neilus2 has joined #openstack-ansible11:01
*** chandanc_ has quit IRC11:02
*** appprod0 has joined #openstack-ansible11:02
*** neilus has quit IRC11:03
*** neilus1 has quit IRC11:03
*** smatzek has joined #openstack-ansible11:05
*** appprod0 has quit IRC11:07
*** asettle has quit IRC11:26
*** asettle has joined #openstack-ansible11:29
*** bootsha has quit IRC11:29
*** v1k0d3n has joined #openstack-ansible11:29
*** bootsha has joined #openstack-ansible11:30
*** bootsha has quit IRC11:35
*** javeriak has joined #openstack-ansible11:36
*** markvoelker has joined #openstack-ansible11:37
*** McMurlock1 has joined #openstack-ansible11:40
*** markvoelker has quit IRC11:41
*** thorst has joined #openstack-ansible11:42
*** deverter has joined #openstack-ansible11:43
*** weshay has joined #openstack-ansible11:46
Andrew_jediFolks, any idea about this error11:49
Andrew_jedi(item={'src': u'/etc/rabbitmq/rabbitmq.pem', 'name': 'rabbitmq_ssl_cert', 'file_mode': '0640'}) => {"attempts": 5, "err": "Memcache key not found", "failed": true, "item": {"file_mode": "0640", "name": "rabbitmq_ssl_cert", "src": "/etc/rabbitmq/rabbitmq.pem"}, "rc": 1}11:49
*** GMAzrael has joined #openstack-ansible11:57
*** prometheanfire has quit IRC12:03
*** prometheanfire has joined #openstack-ansible12:03
*** neilus has joined #openstack-ansible12:07
*** neilus has quit IRC12:08
*** markvoelker has joined #openstack-ansible12:08
*** karimb has quit IRC12:08
*** neilus has joined #openstack-ansible12:09
*** neilus2 has quit IRC12:10
*** admin0 has joined #openstack-ansible12:12
*** bootsha has joined #openstack-ansible12:14
*** neilus has quit IRC12:16
Andrew_jedicloudnull odyssey4me ^^12:17
Andrew_jediI am running a Kilo setup.12:18
*** psilvad has joined #openstack-ansible12:21
mgariepygood morning everyone12:25
*** psilvad has quit IRC12:29
*** automagically has quit IRC12:29
*** automagically has joined #openstack-ansible12:30
*** neilus has joined #openstack-ansible12:32
*** v1k0d3n has quit IRC12:35
*** woodard has joined #openstack-ansible12:36
*** aernhart has joined #openstack-ansible12:40
*** ManojK has joined #openstack-ansible12:45
*** GMAzrael has quit IRC12:47
*** psilvad has joined #openstack-ansible12:48
*** karimb has joined #openstack-ansible12:48
*** ManojK has quit IRC12:55
*** zerda2 has quit IRC12:58
*** TxGirlGeek has joined #openstack-ansible13:00
*** appprod0 has joined #openstack-ansible13:03
*** messy has joined #openstack-ansible13:04
*** appprod0 has quit IRC13:08
*** ManojK has joined #openstack-ansible13:10
*** deverter has quit IRC13:10
*** M00nr41n has quit IRC13:11
alextricity25Andrew_jedi: could you give more detail? What task were you running? Could you possibly send a paste of the entire task when it failed?13:13
*** javeriak has quit IRC13:14
automagicallymorning all13:15
Andrew_jedialextricity25: I was trying to reinstall rabbitmq cluster. http://paste.openstack.org/show/524038/13:16
*** sdake has joined #openstack-ansible13:17
alextricity25Andrew_jedi: At first glance it looks to me that the rabbimq_ssl_cert no longer is the memcache server. It probably expired from there and was removed. Let me try to see if I can find a way to refresh what's stored in memcache13:19
Andrew_jedialextricity25: Thanks, i have been searching for it for sometime. Couldn't find anything :/13:19
*** gregfaust has joined #openstack-ansible13:21
*** sdake_ has joined #openstack-ansible13:22
*** sdake has quit IRC13:22
alextricity25Andrew_jedi:  Do you think it might be this task that's failing? https://github.com/openstack/openstack-ansible/blob/kilo/playbooks/roles/rabbitmq_server/tasks/rabbitmq_ssl_key_store.yml#L16-L3113:22
alextricity25Andrew_jedi:  or maybe this one? https://github.com/openstack/openstack-ansible/blob/kilo/playbooks/roles/rabbitmq_server/tasks/rabbitmq_ssl_key_distribute.yml#L16-L3213:24
Andrew_jedialextricity25: Possibly yes, i am trying to implement this fix for the issue https://github.com/openstack/openstack-ansible-rabbitmq_server/blob/c9773b9d9c85dbec0422839829a9dedbd07991d0/tasks/rabbitmq_ssl_key_distribute.yml13:24
Andrew_jedialextricity25: Looks like it worked ...13:24
alextricity25Andrew_jedi: awesome!13:25
Andrew_jediAndrew_jedi: New error :/,13:26
Andrew_jedialextricity25: New error,13:26
Andrew_jedifailed: [controller3_rabbit_mq_container-48ecc3b2] => {"cmd": "/usr/sbin/rabbitmqctl -q -n rabbit add_user openstack a414b123025ff04c84fd", "failed": true, "rc": 2}13:26
Andrew_jedistderr: Error: user_already_exists: openstack13:26
*** sdake has joined #openstack-ansible13:26
*** sdake_ has quit IRC13:27
alextricity25Andrew_jedi: what task is that?13:28
Andrew_jedialextricity25: TASK: [rabbitmq_server | Ensure rabbitmq user]13:28
*** KLevenstein has joined #openstack-ansible13:29
Andrew_jedialextricity25: fixed13:30
alextricity25Andrew_jedi: That's strange...the rabbitmq ansible module should skip it13:30
cloudnullmorning13:30
alextricity25Andrew_jedi: oh good13:30
alextricity25good morning cloudnull13:30
Andrew_jedicloudnull: good morning, saw your pics on twitter, nice place +113:31
*** deverter has joined #openstack-ansible13:31
*** deverter has quit IRC13:34
*** deverter has joined #openstack-ansible13:34
*** karimb has quit IRC13:36
*** ManojK has quit IRC13:38
*** TxGirlGeek has quit IRC13:38
*** karimb has joined #openstack-ansible13:38
*** TxGirlGeek has joined #openstack-ansible13:38
*** TxGirlGeek has quit IRC13:40
*** ManojK has joined #openstack-ansible13:40
*** TxGirlGeek has joined #openstack-ansible13:40
automagicallyAny cores available to review https://review.openstack.org/334506 ?13:52
andymccrlgtm13:53
automagicallythx andymccr13:54
automagicallyCan you +2. I just see a +w andymccr13:54
automagicallyIf you are so inclined13:54
*** raddaoui has joined #openstack-ansible13:54
*** jiteka has quit IRC13:54
*** ametts has joined #openstack-ansible13:56
andymccrautomagically: sure, it should still gate/merge either way though.13:57
*** jthorne has joined #openstack-ansible13:57
automagicallyAh, I thought it needed 2 +2 and a +w13:57
automagicallyThanks again13:57
*** michaelgugino has joined #openstack-ansible13:58
*** ajo_ has joined #openstack-ansible14:00
*** v1k0d3n has joined #openstack-ansible14:02
*** ajo_ has quit IRC14:03
*** bootsha has quit IRC14:03
*** ajo_ has joined #openstack-ansible14:04
*** jayc has joined #openstack-ansible14:04
*** asettle has quit IRC14:05
*** ajo_ has quit IRC14:07
*** ajo_ has joined #openstack-ansible14:07
cloudnullAndrew_jedi: sorry was afk a min. thanks it was fun to be away :)14:08
cloudnullautomagically:  looking now14:08
automagicallycloudnull: don’t bother, its on its way14:09
automagicallyAppreciate it tho14:09
* cloudnull neverminding14:09
*** ajo_ has quit IRC14:09
*** TxGirlGeek has quit IRC14:11
*** TxGirlGeek has joined #openstack-ansible14:11
*** TxGirlGeek has quit IRC14:14
*** TxGirlGeek has joined #openstack-ansible14:14
openstackgerritKevin Carter (cloudnull) proposed openstack/openstack-ansible-openstack_hosts: Updated the hostname generation  https://review.openstack.org/32350414:15
*** sdake has quit IRC14:16
*** jmckind has joined #openstack-ansible14:16
*** sdake has joined #openstack-ansible14:17
*** sdake has quit IRC14:17
*** klamath has joined #openstack-ansible14:19
*** sdake has joined #openstack-ansible14:19
*** TxGirlGeek has quit IRC14:20
openstackgerritMatt Dorn proposed openstack/openstack-ansible-openstack_hosts: Add linux-image-extra-virtual to host packages  https://review.openstack.org/33552514:21
*** sc68cal has quit IRC14:23
*** spotz_zzz is now known as spotz14:24
*** sc68cal has joined #openstack-ansible14:25
*** jorge_munoz has joined #openstack-ansible14:32
*** jiteka has joined #openstack-ansible14:40
*** Mudpuppy has joined #openstack-ansible14:41
*** TxGirlGeek has joined #openstack-ansible14:45
*** karimb has quit IRC14:46
*** kstev has joined #openstack-ansible14:47
*** jorge_munoz_ has joined #openstack-ansible14:48
*** jorge_munoz has quit IRC14:48
*** jorge_munoz_ is now known as jorge_munoz14:48
*** eil397 has joined #openstack-ansible14:48
Andrew_jedicloudnull: Do we have any script in osa to deal with network partitions, like this, https://gist.github.com/niedbalski/aceba280b0365bdff46f#file-partition-recover-rabbitmq-py14:52
* cloudnull looking14:53
cloudnullAndrew_jedi:  no nothing in tree that im aware of14:54
cloudnullthough that looks useful .14:54
Andrew_jediok, thanks!14:54
automagicallyContribute to https://github.com/openstack/openstack-ansible-ops Andrew_jedi14:54
cloudnull+114:54
*** berendt has quit IRC14:56
*** karimb has joined #openstack-ansible14:58
evrardjpAndrew_jedi: we still have some kind of mention of it in a template IIRC15:01
evrardjpthe rabbitmq.config.j215:01
evrardjpdon't know what you need or what you are talking about, just remembering that partitions were something I saw15:01
evrardjpbut then I guess it's probably the first steps to ops15:02
*** jiteka has quit IRC15:03
*** appprod0 has joined #openstack-ansible15:04
*** chhavi has quit IRC15:06
*** neilus has quit IRC15:07
*** cloader89 has joined #openstack-ansible15:07
*** cloader89 has quit IRC15:08
*** appprod0 has quit IRC15:08
*** aernhart has quit IRC15:08
*** alan__ has joined #openstack-ansible15:08
Andrew_jedievrardjp: I had power failure yesterday, and as a result i have to deal with the network partition. Rabbitmq cluster got screwed. Still trying to fix it.15:08
*** cloader89 has joined #openstack-ansible15:09
*** sacharya has joined #openstack-ansible15:09
*** weezS has joined #openstack-ansible15:10
openstackgerritKevin Carter (cloudnull) proposed openstack/openstack-ansible-openstack_hosts: Updated the hostname generation  https://review.openstack.org/32350415:10
*** daneyon has joined #openstack-ansible15:11
evrardjpat some point it's easier to rebuild your cluster from scratch and rerun the os playbooks15:12
evrardjp:D15:12
*** eil397 has quit IRC15:12
evrardjpI mean it's just rabbitmq queues15:12
cloudnullAndrew_jedi: can you stop then start the cluster to reset the partitioning  ?15:12
evrardjpI guess he already tried that15:13
Andrew_jedicloudnull: Tried that, infact restarted the entire setup, and then finally rebuilt the rabbitmq cluster, but still facing network partition :/15:13
cloudnull:'(15:14
evrardjpthat's weird15:14
evrardjprebuilt from new containers or from existing ones ?15:14
*** TxGirlGeek has quit IRC15:14
Andrew_jedicloudnull: And to make the matter worse, this is a production setup15:14
evrardjpoh15:14
*** TxGirlGeek has joined #openstack-ansible15:14
cloudnullAndrew_jedi: kilo ?15:15
Andrew_jedievrardjp: existing ones, stop-destroy-and recreate15:15
evrardjpso complete process for the rabbit recreate15:15
openstackgerritNolan Brubaker proposed openstack/openstack-ansible: Use in-tree env.d files, provide override support  https://review.openstack.org/33259515:15
Andrew_jedicloudnull: yep, scheduled for upgrade to Liberty in August.15:15
cloudnullhave you tried setting https://github.com/openstack/openstack-ansible/blob/kilo/playbooks/roles/rabbitmq_server/defaults/main.yml#L47 ?15:15
cloudnullto autoheal ?15:15
*** sacharya_ has joined #openstack-ansible15:16
Andrew_jedicloudnull: nope, let me try that.15:16
Andrew_jedievrardjp: Yep!15:16
*** catintheroof has joined #openstack-ansible15:16
*** pcaruana has quit IRC15:17
cloudnullsetting "rabbitmq_cluster_partition_handling: autoheal" in user_variables.yml and reruning `openstack-ansible rabbitmq-install.yml --tags rabbitmq-config` should drop the needed config and restart the app.15:17
evrardjpthat variable is indeed used in the file I told earlier15:17
evrardjpyou can directly give it in CLI15:17
evrardjpopenstack-ansible rabbitmq-install.yml -e rabbitmq_cluster_partition_handling=autoheal15:18
evrardjpnot sure about what it does 'though, I'm no rabbit expert15:18
*** sacharya has quit IRC15:18
cloudnullAndrew_jedi: in reading https://www.rabbitmq.com/partitions.html it looks like autoheal is the way to go when dealing with network issues. its the most aggressive way of recovering from partitioning however it should do the trick.15:21
Andrew_jedicloudnull: thanks, fingers crossed.15:21
cloudnull**residual network issues caused by a major outage.15:21
* cloudnull grabing coffee back in a min15:22
*** eil397 has joined #openstack-ansible15:26
*** chandanc_ has joined #openstack-ansible15:27
evrardjpThis kind of hands on experience deserves docs IMP15:29
evrardjpIMO*15:29
*** v1k0d3n has quit IRC15:29
*** asettle has joined #openstack-ansible15:29
*** eil397 has quit IRC15:30
Andrew_jedicloudnull evrardjp : No cigars, http://paste.openstack.org/show/524089/15:32
*** pcaruana has joined #openstack-ansible15:32
*** eil397 has joined #openstack-ansible15:33
cloudnullis infra3 the only partitioned node?15:33
cloudnullwhat does: rabbitmqctl cluster_status show?15:34
Andrew_jedicloudnull: only controller115:35
Andrew_jedicloudnull: pasting the output now15:35
*** michaelgugino has quit IRC15:36
Andrew_jedicloudnull: http://paste.openstack.org/show/524091/15:38
*** ManojK has quit IRC15:38
evrardjprabbitmqctl status on node 1?15:39
cloudnullidk If you've tried this but, can you login to the misbehaving node "controller1_rabbit_mq_container-2d645e7f" and run `rabbitmqctl stop_app; rabbitmqctl reset; rabbitmqctl join_cluster rabbit@controller2_rabbit_mq_container-8b157b02; rabbitmqctl start_app;15:40
*** ManojK has joined #openstack-ansible15:40
evrardjpstatus will show the app runing15:40
evrardjpplease do that before :D15:40
evrardjpfor pasting purposes15:40
evrardjpIIRC15:41
Andrew_jedicloudnull: Yes, i did.  But it will fail to join the cluster.15:41
cloudnullsame reason ?15:41
Andrew_jedievrardjp: http://paste.openstack.org/show/524092/, No rabbit app active15:41
Andrew_jedicloudnull: let me show you15:41
*** mummer has joined #openstack-ansible15:42
evrardjprm -rf /var/lib/rabbitmq/mnesia15:43
cloudnullif that continues to fail you could try destroying the node and rebuilding it to se if it'll rejoin. `openstack-ansible lxc-container-destroy.yml lxc-container-create.yml --limit controller1_rabbit_mq_container-2d645e7f; openstack-ansible rabbitmq-install.yml`15:43
Andrew_jedicloudnull: http://paste.openstack.org/show/524094/15:45
Andrew_jedievrardjp: i thought rabbitmqctl reset will delete the mnesia15:45
*** Drago has joined #openstack-ansible15:46
*** phalmos has joined #openstack-ansible15:46
cloudnullAndrew_jedi: can you ping the nodes in the cluster using the hostname from the misbehaving one?15:47
Andrew_jedicloudnull: Yes15:47
evrardjphe has a fundamental mnesia problem I think, the process isn't listed in its erlang vm15:48
evrardjpthat's why I tried to trash the folder15:48
eil397good morning everyone15:48
mrhillsmanyo15:48
cloudnullyou try to reset it using the erl commands15:48
evrardjpgood morning15:48
mrhillsmanhttps://bugs.launchpad.net/openstack-ansible/+bug/159741015:48
openstackLaunchpad bug 1597410 in openstack-ansible "manual upgrade: memcached flush fails" [Undecided,New]15:48
cloudnullmorning eil39715:48
mrhillsmanthis is related to upgrade again cloudnull evrardjp15:48
Andrew_jedicloudnull: http://paste.openstack.org/show/524095/15:48
cloudnullyo mrhillsman15:48
mrhillsmani wish i knew how to contribute :(15:49
mrhillsmani put the change i made to make it work in the description15:49
Andrew_jedievrardjp: trashing it15:49
Andrew_jedievrardjp: Done15:49
cloudnullAndrew_jedi: try: erl -sname "rabbit@controller1_rabbit_mq_container-2d645e7f" -mnesia dir15:50
evrardjprestart rabbit and check status15:50
cloudnullthat will drop you into a shell15:50
cloudnullif it can connect15:50
evrardjpyes that's even better15:50
evrardjpand then check mnesia info15:50
evrardjpgood idea cloudnull15:50
cloudnullthen: mnesia:delete_schema(['rabbit@controller1_rabbit_mq_container-2d645e7f'])15:50
*** adrian_otto has joined #openstack-ansible15:50
eil397mrhillsman: you have issue with sending commit on review ?15:51
cloudnulland try to stop_app, join_cluster, start_app15:51
evrardjpthat should be done by removing the folder completely15:51
cloudnull++15:51
cloudnullthats very true15:51
cloudnullrm is the hammer way :)15:51
evrardjpI tried the bazooka aproach15:51
evrardjpyes15:51
evrardjp:D15:51
evrardjpthat's rabbit15:51
evrardjpwho cares15:51
evrardjpeventually consistent, right ?15:52
evrardjpanyway, rabbitmqctl status should give you at least the pid of the mnesia process15:52
cloudnullmrhillsman: for that to fail like so it would mean the entire cluster was unreachable ?15:52
*** alan__ has quit IRC15:52
cloudnullwas memcached running ?15:53
cloudnullon any of the nodes?15:53
mrhillsmanthe issue is when hostname has -l in it15:53
mrhillsmanthe regex checks memcached.conf for -l15:53
evrardjpcloudnull: I think it's problem with parsing the config of memcached15:53
mrhillsmanfirst line has hostname15:53
*** alan__ has joined #openstack-ansible15:53
mrhillsmanso it returns Ansible - { print $2 }15:53
evrardjpecho 'flush_all' | nc $(awk '/\\-l/ {print $2}' /etc/memcached.conf15:53
evrardjpnc: port number invalid: 172.29.238.13415:53
mrhillsmanright15:53
evrardjpthat's ... no luck ?15:54
mrhillsmanfirst line is about ansible managing the file for the hostname (melv7301-rpcops-lab)15:54
evrardjp:p15:54
*** alan__ has quit IRC15:54
mrhillsmanso that -l(ab) is the issue15:54
evrardjp-ab15:54
evrardjpthat's the solution15:54
evrardjprename your host :p15:54
mrhillsmanhehe15:54
cloudnullmrhillsman: http://cdn.pasteraw.com/efukadmfhg9uwt3hhwst75cmzlky6qw15:54
*** alan__ has joined #openstack-ansible15:54
mrhillsmanright15:54
mrhillsmani added that in the description as fix15:55
evrardjplisten should be an ip anyway IMO15:55
mrhillsmanit is15:55
mrhillsmannc ip port15:55
mrhillsmanbut you get nc Ansible ip port15:55
mrhillsmanthe ^ should be used anyway since it makes it more specific15:56
evrardjpYou're right15:56
evrardjpit's starting by this15:56
Andrew_jedicloudnull: Is this right, "erl -sname "rabbit@controller1_rabbit_mq_container-2d645e7f" -mnesia /var/lib/rabbitmq/mnesia/"15:57
openstackgerritTravis Truman (automagically) proposed openstack/openstack-ansible: Define glance_default_store in group_vars  https://review.openstack.org/33557115:58
*** KLevenstein has quit IRC15:58
cloudnullAndrew_jedi: no. just dir at the end, If i remember right15:58
*** alan__ has quit IRC15:58
*** phalmos has quit IRC15:59
*** KLevenstein has joined #openstack-ansible15:59
*** alan__ has joined #openstack-ansible15:59
Andrew_jedicloudnull: ack!15:59
Andrew_jedithanks15:59
Andrew_jedicloudnull: http://paste.openstack.org/show/524097/16:00
cloudnullAndrew_jedi: so rabbit is running16:00
Andrew_jedicloudnull: this was from inside the rabbitmq container on controller116:00
cloudnulland connected to the db16:00
openstackgerritTravis Truman (automagically) proposed openstack/openstack-ansible: Define glance_default_store in group_vars  https://review.openstack.org/33557116:01
*** TheIntern has joined #openstack-ansible16:01
cloudnullevrardjp: do you see the issue mrhillsman is seeing? -- when I run "echo 'flush_all' | nc $(awk '/\-l/ {print $2}' /etc/memcached.conf) $(awk '/\-p/ {print $2}' /etc/memcached.conf)" it works16:02
mrhillsmancloudnull16:02
evrardjpI understand why he wants to add ^16:02
mrhillsmanthe first line in your memcached.conf16:02
mrhillsmandoes it have a -l anywhere?16:02
mrhillsmanif not, it will work16:02
cloudnullmy bad.16:02
mrhillsmanif it does, as does mine because of my hostname, it fails16:02
cloudnulli see it now.16:02
mrhillsmancool16:03
mrhillsmanother than that, manual upgrade with the rabbit changes succeeded16:04
cloudnullcool16:04
cloudnullevrardjp: do you have a PR in the works?16:04
*** karimb has quit IRC16:05
Andrew_jedicloudnull: Aha, i got the shell16:05
cloudnullawesome16:05
evrardjpI'm writing it right now16:05
openstackgerritJean-Philippe Evrard proposed openstack/openstack-ansible: Fix memcached flush if -l is in hostname  https://review.openstack.org/33557416:06
evrardjpquick thing, so it could need more love16:06
*** jmckind_ has joined #openstack-ansible16:07
evrardjpmrhillsman: from which version are you upgrading from/to16:07
*** jmckind has quit IRC16:10
cloudnullAndrew_jedi: were you able to nuke the DB and get the node to reconnect?16:11
*** phalmos has joined #openstack-ansible16:14
*** admin0 has quit IRC16:15
*** phalmos has quit IRC16:16
cloudnullevrardjp: https://review.openstack.org/#/c/335574 -- reviewed16:16
cloudnullanchor works but needs to be moved.16:16
Andrew_jedicloudnull: http://paste.openstack.org/show/524100/16:17
evrardjpthanks dslexia16:17
evrardjpdyslexia16:17
*** neilus has joined #openstack-ansible16:17
Andrew_jedistill same issue16:17
*** neilus has quit IRC16:17
openstackgerritJean-Philippe Evrard proposed openstack/openstack-ansible: Fix memcached flush if -l is in hostname  https://review.openstack.org/33557416:17
evrardjpI didn't test this ^16:17
evrardjpmrhillsman: could you test it ?16:18
cloudnullAndrew_jedi: try rebuilding that container?16:18
cloudnullif that continues to fail you could try destroying the node and rebuilding it to se if it'll rejoin. `openstack-ansible lxc-container-destroy.yml lxc-container-create.yml --limit controller1_rabbit_mq_container-2d645e7f; openstack-ansible rabbitmq-install.yml`16:18
*** neilus has joined #openstack-ansible16:19
mrhillsmanwas in a meeting16:19
mrhillsmanlooking16:19
*** appprod0 has joined #openstack-ansible16:20
openstackgerritJean-Philippe Evrard proposed openstack/openstack-ansible: Fix memcached flush if -l is in hostname  https://review.openstack.org/33557416:20
*** david-lyle has joined #openstack-ansible16:20
Andrew_jedicloudnull: Roger that.16:20
mrhillsmanyeah16:20
mrhillsmancloudnull16:20
cloudnullwhen you rerun the rabbitmq-install.yml play you can elect a different cluster node16:20
mrhillsmanevrardjp commented16:20
evrardjpAndrew_jedi: don't forget to limit on the destroy :p16:21
*** david-lyle has quit IRC16:21
cloudnullIE: openstack-ansible rabbitmq-install.yml -e "rabbitmq_primary_cluster_node=controller3_rabbit_mq_container-48ecc3b2"16:21
*** phalmos has joined #openstack-ansible16:21
*** david-lyle has joined #openstack-ansible16:21
evrardjpmrhillsman: I don't see your comment :/16:22
*** david-lyle has quit IRC16:22
*** david-lyle has joined #openstack-ansible16:23
mrhillsmanwhy you add the -F,... part?16:24
*** berendt has joined #openstack-ansible16:24
*** david-lyle has quit IRC16:25
*** david-lyle has joined #openstack-ansible16:25
evrardjpmrhillsman: that's how master is behaving16:27
evrardjpI barely added ^16:27
cloudnullAndrew_jedi: I just caused my cluster to get into a partitioned state and then ran: openstack-ansible lxc-containers-destroy.yml lxc-containers-create.yml --limit infra1_rabbit_mq_container-d3c5f2d5 && openstack-ansible rabbitmq-install.yml -e "rabbitmq_primary_cluster_node=infra3-rabbit-mq-container-ff06bfc8" ## Note my hostnames are different than yours ## and it seemed to recover nicely.16:28
*** phalmos has quit IRC16:28
cloudnullidk if that will help your situation but its worth a shot16:28
*** cloader89 has quit IRC16:30
cloudnullyou shouldnt need the cluster node part but its worth noting16:30
Andrew_jedicloudnull: Thanks, I appreciate this. I am trying the same thing now16:30
cloudnull** cluster node part == setting rabbitmq_primary_cluster_node16:30
*** sulo has joined #openstack-ansible16:30
cloudnullafk lunching Andrew_jedi best of luck , let us know how it goes.16:32
*** jorge_munoz has quit IRC16:33
openstackgerritTravis Truman (automagically) proposed openstack/openstack-ansible: Add conditional for overlay network settings  https://review.openstack.org/33557916:33
*** asettle has quit IRC16:33
automagicallycloudnull, hope you don’t mind that cherry-pick of your change to master ^16:34
evrardjpgood change btw16:34
*** berendt has quit IRC16:35
evrardjpwill star it, and be back on it later16:35
*** pcaruana has quit IRC16:36
*** TxGirlGeek has quit IRC16:38
*** neilus has quit IRC16:44
*** Andrew_jedi has quit IRC16:48
*** karimb has joined #openstack-ansible16:48
*** karimb has quit IRC16:48
*** Andrew_jedi has joined #openstack-ansible16:48
openstackgerritTravis Truman (automagically) proposed openstack/openstack-ansible: Add conditional for overlay network settings  https://review.openstack.org/33557916:48
*** KLevenstein has quit IRC16:53
*** KLevenstein has joined #openstack-ansible16:54
*** TxGirlGeek has joined #openstack-ansible16:55
evrardjpsee you tomorrow everyone!16:56
automagicallylater evrardjp16:56
eil397have a good one16:57
*** krotscheck is now known as krotscheck_vaca17:01
*** krotscheck_vaca is now known as krot_vaca_jul1917:01
spotzbye evrardjp17:03
*** javeriak has joined #openstack-ansible17:03
*** sdake_ has joined #openstack-ansible17:08
*** sdake has quit IRC17:10
*** weezS has quit IRC17:11
*** asettle has joined #openstack-ansible17:12
Andrew_jedibye evrardjp17:16
*** eil397 has quit IRC17:16
*** TheIntern has quit IRC17:16
*** TheIntern has joined #openstack-ansible17:17
*** PrestonBannister has joined #openstack-ansible17:28
*** eil397 has joined #openstack-ansible17:29
*** javeriak_ has joined #openstack-ansible17:33
*** javeriak has quit IRC17:34
*** TheIntern has quit IRC17:37
*** ManojK has quit IRC17:40
*** ManojK has joined #openstack-ansible17:40
*** TheIntern has joined #openstack-ansible17:40
*** electrofelix has quit IRC17:41
*** TheIntern has quit IRC17:42
*** McMurlock1 has quit IRC17:46
openstackgerritNolan Brubaker proposed openstack/openstack-ansible: Use in-tree env.d files, provide override support  https://review.openstack.org/33259517:48
*** chandanc_ has quit IRC17:48
*** admin0 has joined #openstack-ansible17:55
*** admin0 has quit IRC17:55
*** sdake_ has quit IRC17:55
cloudnullAndrew_jedi: whats the word?17:55
Andrew_jedicloudnull: Part of the problem is hardware. Waiting for that to get fixed. Faulty cable.17:56
Andrew_jedicloudnull: I will update you within an hour.17:56
cloudnullah. that makes networking angry17:56
cloudnull:P17:56
Andrew_jedicloudnull: Lol ;)17:57
cloudnullany cores around that might want to give this a shove https://review.openstack.org/#/c/323504/17:57
*** permalac has quit IRC17:57
*** albertcard has joined #openstack-ansible18:00
jmccrorycloudnull: got it18:02
cloudnulljmccrory: tyvm18:02
openstackgerritMerged openstack/openstack-ansible-openstack_hosts: Updated the hostname generation  https://review.openstack.org/32350418:03
*** TxGirlGeek has quit IRC18:05
openstackgerritAnton Khaldin proposed openstack/openstack-ansible-galera_client: Add ignore_errors to fix minor bug with fallback source for apt-key.  https://review.openstack.org/33523318:07
*** TxGirlGeek has joined #openstack-ansible18:17
*** berendt has joined #openstack-ansible18:17
openstackgerritAnton Khaldin proposed openstack/openstack-ansible-galera_client: Add ignore_errors to fix minor bug with fallback source for apt-key.  https://review.openstack.org/33523318:18
*** jorge_munoz has joined #openstack-ansible18:20
*** mrhillsman has quit IRC18:21
*** TxGirlGeek has quit IRC18:23
*** johnmilton has quit IRC18:25
*** cloader89 has joined #openstack-ansible18:26
*** mrhillsman has joined #openstack-ansible18:26
alextricity25cloudnull: odyssey4me: This is something worth looking into, as it will block anyone building a multi-node with master: https://bugs.launchpad.net/openstack-ansible/+bug/159747518:36
openstackLaunchpad bug 1597475 in openstack-ansible "swift_rings_distribute.yml synchronize task broken on multi-node" [Undecided,New]18:36
cloudnullalextricity25: i was looking into that last night.18:37
alextricity25cloudnull: did you get it too?18:37
cloudnullno18:37
*** daneyon has left #openstack-ansible18:37
alextricity25....well then...18:37
*** asettle has quit IRC18:37
cloudnulli was using the multi-node-aio env18:38
alextricity25Are you sure you were building with master? Ansible version 2.1.0?18:38
cloudnulland ive not been able to replicate it18:38
openstackgerritMerged openstack/openstack-ansible-openstack_hosts: Added the ip_vs kernel module to all openstack hosts  https://review.openstack.org/33470118:38
cloudnullalextricity25: yes im on e6d2f771b8d2b9fd9578396d276398ed1bdaafa218:39
cloudnulli have another env being kicked right now18:39
cloudnullso more soon , but thus far I cant recreat that18:39
cloudnull*recreate18:40
javeriak_hey guys, i have an issue with ansible ssh not working, direct ssh works, but not through ansible; here are my traces: http://paste.ubuntu.com/18086570/18:41
cloudnulljaveriak_: so running: ssh 10.100.1.2 works?18:48
*** TxGirlGeek has joined #openstack-ansible18:48
javeriak_cloudnull yes18:48
cloudnulldo you by change have a lot of keys loaded in your ssh-agent  ?18:49
cloudnullssh-add -L18:49
javeriak_btw where can i find the actual code for the ansible core modules on my deploy? for example i want to see what the ansible ping module does18:49
javeriak_cloudnull nope, only the deploy node key18:50
cloudnulljaveriak_: /opt/ansible-runtime18:50
javeriak_ i have the installed ansible version under there, /opt/ansible_v1.9.3-1/ ?18:52
cloudnulljaveriak_:  is this master?18:52
javeriak_its kilo18:52
cloudnullah.18:52
javeriak_10.1.1118:52
*** TxGirlGeek has quit IRC18:53
cloudnullthat should be in /usr/local/lib/python2.7/dist-packages/ansible18:54
*** admin0 has joined #openstack-ansible18:54
*** ManojK has quit IRC18:54
errrif I am using developer mode on horizon where does the gitrepo I specify get checked out to?18:55
javeriak_cloudnull found it, thanks18:55
errroh it only grabs the egg anyway, so mever mind.18:56
javeriak_so back to the issue, i dont know whats wrong with the connection, the debug trace just timesout18:56
cloudnullerrr: its cloning the git repo directly and then installing it using the local clone as a constraint18:56
errrcloudnull: so the whole repo is cloned then? where does it put it?18:57
cloudnullerrr: pip puts it in /tmp/build i believ18:58
javeriak_cloudnull btw where does it place this ping module on the target node somewhere too?18:58
cloudnullthe task builds the constraint file here /opt/developer-pip-constraints.txt18:58
cloudnullthen the regular install process happens using the local constraints18:59
errrcloudnull: ah so its gone after its built. I was just wanting to check the sha in it vs what this other box we are deving on has18:59
*** TxGirlGeek has joined #openstack-ansible19:01
cloudnulli do believe its gone post build, and the default is set to use the master branch when dev mode is enabled19:01
cloudnullso it may be hard to track19:02
cloudnullyou can active the venv19:02
cloudnulland see what the installed version is19:02
*** ScarZy has quit IRC19:02
*** vnogin has quit IRC19:03
cloudnulljaveriak_:  idk.19:04
cloudnulli believe the module is copied over at runtime.19:04
javeriak_cloudnull yes but it doesnt persist19:05
cloudnullno i dont believe so19:05
javeriak_i mean i cant find it on the target, so i suppose i can just modify the master one and use that19:05
cloudnullif you suspect the ping module to be misbehaving you can try a shell command.19:06
cloudnullansible infra1 -m shell -a 'echo hi'19:06
javeriak_i think the module is fine, but im trying to understand what its doing, because the ssh debug trace only shows it running /usr/bin/python and then nothing, boom output19:07
*** Andrew_jedi has quit IRC19:10
*** woodard has quit IRC19:12
*** woodard has joined #openstack-ansible19:13
*** woodard has quit IRC19:13
*** woodard has joined #openstack-ansible19:14
*** sdake has joined #openstack-ansible19:14
*** vnogin has joined #openstack-ansible19:15
*** ManojK has joined #openstack-ansible19:15
*** ScarZy has joined #openstack-ansible19:15
*** TM1 has quit IRC19:27
*** Andrew_jedi has joined #openstack-ansible19:27
*** javeriak_ has quit IRC19:32
*** javeriak has joined #openstack-ansible19:34
*** asettle has joined #openstack-ansible19:38
*** asettle has quit IRC19:42
*** catintheroof has quit IRC19:43
eil397can someone review oneline bug fix ? https://review.openstack.org/#/c/335233/19:49
cloudnulleil397: looking now19:50
cloudnull++19:50
eil397cloudnull: thanks19:50
cloudnullthank you for putting it together :)19:51
alextricity25cloudnull: i have a repo-build question for you19:52
cloudnullsure19:52
alextricity25How does repo-build determine what wheels to build?19:52
eil397cloudnull: it was David Wilde. who found and described it. I just that opporunity to send my first commit to osa. hope I will be able to add value.19:52
eil397s\just that\jsut used that\g19:53
alextricity25cloudnull: I imagine that the repo-build play does some sort of logic around requirements.txt19:53
openstackgerritMatt Dorn proposed openstack/openstack-ansible-openstack_hosts: Add linux-image-extra-virtual to host packages  https://review.openstack.org/33565019:55
alextricity25cloudnull: If I wanted to tell the repo-build server to not build wheels for a specific role, how would I do that?19:56
openstackgerritTravis Truman (automagically) proposed openstack/openstack-ansible-os_nova: Remove tags from functional testing playbooks  https://review.openstack.org/33565119:57
openstackgerritMerged openstack/openstack-ansible-galera_client: Add ignore_errors to fix minor bug with fallback source for apt-key.  https://review.openstack.org/33523319:58
*** Guest20454 is now known as mgagne19:58
*** mgagne has joined #openstack-ansible19:58
*** asettle has joined #openstack-ansible19:59
*** appprod0 has quit IRC20:05
*** TxGirlGeek has quit IRC20:07
*** TxGirlGeek has joined #openstack-ansible20:07
openstackgerritKevin Carter (cloudnull) proposed openstack/openstack-ansible-repo_build: Updated repo-build to store package sources  https://review.openstack.org/33411020:09
cloudnullalextricity25: which role ?20:09
alextricity25cloudnull: I want the repo-build playbook to skip building wheels for an RPC-O role, beaver.20:10
*** weezS has joined #openstack-ansible20:10
alextricity25cloudnull: I think i figured it out though...it looks for ansible variables postfixed with "pip_packages"?20:10
alextricity25or any variant of BUILD_IN_PIP_PACKAGE_VARS?20:11
alextricity25s/BUILD/BUILT/20:11
cloudnullalextricity25:  you can change pkg_locations to no include the beaver role.20:11
cloudnullrather the localtion of the beaver role.20:11
cloudnullif you just dont want wheels built for that role you can modify the vars too.20:12
*** TxGirlGeek has quit IRC20:12
*** TxGirlGeek has joined #openstack-ansible20:12
alextricity25cloudnull:  I just deleted the beaver role from the code tree :P20:12
alextricity25and all it's variables20:12
alextricity25ha20:13
alextricity25cloudnull: Where is this pkg_locations variable you speak of?20:13
cloudnullalextricity25: https://github.com/openstack/openstack-ansible/blob/master/playbooks/repo-build.yml#L2420:14
cloudnullvalues found here by default https://github.com/openstack/openstack-ansible/blob/master/playbooks/repo-build.yml#L44-L4720:14
*** TxGirlGeek has quit IRC20:14
cloudnullif RPC-O may be storing roles in one of those locations or overriding the default.20:14
*** TxGirlGeek has joined #openstack-ansible20:15
mhaydencloudnull: nice find on the nohup20:16
cloudnullthat was an odd one.20:16
mhaydeni wonder why that happens20:16
*** Drago has left #openstack-ansible20:17
cloudnullit looks like stdout is just left open for the entire run20:17
*** asettle has quit IRC20:19
*** appprod0 has joined #openstack-ansible20:20
*** mkrish004c has joined #openstack-ansible20:22
*** Andrew_jedi has quit IRC20:25
*** asettle has joined #openstack-ansible20:26
*** Andrew_jedi has joined #openstack-ansible20:29
Andrew_jediclounull: Still not fixed. It may be a hardware issue. We first saw the network partition when we introduced bonding on one of this setup.20:30
*** alan__ has quit IRC20:34
cloudnullAndrew_jedi: maybe switch configs ?20:36
Andrew_jedicloudnull: Could you pls spare 2 mins and have a look at this, http://paste.openstack.org/show/524128/20:40
Andrew_jediThis is the network config we introduced when we implemented bonding20:41
Andrew_jedibond0 replaced eth0 and bond1 replaced eth1 basically20:42
Andrew_jediIt could be a switch issue but we are not even using vlans on this setup.20:43
*** TxGirlGeek has quit IRC20:45
*** asettle has quit IRC20:51
*** woodard_ has joined #openstack-ansible20:56
*** asettle has joined #openstack-ansible20:57
*** woodard has quit IRC21:00
cloudnullAndrew_jedi: looking21:00
*** Mudpuppy has quit IRC21:01
*** javeriak has quit IRC21:03
*** woodard has joined #openstack-ansible21:08
*** psilvad has quit IRC21:09
*** javeriak has joined #openstack-ansible21:10
cloudnullAndrew_jedi: i dont see anything wrong with the interface file.21:10
cloudnullthat should work. But if you're seeing issues with the bond you might want to try disbaling a channel to see if the connection stabalizes21:11
*** TxGirlGeek has joined #openstack-ansible21:11
*** asettle has quit IRC21:11
*** woodard_ has quit IRC21:12
*** woodard has quit IRC21:13
*** ManojK has quit IRC21:13
openstackgerritKevin Carter (cloudnull) proposed openstack/openstack-ansible-repo_build: Updated repo-build to store package sources  https://review.openstack.org/33411021:13
*** ManojK has joined #openstack-ansible21:15
*** pester has joined #openstack-ansible21:15
*** TxGirlGeek has quit IRC21:16
*** fxpester has quit IRC21:17
Andrew_jedicloudnull: thx! looking in to this.21:20
*** thorst has quit IRC21:21
*** kstev has quit IRC21:23
*** smatzek has quit IRC21:27
*** mkrish004c has quit IRC21:36
*** PrestonBannister has quit IRC21:38
*** spotz is now known as spotz_zzz21:39
*** thorst has joined #openstack-ansible21:46
*** Andrew_jedi has quit IRC21:46
*** ManojK has quit IRC21:48
*** thorst has quit IRC21:50
*** woodard has joined #openstack-ansible21:55
*** messy has quit IRC21:57
*** adrian_otto has quit IRC21:57
*** jmckind_ has quit IRC21:58
*** TxGirlGeek has joined #openstack-ansible22:00
mrdaMorning all22:03
*** TxGirlGeek has quit IRC22:06
*** thorst has joined #openstack-ansible22:07
admin0morning mrda (0:10 AM here)22:10
*** berendt has quit IRC22:10
*** TxGirlGeek has joined #openstack-ansible22:11
*** asettle has joined #openstack-ansible22:12
mrda:)22:15
*** ametts has quit IRC22:16
*** asettle has quit IRC22:17
eil397mrning mrda22:18
*** TxGirlGeek has quit IRC22:21
mrdao/22:23
*** adrian_otto has joined #openstack-ansible22:25
*** aernhart has joined #openstack-ansible22:33
*** cloader89 has quit IRC22:37
*** admin0 has left #openstack-ansible22:40
*** admin0 has quit IRC22:40
*** sdake_ has joined #openstack-ansible22:42
*** sdake has quit IRC22:45
*** KLevenstein has quit IRC22:46
*** sdake_ has quit IRC22:49
*** thorst has quit IRC22:58
*** weshay has quit IRC22:58
*** thorst has joined #openstack-ansible22:59
*** sdake has joined #openstack-ansible23:01
*** sdake has quit IRC23:04
*** thorst has quit IRC23:07
*** woodard has quit IRC23:12
*** jamielennox is now known as jamielennox|away23:13
*** deverter has quit IRC23:14
*** asettle has joined #openstack-ansible23:28
*** asettle has quit IRC23:33
*** daneyon has joined #openstack-ansible23:38
*** daneyon_ has joined #openstack-ansible23:39
*** daneyon has quit IRC23:43
*** jamielennox|away is now known as jamielennox23:51
*** sacharya_ has quit IRC23:56
*** mummer has quit IRC23:58
*** eil397 has left #openstack-ansible23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!