Friday, 2015-07-17

*** Mudpuppy has joined #openstack-ansible00:00
*** openstack has joined #openstack-ansible00:05
*** Mudpuppy has joined #openstack-ansible00:05
*** TheIntern has quit IRC00:07
*** galstrom_zzz is now known as galstrom00:13
*** sdake has joined #openstack-ansible00:30
*** shaleh has quit IRC00:36
*** sigmavirus24 is now known as sigmavirus24_awa00:37
*** galstrom is now known as galstrom_zzz00:41
*** galstrom_zzz is now known as galstrom00:41
openstackgerritMiguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Identity Provider Configuration  https://review.openstack.org/19425900:44
*** Mudpuppy_ has joined #openstack-ansible00:53
*** galstrom is now known as galstrom_zzz00:55
*** sacharya has joined #openstack-ansible00:55
*** Mudpuppy has quit IRC00:56
*** sacharya1 has joined #openstack-ansible00:59
*** sacharya has quit IRC00:59
jwitkowhen I run the setup-infrastructure.yml playbook I am getting "TypeError: argument of type 'NoneType' is not iterable" when gathering facts01:17
jwitkothis did not happen on the foundation playbook01:17
jwitkoanyone have any idea whats wrong ?01:17
*** sacharya1 has quit IRC01:19
*** javeriak has quit IRC01:52
cloudnulljwitko: id imagine that its an issue in inventory / user config. does it happen on first run ? or explode on some task ?01:55
jwitkofirst run02:00
jwitkocloudnull, looks like its trying to execute on containers02:01
cloudnullwhat play are you running ?02:03
jwitkoopenstack-ansible setup-infrastructure.yml02:04
jwitkoit fails on the Install memcached play,  under gathering facts02:04
cloudnullif you execute the memcached-install.yml play directly does it fail the same way ?02:05
*** Mudpuppy_ is now known as Mudpuppy02:06
*** alop has quit IRC02:07
*** alop has joined #openstack-ansible02:31
*** galstrom_zzz is now known as galstrom02:37
*** galstrom is now known as galstrom_zzz02:38
*** galstrom_zzz is now known as galstrom02:47
jwitkocloudnull, i'll try that.02:52
jwitkoyes, it does02:55
jwitkocloudnull, http://pastebin.com/pbnJzkWH02:56
jwitkoi don't know much about containers02:56
*** galstrom is now known as galstrom_zzz02:56
jwitkobut it looks like this is trying to reach out to some containers via ssh, If those are the hostnames its creating dynamically then its not surprising it can't SSH to those?02:57
cloudnulldid you by chance use vault on /etc/openstack_deploy/openstack_user_config.yml ?02:57
jwitkoyes02:58
cloudnulldecrypt that file and try again .02:58
jwitkoerr no02:58
jwitkosorry i did not,  only on user_secrets.yml02:58
cloudnullok, nevermind then02:58
cloudnullnext, how was ansible installed?02:58
cloudnullusing the scripts ?02:59
jwitkoaye02:59
cloudnullif you run lxc-ls -f02:59
cloudnulldo all your containers have ip addresses ?02:59
cloudnullim guessing because i've not seen that issue before03:00
jwitkoyes they do03:00
jwitkohttp://pastebin.com/wnAuy1xa03:01
jwitkoif i try to SSH to the memcached_container from the host itself i can get to a password prompt03:03
jwitkobut what I'm not understanding here is,  isn't ansible attempting to SSH straight to the container name from my work-station ?03:04
jwitkohow would the work station have any idea how to resolve these hostnames?03:04
Sam-I-Amdns03:07
*** sacharya has joined #openstack-ansible03:15
*** sdake has quit IRC03:23
*** galstrom_zzz is now known as galstrom03:24
cloudnulljwitko:  it looks like the containers were created without the other interfaces03:29
cloudnullhttp://cdn.pasteraw.com/316w5iw1f6rbmuc45a7prbrbgr9ev9k03:29
cloudnull^ that is a working set of containers.03:30
jwitkooh, the vxlan stuff?03:30
cloudnullmgmt is the interface that is missing03:30
cloudnullthe 10.x interfaces are from lxc03:31
cloudnullmy 172 addresses are what i use for management03:31
cloudnulland then I have others for vxlan / vlan / flat03:31
jwitkook weird i wonder why they would be missing03:31
jwitkoi have my management network specified03:31
cloudnullissue in config from before that is otherwise corrected now?03:33
cloudnullid guess your containers inventory has all of the ip addresses set to None03:33
*** sdake has joined #openstack-ansible03:33
jwitkosorry, where is the containers inventory?03:35
cloudnull# /etc/openstack_deploy/openstack_inventory.json03:35
jwitkoyou're right03:36
jwitkothey're set to null03:36
jwitko            "os-ctrl1_memcached_container-a8401c9b": {03:36
jwitko                "ansible_ssh_host": null,03:36
jwitko                "component": "memcached",03:36
jwitko                "container_address": null,03:36
jwitkocloudnull, any idea why its all populated with null?03:42
jwitkohm... is provider networks a sub-item of global_overrides?03:49
*** tlian has quit IRC03:49
*** galstrom is now known as galstrom_zzz03:55
cloudnullsorry being split brained03:56
cloudnullits late : )03:56
cloudnulljwitko:  check this http://cdn.pasteraw.com/2zw3eoupb535eqbs27gg130jv2oya0r03:56
cloudnullif you run # /opt/os-ansible-deployment/scripts/inventory-manage.py -f /etc/openstack_deploy/openstack_inventory.json -l03:56
cloudnullid imagine that your inventory output looks something similar.03:56
cloudnullif thats the case then to get this back on the right track you just need to run the following. little python code to reset the container addresses. http://paste.openstack.org/show/382642/03:58
cloudnullthen rerun # openstack-ansible lxc-container-create.yml03:58
cloudnullwhich will re-assign ip addresses to the various containers.03:59
cloudnulljwitko:  yes its a sub element .04:01
cloudnullhere is a complete config from one of my dev labs http://paste.openstack.org/show/382643/04:01
*** KLevenstein has joined #openstack-ansible04:01
*** galstrom_zzz is now known as galstrom04:08
jwitkocloudnull, yea so the issue was my indentation of the entire provider networks block04:11
jwitkoso I fixed the indentation and re-ran the $ openstack-ansible setup-hosts.yml04:11
jwitkothis has failed however on the lxc_container start ups04:11
jwitkoNOTIFIED: [lxc_container_create | Start Container] ****************************04:12
cloudnullso once that play is done you should have some network bits04:12
jwitkoso its expected the failures of the lxc container starts?04:15
*** misc has quit IRC04:17
cloudnullno04:19
jwitkoi see that it got the networking bits correct this time around04:19
jwitkoafter i fixed the indentations04:19
cloudnullok04:19
cloudnullgreat04:19
cloudnullall the containers started up ?04:19
jwitkono they are all failing to start04:20
jwitkofailed: [os-ctrl2_nova_cert_container-a9a3f4e2 -> os-ctrl2] => {"error": "Failed to start container [ os-ctrl2_nova_cert_container-a9a3f4e2 ]", "failed": true, "lxc_container": {"init_pid": -1, "interfaces": [], "ips": [], "state": "stopped"}, "rc": 1}04:20
jwitkomsg: The container [ %s ] failed to start. Check to lxc is available and that the container is in a functional state.04:20
jwitkoall containers are reporting that error04:22
jwitkoits still going04:22
cloudnullif you do lxc-ls -f04:24
cloudnullall are "stopped"04:25
*** misc has joined #openstack-ansible04:25
cloudnullif so it may jus tbe easier to nuke the containers you have on disk and build new ones.04:26
cloudnullopenstack-ansible lxc-container-destroy.yml04:26
cloudnullwill remove all the containers. working or now.04:26
cloudnullthen rerun openstack-ansibel lxc-container-create.yml04:26
*** alop has quit IRC04:27
cloudnullyou could fix up the configs , which im guessing are made however its likely faster to nuke and rebuild.04:27
jwitkook cool04:27
jwitkoshould i run lxc-container-create.yml before or after the "openstack-ansible setup-hosts.yml" ?04:28
cloudnullit wont harm your old inventory so it should just carry on.04:28
cloudnulletup-hosts.yml is a meta play that calls lxc-container-create.yml04:28
jwitkooh ok04:28
cloudnullyou can see the order of the calls by catting the file.04:29
jwitkoyea i'm going to check it out after this errors04:29
cloudnullim mostly afk for the rest of the night .04:29
jwitkothanks so much for your herlp04:29
jwitkohelp04:29
jwitkohave a good night04:29
cloudnulllet us know how it goes04:30
cloudnullanytime.04:30
cloudnullhappy to help04:30
*** britthou_ has joined #openstack-ansible04:38
*** KLevenstein has quit IRC04:39
*** galstrom is now known as galstrom_zzz04:40
*** britthouser has quit IRC04:40
*** galstrom_zzz is now known as galstrom04:42
prometheanfirecloudnull: you happen to know of any cause to a bridge dropping all traffic?04:42
prometheanfirecloudnull: since you are up04:42
Sam-I-Amprometheanfire: it went from bridge to bitbucket?04:44
prometheanfireSam-I-Am: seems like it04:45
prometheanfireSam-I-Am: if you want we are talking about it in #gentoo-virtualization04:45
prometheanfireit's something I've not heard of or reproduced04:45
Sam-I-Ambut someone else has figured out how to break a bridge?04:46
jwitkodamn... i did the lxc-container-destroy and rebuild04:50
jwitkoand it still fails the same way04:50
prometheanfireSam-I-Am: I'm thinking it has to do with rip/stp04:51
prometheanfireSam-I-Am: he's checking04:51
*** sdake has quit IRC04:56
*** galstrom is now known as galstrom_zzz05:03
*** Mudpuppy has quit IRC05:09
*** mancdaz has quit IRC05:23
*** mancdaz has joined #openstack-ansible05:24
*** sacharya has quit IRC05:29
*** markvoelker has joined #openstack-ansible05:41
*** markvoelker_ has joined #openstack-ansible05:44
*** markvoelker has quit IRC05:45
*** annashen has joined #openstack-ansible05:51
*** ig0r_ has joined #openstack-ansible06:08
*** ig0r_ has quit IRC06:45
*** ig0r_ has joined #openstack-ansible06:51
openstackgerritMerged stackforge/os-ansible-deployment: Remove {{ from "with_items" and "when" statements  https://review.openstack.org/20258107:11
openstackgerritMerged stackforge/os-ansible-deployment: Fix haproxy service config when ssl is enabled  https://review.openstack.org/20248507:40
openstackgerritJesse Pretorius proposed stackforge/os-ansible-deployment: Fix haproxy service config when ssl is enabled  https://review.openstack.org/20291107:48
*** annashen has quit IRC08:05
mancdazgit-harry ping08:09
git-harrypong08:09
mancdazgood morrow!08:11
mancdazI was just looking at your review for the rabbitmq install stuff08:11
git-harryyeah, I know it's broken08:12
mancdazI'm probably being a bit dumb, but isn't it going to stop all the rabbits every time the playbooks are run?08:12
git-harryoh08:12
git-harryyes, I need to add a tag so you can skip that08:12
mancdaza tag?08:13
mancdazdoesn't it need to determine automatically if they need stopping, or not08:13
git-harrythis is why we had the discussion yesterday about assumptions.08:14
git-harryIf I can assume that this is only ever run during a maintenance it doesn't matter if they all get shutdown08:14
mancdazwell it's part of the 'install' play08:15
git-harryI'm not following08:15
mancdazwell, we were talking about when person does full juno > kilo upgrade, it would usually be in a maintenance window08:20
git-harryI wasn't :P08:21
mancdazok08:21
mancdaztruthfully, I'd expect that any time the playbooks are run fully it would be in a maintenance window. But I couldn't guarantee that08:22
mancdazand also remember, we were speaking rpc specific practices08:22
mancdazthis is osad, so other deployers might have different expectations08:22
git-harryI know, but no one has reviewed it yet to express dissatisfaction at this method08:23
mancdazha, I can add my comments there then. Was just double checking here that I wasn't misunderstanding the patch08:23
*** rcarrillocruz has quit IRC08:35
*** markvoelker_ has quit IRC08:42
openstackgerritgit-harry proposed stackforge/os-ansible-deployment: Serialise rabbitmq playbook to allow upgrades  https://review.openstack.org/20268108:44
*** markvoelker has joined #openstack-ansible08:57
*** markvoelker has quit IRC09:02
openstackgerritJesse Pretorius proposed stackforge/os-ansible-deployment: Keystone Federation Service Provider Configuration  https://review.openstack.org/19439509:07
*** markvoelker has joined #openstack-ansible09:12
*** markvoelker has quit IRC09:17
*** markvoelker has joined #openstack-ansible09:26
*** markvoelker has quit IRC09:31
*** markvoelker has joined #openstack-ansible09:41
*** markvoelker has quit IRC09:45
openstackgerritgit-harry proposed stackforge/os-ansible-deployment: Serialise rabbitmq playbook to allow upgrades  https://review.openstack.org/20268109:52
*** markvoelker has joined #openstack-ansible09:55
*** markvoelker has quit IRC10:00
*** markvoelker has joined #openstack-ansible10:07
*** markvoelker has quit IRC10:12
*** markvoelker has joined #openstack-ansible10:22
*** openstackgerrit has quit IRC10:31
*** openstackgerrit has joined #openstack-ansible10:32
*** markvoelker has quit IRC10:32
*** markvoelker has joined #openstack-ansible10:37
*** sdake has joined #openstack-ansible10:39
*** markvoelker has quit IRC10:42
*** markvoelker has joined #openstack-ansible10:51
*** markvoelker has quit IRC10:56
openstackgerritJesse Pretorius proposed stackforge/os-ansible-deployment: Fix Horizon SSL certificate management and distribution  https://review.openstack.org/20297710:57
openstackgerritMerged stackforge/os-ansible-deployment: Fix repo section in example config file  https://review.openstack.org/20237711:00
*** markvoelker has joined #openstack-ansible11:06
*** markvoelker has quit IRC11:11
openstackgerritJesse Pretorius proposed stackforge/os-ansible-deployment: Set haproxy install to use latest packages  https://review.openstack.org/20298111:15
*** markvoelker has joined #openstack-ansible11:19
*** markvoelker has quit IRC11:24
*** sdake has quit IRC11:32
*** markvoelker has joined #openstack-ansible11:32
*** markvoelker has quit IRC11:44
*** mmasaki has left #openstack-ansible11:51
*** markvoelker has joined #openstack-ansible11:55
*** markvoelker has quit IRC12:00
*** markvoelker has joined #openstack-ansible12:09
*** markvoelker has quit IRC12:13
*** markvoelker has joined #openstack-ansible12:16
openstackgerritJesse Pretorius proposed stackforge/os-ansible-deployment: Fix Horizon SSL certificate management and distribution  https://review.openstack.org/20297712:20
*** markvoelker has quit IRC12:21
*** markvoelker has joined #openstack-ansible12:24
*** markvoelker has quit IRC12:32
*** sdake has joined #openstack-ansible12:33
*** markvoelker has joined #openstack-ansible12:39
*** markvoelker has quit IRC12:43
*** markvoelker has joined #openstack-ansible12:53
*** tlian has joined #openstack-ansible12:57
*** markvoelker has quit IRC12:58
mgariepyhey, how long should it take to start lxc container on a infra hosts?13:01
mgariepyis there some order that it respect ? or should it start them all at once on boot ?13:01
openstackgerritJesse Pretorius proposed stackforge/os-ansible-deployment: Keystone SSL cert/key distribution and configuration  https://review.openstack.org/19447413:04
*** markvoelker has joined #openstack-ansible13:05
odyssey4memgariepy if you start them all at once it'll take longer for them to get to a ready state, but they'll normalise and should be fine13:05
odyssey4meit's probably best to do a bit of a delay between each of them and start them in some sort of sensible order based on dependencies13:06
odyssey4mebut if you start them all at once it'll probably work - you may just need to monitor and health check them a bit later13:06
mgariepywhat's the default behavior ?13:06
*** markvoelker_ has joined #openstack-ansible13:07
odyssey4meI could be wrong, but the default at this stage is all at once.13:07
mgariepybecause i rebooted a controller and it's been an hour and still one is not started yet.13:07
odyssey4memgariepy then you have a problem and had best start checking logs - they should've been up within 5 mins13:08
mancdazmgariepy I've seen that before13:08
mancdazthough I can't remember exactly what it was :(13:08
mgariepyi have a kindof old serveur to test.13:08
mgariepybut still. shouldn't take an hour to boot haha13:08
mancdazmgariepy the server is still rebooting?13:09
mancdazor the containers haven\'t started after the reboot?13:09
mgariepyit's longer to boot then to install haha :)13:09
mgariepythey are starting one by one13:09
*** markvoelker has quit IRC13:09
mgariepyi'll reboot to see if it's the same13:27
*** markvoelker_ has quit IRC13:34
*** ccrouch has joined #openstack-ansible13:36
*** Mudpuppy has joined #openstack-ansible13:41
*** Guest9887 has quit IRC13:48
*** markvoelker has joined #openstack-ansible13:49
*** blewis has joined #openstack-ansible13:49
*** blewis is now known as Guest6246513:49
*** Mudpuppy has quit IRC13:50
*** markvoelker has quit IRC13:54
*** markvoelker has joined #openstack-ansible13:59
*** spotz_zzz is now known as spotz14:00
*** Mudpuppy has joined #openstack-ansible14:01
*** Mudpuppy has quit IRC14:01
*** Mudpuppy has joined #openstack-ansible14:02
*** sigmavirus24_awa is now known as sigmavirus2414:06
*** markvoelker has quit IRC14:07
cloudnullmorning14:07
*** markvoelker has joined #openstack-ansible14:14
*** markvoelker has quit IRC14:18
*** galstrom_zzz is now known as galstrom14:20
jwitkogood morning cloudnull!  :)14:21
cloudnullhows it jwitko?14:21
jwitkoah i went to bed shortly after you last night14:21
jwitkoi ran the lxc-containers-destroy, and it destroyed everything14:22
cloudnullyea, i was tired .14:22
jwitkothen i ran the create, and it worked out well once again until starting containers14:22
jwitkoso in the end it failed in the same spot14:22
cloudnullhum...14:22
cloudnulldo all of the bridges exist on the host that are specified in the provider netwoks section ?14:23
*** yaya has joined #openstack-ansible14:23
cloudnullare all of the containers in a "stopped" state ?14:23
cloudnullif so, can you run # lxc-start -n $container_name14:23
jwitkoso I have three bridges that only exist on compute hosts14:23
cloudnullit should kick out some details on why the container will not start14:23
jwitkothey are connected to my SAN14:24
jwitkoso on 2/3 hosts all containers are STOPPED14:24
jwitkoon one host, the memcached container is running with a single IP14:24
cloudnullso your hosts should have all of the netwosk that are specified in the container_bridge key and are also bound to the groups. IE https://github.com/stackforge/os-ansible-deployment/blob/master/etc/openstack_deploy/openstack_user_config.yml.aio#L2014:25
cloudnullso your os-infra host should have br-mgmt and br-storage if you used the default names.14:26
jwitkolet me make a paste bin14:27
*** yaya has quit IRC14:27
cloudnullyour compute hosts should have br-mgmt, br-storage, br-vlan and br-vxlan14:27
cloudnulland your network hosts should have  br-vlan and br-vxlan14:27
cloudnullagain if you had used all of the default names.14:27
jwitkohttp://pastebin.com/XW3DNEE914:28
jwitkoso basically14:28
*** markvoelker has joined #openstack-ansible14:28
jwitkobr-nfs1, br-iscsi1, br-iscsi2  --- these only exist on compute hosts14:29
jwitkothey are for the storage network14:29
jwitkoi'm not using br-vxlan because I don't plan to have tenant networks14:29
jwitkocloudnull, so I'm guessing my setup is not a functional one14:31
*** markvoelker has quit IRC14:33
*** yaya has joined #openstack-ansible14:35
cloudnullso the config had some syntax issues due to white space. http://paste.openstack.org/show/383824/14:36
cloudnullthis is what it was reading before http://paste.openstack.org/show/383837/14:37
cloudnulland this is what is should read once fixed up http://paste.openstack.org/show/383839/14:38
cloudnullnotice the difference in group_binds being a list14:38
cloudnullpalendae, Mudpuppy, sigmavirus24 we really should produce a config scheme validator to help out with these types of issues.14:39
cloudnullsigmavirus24: :)14:39
Mudpuppy:)14:39
palendaecloudnull: I talked to sigmavirus24 about it some yesterday...the thing I had in mind was verifying that the required keys are present, and maybe a syntax check14:39
sigmavirus24cloudnull: yeah yeah yeah14:40
sigmavirus24cloudnull: palendae and I are talking about that for the hackathon14:40
palendaeThough if we have a schema validator, we'll now need to have the config in 3 places - the schema, example files, and docs14:40
*** TheIntern has joined #openstack-ansible14:40
sigmavirus24palendae: if we can generate teh docs from the schema that'd be greeeat14:40
cloudnullmaybe we could use https://github.com/sigmavirus24/schema-validator14:40
sigmavirus24also maybe example files14:40
palendaeOr vice-versa14:40
sigmavirus24palendae: nah14:40
MudpuppyIn this case it was valid yaml, but the white space messed stuff up, so pass 1 should check typos, and phase two syntax14:40
sigmavirus24cloudnull: what is that magic project you speak of14:40
sigmavirus24=P14:41
cloudnulli know, i know, NIH and all14:41
sigmavirus24the example schema and such are probably woefully out of date14:41
sigmavirus24=/14:41
palendaeworks for juno!14:41
cloudnullsigmavirus24: likely14:41
cloudnullthis is something that we should invest time in14:41
cloudnullIMO14:41
palendaeYeah14:42
cloudnullnow just let me find that extra time....14:42
palendaecloudnull: I was thinking about writing a spec14:42
palendaeBut yeah14:42
cloudnulltypie typie14:42
sigmavirus24cloudnull: I'd totally work on this today14:42
sigmavirus24but those days between sprints we were promised disappeared =P14:42
palendaeAlso rpc-openstack grew a yaml-mergerating script that I think would be generally useful to anyone extending openstack-ansible14:42
cloudnullthese are not the days you were looking for14:42
sigmavirus24cloudnull: yep14:42
cloudnullpalendae:  as long as its called the extenderator im game14:43
palendaeI'm for the name changing if it gets into upstream14:43
palendaeextenderatoring selfie-stick14:43
* cloudnull will +2 for more erators14:44
jwitkocloudnull, ok i believe i've fixed the syntax.  not sure how it got like that14:44
cloudnulljwitko:  it happens ,14:44
jwitkois there any issue with my bridges?  and having storage bridges that should only be on storage hosts?14:44
cloudnulleverything else looks fine14:44
jwitkoalright I'll run the destroy again14:45
jwitkoand then the setup-hosts14:45
cloudnulltry running lxc-container-craete again14:45
jwitkowouldn't that formatting issues create some problems with the config generation ?14:46
cloudnullyea, and being that you have nothing deployed presently, its probably best to nuke it.14:47
cloudnullid also remove /etc/openstack_deploy/openstack_inventory.json14:47
cloudnullonce you do the container destroy14:47
cloudnullwhich will be like starting inventory fresh14:47
cloudnullCores https://review.openstack.org/#/c/202268/ please review and do the needfuls, if you think its a good change that is. I've done 9 tests, number 10 is in progress and so far I've had 100% success.14:50
cloudnullonly 2 of the tests were done with the successerator reenabled.14:50
*** markvoelker has joined #openstack-ansible14:50
cloudnullitll be 3 if the last one is good too. but in all it looks like the change is helping general performance and ssh intabilities (even without the successerator).14:52
jwitkocloudnull, sorry can you explain what you mean by nuke it?   just run the lxc-destroy and delete the json ?   Then am I running the setup-hosts.yml or the lxc-create yml ?14:54
cloudnullrun lxc-container-destory.yml14:54
cloudnullrm the json14:54
cloudnulland then rerun lxc-container-create14:54
jwitkoah ok14:54
cloudnullthe json is rerendered on every run, but the inventory generator perserves data within it for items already in inventory. so removing it is like starting from a clean state.14:55
jwitkocool, thank you i'll try that now14:55
odyssey4mecloudnull lol, a nice sneak attack there with MAX_RETRIES14:55
cloudnullbut destroy your containers first .14:55
cloudnullodyssey4me: i added it after the meeting yesterday14:56
odyssey4mecloudnull that perhaps should be a separate patch, don't you think?14:56
cloudnullif you think it best.14:56
*** markvoelker has quit IRC14:57
*** markvoelker_ has joined #openstack-ansible14:57
*** markvoelker_ has quit IRC14:57
odyssey4methat way if we need to revert the other one, or revert this one - they don't interfere with each other14:57
*** markvoelker has joined #openstack-ansible14:57
cloudnullok one sec14:57
odyssey4meno need for a bug on the successerator one14:57
odyssey4melol, you have no bug on that one anyway :p14:58
cloudnullnope14:58
openstackgerritKevin Carter proposed stackforge/os-ansible-deployment: Container create/system tuning  https://review.openstack.org/20226814:59
openstackgerritKevin Carter proposed stackforge/os-ansible-deployment: Adds retries  https://review.openstack.org/20306315:00
*** jaypipes has joined #openstack-ansible15:01
cloudnulldone15:01
odyssey4meand done15:05
*** sdake has quit IRC15:13
cloudnullin related news please vote http://lists.openstack.org/pipermail/openstack-dev/2015-July/069857.html15:15
cloudnullpalendae:  https://review.openstack.org/#/c/202821 is a fix for an upgrade issues tha bjorne was seeing15:17
palendaecloudnull: Ok, thanks. I'll look at it in a moment15:18
palendaeWhen I find what exchange has done with my openstack-dev mails15:18
palendaecloudnull: Another thing I kinda wanted to spec out - bash linting on run-upgrades.sh15:20
palendaeBut for valid linting on that, we'd need to do yaml and python, too15:20
*** alextricity has joined #openstack-ansible15:21
cloudnull+115:21
jwitkocloudnull,  failure  :(15:21
cloudnullsame thing ?15:21
jwitkoall containers reporting same error:15:21
jwitkofailed: [os-ctrl1_utility_container-062f5209 -> os-ctrl1] => {"error": "Failed to start container [ os-ctrl1_utility_container-062f5209 ]", "failed": true, "lxc_container": {"init_pid": -1, "interfaces": [], "ips": [], "state": "stopped"}, "rc": 1}15:21
jwitkomsg: The container [ %s ] failed to start. Check to lxc is available and that the container is in a functional state.15:21
cloudnullcan you do # lxc-start -n os-ctrl1_utility_container-062f520915:22
cloudnullwhat error is it reporting ?15:22
jwitkohttp://paste.openstack.org/show/383961/15:26
*** jaypipes is now known as blockedpipes15:27
*** annashen has joined #openstack-ansible15:28
cloudnulllooks like a network problem15:28
*** sdake has joined #openstack-ansible15:28
cloudnulljwitko:  what bridge is vethT9STR7 connected to ?15:29
cloudnullcurious, do you have multiple eth0s ?15:29
jwitkoshouldn't?15:29
* cloudnull going back to look at your config15:29
*** daneyon has joined #openstack-ansible15:31
cloudnullthe container_interface is the name of the interface within the container.15:31
cloudnulland is created by the lxc-container-create play.15:31
jwitkoso I can't find that interface15:32
jwitkolet me know if you need me to paste any updated files15:33
cloudnull# /var/lib/lxc/os-ctrl1_utility_container-062f5209/config15:33
*** weezS has joined #openstack-ansible15:33
*** annashen has quit IRC15:35
jwitkohttp://paste.openstack.org/show/383963/15:36
cloudnull in your config change line 3015:37
cloudnull# /container_interface: "eth0"/container_interface: "eth1"15:37
cloudnullthe base lxc system is using the default lxc network interface15:38
cloudnullwhich is eth015:38
cloudnulland your managemant network interface is being instructed to run on eth015:38
cloudnullso they're conflicting.15:38
jwitkooh ok15:39
cloudnullyou may have the same issue on line 76 too15:39
jwitkoso this eth1 will live only inside the containers?15:39
cloudnullyes15:39
cloudnullline 76 to eth215:39
cloudnullline 85 to eth315:39
cloudnulli shouldve caught that before sorry15:39
jwitkono worries!15:40
jwitkoit is me who should be apologizing lol15:40
cloudnullnah its us , we need moar better docs15:40
jwitkoso just to confirm15:40
cloudnullSam-I-Am:  ^ typie typie15:40
jwitkothose changes are in openstack_user_config.yml15:41
cloudnullyes15:41
jwitkothen i will destroy containers, remove json file, and create containers15:41
cloudnullyou should be able to make the change and rerun the lxc-container-create15:41
jwitkoSam-I-Am is a king among men  :)15:41
jwitkoi bow to his docs15:41
cloudnulllol15:42
jwitkocloudnull yay it made it past the error  :)15:51
jwitkoi have to go to the dentist, but I'm sure i'll be in here again today poking you more15:51
jwitkothanks again for the help15:51
cloudnullanytime15:52
*** CheKoLyN has joined #openstack-ansible15:52
cloudnullhave fun at the dentist15:52
cloudnull;)15:52
Sam-I-Amlol dentists15:54
Sam-I-Ambetter than or worse than neutron :)15:54
MudpuppyNightmare: Your dentist is a neutron contributor15:56
palendaeSounds right15:57
openstackgerritJesse Pretorius proposed stackforge/os-ansible-deployment: Keystone Federation Service Provider Configuration  https://review.openstack.org/19439516:02
odyssey4memiguelgrinberg ^16:03
*** vdo has quit IRC16:03
*** eglute has quit IRC16:04
*** eglute has joined #openstack-ansible16:05
*** yaya has quit IRC16:06
*** weezS has quit IRC16:06
*** sacharya has joined #openstack-ansible16:06
*** yaya has joined #openstack-ansible16:09
*** alop has joined #openstack-ansible16:12
*** yaya has quit IRC16:13
*** alop_ has joined #openstack-ansible16:15
*** alop has quit IRC16:16
*** alop_ is now known as alop16:16
*** annashen has joined #openstack-ansible16:16
*** TheIntern has quit IRC16:22
*** eglute has quit IRC16:23
*** eglute has joined #openstack-ansible16:24
openstackgerritMerged stackforge/os-ansible-deployment: Adding missing role for maas user creation  https://review.openstack.org/19963916:32
*** eglute has quit IRC16:33
*** eglute has joined #openstack-ansible16:34
*** annashen has quit IRC16:36
*** annashen has joined #openstack-ansible16:40
*** sigmavirus24 has quit IRC16:52
jwitkocloudnull16:53
jwitkoSo i have returned and it appears while some of the containers progressed past that step, not all did16:53
jwitkothe network name is still eth0 it appears16:54
*** sigmavirus24 has joined #openstack-ansible16:54
jwitkothe error from starting the container is a little different16:55
jwitkohttp://paste.openstack.org/show/384175/16:56
*** sigmavirus24 has quit IRC16:57
cloudnulljwitko: try to run # ansible hosts -m shell -a '/usr/local/bin/lxc-system-manage veth-cleanup'16:58
*** alop has quit IRC16:58
cloudnullfrom your deployment box16:58
*** sigmavirus24 has joined #openstack-ansible17:00
jwitkocloudnull, ok ran it.  the boxes that failed had a ton of "cannot find device" output17:01
jwitkofor the veth*17:01
jwitkoshould i just run the lxc-containers-create.yml in its entirety again ?17:01
*** alop has joined #openstack-ansible17:02
cloudnullyea give it a go17:02
jwitkoso is that just some left over meta data?17:02
jwitkoand the lxc has built in scripts to clean it up?17:02
cloudnullit is left over bits from earlier builds17:03
openstackgerritgit-harry proposed stackforge/os-ansible-deployment: Serialise rabbitmq playbook to allow upgrades  https://review.openstack.org/20268117:03
cloudnullwe wrote that tool to do system clean up and such because it was a problem we've run into in the past17:03
jwitkodamn17:09
jwitkosome containers still failed to start17:09
cloudnullwhat are the new errors ?17:09
jwitkook17:09
jwitkoso this one is looking for a bridge that is only available on compute servers17:09
jwitkoits the bridge that handles NFS storage]17:10
jwitkoso in my provider networks i have 3 SAN networks listed17:10
jwitkonfs, iscsi1, iscsi217:10
jwitkothese are only meant to be applied to cinder/glance hosts17:10
jwitkooh, i think i see the problem17:12
odyssey4mecloudnull it would seem that this needs an update to a later sha: https://review.openstack.org/19912617:15
odyssey4meit blew out on neutron's db sync - duplicate serial17:15
cloudnullyea master is not happy today17:16
cloudnullor rather the last few days.17:16
cloudnullupstream master that is17:16
*** alextricity has quit IRC17:17
*** annashen has quit IRC17:17
odyssey4meperhaps best to leave it at the current sha for a bit then - we have enough gating shenanigans17:21
cloudnullagreeded.17:21
cloudnull100%17:22
*** daneyon has quit IRC17:23
*** harlowja has quit IRC17:26
*** harlowja has joined #openstack-ansible17:26
openstackgerritMerged stackforge/os-ansible-deployment: Adjust key distribution mechanism for Swift  https://review.openstack.org/19999217:26
*** annashen has joined #openstack-ansible17:27
jwitkohey cloudnull, do you know how many IPs each server takes up with containers?17:35
*** eglute has quit IRC17:35
*** eglute has joined #openstack-ansible17:36
cloudnullit depends on the infra, but it can be a bunch. generally on *infra hosts your looking at somewhere around 32 containerrs per host17:41
sigmavirus24cloudnull: have we pinned down which upstream project is unhappy?17:52
cloudnullneutron so far17:52
*** metral is now known as metral_zzz17:53
sigmavirus24Can we upgrade everything but neutron?17:53
* sigmavirus24 is just curious17:53
sigmavirus24Not necessary17:53
cloudnulli've not looked since i rev'd the commit a few days agao17:54
cloudnull*ago17:54
* sigmavirus24 is curious17:55
openstackgerritMiguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Identity Provider Configuration  https://review.openstack.org/19425918:04
openstackgerritMiguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Identity Provider Configuration  https://review.openstack.org/19425918:05
openstackgerritMiguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Identity Provider Configuration  https://review.openstack.org/19425918:08
*** TheIntern has joined #openstack-ansible18:19
odyssey4mesigmavirus24 you've been seeing galera failures in builds too, right?18:29
sigmavirus24odyssey4me: yep18:29
odyssey4mehas it only been with affinity: 1, or with larger affinity?18:29
sigmavirus24default affinity18:30
sigmavirus24I never think to set the affinity to 118:30
sigmavirus24hasn't happened for the last couple builds though18:30
odyssey4mewas it on re-runs of builds?18:30
odyssey4meand are you using scripts/run-playbooks?18:30
sigmavirus24no I've gotten into the habit of doing `nova rebuild <server> <image> --poll` then setting things up from scratch18:31
sigmavirus24the reasons the keystone v3 patch kept failing was because I was working off an existing build18:31
sigmavirus24so I wasnt' seeing all the problems that the gate was18:31
jwitkocloudnull, finally got the setup-hosts.yml playbook to complete without error  :)18:32
mgariepyodyssey4me, mancdaz : the containers start very slowly because of a bug in lxc-autostart.18:32
*** markvoelker has quit IRC18:32
mgariepyodyssey4me, mancdaz  http://paste.ubuntu.com/11893981/18:33
odyssey4mesigmavirus24 interesting - I had set affinity: 1 and was getting issues over and over again, so trying now with the standard setting18:33
odyssey4meThis may be why though: https://review.openstack.org/20005418:33
odyssey4menot a bad review/patch, but it does mean that previously we never had restarts - and now we do18:34
sigmavirus24odyssey4me: blame hughsaunders then =P18:34
odyssey4mesigmavirus24 that's git-harry :p18:34
sigmavirus24well git-harry didn't approve it18:34
sigmavirus24=P18:34
* sigmavirus24 is only kidding18:34
jwitkocloudnull, during the verification however i am experiencing issues18:35
odyssey4meit's likely that during the restarts the cluster is getting all mixed up - perhaps we need some sort of serial restart, instead of simultaneous restarts18:35
jwitkothe galera container doesn't seem to recognize the mysql command18:35
jwitkoin the process list i don't see mysql running either18:35
jwitko(mariadb)18:35
odyssey4mesigmavirus24 ah, that patch makes it run in serial :/18:35
odyssey4mesigmavirus24 ah, that patch makes it run in serial :/18:36
sigmavirus24odyssey4me: oh okay18:37
sigmavirus24odyssey4me: oh okay18:37
sigmavirus24odyssey4me: similar patch with redis is up for review18:37
palendaeredis or rabbitmq?18:38
palendaehttps://review.openstack.org/#/c/202681/4 is the rabbitmq one18:38
palendaeNot necessarily exactly the same, but making sure that we restart rabbit cluster members in the correct order after upgrades18:39
sigmavirus24sorry18:39
* sigmavirus24 's brain is fried18:39
palendaeSimilar approach, at least18:40
odyssey4meyup, I'm a fan of the approach18:43
*** TheIntern has quit IRC18:59
*** TheIntern has joined #openstack-ansible19:02
jwitkohas anyone ever had an issue where a container didn't seem to install properly?19:03
palendaejwitko: Can you elaborate?19:04
jwitkoi'm attempting to validate the setup-hosts.yml run as the guide instructs, and while attached to my galera container there is no mariadb process running and no mysql binary to use to attach to a DB19:04
palendaeI'm seeing containers failing to start right now (with a juno install) because cgmanager claims there are 100 cgroups with that name19:04
jwitkomine is the kilo19:04
palendaeAh19:04
palendaeI don't think I've seen that behavior before19:05
odyssey4mejwitko you haven't yet installed anything - only the containers are created19:07
jwitkothanks odyssey4me, cloudnull just helped me realize that19:08
odyssey4meou've only done the setup-hosts play, right, and haven't yet run the setup-infrastructure play19:08
palendaeD'oh, yeah19:08
palendaeYep, need to do the next step19:08
odyssey4me:) palendae needs rest, as do we all :)19:08
mgariepyodyssey4me, mancdaz  fix will be pushed in lxc 1.1.3 next week.19:09
palendaemgariepy: Cool - I was just noticing the slow autostart, too. Thanks for that update!19:10
cloudnullmgariepy: which fix was that ?19:11
mgariepyhttps://github.com/lxc/lxc/compare/7faa223603a8...76e484a7093c19:11
cloudnullah. the auto start bits19:11
mgariepylxc-autostart -L -g onboot was duplucating some entry..19:11
mgariepydon't you guys need to reboot your servers ? some times ? haha19:14
palendaemgariepy: !19:15
palendaemgariepy: So would that have caused some issues where a container had 100 cgroups?19:15
mgariepybooting a node with 21 containers, sleeped 15 secondes 231 times before the last ont starts.19:16
palendaeOk, so not the same thing19:16
git-harrysigmavirus24: odyssey4me you approve it, you buy it.19:29
odyssey4mecloudnull sigmavirus24 miguelgrinberg so the issue I've been fighting seems to relate to horizon's keystoneclient being redirected to the internal endpoint every time, even though I've configured horizon to use the public endpoint19:30
odyssey4mefor some reason keystone seems to tell the client to use the internal endpoint (http) instead of the public one (https)19:31
miguelgrinbergwhat part of the auth flow is this?19:31
miguelgrinbergare we past all the redirects at this point?19:32
*** sdake has quit IRC19:32
sigmavirus24odyssey4me: interesting19:32
odyssey4memiguelgrinberg yes, we're already past shibboleth19:32
odyssey4meso apache is no longer involved - it is now just horizon's auth backend19:33
miguelgrinbergthe final redirect takes us into a keystone endpoint, and then keystone needs to redirect back to horizon. Is that last one between keystone and horizon that is having this problem?19:34
odyssey4memiguelgrinberg join #openstack-keystone for the running analysis - here's a log to peek at: http://paste.openstack.org/show/7LIzjZ09I8bRoVuOl7as/19:35
*** galstrom is now known as galstrom_zzz19:36
*** wmlynch has quit IRC19:41
odyssey4meso I can fudge this by making keystone's apache change any json/xml content that comes through the public endpoint... but that's less desirable than having this work right19:43
*** TheIntern has quit IRC19:43
*** ig0r_ has quit IRC19:47
*** sacharya has quit IRC19:56
*** galstrom_zzz is now known as galstrom20:03
*** TheIntern has joined #openstack-ansible20:14
odyssey4memiguelgrinberg sigmavirus24 silly me, I should have set public_endpoint in keystone.conf to tell it how to present itself to clients :p20:33
cloudnullcan a core push the button on this https://review.openstack.org/#/c/203063/20:34
miguelgrinbergodyssey4me: oh, that was it?20:34
odyssey4memiguelgrinberg yep - now one more issue20:34
odyssey4meI find that on the first auth I get redirected to keystone's public endpoint instead of back to horizon - need to figure that one out20:34
miguelgrinbergodyssey4me: isn't keystone supposed to take the redirect and then send you to horizon? I think that's how it is supposed to work20:36
odyssey4memiguelgrinberg yep, and that happens on the second auth - but not the first20:37
miguelgrinbergwhat do you mean by "first auth"?20:38
miguelgrinbergodyssey4me: I'm using the etherpad as a reference for the auth flow, lines 229-234.20:39
odyssey4meaccess horizon, choose to use the idp, login to the idp, the redirect from the idp goes back to keystone's endpoint20:39
odyssey4mein the same session, go back to the login page and retry - this time I go through to the summary page20:40
odyssey4metry it for yourself: https://test1.pigeonbrawl.net20:40
miguelgrinbergokay, so you mean multiple login attempts20:40
miguelgrinbergodyssey4me: and the problem is the JSON blob that appears on the browser?20:41
odyssey4memiguelgrinberg that's the keystone endpoint simply responding to a normal GET20:41
odyssey4mebut yes, that's it20:41
odyssey4meif you go back in the same browser session, re-auth, then it'll go through to the horizon summary page20:42
miguelgrinbergodd, it doesn't for me, I keep getting the JSON page20:42
odyssey4memiguelgrinberg every time?20:43
miguelgrinbergit alternates between that JSON page and an error "unable to retrieve authorized projects"20:44
miguelgrinbergare you sure this redirect isn't set incorrectly in your testshib config?20:45
odyssey4memiguelgrinberg oh - you're still seeing that error? that's odd20:46
odyssey4memiguelgrinberg do you have strict cookies or something set?20:47
miguelgrinberglet me try another browser, I'm using chrome with default settings now20:47
miguelgrinbergwell, safari worked the first time20:48
miguelgrinbergand the second time I get the JSON page20:48
odyssey4meso this inconsistency is what I need to figure out20:49
odyssey4methe redirecting isn't right every time20:49
miguelgrinbergodyssey4me: clearly there are a few redirects in sequence during the auth flow, but it appears this page comes directly from a redirect from testshib20:52
openstackgerritKevin Carter proposed stackforge/os-ansible-deployment: Change to ensure container networks are up  https://review.openstack.org/20282120:52
openstackgerritKevin Carter proposed stackforge/os-ansible-deployment: Change to ensure container networks are up  https://review.openstack.org/20282120:53
odyssey4memiguelgrinberg yes, but I think it uses either something from the metadata or from the original page referring it20:54
miguelgrinbergodyssey4me: did you have to enter the root keystone URL in testshib?20:54
odyssey4meI saw this issue without SSL, and without SSL I'll be able to sniff the data between them so maybe that's the way to get that sorted20:54
odyssey4memiguelgrinberg nope20:55
miguelgrinbergI wonder where it is coming from then20:55
odyssey4meonly the metadata, so this: https://test1.pigeonbrawl.net:5000/Shibboleth.sso/Metadata20:55
odyssey4meI think it's a referre20:55
odyssey4me*referrer20:55
miguelgrinbergthis sounds painful, but maybe we need to use wireshark to figure out exactly the traffic20:56
odyssey4memiguelgrinberg yeah - suffice to say that this can wait until monday - the big hurdle has been resolved20:57
miguelgrinbergodyssey4me: yes, sure. I submitted all the IdP fixes earlier today, still waiting for the gate, but I think it'll go through20:59
*** galstrom is now known as galstrom_zzz21:00
miguelgrinbergodyssey4me: if you add my key to your SP host I can take a look at the config, maybe a set of fresh eyes will help21:03
odyssey4memiguelgrinberg I'm looking at marekd's comment here: https://review.openstack.org/#/c/194395/29/playbooks/roles/os_keystone/templates/keystone-httpd.conf.j2,cm21:04
openstackgerritKevin Carter proposed stackforge/os-ansible-deployment: Change to ensure container networks are up  https://review.openstack.org/20282121:05
miguelgrinbergodyssey4me: maybe he means the saml2 portion should be *, to not make it hardcoded to saml221:05
miguelgrinbergodyssey4me: yeah, see how they do it in the docs: http://docs.openstack.org/developer/keystone/federation/shibboleth.html21:06
odyssey4memiguelgrinberg the WSGIScriptAliasMatch line is different21:07
odyssey4mewhich makes sense - it tells keystone to handle any protocol21:07
odyssey4mebut shibboleth should only handle saml, because it doesn't do other protocols21:07
miguelgrinbergah sorry, looked at the wrong line, yes21:07
odyssey4meyeah, I don't think that needs to be different - will have to chat to him to find out what he means21:08
odyssey4mehe's unfortunately just jumped onto a plane21:09
miguelgrinbergmaybe he confused it with the alias line, like I did21:09
odyssey4memiguelgrinberg I see some tweaks which we can add in the adfs patch sets which make it logout properly, etc - but for now I think once you've added the key distribution we're good21:12
odyssey4mewe're good for non-ssl anyway21:12
odyssey4meI can do the ssl-related stuff in a subsequent patch21:13
miguelgrinbergsounds good21:14
miguelgrinbergI'll have the shib cert distribution soon, working on that now21:14
odyssey4meI think I should probably kill or shutdown these servers now - I've happily been sharing a lot of info about them :p21:15
odyssey4memiguelgrinberg both the IDP and SP patches need a DocImpact tag, and we should probably put more details in the commit message about what the patch does, any upgrade impact, etc21:26
miguelgrinbergokay, I can attempt that once I have the cert distribution21:27
odyssey4memiguelgrinberg thanks, see the commit message in https://review.openstack.org/202977 for an idea of what I think is good practise21:28
miguelgrinbergokay, will do21:28
odyssey4meI think it makes it easier for reviewers to get the gist of what's going on.21:28
odyssey4methanks, have a great weekend! :)21:28
openstackgerritKevin Carter proposed stackforge/os-ansible-deployment: Change to ensure container networks are up  https://review.openstack.org/20282121:29
miguelgrinbergyou yoo! :)21:29
miguelgrinberg*too21:29
*** blockedpipes has quit IRC21:29
cloudnulllater odyssey4me!21:31
cloudnullhave a  good one21:31
*** spotz is now known as spotz_zzz21:44
marekdmiguelgrinberg: odyssey4me just responded - my 1st comment was wrong.21:48
miguelgrinbergmarekd: odyssey4me is gone for the day I think, what was this about?21:50
marekdmiguelgrinberg: the thing you discussed about protocol '*'21:51
miguelgrinbergmarekd: ah okay, good, we guessed that was the case21:52
odyssey4memarekd :) thanks for getting back to us, and thank you for your help earlier!21:53
*** Mudpuppy has quit IRC21:54
openstackgerritMerged stackforge/os-ansible-deployment: Adds retries  https://review.openstack.org/20306321:59
marekdodyssey4me: i will be again eu tz from monday.22:01
odyssey4memarekd awesome - where are you now?22:01
*** CheKoLyN has quit IRC22:02
openstackgerritKevin Carter proposed stackforge/os-ansible-deployment: Adds retries  https://review.openstack.org/20325722:03
marekdodyssey4me: Boston Int Airport22:04
marekdwaiting form y flight22:04
marekdwiting for my flight22:04
marekdkilling time on IRC.22:05
marekd:P22:05
openstackgerritKevin Carter proposed stackforge/os-ansible-deployment: Fix general upgrade issues for Juno > Kilo  https://review.openstack.org/20282122:06
odyssey4memarekd have a good flight - time for me to enjoy a glass of wine and spend some time with my wife :)22:08
odyssey4menight all, for good this time :p22:08
odyssey4mecloudnull have a great weekend!22:08
cloudnullwill do  and you too!22:08
marekdodyssey4me: sure thing!22:09
openstackgerritMiguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Service Provider Configuration  https://review.openstack.org/19439522:22
openstackgerritMiguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Identity Provider Configuration  https://review.openstack.org/19425922:30
*** annashen has quit IRC23:06
*** annashen has joined #openstack-ansible23:08
*** metral_zzz is now known as metral23:32
openstackgerritMerged stackforge/os-ansible-deployment: Container create/system tuning  https://review.openstack.org/20226823:51
*** alop has quit IRC23:57

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!