Wednesday, 2017-09-27

*** tongl has joined #openstack-lbaas00:49
*** mugsie has quit IRC00:51
*** fnaval has quit IRC00:56
*** yamamoto has quit IRC01:03
*** yamamoto_ has joined #openstack-lbaas01:03
*** leyal has quit IRC01:31
*** leyal has joined #openstack-lbaas01:33
*** bzhao has joined #openstack-lbaas01:39
*** bbzhao has joined #openstack-lbaas01:48
*** leitan has joined #openstack-lbaas02:06
openstackgerritLingxian Kong proposed openstack/octavia-tempest-plugin master: [WIP] Create scenario tests for listeners  https://review.openstack.org/49231102:14
*** SumitNaiksatam has joined #openstack-lbaas02:18
*** aojea has joined #openstack-lbaas03:17
*** aojea has quit IRC03:22
openstackgerritLingxian Kong proposed openstack/octavia-tempest-plugin master: [WIP] Create scenario tests for listeners  https://review.openstack.org/49231103:28
*** Yipei has joined #openstack-lbaas03:46
*** ianychoi_ has joined #openstack-lbaas04:20
*** ianychoi has quit IRC04:29
*** m-greene_ has quit IRC04:32
*** m-greene_ has joined #openstack-lbaas04:35
*** sanfern has joined #openstack-lbaas04:38
*** belharar has joined #openstack-lbaas04:40
*** armax has joined #openstack-lbaas04:42
*** Alex_Staf has joined #openstack-lbaas04:54
*** leitan has quit IRC05:11
openstackgerritRajat Sharma proposed openstack/octavia master: Replace 'manager' with 'os_primary' and 'os_adm' with 'os_admin'  https://review.openstack.org/47839905:22
*** aojea has joined #openstack-lbaas05:42
*** gcheresh_ has joined #openstack-lbaas05:44
*** ltomasbo has quit IRC05:49
openstackgerritPradeep Kumar Singh proposed openstack/octavia master: Add flavor, flavor_profile table and their APIs  https://review.openstack.org/48649906:12
*** armax has quit IRC06:19
*** Alex_Staf has quit IRC06:20
*** ltomasbo has joined #openstack-lbaas06:21
openstackgerritBar RH proposed openstack/octavia master: [WIP] Assign bind_host ip address per amphora  https://review.openstack.org/50515806:27
*** eezhova has joined #openstack-lbaas06:36
*** rcernin has joined #openstack-lbaas06:39
*** armax has joined #openstack-lbaas06:41
*** yamamoto_ has quit IRC06:41
*** yamamoto has joined #openstack-lbaas06:45
*** armax has quit IRC06:50
*** yamamoto_ has joined #openstack-lbaas07:04
*** yamamoto has quit IRC07:04
*** yamamoto_ has quit IRC07:10
*** tongl has quit IRC07:12
*** yamamoto has joined #openstack-lbaas07:18
*** eezhova has quit IRC07:19
*** tesseract has joined #openstack-lbaas07:21
*** Alex_Staf has joined #openstack-lbaas07:41
*** eezhova has joined #openstack-lbaas07:42
*** eezhova_ has joined #openstack-lbaas07:45
*** eezhova has quit IRC07:47
*** Yipei has left #openstack-lbaas08:40
*** chlong has quit IRC08:49
openstackgerritPradeep Kumar Singh proposed openstack/octavia master: Add flavor, flavor_profile table and their APIs  https://review.openstack.org/48649908:51
openstackgerritBar RH proposed openstack/octavia master: [WIP] Assign bind_host ip address per amphora  https://review.openstack.org/50515809:00
*** yamamoto has quit IRC09:19
*** salmankhan has joined #openstack-lbaas09:23
*** numans has quit IRC09:25
*** numans has joined #openstack-lbaas09:28
openstackgerritPradeep Kumar Singh proposed openstack/octavia master: Add flavor, flavor_profile table and their APIs  https://review.openstack.org/48649909:30
openstackgerritBar RH proposed openstack/octavia master: [WIP] Assign bind_host ip address per amphora  https://review.openstack.org/50515809:55
*** yamamoto has joined #openstack-lbaas10:20
*** eezhova__ has joined #openstack-lbaas10:28
*** eezhova__ has quit IRC10:28
*** salmankhan has quit IRC10:29
*** eezhova_ has quit IRC10:31
*** atoth has quit IRC10:35
openstackgerritLingxian Kong proposed openstack/octavia-tempest-plugin master: Create scenario tests for listeners  https://review.openstack.org/49231110:38
*** salmankhan has joined #openstack-lbaas10:39
*** apuimedo_ has joined #openstack-lbaas10:43
*** apuimedo has quit IRC10:45
*** apuimedo_ is now known as apuimedo10:45
*** sanfern has quit IRC10:54
*** eezhova has joined #openstack-lbaas11:20
*** strigazi has quit IRC11:23
*** strigazi has joined #openstack-lbaas11:24
*** pcaruana has joined #openstack-lbaas11:27
*** atoth has joined #openstack-lbaas11:29
*** sanfern has joined #openstack-lbaas12:32
nmagnezio/12:44
openstackgerritOpenStack Proposal Bot proposed openstack/neutron-lbaas master: Updated from global requirements  https://review.openstack.org/50663812:48
openstackgerritOpenStack Proposal Bot proposed openstack/neutron-lbaas-dashboard master: Updated from global requirements  https://review.openstack.org/50466012:48
*** leitan has joined #openstack-lbaas12:56
*** belharar has quit IRC12:57
*** belharar has joined #openstack-lbaas12:58
*** chlong has joined #openstack-lbaas13:29
*** chlong has quit IRC13:31
*** belharar has quit IRC13:34
*** Alex_Staf has quit IRC13:36
*** rtjure has quit IRC13:58
*** sanfern has quit IRC13:59
*** sanfern has joined #openstack-lbaas14:00
*** rtjure has joined #openstack-lbaas14:03
*** belharar has joined #openstack-lbaas14:08
*** yamamoto has quit IRC14:10
*** ipsecguy_ has joined #openstack-lbaas14:13
*** ipsecguy has quit IRC14:14
*** yamamoto has joined #openstack-lbaas14:15
*** yamamoto has quit IRC14:20
*** tongl has joined #openstack-lbaas14:27
johnsomo/14:28
*** tongl has quit IRC14:30
openstackgerritMerged openstack/neutron-lbaas-dashboard master: Updated from global requirements  https://review.openstack.org/50466014:34
*** dayou has quit IRC15:01
*** longkb_ has joined #openstack-lbaas15:01
*** bbzhao has quit IRC15:03
*** bbzhao has joined #openstack-lbaas15:03
*** yamamoto has joined #openstack-lbaas15:10
*** gcheresh_ has quit IRC15:11
*** chlong has joined #openstack-lbaas15:17
*** eezhova has quit IRC15:18
xgerman_o/ - not sure if I am able to make the meeting but feel free to summon me if needed ;-)15:33
johnsomOk15:35
*** tongl has joined #openstack-lbaas15:37
*** rcernin has quit IRC15:43
openstackgerritBar RH proposed openstack/octavia master: [WIP] Assign bind_host ip address per amphora  https://review.openstack.org/50515815:49
johnsomHmmm, I think I just reproduced the "404" issue locally.  I have a nova vm, but the interface isn't in the VM15:54
nmagnezijohnsom, in case I won't make it to the meeting, I voted to the poll (the best available option for me is to revert back to the old timing)15:58
nmagnezijohnsom, oh, and hi :)15:58
johnsomOk, great, I was going to ping you to vote15:59
nmagnezijohnsom, btw as for my plugin.sh (and more) patch, for some reason when I stack with this patch it fails to spawn vms with: {u'message': u"Host 'rdocloud-devstack2' is not mapped to any cell", u'code': 400, u'created': u'2017-09-27T11:31:42Z'}16:01
nmagneziso not sure if that's related, going to use a clean setup to restack16:01
johnsomOk, yeah, that sounds like a nova setup issue, probably unrelated16:02
nmagnezijohnsom, byw in the story I asked you question about that DVR comment you asked for16:02
nmagneziyup. i think the same.16:02
johnsomDo you have a link to the story?16:02
*** atoth has quit IRC16:03
nmagneziyes, sec16:03
johnsomThanks, trying to dig into why I have a instance with a missing network interface.  Nova/neutron say it's there, but linux in the vm doesn't see it.16:04
nmagnezijohnsom, https://storyboard.openstack.org/#!/story/2001183#comment-1746116:04
*** gcheresh_ has joined #openstack-lbaas16:07
johnsomnmagnezi Thanks, commented16:08
nmagnezijohnsom, thanks! was that issue resolved later? (I want to specify this in the comment)16:09
johnsomIt was fixed in the Pike release16:10
johnsomPrior to Pike it has always been broken when using DVR16:10
nmagneziack. thanks!16:13
nmagnezijohnsom, btw fixed even the nova commands, so I want those bonus points.16:13
johnsomCool!16:14
johnsom500+16:14
*** gcheresh_ has quit IRC16:22
*** tesseract has quit IRC16:24
*** sshank has joined #openstack-lbaas16:39
*** SumitNaiksatam has quit IRC16:40
*** JudeC has joined #openstack-lbaas16:53
*** yamamoto has quit IRC16:55
*** dayou has joined #openstack-lbaas16:57
*** longstaff has joined #openstack-lbaas16:58
*** gans has joined #openstack-lbaas16:59
*** rm_mobile has joined #openstack-lbaas16:59
*** yamamoto has joined #openstack-lbaas16:59
*** eezhova has joined #openstack-lbaas17:01
*** longstaff has quit IRC17:07
*** longstaff has joined #openstack-lbaas17:10
*** pcaruana has quit IRC17:14
*** yamamoto has quit IRC17:31
*** yamamoto has joined #openstack-lbaas17:31
*** gans has quit IRC17:32
*** rm_mobile has quit IRC17:49
*** sshank has quit IRC17:54
*** jniesz has joined #openstack-lbaas18:00
johnsomjniesz Ok with next week or want to chat here?18:02
rm_workwait, i just read the flavor spec and it said flavors were immutable and only set at create time :P18:02
rm_workare we changing that?18:03
johnsomYeah, that is the current stance18:03
johnsomI think the topic was a discussion about revisiting that.18:03
jnieszyes because the question is how to move from one flavor to another18:04
johnsomI think it's "possible", but I would like to see it working first...  grin18:04
jnieszi agree that an lb created under a flavor is immutable18:04
jnieszbut should be able to failover (deprovision / reprovision) lb to new flavor18:04
*** longstaff has quit IRC18:06
jnieszfor example if we update glance image of a flavor18:06
jnieszcreate a new flavor with a new glance image18:06
jnieszand want to move all lb's under the old flavor over to that new flavor18:06
johnsomIt gets pretty strange if the provider is different across the flavors.  I mean, I think it is possible, but definitely something I would want to tackle in the future when we have providers/drivers working18:06
johnsomWell, today, glance images are tagged, so by updating the tag to point to the new image and then using the failover API you can accomplish that without changing the flavor18:08
jnieszcorrect.  Depending how we implement glance images in flavor that might be different18:09
jnieszif we have different glance images for different flavors18:09
jnieszflavor might point to glance image id18:09
jnieszor need to support multiple tags for different images18:10
johnsomWe have deprecated pointing to image IDs I think...18:10
jnieszright now we just look for single tag18:10
johnsomI think we only support tags18:10
johnsomRight, you could setup a tag per flavor if you want to manage it that way.18:11
johnsomhttps://github.com/openstack/octavia/blob/master/octavia/common/config.py#L29018:11
jnieszyes, so amp_image_tag would have to move into flavor meta_info18:12
jnieszfrom the config18:12
johnsomYes, that is definitely something that needs to happen18:12
johnsomI think there are a few config settings that need to move to flavor, like topology, tag, nova flavor, etc.  Mostly we parked things in config because we didn't have flavors.18:13
jnieszyea, nova flavor is another useful one.18:14
jnieszmigrate from one flavor to another (vertical scaling)18:14
jnieszit would be good to failover to make that happen similar to the way glance is handled with tags18:14
johnsomIt's another interesting one.  Nova team is starting to work on hot-plug vcpu/ram18:15
*** pcaruana has joined #openstack-lbaas18:15
*** salmankhan has quit IRC18:15
openstackgerritBar RH proposed openstack/octavia master: [WIP] Assign bind_host ip address per amphora  https://review.openstack.org/50515818:19
xgerman_yeah, we can move all that to flavor ;-) We still support image-ids but they are a bit weird when using18:19
rm_workjohnsom: harlowja says moving to jobboard requires us dropping oslo.messaging18:20
xgerman_Probably need to extends our failover mechanism and one flavor->another might be a user op…18:20
harlowjaor at least parts of it...18:20
harlowjathe models just aren't the same (in that there is no messaging, lol)18:20
xgerman_mmh, can we keep our queue between API and worker?18:20
johnsomhmmm, confused a bit by that18:21
xgerman_+118:21
johnsomWe use it in two places:18:21
johnsom1. API process to worker (prior to starting a task flow)18:21
rm_workxgerman_: specifically the queue between API and worker is what needs to stop using it18:21
xgerman_mmh18:21
johnsom2. Sending stats/status over to neutron (outside task flow)18:21
rm_workyeah that second one could continue18:21
rm_workexcept, neutron-lbaas is on fire18:22
johnsomSince API->CW is before we even launch an engine, how would it conflict?  In reality I think the CW gets a lot smaller and the JB workers take on more of the stuff18:23
harlowjahttps://openstack.nimeyo.com/83061/openstack-dev-oslo-mistral-saga-process-than-where-from-here also for homework/reading18:24
harlowjaquiz tommorow18:25
rm_workjohnsom: because jobboard requires ack-after-work and oslo does ack-before-work apparently18:25
johnsomDeal18:25
*** belharar has quit IRC18:25
xgerman_so we need to run our own queue? Or does job board does that for us? Confused?18:26
harlowjaso it prob would be useful for me to tell u what a jobboard (at least the zookeeper one)18:26
xgerman_guess @johnsom will write a summary paper for us and present :-)18:26
johnsomrm_work Still don't see how that is a problem.  It is a feature of this solution IMO18:26
harlowjadepending on peoples zookeeper knowledge18:26
harlowjai need 20 minutes class time to do said description18:26
xgerman_I know the keeper —18:26
harlowjak18:27
harlowjahttps://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#ch_zkDataModel (for those who don't know)18:27
harlowjawatches, znodes, emphermal nodes are important18:27
johnsomHere comes the fire hose....18:27
harlowjaha18:27
harlowjalet me know when to fully firehose18:28
harlowjajohnsom ack-before-work means u can lose messages if worker processing messages crashes right after acking18:29
harlowjaso that's bad, especially in projects (mistral, octavia) that aren't retrying and ...18:29
harlowja(retrying from the sender side)18:29
harlowja(crash or software upgrade, or network blip or other)18:30
johnsomRight, this is our current situation.18:31
xgerman_yep, but can’t we make the queue “duarble”18:31
johnsomI am envisioning.... (cue dreamland music) Our CW gets message off queue, fires up TF engine stuffs, ACKs queue as message has been handed off to a more durable system.18:32
rm_workthat is what i thought too, but apparently not how it works :/18:32
*** sshank has joined #openstack-lbaas18:32
*** sshank has quit IRC18:33
*** sshank has joined #openstack-lbaas18:33
harlowjanot with oslo.messaging, lol18:34
xgerman_fix it!18:34
harlowjanot likely18:34
*** sshank has quit IRC18:35
harlowjaoslo.messaging doesn't really expose the acking to users18:35
harlowjaon purpose18:35
johnsomSo even Our CW gets message off queue, auto ACKs, fires up TF engine is better than we have today and a pretty small failure window18:35
xgerman_+118:35
harlowjathat's already happening u use oslo.messaging18:36
rm_workwelcome to my conversation with harlowja as of like 15 minutes ago18:36
xgerman_nice — so we fire off taskboard earlier?18:36
johnsomSo you are proposing to scrap the queue, launch the TF engine straight from the API provider driver18:36
xgerman_monoliths unite!18:36
harlowjadifferent kind of queue (if u can call it that)18:36
rm_workI think we ... post a job to a board18:36
harlowjaa taskflow job 'queue' is a directory in zookeeper18:37
johnsomThe problem we have is after the message is off the queue and we start the TF is when the "hard stuff" happens18:37
rm_workinstead of put a call on a queue18:37
rm_workit's basically the same kind of thing18:37
harlowjaso when u post to a job board, it creates entry in that directory18:37
harlowjaworkers that are waiting for work are 'watching' that directory for entries being created18:37
xgerman_ok, zookeeper is the queue18:37
xgerman_got it… easy18:37
rm_workso yeah we'd just replace the places where we post to oslo.messaging, and instead post the job to zookeeper in a slightly different format18:37
harlowjathen a worker (one of them) gets the 'entry' (via atomic blah blah)18:37
harlowjathen worker starts processing workflow described in job18:38
rm_workand the workers look at zookeeper for jobs18:38
rm_workinstead of reading from oslo.messaging18:38
harlowjaif worker dies (at any point) it releases lock (emphermal ...node)18:38
xgerman_yeah, then they lock and if they die it unlocks bla, bla18:38
harlowjathen another worker can see this has happened and try to take that job over18:38
harlowjablah blah18:38
johnsomWhere in this does the TF engine get started?18:38
harlowjapost-job-claim18:38
harlowjathe interseting bit is that if u have TF engines persisting task state to somewhere18:38
harlowjathat on worker death the next worker can try to figure out where the last worker finished18:39
johnsomYeah, that is the part we actually need18:39
harlowjaand pick it up there18:39
xgerman_now we are talking ;-)18:39
harlowjaof course, some projects don't give a shit about the persistnce part18:39
johnsomsub-flow durability18:39
harlowjaand just restart the whole damn thing (and just use the auto-worker transfer stuff)18:39
harlowjadepends on if u can restart the subflows or if its just easier to start the whole thing over18:39
harlowja(and then have tasks themselves check things and do nothing...)18:40
xgerman_could work for us but would be wasteful (unaccounted resources…)18:40
johnsomYeah, restart the whole thing is bad in most cases.  We have this pesky VIP IP/port18:40
harlowjai'd expect a task could check something before doing work no?18:40
harlowjalike check if VIP/IP/port already made, then do nothing18:40
xgerman_probably18:40
harlowjaif not already made, do something...18:40
harlowjaand repeat18:40
xgerman_but it gets more tough with VMs, etc.18:40
harlowjasure18:41
harlowjaanyway, that's the idea18:41
harlowjahose done18:41
harlowjalol18:41
*** gcheresh_ has joined #openstack-lbaas18:41
johnsomYou have such faith in the OpenStack API capabilities....  Odds are high we would walk every port in the system....  grin18:41
harlowjai haven't (but could) transfer the same concepts to etcd18:41
harlowjai just haven't18:41
harlowjathere is a limited redis driver for jobboard that is sorta similar (but not so good)18:41
harlowjasince redis doesn't support the same concepts natively18:41
harlowjajohnsom ha18:42
harlowjadon't do dumb things :-P18:42
xgerman_why can’t we use a graphDB - I heard they are all the rage now18:42
harlowjaha18:42
johnsomI mean if we are going to restart the whole flow, isn't there something lighter weight than zookeeper, etc?18:42
xgerman_like a durable queue?18:42
xgerman_after all the keeper is not officially part of OpenStacj whereas etcd is18:43
harlowjadefine light-weight18:43
harlowjalol18:43
xgerman_so if we can avoid the keeper would be goodness18:43
johnsomYeah, relying on more external parts makes me ill18:43
harlowjameh, u decide18:43
harlowjau can hack all of this with db|rabbit|something else18:43
harlowjabut i'd rather not18:43
harlowjawith some polling threads and shit18:44
*** sshank has joined #openstack-lbaas18:44
harlowjaenjoy that, ha18:44
harlowjabut ya, the jobboard stuff was before etcd got approved (3 months ago?) so ya...18:44
harlowjais what it is ,ha18:44
xgerman_so we should at least use etcd since that’s now officially part of the kit whereas zookeeper isn't18:44
harlowjago for it18:44
harlowjai could prob do it, but it might be a useful thing for someone here18:44
harlowjathen u'll know wtf jobboards are better, haha18:44
xgerman_I guess we have our work cutout or wait for the K8 Octavia18:44
harlowjaat least now u know the concepts18:45
xgerman_yeah, might also be a non-issue, e.g. terraform checks if the LB appeared and if not errors out — so if we loose the message the user will know and can run again18:46
harlowjasounds like shifting work to user18:46
xgerman_yep, it’s shitty18:46
harlowjaie, the user is your retry decorator, lol18:46
harlowjauser-powered-decorators18:46
johnsomWe could just tell rm_work to never interrupt a controller18:47
xgerman_:-)18:47
johnsomGet a bunker, UPS, generator, vault door18:47
xgerman_http://www.zerohedge.com/sites/default/files/images/user5/imageroot/2017/04/15/north-korea-missiles_0.png18:48
xgerman_just look that you build it outside those circles18:48
johnsomGeez, three pages in the first doc alone...18:49
johnsomI feel like harlowja gave us free candy (taskflow) and then .....18:50
*** sshank has quit IRC18:56
*** JudeC has quit IRC18:59
*** pcaruana has quit IRC19:00
*** gcheresh_ has quit IRC19:05
nmagnezijohnsom, o/19:11
nmagnezijohnsom, a question about https://review.openstack.org/#/c/505884/719:12
johnsomo/19:12
nmagnezijohnsom, why do we have openstackclient in both places?19:12
johnsomYeah, I had that question too.  Let me see if I can find the comments about this.19:13
nmagnezijohnsom, so in the patch i looks like it bumps it in test req, but strangely enough i don't see it in master: https://github.com/openstack/python-octaviaclient/blob/master/test-requirements.txt19:13
johnsomIf I remember right it has to do with some integrated tests19:13
nmagnezijohnsom, ack. just tried to makes sense of it for myself :)19:13
johnsomOh, interesting point19:14
johnsomhmmm19:14
johnsomhttps://review.openstack.org/#/c/487565/3/test-requirements.txt19:15
johnsomSo, we did remove it19:15
johnsomThe bot must be confused19:15
johnsomOh, we removed it after stable/pike was cut19:15
johnsomThat bot patch is against stable/pike19:16
nmagnezijohnsom, oh, i missed that19:16
johnsomYeah, we can backport that if it is a problem for your packaging19:16
nmagnezijohnsom, i should have waited with my vote.. :P19:17
*** sanfern has quit IRC19:17
johnsomNot to late to fix it19:17
nmagnezijohnsom, indeed19:17
nmagnezijohnsom, as for packaging, I didn't package the client yet (I plan to do so in a few weeks)19:19
nmagnezibut since pike was already release i think we should just leave it there19:19
johnsomYep19:19
nmagneziif it ain't broken.. :)19:19
johnsomYep19:20
openstackgerritNir Magnezi proposed openstack/octavia master: Fix a Python3 issue with start_stop_listener  https://review.openstack.org/48091919:20
nmagnezirm_work, i need some advise with this one ^. A little bird (with a PTL wings) whispered me you know how to tackle those py3 issues :)19:22
*** eezhova has quit IRC19:23
rm_workyeah those are always fun19:23
rm_workoften you can run it through / test against six.text_type19:23
rm_worki THINK in this case that might be safe19:24
rm_worksix.text_type("a") == six.text_type(b"a")19:25
*** sanfern has joined #openstack-lbaas19:25
rm_workah yeah doesn't work directly in py319:26
rm_workbut you can test19:27
harlowjajohnsom hahaha, free candy19:31
harlowjai can only provide some much of the candy, the rest is up to u guys19:33
harlowjai had a hard enough time just trying to get etcd|zookeeper into openstack as an accepted thing19:34
harlowja(thankfully it now is)19:34
harlowja^ that blew my mind honestly (that it took that long)19:34
harlowjaespecially for a cloud distributed system...19:34
harlowjalol19:34
* harlowja slightly burnt out by that crap 19:36
rm_workjohnsom: do you remember what log level we get if we don't have debug=true?19:38
harlowjai think also not super-happy with how people like used oslo.messaging and i think they didn't quite know what it really is doing (acking before work...)19:38
johnsomYeah, I am getting worn down by things breaking out from under us...  Thus the worry of adding more19:38
harlowjaanyways, rant over19:38
johnsomrm_work INFO I think19:39
rm_workyeah I'm hoping that's true19:39
johnsomThat is what I see in one of my devstack VMs19:43
johnsomOk, back to nova/neutron fun with the case of the missing network interface19:43
rm_workganbatte19:52
openstackgerritAdam Harwell proposed openstack/octavia master: WIP: Floating IP Network Driver (spans L3s)  https://review.openstack.org/43561220:03
*** sshank has joined #openstack-lbaas20:05
*** dayou has quit IRC20:07
*** yamamoto has quit IRC20:08
*** eezhova has joined #openstack-lbaas20:14
*** sshank has quit IRC20:24
*** sshank has joined #openstack-lbaas20:24
*** yamamoto has joined #openstack-lbaas20:27
*** yamamoto has quit IRC20:31
*** ltomasbo has quit IRC20:34
*** ltomasbo has joined #openstack-lbaas20:37
johnsomSo, our 404 issue....20:39
johnsomAppears to be an issue inside the amp20:39
johnsomIf I force a PCI bus re-enumeration the interface pops up20:40
rm_workhmmmmmmmmm20:40
rm_workcuriosity: i wonder if we switched the gates to centos amps if it'd show up :P20:41
rm_workmight be an ubuntu thing20:41
johnsomYeah.  I collected a ton of nova/neutron logs then figured I would start poking things.20:41
rm_workyou said you managed to repro, but20:41
rm_work*reliably*? or just once randomly20:41
johnsomOnce ever20:42
rm_workT_T20:42
johnsomBut we saw it in the gates a bunch in Pike20:42
rm_workyes, i remember20:42
rm_workis it a race?20:42
rm_worklike sure you did a re-enum and it popped up20:42
rm_workbut maybe because nova didn't set it up in the time it was supposed to?20:42
johnsomIf so it's in the bowels of the linux kernel hot-plug systems....20:43
johnsomOh, no, I had looked at the device list just before20:43
johnsomhttps://www.irccloud.com/pastebin/5DeVTzTG/20:43
*** aojea has quit IRC20:44
johnsomI ran it in the netns just to make sure there wasn't some strange netns magic going on (though PCI should not be masked by that)20:44
johnsomI mean, forcing a rescan if we don't find the interface we expect is not harmful.20:45
openstackgerritAdam Harwell proposed openstack/octavia master: WIP: Floating IP Network Driver (spans L3s)  https://review.openstack.org/43561220:45
*** aojea has joined #openstack-lbaas20:49
rm_workjohnsom: so, basically a quick workaround20:59
rm_workmaybe that's fine :/20:59
rm_workmy standards have become a lot lower as someone trying to actually *get all this working reliably*21:00
openstackgerritNir Magnezi proposed openstack/octavia master: Update devstack plugin and examples  https://review.openstack.org/50363821:01
*** Alex_Staf has joined #openstack-lbaas21:02
johnsomIt's actually a pretty common practice when hot plugging things into linux hosts.  I just thought we were long beyond needing it.21:02
rm_workaah there was another admin endpoint i forgot to talk about in the meeting T_T21:02
rm_workmaintenance mode21:02
rm_worka couple of things:21:03
rm_work1) I think i want to store the AZ/host that comes back from the nova polling21:04
rm_work2) make a new table for storing currently active maintenances (either an AZ or a Host)21:04
rm_work3) Make an endpoint to create/read/delete to that table21:05
rm_work4) have some logic that can failover amps off those AZ/hosts when a maintenance is set21:05
rm_workjniesz: ^^ possibly relevant for you too21:06
rm_workfor #1 i mean, in the amphora table21:07
xgerman_mmh, I can see that we switch off an AZ for maintenance but actively failing stuff over - that should be left to the operator to run some script21:08
rm_workhmmm21:08
xgerman_yeah, not every maintenance might need a failover - maybe just not schedule new amps for some time21:09
johnsomYeah, it seems like maintenance would stop health monitoring and block failovers21:10
johnsomIf I remember right, you are looking for something to evacuate an AZ???21:11
xgerman_looks like it and I think that should be done outside Octavia21:11
xgerman_we just give you the building blocks21:11
johnsomYeah, some folks might want to live migrate too21:12
xgerman_+121:12
xgerman_also rm_work should write a spec — this is getting beyond what we can handle i irc ;-)21:15
*** eezhova has quit IRC21:15
johnsomYeah, this one probably should have a spec21:15
*** ltomasbo has quit IRC21:16
*** leitan has quit IRC21:16
xgerman_I haven’t looked at our minutes but if we do a spec we might do it for all the proposed ones… so we have a proper record21:16
xgerman_aka an admin-api spec21:17
*** ltomasbo has joined #openstack-lbaas21:18
rm_workyeah prolly21:18
johnsomhttps://www.irccloud.com/pastebin/CykDBtZM/21:19
xgerman_root is not what it used to be?21:19
rm_workopen('/sys/bus/pci/rescan', 'w')21:20
rm_workplz try21:20
johnsomAh, I'm a dork and forgot the 'w'21:20
openstackgerritMichael Johnson proposed openstack/octavia master: Force PCI bus rescan if interface is not found  https://review.openstack.org/50798621:26
rm_workso now our gates will be impervious to failure? :P21:27
*** yamamoto has joined #openstack-lbaas21:28
tonglDo we still track neutron-lbaas v2 bug? I created a listener with default pool, and also add healthmonitor for this pool. However, when I tried to create healthmonitor for the 2nd redirect_pool, it reports exception "TypeError: unhashable type: 'dict'".21:29
tonglDid anyone see this before in LBaaS v2?21:29
tonglError log: http://paste.openstack.org/show/622092/21:30
* rm_work doesn't use neutron-lbaas21:31
tonglWe still have to develop our driver to support neutron-lbaas :(21:33
johnsomtongl neutron-lbaas bugs go here: https://storyboard.openstack.org/#!/project/90621:33
johnsomI will say, developers working on neutron-lbaas are few21:33
tonglthanks21:34
*** yamamoto has quit IRC21:34
johnsomIs it the odd newline in "admin_state_up"?21:37
*** gcheresh_ has joined #openstack-lbaas21:39
*** gcheresh_ has quit IRC21:43
*** sshank has quit IRC21:47
tonglAnother quick question:  in Octavia, do we allow deleting a pool when l7policy redirect_to_pool is still referencing it?21:48
johnsomI think the answer is no, but I have not tested it21:50
tonglthx21:51
*** yamamoto has joined #openstack-lbaas21:57
*** yamamoto has quit IRC21:57
*** aojea has quit IRC21:58
*** kbyrne has quit IRC22:00
*** kbyrne has joined #openstack-lbaas22:03
*** jniesz has quit IRC22:04
rm_workblegh, amps explode when i use single-create, not when i use normal create22:22
rm_workbecause of the active-standby keepalived22:22
johnsomUgh22:22
rm_worksomething with the initiation of it happening very early22:22
rm_worklogs look like this:22:22
rm_workhttp://paste.openstack.org/show/622094/22:24
rm_workthat's one amp22:24
rm_workhttp://paste.openstack.org/show/622095/22:26
rm_workthat's the other22:26
rm_workso what seems to happen ...22:26
rm_workhttp://paste.openstack.org/show/622096/22:29
rm_workhmmm22:29
rm_workactually i am unsure if it's keepalived related22:29
*** sshank has joined #openstack-lbaas22:29
rm_workit actually fails over because of no health updates?22:29
rm_workbut it does so before we consider the create "done"22:30
rm_workso it errors the amps because it's still PENDING_CREATE22:30
rm_worknot sure why it's not sending health messages22:30
johnsomUmmm, it can't be told to failover from health messages under after it starts sending them.22:32
rm_workyeah i mean it seems like it STOPS sending them22:32
rm_workor something22:32
rm_workit IS sending them now22:33
rm_workblegh22:33
rm_workI am still unsure what I think about being unable to failover while LB is pending state22:34
johnsomShould not happen, HM doesn't own the amp22:35
*** sshank has quit IRC22:37
rm_workaahh22:37
rm_work2017-09-27 15:37:28.083 7766 WARNING octavia.controller.healthmanager.update_db [-] Amphora 9912b285-4aee-4216-bc81-3e036db246cd health message reports 1 listeners when 0 expected22:37
rm_workthe messages come in but are ignored I think22:37
rm_workbecause the listener count doesn't line up22:37
johnsomcorrect, that is normal on startup22:38
rm_workok but it stays that way for a WHILE22:38
rm_worki think because it's still in the process of setting up the single-create stuff so it hasn't persisted that yet?22:39
rm_workbut it has sent it to the amps/22:39
rm_work?22:39
johnsomEven so, it can't fail over due to HM because it should have never written a amphora_health record to the DB until the first listener comes up22:40
rm_workyeah the first listener comes up on the amp22:40
rm_workerr22:41
rm_workhmm so at SOME POINT the numbers had to match22:41
johnsomI think we had bug about that, that we really should be doing health of the amp before the listener, but you know, chicken/egg22:41
johnsomThe last attempt at that caused it to always failover22:42
johnsomha22:42
rm_worki have an idea...22:42
rm_workif len(listeners) == expected_listener_count:22:42
rm_work^^ replace that with22:42
rm_workif len(listeners) == expected_listener_count and 'PENDING' not in lb.provisioning_status:22:43
rm_workerr sorry bad logic22:43
rm_workone sec22:43
rm_workif len(listeners) == expected_listener_count or 'PENDING' in lb.provisioning_status:22:43
johnsomspares pool22:43
rm_workerm22:43
rm_workhold on we're USING lb.id here22:44
rm_workah i see22:44
rm_workone sec22:44
*** sshank has joined #openstack-lbaas22:45
rm_workhttp://paste.openstack.org/show/622097/22:46
rm_workadded lines 1, 6/7, edited 1322:46
*** leitan has joined #openstack-lbaas22:48
rm_workjohnsom: ^^22:50
johnsomWhat about an LB created but no listener?22:51
johnsomI don't remember when we start sending22:51
johnsomFrankly I'm still not sure why we are doing this at all...22:51
rm_workoh dear god, somehow i got an orphaned VM pair22:55
rm_worki don't know how to track them down...22:55
*** yamamoto has joined #openstack-lbaas22:58
*** rtjure has quit IRC23:01
*** bcafarel has quit IRC23:03
rm_workk got it23:03
*** yamamoto has quit IRC23:05
rm_workyeah umm shit23:06
rm_workso when i do single-create with active-standby, it dies before it ever finishes creating23:06
rm_work(the amps go to error)23:06
johnsomnova error?23:06
rm_workand then when i try cleanup (delete the LB, which is ACTIVE) it seems to be leaving the VMs behind but deleting the amp record O_o23:07
rm_workno, the amps failover but it's in PENDING_CREATE so they don't finish failover and go to ERROR23:07
rm_workmaybe we shouldn't put amps in ERROR if we try a failover but don't do it because of LB state23:07
rm_workO_o23:08
*** tongl has quit IRC23:09
openstackgerritAdam Harwell proposed openstack/octavia master: WIP: Floating IP Network Driver (spans L3s)  https://review.openstack.org/43561223:11
rm_workok that fix seems to fix my startup issue23:13
rm_workthe PENDING state is just too long for them and so it gets out an initial heartbeat somehow but then is blocked for too long T_T23:14
johnsomI would be figuring out how that hearthbeat is coming out23:14
*** bcafarel has joined #openstack-lbaas23:16
rm_workhmm23:18
rm_workyeah the timing on this is crazy complex23:18
rm_workjohnsom: ok here's a question for you23:35
rm_workLB is happy and ACTIVE23:35
rm_workan amp dies23:35
rm_workwe notice, trigger failover, LB goes to PENDING_UPDATE23:35
rm_workwe finish the failover, mark it ACTIVE again23:35
rm_workright?23:35
openstackgerritMichael Johnson proposed openstack/octavia master: Force PCI bus rescan if interface is not found  https://review.openstack.org/50798623:35
johnsomYes, it also has a health lock23:36
rm_worknow what happens if both of the amps behind an ACTIVE_STANDBY LB die within a few seconds?23:36
rm_workone amp dies, we PENDING_UPDATE the LB23:36
rm_worksecond amp ... failover fails23:36
rm_worknow think about once we have N-Active23:37
rm_workI *really* don't think we can afford to lock the LB on failover <_<23:37
johnsomBut, it should be fine right, the second failed amp will stay failed, first one will be built, go back ACTIVE, HM will notice the second one is borked and start failover on it.23:39
johnsomRight?23:39
rm_worki don't think that works23:39
rm_worki need to figure out why23:40
rm_workbut IME they stay ERROR forever23:40
johnsomAbout a year ago it did, I tried deleting both and got it to work23:40
rm_worki think possibly the revert doesn't properly unset the busy flag23:40
rm_worki'll look again next time it happens23:40
rm_workunfortunately my breakpoints keep getting messed up, so tempest finishes cleanup before i can investigate fully23:41
johnsomThat could be the case actually23:41
rm_workbut i just went through and found a ton of amps in the health table with busy-flag set23:41
rm_workso there is obviously a case where it doesn't unset properly somewhere23:42
johnsomYeah, because that happens outside the flow23:42
johnsomIt happens in the HM during "stale" check (still hate that term)23:42
rm_workMarkAmphoraHealthBusy23:43
rm_workthat doesn't have a revert action23:43
johnsomso, failover error on revert should unset it23:43
johnsomhttps://github.com/openstack/octavia/blob/master/octavia/db/repositories.py#L107923:43
rm_workerrr23:43
rm_workAmphoraToErrorOnRevertTask doesn't23:44
rm_worktrying to find a task that does23:44
johnsomYeah, that is what I am saying, it probably should unset busy23:44
rm_workah k23:44
rm_workyou think in AmphoraToErrorOnRevertTask or as a revert on the MarkAmphoraHealthBusy task?23:44
johnsomMarkAmphoraHealthBusy is on the new amp I think23:44
rm_workerr23:44
rm_workhmmm23:45
rm_workdatabase_tasks.MarkAmphoraHealthBusy(23:45
rm_work                rebind={constants.AMPHORA: constants.FAILED_AMPHORA},23:45
johnsomOh, no, it's there for the non-failed failover case23:45
rm_worklooks like it's on the old one23:45
rm_workalthough I guess doing MarkAmphoraHealthBusy in the flow at all is redundant23:46
rm_workbecause in order to RUN this flow on an amp, we've already set it, right?23:46
rm_workso it's just... a noop23:46
johnsomWell, either way, your case with PENDING will fail before it gets to MarkAmphoraHealthBusy23:46
johnsomWell, if it isn't actually a failure, but failover API it needs to be set23:46
rm_workhmmm23:47
johnsomfailure case, it's a dup, failover case it's needed23:47
rm_workok23:47
rm_workso maybe the revert should be there regardless23:47
rm_workand then ALSO somewhere higher?23:47
*** sshank has quit IRC23:47
johnsomYeah, I think AmphoraToErrorOnRevertTask should clear the busy flag23:47
*** sshank has joined #openstack-lbaas23:52
*** sshank has quit IRC23:53
openstackgerritAdam Harwell proposed openstack/octavia master: WIP: Floating IP Network Driver (spans L3s)  https://review.openstack.org/43561223:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!