Wednesday, 2017-08-16

*** catintheroof has joined #openstack-lbaas00:23
*** aojea has quit IRC00:49
*** ssmith has quit IRC01:09
openstackgerrithuangshan proposed openstack/octavia master: Fix DB update reverts for listener provisioning_status  https://review.openstack.org/49340101:42
*** catintheroof has quit IRC01:47
*** gongysh has joined #openstack-lbaas02:07
*** gongysh has quit IRC02:47
*** catintheroof has joined #openstack-lbaas02:50
*** gongysh has joined #openstack-lbaas02:53
*** catintheroof has quit IRC03:03
*** links has joined #openstack-lbaas03:47
*** gcheresh has joined #openstack-lbaas04:40
*** gcheresh has quit IRC05:02
*** reedip has quit IRC05:23
*** reedip has joined #openstack-lbaas05:37
*** pcaruana has joined #openstack-lbaas05:45
*** gcheresh has joined #openstack-lbaas06:36
*** slaweq has joined #openstack-lbaas06:40
*** rcernin has joined #openstack-lbaas06:57
isantospHi, how could I integrate in my puppet environment octavia-pike?07:07
*** rcernin has quit IRC07:16
*** gcheresh has quit IRC07:17
*** rcernin has joined #openstack-lbaas07:19
*** gcheresh has joined #openstack-lbaas07:22
*** Alex_Staf has joined #openstack-lbaas07:37
isantospI mean, the delorean repo08:04
*** yamamoto has joined #openstack-lbaas08:33
*** slaweq has quit IRC08:33
*** aojea has joined #openstack-lbaas08:37
*** gcheresh has quit IRC08:37
*** aojea has quit IRC08:42
openstackgerritNir Magnezi proposed openstack/neutron-lbaas master: Add status in VMware driver  https://review.openstack.org/32364508:44
*** yamamoto has quit IRC08:49
*** gcheresh has joined #openstack-lbaas08:52
*** gcheresh has quit IRC08:53
*** aojea has joined #openstack-lbaas08:57
*** aojea has quit IRC09:01
*** belharar has joined #openstack-lbaas09:20
*** slaweq has joined #openstack-lbaas10:19
*** atoth has quit IRC10:33
*** slaweq has quit IRC10:34
*** slaweq has joined #openstack-lbaas10:36
*** belharar has quit IRC11:18
*** belharar has joined #openstack-lbaas11:18
*** rm_mobile has joined #openstack-lbaas11:29
*** atoth has joined #openstack-lbaas11:29
rm_mobileI should be "physically present" at the meeting today, but whether I'll be fully conscious is doubtful11:30
*** slaweq has quit IRC11:43
*** slaweq has joined #openstack-lbaas11:43
*** slaweq has quit IRC11:44
*** slaweq has joined #openstack-lbaas11:45
*** slaweq has quit IRC12:05
*** slaweq has joined #openstack-lbaas12:05
*** slaweq has quit IRC12:10
*** slaweq has joined #openstack-lbaas12:18
*** catintheroof has joined #openstack-lbaas12:49
*** atoth has quit IRC12:50
*** gongysh has quit IRC13:00
*** gongysh has joined #openstack-lbaas13:00
*** gongysh has quit IRC13:00
*** belharar_ has joined #openstack-lbaas13:02
*** belharar has quit IRC13:02
*** atoth has joined #openstack-lbaas13:07
*** belharar_ has quit IRC13:19
*** belharar_ has joined #openstack-lbaas13:19
*** chlong_ has joined #openstack-lbaas13:31
*** aojea has joined #openstack-lbaas13:32
*** ssmith has joined #openstack-lbaas13:41
*** cpusmith has joined #openstack-lbaas13:42
*** ssmith has quit IRC13:46
*** cpusmith_ has joined #openstack-lbaas13:59
*** chlong_ has quit IRC13:59
*** cpusmith has quit IRC14:03
*** atoth has quit IRC14:12
*** cpusmith_ has quit IRC14:14
*** cpusmith_ has joined #openstack-lbaas14:19
*** cpusmith has joined #openstack-lbaas14:20
*** cpusmith_ has quit IRC14:23
*** atoth has joined #openstack-lbaas14:25
*** slaweq has quit IRC14:30
*** links has quit IRC14:32
*** fnaval has joined #openstack-lbaas14:32
*** belharar_ has quit IRC14:39
*** Alex_Staf has quit IRC14:45
*** belharar has joined #openstack-lbaas15:05
*** rcernin has quit IRC15:07
isantospIs there any keystone configuration to be added in neutron.conf to work with octavia? (neutron-newton) (octavia-pike)15:22
johnsomisantosp No, I don't think so.  The only thing I can think of is RBAC policy work if you want to run octavia under a unique user as opposed to an admin user.15:24
isantosp@johnsom Im seeing this error: ERROR: neutronclient.shell Driver error: Unable to establish connection to http://127.0.0.1:5000/v2.0/tokens: HTTPConnectionPool(host='127.0.0.1', port=5000): Max retries exceeded with url: /v2.0/tokens (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x8bb8750>: Failed to establish a new connection: [Errno 111]15:27
isantospECONNREFUSED',))15:27
isantospany idea about what can it be?15:28
johnsomYeah, I know what that is.  It's not octavia specific, but a problem.  Just a second let me did up some pointers for your15:28
mnaserisantosp are you running aio?15:29
mnaserlooks like neutron cannot authenticate to keystone15:29
johnsomisantosp First thing, do an "openstack endpoint list" and find the keystone entry and it's URL.15:30
johnsomMy devstack is http://172.20.55.13/identity as an example.  Note, the format no longer has a port number as keystone has changed the way it advertises it's endpoint for the uwsgi community goal.15:31
johnsomThen, in neutron.conf, find the [keystone_authtoken] and the "auth_url" setting and make it point to the endpoint you found above (or at least look like it).15:32
johnsomYou should do the same for all of the "auth_url" settings in all of the neutron configuration files.15:32
johnsomneutron-lbaas.conf if you have it as well15:33
*** links has joined #openstack-lbaas15:33
isantosp@johnsom aham, ok, I'll try, thank you :)15:39
isantospmnaser no, we are not running aio15:39
mnaserisantosp ^ what johnsom said but also a little hack15:43
mnaserrestart neutron with debug=True15:43
mnasergrep logs for 127.0.0.115:43
mnaseryou'll find some keystone setting pointing to the wrong place15:43
isantospaham nice, I was debuging it only in a loadbalancer creation :) thanks15:48
*** links has quit IRC15:49
*** pcaruana has quit IRC15:57
*** aojea has quit IRC16:13
*** aojea has joined #openstack-lbaas16:25
*** rcernin has joined #openstack-lbaas16:49
*** JudeC has joined #openstack-lbaas16:59
johnsomOctavia meeting starting soon on #openstack-meeting16:59
*** leyal has quit IRC17:00
*** strigazi_OFF is now known as strigazi17:00
*** leyal has joined #openstack-lbaas17:00
*** sshank has joined #openstack-lbaas17:06
*** aojea has quit IRC17:19
nmagnezijohnsom, that was a short one17:22
nmagnezi(sorry to be late)17:22
johnsomYeah, I kind of expected it to be.  RC weeks are kind of a break for most folks17:23
johnsomThe logs are here if you would like: https://wiki.openstack.org/wiki/Octavia/Meeting_Minutes17:23
nmagnezijohnsom, not for me.. just a bad hour for me to make it to the meeting.. but I'm trying :)17:24
johnsomYeah, maybe we can re-evaluate in Queens17:24
nmagnezi+117:25
johnsomThere is talk of letting the teams use the project IRC channels for the meetings, so that may give us more flexibility17:25
nmagnezithat could actually be a good idea. will also help to everyone here to follow up on topic in case they missed the actual meeting (assuming not everyone open the meeting minutes)17:28
*** aojea has joined #openstack-lbaas17:29
*** strigazi is now known as strigazi_OFF17:55
*** tomtomtom has joined #openstack-lbaas17:59
tomtomtomrm_work, johnsom, the ip addresses are acting as expected now thank you.  I am now running into this issue (http://imgur.com/a/xcPyT) do you know if this is coming from the web server or the load balancer when I hit it in the web browser?18:00
johnsomHmm, 50418:01
johnsomIs it a large transfer or just a normal size GET?18:02
tomtomtomlarge transfer18:03
rm_worktomtomtom: can you SSH into the amphora, activate the netns, and make sure you can curl the members from there?18:03
rm_workah18:03
rm_workyou might need to set timeouts18:03
tomtomtomyeah, I set the health monitor timeout to 30018:03
johnsomYeah, ok.  It's likely you are hitting one of the default timeouts18:03
tomtomtomjohnsom I can curl the page from the namespace18:03
rm_workerr18:04
rm_worknot the healthmonitor timeout18:04
rm_workthe listener timeout18:04
johnsomYeah, that doesn't matter here18:04
johnsomIf you want to double check, you can look at the haproxy log in the amphora, but I'm pretty sure it's the timeouts.18:04
rm_workwait yeah where is that...18:05
rm_worki thought somewhere we specified the max connection time18:05
rm_workbut i can't find it18:06
johnsomSo, we have not exposed this via the API yet.  It's a wish list item.  But you can adjust them in your jinja template.18:06
tomtomtomis there a reference for the jinja template?18:07
tomtomtomlike where it is?18:07
johnsomThese are the timeout options:18:07
johnsomhttps://www.irccloud.com/pastebin/ojW3SE7z/18:07
johnsomDocs for those are here: http://cbonte.github.io/haproxy-dconv/1.6/configuration.html18:08
tomtomtomok thanks18:08
johnsomjinja template is here: https://github.com/openstack/octavia/blob/master/octavia/common/jinja/haproxy/templates/base.j2#L3218:08
*** pcaruana has joined #openstack-lbaas18:09
*** JudeC has quit IRC18:15
openstackgerritAdam Harwell proposed openstack/octavia master: Add flag to disable SSHD on the amphora image  https://review.openstack.org/49233218:18
rm_workjohnsom:  could https://review.openstack.org/#/c/493252/9/octavia/controller/healthmanager/update_db.py use a context-manager?18:28
*** belharar has quit IRC18:30
johnsomFor the db session?  maybe?  I'm always squeamish with context managers and exception handling cleanup, etc.18:30
rm_workhmm18:30
rm_workjohnsom: can I merge https://review.openstack.org/#/c/493328/1 safely?18:32
rm_workor no18:32
rm_worklike, what am I allowed to merge18:32
johnsomTechnically just the two patches we are considering for RC218:33
rm_workk18:33
johnsomThat one is really more of a feature with the version string at startup, etc.18:33
rm_work252 and 64618:33
johnsomYes18:33
johnsomThen, once we are good with RC2, we can decide when to cut Pike and open things up for Queens.  I think we can do it pretty soon actually as I don't think we are going to get translations for octaiva, just the dashboard18:34
rm_workk there's 64618:34
*** sshank has quit IRC18:37
rm_workok and 25218:39
rm_workverified it all looked like what i have18:39
johnsomGroovy, when those merge I will put in the backport for stable/pike, we can merge those and I will put up the RC2 request18:40
rm_workk18:41
johnsomSpeaking of translations, an update came in last night: https://review.openstack.org/#/c/49414718:50
rm_workwoo, korean18:51
johnsomMakes me miss the really good Korean restaurant we had in town...18:52
johnsomNow it's a dumb chain sandwich shop.18:52
rm_worki always wonder about translations now, since one of my old projects once merged translations in Russian that turned out to be advertisements for some spam site18:52
rm_workbut no one realized because we didn't speak russian <_<18:52
rm_workuntil some russian speaking users came to our IRC and were like "... wut"18:52
johnsomHahaha, I think the i18n folks have a pretty good system18:53
rm_workthere's some review I assume? that actually relies on another user being able to read it?18:53
rm_workbecause the person who did that to us actually took the time to really properly format everything18:54
rm_workit all fit like it was supposed to <_<18:54
johnsomThey use zanata: https://translate.openstack.org/?dswid=308518:54
johnsomI mean, it is still on us as cores to validate, but yeah, google translate only goes so far...18:54
rm_worki guess at least now with google translate we can see if it's OBVIOUSLY spam18:55
*** rm_mobile has quit IRC18:59
*** aojea has quit IRC19:15
openstackgerritMerged openstack/octavia master: Fix a bad revert method and add hacking check  https://review.openstack.org/49364619:40
openstackgerritMerged openstack/octavia master: Fix health monitor DB locking.  https://review.openstack.org/49325219:51
*** rcernin has quit IRC19:55
*** sshank has joined #openstack-lbaas20:00
*** aojea has joined #openstack-lbaas20:05
*** aojea has quit IRC20:10
*** sshank has quit IRC20:11
*** sshank has joined #openstack-lbaas20:14
openstackgerritAdam Harwell proposed openstack/octavia master: WIP: Floating IP Network Driver (spans L3s)  https://review.openstack.org/43561220:17
*** sshank has quit IRC20:18
*** sshank has joined #openstack-lbaas20:18
rm_workjohnsom: we don't actually... check anything about the health messages, other than verify the HMAC, do we? we just take whatever comes in and pass it to both the health_update and stats_update functions20:28
rm_workit seems20:28
rm_workthere's no intelligent routing of health messages20:28
johnsomUmm, not sure.  I haven't dug into much of that20:28
johnsomintelligent routing, no20:29
johnsomIt is intended to be dumb20:29
rm_workjohnsom: for reference, i'm adding another type of message, essentially an "emergency ping" to trigger the active-standby VIP migration, since we don't have BGP20:29
johnsomhealth messages should randomly pick a HM to send to from the list20:29
rm_workso the master keepalived can run an alert script and it'll broadcast an emergency message and the HMs can pick it up and trigger a flow for the VIP move20:30
rm_worki mean on the Healthmanager side20:30
rm_workwhen it receives a UDP message20:30
rm_workthere's no intelligent handling20:30
rm_workit's just like "ok this message has health and stats, run both types of update functions on it"20:30
johnsomAh, ok, so creating a new message type or such?20:30
rm_workyes20:30
johnsomYeah, that is cool.  We, wait for it, might want to have loadable drivers for the message handling....20:31
rm_worklololol20:31
johnsomWe have talked about using a new message type for the "I need my config back" too20:31
rm_workso yeah it looks like this system was set up to handle drivers20:31
rm_workby someone who didn't know how it works (carlos)20:31
rm_workand so it's like20:31
johnsomLike just replace the whole darn thing or actually load more than one?20:31
rm_workkinda architected in the right way, but everything is hardcoded anyway20:32
rm_workit's more like20:32
rm_workinstead of stevedore loading anything20:32
rm_workit's just20:32
*** aojea has joined #openstack-lbaas20:32
rm_workthe main "driver" is included and loaded into the class property20:32
rm_workso I could probably disentangle it and get it working as a driver system20:33
rm_workumm, one sec i'll link20:33
johnsomCool, like a named dispatch or something?20:33
johnsomhttps://docs.openstack.org/stevedore/latest/reference/index.html#namedispatchextensionmanager20:33
rm_worki mean20:34
rm_workthe comment... lol20:34
rm_workhttps://github.com/openstack/octavia/blob/master/octavia/cmd/health_manager.py#L4120:34
rm_work(that was put in by xgerman_ later I think, not at the original writing time)20:34
xgerman_me?20:35
rm_workhmm though not sure20:35
rm_workxgerman_: well, maybe it was generically some German guy20:35
rm_workwho wanted to be anonymous20:35
johnsomYeah, I think that came in with the code...20:35
xgerman_nah, that was me20:35
xgerman_the idea was to have multiple drivers handling stuff20:36
rm_workwell20:36
rm_work"came in with the code"20:36
rm_workthat patch took almost a year to merge20:36
rm_workand had like 6 authors20:36
rm_workwhen I talk about "original code" i mean draft1 by carlos20:36
rm_worka ton of the code that merged in that patch was "tacked on"20:36
johnsomAh, gotcha20:36
rm_workbefore it even made it to the repo20:36
johnsomWell, doesn't matter, let's knock out a wall and add a drive through20:36
xgerman_?20:37
rm_worklolol20:37
johnsomxgerman_ https://github.com/openstack/octavia/blob/master/octavia/cmd/health_manager.py#L4120:37
rm_worki don't fully understand that metaphor but what i am doing now is more like20:37
rm_workconstruct a covered patio outside the house with PVC and plastic sheeting, and call it an extra room20:38
xgerman_I think I caught up — we want to have a new message type to force failover/etc20:39
rm_work<N> new message types, I think20:39
xgerman_instead of just disabling health messages we want to send said message20:39
xgerman_to get it done faster?20:39
rm_workright now I am just trying to make sure I don't BREAK the existing messages20:40
rm_workby sending one that doesn't include like... [listeners]20:40
rm_workxgerman_: in my case it's because keepalived needs to failover the vip from one amp to the other in the pair20:40
johnsomI was just thinking we want to hand out tasty health messages to which ever driver wants them....20:40
xgerman_yes, that was my plan - have a list of drivers and they can do whatever they want20:41
*** pcaruana has quit IRC20:41
rm_workxgerman_: and it can't use BGP, it needs to call into neutron20:41
rm_workanyway yeah, I am about to ...20:41
rm_workumm20:41
rm_workmake a patch20:41
rm_workone sec20:41
xgerman_yes, BUT we already use the absence of a message to trigger failover20:41
xgerman_just throwing that out there20:41
johnsomHe is talking about act/stdby failover which doesn't involve the controller today20:42
johnsomHe needs to tell something outside (neutron?) that hey I just made this amp the master.20:42
johnsomWhich, personally, seems too slow to me, but, hey, it's a driver....20:42
xgerman_yes, understood, but it should also happen if the amphora goes dead20:43
xgerman_I am just saying he might not need a new message but new behavior20:43
johnsomIn theory, it would happen before the controller knows the amp went dead.20:43
xgerman_yes, but in his case20:43
xgerman_20:43
xgerman_my thing is the current messaging format is rich enough to convey that haproxy is down so all he needs is new behavior on the controller20:45
xgerman_e.g. you cna set LB to unhealthy20:45
rm_workxgerman_: if we wait until the HM knows the amp is dead20:46
rm_workthere's no difference between ACTIVE_STANDBY and SINGLE + spares20:46
rm_workthe benefit of the pair is that keepalived is monitoring things FAST20:46
xgerman_no, the current format let;s you say your LB is dead20:46
rm_workhow20:46
rm_workabsence of messages -> missed_messages * message_time == long20:47
johnsomWell, you could push an artificial heartbeat with the "down" concept.  It's just bit odd.  Really it should be the new master that sends rm_work's new message to say, "hey, I'm master now" because the dead amp isn't going to send another heartbeat20:47
rm_workit's about 30s until the HMs detect an amp is down, from when it goes down20:47
rm_workyes, what johnsom said20:48
xgerman_oh, ok, but then we should put in each message master=true in case his “special” message gets lost - after all it’s UDP20:49
*** vegarl has joined #openstack-lbaas20:49
johnsomThat is a good idea actually20:49
rm_workyeah20:49
rm_workthat's not bad20:49
rm_worki am adding a "broadcast" method for this though, in addition to the normal round-robin that health daemon does20:50
rm_workso the idea is this "emergency" message is sent everywhere20:50
rm_workand they immediately all race to lock, which is fine20:50
rm_workbut better that than send it once to one place and it is missed]20:50
johnsomThe only downside I see if you have to check the old state from the DB.20:50
rm_workhmmm20:50
xgerman_yep, that’s true20:50
rm_workmy plan was, try to do a health busy-lock20:51
rm_workand then do the FLIP failover, and then unlock20:51
johnsomYeah, the blast-o-gram option would work, just tie up some threads trying to get a lock.20:51
rm_workI think that's fine20:51
rm_workbut20:52
rm_workfeel free to tell me it's not and we can brainstorm or something20:52
johnsomAs long as whatever you are calling handles extra set calls20:52
rm_workin my case it's neutron's flip associate20:52
rm_workwhich ...20:52
rm_workyou can call with whatever destination20:52
rm_workeven if it's the same one20:52
johnsomIn case hm1 gets the lock, makes the change, releases lock, hm2 was still waiting for lock20:52
rm_workif it's the same it should noop20:52
rm_workbut that is a good point20:52
xgerman_yeah, not really sure if broadcast buys us that much but…20:53
*** atoth has quit IRC20:54
xgerman_since we sent a message pretty often…20:54
rm_workwell, keepalived will only run this script once20:54
johnsomYeah, don't use an actual network broadcast... You would be cycling unicast20:54
rm_workto do an "emergency" message20:54
rm_workafter that, I think just setting master=true in the normal message is fine20:54
rm_workrofl no20:54
rm_workone sec20:54
openstackgerritAdam Harwell proposed openstack/octavia master: WIP: amphora emergency udp message extension  https://review.openstack.org/49432320:57
rm_worksuper crazy early draft ^^20:57
rm_workjust showing the framework i'm setting up now20:57
rm_workhaven't gotten very far20:58
xgerman_it irks me philosophically that we now send commands from amphora to control plane - I would us rather report status (e.g. “I am master”) to decouple the reaction from the message20:59
johnsomrm_work FYI, I would heavily test this.  I have experienced some "interesting" behaviors with keepalived, so I would make sure it's triggering this when/how you expect.20:59
johnsomYeah, we need to make sure other amps can't failover others21:00
rm_workright21:00
rm_workhmm21:00
rm_workyeah somehow i was thinking these messages were verifiable21:01
rm_workbut i guess they aren't?21:01
rm_workoh21:01
rm_worki can verify source-ip?21:01
rm_workmaybe21:01
rm_workwe get it back, just throw it away21:01
johnsomI think we salt the hmac with the amp ID which is "private" so, that should be good, but we should double check they actually did that.....21:01
rm_worklol yes21:01
rm_worki mean, the only action here is21:02
rm_work"failover needs to happen for this amp_id"21:02
rm_workand we look up which amp it is and where the failover needs to go21:02
rm_worki'm not going to be like21:02
*** gcheresh has joined #openstack-lbaas21:02
xgerman_which is the same as saying “I am master:21:02
rm_workright21:02
rm_workyes that is correct21:02
rm_workbut it needs to be sent out-of-band21:02
xgerman_yes21:02
rm_worknot "wait for the next health message"21:02
johnsomYeah, I lean heavily towards "I'm master" vs. "go do something"21:02
rm_workright21:02
rm_worki mean that's what it is21:03
*** aojea has quit IRC21:03
rm_worktype: failover21:03
rm_workjust means "hey i'm master now"21:03
rm_workbut i guess i could have it just compile the same message21:03
rm_workone sec21:03
johnsomYeah, I really don't like failover as it's overloaded already with the 12 layers of failover we have21:03
xgerman_+121:03
rm_workhow do I tell if the amphora thinks it is a master or backup21:04
rm_workdo we send that in some config?21:04
xgerman_I like the “I am master message” even if keepalived can change VIP so we cna update DB/status21:04
xgerman_it’s in config21:05
rm_worklooking for it21:05
rm_worknot seeing it21:06
johnsomWell, technically it doesn't matter which is which.  However, when we build, the "MASTER" has a slightly different priority config just to make the non-preempt work right.21:06
johnsomAt any given time, currently, we don't know which is which.21:06
rm_workyeah ok so21:06
rm_workhow would an amp know it needs to set the "master" flag to True or False21:06
rm_workexcept specifically from one of these events21:07
johnsomI thought you said you were going to use the script hook21:07
rm_workyes, but if it can't tell that, it can't be something we also include in the next heartbeat messages21:07
xgerman_I think we should make that a permanent status21:07
johnsomAh, yeah, well, it doesn't....21:07
xgerman_keepalived knows it outputs it in the logs21:07
johnsomI guess you could write it out to the FS21:07
rm_workwas trying to appease xgerman_ because it wasn't a bad idea21:08
rm_workbut i think maybe it's not possible?21:08
rm_workkeepalived config has it I guess21:08
johnsomYeah, but not reliably21:08
rm_workyeah but then that's a state we have to track21:08
rm_workugh21:08
rm_workno21:08
rm_worki think i'm back to "just send it this once21:08
rm_work"21:08
johnsomYou can't trust the log for the state21:08
rm_workyeah21:08
rm_workcan't really trust anything for the state21:08
rm_workhonestly21:08
johnsomYou can send a signal to the process and have it dump a state file, but you don't want to do that every heartbeat21:09
rm_workso basically.... back to just doing a one-off message21:09
rm_workbut i think i will copy the message format and just add the field21:09
xgerman_+121:10
rm_workI am a little worried about SEQ tho21:10
rm_workbecause it's always going to be 1?21:10
rm_workbecause this is not a daemon, it's a start-once script21:10
rm_workso there's normal health messages with a valid SEQ21:10
rm_workand then all the sudden one comes in with "1"21:10
rm_workwhich would just look like an out-of-sequence message21:10
johnsomI asked for that, but I don't know if they implemented it.  It's to stop replay issues.21:10
rm_workyeah which... i guess makes us vulnerable to those again21:11
xgerman_not if we keep book21:11
johnsomAssuming they implemented it21:11
rm_workbut yeah I'm not sure where it's read21:11
*** gcheresh has quit IRC21:11
rm_workxgerman_: the normal health sender daemon keeps the seq21:11
rm_workthis isn't that, it's a one-off run by keepalived21:12
rm_workso it will always have new state21:12
rm_workI guess I could implement it as like...21:12
xgerman_if you know that AMP A is master and some message comes telling you that - yoi cna discard it21:12
rm_workthe health daemon process emits the emergency message on a signal -- and keepalived script just sends that signal to the process21:12
johnsomSo, I have to step back and ask, why can't normal VRRP moving the private VIP address just work like it does upstream?21:13
rm_workxgerman_: yes but if i format the message exactly the same as the other health messages, except with the "master" field to signal it needs to take over, then the receiving end may just discard it for being out-of-sequence21:14
rm_workjohnsom: no BGP?21:14
johnsomI mean, it's old tech that should just work.....21:14
rm_workBGP?21:14
rm_worksure21:14
johnsomWe don't have BGP upstream21:14
rm_workerr21:14
rm_worklook at what the failover script does21:14
rm_workit is doing BGP21:14
johnsomYour failover script?21:14
rm_workno21:14
rm_worki mean21:14
rm_workupstream octavia-keepalived.conf21:15
rm_workit just does a garp21:15
johnsomWe currently have zero BGP in octavia today21:15
rm_workwhich relies on BGP working21:15
johnsomNo, garp is not related to BGP at all21:15
rm_workerr21:15
rm_workok what is the garp related to21:16
rm_workwhatever it is21:16
rm_workthat is the thing we don't support21:16
rm_workwe do static routing21:16
rm_workARPing is not how our network does routing configuration21:17
johnsomARP has nothing to  do with routing either.21:17
rm_workok well21:18
rm_workwhat happens is21:18
rm_worksomething sends an ARP21:18
johnsomIt's a map between an IP and a MAC for layer 221:18
rm_workand nothing happens21:18
rm_workbecause the switches don't listen for it21:18
rm_workthey have static routes21:18
johnsomI have static routes, devstack has static routes21:18
rm_workI can send 1000 garps21:18
rm_workand it will not affect where packets go21:18
rm_workah maybe it's "FLIPs are L3"21:19
johnsomWhat you must have is a static ARP table21:20
rm_workisn't that "static routes"?21:20
johnsomif garps and the ip migration doesn't work21:20
johnsomno21:20
johnsomARPs play a role in networks with no router at all21:20
*** catintheroof has quit IRC21:21
xgerman_ARP is MAC->IP; Routing is how to get to that IP21:22
*** aojea has joined #openstack-lbaas21:22
openstackgerritMerged openstack/octavia master: Add flag to disable SSHD on the amphora image  https://review.openstack.org/49233221:22
johnsomWhen two hosts are on the same ethernet segment, configured with the same IP subnet, and host A with IP 10.1.1.5 wants to send a packet to host B with IP 10.1.1.10, host A arps a "who has IP 10.1.1.5?" and host b responds with "MAC x:x:x:x:x..."21:23
rm_workah21:23
rm_workso yeah21:23
rm_workFLIPs are L321:23
rm_worknot L221:23
johnsomA GARP just tells all of the hosts on the ethernet segment, hey, update your arp table, IP 10.1.1.5 is now at MAC x:X:x:xX:21:23
rm_workso can't redirect a FLIP with garp21:23
rm_workright?21:24
johnsomFLIP is a NAT right, IP 192.1.1.1 gets mapped to 10.1.1.5.  So, your FLIP/NAT device has an interface on 10.1.1.0/8 subnet right?  That interface should use ARP and honor GARP to change the MAC that owns the IP21:25
*** cpusmith has quit IRC21:25
johnsomIf it doesn't, it means your FLIP device is doing something really strange and building some sort of static ARP table, which is a bad practice (part of why DVR has issues).21:26
rm_worksec21:26
rm_worki have a better idea21:27
*** JudeC has joined #openstack-lbaas21:27
rm_workJudeC: please explain the way in which our networking is dump21:27
rm_work*dumb21:27
JudeCClarify our. you mean GoDaddy?21:28
rm_workwhy can't we change where a FLIP points via a gARP21:28
rm_workyes21:28
johnsomI mean one other way to fix this is write some code that sits on the VIP network and listens for the garps and does your FLIP magic....21:28
xgerman_it has to support ARP otherwise you wouldn;t reach anyhting21:29
rm_workone sec finding the irc logs for JudeC to catch up21:30
johnsomWell, in normal networks, but if the sender already "knows" the MAC it could just stamp it in21:30
johnsomBad practice, but would work21:30
rm_workhttp://eavesdrop.openstack.org/irclogs/%23openstack-lbaas/latest.log.html#t2017-08-16T21:13:4321:31
JudeCWell we are using static routes to route traffic from our leaf switches down to the VM.21:31
johnsomYeah, we aren't talking about the L3 routing layer, we are talking L221:31
JudeCI believe we are layer 3 all the way down to the VM in our network, we dont route to an access switch.21:34
JudeCIts been a while but doesn't the GARP/ARP need to have an AS in the way to update its MAC table?21:36
johnsomNo21:36
johnsomARP/GARP is layer 221:37
johnsomNo routing involved at all21:37
rm_workso what receives the ARP?21:38
rm_worka switch?21:38
rm_workand our switches have fully static routes, right?21:39
rm_work-> ARPs do nothing21:39
rm_workfrom how I understand it21:39
johnsom(172.21.21.3) at dc:9f:db:80:c3:3121:39
johnsomThat is an arp table entry21:39
rm_workok but21:39
rm_workwhere does that live21:39
JudeCyeah, our switches don't use a layer 2 switch at all.21:39
johnsomOn the hosts21:39
rm_workit doesn't live on the wire21:40
rm_workwhat hosts21:40
rm_workevery host in the network?21:40
johnsomSo, go into one of your amps, do an "arp -a" what do you have?21:40
rm_worka big list of IPs21:40
rm_workbut like21:40
rm_workthat's on this host21:40
rm_workmy laptop doesn't get the ARP from my openstack VM21:40
rm_workso it has to go to ... somewhere, right? the switch?21:41
rm_worksomewhere that has an ARP table21:41
johnsomAgain, two hosts, a cross over cable (ok, dating myself) between them.21:42
JudeCI am checking on something really quick as to not appear like a dumb dumb one sec :P21:42
rm_work get that johnsom21:42
johnsomhost A will show Host B's IP and MAC in it's ARP table, Host B will show Host A's IP and MAC21:42
rm_workbut that's not how the internet works21:42
rm_workso the switches hold the ARP tables21:43
rm_workright?21:43
johnsomNo21:43
rm_workwhat does?21:43
johnsomAnything that needs to send an IP packet.  The kernel of the instance in most cases21:43
rm_workerrr21:44
rm_workok but let's say the two machines are:21:44
rm_workmy laptop at home21:44
johnsomA switch just has a list of MAC to physical port mappings, no need for an ARP table21:44
rm_workan opstack VM21:44
johnsomSame subnet?21:44
rm_workno21:44
johnsomThey will not see each other in the arp table21:44
rm_workok21:44
rm_workso21:44
rm_workthe same is true for us21:44
*** yamamoto has joined #openstack-lbaas21:44
johnsomYou would have to have a router21:45
johnsomno21:45
rm_workmost VMs are not in the same subnet as other VMs21:45
*** yamamoto has quit IRC21:45
johnsomOur amps are, that is how we can move the same IP back and forth21:45
rm_workright but ours aren't and can't21:45
rm_workthat's the whole point21:45
rm_workwe can't move an IP back and forth21:45
rm_workbecause they just can't be connected to the same subnets21:45
rm_workso we use FLIPs21:45
rm_workwhich are L321:45
johnsomIt's a NAT21:46
*** aojea has quit IRC21:46
johnsomSo, yeah, your issue is different subnets.  When you plug the neutron port for the VIP on the amp, you pick a random subnet?21:47
rm_workyes21:47
johnsomoye21:48
rm_worknova picks it21:48
rm_workit's up to whatever subnet is available on the host it schedules the VM to21:48
rm_workwe don't get to pick subnets, only networks21:48
rm_workthus FLIPs21:48
JudeCYeah so basically we use static routes that say 192.168.1.14/32 routes to 10.0.0.124 where 10.0.0.124 is the fixed IP address of your host.21:49
JudeC192.* would be your float (network)21:50
*** sshank has quit IRC21:50
johnsomBasically every instance gets a subnet of /3221:50
JudeCas far as the static route is concerned yes.21:50
*** yamamoto has joined #openstack-lbaas21:51
JudeCso it literally looks like in the switch config:21:51
JudeCip route 192.168.1.14/32 10.0.0.12421:51
JudeCfor every single VM21:51
JudeCand its all managed via neutron which makes a call out to an external API that actually writes the routes to the L3 switches.21:52
JudeCits a mess21:52
*** apuimedo has quit IRC21:53
johnsomYeah, so customers can't use broadcast or multicast with their instances...  Well, this isn't here nor there.21:53
*** apuimedo has joined #openstack-lbaas21:54
JudeCwhoa whoa whoa21:54
JudeCI just had an idea21:54
johnsomBasically our act/stdby won't work with that kind of networking.  It's designed for typical subnets for the amp VIP networks.  What you are trying to do is more like the act/act21:54
JudeCThe static route API that neutron uses can write routes that have a larger network, as long as that network in contiguous I think a GARP should work...21:55
rm_worki don't think we can guarantee it's contiguous21:56
JudeCwait nvm its needs to be the nexthop too.... askfjlskd21:56
rm_workbut yes, our active-standby isn't21:56
rm_workthe same21:56
*** aojea has joined #openstack-lbaas21:56
rm_workwe don't use one IP with AAP21:56
rm_workwe use a FLIP21:56
rm_workand we need to tell neutron to move it21:56
rm_workwhich, again, is why i am doing this whole thing21:56
*** yamamoto has quit IRC21:58
johnsomYeah, this is just going to be slow.21:58
rm_work~7s22:00
*** sshank has joined #openstack-lbaas22:00
johnsomYeah, you are going to have to use the notify script, send it up in a heartbeat, have something that tells neutron to update your NAT.  The other problem for you is keepalived runs inside the netns, so isolated from amp-agent/lb-mgmt-net22:01
johnsomIt handles tenant traffic, so it's isolated off22:02
rm_workerrr22:02
rm_workcrap22:02
rm_workok so22:02
rm_workyeah, signal i think22:02
rm_workthat actually solves some problems22:02
xgerman_I guess ACTIVE-ACTIVE won’t be great in your env neither22:05
xgerman_OVS based does ARP and the other one needs routers/BGP22:05
rm_workjohnsom: oh well actually lol it doesn't matter22:10
rm_workremember we don't have tenant networks22:10
rm_workall our traffic is technically on the same network22:10
rm_workIE, mgmt-on-vip-net22:10
rm_workXD22:10
johnsomI think you are getting into the realm of local driver here....22:13
johnsomcustom amp image22:14
johnsomMaybe22:15
rm_worki mean yeah we have some patches to the amp image22:15
rm_workgetting the netmask right on the VIP is one22:15
rm_worki WAS originally adding this directly to my FLIP driver patch22:16
rm_workbut i thought maybe we'd want some part of the framework upstream...22:16
rm_workfor sending custom messages22:16
rm_workso should  i just abandon this and fold it back into the FLIP thing?22:18
rm_workbut yeah i think i'm going to implement this as a signal22:18
xgerman_I still see value in us reporting who is master/backup — after all we have a column in the DB22:33
xgerman_ but if the fidelity sucks…22:33
*** fnaval has quit IRC22:35
*** aojea has quit IRC22:39
*** aojea has joined #openstack-lbaas22:40
johnsomYeah, it gets a bit strange too, so would be good to have a solid use case.  I.e. we get a "MASTER" from amp 1 we have to go find amp 2 and change it's status too.22:45
johnsomI think we *could* do it with the notify scripts, but it would need testing and then the database stuffs on the controller side22:45
johnsomI guess it wouldn't be too bad, we could just use a negative where clause in the DB.22:46
*** fnaval has joined #openstack-lbaas22:46
*** fnaval has quit IRC22:47
*** fnaval has joined #openstack-lbaas22:47
xgerman_yeah, not urgent but for the back of our heads22:50
*** ssmith has joined #openstack-lbaas22:52
*** ssmith has quit IRC23:03
*** ssmith has joined #openstack-lbaas23:04
*** sshank has quit IRC23:10
rm_worki guess for now i'm implement this as part of FLIP instead of standalone :/23:16
rm_workthat's fine23:16
rm_workif I don't have to account for more general cases, it's a lot simpler/faster23:17
*** ssmith has quit IRC23:19
rm_workjohnsom: yeah looks like we do nothing with SEQ23:19
*** ssmith has joined #openstack-lbaas23:19
rm_work¯\_(ツ)_/¯23:20
rm_work^^ this emoji describes my general mental state recently with regards to ... a lot of this23:20
rm_workjohnsom: you can remove your -2 on https://review.openstack.org/#/c/488885/ now right?23:29
johnsomNo23:29
johnsomWe are still in feature freeze23:29
rm_work? thought we were merging stuff23:29
johnsomJust the two for RC223:30
rm_workmy SSHD thing merged, so I figured we were good and just went and +A'd a couple of the other ones that were ready23:30
rm_workI guess I should stop those23:30
johnsomWell, let them go, we will just hope RC2 is it and those don't entangle23:31
rm_workwhatev already removed my +A but gate is in-progress so we'll see what happens23:31
johnsomYeah, they will likely merge23:32
*** aojea has quit IRC23:40

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!