Thursday, 2019-05-23

ut2k3Ok thank you so much00:00
ut2k3I gonna sleep too!00:01
ut2k3Thank you and sorry for asking that much :/00:01
*** ut2k3 has quit IRC00:04
*** luksky has quit IRC00:14
sapd1Do you know What are the use case of Multiple VIPs for Load Balancer? Could you guys tell me?01:26
*** ricolin has joined #openstack-lbaas01:36
*** goldyfruit has quit IRC01:36
*** goldyfruit has joined #openstack-lbaas02:06
rm_worksapd1: so let's say you want both ipv4 and ipv602:15
rm_work:)02:15
rm_workthat requires one additional VIP02:15
rm_workthat is the best example02:15
rm_workanother would be if you want to be able to access it on a private AND a public network02:15
rm_workdepending on how routing works (maybe you have some nodes that have zero egress from their private subnet)02:16
rm_work(assuming you can have a private and a public subnet on the same network, I guess, since it's limited to a single network with this design)02:16
rm_worksapd1: if you have feedback about your own use-cases for multi-vip, i'd love to hear them, as i want to make the design fit as many cases as reasonably possible02:17
sapd1rm_work,  Thank you for your answer. I will try after up finish your work :D02:24
*** irclogbot_3 has quit IRC02:26
openstackgerritsapd proposed openstack/octavia master: Support create amphora instance from volume based.  https://review.opendev.org/57050502:27
rm_workyeah this is eating up all my time right now, but as a bonus, I DID get my devstack back up and running02:28
rm_workso maybe i can test your thing02:28
rm_workif there's instructions on how to set it up... let me know02:29
*** irclogbot_2 has joined #openstack-lbaas02:29
openstackgerritsapd proposed openstack/octavia master: Support create amphora instance from volume based.  https://review.opendev.org/57050502:30
sapd1rm_work, ah, about cinder volume, right?02:31
rm_workyes02:32
sapd1:D Actually that patch was done by many contributors :)02:34
rm_workyeah, but if you are testing it, you know how to enable it and use it and verify it in devstack>02:34
rm_workright?02:34
rm_worki don't know what i'd be looking at02:34
sapd1ah, You have to configure volume_driver in section controller_worker - volume_driver = volume_cinder_driver and authentication infor in section cinder.02:36
sapd1this variable (OCTAVIA_VOLUME_DRIVER) in local.conf for devstack02:37
*** goldyfruit has quit IRC04:06
johnsomWe have a tempest gate for it.04:19
* johnsom passes through wondering why the fedora screen saver isn't working....04:20
*** gcheresh_ has joined #openstack-lbaas04:22
*** luksky has joined #openstack-lbaas04:35
*** luksky has quit IRC04:41
*** luksky has joined #openstack-lbaas04:42
*** luksky has quit IRC04:56
openstackgerritsapd proposed openstack/octavia master: Support create amphora instance from volume based.  https://review.opendev.org/57050505:09
*** yboaron_ has quit IRC05:11
cgoncalvesjohnsom, better that and terminal crashing than what happened to my partner's Apple laptop: system updater kept on crashing at boot, no way to stop it from re-trying at every boot06:11
cgoncalvessecond time it happened in the past 3 months06:11
openstackgerritsapd proposed openstack/octavia master: Support create amphora instance from volume based.  https://review.opendev.org/57050506:23
*** gthiemonge has joined #openstack-lbaas06:38
*** tesseract has joined #openstack-lbaas06:41
*** yboaron_ has joined #openstack-lbaas06:43
*** yboaron_ has quit IRC06:48
*** yboaron_ has joined #openstack-lbaas06:49
*** rcernin has quit IRC07:00
*** ccamposr has joined #openstack-lbaas07:03
*** pcaruana has joined #openstack-lbaas07:07
*** pnull has quit IRC07:19
*** rpittau|afk is now known as rpittau07:20
*** ivve has joined #openstack-lbaas07:42
*** trident has quit IRC08:04
*** trident has joined #openstack-lbaas08:05
openstackgerritAdit Sarfaty proposed openstack/neutron-lbaas stable/stein: Support URL query params in healthmonitor url_path  https://review.opendev.org/66093008:31
*** ricolin has quit IRC09:15
*** luksky has joined #openstack-lbaas09:29
*** ccamposr has quit IRC09:38
*** ccamposr has joined #openstack-lbaas09:38
*** ccamposr__ has joined #openstack-lbaas09:38
*** ccamposr__ has quit IRC09:38
*** pnull has joined #openstack-lbaas10:32
openstackgerritAnn Taraday proposed openstack/octavia master: [Jobboard] Importable flow functions  https://review.opendev.org/65953810:32
*** sapd1_x has joined #openstack-lbaas10:37
*** pnull has quit IRC11:06
openstackgerritMerged openstack/octavia master: Document health monitor UDP-CONNECT type  https://review.opendev.org/66036411:21
openstackgerritMerged openstack/octavia-dashboard master: Fix devstack plugin python3 support  https://review.opendev.org/66081311:34
*** ccamposr has quit IRC12:07
*** boden has joined #openstack-lbaas12:09
*** luksky has quit IRC12:33
*** ccamposr has joined #openstack-lbaas13:05
*** goldyfruit has joined #openstack-lbaas13:16
*** luksky has joined #openstack-lbaas13:25
*** ricolin has joined #openstack-lbaas13:49
*** ccamposr__ has joined #openstack-lbaas14:10
*** pcaruana has quit IRC14:10
*** ccamposr has quit IRC14:12
*** lemko has joined #openstack-lbaas14:21
*** pcaruana has joined #openstack-lbaas14:29
*** gthiemonge has quit IRC14:34
*** gthiemonge has joined #openstack-lbaas14:34
*** Vorrtex has joined #openstack-lbaas14:52
*** ivve has quit IRC15:06
*** yboaron_ has quit IRC15:11
*** luksky has quit IRC15:12
*** gcheresh_ has quit IRC15:39
*** rpittau is now known as rpittau|afk15:51
openstackgerritMerged openstack/octavia master: Fix tox for functional py36 and py37  https://review.opendev.org/66022715:54
openstackgerritMerged openstack/octavia master: Correct OVN driver feature matrix  https://review.opendev.org/66020215:54
*** pcaruana has quit IRC16:09
*** ivve has joined #openstack-lbaas16:23
*** gthiemonge has quit IRC16:30
*** gthiemonge has joined #openstack-lbaas16:30
*** sapd1_x has quit IRC16:54
*** ricolin has quit IRC16:55
*** pcaruana has joined #openstack-lbaas17:00
*** goldyfruit has quit IRC17:03
johnsomcores - It would be nice to get this backport merged so we can cut a dashboard release: https://review.opendev.org/#/c/660769/17:14
*** goldyfruit has joined #openstack-lbaas17:36
*** lemko has quit IRC17:51
*** gcheresh_ has joined #openstack-lbaas17:53
*** luksky has joined #openstack-lbaas18:20
*** gcheresh_ has quit IRC18:23
openstackgerritMichael Johnson proposed openstack/octavia master: Convert listener flows to use provider models  https://review.opendev.org/66023618:38
*** tesseract has quit IRC19:05
openstackgerritMerged openstack/octavia-dashboard stable/stein: Fix 403 issue when creating load balancers  https://review.opendev.org/66076919:43
*** gcheresh_ has joined #openstack-lbaas19:56
*** rouk has joined #openstack-lbaas20:08
roukwhen an amphora fails to provision a listener due to OOM, and it takes down the entire LB (which, not good), whats the recommended way to get it running again, now that its immutable and cant be scaled back down?20:09
rouksince it cant be failed over because its immutable20:12
johnsomrouk The retries/repair process will timeout and the controller working on the object will release the load balancer back to either ACTIVE or ERROR.20:26
roukso im in active error status right now, with all listeners sitting at provisioning errors (cause one listener OOM'd)20:27
roukwhats step 1 to get the LB to not be dead? the amphoras are live.20:28
roukbut... all the listeners are online operating status, but error provisioning status20:29
roukthings are all over after an OOM20:29
johnsomIf it is in "ERROR" you can use the failover API to have the controller rebuild the LB. ERROR is not an immutable state for failover.20:33
roukwell it is in my version, we spoke about this in the past.20:34
johnsomYou can also delete one of the failed listeners20:34
johnsomWhat version are you running?20:34
roukbest way to get the version youre looking for?20:34
*** gcheresh_ has quit IRC20:35
johnsompip list, or from the logs20:35
johnsomi.e.: ay 16 10:33:38 devstack octavia-worker[31892]: INFO octavia.common.config [-] /usr/local/bin/octavia-worker version 4.1.0.dev5020:35
rouk3.0.220:37
johnsomAh, ok, yes, there was a bug fixed in 3.1.0 for the failover problem.20:39
johnsomYou should be able to delete one of the failed listeners.  Have you tried that?20:40
roukit was immutable too, i had to delete the listener in the db, then set the status of the lb to active, then fail it over20:42
roukstatuses are still broken, but at least its functional again, i guess.20:42
johnsomAre you using a bionic based amphora image? Or did you update the haproxy to a newer version?20:43
johnsomI'm curious how it ran out of memory.20:45
rouk2gb, 16 listeners20:45
roukdont have the LB flavors code yet, one size fits all does not fit all20:45
johnsomWe know that newer haproxy version are allocating a lot more memory than the older versions do. It's on our list of things to fix.20:46
*** pcaruana has quit IRC20:47
rm_workso... even though i have a depends-on for octavia lib, this is what happens during tox testing:20:48
rm_workCollecting octavia-lib===1.1.1 (from -c /home/zuul/src/opendev.org/openstack/requirements/upper-constraints.txt (line 131))20:48
rm_work  Downloading http://mirror.regionone.limestone.openstack.org/pypifiles/packages/b2/5f/fd0da2ce699b8bf83570f6df6d006f68baf4e437337004bc6f0fb9865947/octavia_lib-1.1.1-py2.py3-none-any.whl20:48
rm_workwhich kinda makes sense, the devstack inclusions don't affect tox.ini or requirements.txt20:48
rm_workso how do we fix this?20:48
rm_workdo we change tox.ini to do some checks of devstack variables, and change what it installs?20:49
rm_workthis is going to be a problem in the future for any patch to octavia main repo that relies on a change to octavia-lib20:50
*** Vorrtex has quit IRC20:50
johnsomrm_work: I think I have an answer, I just need to make a sandwich.20:51
johnsomGive me 5-1020:51
rm_workkk np i'll do other stuff, back in 2020:51
rm_workwill give you plenty of time, sandwiches are not to be rushed20:51
rm_work(for real, sandwich making is serious business IMO)20:52
johnsomOne of those days where I've been working since before 6 and really didn't get much to eat until now.20:57
johnsomSo, I'm guessing the tempest gates work, but the unit/functional don't. Correct?20:58
rm_workwell, right now nothing works21:05
rm_workso it's hard to say21:05
johnsomWhich patch?21:05
rm_work:D21:06
rm_workmultivip21:06
johnsomYeah, so scenarios is because your DB migration is messed up.21:07
rm_workright21:07
johnsomhttp://logs.openstack.org/39/660239/9/check/octavia-v2-dsvm-scenario/9e19e9e/job-output.txt.gz#_2019-05-22_23_36_48_26795821:07
rm_workbut i think the octavia-lib is installed fine there from source21:07
johnsomAs for the others: https://github.com/openstack/openstack-zuul-jobs/blob/master/zuul.d/jobs.yaml#L75821:07
rm_workso yes, tox is the real thing21:07
johnsomWe probably need to setup these21:07
openstackgerritAdam Harwell proposed openstack/octavia master: WIP: Allow multiple VIPs per LB  https://review.opendev.org/66023921:10
rm_workthere's the fix for the head issue21:10
rm_workworking on how i'm going to get the stuff to plug properly now...21:10
rm_workthough i have to run get a thing notarized before the notary place near me closes today21:11
*** boden has quit IRC21:37
colin-having trouble tracking this down in the housekeeping controller, any guidance on it?21:37
colin-octavia-housekeeping[25577]: WARNING urllib3.connectionpool [-] Connection pool is full, discarding connection21:38
johnsomoslo.db?21:38
johnsomAh, urllib3, probably not21:39
colin-around the time i saw it, new amps were having certificate data written to them21:40
johnsomThe only thing I can think of in housekeeping that would use urllib3 is the cert rotation21:40
colin-wasn't much else going on as far as i could tell21:40
*** ut2k3 has joined #openstack-lbaas21:40
ut2k3Hi johnsom how are you? I am still unlucky with my reachability issue :/21:41
ut2k3I've adjusted that I am able to SSH into the amphora via the octavia LCX-Container21:41
johnsomcolin- Were they new amps or old amps getting new certs?21:42
johnsomut2k3 Hi. Cool, let's jump into one of the amps, then "ip netns exec amphora-haproxy bash" to get a shell inside the network namespace.21:42
johnsomThen from there let's do an "ifconfig"21:43
ut2k3https://imgshare.io/image/screenshot-2019-05-23-234312.rVQlS thats the amphora I am jumping on21:44
ut2k3http://paste.openstack.org/show/752007/21:45
rm_workut2k3: not a fan of vehicles? definitely prefer ut2k4 myself :D21:45
johnsomYou should see eth1 and eth1:0, where eht1:0 will have your VIP IP21:45
johnsomAh, ok, different image. that is fine, both IPs are there21:46
johnsomThere is no eth2?21:46
ut2k3http://paste.openstack.org/show/752008/21:46
johnsomIf there is no eth2, are the members on the same subnet as 10.123.x.x?21:48
ut2k3Yep all in the same21:48
ut2k3eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 145021:49
ut2k3        inet 10.123.0.19  netmask 255.255.0.0  broadcast 10.123.255.25521:49
johnsomNext we want to "tcpdump -nli eth1" then from another window, try connecting to the VIP / port as you would expect it to work21:49
ut2k3thats one eth0 from a pool member21:49
johnsomOk, cool, you are running a one-armed LB, that is fine.21:49
ut2k3Thats from a host > curl: (7) Failed to connect to 10.123.0.9 port 6443: No route to host21:53
ut2k3tcpdump -nli eth1 is running on the amphora21:53
ut2k3Tried to connect from a k8s minion which was able to connect to it before the problems started21:54
johnsomYou should see output in the tcpdump if the packet is getting to the amphora21:54
ut2k3`21:55:38.811404 ARP, Request who-has 10.123.0.9 tell 10.123.0.19, length 28`21:56
ut2k310.123.0.19 is k8s node I try to request currently with: `watch "curl -v https://10.123.0.9:6443"`21:56
johnsomYou should see something like this:21:58
johnsom21:58:10.867009 IP 172.24.4.1.47586 > 10.0.0.44.6443: Flags [S], seq 234743365, win 29200, options [mss 1460,sackOK,TS val 2802105723 ecr 0,nop,wscale 7], length 021:58
johnsomThat is from a tcpdump I did with an LB that has a listener on 6443 like you have.21:59
johnsomThere are a lot more packets in that connection, but that is the first one.21:59
ut2k3Thats the lb we are playing with: http://paste.openstack.org/show/752009/21:59
*** ivve has quit IRC21:59
johnsomYeah, If you don't see the packet in tcpdump, it's not making it to the amphora instance at all. As I said yesterday, I'm pretty sure the amphora is healthy.  Let's try to prove that real quick.22:00
ut2k3http://paste.openstack.org/show/752010/22:01
johnsomInside the amphora, in the netns, do a "ifconfig lo up", then curl the VIP from the amphora netns22:01
ut2k3In the `amphora-haproxy` netns yes?22:02
johnsomyes22:02
ut2k3Yep `curl -v https://10.123.0.9:6443` is working there.22:04
johnsomYeah, ok, so the amphora and load balancer are healthy. The problem is somewhere in nova/neutron such that the requests are not making it to the amphora.22:05
johnsomfrom your controller, do a "openstack network agent list" and check that all of the neutron agents are "UP" with ":-)" next to them.22:06
johnsomIt could be that a neutron agent is down and the traffic isn't flowing to the instance correctly22:06
ut2k3http://paste.openstack.org/show/752012/ (amphora running on the virt0 host currently, as well the 4 K8s nodes)22:08
johnsomda45c5d6-a68a-4853-a04a-f12ca413a07e | L3 agent           | virt0                  | de-kar-1b         | XXX   | UP    | neutron-l3-agent22:08
johnsomThat sick l3-agent might be the problem22:09
johnsomBut there are a lot of them unhealthy.... It could be any one of them22:09
ut2k3Ok let me check/restart them22:09
johnsomIt's the neutron-linuxbridge-agent and neutron -l3-agent that matter for this scenario. The others aren't needed for this test22:10
ut2k3http://paste.openstack.org/show/752013/22:16
ut2k3I've cleaned up the things, as well restarted on virt0 the neutron-linuxbridge22:17
ut2k3Thats our current setup > https://docs.openstack.org/security-guide/_images/1aa-network-domains-diagram.png22:17
johnsomcolin- Here is a hunch based on a quick google.  https://github.com/openstack/octavia/blob/master/octavia/amphorae/drivers/haproxy/rest_api_driver.py#L400 and https://github.com/kennethreitz/requests/blob/master/requests/adapters.py#L9222:17
ut2k3The old agents seemed to be "zombies" listed there.22:17
johnsomHmm, ok, no L3 agent on vrt0 now... I don't know if that is ok or not.22:18
johnsomDoes the FIP work now?22:18
ut2k3Hmm according to https://docs.openstack.org/security-guide/_images/1aa-network-domains-diagram.png it should be ok22:21
johnsomcolin- Maybe we are exceeding the 10 connections default, though that seems *odd*22:22
ut2k3Still not luck after restarting full neutron on the two network nodes as well on the compute node22:24
johnsomHmmm, then I'm not sure why traffic isn't getting to that vm instance....22:25
ut2k3For example on virt0 I am reaching the VM without problems via Floating IP as well Private IP to Private IP22:26
johnsomSo on host you can hit the VIP?22:26
ut2k3Sorry for not being precise => the k8s nodes which are in the same subnet with their private ips can reach each other.22:29
ut2k3Thats from a k8s node to the VIP http://paste.openstack.org/show/752015/22:30
johnsomYeah, you shouldn't be able to ping the VIP, that is normal22:30
johnsomThis is troubling: (10.123.0.9) at <incomplete> on eth022:31
colin-johnsom> colin- Were they new amps or old amps getting new certs?22:31
colin-new amps22:31
ut2k3http://paste.openstack.org/show/752016/22:31
johnsomcolin- Hmmm, not sure why housekeeping would be connecting to them...22:32
colin-yeah good point22:32
colin-wish i could recreate it reliably, don't want to just idle with debug on22:32
colin-will look a bit more tomorrow22:32
johnsomOk, let me know22:32
colin-will do, thx22:32
johnsomut2k3 You are getting into neutron debugging that I'm not an expert in. I would have expected detaching the FIP and re-attaching would have cleared things up if the network node was messed up.22:34
johnsomut2k3 I wonder what happens if you try to attach another FIP?22:34
ut2k3Sure can do that :)22:34
ut2k3sec22:34
*** goldyfruit has quit IRC22:38
ut2k3Nope doesn't help22:40
johnsomI don't know what is wrong. It's outside the VM though. The failovers have rebuild that those, a few times now. We know the code inside the  VM is working. We know packets don't arrive on the interface inside the VM.22:42
johnsomWe confirmed the port was in the security group right? It should be, but just double checking.22:43
ut2k3NO SG  => octavia-lb-acae625f-01ff-4bfc-9b74-df0df3f59be6 [10.123.0.9 fa:16:3e:8b:e4:e4Octavia] DOWN22:50
ut2k3Has SG => octavia-lb-vrrp-e66c2f18-8b87-42c6-9b13-fa4485c79c86 [10.123.0.24 fa:16:3e:be:03:ea] UP22:50
johnsomYeah, that SG has the port open in it right?22:50
*** goldyfruit has joined #openstack-lbaas22:53
ut2k3I think I found something. I think the wrong SG is attached to that.22:56
ut2k3Could it be that I took something wrong when we rebuild the amphora?22:57
ut2k3May you remember I have 3 LB, and it seems the SG got mixed up..22:57
ut2k3http://paste.openstack.org/show/752018/23:04
ut2k3But the listener of that LB is => Listener: cluster-production-k8s-de-kar--4fo2nxngz7a6-api_lb-7erqpobqhuh3-listener-dwnqi4thczuzTCP644323:05
*** sapd1_x has joined #openstack-lbaas23:06
johnsomut2k3: ok, so we got the ports put of order when we recreated the amp records23:11
ut2k3Seems like that. Could that explain also the missing arp entry here? Question is: can I swap the port via Database Updates and then do a proper failover again?23:13
ut2k3From the SG that are attached to this ports, I am able to spot which port belongs to which LB23:16
johnsomSure, that should work23:17
*** rcernin has joined #openstack-lbaas23:18
ut2k3`UPDATE amphora SET load_balancer_id = 'CORRECT_LB_ID' WHERE vrrp_ip = 'PORT_IP'`23:21
ut2k3Then doing the failover, do you think that would work?23:22
*** goldyfruit has quit IRC23:22
johnsomUmm23:27
johnsomYeah, likely that will work23:28
ut2k3Hmm nope23:38
ut2k3So I gonna go to bed. I will follow that lead and update the amphora vrrp_port_id manually and then do the failover23:41
ut2k3Thanks, I will let you know if that helped23:41
*** ut2k3 has quit IRC23:41
*** goldyfruit has joined #openstack-lbaas23:42
*** trident has quit IRC23:51
*** trident has joined #openstack-lbaas23:53

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!