Friday, 2021-09-03

mephmanxAll, I am having an issue installing cloudfoundry in my openstack env.  I get errors related to Octavia/lbaas.  I put the question out to a few places and got the following response:11:35
mephmanxI don’t think the Octavia endpoints are the same as the old Neutron lbaas ones were.   :path          => "/v2.0/lbaas/pools" ''OpenStack API NotFound Expected(200) <=> Actual(404 Not Found)   The API docs for Octavia seem to indicate that the path would be /v2/lbaas (rather than /v2.0)   I’ve no idea how to fix the issue, but I think that explains why the error is happening.11:35
mephmanxIs there a way I could quickly identify if this was the case?  If it turns out to be, I think I have 3 options; 1) find and fix the code in whatever is using that url to communication with openstack 2)fix openstack neutron itself to use the correct url 3) put a service in the middle (like HAProxy or something) that could do url rewriting.11:37
mephmanxIs there a way I could quickly identify if this was the case?  If it turns out to be, I think I have 3 options; 1) find and fix the code in whatever is using that url to communication with openstack 2)fix openstack neutron itself to use the correct url 3) put a service in the middle (like HAProxy or something) that could do url rewriting11:37
mephmanxCould I get some advice?11:37
gthiemongemephmanx: /v2.0/lbaas/pools should work, this is an alias to /v2/lbaas/pools11:43
mephmanxhmm...  on my cloudofundry install, I am receiving this error11:44
mephmanx01:29:55.852077+0000', 'admin', 'update', 'deployment', 'cf', 'CPI error ''Bosh::Clouds::CloudError'' with message ''OpenStack API NotFound Expected(200) <=> Actual(404 Not Found) excon.error.response   :body          => "{\"NeutronError\": {\"type\": \"HTTPNotFound\", \"message\": \"The resource could not be found.\", \"detail\": \"\"}}"   :cookies       => [   ]   :headers       => {     "content-length"            => "111:44
mephmanx    "content-type"              => "application/json"     "date"                      => "Thu, 02 Sep 2021 01:29:32 GMT"     "strict-transport-security" => "max-age=31536000;"     "x-openstack-request-id"    => "req-fbfa7aa1-1f13-46c6-8385-5f0fde847b08"   }   :host          => "openstack-external.lyonsgroup.family"   :local_address => "10.0.1.6"   :local_port    => 45952   :path          => "/v2.0/lbaas/pools"   :port     11:44
mephmanxif I do a curl against https://openstack-external.lyonsgroup.family:9696/v2.0/lbaas on any of ther servers in question, the endpoint works so I dont believe its network.11:45
mephmanxI made everything unrestricted so that all the bosh vm's had access so I dont believe its security...11:46
mephmanxI could forward along anything that could help if you have a moment....  At least if you could point me to where to look Id be grateful.  11:47
gthiemongeI believe the url is correct, but perhaps the endpoint is not the good one, 9696 is the neutron port, so you're probably using the neutron endpoint here11:48
mephmanxyeah, the error looks like it tries to do something with the loadbalancers agains tthe neutron port:11:49
mephmanx  :host          => "openstack-external.lyonsgroup.family"   :local_address => "10.0.1.6"   :local_port    => 45952   :path          => "/v2.0/lbaas/pools"   :port          => 9696   :reason_phrase => "Not Found"   :remote_ip     => "174.54.141.197"   :status        => 404   :status_line   => "HTTP/1.1 404 Not Found\r\n"11:49
mephmanxThe 9696 neutron service is active and open...you could hit it as well from your machine using the link I posted.11:51
gthiemongemephmanx: I'm not familiar with cloudfoundry, but to me, it looks it doesn't support octavia11:52
mephmanxI heard from them that thier reference design was octavia....11:54
mephmanxI asked on thier slack channel about other issues I have had getting to this point.  Mostly either things I did wrong or not very good documentation...11:55
gthiemongemephmanx: there was a neutron-lbaas proxy plugin that is now unsupported, it forwards request from neutron-lbaas to octavia, maybe their reference design uses it12:05
mephmanxis that gone now?  How could I install that if it is?  I have a vanilla Wallaby cloud deployed using kolla that I am working with.12:07
gthiemongeit was deprecated in Queen12:09
mephmanxis it still useable?  Could I deploy it in my stack?12:12
gthiemongeit was removed in stein, I don't believe it would work in a Wallaby env12:22
mephmanxSo the only way to use cloudfoundry then would be to fix the cloudfoundry code or maybe deploy as kubernetes?12:25
mephmanxAnyone else also have this feeling?  cloudfoundry does not support octavia natively, it supports on up to Queesn (due to lbaas-proxy)?12:34
opendevreviewGregory Thiemonge proposed openstack/octavia stable/wallaby: Add generic network interface management in the amphora  https://review.opendev.org/c/openstack/octavia/+/80731012:57
mephmanxI think I see the issue in fog-openstack if it is as easy as ust the /v2 endpoint instead of /v2.0 as the /v2.0 goes to the neutron-lbaas proxy (which doesnt exist anymore).  I can point that change out (or make it) if someone else could test it out or even approve the change.  It looks like the fog library is nearly abandoned as it dosent appear to have had a commit in over a year.13:07
mephmanxI would like to be able to get this working in a week or so....not wait months or longer...13:08
mephmanx_sorry, I dropped connection.  Anyone have any other thoughts on fog-openstack, cloudfoundry support for openstack version after queens, neutron lbaas / octavia, etc?13:39
gthiemongenop sorry, perhaps the people who are based in the US will join us soon and will have more insights on this13:41
johnsommephmanx_: you need to fix the endpoint cloud foundry is using for the load balancer service (i.e. not the old port way) or map the /lbaas path to point to the octavia api endpoints13:41
johnsomPeople have done the mapping/proxy using apache in the past13:43
mephmanx_ok, so the requests are basically the same, its just the uri that is different?13:45
mephmanx_I can manage that...do you have any links that discuss that?  I have worked with apache before.  I have PFSense / HAProxy that could help with it...13:46
mephmanxsorry, connection dropped again.  Was there confirmation that the traffix is the same, its just the uri that is wrong if I was to put a proxy in front of neutron?13:55
mephmanxAlso, could you repost any links that describe this solution?  If some were, I didnt get them.13:55
johnsommephmanx: https://wiki.openstack.org/wiki/Neutron/LBaaS/Deprecation14:00
mephmanxI saw that page, thanks.  Is there any blogs or walkthrough on how others set up a proxy for the lbaas issue.14:02
johnsomBack at the deprecation time there was a test job using the apache method. If you dig back in the neutron lbaas test jobs you may be able to find it.14:02
johnsomI am on mobile so can’t dig for it right now14:02
johnsomThe API is compatible, it is just how the api is reached that changed14:03
mephmanxok, so just to confirm how I see it; if I can proxy requests made to /v2/lbaas/* to /v2.0/lbaas/*, that should do it?14:14
johnsomNo, the v2 v2.0 is aliased, it is the endpoint where octavia is listening. Could be a different port number or IP depending on how your cloud is configured. Check “openstack endpoint list”14:29
mephmanxhere is what I have:  What would I be rewriting or proxying?14:30
mephmanx+----------------------------------+---------+--------------+-----------------+---------+-----------+-------------------------------------------------------------------------+ | ID                               | Region  | Service Name | Service Type    | Enabled | Interface | URL                                                                     | +----------------------------------+---------+--------------+----------14:31
mephmanxhmm...  let me pastebin that.14:31
mephmanxhttps://pastebin.com/QVrLWPks14:31
johnsomSo, this is your target endpoint in that cloud: https://openstack-external.lyonsgroup.family:987614:36
mephmanxhere is the error I see during cloudfoundry install.  Looks like it is trying to use 9696.  https://pastebin.com/zPuRetJz14:37
mephmanxwait, I think I see what you are saying.  I need to send the lbaas requests that are going to 9696 to 9876, right?14:37
johnsomRight, it is using the neutron port 9696 instead of the octavia port 987614:37
mephmanxAh, ok.  I got it then.  Thank you!  Let me see what I can put together.14:37
mephmanxI was able to get the redirect working but I am now seeing 504's back from octavia on POSTs to /v2.0/lbaas/pools/<poolid>/members15:30
johnsommephmanx Hmmm, you might need to increase the connection/data timeouts in your proxy.15:48
mephmanxok.  Could there be any other resources that would need this sort of rewrite?   I ran a bunch of prep scripts via terraform to prep the env for cloudfoundry...the scripts created a bunch of stuff but one thing they created was the loadbalancer.  Would I need to delete and recreate or possibly recreate the entire env due to this?  Could something be in a bad state from not having this access and now the db is messed up?15:51
johnsomNo, the database will be consistent, but the resource may not be fully setup as terraform expects. Considering it was adding members, I would just check that all of the service instances you would expect are setup correctly on the load balancer pool.15:53
johnsomI'm now in the office, I can see if I can dig and find the old setup that was used for testing if you would like.15:53
johnsomIt might have the timeouts that were used.15:53
mephmanxif you could, I would greatly appreciate it.  It looks like I was using the haproxy default of 30 seconds...15:54
johnsomI would hope that terraform would have logged any resources it wanted to setup, but was unable to complete. But I have also not used cloudfoundry15:54
johnsomYeah, give me a few minutes. I need to jump in the way-way-back machine. grin15:54
mephmanxI see log entries like this as well:  Pool cannot be created or modified because the Load Balancer is in an immutable state15:56
mephmanxthat looks like it happens if LB is in ERROR or one of the pendinge states but mine is in ACTIVE and it looks like the script even addedd a member...15:57
johnsomThat is normal. If an update is in-progress on the resource, to keep consistency, we send back an HTTP status code that says it should retry the request.15:57
mephmanxok, thanks15:57
mephmanxI set it to 600 seconds but still got errors....  https://pastebin.com/7p40kfYf15:58
johnsomHmm, adding members should only take a few seconds really. It depends on the compute environment, is it nested virtualization (running VMs in VMs) or not.15:59
mephmanxit is nested..16:00
johnsomOk, then it's important to have KVM enabled as otherwise nova can take up to 16 minutes to fully boot a VM.16:00
mephmanxthe 504 look like they are also happening on /v2.0/lbaas/pools<pool id>16:00
mephmanxKVM is enabled16:00
johnsomOk, good! Yeah, 504 is purely a timeout at the proxy layer, it's not coming from Octavia API.16:01
mephmanxit dosent look like it waited the 600 seconds...the 504 came back pretty quick still...  Im wondering if its another timeout or something else missing...16:01
johnsomWhat are you using for the proxy? Apache, haproxy?16:01
mephmanxhaprox.  I added a rule that says "if the string lbaas is in the path, send to octavia instead of neutron"16:02
mephmanxIt actually looks like all the ocatavia apis are sending 504...must be setup then...16:02
johnsomRight, perfect. So there are 2-3 timeouts that probably need to be set in haproxy for this path.16:03
mephmanxno, wait...some of the gets are returning...16:03
mephmanxI updated the timeouts on the neutron frontend...maybe octavia as well?  Ill check backend timeouts...16:03
mephmanxfound a few more timeouts to mess with...16:05
johnsomtimeout client <>16:07
johnsomtimeout server <>16:07
johnsomtimeout tunnel <>16:07
johnsomI would just set those all to like 5 minutes16:07
mephmanxI updated them all and its still processing...good sign...16:08
johnsom+116:08
mephmanxstill going so it might be working!  Im worried if the calls are staying open and will hit the 600 second mark...16:14
mephmanxI do see new vms in horizon though...16:14
johnsomIt really should not, the Octavia API calls return pretty quickly, the longest time spent is waiting for nova/neutron to plug network ports, etc.16:15
mephmanxjohnsom I think it worked!  Or, at least the deployment got further.  This is the second issue you helped me through...I shuold send you doughnuts or soemthing!16:37
mephmanxdeployment success!!!  Thank you!16:57
mephmanxhttps://pastebin.com/8cNDiY9i16:58
johnsomCool, glad I could help.17:01

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!