Friday, 2018-08-03

bbbbzhao_cgoncalves:  Maybe you don't run health-monitor to refresh the operating_status  ? ;-)00:07
bbbbzhao_cgoncalves:  Sorry , I need to run to office, it's late.. ;-).  I will reply you when I arrive.00:07
bbbbzhao_johnsom:  Does that mean I need to post a new revision for patch 2?00:08
johnsomI haven't posted any comment that require a revision to patch 2. We may add a patch to the end of the chain, but I am not committed to another patch 2 yet00:10
bbbbzhao_johnsom: Oh, yeah. Sorry. I consider -1 for making me concern the issue. Thanks00:23
johnsombbbbzhao_ Yesh, No problem, I just wanted to call attention to my question there as reordering flows is higher risk00:26
*** longkb has joined #openstack-lbaas00:49
*** abaindur has quit IRC01:31
*** hongbin has joined #openstack-lbaas01:44
bzhao__cgoncalves:  Hi, For your question, did you test on centos? The get_udp_listeners just do searching the udp specific named config file, is_udp_listener_running is checking  the keepalived which hold the specific file is running by searching in /proc with its pid.. Maybe the keepalived is not running.01:59
*** ramishra has joined #openstack-lbaas02:00
*** yamamoto has joined #openstack-lbaas02:01
bzhao__johnsom:  I will provider a full logs for the concerned reorder issue in part 2.  And today, I will begin the storyboard highest UDP bug today, and try my best to fullfill the rest of them in this weekend. Thank you again for many helps, and thanks our octavia team. ;-)02:04
bzhao__cgoncalves:  So your amp are in trouble status, the healthmonitor will remove the amp and rebuild it.02:05
bzhao__cgoncalves:  As it can not get the expect listener to update into db.02:05
bzhao__cgoncalves:  I mean "it" above sentence is health monitor process02:06
*** ramishra has quit IRC02:08
*** yamamoto has quit IRC02:12
*** yamamoto has joined #openstack-lbaas02:18
*** yamamoto has quit IRC02:23
bzhao__cgoncalves:  sorry, s/ healthmonitor /health manger/02:35
johnsomThank you. Volunteering at the county fair tonight I can’t work on it tonight, but will work again tomorrow02:39
*** yamamoto has joined #openstack-lbaas03:03
bzhao__johnsom: Thanks. ;-). Run with the time.03:08
bzhao__johnsom:  I had prepared the LOG, I think I show it here. Hope not fresh other guys' screen.03:09
bbbbzhao_https://www.irccloud.com/pastebin/tgt6zgFO/This%20is%20operation%20steps.03:10
*** yamamoto has quit IRC03:11
bbbbzhao_http://paste.openstack.org/show/727198/    This is piece of health manager logs.03:20
bbbbzhao_I collect this by not moving the order. Agent side will raise 500 for start listener  https://www.irccloud.com/pastebin/GuMt9ECo/This%20is%20the%20error%20in%20log03:26
bbbbzhao_johnsom:  I mark here and ping you for not miss it. Thanks. ;-). Have a good rest.03:26
*** yamamoto has joined #openstack-lbaas03:37
*** yamamoto has quit IRC03:41
*** yamamoto has joined #openstack-lbaas03:43
*** hongbin has quit IRC03:52
*** yamamoto has quit IRC04:02
*** ramishra has joined #openstack-lbaas04:03
*** yamamoto has joined #openstack-lbaas04:03
*** yamamoto has quit IRC04:14
*** yamamoto has joined #openstack-lbaas04:28
*** yamamoto has quit IRC04:38
*** yamamoto has joined #openstack-lbaas04:43
*** yamamoto has quit IRC04:47
*** yamamoto has joined #openstack-lbaas04:56
*** yamamoto has quit IRC04:58
*** yamamoto has joined #openstack-lbaas06:03
*** yamamoto_ has joined #openstack-lbaas06:07
*** yamamoto has quit IRC06:09
*** yamamoto_ has quit IRC06:10
*** rcernin has quit IRC06:54
*** annp has quit IRC07:02
*** longkb has quit IRC07:03
*** longkb has joined #openstack-lbaas07:17
*** longkb has quit IRC07:31
*** longkb has joined #openstack-lbaas07:35
*** ktibi has joined #openstack-lbaas07:56
cgoncalvesbzhao__, yes, centos. right, health manager is failing over amp because of expected listeners not matching08:19
cgoncalvesI will continue looking at it today08:19
*** salmankhan has joined #openstack-lbaas08:48
openstackgerritZhaoBo proposed openstack/octavia master: Followup patch for UDP support  https://review.openstack.org/58769008:49
bzhao__cgoncalves:  Thanks.  My env just be flushed. Is the process can be found in "/proc/PID" on centos? If yes, we must to go inside and check the /var/log/message for the reason why it can not be setup. If not, another different between OS.08:51
cgoncalvesbzhao__, I'm restacking with latest patches. yeah, I wanted to check that yesterday but was 2 AM for me :)08:55
cgoncalvesI'll check and keep you posted08:55
bzhao__cgoncalves:  You work so hard. ;-) .  Take a good rest.08:58
cgoncalvesthanks :)08:58
*** obre is now known as obre_09:58
*** obre_ is now known as obre09:58
*** obre has quit IRC10:04
openstackgerritZhaoBo proposed openstack/octavia master: [UDP] Fix failed member always in DRAIN status  https://review.openstack.org/58851110:53
*** amuller has joined #openstack-lbaas11:53
*** longkb has quit IRC12:38
*** ktibi has quit IRC13:04
*** ramishra has quit IRC13:22
*** ktibi has joined #openstack-lbaas14:06
cgoncalvesI no longer get unexpected # of listeners. latest PS should have fixed it14:12
cgoncalvesaltough pool operating_status keeps on OFFLINE14:13
cgoncalvesmember provisioning status is ACTIVE14:13
cgoncalvesah... netcat isn't installed on centos amp :/14:16
*** rpittau has quit IRC14:18
*** hongbin has joined #openstack-lbaas14:33
*** erjacobs has joined #openstack-lbaas14:41
cgoncalvesinteresting. amphora-haproxy netns isn't created after vm reboot and isn't failed over as eth0 (lb-mgmt) is up and reports health msgs14:41
cgoncalveshttps://storyboard.openstack.org/#!/story/200330614:54
openstackgerritGerman Eichberger proposed openstack/octavia master: Allows failover if port is not deallocated by nova  https://review.openstack.org/58586415:47
*** erjacobs has quit IRC15:57
-openstackstatus- NOTICE: The infra team is renaming projects in Gerrit. There will be a short ~10 minute Gerrit downtime in a few minutes as a result.16:02
johnsomcgoncalves Hmm, so I think there is a systemd service for setting up the netns, I wonder if that is failing16:21
xgerman_would that run for UDP?16:27
johnsomWell, it doesn't *depend* on any other systemd service, but the question is where in the code is it getting written out.16:32
johnsomYeah, it is missing on the pure UDP path16:34
*** openstackgerrit has quit IRC16:49
cgoncalvesjohnsom, can you reproduce it on ubuntu?16:52
johnsomI haven't tried, but I can clearly see in the UDP path where this is missing.16:52
johnsomcgoncalves https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/agent/api_server/listener.py#L19016:53
johnsomThat section is not in the pure UDP path16:53
johnsomAnd should be16:53
johnsomThough a refactor would be nice too, but....16:53
*** dmellado has quit IRC17:22
johnsomcgoncalves Can you confirm the LB with the amp you reboot only had a UDP listener?  Is there something more you want me to test here on xenial?17:27
cgoncalvesjohnsom, only one UDP listener17:28
cgoncalvesjohnsom, netcat being installed is a new requirement, correct?17:28
johnsomcgoncalves Yeah, ok, so it's that missing code. I will fix in my patch today.17:28
-openstackstatus- NOTICE: Project renames and review.openstack.org downtime are complete without any major issue.17:28
johnsomI am stacking last night's code, going to investigate that flow re-order, then try to finish my review of #217:29
cgoncalvesthanks \o/17:29
*** openstackgerrit has joined #openstack-lbaas17:39
openstackgerritCarlos Goncalves proposed openstack/octavia-tempest-plugin master: WIP: Gate on CentOS 7 and check on Ubuntu Bionic  https://review.openstack.org/58741417:39
openstackgerritCarlos Goncalves proposed openstack/octavia-tempest-plugin master: WIP: Gate on CentOS 7 and check on Ubuntu Bionic  https://review.openstack.org/58741417:44
cgoncalvesjohnsom, we'll need to revert your Ib8677d2b85e352b19abf5fd0b79c1b865381930117:45
cgoncalves"Job octavia-v2-dsvm-scenario-ubuntu.bionic in openstack/octavia-tempest-plugin is not permitted to shadow job octavia-v2-dsvm-scenario-ubuntu.bionic in openstack/octavia"17:45
johnsomSay what?17:45
cgoncalvesor I can fix that in https://review.openstack.org/#/c/587442/17:46
*** salmankhan has quit IRC17:46
johnsomcgoncalves Where are you seeing that?  Ib8677d2b85e352b19abf5fd0b79c1b8653819301 is correct, however, there should be no definition for that in octavia/octavia17:48
KeithMnemonicjohnsonm can you point me in the location of the code that populates the octavia.conf or amphora-agent.conf in the amphora images. i.e when changing a value on the octavia-worked node how it gets sent to the amphora when it is freshly booted?17:49
johnsomamphora-agent.conf is created at amp boot time and loaded via config drive. It does not get updated after boot (If I remember right).  We have plans to enable that, but not implemented yet17:50
johnsomKeithMnemonic Hi Keith BTW.17:51
johnsomThe template is here: https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/agent/templates/amphora_agent_conf.template17:51
johnsomIt gets rendered here: https://github.com/openstack/octavia/blob/master/octavia/controller/worker/tasks/compute_tasks.py#L7717:52
johnsomcgoncalves Ah, this needs to go away... https://github.com/openstack/octavia/blob/master/zuul.d/jobs.yaml#L5417:53
johnsomSigh, to many things going on at once....17:53
johnsomcgoncalves yes, nuke it here would be great: https://review.openstack.org/#/c/587442/17:53
cgoncalvesright. that's what I'm gonna do once I get to fix octavia-v2-dsvm-scenario-centos.717:54
johnsomThanks17:55
KeithMnemonicjohnsonm Hello Back, sorry to be so abrupt ;0-) thanks, config drive is what i was looking for17:57
KeithMnemonicis that the same for octavia.conf17:57
johnsomoctavia.conf is only on the controllers17:57
johnsomAnd is configured/installed by the operator or packager17:58
KeithMnemonicok do maybe this guy read your note wrong http://eavesdrop.openstack.org/irclogs/%23openstack-lbaas/%23openstack-lbaas.2016-05-26.log.html17:58
KeithMnemonicjohnsom: Just a second, I will send you the settings I think you need to increase17:58
KeithMnemonicjohnsom: kevo These two in octavia.conf17:58
KeithMnemonicjohnsom: # rest_request_conn_timeout = 1017:58
KeithMnemonicjohnsom: # rest_request_read_timeout = 6017:58
KeithMnemonickevo: Johnson, I'll try that out and I'll let you know. Thanks17:58
KeithMnemonickevo: thanks johnsom your suggestion worked.17:58
johnsom# ls /etc/octavia/17:58
johnsomamphora-agent.conf  certs17:58
johnsomroot@amphora-6bf09249-2ba9-4ce3-9572-83e61dcf5e21:/usr/lib/systemd/system#17:58
KeithMnemonici see an octavia.conf but it is all commented out17:59
KeithMnemonicdo you recall that conversation17:59
johnsomRight, those settings are only valid in the octavia.conf (they are controller settings).  When they are commented out, they are using the default values, which is the number in the comment.18:00
openstackgerritCarlos Goncalves proposed openstack/octavia master: Gate on octavia-dsvm-base based jobs and housekeeping  https://review.openstack.org/58744218:00
KeithMnemonicok they guy who pinged me thought they should end up on the amphorae18:00
johnsomSo, this: # rest_request_conn_timeout = 1018:00
johnsomMeans rest_request_conn_timeout  is using the coded default, which is 10 by default.18:01
johnsomNo, that is the timeout for the controller talking to the amphora-agent.  It doesn't get set or used in the amp18:01
KeithMnemonicyes he misunderstood18:01
johnsomTwo years ago??? ha18:02
KeithMnemonichis issue is the vip is not plugging fast enough and he saw that old thread and thought it was the same issue18:02
johnsomLucky if I remember what people asked me last week18:02
johnsomHmm, so nova isn't booting the VM fast enough?18:02
openstackgerritCarlos Goncalves proposed openstack/octavia master: Gate on octavia-dsvm-base based jobs and housekeeping  https://review.openstack.org/58744218:02
johnsomKeithMnemonic That setting is connection_max_retries and connection_retry_interval18:03
johnsomBut the default for that is like 25 minutes. If he can't boot an amp in that, he might as well go home.....18:04
johnsomCould be that his lb-mgmt-net isn't working18:04
johnsomNormal time for that action is less than 30 seconds18:04
johnsomWe have it at 25 minutes for virtualbox users and super slow gate test hosts18:05
openstackgerritCarlos Goncalves proposed openstack/octavia master: Gate on octavia-dsvm-base based jobs and housekeeping  https://review.openstack.org/58744218:05
* cgoncalves needs more coffee...18:05
johnsomProduction deploys usually drop that down, but like I said, it's normal to just boot in less than 30 seconds18:05
johnsomKeithMnemonic I would check that the lb-mgmt-net is even working.  Most likely that is the problem18:06
KeithMnemonicyeah for sure. i can check the plumbing18:06
KeithMnemonicthe log showed     Error code 400:18:07
KeithMnemonicJSON Response:18:07
KeithMnemonic  {18:07
KeithMnemonic    'message': 'Invalid VIP',18:07
KeithMnemonic  }18:07
KeithMnemonic        I found a match of these error:18:07
KeithMnemonicand that led him to that old thread listed above18:07
KeithMnemonic2018-07-03 11:44:44.977 1711 INFO werkzeug [-] 10.207.206.13 - - [03/Jul/2018 11:44:44] "POST /0.5/plug/vip/10.207.221.55 HTTP/1.1" 404 –18:09
colin-when you guys create an LB and the provisioning state goes to ERROR, what's the first place you look for additional info? i'm not super familiar with the API at the moment and my instinct is to just check the worker's process output but i'm wondering if there's better info than that available18:09
johnsomKeithMnemonic 400 means user error.  Like the VIP address doesn't match the subnet/network specified, or something like that.18:16
cgoncalvescolin-, that is how I do it, too :/18:17
johnsomcolin- Yes, provisioning_status ERROR means the controller ran into a problem and all of the retries/workarounds have failed and the controller needs to stop and ask for operator intervention.  Like if nova goes down in the middle of booting an amphora, or neutron fails to create a port and the retries time out.  The first stop is going to be the controller logs18:17
cgoncalvestoday I added "Debugging - End-user friendly ERROR messages" as proposed topic for the PTG18:18
johnsomIn that state the end user has the option of escalating to the operator or deleting the object in ERROR and trying again18:18
*** openstackgerrit has quit IRC18:19
johnsomcgoncalves bring the popcorn.  This is fun topic. Many operators want to hide the true reason for the failures from the users...  SLA contracts and such....18:19
colin-understood, thanks18:19
johnsomThey don't like octavia objects to say "Nova compute has been down for 8 hours, unable to create load balancer" Which is exactly what it would say for a recent issue I saw... lol18:20
cgoncalvesjohnsom, I know. i was thinking soemthing generic yet a bit more useful like pointing fingers to the compute or network service18:20
johnsomcgoncalves It is a good topic to discuss though, so please feel free to add it to the etherpad18:32
johnsomhttps://etherpad.openstack.org/p/octavia-stein-ptg18:33
cgoncalvesI already did :)18:33
johnsomI see that. Awesome18:33
*** openstackgerrit has joined #openstack-lbaas18:34
openstackgerritGerman Eichberger proposed openstack/octavia master: Allows failover if port is not deallocated by nova  https://review.openstack.org/58586418:34
openstackgerritCarlos Goncalves proposed openstack/octavia-tempest-plugin master: WIP: Gate on CentOS 7 and check on Ubuntu Bionic  https://review.openstack.org/58741418:36
*** ktibi has quit IRC19:04
*** abaindur has joined #openstack-lbaas19:16
*** salmankhan has joined #openstack-lbaas19:19
*** salmankhan has quit IRC19:24
*** abaindur has quit IRC19:28
*** amuller has quit IRC19:52
*** dmellado has joined #openstack-lbaas19:56
rm_workjiteka: ask questions here :P20:09
jitekaI'm too shy20:15
jiteka:D20:15
jitekaactually I already bothered johnsom couple of time this week20:15
jitekait was your turn20:15
xgerman_lol20:16
rm_worklol20:17
*** harlowja has joined #openstack-lbaas20:21
johnsomjiteka No need to be shy, we are all friendly here20:27
jitekaI know johnsom was joking :)20:29
johnsomFunny how we all have the same advice...  grin20:37
johnsomNice, our upgrade tags have shown up: https://pypi.org/project/octavia/20:47
cgoncalvesI don't get why so often our centos job fails to build the amp image with "Cannot retrieve metalink for repository: epel/x86_64. Please verify its path and try again"20:48
cgoncalveshttp://logs.openstack.org/14/587414/5/check/octavia-v2-dsvm-scenario-centos.7/18de25c/job-output.txt.gz#_2018-08-03_20_28_39_67237720:48
cgoncalvesthis is a successful run: http://logs.openstack.org/55/587255/1/check/octavia-v1-dsvm-scenario-kvm-centos.7/f145430/logs/devstacklog.txt.gz#_2018-07-31_00_16_20_96920:50
johnsomlooking20:50
cgoncalveseven though there was that metadata 40420:50
johnsomWell, even on that one "updateinfo.xml.bz2: [Errno 14] HTTP Error 404 - Not Found" doesn't seem good20:51
johnsomSo, I could be lazy and say the epel mirrors are trash, but I'm not so give me a few minutes to look at some things20:52
cgoncalvesjohnsom, I asked on #openstack-infra but no luck. perhaps you could work your magic :D20:52
johnsomlol20:53
johnsomMaybe they don't get into the RedHat parties either20:53
cgoncalveslol20:53
johnsomHmm, looks like that is occuring in the base elements from DIB too and not one of ours?  I can't imagine you got an include of iscsi by me20:55
cgoncalvesdiskimage-builder/diskimage_builder/elements/base/install.d/00-baseline-environment20:56
cgoncalvesinstall-packages -m base iscsi_package20:56
xgerman_with the next PTG around the corner —- we are better on that bus ;-)21:01
johnsomcgoncalves My initial guess is this element step is not working: 01-set-centos-mirror21:02
johnsomBut still looking21:02
cgoncalvesjohnsom, I doubt that because that's for centos repos. epel is managed separately21:03
johnsomThat it is running out to the interwebs and not using the OpenStack infra mirrors.  But it could be that that epel isn't mirrored as well21:03
johnsomWell, that *might* be the issue....  See what I am saying21:03
johnsomcgoncalves Yep, ok, got it21:04
johnsomhttp://logs.openstack.org/14/587414/5/check/octavia-v2-dsvm-scenario-centos.7/18de25c/job-output.txt.gz#_2018-08-03_20_28_38_55654821:04
johnsomThis is the execution log of the DIB phase. The # prefixes are the ordering21:05
johnsomOh, wait, nevermind, looking in the wrong place.21:05
johnsom05-rpm-epel-release This one might not be working....21:05
johnsomlol21:05
johnsomhttps://github.com/openstack/diskimage-builder/tree/master/diskimage_builder/elements/epel21:06
cgoncalvesfrom what I've seen, it is working. the epel-release package gets installed21:06
cgoncalvesmirror is not overwritten, which is expected21:07
cgoncalvesit's really intermittent. see http://logs.openstack.org/06/586906/2/check/octavia-v1-dsvm-scenario-kvm-centos.7/b60cf4b/logs/devstacklog.txt.gz#_2018-08-01_12_12_02_645. no errors or warnings whatsoever21:08
johnsomYeah, it looks like the "cache data" from for fastestmirror is getting trashed?21:11
johnsomHmm, so your failed job ran at OVH, the success ran at inap.  Is there a relation on the failed jobs?21:13
johnsomYou can look in zuul-info/zuul-info.controller.txt for the provider21:13
johnsomIt could be one provider has a problem with the mirror21:14
cgoncalvesI'd say very likely21:14
johnsomThis still makes me wonder:21:15
johnsomhttp://logs.openstack.org/06/586906/2/check/octavia-v1-dsvm-scenario-kvm-centos.7/b60cf4b/logs/devstacklog.txt.gz#_2018-08-01_12_12_02_89821:15
johnsomSo the others seem to be mirrors at the provider, this one (success) went out to the internets21:15
johnsomosuosl.org (which happens to be here in town)21:15
johnsomGo beavs21:15
johnsommnaser Quick question, are you aware of any recent issues with the epel mirror at OVH?21:17
mnaserjohnsom: not that i know of21:17
johnsomOk, no smoking gun here yet, but thought I would give a quick ping to check21:17
johnsomhttp://logs.openstack.org/14/587414/5/check/octavia-v2-dsvm-scenario-centos.7/18de25c/job-output.txt.gz#_2018-08-03_20_28_39_672377 for those playing along21:18
cgoncalveshttp://logs.openstack.org/56/584856/3/check/octavia-v1-dsvm-scenario-kvm-centos.7/1766905/zuul-info/inventory.yaml okay with rax21:21
johnsomYeah, inap and rax seem to pass.21:21
johnsomcgoncalves Is this centos 7?21:21
cgoncalvesjohnsom, yes. controller and amp21:22
cgoncalveshttp://logs.openstack.org/14/587414/5/check/octavia-v2-dsvm-scenario-centos.7/b7ce6d4/zuul-info/inventory.yaml this is inap and failed21:22
johnsomOk, so likely not a local mirror issue21:22
johnsomcgoncalves these are running on different nodepool images.  Your failed is on centos image where pass is on ubuntu21:24
cgoncalveshmm ok, looking now at nodeset level...21:25
cgoncalveswhy can't you just pull your strings with infra people. would be much easier xD21:25
johnsomMust save silver bullets for things specific and clear.....21:26
johnsomSo that path I was going down is looking at the ca-certificates files as that plays a role here.  That is when I noticed the base image is different21:27
johnsomYeah, all four samples I have line up, so it looks like you must build centos images on ubuntu hosts...  lol21:28
xgerman_yep, that’s how it was designed21:28
xgerman_cgoncalves: did you think you could build centos on centos?21:29
johnsomI would look at clock skew on the centos hosts (i.e. is it getting a good time so the ssl can negotiate?), the packages like yum ca-certificates, etc.21:29
johnsomxgerman_ I love the wording of this MicroFocus proxy vote letter (from hp/hpe stock you might have had): "To approve the disposal by the Company of the SUSE buisness segment ..."21:32
johnsomcgoncalves I need to get back to  UDP stuff.  Noodle on that a bit. If you are still stuck ping your colleague ianw in #openstack-dib21:36
cgoncalvesjohnsom, ok. thank you for your time!21:36
johnsomIf you are still stuck next week ping me again on it21:37
xgerman_Yeah, only kept the HP printer Corp. - the rest seemed to risky:-)21:38
*** pcaruana has quit IRC21:38
johnsomYeah, I must have like one share floating around somewhere21:38
jitekaCould someone confirm which parameter I need to change to increase the timeout on amphora build21:39
johnsomWow, really?  25 minutes isn't enough?  I think that is the default21:39
jitekaI don't have enough time to troubleshoot why the controller can't reach my VM21:39
jitekaoctavia-worker[18322]: WARNING octavia.amphorae.drivers.haproxy.rest_api_driver [-] Could not connect to instance. Retrying.: ConnectionError: HTTPSConnectionPool(host='10.79.80.30', port=9443): Max retries exceeded with url: /0.5/plug/vip/10.63.69.0 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fc56b4edbd0>: Failed to establish a new connection: [Errno 111] Connection refused',))21:39
jitekathe VM go away in something like 1 or 2 min21:40
johnsomjiteka connection_max_retries (default 300) and connection_retry_interval  (default 5) are the two timeouts there while we wait for nova to boot the instance.21:41
johnsomThough this also could mean your lb-mgmt-net is not working.  Booting up a cirros there and checking that it is reachable/got an IP can help there21:41
johnsomHmmm refused, could be a firewall or some kind of customized image that is broken.21:42
jitekawhat's the difference between :21:43
jiteka- ConnectTimeoutError21:43
jiteka- NewConnectionError21:43
johnsomNormally we handle the security groups, so that should not be an issue.21:43
jitekaI noticed that it throw few timeout before throwing error21:43
johnsomHmm, neither of those are from our code.  They are things we are catching. Let me search. Are they both in that same warning message?21:48
johnsomjiteka They are both URLLIB3 execptions: http://urllib3.readthedocs.io/en/latest/reference/index.html#module-urllib3.exceptions21:49
johnsomNewConnectionError appears to mean something is actively rejecting the connection, where ConectTimeoutError is no response at all21:50
johnsomYou would see ComputeWaitTimeoutException if octavia actually gives up trying to connect21:51
KeithMnemonicjohnsom thanks again have a great weekend (need to make up for my earlier abruptness)21:53
johnsomOr a pure "TimeOutException" with an error log entry "Connection retries (currently set to %(max_retries)s) exhausted.  The amphora is unavailable.21:53
johnsomKeithMnemonic o/21:53
*** KeithMnemonic has quit IRC21:53
rm_workhmmm i feel like i JUST saw someone else posting about that centos build error somewhere else...21:57
johnsomrm_work BTW, I hacked a stack with your bbq client patch (fed=False) and it doesn't appear to solve the problem.21:59
rm_workhmmmmm21:59
johnsomAt least when using Octavia22:00
johnsomI didn't try the CLI test22:00
rm_workit still tries to hit the href passed in?22:00
johnsomIt hits the public URL in keystone. I don't know if it's getting that from keystone, the hardcoded one in the config file, or the href22:00
rm_workhmmmm22:01
rm_workyeah can you just put a print statement in to show what URL it's passing in22:01
johnsomSadly I don't have direct access to that stack, so my debugging is limited there.22:01
rm_workin the secrets class (right after I do the if-statemet to generate the new URL)22:01
rm_workah22:01
rm_worknot devstack?22:01
johnsomNo, it's an actual cloud that someone else controls22:02
johnsomThus why the internal and public URLs are different and we found this issue.22:02
rm_workyeah but22:03
johnsomMaybe next week I can setup that CLI test on devstack again.22:03
rm_workhmmm22:03
rm_workdo you know it was done correctly then?22:03
johnsomI'm just arms deep in a amphora-agent refactor for UDP22:03
rm_worklike, the patch was actually installed in the right place22:03
johnsomYeah, I watched and instructed as the installed.  85% confident22:03
rm_workhmmmmmmmm22:03
rm_workI can add in some debugging if you can have them try it again22:04
rm_workthen could at least verify it is running that code22:04
rm_workthere are some changes i wanted to make anyway22:04
johnsomYeah, maybe next week. I can also just setup a devstack and change the internal URL to broken and run my test steps in the story22:04
rm_workyes22:04
johnsomI just don't have the VMs for that right now22:04
rm_worki would like to see that :P22:04
rm_workah yeah22:04
rm_workcloud + bit.do/devstack? :P22:05
rm_worki always just used RAX VMs22:05
johnsomI don't, for the reasons your are aware of....22:05
johnsomlol22:05
rm_worki mean22:05
rm_workit's better than NO VMs22:05
rm_workusually22:05
rm_workbut, let me see if i can spin a stack22:06
rm_worki haven't tried in a while22:06
rm_workdo you stack on Bionic now?22:07
rm_worki should prolly just make a clean thing22:07
johnsomNo, I haven't switched yet22:07
rm_workhmmm22:08
johnsomI was nervous about the major networking changes, but it seems the compatibility stuff works well enough our amps run.22:08
rm_workbut it should be safe, right?22:08
rm_workhmm k22:08
openstackgerritAdam Harwell proposed openstack/octavia master: Add usage admin resource  https://review.openstack.org/55754822:11
*** hongbin has quit IRC22:19
openstackgerritCarlos Goncalves proposed openstack/octavia-tempest-plugin master: WIP: Gate on CentOS 7 and check on Ubuntu Bionic  https://review.openstack.org/58741422:35
cgoncalvesgot it! https://review.openstack.org/#/c/588676/23:13
johnsomNice, so it was a ca-certificates issue.  Probably just not installed soon enough.23:19
cgoncalvesnot installed at all23:22
cgoncalvesepel is installed on the host but is http://23:22
johnsomYeah, it is eventually, it was in rpm list23:22
cgoncalvesall repos I've come across are http:// in fact23:22
johnsomhttp://logs.openstack.org/14/587414/5/check/octavia-v2-dsvm-scenario-centos.7/18de25c/controller/logs/rpm-qa.txt.gz23:24
johnsomI checked that, but it must be coming in too late23:24
cgoncalvesyou, sir, always amuse me with such useful logs23:24
cgoncalvesso, the mystery of octavia-tempest-plugin is still unsolved. the perms seem to be right, though23:25
* johnsom spends way too much time looking at logs for people.....23:25
cgoncalvesguilty xD23:25
johnsomHmmm, have a link for the plugin issue?23:26
johnsomWhere you are dumping the perms?23:26
cgoncalvespost-run. 2 secs23:26
johnsomMy little patch for the netns service is getting bigger. I found that some of the unit tests are actually functional tests and need moved....23:26
cgoncalvesjohnsom, http://logs.openstack.org/14/587414/5/check/octavia-v2-dsvm-scenario-centos.7/18de25c/job-output.txt.gz#_2018-08-03_20_28_52_63260023:31
cgoncalvesrun with ca-certificate fixed: http://logs.openstack.org/14/587414/6/check/octavia-v2-dsvm-scenario-centos.7/cf0e780/job-output.txt.gz#_2018-08-03_23_24_59_01714923:34
johnsomSo right at the top of that, it only zuul can access that path. Where is that original failure?23:34
johnsomAh, ok23:34
rm_workwoo my devstack worked23:46
rm_workok time to test this patch thing23:47
rm_workso you recommended ... setting the config for "internal" in octavia.conf for barbican23:47
rm_workand then setting internal to something invalid23:47
johnsomYeah23:47
rm_workand seeing if it still succeeds?23:47
rm_work... prolly i'll just do tons of debug logging23:47
johnsomcertificates section endpoint_type23:47
johnsomYeah23:48
johnsomOr just use my CLI test in the story23:48
rm_workoh, right23:48
rm_workwell23:48
rm_workCLI doesn't have a flag...23:48
*** harlowja has quit IRC23:48
rm_worki'd have to default it to False23:48
rm_workwhich i can do, so :)23:49
johnsomcgoncalves Try overriding the path to the tempest plugin to be /opt/stack/octavia-tempest-plugin23:50
johnsomI set it to /home/zuul/....  here:23:50
johnsom    vars:23:51
johnsom      devstack_localrc:23:51
johnsom        TEMPEST_PLUGINS: "'{{ ansible_user_dir }}/src/git.openstack.org/openstack/octavia-tempest-plugin'"23:51
rm_workoh ummm23:51
rm_workjohnsom: any chance they're still using the old-style Containers?23:51
rm_workinstead of one PKCS12 secret?23:51
johnsomNo, it is pkcs1223:51
rm_workbecause... i didn't do it for Containers yet in that patch...23:51
rm_workhmm ok23:51
rm_workreally thought i had it there for a sec :P23:51
johnsomcgoncalves Yeah, my money is on that /home/zuul directory permissions are different on that centos nodepool instance.  opt/stack/octavia-tempest-plugin should fix you right up.  Probably should just to that in the parent job for all of them23:53
cgoncalves2018-08-03 23:25:28.290794 | controller | /home/zuul:23:56
cgoncalves2018-08-03 23:25:28.290953 | controller | total 5223:56
cgoncalves2018-08-03 23:25:28.291080 | controller | drwx------. 7 zuul zuul 4096 Aug  3 22:43 .23:56
johnsomYeah,, I think the user at that point has switched to "stack" via devstack23:57
cgoncalvesisn't the ansible user zuul? if so it should have no probs in reading /home/zuul23:57
rm_workhmmmm johnsom this seems to be working in my client, one sec23:57
cgoncalvesah, right23:57
openstackgerritCarlos Goncalves proposed openstack/octavia-tempest-plugin master: WIP: Gate on CentOS 7 and check on Ubuntu Bionic  https://review.openstack.org/58741423:59
cgoncalvesthanks johnsom23:59
cgoncalvesoff I go23:59
johnsomo/23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!