*** armax has quit IRC | 00:01 | |
*** rcernin has joined #openstack-lbaas | 00:05 | |
*** rcernin has quit IRC | 00:06 | |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Add a requirements.txt check job https://review.opendev.org/751925 | 00:18 |
---|---|---|
johnsom | If it's working, that will fail. | 00:19 |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Add a requirements.txt check job https://review.opendev.org/751925 | 00:23 |
johnsom | Fix that so both tests run even if one fails. | 00:23 |
*** armax has joined #openstack-lbaas | 00:25 | |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Add a requirements.txt check job https://review.opendev.org/751925 | 00:29 |
*** lemko has joined #openstack-lbaas | 00:39 | |
*** lemko1 has quit IRC | 00:42 | |
*** armax has quit IRC | 00:59 | |
openstackgerrit | Adam Harwell proposed openstack/octavia master: Use routed network filter if it exists https://review.opendev.org/706153 | 01:04 |
*** ianychoi_ has joined #openstack-lbaas | 01:46 | |
*** ianychoi has quit IRC | 01:48 | |
openstackgerrit | zhangchun proposed openstack/octavia-tempest-plugin master: Remove install unnecessary packages https://review.opendev.org/751949 | 02:12 |
openstackgerrit | zhangchun proposed openstack/octavia-dashboard master: Remove install unnecessary packages https://review.opendev.org/751953 | 02:16 |
*** psachin has joined #openstack-lbaas | 03:00 | |
*** vishalmanchanda has joined #openstack-lbaas | 04:33 | |
*** gcheresh has joined #openstack-lbaas | 04:50 | |
*** AlexStaf has quit IRC | 05:10 | |
*** gcheresh has quit IRC | 05:13 | |
*** zzzeek has quit IRC | 05:19 | |
*** zzzeek has joined #openstack-lbaas | 05:22 | |
cgoncalves | haleyb, allowed cidrs were not tested against the OVN provider driver because the driver does not support it. support was added only to amphora. if you're asking for tested against OVN ML2, I have not tested that either but it should just work as the amphora driver is just calling Neutron APIs so ML2 plugin shouldn't matter | 06:02 |
*** ramishra_ has quit IRC | 06:32 | |
*** ramishra has joined #openstack-lbaas | 06:44 | |
*** zzzeek has quit IRC | 06:45 | |
*** gcheresh has joined #openstack-lbaas | 06:46 | |
*** zzzeek has joined #openstack-lbaas | 06:47 | |
*** eandersson has quit IRC | 06:58 | |
*** eandersson has joined #openstack-lbaas | 06:59 | |
*** zzzeek has quit IRC | 07:01 | |
*** eandersson has quit IRC | 07:01 | |
*** eandersson has joined #openstack-lbaas | 07:02 | |
*** zzzeek has joined #openstack-lbaas | 07:03 | |
*** ataraday_ has joined #openstack-lbaas | 07:29 | |
*** ccamposr__ has joined #openstack-lbaas | 07:33 | |
*** ccamposr has quit IRC | 07:36 | |
openstackgerrit | Ann Taraday proposed openstack/octavia master: Remove unnecessary joinedload https://review.opendev.org/740994 | 07:46 |
*** AlexStaf has joined #openstack-lbaas | 09:33 | |
*** lxkong_ has joined #openstack-lbaas | 09:43 | |
*** tkajinam has quit IRC | 09:49 | |
*** lxkong has quit IRC | 09:56 | |
*** lxkong_ is now known as lxkong | 09:56 | |
*** AlexStaf has quit IRC | 10:23 | |
*** zzzeek has quit IRC | 11:24 | |
*** zzzeek has joined #openstack-lbaas | 11:24 | |
openstackgerrit | Arkady Shtempler proposed openstack/octavia-tempest-plugin master: Adding failover test. Send HTTP traffic while MASTER Amphore is rebooted. BACKUP Amphore should serve the traffic. https://review.opendev.org/751617 | 11:37 |
*** lxkong has quit IRC | 12:21 | |
*** lxkong has joined #openstack-lbaas | 12:26 | |
*** njohnston has joined #openstack-lbaas | 12:43 | |
*** njohnston_ has joined #openstack-lbaas | 12:51 | |
*** njohnston has quit IRC | 12:51 | |
openstackgerrit | Ann Taraday proposed openstack/octavia master: Add experimental amphorav2 jobs https://review.opendev.org/737993 | 13:08 |
*** njohnston_ has quit IRC | 13:22 | |
*** ataraday_ has quit IRC | 13:23 | |
*** njohnston has joined #openstack-lbaas | 13:31 | |
*** ccamposr has joined #openstack-lbaas | 13:38 | |
*** TrevorV has joined #openstack-lbaas | 13:41 | |
*** ccamposr__ has quit IRC | 13:42 | |
dulek | cgoncalves, johnsom: Hi there! We're recently seeing maaany failures in our gate and I've just correlated that with OOM killing Amps and our containers. | 13:42 |
dulek | My bet is that it's Kuryr apps which is leaking, but I figured out it won't hurt to ask if anything changed in that matter on Octavia side? | 13:43 |
cgoncalves | dulek, nothing comes to mind | 13:43 |
dulek | cgoncalves: Okay, thanks! | 13:45 |
*** ccamposr__ has joined #openstack-lbaas | 13:49 | |
*** ccamposr has quit IRC | 13:53 | |
*** nicolasbock_ has joined #openstack-lbaas | 13:57 | |
*** andrein_ has joined #openstack-lbaas | 13:57 | |
*** dougwig_ has joined #openstack-lbaas | 13:57 | |
*** headphoneJames_ has joined #openstack-lbaas | 13:57 | |
*** gmann_ has joined #openstack-lbaas | 13:57 | |
*** cgoncalves has quit IRC | 13:58 | |
johnsom | dulek: I haven’t seen anything either. The only thing that comes to mind is the recent e-mail chain on the discuss list about keystone middleware leaking memcached connections via neutron API. | 13:59 |
*** nicolasbock has quit IRC | 14:00 | |
*** dougwig has quit IRC | 14:00 | |
*** andrein has quit IRC | 14:00 | |
*** logan- has quit IRC | 14:00 | |
*** headphoneJames has quit IRC | 14:00 | |
*** gmann has quit IRC | 14:00 | |
*** dougwig_ is now known as dougwig | 14:00 | |
*** logan_ has joined #openstack-lbaas | 14:00 | |
*** headphoneJames_ is now known as headphoneJames | 14:00 | |
*** gmann_ is now known as gmann | 14:00 | |
*** nicolasbock_ is now known as nicolasbock | 14:00 | |
*** andrein_ is now known as andrein | 14:00 | |
*** logan_ is now known as logan- | 14:00 | |
dulek | That would mean our tests were using just right amount of memory until Friday and that little leak made it overflow limits. A bit unlikely, I guess. ;) | 14:01 |
*** jamesdenton has quit IRC | 14:03 | |
*** jamesdenton has joined #openstack-lbaas | 14:05 | |
*** cgoncalves has joined #openstack-lbaas | 14:24 | |
haleyb | cgoncalves: ack, for now i'm just having it fail as no implemented, will have to look at amphora driver to see what it does | 14:26 |
*** armax has joined #openstack-lbaas | 14:41 | |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Add a requirements.txt check job https://review.opendev.org/751925 | 15:32 |
dulek | cgoncalves, johnsom: Uhm, probably a stupid one… So we lived with assumption that we need barbican to run Octavia, yet I don't see it on your gates. Can I just remove it from ours? | 15:36 |
cgoncalves | dulek, we do have a scenario -barbican job | 15:37 |
johnsom | dulek You only need barbican if you are doing TLS offload. | 15:37 |
johnsom | We have a special job that runs with barbican to test TLS offload | 15:37 |
dulek | johnsom: Hm, we're creating a listener and pool with HTTPS protocol - is that TLS offload already? | 15:39 |
johnsom | No, that would be HTTPS pass through. It's the TERMINATED_HTTPS protocol. (terminology we had to inherit from neutron-lbaas) | 15:40 |
dulek | Alright, then I'm trying to remove it. Thanks! | 15:40 |
*** spatel has joined #openstack-lbaas | 15:41 | |
*** ianychoi__ has joined #openstack-lbaas | 15:41 | |
johnsom | Oh man, the pypi cache/CDN issue is still happening.... | 15:43 |
*** ianychoi_ has quit IRC | 15:45 | |
dulek | True, it's the third stacked fatal issue on Kuryr CI too. :/ | 15:45 |
johnsom | dulek Do you have a link to a job that ran out of memory? I can look through some logs too if you would like | 15:50 |
*** gcheresh has quit IRC | 15:52 | |
dulek | johnsom: Sure, this is the simplest one: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e65/751263/3/check/kuryr-kubernetes-tempest/e65494e/controller/logs/ | 15:57 |
dulek | https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e65/751263/3/check/kuryr-kubernetes-tempest/e65494e/controller/logs/syslog.txt - you can find oom instances here. | 15:58 |
johnsom | Ok, cool, I will have browse | 15:58 |
dulek | johnsom: But I tend to bet that it's Kuryr fault as runs with lower-constraints.txt used to build Kuryr containers seem to work just fine. And the requirements are the only difference here. | 16:00 |
dulek | I'd have way lot more data points, but the PyPi issue interfered. | 16:00 |
-openstackstatus- NOTICE: Our PyPI caching proxies are serving stale package indexes for some packages. We think because PyPI's CDN is serving stale package indexes. We are sorting out how we can either fix or workaround that. In the meantime updating requirements is likely the wrong option. | 16:10 | |
*** psachin has quit IRC | 16:30 | |
*** ccamposr__ has quit IRC | 16:41 | |
*** ccamposr has joined #openstack-lbaas | 16:47 | |
*** gcheresh has joined #openstack-lbaas | 17:23 | |
*** TMM has quit IRC | 17:42 | |
*** TMM has joined #openstack-lbaas | 17:43 | |
*** vishalmanchanda has quit IRC | 18:13 | |
*** zzzeek has quit IRC | 18:39 | |
*** zzzeek has joined #openstack-lbaas | 18:41 | |
*** gcheresh has quit IRC | 18:48 | |
*** zzzeek has quit IRC | 18:51 | |
*** zzzeek has joined #openstack-lbaas | 18:53 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia-lib master: Add alpn_protocols to the pool data model https://review.opendev.org/752094 | 18:56 |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Add ALPN support for TLS-enabled pools https://review.opendev.org/752095 | 18:56 |
openstackgerrit | Carlos Goncalves proposed openstack/python-octaviaclient master: Add ALPN support for pools https://review.opendev.org/752096 | 18:57 |
*** zzzeek has quit IRC | 19:00 | |
*** zzzeek has joined #openstack-lbaas | 19:03 | |
*** zzzeek has quit IRC | 19:10 | |
*** gcheresh has joined #openstack-lbaas | 19:11 | |
*** zzzeek has joined #openstack-lbaas | 19:12 | |
johnsom | dulek So, looking through the logs I see that the test is running out of memory. At one point the OOM starts killing the amphora, which causes a failover process, which consumes more RAM during that process. It looked to me like etcd and mysql were high consumers. I may dig a little more as the logs are slightly inconsistent. | 19:36 |
johnsom | dulek One thing I noticed is the job is running with the default thread settings, which will scale to the core count. For your tests you don't need that much, so I have proposed a patch to limit the threads used in your test. That should save you some RAM. Certainly not the root cause, but maybe a help. | 19:37 |
*** spatel has quit IRC | 19:49 | |
*** gcheresh has quit IRC | 19:54 | |
*** TrevorV has quit IRC | 19:57 | |
rm_work | johnsom / cgoncalves: https://review.opendev.org/#/c/751111/ Carlos' comments here would actually be a CHANGE right? a good one, but not even what was done before? | 20:23 |
johnsom | Yeah, that is a big change. Currently provisioning_status does not have a DEGRADE state to my knowledge. Only operating status | 20:24 |
johnsom | https://docs.openstack.org/api-ref/load-balancer/v2/index.html#provisioning-status-codes | 20:25 |
johnsom | rm_work I commented on that. How are those two patches going? Are things better? | 21:13 |
johnsom | It seems like we have some time to discuss this patch as nothing is going to pass the gates until this CDN issue is resolved | 21:14 |
rm_work | having issues getting them deployed | 21:48 |
rm_work | due to ... CI stuff | 21:48 |
rm_work | but they seemed to fix issues in QE | 21:48 |
*** rcernin has joined #openstack-lbaas | 22:25 | |
*** servagem has quit IRC | 22:42 | |
*** tkajinam has joined #openstack-lbaas | 22:51 | |
johnsom | Whelp, we (infra folks) found what appears to be the issue in a full 12tb disk on a pypa mirror. Now it's just cleaning up the negative caches, etc. | 23:23 |
sorrison | johnsom: Just wondering if you have a smart trick to fix an LB stuck in error due to https://storyboard.openstack.org/#!/story/2008099 | 23:52 |
sorrison | I was wondering if I can fudge the DB somehow to get it to failover | 23:52 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!