Friday, 2018-12-14

*** fnaval has quit IRC		00:10
*** PagliaccisCloud has quit IRC		00:21
rm_work	cgoncalves: what happens when you then REMOVE that listener? does it remove the existing rule for that port and break the peering? :P	00:46
rm_work	was traveling today so I've been out, but what's happening	00:48
rm_work	xgerman: did you make it to kubecon?	00:49
*** Swami has quit IRC		00:49
johnsom	Also, I can reproduce this barbican ACL issue (queens). so fun times tracking that down	00:50
rm_work	hmmmmmmmm	00:54
johnsom	Yeah, I grant octavia the ACL right, but octavia gets RBAC error. Only if the user creating the secret is in a different project/	00:56
rm_work	hmmmmmmm	00:56
rm_work	let me spin up a queens env, do you have the steps written out somewhere? (story?)	00:57
johnsom	No, but that could happen super quick	00:57
johnsom	Hmm, nevermind, I think I screwed up reproducing it	01:02
rm_work	k	01:03
*** yamamoto has quit IRC		01:03
johnsom	I need to switch the octavia account and try this too.	01:03
rm_work	ok well, devstack is spinning	01:07
lxkong	johnsom, rm_work, could you please take a look at this https://storyboard.openstack.org/#!/story/2004602? If it's bug or not?	01:09
rm_work	hmmm	01:09
rm_work	i mean, that SOUNDS like a thing that could be a bug	01:10
rm_work	I guess qos-policy needs to be a capability check?	01:10
rm_work	similar to some of the other stuff we check for in the network driver	01:10
rm_work	and then we can set a flag for whether to try or not	01:10
rm_work	the question would be, what to do in the case it's unsupported -- do we let users set them and just ignore it? or do we do something like what happens if an operator disables TLS-Term in config?	01:11
rm_work	right now it's technically doing what I'd expect -- trying to do the requested provisioning, failing, and rolling back -> ERROR	01:12
rm_work	so that's "correct" kinda	01:12
johnsom	Yeah, create should roll back the API request when the VIP port create fails.	01:14
johnsom	Probably doesn't give the best error though.	01:14
johnsom	Update probably needs a check	01:14
rm_work	the user doesn't really have ANY insight into what happened	01:15
rm_work	which ... didn't we discuss fixing that, at the PTG?	01:15
rm_work	it came up again yesterday internally	01:15
rm_work	"how does the user get any visibility to what broke" -> "they don't"	01:16
johnsom	Sure they do, the API will return a 400 and a fault string	01:16
rm_work	not on an update	01:16
rm_work	i mean when something like THAT breaks	01:16
johnsom	Right, update should probably check the neutron capability	01:16
rm_work	and create won't either right now	01:16
rm_work	because it's on the async side	01:16
rm_work	isn't it?	01:16
johnsom	No it is not	01:17
rm_work	oh, maybe it's on the API side if it's the VIP port	01:17
johnsom	Yeah, we had to create that up front to give them the IP back in the respone	01:17
rm_work	i always forget we moved that to be sync	01:17
rm_work	we're so nice to our users, lol	01:17
lxkong	johnsom: but use just specify a None for that param, even neutron doesn't support, shoudn't that be ignored?	01:17
rm_work	nova don't care	01:17
lxkong	s/use/user	01:17
rm_work	"you'll get an IP when you get one"	01:17
rm_work	lol	01:17
rm_work	hmmm yeah, i kinda skimmed over the None part	01:18
rm_work	that is prolly because of how the WSME stuff loads? None should technically be "no change" so it shouldn't even go through any logic for it	01:18
rm_work	but ...	01:18
johnsom	I'm confused on the None thing?	01:19
rm_work	either the WSME stuff is handling None wrong, or we're doing something we shouldn't on updates (which I don't think is true, because I ran on Liberty Neutron and updates were fine)	01:19
johnsom	Yeah, in the API None/blank is "UnSetType" or something similar in the WSME objects the API gets	01:19
lxkong	rm_work: 'None' means remove any policy on the port, right?	01:19
rm_work	only if there WAS one	01:20
johnsom	Oh, missed the story link	01:20
johnsom	looking	01:20
rm_work	i mean, yes, it is different from Unset	01:20
rm_work	but we SHOULD throw it away	01:20
rm_work	if it's a noop	01:20
rm_work	... oh, unless we're being lazy/dumb, which is highly probably	01:20
rm_work	let me glance at that path	01:20
lxkong	what if neutron supports, we should not treat it as unsettype	01:20
johnsom	We will pass "null" in as None through WSME. Is the client converting that "None" to null?	01:21
johnsom	Yeah, the unset type gets dropped in our model handling. Unsets don't go into the update list	01:21
lxkong	`Set QoS policy ID for VIP port. Unset with 'None'.`	01:21
lxkong	None is meaningful value	01:22
johnsom	This is in the client?	01:22
lxkong	yeah	01:22
rm_work	we do the validate for it only if there's a change.... looking at what we do after that	01:22
johnsom	That is a fail	01:22
lxkong	from CLI helper	01:22
johnsom	the CLI has an "unset" command for that	01:22
johnsom	Probably didn't get implemented though	01:22
rm_work	yeah but it MUST be passing through something	01:23
rm_work	because it's UUIDType in WSME	01:23
johnsom	Probably null	01:23
rm_work	which means for it to accept the request and get to provisioning, it has to either be a UUID or a null	01:23
johnsom	lxkong For your original question, yes, that is very much a bug	01:23
rm_work	right, so a null would be correct for "unsetting"	01:23
rm_work	right?	01:23
johnsom	Yeah	01:23
rm_work	so i think the issue is past this	01:24
rm_work	i think because it's not "wsme-unset"	01:24
rm_work	we DO pass it through in the model that goes to the handler	01:24
rm_work	at which point it'd try to do an update without actually checking to see if that'd be a noop	01:24
rm_work	which would call neutron with a json-body that's invalid for that version	01:25
johnsom	Right, null would get through and become None in the code, post validation	01:25
rm_work	so even though there wasn't a vip-qos	01:25
rm_work	it'll try to remove it in neutron	01:25
rm_work	which will fail	01:25
rm_work	(because that version of neutron doesn't support it)	01:26
rm_work	yep	01:27
rm_work	just confirmed that for myself	01:28
lxkong	johnsom, rm_work, could you please help to fix so I can help backport to Queens and Rocky, we also need to do a quick release.	01:28
rm_work	https://github.com/openstack/octavia/blob/master/octavia/controller/worker/tasks/network_tasks.py#L542-L548	01:28
rm_work	so it'll try because it IS in the update-dict	01:28
rm_work	well the quick thing to do here would be to just short-circuit	01:28
rm_work	if it's a real noop	01:29
rm_work	which I believe we can check here	01:29
rm_work	but the real solution I think is to actually detect this feature availability, and avoid the operation in general if it is unavailable	01:29
rm_work	oh, maybe not tho? because NOW we store the data to the DB in the API layer, which means by the time we're here we can't tell if it's a noop or not	01:31
rm_work	umm, could short-circuit it in the API layer	01:31
rm_work	but that's kinda hacky	01:31
rm_work	lxkong: how urgent is this really? can you communicate to users just not to try to set that?	01:31
rm_work	why would they even try, if they were not able to create a LB with a qos-policy to begin with?	01:32
rm_work	I feel like this isn't super critical, so we should be able to take the time to fix it correctly	01:32
lxkong	rm_work: that's not so urgent, because i can just restored the error status back to active, and it's not harmful for the users.	01:32
rm_work	yeah, ummm, find out what users are doing that, and tell them to stop it	01:33
rm_work	and ask why they thought that made any sense to begin with <_<	01:33
lxkong	that happened because some users are trying the our LBaaS and specify some params that they thought safe to do	01:33
lxkong	i've already told them, and we do need a solution	01:34
lxkong	rm_work: thanks for all your analysis	01:34
johnsom	Yeah, we should never 500 out	01:34
rm_work	I posted my comments on the story	01:36
rm_work	johnsom: that wouldn't be a 500	01:36
rm_work	it'd be a 202	01:36
rm_work	err, 200	01:36
rm_work	and then it'd fail async	01:36
rm_work	well, yeah on a create it might 500? not sure exactly	01:36
johnsom	Yeah, I realized that after I typed it	01:36
rm_work	but this story is specifically about update	01:36
johnsom	Too many things going on	01:36
rm_work	yeah lol	01:37
rm_work	took me a while to even get my brain to focus on what the issue was, but yeah, got it	01:37
johnsom	I'm adding a task for the "unset" issue in the client.	01:38
johnsom	We need to fix a bunch of those	01:38
johnsom	I don't think any of our update commands have "unset"	01:39
rm_work	yeah, well, that's true	01:39
rm_work	but not REALLY the issue	01:39
rm_work	I suppose we should do it anyway tho	01:39
johnsom	Right, but ...	01:39
rm_work	is there a way to mark a bug-story as "triaged" or "confirmed" or something in launchpad? :/	01:39
johnsom	In launchpad yes, in our friend storyboard, no	01:40
rm_work	errr	01:40
rm_work	oops lol yeah	01:40
johnsom	Other that adding tags to stories, but that is pointless because you can't search for not having a tag in SB either	01:40
rm_work	lol	01:40
rm_work	storyboard != bug tracker, I guess	01:41
johnsom	Ok, off to dinner. I have the release patch ready, just waiting for the last patch to finish merging. Will push that later tonight	01:41
rm_work	o/	01:41
rm_work	ping me for a +2 if you need one	01:42
johnsom	Evidently we are "using it wrong"	01:42
johnsom	sigh	01:42
rm_work	lol	01:42
johnsom	No, our stuff is all +2/+w, this will be a release team +2. So, no worries	01:42
rm_work	kk	01:42
*** phuoc_ has joined #openstack-lbaas		01:50
*** phuoc has quit IRC		01:53
openstackgerrit	Merged openstack/octavia stable/pike: Stop Logging Amphora Cert https://review.openstack.org/625066	02:15
*** sapd1_ has joined #openstack-lbaas		02:27
*** sapd1 has quit IRC		02:29
*** yamamoto has joined #openstack-lbaas		03:02
openstackgerrit	Merged openstack/octavia stable/queens: Bring up secondary IPs on member networks https://review.openstack.org/624804	03:05
*** hongbin has joined #openstack-lbaas		03:14
*** hongbin has quit IRC		03:15
*** hongbin has joined #openstack-lbaas		03:16
johnsom	stable branch release patch: https://review.openstack.org/625144	03:28
*** yamamoto has quit IRC		03:39
rm_work	cool	03:47
rm_work	wait we're still cutting pike releases? lol	03:47
rm_work	we haven't been backporting anything past queens tho	03:47
rm_work	oh, just that one	03:47
*** ramishra has joined #openstack-lbaas		03:49
*** hongbin has quit IRC		04:34
*** PagliaccisCloud has joined #openstack-lbaas		04:44
*** yamamoto has joined #openstack-lbaas		04:55
*** yamamoto has quit IRC		05:40
*** yamamoto has joined #openstack-lbaas		06:20
*** JudeCross has joined #openstack-lbaas		06:25
*** ccamposr has joined #openstack-lbaas		06:34
*** JudeCross has quit IRC		06:49
*** rcernin has quit IRC		07:03
*** PagliaccisCloud has quit IRC		07:12
*** pcaruana has joined #openstack-lbaas		07:12
*** JudeCross has joined #openstack-lbaas		07:15
*** yamamoto has quit IRC		07:48
*** yamamoto has joined #openstack-lbaas		07:48
*** yamamoto has quit IRC		07:57
*** rpittau has joined #openstack-lbaas		08:06
*** reedipb has joined #openstack-lbaas		08:33
reedipb	johnsom : there?	08:33
*** yamamoto has joined #openstack-lbaas		08:35
*** velizarx has joined #openstack-lbaas		08:35
openstackgerrit	OpenStack Proposal Bot proposed openstack/octavia-dashboard master: Imported Translations from Zanata https://review.openstack.org/625183	08:55
*** JudeCross has quit IRC		09:01
*** Emine has joined #openstack-lbaas		09:05
*** reedipb has quit IRC		09:15
rm_work	what channel is the storyboard team in again? >_>	09:28
rm_work	need to ask them about bug flags	09:28
rm_work	for priority we could use tags and add like "high" or "low" or whatever	09:29
rm_work	but ...	09:29
rm_work	ah, maybe if you could search for "lack of a flag" that'd do it	09:29
*** yamamoto has quit IRC		09:37
cgoncalves	rm_work, fields like such (priority) have been requested to the Storyboard team awhile ago IIRC. the workaround is indeed using tags, which I dislike	09:47
*** salmankhan has joined #openstack-lbaas		10:34
*** PagliaccisCloud has joined #openstack-lbaas		10:59
*** rpittau is now known as rpittau\|lunch		11:10
*** yamamoto has joined #openstack-lbaas		11:15
*** salmankhan has quit IRC		12:06
*** salmankhan has joined #openstack-lbaas		12:09
*** rpittau\|lunch is now known as rpittau		12:13
*** pcaruana has quit IRC		12:21
*** pcaruana has joined #openstack-lbaas		12:22
*** pcaruana is now known as pcaruana\|intw\|		12:25
*** yamamoto has quit IRC		12:28
*** yamamoto has joined #openstack-lbaas		12:30
*** yamamoto has quit IRC		12:30
*** yamamoto has joined #openstack-lbaas		13:05
*** yamamoto has quit IRC		13:18
*** yamamoto has joined #openstack-lbaas		13:18
*** velizarx has quit IRC		13:20
*** velizarx has joined #openstack-lbaas		14:00
*** pcaruana\|intw\| has quit IRC		14:05
*** Emine has quit IRC		14:16
*** pcaruana has joined #openstack-lbaas		14:31
*** ccamposr has quit IRC		14:49
*** ivve has joined #openstack-lbaas		15:05
*** yangjianfeng has joined #openstack-lbaas		15:10
*** yangjianfeng has quit IRC		15:26
*** ivve has quit IRC		15:34
*** velizarx has quit IRC		15:47
*** ccamposr has joined #openstack-lbaas		15:50
*** PagliaccisCloud has quit IRC		16:15
*** pcaruana has quit IRC		16:20
*** ivve has joined #openstack-lbaas		16:24
johnsom	FYI, I have also posted patches to bump OSA to get the taskflow logging patch	16:53
*** salmankhan has quit IRC		16:56
*** salmankhan has joined #openstack-lbaas		16:57
*** ccamposr has quit IRC		17:15
*** PagliaccisCloud has joined #openstack-lbaas		17:15
johnsom	reedipb Looking for me?	17:17
*** rpittau has quit IRC		17:22
*** Emine has joined #openstack-lbaas		17:36
*** PagliaccisCloud has quit IRC		17:43
*** Swami has joined #openstack-lbaas		18:26
cgoncalves	I clarified Reedip's questions off channel. ovn devstack plugin is doing weird things for enabling its provider driver in octavia, they have been advised how on to best do it	18:54
johnsom	Ok, cool. Thanks!	18:54
cgoncalves	and second, he was looking for if it is possible and how to enable multiple provider drivers	18:54
johnsom	Yes, we support that	18:55
cgoncalves	right	18:55
cgoncalves	he was setting enabled_provider_drivers option in python dict format :)	18:55
cgoncalves	should be "enabled_providers_drivers: amphora:'some description', ovn:'another description'"	18:56
cgoncalves	"Documentation for Octavia's OVN Driver" -- https://review.openstack.org/#/c/624937/	18:57
*** salmankhan has quit IRC		18:57
*** abaindur has joined #openstack-lbaas		19:06
*** Emine has quit IRC		19:26
*** abaindur has quit IRC		19:54
*** abaindur has joined #openstack-lbaas		20:14
*** abaindur has quit IRC		20:15
*** abaindur has joined #openstack-lbaas		20:15
openstackgerrit	German Eichberger proposed openstack/octavia master: Amphora logging https://review.openstack.org/624835	20:54
xgerman	(hopefully goo to go)	20:54
xgerman	next week haproxy log format to include project-id	20:55
openstackgerrit	Michael Johnson proposed openstack/octavia master: Updates Octavia to support octavia-lib https://review.openstack.org/613709	21:34
*** badloop has joined #openstack-lbaas		21:50
badloop	we are having issues with ha-proxy working intermittently for our loadbalancers (communication just stops and starts randomly with no clear reason why). where would be the best place to look for troubleshooting that	21:51
badloop	it is pretty much a vanilla install of ocata	21:51
*** colby_ has joined #openstack-lbaas		22:09
colby_	Hey Guys,	22:09
colby_	I just upgraded to queens and we had octavia working under pike. Suddenly the worker and health monitor are unable to connect. We get the following:SSLError: ("bad handshake: Error([('SSL routines', 'ssl3_read_bytes', 'tlsv1 alert unknown ca')],)",)	22:10
rm_work	badloop: my INITIAL guess, with no real evidence or view into your particular situation, would be that healthchecks are periodically failing for some reason and taking nodes offline	22:10
colby_	did the configs change for the certs that Im not aware of?	22:11
rm_work	and by healthchecks, i mean the haproxy ones, not amphora health	22:11
rm_work	colby_: i don't think so? are you sure the files didn't get swapped out or something by the upgrade process by accident?	22:11
colby_	no our SSL certs are stored outside the octavia config directory	22:12
johnsom	How did up upgrade? Did the process recreate the certificates on the controllers?	22:13
colby_	no I created all the certs manually for pike. Just pointing to them in the config	22:14
johnsom	I don’t think anything changed with the amp cert system.	22:18
rm_work	johnsom: err, is redhat really supposed to write the vip-interface-file twice? https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/agent/api_server/osutils.py#L393-L409	22:19
johnsom	Are you using selinux or apparmour that is denying access to to the certs?	22:20
rm_work	(in the case of keepalived)	22:20
rm_work	and why does keepalived on Ubuntu not handle the vip in the same was is it does in redhat?	22:20
rm_work	the ubuntu version of that function is missing all of the stuff past the first write (where it says "Keepalived will handle this"	22:21
rm_work	)	22:21
colby_	johnsom: no both are not enabled.	22:21
johnsom	Nir would be best to answer that, but I know RH requires more files than Ubuntu	22:22
rm_work	oh, or is it that keepalived handles the same things on both, but Ubuntu's networking handles the VIP better in the non-keepalived case (and redhat does not)	22:22
rm_work	yeah ok... nmagnezi maybe if you're around	22:22
rm_work	ah it's a different path too, ok, yeah, an extra alias file	22:22
johnsom	Ubuntu we write an int file in single mode, and not in act/stdby	22:23
*** ivve has quit IRC		22:29
johnsom	colby_: that is in the worker log right?	22:31
colby_	both the worker and health manager	22:31
johnsom	Yeah, somehow the certs are not aligned. I would double check the config via the startup debug logs and then compare the certs to what you get from openssl on the amp agent port.	22:34
rm_work	for ipv6, the "netmask" is the prefixlength?	23:14
rm_work	(of the network)	23:14
rm_work	yeah ok read some docs, i get it	23:18
rm_work	but does ipv6 have a concept of host_routes?	23:18
rm_work	johnsom: ^^	23:18
rm_work	i ... feel like it wouldn't need them?	23:19
johnsom	Yes, but the term “host route” is a neutron isem	23:19
johnsom	The whole netmask vs prefix thing had a problem. Someone posted a patch for that. We were not using it consistently.	23:20
rm_work	right i am fixing that patch now	23:22
johnsom	Awesome	23:22
rm_work	almost done	23:22
rm_work	patiently waiting on local tests... >_>	23:23
*** PagliaccisCloud has joined #openstack-lbaas		23:29
*** yamamoto has quit IRC		23:39
*** abaindur has quit IRC		23:53

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!