Thursday, 2019-04-11

*** kmadac has quit IRC02:05
*** kmadac has joined #openstack-kuryr02:07
*** rh-jelabarre has quit IRC02:37
*** hongbin has joined #openstack-kuryr03:03
*** gcheresh has joined #openstack-kuryr03:45
*** gcheresh has quit IRC03:51
*** hongbin has quit IRC04:05
*** gcheresh has joined #openstack-kuryr05:06
*** dulek has quit IRC05:56
*** ccamposr has joined #openstack-kuryr06:03
*** dulek has joined #openstack-kuryr06:05
openstackgerritDanil Golov proposed openstack/kuryr-kubernetes master: Update sriov neutron ports with pci info  https://review.openstack.org/64270406:05
openstackgerritDanil Golov proposed openstack/kuryr-kubernetes master: Support sriovdp arbitrary resource names  https://review.openstack.org/64249106:05
openstackgerritDanil Golov proposed openstack/kuryr-kubernetes master: Add PodResources service client  https://review.openstack.org/65158006:05
openstackgerritDanil Golov proposed openstack/kuryr-kubernetes master: [WIP] DPDK in baremetal containers using SR-IOV  https://review.openstack.org/65158106:05
*** maysams has joined #openstack-kuryr07:13
*** maysams has quit IRC07:40
*** maysams has joined #openstack-kuryr07:44
*** alisanhaji has joined #openstack-kuryr08:04
*** brault has joined #openstack-kuryr08:28
alisanhajiHi dulek, so do you have an idea about how to make kuryr-cni work with a containerized kubelet?08:39
dulekalisanhaji: Can you show me the definition of that kubelet pod/container? Like what host volumes it has?08:40
alisanhajidulek: the config.json of kubelet runc container says that are mounted /proc /dev /sys /etc/kubernetes /var/lib08:40
dulekalisanhaji: Hm, so no /var/run? How does it talk with Docker then?08:41
dulekalisanhaji: Wait, it's runc, so possibly there's a socket in /var/lib?08:45
dulekIt must be like that.08:45
alisanhajiHere is the complete definition of kubelet container: http://paste.openstack.org/show/749160/08:45
dulekalisanhaji: Okay, there's /run, probably it's where the socket we should talk to is…08:47
dulekalisanhaji: Can you check if it has Python binary?08:47
alisanhajiIndeed their is the docker.sock in /run but no docker binary08:48
alisanhajino python binary either08:50
dulekalisanhaji: Okay, so there are two issues we would need to fix to make this work. One is that we hardcode /var/run/docker.sock - we would need to make it configurable or fetch it from somewhere. I guess kubelet does the same, so that one is easy.08:51
dulekalisanhaji: The more complicated, but not hopeless thing would be to switch `docker exec` to some cool `curl` call to that socket…08:51
dulekOh, third issue, " | python -c "\${finder}" docker\`" - this pipe won't work.08:52
dulekWhich, kinda sucks, because I don't want to write JSON processing in bash…08:52
alisanhajiI see what you want to do, but I am not sure how much time it would cost... :-(08:55
dulekalisanhaji: 1. is trivial, 2. ranges from easy to hard :P, 3. might be easy if Docker API offers some filtering.08:57
dulekalisanhaji: Oh crap, but is there even curl in that container?08:57
alisanhajiJust checked, and it does not...09:00
dulek:)09:00
dulekOkay, it's actually good.09:00
dulekalisanhaji: Because at this point it will be easier to just rewrite kuryr-cni in golang and inject a binary like a proper CNI plugin should do.09:01
dulekSince we dropped support for no kuryr-daemon, it actually isn't that much work.09:01
dulekalisanhaji: It's hard for me to prioritize this work at the moment, but we could definitely make it a goal for the cycle.09:02
alisanhajidulek: that would be great! No worries I can get back to working on kuryr with magnum when it's ready09:03
dulekltomasbo, dmellado: What do you think. ^ Talking about rewriting that part that calls kuryr-daemon to golang and injecting a binary as CNI plugin instead of a hacky bash script.09:03
dulekalisanhaji: I suspect that kubelet container has host networking?09:04
alisanhajiyes indeed09:05
dulekalisanhaji: Okay, so that should work. :)09:06
dulekalisanhaji: I'll propose a blueprint later on.09:06
alisanhajidulek: Awesome, I will keep tracking the blueprints, thanks for the help :-D09:09
*** danil has joined #openstack-kuryr09:10
dmelladohey dulek, well, I wouldn't mind but I'd love checking a blueprint on that09:47
dmelladomaysams: pong09:47
* dmellado just landed...09:47
maysamshey dmellado09:47
maysamsdmellado: ohh I hope you have had a pleasant flight :)09:48
dmelladoyep! my flight got delayed a bit but I'll survive xD09:49
maysamsdmellado: XD09:50
dmelladoso what's up? :D09:50
dmelladoI was reading the backlog regarding the golang stuff, but I might be missing something more around09:50
maysamsdmellado: I was gonna ask you if you could rebase your patch https://review.openstack.org/#/c/645139/ when you have sometime, because my dependants ps needed rebase09:51
dmelladomaysams: sure thing, I'll tackle that so I don't block you ;)09:52
maysamsdmellado: thank you ;)09:52
openstackgerritDaniel Mellado proposed openstack/kuryr-kubernetes master: Add ipBlock support to NP  https://review.openstack.org/64513909:55
dmelladomaysams: there you go:P09:55
maysamsdmellado: Thanks!!09:57
dulekdmellado, ltomasbo, alisanhaji: https://blueprints.launchpad.net/kuryr-kubernetes/+spec/golang-kuryr-cni10:46
* dmellado reading10:47
*** maysams is now known as maysams|afk11:14
*** danil has quit IRC11:17
dulekdmellado, ltomasbo, maysams|afk: What do you think about this folks: bug/1824332 ?11:18
dulekHm, bot isn't linking? bug 182433211:18
openstackbug 1824332 in kuryr-kubernetes "Loadbalancer doesn't get recreated when deleted" [Undecided,New] https://launchpad.net/bugs/182433211:19
dulekHere you go. ^\11:19
dmelladodulek: let's see11:19
dulekThis was reported by livelace yesterday. I reproduced it very easily.11:19
dmelladoso if we have a service and we delete the lb11:19
dmelladothen boom?11:19
dulekdmellado: No, Kuryr seems fine until it notices some change, e.g. scale-up.11:20
dmelladohmmm I see11:24
dmelladodulek: well, as you mention it shouldn't happen, but I'd prefer to honor that and fix it11:25
dmelladoI'll try reproducing it as well11:25
dulekdmellado: If you have setup hanging around it's easy. ;)11:26
dmelladoyep, doing it just now xD11:26
dulekI'd really like to get ltomasbo's and maysams's opinions as they worked with LB code more closely. Because maybe it's crazy difficult. :P11:26
dulekdmellado: Ah, you can recover from the crash loop by deleting the svc.11:26
dmelladodulek: lol, thanks for the hint xD11:27
*** danil has joined #openstack-kuryr11:37
*** rh-jelabarre has joined #openstack-kuryr12:05
*** maysams|afk is now known as maysams12:55
*** danil has quit IRC12:58
alisanhajidulek: thanks for the blueprint :)13:11
*** pcaruana has quit IRC13:20
*** celebdor has quit IRC13:30
*** pcaruana has joined #openstack-kuryr13:42
ltomasbodulek, what whas the question?13:48
ltomasbo*was13:48
*** gkadam has joined #openstack-kuryr14:02
*** gkadam has quit IRC14:03
maysamsdulek: I managed to reproduce the bug on mentioned14:13
maysamss/on/you14:13
maysamshttps://launchpad.net/bugs/182433214:14
openstackLaunchpad bug 1824332 in kuryr-kubernetes "Loadbalancer doesn't get recreated when deleted" [Undecided,New]14:14
maysamsdulek: this occurs because the annotations is not being removed when the lbaas is deleted14:15
maysamsdulek: I believe this is not difficult to fix14:15
maysamsdulek; and I agree with you that the lbaas should get recreated14:19
dulekltomasbo: Your opinion about the bug mentioned above. ^14:20
dulekmaysams: We don't watch on loadbalancers on Octavia API, but you're right that probably deleting the annotations (only the State, I guess) when we notice that LB is gone should fix this.14:23
dulekActually some OpenStack API poller thread wouldn't be too hard to implement. Or even better - thread that would inject "on_present" events into the queue.14:25
maysamsdulek, "on_present" events for what resources of what api?14:35
dulekmaysams: Hm, probably Endpoints on K8s API?14:36
maysamsdulek, I think we could remove the annotation when we get a not_found or recreate the lbaas14:36
dulekmaysams: Well, with LBaaS go all the other resources, so this isn't something we can easily do in one call, I think.14:37
maysamsdulek, hmmm true14:37
dulekmaysams: But yes, on "not found" we can remove LBaaSState and this will trigger an on_present and Kuryr will do its work by creating a new one.14:38
dulekmaysams: And this if fine, I was just looking for a way to achieve recreation even if Service or Endpoint events are not triggered.14:39
ltomasbodulek, regarding that bug14:40
ltomasboI believe the loadbalancer is recreated if it gets deleted, right?14:40
maysamsdulek, I sEE14:40
maysamssee*14:40
ltomasbothe code calls create, and if it is there, then it tries to find it and load its information14:41
dulekltomasbo: No, I tried doing `openstack delete loadbalancer --cascade <id>`, then scaled the env and Kuryr got into crash loop.14:41
dulek(due to healthchecks of course)14:41
dulekltomasbo: In this case it wasn't looking for LB, I think, because it only needed to add members on scale-up14:42
ltomasboumm, ahh, ok14:42
dulekOh, it's _wait_for_provisioning that's failing…14:42
ltomasbodulek, I think it just tries to recreate it when the kuryr-controller is restarted14:43
ltomasboand I guess the annotation is not matching14:43
dulekltomasbo: Restarting kuryr-controller wasn't triggering a recreation on my env.14:43
dulekMaybe we just broke that somehow.14:44
ltomasbodulek, probably becuase the annotations were already there...14:44
ltomasboanyway, not sure if going and killing openstack reousrces manually should be supported either...14:45
ltomasboit is a nice thing to have, but...14:45
dulekltomasbo: livelace hit that when switching to Octavia from LBaaS v2.14:45
ltomasbonot always possible14:45
dulekltomasbo: Well, it's definitely the K8s way of doing things - I check the state, if it doesn't match the reality I fix the reality.14:46
openstackgerritLuis Tomas Bolivar proposed openstack/kuryr-kubernetes master: Ensure port_range_min is optional  https://review.openstack.org/65180514:47
*** gcheresh has quit IRC14:48
ltomasbodulek, yes14:49
maysamsltomasbo: Is this the bug genadi mentioned today? ^14:50
ltomasbodulek, but for instance, in kubernetes, if you delete the machine object14:50
ltomasboa new one is created14:50
ltomasbobut if you remove the openstack VM, but leave the machine object there, nothing will happen14:50
ltomasboas k8s is only looking for k8s resources14:50
ltomasbomaysams, yes14:51
maysamsltoamsbo, okay14:51
ltomasbomaysams, though I think I read it from dulek and dmellado14:51
ltomasbosorry I was at a meeting and was only connecting at times14:51
dulekmaysams, ltomasbo: Yes, it was found by dmellado and gcheresh.14:51
dulekltomasbo: Hm, so… VM is created from Machine objects by Operator?14:52
ltomasbodulek, great! and yes, it must have been due to my previous patch removing that default14:52
maysamsltomasbo, that's fine14:52
dulekltomasbo: If so, it'll get recreated - operators apart from watching from changes also periodically trigger "reconciliation" even without events.14:52
duleks/from/for14:53
ltomasbodulek, ahh, ok, maybe I didn't wait enough14:53
ltomasbobut in that case, we would need to implement something similar14:53
ltomasbokuryr-reconciliation14:53
ltomasboI agree we should move to use more and more CRDs to have the information about the openstack resources, and trigger the needed actions in case they are missing14:54
dulekltomasbo: Ha, yes - both LBaaS structs should be only one Kuryr CRD that has spec and status. :)14:54
dulekltomasbo: Okay, but I think you convinced me that this isn't really top priority.14:56
duleklivelace issue was somehow special - he switched from LBaaS v2 to Octavia.14:56
dulekI mean we can fix the bug, so on Service scale it'll work normally and recreate everything.14:58
dulekBut to do proper "reconciliation" is harder and not top thing we need.14:58
ltomasbodulek, to me, it sounds like a great thing to have, but I would prioritice other things14:59
dulekltomasbo: Fair enough, let's keep the bug hanging and we'll get back to it.15:00
ltomasboperhaps we can start moving to CRDs when we can, and new features can be 'forced' to have it there from the beginning15:00
ltomasboanyway, let me check the code... not sure if the code should already handle lbaas creation in that case already15:01
ltomasbodulek, seems if the lbaas_state annotations are there, we don't retrigger ensure_loadbalancer: https://github.com/openstack/kuryr-kubernetes/blob/master/kuryr_kubernetes/controller/handlers/lbaas.py#L562-L57515:04
ltomasboperhaps we need an extra find_loadblaancer to ensure that on updates... but that will slow down things for scaling/exposing actions15:05
dulekltomasbo: Hm, that does make sense.15:05
dulekltomasbo: We can always catch the 404 exception on member create and make sure that will delete the State annotation.15:06
dulekltomasbo: And this will retrigger whole thing.15:06
dulekCan anything else trigger change in LB? I mean - can Service be modified?15:06
ltomasbosounds like a nice workaround15:07
ltomasboyes, we can expose different ports15:07
ltomasboadd an extra port exposed15:07
ltomasbochange the target...15:07
dulekltomasbo: Credits for the workaround idea go to maysams. ;)15:07
ltomasbo:)15:07
dulekltomasbo: Hm, so we should probably check failure modes for those as well and add deleting annotation there accordingly.15:08
maysams:-)15:08
*** celebdor has joined #openstack-kuryr15:38
*** pcaruana has quit IRC15:42
*** premsankar has joined #openstack-kuryr16:01
*** ccamposr has quit IRC16:03
*** mrostecki has quit IRC16:09
aperevalovdulek, hi! Are you here?16:10
dulekaperevalov: Yup, what's up?16:10
aperevalovoh, I just want to ask, did you faced with issue  in HA. When one master goes down and CNI on minion can't reconnect to VIP, and initial k8s connection was made by VIP. Or you just covered haproxy use case?16:12
aperevalovnow I tried through haproxy, the same result: Connection broken.16:14
aperevalovBut after cni restart (w/o reconfiguring url) it's ok. So I assume our kuryr cni doesn't handle Connection broken correctly.16:14
aperevalovof course daemon was used ;)16:14
*** mrostecki has joined #openstack-kuryr16:18
aperevalovI cal show you a call stack16:19
dulekaperevalov: Hm, so daemon haven't reconnected?16:23
aperevalovyes, it didn't16:24
dulekaperevalov: Traceback would be useful here. It's master or stable/something?16:24
aperevalovhttps://etherpad.openstack.org/p/cni-ha-reconnect-problem16:25
dulekOkay, so this was fixed long time ago, it should recover from this.16:26
dulekIs it some old Kuryr?16:26
aperevalovprobably yes, this build wasn't prepared by me. And I didn't yet check on master. Thanks! I'll recheck on master.16:27
dulekaperevalov: Okay, thanks!16:35
*** aperevalov has quit IRC16:58
*** pcaruana has joined #openstack-kuryr17:05
*** maysams has quit IRC17:18
*** alisanhaji has quit IRC18:00
*** gmann is now known as gmann_afk18:15
*** pcaruana has quit IRC19:02
*** gmann_afk is now known as gmann19:27
*** gcheresh has joined #openstack-kuryr20:35
*** gcheresh has quit IRC20:48
*** premsankar has quit IRC23:11
*** kmadac has quit IRC23:21
*** irclogbot_1 has quit IRC23:24
*** kmadac has joined #openstack-kuryr23:27

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!