*** nhlfr has joined #openstack-kuryr | 00:01 | |
*** gouthamr has quit IRC | 00:14 | |
*** gouthamr has joined #openstack-kuryr | 00:20 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/kuryr master: Updated from global requirements https://review.openstack.org/516936 | 00:24 |
---|---|---|
openstackgerrit | OpenStack Proposal Bot proposed openstack/kuryr-libnetwork master: Updated from global requirements https://review.openstack.org/516937 | 00:25 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/kuryr-kubernetes master: Updated from global requirements https://review.openstack.org/509781 | 00:25 |
*** salv-orlando has joined #openstack-kuryr | 00:32 | |
*** salv-orlando has quit IRC | 00:36 | |
*** caowei has joined #openstack-kuryr | 01:25 | |
*** gouthamr has quit IRC | 01:30 | |
*** salv-orlando has joined #openstack-kuryr | 01:32 | |
*** kiennt26 has joined #openstack-kuryr | 01:35 | |
*** salv-orlando has quit IRC | 01:38 | |
*** gouthamr has joined #openstack-kuryr | 02:19 | |
*** salv-orlando has joined #openstack-kuryr | 02:33 | |
*** salv-orlando has quit IRC | 02:38 | |
*** openstack has joined #openstack-kuryr | 02:43 | |
*** ChanServ sets mode: +o openstack | 02:43 | |
*** gouthamr has quit IRC | 02:50 | |
*** gouthamr has joined #openstack-kuryr | 03:26 | |
*** reedip has joined #openstack-kuryr | 03:49 | |
*** kiennt26 has quit IRC | 04:24 | |
*** gouthamr has quit IRC | 04:35 | |
*** salv-orlando has joined #openstack-kuryr | 04:35 | |
openstackgerrit | Vu Cong Tuan proposed openstack/fuxi master: Do not use “-y” for package install https://review.openstack.org/518219 | 04:36 |
*** gouthamr has joined #openstack-kuryr | 04:40 | |
*** salv-orlando has quit IRC | 04:40 | |
*** yamamoto has joined #openstack-kuryr | 04:47 | |
*** caowei has quit IRC | 05:03 | |
*** janki has joined #openstack-kuryr | 05:17 | |
*** caowei has joined #openstack-kuryr | 05:32 | |
*** salv-orlando has joined #openstack-kuryr | 05:36 | |
*** salv-orlando has quit IRC | 05:40 | |
openstackgerrit | XieYingYun proposed openstack/kuryr master: Optimize the link address in docs https://review.openstack.org/518233 | 06:02 |
*** yboaron has joined #openstack-kuryr | 06:12 | |
*** yboaron has quit IRC | 06:19 | |
*** yboaron has joined #openstack-kuryr | 06:23 | |
*** salv-orlando has joined #openstack-kuryr | 06:35 | |
openstackgerrit | Merged openstack/kuryr-kubernetes master: Add icmp sg rules to k8s project https://review.openstack.org/515357 | 06:48 |
*** gouthamr has quit IRC | 06:54 | |
gsagie | salv-orlando: indeed it is.. | 07:11 |
*** gsagie has quit IRC | 07:23 | |
*** pcaruana has joined #openstack-kuryr | 08:06 | |
*** yboaron has quit IRC | 08:12 | |
*** salv-orlando has quit IRC | 08:45 | |
*** salv-orlando has joined #openstack-kuryr | 08:46 | |
*** salv-orlando has quit IRC | 08:51 | |
*** yamamoto has quit IRC | 09:02 | |
irenab | dulek, will check asap | 09:02 |
*** yamamoto has joined #openstack-kuryr | 09:05 | |
*** yboaron has joined #openstack-kuryr | 09:06 | |
openstackgerrit | Merged openstack/kuryr-kubernetes master: Remove 99-loopback.conf https://review.openstack.org/510571 | 09:07 |
*** garyloug has joined #openstack-kuryr | 09:07 | |
*** garyloug_ has joined #openstack-kuryr | 09:08 | |
*** garyloug has quit IRC | 09:08 | |
*** yamamoto has quit IRC | 09:09 | |
openstackgerrit | Michał Dulko proposed openstack/kuryr-kubernetes master: CNI Daemon documentation https://review.openstack.org/509380 | 09:12 |
openstackgerrit | Michał Dulko proposed openstack/kuryr-kubernetes master: DNM: Trying to run gate tests with CNI Daemon https://review.openstack.org/509765 | 09:15 |
*** salv-orlando has joined #openstack-kuryr | 09:23 | |
ltomasbo | dulek, http://paste.openstack.org/show/625684/ | 09:25 |
ltomasbo | I'm testing your cni-split containerized and got that (just in case it helps) | 09:25 |
dulek | ltomasbo: All of the pods get that state? You should look in CNI container docs (it's running the daemon now). | 09:26 |
dulek | s/docs/logs | 09:26 |
ltomasbo | dulek, not all of the pods | 09:26 |
ltomasbo | when I scale and create a bunch of them at the same time | 09:27 |
ltomasbo | some of them get it | 09:27 |
ltomasbo | most of the time it works... | 09:27 |
dulek | ltomasbo: Okay, so I'd expect you'll see EEXIST in the CNI daemon logs. | 09:27 |
dulek | ltomasbo: I should get that into a try-except, irenab asked me to do that in another patch. | 09:28 |
dulek | I guess I should do the patch now. | 09:28 |
ltomasbo | ok, seems I cannot access the logs... | 09:30 |
ltomasbo | dulek, it is not like this: $ kubectl -n kube-system logs kuryr-cni-ds-v91jm | 09:31 |
ltomasbo | ? | 09:31 |
dulek | Yup, that should be it. | 09:32 |
dulek | Empty? | 09:32 |
ltomasbo | yep, nothing there | 09:33 |
dulek | Hm, strange. That would mean daemon is not running. But it is, as you're getting some of your pods up and running. | 09:34 |
ltomasbo | and it annoys me a bit to see that many kuryr-daemon server workers... even when I delete the pods... | 09:34 |
dulek | ltomasbo: sudo systemctl status devstack@kuryr-daemon | 09:35 |
dulek | Maybe it had started in a process? | 09:35 |
ltomasbo | there is no kuryr-daemon... | 09:36 |
ltomasbo | no systemd I mean | 09:36 |
ltomasbo | kubectl get -n kube-system pod | 09:36 |
ltomasbo | NAME READY STATUS RESTARTS AGE | 09:36 |
ltomasbo | kuryr-cni-ds-v91jm 1/1 Running 0 15m | 09:36 |
ltomasbo | kuryr-controller-3218936350-d2kth 1/1 Running 0 38m | 09:36 |
dulek | And you're testing latest version of the pathc? | 09:36 |
ltomasbo | umm, let me see | 09:37 |
ltomasbo | there is only 1 patch set, right? | 09:37 |
ltomasbo | https://review.openstack.org/#/c/518024/ | 09:37 |
dulek | Okay, I wasn't sure if I haven't sent two, sorry. | 09:39 |
dulek | You have option setting containerized set to True and enable_service kuryr-daemon, right? | 09:39 |
ltomasbo | yep | 09:40 |
ltomasbo | and I see it is running containerized | 09:40 |
ltomasbo | and the kuryr-daemon processes | 09:40 |
dulek | Okay, so I'll try to run that myself and see what happens. | 09:40 |
ltomasbo | ps -eF | grep daem | grep kury | 09:41 |
ltomasbo | root 6351 6326 0 66268 50856 4 09:21 ? 00:00:00 kuryr-daemon: master process [/usr/bin/kuryr-daemon --config-file /etc/kuryr/kuryr.conf] | 09:41 |
ltomasbo | root 6364 6351 0 105183 48484 6 09:21 ? 00:00:00 kuryr-daemon: watcher worker(0) | 09:41 |
ltomasbo | root 6367 6351 0 1233788 98204 4 09:21 ? 00:00:08 kuryr-daemon: server worker(0) | 09:41 |
*** yamamoto has joined #openstack-kuryr | 09:46 | |
*** gouthamr has joined #openstack-kuryr | 09:52 | |
dulek | ltomasbo: http://paste.openstack.org/show/625694/ | 10:04 |
dulek | kss is my alias to kubectl -n kube-system | 10:04 |
ltomasbo | umm | 10:05 |
ltomasbo | so it works for you... | 10:05 |
ltomasbo | I did kill the cni container, to see if that was recreated | 10:05 |
ltomasbo | and the leftover processes were deleted | 10:05 |
ltomasbo | dulek, can you try that and check if you can access the logs of the new cni container? | 10:06 |
dulek | Sure thing. | 10:06 |
dulek | ltomasbo: Processes are gone now…, daemon started… aaand I have the logs. | 10:07 |
ltomasbo | umm | 10:07 |
ltomasbo | ok, I'll re-stack | 10:07 |
ltomasbo | only modification I did is to use lbaasv2 instead of octavia | 10:08 |
dulek | ltomasbo: I'm running without Octavia as well. | 10:09 |
ltomasbo | dulek, umm, centos? | 10:09 |
dulek | ltomasbo: Yup, I'll send you my local.conf | 10:09 |
dulek | http://paste.openstack.org/show/625700/ | 10:11 |
ltomasbo | dulek, this is mine: http://paste.openstack.org/show/625699/ | 10:11 |
dulek | But please note that I also did got the errors you've mentioned originally, one pod failed. | 10:11 |
ltomasbo | umm | 10:12 |
ltomasbo | ok | 10:12 |
dulek | ltomasbo: Hm, it looks pretty much the same as mine. | 10:12 |
ltomasbo | seems the only different is you enable debug | 10:12 |
ltomasbo | and I force ovs-firewall | 10:12 |
dulek | I'll add that try-except. | 10:12 |
ltomasbo | other than that, pretty much the same | 10:13 |
ltomasbo | ok, thanks! | 10:13 |
dulek | Yup, but you should still see INFO logs even without debug. | 10:13 |
ltomasbo | I was eager to test this! but no hurry! | 10:13 |
*** caowei has quit IRC | 10:13 | |
ltomasbo | dulek, yep, I know, it should have the info logs | 10:13 |
ltomasbo | I'm restacking | 10:13 |
ltomasbo | let see if now it works... | 10:13 |
*** jchhatbar has joined #openstack-kuryr | 10:26 | |
*** janki has quit IRC | 10:26 | |
*** alraddarla has joined #openstack-kuryr | 10:26 | |
*** yamamoto has quit IRC | 10:27 | |
*** salv-orlando has quit IRC | 10:27 | |
*** yamamoto has joined #openstack-kuryr | 10:27 | |
*** yamamoto has quit IRC | 10:33 | |
*** janki has joined #openstack-kuryr | 10:34 | |
*** jchhatbar has quit IRC | 10:35 | |
irenab | dulek, I verified the CNI split patch, you got my +2 | 10:41 |
dulek | irenab: Thanks! | 10:42 |
irenab | Not directly related, but triggered by this patch. I am a bit concerned about number of various options how to run the kuryr. I think we should converge on limited number and have proper documentation of what are the options | 10:44 |
dulek | irenab: I agree and I guess this is the outcome of the fact we're still experimenting a bit. | 10:47 |
irenab | dulek, totally agree. I think we just should keep the awareness and deprecate as we go the previous not used flavours | 10:48 |
dulek | irenab: Yup. Currently we have 2x2 matrix - containerized or not and with daemon and without. | 10:49 |
dulek | irenab: I believe first to get deprecated is running without daemon. | 10:49 |
dulek | Then containerization is really a deployment thing, so I don't think we'll be able to deprecate running kuryr services on bare metal. | 10:49 |
irenab | +1 | 10:49 |
*** yamamoto has joined #openstack-kuryr | 11:12 | |
ltomasbo | I agree on that too! | 11:12 |
ltomasbo | dulek, I remove the VM and create a new one | 11:13 |
ltomasbo | and I'm getting the same | 11:13 |
ltomasbo | no access to the log with kubectl | 11:13 |
ltomasbo | but I can see them with docker logs CNI_ID | 11:13 |
ltomasbo | strange | 11:13 |
ltomasbo | dulek, same for the controller container | 11:14 |
dulek | Hm… I'd say it's your env's fault? | 11:14 |
ltomasbo | umm, could be, but I remove everything (even the VM) and started from scratch | 11:15 |
dulek | ltomasbo: Oh, so definitely not your fault… | 11:15 |
dulek | ltomasbo: BTW I think I've understood the EEXIST/KeyError issue, I'm testing the patch currently. | 11:15 |
ltomasbo | dulek, I agree it is probably not related to your patch either | 11:15 |
ltomasbo | dulek, you are fast!!!! | 11:16 |
dulek | I guess I've had enough coffee today, I remember spending hours on that with apuimedo. :P | 11:16 |
dulek | (not coffee, the bug) | 11:16 |
dulek | ltomasbo: So controller pod has no logs as well? | 11:17 |
ltomasbo | well, it has, but it is not accesible through kubectl, only from docker | 11:17 |
dulek | ltomasbo: How about checking the kube-apiserver logs and how it responds to kubectl logs? | 11:18 |
dulek | ltomasbo: Oh, even better - just spawn any pod and see if it has logs. :) | 11:19 |
ltomasbo | ok | 11:20 |
ltomasbo | dulek, same problem... | 11:22 |
ltomasbo | kubectl logs demo-2293951457-bmxbm | 11:22 |
ltomasbo | ^C | 11:22 |
ltomasbo | [stack@gerrit-518024vm-0 devstack]$ docker logs 9b340201edea | 11:22 |
ltomasbo | ::ffff:10.0.0.70 - - [07/Nov/2017 11:21:51] "GET / HTTP/1.1" 200 - | 11:22 |
dulek | So I guess it's caused by something on your VM. | 11:26 |
*** salv-orlando has joined #openstack-kuryr | 11:27 | |
ltomasbo | dulek, umm, probably | 11:27 |
*** yamamoto has quit IRC | 11:31 | |
*** salv-orlando has quit IRC | 11:32 | |
*** yamamoto has joined #openstack-kuryr | 11:43 | |
*** yamamoto has quit IRC | 11:58 | |
irenab | dulek, do you have a min? | 12:00 |
dulek | irenab: min? | 12:01 |
irenab | minute | 12:01 |
irenab | question regarding http://logs.openstack.org/80/509380/13/check/build-openstack-sphinx-docs/d8f7286/html/devref/kuryr_kubernetes_design.html#communication | 12:02 |
*** janonymous has joined #openstack-kuryr | 12:02 | |
irenab | in the diagram it looks that there is a watch started upon daemon start and then there is an AddNetwork request to deal with specific Pod request. from the diagram it is not clear what is in the scope of the watch event handling | 12:05 |
irenab | I am not sure it can be properly clarified in the diagram, but worth to add in the description | 12:06 |
dulek | irenab: Hm, that's right, I would need a third time scale probably. | 12:06 |
dulek | irenab: Basically now with daemon it's a single watcher that looks for pods on host it's running on. | 12:06 |
dulek | irenab: Maybe it should even be pictured as another entity? | 12:06 |
irenab | watch? | 12:07 |
dulek | irenab: You're asking about watcher? | 12:08 |
irenab | What do you mean by 'pictured as another entity'? | 12:08 |
dulek | As another vertical timeline. | 12:09 |
* dulek forgot the names of UML diagram elements. | 12:09 | |
irenab | what do you mean by 'it'? | 12:09 |
irenab | the question is not regarding UML :-) | 12:09 |
dulek | Oh, sorry. The Watcher, yes. | 12:09 |
irenab | I think the vertical is 'time line' | 12:10 |
dulek | irenab: So right now Daemon is 2 processes - one is serving requests and second one is Watcher process that monitors k8s API for incoming VIF annotation. | 12:10 |
dulek | Probably it'll be best to split Daemon into two time lines then. I'll update the patch. | 12:11 |
irenab | What is missing in the diagram is split of responsibilities between these two | 12:11 |
irenab | can be iether clarified in the description or maybe by adding Watcher as another Actor in the diagram | 12:12 |
irenab | dulek, this can be great. thanks! | 12:12 |
dulek | Okay, I'll add another actor and description. | 12:12 |
*** janki has quit IRC | 12:22 | |
*** salv-orlando has joined #openstack-kuryr | 12:29 | |
*** salv-orlando has quit IRC | 12:31 | |
*** salv-orl_ has joined #openstack-kuryr | 12:31 | |
*** yamamoto has joined #openstack-kuryr | 12:34 | |
*** salv-orl_ has quit IRC | 12:40 | |
openstackgerrit | Michał Dulko proposed openstack/kuryr-kubernetes master: CNI Daemon documentation https://review.openstack.org/509380 | 12:47 |
*** gouthamr has quit IRC | 12:50 | |
*** pcaruana has quit IRC | 13:22 | |
*** garyloug_ has quit IRC | 13:34 | |
openstackgerrit | Danil Golov proposed openstack/kuryr-kubernetes master: Add VIF-Handler And Drivers Design approach https://review.openstack.org/513715 | 13:53 |
*** yamamoto has quit IRC | 13:56 | |
*** yamamoto has joined #openstack-kuryr | 14:10 | |
*** yamamoto has quit IRC | 14:15 | |
*** janonymous has quit IRC | 14:21 | |
ltomasbo | ping dulek | 14:39 |
dulek | ltomasbo: pong. I'm preparing late lunch, so I may answer slowly. | 14:43 |
ltomasbo | dulek, no hurry! go to have lunch! | 14:43 |
ltomasbo | dulek, it was just a question about the containerized deployment, where should I look (inside the container) for the logs (kuryr-kubernetes.log)? | 14:43 |
dulek | ltomasbo: I think those go to stdout, so docker manages them. | 14:44 |
dulek | ltomasbo: If we're talking about controller logs. | 14:44 |
ltomasbo | yep, controller logs | 14:45 |
ltomasbo | I want to create a readiness check based on the controller log info | 14:45 |
dulek | ltomasbo: Aaah. That might be a bit difficult. | 14:45 |
ltomasbo | that what I'm seeing... | 14:45 |
ltomasbo | anyway, go to have lunch! we can discuss it later! | 14:46 |
dulek | ltomasbo: Why don't you use your port pool file socket to ask the service directly? | 14:46 |
*** pcaruana has joined #openstack-kuryr | 14:47 | |
ltomasbo | mainly because we agreed on that being just a devstack kind of tool, and we need to improve that if we are going to use it for more robust deployments | 14:48 |
ltomasbo | but, unless we figure it out a different way, it may be the time to improve that... and use it | 14:49 |
dulek | ltomasbo: Okay, then it may be good to think of something else… | 15:00 |
dulek | ltomasbo: What log message you wanted to look for? | 15:01 |
ltomasbo | yep, I have an idea about how to fix it | 15:01 |
ltomasbo | I'm looking at what point in time the ports are loaded into the pools | 15:01 |
dulek | ltomasbo: Ah, that's readiness probe. | 15:02 |
ltomasbo | so, I will create a file once that is done, and use that as rediness check | 15:02 |
dulek | ltomasbo: I wanted to write exactly that. :D | 15:02 |
ltomasbo | :D | 15:03 |
*** garyloug_ has joined #openstack-kuryr | 15:09 | |
*** yamamoto has joined #openstack-kuryr | 15:13 | |
*** jdavis has joined #openstack-kuryr | 15:25 | |
*** yamamoto has quit IRC | 15:26 | |
*** yamamoto has joined #openstack-kuryr | 15:55 | |
*** janki has joined #openstack-kuryr | 15:56 | |
*** yamamoto has quit IRC | 16:06 | |
*** salv-orlando has joined #openstack-kuryr | 16:20 | |
dulek | ltomasbo: Want to do a sanity check for my idea of KeyError? | 16:34 |
dulek | ltomasbo: I've spend another bunch of hours trying to debug that and again I have an Idea. | 16:34 |
dulek | ltomasbo: So line https://github.com/openstack/kuryr-kubernetes/blob/master/kuryr_kubernetes/cni/binding/bridge.py#L39 sometimes explodes with KeyError. | 16:35 |
ltomasbo | tell me | 16:35 |
ltomasbo | umm | 16:35 |
ltomasbo | and do you know why? | 16:35 |
*** yamamoto has joined #openstack-kuryr | 16:35 | |
ltomasbo | not yet into the namespace? | 16:36 |
dulek | ltomasbo: My hypothesis is that movement of h_iface to host netns done in line 37 sometimes isn't noticed by IPDB soon enough. | 16:36 |
dulek | Yup. IPDB checks for changes periodically. It might happen that the object haven't yet caught up with changes. | 16:36 |
dulek | At least I think this makes sense. :P | 16:36 |
*** yboaron has quit IRC | 16:37 | |
ltomasbo | dulek, sounds plausible... | 16:37 |
ltomasbo | so, what do you propose? | 16:39 |
ltomasbo | re-try? | 16:39 |
dulek | ltomasbo: Yeah, so first step would be https://review.openstack.org/#/c/517406/ - to remove IPDB object caching and prevent process leaks. | 16:40 |
dulek | ltomasbo: Oh, this will solve the problem completely. :D | 16:40 |
ltomasbo | xD | 16:41 |
dulek | ltomasbo: Just here: https://review.openstack.org/#/c/517406/2/kuryr_kubernetes/cni/binding/bridge.py I'd need to split the with that's there. | 16:41 |
*** yamamoto has quit IRC | 16:41 | |
dulek | ltomasbo: That way I can be sure that IPDB will get created after the changes got committed. | 16:41 |
ltomasbo | you mean the with at L37? | 16:42 |
ltomasbo | unindent it? | 16:42 |
dulek | ltomasbo: Line 26, h_ipdb isn't used until line 37. | 16:43 |
dulek | ltomasbo: And after line 37 c_ipdb isn't used. | 16:43 |
ltomasbo | so, it the current version (left side) | 16:43 |
ltomasbo | will it work just moving L27 to L38? | 16:43 |
dulek | I can split that to do a c_ipdb context manager first and then h_ipdb. Completely separated. | 16:43 |
ltomasbo | or here https://github.com/openstack/kuryr-kubernetes/blob/master/kuryr_kubernetes/cni/binding/bridge.py#L39, just move L27 to right before L39? | 16:44 |
dulek | ltomasbo: That would work only if caching isn't enabled. | 16:45 |
ltomasbo | sure | 16:45 |
dulek | ltomasbo: Otherwise in kuryr-daemon situation the host IPDB will be cached. | 16:45 |
dulek | And I hate this IPDB class. Kuryr code assumes too much about it. I wonder if we shouldn't move to something that is doing full reload before each operation. | 16:45 |
dulek | Okay, I need to leave for some time, but I'll probably find some time today to fix this stuff up. | 16:46 |
ltomasbo | great! thanks! | 16:46 |
ltomasbo | I'll give it another try tomorrow then! | 16:46 |
*** janki has quit IRC | 17:30 | |
*** janki has joined #openstack-kuryr | 17:31 | |
*** janki has quit IRC | 17:32 | |
*** jdavis has quit IRC | 17:35 | |
*** aojea has joined #openstack-kuryr | 19:15 | |
*** aojea has quit IRC | 19:34 | |
*** aojea has joined #openstack-kuryr | 19:35 | |
*** aojea has quit IRC | 19:39 | |
dmellado | heh, being in Sydney is reading a digest of the channel xD | 19:58 |
*** aojea has joined #openstack-kuryr | 20:08 | |
*** aojea has quit IRC | 21:33 | |
*** aojea has joined #openstack-kuryr | 21:47 | |
*** livelace-link has quit IRC | 21:52 | |
openstackgerrit | Michał Dulko proposed openstack/kuryr-kubernetes master: Fix kubelet retries issues https://review.openstack.org/518404 | 22:06 |
openstackgerrit | Michał Dulko proposed openstack/kuryr-kubernetes master: Prevent pyroute2.IPDB threads leaking https://review.openstack.org/517406 | 22:13 |
openstackgerrit | Michał Dulko proposed openstack/kuryr-kubernetes master: CNI split - introducing CNI daemon https://review.openstack.org/515186 | 22:13 |
openstackgerrit | Michał Dulko proposed openstack/kuryr-kubernetes master: Support kuryr-daemon when running containerized https://review.openstack.org/518024 | 22:13 |
*** gouthamr has joined #openstack-kuryr | 22:16 | |
openstackgerrit | Michał Dulko proposed openstack/kuryr-kubernetes master: DNM: Trying to run gate tests with CNI Daemon https://review.openstack.org/509765 | 22:18 |
*** aojea has quit IRC | 22:52 | |
*** salv-orlando has quit IRC | 22:59 | |
*** salv-orlando has joined #openstack-kuryr | 23:00 | |
*** salv-orlando has quit IRC | 23:04 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!