Wednesday, 2017-11-08

*** garyloug_ has quit IRC00:01
*** salv-orlando has joined #openstack-kuryr00:02
*** salv-orlando has quit IRC00:08
*** gouthamr has quit IRC00:48
*** salv-orlando has joined #openstack-kuryr01:03
*** salv-orlando has quit IRC01:07
*** gouthamr has joined #openstack-kuryr01:50
*** caowei has joined #openstack-kuryr01:57
*** salv-orlando has joined #openstack-kuryr02:04
*** salv-orlando has quit IRC02:09
*** yamamoto has joined #openstack-kuryr02:20
*** atoth has quit IRC02:31
*** kiennt26 has joined #openstack-kuryr02:33
*** gouthamr has quit IRC02:40
*** gouthamr has joined #openstack-kuryr02:45
*** salv-orlando has joined #openstack-kuryr03:04
*** salv-orlando has quit IRC03:10
*** gouthamr has quit IRC03:34
*** gouthamr has joined #openstack-kuryr03:39
*** aojea has joined #openstack-kuryr03:53
*** aojea has quit IRC03:57
*** gouthamr has quit IRC03:59
*** salv-orlando has joined #openstack-kuryr04:05
*** salv-orlando has quit IRC04:10
*** yamamoto has quit IRC04:38
*** yamamoto has joined #openstack-kuryr04:41
*** caowei has quit IRC04:47
*** caowei has joined #openstack-kuryr05:01
*** salv-orlando has joined #openstack-kuryr05:06
*** salv-orlando has quit IRC05:10
*** janki has joined #openstack-kuryr05:21
*** dims has quit IRC06:05
*** pc_m has quit IRC06:06
*** dims has joined #openstack-kuryr06:12
*** salv-orlando has joined #openstack-kuryr06:13
*** pc_m has joined #openstack-kuryr06:14
*** janki has quit IRC06:26
*** janki has joined #openstack-kuryr06:27
*** janki has quit IRC06:28
*** janki has joined #openstack-kuryr06:28
*** aojea has joined #openstack-kuryr06:58
*** yboaron has joined #openstack-kuryr07:40
*** aojea has quit IRC08:02
openstackgerritDanil Golov proposed openstack/kuryr-kubernetes master: Add VIF-Handler And Drivers Design approach  https://review.openstack.org/51371508:10
*** salv-orlando has quit IRC08:18
*** salv-orlando has joined #openstack-kuryr08:19
*** salv-orlando has quit IRC08:23
openstackgerritDanil Golov proposed openstack/kuryr-kubernetes master: Add VIF-Handler And Drivers Design approach  https://review.openstack.org/51371508:25
*** aojea has joined #openstack-kuryr08:31
ltomasbohi irenab, don't forget about: https://review.openstack.org/#/c/510157/ :D08:35
*** salv-orlando has joined #openstack-kuryr08:58
ltomasbohi dulek! I just tested your patch (containerized+daemon) with the other fixed09:03
ltomasboand now it works like a charm!09:03
dulekltomasbo: Yay!09:04
ltomasbono processes left behind, and all containers are starting without any problems!09:04
dulekltomasbo: I'm glad to hear that.09:05
ltomasboohh, I was too fast...09:05
ltomasboI tried to create 30 pods now at once09:06
ltomasboand I get: http://paste.openstack.org/show/625803/09:06
dulekltomasbo: Hm, I've did 40 without problems yesterday. What happened?09:06
ltomasbodulek, ^^09:06
ltomasbook, but at the end, they become active anyway09:07
ltomasboso, it killed the sandbox and recreated it09:07
dulekltomasbo: Do: `for ip in $(kubectl get pods -o wide | tr -s " " | cut -f 6 -d " " | grep 10); do ping -c 1 $ip; done` to see if all of them are pingable09:07
dulek(command should just end immediately)09:08
dulek(ah, grep 10 works if your pod network starts with 10 :P)09:08
dulekI've hit an exception yesterday, but retry done by kubelet helped.09:08
ltomasboumm09:09
ltomasbonot all are pingable09:09
dulekltomasbo: http://paste.openstack.org/show/625804/ - you can look for that in daemon logs.09:09
dulekEh, so there must be another mistake. I'll try to create a bunch of pods a few times, maybe I'll get it.09:10
ltomasboactually, the onces that are not working, as the ones with that CNI error I pasted09:11
dulekltomasbo: Any chance to get the cni-daemon logs?09:12
ltomasbosure09:12
dulekltomasbo: It would be best if you could send me the logs along with output of "openstack port list"09:12
dulekThat's how I debugged yesterday's problems.09:12
ltomasbodulek, http://paste.openstack.org/show/625805/09:13
dulekltomasbo: I guess "fa:16:3e:ad:50:69" matches the address of non-active port in openstack port list?09:14
ltomasbodulek, do you prefer access to the env or the logs?09:14
ltomasbolet me check that09:15
ltomasboyep, they are down09:15
dulekltomasbo: Okay, so I think I'm getting the same on my env.09:16
dulekOkay, I'll look into it. At least I know how to debug this stuff. :)09:16
ltomasboxD09:17
ltomasbothat helps!09:17
ltomasboany initial idea?09:17
ltomasbosome kind of race getting some info from the namespaces?09:17
*** garyloug has joined #openstack-kuryr09:30
*** garyloug has quit IRC09:34
openstackgerritLuis Tomas Bolivar proposed openstack/kuryr-kubernetes master: Add readiness probe to kuryr-controller pod  https://review.openstack.org/51850209:42
*** irenab has quit IRC09:43
*** leyal has quit IRC09:43
*** jerms has quit IRC09:43
*** nhlfr has quit IRC09:46
*** enick_354 has quit IRC09:47
*** garyloug has joined #openstack-kuryr09:48
*** irenab has joined #openstack-kuryr09:49
*** leyal has joined #openstack-kuryr09:49
*** jerms has joined #openstack-kuryr09:49
*** deepika has joined #openstack-kuryr09:58
*** deepika is now known as Guest4995009:58
dulekltomasbo: Oh, sorry, I've dived too much into the code. For now I know this is an exception raised during exception handling. :D09:58
dulekltomasbo: I'm modifying pyroute2 to print initial exception.09:59
*** dougbtv has joined #openstack-kuryr09:59
*** kiennt26 has quit IRC10:00
*** dougbtv_ has quit IRC10:00
*** dougbtv_ has joined #openstack-kuryr10:01
ltomasbodulek, no worries! I was just curious!10:02
ltomasbodulek, I added initial support for the readiness probe when containerized and ports pool enabled:  https://review.openstack.org/51850210:02
ltomasbodulek, I'll adapt the openshift-ansible too and it seems to be working: http://paste.openstack.org/show/625810/10:03
irenabltomasbo, reviewed10:04
ltomasboirenab, thanks! I'll take a look10:04
dulekltomasbo: I've added myself as reviewer. I'm slacking off with reviews, but fighting those transient problems is super time consuming.10:04
*** dougbtv has quit IRC10:04
ltomasbodulek, I can imagine... those are difficult to find though usually fixes are just a couple of lines...10:05
*** nhlfr has joined #openstack-kuryr10:06
*** caowei has quit IRC10:10
dulekltomasbo: Aww, looks like it's a timeout issue - kernel works slower than pyroute.ipdb expects.10:30
dulekltomasbo: I'm checking that now.10:30
ltomasboumm10:32
dulekltomasbo: I'm still getting tracebacks but surprisingly this time all the pods are pingable… Awww.10:38
ltomasbowith the ports in down status??10:38
ltomasbodulek, ^^10:38
dulekltomasbo: No, they're ACTIVE.10:38
dulekltomasbo: Can you explain what's responsible for turning ports from DOWN to ACTIVE?10:39
dulekl3 agent?10:39
ltomasboyep10:40
ltomasboI think the agent triggers a notification when the flows are there10:40
ltomasboand the server update the status in the DB10:40
dulekltomasbo: So it sees that the port gets connected by kuryr and reports that?10:40
dulekOk.10:40
openstackgerritLuis Tomas Bolivar proposed openstack/kuryr-kubernetes master: Avoid neutron calls at recovering precreated ports  https://review.openstack.org/51015710:42
dulekltomasbo: I fear that there may be another race condition. Because why sometimes when kubelet retries, ports are getting into DOWN state?10:44
dulekltomasbo: On failure kubelet always tries to do DEL first.10:44
dulekltomasbo: I wonder if there isn't a race condition between DEL and ADD. VIF stays the same between retries.10:45
dulekltomasbo: If ADD somehow gets processed faster, and then delayed DEL comes - pod ends up without network.10:45
dulekSeems a bit bold, but I think it's possible.10:46
ltomasbodulek, so, you think it fails and the delete comes after the next retry has taken place?10:46
dulekltomasbo: IMO kubelet should wait for DEL to complete before sending another ADD.10:47
dulekltomasbo: So I'm not sure here…10:47
ltomasbodulek, it should, but maybe not. Didn't you discuss that case with apuimedo at some point?10:48
ltomasbowas due to a related problem?10:48
dulekltomasbo: No, I thought of this race condition just yesterday.10:49
dulekltomasbo: Now I've got 12/30 pods not connected…10:49
ltomasbodulek, that is pretty much what I was gettting10:50
*** aojea has quit IRC10:50
ltomasbolike a few containers not getting to Running that fast, and then not having connectivity10:51
dulekltomasbo: Oh, okay, I've committed timeout increase to wrong branch! Let me recheck that…10:56
dulek(too less coffee today)10:56
ltomasboxD10:56
dulektoo little?10:56
dulek:P10:56
ltomasbosolution: more coffee!! jejejeje10:57
*** yamamoto has quit IRC11:03
*** yboaron_ has joined #openstack-kuryr11:04
*** yboaron has quit IRC11:07
*** aojea has joined #openstack-kuryr11:12
*** yboaron_ has quit IRC11:23
*** Guest96098 has joined #openstack-kuryr11:36
*** yamamoto has joined #openstack-kuryr11:37
dulekltomasbo: I've increased VIF annotation timeout and this kernel timeout and it looks promising.11:39
dulekltomasbo: I'll scale the deployment few more times. Fingers crossed.11:40
*** yboaron_ has joined #openstack-kuryr11:43
ltomasbodulek, great!11:43
ltomasbodulek, fingers crossed! let me know the outcome!11:44
*** Guest96098 has quit IRC11:45
dulekltomasbo: Eh, still same tracebacks, one pod haven't got the networks…11:47
ltomasboumm11:47
dulekltomasbo: Waaaait…11:48
dulekNo, it has normal IPv4, I wondered if it isn't IPv6 fault.11:48
ltomasbodulek, ohh, but the port is active?11:49
dulekNo.11:49
ltomasboand it has ipv4 connectivity??11:49
dulekNo, no, it's normal port, just must have been disconnected. Maybe I can track how it was processed through logs.11:50
ltomasbobut if the port is down, the agent should remove the flows, and it should not have connectivity, even if the pod is up and running and connected to the br-int, right?11:51
dulekNo, no, it has *no* connectivity.11:52
dulekSorry for confusing you.11:52
ltomasboahh, ok11:52
dulekYeah, last operation on the vif was Successfully unplugged vif11:52
dulekNow why…11:52
dulekAh, because that's how request came. Now why the strange order…11:53
ltomasboso, the add (retry) was performed before the delete (of the failed previous attempt)11:55
dulekI think so. Which is rather strange.11:55
dulekIt was ADD, DEL, ADD, DEL for this pod.11:56
ltomasboumm, then the order is good11:56
ltomasbothe bad thing is that there is an extra del, right?11:56
dulekYup.11:56
ltomasboumm, why 2 dels....11:57
dulekHow about - ADD errors out, DEL is issued, DEL finishes with failure, so ADD is issued, but failed DEL is retried.11:57
dulekThat would make sense.11:57
dulekAnd that would make our VIF allocation algorithm not compatible with what kubelet is doing.11:57
ltomasboumm11:58
ltomasbothat could be11:58
ltomasbobut not sure why it happens with many containers11:58
ltomasbowhy del finishes with failure?11:58
dulekTimeouts I guess? But good question, I'll track that down…11:59
ltomasboyep, timeouts could be a good explanation12:01
ltomasbobut I wonder why this was not happening before12:02
ltomasbodulek, due to having the ipdb cache?12:02
dulekltomasbo: We've had a single process for each request before. They are parallelizing much better in Python.12:03
dulekActually if I've eliminated orphaned threads issue, I should be able to spawn more processes. I'll try that later.12:04
*** aojea has quit IRC12:05
dulekOkay, I'll need to add more logs to track requests correctly.12:08
dulekltomasbo: Uhm, and by the way first DEL seems to return success, so I'm puzzled why there's a second issued.12:08
ltomasboumm12:12
ltomasbothere should be a second del if the second add fails, right?12:14
dulekltomasbo: But I see second ADD to complete correctly in kubelet logs.12:15
ltomasboummm12:15
dulekltomasbo: CNI Daemon returned "202 ACCEPTED", followed by correct info about IP.12:15
dulekThen I have:12:15
ltomasbohow is that possible then12:15
dulekNetworkPlugin cni failed on the status hook for pod "test-3202410914-qlcbc_default": Unexpected command output nsenter: cannot open : No such file or directory12:15
ltomasbocould it be the message is lost?12:15
dulekAnd then I see DEL being issued.12:16
dulekNow this nsenter error is something I've seen a lot of time in the logs.12:16
*** atoth has joined #openstack-kuryr12:17
*** aojea has joined #openstack-kuryr12:28
*** salv-orlando has quit IRC13:18
*** salv-orlando has joined #openstack-kuryr13:19
*** salv-orlando has quit IRC13:23
irenabdulek, I wonder what level of testing can be done for the https://review.openstack.org/#/c/517406/313:24
irenabI wonder if some level unit testing can be added13:24
dulekirenab: I think we're missing unit tests for whole 'cni' module.13:31
irenabyou are correct13:31
dulekirenab: I've added some in CNI daemon patch, but I've omitted cni.binding because, well… I haven't understood what's going on in those functions.13:32
irenab:-), I thunk now you are the expert13:32
irenabthink13:32
dulekSo I can add them in a separate patch or include it in the bugfix, whatever you like.13:33
openstackgerritOpenStack Proposal Bot proposed openstack/kuryr master: Updated from global requirements  https://review.openstack.org/51693613:35
openstackgerritOpenStack Proposal Bot proposed openstack/kuryr-libnetwork master: Updated from global requirements  https://review.openstack.org/51693713:35
openstackgerritOpenStack Proposal Bot proposed openstack/kuryr-kubernetes master: Updated from global requirements  https://review.openstack.org/50978113:35
irenabbetter to have it as part of the patch, at least some basic tests13:37
dulekirenab: Okay, I'll work on that.13:40
*** aojea has quit IRC13:45
*** aojea has joined #openstack-kuryr13:45
openstackgerritLuis Tomas Bolivar proposed openstack/kuryr-kubernetes master: Add readiness probe to kuryr-controller pod  https://review.openstack.org/51850213:54
*** yamamoto has quit IRC13:55
*** yamamoto has joined #openstack-kuryr13:56
*** yamamoto has quit IRC14:01
*** aojea has quit IRC14:05
openstackgerritMerged openstack/kuryr-kubernetes master: Avoid neutron calls at recovering precreated ports  https://review.openstack.org/51015714:06
*** aojea has joined #openstack-kuryr14:06
dulekltomasbo: Ghhaaaw! I should turn off port_debug to test in scale, shouldn't I?14:11
ltomasboyep14:12
ltomasbodulek, that will skip neutron calls (changing names)14:12
ltomasbobut that will affect mostly when using pools14:12
ltomasbodulek, otherwise it may not skip that many calls, possibly none14:13
dulekltomasbo: Ah, right… So that's why I'm seeing kuryr-kubernetes to annotate pods with VIFs really slowly?14:15
*** aojea has quit IRC14:17
ltomasboxD14:18
ltomasbocould be14:19
ltomasbodulek, there will be a lot of calls to neutron to create the ports, pooling to know when the port becomes active, change the name and device-id, ...14:19
*** yamamoto has joined #openstack-kuryr14:26
*** yamamoto has quit IRC14:31
*** salv-orlando has joined #openstack-kuryr14:31
huats_Hi guys14:35
huats_I am quite surprised by the documentation14:35
huats_I I have followed the libkuryr-network installation doc and at the end I don't have /usr/local/libexec/kuryr/ created14:36
huats_thus I don't have the various script for bindings14:36
huats_(like the ovs in my case)14:36
huats_any idea ?14:36
huats_ok14:42
huats_I was missing kuryr installation14:42
*** aojea has joined #openstack-kuryr14:43
*** premsankar has joined #openstack-kuryr15:05
*** aojea has quit IRC15:07
*** yboaron_ has quit IRC15:08
*** janki has quit IRC15:10
*** aojea has joined #openstack-kuryr15:11
*** yamamoto has joined #openstack-kuryr15:12
openstackgerritOpenStack Proposal Bot proposed openstack/kuryr master: Updated from global requirements  https://review.openstack.org/51693615:15
openstackgerritOpenStack Proposal Bot proposed openstack/kuryr-libnetwork master: Updated from global requirements  https://review.openstack.org/51693715:15
openstackgerritOpenStack Proposal Bot proposed openstack/kuryr-kubernetes master: Updated from global requirements  https://review.openstack.org/50978115:15
*** yamamoto has quit IRC15:16
*** aojea has quit IRC15:22
*** aojea has joined #openstack-kuryr15:24
*** aojea has quit IRC15:34
*** aojea has joined #openstack-kuryr15:41
*** salv-orlando has quit IRC15:42
*** salv-orlando has joined #openstack-kuryr15:43
*** aojea has quit IRC15:45
*** salv-orlando has quit IRC15:47
*** aojea has joined #openstack-kuryr15:51
*** yamamoto has joined #openstack-kuryr15:54
*** yamamoto has quit IRC16:01
*** yamamoto has joined #openstack-kuryr16:03
*** aojea has quit IRC16:04
*** aojea has joined #openstack-kuryr16:07
dulekRather lame question but… how do I run a single unit test?16:14
*** yamamoto has quit IRC16:16
ltomasbodulek, enable the virtalenv16:19
ltomasbosource .tox/py27/bin/activate16:19
ltomasboand then run this:16:19
dulekOh, no `tox -e py27 -- --regex path.to.module`?16:19
*** aojea has quit IRC16:19
ltomasbopython -m testtools.run [test module path]16:20
ltomasbothat is the way I do it, pretty sure there are other ways...16:20
dulekltomasbo: Thanks, that's much better than nothing. :)16:22
ltomasboxD16:22
*** yamamoto has joined #openstack-kuryr16:26
*** yamamoto has quit IRC16:31
*** salv-orlando has joined #openstack-kuryr16:43
*** openstackgerrit has quit IRC16:48
*** salv-orlando has quit IRC16:50
*** salv-orlando has joined #openstack-kuryr17:27
*** yamamoto has joined #openstack-kuryr17:35
*** garyloug_ has joined #openstack-kuryr17:40
*** garyloug_ has quit IRC17:43
*** yamamoto has quit IRC17:43
*** garyloug has quit IRC17:43
*** aojea has joined #openstack-kuryr17:55
*** aojea has quit IRC18:21
*** salv-orlando has quit IRC18:26
*** salv-orlando has joined #openstack-kuryr18:34
*** aojea has joined #openstack-kuryr18:37
*** aojea has quit IRC18:54
*** aojea has joined #openstack-kuryr19:00
*** aojea has quit IRC19:18
*** openstackgerrit has joined #openstack-kuryr19:19
openstackgerritMichał Dulko proposed openstack/kuryr-kubernetes master: Prevent pyroute2.IPDB threads leaking  https://review.openstack.org/51740619:19
openstackgerritMichał Dulko proposed openstack/kuryr-kubernetes master: CNI split - introducing CNI daemon  https://review.openstack.org/51518619:19
openstackgerritMichał Dulko proposed openstack/kuryr-kubernetes master: Support kuryr-daemon when running containerized  https://review.openstack.org/51802419:19
dulekirenab: It wasn't too easy, but I've added unit tests for cni.binding modules.19:22
openstackgerritMichał Dulko proposed openstack/kuryr-kubernetes master: DNM: Trying to run gate tests with CNI Daemon  https://review.openstack.org/50976519:30
*** aojea has joined #openstack-kuryr19:31
*** aojea has quit IRC19:41
*** aojea has joined #openstack-kuryr19:41
*** aojea has quit IRC19:53
*** aojea has joined #openstack-kuryr20:04
*** atoth has quit IRC20:11
*** vikasc has quit IRC20:28
*** aojea has quit IRC20:44
*** dougbtv__ has joined #openstack-kuryr22:08
*** aojea has joined #openstack-kuryr22:44
*** aojea has quit IRC22:49
openstackgerritOpenStack Proposal Bot proposed openstack/kuryr master: Updated from global requirements  https://review.openstack.org/51693623:55
openstackgerritOpenStack Proposal Bot proposed openstack/kuryr-libnetwork master: Updated from global requirements  https://review.openstack.org/51693723:55
openstackgerritOpenStack Proposal Bot proposed openstack/kuryr-kubernetes master: Updated from global requirements  https://review.openstack.org/50978123:55

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!