Tuesday, 2018-10-02

*** slaweq has joined #openstack-meeting-500:11
*** slaweq has quit IRC00:15
*** yamahata has quit IRC01:39
*** iyamahat has quit IRC01:39
*** roman_g has quit IRC02:17
*** skazi has quit IRC04:53
*** lemko has joined #openstack-meeting-505:49
*** slaweq has joined #openstack-meeting-506:11
*** slaweq has quit IRC06:16
*** dims has quit IRC06:38
*** dims has joined #openstack-meeting-506:44
*** dims has quit IRC06:48
*** dims has joined #openstack-meeting-506:51
*** slaweq has joined #openstack-meeting-507:06
*** slaweq has quit IRC07:39
*** slaweq has joined #openstack-meeting-507:49
*** iyamahat has joined #openstack-meeting-507:51
*** derekh has joined #openstack-meeting-508:37
*** roman_g has joined #openstack-meeting-508:39
*** slaweq has quit IRC09:19
*** slaweq has joined #openstack-meeting-509:21
*** markvoelker has joined #openstack-meeting-512:34
*** hongbin has joined #openstack-meeting-513:57
*** amotoki_ is now known as amotoki14:04
portdirecto/15:01
lamto/15:01
portdirect#startmeeting openstack-helm15:01
openstackMeeting started Tue Oct  2 15:01:50 2018 UTC and is due to finish in 60 minutes.  The chair is portdirect. Information about MeetBot at http://wiki.debian.org/MeetBot.15:01
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:01
*** openstack changes topic to " (Meeting topic: openstack-helm)"15:01
openstackThe meeting name has been set to 'openstack_helm'15:01
portdirectlets give it a few mins for people to roll in15:02
portdirect#topic rollcall15:02
*** openstack changes topic to "rollcall (Meeting topic: openstack-helm)"15:02
portdirecto/15:02
srwilkerso/15:02
lamt\o15:02
roman_go/15:03
portdirectagenda for today: https://etherpad.openstack.org/p/openstack-helm-meeting-2018-10-02, will give until 5 past and then kick off15:03
*** gagehugo has joined #openstack-meeting-515:05
portdirectoh hai gagehugo15:05
gagehugoo/15:05
portdirectok - lets get going15:05
srwilkersa wild gagehugo appears15:05
portdirect#topic Libvirt restarts15:06
*** openstack changes topic to "Libvirt restarts (Meeting topic: openstack-helm)"15:06
portdirectso once again, we seem to have lost the ability to restart libvirt pods without stopping vms15:06
portdirectas far as i can make out, the pid reaper of k8s is now (since 1.9) clever enough to kill child processes of pods, even when running in host pid mode15:07
portdirectand uses cgroups to target the pids to reap15:07
srwilkersahh15:08
portdirecti think the soultion to this is wo get ourself out of the k8s managed cgroups entirely15:08
portdirectand so have proposed the following: https://review.openstack.org/#/c/607072/2/libvirt/templates/bin/_libvirt.sh.tpl15:09
alanmeadowsI thought we used to run libvirt in hostIPC--I'm not sure if I am misremembering or we stopped--this seems to re-enable that15:09
portdirectwe never did - though i rememeber the same15:09
alanmeadowsshouldn't hostIPC effectively *be* the flag we need to tell k8s to stop mucking with this? Are we seeing behavior we don't expect?15:10
portdirecti re-enabled as part of this - as it makes sense to have this15:10
portdirectalanmeadows: its not - see the cgroups/pid reaper comment above15:10
alanmeadowsTo be sure, there *should* be a k8s flag to effectively disable the cgroup, repeating, and other "helpers" and I thought hPid, and hIPC were it15:10
alanmeadowss/repeating/reaping/15:11
portdirectthey no-longer are, looking at the kubelet source, theres no way to disable this15:11
alanmeadowsthis feels like a k8s gap15:11
portdirectand for everyone but us - i think what it does is an improvement15:11
portdirectno disagreement there15:12
alanmeadowssure, just feel like there needs to be a "don't get smart" button15:12
portdirectrkt stage 1 fly would offer this15:12
alanmeadowslibvirt is just one of several use cases15:12
portdirectso - i think this to me suggests two things15:12
portdirect1) we need a fix to this NOW, is the above the right way to do this?15:13
portdirect2) lets use the fix we end up with, and get a bug opened with k8s to support "dumb" containers - just like the good 'ol days15:13
portdirectthe though behind what im doing above is that we essentially run libvirt as a transient unit on the host15:15
srwilkersthe approach above seems acceptable to me, unless im missing something15:15
portdirectso for pretty much the whole world - we get normal operation15:15
portdirectthe one thing being that we dont specify a name for the transient unit - so systemd assigns one15:16
portdirectthis allows the pod to be restarted15:16
portdirector even the chart to be removed, and qemu processes will be left running15:16
portdirectand then when the pod/chart comes back - libvirt will start up in a new scope, but manage the quems left in the old one just fine15:17
portdirectseem sane?15:17
alanmeadowswe validated it can not only see them but can touch them?15:18
srwilkersi think so15:18
portdirectalanmeadows: yes15:18
portdirectthough i do still need to check when using the cgfroupfs driver for docker/k8s15:19
portdirectthat this still works, and that also leads nicely into the next point15:19
alanmeadowswhat are the interactions of this and the recommendation to disable the hugetlb cgroup in the boot parameters15:19
alanmeadowsare both still required?15:19
portdirectno - this removes that requirement15:20
portdirectwe super need to gate this - once we have fixed this issue - I really want to get a light weight gate in that just confirms that the libvirt chart can be deployed, start a vm, and then be removed and deployed again, with 0 imact on the running vm15:21
alanmeadowslast question15:21
portdirectthe end of this would probably be initiating a reboot15:21
portdirectI dont think openstakc would be required for this gate15:21
srwilkersportdirect: yeah, was going to see if we could include that in the gate rework you're going to chat about later15:21
alanmeadowsif cgroup_disable=hugetlb is still leveraged, this doesn't care and operates fine?15:21
portdirectyes15:21
portdirectits why on l35 i get the cgroups to manually use/over-ride dynamicly: https://review.openstack.org/#/c/607072/2/libvirt/templates/bin/_libvirt.sh.tpl15:22
portdirectwe ok here? to leave any further convo to review?15:24
srwilkersyeah, works for me15:24
portdirectok15:25
portdirect#topic Calico V315:25
*** openstack changes topic to "Calico V3 (Meeting topic: openstack-helm)"15:25
portdirectso i dont think anticw is here15:25
portdirectbut theres been a load of work done on updating our now long in the tooth calico chart15:25
portdirectadding v3 support15:26
portdirecthttps://review.openstack.org/#/c/607065/15:27
portdirectplease review away15:27
portdirectI'm super excited about this - as it offers a ray of hope for the future, that we can get out of the quagmire of iptables rules from the kube-proxy and move to ipvs15:27
portdirectbut baby steps...15:27
*** anticw has joined #openstack-meeting-515:28
portdirecthey anticw ' we were just talking about you15:29
srwilkerscool.  will review this proper later today15:29
portdirectanything you'd like to point out re the calico v3 work?15:29
anticwit works15:30
anticwthere are some cosmetic changes done to try stay aligned with upstream15:31
anticwnot all of those are required, but having them means a later upgrade should be easier15:31
portdirectsounds great anticw15:32
portdirectthx for your work on this15:33
anticwnp, the other cleanups people brought up i've put on a list and we can decide which of those are needed15:34
anticwas you pointed out some of them run counter to a uniform interface to other CNS15:34
anticwCNIs15:34
portdirectsure - from what i have seen the core is good solid, and the only real discussion may be aournd some of the config entrypoints15:35
portdirectbut i think we can hash that out in review15:35
srwilkersworks for me15:37
portdirectok15:38
portdirect#topic MariaDB15:38
*** openstack changes topic to "MariaDB (Meeting topic: openstack-helm)"15:38
portdirectso I've got a wip up here: https://review.openstack.org/#/c/604556/ that i hope radically improves our galera clustering ability15:38
portdirective been testing it reasonably hard15:39
portdirectthe biggest gaps atm that i'm aware of is the need to handle locks on configmaps better so we get acid like use out of them15:39
portdirectand also get xtrabackup working again15:40
portdirectthankfully both of these are relativly simple,  though the configmap mutex may require a bit of time15:40
portdirectwould be great to get people to run this though its paces, and report back any shortcomings15:41
srwilkerseven if it does, i think this is a step in a better direction.  i've been playing with some of the changes for a bit now, and im pretty happy with it thus far15:41
portdirectok - so the last thing from me this week:15:42
portdirect#topic Gate intervention15:42
*** openstack changes topic to "Gate intervention (Meeting topic: openstack-helm)"15:42
portdirectevardjp is planning on doing and extensive overhall of the gates, and bring some much needed sanity to them15:43
portdirectthough hes away this week - boo!15:43
portdirectthat said, theres an urgent need to get our gates in a slightly better state than they are today15:43
portdirectso after this meeting im planning on refactoring some of them to get us to a point where things can merge without one million retrys15:44
srwilkersthat'd be great15:44
portdirectthe main method to do this will to be cutting out duplicate tests - and also potentially adding an extra gate, so we can split the load15:45
srwilkersnot sure if it matters now, but do we want to consider moving some of the checks to experimental checks (where it makes sense), until we can get the larger overhaul started/completed?15:45
portdirectas most failures seem to be the nodepool vm's just bing pushed harder than they can take15:45
portdirectsrwilkers: if by the end of day i've not made signifigant progress - i think that, may be the short term bandage we need15:46
srwilkersportdirect: yeah. i was playing around with some of the osh-infra gates just to see how things performed when the logging and monitoring charts were split into separate jobs15:46
portdirectwhile on the subject of gates:15:47
portdirect#topic Armada gate15:48
*** openstack changes topic to "Armada gate (Meeting topic: openstack-helm)"15:48
portdirectsrwilkers: you're up15:48
srwilkersi've got a few changes pending for the armada gate in openstack-helm15:48
srwilkersthe first adds the Elasticsearch admin password to the nagios chart definition, as the current nagios chart supports querying elasticsearch for logged events15:49
srwilkersthe second adds ragosgw to the lma manifest, along with the required overrides to take advantage of the s3 support for elasticsearch15:49
srwilkersthe third is more reactive, as it seems the rabbitmq helm tests fail sporadically in the armada gate.  that change proposes disabling them for the time being15:50
srwilkersand the fourth is the most important in my mind.  it's the introduction of an ocata armada gate.  and the question becomes:  do we sunset the newton armada gate?15:51
portdirectfor rabbitmq - we prob dont need to run as many as we do in the upstream gates15:51
srwilkersportdirect: probably not.  i can update that patchset to instead reduce us down to one rabbit deployment15:51
portdirect++15:52
portdirectwe got consensus at the ptg to sunset newton totally15:52
portdirectand move the default to ocata15:53
srwilkersthats why im leaning towards sunsetting the newton armada gate with the ocata armada patchset, along with avoiding adding another 5 node check to our runs15:53
portdirectsounds good - though I think the 1st step would be to make ocata images the defaults in charts15:54
lamtare we sunsetting newton for just the armada job or all the jobs?15:54
lamtI volunteer to do that15:54
srwilkerslamt: nice :)15:54
portdirectlamt: if you could that would be awesome15:54
lamtwill start - those newton images start to pain me anyway15:55
portdirectplease add a loci newton gate though15:55
lamtwill do15:55
srwilkersunrelated portdirect:  we can take my last point wrt the values spec offline, so we have time for open discussion15:55
portdirectok - sounds good15:55
srwilkerswe can handle that in the #openstack-helm channel15:55
portdirect#topic open discussion / review needed15:55
*** openstack changes topic to "open discussion / review needed (Meeting topic: openstack-helm)"15:55
srwilkerscrickets :)15:57
portdirectok - lets wrap up then15:57
portdirect#endmeeting15:57
*** openstack changes topic to "OpenStack Meetings || https://wiki.openstack.org/wiki/Meetings/"15:57
openstackMeeting ended Tue Oct  2 15:57:54 2018 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:57
openstackMinutes:        http://eavesdrop.openstack.org/meetings/openstack_helm/2018/openstack_helm.2018-10-02-15.01.html15:57
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/openstack_helm/2018/openstack_helm.2018-10-02-15.01.txt15:57
openstackLog:            http://eavesdrop.openstack.org/meetings/openstack_helm/2018/openstack_helm.2018-10-02-15.01.log.html15:57
*** gagehugo has left #openstack-meeting-516:00
*** skazi has joined #openstack-meeting-516:21
*** iyamahat has quit IRC17:01
*** derekh has quit IRC17:03
*** spiette has quit IRC17:26
*** iyamahat has joined #openstack-meeting-517:26
*** spiette has joined #openstack-meeting-517:29
*** spiette has quit IRC17:29
*** spiette has joined #openstack-meeting-517:38
*** lemko has quit IRC18:17
*** spiette has quit IRC19:18
*** spiette has joined #openstack-meeting-519:21
*** sgrasley__ has joined #openstack-meeting-523:00
*** hongbin has quit IRC23:02
*** sgrasley_ has quit IRC23:04
*** njohnston has quit IRC23:17
*** sgrasley_ has joined #openstack-meeting-523:46
*** sgrasley__ has quit IRC23:49

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!