*** slaweq has joined #openstack-meeting-5 | 00:11 | |
*** slaweq has quit IRC | 00:15 | |
*** yamahata has quit IRC | 01:39 | |
*** iyamahat has quit IRC | 01:39 | |
*** roman_g has quit IRC | 02:17 | |
*** skazi has quit IRC | 04:53 | |
*** lemko has joined #openstack-meeting-5 | 05:49 | |
*** slaweq has joined #openstack-meeting-5 | 06:11 | |
*** slaweq has quit IRC | 06:16 | |
*** dims has quit IRC | 06:38 | |
*** dims has joined #openstack-meeting-5 | 06:44 | |
*** dims has quit IRC | 06:48 | |
*** dims has joined #openstack-meeting-5 | 06:51 | |
*** slaweq has joined #openstack-meeting-5 | 07:06 | |
*** slaweq has quit IRC | 07:39 | |
*** slaweq has joined #openstack-meeting-5 | 07:49 | |
*** iyamahat has joined #openstack-meeting-5 | 07:51 | |
*** derekh has joined #openstack-meeting-5 | 08:37 | |
*** roman_g has joined #openstack-meeting-5 | 08:39 | |
*** slaweq has quit IRC | 09:19 | |
*** slaweq has joined #openstack-meeting-5 | 09:21 | |
*** markvoelker has joined #openstack-meeting-5 | 12:34 | |
*** hongbin has joined #openstack-meeting-5 | 13:57 | |
*** amotoki_ is now known as amotoki | 14:04 | |
portdirect | o/ | 15:01 |
---|---|---|
lamt | o/ | 15:01 |
portdirect | #startmeeting openstack-helm | 15:01 |
openstack | Meeting started Tue Oct 2 15:01:50 2018 UTC and is due to finish in 60 minutes. The chair is portdirect. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:01 |
*** openstack changes topic to " (Meeting topic: openstack-helm)" | 15:01 | |
openstack | The meeting name has been set to 'openstack_helm' | 15:01 |
portdirect | lets give it a few mins for people to roll in | 15:02 |
portdirect | #topic rollcall | 15:02 |
*** openstack changes topic to "rollcall (Meeting topic: openstack-helm)" | 15:02 | |
portdirect | o/ | 15:02 |
srwilkers | o/ | 15:02 |
lamt | \o | 15:02 |
roman_g | o/ | 15:03 |
portdirect | agenda for today: https://etherpad.openstack.org/p/openstack-helm-meeting-2018-10-02, will give until 5 past and then kick off | 15:03 |
*** gagehugo has joined #openstack-meeting-5 | 15:05 | |
portdirect | oh hai gagehugo | 15:05 |
gagehugo | o/ | 15:05 |
portdirect | ok - lets get going | 15:05 |
srwilkers | a wild gagehugo appears | 15:05 |
portdirect | #topic Libvirt restarts | 15:06 |
*** openstack changes topic to "Libvirt restarts (Meeting topic: openstack-helm)" | 15:06 | |
portdirect | so once again, we seem to have lost the ability to restart libvirt pods without stopping vms | 15:06 |
portdirect | as far as i can make out, the pid reaper of k8s is now (since 1.9) clever enough to kill child processes of pods, even when running in host pid mode | 15:07 |
portdirect | and uses cgroups to target the pids to reap | 15:07 |
srwilkers | ahh | 15:08 |
portdirect | i think the soultion to this is wo get ourself out of the k8s managed cgroups entirely | 15:08 |
portdirect | and so have proposed the following: https://review.openstack.org/#/c/607072/2/libvirt/templates/bin/_libvirt.sh.tpl | 15:09 |
alanmeadows | I thought we used to run libvirt in hostIPC--I'm not sure if I am misremembering or we stopped--this seems to re-enable that | 15:09 |
portdirect | we never did - though i rememeber the same | 15:09 |
alanmeadows | shouldn't hostIPC effectively *be* the flag we need to tell k8s to stop mucking with this? Are we seeing behavior we don't expect? | 15:10 |
portdirect | i re-enabled as part of this - as it makes sense to have this | 15:10 |
portdirect | alanmeadows: its not - see the cgroups/pid reaper comment above | 15:10 |
alanmeadows | To be sure, there *should* be a k8s flag to effectively disable the cgroup, repeating, and other "helpers" and I thought hPid, and hIPC were it | 15:10 |
alanmeadows | s/repeating/reaping/ | 15:11 |
portdirect | they no-longer are, looking at the kubelet source, theres no way to disable this | 15:11 |
alanmeadows | this feels like a k8s gap | 15:11 |
portdirect | and for everyone but us - i think what it does is an improvement | 15:11 |
portdirect | no disagreement there | 15:12 |
alanmeadows | sure, just feel like there needs to be a "don't get smart" button | 15:12 |
portdirect | rkt stage 1 fly would offer this | 15:12 |
alanmeadows | libvirt is just one of several use cases | 15:12 |
portdirect | so - i think this to me suggests two things | 15:12 |
portdirect | 1) we need a fix to this NOW, is the above the right way to do this? | 15:13 |
portdirect | 2) lets use the fix we end up with, and get a bug opened with k8s to support "dumb" containers - just like the good 'ol days | 15:13 |
portdirect | the though behind what im doing above is that we essentially run libvirt as a transient unit on the host | 15:15 |
srwilkers | the approach above seems acceptable to me, unless im missing something | 15:15 |
portdirect | so for pretty much the whole world - we get normal operation | 15:15 |
portdirect | the one thing being that we dont specify a name for the transient unit - so systemd assigns one | 15:16 |
portdirect | this allows the pod to be restarted | 15:16 |
portdirect | or even the chart to be removed, and qemu processes will be left running | 15:16 |
portdirect | and then when the pod/chart comes back - libvirt will start up in a new scope, but manage the quems left in the old one just fine | 15:17 |
portdirect | seem sane? | 15:17 |
alanmeadows | we validated it can not only see them but can touch them? | 15:18 |
srwilkers | i think so | 15:18 |
portdirect | alanmeadows: yes | 15:18 |
portdirect | though i do still need to check when using the cgfroupfs driver for docker/k8s | 15:19 |
portdirect | that this still works, and that also leads nicely into the next point | 15:19 |
alanmeadows | what are the interactions of this and the recommendation to disable the hugetlb cgroup in the boot parameters | 15:19 |
alanmeadows | are both still required? | 15:19 |
portdirect | no - this removes that requirement | 15:20 |
portdirect | we super need to gate this - once we have fixed this issue - I really want to get a light weight gate in that just confirms that the libvirt chart can be deployed, start a vm, and then be removed and deployed again, with 0 imact on the running vm | 15:21 |
alanmeadows | last question | 15:21 |
portdirect | the end of this would probably be initiating a reboot | 15:21 |
portdirect | I dont think openstakc would be required for this gate | 15:21 |
srwilkers | portdirect: yeah, was going to see if we could include that in the gate rework you're going to chat about later | 15:21 |
alanmeadows | if cgroup_disable=hugetlb is still leveraged, this doesn't care and operates fine? | 15:21 |
portdirect | yes | 15:21 |
portdirect | its why on l35 i get the cgroups to manually use/over-ride dynamicly: https://review.openstack.org/#/c/607072/2/libvirt/templates/bin/_libvirt.sh.tpl | 15:22 |
portdirect | we ok here? to leave any further convo to review? | 15:24 |
srwilkers | yeah, works for me | 15:24 |
portdirect | ok | 15:25 |
portdirect | #topic Calico V3 | 15:25 |
*** openstack changes topic to "Calico V3 (Meeting topic: openstack-helm)" | 15:25 | |
portdirect | so i dont think anticw is here | 15:25 |
portdirect | but theres been a load of work done on updating our now long in the tooth calico chart | 15:25 |
portdirect | adding v3 support | 15:26 |
portdirect | https://review.openstack.org/#/c/607065/ | 15:27 |
portdirect | please review away | 15:27 |
portdirect | I'm super excited about this - as it offers a ray of hope for the future, that we can get out of the quagmire of iptables rules from the kube-proxy and move to ipvs | 15:27 |
portdirect | but baby steps... | 15:27 |
*** anticw has joined #openstack-meeting-5 | 15:28 | |
portdirect | hey anticw ' we were just talking about you | 15:29 |
srwilkers | cool. will review this proper later today | 15:29 |
portdirect | anything you'd like to point out re the calico v3 work? | 15:29 |
anticw | it works | 15:30 |
anticw | there are some cosmetic changes done to try stay aligned with upstream | 15:31 |
anticw | not all of those are required, but having them means a later upgrade should be easier | 15:31 |
portdirect | sounds great anticw | 15:32 |
portdirect | thx for your work on this | 15:33 |
anticw | np, the other cleanups people brought up i've put on a list and we can decide which of those are needed | 15:34 |
anticw | as you pointed out some of them run counter to a uniform interface to other CNS | 15:34 |
anticw | CNIs | 15:34 |
portdirect | sure - from what i have seen the core is good solid, and the only real discussion may be aournd some of the config entrypoints | 15:35 |
portdirect | but i think we can hash that out in review | 15:35 |
srwilkers | works for me | 15:37 |
portdirect | ok | 15:38 |
portdirect | #topic MariaDB | 15:38 |
*** openstack changes topic to "MariaDB (Meeting topic: openstack-helm)" | 15:38 | |
portdirect | so I've got a wip up here: https://review.openstack.org/#/c/604556/ that i hope radically improves our galera clustering ability | 15:38 |
portdirect | ive been testing it reasonably hard | 15:39 |
portdirect | the biggest gaps atm that i'm aware of is the need to handle locks on configmaps better so we get acid like use out of them | 15:39 |
portdirect | and also get xtrabackup working again | 15:40 |
portdirect | thankfully both of these are relativly simple, though the configmap mutex may require a bit of time | 15:40 |
portdirect | would be great to get people to run this though its paces, and report back any shortcomings | 15:41 |
srwilkers | even if it does, i think this is a step in a better direction. i've been playing with some of the changes for a bit now, and im pretty happy with it thus far | 15:41 |
portdirect | ok - so the last thing from me this week: | 15:42 |
portdirect | #topic Gate intervention | 15:42 |
*** openstack changes topic to "Gate intervention (Meeting topic: openstack-helm)" | 15:42 | |
portdirect | evardjp is planning on doing and extensive overhall of the gates, and bring some much needed sanity to them | 15:43 |
portdirect | though hes away this week - boo! | 15:43 |
portdirect | that said, theres an urgent need to get our gates in a slightly better state than they are today | 15:43 |
portdirect | so after this meeting im planning on refactoring some of them to get us to a point where things can merge without one million retrys | 15:44 |
srwilkers | that'd be great | 15:44 |
portdirect | the main method to do this will to be cutting out duplicate tests - and also potentially adding an extra gate, so we can split the load | 15:45 |
srwilkers | not sure if it matters now, but do we want to consider moving some of the checks to experimental checks (where it makes sense), until we can get the larger overhaul started/completed? | 15:45 |
portdirect | as most failures seem to be the nodepool vm's just bing pushed harder than they can take | 15:45 |
portdirect | srwilkers: if by the end of day i've not made signifigant progress - i think that, may be the short term bandage we need | 15:46 |
srwilkers | portdirect: yeah. i was playing around with some of the osh-infra gates just to see how things performed when the logging and monitoring charts were split into separate jobs | 15:46 |
portdirect | while on the subject of gates: | 15:47 |
portdirect | #topic Armada gate | 15:48 |
*** openstack changes topic to "Armada gate (Meeting topic: openstack-helm)" | 15:48 | |
portdirect | srwilkers: you're up | 15:48 |
srwilkers | i've got a few changes pending for the armada gate in openstack-helm | 15:48 |
srwilkers | the first adds the Elasticsearch admin password to the nagios chart definition, as the current nagios chart supports querying elasticsearch for logged events | 15:49 |
srwilkers | the second adds ragosgw to the lma manifest, along with the required overrides to take advantage of the s3 support for elasticsearch | 15:49 |
srwilkers | the third is more reactive, as it seems the rabbitmq helm tests fail sporadically in the armada gate. that change proposes disabling them for the time being | 15:50 |
srwilkers | and the fourth is the most important in my mind. it's the introduction of an ocata armada gate. and the question becomes: do we sunset the newton armada gate? | 15:51 |
portdirect | for rabbitmq - we prob dont need to run as many as we do in the upstream gates | 15:51 |
srwilkers | portdirect: probably not. i can update that patchset to instead reduce us down to one rabbit deployment | 15:51 |
portdirect | ++ | 15:52 |
portdirect | we got consensus at the ptg to sunset newton totally | 15:52 |
portdirect | and move the default to ocata | 15:53 |
srwilkers | thats why im leaning towards sunsetting the newton armada gate with the ocata armada patchset, along with avoiding adding another 5 node check to our runs | 15:53 |
portdirect | sounds good - though I think the 1st step would be to make ocata images the defaults in charts | 15:54 |
lamt | are we sunsetting newton for just the armada job or all the jobs? | 15:54 |
lamt | I volunteer to do that | 15:54 |
srwilkers | lamt: nice :) | 15:54 |
portdirect | lamt: if you could that would be awesome | 15:54 |
lamt | will start - those newton images start to pain me anyway | 15:55 |
portdirect | please add a loci newton gate though | 15:55 |
lamt | will do | 15:55 |
srwilkers | unrelated portdirect: we can take my last point wrt the values spec offline, so we have time for open discussion | 15:55 |
portdirect | ok - sounds good | 15:55 |
srwilkers | we can handle that in the #openstack-helm channel | 15:55 |
portdirect | #topic open discussion / review needed | 15:55 |
*** openstack changes topic to "open discussion / review needed (Meeting topic: openstack-helm)" | 15:55 | |
srwilkers | crickets :) | 15:57 |
portdirect | ok - lets wrap up then | 15:57 |
portdirect | #endmeeting | 15:57 |
*** openstack changes topic to "OpenStack Meetings || https://wiki.openstack.org/wiki/Meetings/" | 15:57 | |
openstack | Meeting ended Tue Oct 2 15:57:54 2018 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:57 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/openstack_helm/2018/openstack_helm.2018-10-02-15.01.html | 15:57 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/openstack_helm/2018/openstack_helm.2018-10-02-15.01.txt | 15:57 |
openstack | Log: http://eavesdrop.openstack.org/meetings/openstack_helm/2018/openstack_helm.2018-10-02-15.01.log.html | 15:57 |
*** gagehugo has left #openstack-meeting-5 | 16:00 | |
*** skazi has joined #openstack-meeting-5 | 16:21 | |
*** iyamahat has quit IRC | 17:01 | |
*** derekh has quit IRC | 17:03 | |
*** spiette has quit IRC | 17:26 | |
*** iyamahat has joined #openstack-meeting-5 | 17:26 | |
*** spiette has joined #openstack-meeting-5 | 17:29 | |
*** spiette has quit IRC | 17:29 | |
*** spiette has joined #openstack-meeting-5 | 17:38 | |
*** lemko has quit IRC | 18:17 | |
*** spiette has quit IRC | 19:18 | |
*** spiette has joined #openstack-meeting-5 | 19:21 | |
*** sgrasley__ has joined #openstack-meeting-5 | 23:00 | |
*** hongbin has quit IRC | 23:02 | |
*** sgrasley_ has quit IRC | 23:04 | |
*** njohnston has quit IRC | 23:17 | |
*** sgrasley_ has joined #openstack-meeting-5 | 23:46 | |
*** sgrasley__ has quit IRC | 23:49 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!