tbarron | bswartz: true. We lose the "statement of intent". | 00:01 |
---|---|---|
*** crushil has quit IRC | 00:01 | |
tbarron | bswartz: if you make the feature matrix go away though, there is no public statement of intent. | 00:01 |
tbarron | bswartz: maybe we just report what is, not what is claimed. | 00:02 |
bswartz | tbarron: well that gets back to the concern RedHat has about knowing which skips are good skips and which skips are bad skips | 00:02 |
tbarron | bswartz: so to take a concrete example, glusterfs native claims create-from-snapshot support | 00:02 |
*** huyang_ has joined #openstack-manila | 00:02 | |
tbarron | bswartz: and it reportedly works, but fails in CI with timeouts | 00:02 |
bswartz | yes the simple thing to do is to not worry about it on the testing side | 00:03 |
tbarron | bswartz: so I have no idea if it really works at decent cloud scale or is just broken | 00:03 |
tbarron | bswartz: so I'm inclined to remove the claim that it works and get it passing | 00:03 |
tbarron | bswartz: well good-skips vs bad-skips really only makes sense if there *is* a claim to support a feature. | 00:04 |
bswartz | tbarron: there has to be an alarm bell somewhere that goes off if a capability that was previously reported just stops getting reported | 00:04 |
*** huyang has quit IRC | 00:05 | |
bswartz | if a software upgrade caused a capability to stop working then a customer would regard that as a regression | 00:05 |
*** huyang_ is now known as huyang | 00:05 | |
tbarron | bswartz: atm I am leaning in the direction you were proposing, just report what we see and downstream we say we saw that netapp is certified for feature X with NFS but we didn't see it for CIFs or whatever | 00:05 |
tbarron | bswartz: on the alarm bell, how about a timeline of results? | 00:05 |
bswartz | yeah you need some comparison to past results | 00:05 |
tbarron | bswartz: it will be the driver maintainer's responsibility to track if there appears to be a regression in capabilities. | 00:06 |
bswartz | tbarron: that's probably insufficient for a company that's on the hook for support | 00:06 |
*** tommylikehu_ has quit IRC | 00:06 | |
bswartz | if netapp dedupe suddenly stops working when you upgrade from newton to ocata, you're likely to get a call about taht | 00:07 |
tbarron | bswartz: well, we would see that netapp for osp11 (ocata) isn't certified any more. | 00:07 |
bswartz | you want to know that at least everything that worked last time is still working | 00:07 |
tbarron | bswartz: might be some joint motivation resulting | 00:07 |
bswartz | tbarron: how would that work? would part of the certification be to compare the test results with prior runs? | 00:08 |
tbarron | bswartz: agree, ratchet with red flag makes sense | 00:08 |
tbarron | bswartz: I think that makes sense, that's not automated atm so far as I know but shouldn't be hard | 00:08 |
bswartz | okay so the proposal is: | 00:08 |
bswartz | 1) in tempest, just automatically skip tests if there's no capability to run them -- no config flag needed | 00:09 |
bswartz | 2) rely on looking at test results over time to detect regressions in the capabilities over time | 00:09 |
*** kaisers_ has joined #openstack-manila | 00:10 | |
tbarron | bswartz: I think this works for capability based tests and for "certifications" | 00:11 |
tbarron | bswartz: in gate we have another class of tests, backend vs frontend, migration (intensive, false negative likely), etc. | 00:11 |
tbarron | bswartz: this is somewhat complicated by intersection of driver-assist (really a capability) for features like migration. | 00:12 |
bswartz | for those that can't easily be detected automatically the fallback is flags in tempest.conf -- exactly what we have today | 00:13 |
tbarron | bswartz: insofar as these are really capabilties (driver asssted, etc.) probably we should drive them to the approach outlined earlier. | 00:13 |
tbarron | bswartz: and, as you just said, the remainder can be RUN* flags in tempest.conf | 00:13 |
tbarron | bswartz: or, as gouthamr and ganso suggested earlier, we can over time get rid of those as well | 00:14 |
bswartz | tbarron: eh, the term "capability" is an accurate way to describe what you're saying in English, but in Manila/Cinder it has a specific meaning which only relates to the scheduler | 00:14 |
tbarron | bswartz: by moving such tests under their own path, so that it's easy to control them by regexes | 00:14 |
bswartz | the driver-assisted migration stuff really isn't a "capability" in the scheduler sense | 00:14 |
*** kaisers_ has quit IRC | 00:15 | |
tbarron | bswartz: agreed. but perhaps a driver that can do it should advertise that by some mechanism that we can pick up. | 00:15 |
tbarron | bswartz: it would be by a periodic announcment that gets cast to the scheduler even though the scheduler doesn't use it, dunno. | 00:15 |
bswartz | tbarron: I see no technical advantage to that approach over a flag in tempest.conf | 00:16 |
tbarron | bswartz: flag in tempest.conf can be set inconsistently with wht the driver itself was coded to do. | 00:17 |
bswartz | bugs are bugs | 00:17 |
tbarron | bswartz: that's not a bug, that's human error | 00:17 |
bswartz | the actual code path that the driver users to let the manager know it can do assisted migration could also have a bug in it | 00:17 |
tbarron | bswartz: it's a bug if my driver says it has the capability to revert to snapshot but in fact it fails. | 00:17 |
bswartz | both approaches can cause false negatives and false positives if a human screws up | 00:18 |
tbarron | bswartz: it's human error if someone claims to tempest that the driver has a capability but it doesn't | 00:18 |
bswartz | tbarron: that would be a true negative though | 00:19 |
bswartz | the tests would catch that particular human error | 00:19 |
tbarron | bswartz: agree, I'm mostly playing this out. There is this class of quasi-non-scheduler capabilties that will need to be handled, not sure of the best approach. | 00:19 |
tbarron | bswartz: probably best to start with the true scheduler-capabilities for auto-detect and treat everything else as a flag. | 00:19 |
bswartz | the things tests can't catch are if you implement something but don't tell tempest so it skips those tests, or if you implement a mechanism for tempest to know what to skip and that mechanism screws up somehow | 00:20 |
bswartz | in both cases tempest says green when something is wrong | 00:20 |
bswartz | tbarron: agreed | 00:21 |
gouthamr | related question: do manila-tempest-test options need a deprecation warning? | 00:21 |
bswartz | gouthamr: emphatically no | 00:22 |
tbarron | good | 00:22 |
bswartz | the only harm caused by surprise breakage of tempest options is that a few CI systems get screwed up | 00:22 |
bswartz | as long as we give CI maintainers a heads up I think we should be okay | 00:23 |
gouthamr | what if these options were being used in a non-CI use case? | 00:23 |
bswartz | what use case is that? | 00:23 |
tbarron | bswartz: gouthamr And cert systems or (as the tempest people would say) cloud admiins who run tempest to validate their clouds, BUT | 00:23 |
gouthamr | ^ yes.. | 00:23 |
tbarron | let's be practical -- someday that might be a tough issue but for manila we can handle all those cases 1-1 today | 00:24 |
tbarron | in 5 years, maybe it will be a different story | 00:24 |
gouthamr | agree with that... so we shouldn't leave those unused options in the config file | 00:24 |
bswartz | gouthamr: I'm pretty sure you can't do that with manila today | 00:25 |
gouthamr | bswartz: true... we'll know exactly when we get around to manila certs :) | 00:25 |
* tbarron believes in pragmatism over principle - recognize the principle, recognize the need to scale, etc. but don't let it stop us from doing the right thing today | 00:25 | |
bswartz | maybe someday when the stability fairy sprinkles some dust over tempest-lib | 00:25 |
tbarron | ack | 00:25 |
tbarron | so i think ganso's patch is fine for now w/o any special todo, and we all know we want to move beyond it and what is there | 00:26 |
tbarron | today | 00:26 |
gouthamr | bswartz tbarron: https://review.openstack.org/#/c/427663 does what its supposed to do.. let's begin the effort of cleaning up these options | 00:26 |
gouthamr | +1 | 00:27 |
bswartz | NetApp CI 7:56 PM | 00:28 |
bswartz | manila-cDOT-no-ss SUCCESS in 3h 06m 56s | 00:28 |
bswartz | manila-cDOT-ss SUCCESS in 3h 14m 41s | 00:28 |
bswartz | gouthamr: >3 hours? ;-( | 00:28 |
tbarron | stop putting "capability" options in tempest.conf, start looking at real capabilities, use RUN options for what is left over, and wither those away by paths and regexes | 00:28 |
gouthamr | bswartz: i think the CI showed signs of friday fatigue | 00:28 |
tbarron | gouthamr: does your 3rd party CI use lots of memory and time? | 00:29 |
bswartz | I used to make fun of tripleo ci for taking more than 2 hours | 00:29 |
bswartz | this is embarrassing! | 00:29 |
gouthamr | bswartz: both nodes for that came from one pizza box (your words).. we seem to have some network latency even hitting our own pip cache on the same network | 00:30 |
ganso | gouthamr, tbarron, bswartz: Thanks! | 00:30 |
tbarron | gouthamr: not a criticism, I'm just interested in the whole discussion - beyond manila - of how to get gate tempest jobs running more reliably given current dsvm constraints | 00:30 |
bswartz | oh devstack is the slow part not the tests themselves? | 00:30 |
gouthamr | bswartz: yes | 00:30 |
bswartz | gouthamr: then there's hope of fixing it | 00:30 |
gouthamr | tbarron: yes... devstack setup seems to have taken forever on those jobs | 00:30 |
tbarron | in gate we see a log msg (frequently) whenever devstack build takes more than 20min | 00:31 |
tbarron | gouthamr: how much RAM for your devstack nodes (more an after that question, not a netapp network/cadhing question) | 00:31 |
gouthamr | tbarron: 8gb vRAM, 2vCPU | 00:31 |
ganso | gouthamr: I replied to your question | 00:32 |
tbarron | gouthamr: k, you are playing "fair", for better or worse. | 00:32 |
gouthamr | tbarron: our new CI system is still a science project :D we're fixing things as we go.. it's been running reliably only for the last couple of weeks.. | 00:34 |
tbarron | gouthamr: but you don't need service VMs and don't demand much of neutron either, so probably have much less tendency towards bloat then our jobs in gate | 00:34 |
bswartz | tbarron: yes the netapp jobs should should be blindingly fast | 00:36 |
gouthamr | tbarron: true.. but we're setting up the typical stack with some unnecessary components: cinder, glance, swift... | 00:36 |
bswartz | our main resource constraint isn't CPU or RAM but actual storage controllers, since a job consumes a whole cluster | 00:36 |
gouthamr | tbarron: should remove those | 00:36 |
bswartz | gouthamr: -1 | 00:36 |
bswartz | they're needed for scenario tests | 00:36 |
gouthamr | bswartz: really? i thought we only need nova | 00:36 |
bswartz | glance and neutron are | 00:37 |
bswartz | not cinder or swift | 00:37 |
gouthamr | oh yes.. and glance | 00:37 |
gouthamr | :) | 00:37 |
* bswartz dreams of a day when he ran run nova without glance | 00:37 | |
gouthamr | in the beginning, there was nova | 00:37 |
gouthamr | heh, is that where "big tent" came from? | 00:38 |
bswartz | gouthamr: it comes from politics | 00:40 |
gouthamr | :P | 00:40 |
bswartz | gouthamr: a "big tent" political party tries to attract multiples groups of voters | 00:40 |
*** cknight has quit IRC | 00:41 | |
gouthamr | bswartz: i thought you were condescending... that's nice etymology. | 00:42 |
bswartz | gouthamr: no it has a long history | 00:42 |
bswartz | traditionally in the USA the republican party has been viewed as the "big tent" party because it attracts different voter groups who have very little in common | 00:43 |
gouthamr | ah... thanks! sounds like a relatively well known term i was unaware of... | 00:44 |
tbarron | so your backend can integrate with neutron network namespaces, doesn't need nova to run a service VM, doesn't need glance for service VM image, doesn't need cinder for backing storage for filesystems. Nor does it need linux host capabilities to do exports. Your requirements on dsvm should be minimal. | 00:51 |
tbarron | Today you need nova/glance for compute instance clients. | 00:51 |
tbarron | for scenarios. | 00:51 |
tbarron | Maybe tomorrow containers on tenant defined networks with the same kind of topolgies would give the same coverage. | 00:52 |
bswartz | tbarron: neutron is needed for scenario tests | 00:52 |
tbarron | topologies | 00:52 |
tbarron | neutron wouldn't go away. | 00:52 |
tbarron | that was where I'm going. | 00:53 |
bswartz | that's fine because I stopped hating neutron more than a year ago | 00:53 |
bswartz | <3 neutron | 00:53 |
gouthamr | haha | 00:53 |
tbarron | Thinking about the whole cinder + manila as SDS for kubernetes containers thing, I have been wondering: who is running the network? | 00:54 |
bswartz | tbarron: easy -- with cinder there is no network | 00:54 |
tbarron | bswartz: exactly | 00:55 |
bswartz | tbarron: and for manila you just use one of our many flat network plugins | 00:55 |
tbarron | but we have to figure it out, and that answer may be lame. | 00:55 |
tbarron | not sure. | 00:55 |
bswartz | people that run containers aren't doing it for security -- so it stands to reason that secure multitenant networks is not a requirement | 00:56 |
tbarron | hmmm. | 00:56 |
tbarron | No tenant-defined isolated networks? | 00:56 |
bswartz | tbarron: I'm pretty sure that "tenant" and "containers" don't go together | 00:57 |
tbarron | That need to be segmented and tunneled b/c the tenants aren't actually adjacent? | 00:57 |
bswartz | if you want containers and multitenancy you need another layer in the middle to give you the security that matters for that use case | 00:57 |
bswartz | (such as nova) | 00:58 |
tbarron | bswartz: I'm (honestly) missing something, why wouldn't "tenants" just want lighter weight, faster compute for micro-workloads and still want isolated tenant-defined networks for these | 00:58 |
bswartz | tbarron: they do, but I'm talking about something else | 00:58 |
tbarron | I can run containers in different network namespaces, don't need VMs | 00:58 |
bswartz | you use containers because of the packaging and efficiency they provide | 00:59 |
tbarron | If that's what is wanted. | 00:59 |
bswartz | you want multitenancy so you can use other people's resources | 00:59 |
bswartz | the only way to get both of those things is to run containers inside vms, where the VMs provide tenant isolation and the containers provide the rest of the goodness you're after | 00:59 |
tbarron | bswartz: that first statement is what I'm actually not sure of. | 01:00 |
bswartz | which statement? | 01:01 |
bswartz | this? 0 | 01:01 |
tbarron | bswartz: if I use something like systemd-nspawn containers I can get (1) separage namespaces, (2) separatge mount namespaces, (3) cgroup type limits, etc. | 01:01 |
bswartz | this? (08:58:48 PM) tbarron: I can run containers in different network namespaces, don't need VMs | 01:01 |
tbarron | ^^ ack | 01:01 |
bswartz | as long as you don't share the hardware with anyone you don't trust, sure | 01:01 |
tbarron | well "the only way to get both of these things is to run containers inside vms" | 01:01 |
bswartz | the problem comes when you have people who don't trust eachother sharing hardware, like in an amazon cloud type environment | 01:02 |
*** tommylikehu_ has joined #openstack-manila | 01:02 | |
tbarron | You have to trust the cloud admin anyways. | 01:02 |
bswartz | yes but the admin doesn't have to trust his customers | 01:02 |
tbarron | right, and there you see a deficiency for containers vs VMs? | 01:03 |
bswartz | running a container in an unvirtualized environment is extremely dangerous | 01:03 |
bswartz | because kernel exploits are relatively common and container security is like lolwut? | 01:03 |
* tbarron is motivated by thoughts of using containers or namespaces instead of service vms | 01:04 | |
tbarron | but that is admitedly difft case since service vms are under admin control | 01:04 |
bswartz | tbarron: if the admin controls the containers then there's no issue | 01:04 |
tbarron | bswartz: yup | 01:05 |
bswartz | the issue I'm raising is the one where you're running containers from other people you don't trust -- you must virtualize those | 01:05 |
tbarron | bswartz: OK, point taken. So for scenario tests, where we need compute consumers mounting our fileshares, we need VMs for the multi-tenancy use case. So we depend on nova and glance at minimum. | 01:06 |
bswartz | and neutron | 01:07 |
tbarron | Besides neutron as constant. | 01:07 |
bswartz | neutron isn't needed if you're testing LVM or ZFS, or netapp-singlesvm without doing scenario tests | 01:07 |
tbarron | Yeah, "constant" only for multi-svm/DHSS=True. | 01:08 |
bswartz | tbarron: even then, it's only because of the neutron network plugin -- in theory someone could write another network plugin that didn't require neutron and use it with the netapp-multisvm driver | 01:09 |
tbarron | bswartz: don't disagree, but that plugin must support tenant-defined networks in separate network namespaces and network segmentation. | 01:10 |
tbarron | bswartz: I guess it doesn' *have* to, just not sure it would be much use otherwise. | 01:10 |
*** kaisers_ has joined #openstack-manila | 01:11 | |
bswartz | yeah now that I think about it that would be fairly nonsensical without some form of neutron emulation or neutron interaction | 01:11 |
bswartz | may as well use neutron in that case | 01:11 |
tbarron | bswartz: It could be 100% ipv6 with BGP/VPN segmentaiton | 01:11 |
tbarron | bswartz: but if that evolves for Openstack it will likely be called neutron | 01:12 |
tbarron | good thing you <3 neutron | 01:13 |
*** kaisers_ has quit IRC | 01:15 | |
*** tommylikehu_ has quit IRC | 01:27 | |
*** tommylikehu_ has joined #openstack-manila | 01:28 | |
*** tommylikehu_ has quit IRC | 01:28 | |
*** tommylikehu_ has joined #openstack-manila | 01:29 | |
*** tommylikehu_ has quit IRC | 01:29 | |
*** tommylikehu_ has joined #openstack-manila | 01:29 | |
*** tommylikehu_ has quit IRC | 01:30 | |
*** tommylikehu_ has joined #openstack-manila | 01:31 | |
*** tommylikehu_ has quit IRC | 01:31 | |
*** tommylikehu_ has joined #openstack-manila | 01:31 | |
*** tommylikehu_ has quit IRC | 01:32 | |
*** kaisers1 has joined #openstack-manila | 01:39 | |
*** kaisers has quit IRC | 01:40 | |
*** kaisers has joined #openstack-manila | 02:11 | |
*** kaisers has quit IRC | 02:16 | |
*** yuvalb has quit IRC | 02:48 | |
*** yuvalb has joined #openstack-manila | 02:48 | |
*** hoonetorg has quit IRC | 02:54 | |
*** hoonetorg has joined #openstack-manila | 02:54 | |
*** ganso has quit IRC | 02:55 | |
*** gouthamr has quit IRC | 03:30 | |
*** tommylikehu_ has joined #openstack-manila | 03:32 | |
*** markstur has quit IRC | 03:35 | |
*** markstur has joined #openstack-manila | 03:36 | |
*** tommylikehu_ has quit IRC | 03:37 | |
*** markstur has quit IRC | 03:41 | |
*** kaisers1 has quit IRC | 03:47 | |
*** furlongm has quit IRC | 03:47 | |
*** kaisers has joined #openstack-manila | 04:02 | |
*** zhugaoxiao has quit IRC | 04:11 | |
*** zhugaoxiao has joined #openstack-manila | 04:11 | |
*** kaisers_ has joined #openstack-manila | 04:13 | |
*** kaisers_ has quit IRC | 04:18 | |
*** zhugaoxiao has quit IRC | 04:30 | |
*** zhugaoxiao has joined #openstack-manila | 04:31 | |
*** markstur has joined #openstack-manila | 04:36 | |
*** markstur has quit IRC | 04:36 | |
*** markstur has joined #openstack-manila | 04:37 | |
*** dsariel has joined #openstack-manila | 05:13 | |
*** tommylikehu_ has joined #openstack-manila | 05:25 | |
*** tommylikehu_ has quit IRC | 06:04 | |
*** kaisers_ has joined #openstack-manila | 06:14 | |
*** kaisers_ has quit IRC | 06:20 | |
*** lpetrut has joined #openstack-manila | 06:54 | |
*** kaisers_ has joined #openstack-manila | 07:10 | |
*** kaisers_ has quit IRC | 07:20 | |
*** markstur has quit IRC | 07:20 | |
*** markstur has joined #openstack-manila | 07:23 | |
*** markstur has quit IRC | 07:27 | |
*** arnewiebalck__ has joined #openstack-manila | 07:41 | |
*** kaisers_ has joined #openstack-manila | 08:08 | |
*** arnewiebalck__ has quit IRC | 08:10 | |
*** kaisers_ has quit IRC | 08:13 | |
*** lpetrut has quit IRC | 08:47 | |
*** shausy has joined #openstack-manila | 08:50 | |
*** kaisers_ has joined #openstack-manila | 08:55 | |
*** kaisers_ has quit IRC | 09:04 | |
*** shausy has quit IRC | 09:08 | |
*** arnewiebalck__ has joined #openstack-manila | 09:18 | |
*** kaisers_ has joined #openstack-manila | 09:37 | |
*** kaisers_ has quit IRC | 09:43 | |
*** arnewiebalck__ has quit IRC | 09:55 | |
*** markstur has joined #openstack-manila | 10:14 | |
*** markstur has quit IRC | 10:19 | |
*** arnewiebalck__ has joined #openstack-manila | 10:40 | |
openstackgerrit | Merged openstack/manila master: Remove redundant revert-to-snapshot test option https://review.openstack.org/427663 | 11:25 |
*** kaisers_ has joined #openstack-manila | 11:40 | |
*** kaisers_ has quit IRC | 11:45 | |
*** arnewiebalck__ has quit IRC | 11:50 | |
*** markstur has joined #openstack-manila | 12:03 | |
*** markstur has quit IRC | 12:08 | |
*** kaisers_ has joined #openstack-manila | 12:18 | |
*** kaisers_ has quit IRC | 12:22 | |
*** kaisers_ has joined #openstack-manila | 13:19 | |
*** kaisers_ has quit IRC | 13:23 | |
*** markstur has joined #openstack-manila | 13:52 | |
*** markstur has quit IRC | 13:56 | |
*** kaisers_ has joined #openstack-manila | 14:20 | |
*** arnewiebalck__ has joined #openstack-manila | 14:41 | |
*** shausy has joined #openstack-manila | 14:44 | |
*** kaisers_ has quit IRC | 14:53 | |
*** kaisers_ has joined #openstack-manila | 15:05 | |
*** shausy has quit IRC | 15:05 | |
*** kaisers_ has quit IRC | 15:10 | |
*** markstur has joined #openstack-manila | 15:40 | |
*** markstur has quit IRC | 15:45 | |
*** zhugaoxiao has quit IRC | 15:57 | |
*** zhugaoxiao has joined #openstack-manila | 15:58 | |
*** kaisers_ has joined #openstack-manila | 17:06 | |
*** lpetrut has joined #openstack-manila | 17:09 | |
*** kaisers_ has quit IRC | 17:11 | |
*** lpetrut has quit IRC | 17:16 | |
*** markstur has joined #openstack-manila | 17:29 | |
*** a-pugachev has joined #openstack-manila | 17:30 | |
*** markstur has quit IRC | 17:34 | |
*** arnewiebalck__ has quit IRC | 17:49 | |
*** markstur has joined #openstack-manila | 17:50 | |
*** markstur has quit IRC | 17:55 | |
*** kaisers_ has joined #openstack-manila | 18:07 | |
*** kaisers_ has quit IRC | 18:12 | |
*** hoonetorg has quit IRC | 19:06 | |
*** lpetrut has joined #openstack-manila | 19:07 | |
*** kaisers_ has joined #openstack-manila | 19:08 | |
*** kaisers_ has quit IRC | 19:13 | |
*** hoonetorg has joined #openstack-manila | 19:18 | |
*** markstur has joined #openstack-manila | 19:20 | |
*** markstur has quit IRC | 19:26 | |
*** gaurangt_ has quit IRC | 19:35 | |
*** kaisers_ has joined #openstack-manila | 19:44 | |
*** lpetrut has quit IRC | 21:39 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!