Saturday, 2017-03-18

tbarron	bswartz: true. We lose the "statement of intent".	00:01
*** crushil has quit IRC		00:01
tbarron	bswartz: if you make the feature matrix go away though, there is no public statement of intent.	00:01
tbarron	bswartz: maybe we just report what is, not what is claimed.	00:02
bswartz	tbarron: well that gets back to the concern RedHat has about knowing which skips are good skips and which skips are bad skips	00:02
tbarron	bswartz: so to take a concrete example, glusterfs native claims create-from-snapshot support	00:02
*** huyang_ has joined #openstack-manila		00:02
tbarron	bswartz: and it reportedly works, but fails in CI with timeouts	00:02
bswartz	yes the simple thing to do is to not worry about it on the testing side	00:03
tbarron	bswartz: so I have no idea if it really works at decent cloud scale or is just broken	00:03
tbarron	bswartz: so I'm inclined to remove the claim that it works and get it passing	00:03
tbarron	bswartz: well good-skips vs bad-skips really only makes sense if there is a claim to support a feature.	00:04
bswartz	tbarron: there has to be an alarm bell somewhere that goes off if a capability that was previously reported just stops getting reported	00:04
*** huyang has quit IRC		00:05
bswartz	if a software upgrade caused a capability to stop working then a customer would regard that as a regression	00:05
*** huyang_ is now known as huyang		00:05
tbarron	bswartz: atm I am leaning in the direction you were proposing, just report what we see and downstream we say we saw that netapp is certified for feature X with NFS but we didn't see it for CIFs or whatever	00:05
tbarron	bswartz: on the alarm bell, how about a timeline of results?	00:05
bswartz	yeah you need some comparison to past results	00:05
tbarron	bswartz: it will be the driver maintainer's responsibility to track if there appears to be a regression in capabilities.	00:06
bswartz	tbarron: that's probably insufficient for a company that's on the hook for support	00:06
*** tommylikehu_ has quit IRC		00:06
bswartz	if netapp dedupe suddenly stops working when you upgrade from newton to ocata, you're likely to get a call about taht	00:07
tbarron	bswartz: well, we would see that netapp for osp11 (ocata) isn't certified any more.	00:07
bswartz	you want to know that at least everything that worked last time is still working	00:07
tbarron	bswartz: might be some joint motivation resulting	00:07
bswartz	tbarron: how would that work? would part of the certification be to compare the test results with prior runs?	00:08
tbarron	bswartz: agree, ratchet with red flag makes sense	00:08
tbarron	bswartz: I think that makes sense, that's not automated atm so far as I know but shouldn't be hard	00:08
bswartz	okay so the proposal is:	00:08
bswartz	1) in tempest, just automatically skip tests if there's no capability to run them -- no config flag needed	00:09
bswartz	2) rely on looking at test results over time to detect regressions in the capabilities over time	00:09
*** kaisers_ has joined #openstack-manila		00:10
tbarron	bswartz: I think this works for capability based tests and for "certifications"	00:11
tbarron	bswartz: in gate we have another class of tests, backend vs frontend, migration (intensive, false negative likely), etc.	00:11
tbarron	bswartz: this is somewhat complicated by intersection of driver-assist (really a capability) for features like migration.	00:12
bswartz	for those that can't easily be detected automatically the fallback is flags in tempest.conf -- exactly what we have today	00:13
tbarron	bswartz: insofar as these are really capabilties (driver asssted, etc.) probably we should drive them to the approach outlined earlier.	00:13
tbarron	bswartz: and, as you just said, the remainder can be RUN* flags in tempest.conf	00:13
tbarron	bswartz: or, as gouthamr and ganso suggested earlier, we can over time get rid of those as well	00:14
bswartz	tbarron: eh, the term "capability" is an accurate way to describe what you're saying in English, but in Manila/Cinder it has a specific meaning which only relates to the scheduler	00:14
tbarron	bswartz: by moving such tests under their own path, so that it's easy to control them by regexes	00:14
bswartz	the driver-assisted migration stuff really isn't a "capability" in the scheduler sense	00:14
*** kaisers_ has quit IRC		00:15
tbarron	bswartz: agreed. but perhaps a driver that can do it should advertise that by some mechanism that we can pick up.	00:15
tbarron	bswartz: it would be by a periodic announcment that gets cast to the scheduler even though the scheduler doesn't use it, dunno.	00:15
bswartz	tbarron: I see no technical advantage to that approach over a flag in tempest.conf	00:16
tbarron	bswartz: flag in tempest.conf can be set inconsistently with wht the driver itself was coded to do.	00:17
bswartz	bugs are bugs	00:17
tbarron	bswartz: that's not a bug, that's human error	00:17
bswartz	the actual code path that the driver users to let the manager know it can do assisted migration could also have a bug in it	00:17
tbarron	bswartz: it's a bug if my driver says it has the capability to revert to snapshot but in fact it fails.	00:17
bswartz	both approaches can cause false negatives and false positives if a human screws up	00:18
tbarron	bswartz: it's human error if someone claims to tempest that the driver has a capability but it doesn't	00:18
bswartz	tbarron: that would be a true negative though	00:19
bswartz	the tests would catch that particular human error	00:19
tbarron	bswartz: agree, I'm mostly playing this out. There is this class of quasi-non-scheduler capabilties that will need to be handled, not sure of the best approach.	00:19
tbarron	bswartz: probably best to start with the true scheduler-capabilities for auto-detect and treat everything else as a flag.	00:19
bswartz	the things tests can't catch are if you implement something but don't tell tempest so it skips those tests, or if you implement a mechanism for tempest to know what to skip and that mechanism screws up somehow	00:20
bswartz	in both cases tempest says green when something is wrong	00:20
bswartz	tbarron: agreed	00:21
gouthamr	related question: do manila-tempest-test options need a deprecation warning?	00:21
bswartz	gouthamr: emphatically no	00:22
tbarron	good	00:22
bswartz	the only harm caused by surprise breakage of tempest options is that a few CI systems get screwed up	00:22
bswartz	as long as we give CI maintainers a heads up I think we should be okay	00:23
gouthamr	what if these options were being used in a non-CI use case?	00:23
bswartz	what use case is that?	00:23
tbarron	bswartz: gouthamr And cert systems or (as the tempest people would say) cloud admiins who run tempest to validate their clouds, BUT	00:23
gouthamr	^ yes..	00:23
tbarron	let's be practical -- someday that might be a tough issue but for manila we can handle all those cases 1-1 today	00:24
tbarron	in 5 years, maybe it will be a different story	00:24
gouthamr	agree with that... so we shouldn't leave those unused options in the config file	00:24
bswartz	gouthamr: I'm pretty sure you can't do that with manila today	00:25
gouthamr	bswartz: true... we'll know exactly when we get around to manila certs :)	00:25
* tbarron believes in pragmatism over principle - recognize the principle, recognize the need to scale, etc. but don't let it stop us from doing the right thing today		00:25
bswartz	maybe someday when the stability fairy sprinkles some dust over tempest-lib	00:25
tbarron	ack	00:25
tbarron	so i think ganso's patch is fine for now w/o any special todo, and we all know we want to move beyond it and what is there	00:26
tbarron	today	00:26
gouthamr	bswartz tbarron: https://review.openstack.org/#/c/427663 does what its supposed to do.. let's begin the effort of cleaning up these options	00:26
gouthamr	+1	00:27
bswartz	NetApp CI 7:56 PM	00:28
bswartz	manila-cDOT-no-ss SUCCESS in 3h 06m 56s	00:28
bswartz	manila-cDOT-ss SUCCESS in 3h 14m 41s	00:28
bswartz	gouthamr: >3 hours? ;-(	00:28
tbarron	stop putting "capability" options in tempest.conf, start looking at real capabilities, use RUN options for what is left over, and wither those away by paths and regexes	00:28
gouthamr	bswartz: i think the CI showed signs of friday fatigue	00:28
tbarron	gouthamr: does your 3rd party CI use lots of memory and time?	00:29
bswartz	I used to make fun of tripleo ci for taking more than 2 hours	00:29
bswartz	this is embarrassing!	00:29
gouthamr	bswartz: both nodes for that came from one pizza box (your words).. we seem to have some network latency even hitting our own pip cache on the same network	00:30
ganso	gouthamr, tbarron, bswartz: Thanks!	00:30
tbarron	gouthamr: not a criticism, I'm just interested in the whole discussion - beyond manila - of how to get gate tempest jobs running more reliably given current dsvm constraints	00:30
bswartz	oh devstack is the slow part not the tests themselves?	00:30
gouthamr	bswartz: yes	00:30
bswartz	gouthamr: then there's hope of fixing it	00:30
gouthamr	tbarron: yes... devstack setup seems to have taken forever on those jobs	00:30
tbarron	in gate we see a log msg (frequently) whenever devstack build takes more than 20min	00:31
tbarron	gouthamr: how much RAM for your devstack nodes (more an after that question, not a netapp network/cadhing question)	00:31
gouthamr	tbarron: 8gb vRAM, 2vCPU	00:31
ganso	gouthamr: I replied to your question	00:32
tbarron	gouthamr: k, you are playing "fair", for better or worse.	00:32
gouthamr	tbarron: our new CI system is still a science project :D we're fixing things as we go.. it's been running reliably only for the last couple of weeks..	00:34
tbarron	gouthamr: but you don't need service VMs and don't demand much of neutron either, so probably have much less tendency towards bloat then our jobs in gate	00:34
bswartz	tbarron: yes the netapp jobs should should be blindingly fast	00:36
gouthamr	tbarron: true.. but we're setting up the typical stack with some unnecessary components: cinder, glance, swift...	00:36
bswartz	our main resource constraint isn't CPU or RAM but actual storage controllers, since a job consumes a whole cluster	00:36
gouthamr	tbarron: should remove those	00:36
bswartz	gouthamr: -1	00:36
bswartz	they're needed for scenario tests	00:36
gouthamr	bswartz: really? i thought we only need nova	00:36
bswartz	glance and neutron are	00:37
bswartz	not cinder or swift	00:37
gouthamr	oh yes.. and glance	00:37
gouthamr	:)	00:37
* bswartz dreams of a day when he ran run nova without glance		00:37
gouthamr	in the beginning, there was nova	00:37
gouthamr	heh, is that where "big tent" came from?	00:38
bswartz	gouthamr: it comes from politics	00:40
gouthamr	:P	00:40
bswartz	gouthamr: a "big tent" political party tries to attract multiples groups of voters	00:40
*** cknight has quit IRC		00:41
gouthamr	bswartz: i thought you were condescending... that's nice etymology.	00:42
bswartz	gouthamr: no it has a long history	00:42
bswartz	traditionally in the USA the republican party has been viewed as the "big tent" party because it attracts different voter groups who have very little in common	00:43
gouthamr	ah... thanks! sounds like a relatively well known term i was unaware of...	00:44
tbarron	so your backend can integrate with neutron network namespaces, doesn't need nova to run a service VM, doesn't need glance for service VM image, doesn't need cinder for backing storage for filesystems. Nor does it need linux host capabilities to do exports. Your requirements on dsvm should be minimal.	00:51
tbarron	Today you need nova/glance for compute instance clients.	00:51
tbarron	for scenarios.	00:51
tbarron	Maybe tomorrow containers on tenant defined networks with the same kind of topolgies would give the same coverage.	00:52
bswartz	tbarron: neutron is needed for scenario tests	00:52
tbarron	topologies	00:52
tbarron	neutron wouldn't go away.	00:52
tbarron	that was where I'm going.	00:53
bswartz	that's fine because I stopped hating neutron more than a year ago	00:53
bswartz	<3 neutron	00:53
gouthamr	haha	00:53
tbarron	Thinking about the whole cinder + manila as SDS for kubernetes containers thing, I have been wondering: who is running the network?	00:54
bswartz	tbarron: easy -- with cinder there is no network	00:54
tbarron	bswartz: exactly	00:55
bswartz	tbarron: and for manila you just use one of our many flat network plugins	00:55
tbarron	but we have to figure it out, and that answer may be lame.	00:55
tbarron	not sure.	00:55
bswartz	people that run containers aren't doing it for security -- so it stands to reason that secure multitenant networks is not a requirement	00:56
tbarron	hmmm.	00:56
tbarron	No tenant-defined isolated networks?	00:56
bswartz	tbarron: I'm pretty sure that "tenant" and "containers" don't go together	00:57
tbarron	That need to be segmented and tunneled b/c the tenants aren't actually adjacent?	00:57
bswartz	if you want containers and multitenancy you need another layer in the middle to give you the security that matters for that use case	00:57
bswartz	(such as nova)	00:58
tbarron	bswartz: I'm (honestly) missing something, why wouldn't "tenants" just want lighter weight, faster compute for micro-workloads and still want isolated tenant-defined networks for these	00:58
bswartz	tbarron: they do, but I'm talking about something else	00:58
tbarron	I can run containers in different network namespaces, don't need VMs	00:58
bswartz	you use containers because of the packaging and efficiency they provide	00:59
tbarron	If that's what is wanted.	00:59
bswartz	you want multitenancy so you can use other people's resources	00:59
bswartz	the only way to get both of those things is to run containers inside vms, where the VMs provide tenant isolation and the containers provide the rest of the goodness you're after	00:59
tbarron	bswartz: that first statement is what I'm actually not sure of.	01:00
bswartz	which statement?	01:01
bswartz	this? 0	01:01
tbarron	bswartz: if I use something like systemd-nspawn containers I can get (1) separage namespaces, (2) separatge mount namespaces, (3) cgroup type limits, etc.	01:01
bswartz	this? (08:58:48 PM) tbarron: I can run containers in different network namespaces, don't need VMs	01:01
tbarron	^^ ack	01:01
bswartz	as long as you don't share the hardware with anyone you don't trust, sure	01:01
tbarron	well "the only way to get both of these things is to run containers inside vms"	01:01
bswartz	the problem comes when you have people who don't trust eachother sharing hardware, like in an amazon cloud type environment	01:02
*** tommylikehu_ has joined #openstack-manila		01:02
tbarron	You have to trust the cloud admin anyways.	01:02
bswartz	yes but the admin doesn't have to trust his customers	01:02
tbarron	right, and there you see a deficiency for containers vs VMs?	01:03
bswartz	running a container in an unvirtualized environment is extremely dangerous	01:03
bswartz	because kernel exploits are relatively common and container security is like lolwut?	01:03
* tbarron is motivated by thoughts of using containers or namespaces instead of service vms		01:04
tbarron	but that is admitedly difft case since service vms are under admin control	01:04
bswartz	tbarron: if the admin controls the containers then there's no issue	01:04
tbarron	bswartz: yup	01:05
bswartz	the issue I'm raising is the one where you're running containers from other people you don't trust -- you must virtualize those	01:05
tbarron	bswartz: OK, point taken. So for scenario tests, where we need compute consumers mounting our fileshares, we need VMs for the multi-tenancy use case. So we depend on nova and glance at minimum.	01:06
bswartz	and neutron	01:07
tbarron	Besides neutron as constant.	01:07
bswartz	neutron isn't needed if you're testing LVM or ZFS, or netapp-singlesvm without doing scenario tests	01:07
tbarron	Yeah, "constant" only for multi-svm/DHSS=True.	01:08
bswartz	tbarron: even then, it's only because of the neutron network plugin -- in theory someone could write another network plugin that didn't require neutron and use it with the netapp-multisvm driver	01:09
tbarron	bswartz: don't disagree, but that plugin must support tenant-defined networks in separate network namespaces and network segmentation.	01:10
tbarron	bswartz: I guess it doesn' have to, just not sure it would be much use otherwise.	01:10
*** kaisers_ has joined #openstack-manila		01:11
bswartz	yeah now that I think about it that would be fairly nonsensical without some form of neutron emulation or neutron interaction	01:11
bswartz	may as well use neutron in that case	01:11
tbarron	bswartz: It could be 100% ipv6 with BGP/VPN segmentaiton	01:11
tbarron	bswartz: but if that evolves for Openstack it will likely be called neutron	01:12
tbarron	good thing you <3 neutron	01:13
*** kaisers_ has quit IRC		01:15
*** tommylikehu_ has quit IRC		01:27
*** tommylikehu_ has joined #openstack-manila		01:28
*** tommylikehu_ has quit IRC		01:28
*** tommylikehu_ has joined #openstack-manila		01:29
*** tommylikehu_ has quit IRC		01:29
*** tommylikehu_ has joined #openstack-manila		01:29
*** tommylikehu_ has quit IRC		01:30
*** tommylikehu_ has joined #openstack-manila		01:31
*** tommylikehu_ has quit IRC		01:31
*** tommylikehu_ has joined #openstack-manila		01:31
*** tommylikehu_ has quit IRC		01:32
*** kaisers1 has joined #openstack-manila		01:39
*** kaisers has quit IRC		01:40
*** kaisers has joined #openstack-manila		02:11
*** kaisers has quit IRC		02:16
*** yuvalb has quit IRC		02:48
*** yuvalb has joined #openstack-manila		02:48
*** hoonetorg has quit IRC		02:54
*** hoonetorg has joined #openstack-manila		02:54
*** ganso has quit IRC		02:55
*** gouthamr has quit IRC		03:30
*** tommylikehu_ has joined #openstack-manila		03:32
*** markstur has quit IRC		03:35
*** markstur has joined #openstack-manila		03:36
*** tommylikehu_ has quit IRC		03:37
*** markstur has quit IRC		03:41
*** kaisers1 has quit IRC		03:47
*** furlongm has quit IRC		03:47
*** kaisers has joined #openstack-manila		04:02
*** zhugaoxiao has quit IRC		04:11
*** zhugaoxiao has joined #openstack-manila		04:11
*** kaisers_ has joined #openstack-manila		04:13
*** kaisers_ has quit IRC		04:18
*** zhugaoxiao has quit IRC		04:30
*** zhugaoxiao has joined #openstack-manila		04:31
*** markstur has joined #openstack-manila		04:36
*** markstur has quit IRC		04:36
*** markstur has joined #openstack-manila		04:37
*** dsariel has joined #openstack-manila		05:13
*** tommylikehu_ has joined #openstack-manila		05:25
*** tommylikehu_ has quit IRC		06:04
*** kaisers_ has joined #openstack-manila		06:14
*** kaisers_ has quit IRC		06:20
*** lpetrut has joined #openstack-manila		06:54
*** kaisers_ has joined #openstack-manila		07:10
*** kaisers_ has quit IRC		07:20
*** markstur has quit IRC		07:20
*** markstur has joined #openstack-manila		07:23
*** markstur has quit IRC		07:27
*** arnewiebalck__ has joined #openstack-manila		07:41
*** kaisers_ has joined #openstack-manila		08:08
*** arnewiebalck__ has quit IRC		08:10
*** kaisers_ has quit IRC		08:13
*** lpetrut has quit IRC		08:47
*** shausy has joined #openstack-manila		08:50
*** kaisers_ has joined #openstack-manila		08:55
*** kaisers_ has quit IRC		09:04
*** shausy has quit IRC		09:08
*** arnewiebalck__ has joined #openstack-manila		09:18
*** kaisers_ has joined #openstack-manila		09:37
*** kaisers_ has quit IRC		09:43
*** arnewiebalck__ has quit IRC		09:55
*** markstur has joined #openstack-manila		10:14
*** markstur has quit IRC		10:19
*** arnewiebalck__ has joined #openstack-manila		10:40
openstackgerrit	Merged openstack/manila master: Remove redundant revert-to-snapshot test option https://review.openstack.org/427663	11:25
*** kaisers_ has joined #openstack-manila		11:40
*** kaisers_ has quit IRC		11:45
*** arnewiebalck__ has quit IRC		11:50
*** markstur has joined #openstack-manila		12:03
*** markstur has quit IRC		12:08
*** kaisers_ has joined #openstack-manila		12:18
*** kaisers_ has quit IRC		12:22
*** kaisers_ has joined #openstack-manila		13:19
*** kaisers_ has quit IRC		13:23
*** markstur has joined #openstack-manila		13:52
*** markstur has quit IRC		13:56
*** kaisers_ has joined #openstack-manila		14:20
*** arnewiebalck__ has joined #openstack-manila		14:41
*** shausy has joined #openstack-manila		14:44
*** kaisers_ has quit IRC		14:53
*** kaisers_ has joined #openstack-manila		15:05
*** shausy has quit IRC		15:05
*** kaisers_ has quit IRC		15:10
*** markstur has joined #openstack-manila		15:40
*** markstur has quit IRC		15:45
*** zhugaoxiao has quit IRC		15:57
*** zhugaoxiao has joined #openstack-manila		15:58
*** kaisers_ has joined #openstack-manila		17:06
*** lpetrut has joined #openstack-manila		17:09
*** kaisers_ has quit IRC		17:11
*** lpetrut has quit IRC		17:16
*** markstur has joined #openstack-manila		17:29
*** a-pugachev has joined #openstack-manila		17:30
*** markstur has quit IRC		17:34
*** arnewiebalck__ has quit IRC		17:49
*** markstur has joined #openstack-manila		17:50
*** markstur has quit IRC		17:55
*** kaisers_ has joined #openstack-manila		18:07
*** kaisers_ has quit IRC		18:12
*** hoonetorg has quit IRC		19:06
*** lpetrut has joined #openstack-manila		19:07
*** kaisers_ has joined #openstack-manila		19:08
*** kaisers_ has quit IRC		19:13
*** hoonetorg has joined #openstack-manila		19:18
*** markstur has joined #openstack-manila		19:20
*** markstur has quit IRC		19:26
*** gaurangt_ has quit IRC		19:35
*** kaisers_ has joined #openstack-manila		19:44
*** lpetrut has quit IRC		21:39

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!