Monday, 2018-12-03

*** rcernin has quit IRC06:57
*** pcaruana has joined #openstack-sahara07:25
openstackgerritTobias Urdin proposed openstack/puppet-sahara master: Remove deprecated parameters  https://review.openstack.org/62080808:23
*** tosky has joined #openstack-sahara08:48
*** tellesnobrega_ is now known as tellesnobrega10:00
tellesnobregatosky, morning, I see that you have a solution for the vanilla issue10:01
tellesnobregadid that make the cluster start?10:01
toskyhi, yes, now the behavior using ubuntu and centos7 is the same10:01
toskyboth fails on EDP and scale -_-' but at least it's a start10:02
toskydid you hit the same issue?10:03
tellesnobregaI just woke up, will test it soon10:04
tellesnobregacreating all vanilla versions for ubuntu and centos710:33
tellesnobregaand it will test it out10:33
openstackgerritMerged openstack/sahara-image-elements master: firstboot: make rc-local start after cloud-init  https://review.openstack.org/62130210:50
openstackgerritLuigi Toscano proposed openstack/sahara-image-elements master: Plain Ubuntu image are still based on Xenial  https://review.openstack.org/62154010:59
openstackgerritMerged openstack/sahara stable/queens: Add DEBIAN_FRONTEND=noninteractive in front of apt-get install commands  https://review.openstack.org/62135111:20
openstackgerritMerged openstack/sahara stable/rocky: doc: restructure the image building documentation  https://review.openstack.org/62130611:20
toskytellesnobrega: do you think it's worth to backport (and adapt) the doc refactoring patch also to queens? It's going to be supported for 8+ months, until August (and then in extended maintainance mode)11:47
tellesnobregatosky, it is a long time, so I would say yes11:49
toskyoki :)11:49
tellesnobregatosky, just created a worklist with apiv2 stuff https://storyboard.openstack.org/#!/worklist/53311:57
toskyI noticed the updates on storyboard12:43
tellesnobregatosky, so, I just started a centos7 vanilla 2.7.1 cluster12:52
tellesnobregawithout your fix12:54
tellesnobregaand the cluster is active12:54
*** dave-mccowan has joined #openstack-sahara12:57
toskytellesnobrega: yes, and it may happen, if the order of the service happens to be the lucky one13:07
tellesnobregahum, I see13:07
openstackgerritTobias Urdin proposed openstack/puppet-sahara master: Deprecate ZeroMQ  https://review.openstack.org/62156813:12
toskyadded a story to track the API v2 changes required for sahara-tests13:31
toskycan I tag it as high-priority?13:31
*** dave-mccowan has quit IRC13:32
tellesnobregayes please13:34
*** dave-mccowan has joined #openstack-sahara14:54
openstackgerritTelles Mota Vidal Nóbrega proposed openstack/sahara master: Fixing cluster scale  https://review.openstack.org/61619317:02
toskytellesnobrega: could this fix ^^ solve the issue that I noticed when scaling vanilla clusters using the template in sahara-tests?17:08
tellesnobregayes17:08
tellesnobregait does17:08
toskyoh!17:08
toskylet me try it then17:08
toskynow I get it :)17:08
tellesnobregathat is when I noticed this change was needed17:09
openstackgerritLuigi Toscano proposed openstack/sahara-image-elements stable/rocky: firstboot: make rc-local start after cloud-init  https://review.openstack.org/62164117:17
tellesnobregatosky,17:37
tellesnobregascaling works here17:37
tellesnobregaEDP still failing17:37
toskyI see mixed errors for EDP failures; on the node where I use radosgw, some are related to some swift URL being unreachable, others related to weired errors on retrieving some blocks from HDFS17:43
toskyI need to check on the system which uses swift instead of radosgw, but it was failing too (maybe a subset of the  failures)17:43
tellesnobregaI see17:50
toskytellesnobrega: did you notice which job(s) failed specifically? In my last re-run on the current code rocky + the patch, the first run of EDP jobs saw only one KILLED job17:58
toskyand that's the Hive job17:58
toskynothing new17:58
toskyI'm waiting for the second run of the jobs, after the scaling operation17:58
toskyin the meantime I re-run the tests for vanilla 2.8.2/centos7 on the split plugin/devstack deployment17:59
toskylet's see17:59
toskyI will also run mapr afterwards (as mapr scenario test also includes scaling, and it's faster than ambari)17:59
tellesnobregaon centos7, vanilla 2.7.118:00
tellesnobregaAssertionError: Job with id=47344935-b912-43e3-9639-1e1449eee700, name=test-d92a1077, type=Pig has status FAILED18:00
tellesnobregaJob with id=5956755b-4276-4288-a798-854c3feb06d6, name=test-f786028b, type=MapReduce has status FAILED18:00
tellesnobregaJob with id=af766279-eac2-418f-849f-94341c7fff5e, name=test-da3316f7, type=MapReduce.Streaming has status FAILED18:00
tellesnobregaJob with id=440bb3b9-c0be-42b8-8b86-12818448bde7, name=test-f3e76d23, type=Java has status FAILED18:00
tellesnobregaJob with id=1c636e54-bc4c-44b6-9548-2ec67d756082, name=test-59888187, type=Hive has status FAILED18:00
toskythat's the split version?18:01
tellesnobregano, master18:01
toskyanyway, I'm also going to build a centos7/vanilla 2.7.1 image18:02
tellesnobregaok18:02
toskydo you see any special error for those jobs from the web console?18:02
tellesnobreganot right now, I will run again and see how that goes18:04
toskygetting closer18:05
toskyyou will need another rebase for sure :P18:05
tellesnobregatoo much stuff changing?18:05
toskyfew useful patches18:06
tellesnobreganice18:06
tellesnobregano worries on rebase, I think I got a good handle on it now18:06
GaasmannAbout the problem with telnetlib.Telnet for cdh, what would you think about a patch like that? http://paste.openstack.org/show/736588/18:51
Gaasmann2018-12-03 18:46:50.102 20 DEBUG sahara.plugins.cdh.client.http_client [req-d67860ab-0093-4c8f-8452-9b5ceec3941f 78e9b31dfce642fa9995a58d017458d1 a4b84a529e9e4eea8ff7bfbe51c48e32 - - -] [instance: none, cluster: 337be666-a115-46d4-9b73-3e704e14b0ec] Method: GET, URL: http://192.168.52.18:7180/api/v8/users/admin execute18:52
Gaasmann/var/lib/kolla/venv/local/lib/python2.7/site-packages/sahara/plugins/cdh/client/http_client.py:12418:52
Gaasmannthis timeout, maybe the same issue?18:52
toskythat timeout seems to be related, do you have longer stacktrace?19:13
toskyabout the patch, it may not work in general19:13
toskyoh19:13
toskyuhm, maybe it can, you are exploiting the usage of ssh on one of the nodes19:14
toskylet me test it with my "normal" deployment19:15
Gaasmanntosky: the longer stacktrace http://paste.openstack.org/show/736591/19:40
GaasmannFor the patch, I use what seems to be used for the preparation/configuration of the cluster so I guess it uses the Remote/SshRemoteDriver classes19:43
Gaasmann(it makes the debug log a bit verbose though)19:45
toskyGaasmann: that stacktraces looks like another direct call to the API19:49
toskyI guess we need to wrap all API calls somehow19:49
* tosky bbl19:49
Gaasmannthat is the first API call I see during the cluster creation. I guess it's possible to ssh and run a curl command locally but it sounds like a quick and dirty fix19:54
toskyI'd say: please send that patch as it is (it worked for me on a "normal" cloudera deployment), and then let's see if it makes sense to extend it to support all cases where sahara use the CDH API20:25
toskyI'd also suggest to extend the scope of the story to address all the possible issues of the same kind; if we need to split the commits, we can use different tasks20:26
toskyGaasmann: you may want to edit the main content of the story, instead of just adding a comment :)20:48
Gaasmanngood idea :-)20:50
toskyGaasmann: I think that storyboard supports user editing of the comments, but the feature is disabled on the openstack instance21:59
openstackgerritLuigi Toscano proposed openstack/sahara master: DNM TESTONLY py3 test: remove i18n call to db exceptions  https://review.openstack.org/60068922:06
*** goldyfruit has joined #openstack-sahara22:14
*** rcernin has joined #openstack-sahara22:18
*** pcaruana has quit IRC22:18
openstackgerritMerged openstack/sahara-image-elements stable/rocky: firstboot: make rc-local start after cloud-init  https://review.openstack.org/62164122:41
openstackgerritLuigi Toscano proposed openstack/sahara-image-elements stable/queens: firstboot: make rc-local start after cloud-init  https://review.openstack.org/62172122:47
*** irclogbot_2 has joined #openstack-sahara23:09

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!