Monday, 2019-03-25

*** cheng1 has joined #airshipit01:07
*** cheng1 has quit IRC03:21
*** cheng1 has joined #airshipit03:58
*** juhak has quit IRC04:51
*** juhak has joined #airshipit04:52
openstackgerritTin Lam proposed openstack/airship-pegleg master: trivial: fix yapf/pep8 interaction failing on logical operator  https://review.openstack.org/61993605:07
openstackgerritSmruti Soumitra Khuntia proposed openstack/airship-shipyard master: User context tracing through logging  https://review.openstack.org/63387306:10
openstackgerritSmruti Soumitra Khuntia proposed openstack/airship-drydock master: End user logging for audit traceabilty  https://review.openstack.org/63811506:11
openstackgerritSmruti Soumitra Khuntia proposed openstack/airship-shipyard master: User context tracing through logging  https://review.openstack.org/63387306:31
openstackgerritSmruti Soumitra Khuntia proposed openstack/airship-in-a-bottle master: Document End user optional header  https://review.openstack.org/64299906:33
openstackgerritSmruti Soumitra Khuntia proposed openstack/airship-in-a-bottle master: Document End user optional header  https://review.openstack.org/64299906:33
*** cheng1 has quit IRC06:38
*** cheng1 has joined #airshipit06:38
*** cheng1_ has joined #airshipit07:23
*** cheng1 has quit IRC07:25
*** licanwei has joined #airshipit08:23
*** cheng1_ has quit IRC08:37
*** roman_g has joined #airshipit08:42
*** juhak has quit IRC08:54
*** juhak has joined #airshipit08:54
openstackgerritSmruti Soumitra Khuntia proposed openstack/airship-shipyard master: User context tracing through logging  https://review.openstack.org/63387309:44
openstackgerritSmruti Soumitra Khuntia proposed openstack/airship-shipyard master: User context tracing through logging  https://review.openstack.org/63387309:47
*** cheng1_ has joined #airshipit09:58
*** cheng1_ has quit IRC12:06
*** juhak has quit IRC12:54
*** juhak has joined #airshipit12:54
*** aaronsheffield has joined #airshipit12:57
*** irclogbot_3 has joined #airshipit13:27
*** altlogbot_0 has quit IRC13:31
*** altlogbot_0 has joined #airshipit13:31
*** irclogbot_3 has quit IRC13:38
*** irclogbot_2 has joined #airshipit13:38
*** kranthikirang has joined #airshipit13:41
*** michael-beaver has joined #airshipit13:48
evgenylkranthikirang: Thanks for submitting a patch! I've checked it, just a few small comments.14:01
evgenylkranthikirang: Yes, you will need to configure the vlans manually for genesis14:01
openstackgerritDimitrios Markou proposed openstack/airship-in-a-bottle master: Add bgp peering in virtual airship  https://review.openstack.org/64217114:10
openstackgerritAaron Sheffield proposed openstack/airship-deckhand master: Updating Docker Gate use of zuul.newrev  https://review.openstack.org/64582514:21
openstackgerritMerged openstack/airship-armada master: Support in Armada for locking Tiller  https://review.openstack.org/63248314:29
openstackgerritkranthi kiran guttikonda proposed openstack/airship-treasuremap master: Fix install 4.15.0-34-generic  https://review.openstack.org/64595114:42
kranthikirangevgenyl: I am seeing an exception between Armada and tiller while running ./genesis.sh script; Script keep trying to deploy mariadb, rabbitmq and ingress and failing or being timed out and trying to same thing over and over14:48
kranthikiranghttp://paste.openstack.org/show/748322/14:48
openstackgerritEvgeniy L proposed openstack/airship-in-a-bottle master: Add a comment to clarify ingress requirements for MariaDB  https://review.openstack.org/64749614:49
openstackgerritSmruti Soumitra Khuntia proposed openstack/airship-shipyard master: User context tracing through logging  https://review.openstack.org/63387314:50
mattmceuenkranthikirang: what do you see in the mariadb logs or pod descriptions?14:50
kranthikirangmattmceuen: When I checked on Friday I see pods were running and but it took a while for them to come up14:50
kranthikirangmattmceuen: I will try running genesis.sh again14:51
evgenylHi everyone, please help with reviews/merges for AIAB gate fix https://review.openstack.org/#/c/644634/14:52
mattmceuenkranthikirang:  sounds good.  It'll either be a case of "mariadb is taking a long time because the environment is a bit slow", in which case maybe you should increase timeouts;  or, it'll be "taking a long time because something is wrong" in which case the root cause will just need to be troubleshot14:53
kranthikirangmattmceuen: How can I increse the timeouts? I am using HP gen9 v4 servers14:53
mattmceuenAll the timeout / waiting type stuff lives in the armada charts for the different deployed components.  I think this is the one you'd need to tweak:  https://github.com/openstack/airship-treasuremap/blob/master/global/software/charts/ucp/core/mariadb.yaml#L5714:56
kranthikirangmattmceuen: since I have already generated the collectd and bundlem I guess I have to modify airship-treasuremap.yaml14:59
mattmceuenkranthikirang:  Since Airship is a declarative platform, the best thing to do is to modify the original source documents (or override at the type or site level), and then re-collect them fresh, and generate a new genesis.sh.  Modifying the collected documents should work fine, but then the deployed site won't match the declarative intent in your git repo15:03
kranthikirangmattmceuen: totally agree; I see in genesis its creating /etc/genesis/armada folder where its keeping all the manifests as well; I guess updating collectd manifest alone will not resolve;15:05
kranthikirangmattmceuen: WIll update the original site documents as well15:05
mattmceuenAwesome - let me know how the timeout update goes15:06
openstackgerritLev Morgan proposed openstack/airship-pegleg master: Fix multiple I/O issues in cert generation  https://review.openstack.org/64367815:09
openstackgerritMerged openstack/airship-in-a-bottle master: Fix AIAB gate Heat test & MariaDB failures  https://review.openstack.org/64463415:10
openstackgerritAlexander Hughes proposed openstack/airship-pegleg master: PKI Cert generation and check updates  https://review.openstack.org/63941415:14
kranthikirangmattmceuen: re-running ./gesis.sh seems very costly; it restarts the docker thus restarting everything as container load; What is the follow you follow to update anything in middle? build collectd, bundle and re-run ./gesis.sh?15:22
openstackgerritLev Morgan proposed openstack/airship-pegleg master: [DNM] Added cleartext option to passphrase generation  https://review.openstack.org/64501715:28
openstackgerritRick Bartra proposed openstack/airship-divingbell master: Update documentation based on change to using unprivileged containers  https://review.openstack.org/64751015:33
openstackgerritLev Morgan proposed openstack/airship-pegleg master: Added document wrapping command  https://review.openstack.org/64463715:42
openstackgerritAaron Sheffield proposed openstack/airship-deckhand master: [WIP] Updating Docker Gate use of zuul.newrev  https://review.openstack.org/64582516:09
openstackgerritMerged openstack/airship-pegleg master: Set salt when generating genesis bundle  https://review.openstack.org/64284816:16
openstackgerritMerged openstack/airship-pegleg master: trivial: fix yapf/pep8 interaction failing on logical operator  https://review.openstack.org/61993616:21
openstackgerritPRATEEK REDDY DODDA proposed openstack/airship-armada master: Implement Security Context for Armada  https://review.openstack.org/63920716:22
mattmceuenkranthikirang:  yep, if genesis needs to be re-run with updated manifests, then we run those steps again (via automation).  Once the genesis process has completed, though, you can generally push additional changes to the site via update_site or update_software APIs on Shipyard16:34
openstackgerritMerged openstack/airship-in-a-bottle master: Add a comment to clarify ingress requirements for MariaDB  https://review.openstack.org/64749616:44
openstackgerritPRATEEK REDDY DODDA proposed openstack/airship-divingbell master: Implement Security Context for Divingbell  https://review.openstack.org/64170617:16
openstackgerritEvgeniy L proposed openstack/airship-in-a-bottle master: [WIP][DNM] debug patch  https://review.openstack.org/64756718:06
openstackgerritDimitrios Markou proposed openstack/airship-in-a-bottle master: Add bgp peering in virtual airship  https://review.openstack.org/64217118:07
*** ukk1985 has joined #airshipit18:11
openstackgerritPRATEEK REDDY DODDA proposed openstack/airship-divingbell master: Implement Security Context for Divingbell  https://review.openstack.org/64170618:34
openstackgerritDimitrios Markou proposed openstack/airship-in-a-bottle master: Add bgp peering in virtual airship  https://review.openstack.org/64217118:35
openstackgerritPRATEEK REDDY DODDA proposed openstack/airship-divingbell master: Implement Security Context for Divingbell  https://review.openstack.org/64170618:45
openstackgerritPRATEEK REDDY DODDA proposed openstack/airship-armada master: Implement Security Context for Armada  https://review.openstack.org/63920719:04
*** rihbb has joined #airshipit19:07
rihbbHi, when i am using update_software shipyard script to make updates to my current site, the ceph-rgw chart seems to fail ("Failed to apply manifest: Exception deploying charts: ['tenant-ceph-rgw']"). Is there some ceph related clean up that one needs to do before running update_software?Thanks!19:10
evgenylrihbb: I don't think that there is anything specific that needs to be run, I did quite a few updates, and have not seen problems with tenant-ceph-rgw, check around this message in armada-api logs, you may be able to find more details, and also check rgw pods in tenant-ceph namespace, maybe you will be able to identify some of them being stuck in init/error states.19:14
openstackgerritPRATEEK REDDY DODDA proposed openstack/airship-divingbell master: Implement Security Context for Divingbell  https://review.openstack.org/64170619:14
openstackgerritPRATEEK REDDY DODDA proposed openstack/airship-armada master: Implement Security Context for Armada  https://review.openstack.org/63920719:17
*** sthussey has joined #airshipit19:18
rihbbevgenyl: Hi, The armada logs show timeout related errors - https://paste.ubuntu.com/p/XFjFBwYPKG/. There are no ceph-rgw pods in tenant-ceph namespace, however the ceph-rgw pods in openstack namespaces seem to be restarting with no error https://paste.ubuntu.com/p/MfVhTfhHXC/. All the other pods in tenant-ceph namespace seem to be running & deployed properly. Any idea of what could have gone wrong?19:18
rihbbAll that I changed before running update_software are the versions of openstack component images (I didnt touch any ceph related config).19:20
evgenylrihbb: Can you show the output of `helm history airship-tenant-ceph-rgw`?19:24
evgenylrihbb: I suspect this is just some timeout related error which just requires re-apply, but let's first check a few things.19:24
rihbbREVISION        UPDATED                         STATUS  CHART           DESCRIPTION19:25
rihbb1               Mon Mar 25 09:41:21 2019        FAILED  ceph-rgw-0.1.0  Release "airship-tenant-ceph-rgw" failed: timed out waiti...19:25
openstackgerritRahul Khiyani proposed openstack/airship-drydock master: Drydock: Add pod/container security context  https://review.openstack.org/63919719:26
evgenylrihbb: And now `helm history -o yaml airship-tenant-ceph-rgw` to see a complete description of the problem.19:27
rihbbevgenyl: That also shows timeout condition:19:28
rihbbdescription: 'Release "airship-tenant-ceph-rgw" failed: timed out waiting for the19:28
rihbb    condition'19:28
rihbb  revision: 119:28
rihbb  status: FAILED19:28
evgenylrihbb: `kubectl get pods -o wide --all-namespaces | grep rgw`19:30
rihbbevgenyl: https://paste.ubuntu.com/p/m94RV47try/19:34
rihbblogs of the ceph-rgw pod in crashloopback state: https://paste.ubuntu.com/p/MfVhTfhHXC/19:35
evgenylrihbb: Are these all the logs? Can you try to run `kubectl logs` with `-f` key and wait until the next round of crash happens?19:36
evgenylrihbb: Just to make sure that we catch an actual error that caused the crash.19:37
rihbbevgenyl: kubectl logs command exits with the same log ^ when the new restart happens; but kubectl describe shows Warning  Unhealthy  14m (x1872 over 9h)  kubelet, node-4 Readiness probe failed: Get http://10.97.232.89:8088/: dial tcp 10.97.232.89:8088: getsockopt: connection refused19:41
openstackgerritRahul Khiyani proposed openstack/airship-maas master: Maas: Add pod/container security context  https://review.openstack.org/63920019:51
evgenylrihbb: Hmm, so it fails on `ceph version 13.2.2` without any other messages?19:52
evgenyl^ I means on printing its version?19:53
rihbbevgenyl: Yes (2019-03-25 19:49:51.716 7fa3387988c0  0 ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic (stable), process radosgw, pid 17)19:54
evgenylrihbb: Can you try to force to recreate e.g. `ceph-rgw-679c47b9dd-klwm5` pod by deleting it using `kubectl delete -n openstack ceph-rgw-679c47b9dd-klwm5` and waiting until it gets recreated using a different name and following the logs when it is started.19:59
openstackgerritRahul Khiyani proposed openstack/airship-drydock master: Drydock: Add pod/container security context  https://review.openstack.org/63919720:02
openstackgerritRahul Khiyani proposed openstack/airship-maas master: Maas: Add pod/container security context  https://review.openstack.org/63920020:03
rihbbevgenyl: This is how the logs look like once it gets created for the first time: https://paste.ubuntu.com/p/wpYY62QPMD/.20:04
evgenylrihbb: Has it crashed?20:05
rihbbevgenyl: yes20:05
evgenylrihbb: Can you now show the output of `kubectl describe ...`?20:05
openstackgerritAlexander Hughes proposed openstack/airship-pegleg master: PKI Cert generation and check updates  https://review.openstack.org/63941420:07
rihbbevgenyl: Sure, https://paste.ubuntu.com/p/H8bjtDbPrQ/20:09
openstackgerritDimitrios Markou proposed openstack/airship-in-a-bottle master: Add bgp peering in virtual airship  https://review.openstack.org/64217120:12
openstackgerritPRATEEK REDDY DODDA proposed openstack/airship-armada master: Implement Security Context for Armada  https://review.openstack.org/63920720:12
openstackgerritAlexander Hughes proposed openstack/airship-pegleg master: PKI Cert generation and check updates  https://review.openstack.org/63941420:26
evgenylrihbb: It's hard to tell what is going on, I don't think that failed readiness probe causes crashes, I'm wondering if we can increase logging level for rgw, but I'm surprised that it fails silently with no messages.20:27
evgenylrihbb: Can you provide a bit more details on what changes have you applied to he manifests?20:27
openstackgerritStas Egorov proposed openstack/airship-in-a-bottle master: Fixed sudo env vars for apt  https://review.openstack.org/64759820:34
openstackgerritRahul Khiyani proposed openstack/airship-shipyard master: Shipyard and Airflow: Add pod/container security context  https://review.openstack.org/63919520:40
*** mfuller_ has quit IRC20:42
*** mcfuller has joined #airshipit20:42
openstackgerritRahul Khiyani proposed openstack/airship-shipyard master: Shipyard and Airflow: Add pod/container security context  https://review.openstack.org/63919520:43
mcfullerHello, I was curious about the absence of a global tempest chart in treasuremap. Is the preferred method for tempest testing to set the run_tempest flag as a value override for individual helm / armada charts?20:44
rihbbevgenyl: For this case, I had modified versions.yaml - https://paste.ubuntu.com/p/s3JTFBmwGB/20:51
*** ukk1985 has quit IRC20:52
evgenylrihbb: The only idea I have right now, is to try to increase logging level, you can do that using `kubectl edit configmap -n openstack ceph-rgw-etc` you will need to add `debug_rgw = 10/5` into globals section, see http://docs.ceph.com/docs/mimic/rados/troubleshooting/log-and-debug/ for details.20:56
rihbbevgenyl: Thanks, will try that.21:00
rihbbevgenyl: The logs after increasing logging level (& before entering restart mode) look like this - https://paste.ubuntu.com/p/QCYZFwyq8N/.21:15
rihbbNo error message as such in the entire log.21:16
openstackgerritLev Morgan proposed openstack/airship-pegleg master: Added DeploymentData document generation  https://review.openstack.org/64761521:21
evgenylrihbb: This is very strange :) Can you try checking /var/log/syslog (on the node where rgw fails) and see if there is anything interesting related to rgw pod?21:23
openstackgerritScott Hussey proposed openstack/airship-in-a-bottle master: (multinode) Make disk layout flexible  https://review.openstack.org/63804021:30
openstackgerritScott Hussey proposed openstack/airship-in-a-bottle master: Network enhancements for gate-multinode  https://review.openstack.org/63483721:30
kranthikirangmattmceuen: I have increased the timeouts to 600 for mariadb and rabbit and observed the same failure; After inspecting the logs I see two reasons for the failure;21:34
kranthikirangmattmceuen: http://paste.openstack.org/show/748340/ - rabbitmq logs21:35
kranthikirangmattmceuen: http://paste.openstack.org/show/748341/ - mariadb logs21:36
kranthikirangmattmceuen: I also see readinessProbe failing for both the pods; With in 600 seconds these didn't become alive hence Armada giving failures; Can you help me to find the root cause for these two? Also on how to change ReadinessProbe value for a chaert? I have deployed rabbitmq directly using openstack-helm-infra charts but never encountered these failures21:37
kranthikirangmattmceuen: probably my HP gen9 v4 isn't sufficient but that's weird since its a 56 CPU host and with 256GB memory21:39
openstackgerritScott Hussey proposed openstack/airship-in-a-bottle master: [WIP] Network enhancements for gate-multinode  https://review.openstack.org/63483721:48
*** rihbb has left #airshipit21:49
openstackgerritPRATEEK REDDY DODDA proposed openstack/airship-divingbell master: Implement Security Context for Divingbell  https://review.openstack.org/64170621:56
openstackgerritGeorg Kunz proposed openstack/airship-in-a-bottle master: [WIP] Configuration for testing DPDK in multi-node AIAB  https://review.openstack.org/63420721:58
*** michaelbeaver has joined #airshipit22:00
*** michael-beaver has quit IRC22:04
openstackgerritRahul Khiyani proposed openstack/airship-promenade master: Add pod/container security context  https://review.openstack.org/63918922:06
*** michaelbeaver has quit IRC22:06
openstackgerritRahul Khiyani proposed openstack/airship-shipyard master: Shipyard and Airflow: Add pod/container security context  https://review.openstack.org/63919522:18
openstackgerritAnthony Bellino proposed openstack/airship-divingbell master: [WIP] Initial Ansible Daemonset  https://review.openstack.org/64053922:38
*** kranthikirang has quit IRC22:39
openstackgerritRahul Khiyani proposed openstack/airship-maas master: Maas: Add pod/container security context  https://review.openstack.org/63920022:42
*** aaronsheffield has quit IRC22:56
*** sthussey has quit IRC23:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!