*** rlandy|rover|bbl is now known as rlandy|rover | 00:36 | |
rlandy|rover | merged the revert of victoria | 00:38 |
---|---|---|
*** rlandy|rover is now known as rlandy|out | 00:39 | |
dasm|ruck|bbl | nice | 00:44 |
*** dasm|ruck|bbl is now known as dasm|ruck|off | 03:11 | |
*** marios is now known as marios|ruck | 05:08 | |
marios|ruck | morning | 05:08 |
chandankumar | marios|ruck: Good morning :-) | 05:15 |
*** jpena|off is now known as jpena | 07:34 | |
frenzy_friday | hey marios|ruck Good morning. https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-build-containers-centos-9-quay-master looks like quay master is passing again | 08:30 |
marios|ruck | frenzy_friday: o/ morning ok good so you merged some fix? | 08:30 |
frenzy_friday | nope, maybe something in quay was down ? | 08:30 |
marios|ruck | frenzy_friday: ah i see | 08:31 |
marios|ruck | well fantastic then | 08:31 |
marios|ruck | ;) | 08:31 |
marios|ruck | frenzy_friday: thanks | 08:31 |
frenzy_friday | that was quay trolling us | 08:31 |
marios|ruck | :D | 08:31 |
marios|ruck | coffee brb | 08:32 |
chandankumar | marios|ruck: need any help on ruck rover? | 10:12 |
marios|ruck | chandankumar: thanks ok for now chasing promotions mostly atm ... gates look ok but lets not focus on that too much ;) | 10:12 |
chandankumar | marios|ruck: ok | 10:13 |
marios|ruck | chandankumar: thanks will ping when i need sthing | 10:13 |
chandankumar | sure sure | 10:13 |
marios|ruck | dasm|ruck|off: rlandy|out: o/ question about how network promoted without fs1 did you remove from criteria? https://bugs.launchpad.net/tripleo/+bug/1970899/comments/10 3 | 10:14 |
rlandy|out | marios|ruck: ack | 10:19 |
rlandy|out | harold's note said we needed newer neutron | 10:19 |
rlandy|out | marios|ruck: dasm|ruck|off also added a patch to skip train fs01 and fs035 | 10:20 |
rlandy|out | not sure how I feel about that | 10:20 |
rlandy|out | we should investigate what died there | 10:20 |
rlandy|out | marios|ruck: is master any better with newer neutron? | 10:20 |
marios|ruck | rlandy|out: yeah i blocked it for now trying our luck with 2 hashes first ;) details there https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/42450/1#message-e243ae370c46aa1d9e1ddbcac379bd742eac87d3 | 10:21 |
chandankumar | dasm|ruck|off: hello, please remove -1 from https://review.rdoproject.org/r/c/config/+/42226 as https://github.com/rdo-infra/rdo-jobs/blob/master/zuul.d/ansible-role-container-registry.yaml#L16 is removed now | 10:21 |
marios|ruck | rlandy|out: well master integration line today is at least no longer hitting +bug/1970899/ (i wrote in comment #10) | 10:22 |
*** rlandy|out is now known as rlandy | 10:24 | |
rlandy | marios|ruck: quick chat re: invoice | 10:24 |
rlandy | https://meet.google.com/wzf-vowr-oxu?pli=1&authuser=0 | 10:24 |
rlandy | marios|ruck: ^^ | 10:25 |
marios|ruck | rlandy: sure sec (sorry had temp issue with connection router dropped for a minute) | 10:27 |
rlandy | chandankumar: arxcruz: can you guys join us on https://meet.google.com/wzf-vowr-oxu?pli=1&authuser=0 | 10:53 |
marios|ruck | arxcruz: https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-master/65b4123/logs/undercloud/var/log/tempest/stestr_results.html.gz | 10:57 |
chandankumar | passing https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-master/b0fc454/logs/overcloud-novacompute-0/var/log/extra/errors.txt.gz | 11:03 |
chandankumar | failure: https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-master/65b4123/logs/overcloud-novacompute-0/var/log/extra/errors.txt.gz | 11:03 |
chandankumar | arxcruz: https://opendev.org/openstack/tripleo-quickstart/src/branch/master/config/general_config/featureset001.yml#L147 | 11:07 |
chandankumar | https://opendev.org/openstack/tripleo-quickstart/src/branch/master/config/general_config/featureset035.yml#L217 | 11:07 |
chandankumar | rlandy: https://codesearch.opendev.org/?q=tempest_run_concurrency&i=nope&literal=nope&files=&excludeFiles=&repos= most places it is 4 | 11:09 |
rlandy | marios|ruck: https://review.rdoproject.org/r/c/testproject/+/36255 - ready for depends-on | 11:14 |
marios|ruck | rlandy: ack https://review.opendev.org/c/openstack/tripleo-quickstart/+/840283 | 11:18 |
*** dviroel|out is now known as dviroel | 11:18 | |
rlandy | thanks | 11:19 |
rlandy | marios|ruck: need help with anything else? | 11:26 |
marios|ruck | rlandy: not currently will let you know thanks main focus is train promotion currently | 11:27 |
rlandy | k | 11:27 |
rlandy | jm1: frenzy_friday, arxcruz, chandankumar, marios|ruck, rcastillo, dasm|ruck|off, dviroel: any topics for today's community call? | 11:28 |
rlandy | india has a public holiday today | 11:29 |
rlandy | I have a clashing meeting | 11:29 |
rlandy | which I may be able to skip some of | 11:29 |
marios|ruck | rlandy: ack we'll join in case there are any guests | 11:30 |
rlandy | marios|ruck; pls do | 11:30 |
rlandy | will join if I can | 11:30 |
marios|ruck | k np | 11:30 |
chandankumar | rlandy: dviroel marios|ruck https://review.opendev.org/c/openstack/tripleo-ansible/+/839319 please have a look when free, thanks! | 11:41 |
marios|ruck | chandankumar: k adding to reviews | 11:41 |
reviewbot | Do you want me to add your patch to the Review list? Please type something like add to review list <your_patch> so that I can understand. Thanks. | 11:41 |
chandankumar | I am still working on final change to close out cs9 tripleo-ansible work https://review.opendev.org/c/openstack/tripleo-ansible/+/839688/11 | 11:41 |
chandankumar | rlandy: marios|ruck leaving early today, see ya tomorrow | 11:42 |
marios|ruck | chandankumar: o/ have a good one mate | 11:43 |
dviroel | k | 11:43 |
rlandy | rcastillo: updated your molecule patch for stable/wallaby | 11:45 |
rlandy | freeze graph issue | 11:45 |
rlandy | rcastillo: testing nested-virt here: https://review.rdoproject.org/r/c/testproject/+/42434 | 11:48 |
rlandy | dasm|ruck|off: marios|ruck: rhos-17 on rhel-9 missing fs035 to promo - rerunning that now | 12:09 |
marios|ruck | thank you rlandy | 12:09 |
* marios|ruck coffee brb | 12:09 | |
*** dasm|ruck|off is now known as dasm|ruck | 12:17 | |
dasm|ruck | o/ | 12:17 |
jm1 | rlandy: prepared a intro for our zuul ci jobs, but can give it next week as well | 12:17 |
jm1 | rlandy: *ci jobs in ansible openstack collection | 12:17 |
jm1 | marios|ruck: ^ | 12:25 |
marios|ruck | o/ jm1 | 12:25 |
marios|ruck | sure sounds good - maybe both today and next week? | 12:26 |
jm1 | hey hey .) | 12:26 |
jm1 | that might bore half of our team ^^ | 12:26 |
jm1 | ..next week | 12:26 |
marios|ruck | jm1: also fine if you want to wait until next week :) | 12:26 |
jm1 | i am fine with both but if india is out today it might make sense to postpone it because in particular chandan asked about it ^^ | 12:27 |
marios|ruck | jm1: sounds like a plan then ;) | 12:27 |
marios|ruck | ship it! | 12:28 |
jm1 | marios|ruck: ack, thx :) | 12:28 |
rlandy | jm1: let's see who is there | 12:29 |
jm1 | rlandy: ack, will be there | 12:30 |
rlandy | chandankumar: when you are in tomorrow - let's discuss https://review.rdoproject.org/r/c/config/+/42442 | 12:53 |
rlandy | ordering there | 12:53 |
rlandy | I think that's the right way around | 12:53 |
rlandy | jm1: hey - going to send you a test email - can you just tell me when you receive it? | 12:56 |
rlandy | marios|ruck: ^^ checking | 12:56 |
marios|ruck | rlandy: :) thanks | 12:57 |
jm1 | rlandy: none yet ^^ | 12:58 |
marios|ruck | nothing reported there https://www.google.com/appsstatus/dashboard/ fwiw | 12:59 |
rlandy | jm1: frenzy_friday, arxcruz, chandankumar, marios|ruck, rcastillo, dasm|ruck|off, dviroel: community call notes hackmd ready for notes if I can't attend: | 13:05 |
rlandy | https://hackmd.io/MMg4WDbYSqOQUhU2Kj8zNg | 13:05 |
rlandy | 05/03 | 13:05 |
marios|ruck | thanks rlandy | 13:05 |
*** rlandy is now known as rlandy|mtg | 13:05 | |
marios|ruck | jm1: did you still not receive an email fro rlandy|mtg ? | 13:06 |
dasm|ruck | marios|ruck: i responded to your review comments on https://review.rdoproject.org/r/q/topic:rr_refactor+is:open | 13:06 |
dasm|ruck | hmm.. rdoproject is 503-ing me | 13:06 |
marios|ruck | dasm|ruck: thanks will revisit | 13:07 |
marios|ruck | dasm|ruck: yes there is software factory upgrade today | 13:07 |
marios|ruck | totally forgot | 13:07 |
marios|ruck | :/ | 13:07 |
marios|ruck | must have just started dasm|ruck | 13:07 |
dasm|ruck | ah | 13:07 |
dasm|ruck | ack | 13:07 |
marios|ruck | i guess zuul will go at some point still there currently | 13:08 |
dasm|ruck | repos are unavailable atm | 13:08 |
jm1 | marios|ruck: nope | 13:10 |
marios|ruck | jm1: k thank you must be some wider email issue | 13:10 |
jm1 | marios|ruck, rlandy|mtg: a regular mail to my company mail address? | 13:11 |
marios|ruck | jm1: yeah work red hat email | 13:11 |
jm1 | marios|ruck, rlandy|mtg: last mail in my inbox is from 3h ago | 13:13 |
jm1 | marios|ruck, rlandy|mtg: which is not normal | 13:13 |
rlandy|mtg | jm1: marios|ruck; ack - seems like an issue or lag | 13:16 |
rlandy|mtg | thanks for checking | 13:16 |
rlandy|mtg | last email 6:19 | 13:17 |
rlandy|mtg | confirmed 3 hours ago | 13:17 |
marios|ruck | thanks jm1 seems same for me (~3 hour ago last email) | 13:24 |
jm1 | marios|ruck, rlandy|mtg: we just got an emergency alert email | 13:24 |
marios|ruck | jm1: ehm... where? i mean we don't have email :) | 13:25 |
jm1 | marios|ruck: oh, right. i got it to my personal email.. no idea why | 13:27 |
marios|ruck | jm1: ack is it isaac alert thing maybe | 13:27 |
marios|ruck | hope nothing serious happened if so... | 13:28 |
jm1 | marios|ruck, rlandy|mtg: send you the incident link as pm | 13:29 |
marios|ruck | thanks jm1 | 13:29 |
rlandy|mtg | thanks | 13:30 |
marios|ruck | dasm|ruck: rdo/gerrit back fyi | 13:43 |
dasm|ruck | ack | 13:44 |
dviroel | yep, service-now says mail outage - started at 9:19 | 13:47 |
rlandy|mtg | hey - sorry missed community call | 14:02 |
rlandy|mtg | anything happen?? | 14:02 |
*** rlandy|mtg is now known as rlandy | 14:02 | |
marios|ruck | rlandy|mtg: nah we dropped in 5 mins no topics | 14:02 |
rlandy | marios|ruck: arxcruz: https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset035-master/498eb76/logs/undercloud/var/log/tempest/stestr_results.html.gz :( | 14:19 |
rlandy | with reduced workers concurrency | 14:20 |
arxcruz | rlandy that's a different problem, either there is a firewall rule blocking the connection to the identity service (keystone) or the service did not started | 14:20 |
arxcruz | it'sa bug | 14:21 |
marios|ruck | rlandy: i dont think it will help us much, because a lot of the issues are not tempest (a lot are but quite a few arent)...e.g. latest example from the master line https://review.rdoproject.org/r/c/testproject/+/42518/2#message-6725e09951c0779190e83eb4844ba3482eb9fe1f - only one of those is tempest | 14:21 |
arxcruz | for ipv6 | 14:21 |
rlandy | marios|ruck: re: arxcruz's comment above | 14:21 |
rlandy | can you confirm that for ipv6? | 14:21 |
marios|ruck | rlandy: well if we can get some of those consistently yeah lets file a bug but not really seeing that yet | 14:22 |
marios|ruck | it seems like different issues each time | 14:22 |
marios|ruck | e.g. fs35 in latest master fails on provision (https://review.rdoproject.org/r/c/testproject/+/42518/2#message-6725e09951c0779190e83eb4844ba3482eb9fe1f ) | 14:22 |
arxcruz | marios|ruck rlandy urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13000): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f3e9f60cd90>: Failed to establish a new connection: [Errno 111] Connection refused')) | 14:23 |
arxcruz | it's not being able to connect | 14:24 |
arxcruz | so, all tests fails | 14:24 |
arxcruz | marios|ruck rlandy the other tests marked as passed are actually skipped | 14:26 |
arxcruz | {0} setUpClass (tempest.api.compute.admin.test_floating_ips_bulk.FloatingIPsBulkAdminTestJSON) ... SKIPPED: nova-network is gone | 14:26 |
marios|ruck | thanks arxcruz | 14:26 |
rlandy | arxcruz: marios|ruck: ok - so that should be a master fs035 bug only | 14:27 |
rlandy | fs001 and other releases still have a chance | 14:27 |
dasm|ruck | brb | 14:27 |
arxcruz | rlandy marios|ruck seems to be amq server | 14:28 |
arxcruz | 2022-05-03 13:40:24.509 16 ERROR oslo.messaging._drivers.impl_rabbit [-] [ea0597e8-f3e2-49d4-8646-f633653e2527] AMQP server on overcloud-controller-1.internalapi.localdomain:5672 is unreachable: <RecoverableConnectionError: unknown error>. Trying again in 1 seconds.: amqp.exceptions.RecoverableConnectionError: <RecoverableConnectionError: unknown error> | 14:28 |
marios|ruck | arxcruz: rlandy: k i am not going to file a bug for that until we start seeing it consistently though things are not stable enough to call it right now | 14:28 |
rlandy | k | 14:28 |
marios|ruck | rlandy: i've seen fs35 fail on 3 different things today and not 2 of those yet | 14:28 |
arxcruz | rlandy marios|ruck yes, haproxy is down | 14:29 |
arxcruz | https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset035-master/498eb76/logs/overcloud-controller-0/var/log/containers/haproxy/haproxy.log.txt.gz | 14:29 |
rlandy | dasm|ruck: ^^ pls read back on all this | 14:29 |
rlandy | re: tempest investigation | 14:29 |
arxcruz | May 3 13:15:52 overcloud-controller-0 haproxy[7]: Server swift_proxy_server_be/overcloud-controller-2.storage.localdomain is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 3ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. | 14:29 |
marios|ruck | feels like it could be more related to the nodes themselves as discusse earlier | 14:29 |
marios|ruck | thanks arxcruz :D | 14:29 |
marios|ruck | dasm|ruck: lets do a sync in half hour if you available? | 14:29 |
arxcruz | marios|ruck so, i can tell the issue is not related to tempest, it's actually related to the installation / network setup | 14:29 |
arxcruz | either there is a firewall rule blocking | 14:30 |
marios|ruck | arxcruz: ack | 14:30 |
arxcruz | or some service weren't installed properly | 14:30 |
arxcruz | or the network where the ovb vm's were deployed is not working properly | 14:30 |
dasm|ruck | back | 14:38 |
dasm|ruck | marios|ruck: sure | 14:38 |
rlandy | marios|ruck: so all the fs035 jobs could probably get bug'ed | 14:48 |
marios|ruck | rlandy: what do you mean | 14:50 |
marios|ruck | rlandy: going to sync in 10 with dasm if available? | 14:50 |
rlandy | all releases are failing | 14:50 |
rlandy | ack - will join then | 14:50 |
marios|ruck | rlandy: right but random things...e g. just saw one for train like that: https://logserver.rdoproject.org/62/42462/2/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-train/3dabaa0/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz | 14:51 |
marios|ruck | rlandy: FATAL | Check Keystone user assignment to roles status | undercloud | item=cinderv3 | error={"ansible_job_id": "965216880256.483604", "ansible_loop_var": | 14:51 |
marios|ruck | rlandy: so what can we file this is madness | 14:51 |
marios|ruck | i mean the sheer number of different things | 14:51 |
marios|ruck | i have seen today | 14:51 |
marios|ruck | log files are all starting to blur into one | 14:51 |
rlandy | I'd follow fs035 train | 14:52 |
marios|ruck | rlandy: that's why i think it may be some performance issue i mean 'we need bigger machines' but i have no evidence yet | 14:52 |
rlandy | it was stable at one point then stopped | 14:52 |
rlandy | marios|ruck: compare: | 14:52 |
rlandy | https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-train&result=success | 14:52 |
rlandy | vs | 14:52 |
rlandy | https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-train&result=success | 14:53 |
marios|ruck | rlandy: sure but we cant track/file something if each run is different error | 14:53 |
marios|ruck | i mean once we have 2 the same | 14:53 |
marios|ruck | i'll file it | 14:53 |
marios|ruck | :D | 14:53 |
marios|ruck | rlandy: those links are both train did you mean to point to something else? | 14:53 |
rlandy | somewhere on 03/26 we started to die consistently | 14:53 |
rlandy | no - | 14:53 |
rlandy | I am talking about two different status links on the same train job | 14:54 |
marios|ruck | 17:52 < rlandy> https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-train&result=success | 14:54 |
marios|ruck | 17:52 < rlandy> vs | 14:54 |
marios|ruck | 17:53 < rlandy> https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-train&result=success | 14:54 |
marios|ruck | rlandy: same links? | 14:54 |
rlandy | https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-train&result=failure | 14:54 |
rlandy | sorry - ^^ that | 14:54 |
marios|ruck | ah k looking | 14:54 |
marios|ruck | rlandy: right | 14:54 |
marios|ruck | rlandy: dasm|ruck: https://meet.google.com/dqv-rjoq-gmr | 14:59 |
dasm|ruck | k | 14:59 |
dasm|ruck | joining | 14:59 |
rlandy | joining | 15:00 |
marios|ruck | dasm|ruck: master https://review.rdoproject.org/r/c/testproject/+/42518 | 15:31 |
rlandy | marios|ruck: emails are back | 15:43 |
marios|ruck | rlandy: thanks, did you get the one i sent to concilium with the timesheet? (so i know not to re-send it tomorrow :)) | 15:44 |
marios|ruck | rlandy: i see the one you sent to me as well now (the one that was missing earlier) | 15:45 |
rlandy | marios|ruck: yes - I have that one | 15:46 |
marios|ruck | rlandy: thanks | 15:46 |
rlandy | marios|ruck: arxcruz: dasm|ruck: same tempest disaster with concurrency set to 2: https://logserver.rdoproject.org/91/42491/1/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train/626ac5d/logs/undercloud/var/log/tempest/stestr_results.html.gz | 15:49 |
dasm|ruck | k | 15:49 |
marios|ruck | rlandy: ack | 15:49 |
* marios|ruck going to dream about tempest tonight | 15:49 | |
rlandy | tempest.lib.exceptions.IdentityError: Got identity error | 15:49 |
dasm|ruck | marios|ruck: it's going to be a bad dream? | 15:49 |
dasm|ruck | or even nightmare? | 15:49 |
rlandy | arxcruz: ^^ do we need the same key changes we had for octavia? | 15:49 |
marios|ruck | right dasm|ruck | 15:50 |
rlandy | train c8 | 15:50 |
rlandy | featureset001-train | 15:50 |
rlandy | so no ipv6 | 15:50 |
rlandy | tempest.lib.exceptions.IdentityError: Got identity error | 15:52 |
rlandy | https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby/2eef1a4/logs/undercloud/var/log/tempest/stestr_results.html.gz | 15:52 |
rlandy | same error in wallaby c8 | 15:52 |
rlandy | periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby | 15:52 |
rlandy | we may have legit c8 errors | 15:53 |
*** dviroel is now known as dviroel|lunch | 15:53 | |
rlandy | marios|ruck: dasm|ruck: ^^ c8 shows this | 15:54 |
marios|ruck | rlandy: k will check more tomorrow - may be there are some legit bugs in there, as well ;) | 15:55 |
rlandy | fs001 and fs035 | 15:55 |
rlandy | waiting to see if arxcruz is still around | 15:55 |
rlandy | if he can comment on that | 15:55 |
marios|ruck | ack | 15:56 |
* marios|ruck has to go | 15:57 | |
marios|ruck | will pickup tomorrow bai | 15:57 |
*** marios|ruck is now known as marios|out | 15:57 | |
arxcruz | rlandy checking | 16:00 |
rlandy | arxcruz: can we chat for a few? | 16:00 |
arxcruz | yes | 16:00 |
rlandy | arxcruz: https://meet.google.com/vkq-dxan-ftb?pli=1&authuser=0 | 16:00 |
rlandy | dasm|ruck: you can join if you like | 16:01 |
arxcruz | rlandy https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby/2eef1a4/logs/overcloud-controller-0/var/log/containers/haproxy/haproxy.log.txt.gz | 16:03 |
arxcruz | https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset035-master/498eb76/logs/undercloud/var/log/tempest/stestr_results.html.gz | 16:06 |
arxcruz | rlandy ^ | 16:06 |
rlandy | arxcruz: https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-wallaby/9967e52/logs/undercloud/var/log/tempest/stestr_results.html.gz | 16:09 |
arxcruz | dasm|ruck we start here: https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset035-master/498eb76/logs/undercloud/var/log/tempest/stestr_results.html.gz | 16:29 |
arxcruz | dasm|ruck then I came here https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset035-master/498eb76/logs/overcloud-controller-0/var/log/containers/haproxy/haproxy.log.txt.gz | 16:31 |
arxcruz | and found May 3 13:15:50 overcloud-controller-0 haproxy[7]: Server cinder_be/overcloud-controller-0.internalapi.localdomain is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. | 16:31 |
arxcruz | this is for centos 9 | 16:31 |
arxcruz | for centos 8: | 16:31 |
dasm|ruck | rlandy: fyi, this is overall pass rate for jast 1k tests: https://paste.opendev.org/show/bqMe8dzIzOyTowakeCzm/ | 16:32 |
arxcruz | dasm|ruck first we start here: | 16:32 |
dasm|ruck | now, i'm looking into when it started failing first time | 16:32 |
arxcruz | https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset035-master/498eb76/logs/undercloud/var/log/tempest/stestr_results.html.gz | 16:32 |
dasm|ruck | rlandy: in 15 mints i'll need to leave for about 1h. Doc's app. | 16:33 |
rlandy | dasm|ruck: ok - ping me when you leave with what you have and I'll log the bug | 16:33 |
arxcruz | dasm|ruck sorry, centos 8 we start here: | 16:33 |
arxcruz | https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-wallaby/9967e52/logs/undercloud/var/log/tempest/stestr_results.html.gz | 16:33 |
dasm|ruck | rlandy: k | 16:33 |
arxcruz | then i check here:https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-wallaby/9967e52/logs/overcloud-controller-0/var/log/containers/keystone/keystone.log.txt.gz | 16:35 |
arxcruz | and got this | 16:35 |
arxcruz | 2022-05-03 14:27:43.142 169 ERROR oslo_db.sqlalchemy.engines [req-d6926730-7334-4db4-9cca-c4760d5dbb71 - - - - -] Database connection was found disconnected; reconnecting: oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') | 16:35 |
* rlandy starting bug | 16:35 | |
arxcruz | then i check this: | 16:35 |
arxcruz | https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-wallaby/9967e52/logs/overcloud-controller-0/var/log/containers/haproxy/haproxy.log.txt.gz | 16:35 |
rlandy | arxcruz: will send to you for review | 16:35 |
arxcruz | and got this May 3 13:24:01 overcloud-controller-0 haproxy[13]: Backup Server mysql/overcloud-controller-0.internalapi.localdomain is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 2 backup servers left. Running on backup. 0 sessions active, 0 requeued, 0 remaining in queue. | 16:35 |
arxcruz | rlandy ok | 16:35 |
dasm|ruck | rlandy: fyi, i'm afk. i'll be back in ~1h | 16:45 |
rlandy | k - bug in progress | 16:45 |
rlandy | juts looking for c9 logs | 16:45 |
dasm|ruck | i pulled up initial stats. i'll refine them after coming back | 16:45 |
dasm|ruck | it looks like fs002 is in a good shape | 16:46 |
dasm|ruck | s/good/better | 16:46 |
dasm|ruck | than others | 16:46 |
dasm|ruck | but overall, ovb is in a miserable shape | 16:46 |
rlandy | https://bugs.launchpad.net/tripleo/+bug/1971465 | 16:48 |
dasm|ruck | k | 16:48 |
rlandy | arxcruz: dasm|ruck: ^^ we have a start | 16:48 |
rlandy | editing bug to add stats | 16:48 |
dasm|ruck | rlandy: you want me to add notes to this bug or create a separate one to track general ovb failures? | 16:49 |
dasm|ruck | bbl | 16:49 |
rlandy | adding that info | 16:49 |
*** dasm|ruck is now known as dasm|ruck|bbl | 16:50 | |
arxcruz | rlandy can explain that is not a tempest issue, but the fail in the connectivity between the controllers / haproxy | 16:50 |
rlandy | arxcruz: yep - adding more info now | 16:50 |
rlandy | just wanted to keep description short enough to read | 16:50 |
rlandy | arxcruz: does https://bugs.launchpad.net/tripleo/+bug/1971465 cover it? | 16:52 |
arxcruz | rlandy yes | 16:53 |
rlandy | arxcruz: dasm|ruck|bbl: going to start by pinging the DF for some help in getting debug direction | 16:53 |
*** dviroel|lunch is now known as dviroel | 16:56 | |
arxcruz | ok | 16:58 |
*** jpena is now known as jpena|off | 17:00 | |
*** dasm|ruck|bbl is now known as dasm|ruck | 18:01 | |
* dasm|ruck is back | 18:01 | |
rlandy | dasm|ruck: pls see conversation on #tripleo | 18:06 |
rlandy | trying to add swap to overcloud nodes | 18:06 |
rlandy | dasm|ruck: bug logged at https://bugs.launchpad.net/tripleo/+bug/1971465 | 18:07 |
dasm|ruck | k | 18:11 |
dasm|ruck | rlandy: did you retrigger ovb jobs on https://review.opendev.org/c/openstack/tripleo-quickstart/+/840283 ? | 18:12 |
dasm|ruck | do you want me to do so? | 18:13 |
rlandy | dasm|ruck: already done | 18:17 |
dasm|ruck | k | 18:17 |
rlandy | waiting for rhos-17 on rhel-9 to complete as well | 18:17 |
dasm|ruck | ack | 18:19 |
rlandy | dasm|ruck: fs039 has been failing as well - maybe an install issue still - you can check into that | 18:20 |
dasm|ruck | yep | 18:20 |
* rlandy back to 16.2 base image work | 18:20 | |
dasm|ruck | > periodic-tripleo-ci-centos-9-ovb-1ctlr_1comp-featureset002-master started passing | 18:46 |
dasm|ruck | brb | 18:59 |
dasm|ruck | back | 19:06 |
rlandy | dviroel: dasm|ruck: pls vote/review so I don't just self merge https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/406630 | 20:04 |
rlandy | no idea if it's right or wrong before I merge it :( | 20:04 |
rlandy | config | 20:04 |
dasm|ruck | checking | 20:04 |
dviroel | looks correct I think | 20:06 |
dasm|ruck | rlandy: hmm.. to me it looks like this secret isn't even used | 20:07 |
dasm|ruck | https://sf.hosted.upshift.rdu2.redhat.com/codesearch/?q=registry_redhat_io&i=nope&files=&repos= | 20:07 |
dasm|ruck | but i might be wrong.w | 20:07 |
rlandy | dasm|ruck: I am going to add the usage now | 20:07 |
dasm|ruck | it would mean, we can merge it, and it won't break anything | 20:07 |
dasm|ruck | ack | 20:07 |
rlandy | busy coding that change | 20:07 |
rlandy | ack | 20:07 |
rcastillo | lgtm too | 20:07 |
rlandy | it needs to be merged to reference it | 20:07 |
rlandy | thank you | 20:07 |
dasm|ruck | k | 20:08 |
dasm|ruck | dviroel: wanna pull the trigger? rcastillo and myself technically have this superpower too | 20:08 |
dviroel | done | 20:10 |
rlandy | thanks | 20:15 |
rlandy | adding secret usage in next review | 20:15 |
dasm|ruck | k | 20:16 |
rlandy | https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/406631 | 20:23 |
rlandy | ^^ usage | 20:23 |
rlandy | dasm|ruck: going to w+ the train patch | 20:57 |
rlandy | will revert | 20:58 |
dasm|ruck | Lack | 20:59 |
dasm|ruck | yours is still running. currently on tempest, rlandy | 20:59 |
rlandy | yep I know | 20:59 |
rlandy | wallaby c9 fs001 and fs035 passed | 20:59 |
dasm|ruck | \o/ | 21:00 |
rlandy | not with swap | 21:00 |
rlandy | just generally | 21:00 |
dasm|ruck | oh | 21:00 |
dasm|ruck | i was checking ovb jobs. in last few hours they seem to be in better shape. | 21:01 |
dasm|ruck | they've started passing | 21:01 |
dasm|ruck | rlandy: cs8 train - disable fs001 & fs035 got merged. I'm gonna kickstart promotion. | 21:28 |
* dviroel out o/ | 21:30 | |
*** dviroel is now known as dviroel|out | 21:30 | |
dasm|ruck | renning | 21:30 |
dasm|ruck | *running | 21:30 |
dasm|ruck | dviroel|out: take care o/ | 21:30 |
rcastillo | dviroel|out: o/ | 21:30 |
dviroel|out | tks o/ | 21:30 |
rlandy | cool | 22:00 |
rlandy | timed out | 22:01 |
rlandy | did not solve the issue | 22:01 |
rlandy | dasm|ruck: the promotion runs on its own | 22:02 |
rlandy | you don't have to start anything | 22:02 |
rlandy | http://promoter.rdoproject.org/promoter_logs/container-push/ | 22:03 |
rlandy | you rekciked the line??? | 22:03 |
dasm|ruck | i did. a while ago | 22:08 |
dasm|ruck | did i break something, rlandy ? | 22:09 |
dasm|ruck | ugh. time out on periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-train | 22:20 |
dasm|ruck | bbl | 22:21 |
rlandy | dasm|ruck: no | 22:21 |
rlandy | but you restarted the like | 22:21 |
rlandy | line | 22:21 |
rlandy | that has nothing to do with the promotion | 22:21 |
rlandy | dasm|ruck: I can explain the difference | 22:22 |
rlandy | dasm|ruck: train promoted - revert test skip | 22:24 |
rlandy | reverted | 22:24 |
rlandy | wow - we have a long gate | 22:24 |
rlandy | https://logserver.rdoproject.org/55/36255/67/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-wallaby/e61020e/logs/undercloud/var/log/tempest/stestr_results.html.gz | 22:25 |
rlandy | only one test failed here :) | 22:25 |
rlandy | putting in a separate patch with swap and no concurrency change | 22:26 |
dasm|ruck | ack | 22:38 |
dasm|ruck | ok, i see cs8 train promoted. | 22:40 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!