*** yadnesh|away is now known as yadnesh | 04:33 | |
*** marios is now known as marios|ruck | 06:04 | |
marios|ruck | o/ | 06:04 |
---|---|---|
pojadhav | marios|ruck, hey good morning! | 06:32 |
pojadhav | can you please check when you free : https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/46045 | 06:33 |
marios|ruck | o/ po | 06:33 |
marios|ruck | pojadhav: will do | 06:33 |
pojadhav | marios|ruck, thank you.. will talk more on this when you have look on that. | 06:33 |
jm1 | o/ | 08:08 |
*** amoralej|off is now known as amoralej | 08:11 | |
*** jpena|off is now known as jpena | 08:23 | |
bhagyashris | marios|ruck, hey job is passing https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/build/2f676356d17f4356987ba56df24162d6 | 08:32 |
bhagyashris | and for 8 and 9 it's taking tripleo-ci-testing hash | 08:32 |
marios|ruck | bhagyashris: yeah i know but it is still not reporting correctly and not using the ci-config depends on | 08:33 |
marios|ruck | bhagyashris: i need to dig some more today there | 08:33 |
marios|ruck | bhagyashris: it creates the hash_info.sh correctly though - we now have that file so at least it does use the tripleo-ci depends-on | 08:34 |
akahat | o/ | 08:34 |
bhagyashris | marios|ruck, yes | 08:34 |
rlandy | chandankumar: are the jira issues you forwarded spikes or tasks? | 08:44 |
*** ysandeep|out is now known as ysandeep | 08:54 | |
* bhagyashris lunch brb | 09:22 | |
*** dviroel|biab is now known as dviroel | 09:36 | |
*** dviroel is now known as dviroel|doc-appt | 09:52 | |
*** ysandeep is now known as ysandeep|out | 10:43 | |
*** dviroel|doc-appt is now known as dviroel | 11:24 | |
arxcruz | dpawlik, i think collect-logs job is broken https://e8c27e1b50f6503906ee-e4b92f5e1fb788c5da82bf1447ffa549.ssl.cf5.rackcdn.com/859401/4/check/tox-ansible-test-sanity/677bf58/job-output.txt | 11:34 |
rlandy | dviroel: hey - good morning | 11:35 |
rlandy | dviroel: attila will ping you re: hive | 11:36 |
dviroel | rlandy: hi, ok | 11:48 |
marios|ruck | bhagyashris: can you please help me sanity check https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/434855 and we can merge and try if you agree | 11:58 |
marios|ruck | bhagyashris: when you have time please ^^ | 11:58 |
bhagyashris | looking | 11:58 |
bhagyashris | marios|ruck, testing ... | 12:04 |
marios|ruck | bhagyashris: how | 12:04 |
marios|ruck | bhagyashris: its config repo so we have to merge it to test | 12:04 |
bhagyashris | marios|ruck, ohh i forgot that point | 12:04 |
marios|ruck | bhagyashris: cant think of sthing else. the file *is* correct! | 12:05 |
marios|ruck | bhagyashris: and this is what it runs: https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/roles/dlrn-report/tasks/dlrn-report-results.yml | 12:05 |
marios|ruck | bhagyashris: source {{ workspace }}/hash_info.sh | 12:05 |
marios|ruck | bhagyashris: but see logs | 12:05 |
marios|ruck | 2022-11-09 16:09:05.858659 | primary | + source /home/zuul/workspace/hash_info.sh | 12:05 |
marios|ruck | 2022-11-09 16:09:05.858668 | primary | ++ export DLRNAPI_URL=https://osp-trunk.hosted.upshift.rdu2.redhat.com/api-rhel9-osp17-1 | 12:05 |
marios|ruck | bhagyashris: but the file itself | 12:05 |
marios|ruck | contains *8* | 12:05 |
marios|ruck | bhagyashris: https://sf.hosted.upshift.rdu2.redhat.com/logs/44/434544/4/check/periodic-tripleo-ci-rhel-9-8-multinode-mixed-os-rhos-17.1/2f67635/logs/undercloud/home/zuul/workspace/hash_info.sh | 12:06 |
marios|ruck | bhagyashris: export DLRNAPI_URL="https://osp-trunk.hosted.upshift.rdu2.redhat.com/api-rhel8-osp17-1" | 12:06 |
marios|ruck | :/ | 12:06 |
marios|ruck | bhagyashris: see what i mean? ^^ | 12:06 |
bhagyashris | yeah got your point | 12:08 |
bhagyashris | it should take 8 | 12:08 |
bhagyashris | marios|ruck, ^ | 12:08 |
dpawlik | arxcruz: hey, is it for me? | 12:11 |
bhagyashris | marios|ruck, looks ok for me https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/434855 let's try... | 12:11 |
arxcruz | dpawlik, i think i found the issue, since the nodes were updated to the new ubuntu version, it doesn't have python 3.8 anymore, i'm doing some tests now | 12:11 |
dpawlik | arxcruz: ack | 12:12 |
dpawlik | arxcruz: still, would be easier to debug in the future when some tasks will be moved to run instead of pre-run | 12:12 |
marios|ruck | bhagyashris: k i set workflow | 12:16 |
marios|ruck | bhagyashris: please rerun the test once it merges lets see | 12:16 |
marios|ruck | bhagyashris: next step is to hold a node ;) | 12:17 |
bhagyashris | marios|ruck, yeah ;) | 12:19 |
marios|ruck | bhagyashris: maybe we can get a hold anyway? | 12:23 |
bhagyashris | marios|ruck, ok | 12:24 |
marios|ruck | bhagyashris: you have to add the force fail var before rerun for that please? | 12:25 |
marios|ruck | bhagyashris: then we can check tomorrow it would be helpful | 12:25 |
marios|ruck | bhagyashris: force_job_failure | 12:26 |
marios|ruck | force_job_failure: true | 12:26 |
bhagyashris | ok | 12:26 |
bhagyashris | marios|ruck, is your key added in the downstream | 12:27 |
marios|ruck | bhagyashris: should be but as long as your is then we are ok you can add me later | 12:27 |
bhagyashris | yeah | 12:27 |
bhagyashris | we can do that as well | 12:28 |
bhagyashris | marios|ruck, looks like patch https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/434855 is stuck i dont see that in gate | 12:38 |
bhagyashris | i think will need to abandon and restore and then +2 and +W on it | 12:39 |
marios|ruck | bhagyashris: added +2 let me see | 12:39 |
marios|ruck | bhagyashris: in gate | 12:39 |
marios|ruck | waiting to run linters | 12:40 |
bhagyashris | ack | 12:40 |
amoralej | marios|ruck, wrt https://review.opendev.org/c/openstack/tripleo-quickstart/+/863813 would you mind if i remove the TODO in follow-up? | 12:41 |
amoralej | i need it to finish the Zed release | 12:41 |
amoralej | I've tested it in https://review.rdoproject.org/r/c/rdoinfo/+/45965 btw | 12:41 |
marios|ruck | amoralej: ack np added +2 thanks | 12:41 |
amoralej | rlandy, would you mind to review https://review.opendev.org/c/openstack/tripleo-quickstart/+/863813 when you have a chance? | 12:43 |
rlandy | amoralej: looks ok - done on review | 12:44 |
amoralej | may i get +W ? :) | 12:45 |
rlandy | done | 12:46 |
amoralej | thanks rlandy ! | 12:46 |
*** yadnesh is now known as yadnesh|away | 12:47 | |
*** amoralej is now known as amoralej|lunch | 12:53 | |
marios|ruck | scrum | 13:00 |
marios|ruck | anyone want to join us | 13:00 |
marios|ruck | soniya29: o/ scrum time | 13:00 |
* pojadhav stepping out.. will back in hour | 13:19 | |
* jm1 taking a break, will be back later | 13:22 | |
marios|ruck | rlandy: can you please have a look at alternative criteria & promoter | 13:26 |
marios|ruck | rlandy: adaac75f69ae93d6ae76ee320b90dc0a hash in http://promoter.rdoproject.org/promoter_logs/centos9_master_2022-11-10T08:20.log | 13:26 |
marios|ruck | rlandy: passing fs1 but missing fs1-internal | 13:26 |
marios|ruck | rlandy: and it is not promoting this hash as a result | 13:26 |
rlandy | will look in a few | 13:27 |
akahat | marios|ruck, rlandy i'm running failed job ^^ let's see: Change-Id: I2bf1f409ddc051c812d2c7a33df0bdfb988798df | 13:34 |
marios|ruck | akahat: you mean the internal job? | 13:35 |
akahat | marios|ruck, yeah. | 13:35 |
marios|ruck | akahat: k thanks as workaround for now | 13:35 |
rlandy | marios|ruck: checked the code - alt job should only be added if the actual job failed and it passed | 14:03 |
rlandy | fs035 was never added | 14:03 |
rlandy | wonder if there is an old config | 14:03 |
rlandy | we were added fs001-internal as the actual citeria | 14:03 |
rlandy | workaround | 14:04 |
rlandy | marios|ruck: ^^ remove the alternative criteria from criteria file | 14:04 |
marios|ruck | rlandy: ok thanks for checking, will have a look on promoter then in a minute (we have a new jenkins bug) | 14:04 |
rlandy | quicker merge than job rerun | 14:04 |
marios|ruck | rlandy: yeah and if the promoter is OK i'll post the criteria remove | 14:04 |
marios|ruck | rlandy: yeah +1 i was thinking the same | 14:05 |
rlandy | easiest way around | 14:05 |
dasm|off | o/ | 14:09 |
marios|ruck | rcastillo|rover: can you please post that and i'll merge it? | 14:09 |
*** dasm|off is now known as dasm | 14:09 | |
marios|ruck | rcastillo|rover: remove fs1-internal from alternative criteria for master | 14:09 |
rcastillo|rover | marios|ruck: sure | 14:09 |
marios|ruck | so we can promote adaac75f69ae93d6ae76ee320b90dc0a | 14:09 |
marios|ruck | thanks rcastillo|rover | 14:09 |
rcastillo|rover | marios|ruck: https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/46056 | 14:12 |
marios|ruck | rcastillo|rover: rlandy: criteria looks OK on the promoter itself | 14:13 |
marios|ruck | periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-master: | 14:13 |
marios|ruck | - periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-internal-master | 14:13 |
marios|ruck | and no other instances of that 001-internal in the criteria | 14:13 |
rlandy | marios|ruck: yeah - idk - we have had alt criteria fail before | 14:14 |
marios|ruck | rcastillo|rover: thanks will merge that | 14:14 |
rlandy | and not report missing | 14:14 |
rlandy | marios|ruck: the rr tool is ok | 14:14 |
rlandy | dasm helped with that code | 14:14 |
rlandy | maybe he can help there | 14:14 |
marios|ruck | rlandy: i think this is the least common case, where the primary criteria job passes fs1 and the -internal fs1 fails | 14:15 |
rlandy | fs035 is the same | 14:15 |
dasm | ? what's going on? | 14:15 |
rlandy | in the same hash | 14:15 |
rlandy | marios|ruck: ^^ why didn't fs035-internal report missing | 14:15 |
marios|ruck | weird | 14:15 |
dasm | marios|ruck, any help needed? | 14:17 |
rlandy | dasm: pls see ... | 14:17 |
rlandy | <marios|ruck> rlandy: adaac75f69ae93d6ae76ee320b90dc0a hash in http://promoter.rdoproject.org/promoter_logs/centos9_master_2022-11-10T08:20.log | 14:17 |
rlandy | <marios|ruck> rlandy: passing fs1 but missing fs1-internal | 14:17 |
marios|ruck | dasm: seems like a bug with alternative_criteria code/promoter... eg hash adaac75f69ae93d6ae76ee320b90dc0a at http://promoter.rdoproject.org/promoter_logs/centos9_master_2022-11-10T08:20.log says fs1-internal missing | 14:17 |
marios|ruck | dasm: but actually fs1 normal job passed yet it hold promotion on that -internal job ^^ | 14:18 |
dasm | hmm.. | 14:18 |
dasm | here is the logic: https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/45468/12/ci-scripts/dlrnapi_promoter/logic.py | 14:19 |
marios|ruck | dasm: thx will have a look there in a bit | 14:21 |
dasm | marios|ruck: i'm checking that too. right now i'm trying to understand if i can run promoter code with some test data | 14:22 |
marios|ruck | dasm: thx i am digging at a different issue now but will revisit that in a bit thanks | 14:29 |
dasm | np | 14:29 |
rlandy | dasm: sending you a test script | 14:30 |
dasm | rlandy: oh, that's gonna be useful | 14:30 |
marios|ruck | jpodivin: o/ can you please check https://bugzilla.redhat.com/show_bug.cgi?id=2141701 | 14:31 |
marios|ruck | jpodivin: is that the right component for validation squad? 'openstack-tripleo-validations'? | 14:31 |
rlandy | dasm: pls check email | 14:32 |
dasm | rlandy: got it, thanks | 14:33 |
marios|ruck | rlandy: rcastillo|rover: manually created that one https://trello.com/c/DcqS7k0P | 14:36 |
marios|ruck | fyi | 14:36 |
marios|ruck | jpodivin: must be something from last 3 days at least that is when this started happening | 14:38 |
rlandy | marios|ruck: why manual - is the script broken? | 14:38 |
marios|ruck | rlandy: bz and i dind't send the email to rhos-dev | 14:38 |
marios|ruck | rlandy: so just created | 14:38 |
marios|ruck | rlandy: script working fine as far as i know for upstream | 14:39 |
rlandy | ok | 14:39 |
dasm | rlandy: marios|ruck for this hash: adaac75f69ae93d6ae76ee320b90dc0a where missing job is "periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-internal-master" the code returns correct value of only fs020-master | 14:41 |
dasm | rlandy: based on your test script | 14:41 |
dasm | checking if we didn't miss anything | 14:41 |
dasm | marios|ruck: i tried ssh-ing to promoter-server but it doesn't allow me into. I'm wondering if the code running there is up to date. | 14:50 |
dasm | based on quick local tests, the logic of particular piece of code seems to be correcvc | 14:50 |
dasm | *correct | 14:50 |
*** amoralej|lunch is now known as amoralej | 14:51 | |
dasm | jm1: can we merge this? https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/46001 | 14:52 |
dasm | jm1: in current shape our infra code is dead in the water. we need to move it forward somewhere. i'm being blocked by your reviews | 14:52 |
dasm | jm1: what i'm working on right now is to make at least something working. your requests are overwhelming changes | 14:53 |
marios|ruck | dasm: i checked on promoter looked sane/correct criteria | 14:54 |
dasm | jm1: these https://review.rdoproject.org/r/q/topic:ansible-inventory won't be needed when i'm finally gonna merge these: https://review.rdoproject.org/r/q/topic:infra_updates | 14:54 |
dasm | jm1: but I can't without starting somewhere | 14:54 |
marios|ruck | rcastillo|rover: merging for now https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/46056 | 14:55 |
dasm | marios|ruck: logs show fs001-internal missing, but local code, with the same set of jobs do not show them. i've no idea what's going on | 14:55 |
rcastillo|rover | I'm looking at the code as well, does look correct to me too | 14:55 |
marios|ruck | rcastillo|rover: can you pleas revert once master promotes adaac75f69ae93d6ae76ee320b90dc0a | 14:55 |
rcastillo|rover | yeah | 14:55 |
marios|ruck | rcastillo|rover: or if master doesn't promote then still revert the problem is elsewhere then | 14:55 |
marios|ruck | :D | 14:55 |
dasm | marios|ruck: don't even joke like that :) | 14:56 |
marios|ruck | :) | 14:59 |
dasm | marios|ruck: can you ssh to promoter server and give it a try checking if there is latest code? | 14:59 |
dasm | marios|ruck: i can't, ssh do not let me in | 14:59 |
marios|ruck | dasm: http://pastebin.test.redhat.com/1080904 | 15:00 |
dasm | marios|ruck: how about docker image? is it regenerated? | 15:00 |
dasm | marios|ruck: or just add my key over there :) | 15:00 |
marios|ruck | dasm: why is your key not on there in the first place? | 15:01 |
dasm | ¯\_(ツ)_/¯ | 15:02 |
marios|ruck | dasm: k let me check | 15:03 |
pojadhav | marios|ruck, dasm : should we merge this ?? https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/46045 | 15:21 |
marios|ruck | pojadhav: if dasm and dviroel are ok to watch during their day and revert rcastillo|rover you ok with that? | 15:23 |
dasm | pojadhav: i believe so. test instance runs on 9.1.5, prod instance as well. I would imagine (hope so?) to be it compatible | 15:23 |
marios|ruck | potential disruption | 15:23 |
rcastillo|rover | I can keep an eye on it | 15:23 |
pojadhav | marios|ruck, agree | 15:24 |
pojadhav | dasm, marios|ruck : lets merge and monitor.. | 15:24 |
dviroel | pojadhav: marios|ruck: ack, we can revert by our EOD if needed | 15:24 |
dviroel | rcastillo|rover: dasm ^ | 15:24 |
pojadhav | dasm, i think we need to rebuild container to get latest changes right ? | 15:25 |
pojadhav | after merging patch | 15:25 |
jm1 | dasm: merging https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/46001 will break cockpit as commented in the patch | 15:25 |
pojadhav | on prod server or it will do its own ? | 15:25 |
dasm | pojadhav: it should autoreload itself | 15:25 |
pojadhav | dasm, ack | 15:26 |
pojadhav | dasm, marios|ruck rcastillo|rover dviroel : thanks for this discussion.. lets have +w to see results next few hours. | 15:28 |
marios|ruck | pojadhav: done | 15:29 |
pojadhav | thank you marios|ruck | 15:29 |
marios|ruck | rcastillo|rover: cool 16.2 should promote https://code.engineering.redhat.com/gerrit/c/testproject/+/434848/2#message-363dcfef31a480db43d27a388b91f9db878b29f4 | 15:54 |
*** dviroel is now known as dviroel|lunch | 15:54 | |
marios|ruck | rcastillo|rover: please keep eye on http://promoter.rdoproject.org/promoter_logs/centos9_master.log as discussed we expect promotion for adaac75f69ae93d6ae76ee320b90dc0a then you can revert that | 16:13 |
marios|ruck | rcastillo|rover: that i mean https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/46056 | 16:13 |
*** marios|ruck is now known as marios | 16:16 | |
marios | \o/ | 16:16 |
*** marios is now known as marios|out | 16:16 | |
*** marios|out is now known as marios | 16:16 | |
jm1 | marios: o/ | 16:20 |
marios | o/ jm1 | 16:20 |
marios | bhagyashris: nope :( still not right we'll have to dig tomorrow https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/stream/6b69623f31c84635a8b4cf779cdf85c2?logfile=console.log | 16:21 |
marios | bhagyashris: TASK [Wait for hash 574f09a56a390dc21fe4f5a905bdb790 to appear in hash_info.sh] OK -> then source /home/zuul/workspace/hash_info.sh export FULL_HASH=7f29dc67e4ecfc7368558b91f6cef4ac | 16:21 |
marios | bhagyashris: maybe there is another hash_info.sh we should have hold on that node right? we can check tomorrow | 16:22 |
*** marios is now known as marios|out | 16:23 | |
rcastillo|rover | upstream promotion dashboard broken for anyone else? | 16:49 |
rcastillo|rover | dasm ^ | 16:49 |
dasm | checking | 16:52 |
dasm | i see no data | 16:52 |
dasm | gimme few to check | 16:52 |
dasm | this one works tho, rcastillo|rover http://dashboard-ci.tripleo.org/d/JFeJ6Htnk/centos-9-upstream-and-rdo-promotions?orgId=1 | 16:53 |
dasm | so data is there | 16:53 |
rcastillo|rover | ah, right | 16:53 |
dasm | but like you said: upstream seems to be daed | 16:54 |
dasm | hmm.. no queries are issued by grafana dashboard | 16:56 |
*** dviroel|lunch is now known as dviroel | 16:59 | |
dasm | i see the difference, but i don't know what's causing that. staging env is using "influxdbPlugin.js" while our prod doesn't have that | 17:09 |
dasm | continuing investigatino | 17:09 |
rcastillo|rover | is prod and staging using the same grafana image? | 17:09 |
dasm | i would assume so | 17:10 |
dasm | rcastillo|rover: actually - no. staging was setup 3 days ago, while prod grafana 7 weeks ago. | 17:11 |
dasm | we have at least one issue in our docker | 17:12 |
dasm | > fatal: [rrcockpit]: FAILED! => {"changed": false, "errors": [], "module_stderr": "", "module_stdout": "Step 1/9 : FROM telegraf:1.8.3\nTrying to pull repository docker.io/library/telegraf | 17:13 |
dasm | rcastillo|rover: i'm gonna see what i can do with that | 17:15 |
rcastillo|rover | weird that the c9 dash works though | 17:15 |
rcastillo|rover | dasm: alright let me know if I can help out | 17:15 |
dasm | i would assume update to upstream dashboard introduced additional changes | 17:17 |
dasm | adn those changes aren't part of update, so maybe rebuild of container is needed? | 17:17 |
dasm | idk yet | 17:17 |
*** jpena is now known as jpena|off | 17:21 | |
*** dviroel_ is now known as dviroel | 18:02 | |
dasm | rcastillo|rover: dviroel jm1 reverting pojadhav's patch: https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/45878 It again broke grafana view. More investigation is required. | 18:23 |
dasm | It's confusing, because it changes data source | 18:23 |
dviroel | dasm: yep | 18:23 |
*** amoralej is now known as amoralej|off | 18:47 | |
dviroel | dasm: i will merge the revert, ok? or you still debugging? | 19:01 |
dasm | dviroel: please go ahead. i'm out of ideas what could go wrong. i'm gonna need to setup local instance to debug it | 19:03 |
dviroel | ok | 19:04 |
jm1 | dviroel, dasm: pojadhav asked for more time than last time. but i actually agree with you that time is probably not the reason | 19:04 |
jm1 | *more time for cockpit to fetch data | 19:05 |
dasm | jm1: cockpit fetches data, which can be seen on different views. for some reason data source is misconfigured. | 19:05 |
dasm | no idea why though | 19:05 |
jm1 | dasm: yeah, definitely something odd. couldnt pojadhav debug this live on our upstream cockpit tomorrow? we do not really have to merge and revert and merge and revert and ... | 19:07 |
jm1 | dasm: she has access, so we could just stop ansible-pull and let her try live | 19:07 |
dasm | jm1: that sounds like a good idea | 19:09 |
dviroel | jm1: dasm: in the end, makes no sense to wait for data, the data is there is the database. The issue is that change on datasource, which we don't know why it works on other envs | 19:10 |
dasm | dviroel: yes. strange thing is behavior of grafana staging != grafana prod | 19:11 |
dviroel | yep | 19:11 |
jm1 | dasm, dviroel: where is this staging grafana? | 19:11 |
jm1 | i mean where is it hosted? on one of our vms? | 19:12 |
dasm | jm1: http://10.0.111.235/d/54Nv7HN4z/upstream-and-rdo-promotions?orgId=1 | 19:12 |
jm1 | dasm: cannot log in. what kind of machine is this? | 19:13 |
jm1 | *no ssh login possible | 19:14 |
dasm | jm1: try centos | 19:14 |
jm1 | dasm: yeah no login possible | 19:16 |
jm1 | asks for password | 19:16 |
dasm | i see | 19:16 |
jm1 | maybe its frenzy_friday's staging vm | 19:17 |
dasm | jm1: try again | 19:17 |
dasm | yes, it was staged by frenzy_friday | 19:17 |
frenzy_friday | jm1, dasm yep that is the staging cockpit. Try ssh centos@10.0.111.235 It should work with your ssh keys | 19:19 |
dasm | frenzy_friday: i just added jm1's keys | 19:20 |
jm1 | dasm, frenzy_friday: thank you! | 19:20 |
jm1 | frenzy_friday: ah that one is not running ansible-pull | 19:21 |
jm1 | frenzy_friday: your personal playground or kind-of-production env? | 19:21 |
frenzy_friday | yeah, it is sort of a dev vm | 19:21 |
frenzy_friday | so that people from our team or cre team can test whatever they want there. It is not synced with the actual playbooks | 19:22 |
dasm | frenzy_friday: not anymore :D | 19:22 |
dasm | it's now staging env for the team ^^ | 19:22 |
frenzy_friday | oh, with ansible pull ? | 19:22 |
dasm | no. just saying everyon uses that now ;) | 19:22 |
dasm | it evolved from small, personal VM over to bigger, staging one | 19:23 |
frenzy_friday | oh yep. It is like a dev vm for the team. When I checked last time Pooja's victoria patch works fine there, but on the prod sever we are missing data | 19:24 |
dasm | frenzy_friday: yes, it happened again. something is wrong with datasource | 19:24 |
dasm | not sure why | 19:25 |
dasm | frenzy_friday: even after tearing down and respinning docker compose, it's still wrong. | 19:25 |
dasm | frenzy_friday: is it possible that your dev env is different? | 19:25 |
frenzy_friday | dasm, yep! When I try to make changes to any of the dashboards it automatically changes the datasource to influxdb from telegraf (https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/46045/1/ci-scripts/infra-setup/roles/rrcockpit/files/grafana/upstream-and-rdo-promotions.dashboard.json#35) | 19:26 |
frenzy_friday | I have no idea why | 19:26 |
frenzy_friday | The dev env is deployed with ./development_server.sh (which uses docker-cpmpose) | 19:27 |
dasm | hmm | 19:27 |
jm1 | frenzy_friday, dasm: maybe pojadhav's patch is incomplete? i see that a lot of other files have been changed, not only the promotion dashboard | 19:27 |
dasm | jm1: i don't think so | 19:27 |
jm1 | dasm: ssh centos@10.0.111.235 'cd /home/centos/ci-config && git diff --color=always' | 19:29 |
jm1 | dasm: or better ssh centos@10.0.111.235 'cd /home/centos/ci-config && git status' | 19:29 |
jm1 | dasm: let me try that on our upstream cockpit | 19:30 |
frenzy_friday | jm1, are you on a tmux on the upstream cockpit? | 19:32 |
jm1 | frenzy_friday: nope, why? | 19:32 |
frenzy_friday | ok, just wanted to join and see whats happening :D | 19:33 |
jm1 | frenzy_friday: are you playing with upstream cockpit? | 19:33 |
frenzy_friday | nope | 19:33 |
jm1 | frenzy_friday: then i will wait till tomorrow | 19:33 |
jm1 | frenzy_friday: ah ok | 19:33 |
jm1 | frenzy_friday: do you want me to use tmux session or were you about to check something? | 19:34 |
frenzy_friday | jm1, no you can go ahead. Lemme know if you find something. Also, there is one mismatch that I know of, between the dev and prod cockpits - on the dev vm nginx set up always needs a bit of tweaking (/me sends what I mean in a patch) | 19:36 |
jm1 | frenzy_friday: saw that, yeah | 19:37 |
frenzy_friday | https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/46063 (the nginx change we need in the dev vm) | 19:42 |
jm1 | dasm, frenzy_friday: will debug it tomorrow. i am moving in slow motion, its too late.. | 19:51 |
dviroel | dasm: rcastillo|rover: rdo dashboard didn't come back yet | 19:52 |
dasm | dviroel:hmm | 19:53 |
dasm | jm1: ack | 19:53 |
jm1 | dviroel: because i stopped ansible-pull ;) | 19:56 |
dviroel | jm1: :-) | 19:57 |
jm1 | dviroel: restarted ansible-pull, should be updated soon | 19:57 |
dviroel | nice, thanks | 19:57 |
jm1 | dviroel: restarted. will take some time to recover | 20:01 |
dviroel | it is back already | 20:02 |
dviroel | http://dashboard-ci.tripleo.org/d/HkOLImOMk/upstream-and-rdo-promotions?orgId=1 | 20:02 |
dasm | neat | 20:02 |
rcastillo|rover | nice | 20:02 |
jm1 | dasm, dviroel, rcastillo|rover: mtu on that bridge was 1500 instead of 1450, maybe it was related to that | 20:03 |
jm1 | *pojadhav's dashboard issues might be related to that | 20:03 |
jm1 | will debug tomorrow | 20:03 |
* jm1 out for today | 20:03 | |
dasm | i'm not sure about mtu, because other dashboards weren't affected | 20:03 |
dasm | jm1: take care | 20:03 |
jm1 | have a nice evening :) | 20:03 |
rcastillo|rover | jm1: o/ | 20:03 |
dviroel | can't see why mtu would affect, if now is greater than before, there is no problem | 20:10 |
* dviroel going afk | 20:12 | |
*** dviroel is now known as dviroel|afk | 20:12 | |
* dasm => offline | 22:28 | |
dasm | o/ | 22:28 |
*** dasm is now known as dasm|off | 22:28 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!