Tuesday, 2019-10-01

*** Vorrtex has quit IRC00:16
*** weshay has joined #oooq00:24
*** aakarsh has joined #oooq02:04
*** ykarel|away has joined #oooq02:12
*** raukadah is now known as chandankumar03:23
*** dsneddon has quit IRC03:31
*** dsneddon has joined #oooq03:46
*** dsneddon has quit IRC03:53
*** gkadam has joined #oooq03:55
*** udesale has joined #oooq03:57
*** gkadam has quit IRC04:08
*** rfolco has quit IRC04:17
*** surpatil has joined #oooq04:26
*** ykarel|away has quit IRC04:27
*** dsneddon has joined #oooq04:30
*** weshay has quit IRC04:40
*** ykarel|away has joined #oooq04:48
*** ykarel|away is now known as ykarel04:51
*** jaosorior has joined #oooq05:05
*** ratailor has joined #oooq05:07
*** jbadiapa has joined #oooq05:13
ykarelpanda, are there still some issues with promoter server? i see no promotion yet05:57
*** jfrancoa has joined #oooq05:59
*** surpatil has quit IRC06:12
*** surpatil has joined #oooq06:13
*** jbadiapa has quit IRC06:22
*** jtomasek has joined #oooq06:30
*** dsneddon has quit IRC06:55
*** dsneddon has joined #oooq06:57
*** dsneddon has quit IRC07:03
*** zbr is now known as zbr|ruck07:08
zbr|ruckykarel: yeah.07:08
*** bogdando has joined #oooq07:14
ykarelzbr|ruck, and any fix for it?07:15
*** tesseract has joined #oooq07:16
*** tosky has joined #oooq07:16
*** apetrich has joined #oooq07:17
*** dtantsur|afk is now known as dtantsur07:23
*** kopecmartin|off is now known as kopecmartin07:24
*** jbadiapa has joined #oooq07:27
*** dsneddon has joined #oooq07:27
*** recheck has quit IRC07:27
*** recheck has joined #oooq07:28
*** soniya29 has joined #oooq07:29
zbr|rucki got a msg from wes last night, this is what I am looking at.07:31
ykarelokk07:32
pandaI really don't understand07:34
arxcruz|ruckykarel: https://review.opendev.org/#/c/685709/ worked :)07:36
*** dsneddon has quit IRC07:36
pandazbr|ruck: found anything yet ?07:38
pandazbr|ruck: I really can't find a reason for the different behaviour in the promoter for the docker login07:40
zbr|ruckpanda: f***, why we are not testing out code? http://38.145.34.55/centos7_master.log-2019093007:41
zbr|ruck"Pull the images from rdoproject registry"07:41
*** amoralej|off is now known as amoralej07:41
zbr|ruckand i am not surprised, it has a sudo on it but nobody ever installed requirements for root user.07:42
pandazbr|ruck: it always worked, and what the requirements have to do with getting the wrong token ?07:43
*** jpena|off is now known as jpena07:43
pandajpena: ther he is ! :)07:44
zbr|ruckwhat wes reported is not what I see on the logs.07:45
* jpena ducks07:45
zbr|ruckdocker module was not installed, obviously that any ansible docker module would fail07:45
pandazbr|ruck: I'd say even more, dlrnapi_promoter import configparser (all lowercase)07:45
pandazbr|ruck: wchih is the name in python307:45
pandajpena: quack ?07:45
pandazbr|ruck: it doesn't fail07:45
pandazbr|ruck: it works , but give strange results. I tried also manually , as root, inside and outside the same venv the promotion uses, in ansible and with cli07:46
pandamanually the login command works07:47
zbr|ruckthe logs is full or Failed to import docker or docker-py (Docker SDK for Python) - No module named requests.exceptions. Try `pip install docker` or `pip install docker-py` (Python 2.6)07:47
zbr|ruckit should have NONE07:47
zbr|ruckthis is not caused by bad-token/login whatever.07:47
pandazbr|ruck: those were the older logs07:47
pandazbr|ruck: when ansible was installed inside the virtualenv07:47
pandazbr|ruck: we removed it, to use ansible from package07:48
pandazbr|ruck: look at the most recent logs07:48
pandazbr|ruck: those are the result of the manual attempt07:48
*** ykarel is now known as ykarel|lunch07:48
zbr|ruckpanda: last msg is "Promoting the container images for" at ~3am and nothing at all after.07:49
pandazbr|ruck: yes, we disabled the automatic script, and launched manually for the last 10 runs07:49
jpenapanda: so what is the current situation with the registry?07:50
pandazbr|ruck: but the last 10 runs don't have dependency errors, the push fails with authentication error.07:50
pandajpena: we are not able to rule out any possibilities07:51
pandajpena: manually logins work, but in the promotion process they get the bogus token07:51
pandajpena: but even using older code and completely reinstalled ven, it fails07:51
pandavenv*07:51
zbr|ruckpanda: can you send me a list of valid/safe arguments/vars to pass to to container-push.yml so I can test?07:51
jpenapanda: are there any logs from the promoter I can check?07:53
pandajpena: http://38.145.34.55/centos7_master.log-20190930  not sure if they'll be useful, most of them fail with "authentication error" but unless you acces the promoter server, it's not clear what's happening07:54
pandajpena: I emailed you the .docker/config.job that contains the token result from the docker login there.07:55
zbr|ruckpanda: can you please tell me to which line to look?07:56
pandazbr|ruck: for what ?07:57
jpenapanda: I understand this is done using an ansible playbook, can you run it with -vvv and see if it provides any meaningful output?07:58
zbr|ruckfound auth one. but here is the deal "Pull the images from rdoproject registry" works,07:58
pandazbr|ruck: pull happens before the login07:59
pandazbr|ruck: and the pulls are not authenticated yet07:59
jpenapanda: just to check. A couple weeks ago we had to update one of the tripleo tokens in the registry, because it was leaked by a job. Was that changed in the promoter server?08:00
pandajpena: yes. We had promotions since then08:00
jpenapanda: oh, ack08:00
jpenathe registry authentication has not changed since then, we're planning to do it but we decided to wait a bit08:01
zbr|rucki had the impression that even for pulls we need auth,... so how it comes that pulls works?08:01
pandajpena: yep, that's what I was afraid to hear from you. I would have liked better a "oh yeah, let me revert this change I made yesterday"08:02
jpenasorry :-/08:02
*** apetrich has quit IRC08:02
pandajpena: but it's probably a problem on our side08:02
pandazbr|ruck: pulls don't need auth08:02
pandazbr|ruck: yet08:02
*** jbadiapa has quit IRC08:03
*** dsneddon has joined #oooq08:04
arxcruz|ruckpanda: zbr|ruck getting late to the party, did you guys check if you have some dot directory with some dirty on the promoter ?08:04
*** jbadiapa has joined #oooq08:04
pandaarxcruz|ruck: like what ?08:04
arxcruz|ruck.config/docker for example08:04
arxcruz|rucki don't know maybe it has some cache that are messing with the promoter08:05
pandaarxcruz|ruck: we have a .docker/config.json, it has the cache with auth tokens, but it's replaced at every login08:05
pandajpena: as an extreme measure, what would it take to disable authentication off just the time needed for us to promote, and then turn it back on ?08:06
arxcruz|ruckare you sure?08:06
jpenapanda: for pushes?08:06
*** apetrich has joined #oooq08:06
pandajpena: yes08:07
jpenapanda: I don't think that is even possible08:07
pandajpena: if we can't solve this in a reasonable time08:07
pandajpena: wonderful, ok08:07
chandankumarzbr|ruck: Hello08:09
*** dsneddon has quit IRC08:09
chandankumarzbr|ruck: in RHEL-8 job we use ansible_python_interpreter in job definition, after ansible-2.8 update is it still needed?08:10
zbr|ruckchandankumar: not needed with 2.8+, safe to remove.08:13
*** surpatil has quit IRC08:13
zbr|ruckpanda: i wonder if subsequent docker logins do cancel previous ones08:14
zbr|ruckbut we should be able to reproduce the problem by calling this playbook, right?08:14
zbr|rucksad that the script does not dump all these vars08:14
jpenapanda: I see there were some promotions yesterday for queens and rhel8, is that correct?08:15
pandazbr|ruck: dumping increases the risk of leaking08:15
pandazbr|ruck: did we have promotion for queens yesterday ?08:15
*** jaosorior has quit IRC08:16
zbr|ruckbecause promoter runs on specific machine, i would question why we even deal with credentials in scripts. it should just be configured on the server, permanently.08:16
zbr|rucklike I have locally.08:17
pandazbr|ruck: we don't deal with credentials in the scripts08:17
pandazbr|ruck: the credentials are only in the server08:17
pandazbr|ruck: the scripts just use them08:18
chandankumarzbr|ruck: ok, thanks!08:19
pandajpena: I see queens is at 20 days08:19
jpenapanda: yes. I was checking the logs and saw that there was one from yesterday without an auth error... But it probably didn't do a thing08:20
zbr|ruckpanda: really? username: "{{ dockerhub_username }}" does look like processing to me.08:20
pandazbr|ruck: how else would you select what username to use ?08:20
zbr|ruckpanda: one username is not enough?08:21
zbr|ruckhttps://etherpad.openstack.org/p/ssbarnea ?08:22
zbr|ruckand guess, what doing a  ~/.docker/config.json                  explains why it does not owkr08:23
*** yolanda has joined #oooq08:23
zbr|ruckI bet   "auth": "Og==",  --- is not a valid login08:24
pandazbr|ruck: did you read the email I sent yesterday ?08:24
zbr|rucknope, only wes one.08:25
zbr|ruckmissed to scroll more08:25
pandazbr|ruck: that token is recreated every time we launch the promotion08:25
jpenapanda: where is the code we're using for the promoter? I can check it and see if I find anything08:25
arxcruz|ruckpanda: zbr|ruck tmux ? we are all in the promoter :)08:27
pandajpena: I would not want to inflict you this amount of pain, the code is problematic, and we broke it while trying to make it better. There are lots of things that shouldn't work at all, but they did until yesterday.08:27
pandaarxcruz|ruck: there's already a tmux sesison open from me and wes yesterday08:28
zbr|ruckquick workaround: lets remove the docker_login from script and perform a login manually08:28
pandazbr|ruck: let's try08:29
zbr|ruckwe can even put a cron to keep the token refreshed.08:29
zbr|ruckthis should give us time to debug it08:29
pandazbr|ruck: you're going to do it ?08:29
zbr|ruckyep,.... on it08:29
pandathis could give time to just merge the new code we have for this thing also08:29
*** derekh has joined #oooq08:32
zbr|ruckpanda: https://review.rdoproject.org/r/#/c/22743/08:33
pandazbr|ruck: You'll have to do it directly in the server08:34
*** pierrepr1netti is now known as pierreprinetti08:34
zbr|rucki know, this is what i am preparing to do now08:34
pandazbr|ruck: we disabled automatic update in the server08:34
pandazbr|ruck: ok08:34
zbr|ruckaha.08:34
panda?08:35
*** pierreprinetti has quit IRC08:35
*** chem has joined #oooq08:35
*** surpatil has joined #oooq08:37
*** dsneddon has joined #oooq08:37
zbr|ruckpanda: ok cherry picked new code and performed manual login, also created a bkp cfg.08:46
pandazbr|ruck: ok, we can start a manual promotion in the tmux ?08:46
pandazbr|ruck: already in promoter_venv with user centos08:47
pandazbr|ruck: tmux atta08:47
zbr|ruckpanda: obviously that centos user cannot login because it cannot connect to docker socket.08:50
pandazbr|ruck: login as root08:50
zbr|ruckbut root is logged in08:50
pandazbr|ruck: and root login is all we need08:50
*** ykarel|lunch is now known as ykarel08:54
pandazbr|ruck: arxcruz|ruck where is the promoter running did someone start it ?08:55
pandazbr|ruck: arxcruz|ruck talk here  ... bash is somewhat limited as a chat tool08:56
arxcruz|ruckpanda: zbr|ruck  google hangouts ?08:56
pandaarxcruz|ruck: google meet ? I have not idea how to start a new meet though08:56
arxcruz|rucklet me check how08:56
pandaarxcruz|ruck: zbr|ruck I'll start the promotion08:56
arxcruz|ruckack08:57
pandazbr|ruck: promotion started08:57
zbr|ruckthis kind of script is supposed to run on tower/awx. it would have much easier to track it there.08:58
zbr|ruckload avg 309:02
pandawhat ?09:03
arxcruz|ruckwhat [2]09:03
arxcruz|ruckis it promoting? i'm seeing stucked in promoting...09:04
pandazbr|ruck: we could speak for ages about how this should be. THis is what we have. Until we have tests, there's not much we can change. After we have tests, we'll have to make small incremental changes09:04
arxcruz|rucknot sure if should have logs09:04
pandaarxcruz|ruck: it's running the ansible part, we won't get any output until it finishes09:04
panda(oh the joy)09:04
zbr|rucki proposed setting an awx instance but the boss said is not a prio now. tbh, we have bigger problems.09:06
ykarelarxcruz|ruck, looking09:07
ykarelbut i still have doubts09:07
arxcruz|ruckykarel: come on, have faith09:08
pandazbr|ruck: what would that change ?09:09
zbr|ruckpanda: i will show you when i have time (if ever), for start clear execution logs.09:10
ykarelarxcruz|ruck, yes i have faith on that change and you as well09:10
ykarelmy doubt is more specific to goal09:10
ykarelso earlier we were using latest centos image09:11
ykarelbut now with that patch we will stick to image09:11
pandazbr|ruck: putting this in awx will not automatically clear the logs09:11
ykarelalso with centos we get advantage of mirror, but not with rdo,09:11
ykarelbut that's secondary, main issue is with the first one09:12
pandazbr|ruck: there's a lot of work to do anyway09:12
pandazbr|ruck: frankly I'm not even sure we would need to choose a single language, instead of trying to integrate python, ansible and bash all together happily09:13
pandaI think the promotion is stuck09:13
pandawe have 43 images downloaded still :(09:14
pandafix a problem another pops out ..09:14
arxcruz|ruckykarel: we are still stick to a specific image09:17
arxcruz|ruckthe 1901 if i recall correctly09:18
arxcruz|ruckykarel: but it's also only on ovb promotion jobs09:18
pandayep, promoter stuck at pulling images09:18
*** ratailor_ has joined #oooq09:24
*** ratailor has quit IRC09:26
*** rfolco has joined #oooq09:29
*** jbadiapa has quit IRC09:29
*** jbadiapa has joined #oooq09:30
pandazbr|ruck: arxcruz|ruck the promotion is stuck and I have not idea what is happening.09:34
arxcruz|ruckpanda: is it possible to run the ansible code from the promotion manually ?09:35
arxcruz|ruckcan jpena show the logs of the rdo registry to see what's happening on the other side ?09:35
jpenaarxcruz|ruck: let me get them09:35
pandaarxcruz|ruck: I've never tried, but it should be possible09:36
pandaarxcruz|ruck: at that point we have to revert to completely manual promotion09:36
pandaand at this point it seems the best option.09:40
jpenathe last log line from the registry related to the promoter server I see is https://softwarefactory-project.io/paste/show/1605/09:41
jpenafrom 10 minutes ago09:41
pandajpena: it's a push ?09:42
pandajpena: seems to be a push09:42
jpenahttp.request.method=PUT09:43
pandajpena: we push current-tripleo tag to rdo registry befoer pushing to docker io09:43
pandaso even when the auth succeed, something wrong happens while pushing09:43
arxcruz|ruckpanda: is it possible to switch the order ?09:44
arxcruz|ruckso we can identify if it's an issue with rdo registry or docker09:45
arxcruz|ruckor it tries both and fail ?09:45
*** akahat has joined #oooq09:45
pandaarxcruz|ruck: right now we are stuck at rdo registry, the pushes there fail, we can try to remove the push, I don't remember why we were pushing tags ther in the first place09:45
arxcruz|ruckpanda: let's try this09:46
pandazbr|ruck: agreed ?09:46
pandazbr|ruck: let's remove also the push tag to rdo registry09:47
zbr|ruckit is needed afaik09:47
pandazbr|ruck: pushing tags to rdo-registry ? I don't remember why09:48
jpenapanda, looking at the registry console, I see it is properly tagging the images as current-tripleo09:48
jpenahttps://console.registry.rdoproject.org/registry#/images/tripleomaster/centos-binary-nova-conductor:current-tripleo , for example09:48
*** ratailor__ has joined #oooq09:48
pandajpena: thanks, I'm afraid at this point it may be incompatibility issue between docker-py and docker-ce09:50
arxcruz|ruckjpena: panda  is it possible from docker push directly to rdo registry ?09:50
arxcruz|rucklike a proxy or something like that ?09:50
pandammhh09:50
zbr|ruckno09:50
pandaarxcruz|ruck: I WISH!09:50
arxcruz|rucklol09:50
pandaI'm restarting docker daemon09:50
*** ratailor_ has quit IRC09:50
zbr|ruckpanda: better not09:51
arxcruz|ruckpanda: zbr|ruck  so, shall we test only on docker.io ?09:51
pandazbr|ruck: why not ? I'm out of ideas, it's possible the manual promotion will not work at this point09:51
zbr|ruckpanda: you are more likely to mess it with partial pushes. i am now looking that the code. i am going to rewrite ansible execution.09:52
pandasure, rewrite the core logic in the production server in the heat of a bugfix with pressing deadline, and wihout regression tests. What could go wrong09:53
arxcruz|ruckpanda: at least it's not friday :D09:54
* arxcruz|ruck trying to be positive09:54
zbr|ruckpanda: today we are paying the price of coding on our knees.... not sure who assumed that capturing an entire playbook execution with check_output is ok.09:55
zbr|ruckthere are two playbooks running in parallel09:59
panda?09:59
pandaah wonderful10:01
pandaso we didn't coordinate correctly10:01
pandazbr|ruck: the previous promotion did not finish10:01
pandazbr|ruck: ti was stuck in the middle10:01
zbr|rucklets kill ansible-playbook processes first, one by one10:02
zbr|ruckis there any reason why check_output was not used with PIPE?10:03
pandawith fire10:03
arxcruz|ruckpanda: https://github.com/ansible/ansible/issues/32868#issuecomment-34405587010:06
arxcruz|ruckpanda: zbr|ruck https://meet.google.com/gwb-ikpz-jvg10:07
arxcruz|ruckpanda: ^10:10
zbr|ruckpanda: please join.10:11
zbr|ruckhttps://etherpad.openstack.org/p/ssbarnea10:13
*** jaosorior has joined #oooq10:24
bogdandofolks, do we have one or two ovb jobs with pacemaker (with centos 7)?10:26
bogdandoI forgot which one...10:27
*** udesale has quit IRC10:27
*** udesale has joined #oooq10:28
rfolcopanda, quick question: how do you run delegated tests locally ?10:29
*** soniya29 has quit IRC10:37
pandarfolco: molecule converge -s $scenario10:38
rfolcopanda, it returns an error...10:38
rfolco          module_stderr: |-10:38
rfolco            /bin/sh: /home/zuul/test-python/bin/python: Permission denied10:38
rfolcopanda, shouldn't delegated-pre.yml be the prepare stage ?10:39
rfolcolike prepare.yml10:40
pandarfolco: ah yea, you need to export MOLECULE_INTERPRETER=$path/to/your/python10:40
pandarfolco: this is the trade off to be able to run the same scenario in both locally and zuul10:45
rfolcopanda, how do you run the delegated-pre.yml locally ?10:46
rfolcoansible-playbook ?10:46
pandarfolco: yes10:48
*** jbadiapa has quit IRC10:53
*** akahat has quit IRC11:00
*** ykarel is now known as ykarel|meeting11:02
*** udesale has quit IRC11:05
*** udesale has joined #oooq11:06
*** chem` has joined #oooq11:06
*** chem has quit IRC11:08
rfolcopanda, what you have in your localhost test ? centos ?11:10
rfolcopanda, I fix one issue, I get another.... so I suspect I am doing it wrong11:11
pandazbr|ruck: arxcruz|ruck the containers push passed11:11
pandazbr|ruck: arxcruz|ruck we have containers promoted in docker.io11:11
*** soniya29 has joined #oooq11:11
pandatwo steps missing11:11
pandarfolco: what's the error ?11:12
rfolcopanda, cannot find python2-dnf.... running local in fedora11:12
*** udesale has quit IRC11:17
pandarfolco: install python2-dnf11:20
pandarfolco: I haven't tested in fedora11:20
pandarfolco: only centos11:20
*** ykarel|meeting is now known as ykarel11:20
rfolcopanda, this is my question... you have a centos local box ?11:20
pandarfolco: a vm11:20
rfolcopanda, ok11:21
arxcruz|ruckpanda: so, the problem is on rdo registry right11:23
pandarfolco: but the molecule-delegated-pre was created mainly for zuul, locally I had already all the packages needed11:23
pandaarxcruz|ruck: what makes you say that ?11:23
rfolcopanda, my idea was to rename it to prepare.yml11:23
rfolcopanda, so molecule would run it in zuul and also locally in prepare stage11:24
pandaarxcruz|ruck: from what I see, the problem is that something breaks whan the python script calls the ansible playbook11:24
pandaand it's something that even the integration test can't detect, since we are not logging in to rdoregistry11:25
pandathe only thing that we can't test wiht the integration test, is the one that is breaking everything11:25
pandaI think I'll start a new career as a farmer in iceland.11:26
pandaarxcruz|ruck: zbr|ruck launching promote-image.sh script manually now to promote overcloud images11:33
*** jpena is now known as jpena|lunch11:38
*** weshay has joined #oooq11:41
weshaypanda, arxcruz|ruck zbr|ruck ping me when you guys are avail11:42
weshaybrb11:45
zbr|rucki am back11:46
arxcruz|ruckweshay: i'm here11:47
zbr|rucki joined promoter meeting11:48
weshayk.. /me also on the promoter mtg11:50
weshayhttps://meet.google.com/ito-fxdo-trb?authuser=111:50
weshayarxcruz|ruck, zbr|ruck have you heard from panda today?11:50
Tenguweshay: [13:26] < panda> I think I'll start a new career as a farmer in iceland.11:51
zbr|rucksure, we were in meeting earlier reg promoter11:51
weshayTengu, heh.11:51
Tenguguess he went there sooner than expected :)11:51
zbr|ruckgood idea, i liked iceland too.11:51
zbr|rucknot so sure about farming, but lava-bread was good.11:51
weshayI like it iceland so much, I think I'll buy it11:51
*** dsneddon has quit IRC12:17
*** dsneddon has joined #oooq12:20
*** jfrancoa has quit IRC12:22
*** dsneddon has quit IRC12:25
*** holser has joined #oooq12:25
chandankumarweshay: zbr|ruck panda jpena|lunch https://review.rdoproject.org/r/#/c/22550/ please have a look, thanks!12:27
*** jfrancoa has joined #oooq12:37
*** jpena|lunch is now known as jpena12:38
arxcruz|ruckzbr|ruck: weshay  panda https://docs.ansible.com/ansible/latest/modules/docker_container_module.html12:44
*** amoralej is now known as amoralej|lunch12:49
*** Goneri has joined #oooq12:56
*** jaosorior has quit IRC12:58
*** dsneddon has joined #oooq13:00
*** marios has joined #oooq13:00
mjturekrfolco is the community call in a half hour??13:01
rfolcomjturek, yes sir :)13:01
mjturek\o/13:01
rfolcomjturek, feel free to add your topic to the agenda https://etherpad.openstack.org/p/tripleo-ci-squad-meeting @L5913:02
*** jaosorior has joined #oooq13:02
*** jbadiapa has joined #oooq13:06
pandarfolco: were you able to run delegated ?13:08
rfolcopanda, no, not even in a centos vm, its been a nightmare13:09
pandarfolco: ....13:09
rfolcopanda, also, delegated job in zuul is reporting false positive, ignoring syntax errors http://logs.rdoproject.org/37/22737/7/check/molecule-tripleo-common/8e1bc25/13:10
pandalet's sync after this, after community call after tripleo meeting and DFG all hands and monthly jim webcast, ok ?13:10
rfolcopanda, ok next year then13:10
rfolcogreat timing13:10
pandarfolco: wat ?13:11
rfolcopanda, meanwhile working on the dark waiting for next ci job to report result13:11
pandarfolco: that job is passing ?13:11
rfolcopanda, there is a syntax error there... fixed in the next patchset13:12
pandarfolco: you should not use destroy in delegated mode13:12
rfolcopanda, I don't. Who said that?13:13
pandarfolco: that's why it's passing, it fails converge, then runs destroy13:13
pandarfolco: ah it's not destroy13:13
rfolcopanda, fails converge should fail13:13
pandarfolco: it's another playbook13:13
rfolcoyeah yeah13:13
rfolcolets sync asap13:13
rfolcoplease13:13
pandarfolco: I have a slot. between midnight and 1 o clock, the 25th of december 202113:14
rfolcopanda, :(13:15
*** jbadiapa has quit IRC13:16
*** ratailor__ has quit IRC13:19
arxcruz|ruckbrb in 10-15 minutes13:20
pandarfolco: 5 minuts sync, now13:20
pandarfolco: in comunity call link13:21
rfolcopanda, ok going13:21
pandarfolco: so we are already prepared13:21
pandarfolco: you there ? someone needs to let me in13:23
pandalet me iiiin13:23
pandalet meeee iiiiiiiin13:23
*** soniya29 has quit IRC13:23
*** Vorrtex has joined #oooq13:23
*** aakarsh has quit IRC13:30
*** Vorrtex has quit IRC13:30
*** amoralej|lunch is now known as amoralej13:33
rfolcoweshay, you joining?13:34
zbr|ruckvalidating docker login: https://etherpad.openstack.org/p/ssbarnea13:35
rfolcomjturek, baha https://meet.google.com/bqx-xwht-wky13:35
rfolcofyi we are using google meet for community call13:35
mjturekahh thank you13:35
bahaThank you! We've been sitting in the bluejeans room13:35
rfolco:)13:35
pandarfolco: the part that adds user to docker group is in molecule-delegated-pre , did it run fully on your system ? is not called automatically13:41
*** jaosorior has quit IRC13:51
*** SurajPatil has joined #oooq13:59
*** aakarsh has joined #oooq14:01
*** surpatil has quit IRC14:02
*** surpatil has joined #oooq14:08
*** SurajPatil has quit IRC14:11
weshaymarios, you are avail for the scenario004 sync in a bit?14:12
*** chem` has quit IRC14:12
mariosweshay: yeah that's what i'm here for14:13
weshaypanda, we have a boog14:18
weshaypanda, result.. = "dW51c2VkOg==" which translates to "unused"14:19
weshaypanda, maybe we mixed up the creds14:20
pandaweshay: nope translates to unused:14:20
*** SurajPatil has joined #oooq14:20
amoralejarxcruz|ruck, weshay check last comments in https://bugs.launchpad.net/tripleo/+bug/184516614:22
openstackLaunchpad bug 1845166 in tripleo "[queens] [Periodic][check] OVB jobs failing in OC deploy when trying to ssh to nodes" [Critical,Triaged]14:22
weshayarxcruz|ruck, when ur back ping me14:22
*** surpatil has quit IRC14:22
amoraleji think we have root cause14:22
amoralejbut i'm not sure about proper fix14:23
amoralejin fact it was not related to 7.714:23
* weshay reads14:23
weshayugh14:24
weshayamoralej, can we turn auth off?14:24
weshayfor rhel containers14:24
amoralejit was introduced for https://tree.taiga.io/project/tripleo-ci-board/task/118214:25
amoralejweshay, you mean rhel/osp?14:25
weshayamoralej, there are no osp containers in the rdo-registry.. but there are rdo on rhel8 containers14:26
weshaythat's why all the docker-login and auth work was done14:26
amoralejah, got it now14:26
amoraleji thing it'd be better to fix the login14:26
*** surpatil has joined #oooq14:26
weshaywe can remove docker login..  if we can remove the auth14:27
amoralejthat disabling auth14:27
amoralejauth is required in all containers or only rhel containers?14:27
weshayamoralej, where did your deploy fail w/ w/ this root cause?14:27
amoralejhttps://review.rdoproject.org/r/#/c/22551/14:27
amoralejthat's the reproducer14:28
arxcruz|ruckweshay: ping14:29
*** SurajPatil has quit IRC14:29
pandaweshay: fixed, the credentials were ok, problem with indentation on the docker_login task, should work now14:30
weshaypanda,  may I rerun?14:31
weshayamoralej, and it only affects queens?14:31
pandaweshay: yes14:32
weshaypanda, ack.. running14:33
weshayarxcruz|ruck, let's chat https://meet.google.com/dou-geaa-njq?authuser=114:33
amoralejweshay, only queens14:34
amoralejmy guess, after rocky, something is restoring policy to ACCEPT later14:34
weshayamoralej, ykarel oddly my undercloud keeps shutting down during a deployment14:34
amoralejin rocky masqerading connnfiguration was reworked in https://github.com/openstack/puppet-tripleo/blob/master/manifests/masquerade_networks.pp14:34
amoralejmy guess is that someching changed14:34
weshaymaking it difficult for me to bring credible results back to you14:34
rfolcopanda, do I need to run this manually ?14:35
rfolco# Export the path to the mounted docker socket so all docker client commands will use it14:35
weshayamoralej, k.. give me to end of the day to try and reproduce that14:36
pandarfolco: no, that's for the docker on docker while using docker driver14:39
rfolcopanda, added zuul to docker group, still getting permission denied.... I hate delegated mode14:40
rfolcoHATE14:40
pandazuul ?14:40
pandayou're running as zuul ?14:40
rfolco        - name: Add user to docker group14:41
rfolco          become: true14:41
rfolco          user:14:41
rfolco            name: zuul14:41
rfolco            groups: '{{ docker_group.stdout }}'14:41
rfolco            append: true14:41
pandarfolco: because that's supposed to run in upstream, is your user zuul ?14:41
*** Vorrtex has joined #oooq14:42
rfolcopanda, ok, w/ ansible interpreter and docker group fixed, now I am back to business, thanks14:46
pandarfolco: show me the money!14:48
rfolcopanda, mercenary14:48
*** aakarsh has quit IRC14:49
ykarelweshay, ack14:51
ykarellet us know if u face some issue14:51
weshayk.. thanks14:51
weshayykarel, amoralej although.. I suppose I won't hit the issue if docker login is not executed14:52
amoralejyes14:53
ykarelweshay, yes that's the reason i too didn't faced the issue14:53
amoralejyou need to execute it14:53
amoralejyeah, that ^14:53
weshaywhich it wouldn't be in the way I'm trying to execute14:53
weshayperhaps we can exclude docker login for queens14:54
amoralejweshay, that may be workaround14:54
weshaypanda, you still alive?14:54
ykarelweshay, remember there is a plan to remove anonymous login in rdo registry14:54
zbr|ruckpanda:  do a visual on https://review.rdoproject.org/r/#/c/22743/3/ci-scripts/container-push/container-push.yml and let me know if I should test it manually14:55
weshayykarel, sounds like no one is ready for that yet :)14:55
ykarelso you need to consider that as well while removing container login for queens14:55
amoralejyes, temporary workaround until auth is enabled every where14:55
weshayafaik.. we're running login every where14:55
weshayin rdo-cloud14:55
amoralejyes, but i think anonymous is still open14:56
amoralejnot sure for how long14:56
ykarelamoralej, yes it's still upon until https://review.rdoproject.org/r/#/c/22623/ don't merges14:56
ykarelas per the comment there plan was to merge yesterday14:56
weshayamoralej, ykarel ok.. so both arxcruz|ruck and I are running reproducers for queens.. w/o login14:57
weshaywill be interesting to see where it fails14:57
amoralejit will not fail14:57
ykarelit should not fail14:57
amoralejykarel already ran that14:57
weshayheh14:57
weshayykarel,  on the promoted hash.. or tripleo-ci-testing?14:57
pandaweshay: more or less14:57
ykarelweshay, promoted hashj14:57
weshayykarel, ya.. promoted hash should work outside of rdo-cloud14:58
weshayagree14:58
*** jbadiapa has joined #oooq14:58
weshaypanda, is it possible to disable the docker_login for jobs running queens?14:58
weshayw/ a when:14:58
pandacommit hashish, distro hashish14:58
pandaweshay: yes, container push has the release information14:59
ykarelregistry_login_enabled: false14:59
pandaah the general docker login14:59
weshaypanda, ok.. switch context here.. no talking about the promotion.. just the jobs executing deployments14:59
weshaypanda, ya14:59
pandaweshay: if we pass the release information to all the jobs, then it should be possible, and it's not passed we'll default to true15:00
weshayarxcruz|ruck, https://review.opendev.org/#/c/685184/1/roles/create-reproducer-script/templates/reproducer-quickstart.sh.j215:01
*** jbadiapa has quit IRC15:05
*** aakarsh has joined #oooq15:13
*** amoralej is now known as amoralej|brb15:22
weshayarxcruz|ruck, is that running for you?15:24
arxcruz|ruckweshay: ya15:24
*** ykarel is now known as ykarel|afk15:26
*** amoralej|brb is now known as amoralej15:29
weshaypanda, container promote worked, image promote worked... dlrn_api failed w/15:31
weshayhttp://pastebin.test.redhat.com/80215415:31
* weshay going to do it by hand15:32
*** tesseract has quit IRC15:33
weshayarxcruz|ruck, zbr|ruck fyi ^15:33
zbr|ruckweshay: re validating registry login, https://review.rdoproject.org/r/#/c/22743/4/ci-scripts/container-push/container-push.yml is ready. code tested with your small script.15:37
zbr|ruckcan I cherry pick it to promoter?15:37
zbr|rucki know is ugly,... but it does validate.15:37
*** aakarsh has quit IRC15:39
weshayzbr|ruck, give me a few15:39
*** jpena is now known as jpena|brb15:42
zbr|ruckweshay: sure. on the funny side, read release notes from 3.0 on https://github.com/docker/docker-py/blob/4c71641544f8cacfd27fbd5b36c67fc6763675a2/docs/change-log.md#L36215:42
zbr|ruckutils.ping_registry and utils.ping have been removed.15:43
zbr|ruckmainly, there *was* a way to ping the registry....15:43
weshayarxcruz|ruck, zbr|ruck just in case.. please watch the upstream check / gate jobs as new promotions can some times blow things up15:53
weshayzbr|ruck, /me looks15:53
weshayfyi.. master is fully promoted15:53
*** marios has quit IRC15:53
weshayzbr|ruck, where is noop coming from?15:56
weshayis it in the rdo reg?15:57
weshayhttps://console.registry.rdoproject.org/registry#/images/tripleomaster/noop15:57
zbr|ruckweshay: I am building it for test, does not come from anywhere. this task is building this mini image and pushing it.15:58
weshayk zbr|ruck I'm on the tmux running it.. it's not fast15:58
weshaymight be hanging15:59
zbr|rucki know, but is much faster than the rest. it does run only once per promotion, not for every image.15:59
zbr|ruckbetter to fail fast than later with 100s images15:59
weshayya.. running for several minutes, still going15:59
zbr|ruck~10-15s not minutes.16:00
zbr|rucki could try to make it work with the famous "scratch" image which is much smaller but I need to create a linux executable, it does not even have bash on it.16:01
*** jfrancoa has quit IRC16:03
*** surpatil has quit IRC16:04
weshayzbr|ruck, ya.. not working16:10
weshayfrom my laptop or the promotion server16:10
*** surpatil has joined #oooq16:10
zbr|ruckweshay: what is the issue?16:10
weshayjust hanging16:10
zbr|ruckgive me few minutes, I am trying to make it work with a smaller image.16:11
zbr|ruckweshay: run ./test_bad.sh on promoter now16:13
*** SurajPatil has joined #oooq16:13
chandankumarpanda: jpena|brb: weshay arxcruz|ruck zbr|ruck can we merge this https://review.rdoproject.org/r/#/c/22623/ ?16:14
weshayzbr|ruck, that worked great16:14
weshayzbr|ruck, let's put up a review16:15
zbr|rucki am updating the review now...16:15
weshaywe're in no danger of promoting anything else soon16:15
weshaychandankumar, it's potentially blocking queens now16:15
weshaychandankumar, so let's hold16:15
weshayarxcruz|ruck, I'm in deploy16:16
*** surpatil has quit IRC16:16
chandankumarweshay: ok16:16
weshaythanks for asking16:16
*** jfrancoa has joined #oooq16:17
chandankumarweshay: tomorrow is public holiday in india.16:18
arxcruz|ruckweshay: Prepare for the containerized deployment16:19
*** jpena|brb is now known as jpena16:19
arxcruz|ruckweshay: https://review.opendev.org/#/c/685709/ please check the comments and advice what to do16:21
zbr|ruckweshay: do we need any local ci-config changes on promoter? can I reset?16:22
*** weshay_ has joined #oooq16:25
*** weshay has quit IRC16:26
weshay_ssh is amazing16:26
*** SurajPatil has quit IRC16:27
arxcruz|ruckweshay_: https://review.opendev.org/#/c/685709/ please check the comments and advice what to do16:31
*** ykarel|afk is now known as ykarel|away16:36
*** dtantsur is now known as dtantsur|afk16:40
weshay_arxcruz|ruck, my job failed in heat16:42
weshay_arxcruz|ruck, you have any luck16:42
weshay_?16:42
arxcruz|ruckweshay_: deploying overcloud now16:42
weshay_k.. arxcruz|ruck what hash are you on?16:42
*** matbu has quit IRC16:43
*** ykarel|away has quit IRC16:43
weshay_arxcruz|ruck, /me tries a more recent hash16:43
arxcruz|ruckweshay_: checking16:43
arxcruz|ruckweshay_: dlrn_hash_tag=e2208117a31eea79c50330333501b6581f9faafe_15a9b67116:44
weshay_arxcruz|ruck, k.. good16:44
weshay_latest tripleo-ci-testing16:44
*** bogdando has quit IRC16:46
*** weshay_ is now known as weshay16:46
weshayarxcruz|ruck, if it get's late for you.. give me the ip of the node.. and I can watch it16:54
weshayarxcruz|ruck, put my github key on it16:54
arxcruz|ruckweshay: ack16:55
arxcruz|ruckweshay: what's your github username ?16:57
weshayweshayutin16:57
*** derekh has quit IRC17:01
*** jfrancoa has quit IRC17:05
*** aakarsh has joined #oooq17:07
*** amoralej is now known as amoralej|off17:11
*** jpena is now known as jpena|off17:16
*** ykarel|away has joined #oooq17:22
*** aakarsh has quit IRC17:33
ykarel|awayarxcruz|ruck, zbr|ruck is stein standalone upgrade failure already known?17:35
ykarel|awayi see last success from 18th september:- https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-standalone-upgrade-stein17:35
ykarel|awayand seems the only blocker for stein17:36
ykarel|awayas per the last run:- https://trunk-primary.rdoproject.org/api-centos-stein/api/civotes_detail.html?commit_hash=a361c5edd5734320f781e10358a602f4ba57fc04&distro_hash=647b08e47a72f7533142c09074f227628f08f9fa17:36
ykarel|awayfs035 failure is overcloud timeout, that fails randomly, but stein-upgrade one is real one17:37
weshayykarel|away, we're focused on queens atm.. oom's in rdo-cloud :)17:47
ykarel|awayweshay, ack17:48
ykarel|awaybut good to have a bug and some one from upgrade should be involved if not yet done17:49
weshayykarel|away, aye.. just don't kill urself brotha17:52
ykarel|awayme alright, have holiday tomorrow, so will rest17:53
*** Goneri has quit IRC17:59
*** Vorrtex has quit IRC18:03
*** Goneri has joined #oooq18:11
*** aakarsh has joined #oooq18:31
*** Goneri has quit IRC18:34
*** aakarsh has quit IRC18:36
*** matbu has joined #oooq18:38
rfolcozbr|ruck, around?18:42
rfolcopanda, you?18:43
arxcruz|ruckrfolco: it's almost 9pm here...18:43
arxcruz|ruckweshay: +w https://review.rdoproject.org/r/#/c/22743/ or wait ?18:44
*** aakarsh has joined #oooq18:44
rfolcook let me figure this out myself18:44
arxcruz|ruckokay, gotta watch macguyver18:44
arxcruz|ruck:D18:44
* weshay looks18:44
weshayarxcruz|ruck, zbr|ruck wait for the team to review on that18:44
weshayrfolco, something is wrong w/ the current promoter code w/ dlrnapi18:45
* weshay should find the code18:45
*** Goneri has joined #oooq18:48
rfolcoweshay, sftp issue and dlrnapi auth looks like18:48
weshaysftp worked18:49
weshaywell there was an error thrown18:49
weshaybut it got the job done18:49
weshayya.. auth issues18:49
weshayI don't see auth here https://github.com/rdo-infra/ci-config/blob/master//ci-scripts/dlrnapi_promoter/dlrnapi_promoter.py#L85-L11118:50
weshayah here https://github.com/rdo-infra/ci-config/blob/master//ci-scripts/dlrnapi_promoter/dlrnapi_promoter.py#L348-L35218:51
rfolcoexactly weshay18:52
rfolcoDLRNAPI_PASSWORD env variable is missing or empty18:52
rfolcothis is in the logs18:52
weshayrfolco, can you pdb     api_instance = dlrnapi_client.DefaultApi(api_client=api_client)18:53
weshayah k18:53
weshayyou got it18:53
rfolcochecking promoter env18:53
rfolcoweshay, exported dlrnapi_password env, watching logs18:58
rfolcodoes anybody know how to escape {{ template in ansible properly ? imagename: {{namespace}}/{{name_prefix}}{{ item }}{{name_suffix}}:{{tag}   >> only {{ item }} is a var19:03
pandarfolco: use {% raw %} {%endraw%}19:07
pandarfolco: how's the promoter server ?19:07
rfolcopanda, I reloaded env var, watching in tmux logs19:08
*** aakarsh has quit IRC19:08
pandawhy is nothing suddenly working19:08
rfolcopanda, did not run again yet... is it stuck ?19:08
pandarfolco: I have no idea, I have to fight a hyperactive daughter right now ...19:09
rfolcopanda, may the force be with you19:09
rfolcopanda, better not19:09
rfolcopanda, may the patience be with you19:09
rfolcoweshay, did you stop the service ?19:21
weshayrfolco, ya.. we did yesterday19:21
weshayprobably safe to restart.. although we may want to wait for tomorrow morning.. nothing is dire need of promotion now19:21
rfolcoweshay, ok, just wanted to check if you needed to test w/ the dlrnapi_password env var set19:22
weshayrfolco, so if we could um have a test that does that :)19:26
weshaythat would be super cool19:26
weshayarxcruz|ruck, I have two queens deployments at overcloud_deploy atm19:26
weshayone virt one ovb19:26
rfolcooh glory19:35
rfolcoimage prepare just worked19:35
rfolcois to glorify standing up, church! << arxcruz|ruck19:36
rfolcosorry will control my emotions now19:37
weshaywoot19:44
weshayqueens is fine :)19:44
weshayTASK [overcloud-deploy : echo deployment_status] *********************************************************19:44
weshayTuesday 01 October 2019  19:43:49 +0000 (0:00:00.120)       2:39:25.090 *******19:44
weshayok: [undercloud] => {19:44
weshay    "overcloud_deploy_result": "passed"19:44
weshay}19:44
weshayarxcruz|ruck, ^19:44
weshayarxcruz|ruck, zbr|ruck ok.. I have a passing ovb passing deployment fs001 ( tempest fails to setup ) and a libvirt passing deployment + passing tempest on baseurl=https://trunk.rdoproject.org/centos7-queens/e2/20/e2208117a31eea79c50330333501b6581f9faafe_15a9b67120:07
weshayI'm going to send queens through20:08
weshayarxcruz|ruck, zbr|ruck focus laser beams on rhel8 scen01/02 and fs001 ovb master20:08
*** kopecmartin is now known as kopecmartin|off20:11
*** ykarel|away has quit IRC20:17
*** ykarel|away has joined #oooq20:20
*** ykarel|away has quit IRC20:33
*** Goneri has quit IRC20:57
*** openstackstatus has quit IRC21:32
*** openstackstatus has joined #oooq21:34
*** ChanServ sets mode: +v openstackstatus21:34
*** holser has quit IRC21:51
*** Goneri has joined #oooq22:28
*** weshay has quit IRC22:29
*** aakarsh has joined #oooq22:35
*** Goneri has quit IRC22:48
*** weshay has joined #oooq22:52
*** weshay has quit IRC22:55
*** weshay has joined #oooq22:58
*** tosky has quit IRC23:02
*** weshay has quit IRC23:12
*** weshay has joined #oooq23:15
*** ChanServ sets mode: +o weshay23:19

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!