*** igordc has quit IRC | 00:11 | |
*** igordc has joined #zuul | 00:11 | |
ianw | Shrews: ahh, sorry i'll loop back on that | 00:17 |
---|---|---|
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: super hacky demo of logfile under the manifest https://review.opendev.org/676843 | 00:31 |
corvus | zuul-maint: high-priority reviews of "topic:fix-zuul-logs" will help folks inconvenienced by the switch to the build page and therefore they will like us more and maybe buy us beer | 00:34 |
*** igordc has quit IRC | 00:44 | |
*** igordc has joined #zuul | 00:45 | |
*** spsurya has joined #zuul | 01:14 | |
*** bhavikdbavishi has quit IRC | 01:38 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: highlight selected line https://review.opendev.org/676849 | 02:18 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: highlight selected line https://review.opendev.org/676849 | 03:18 |
*** bhavikdbavishi has joined #zuul | 03:19 | |
*** raukadah is now known as chkumar|ruck | 03:24 | |
*** igordc has quit IRC | 04:11 | |
*** bjackman_ has joined #zuul | 04:13 | |
*** bjackman_ has quit IRC | 04:30 | |
*** bjackman_ has joined #zuul | 04:37 | |
fungi | i can review changes until my flight starts boarding | 05:00 |
fungi | will take a look | 05:00 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Add a dib-cmd option for diskimages https://review.opendev.org/672196 | 06:03 |
*** yolanda has joined #zuul | 06:55 | |
*** saneax has joined #zuul | 07:48 | |
ofosos | Did the location of job-output.txt change recently? | 10:50 |
*** bhavikdbavishi has quit IRC | 11:02 | |
*** bhavikdbavishi has joined #zuul | 11:03 | |
*** bhavikdbavishi has quit IRC | 11:22 | |
*** mgoddard has quit IRC | 11:52 | |
*** mgoddard has joined #zuul | 11:59 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Rework a cache invalidation issue https://review.opendev.org/674425 | 12:28 |
*** rlandy has joined #zuul | 12:33 | |
*** rlandy is now known as rlandy|rover | 12:33 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: highlight selected line https://review.opendev.org/676849 | 12:35 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: do not scroll into view more than once https://review.opendev.org/676924 | 12:35 |
tristanC | corvus: mordred: i'm working on a new implementation for line selection, without using anchors | 12:37 |
mordred | neat | 12:37 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: logfile highlight selected line https://review.opendev.org/676849 | 12:57 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: logfile do not scroll into view more than once https://review.opendev.org/676924 | 12:57 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: logfile do not use anchor for line selection https://review.opendev.org/676928 | 12:57 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: logfile scroll into view a bit more https://review.opendev.org/676929 | 12:57 |
tristanC | https://review.opendev.org/676928 should gives a better result | 12:57 |
*** bjackman_ has quit IRC | 12:58 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: logfile scroll into view a bit more https://review.opendev.org/676929 | 13:20 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: logfile do not use anchor for line selection https://review.opendev.org/676928 | 13:20 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: logfile support multi line selection through shift-click https://review.opendev.org/676937 | 13:20 |
tristanC | mordred: here is a multi line selector implementation (that supports only one range) | 13:21 |
mordred | tristanC: I'm fine with only one range :) | 13:21 |
tristanC | well highlighting multiple range shouldn't be hard, but making a non-confusing ui for the selection is rather difficult :) | 13:26 |
openstackgerrit | David Shrewsbury proposed zuul/nodepool master: DNM: testing openshift job https://review.opendev.org/676943 | 13:34 |
SpamapS | simplicity > corner case coverage | 13:35 |
corvus | mordred, SpamapS: would you please reconsider your -1s on 676818? i realize that in some clases when you click the line, it still doesn't appear in the right place, however, it works better than the current situation in that it works on page load, and we can improve it later. | 13:39 |
corvus | normally i like to merge perfect code, but since we already merged imperfect code, i'd like to merge more imperect code to make it less imperfect | 13:40 |
* Shrews almost spits out coffee | 13:40 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Remove sphinx_output_dir https://review.opendev.org/676945 | 13:41 |
AJaeger | corvus, good morning. promote of openstack-manuals works and we can remove sphinx_output_dir again ^ | 13:41 |
AJaeger | better remove it before anybody uses it ;) | 13:41 |
*** jeliu_ has joined #zuul | 13:44 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: manager: check if parent_layout exists before looking for errors https://review.opendev.org/676947 | 13:47 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: github: properly exist when failing to get a pull request https://review.opendev.org/676948 | 13:50 |
tristanC | corvus: we setup a new monitoring system to detect stacktrace in zuul/nodepool logs yesterday, here are a couple of changes to prevent those, not sure that's correct though ^ | 13:52 |
tristanC | fwiw, here is the script we are now running: https://softwarefactory-project.io/cgit/software-factory/sf-ops/tree/scripts/monitor-traceback.py | 13:52 |
tristanC | Shrews: also, got this NotEmptyError in nodepool-launcher: http://paste.openstack.org/show/757967/ | 13:54 |
Shrews | tristanC: yeah, deleting znodes is racey because of how locks work | 13:56 |
Shrews | tristanC: basically, you can have a znode locked, then try to delete it. Because the lock is part of the znode itself, once the lock data is deleted, another thread can attempt to lock the znode before it disappears, causing new lock data to appear. | 13:57 |
tristanC | Shrews: then shouldn't this be excepted and logged as error instead of exception? | 13:57 |
Shrews | (if that makes sense) | 13:57 |
Shrews | tristanC: maybe? | 13:58 |
corvus | tristanC: i can't figure out what 676924 does | 14:01 |
tristanC | corvus: not sure it's still useful, but it basically prevent the scroll from happening when browsing the file | 14:03 |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: zk: except NotEmptyError in deleteRawNode https://review.opendev.org/676953 | 14:03 |
corvus | tristanC: it isn't working for me | 14:03 |
corvus | i can't see any behavior change compared to the previous patch | 14:03 |
tristanC | corvus: it's because that patch still uses <a href="#">, next one removes the link | 14:04 |
tristanC | corvus: 676924 should be on top of 676928 to be effective | 14:04 |
corvus | tristanC: okay, that makes sense | 14:04 |
corvus | tristanC: one really minor note, if you click on a line visible when you're at the top, it will scroll. but this is still an improvement i think | 14:06 |
corvus | (and yeah, scrolling when you click an anchor is standard behavior, but i also think it makes things difficult when it's something like a line in a logfile versus a section header -- it's hard for your eyes to follow the jump, so i think not scrolling on line selection is the way to go :) | 14:06 |
tristanC | corvus: we can use a state variable to remember if the initial scrollinto view already happened | 14:07 |
Shrews | tristanC: i left a comment on that nodepool change | 14:10 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: logfile remember if initial scroll was performed https://review.opendev.org/676954 | 14:11 |
tristanC | corvus: like so ^ | 14:11 |
Shrews | tristanC: also, nodepool-functional-openshift job seems consistently broken now for some reason | 14:14 |
corvus | tristanC: sweet, i have +2d the entire stack | 14:14 |
tristanC | Shrews: perhaps it's time to merge https://review.opendev.org/#/c/672785/ | 14:15 |
corvus | how did a gating job break? | 14:15 |
Shrews | tristanC: shouldn't we care that it is broken with an older version of openshift? | 14:16 |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: zk: except NotEmptyError in deleteRawNode https://review.opendev.org/676953 | 14:16 |
Shrews | https://zuul.opendev.org/t/zuul/build/49a3d363d94e4008b3bb8b6876f1a3bc/log/job-output.txt#706 | 14:17 |
Shrews | seems it can't recognize that the service is started | 14:17 |
tristanC | Shrews: corvus: the previous version may have been updated and now assume the node has correct firewall configurations | 14:17 |
tristanC | Shrews: i can extract the firewall task from 672785, but i think we should bump to 3.11 now as it is the version that feature the operator framework | 14:18 |
corvus | tristanC, Shrews: we also added a clear-firewall rule we could run before install-openshift if we want | 14:19 |
tristanC | corvus: it seems like we only needs to authorize 172.16.0.0/12, which seems safer to do on public network | 14:20 |
corvus | k | 14:20 |
corvus | anyway, if it looks like it's just the firewall that's the issue, that seems like a satisfactory explanation to me, and we can upgrade to fix it | 14:21 |
corvus | mordred, SpamapS: also, all your concerns on 676818 are addressed in tristanC's followups, which all have a +2 from me | 14:23 |
corvus | the shortest path to happiness is to +3 676818 and the rest of the fix-zuul-logs stack | 14:23 |
corvus | tristanC: can you take a look at https://review.opendev.org/676843 and let me know if you like the idea? | 14:24 |
corvus | tristanC: i thought of it when i was switching between different logfiles looking for an error | 14:24 |
mordred | corvus: yeah- I agree - I've +2'd the stack too - waiting to see the previews for the +A just to be sure | 14:25 |
tristanC | corvus: i'll give it a try, iiuc, it is fetching the file content each time a select occurs right? | 14:27 |
corvus | tristanC: yes, i agree we should make your suggested change to keep the file content around if we do this. mostly i'm wondering if we like the ui enough to continue working on this change | 14:28 |
corvus | (many things are wrong with the change, mostly i'm just wondering -- is putting the file content below the tree a good idea?) | 14:29 |
tristanC | corvus: i'm terrible at designing such things... but i would put it in the tree if there was way to inline a panel without the tree indentation | 14:30 |
tristanC | so that, when you look at tox logs for example, you can open each numbered file in order and get a complete view of what tox did | 14:31 |
corvus | tristanC: i feel like tox is the exception though, most of these files are hundreds or thousands (or tens of thousands) of lines long | 14:32 |
openstackgerrit | Merged zuul/zuul-jobs master: install-openshift: bump version to 3.11.0 https://review.opendev.org/672785 | 14:32 |
corvus | so that might push the rest of the tree way down | 14:32 |
tristanC | corvus: we also limit the height of the inlined panel | 14:34 |
tristanC | can* also | 14:34 |
corvus | tristanC: if we put part of the tree above, and part below, that can make it difficult to switch between logs | 14:36 |
tristanC | here is an example of how it may looks like: https://softwarefactory-project.io/logs/65/15565/12/check/sf-ci-functional-allinone/45668f1/report.html (though the panel do not respect window heights) | 14:36 |
corvus | i guess you're saying that's a way to put all the logs on the page | 14:36 |
corvus | hrm, i'm not sure i like the double scrollbars | 14:38 |
corvus | to browse a file, you have to line up the interior box with the viewport, then you can scroll within it | 14:38 |
tristanC | corvus: then we shouldn't inline the log in the tree | 14:38 |
corvus | (i do agree that works really well for shorter files) | 14:39 |
tristanC | perhaps we could get a similar result by stacking the file clicked (sorted by clicking order) under the tree | 14:39 |
tristanC | also, we can count line return, if there is not too many (e.g. less than ten), then we can still inline the content in the tree | 14:40 |
corvus | i asked efried in #openstack-infra for thoughts too -- he had some feedback there | 14:41 |
corvus | mordred: https://cb7d8e118e66e1fa69a1-f54717b5728ca51ec481953b9301c7c6.ssl.cf2.rackcdn.com/676954/1/check/zuul-build-dashboard/376fe86/npm/html/ is the preview for the end of the stack | 14:44 |
corvus | tristanC: followup question on 676947 | 14:53 |
tristanC | corvus: replied | 15:05 |
corvus | tristanC: had the scheduler been running for a while? and was that near a tenant reconfiguration event? | 15:07 |
tristanC | corvus: we restarted the scheduler at 2019-08-15 20:12:49 UTC; 18h ago | 15:08 |
tristanC | corvus: and there has been a full-reconfigure command at 2019-08-15 21:14:16,384, about 3 minutes before the exception | 15:09 |
mordred | tristanC, corvus: stack looks great! want me to push it in - or want to wait for efried? | 15:11 |
corvus | mordred: go for it | 15:12 |
tristanC | mordred: corvus: glad you like it | 15:12 |
mordred | k. stack is +A | 15:13 |
tristanC | pabelanger: tobiash corvus: regarding executor zone, iiuc this feature doesn't isolate merge job right? e.g. does a zone executor register merge gearman jobs and will perform merge task, even for un-related change? | 15:16 |
tristanC | we are looking into spawning an executor on a slow network, but it seems like it will need access to all of our connections | 15:17 |
corvus | tristanC: correct | 15:17 |
tristanC | git* connections | 15:17 |
tristanC | corvus: then would it be possible to implement merge zone, or disable the registering of merge gearman jobs for zoned executor? | 15:17 |
corvus | tristanC: i think the second idea would make the most sense | 15:18 |
corvus | (also easier) | 15:18 |
dmsimard | I wrongly assumed that zuul executors only merged their own things :) | 15:19 |
corvus | the zones are really supposed to be about the executor communicating with the test nodes. so there really isn't a locality to tie them to changes or git sources. | 15:19 |
pabelanger | topic:distributed-executors is work tobiash has been doing for improvements but yah, merges still happen | 15:19 |
corvus | so given what you describe, i think just not doing more merges than necessary is the way to go | 15:20 |
corvus | dmsimard: there are lots of things mergers do other than just preparing repos for running jobs; it's a lot of work, so if the executors have spare cycles, they pitch in | 15:20 |
corvus | but that's predicated on the idea that they are fast. if the're off on a slow network, then they're not helping. so better to just not have them participate in that case. | 15:20 |
pabelanger | if we still had infracloud, that was something we likely wanted to implement there. It had slowish network too | 15:21 |
pabelanger | dmsimard: on disk repos also help, we did that for ansible/ansible just down on alot of traffic pushing data to the node | 15:22 |
pabelanger | s/just/cut | 15:22 |
pabelanger | https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/mirror-workspace-git-repos | 15:23 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: executor: add merge_jobs options to disable gearman merge jobs https://review.opendev.org/676974 | 15:27 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Collect more information from quickstart failures https://review.opendev.org/676976 | 15:31 |
corvus | the stack failed due to a quickstart error; ^ hopefully that will help us find the problem | 15:31 |
*** noorul has joined #zuul | 15:32 | |
corvus | i've reapproved it | 15:33 |
tristanC | pabelanger: would it be possible to set the job zone without a nodepool pool/label? Or does the zone feature implies running a launcher too? | 15:34 |
clarkb | tristanC: corvus ^ comments on your two most recent changes there | 15:36 |
tristanC | because the second issue is that this zone would use https://wiki.centos.org/QaWiki/CI/Duffy, which doesn't have a nodepool driver, but we have ansible role to create the instance | 15:36 |
pabelanger | tristanC: yes, currently depends on having nodepool | 15:37 |
clarkb | ya as mentioned before the zoning has to do with accessibility and nodepool has to get things like ssh host keys | 15:37 |
pabelanger | yah, was setup to solve FIP issue really | 15:38 |
tristanC | i guess we can write a duffy driver, but that would be more difficult, and we would have to run another service there (with zookeeper lack of authentication, that's a bit of a problem...) | 15:38 |
clarkb | since the assumption is that you can't talk to the nodes unless in that zone then nodepool and executor are both going to live there | 15:38 |
clarkb | zk has authentication... | 15:38 |
tristanC | clarkb: well, it has authentication, but iiuc it doesn't enforce it | 15:38 |
clarkb | iirc shrews tested it and it works fine | 15:38 |
pabelanger | did we add zone info to instance data? or just into zk directky | 15:39 |
pabelanger | directly* | 15:39 |
clarkb | tristanC: you can connect without auth but you can't access data behind auth without authing | 15:39 |
clarkb | basically tress (and subtrees) end up protected by acls (and authentication) and until you access that data it doesn't care about your auth state. This means you can start the tcp connection but you can't read the or write the data unless auth'd | 15:40 |
tristanC | clarkb: can we put an acl on / or /zookeeper ? | 15:40 |
clarkb | tristanC: yes I believe that is exaclty what shrews had tested | 15:40 |
tristanC | then i stand corrected, didn't realized it was possible to lock down the zk tree acl | 15:41 |
Shrews | well, i didn't test an acl on /zookeeper, just that the kazoo auth code worked | 15:42 |
clarkb | Shrews: you did test it on subtrees though iirc | 15:42 |
Shrews | i can't recall. i want to say "yes" | 15:42 |
corvus | clarkb: q on 676976 | 15:43 |
clarkb | corvus: responded. Hopefully that makes more sense | 15:44 |
clarkb | it is early and I am probably not making sense :) | 15:44 |
corvus | clarkb: nope that makes perfect sense. i just didn't see that i had left an extra line in there :) | 15:45 |
pabelanger | we've also thought about the idea of a zuul-executor on pem for partner for testing this, but haven't decided to do that yet or not. Idea being, to reduce the size of zuul a partner would run | 15:45 |
Shrews | clarkb: tristanC: fwiw, i think this was my test script: http://paste.openstack.org/show/757970/ | 15:45 |
tristanC | Shrews: clarkb: iirc, the concern was that malicious access could stress the cluster by creating on /, or maybe manipulate /nodepool | 15:46 |
tristanC | and iirc it was considered a serious issue if anonymous access was possible to a zk cluster | 15:47 |
clarkb | tristanC: with an acl on / no one should be able to access or create in that tree | 15:47 |
clarkb | it is possible that you could cause problems creating too many tcp sessions | 15:47 |
clarkb | but that is true of any server you can talk to | 15:47 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Collect more information from quickstart failures https://review.opendev.org/676976 | 15:48 |
Shrews | tristanC: i think that script addresses that concern. the last line troubles me as i do not recall why that didn't work, or why it wasn't a concern | 15:49 |
Shrews | or if it was a concern, why i forgot about it | 15:49 |
Shrews | maybe my comment is outdated | 15:50 |
ofosos | noorul: ready? | 15:50 |
Shrews | i guess subtrees are not addressed in that sample code | 15:51 |
Shrews | tristanC: oh, i misread your comment. yeah, i don't think there's a way to prevent anonymous connections | 15:51 |
*** noorul has quit IRC | 15:51 | |
Shrews | i'm not aware of one, at least | 15:52 |
clarkb | looks like zookeeper merged sasl auth support ~23 days ago | 15:56 |
clarkb | next release will manage it at the connection level looks like | 15:56 |
tristanC | Shrews: clarkb: alright, thank you very much for the feedback. Then it seems like running a nodepool-launcher in that network is the right thing to do. | 16:02 |
*** noorul has joined #zuul | 16:08 | |
noorul | ofosos: hi | 16:08 |
ofosos | noorul: hi! all good? | 16:14 |
noorul | ofosos: hi | 16:16 |
ofosos | noorul: hi | 16:17 |
noorul | ofosos: I could bring up zuul using docker-compose | 16:17 |
ofosos | cool | 16:17 |
noorul | ofosos: Issue was that it was waiting for gerrit to start | 16:17 |
openstackgerrit | Merged zuul/zuul master: JS: account for header when scrolling to line https://review.opendev.org/676818 | 16:17 |
noorul | What is the next step? | 16:17 |
ofosos | So, did you already have a look at the docs? | 16:17 |
noorul | which doc? | 16:18 |
ofosos | We need a "Zuul" user in bitbucket, then we need to configure the driver with that user. | 16:18 |
noorul | I did that | 16:18 |
noorul | https://zuul-ci.org/docs/zuul/admin/quick-start.html | 16:18 |
noorul | It has example for gerrit | 16:18 |
ofosos | https://021706778c3f130fc2cd-793c55b6fd3f2546ffa515e5af6fce40.ssl.cf2.rackcdn.com/674425/5/check/zuul-tox-docs/429bfef/docs/ | 16:19 |
noorul | Is there an example for Bitbucket zuul-config | 16:19 |
ofosos | https://021706778c3f130fc2cd-793c55b6fd3f2546ffa515e5af6fce40.ssl.cf2.rackcdn.com/674425/5/check/zuul-tox-docs/429bfef/docs/admin/drivers/bitbucket.html this one | 16:20 |
ofosos | This contains the info you need for the bitbucket connection | 16:20 |
noorul | So as first step I need to have a zuul-config repo in bitbucket | 16:22 |
ofosos | You can create a connection like this for bitbucket, after that you can decide if you want to have zuul-config in gerrit or in bitbucket. For simplicity, let's keep it in Gerrit (we have everythin in Bitbucket). | 16:23 |
noorul | I don't have gerrit | 16:23 |
noorul | I created a zuul-config repo | 16:23 |
ofosos | Not necessarily, the connection needs to be configured before we can pull from zuul-config. We also need a main.yaml with the tenant configuration. | 16:23 |
noorul | How is that done? | 16:24 |
noorul | Is there a complete step by step documentation ? | 16:24 |
ofosos | Ok, then in your /etc/zuul on the scheduler you should find two files. One is the config with the connections and one is the tenant config (usually main.yaml). | 16:24 |
ofosos | Remove the gerrit connection and add the bitbucket config with the credentials from the zuul user. | 16:25 |
noorul | Done! | 16:25 |
ofosos | Do you already have a project in Bitbucket? | 16:25 |
ofosos | What's it named? | 16:25 |
noorul | Yes demo project | 16:25 |
ofosos | ok, go to the tenant config and configure a config project, it should be `demo/zuul-config` | 16:26 |
ofosos | `demo/zuul-config` that's the path | 16:26 |
ofosos | https://zuul-ci.org/docs/zuul/admin/tenants.html | 16:27 |
ofosos | That's the link to the tenant config. | 16:27 |
noorul | source should be bitbucket right? | 16:27 |
ofosos | After that, start/restart the scheduler. | 16:27 |
ofosos | Yes, if that is the name of the source you configured. | 16:27 |
ofosos | AFK for 7 minutes | 16:28 |
noorul | - tenant: | 16:28 |
noorul | name: demo | 16:28 |
noorul | source: | 16:28 |
noorul | bitbucket: | 16:28 |
noorul | config-projects: | 16:28 |
noorul | - zuul-config | 16:28 |
noorul | 16:28 | |
noorul | Is that fine? | 16:28 |
*** yolanda has quit IRC | 16:30 | |
corvus | noorul, ofosos: if you want, you can use etherpad to sketch out config files like that. see this url: https://etherpad.openstack.org/p/rWL36RmF6W | 16:30 |
corvus | and yeah, that looks right to me | 16:31 |
*** hwangbo has joined #zuul | 16:34 | |
openstackgerrit | Merged zuul/zuul master: JS: Break log viewer out of the panel https://review.opendev.org/676827 | 16:36 |
noorul | ofosos: Added zuul.conf there. Can you review? | 16:36 |
*** yolanda has joined #zuul | 16:37 | |
noorul | Is opendev and mysql connections required? | 16:37 |
ofosos | No, use demo/zuul-config instead of plain zuul-config | 16:37 |
noorul | I meant /etc/zuul/zuul.conf | 16:38 |
noorul | Not the pipeline configuration | 16:38 |
noorul | ofosos: Can you take a look at https://etherpad.openstack.org/p/rWL36RmF6W ? | 16:39 |
noorul | First one is main.yaml | 16:39 |
noorul | second one is zuul.conf | 16:39 |
ofosos | I corrected some stuff | 16:43 |
noorul | What is server for? | 16:43 |
ofosos | The driver has no sshkey option, please provide that in /root/.ssh/id_* on the executor. | 16:43 |
ofosos | Server is the API endpoint | 16:43 |
noorul | Then it should be http right? | 16:43 |
ofosos | You don't need server | 16:44 |
ofosos | Base URL will suffice for the API and cloneurl will suffice for GIT access | 16:44 |
noorul | You corrected that to add ssh://.. | 16:44 |
noorul | ok, I see that it is removed now | 16:44 |
corvus | (we should add sshkey support to the driver) | 16:45 |
ofosos | corvus: I think we should add the ability of zuul to upload it's own key to bitbucket :) | 16:45 |
ofosos | I.e. attach the access key to a repo. | 16:45 |
ofosos | I think that's possible API wise. If noorul joins our party, it should be swiftly done ;) | 16:46 |
ofosos | noorul: I think with the config you're now ready to spin up the processes. | 16:47 |
noorul | ofosos: started | 16:47 |
*** rlandy|rover is now known as rlandy|rover|brb | 16:47 | |
corvus | ofosos: yeah, better bootstrapping sounds good -- but different connections will may different keys, so we can't rely on ~/.ssh/id_rsa being there for a particular connection -- so the other drivers allow you to specify a key | 16:47 |
corvus | s/will may/may have/ | 16:47 |
corvus | anyway, just a note for us to come back to later; i'll leave it as a review comment | 16:48 |
corvus | (don't want to derail the bootstrapping party) | 16:48 |
noorul | The driver has no sshkey option, please provide that in | 16:48 |
noorul | /root/.ssh/id_* on the executor. | 16:48 |
noorul | ofosos: Can you explain that a bit? | 16:48 |
ofosos | It uses the default ssh identity of the user the executor runs as. | 16:49 |
noorul | ofosos: Are you saying that I should have zuul users private key at /root/.ssh/ folder? | 16:50 |
ofosos | Yes, I am :) | 16:50 |
ofosos | But in the first step we don't need it right away. First we have to check if the scheduler connects to the Bitbucket correctly. After that we'll grant access to the key in the zuul-config repo and restart once more. | 16:51 |
ofosos | Also, the zuul-config repo will need "write" permssions to be granted to the zuul user. | 16:52 |
noorul | I see | 16:52 |
noorul | What is the next step? | 16:52 |
ofosos | There are three things you have to do for every repo: put it in the tenant config, grant access to the access key in the repo and add the zuul user to the repo with "write" level permissions. | 16:53 |
ofosos | If you've done that, start up the scheduler and look at the output. | 16:53 |
openstackgerrit | Merged zuul/zuul master: JS: add line numbers to log file https://review.opendev.org/676830 | 16:53 |
noorul | ofosos: Can you help to add one repo example in etherpad? | 16:54 |
AJaeger | corvus: want to take https://review.opendev.org/676945 (remove sphinx_output_dir from zuul-jobs) before it gets used? I don't need it anymore... | 16:55 |
AJaeger | Or do we need to deprecate that properly? | 16:55 |
ofosos | noorul: I added one | 16:56 |
noorul | ok | 16:56 |
ofosos | You still have to create the repo | 16:56 |
ofosos | But,... you will have to push the sample code from the zuul-config repo that is included with gerrit into zuul-config. That'll make testing easier | 16:56 |
noorul | I created | 16:56 |
ofosos | What do you get from the scheduler logs? | 16:57 |
ofosos | The executor should check out two repos: zuul-config and test | 16:57 |
ofosos | And the scheduler should be running a loop with 60s delay and looking into those repos. | 16:57 |
noorul | I am getting some other error related to alembic | 16:59 |
noorul | I forgot the openstack pastie service | 16:59 |
noorul | I could paste the error there | 16:59 |
*** panda has quit IRC | 17:00 | |
noorul | I am getting http://paste.openstack.org/show/757976/ | 17:00 |
*** mattw4 has joined #zuul | 17:01 | |
noorul | May be I should remove mysql driver? | 17:02 |
noorul | from the config? | 17:02 |
*** panda has joined #zuul | 17:02 | |
noorul | Did anyone see that error before? | 17:04 |
noorul | I am using docker-compose | 17:04 |
*** igordc has joined #zuul | 17:04 | |
corvus | noorul: my guess is that you have a database leftover from running docker-compose on the current code but you're now running it on older code | 17:05 |
noorul | I see | 17:05 |
corvus | noorul: you may need to stop all the containers, delete the volume used by the mariadb container, then restart | 17:06 |
*** mgoddard has quit IRC | 17:07 | |
noorul | corvus: I am not sure where the mariadb volume is located as I don't see any volumes entry in mariadb section | 17:08 |
*** igordc has quit IRC | 17:08 | |
*** mgoddard has joined #zuul | 17:11 | |
*** rlandy|rover|brb is now known as rlandy|rover | 17:11 | |
noorul | corvus: https://opendev.org/zuul/zuul/src/branch/master/doc/source/admin/examples/docker-compose.yaml#L34 | 17:12 |
openstackgerrit | Merged zuul/zuul master: web: logfile highlight selected line https://review.opendev.org/676849 | 17:13 |
ofosos | You can run without the mysql driver, but with reduced functionality (no historic build info). | 17:15 |
noorul | I would like to understand what is going on | 17:16 |
corvus | noorul: yeah, it's a volume that's specified in the container image itself, so it doesn't show up in docker-compose. you can see all tho volumes with docker volume list | 17:16 |
openstackgerrit | Merged zuul/zuul-jobs master: Remove sphinx_output_dir https://review.opendev.org/676945 | 17:17 |
corvus | ah | 17:17 |
noorul | There are too many, not sure which one belongs to mariadb | 17:17 |
corvus | noorul: "docker inspect examples_mysql_1" should tell you which volume | 17:17 |
corvus | noorul: look under the "Mounts" section | 17:18 |
corvus | noorul: another easy way to fix this might be to just delete the mysql container and let docker-compose recreate it | 17:18 |
corvus | so just: docker rm examples_mysql_1 | 17:19 |
*** mattw4 has quit IRC | 17:20 | |
*** mgoddard has quit IRC | 17:22 | |
*** mgoddard has joined #zuul | 17:23 | |
noorul | corvus: Thank you | 17:25 |
noorul | corvus: I pruned using docker system prune | 17:25 |
noorul | ofosos: http://paste.openstack.org/show/757978/ | 17:25 |
noorul | ofosos: That scheduler log now | 17:25 |
*** igordc has joined #zuul | 17:26 | |
corvus | ++ | 17:27 |
ofosos | The last three lines seem alien to me, but maybe they're benign | 17:27 |
ofosos | noorul: Have another look, the bitbucket driver takes a minute to do something. | 17:28 |
noorul | ofosos: There was a typo. Fixed it | 17:28 |
ofosos | noorul: can you also paste the executor log? | 17:28 |
noorul | http://paste.openstack.org/show/757979/ | 17:28 |
corvus | the keypair stuff happens on first boot | 17:29 |
ofosos | Give it a minute and post again. | 17:29 |
ofosos | Also, I would be interested in the executor log. | 17:29 |
noorul | executor log: http://paste.openstack.org/show/757980/ | 17:29 |
ofosos | But that now look good, the driver has initialized. | 17:29 |
noorul | ok, then what next? | 17:30 |
ofosos | Wait, post the scheduler log again. The watcher thread should have run by now. | 17:31 |
ofosos | What's in the `zuul-config` repo? | 17:31 |
noorul | No change | 17:32 |
noorul | zuul-config has nothing | 17:32 |
noorul | It is empty now | 17:32 |
noorul | Shall I add the example pipeline from bitbucket doc? | 17:32 |
ofosos | Yes please | 17:33 |
ofosos | Just create it on master, we need that in the repo initially and then we need to do `zuul-scheduler full-reconfigure` to make it pick up the changes (no pipelines yet). | 17:33 |
ofosos | (Issue the command in the scheduler container) | 17:34 |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: super hacky demo of logfile under the manifest https://review.opendev.org/676843 | 17:37 |
noorul | ofosos: Did you mean not to add pipeline.yaml now? | 17:37 |
ofosos | Yes | 17:40 |
noorul | ok, I just pushed README.md | 17:40 |
noorul | to master branch | 17:41 |
noorul | Shall I run zuul-scheduler full-reconfigure | 17:41 |
ofosos | Try it | 17:41 |
ofosos | There may still lurk some error, since there was no car job in your log | 17:42 |
ofosos | Cat | 17:42 |
ofosos | Not car | 17:42 |
noorul | http://paste.openstack.org/ | 17:42 |
noorul | http://paste.openstack.org/show/758063/ | 17:43 |
ofosos | Can you post the executor logs? | 17:45 |
ofosos | It's now waiting for the cat job, so we need to check that that ran correctly | 17:46 |
*** igordc has quit IRC | 17:49 | |
noorul | Looks like ssh key issue http://paste.openstack.org/show/758134/ | 17:49 |
*** igordc has joined #zuul | 17:50 | |
*** mattw4 has joined #zuul | 17:50 | |
ofosos | The logs are a bit weird, because the bitbucket watcher is not running | 17:50 |
ofosos | Yes, please put the Zuul ssh key into the repository under repo config / access keys | 17:51 |
ofosos | That might do it. When I bork the ssh key (which happens fairly often), we usually run into this error. After I fix the ssh key it's ok. | 17:54 |
noorul | Actually executor is not having ssh private key under /root/.ssh | 17:55 |
noorul | ssh key mounting is not working | 17:58 |
*** armstrongs has joined #zuul | 18:11 | |
noorul | ofosos: I have new error | 18:12 |
noorul | ofosos: http://paste.openstack.org/show/758136/ | 18:12 |
noorul | corvus: ^^ | 18:13 |
noorul | corvus: Any idea? | 18:13 |
ofosos | Can you try using a name for the ssh server? | 18:15 |
pabelanger | tobiash: corvus: mordred: clarkb: I think http://paste.openstack.org/show/758138/ is going to be the next issue to solve with github, we are seeing a high rate of MERGER_FAILURE, from what looks like maybe the event getting to zuul faster then the git refs being updated on github side | 18:15 |
noorul | ofosos: where? | 18:16 |
ofosos | In the Zuul config, in the driver | 18:18 |
noorul | You mean instead of IP are you asking to give FQDN? | 18:19 |
ofosos | Yes | 18:19 |
pabelanger | https://dashboard.zuul.ansible.com/t/ansible/buildsets?result=MERGER_FAILURE is example | 18:20 |
pabelanger | but hard to say time frame | 18:20 |
*** armstrongs has quit IRC | 18:20 | |
noorul | ofosos: Are you using sshkey property? | 18:21 |
ofosos | No | 18:22 |
*** chkumar|ruck is now known as raukadah | 18:23 | |
clarkb | pabelanger: pull request came in, zuul gets event for pull, then tries to fetch the head of the pull ref and fails because it isn't there yet? | 18:24 |
openstackgerrit | Merged zuul/zuul master: web: logfile do not scroll into view more than once https://review.opendev.org/676924 | 18:24 |
ofosos | But we're using a named Bitbucket server. | 18:24 |
ofosos | I think the error indicates that paramiko will not put the ssh host id into it's known hosts file. | 18:24 |
ofosos | What you can do is try to ssh to port 7999 from the executor and have the ssh command put it there. | 18:25 |
ofosos | No, I'm wrong. Please try sshing to this IP and port. I think read timeout indicates some other problem. | 18:26 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: gerrit: ensure patchset numbre is a number https://review.opendev.org/677007 | 18:29 |
ofosos | I think if the key was wrong we'd get a different error. Maybe there's something wrong with container networking. Can you ssh from the zuul executor container to the Bitbucket server? | 18:29 |
clarkb | pabelanger: I wonder if there are other events that signify the ref is available (like gerrit's replication completed events) | 18:29 |
noorul | ofosos: Cloning from ssh://10.29.12.160:7990/demo/zuul-config.git | 18:29 |
clarkb | pabelanger: otherwise we may just have to do a stronger backoff with our retries | 18:29 |
ofosos | Yes | 18:30 |
noorul | ofosos: Shouldn't this be ssh://git@10.29.12.160:7999/demo/zuul-config.git | 18:30 |
pabelanger | clarkb: yah, I think that is right, not there | 18:30 |
pabelanger | I need to check if other events are there | 18:30 |
pabelanger | clarkb: or we have another merger try | 18:31 |
ofosos | You're right. That's the error, enter ssh://git@10.... into the driver config. | 18:31 |
ofosos | And restart | 18:31 |
ofosos | Ok, I'm switching places. Responses will now a bit more sluggish. | 18:31 |
openstackgerrit | Merged zuul/zuul master: web: logfile scroll into view a bit more https://review.opendev.org/676929 | 18:43 |
noorul | ofosos: now scheduler log says bad status line http://paste.openstack.org/show/758141/ | 18:47 |
noorul | Did anyone see ^^ this before ? | 18:51 |
openstackgerrit | Merged zuul/zuul master: web: logfile do not use anchor for line selection https://review.opendev.org/676928 | 19:00 |
noorul | ofosos: hi | 19:01 |
noorul | ofosos: leaving now | 19:02 |
noorul | ofosos: When can we meet again? | 19:02 |
noorul | ofosos: let me know | 19:02 |
*** noorul has quit IRC | 19:02 | |
SpamapS | corvus:ACK I'll peek now | 19:23 |
SpamapS | Ah I see that's already been handled. :) | 19:23 |
corvus | SpamapS: yep, tldr all your wishes should have been granted :) | 19:26 |
SpamapS | Indeed. Is that going to be in a 3.10.2 eventually? | 19:30 |
SpamapS | I was starting to upgrade to 3.10.1 last night but figured I might want to wait until fix-zuul-logs is .. well.. fixed. ;) | 19:31 |
jeliu_ | mordred: Hey Monty, are you familiar with setting up percona db cluster using percona operator? I'm trying to do it manually first on minikube (https://www.percona.com/doc/kubernetes-operator-for-pxc/minikube.html) and then incorporate it into the zuul-operator but I was having some issues because the cluster states were "running" but not "ready" (https://docs.google.com/document/d/1rveUpciVirBrK6hqeXiH61JVgZuMDpDw108uZrf6bt0/edit?usp=sharing) | 19:33 |
pabelanger | +1 3.10.2, I confirmed we also had deep link issue, but only after trying to reproduce the issue | 19:33 |
clarkb | jeliu_: there was stuff for that in our k8s gitea deployment | 19:39 |
clarkb | let me see if I can find links | 19:39 |
clarkb | jeliu_: https://opendev.org/opendev/system-config/src/branch/master/kubernetes/percona-xtradb-cluster | 19:40 |
corvus | clarkb: i don't think we used the operator, did we? | 19:40 |
corvus | jeliu_: you might want to paste that log into http://paste.openstack.org/ so folks can read it without logging in | 19:41 |
clarkb | oh no it may have all veen ansible | 19:41 |
corvus | clarkb: yeah, i think maybe it was converted from helm charts or something? | 19:41 |
corvus | still, there could be a clue there in to what's going wrong, it's just not a 1:1 | 19:42 |
corvus | ie, you could use that as a reference for "here is one way to set up a pxc cluster, what is the pxc operator doing differently?" | 19:42 |
corvus | (pxc == percona xtradb cluster) | 19:42 |
clarkb | ya seems to just be a statefulset of 3 pxc containers | 19:43 |
clarkb | and ansible around that | 19:43 |
*** hwangbo has quit IRC | 20:06 | |
openstackgerrit | Merged zuul/zuul master: web: logfile support multi line selection through shift-click https://review.opendev.org/676937 | 20:10 |
jeliu_ | clarkb, corvus: thanks, and will the pxc-playbook run successfully on my local computer? | 20:10 |
SpamapS | random question: how's the Zuul Kubernetes Operator effort coming along? | 20:22 |
openstackgerrit | Merged zuul/zuul master: web: logfile remember if initial scroll was performed https://review.opendev.org/676954 | 20:24 |
corvus | SpamapS: jeliu_ is trying to get the pxc operator to work but it's failing; if you have a second to take a look at his error, that could help. | 20:32 |
corvus | jeliu_: it would really be great if you pasted your error into the public pastebin istead of google docs :) | 20:33 |
*** mattw4 has quit IRC | 20:34 | |
corvus | jeliu_: yeah, i think it should run with a local k8s | 20:35 |
*** mattw4 has joined #zuul | 20:35 | |
jeliu_ | Logs for trying to Install the Percona Operator and Create a Cluster Resource: http://paste.openstack.org/show/758146/ | 20:39 |
jeliu_ | ^not the prettiest logs to look at | 20:40 |
corvus | errors rarely are :) | 20:42 |
corvus | jeliu_: that looks great, thanks | 20:43 |
corvus | jeliu_: something just occurred to me -- it may also be helpful to get the logs for the cluster member that *did* start | 20:43 |
corvus | it still may have information which could indicate why the others didn't | 20:43 |
*** yolanda__ has joined #zuul | 20:45 | |
*** yolanda has quit IRC | 20:46 | |
*** rfolco has quit IRC | 20:48 | |
openstackgerrit | Merged zuul/zuul master: Collect more information from quickstart failures https://review.opendev.org/676976 | 20:50 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: extract pure functions from the TaskOutput component https://review.opendev.org/675460 | 20:50 |
*** mattw4 has quit IRC | 20:53 | |
*** mattw4 has joined #zuul | 20:53 | |
*** pcaruana has quit IRC | 20:57 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: console mark syntax error task as failed https://review.opendev.org/677030 | 21:00 |
*** igordc has quit IRC | 21:01 | |
*** igordc has joined #zuul | 21:01 | |
tristanC | corvus: please have a look at 677030, the console is showing syntax error as OK. I tried to added a test but i think the change is missing legitimate success result | 21:01 |
tristanC | corvus: it's from: https://softwarefactory-project.io/zuul/t/local/build/8b1f38432ba747379c3ecb8e5035d81a/console | 21:02 |
corvus | tristanC: mordred is working on the solution to that | 21:02 |
corvus | tristanC: https://review.opendev.org/676723 | 21:02 |
tristanC | alright, that's good to know | 21:03 |
corvus | turns out we're missing important info in the json file :) | 21:03 |
tristanC | corvus: would it be possible to land the js tests soon ( https://review.opendev.org/675460 ) | 21:03 |
corvus | tristanC: i'll look soon, running to a meeting now | 21:04 |
*** armstrongs has joined #zuul | 21:10 | |
armstrongs | hey im seeing a situation where i have pushed the latest version of a container to a docker registry and nodepool is scheduling an older version of the same tag. Does nodepool cache the images and how do i refresh this? | 21:12 |
clarkb | armstrongs: I want to say k8s and/or docker behave like git in this case? | 21:14 |
clarkb | local versions arent moved unless explicitly told to do so | 21:14 |
clarkb | https://github.com/kubernetes/kubernetes/issues/33664 | 21:15 |
armstrongs | ok so not nodepool thanks :) | 21:16 |
tristanC | armstrongs: the nodepool kubernetes provides uses "IfNotPresent" by default for the image-pull attribute | 21:27 |
*** mgoddard has quit IRC | 21:29 | |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: kubernetes: add missing image-pull documentation https://review.opendev.org/677036 | 21:29 |
*** mgoddard has joined #zuul | 21:32 | |
*** armstrongs has quit IRC | 21:36 | |
corvus | tristanC: i think we can merge 677030 as a workaround until mordred finishes the work on the callback -- but can you use that to look at a bunch of logfiles and make sure it still does the right thing? i know there are some test cases, but since it's changing the default return value, i'm not sure if they're enough coverage. | 21:40 |
*** rlandy|rover has quit IRC | 21:41 | |
*** spsurya has quit IRC | 22:23 | |
*** jeliu_ has quit IRC | 22:45 | |
*** igordc has quit IRC | 22:46 | |
clarkb | corvus: tristanC I left a question on 677030 | 23:28 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!