Thursday, 2021-09-16

-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] Show emoji to highlight failed jobs in build result in Github https://review.opendev.org/c/zuul/zuul/+/80354705:36
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] Show emoji to highlight failed jobs in build result in Github https://review.opendev.org/c/zuul/zuul/+/80354705:49
-@gerrit:opendev.org- Felix Edel proposed:07:05
- [zuul/zuul] Don't use executor.builds when processing build result events https://review.opendev.org/c/zuul/zuul/+/808091
- [zuul/zuul] Fix race in test_data_return_child_from_retried_paused_job https://review.opendev.org/c/zuul/zuul/+/808918
- [zuul/zuul] Don't use executor.builds to find out if tests are settled https://review.opendev.org/c/zuul/zuul/+/808792
- [zuul/zuul] Remove the local builds list from the executor client https://review.opendev.org/c/zuul/zuul/+/809175
-@gerrit:opendev.org- Simon Westphahl proposed:08:50
- [zuul/zuul] Implement ABC for caching changes in Zookeeper https://review.opendev.org/c/zuul/zuul/+/805835
- [zuul/zuul] Cache Gerrit refs in Zookeeper https://review.opendev.org/c/zuul/zuul/+/805837
- [zuul/zuul] Cache Github refs in Zookeeper https://review.opendev.org/c/zuul/zuul/+/805838
- [zuul/zuul] Cache Pagure refs in Zookeeper https://review.opendev.org/c/zuul/zuul/+/806556
- [zuul/zuul] Cache Gitlab refs in Zookeeper https://review.opendev.org/c/zuul/zuul/+/806557
- [zuul/zuul] Cache Git refs (driver) in Zookeeper https://review.opendev.org/c/zuul/zuul/+/806755
- [zuul/zuul] Periodically maintain connection caches https://review.opendev.org/c/zuul/zuul/+/806756
- [zuul/zuul] Clean up dangling cache data nodes more often https://review.opendev.org/c/zuul/zuul/+/807102
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] Web UI: make more filters selectable in build, buildset searches https://review.opendev.org/c/zuul/zuul/+/79315911:03
-@gerrit:opendev.org- Tobias Henkel proposed: [zuul/nodepool] Check for images to upload single threaded https://review.opendev.org/c/zuul/nodepool/+/74379011:13
-@gerrit:opendev.org- Tobias Henkel proposed: [zuul/nodepool] Check for images to upload single threaded https://review.opendev.org/c/zuul/nodepool/+/74379011:14
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] Let zuul-web look up the live log streaming address from ZooKeeper https://review.opendev.org/c/zuul/zuul/+/80941011:27
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] Delete build request by its path https://review.opendev.org/c/zuul/zuul/+/80941312:58
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] wip: Make QueueItem a Zookeeper object https://review.opendev.org/c/zuul/zuul/+/80941413:26
@westphahl:matrix.org^ corvus first attempt at making the queue item a zk object.13:28
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] wip: Make QueueItem a Zookeeper object https://review.opendev.org/c/zuul/zuul/+/80941413:44
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] Let zuul-web look up the live log streaming address from ZooKeeper https://review.opendev.org/c/zuul/zuul/+/80941013:45
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] wip: Make QueueItem a Zookeeper object https://review.opendev.org/c/zuul/zuul/+/80941413:54
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul] UI: remove immutability-helper dependency https://review.opendev.org/c/zuul/zuul/+/80942314:44
@mhuin:matrix.orgcorvus: fixed the (X) button in https://review.opendev.org/c/zuul/zuul/+/793159/ - the behavior should be closer to what you had in mind14:50
@clarkb:matrix.orgswest: can you check my comment on https://review.opendev.org/c/zuul/zuul/+/805835/ ?15:33
@clarkb:matrix.orgfungi: have those tox role changes landed yet?15:40
@clarkb:matrix.orgLooks like no. If anyone has a chance to review https://review.opendev.org/c/zuul/zuul-jobs/+/806613/3 and its children that would be appreciated. We have some users trying to make use of them16:11
@westphahl:matrix.orgClark: responded to your comment. the Kazoo docs seem to be outdated16:25
@jim:acmegating.comyeah, i noticed the same thing with the docs...16:26
@jim:acmegating.comi knew kazoo had implemented the create2 opcode, but was a little surprised to see it there.  but it's in the docstring for the method, so i think it's a legit intended feature.  maybe something's messed up with doc generation/publishing16:27
@jim:acmegating.com * i knew kazoo had implemented the create2 opcode, but was a little surprised not to see it there.  but it's in the docstring for the method, so i think it's a legit intended feature.  maybe something's messed up with doc generation/publishing16:27
@jim:acmegating.comClark: fungi q on https://review.opendev.org/80662116:47
@clarkb:matrix.orgswest: +217:27
@clarkb:matrix.orgcorvus: looks like fungi will attempt to add better testing to that.17:27
@clarkb:matrix.orgUnrelated to anything above I've been doing openstack nova surgery and it occured to me that nova's placement service tracks general resource usage. Very early thinking out loud thoughts: There is potential for using the placement service in nodepool to allocate resources more effectively if we want to remove the free for all behavior of providers grab requests as quickly as possible and try to allocate more smartly17:29
@clarkb:matrix.orgIn theory we wouldn't have to write a bunch of code to do that logic and instead could farm that out to placement. Though I'm not sure how generally independent placement is (eg could we use it without keystone)17:29
@jim:acmegating.comexisting test wfm; +317:40
@jim:acmegating.comthe mysqldump of opendev's zuul sql db is 9.8G uncompressed18:00
@jim:acmegating.comtobiash: i'm happy to report that running the buildset time query on opendev's database against a nova job takes 750ms with a cold cache, or up to 1500ms with contention from other queries.  that's even slower than what you reported, so i think that's a good baseline to check for improvement.18:06
@tobias.henkel:matrix.orgcorvus: thanks for double checking, I'm glad that it's not just our db18:07
@clarkb:matrix.orgThinking out loud: maybe we keep the timedb local to each scheduelr and over time they'll all have a mostly accurate enough dataset they can refer to locally?18:08
@jim:acmegating.comoh, the query is terrible18:08
@jim:acmegating.comlet me at least do 5 seconds worth of query optimization before we start saying things like "sql databases are slow"18:08
@jim:acmegating.comcause they aren't :)18:08
@tobias.henkel:matrix.orgI wasn't planning to say this :D18:09
@jim:acmegating.com|  1 | SIMPLE      | zuulbuildset | NULL       | ref  | PRIMARY,projectpipelineidx,projectchangeidx | projectpipeline_idx     | 768     | const                       | 189010 |     1.00 | Using where; Using temporary; Using filesort |18:09
@jim:acmegating.comthis is not a happy query ^ :)18:09
@tobias.henkel:matrix.orgfilesort sounds like a deal breaker for frequent queries18:10
@jim:acmegating.com(that's 189010 estimated rows returned from the first stage which involves a temporary table and filesort)18:10
@jim:acmegating.comyeah.  so i'm thinking we need a tenant index at least; tenant+project might be a good idea too18:11
@jim:acmegating.combut anyway, i'm setting up a local test environment so i can double check it on production-scale data; should have something after lunch18:12
@avass:vassast.organy reason why zuuls Dockerfile make `/var/lib/zuul` a volume? That causes some headaches since we tried to build our own image to change the user id of the zuul user to make it run in openshift.18:39
@avass:vassast.orghttps://review.opendev.org/plugins/gitiles/zuul/zuul/+/refs/heads/master/Dockerfile#6318:39
@avass:vassast.orgwe ended up solving it by mounting volumes in `/var/lib/zuul` but docker volumes seem to cause issues quite often :/18:41
@clarkb:matrix.orgavass: I think beacuse zuul needs to keep persistent data and some semi persistent data18:46
@clarkb:matrix.orgThe discussion above about the timing database will remove some but not all of that information. In particular you have the git repos which most users probably do not want to reclone every time they restart zuul services18:46
@fungicide:matrix.orgcorvus: yep, sorry, was in the middle of yardwork so had gertty handy but not element. i'm working on an update to the existing test now18:47
@avass:vassast.orgClark: it's should still be possible to set up volumes without doing in the dockerfile, right? The issue is that you can't make changes to a `VOLUME` directory later on.18:48
@avass:vassast.orgnot a huge issue but it's a bit annyoing18:48
@clarkb:matrix.orgavass: I think the idea is by being explicit then users can't mess it up? they get a volume automatically if they don't supply a mount for that?18:48
@avass:vassast.orgClark: yeah but in our case we needed to change the uid of zuul and update the owner of that directory, which we then can't do if we extend the image18:49
@clarkb:matrix.orgya I'm not sure what the best approach is for managing uids in this case. It seems like a problem with containers in general. I seem to recall some container init scripts doing a chown of dirs for example18:50
@avass:vassast.orgwe also can't do that since we can't run as root or the uid of the zuul user :)18:50
-@gerrit:opendev.org- Jeremy Stanley proposed: [zuul/zuul-jobs] Explicit tox_extra_args in zuul-jobs-test-tox https://review.opendev.org/c/zuul/zuul-jobs/+/80945619:01
@fungicide:matrix.orgcorvus: ^ is that what you had in mind?19:02
@fungicide:matrix.orginteresting, test_zuul_google_storage_upload is failing on python 3.7 because of what looks like protobuf's descriptor metaclass inheritance syntax being python-3 only (i think). is anyone else already looking into it?19:06
@fungicide:matrix.orger, failing on python 2.719:06
@fungicide:matrix.orghttps://zuul.opendev.org/t/zuul/build/47af369513b1408db101faa33f7796c319:07
@fungicide:matrix.orgnew protobuf release yesterday19:08
@fungicide:matrix.orgyep, that's it, changelog states "Drops support for 2.7 and 3.5."19:10
-@gerrit:opendev.org- Jeremy Stanley proposed:19:18
- [zuul/zuul-jobs] Add tox_config_file rolevar to tox https://review.opendev.org/c/zuul/zuul-jobs/+/806613
- [zuul/zuul-jobs] Support verbose showconfig in tox siblings https://review.opendev.org/c/zuul/zuul-jobs/+/806621
- [zuul/zuul-jobs] Include tox_extra_args in tox siblings tasks https://review.opendev.org/c/zuul/zuul-jobs/+/806612
- [zuul/zuul-jobs] Explicit tox_extra_args in zuul-jobs-test-tox https://review.opendev.org/c/zuul/zuul-jobs/+/809456
- [zuul/zuul-jobs] Pin protobuf<3.18 for Python<3.6 https://review.opendev.org/c/zuul/zuul-jobs/+/809460
@fungicide:matrix.orgstacked onto the surgical protobuf pin19:18
@jim:acmegating.comah, the key here is to reverse sort by buildset id, not build id.  that gets us using where clauses throughout and no additional indexes needed20:36
@jim:acmegating.comhuh, thinking about that a bit more, i think we can apply that universally and get a performance benefit even on things like the builds page20:44
@foodster:matrix.orgHello..to get around the issue of not being able to rebase MRs using gitlab driver for ff merge I am planning to add a task in pipeline itself to rebase and merge instead of zuul performing the merge..is it a good idea to add that task in post pipeline with dependence on zuul_success?20:47
@jim:acmegating.com@foodster:matrix.org: honestly no.  for zuul to work in a gating environment, it needs to be in control of merging, otherwise its testing and operation isn't valid.  if you can't alter the gitlab configuration to work with zuul, or update the zuul gitlab driver to work with your workflow, then i don't think trying to work around that in jobs is the right way to go.20:54
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] Remove time database https://review.opendev.org/c/zuul/zuul/+/80884121:20
@jim:acmegating.comClark tobiash mordred ^ that runs in 2ms on the opendev production dataset (openstack tenant) in my testing.21:22
@jim:acmegating.com"select 1;" runs in 1ms, fwiw.21:23
@clarkb:matrix.orgNice I'll rereview shortly21:27
@clarkb:matrix.orgcorvus:  you might want to remove your wip on the change21:28
@clarkb:matrix.orgcorvus: when sorting by buildset are we sorting by buildset uuid or an auto incrementing index column in the database?21:30
@jim:acmegating.comClark: auto increment21:30
@jim:acmegating.comso it's an approximation of most recent builds.  should produce the same set of build records, just possibly not strictly in build-id sorted order.21:31
@jim:acmegating.com(it probably will be in practice anyway, but it's not guaranteed, and we don't care; but that's ultimately why i decided not to do this for the queries we do for the build page)21:32
@clarkb:matrix.orgok just double checking because this is how it is defined:21:32
> id = sa.Column(sa.Integer, primary_key=True)
@clarkb:matrix.orgI guess maybe primary_key=True implies auto incrementing21:32
@clarkb:matrix.orgin that case we should get recent enough build data even if not strictly ordered. I would worry if we can have year old data mixed in but it should all be from the same day21:33
@jim:acmegating.comyeah, we'll get 10 builds from the most recent buildsets.  that should be the same as 10 most recent builds.  just that it's possible that those 10 won't be in sorted order relative to each other.  they will all be no less recent than all other builds though.21:35
@jim:acmegating.com(i think the only way they could end up sorted out of order is if one of them is a contigent build and started later after all the other builds in its buildset)21:36
@jim:acmegating.comi used pgloader to copy the opendev production data into a postgres db.  that's an approximation of the schema, but it returned data similarly quickly.  i tried importing it with the tables as they would actually be created and it oom'd.  but i'm reasonably confident it should be as fast on pgsql as mysql.22:24
@clarkb:matrix.orgswest: corvus left a couple of notes on https://review.opendev.org/c/zuul/zuul/+/806556/ based on some of the patterns and differences between the drivers I'm seeing emerge as I get through the stack23:35
@jim:acmegating.comClark: ++ agree on all23:43

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!