Thursday, 2019-04-11

*** threestrands has joined #zuul00:14
SpamapSwhy on earth are the drivers doing this btw?00:29
SpamapSJust ocurred to me that this seems like a generic problem for all providers.00:29
clarkbSpamapS: my guess is beacuse figuring out usage is driver specific00:30
SpamapSBut not at the max-servers level00:30
clarkbwe could push that into the drivers and then have ya that00:30
SpamapSThat's entirely nodepool's problem.00:30
clarkbprior to actually checking quota it was entirely in the scheduler iirc00:31
SpamapSThere's "how many are running in the cloud vs. what you want to launch" and then theres' "how many are in nodepool's database vs. what you want to run?"00:31
clarkbnodepool used its local db to calculate usage00:31
SpamapSYeah, seems like both should happen.00:31
clarkbbut now it uses a combo of both00:31
SpamapSseems like nodepool should handle max-servers itself.00:31
clarkbSpamapS: I think this is largely just an overlooked item in the split out of drivers from onyl having an openstack driver00:31
SpamapSmakes sense00:32
*** pwhalen has quit IRC00:32
*** tobiash has quit IRC00:32
*** timburke has quit IRC00:32
clarkbbut ya nodepool could ask the driver of a provider for its reported usage, grab usage from the db and compare the two at a higher level00:33
clarkbthen drivers would only need to implement the provide reported usage data00:33
*** tobiash has joined #zuul00:33
*** timburke has joined #zuul00:34
*** mgoddard has quit IRC00:34
SpamapSThat's also good, but I need even less.00:35
SpamapSNodepool could look at the number of nodes a provider owns, and if it's >= max-servers, reject the request.00:35
SpamapSNo cloud request needed for that.00:35
clarkbya I actually think the openstack driver does that short cut00:35
SpamapSI actually assumed it did that00:35
clarkbbut you'd still need the check usage for the general case00:36
SpamapSand luckily I have a 30 instance limit.. or I might have eaten up my entire AWS budget yesterday. ;)00:36
clarkbI have max-server headroom, do I actually have quota00:36
SpamapSanother broken assumption from the openstack driver that we accidentally carried into AWS is that this provider owns every instance it can see.00:37
*** mgoddard has joined #zuul00:37
SpamapSAWS definitely does not work like that.00:37
SpamapS(well, it can, but only with a really carefully constructed IAM policy)00:37
SpamapSso I'm adding a tag system so listNodes filters out nodes that don't belong to nodepool.00:38
clarkboh it already does the thing I said via hasRemainingQuota00:38
clarkbbut aws driver doesn't implement that00:38
SpamapSThat shortcut should be moved to a concrete function in the parent class that gets called before hasREmainingQuota.00:39
SpamapSsomething like  if self.pool.max_servers is not None: ...00:39
SpamapSanyway... for now.. just adding some filtering on tags and counting and it should work00:39
clarkband it doesn't shortcut like I thought00:40
clarkbSpamapS: I think updating the base class hasRemainingQuota() to check local db against max* would get you what you want00:41
clarkbthen drivers can selectively do more by completely overriding the  base class method or do both by supercalling it and then doing more00:41
SpamapSclarkb: max*? Are there cpu counts hiding in there too?00:41
clarkbSpamapS: ya ram, cpu, and instances00:41
clarkblooks like00:41
clarkbhrm I suppose you'd really only be able to do instances that way00:42
clarkbbeacuse you won't know ram or cpu usage00:42
clarkbbut still an improvement00:42
SpamapSah ok, interesting.. AWS actually does have a price list API that has some of what the openstack flavors API has... so we could actually do this00:44
SpamapShttps://docs.aws.amazon.com/aws-cost-management/latest/APIReference/API_pricing_GetProducts.html00:44
SpamapSIn fact with that API, one could actually set up a run-rate as a limit00:47
SpamapSbut yeah, for now, I just want max-servers :-P00:47
* SpamapS working through just querying EC2.00:47
openstackgerritClark Boylan proposed openstack-infra/nodepool master: Add simple max-server sanity check to base handler class  https://review.openstack.org/65167601:01
clarkbSpamapS: ^ I haven't tested that but it may be as simple as that for your needs01:01
*** jamesmcarthur has joined #zuul01:34
*** jamesmcarthur has quit IRC01:48
*** jamesmcarthur has joined #zuul01:49
*** jamesmcarthur has quit IRC02:25
*** bhavikdbavishi has joined #zuul02:42
*** bhavikdbavishi has quit IRC03:05
*** jamesmcarthur has joined #zuul03:25
*** jamesmcarthur has quit IRC03:41
*** bhavikdbavishi has joined #zuul03:44
*** bhavikdbavishi1 has joined #zuul03:49
*** bhavikdbavishi has quit IRC03:49
*** bhavikdbavishi1 is now known as bhavikdbavishi03:49
*** jamesmcarthur has joined #zuul04:01
*** jamesmcarthur has quit IRC04:19
*** bjackman_ has joined #zuul04:25
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: Add python-path option to node  https://review.openstack.org/63733804:44
*** mhu has quit IRC04:45
openstackgerritTristan Cacqueray proposed openstack-infra/zuul-jobs master: Add ansible-lint job  https://review.openstack.org/53208304:46
*** quiquell|off is now known as quiquell|rover05:32
*** gouthamr has quit IRC05:56
*** gouthamr has joined #zuul06:00
openstackgerritTristan Cacqueray proposed openstack-infra/zuul-jobs master: Add ansible-lint job  https://review.openstack.org/53208306:12
*** quiquell|rover is now known as quique|rover|brb06:26
*** gtema has joined #zuul06:57
*** quique|rover|brb is now known as quiquell|rover07:06
*** threestrands has quit IRC07:30
*** jpena|off is now known as jpena07:36
*** hashar has joined #zuul08:18
*** yolanda_ has joined #zuul08:29
*** mhu has joined #zuul09:26
*** electrofelix has joined #zuul09:40
*** bhavikdbavishi has quit IRC10:12
*** yolanda_ has quit IRC10:40
*** yolanda_ has joined #zuul10:52
*** jpena is now known as jpena|lunch10:56
quiquell|rovertristanC: do you know if a job that depends on a non-voting job will start even if dependant fails ?11:01
quiquell|rovernhicher: ^ do you know ?11:02
quiquell|roverI am going to test that11:03
*** bhavikdbavishi has joined #zuul11:08
AJaegerI hope it does not ;)11:12
quiquell|rovertesting it but our pipeline looks like11:20
quiquell|roverAJaeger: openstack-periodic at https://softwarefactory-project.io/zuul/t/rdoproject.org/status11:20
quiquell|roverAJaeger: the dependant is the buildah job and it's failing11:20
quiquell|roverAJaeger: testing it here https://review.rdoproject.org/r/#/c/20143/11:21
quiquell|roverAJaeger: nah is working as expected https://review.rdoproject.org/r/#/c/20143/11:31
AJaegergood ;)11:34
quiquell|roverAJaeger: then I don't know why our jobs are running11:35
quiquell|rover:-/ that's worse11:35
quiquell|roverAJaeger: do you have some brain cycles for me ?11:35
quiquell|roverAJaeger: this is the stuff I am talking about https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/zuul.d/tripleo.yaml#L45-L4711:40
AJaegersorry, not enough brain right now to dig into that ;(11:41
*** rlandy has joined #zuul11:57
*** rlandy is now known as rlandy|ruck11:58
*** jamesmcarthur has joined #zuul12:13
*** jpena|lunch is now known as jpena12:26
pabelangerclarkb: mordred: tobiash: care to add https://review.openstack.org/163922 to your review queue, that should be a small improvement on zuul-merger with stacked commits12:29
*** jamesmcarthur has quit IRC12:34
*** jamesmcarthur has joined #zuul12:47
*** quiquell|rover is now known as quiquell|lunch12:59
*** bjackman_ has quit IRC13:04
*** jamesmcarthur has quit IRC13:04
nhicherquiquell|lunch: did you find the issue with your non-voting job ?13:06
quiquell|lunchnhicher: there were no issue just a brain fart from my part13:07
quiquell|lunchthanks anyways13:07
nhicherquiquell|lunch: ok =)13:08
*** webknjaz has joined #zuul13:09
webknjazHello everyone!13:10
* webknjaz wonders whether this is the right place to criticise the usage of CherryPy in Zuul...13:10
*** bhavikdbavishi has quit IRC13:17
Shrewswebknjaz: perhaps you mean "discuss" rather than "criticise"?   but yes, this is the correct channel, though the person that switched us to that is on vacation at the moment13:17
*** pcaruana has quit IRC13:20
webknjaz@Shrews: yeah... The thing is that I've been exposed to the source code which made me a bit unhappy :)13:22
webknjazI'm a CherryPy maintainer (along with other things like aiohttp, ansible core).13:22
webknjazSo I just wanted to point out how to do a few things cherrypy-way. I'm especially interested in GitHub Apps now.13:22
webknjazHere's a cleaner example of having event handlers from one of my PoCs, for example: https://github.com/webknjaz/ansiwatch-bot/blob/58246a8/ansiwatch_bot/apps/github_events.py#L136-L15713:22
webknjazOh, and I'm currently developing a framework for writing github apps and actions — https://tutorial.octomachinery.dev13:22
webknjaz@Shrews: so I just wanted to say that if interested parties want some feedback, maybe you could tell them to ping me once they are back?13:23
Shrewswebknjaz: we'd LOVE feedback on how to do things better, or where we do things in not the best way. anything you can share to help us improve is much appreciated. No need to wait for that particular dev to return.13:30
webknjazIs this channel logged? Will it better to share it in some better discoverable place?13:31
Shrewswebknjaz: yes, it is logged: http://eavesdrop.openstack.org/irclogs/%23zuul/13:32
*** quiquell|lunch is now known as quiquell|rover13:37
*** pcaruana has joined #zuul13:42
webknjazI think that some parts of http://git.zuul-ci.org/cgit/zuul/tree/zuul/web/__init__.py#n647 could use plugins like from here: https://github.com/cherrypy/cherrypy/blob/b648977/cherrypy/process/plugins.py#L484-L58013:51
webknjazThis class also looks like it should be a plugin: http://git.zuul-ci.org/cgit/zuul/tree/zuul/driver/github/githubconnection.py#n10113:53
webknjazpeople don't seem to realize that the core of CherryPy is a pubsub bus which they can actually use as well13:54
*** bjackman_ has joined #zuul13:56
webknjazhttps://www.irccloud.com/pastebin/1QzyW2Aj/13:59
*** bjackman_ has quit IRC14:01
webknjazBut I think that this is still a wrong approach. Architecturally, it's way nicer to customize the routing layer. CherryPy has pluggable and replaceable request dispatchers.14:02
webknjazFor example, here's what I've done: https://github.com/webknjaz/ansiwatch-bot/blob/master/ansiwatch_bot/request_dispatcher.py14:02
webknjazThis allows me to have per-event methods: https://github.com/webknjaz/ansiwatch-bot/blob/58246a8/ansiwatch_bot/apps/github_events.py#L136-L15714:02
*** swest has quit IRC14:03
webknjazYou can have a decorator to unpack all webhook payload keys as handler method arguments: https://github.com/webknjaz/ansiwatch-bot/blob/master/ansiwatch_bot/tools.py14:03
*** swest has joined #zuul14:04
webknjazOverall working with GitHub can be offloaded to plugins: https://github.com/webknjaz/ansiwatch-bot/blob/master/ansiwatch_bot/plugins.py14:05
Shrewswebknjaz: i'm not sure how much sense it makes for us to use the cherrypy dispatchers. we have other parts of the app (non-cherrypy based) that we are communicating with via gearman, which is what you see there in the payload() method14:13
webknjazwhat do you mean? does gearman act as a proxy?14:13
*** ianychoi has quit IRC14:13
*** ianychoi has joined #zuul14:14
Shrewswe have several pieces/daemons to "zuul" that all communicate with each other (submitting jobs/requests/etc) using gearman14:14
mordredyeah. it's a network bus/ job worker system14:14
pabelangerwebknjaz: https://zuul-ci.org/docs/zuul/admin/components.html might help explain the different parts of zuul14:14
Shrewspabelanger: ah yes, was just looking for that. thx14:15
pabelangernp!14:15
mordredso this http://git.zuul-ci.org/cgit/zuul/tree/zuul/driver/github/githubconnection.py#n1556 is the only bit running on the zuul-web server, the other bits are running in other processes probably on other machines that don't do anything with http requests14:15
webknjazso it's vice versa14:16
webknjazanyway, it still needs to have validation in a tool14:16
mordredyeah- the web interaction is largely "receive payload, validate signature, put on gearman bus" - but yeah, I think doing the validate as a decorator like you mention could be potentially nice14:17
webknjazfor interfacing with jobs it may be better to have a CherryPy plugin14:17
webknjazbecause otherwise implementation details of the rpc leak into the web handler layer14:18
mordredlike a cherrypy plugin that does the submit job? that's an interesting idea14:18
webknjazyep14:18
webknjazyou'd use pub-sub to interface with it14:19
webknjazhttps://www.irccloud.com/pastebin/MlVJDNcG/14:20
mordrednod. I could see that plugin being potentially nice - passing things onto the gearman bus is a common pattern14:20
webknjazor without that helper func it's: https://github.com/webknjaz/ansiwatch-bot/blob/master/ansiwatch_bot/apps/github_events.py#L93-L9414:21
mordredwebknjaz: https://opendev.org/openstack-infra/zuul/src/branch/master/zuul/web/__init__.py#L263-L270 ... is an example of one of the simpler but common patterns of "turn this http request into a gearman call" - is there a better way to set the CRSF header more generally than just grabbing the response and setting the header directly like we're doing there?14:25
webknjazI believe you could use another tool14:26
webknjazhttp://docs.cherrypy.org/en/latest/extend.html#hook-point14:26
webknjazTry using `before_finalize` hook point14:27
webknjazAfter looking at http://git.zuul-ci.org/cgit/zuul/tree/zuul/driver/github/githubconnection.py#n241 I think that you really need a custom dispatcher...14:28
webknjazmaybe you could event use a WSGI app interface "on the other end" of gearman14:29
*** quiquell|rover is now known as quiquell|off14:34
*** hashar has quit IRC15:03
*** bhavikdbavishi has joined #zuul15:13
*** pcaruana has quit IRC15:42
*** jamesmcarthur has joined #zuul15:49
*** jamesmcarthur_ has joined #zuul15:50
*** jamesmcarthur has quit IRC15:50
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [DNM] admin REST API: docker-compose PoC, frontend  https://review.openstack.org/64353615:55
*** bhavikdbavishi has quit IRC16:01
*** chandankumar is now known as raukadah16:01
*** quiquell|off has quit IRC16:14
*** ianychoi has quit IRC16:20
SpamapSShrews:you around? I have a question about NodeRequestHandler.decline_request ... it fails if all launchers have declined.. but.. don't we want to retry it again at some point?16:26
SpamapSLike, if all the providers are just busy... we want that request to queue, right?16:26
SpamapSBut it seems like that will just lead to NODE_FAILURE16:27
clarkbwhen they are busy they stop processing requests16:29
clarkbso it shouldnt decline unless it actually failed up to the retry count or the provider does not have the label16:30
SpamapSclarkb:oh? how do they know that?16:30
clarkbSpamapS: the hasremainingquota check16:30
SpamapSMine hits hasProviderQuota first, and fails the request.16:30
SpamapSSo there's a window right as you get busy, where requests fail.16:31
clarkbit is possible this is another openstack specific behavior that should be at a higher level16:31
SpamapSOr it's just a rare thing and you don't see it that often?16:32
SpamapSLike, to make it super careful, you'd need to call hasRemainingQuota before accepting requests.16:32
SpamapSAnd ideally that would reserve resources, so you don't accept two and then one gets failed.16:33
*** zbr has quit IRC16:34
clarkbyes I think it gets the request and checks has remaining quota and if not block until resources are freed16:34
clarkbthat serialization prevents sending back inappropriate declines16:34
SpamapSWell in the test suite what I have is one active request, a quota of 1, and when I send another request for 1, it's failed.16:39
SpamapSI'll push up what I have now, and maybe you can spot the assumptions.16:39
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/nodepool master: Implement max-servers for AWS driver  https://review.openstack.org/64947416:40
*** zbr has joined #zuul16:40
SpamapSclarkb:I would very much appreciate your eyes when you have some time. THanks ^^16:40
clarkbSpamapS: https://git.openstack.org/cgit/openstack-infra/nodepool/tree/nodepool/driver/__init__.py#n48016:41
clarkbthat is the code that pauses when has remaining quota isn't true16:41
SpamapSclarkb:yeah, in my test runs I never see that "Pausing"16:43
SpamapSso there is at least a narrow window between accepting of the request that puts you at quota, and pausing, that results in a failed request. I wonder if I add a sleep if that will remain true16:46
SpamapSIn fact, the request gets declined, which is why we don't go to waitForNodeSet.16:49
SpamapSNot sure how to get through to that next loop iteration that pauses things once we're at quota actually. Seems like everything will bounce off as declined.16:51
* Shrews catches up16:52
SpamapSSeems to me that you shouldn't ever decline an over quota request, you should just pause.16:52
*** jamesmcarthur has joined #zuul16:52
ShrewsSpamapS: i suspect part of the problem may be that you are sharing resources across provider pools, and we don't really support that right now16:55
*** jamesmcarthur_ has quit IRC16:56
Shrewsso your check for quota should never race with another pool thread trying to use that same quota pool16:56
Shrews(if you follow that configuration rule, that is)16:56
Shrewsi realize that's probably not ideal with AWS in its current form16:56
SpamapSThis is a test16:56
SpamapS1 pool16:56
SpamapS1 provider16:56
SpamapSAlso no, the quota check I put in place scopes to pool.16:57
Shrewshrm16:57
SpamapSBut in this case, the problem is the algorithm by which we pause. I believe it has a race condition in it, where you will decline a request if you are already exactly at quota.16:58
SpamapSAnd that may not happen often in OpenStack because of the many quotas.16:58
SpamapSSome min-ready comes along and unwedges things by slipping under the quotas.16:58
SpamapSBut with just max-servers quota.. it's always going to be the case that we walk right up to the quota. And then there's no path I can see through the code to self.paused = True16:59
SpamapSEvery subsequent request to that pool will be declined until the quota is released.16:59
SpamapSThe comment at the top of _waitForNodeSet I think calls out the race at the driver level, but it's actually deeper.17:01
*** gtema has quit IRC17:04
SpamapSI'm actually having trouble figuring out how the pause in _waitForNodeSet is ever reached.17:04
*** pcaruana has joined #zuul17:05
ShrewsSpamapS: if you have to launch a node, but you are out of quota, then you reach that pause.17:05
SpamapSShrews:not in the test case here: https://review.openstack.org/64947417:06
SpamapSYou can't even get to "need to launch a node" because you're already failing hasProviderQuota.17:06
SpamapSOr maybe I misunderstood what hasProviderQuota is supposed to do.17:07
SpamapSI wonder if OpenStack gets around this because of the estimatedNodepoolQuota .. it maybe passes hasProviderQuota in that next request instance.17:09
SpamapSYeah, the caching probably hides this.17:09
SpamapSKeeping the window open just long enough to slip down to _waitForNodeSet.17:10
*** jpena is now known as jpena|off17:10
ShrewsSpamapS: hasProviderQuota, iirc, is supposed to be "can this provider handle the nodes the request, regardless of what is available now".  hasRemainingQuota is the answer to "what is available now"17:12
SpamapSShrews:ok, so hasProviderQuota should *not* take running machines in to account, but rather total capacity of said provider?17:12
SpamapSThat does make sense.17:13
ShrewsSpamapS: i *think* so.... been a while17:13
Shrewsit was originally for things like "request wants 50 nodes, but provider only has 40 total"17:13
SpamapSyeah that would make sense17:14
SpamapSIf that's the case, I think we should make the comment in the abstract class more clear. I'll take that up.17:14
ShrewsSpamapS: but that morphed with tobiash's quota changes, so that's harder to see just from looking at it to try to remember  :)17:14
SpamapSYeah, I think in reading the openstack one I missed that there's an `estimatedNodepoolQuota` and `estimatedNodepoolQuotaUsed` ...17:14
clarkbShrews: semi related to this is https://review.openstack.org/#/c/651676/17:14
SpamapSI just saw them as the same call.17:14
ShrewsSpamapS: "Checks if a provider has enough quota to handle a list of nodes. This does not take our currently existing nodes into account."17:15
SpamapSyep that were it17:16
ShrewsSpamapS: i mean, that seems to say what i just said. what's unclear?17:16
SpamapShttp://paste.openstack.org/show/749204/ <-- this makes the test work the way I expected.17:17
SpamapSShrews:I think I missed the "does not" part.17:17
SpamapS"psshhh... details."17:17
ShrewsSpamapS: happy to +2 any changes that make that more clearerer  :)17:19
* Shrews wonders if <blink> works in rst....17:19
* SpamapS deserved that17:20
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/nodepool master: Implement max-servers for AWS driver  https://review.openstack.org/64947417:21
SpamapSdurn pep817:23
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/nodepool master: Implement max-servers for AWS driver  https://review.openstack.org/64947417:23
SpamapSShrews:^ ok, this I think actually implements correctly. :)17:23
SpamapSwhoops, spotted a bug17:24
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/nodepool master: Implement max-servers for AWS driver  https://review.openstack.org/64947417:24
Shrewsclarkb: what's the impetus for that change? Not saying we shouldn't do it, just curious what brought that about17:30
clarkbShrews: if we had that then aws driver would've worked as is for SpamapS quota issues I think17:31
clarkbShrews: very few of our drivers implement hasRemainingQuota so they all ignore the common max-servers directive17:31
clarkbthis should enforce that directive accurately as long as the "cloud" doesn't leak instances17:32
*** bhavikdbavishi has joined #zuul17:32
SpamapSclarkb:+++ I like yours. :)17:33
*** jamesmcarthur has quit IRC17:35
Shrewsclarkb: my ownly concern with that, atm, is the method used to count current nodes. that may not be accurate since it includes all node states.17:38
clarkbShrews: I believe you need to include all node states as errored/ready/in-use/and deleting but not yet deleted nodes all consume quota17:39
Shrewsclarkb: INIT and ABORTED do not17:40
clarkbdo nodes go aborted only prior to asking $cloud to make them?17:40
Shrewsit's the latter one that concerns me the most. i know it is transient, but don't remember how long that sticks around (or how many may result due to a single launch)17:40
clarkbit should be easy enough to filter out by state if we can nail that down to states that never represent resources in a cloud17:42
clarkbbut I thought they all basically did. Maybe init doesn't17:42
Shrewsyou'll need to at least filter those two states17:43
Shrewsand maybe FAILED17:44
clarkbya aborted is only set when we hit a quota error? so in theory that shouldn't count against your quota17:45
clarkbfailed does count against your quota in openstack17:45
clarkbI don't know if it will in all clouds but conservative there seems fine?17:45
Shrewsfailed doesn't always count, but being conservative seems logical17:46
Shrewsfailing to launch a node after X retries will result in a FAILED state17:46
Shrewsbut so can launching a node and losing the ZK connection before we could record it17:47
*** jamesmcarthur has joined #zuul17:47
*** sshnaidm_ has joined #zuul17:52
openstackgerritClark Boylan proposed openstack-infra/nodepool master: Add simple max-server sanity check to base handler class  https://review.openstack.org/65167617:53
*** sshnaidm has quit IRC17:53
clarkbShrews: ^ that look better? I added DELETED as well since that should mean the node is completely gone from the cloud and is now just accounted in our db17:53
manjeetsHi guys has anything changed for upstream untrusted project ?17:54
*** jamesmcarthur has quit IRC17:54
manjeetswe have a job pipeline that works fine on ci-sandbox17:54
manjeetsbut same jobs are not getting triggered on actual project17:55
manjeetshttps://review.openstack.org/#/c/647960/217:55
manjeetslook at comment Intel SriovTaas CI17:55
manjeetsbut I run same job on ci-sanbox it gets triggered17:56
*** jamesmcarthur has joined #zuul17:56
clarkbmanjeets: the ability to merge is repo specific17:56
manjeetsclarkb, we don't have merge in the pipeline actually17:56
manjeetsusing zuul's docker-compose thing17:57
clarkbzuul has to merge your commit against the target branch to test it17:57
clarkbthat is failing according to the error message17:57
clarkbI would check your merger logs17:57
*** jamesmcarthur has quit IRC17:58
manjeetsclarkb, merge where ? it should never merge to repo anyway >17:58
*** jamesmcarthur has joined #zuul17:59
clarkbmanjeets: one of the fundamental design choices of zuul is that it intends to test what your code would look like if it merged to the actual repo. This means before testing a change zuul merges it locally against the target branch and tests the resulting commit. All of this is in zuul managed git repos. If tests pass then later zuul can ask gerrit to merge them to to canonical repo17:59
clarkbthis way developers don't have to manually rebase everytime a new commit merges to get accurate test results. Zuul does that for you and you know when you ask zuul to merge the code that it should work to a high degree of confidence18:00
clarkbthe error message on that change indicates zuul failed to do this local merge. I would check the zuul merger process's logs18:00
Shrewsclarkb: yah. good call  on DELETED too18:01
*** jamesmcarthur has quit IRC18:01
*** jamesmcarthur has joined #zuul18:02
*** sshnaidm_ is now known as sshnaidm|off18:03
openstackgerritFabien Boucher proposed openstack-infra/zuul master: WIP - Pagure driver - https://pagure.io/pagure/  https://review.openstack.org/60440418:03
*** electrofelix has quit IRC18:19
*** jamesmcarthur has quit IRC18:26
*** jamesmcarthur has joined #zuul18:28
pabelangertobiash: I cannot remember, but were maybe you discussing the ability to support all forms of merge that github supports? (eg: merge / squash / rebase)18:29
tobiashpabelanger: yes, I was part of the discussion18:31
pabelangertobiash: do you happen to know when that was again, so I can find the irclogs?18:31
pabelangerI'd like to refesh myself on that topic18:32
manjeetsclarkb, I get it do you mean it merges the patchset to repo cloned locally for testing ?18:32
clarkbmanjeets: yes18:32
tobiashpabelanger: hrm, no idea, could be months18:33
pabelangerkk18:33
*** hashar has joined #zuul18:57
SpamapSdo we not run any coverage reports for nodepool tests?18:59
clarkbhrm we did before the zuulv3 rewrite19:01
clarkbI don't see it now19:01
Shrewsremoved that long ago19:01
Shrewshttps://review.openstack.org/#/c/608688/19:02
*** pcaruana has quit IRC19:02
clarkbhrm fwiw I found it really useful when improving tests and debugging races19:05
clarkb(I added it and the functioanl jobs way back when to tackle nodepools frequent breakages)19:05
Shrewsyou should still be able to run it on demand19:05
clarkbya or run it locally. I think we ended up stabilizing to the point where it wasn't as useful as often so not really objecting. Just pointing out that it can be valuable19:06
mordredShrews: zuul-preview seems to be running super slow19:06
clarkbthe functional tests go a long way for asnity checking19:06
mordredShrews: if you check out https://review.openstack.org/#/c/651219/ and click on inaugust-build-website - it'll just sit there spinning19:07
mordredI started looking in to it - but haven't gotten very far19:07
mordredobviously it's not _urgent_ as it's not really a fully production thing yet - but thought I'd mention it19:07
Shrewsmordred: we didn't merge my proposed changes yet, did we?19:08
mordredShrews: no - I donm't think so19:09
Shrewsnot that it would improve anything, but wondering if i broke something19:09
Shrewsno, we didn't19:09
mordredShrews: although I did +2 them19:09
mordredI think we're getting crawled19:11
mordreddocker logs --since 30m -f 3310c96209ef on the host shows a bunch of activity - none of it useful19:12
Shrewsmordred: cache overload?19:12
mordredmaybe? although I think this:19:12
mordred[Thu Apr 11 19:09:20.617020 2019] [proxy_http:error] [pid 2916:tid 139831896164096] (70007)The timeout specified has expired: [client 174.143.130.226:55828] AH01102: error reading status line from remote server 174.143.130.226:8019:13
mordredoh - nevermind - I thought it was timing out remotely19:13
Shrewsmordred: i don't even know where to access zuul-preview. it's a container then? what host?19:13
mordredyeah - zuul-proxy.opendev.org19:14
Shrewsthat does not resolve for me19:15
*** bhavikdbavishi has quit IRC19:20
Shrewsmordred: zp makes requests to http://zuul.openstack.org/api/tenant which is not doing anything for me except displaying "Fetching info"19:21
Shrewsso maybe the issue is in zuul-web19:22
Shrewssomething is hammering zuul-web for change 65191019:27
Shrewsmordred: is it normal to see so many of the same GET requests for a change? That doesn't seem right19:31
Shrewse.g.,  GET /api/tenant/openstack/status/change/651910,119:32
Shrewshappening multiple times per second19:33
pabelangerthat is a tripleoci patch19:33
Shrewsso is 651912 which pops up a lot too, but not as much as the other one19:34
pabelangerI wonder if that is somebody running a zuul-runner like too19:34
pabelangerI know tripleo is trying to do things like scrape API to run zuul jobs locally19:35
pabelangerrlandy|ruck: ^by chance, are you aware of any tooling that would scrap zuul.o.o api?19:35
Shrewspabelanger: so maybe something of theirs is behaving badly19:35
pabelangerShrews: do you have an IP where the traffic is coming from?19:37
Shrewsno, only 127.0.01 (guessing the real ip isn't known?). the client signature is "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/20100101 Firefox/66.0"19:38
Shrewsso maybe not an automated tool19:38
Shrewsi'm very curious why we don't have the incoming IP now. some cherrypy-ism?19:40
webknjaz?19:41
*** weshay has joined #zuul19:41
webknjazif it's behind a reverse proxy you should use a proxy tool19:41
rlandy|ruckpabelanger: we looked into but we never went that way19:41
rlandy|rucktoo complicated error prone19:42
pabelangerOh19:42
pabelangerthere is a triplo CI tool19:42
pabelangerthat generates reports in granafa19:42
pabelangerfor that, I think they scrape api from zuul-web19:42
pabelangerrlandy|ruck: do you happen to know where that is done?19:43
clarkbyes it is behind apache so need to check the apache logs19:44
rlandy|ruckzbr: ^^ maybe asking about your work19:45
rlandy|ruckthis was not reproducer-related19:45
Shrewsclarkb: indeed. thx19:46
rlandy|ruckpabelanger: or this: http://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=1?19:46
pabelangerrlandy|ruck: yes, that is it thankyou19:47
pabelangertalking in #tripleo now19:47
Shrewssource IP is 2a02:8010:61a9:33::1dd719:47
rlandy|ruckpabelanger: k - sorry - thought you were after reproducer work19:47
pabelangerShrews: that isn't rdocloud, no ipv6, so that is good19:49
Shrewswe've been getting hit hard ever since that change was submitted, so maybe find the author?19:50
pabelangerShrews: yah, talking to him now19:51
clarkbwhois says broadband service in the uk19:51
pabelangertripleo is digging into it now19:54
pabelangerShrews: clarkb: just looking at apache logs, there might be a few tripleo script scraping the API, 38.145.34.111 is another and that is in rdo-cloud20:04
clarkbany idea what tripleo is trying to learn?20:04
clarkb(I wonder if this indicates some new api endpoints that we might want to add)20:05
pabelangerclarkb: most monitoring health of their jobs20:05
pabelangereg: http://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=1 is something they built20:05
Shrewsmost likely indicates the need for rate limiting  :)20:05
pabelangerbut weshay best to ask20:05
clarkbhrm we have that data in graphite already?20:06
pabelangerShrews: ya, good idea20:06
pabelangerIIRC, zbr also20:06
clarkbeg we shouldn't need to scrape zuul's api and can pull that data from what should be cheaper quicker sources like graphite20:06
*** jamesmcarthur has quit IRC20:10
weshayclarkb ya.. monitoring all the things to keep tripleo upstream healthy20:11
clarkbweshay: acn we stop doing this https://review.openstack.org/#/c/567224/ ?20:11
clarkband set up a periodic pipeline instead?20:11
weshayyup! it's on the to-do list20:12
mordredShrews: wow - I hop on a call for a bit and miss all the fun20:14
Shrewsmordred: how "convenient"  :-P20:15
clarkbweshay: thanks20:16
*** jamesmcarthur has joined #zuul22:01
*** jamesmcarthur has quit IRC22:14
*** hashar has quit IRC22:29
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Add tox-py37 job  https://review.openstack.org/65193822:30
pabelangerclarkb: fungi: mordred: tobiash: ianw: noticed we did't have tox-py37 job^22:30
pabelangerdoh22:30
pabelangersee bug22:30
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Add tox-py37 job  https://review.openstack.org/65193822:31
pabelangerthere22:31
pabelangeralso, spacex launch in 3mins22:32
fungiwhich pad this time?22:33
clarkbhttps://www.youtube.com/watch?v=TXMGu2d8c8g is the official youtube stream22:33
fungiahh, from kennedy22:34
fungiwish they'd schedule more for wallops now that they've gotten the pad back together and cleaned up after the explosion22:35
clarkbI think they need the saturn 5 pad for this particular configuration22:36
fungithe announcer is a little enthusiastic22:36
clarkbor maybe its the old shuttle pad? basically its huge so few options22:36
pabelangerfungi: Yah, he is awesome22:36
fungirocket ballet22:41
clarkbthe size of large buildings22:42
clarkbI love the kerbal style graphics22:43
pabelangerYay22:45
pabelanger3 for 322:45
fungipayload orbital insertion and relanding all first stage boosters in 10 minutes flat22:46
clarkbone better than last time22:46
pabelangerindeed, so cool22:46
mordred++22:46
clarkbpabelanger: linting job doesn't like the py37 job addition (I haven't looked at why yet)22:46
pabelangeryah, looking now22:46
fungimy inner moon landing conspiracy theorist says the life feed cutting out for the third booster touchdown was strategic ;)22:47
fungier, live feed22:47
mordredpabelanger: do we want to add it to zuul so that we run 37 tests too?22:47
clarkbto followup on the docker ipv6 stuff I haven't heard anything back on either the issue or the PR yet22:47
clarkbfungi: ha22:47
fungi"no really, we landed it, i swear"22:47
mordredfungi: yeah. I'm torn on whether to go conspiracy or 'streaming video sucks'22:47
fungii'm pretty sure streaming video sucks22:47
mordredyeah22:48
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Add tox-py37 job  https://review.openstack.org/65193822:48
pabelangerthink that is our fix22:48
mordredand I can imagine that broadcasting video WHILE a rocket lands on top of you might be even more suck22:48
fungion an unmanned drone, even22:48
mordredpabelanger: oh - ignore me - my brain was somehow thinking you were adding this to the zuul repo :)22:48
pabelangermordred: can we pull python37 from bionic? I've been using fedora-29 myself22:48
mordredpabelanger: dunno. I install python with pyenv myself22:49
pabelangerbut, we could if people want22:49
* mordred is such a terrible distro-company employee22:49
clarkbfungi: you can probably look outside with binoculars to check right?22:49
SpamapSbionic has 3.7.1 and may not see many updates since it's in universe.22:49
clarkbpabelanger: check out the work coreyb has done for openstack py37 testing22:50
clarkbbut ya its basically install it from universe and go22:50
pabelangerclarkb: cool22:50
clarkbSpamapS: I get some sense that canonical/coreyb are interested in it. TO be see how up to date it gets though22:51
fungiSpamapS: though to be fair, it released with something like 3.7.0 beta 222:51
fungiso they have at least updated it once already22:51
clarkbpabelanger: the job should also work on fedora and tumbleweed22:54
clarkbbut tumbleweed will eventually stop having a python3.7 (when it gets 3.8) and fedora will eol in a few months22:54
pabelangerclarkb: Yup, was mostly curious is we wanted to add another distro into the mix stick with debuntu22:55
*** rlandy|ruck is now known as rlandy|ruck|bbl23:04
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Don't run zuul_debug_info_enabled under python2.6  https://review.openstack.org/65088023:12
pabelangerdmsimard: ^update23:12
pabelangerclarkb: mordred: ^might have thoughts too23:12
openstackgerritMerged openstack-infra/zuul-jobs master: Add tox-py37 job  https://review.openstack.org/65193823:13
SpamapSif people are interested in it, 3.7 will stay up to date.23:13
clarkbpabelanger: while ansible itself may support python2.6 Im not sure zuul can (no testing)23:15
*** paladox has quit IRC23:16
*** paladox has joined #zuul23:16
clarkbif we do that check we should log a warning when we skip the info23:16
pabelangerclarkb: yah, agree. This is mostly on the remote side of the node from nodepool, I have network images that are still python26 :(23:16
clarkbso that it doesnt silently disappear23:16
pabelangerthat was part of my reason for adding a switch, to avoid hiding the magic23:17
*** rlandy|ruck|bbl has quit IRC23:21
*** tobiash has quit IRC23:21
*** corvus has quit IRC23:21
*** jlvillal has quit IRC23:21
*** mgoddard has quit IRC23:24
*** mgoddard has joined #zuul23:27
*** rlandy|ruck|bbl has joined #zuul23:27
*** tobiash has joined #zuul23:27
*** corvus has joined #zuul23:27
*** jlvillal has joined #zuul23:27
fungialternative is to build a python 3.something in an alternate path in your images and tell ansible to use that for ansible things23:50

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!