Wednesday, 2019-10-02

*** michael-beaver has quit IRC00:02
*** jamesmcarthur has quit IRC00:04
*** jamesmcarthur has joined #zuul00:10
*** igordc has quit IRC00:15
*** jamesmcarthur has quit IRC00:16
*** mattw4 has quit IRC00:17
*** jamesmcarthur has joined #zuul00:19
*** jamesmcarthur has quit IRC00:41
*** jamesmcarthur has joined #zuul00:42
*** jamesmcarthur has quit IRC01:11
*** jamesmcarthur has joined #zuul01:11
openstackgerritMerged zuul/zuul master: Set url scheme on HTTP Gerrit events  https://review.opendev.org/68605401:22
*** jamesmcarthur has quit IRC01:41
*** jamesmcarthur has joined #zuul01:42
*** jamesmcarthur has quit IRC01:47
*** spsurya has joined #zuul01:47
*** jamesmcarthur has joined #zuul01:59
*** jamesmcarthur has quit IRC02:03
*** jamesmcarthur has joined #zuul02:04
*** irclogbot_0 has quit IRC02:09
*** jamesmcarthur has quit IRC02:09
*** irclogbot_0 has joined #zuul02:13
*** jamesmcarthur has joined #zuul02:33
*** jamesmcarthur has quit IRC02:41
*** saneax has joined #zuul03:18
*** jamesmcarthur has joined #zuul03:37
*** saneax has quit IRC03:37
*** jamesmcarthur has quit IRC03:44
*** recheck_ has joined #zuul04:35
*** jangutter_ has joined #zuul04:36
*** jangutter has quit IRC04:38
*** recheck has quit IRC04:38
*** fdegir9 has joined #zuul04:40
*** fdegir has quit IRC04:40
*** tobiash has quit IRC04:40
*** mhu has quit IRC04:40
*** Miouge has quit IRC04:40
*** tobiash has joined #zuul04:41
*** Miouge has joined #zuul04:42
*** pcaruana has joined #zuul04:52
*** jamesmcarthur has joined #zuul05:05
*** jamesmcarthur has quit IRC05:10
*** bhavikdbavishi has quit IRC05:25
*** AJaeger has quit IRC05:57
*** AJaeger has joined #zuul06:02
*** jamesmcarthur has joined #zuul06:36
*** jamesmcarthur has quit IRC06:49
*** avass has joined #zuul06:52
*** hashar has joined #zuul07:10
*** badboy has joined #zuul07:12
*** tosky has joined #zuul07:17
*** jamesmcarthur has joined #zuul07:25
*** jamesmcarthur has quit IRC07:31
*** jamesmcarthur has joined #zuul07:39
*** jamesmcarthur has quit IRC07:44
*** jamesmcarthur has joined #zuul07:44
*** jpena|off is now known as jpena07:48
*** jamesmcarthur has quit IRC07:51
*** jamesmcarthur has joined #zuul07:56
*** mhu has joined #zuul08:01
*** jamesmcarthur has quit IRC08:09
*** jpena is now known as jpena|brb08:10
*** jpena|brb is now known as jpena08:32
*** bhavikdbavishi has joined #zuul08:34
*** panda has quit IRC08:48
*** panda has joined #zuul08:49
*** bolg has joined #zuul09:08
bolganybody working on mac free to review https://review.opendev.org/c/671674/5 ? Relatively small change = quick win :)09:09
openstackgerritMatthieu Huin proposed zuul/zuul master: Authorization rules: support YAML nested dictionaries  https://review.opendev.org/68479009:10
*** bolg has quit IRC09:35
badboywhy did my logs stopped being compressed?09:40
badboyOOP [upload-logs : gzip console log and json output]09:40
badboylocalhost | skipping: Conditional result was False09:40
*** bhavikdbavishi has quit IRC09:41
openstackgerritMatthieu Huin proposed zuul/zuul master: authentication config: add optional token_expiry  https://review.opendev.org/64240809:48
*** bhavikdbavishi has joined #zuul09:49
AJaegerbadboy: http://lists.zuul-ci.org/pipermail/zuul-announce/2019-September/000053.html - change merged yesterday09:55
*** hashar has quit IRC09:55
badboyAJaeger: thank you!10:07
badboyAJaeger: is there a way to set timezone offset in the UI?10:14
*** badboy has quit IRC10:15
*** badboy has joined #zuul10:18
*** bolg has joined #zuul10:36
*** zbr|ruck is now known as zbr|lunch10:39
*** tosky_ has joined #zuul10:49
*** tosky has quit IRC10:52
*** tosky_ is now known as tosky10:58
*** tosky_ has joined #zuul11:02
*** tosky is now known as Guest1863911:02
*** tosky_ is now known as tosky11:02
*** Guest18639 has quit IRC11:04
*** openstackstatus has quit IRC11:09
*** bhavikdbavishi has quit IRC11:14
*** jpena is now known as jpena|lunch11:24
*** jamesmcarthur has joined #zuul11:36
*** jamesmcarthur has quit IRC11:43
*** bhavikdbavishi has joined #zuul11:47
*** badboy has quit IRC11:49
*** jamesmcarthur has joined #zuul11:51
*** bhavikdbavishi1 has joined #zuul11:53
*** bhavikdbavishi has quit IRC11:54
*** bhavikdbavishi1 is now known as bhavikdbavishi11:54
*** jamesmcarthur has quit IRC12:11
*** jamesmcarthur has joined #zuul12:11
*** ianychoi has quit IRC12:21
*** ianychoi has joined #zuul12:25
*** jpena|lunch is now known as jpena12:29
*** jamesmcarthur has quit IRC12:30
*** jamesmcarthur has joined #zuul12:30
*** rlandy has joined #zuul12:40
*** jamesmcarthur has quit IRC12:45
*** jamesmcarthur has joined #zuul12:53
*** jamesmcarthur has quit IRC12:54
*** jamesmcarthur has joined #zuul12:54
pabelangerShrews: I seem to have a stuck request: http://paste.openstack.org/show/780747/ but no more providers to try13:18
pabelangerI'm not sure how to debug13:18
pabelangerShrews: http://paste.openstack.org/show/780749/ seems to be history of node request, which was all from yesterday13:23
pabelangerso, looks like nodepool lost track of it some how?13:23
fricklerpabelanger: I'm assuming the exception at 00:12:34,674 may tell more13:29
pabelangerfrickler: yah, that is cloud outage in nodepool.PoolWorker.vexxhost-sjc1-v2-highcpu-1, and our 3rd attempt to launch. So we then proceeded to remove request from there13:32
openstackgerritJan Kubovy proposed zuul/zuul master: Evaluate CODEOWNERS settings during canMerge check  https://review.opendev.org/64455713:33
*** zbr|lunch is now known as zbr|ruck13:48
*** michael-beaver has joined #zuul13:52
Shrewspabelanger: the only way I can see that happening is if there are still registered launchers that have not processed it14:05
pabelangerShrews: any suggestions how I can check that?14:06
*** panda has quit IRC14:06
Shrewspabelanger: comparing the "Declined By" list in http://paste.openstack.org/show/780747/ against running launchers? Or you can check the znodes in /nodepool/launchers with zk-shell or similar14:07
*** panda has joined #zuul14:08
Shrewspabelanger: if all registered launchers have declined the request, it will go to FAILED14:08
pabelangerShrews: yah, for declied by that is all my launcher regions. But let me check zookeeper, maybe something else there like old one14:09
pabelangerhard to tell, what it is waiting for14:09
Shrewspabelanger: what's the exception that frickler pointed out? you left that out of the paste14:16
Shrewsbolg: I thought mac used kqueue, not epoll/poll?14:17
Shrewsnone of that seems mac specific, unless they started supporting epoll at some point14:18
pabelangerhttp://paste.openstack.org/show/780757/14:19
pabelangersdkexception14:19
*** jamesmcarthur has quit IRC14:28
*** openstackstatus has joined #zuul14:29
*** ChanServ sets mode: +v openstackstatus14:29
clarkbShrews: https://www.freebsd.org/cgi/man.cgi?poll only epoll is linux specific14:35
clarkbkqueue is bsd specific. They both share poll and select14:35
clarkbits less about addibg mac specific code and instead supporting osx with poll alongsidebetter performing epoll for linux14:36
Shrewsclarkb: oh, i thought he was adding epoll to existing poll support, but it's reversed. Perhaps I should have looked at the code rather than just the commit.  :)14:37
openstackgerritJan Kubovy proposed zuul/zuul master: Evaluate CODEOWNERS settings during canMerge check  https://review.opendev.org/64455714:40
openstackgerritMerged zuul/zuul-website master: CSS fix for ul/li in FAQ  https://review.opendev.org/68600314:41
pabelangerShrews: Ah, it looks like there is an old provider in zk for some reason14:46
pabelangerbut, not in my nodepool.yaml files14:46
pabelangerhow best to remove it from zk?14:46
Shrewspabelanger: hrm. those znodes should be ephemeral, so that must mean it's still running somewhere14:46
clarkbpossible we dont remove providers unless we restart? just a hunch14:47
*** jamesmcarthur has joined #zuul14:48
pabelangerShrews: pretty sure they are not registered in launchers ATM, I don't see references for them in logs14:48
*** avass has quit IRC14:48
pabelangerI can stop / start launch to see if that cleans it up14:49
pabelangerit would be nice if nodepool info had a list flag or new command, to see what providers are configured and running14:50
pabelangerShrews: clarkb: okay, stop / start removed the info from zookeeper14:52
pabelangerso, sounds like we might have a bug when we remove a provider, not cleaning up properly14:52
pabelangerand that also removed my periodic jobs from zuul14:53
pabelangerso, that was the issue14:53
pabelangerended up with NODE_FAILURE on said job waiting14:53
pabelangerwhich, make sense14:53
Shrewspabelanger: you should be able to use https://opendev.org/zuul/nodepool/src/branch/master/nodepool/tests/unit/test_launcher.py#L1476 to help catch the problem you saw14:57
Shrewsi'm not sure what's lacking there14:57
*** jangutter_ has quit IRC15:00
pabelangerShrews: ++ I'll try to remember the order I did for the removal15:02
*** mattw4 has joined #zuul15:23
clarkbpabelanger: Shrews the issue is stop() on providers is a noop15:27
clarkbhowever the provider isn't really a long running thread, the manager is and we call into the provider from the manager which probably explains why stop is a noop15:28
clarkbbut the zk connection is managed by the provider not the manager so there is no closure of that connection15:28
clarkbsorry it is the handler with the state information about what is running not the provider manager15:32
clarkbbut the provider manager manages the zk connection15:32
clarkbI think we want to push the zk connection up into the handler as it is what actually interacts with zookeeper15:33
clarkband it can make deicions on when to close the zk connection unlike the provider manager15:33
clarkboh except node requset handler is per request hrm15:36
clarkbin that case maybe we can simply have stop() in the provider close the zk connection and force it to simply error that request?15:36
clarkbthat likely won't help pabelanger with a single provider though15:36
clarkbI expect that will result in node failures when reloading configs15:36
pabelangerclarkb: for zuul.a.c, we have more then 1 provider. Which is nice! rdoproject is the place for single provider15:40
pabelangerbut also neat, you can see what the potential issue is15:40
*** tosky_ has joined #zuul15:41
*** tosky is now known as Guest8305815:41
*** tosky_ is now known as tosky15:41
*** Guest83058 has quit IRC15:43
clarkbreading even more it is actually the pool worker threads taht store the state I think we need (knowing when a provider manager has processed all requests coming to it15:46
clarkbWhat we want to do (roughly) is kill the zk connection once all NodeRequestHanlders associated with the current ProviderManager have completed15:47
Shrewspabelanger: clarkb: the zk connection is shared, so i think we probably need the PoolWorker stop() to deregister the launcher15:48
Shrewswe don't currently have a method in zk.py to do that, so that will need added as well15:50
clarkbShrews: ya this is complicated by the fact that the provider manager and the zk connection are shared with multiple pool workers (where we have the running request state)15:51
*** ianychoi has quit IRC15:51
clarkband ya maybe we can have the zk lib track users per connection. Then as we stop those users and go to zero we can close the tcp connection cleaning up the ephemeral nodes15:52
clarkboh but the zk connection is global for all providers right?15:52
clarkbI guess that is the real underlying reason we don't clean up the ephemeral nodes15:53
*** ianychoi has joined #zuul15:54
Shrewsright. we don't ever terminate the zk connection, except on shutdown. no need to track users, we just need to cleanup after thread stop15:55
Shrewswe just need a deregisterLauncher() call (does not yet exist) here: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/launcher.py#L37415:56
clarkbyou mean do explicit cleanup rather than wait for ephemeral cleanup. That would work too15:58
Shrewsyes. it is a bug that we depend *only* on ephemeral cleanup. that works when we shutdown nodepool, but not on dynamic changes15:58
Shrewsi've already got the code mostly written15:59
clarkbShrews: you should update the docstring on registerLauncher too. It does not automatically deregister when the launcher terminates (only when the zk connection goes away)16:01
*** jamesmcarthur has quit IRC16:06
openstackgerritDavid Shrewsbury proposed zuul/nodepool master: Deregister a launcher when removed from config  https://review.opendev.org/68619816:09
Shrewsclarkb: pabelanger: confirmed the new test changes fail before adding the call to deregisterLauncher() ^^^16:09
*** bolg has quit IRC16:09
*** pcaruana has quit IRC16:10
pabelangeryay! Thanks for quick fix, will look shortly16:11
clarkbShrews: one small nit, but +2 anyway16:11
*** pcaruana has joined #zuul16:11
Shrewsclarkb: i thought of that but decided against it. unnecessary since we don't modify any data16:12
Shrewswe just need the id16:12
clarkbShrews: I agree, but its nice to have symmetry16:14
clarkbyou register the same thing you deregister16:14
*** rlandy is now known as rlandy|brb16:22
*** fdegir9 is now known as fdegir16:23
openstackgerritClark Boylan proposed zuul/zuul-jobs master: Replace command with shell in persistent-firewall  https://review.opendev.org/68621216:30
clarkbcorvus: pabelanger fungi ^ thats another "lets try tweaking things" change around persistent-firewall to see if we can get the behavior to change16:30
fungiwatch it be that iptables-save doesn't work with the minimal environment provided by command tasks or something16:33
clarkbhttp://status.openstack.org/elastic-recheck/#1846093 even that shows some weird gaps16:34
clarkbits definitely not happening all the time and happens on all the platforms (distro and cloud) and across jobs16:35
clarkbwhcih is why I expect it is an ansible bug16:35
*** rlandy|brb is now known as rlandy16:37
*** tosky_ has joined #zuul16:40
pabelangeryah, I think it is ansible bug16:41
pabelangerfor me, I ended up switch to mitogen, which resulted in fast runs and no more -1316:41
pabelangerbut, do want to figure out why it happens16:42
pabelangerclarkb: +2, approve when ready16:42
clarkbwhat does switching to mitogen look like? have to add package installs and set config flag?16:42
*** tosky has quit IRC16:43
pabelangeryah, pip install then ansible.cfg flag16:43
pabelangerso far, it just works16:43
pabelangereven tho totally unsupported by ansible core team16:43
*** rfolco is now known as rfolco|dentist16:44
*** tosky_ is now known as tosky16:45
*** hashar has joined #zuul16:51
corvustristanC: i'm trying to rework the log manifest tree view so that the folder names are no longer links to the raw log storage directory index, but instead behave just like the ">" icon to the left and expand the tree.  i've spent quite some time trying to get the patternfly-react TreeView to behave in that way, but it seems constructed in just such a way that it's impossible to change the internal state of16:55
corvusthe TreeViewNode to set it to be expanded.16:55
corvustristanC: i've tried several approaches, but fundamentally i think the issue is that once the TreeViewNode instance is created, it only checks this.state.expanded to see if it should be expanded (it does not consult this.props.node -- so even though you can use that to set a default expanded state, once created, you can't change the prop to set the expanded state)16:57
corvustristanC: there is no reference to the TreeViewNode object that we can get to, so we can't call TreeViewNode.setState (which is what it does if you click on ">")16:58
corvustristanC: the only thing remaining i can think of is to find the DOM node, and work backward to the react component via __reactInternalInstance$...   which seems like a very bad way of doing it.  do you have any other ideas?16:59
corvuscode for easy reference: https://github.com/patternfly/patternfly-react/blob/master/packages/patternfly-3/patternfly-react/src/components/TreeView/TreeView.js https://github.com/patternfly/patternfly-react/blob/master/packages/patternfly-3/patternfly-react/src/components/TreeView/TreeViewNode.js17:00
corvusfungi: ^ fyi17:01
fungicorvus: thanks! i hadn't even gotten that far. playing around with what's in web/src/containers/build/Manifest.jsx wasn't yielding the results i anticipated17:03
fungiand i wasn't having much luck trying to crash-course my way through reactjs internals17:04
tristanCcorvus: have you tried https://reactjs.org/docs/refs-and-the-dom.html#creating-refs ?17:14
corvustristanC: yeah, i have a ref to the TreeView, but there's no ref to the TreeViewNode17:15
openstackgerritFabien Boucher proposed zuul/zuul master: WIP - Gitlab - Basic handling of merge_requests event  https://review.opendev.org/68599017:15
tristanCcorvus: even using the TreeView.nodes prop? ( https://github.com/patternfly/patternfly-react/blob/master/packages/patternfly-3/patternfly-react/src/components/TreeView/TreeView.js#L111 )17:16
clarkb"17:17
clarkb2019-10-02 17:11:10.027139 | ubuntu-bionic | 305 Use shell only when shell functionality is required" thank you ansible lint17:17
pabelangeryah, I stopped using it and switched to yamllint. Too opinionated now, IMO17:18
openstackgerritClark Boylan proposed zuul/zuul-jobs master: Replace command with shell in persistent-firewall  https://review.opendev.org/68621217:18
clarkbpabelanger: corvus fungi ^ now with the skip linting tag17:19
corvustristanC: yeah, TreeView.nodes is our array of dictionaries that we pass in; it uses that to create TreeViewNode objects, but it doesn't keep track of them, and they are the objects that have the expanded state: https://github.com/patternfly/patternfly-react/blob/master/packages/patternfly-3/patternfly-react/src/components/TreeView/TreeView.js#L89-L9117:19
corvusclarkb: when you have a second, can you re-review https://review.opendev.org/683958 and child?  i can work on deploying that once we have those images built17:21
clarkbcorvus: yup17:22
tristanCcorvus: arg i see, the map result is not discarded17:23
tristanCis somehow discarded*17:25
*** jpena is now known as jpena|off17:27
tristanCcorvus: there may be a way to trigger the render() procedure of the TreeView component, e.g. by changing a dummy state...17:27
openstackgerritMerged zuul/zuul-registry master: Initial implementation  https://review.opendev.org/68395817:28
corvustristanC: i can do that by changing the nodes prop that i pass to TreeView -- it runs, but the issue is that react caches the TreeViewNode and just updates its props.  once initialized, TreeViewNode doesn't consult its props anymore to determine whether it should be expanded, it only consults its state.17:32
corvustristanC: it's like they designed it perfectly to make it impossible to do this.17:32
*** pcaruana has quit IRC17:45
*** jamesmcarthur has joined #zuul17:51
openstackgerritMerged zuul/zuul-jobs master: Replace command with shell in persistent-firewall  https://review.opendev.org/68621217:54
*** igordc has joined #zuul18:03
openstackgerritTristan Cacqueray proposed zuul/zuul-registry master: Fix container image build  https://review.opendev.org/68580818:05
openstackgerritTristan Cacqueray proposed zuul/zuul-registry master: Add tox configuration and fixe flake8 errors  https://review.opendev.org/68623018:05
tristanCoops, 685808 got rebased by mistake18:06
clarkbweirdly the approval remained18:07
pabelangerthat is odd18:09
pabelangerit looks like approval and new pS was same time18:09
pabelangeroh, no sorry18:09
pabelangerthat was zuul enqueing18:09
clarkbConsidering the repo is new there isn't really anything to rebase on18:11
clarkbgerrit may have recognized that? not sure what changed if anything18:11
clarkbjust the timestamps in the commit and the committer18:11
*** mattw4 has quit IRC18:16
*** mattw4 has joined #zuul18:17
clarkbon the off chance that ansible explicitly exits with a -13 I grepped the source code for -13 and didn't find anything for return codes18:21
clarkblots of software versions and stuff though18:22
tristanCcorvus: would you mind trying strict mypy for zuul_registry?18:22
tristanCclarkb: could this be something fixed in recent release? it seems like some opendev executors are not using updated ansible version, e.g. 2.8.0 instead of 2.8.518:23
clarkbtristanC: correct pip doesn't update ansible due to how we specify our versions18:24
clarkbThat is something we could try, updating all the venvs to the latest point release of the respective versions18:24
openstackgerritMerged zuul/zuul-registry master: Fix container image build  https://review.opendev.org/68580818:33
*** bhavikdbavishi has quit IRC18:33
*** bhavikdbavishi has joined #zuul18:35
pabelangerclarkb: Shrews: +a on nodepool zk fix today, thanks for helping18:40
*** mattw4 has quit IRC18:50
corvustristanC: yes -- i'm not convinced about strict mypy yet, so i think zuul-registry is a great place to demonstrate.18:55
openstackgerritTristan Cacqueray proposed zuul/zuul-registry master: WIP: add type annotations  https://review.opendev.org/68624918:56
corvustristanC: er, in case that "yes" was unlear -- i am in favor of trying strict mypy with z-r.  :)18:56
tristanCcorvus: alright, here is how it can be done ^  I'll do the other modules later today or tomorrow18:56
pabelangerwith zuul-registry, would you run 1 per region (for nodepool) or top level, along side zuul-executors. I am guessing the 2nd option19:02
*** smcginnis has joined #zuul19:04
smcginnisI think I've noticed a zuul bug.19:04
smcginnisThe "ref url" link on https://zuul.opendev.org/t/openstack/build/6010c9c727e64db983744315768ad203 does not appear to be formatted properly.19:04
smcginnisThe resulting link gets you redirected to a list of tenants instead of the review.19:05
smcginnisLooks like it may be appending the review URL to the build base URL.19:05
*** spsurya has quit IRC19:09
clarkbsmcginnis: yes it ended up being rendered as a relative url due to the lack of the scheme component in the url19:10
clarkbsmcginnis: patch to fix this merged yesterday and requires a zuul scheduler restart to take effect19:10
smcginnisCool, thanks.19:11
*** hashar has quit IRC19:18
openstackgerritMerged zuul/nodepool master: Deregister a launcher when removed from config  https://review.opendev.org/68619819:22
*** bhavikdbavishi has quit IRC19:43
*** brennen has left #zuul20:00
daniel2I starting getting a weird error in nodepool launcher: launcher_1      | AttributeError: 'NoneType' object has no attribute 'vcpus'20:14
daniel2Is it a permissions issue?20:14
*** remi_ness has joined #zuul20:14
clarkbis there a traceback?20:15
daniel2clarkb: https://shafer.cc/paste/view/raw/9e904f9120:16
SpamapSHm, AWS just made some changes around vcpus in limits.20:16
SpamapSdaniel2: using AWS by any chance?20:16
daniel2No OpenStack.20:16
SpamapSkk n/m then :)20:17
daniel2Used 24 of 40020:17
clarkbits trying to get the flavor's vcpu count20:17
daniel2Just setup a compute node with dual 20 core processors, 40 cores and 80 threads, plus 512GB of RAM.20:17
clarkbbut the flavor is a nonetype20:17
clarkbdid the flavor go away maybe?20:18
clarkbon the cloud side?20:18
daniel2Nope, I'm looking right at it.20:19
clarkbmight also mean no matching flavor could be found?20:21
daniel2I restarted the container and it went away, I also dropped the min ram line cause the specific flavors are set20:21
clarkbya if min ram was resulting ib no matches that might do it20:22
*** smcginnis has left #zuul20:23
*** jamesmcarthur has quit IRC20:23
daniel2How do you get nodepool to not flood openstack when it has trouble creating instances.20:23
daniel2I end up with like 20 instances in Error state that can't be deleted.20:24
*** mattw4 has joined #zuul20:24
clarkbusually we'll disabale the provider and debug why it happens and fix it. But its purpose for existing is to request as many nodes as required as quickly aspossible20:24
daniel2clarkb: It doesn't even seem to give enough time for the nodes to spin up.20:24
daniel2It creates them then deletes them after a minute20:25
clarkbthat implies the api is telling nodepool that it failed20:25
clarkbnodepool will wait patiently up to its timeouts if the cloud doesnt say anything is wrong20:25
corvusyeah, if the instance went into error state, that's openstack doing that, not nodepool20:26
daniel2https://shafer.cc/paste/view/raw/a54ae82e Not a very informational traceback20:28
pabelangerdaniel2: you could also try modifying https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[openstack].rate we had to do that with rdocloud for a while20:28
corvusi wish it didn't create instances in that case, but apparently that's the architecture.  nodepool will try to clean those up later on, if the cloud will let it.  but sometimes they get stuck and a cloud operator has to go delete them20:28
daniel2The only way to delete them is go into the database usually.20:28
daniel2I'm confused cause these images built just fine before.20:29
corvusdaniel2: agreed -- we actually put that message in sdk because what we got back from openstack was so uninformative.  :(20:29
clarkbthat can be due to no valid hypervisor being found20:29
clarkbif you do a server show on the instancesyou should get the error from nova20:29
clarkbif there is any info20:30
daniel2clarkb: what's weird is if I manually start an instance using the same parameters in Horizon, it works20:31
clarkbdoes a server show on the instances reveal anything?20:33
*** jamesmcarthur has joined #zuul20:33
daniel2Nope20:36
clarkbno field in there with a json blob and a "reason" ? I want to say that is the key name for the fault20:38
daniel2So you think it's trying to connect but can't because of the key?20:39
clarkbno20:39
clarkbnodepool will give you a proper error message for that20:39
clarkbnova is reporting a failure to boot the instance20:39
clarkband often nova will update the instance record with a reason later20:40
clarkb(but not at the point it reports back to nodepool)20:40
clarkbare you able to share the output of a server show?20:40
daniel2clarkb: it shows nothing because its stuck in deleting/error state20:41
daniel2Check your notice clarkb, it had a hostname I didn't want to broadcast publically.20:42
clarkbnova should still tell you the server uuid and name, flavor, image etc20:42
clarkbhrm ya doesn't have the json blob20:43
clarkbyou might need to check with the cloud logs20:43
clarkbif you have access to them grepping on the instance uuid should bring up useful info20:43
daniel2Yeah I have controller access20:44
clarkbiptables-save problems persist after updating the role to use shell. We can try updating ansible as tristanC suggests: https://review.opendev.org/#/c/686237 thanks for the review on that pabelanger20:47
daniel2clarkb: [instance: 1b08544b-0a37-45d5-b85d-4b4f20ab5259] Instance build timed out. Set to error state20:48
daniel2Thats what the compute node is giving me.20:49
clarkbok so it is hitting a nova timeout20:49
clarkboff the top of my head that can happen due to image conversions that nova will attempt to do20:49
clarkbwe saw that with infracloud way back when. Not sure if other things can cause that20:49
pabelangeryup!20:49
pabelangerremember20:49
pabelangeryou can also increase boot-timeout in provider config20:50
daniel2I'll try rebuilding the images.20:50
pabelangerdaniel2: what format are you uploading as?20:50
daniel2I need to provision a few more compute nodes too.20:50
daniel2QCOW220:51
clarkbpabelanger: in this case I don't think the pvodier config will change anything20:51
clarkbpabelanger: its nova that is timing out not nodepool20:51
pabelangeroh, right20:51
pabelangermissed that20:51
clarkb(otherwise we would get proper error in nodepool logs)20:51
pabelangeryah, I would check nova to see if they are forcing raw, then maybe switch nodepool-builder to that format20:52
daniel2Does raw need qemu-utils?20:52
pabelangerhttps://opendev.org/opendev/puppet-infracloud/commit/9dc122c3ddd852f4572ffff0c6d5ed6f84fa2a9620:52
pabelangerwas our infracloud fix20:52
daniel2I'm curious if I could go back to using the docker builder container20:52
pabelangeryah, we use qemu-img to convert20:53
daniel2https://shafer.cc/paste/view/raw/41aa331d20:53
daniel2Thats the image details20:53
clarkbpabelanger: daniel2 raw is actaully the "native" image type for dib20:54
clarkbdib converts from raw to $otherformats20:54
*** remi_ness has quit IRC20:54
daniel2Oh, I never specified anything.20:54
clarkbthe default for nodepool should be whatever the cloud config specifies20:54
pabelangerisn't qcow2 the default?20:54
daniel2It uses qcow2 for me.20:55
clarkbqcow2 may be the cloud config default yes20:55
clarkbmy comment about raw was more aimed at the qemu-utils question20:55
clarkbyou need qemu-img for qcow220:55
pabelangerah, right. I thought it was the other way20:55
daniel2Which you need qemu-utils for raw as well?20:56
clarkbI don't think so20:56
clarkbsince there is no conversion to perform20:56
*** tosky_ has joined #zuul21:05
*** tosky has quit IRC21:06
*** tosky_ is now known as tosky21:08
openstackgerritJames E. Blair proposed zuul/zuul master: web: render log manifest consistently  https://review.opendev.org/68630721:38
corvusfungi, clarkb, tristanC: ^ this is the best improvement i can come up with short of either replacing or modifying the treeview to get the desired expansion behavior.  I think this is better than what we have now, but still not ideal.21:39
*** jamesmcarthur has quit IRC21:56
*** rfolco|dentist is now known as rfolco22:17
fungilooks like the zuul-build-dashboard result isn't capable of rendering specific build pages22:34
fungibut i get the gist of it from the commit message and diff, sounds like a fine compromise22:35
fungiunfortunate that react can't extend the expand/contract toggle to the object name22:36
fungioh! zuul-build-dashboard-multi-tenant seems to be able to: https://e88fada250d77d396ef5-a605dbf95134d478b50bec2dfa092555.ssl.cf1.rackcdn.com/686307/1/check/zuul-build-dashboard-multi-tenant/1469c57/npm/html/t/opendev.org/build/4be7070266c24a2da06087347012cde922:41
fungiahh, nevermind, we don't get a logs tab with that22:41
daniel2So I set the format to raw but it's still trying to do qcow222:41
fungioh, it was just that build. here's one that works: https://e88fada250d77d396ef5-a605dbf95134d478b50bec2dfa092555.ssl.cf1.rackcdn.com/686307/1/check/zuul-build-dashboard-multi-tenant/1469c57/npm/html/t/local/build/3e8d94730bab4649b1859c4d075002a6/logs22:42
fungidaniel2: nova is still trying to boot qcow2 or dib is still trying to build qcow2?22:44
daniel2dib is still trying to build qcow2 even when I specify raw in "formats:"22:45
openstackgerritTristan Cacqueray proposed zuul/zuul-registry master: Add type annotations  https://review.opendev.org/68624922:46
tristanCcorvus: here is the type annotation for each modules, though it's a bit ugly for swift and main because neither openstack or cherrypy is typed. I think we should ignore those modules for now, though i left the full annotation for reference in this second PS22:48
tristanCcorvus: and doing so, mypy figured out that some procedure may return None and thus we need to check for that before doing things like json.dumps22:49
clarkbarent main and swift the entire code that will run in production?22:54
clarkbnit sure what the value us if we ignore those two22:54
fungidaniel2: here's how we're setting it on our builders for specific providers: https://opendev.org/opendev/system-config/src/branch/master/playbooks/templates/clouds/nodepool_builder_clouds.yaml.j2#L8122:55
fungiis that roughly what you did?22:55
tristanCcorvus: also, here is an emacs snippet to configure flycheck mypy: http://paste.openstack.org/show/780770/22:55
fungidaniel2: the daemon also may need a restart to pick up clouds.yaml changes22:55
tristanCclarkb: mypy also found issues in the storage module actually.22:56
tristanCthen for openstack and cherrypy, we could either add types there too, or at least provide a stub for the function we use22:58
fungitristanC: any reason not to approve 686307?23:03
tristanCfungi: to leave a bit more time for others to review? the review was proposed less than an hour ago...23:05
fungifair, it's a behavior change after all23:05
fungia lot of opendev's zuul users are confused by the current behavior, but that doesn't mean all zuul users are23:06
daniel2What is the point of the nodepool-builder docker container if it doesn't have root, because it can't build images on it.23:07
tristanCfungi: i didn't realized the current view caused confusion. the change seems to work as expected, i can +a now23:07
fungitristanC: i don't think it's urgent to approve, but we have seen a lot of users who don't realize there's a cooked log viewer for files in subdirectories because they keep following the raw link by clicking the directory name rather than using the > to expand the listing23:10
fungiideally there would have been a way to make clicking the directory name do the same as clicking the > symbol but that doesn't seem to be easily accomplished with react's treeview23:11
fungidaniel2: o23:12
fungier23:12
fungidaniel2: i'm not sure, but i think folks extend extra privs to the builder container in that case23:12
fungiat a minimum it needs to be able to chroot23:13
*** tosky has quit IRC23:16
openstackgerritTristan Cacqueray proposed zuul/zuul-registry master: Add type annotations  https://review.opendev.org/68624923:20
clarkbdaniel2: fungi I think that is the one image that isnt tested and as far as I know no one is using it23:24
fungiyeah, the quickstart exercise job just uses a static node i think?23:28
clarkbcorrect23:28
*** mattw4 has quit IRC23:28
*** panda is now known as panda|off23:39

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!