Wednesday, 2020-11-18

*** polls45 has quit IRC00:00
ianwhrm, arm64 nodepool testing sort of works; it gets to the end of devstack at least ,but it seems to be trying to run the x86 container00:01
ianwhttps://zuul.opendev.org/t/zuul/build/2a38b5ccc73846ce85d9cbbdb4ab5cca/log/docker/nodepool_nodepool-launcher_1.txt00:01
clarkbianw: do we build the launcher with arm64 too?00:01
ianwi think both, but the builder is the same00:02
ianwhttps://zuul.opendev.org/t/zuul/build/2a38b5ccc73846ce85d9cbbdb4ab5cca/log/docker/nodepool_nodepool-builder_1.txt00:02
ianwit must have pulled the wrong image from the intermediate registry?00:02
corvuspabelanger: probably so?  i think someone will just need to double check the release notes / git log and make sure there aren't any gotchas.  but i don't think it's in a holding pattern like zuul.00:04
*** rlandy|rover has quit IRC00:07
openstackgerritIan Wienand proposed zuul/nodepool master: [dnm] arm64 testing test  https://review.opendev.org/76277400:09
ianw^ my fault; the siblings image builds, that pull in dib/openstacksdk/etc from source, weren't doing multiarch builds00:09
ianwi feel like there's a decent chance this will work.  devstack support is held up on full tempest runs working, but i should be enough to build/upload/boot/basic networking00:10
ianwanyway, what i *really* want to validate is the containerfile builds00:12
ianwi have them working for ubuntu everything and fedora 3300:13
ianwon x86, i can't imagine they won't work on arm64 but never know00:13
clarkbprobably the biggest arm64 issue is if docker hub has arm64 images for those platforms00:14
ianwclarkb: not now, but do you have any interest on working on similar for suse?00:14
clarkbya I can probably help poke at that00:14
clarkbassuming there exists an opensuse leap arm64 docker image00:14
ianwfedora era rpm and zypper are probably the most problematic things running non-native00:15
ianwclarkb: we don't have arm64 suse atm?  so for that we'd really only need validate x8600:15
clarkboh I see00:15
clarkbya I can help with that00:15
clarkbthey do publsih an arm64 image though so that is also fine if we want to try that00:16
ianwyeah, if i can get gate build/boot testing for arm64, that would be awesome to validate it.  so far we just mostly hope things work there, with a modest degree of success :)00:17
openstackgerritMerged zuul/zuul master: Allow to reuse the code handling the branch cache  https://review.opendev.org/75672500:24
openstackgerritMerged zuul/zuul master: JS: Don't run tests in watch mode  https://review.opendev.org/75666200:24
*** Goneri has quit IRC00:36
*** vishalmanchanda has quit IRC00:41
*** hamalq has quit IRC00:41
*** zenkuro has quit IRC00:59
*** y2kenny has quit IRC02:38
openstackgerritJeremy Stanley proposed zuul/zuul-jobs master: validate-host: Options to require v4 and v6 routes  https://review.opendev.org/76306503:09
*** bhavikdbavishi has joined #zuul03:10
openstackgerritIan Wienand proposed zuul/nodepool master: [dnm] arm64 testing test  https://review.opendev.org/76277403:20
*** bhavikdbavishi1 has joined #zuul03:37
*** bhavikdbavishi has quit IRC03:38
*** bhavikdbavishi1 is now known as bhavikdbavishi03:38
*** bhavikdbavishi has quit IRC04:26
*** bhavikdbavishi has joined #zuul04:27
ianwthinking about it, it's probably a lot more practical to have the dib tests pull the current images, because really what we're interested in there is does the image boot, not so much testing nodepool changes04:41
openstackgerritIan Wienand proposed zuul/zuul master: web: make build/buildset table details more clickable  https://review.opendev.org/76258805:27
openstackgerritIan Wienand proposed zuul/zuul master: web: remove search icon from build/buildset page  https://review.opendev.org/76312605:27
*** evrardjp has quit IRC05:33
*** evrardjp has joined #zuul05:33
*** vishalmanchanda has joined #zuul06:00
*** bhavikdbavishi1 has joined #zuul06:15
*** bhavikdbavishi has quit IRC06:17
*** bhavikdbavishi1 is now known as bhavikdbavishi06:17
*** saneax has joined #zuul06:21
*** wuchunyang has joined #zuul06:28
*** rpittau|afk is now known as rpittau06:49
*** mach1na has joined #zuul07:29
*** zbr4 has quit IRC07:38
*** bhavikdbavishi has quit IRC07:43
*** slaweq has joined #zuul07:43
*** mach1na has quit IRC07:47
*** hashar has joined #zuul07:58
*** jcapitao has joined #zuul07:59
*** bhavikdbavishi has joined #zuul08:07
*** mach1na has joined #zuul08:12
*** zbr has joined #zuul08:25
*** iurygregory has quit IRC08:44
*** tosky has joined #zuul08:45
*** iurygregory has joined #zuul08:50
*** jpena|off is now known as jpena08:55
*** zenkuro has joined #zuul08:57
*** slaweq has quit IRC09:17
*** slaweq has joined #zuul09:20
*** nils has joined #zuul09:49
*** yolanda__ has joined #zuul09:50
*** yoctozepto has quit IRC09:50
*** yoctozepto has joined #zuul09:51
*** mach1na has quit IRC10:13
*** mach1na has joined #zuul10:14
*** bolg has joined #zuul10:25
*** zenkuro has quit IRC10:39
*** zenkuro has joined #zuul10:40
*** mach1na has quit IRC10:40
*** zenkuro has quit IRC10:45
*** zenkuro has joined #zuul10:46
*** wuchunyang has quit IRC10:55
*** saneax has quit IRC10:57
*** yolanda__ is now known as yolanda11:00
*** saneax has joined #zuul11:04
*** bhavikdbavishi has quit IRC11:11
*** wuchunyang has joined #zuul11:27
*** jcapitao is now known as jcapitao_lunch11:39
*** slaweq has quit IRC11:39
*** slaweq has joined #zuul11:45
*** mach1na has joined #zuul11:46
*** mach1na has quit IRC11:58
*** bhavikdbavishi has joined #zuul12:01
*** wuchunyang has quit IRC12:03
*** bhavikdbavishi1 has joined #zuul12:04
*** bhavikdbavishi has quit IRC12:05
*** bhavikdbavishi1 is now known as bhavikdbavishi12:05
*** zenkuro has quit IRC12:19
*** zenkuro has joined #zuul12:24
*** mach1na has joined #zuul12:28
*** jpena is now known as jpena|lunch12:29
*** mach1na has quit IRC12:33
openstackgerritTobias Henkel proposed zuul/zuul master: Support per branch change queues  https://review.opendev.org/71853112:34
openstackgerritTobias Henkel proposed zuul/zuul master: Move queue from pipeline to project  https://review.opendev.org/72018212:34
openstackgerritTobias Henkel proposed zuul/zuul master: Add optional support for circular dependencies  https://review.opendev.org/68535412:34
openstackgerritTobias Henkel proposed zuul/zuul master: Check cycle items are mergeable before reporting  https://review.opendev.org/74345012:34
openstackgerritTobias Henkel proposed zuul/zuul master: Make reporting asynchronous  https://review.opendev.org/69125312:34
*** rlandy has joined #zuul12:54
*** rlandy is now known as rlandy|rover12:55
*** mach1na has joined #zuul12:56
*** jfoufas1 has joined #zuul12:59
openstackgerritTobias Henkel proposed zuul/zuul master: Support per branch change queues  https://review.opendev.org/71853113:03
openstackgerritTobias Henkel proposed zuul/zuul master: Move queue from pipeline to project  https://review.opendev.org/72018213:03
openstackgerritTobias Henkel proposed zuul/zuul master: Add optional support for circular dependencies  https://review.opendev.org/68535413:03
openstackgerritTobias Henkel proposed zuul/zuul master: Check cycle items are mergeable before reporting  https://review.opendev.org/74345013:03
openstackgerritTobias Henkel proposed zuul/zuul master: Make reporting asynchronous  https://review.opendev.org/69125313:03
*** jcapitao_lunch is now known as jcapitao13:16
pabelangerzuul-maint: I'd appreciate reviews on https://review.opendev.org/763086/ to begin process of zuul-registry release to pypi. We haven't officially tagged a release yet13:21
pabelangercorvus: looking at release notes, it looks like we'd have to tag 3.14.013:24
pabelangerfor nodepool13:24
tristanCpabelanger: i guess most zuul-registry deployment are using the container image, is there an issue with it?13:25
pabelangertristanC: our control plane isn't setup for containers right now13:26
pabelangerso I would like to avoid pulling in that dependency in deployment playbooks13:26
pabelangerwe'll likely look to move to operator in the future13:26
tristanCpabelanger: the zuul-operator does integrate and deploy the zuul-registry13:27
pabelangercool, that should make migrating easier in the future13:27
*** bhavikdbavishi has quit IRC13:27
*** Goneri has joined #zuul13:29
*** jpena|lunch is now known as jpena13:30
avassIs there any known memory leaks in the scheduler? ours just got OOM killed after consuming 30Gb of ram13:30
tristanCavass: are you running 3.19.1 ?13:30
avassno latest containers13:30
avasslooks like it's been increasing steadily since the last couple of days13:37
openstackgerritTobias Henkel proposed zuul/zuul master: Don't query branch protection on pull request events  https://review.opendev.org/75357113:41
openstackgerritTristan Cacqueray proposed zuul/zuul-registry master: Add hypothesis based test for the storage module  https://review.opendev.org/68711813:59
openstackgerritTristan Cacqueray proposed zuul/zuul-registry master: Add hypothesis rule based state machine test and fix path issue  https://review.opendev.org/68715713:59
openstackgerritTristan Cacqueray proposed zuul/zuul-registry master: Add hypothesis rule based state machine test and fix path issue  https://review.opendev.org/68715714:02
openstackgerritTobias Henkel proposed zuul/nodepool master: WIP: allow disabling sticky AZs  https://review.opendev.org/54617514:10
openstackgerritTobias Henkel proposed zuul/nodepool master: WIP: allow disabling sticky AZs  https://review.opendev.org/54617514:12
tobiashavass: when was your latest update?14:17
tobiashwe had oom issues 1-2 months ago but I think the fixes for that all have merged already14:18
tobiashchecking again14:18
tobiashavass: the fixes were https://review.opendev.org/751170, https://review.opendev.org/751171 and https://review.opendev.org/75117214:20
tobiashall merged ~9 weeks ago14:20
openstackgerritMerged zuul/zuul-registry master: Begin publishing to pypi  https://review.opendev.org/76308614:26
avasstobiash: saturday :)14:29
tobiashavass: hrm, that was a quick oom14:30
avassyep I'm trying to figure out if we're doing something strange14:31
tobiashavass: are you able to get a list of upstream changes that went in between that and the version before?14:31
avasswe were running 3.19.1 before that so quit a lot of changes14:33
tobiashoh14:33
tobiashwe're currently based on 3.19.0-440-g61100409 and see no unusual memory behavior14:33
avasswe're on 3.19.1.dev44114:34
tobiashwhich is the exact commit?14:34
avassI guess that's: 7b8f780d14:35
avassor can I check that easily somehow?14:36
tobiashif all components are the same version, the status page reports it14:36
avassyep the dashboard shows: 3.19.1.dev441 7b8f780d14:37
avassand those should be synced14:37
tobiashthat's not in my zuul tree14:37
tobiashoh I guess that was a speculative commit in the gate14:38
avasssame14:38
avassoh..14:38
tobiashbut you updated to the latest image on saturday?14:39
avassI think we might have another restart + update when we were restarting during the week as well14:39
avassbut the oldest possible image should have been saturday14:39
tobiashthat would be probably f91445bb14:40
tobiashdo you have graphs where you can see when the memleak started?14:41
tobiashthe only scheduler touching change that merged since our last update was https://review.opendev.org/75672514:41
tobiashbut that doesn't really look suspicious at first glance14:42
tobiashor it might be use case driven14:42
tobiashour memleak problems were use case driven as well two months ago (first usage of dequeue rest api by users)14:43
avasshmm no, but from what I've seen it has steadily increasing14:43
avassand it seem to be increasing since the crash14:44
avasswe hadn't had time to set up any long term monitoring yet14:44
tobiashours increases pretty fast at first and then increases slower after some time14:44
avassthough it seem to be steady below 3gb at the moment14:45
avassthe only error we had was a gerrit project that had the wrong name in the tenants config14:47
tobiashwhen you're near oom you could do a repl session with further analysis by using objgraph e.g.14:51
tobiashavass: do you have "zookeeper connection" related log messages btw?14:52
avassnope14:52
tobiashwe saw that when the memory usage is high the garbage collector sometimes can take quite long which makes increasing the zk session timeouts needed14:53
avassor uh, yeah we got debug messages "Zookeeper connection: CONNECTED/SUSPENDED/LOST"14:53
tristanCcorvus: it seems like zuul registry storage is still affected by `..` or `/` in components fixed in https://review.opendev.org/#/c/687157/4/zuul_registry/storage.py14:53
tobiashavass: the LOST causes a mass deletion of all nodes14:54
avassit reconnected 200ms after that14:55
tobiashavass: you probably want to increase that: https://zuul-ci.org/docs/zuul/discussion/components.html#attr-zookeeper.session_timeout14:55
avassmight be a good idea :)14:55
tobiashwe had 40 for a long time but increased further during our memleak problems14:56
tobiashkeep in mind that the zk server also has a config option for a max session timeout that limits what zuul can choose14:56
avassright tickTime*maxSessionTimeout ?14:59
avassoh no that property is in ms15:00
*** Miouge has quit IRC15:03
avasstobiash: thanks!15:13
*** bhavikdbavishi has joined #zuul15:14
avassI guess because we're still pretty reliable on static nodes we haven't actually noticed those problems, or at least not directly.15:16
avasssince the static nodes doesn't get deleted15:16
*** bhavikdbavishi1 has joined #zuul15:16
avassbut my if they still get deleted in zk, nodepool will see them as available and cause extra load on the nodes15:17
*** bhavikdbavishi has quit IRC15:18
*** bhavikdbavishi1 is now known as bhavikdbavishi15:18
*** piotrowskim has joined #zuul15:21
corvustristanC: +2 with comment. that made me wonder: do you think a hypothesis state machine can handle a test of store_upload?15:28
*** zenkuro has quit IRC15:29
*** zenkuro has joined #zuul15:30
avassit's just an idea, but if I'm right about that maybe the scheduler also should abort jobs using the nodes when it deletes the nodes?15:39
avassunless I'm wrong and it already does that15:40
corvusavass: i don't follow; i don't think the scheduler deletes nodes at all?15:43
avassI could be misunderstanding what tobiash said, but what I understood is that if the scheduler loses connection to zk it will cause mass deletion of nodes15:45
fungiif the scheduler releases its lock on a node then nodepool may delete it (if it's the sort of thing nodepool expects to delete once freed)15:45
corvusavass: yes, but by nodepool15:45
avassah well, that's a bit harder to avoid then :(15:46
corvusavass: and the scheduler will abort the jobs because it knows that it lost the locks15:46
tobiashavass: more precise, if zuul looses its session in zk (aka no heartbeat arrived at ak during the session timeout) zuul looses all locks and thus nodepool treats all nodes as released and deletes them15:46
avassokay I think I follow15:46
avasswe've seen unexpected extra load on some nodes so it was just  a theory15:47
fungibut also as corvus points out, the builds which were in progress are going to be considered lost before nodepool starts deleting any of the nodes it had allocated for them anyway15:47
corvusassuming the scheduler actually resumes work and doesn't get oomkilled15:48
fungiyeah, that too15:48
tobiashwell I think the scheduler doesn't handle that at all, so the jobs themselves fail with unreachable once the nodes get destroyed which triggers a job retry15:48
tobiashthat gets more complicated with static nodes though since in that case the old job might continue to run in parallel with a new job15:49
avassthat would explain it15:51
corvusthat seems to be a serious omission :(15:52
tobiashyes, zuul should react on this event and abort all jobs that had any node15:52
tobiashwe so far just assume that this never happens15:52
tobiashand we should do anything that this ever happens actually since even if we handle that a abort-all-jobs is exceedingly costly15:53
tobiashcorvus: I think we should increase the default zk timeout15:54
tobiashespecially since we rely more on zk a higher session timeout makes this much more stable15:54
tobiashespecially on larger zuul's a full gc run can pause the zk heartbeat thread for longer than 10s15:55
tobiashwe run with 120s atm in production (which is a bit excessive but was needed during our oom problems)15:56
tobiashI think we should aim for 40s which is also the default max session timeout in zk server as I think I rememver15:56
avassit is15:56
avassat least with a ticktime of 2000ms15:57
avassby default it's 20*ticktime, but it's unclear if the maxSessionTimeout property is in ms or number of ticks15:57
openstackgerritTobias Henkel proposed zuul/zuul master: Increase default zookeeper session timeout  https://review.opendev.org/76320916:01
openstackgerritJeremy Stanley proposed zuul/zuul-jobs master: validate-host: Options to require v4 and v6 routes  https://review.opendev.org/76306516:02
*** jfoufas1 has quit IRC16:05
*** saneax has quit IRC16:17
*** mach1na has quit IRC16:32
*** bhavikdbavishi has quit IRC17:07
*** jcapitao has quit IRC17:20
pabelangerhas anyone by chance seen the following error before17:23
pabelangerError: error creating build container: (image name "centos:7" is a short name and no search registries are defined in /etc/containers/registries.conf)17:23
pabelangerthis is using podman and build-container-images17:24
pabelangerBefore I start to debug17:24
pabelangerhttps://4ecbd426a5524f19727d-f1df475a3bd31dc5211674e175785e2a.ssl.cf5.rackcdn.com/61/4ee325bfa52926cf585b71d1dcc401d02c55f99b/check/ansible-galaxy-build-container-image/e4daadf/job-output.html#l60217:24
pabelangerthe dockerfile in question is not using namespaces for image17:24
clarkbI think you need to specify a full path for the image17:25
clarkbpodman is saying I don't know whic registry you want me to get centos:7 from17:25
clarkbI think17:25
pabelangeryah, I think there is search registries by default however17:25
avassor you need to configure which registries to search in17:25
pabelangerI wonder if we nuke them some how17:25
avassmaybe use-intermediate-registry does17:26
avassbuildset-registry*17:27
clarkbI want to say the code reads those files and then adds to them and writes them back17:27
pabelanger[registries.search]17:27
pabelangerregistries = ['registry.fedoraproject.org', 'registry.access.redhat.com', 'registry.centos.org', 'docker.io']17:27
pabelangeris what I have local17:27
pabelangerwonder if ubuntu is different17:27
clarkband it vendors a toml lib in the module to make that possible17:28
pabelangerk17:28
avassyup17:29
corvuspabelanger: i highly recommend fully qualifying all names17:30
corvuspabelanger: i think that's going to be important in order to stop thinking of 'docker' as the default17:31
corvuspabelanger: we switched to fully-qualified in opendev and zuul17:32
pabelangercorvus: Oh, for sure and I agree17:32
pabelangertrying to work the issue on both sides right now17:32
corvus++17:33
pabelangermy goal is to by default have zuul-jobs podman ready for ansible17:33
avassit still annoys me every time someone calls containers 'dockers'17:34
pabelangeravass: containers are rpms17:36
pabelanger:)17:36
pabelangeroh interesting /etc/containers/registries.conf is empty by default17:40
pabelangerfrom upstream package17:40
pabelangerand we don't drop in registry.search17:41
pabelangerthat solves that problem17:41
*** jpena is now known as jpena|off18:05
*** hamalq has joined #zuul18:07
pabelangerhere is an odd question, which I likely know the answer too. Is there a way, with github connection, to have zuul apply depends-on PR to a different branch?  So, this is case, sandbox (master) depends-on galaxy (devel) but galaxy (master) is also a thing.  depends-on does not work in this case, since it is raised against the wrong branch.  The issue here, is commit are manually merged from devel into18:07
pabelangermaster in galaxy, and zuul may not be doing that on master.18:07
pabelangerif that all makes sense18:07
pabelangerI am thinking that override branch may work out18:08
pabelangerbut haven't tested yet18:08
clarkbaiui no beacuse it sets up everything per branch18:09
pabelangersorry, override-checkout18:09
clarkbfor example in openstack you can do stable/train depends on stable/stein to test an upgrade fix. Even if you override the branch the code will be applied to stein18:09
clarkbin cases like that it may be more appropriate for the job to do necessary merging18:10
pabelangerdarn18:10
pabelangerokay18:10
clarkbanother approach is to push the code you need to the correct branch then depends on that18:10
pabelangerclarkb: yah, that is what I am testing now18:11
pabelangerbut suspect people will slap me, because I raised it against 'master' vs devel18:11
clarkbwell with pre merge testing they can drop their devel into master workflow and just start working on master :P18:12
pabelangerindeed!18:13
pabelangerright now, trying to get 3pci working first18:13
pabelangerhaving them change workflows is a whole other topic18:13
openstackgerritTristan Cacqueray proposed zuul/zuul-registry master: Add hypothesis rule based state machine test and fix path issue  https://review.opendev.org/68715718:14
corvuspabelanger: galaxy@devel does have the code you want to test -- the issue is just that since you're proposing to sandbox@master the job is checking out master by default; in which case, yeah, override-checkout might be the way to go.  you could do that just on the sandbox job (ie, when running this job on the sandbox repo, check out galaxy@devel)18:14
clarkboh is this purely a zuul config issue and not the actual git checkout state?18:15
corvuspabelanger: some of those statements were supposed to have question marks, sorry :)  anyway read that whole thing as a "did i get this right? if so, i think this..." kind of statement18:15
clarkbya if that is the case then my assertion above wouldn't necessarily be the case. I read it as take the commit(s) from $branch and put them on another branch because that is how we want to test it18:16
corvusclarkb: yeah, and it's possible both are right and i'm just introducing a different way of looking at the prob18:16
pabelangercorvus: yes, you are spot on.18:16
corvusiow, "merge this devel PR into master" would be a solution to the prob, but "checkout devel" would also solve it18:17
pabelangerack, will test soon18:18
clarkbI think it depends on whether or not you're willing to test all the other stuff in devel18:20
clarkb(not sure)18:20
corvusyeah; there's also a lot of variations here: 2 jobs (one for devel, one for master); just doing this as a one-off (ie, dynamic zuul config change to add override-checkout to just the sandbox patch).  and if you go the other way (merge instead of checkout), maybe a role to merge devel into master at the start of the job.18:25
corvusmany tools in this box18:25
openstackgerritJeremy Stanley proposed zuul/zuul-jobs master: DNM: Exercise validate-host IP family assert  https://review.opendev.org/76324518:25
fungiclarkb: so like that ^18:25
corvusclarkb: you previously reviewed https://review.opendev.org/687157 and parent; would love your opinion when you have a min18:26
clarkbfungi: yup18:26
clarkbcorvus: ya I'll take a look in a bit18:27
clarkbI think we may be restarting gerrit soon to double check our recent config updates don't pose a problem then after that I can review18:27
-openstackstatus- NOTICE: The Gerrit service at review.opendev.org is being restarted quickly as a pre-upgrade sanity check, estimated downtime is less than 5 minutes.18:37
*** rpittau is now known as rpittau|afk18:39
clarkbcorvus: tristanC I'm having a hard time wrapping around how a Null upload is valid? re https://review.opendev.org/#/c/687118/518:54
clarkbwhat does that mean from a user standpoint? you've just created an image in the registry maybe with a tag but no real content?18:55
fungiclarkb: it seems like it only runs tasks from a role once even if included multiple times in the same play... do i need to create more plays?18:58
clarkbfungi: oh I think you may need to use include role? or ya just copy pasta the hosts: all roles: header for eahc one18:59
clarkbunless it collapses those too. But include_role should rerun it each time18:59
fungiusing separate plays makes it a little easier for me to document what each one's doing, since the console view doesn't seem to provide that degree of detail on its own19:01
tristanCclarkb: if this is not valid, then we could return an error, right now this code path cause a 500 undefined error19:02
*** openstackgerrit has quit IRC19:02
clarkbright I think that was my original concern. The old setup errors (which I expect is correct), but the new code succeeds in order to fit arbitrary testing inputs19:03
clarkbI don't think we should make the api match arbitrary testing inputs if the correct behavior is to fail19:03
corvusclarkb: agreed, i don't think i fully understood the previous comment/response but now i do.19:08
corvusclarkb, tristanC: i think upload_chunk should only be called after start_upload19:09
tristanCthe spec ( https://github.com/opencontainers/distribution-spec/blob/master/spec.md#push ) states that the location ` SHOULD match exactly the <location> obtained from the POST request `19:14
*** sugaar has quit IRC19:15
pabelangercorvus: could I ask for zuul-registry release? We've merged the release jobs19:19
tristanCi guess it's ok to interpret this as a `MUST match`, but then the failure code should be 404/400, not 50019:20
fungia possible ui regression has been raised in #opendev, apparently this logline is a multi-line entry in the raw version but only the first line is rendered in the html version: https://zuul.opendev.org/t/openstack/build/00ff397ca5374a0b9441036d43ee3416/log/controller/logs/screen-n-cpu.txt#737119:40
fungialso i notice something is causing you to have to scroll to the end of the log to get a horizontal scroll bar19:41
sean-k-mooneyya the long line is a libvirt xml we dump https://d3f8c4c85da258632f6c-407789a006c18491d0ad02a611fbe821.ssl.cf2.rackcdn.com/602432/25/check/tempest-integrated-compute/8105b43/controller/logs/screen-n-cpu.txt19:41
sean-k-mooneyits in the raw version but you cant link to ines in that19:42
sean-k-mooneyif you seach for 'End _get_guest_xml xml=<domain type="qemu">' you will see the full xml19:42
ianwas for the scrolling to the end -- i think it has been like that for quite a while; at least for me19:44
sean-k-mooneyi have noviced it for a few weeks but just assumed most people did not have the issue19:44
sean-k-mooneyits only an issue when you search and it auto scrolls19:44
sean-k-mooneypreviously you could use left and right arrow to scroll back19:45
ianwi restarted zuul-web probably on monday to pick-up the log-viewer change that adds the severity in pf4 buttons and felix's breadcrumbs at the top, but i don't believe that has modified the actual log rendering bits in any way19:46
sean-k-mooneythis one https://opendev.org/zuul/zuul/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a19:46
ianwthat's it19:48
sean-k-mooneyim not familar with this code but skiming it nothing obvious is jumping out at me19:49
clarkbcould be other changes since the last restart too19:49
*** vishalmanchanda has quit IRC19:49
sean-k-mooneythe previous  commti is "Enable ANSI rendering via react-ansi"19:50
fungii didn't think that one touched the log viewer, only summary and console tabs?19:51
sean-k-mooneyya it didnt19:51
sean-k-mooneybuild output and console19:51
sean-k-mooneythe horrizontal scoling was apparently fixed by https://opendev.org/zuul/zuul/commit/e1a7afa9530eb4c2187354f786c73c690768721219:52
*** melwitt has joined #zuul19:53
sean-k-mooneyianw: look like you previous fixed page up page down https://opendev.org/zuul/zuul/commit/8d0f440e1bc85b55a2d780769d8ee1df4ae6ce7e19:54
sean-k-mooneybut it might have regrssed again19:55
ianwpg-up/down is working for me at lesat19:55
sean-k-mooneyyep it is for me too19:56
sean-k-mooneyyep it is as is up and down19:56
sean-k-mooneybut not left and right19:56
sean-k-mooneyhome and end also go to the top and bottom which may or may not be intended19:56
ianwi think first thing we should find a sample site pre the log change and see if those lines are missing there.  i don't believe i changed the actual mechanism of the filtering, but you never know19:57
sean-k-mooneyshift home/end select the start/end fo the line as i would expect19:57
sean-k-mooneyill check my zuul install i havent updated it in a while19:57
ianwhttps://7c4c2c6decd3eafecbc8-7e70b99d63ad0f32ef21f8f8ea4f17ac.ssl.cf1.rackcdn.com/762682/1/check/zuul-build-dashboard-opendev/572ac58/npm/html/tenants seems to be the 14th19:59
sean-k-mooneyhttps://zuul.seanmooney.info/t/openstack/build/e9b8da4a13c340b28fd5aa81ca2eea0f/log/controller/logs/screen-n-cpu.txt#431820:00
sean-k-mooneythat is what it used to look like20:00
sean-k-mooneythat might be a little slow to load20:00
sean-k-mooneyi need to move the static files to a webserver20:00
ianwbummer, i agree @ https://7c4c2c6decd3eafecbc8-7e70b99d63ad0f32ef21f8f8ea4f17ac.ssl.cf1.rackcdn.com/762682/1/check/zuul-build-dashboard-opendev/572ac58/npm/html/t/openstack/build/00ff397ca5374a0b9441036d43ee3416/log/controller/logs/screen-n-cpu.txt20:02
ianwi looks like the lines that don't match aren't coming out20:02
sean-k-mooneywait are each of those like a table row20:04
sean-k-mooneyi mean i guess that works i just was not expecting it to be one big table20:04
ianwyes because that's how like the highlighting of ranges work20:05
sean-k-mooneyah ok20:05
sean-k-mooneyianw: so you said the lines that dont match arent coming out20:07
sean-k-mooneydid you mean to some regex20:07
ianwthat's a guess, i'm trying to remember exactly how it works20:07
sean-k-mooneyi wonder is it only logs that have xml that are affected20:07
sean-k-mooneye.g. as a way to prevent html injections or something20:08
ianwhttps://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/containers/logfile/LogFile.jsx#L21020:09
ianwi feel like it must be ~ there, with the line not having a severity maybe?20:09
sean-k-mooneyyep20:10
sean-k-mooneyit has its 020:10
sean-k-mooneythe ones with debug are 1 but the next lines are 020:10
ianwthat is set in https://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/actions/logfile.js20:10
sean-k-mooneyclass="zuul-log-sev-0"20:11
ianwhrm, but 0 >= 020:11
sean-k-mooney the issue is the subsequent lines dont inhit the sev20:12
sean-k-mooneyhttps://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/actions/logfile.js#L30-L3820:12
sean-k-mooneythose shoudl have had 1 form the initall line20:12
sean-k-mooneywith DEBUG20:12
sean-k-mooneybut its getting 0 becaue of https://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/actions/logfile.js#L4720:13
sean-k-mooney let sev = null im guessing20:13
ianwi don't think subsequent lines inherit20:13
ianwbut it should be showing lines with a severity of 020:14
sean-k-mooneywhy?20:14
sean-k-mooneyoh all is 020:15
ianwi'm just bringing up a dev environment20:16
sean-k-mooneyoh weird it jumps form line number 7371 to 745020:20
sean-k-mooneyso you can see the gaps20:20
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/00ff397ca5374a0b9441036d43ee3416/log/controller/logs/screen-n-cpu.txt?severity=0#737120:20
ianwif you remove the if statement @ https://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/containers/logfile/LogFile.jsx#L210 it all comes back20:21
sean-k-mooneyi dont acully understand " line.severity >= severity && ("20:23
sean-k-mooneywhat is that && testing20:23
sean-k-mooneyshoudl it be (line.severity >= severity) && (20:23
ianwlike in bash i'd say, like "command-that-might-fail && run-if-it-doesn't"20:23
sean-k-mooneyya but im wondering about precidence20:24
sean-k-mooneyis it anding first tehn compureing that20:24
sean-k-mooneye.g. is the operator binding like this (line.severity >= (severity && ()))20:25
clarkbgoogling js operator precedence says now20:25
clarkb*says no20:26
sean-k-mooneyya just tried it in a console20:26
sean-k-mooney1 >= 0 && (4) prints 420:26
sean-k-mooneyso it eveluates the condiotnthen does the thing on the right20:26
ianwi think it's that line.severity is null, or none, or something that javascript doesn't think is >= 020:28
sean-k-mooneywell line ={}; line.severity = null; line.severity >= 020:29
sean-k-mooneyprints true20:29
sean-k-mooneymy dinner i just cooked so got to go20:30
sean-k-mooneyits does seam like its something related to that however20:30
sean-k-mooneynot really sure why its not working20:31
ianwyeah, if i dump the severity before the line, it's blank20:31
ianw      className={`log-message zuul-log-sev-${20:31
ianw                            line.severity || 020:31
ianwthat's why in the css matcher it seems to do this i guess20:32
tristanCin javascript those expressions are all false: [null < 0, null > 0, null == 0] , but those are true: [null <= 0, null >= 0]20:33
sean-k-mooneyactully no20:33
sean-k-mooneynull <= 0 is true20:34
sean-k-mooneyas is null >= 020:34
sean-k-mooneynull <0 and null > 0 are both false20:35
sean-k-mooneyand null == 0 is false20:35
sean-k-mooneybut >= and <= are true20:35
sean-k-mooneytristanC: oh sorry that is what you said20:35
sean-k-mooneyya javescript is a very self consitent language ...20:36
tristanCyeah go figure, it seems like '==' behave differently20:36
tristanCwhen you use ">=", null is converted to the number 0, but when you use '==', null is converted to undefined which doesn't match anything20:37
ianwi have no idea if this is null, blank, or undefined at this point :/20:40
tristanCianw: in that situation, the common strategy is to add `|0` to your value, that would make it default to 0 if it is not defined20:40
tristanC[""|0 , null|0, undefined|0, 42|0] == [0, 0, 0, 42]20:41
ianwhrm, that seems to ... not work20:42
tristanCianw: what's the error?20:43
ianwa blank page with a bunch of 0's randomly on it :)20:43
tristanCat least they are not null :)20:44
ianwwhen i output {line.severity} it's just "blank", (debug lines are 1, etc)20:45
ianwso is it undefined, null, "" is i guess the question?20:45
fungidoes javascript have anything like python's type() function?20:48
tristanCfungi: typeof20:48
fungihopefully that would be a way to disambiguate them for debugging purposes20:49
*** slaweq has quit IRC20:58
ianwsorry jump-starting a car ... as you do21:06
ianwok, i'm a dolt21:07
fungii'm half expecting to need to jump mine sometime soon. it gets driven for a few minutes once every week or two lately21:07
ianwline.severity|0 does seem to work.  i had the filtering set to debug21:07
ianwfungi: yep, it's our second car i run kids around in that hasn't moved in forever since my wife is WFH21:07
sean-k-mooneyundefined >=0 is false21:07
ianwhowever, today she is WAFH21:08
sean-k-mooney"" >= 0 is true21:08
sean-k-mooneyso its proably undefiend21:08
sean-k-mooneywhy is undefined|0 >=0    121:08
sean-k-mooneyand not true or false21:08
sean-k-mooney...21:08
fungiideally those lines with no declared severity would inherit the severity of the last line which had one?21:09
fungi(or 0 if there was no prior line with a severity)21:09
sean-k-mooneyoh || not |21:09
sean-k-mooneyya21:10
sean-k-mooneycan you jsut set the default calue for line.severity to 1 explictly somewhere21:10
sean-k-mooneyor do line_severity = line.severity || 021:11
sean-k-mooneythen use line_severity instead of line.severity21:11
clarkbianw: fungi: we've used ~weekly takeout as an excuse to run the car for about 30 minutes a week.21:16
clarkbI suggested we sell the car but haven't managed to make that idea stick21:16
ianwif (!line.severity) line.severity = 021:19
ianwbefore works21:19
ianwi'm not quite sure how to encode that in the current magic21:19
clarkbianw: have a link to the code context?21:20
ianwhttps://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/containers/logfile/LogFile.jsx#L21021:20
clarkbianw: I think everything in the {} starting at line 205 first non whitespace char is js21:21
clarkbthe statement above does a variable assignment too21:22
clarkbat line 208 I think you can do your statement21:22
ianwor, the regex matcher should set the severity to 0 if it doesn't match21:23
ianwi think that actually works better21:24
sean-k-mooneyianw: i think on line 209 before the return you can jsut do "line.severity = line.severity || 0"21:25
*** openstackgerrit has joined #zuul21:28
openstackgerritIan Wienand proposed zuul/zuul master: web: return 0 as severity when line doesn't match  https://review.opendev.org/76326421:28
ianwi think that might be the best way; although variations on ^^ work too21:29
clarkb++ to addressing that as close to the source as possible21:29
ianwi think it makes sense for lines to have a value from 0->7, rather than null,1->721:30
ianwbut, you know, i'd not be surprised to find out it causes some other issue21:31
*** slaweq has joined #zuul21:31
ianwdoing school run, hopefully that returns sample site soon21:31
clarkbI went ahead and +2'd it21:31
corvuswhat's the latest on horizontal scroll there?21:37
openstackgerritSlawek Kaplonski proposed zuul/zuul-jobs master: [multi-node-bridge] Add script to configure connectivity  https://review.opendev.org/76265021:37
corvusie, did anyone find out why left/right keys stopped working?21:38
clarkbI have not21:39
openstackgerritMerged zuul/zuul-jobs master: validate-host: Options to require v4 and v6 routes  https://review.opendev.org/76306521:45
sean-k-mooneycorvus: no, that is annoying but getting the content rendering is the main issue i hit21:56
sean-k-mooneyleft/right would be nice once its visable21:57
corvusi've gone back a bit, and if that is a regression, it's not a recent one.21:57
corvusi also thought left/right worked, but it looks like it hasn't for a long time, if ever.  i agree it should work.21:57
sean-k-mooneyit used to work before we started using the new app21:57
sean-k-mooneybut that was like over a year ago21:57
clarkbya pf4 changed a lot of that navigation stuff21:59
corvusah found it.  yeah, pf4 broke it22:00
corvussean-k-mooney, clarkb, ianw, felixedel: i have confirmed that left/right keyboard input on log page works at this commit: https://opendev.org/zuul/zuul/commit/b4b5b9fd58dd5d5aeb9d32331e4882d7ef5faf0622:03
corvusso it looks like we missed that when we un-reverted22:03
ianwi have to admit i'm still not sure why "null >= 0" wasn't working22:06
sean-k-mooneyianw: i think that just a case of javasciipt is weird22:11
sean-k-mooneyit totally fills you with confidence that so much software is now written in it22:12
ianwmaybe it's somehow different between react and js?  because in a console "null >= 0" returns true22:14
clarkbjsx is just compiling to js and using a very thin shim iirc. But ya it could be22:15
*** slaweq has quit IRC22:16
ianwthe lines we are missing have a line.severity typeof() "undefined"22:17
sean-k-mooneyianw: ya so undefiend >=0 is false22:17
sean-k-mooneyeven in the console22:17
sean-k-mooneyso if they were undefined and not null then it would be false22:18
ianwbut why is it not "null"?  and why did me changing the default value to "0" from null then change this value to 0 instead of "undefined"?22:18
corvusi think the left/right event is broken by the logs being in a tab22:22
sean-k-mooneyits broken on even without that change22:23
sean-k-mooneywhen the logs were in a new page22:23
sean-k-mooneybut yes its definetly broken in the tab case22:24
corvusto be clear, this is what i am certain of:  left/right scrolling works at this commit (which reverts out the PF4 build page): https://opendev.org/zuul/zuul/commit/b4b5b9fd58dd5d5aeb9d32331e4882d7ef5faf0622:25
corvusthis makes me think that the pf4 build page broke it22:25
corvusi suspect the actual mechanism for the breakage is the tab container consuming the event22:25
*** piotrowskim has quit IRC22:27
*** johnsom has quit IRC22:27
sean-k-mooneyianw: you have seen https://www.youtube.com/watch?v=et8xNAc2ic8 before by the way right22:28
*** johnsom has joined #zuul22:29
*** piotrowskim has joined #zuul22:29
*** nils has quit IRC22:32
ianwok, i'm still trying to 100% make sure we understand why log lines are missing22:32
ianwto be clear, i'm dumping the json representation of each line, and the lines that are missing don't have a severity field22:33
ianwthis is strange to me because it seems every line should have a severity set by https://opendev.org/zuul/zuul/src/branch/master/web/src/actions/logfile.js#L6222:34
sean-k-mooneyianw: where is the prtotype for line defined22:34
sean-k-mooneythe logfile prototype is here which has a default https://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/containers/logfile/LogFile.jsx#L288-L30022:35
sean-k-mooneybut i assume there is a simielr protype for the line22:35
ianwlines are an array in logFileContent22:36
sean-k-mooneyso this is lines right https://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/containers/logfile/LogFile.jsx#L65-L6922:37
ianwi wonder if this is it somehow? https://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/pages/Build.jsx#L19622:38
ianwseverity={severity ? severity : undefined}22:38
sean-k-mooneyactully ya its an array https://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/containers/logfile/LogFile.jsx#L29022:38
fungithat would reset severity of 0 to undefined, right?22:39
*** hashar has quit IRC22:39
sean-k-mooneytest = 0; test = test ? test : undefined; test22:40
sean-k-mooneyprints undefined22:40
ianwthe *first* line has22:41
ianw{"text":"-- Logs begin at Tue 2020-11-10 03:39:48 UTC, end at Tue 2020-11-10 04:33:38 UTC. --","index":1,"severity":null}22:41
ianwthat suggests somehow the first line correctly gets a null severity, but then we don't do it correctly as we loop22:42
sean-k-mooney test_val = null; test_val = test_val ? test_val : undefined; test_val22:43
sean-k-mooneyalso print null22:43
sean-k-mooneysorry22:43
sean-k-mooneyundefined22:43
sean-k-mooneyso it will set test_val to undefined22:43
sean-k-mooneyafter the first iteration22:43
sean-k-mooneygiven null >= 0  is ture but undefined >= 0 is false22:45
sean-k-mooneyianw: so maybe https://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/pages/Build.jsx#L196 should be null and not undefined22:46
sean-k-mooneyit does seam however that you default of 0 are not taking effect22:47
ianwok, this is in the regex matching somehow22:49
ianwgod damn it, why did i look22:49
ianwhttps://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/actions/logfile.js#L4922:50
ianwm = SYSTEMD_LOGMATCH.exec(line)22:50
ianwshould return a "m" value taht can be looked up in severityMap22:51
ianwit's returning crap, that isn't in the map, which gets you a sev of "undefined"22:51
sean-k-mooney right we dont have a value of 0 ther22:51
sean-k-mooneythis is the map https://opendev.org/zuul/zuul/src/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a/web/src/actions/logfile.js#L30-L3822:51
ianwright, it's looking up22:52
ianw"Nov 10 04:15:04.681121 ubuntu-focal-rax-dfw-0021684256 nova-compute[88940]:         <value>fdc</value>,Nov 10 04:15:04.681121,.681121,., ubuntu-focal-rax-dfw-0021684256 nova-compute[88940]:         <value>fdc</value>,ubuntu-focal-rax-dfw-0021684256,,"22:52
ianwin the map22:52
ianwoh, no, hang on, it's lookup up m[7]22:53
sean-k-mooneyyes whatervef m[7] is22:53
sean-k-mooneyi think that is ""22:54
sean-k-mooneywhich wont maghe any of the keys22:54
ianwit's looking up "undefined"22:54
ianwso the match is returning, but it's not got a m[7]22:54
sean-k-mooney7 is the last ,,22:55
sean-k-mooneyin that22:55
sean-k-mooneye.g. nothing right22:55
sean-k-mooneyso instead of if(m)22:55
sean-k-mooneyyou should do if(m && m[7])22:55
sean-k-mooneywhich will leave sev=null22:56
sean-k-mooneyalthough really you want it set to the value from the previous line22:57
ianwi don't know if setting it to last line is always what you want22:59
sean-k-mooneywell it wont be for the first line so you could be consistent and just always set it to null/022:59
sean-k-mooneybut i htink in generall it would be23:00
sean-k-mooneybut i can think of case where it might not be correct23:00
sean-k-mooneyto just use the previous lines23:00
sean-k-mooneye.g. if its not in openstack log format23:00
ianwbut why does SYSTEMD_LOGMATCH match, when this line doesn't have a status23:04
ianwthat's the root cause23:04
ianw^(\w+\s+\d+\s+\d{2}:\d{2}:\d{2}((\.|\,)\d{3,6})?)( (\S+) \S+\[\d+\]\: ((DEBUG|INFO|WARNING|ERROR|TRACE|AUDIT|CRITICAL))?.*)23:05
ianwis the expanded regex23:05
clarkbianw: the ? after the levels means it is optional23:05
clarkblooking at journalctl output locally that seems to be the case23:06
clarkbalso some services seem to use E: for error23:06
ianwyeah; removing the '?' everything starts getting a null severity again23:07
ianwhas that changed in ... forever?23:08
clarkbI don't think so. Is it possible this issue has just been there?23:08
clarkb(I haven't checked git log)23:09
ianwhasn't changed since https://review.opendev.org/#/c/672839/ afaics23:10
ianwthat added it23:10
*** aluria has quit IRC23:11
ianwok, looking more closely at23:11
ianwhttps://opendev.org/zuul/zuul/commit/072bf45ff8a534aebdbf8a1987d4bc902dd15b7a23:11
ianw              {data.map((line) => (23:12
ianw                ((!severity || (line.severity >= severity)) &&23:12
ianwso; i think we have about 6 different ways to fix this now23:13
fungisounds like you should round out your morning with breakfast at milliways23:15
ianwfungi: i feel a bit more like the cow who offers himself to be eaten :)23:16
fungiindeed23:17
ianwi think the ? on the regex must be wrong.  i'm worried it cascades into more weirdness though23:17
*** aluria has joined #zuul23:18
openstackgerritIan Wienand proposed zuul/zuul master: web: remove optional match on systemd regex  https://review.opendev.org/76328123:24
corvusianw: you could look into osla to see if there's any history for that '?'23:26
corvusthat may be where it originated23:27
ianwcorvus: ummm ... what's osla?23:27
ianwoh, os-loganalyze23:29
ianwi have a tree of that somehere...23:29
ianwhttps://review.opendev.org/#/c/455831/ is maybe where it came from?23:30
ianw"We also have to realize that status might not be in a systemd line,23:31
ianwand that's ok, but still match it, given that otherwise we run the23:31
ianwrisk of falling into the syslog match."23:31
ianwthat does -- self.status = m.group('status') or "NONE"23:32
corvusmaybe we should do something similar?23:32
ianwyeah, though in this case we don't have the concern that the other match might catch it23:34
*** rlandy|rover has quit IRC23:35
*** holser has quit IRC23:35
corvusianw: true; we might add more later, but i agree that all we care about is the severity so it's moot23:46
ianw... and that is why we make an effort to write good commit messages :)23:47
ianwif sdague had of just stuck that in, we'd all still be wondering23:48

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!