Friday, 2021-10-15

bauzasgood morning Nova07:44
* bauzas just waves but has to leave for 10 mins ;)07:45
kashyapgibi[m]: bauzas: Friday shameless plug: I recently summarized a KVM maintainer's talk for LWN on QEMU and software complexity.  I think the lessons are interesting for OpenStack too:08:46
kashyapgibi[m]: bauzas: "A QEMU case study in grappling with software complexity" — https://lwn.net/SubscriberLink/872321/221e8d48eb609a38/08:46
bauzas++08:46
kashyapIf you're short on time, read the intro "Sources of complexity", "Ways to fight back", and the short, one-para conclusion.08:47
kashyap(Especially check out the idea of "incomplete transitions")08:47
mdboothMy devstack is failing when it tries to run 'openstack --os-cloud devstack-system-admin registered limit create --service glance --default-limit 10000 --region RegionOne image_size_total' with 'Cloud devstack-system-admin was not found'.  This is a fresh install. I don't know what devstack-system-admin is or what creates it, so I'm at a loss for09:47
mdboothwhere to look. There's a comment from dansmith that this is a hack: https://github.com/openstack/devstack/blob/82facd6edf7cefac1ab68de4fe9054d7c4cb50db/lib/glance#L291-L294 . Has something invalidated the hack? Does anybody know where 'devstack-system-admin' is supposed to come from?09:47
mdboothTo the best of my knowledge there is no clouds.yaml anywhere on this system. If there is, it was created by devstack and put somewhere I don't know to look for it.09:49
kashyapmdbooth: See this commit in DevStack: 56905820 (Add devstack-system-admin for system scoped actions, 2019-01-08)09:49
mdbooth👀09:51
kashyapHeh09:51
kashyapAlso see the code under the comment "#admin with a system-scoped token -> devstack-system" in devstack/functions-common09:51
kashyapAlthough the commit message isn't particularly descriptive; and assumes "inside knowledge"09:52
mdboothHmm, that appears to be updating a clouds.yaml file09:54
mdboothAs I don't have a clouds.yaml file, I wonder if this is an ordering thing09:55
mdboothDid devstack create the glance limit before creating clouds.yaml?09:55
mdboothI'm going to remove GLANCE_LIMIT_IMAGE_SIZE_TOTAL from my local.conf and re-run glance, then look to see what it put in clouds.yaml09:57
mdbooths/glance/stack.sh/09:57
kashyapI don't know about the Glance limit ... but there are bunch of commits that might give a hint (git log --oneline | egrep -i 'glance.*limit')09:58
frickler/etc/openstack/clouds.yaml is what devstack generates10:06
mdboothfrickler: Yeah, that was my RTFS. It seems to be running this without having created it, though 🤔10:07
mdboothAlthough I just found GLANCE_ENABLE_QUOTAS. I wonder if I can sidestep this whole thing.10:07
fricklermdbooth: are you using a reduced set of services? it might be a bug in the async dependencies10:19
mdboothVery possibly. I rewrite local.conf this morning to use ovn and I'm convinced I wasn't hitting this yesterday.10:19
fricklermdbooth: if you can share your local.conf I can give it a spin10:20
mdboothJust re-provisioning. I'll have a fully hydrated one in a few minutes.10:21
mdboothfrickler: Actually you can just get it here: https://github.com/shiftstack/cluster-api-provider-openstack/blob/devstack-on-openstack/hack/ci/cloud-init/default.yaml.tpl10:22
mdboothThat's the bottom half of a cloud-init which runs devstack10:23
mdboothOPENSTACK_RELEASE is xena10:25
fricklerok, having a meeting now, will try to run it afterwards10:27
mdboothfrickler: Thanks. That version includes GLANCE_ENABLE_QUOTAS=False because I'm just testing that.10:28
mdboothBut it previously used GLANCE_LIMIT_IMAGE_SIZE_TOTAL=10000 instead10:28
mdboothfrickler: FWIW I've been in hacker mode on that config for a while (just look at the history!). I just disabled tempest and horizon which had accidentally become enabled again, and it seems to have completed. It's still using GLANCE_ENABLE_QUOTAS=False.11:02
mdboothWhich is to say, if there's a dependency issue I'll bet it relates to tempest or horizon, but I haven't proven that.11:03
fricklermdbooth: o.k., at least I could reproduce your failure with GLANCE_LIMIT_IMAGE_SIZE_TOTAL being set11:29
fricklermdbooth: nice one, this actually only fails consistently with DEVSTACK_PARALLEL=False11:50
fricklerwith async, https://github.com/openstack/devstack/blob/82facd6edf7cefac1ab68de4fe9054d7c4cb50db/stack.sh#L1107 runs in the background and write_clouds_yaml in L1122 has a fair chance of being fast enough11:52
fricklerdansmith: ^^11:52
mdboothfrickler: Oh, wow! I only turned that on temporarily to rule it out as the potential cause of another issue!11:56
kashyapmdbooth: TIL, "tpl" extension12:27
mdboothkashyap: Not mine in this case, but I'm pretty sure I've used it before.12:27
kashyap(From your link.  Probably it's just a convenient reference to refer to that YAML file as a "template")12:27
kashyapmdbooth: I see12:27
gibisean-k-mooney: about stoping the services. you are right we are doing it already. that does not stop all the eventlets the service spawnd. I also tried to iterate all the eventlets and and call throw() on them to stop them but that did not help either. 12:56
gibikashyap: thanks for the links, I added it as weekend reading :)12:57
kashyapgibi: No prob.  (It took 8 gruelling revisions. :D.  But I always become a bit of a better person after writing for LWN)12:57
sean-k-mooneygibi: ya, i have your review open on my other monitor. the more i read over it and look at it the more compleing it becomes. 12:57
sean-k-mooneygibi: its a little non obviious at first glance why we have to do this but its a nice solution when yuou did into it12:58
gibikashyap: I follow LWN but not a subscriber. I think it is a prestige to write there :)12:58
kashyapgibi: I realize not everyone has a subscription; Red Hat has a group sub.  Hence I created a "subscriber link", as I posted it in a community channel.12:59
gibisean-k-mooney: would be better to kill eventlets at the end of each testcase, but I did not find a way to do it12:59
gibikashyap: yeah I see and I thank you for it12:59
kashyapNo prob at all.  (And sorry for the plug.)12:59
kashyapBut the main idea of essential vs. accidental complexity comes from the famous 1986 paper called "No Silver Bullet" by Fred Brooks - https://en.wikipedia.org/wiki/No_Silver_Bullet13:00
sean-k-mooneygibi: well there might be a way to do it if we modifed the test setup so that each test used a seperate greenpool then we could stop all eventlets in the pool and discard it at the end of the test13:00
kashyap(So it was nice to see concrete examples of it in QEMU.)13:01
sean-k-mooneyto do that i think we would have to modify the nova service deffintion and possible nova utils to use a non default eventlet pool13:01
sean-k-mooneybut if we did that we could extend the kill function to terminate the pool13:02
fricklermdbooth: I wanted to move the write_clouds_yaml earlier anyway in https://review.opendev.org/c/openstack/devstack/+/780417, I guess I can just do that step in its own patch to fix your issue13:03
gibikashyap: ohh yeah essential and accidental complexity I like those topics13:05
mdboothfrickler: I'd appreciate it13:05
kashyapgibi: Yeah; the idea goes back 2000 years ago!  (Aristotle++)13:05
gibiohh13:05
gibiI did not know that13:05
kashyapI linked to it in the intro too :)13:06
mdbootheventlet--13:06
kashyapmdbooth: Heh, what a contrasting negative karma13:06
kashyap(Sorry for your pain)13:06
mdboothI only have the scars now, and occasionally the nightmares.13:07
gibisean-k-mooney: if terminating the pool also just calls greenlet.throw() then that would have the same problem as I had when I manually called that at the end of the test on each greenlet13:07
fricklermdbooth: https://review.opendev.org/c/openstack/devstack/+/81414213:08
mdboothIt blows my mind that at some point there was a meeting and somebody said: "You know what, lets just monkey patch everything and replace it all with our own stuff, what could go wrong?". And somebody else in that meeting agreed with them, and they started doing it.13:08
gibimdbooth: I assume it was a single person project. :)13:09
sean-k-mooneymdbooth: well the alternitive was to continue to use twisted so...13:09
gibiour use threading until you scale too big where the overhead of threads are too much13:10
gibiour use other than python without the GIL ;)13:10
sean-k-mooneygibi: i was considering coudl we stop all the service we spawwned as greantreads and then etiehr call waitall to wait for the to finsih or loop over and call kill on the all running greenthreads13:11
sean-k-mooneyso stop service and call https://eventlet.net/doc/modules/greenpool.html#eventlet.greenpool.GreenPool.waitall or stop services and call https://eventlet.net/doc/modules/greenthread.html#eventlet.greenthread.kill on all greentreads in the pool13:12
gibiit is multiple eventlet per service, but yes, you are right. That should work. I did not call wait after throw, maybe that was the problem13:12
gibinote that we not just have greenthreads, we have naked greenlets as well somehow13:12
gibiI did not traced where they are coming from13:13
sean-k-mooneyi need to test something else today but i still think your current patch is likely a viable solution in the sort term and we could explore the green pool approche in parallel/after13:15
bauzasfolks, looking at the nova PTG agenda we have atm13:25
bauzasit looks to me we don't have a lot of topics to discuss, so maybe we shouldn't have a schedule, ok ?13:25
bauzasI'll just prioritize some topics13:26
dansmithfrickler: ah, need to wait for all those accounts to finish before write_clouds_yaml I guess huh?13:30
sean-k-mooneywe might want to keep one of the session free for an unconfrence/follow up dicussions13:30
sean-k-mooneybauzas: ^13:30
sean-k-mooneybauzas: but ya we could also just priortise the list and see how far we get each day13:31
dansmithfrickler: er, no I guess that's just writing static things out, so .. I'm not sure what the problem is (if any)13:37
gibibauzas: I suggest to frontload the important stuff and then just follow the etherpad. If we run out of topics then we are done :)13:38
gibisean-k-mooney: yeah I have to do other things too today so I have no chance to try the greenpool approach13:38
bauzasgibi: yeah, for example, I'll move the melwitt's topic for unified limits above13:48
gibiack13:54
mdboothFYI: $ curl --compressed -H "X-Auth-Token: ${token}" -X GET https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13292/v2.1/images/$imageid/file | nbdcopy -- - [ qemu-nbd -f qcow2 capo-e2e-worker.qcow2 ]14:05
mdboothAh, wrong channel. Maybe still interesting, though :)14:06
mdboothdansmith: The problem I was hitting was that we were trying to create the glance quotas before clouds.yaml had been created.14:08
mdboothAnd to be clear, I'm basically just cargo culting this local.conf. I have very little idea what's actually going on.14:10
mdboothSpeaking of which, anybody ever seen: "The unit files have no installation config (WantedBy=, RequiredBy=, Also=, Alias= settings in the [Install] section, and DefaultInstance= for template units). This means they are not meant to be enabled using systemctl.". This failure seems to be non-deterministic, and unfortunately only happens in CI so I14:14
mdboothcan't debug :(14:14
dansmithmdbooth: ah, maybe I should move that in glance because I do it super early when the *glance* accounts are available, but I definitely need clouds.yaml14:52
opendevreviewAde Lee proposed openstack/nova master: Add check job for FIPS  https://review.opendev.org/c/openstack/nova/+/79051917:03
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline starting in 5 minutes, at 18:00 UTC, for scheduled project rename maintenance, which should last no more than an hour (but will likely be much shorter): http://lists.opendev.org/pipermail/service-announce/2021-October/000024.html17:59
opendevreviewmelanie witt proposed openstack/nova master: DNM Run against unmerged oslo.limit changes  https://review.opendev.org/c/openstack/nova/+/81223621:08

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!