Monday, 2021-05-24

*** k_mouza has joined #openstack-nova01:07
*** k_mouza has quit IRC01:12
*** k_mouza has joined #openstack-nova01:27
*** gmann_afk is now known as gmann01:30
*** k_mouza has quit IRC01:32
*** sapd1 has joined #openstack-nova01:54
*** sapd1_x has quit IRC01:56
*** sapd1 has quit IRC02:03
*** sapd1 has joined #openstack-nova02:03
*** psachin has joined #openstack-nova02:49
*** sapd1_x has joined #openstack-nova03:17
*** mkrai has joined #openstack-nova04:23
*** k_mouza has joined #openstack-nova04:43
*** k_mouza has quit IRC04:47
*** k_mouza has joined #openstack-nova04:55
*** ratailor has joined #openstack-nova04:58
*** k_mouza has quit IRC04:59
*** k_mouza has joined #openstack-nova05:03
*** k_mouza has quit IRC05:08
*** Alon_KS has quit IRC05:22
*** ralonsoh has joined #openstack-nova05:28
*** mkrai_ has joined #openstack-nova05:34
*** sapd1_x has quit IRC05:37
*** mkrai has quit IRC05:37
*** Alon_KS has joined #openstack-nova05:45
*** sapd1_x has joined #openstack-nova06:01
*** k_mouza has joined #openstack-nova06:16
*** k_mouza has quit IRC06:20
*** sapd1_x has quit IRC06:25
*** slaweq has joined #openstack-nova06:33
*** Alon_KS has quit IRC06:37
*** Alon_KS has joined #openstack-nova06:41
*** mkrai_ has quit IRC07:09
*** sapd1_x has joined #openstack-nova07:09
*** tosky has joined #openstack-nova07:20
*** andrewbonney has joined #openstack-nova07:31
*** slaweq has quit IRC07:32
*** k_mouza has joined #openstack-nova07:33
*** k_mouza has quit IRC07:34
*** k_mouza has joined #openstack-nova07:34
*** slaweq has joined #openstack-nova07:35
*** vishalmanchanda has joined #openstack-nova07:41
*** lucasagomes has joined #openstack-nova07:58
lyarwood\o morning08:01
*** derekh has joined #openstack-nova08:05
*** mkrai_ has joined #openstack-nova08:06
*** ociuhandu has joined #openstack-nova08:23
*** stephenfin has quit IRC08:27
lyarwoodstephenfin:  would you mind hitting these again if you're around today https://review.opendev.org/q/topic:%22bug%252F1928063%22+(status:open%20OR%20status:merged)08:28
*** Hazelesque has joined #openstack-nova08:47
*** Hazelesque is now known as Hazelesque_09:02
*** Hazelesque_ is now known as Hazelesque__09:02
*** Hazelesque__ is now known as Hazelesque09:02
kevinzlyarwood: stephenfin: morning! Could you help to review this when you convenient? https://review.opendev.org/c/openstack/nova/+/763928, It is about live migration support on Arm6409:03
* lyarwood clicks09:04
*** sapd1_y has joined #openstack-nova09:19
*** sapd1_x has quit IRC09:19
*** sapd1 has quit IRC09:19
*** ociuhandu has quit IRC09:34
*** ociuhandu has joined #openstack-nova09:35
*** sapd1 has joined #openstack-nova09:41
*** ociuhandu has quit IRC09:43
*** stephenfin has joined #openstack-nova09:49
*** jangutter_ has joined #openstack-nova09:50
*** jangutter has quit IRC09:53
*** owalsh has quit IRC10:05
*** sapd1 has quit IRC10:16
openstackgerritSylvain Bauza proposed openstack/nova-specs master: Add generic mdevs to Nova  https://review.opendev.org/c/openstack/nova-specs/+/79279610:21
*** owalsh has joined #openstack-nova10:27
*** sapd1_y has quit IRC10:29
*** sapd1 has joined #openstack-nova10:30
*** sapd1_x has joined #openstack-nova10:31
*** sapd1_x has quit IRC10:31
*** sapd1_x has joined #openstack-nova10:32
*** lpetrut has joined #openstack-nova10:43
*** pmannidi has joined #openstack-nova10:43
*** pmannidi has quit IRC10:43
*** ociuhandu has joined #openstack-nova10:47
*** ociuhandu has quit IRC11:00
*** ociuhandu has joined #openstack-nova11:03
*** ociuhandu has quit IRC11:15
*** sapd1 has quit IRC11:15
*** mkrai_ has quit IRC11:19
*** ociuhandu has joined #openstack-nova11:27
*** ociuhandu has quit IRC11:31
*** ociuhandu has joined #openstack-nova11:34
*** ociuhandu has quit IRC11:38
*** ociuhandu has joined #openstack-nova11:38
*** ociuhandu has quit IRC11:39
*** ociuhandu has joined #openstack-nova11:41
openstackgerritStephen Finucane proposed openstack/nova master: tests: Move libvirt-specific fixtures  https://review.opendev.org/c/openstack/nova/+/79096911:41
openstackgerritStephen Finucane proposed openstack/nova master: tests: Add os-brick fixture  https://review.opendev.org/c/openstack/nova/+/79097011:41
openstackgerritStephen Finucane proposed openstack/nova master: tests: Rename 'ImageBackendFixture' to 'LibvirtImageBackendFixture'  https://review.opendev.org/c/openstack/nova/+/79235311:41
openstackgerritStephen Finucane proposed openstack/nova master: Create a fixture around fake_notifier  https://review.opendev.org/c/openstack/nova/+/75844611:41
openstackgerritStephen Finucane proposed openstack/nova master: Use NotificationFixture for legacy notifications too  https://review.opendev.org/c/openstack/nova/+/75844811:41
openstackgerritStephen Finucane proposed openstack/nova master: Test the NotificationFixture  https://review.opendev.org/c/openstack/nova/+/75845011:41
openstackgerritStephen Finucane proposed openstack/nova master: Move fake_notifier impl under NotificationFixture  https://review.opendev.org/c/openstack/nova/+/75845111:41
openstackgerritStephen Finucane proposed openstack/nova master: rpc: Mark attributes as private  https://review.opendev.org/c/openstack/nova/+/79280311:41
*** sapd1 has joined #openstack-nova11:42
*** ociuhandu has quit IRC11:47
*** ociuhandu has joined #openstack-nova11:59
*** ociuhandu has quit IRC12:02
*** ociuhandu has joined #openstack-nova12:03
*** sapd1_x has quit IRC12:11
*** sapd1 has quit IRC12:12
*** links has joined #openstack-nova12:25
*** psachin has quit IRC12:28
openstackgerritMerged openstack/nova master: tests: Move libvirt-specific fixtures  https://review.opendev.org/c/openstack/nova/+/79096912:34
openstackgerritMerged openstack/nova master: tests: Add os-brick fixture  https://review.opendev.org/c/openstack/nova/+/79097012:35
sean-k-mooneystephenfin: are you rewriting all the test fixutures again :P12:36
openstackgerritMerged openstack/nova master: tests: Rename 'ImageBackendFixture' to 'LibvirtImageBackendFixture'  https://review.opendev.org/c/openstack/nova/+/79235312:36
lyarwoodI didn't like backporting func tests anyway12:36
lyarwood /s12:37
sean-k-mooney:)12:37
sean-k-mooneywe just need to get all our customer to deploy master at all times12:37
sean-k-mooneyproblem solved12:37
* lyarwood nods12:39
*** bbowen has joined #openstack-nova12:41
*** elod has quit IRC12:56
*** elod has joined #openstack-nova13:05
*** sapd1 has joined #openstack-nova13:11
*** ociuhandu has quit IRC13:40
lyarwoodRandom, as an admin I can't seem to list server events for a user defined instance?13:48
lyarwood$ openstack server event list srvtree-server1-rh5da6maeidh13:49
lyarwoodNo server with a name or ID of 'srvtree-server1-rh5da6maeidh' exists.13:49
*** ratailor has quit IRC13:49
stephenfin@lyarwood If you're in a different project then that won't work13:51
lyarwoodeven as the admin?13:51
lyarwoodokay weird13:51
lyarwoodI thought this worked previously13:51
stephenfinbecause the name-based search is done against the instances in the project13:51
stephenfinI doubt it. A couple of command have an '--all-projects'  option just for this13:52
* stephenfin double checks to make sure he didn't change anything recently, just in case13:52
lyarwoodah right I was likely using the UUID in the past13:53
stephenfinNope, I added missing options but it's otherwise unchanged, save for some docs, since it was added in 201713:53
stephenfinI'd say so13:54
*** ociuhandu has joined #openstack-nova13:54
lyarwoodYeah sorry, I've used UUIDs everywhere in the request-id presentation and that just threw me13:56
*** ociuhandu has quit IRC13:59
*** ociuhandu has joined #openstack-nova14:01
*** jangutter has joined #openstack-nova14:16
*** jangutter has quit IRC14:16
*** jangutter_ has quit IRC14:16
*** jangutter has joined #openstack-nova14:16
*** ociuhandu has quit IRC14:25
*** martinkennelly has joined #openstack-nova14:34
lyarwoodtest_volume_backed_live_migration keeps failing on master at the moment btw14:35
*** ociuhandu has joined #openstack-nova14:38
*** ociuhandu has quit IRC14:42
*** ociuhandu has joined #openstack-nova14:43
*** jangutter_ has joined #openstack-nova14:47
lyarwoodAh I see why, slow nodes and pre_live_migration is timing out14:49
*** lpetrut has quit IRC14:50
*** jangutter has quit IRC14:51
*** dklyle has joined #openstack-nova14:51
*** jangutter has joined #openstack-nova15:03
*** jangutter_ has quit IRC15:07
*** raildo has joined #openstack-nova15:32
*** jangutter_ has joined #openstack-nova15:33
melwittstephenfin: were you planning to work on trying to remove eventlet or would you mind if I took a try at it? the ptg discussion on that occurred earlier than I had come online that day15:35
stephenfinmelwitt: go for it. I am hoping to do something on it but it's a potential minefield that would benefit from many eyes15:36
*** jangutter has quit IRC15:37
melwittstephenfin: cool, thanks. I've gone quite down the rabbit hole related to eventlet on some downstream bugs, so I've some ideas now (fortunately or unfortunately)15:41
*** lucasagomes has quit IRC16:02
dansmithmelwitt: this is remove eventlet from api right?16:06
melwittdansmith: and potentially everything else too16:07
dansmithmelwitt: nothing else is threadsafe so I have a hard time imagining that being a thing16:09
*** jangutter has joined #openstack-nova16:09
melwitttl;dr is there's a bad interaction and failure mode between gevent/eventlet and pymysql and mysqlconnector wherein if a green thread is killed before a connection is cleaned up, it leaves it in an inconsistent state and the next attempt to use a connection blows up16:10
melwittI've spoken at length with zzzeek about this and my understanding is this can't be worked around or handled and that replacing our usage of eventlet with native threading or similar is the only way to avoid it16:11
dansmithis there a pointer to something to read about it?16:12
melwittyeah, sec16:12
*** ociuhandu has quit IRC16:12
*** jangutter_ has quit IRC16:13
dansmithapi and conductor going to native threads are doable I think without too much crazy,16:15
dansmithbut I think compute will be a nightmare, but it also doesn't use the DB driver at all, so it should be immune16:15
melwittdansmith: this comment contains the relevant references https://bugzilla.redhat.com/show_bug.cgi?id=1927994#c45 the rest of that bug has a lot of comments, most of which are private because I'm not sure they help bring any clarity. it's been a long discussion on there16:15
openstackmelwitt: Error: Error getting bugzilla.redhat.com bug #1927994: NotPermitted16:15
melwittoh, the entire bug looks to be private /facepalm16:16
melwitthttps://github.com/PyMySQL/PyMySQL/issues/23416:16
melwitthttps://github.com/sqlalchemy/sqlalchemy/issues/325816:16
melwitthttps://github.com/PyMySQL/PyMySQL/issues/26016:16
melwittthose are the references ^16:16
dansmiththanks, I can read it at least16:17
dansmithso the assertion is that all openstack projects will have to undertake this conversion?16:17
dansmiththat's a pretty big deal16:17
melwittfor whatever reason nova is the only one that seems affected by this, this error has not been found in any other service's logs in the same deployments that see it in nova often16:18
dansmithand only nova-api?16:19
melwittand I think most of the appearance of the error is from back when we had the eventlet-based wsgi server for nova-api. my guess is that since moving away from that, our usage is much reduced and reduces the chances of hitting this16:19
melwittno I have seen traces in nova-scheduler and nova-conductor as well16:19
dansmithokay, because nova-api already runs with a combination of native and green threads, so if it was just api, then that could be why,16:20
dansmithbut if scheduler and conductor see it as well, I'm not sure why no other service would be affected16:20
melwittyeah, I'm guessing it's because it becomes very rare when the only things using eventlet are the periodic tasks, timers/retries, and some bits of scatter/gather. the error shows up under high load and coexists with other various connection errors to the database16:22
melwittwell, maybe "very rare" is not a good way to put it, "more rare"16:23
melwittI have searched for some way that nova does something different than any other service wrt to database access and found nothing16:25
melwittthe only lead we have so far is that nova uses eventlet more than any other services do (afaik so far)16:26
sean-k-mooneymelwitt: i assume we have not had any downstream sqlalcamy updtes in rhel 716:26
sean-k-mooneydoes this happen in modren openstack that was for 13 so queens16:27
melwittsean-k-mooney: what do you mean, like version changes? I don't think so but zzzeek would have covered that16:27
sean-k-mooneyyep or perhaps a backport that could have caused it16:28
melwittsean-k-mooney: to your other question, I could find no bug reports for this for newer than 1316:28
melwittthat could either mean it went away or that it happens rarely enough that no one has bothered reporting it. not sure what to think16:29
dansmithmelwitt: but nothing changed about conductor and scheduler that would account for them not showing the issue in later versions16:30
dansmithmeaning api going from evenlet to real wsgi is a change that could affect this, but that doesn't impact the other services16:30
melwittzzzeek anyway strongly recommended we stop using eventlet for its known issues with the mysql connectors16:30
dansmithI've read the bug now, and I see that he has,16:31
dansmithbut it's not quite as simple as just turning it off16:31
melwittdansmith: yeah, I appreciate that. but it's hard to know if this is just not been reported or if it's really not there anymore16:31
sean-k-mooneyi wonder is this wsgi related16:31
dansmithso much of what goes on changes if you don't have those yield points patched in16:32
dansmithsean-k-mooney: melwitt says she has reports of it from scheduler and conductor, which is what really puzzles me16:32
melwittyeah, I know it's not simple as turning it off but afaict we could switch everything to native threads. I've started looking into it (just by having to explain to the customer where/when it's workaroundable and when it's not)16:32
sean-k-mooneyok i was wondering if it was related to the issue with enabling multiple tread in the wsgi process16:32
sean-k-mooneybut if its in the schduler and conductor its not that16:33
dansmithmelwitt: native threads mean a ton of code can race that can't race now16:33
sean-k-mooneydansmith: well im not sure about "cant race now" but i agree it would be more16:33
melwittdansmith: yeah, those logs are attached to the case and accessible on supportshell if you're interested in looking16:33
sean-k-mooneythe GIL will save use somewhat but not entirely16:34
dansmithsean-k-mooney: not all code, a ton of code.. there is a whole class of stuff that can't race because it's single threaded and can't overlap except at schedule points.. all of that stuff will suddenly be actually paralell16:34
dansmithsean-k-mooney: it will save individual accesses to data structures, but not multiple statements making changes16:34
melwittmaybe at the very least we could use futurist and make it configurable whether to use eventlet or native threading, and default to eventlet so as not to change existing behavior for those it works ok for16:34
sean-k-mooneyya thats true16:35
melwittand let people like these customers try out the native threading and let us know if it works well in a real deployment or not16:35
dansmithwell, any change would have to be gradual like that I think.. meaning a switch to flip that we keep around for a while16:35
dansmithotherwise we're going to flip the switch and not find out if we broke everyone for 18 months :)16:35
sean-k-mooneywe neeed "osapi_compute_workers" to be 1 though right and scale that via the process.16:36
melwittyeah, a good point16:36
dansmithsean-k-mooney: that's a different concern I think16:37
sean-k-mooneymaybe the reinit issues for that have been fixed? but we used to have issue with the pultiple interperters running in the same wsgi process because of how it reloaded16:37
dansmithnot sure we need _workers at the point where we're actually natively threaded16:37
melwittsean-k-mooney: osapi_compute_workers is actually the number of processes but the wsgi.default_pool_size defaults to 1000 and represents the number of green threads for the nova-api eventlet based wsgi server16:38
sean-k-mooneymelwitt: ah ok16:38
sean-k-mooneyits   https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.osapi_compute_workers16:39
sean-k-mooneyim still not sure it makes sense to sue that when running under a wsgi service16:41
dansmithright,16:41
dansmiththat's unrelated I think16:41
dansmithwe'll never spawn our own worker processes when under uwsgi, AFAIK, we'll only spawn (green)threads16:41
melwittyeah I think with uwsgi or mod_wsgi the number of processes is configured by their respective configs16:43
melwittthe osap_compute_workers is for other services or the old eventlet wsgi server we had provided back then https://github.com/openstack/nova/blob/stable/queens/nova/wsgi.py#L7516:43
sean-k-mooneyapparently its never used directly in the nova code16:44
melwitter sorry, osapi_compute_workers was only for nova-api. the other services have their own "workers" settings which map to the oslo.service workers16:44
dansmithright, it's for when we spawn our own master and sub processes and listen on the socket ourselves16:44
sean-k-mooneyhttps://codesearch.opendev.org/?q=osapi_compute_workers&i=nope&files=&excludeFiles=&repos=openstack/nova16:45
melwittsean-k-mooney: it was here https://github.com/openstack/nova/blob/stable/queens/nova/service.py#L36416:46
sean-k-mooneyim wondering if it still used since it does not appear to be16:46
sean-k-mooneyanyway its proably unrelated to the db error16:48
melwittdansmith: I was thinking one of the reasons nova sees this more is because we use the eventlet executor for oslo.messaging any maybe other projects don't. that opens up a lot more chances to hit the error, I think16:48
melwitts/any/and/16:48
sean-k-mooneymelwitt: instent that the default executor16:48
dansmithmelwitt: as opposed to what? synchronous waiting?16:48
melwittthey have a native threads executor16:48
dansmithmelwitt: but if eventlet is monkeypatching then they're the same I think16:49
dansmithI mean, effectively the same16:49
melwittiiuc with the eventlet one, any rpc call coming into a service is in a green thread, so if it collides with a periodic task or scatter/gather, that's a chance for it to happen16:49
melwittdansmith: yeah, that is true but if other projects use the native threads executor and don't monkey patch that might be why they don't see it. afaik nova is the only project that monkey patches16:50
dansmithmelwitt: but if python's own threading library gets patched, the "native" one will be spawning gtreen threads too16:50
dansmithmelwitt: really?16:50
sean-k-mooneyi dont think they do16:51
melwittdansmith: yeah, I know, the configurable thing would only monkey patch if configured for eventlet, right?16:51
melwittsean-k-mooney: you don't think they monkey patch?16:51
melwittor you think they do16:51
dansmithmelwitt: I don't parse the configurable question16:51
dansmithI'm not sure what the point of using eventlet without monkey patching is16:51
dansmithotherwise you're just fully synchronous, you just have "threads" that run to completion all the time, AFAIK16:52
melwittdansmith: sorry, I guess I'm confused. I thought you were pointing out that making it configurable in nova would result in still having things be green threads16:52
dansmithmelwitt: I'm saying that if you were to ask for native threading in oslo.messaging, but you were monkeypatching python's thread library, then you're going to get greenthreads from your "native" o.msg threading module16:53
melwittyeah, I'm not sure if anyone else uses eventlet on purpose or if it's only indirectly through oslo.service or oslo.messaging16:53
dansmithglance certainly does use it16:53
dansmiththey spawn background threads for import tasks16:53
melwittdansmith: ah, ok. yeah16:54
melwitthm, ok16:54
dansmithand they do monkeypatch16:54
dansmithbecause otherwise it would be kinda pointless16:54
dansmithcinder monkeypatches too16:54
dansmithand they call eventlet operations directly, in a looot of places16:55
dansmithlots of direct threadpool and eventlet.sleep() interaction16:55
melwittok, so when I began looking at this, I was assuming that nova does something different than anyone else and that's why we hit the error, but I could not find what that could be16:55
dansmithso I think they're intentionally using eventlet16:55
melwittmeaning, that we are not the only ones using eventlet, yet we're the only ones who hit this16:56
dansmithwell, that's what I'm trying to zero in on.. if we're the only ones.. why16:56
melwittso far, the only thing I can think of is we just having a lot more green threads flying around. either that, or there is something different about the way we do database interactions through sqla16:57
melwittI couldn't find anything when I looked, but obviously I could have missed something16:58
sean-k-mooneyis the reentrent call is happing as part of the rollback17:02
sean-k-mooneyi wonder if this could be releated to how we hanedl exction  with teh scater gater implemation17:02
melwittsean-k-mooney: looks like if you don't choose an executor, it will detect whether you're monkey patched and if you are, it will use eventlet executor, else it will use native threading https://github.com/openstack/oslo.messaging/blob/5aa645b38b4c1cf08b00e687eb6c7c4b8a0211fc/oslo_messaging/_utils.py#L7017:03
melwittsean-k-mooney: yeah, something dies in the middle of the rollback and then the connection is left in a bad state and then when it's accessed again it raises that error17:04
*** andrewbonney has quit IRC17:05
melwittsean-k-mooney: that was one of the earliest theories but there are also bug report for this same thing in OSP10 when we didn't have scatter gather17:05
sean-k-mooney welll what i was wondering is dont we have slighly odd exctpion handeling in the scater gater wehere we return the excptions instead of raisign them17:05
sean-k-mooneyor am i imagining things17:05
melwittand I have looked at those traces too17:05
sean-k-mooneyoh ok17:05
sean-k-mooneynever mind then17:05
dansmithmelwitt: is it always during a scatter/gather?17:06
melwittdansmith: no, it happened in OSP10 too when we didn't have scatter/gather17:06
melwittand I have seen traces where it was raised from a "get quotas" call, from service_update I have seen a lot17:07
dansmithoh right, I read that17:07
lyarwoodhttps://bugs.launchpad.net/nova/+bug/1929446 - This is the issue I was highlighting earlier if anyone has time to help narrow this down a little.17:07
openstackLaunchpad bug 1929446 in OpenStack Compute (nova) "check_can_live_migrate_source taking > 60 seconds in CI" [Undecided,New]17:07
sean-k-mooneyi am not sure this is evently related17:07
sean-k-mooneyit might be17:07
sean-k-mooneybut the nova api was not alwasy monky patched17:08
melwittthere was a window when it wasn't17:08
sean-k-mooneyif you ran it under uwisgi before scater gatter was added it was not monkey patched17:08
melwittbut it was prior to uwsgi/mod_wsgi being a way to run nova-api17:08
sean-k-mooneythe comman line nova-api alwasy was17:08
melwittright17:08
melwitt*but it was monkey patched17:09
sean-k-mooneyno  we had a perfiod of time when uwsgi was supported but we did not monkey patch17:09
melwittI know17:09
sean-k-mooneyathough we may not have relased that way downstream17:09
melwittI'm saying that prior to uwsgi it was always monkey patched17:09
sean-k-mooneyah yes it was17:09
sean-k-mooneyosp 10 is what newton i think we just used the nova-api command directly at that point17:10
melwittbut even if we stop monkey patching in nova-api, we will still see this in nova-scheduler and nova-conductor at least17:10
sean-k-mooneyis there a cler writeup of how the error happens17:11
melwittyeah. and in the sosreports I've looked at for 13 it's also the nova-api command in these cases, afaict from the ps output17:11
sean-k-mooneyi think ooo avoid using apache initally due to concens of memory overhead17:11
melwittsean-k-mooney: yeah but not in the context of openstack. this is a private bug but here https://bugzilla.redhat.com/show_bug.cgi?id=1927994#c45 and the links are https://github.com/PyMySQL/PyMySQL/issues/234 https://github.com/sqlalchemy/sqlalchemy/issues/3258  https://github.com/PyMySQL/PyMySQL/issues/26017:11
openstackmelwitt: Error: Error getting bugzilla.redhat.com bug #1927994: NotPermitted17:11
sean-k-mooneyah ok i was looking at the nova bug report and trying to find the repoducer17:12
zzzeekhey just snooping a little bit, I think the main thing nova is doing that nobody else is, is using eventlet monkeypatching *with* mod_wsgi at the same time17:13
melwittwe haven't been able to reproduce it in openstack17:13
zzzeekso that's two frameworks with heavy and opposing opinions on concurrency getting together17:13
melwittzzzeek: the traces I've been looking at have all been not running under mod_wsgi and also occurred in services (scheduler and conductor) that are not using wsgi in any form17:13
zzzeekah17:14
zzzeekmelwitt: that's odd.   pymysql doesnt like if you use eventlet but as long as the scope of a connection is maintained in only one greenlet at a time, this kind of error shouldnt happen.  what can happen is if requsts are interrupted and not cleaned up correctly17:14
zzzeekor if cleanup code itself is not able to run correctly due to the monkeypatrching17:15
sean-k-mooneydont we initalise the connection globally and share it between all greentherads17:15
melwittno we don't17:15
melwittthat's not a "connection" it's a "transaction context manager" which is a factory if I'm remembering terminology zzzeek explained to me last time17:16
sean-k-mooneyah yes that is what i was thinking of17:16
sean-k-mooneythe object we recently wraped in teh run once decorator17:17
melwittand he has confirmed that is the correct way to use it, the threads can share that factory and get new connections from it17:17
sean-k-mooneyso when ever we context switch eventlet never guarentes we will resume on the same tread.17:17
sean-k-mooneybut normally we have only one17:18
melwittyeah the configure() of those objects17:18
sean-k-mooneycould this be related to the use of pthread for the heartbeat17:18
melwittwhich heartbeat? the service heartbeats are eventlet, that I saw17:18
sean-k-mooneythe only real pthread i know of in nova are teh oslo.messaging heartbeat and the libvirt one17:19
sean-k-mooneyalthough no that would not make sense fo 10/1317:19
dansmithat one point we changed the ordering of our imports relative to the monkeypatching to "fix" something17:20
melwittzzzeek: yeah... dansmith pointed out that glance and cinder use eventlet and monkey patch, but yet we don't see this error from them17:20
dansmithand I think that got backported.. I wonder if that's relevant?17:20
sean-k-mooneymelwitt: i was refering to https://github.com/openstack/oslo.messaging/blob/5aa645b38b4c1cf08b00e687eb6c7c4b8a0211fc/oslo_messaging/_drivers/impl_rabbit.py#L90-L10017:20
* melwitt looks for link17:21
sean-k-mooneydansmith: mdboots change17:21
dansmithsean-k-mooney: right17:21
melwittI was just looking at that earlier17:21
dansmithsean-k-mooney: I wonder if that ended up with us getting a combination of real and green threads in a way that is problematic..17:21
sean-k-mooneyhttps://github.com/openstack/nova/commit/3c5e2b0e9fac985294a949852bb8c83d4ed77e04#diff-c2e5ad6353633e738ba126e0f11ea14ed3f6ea94554deec967586fd2dfcf060d17:22
sean-k-mooneythat one17:22
melwittyeah that's it17:22
melwittsean-k-mooney: ack thanks (pthread)17:22
*** ralonsoh has quit IRC17:23
sean-k-mooneydansmith: well in principal that should have moved the patching eairler so less likely to get a mix17:23
sean-k-mooneybut you are suggestign without it we still could be17:23
sean-k-mooneyif we have not backported it17:23
dansmithwell,17:23
melwittyeah, we did not backport it17:23
dansmithI think in wsgi mode that will come in at the point at which we hit it due to importing that api module17:23
dansmithmelwitt: oh I thought we did.. maybe that's related to the sudden cessation of reports? :)17:24
melwittcould be, yeah17:24
dansmithmelwitt: did you say you didn't see it at all in later releases, or just ... less?17:24
*** coreycb has joined #openstack-nova17:24
sean-k-mooneythis was merged in train17:25
melwittdansmith: I could not find any mention of it past queens/13 when I bugzilla searched everything under nova and pymysql17:25
sean-k-mooneyso if we did not abckprot we shoudl see it up to 1517:25
dansmithmelwitt: that seems like it could be a strong contender for being related then17:26
sean-k-mooneydidnt we also change the mysql clint at one point17:26
melwittdansmith: yeah, agree17:26
dansmithsean-k-mooney: long ago17:26
sean-k-mooneyi think pymsql is the new one right17:26
sean-k-mooneyit used to be mysql_python or something17:26
dansmithit is, but that change was like icehouse or something I think17:26
sean-k-mooneyya ok17:26
melwittI had been looking at it from the context of it also providing a way to disable monkey patching, but due to my lack of understanding of eventlet and mixing with native threads, it did not click for me to think it could have fixed things to monkey patch earlier17:27
melwittit makes sense when you say it now though..17:28
dansmitha combination of references to the un-patched library and the patched one could very much be relevant17:28
dansmithand that's what that change was aabout17:28
dansmithand it's also the argument against monkeypatching altogether of course :P17:28
sean-k-mooneyits ok stephenfin  will reventyly get around to deleteing all the eventlet code like all the ohter stuff he has deleted :)17:29
melwittyeah. that change was mostly non understandable by my brain17:29
sean-k-mooneybut ya we did this for urllib3 eventully17:29
sean-k-mooney*orginally17:30
sean-k-mooneyand som eohter service i guess but it makes sesne17:30
dansmithurllib3 being socket-oriented, along with pymysql ... :)17:30
sean-k-mooneywe did actully backport this https://review.opendev.org/c/openstack/nova/+/64731017:30
sean-k-mooneybut only to stien17:31
melwittah, ok, my bad17:31
melwittI had thought it landed in stein17:31
dansmithokay I was sure we did backport it some, but .. fair enough17:31
melwitt(originally)17:31
sean-k-mooneyi guess that stien is where the orginal bug was reported17:31
sean-k-mooneyand we just did not bring it back before that17:32
sean-k-mooneyhum https://bugs.launchpad.net/nova/+bug/180895117:33
openstackLaunchpad bug 1808951 in tripleo "python3 + Fedora + SSL + wsgi nova deployment, nova api returns RecursionError: maximum recursion depth exceeded while calling a Python object" [High,Incomplete]17:33
sean-k-mooneyoh i miss read SSL as SQL17:33
sean-k-mooneyi was going to say it refrence SQL too17:33
melwittok, I think it would be interesting if I build them a test package with that change and see if they can try it out17:34
melwittthat would be good proof that it is/was the fix17:34
dansmithmelwitt: yeah if they're willing I think that'd be a good test17:34
melwittI'll get that done and give them the option17:35
sean-k-mooneymelwitt: lyarwood  so it look like the new resovled is breaking lowerconstraits on stable os-vif branches17:55
sean-k-mooneyhow is that adressed for stabel brances17:55
sean-k-mooneydo we update to the oldest lib that works?17:55
sean-k-mooneyhacking seams to be what is breakign things17:55
sean-k-mooneyalthough there coudl be other issues17:56
sean-k-mooneyok on master stephenfin removed any non direct deps17:57
sean-k-mooneyhttps://github.com/openstack/os-vif/commit/44d8937148aac1a61f40e59d3271c45f9fe6aa0317:57
sean-k-mooneyill see if i can do somethign similar17:57
lyarwoodsean-k-mooney:  ack yeah or backport a modified version of that?17:58
sean-k-mooneywell that is waht i was going to do but with lower min version to match the branch its going too17:59
sean-k-mooneyim not sure if starting from that patch will be faster or not17:59
*** __ministry has joined #openstack-nova18:04
*** links has quit IRC18:07
*** __ministry has quit IRC18:18
openstackgerritsean mooney proposed openstack/os-vif stable/victoria: Resolve dependency issues  https://review.opendev.org/c/openstack/os-vif/+/79284018:30
*** sapd1_x has joined #openstack-nova18:34
*** sapd1 has quit IRC18:37
*** k_mouza has quit IRC18:42
*** k_mouza has joined #openstack-nova18:42
*** k_mouza has quit IRC18:47
sean-k-mooneyelod: lyarwood: am i allowed to squash two change in a backport upstream18:56
sean-k-mooneybasicaly i have 2 chocie  squash https://review.opendev.org/c/openstack/os-vif/+/716223 and the ussuri version of https://review.opendev.org/c/openstack/os-vif/+/79284018:57
sean-k-mooneyor i can incoperate ignoring W50418:57
sean-k-mooneyoh wait no that is not adding W50418:58
sean-k-mooneythat not the patch i need to add18:59
sean-k-mooneyits https://github.com/openstack/os-vif/commit/d57a5f39edcb8ef3de09e80925c8fe628e5e0f3a18:59
sean-k-mooneybut since that raise the min version hackign i cant really backport19:01
sean-k-mooneyok ill take a look at this again tomorrow19:02
lyarwood<sean-k-mooney "elod: lyarwood: am i allowed to "> Yes FWIW, if it fixes an otherwise unsolvable problem you can merge multiple.19:06
sean-k-mooneyi tought that i need to backport the cleanup patch and merge it with the lower constratint one but that is not what intoduced the w504 skip19:07
sean-k-mooneylyarwood: it was the patch that bumps hacking form 1.x to 3.0 for python 3 support19:07
sean-k-mooneyill see if i can figure out how to cap flak8 and py code style to avoid the hacking bump19:08
sean-k-mooneybut if i cant we will have to deciside if we are oke with the bump to hacking in the pep8 tox env19:09
sean-k-mooneygiven it wont impackt any other test and wont be used at runtime19:09
*** gyee has joined #openstack-nova19:10
sean-k-mooneyif we are ok with going to hacking 3.0 ill look at squashing thsoe two patches19:10
sean-k-mooneywell tomorrow im going to go get dinner now o/19:11
*** k_mouza has joined #openstack-nova19:23
*** k_mouza has quit IRC19:28
*** boxiang has quit IRC19:52
*** boxiang_ has joined #openstack-nova19:52
*** vishalmanchanda has quit IRC19:59
*** k_mouza has joined #openstack-nova20:02
*** k_mouza has quit IRC20:07
*** slaweq has quit IRC20:14
*** raildo has quit IRC21:06
*** pmannidi has joined #openstack-nova21:33
openstackgerritMerged openstack/nova master: image_meta: Provide image_ref as the id when fetching from instance  https://review.opendev.org/c/openstack/nova/+/79065922:01
*** k_mouza has joined #openstack-nova22:03
*** k_mouza has quit IRC22:04
*** k_mouza_ has joined #openstack-nova22:04
*** openstackgerrit has quit IRC22:05
*** k_mouza_ has quit IRC22:09
*** xek has quit IRC23:05
*** xek has joined #openstack-nova23:05
*** tosky has quit IRC23:18
*** openstack has joined #openstack-nova23:50
*** ChanServ sets mode: +o openstack23:50

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!