rpittau | good morning ironic! Happy Friday! o/ | 06:28 |
---|---|---|
rpittau | JayF: I'll test both changes locally, I like the fix more than the workaround :) | 06:30 |
rpittau | JayF: my VM was more happy with the removal of the test, but I think the fix works as well, we should keep an eye on postgres too as I'm seeing things like ironic.tests.unit.db.sqlalchemy.test_migrations.TestMigrationsPostgreSQL.test_walk_versions 907.528 | 07:39 |
rpittau | I will merge the workaround to speed up CI, then we can see with more runs if the fix does work | 07:40 |
opendevreview | Merged openstack/ironic master: Disabling test_upgrade_twice temporarily for CI fix https://review.opendev.org/c/openstack/ironic/+/885600 | 07:53 |
opendevreview | Jacob Anders proposed openstack/ironic master: [WIP] Prevent MissingAttribute error when supportedApplyTime missing https://review.opendev.org/c/openstack/ironic/+/885739 | 10:09 |
iurygregory | good morning Ironic | 11:29 |
iurygregory | wow great news \o/ | 11:36 |
iurygregory | https://review.opendev.org/c/openstack/ironic/+/885661 does sound interesting also | 11:38 |
TheJulia | good morning | 12:55 |
iurygregory | good morning TheJulia | 13:01 |
* TheJulia tries to wake up | 13:25 | |
opendevreview | Riccardo Pittau proposed openstack/ironic master: Add test timout to tox config https://review.opendev.org/c/openstack/ironic/+/885372 | 14:14 |
opendevreview | Riccardo Pittau proposed openstack/ironic master: Add test timeout to tox config https://review.opendev.org/c/openstack/ironic/+/885372 | 14:15 |
opendevreview | Riccardo Pittau proposed openstack/ironic master: Use tox env variables in coverage tests https://review.opendev.org/c/openstack/ironic/+/885507 | 14:15 |
rpittau | bye everyone, have a great weekend! o/ | 15:02 |
opendevreview | Riccardo Pittau proposed openstack/ironic master: Add test timeout to tox config https://review.opendev.org/c/openstack/ironic/+/885372 | 15:11 |
yde | hi, how do you manage to ensure that the OS you'll boot, when using launch instance in nova, without presetting ports on subnets, will chose the proper netif to boot on ? | 16:15 |
yde | for instance, i have a BM node with 4 ports, one with pxe enabled, nova+ironic+neutron create a port for the VM bound to the pxe one (at least it seems to be the default behavior) | 16:16 |
yde | but when the os boots, it choses another netif to do the dhcpclient, and fails to bring up network | 16:17 |
yde | do i have to use custom cloud init scripts ? or to preprare cloudimages dedicated for BM nodes ? or i've completly missed something ? | 16:18 |
yde | thanks a lot | 16:18 |
opendevreview | Jay Faulkner proposed openstack/ironic master: Ensure upgrade twice test uses two engines https://review.opendev.org/c/openstack/ironic/+/885661 | 17:57 |
JayF | So I find myself wondering | 18:05 |
JayF | we had that grenade problem where the migration was just sorta hanging in the middle | 18:05 |
JayF | which seems suspiciously similar to the issue we're seeing in unit tests | 18:05 |
JayF | just want to ask the question: is my fix just removing test coverage from something actually broken? | 18:05 |
JayF | the classic: is the test broken, or is what we are testing broken | 18:06 |
iurygregory | I'm asking myself why this is only happening now.. (don't think I saw this problem before) .-. | 18:07 |
TheJulia | I think your change is more realistic. One question to wonder is are we actually consuming sqlalchemy 2 right now or not | 18:07 |
JayF | that's mostly where I've landed, re: the change more closely reflecting realistic usage | 18:08 |
JayF | but I wanted to ask the question specifically in relation to those grenade failure scenarios | 18:08 |
clarkb | iurygregory: sqla 2.0 affected how locks are done | 18:10 |
clarkb | I would definitely check sqla 2.0 as the first potential culprit | 18:10 |
JayF | clarkb: fwiw, we already did extensive testing vs sqla 2.0 so it'd still be net-new | 18:10 |
JayF | hmmm | 18:10 |
JayF | 1.4.41 installed in my it's-not-reproducing tox environment | 18:11 |
clarkb | in zuul we had nested context managers doing migrations and the first one would hold a lock preventing the second from operating. That isn't the case here and wasn't the case in 1.x but we had to fix it by unnesting things | 18:11 |
* JayF upgrades | 18:11 | |
TheJulia | clarkb: we have had to do a lot of similar changes | 18:12 |
JayF | ...should deprecation warnings cause tests to fail? | 18:17 |
JayF | I'm under the understanding that they shouldn't, but I'm seeing thousands of fails under sqla 2.0.5 | 18:18 |
JayF | and most of it appears to be engine connect signature changing deprecation warnings | 18:18 |
JayF | either way, the migration tests ran under 2.0.5 and didn't freeze (without my fix or the test skip) | 18:21 |
JayF | so I don't think that's a likely culprit | 18:21 |
TheJulia | yde: there is a way, I just don't remember it off hand. In some of it is just hte luck of the draw | 18:21 |
TheJulia | JayF: presently, yes | 18:21 |
JayF | you know the toggle for that, or what doc I'd find it in? | 18:22 |
TheJulia | not off the top of my head | 18:23 |
TheJulia | https://github.com/openstack/ironic/blob/master/tox.ini#L14 is one | 18:24 |
JayF | oh, and this is the change we need in oslodb, yeah? | 18:24 |
TheJulia | but that can get dropped | 18:24 |
TheJulia | I believe there were per-project changes | 18:24 |
JayF | ack; I guess "ensure our unit tests run on sqla 2.0" is something we need to do | 18:24 |
TheJulia | Well, I think we did on 2.0.1 or something like that last | 18:26 |
JayF | yeah, it's going to be obnoxious, I think we're going to have to detect the signature of the method or version of sqla to get rid of the warning | 18:28 |
JayF | if we need it to work on 1.4.x and 2.x | 18:28 |
clarkb | JayF: a lot of stuff for 2.0 works in 1.4 too | 18:31 |
JayF | oh, that's a good point | 18:31 |
clarkb | the sqla transition was really well done imo and other libraries/tools should look at it as an example. You shouldn't need to do anyhting magical to make a codebase work with both I don't think | 18:32 |
TheJulia | 1.4.x was also a migration version to force a lot of the 2.0 breaking changes to be weeded out by the user in advance | 18:37 |
JayF | ack; I'll take a look at cleaning these up | 18:38 |
JayF | I literally think it might be a one line change just somewhere deep | 18:38 |
TheJulia | yeah | 18:38 |
TheJulia | most likely exactly the case | 18:38 |
clarkb | sqlalchemy is 1.4.41 in upper constraints which I would expect to apply to the unittests but haven't confirmed that | 18:40 |
JayF | https://94ab1cfb489723c43f41-e7ab479a8e7045ace97d02ce1ceb0b42.ssl.cf2.rackcdn.com/885661/3/check/openstack-tox-py39/ddda219/job-output.txt (╯°□°)╯︵ ┻━┻ | 18:41 |
clarkb | the job logs on those should record what they install | 18:41 |
JayF | clarkb: yes, that fits my observation, I had to force my local tox venv to run =>2.x | 18:41 |
JayF | TheJulia: rpittau: ^ my "fix" failed during postgresql migrations :( | 18:41 |
JayF | TheJulia: rpittau: I think we have a real bug, somewhere, and the one test getting commented/'fixed' by me just eliminated the easiest reproducer :| | 18:42 |
TheJulia | JayF: you just install it directly | 18:42 |
TheJulia | inside the venv | 18:43 |
TheJulia | or exclude constraints from tox.ini | 18:43 |
JayF | TheJulia: yeah, that's what I did. `source .tox/py310/bin/activate && pip install --upgrade SQLAlchemy` | 18:43 |
* TheJulia gives tox cookies to wake up | 18:47 | |
TheJulia | Uhh, tox, why are you looping hitting futex and getting ETIMEDOUT | 18:48 |
TheJulia | and now running | 18:48 |
TheJulia | sqlalchemy.exc.SADeprecationWarning: The argument signature for the "ConnectionEvents.engine_connect" event listener has changed as of version 2.0, and conversion for the old argument signature will be removed in a future release. The new signature is "def engine_connect(conn)" | 18:49 |
TheJulia | that should be easy | 18:49 |
JayF | yeah I'm going to knock it out | 18:50 |
JayF | especially if the new sig is in 1.4.x | 18:50 |
* JayF doesn't think he can take another day of his brain pointed to unit tests looking for a fix that's hiding | 18:50 | |
JayF | so low hanging fruit sqla migration fixes, yes please | 18:50 |
opendevreview | Julia Kreger proposed openstack/ironic master: Fix SQLAlchemy engine connection listener https://review.opendev.org/c/openstack/ironic/+/885797 | 18:50 |
JayF | or you can eat the low hanging fruit instead lol | 18:51 |
TheJulia | well, mgiht already done | 18:51 |
TheJulia | OM NOM NOM | 18:51 |
TheJulia | re-running tests locally | 18:51 |
* JayF waits for this to fix our tests /s | 18:54 | |
TheJulia | it wont | 18:55 |
TheJulia | and I got it wrong it seems | 18:55 |
TheJulia | oh.. wait | 18:55 |
TheJulia | engine.events.ConnectionEvents | 19:00 |
TheJulia | I'll have to pull down the sqlalchemy code and look at the docs | 19:19 |
TheJulia | I've changed it as needed but I'm still getting hit with the error and I don't quite understand why at present | 19:19 |
opendevreview | Julia Kreger proposed openstack/ironic master: Fix SQLAlchemy engine connection listener https://review.opendev.org/c/openstack/ironic/+/885797 | 19:19 |
yde | i have another issue now, which i dont know if its a normal situation or not. when booting instances from nova. i can ping their ip once active, but i cant ssh into them, just as if the ssh service wasnt started on the host. But, if i boot another instance, and flag the "use-config-drive", it works and i can ssh into the box, using my keypair. any idea ? | 20:07 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!