Wednesday, 2019-04-17

*** tetsuro has joined #openstack-placement00:02
melwittprometheanfire: what version of the os-traits library do you have?00:09
melwittthat trait was added in this commit https://github.com/openstack/os-traits/commit/56531c2a81e4938fd790d6fc05791ee745da5f0300:09
prometheanfire0.11.000:09
melwittok, that should have it00:10
prometheanfirehmm, from what I read nova-compute should be the first to add it?00:10
melwittI don't know enough about it to know why/how placement will reject a 'standard' trait PUT request00:10
prometheanfirethey both have 0.11.0 as well00:11
prometheanfirenor I00:11
melwittyeah, this is part of the compute capabilities as traits change I think00:11
prometheanfirehave some debug00:13
prometheanfire2019-04-16 19:12:26.390 25324 DEBUG placement.requestlog [req-d0831879-30a4-4a20-8bd0-d6465c962022 e589d5a63cf245f381869ee8cb7ca092 48ddb9bf27c342e8a9640fe4e526519f - default default] Starting request: 10.10.2.3 "PUT /traits/COMPUTE_NET_ATTACH_INTERFACE" __call__ /usr/lib64/python3.6/site-packages/placement/requestlog.py:3800:13
prometheanfire2019-04-16 19:12:26.396 25324 DEBUG placement.wsgi_wrapper [req-d0831879-30a4-4a20-8bd0-d6465c962022 e589d5a63cf245f381869ee8cb7ca092 48ddb9bf27c342e8a9640fe4e526519f - default default] Placement API returning an error response: The trait is invalid. A valid trait must be no longer than 255 characters, start with the prefix "CUSTOM_" and use following characters: "A"-"Z", "0"-"9" and00:13
prometheanfire"_" call_func /usr/lib64/python3.6/site-packages/placement/wsgi_wrapper.py:3100:13
prometheanfire2019-04-16 19:12:26.399 25324 INFO placement.requestlog [req-d0831879-30a4-4a20-8bd0-d6465c962022 e589d5a63cf245f381869ee8cb7ca092 48ddb9bf27c342e8a9640fe4e526519f - default default] 10.10.2.3 "PUT /traits/COMPUTE_NET_ATTACH_INTERFACE" status: 400 len: 402 microversion: 1.600:13
melwitthttps://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#compute-capabilities-as-traits00:13
melwittI guess you already linked that00:14
prometheanfireheh, yep00:14
melwittunfortunately the person who implemented that just logged off a little while ago, aspiers00:15
melwittaccording to that error message, the API is saying you can't PUT a trait that doesn't being with CUSTOM_00:17
prometheanfireeh, it's not too big of a deal today, home testing00:17
prometheanfireyep00:17
melwittbut I don't understand that considering this is running in the gate00:17
* prometheanfire shrugs00:19
prometheanfireI'm patient :D00:19
melwitthaha ok00:21
*** mriedem has quit IRC00:38
*** tetsuro has quit IRC01:27
*** mriedem has joined #openstack-placement02:11
*** mriedem has quit IRC04:08
*** david-lyle has joined #openstack-placement04:13
*** dklyle has quit IRC04:16
*** e0ne has joined #openstack-placement04:35
*** e0ne has quit IRC04:42
*** e0ne has joined #openstack-placement04:44
*** e0ne has quit IRC05:16
openstackgerritTetsuro Nakamura proposed openstack/osc-placement master: Improve aggregate version check error messages with min_version  https://review.openstack.org/65210005:30
openstackgerritTetsuro Nakamura proposed openstack/osc-placement master: Expose version error message generically  https://review.openstack.org/65328505:31
*** e0ne has joined #openstack-placement05:40
*** e0ne has quit IRC05:59
*** belmoreira has joined #openstack-placement06:17
*** bhagyashris has joined #openstack-placement06:49
*** belmoreira has quit IRC07:09
*** belmoreira has joined #openstack-placement07:10
openstackgerritMerged openstack/osc-placement master: Improve aggregate version check error messages with min_version  https://review.openstack.org/65210007:37
*** tssurya has joined #openstack-placement07:43
*** e0ne has joined #openstack-placement07:51
*** ttsiouts has joined #openstack-placement08:06
*** ttsiouts has quit IRC08:17
*** ttsiouts has joined #openstack-placement08:17
*** ttsiouts has quit IRC08:22
*** ttsiouts has joined #openstack-placement08:28
*** e0ne has quit IRC08:38
*** e0ne has joined #openstack-placement08:42
*** e0ne has quit IRC08:43
*** e0ne has joined #openstack-placement08:43
*** e0ne has quit IRC08:58
*** e0ne has joined #openstack-placement08:59
*** e0ne has quit IRC09:02
*** e0ne has joined #openstack-placement09:09
*** ttsiouts has quit IRC09:17
*** ttsiouts has joined #openstack-placement09:18
*** ttsiouts_ has joined #openstack-placement09:19
*** ttsiouts has quit IRC09:19
*** e0ne has quit IRC09:37
*** e0ne has joined #openstack-placement09:40
*** e0ne has quit IRC09:46
*** bhagyashris has quit IRC09:49
*** e0ne has joined #openstack-placement09:53
*** e0ne has quit IRC10:27
*** e0ne has joined #openstack-placement10:31
*** ttsiouts_ has quit IRC10:37
*** ttsiouts has joined #openstack-placement10:38
*** ttsiouts has quit IRC10:42
*** ttsiouts has joined #openstack-placement11:08
*** ttsiouts has quit IRC11:20
*** ttsiouts has joined #openstack-placement11:21
*** ttsiouts has quit IRC11:25
*** ttsiouts has joined #openstack-placement11:27
*** cdent has joined #openstack-placement11:36
*** ttsiouts has quit IRC11:43
*** ttsiouts has joined #openstack-placement11:44
*** ttsiouts_ has joined #openstack-placement11:46
*** ttsiouts has quit IRC11:47
*** ttsiouts_ has quit IRC11:51
*** cdent has quit IRC11:59
*** e0ne has quit IRC12:00
*** cdent has joined #openstack-placement12:04
*** e0ne has joined #openstack-placement12:10
edleafeprometheanfire: What were you trying to do with that call? Standard traits are defined in os-traits. You can only create custom traits, hence the error message about the format for custom traits.12:22
*** e0ne has quit IRC12:34
*** mriedem has joined #openstack-placement12:57
*** e0ne has joined #openstack-placement12:58
sean-k-mooneycdent: efried just an fyi i plaid around with improveing the syncing of gerrty last night instead of sleeping13:19
sean-k-mooneyhttps://review.openstack.org/#/q/topic:better_concurrancy+(status:open+OR+status:merged)13:19
sean-k-mooneyi havent done much testing(and gerrty has not tests at all) but it seams to work13:20
cdentnice13:20
sean-k-mooneyretrofiting gertty with concurent futres instead of the manual treading its uses all over the place is more work then i had hopped but thats start at least.13:23
cdenti bet13:26
sean-k-mooneyits  built using a varient of the command pateren so each action/command is a task object that is submitted to a queue  and then  sync trhead has a while true that just decuse the tasks and runs it13:28
sean-k-mooneythe queue is a manual implemeantion of basically a priority cure anthe task object are also acting as custom futures13:28
sean-k-mooneyso the code itself is actully resonaly well encausated but its also really tightly coupled to its current execution model.13:30
*** e0ne has quit IRC13:30
sean-k-mooneyanyway im going to use it locally for a bit and i might see if it helps things13:30
*** cdent has quit IRC13:36
*** cdent has joined #openstack-placement13:36
*** cdent has quit IRC13:40
tssuryacdent: thanks for opening https://storyboard.openstack.org/#!/story/2005473 I'll see if melwitt wants to work on it if not I'll take it up14:13
*** cdent has joined #openstack-placement14:28
*** david-lyle is now known as dklyle14:35
*** e0ne has joined #openstack-placement14:53
*** cdent has quit IRC15:11
*** belmoreira has quit IRC15:32
*** cdent has joined #openstack-placement15:54
*** dims has quit IRC16:07
cdentmriedem: have you had a chance to gaze upon https://review.openstack.org/#/c/651939/ (placementfixture in osc-placement) yet? I'm hoping you will have the chutzpah to dig in16:11
efriedoo, there's no way he'll resist a challenge to his yiddish intestinal fortitude16:12
cdentwhatever it takes16:13
mriedemoy vey16:16
mriedemlemme polish this nova bug fix turd and i'll dig into it16:16
cdentmriedem is a real mensch16:18
*** cdent has left #openstack-placement16:20
*** cdent has joined #openstack-placement16:20
*** david-lyle has joined #openstack-placement16:24
*** dklyle has quit IRC16:27
*** dims has joined #openstack-placement16:54
mriedemcdent: done16:57
cdentthanks16:57
cdentyeah, good comments mriedem, i knew it16:59
*** dims has quit IRC16:59
mriedemi'll reward myself with lunch and jelly beans17:00
*** e0ne has quit IRC17:01
*** dims has joined #openstack-placement17:01
openstackgerritChris Dent proposed openstack/osc-placement master: Use PlacementFixture in functional tests  https://review.openstack.org/65193917:50
*** david-lyle is now known as dklyle17:55
*** tssurya has quit IRC17:58
prometheanfireedleafe: I'm just starting nova-compute, it tries to populate that trait but can't (rejected by placement)18:13
edleafeprometheanfire: it shouldn't be trying to create standard traits, which by definition already exist18:13
prometheanfirethe trait it's trying to create is not populated in the nova_api or placement databases (old/new)18:14
edleafeBut it's in os-traits, right?18:14
openstackgerritChris Dent proposed openstack/osc-placement master: Use PlacementFixture in functional tests  https://review.openstack.org/65193918:18
prometheanfireedleafe: it's in 0.11.0 as a test18:26
prometheanfireos_traits/tests/test_os_traits.py:        self.assertIn(ot.COMPUTE_NET_ATTACH_INTERFACE, traits)18:26
prometheanfirebut that's the only line18:27
edleafeprometheanfire: looking at https://github.com/openstack/os-traits/commit/56531c2a81e4938fd790d6fc05791ee745da5f03#diff-f41fbcbcd9f737e2226d06ab0bb1903a, it seems that it was added in 0.12.018:54
prometheanfireedleafe: looks like that's needed in stein, which is 0.11.018:55
prometheanfirehttps://github.com/openstack/requirements/blob/stable/stein/upper-constraints.txt#L34318:55
prometheanfireso, backport and 0.11.1 release?18:56
prometheanfireedleafe: that says it's in 0.7.0+ so it's in 0.11.018:56
edleafeinterestinghuh, yeah, it shows up in https://github.com/openstack/os-traits/commit/8888c528776f27c632d34bec52de09991385bb5418:57
edleafewhich is from the 0.7.0 release18:58
* edleafe is confused18:58
edleafeAh, I guess I'm misreading the tag line for https://github.com/openstack/os-traits/commit/56531c2a81e4938fd790d6fc05791ee745da5f0318:59
edleafeit's in every release from 0.7.0 to 0.12.018:59
prometheanfireya19:00
cdentprometheanfire, edleafe when you figure out what's going on here, can you write it down somewhere or tell me about it tomorrow or something? I need to go, but I'm curious19:03
prometheanfireso, I'm guessing the problem isn't as mich as not adding COMPUTE_NET_ATTACH_INTERFACE, but that the api rejects all traits not starting with CUSTOM_?19:03
cdentit's supposed to do that19:04
cdentactually, no, not gonna get sucked in, please let me know what's up, later :)19:04
* cdent waves19:04
*** cdent has quit IRC19:04
prometheanfireso, the compute node tries to add COMPUTE_NET_ATTACH_INTERFACE (because it's not in the traits table), but is correctly rejected?19:04
prometheanfire:P19:04
edleafeThe problem is that it checks the reported traits against known traits. That list of known traits *should* include the NET_ATTACH one. It should only attempt to PUT it if it isn't in the known set19:05
edleafe"known traits" should include all of os-traits, plus any previously-defined custom traits19:05
prometheanfireright, so the question becomes why it's not in the traits table19:06
edleafeprometheanfire: yeah, looking at https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L968-L1003, it means that that trait didn't get returned, so it's in the traits_to_create set19:08
prometheanfireyep19:09
prometheanfireplacement db_sync doesn't add it (or didn't for me)19:09
edleafeIt's not something that I've seen before, so I'm wondering if you might have a wonky installation of os-traits19:12
mriedemdid you restart nova-compute with newer os-traits before restarting placement with newer traits?19:16
mriedemthat's probably not something we've really considered or handled very well in nova,19:16
mriedemwe just require a min version of os-traits and assume we're good, but that doesn't mean the placement we're talking to is using that same min version19:16
*** e0ne has joined #openstack-placement19:17
*** e0ne has quit IRC19:20
*** e0ne has joined #openstack-placement19:22
prometheanfiremriedem: no traits upgrade/install was done before nova19:41
prometheanfireand and it's 0.11.0 all around for what's in use19:42
mriedemCOMPUTE_NET_ATTACH_INTERFACE is in 0.11.0 so i don't get it http://git.openstack.org/cgit/openstack/os-traits/tree/os_traits/compute/net.py?h=0.11.0#n1719:46
mriedemdid you check the traits table in the placement db?19:46
prometheanfireyes19:47
prometheanfireit's not in there19:47
prometheanfire(164 rows)19:49
mriedemwhat happens if you restart the placement-api service?19:49
mriedemgot any pyc damage or anything?19:50
prometheanfirewant me to remove all pyc/pyo files from os-traits/placement?19:51
prometheanfireas for restarting, everything seems to work19:51
mriedemthere should be a log message on restart where the service loads up the new traits19:52
mriedemsec19:52
mriedembtw is this placement-in-nova or extracted placement?19:52
prometheanfireextracted19:52
mriedemthis is what should get hit on startup https://github.com/openstack/placement/blob/stable/stein/placement/deploy.py#L9319:53
mriedemtrait.ensure_sync(ctx)19:53
mriedemdo you see this logged? https://github.com/openstack/placement/blob/stable/stein/placement/objects/trait.py#L30019:54
prometheanfirelooking19:54
prometheanfirenever19:54
prometheanfiregrep 'Synced traits from os_traits into API DB' /var/log/placement/placement.log19:54
prometheanfiredebug true, log file /var/log/placement/placement.log log dir /var/log/placement19:55
prometheanfirechecked uwsgi logs just in case too19:55
mriedemhmm, idk what's going on - we would have hit something obvious in grenade from rocky to stein with these changes i'd think19:58
mriedemwhat if you drop everything in the traits table and restart the placement service?19:58
mriedemdoes it show up then?19:58
prometheanfirecould try19:59
prometheanfiremriedem: should I truncate resource_provider_traits as well, since...20:02
prometheanfireERROR:  cannot truncate a table referenced in a foreign key constraint20:02
prometheanfireDETAIL:  Table "resource_provider_traits" references "traits".20:02
mriedemblag20:03
mriedem*blarg20:03
mriedemyou'd have to wipe your resource providers to do that...20:04
mriedemwhich would get regenerated from your computes, but...is this a test/dev env?20:04
prometheanfirekinda? it's my home env20:06
mriedemoh, heh20:06
mriedemwell maybe before that you want to add some debug logging to https://github.com/openstack/placement/blob/stable/stein/placement/objects/trait.py#L269 and see if you can spot something obvious20:07
mriedemlike dump the std_traits and db_traits20:08
prometheanfirek20:08
prometheanfirehttps://gist.github.com/prometheanfire/715b26b57fe5218b72eda0301c78416420:12
prometheanfirelooks like a lot of things need syncing20:13
prometheanfirethere's definitely stuff in batch_args20:13
mriedemok COMPUTE_NET_ATTACH_INTERFACE is in std_traits20:14
prometheanfireseeing if it's passed20:14
mriedemand need_sync20:14
prometheanfireand in the to be sync'd list/batch args20:14
prometheanfireWE DID NOT SYNC, for shame20:14
prometheanfireso it is passed, (excepted)20:15
mriedemhmm, wonder if you're hitting a duplicate20:16
mriedemdid you log that?20:16
prometheanfireit's not logged, just passed, writing a log now20:16
prometheanfireI wonder if we should be inserting one at a time?20:17
prometheanfiresince all I'll get is that the batch failed I think20:17
mriedemthe duplicate entry should say which one was causing the problem20:17
prometheanfireok20:17
prometheanfireDETAIL:  Key (id)=(24) already exists.20:19
prometheanfirewhich for me is...20:19
prometheanfire 2018-03-21 23:26:25.3102   |            |  24 | HW_CPU_X86_SHA20:19
mriedemthat one doesn't show up in your need_sync output20:20
mriedemor batch_args20:20
prometheanfirebecause it already exists20:20
mriedemright,20:20
prometheanfireplacement is trying to insert it with an id that's already in use20:20
mriedembut my point is we shouldn't be inserting that one and triggering the dupblice entry error20:20
prometheanfirethis time it was DETAIL:  Key (id)=(26) already exists.20:21
prometheanfirelet me get a log with everything (since ordering changes per run it looks like)20:21
mriedemi guess i'm not understanding since we're not trying to insert HW_CPU_X86_SHA per your logs20:21
prometheanfirehttps://gist.githubusercontent.com/prometheanfire/1fd41c3d8c0153a1b94e687477ab8223/raw/77d1206d112591a51c0079aa9aff7fa890b308a7/gistfile1.txt20:21
prometheanfireright, but something else was trying to be inserted using the same id, the id clashed, not the name20:22
mriedemwhat db are you using? mysql?20:22
prometheanfirebut the insert doesn't include the id, odd20:23
prometheanfirepsql20:23
mriedemof course20:23
prometheanfire:P20:23
prometheanfirehave to have at least one guinea pig20:23
mriedemzzzeek: have you ever heard of bulk insert autoincrement failing with postgresql?20:25
mriedemprometheanfire: i've got something you can try, sec20:30
prometheanfirehttps://gist.github.com/prometheanfire/ff494588ad06fc9c2f55c30524b2347820:30
prometheanfirethat's odd20:30
*** e0ne has quit IRC20:31
prometheanfireand that's not a name in the db either20:31
openstackgerritMatt Riedemann proposed openstack/placement master: WIP: help prometheanfire for crazy postgresql bulk insert  https://review.openstack.org/65355220:32
mriedemprometheanfire: try that ^20:32
prometheanfirelol20:33
prometheanfirek20:33
prometheanfirereqs meeting now20:34
prometheanfirefailed hard20:36
prometheanfire2019-04-17 15:35:15.376 6402 ERROR placement sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (psycopg2.IntegrityError) duplicate key value violates unique constraint "traits_pkey"20:36
mriedemok same as before20:37
mriedemwtf20:37
mriedemthe sqla docs say we could set return_defaults=True to insert them one at a time20:37
mriedemmaybe that's worth a try20:37
mriedemat the expense of performance20:37
mriedembut it also says, " If the rows to be inserted only refer to a single table, then there is no reason this flag should be set as the returned default information is not used."20:38
mriedemidk, need help from jaypipes and/or zzzeek here20:38
mriedemhttps://stackoverflow.com/questions/15834569/how-to-bulk-insert-only-new-rows-in-postresql doesn't help me20:39
prometheanfireI tried inserting just one and had the error20:39
prometheanfiresee https://gist.github.com/prometheanfire/ff494588ad06fc9c2f55c30524b2347820:39
mriedemseems like the autoincrement isn't being honored at all20:40
prometheanfirehere's the table schema https://gist.githubusercontent.com/prometheanfire/fa71a3c093dc5f1d25ca00fc60238a93/raw/e0ebbdce6b56ca728a9a1e433d47dfc335c29a58/gistfile1.txt20:42
melwittare these inserts with explicit id? googled and found this https://stackoverflow.com/questions/9108833/postgres-autoincrement-not-updated-on-explicit-id-inserts20:42
mriedemno20:42
mriedemjust bulk inserting records with the name and assuming the db will autogenerate/increment the id pkey20:43
melwitthm20:43
mriedemprometheanfire: what version of sqla do you have?20:43
prometheanfirewhatever is in upperconstraints20:44
prometheanfire1.2.1820:44
zzzeekmriedem: "autoincerment" in postgresql comes from a sequence object that is independent of the table itself20:45
zzzeekmriedem: so if the table has rows that were hand-inserted into it using ids that the sequence has not addressed yet, "autoincrement" inserts can fail unless the sequence is bumped manually20:45
prometheanfirethat's why then20:45
prometheanfiresince I did the manual table import (didn't see the migrate script for postgres)20:46
zzzeekprometheanfire: so you need to find the sequences in quesstion and do ALTER SEQUENCE20:46
zzzeekto bump them up20:46
prometheanfireok, will after reqs meeting is done20:47
mriedemprometheanfire: ah...20:47
prometheanfiremriedem: yep20:47
mriedemwe should maybe document a known issue in https://docs.openstack.org/placement/latest/upgrade/to-stein.html20:49
mriedemin case others miss the pg script - or maybe the pg script is missing something20:49
prometheanfireit may be missing something, as it imports the table and does not strip the ID20:50
prometheanfireat least when I read it20:51
mriedemhmm, we have a grenade postgresql job but i think it's in the experimental queue, maybe that's hitting this in stable/stein as well20:51
prometheanfiresame might happen in other tables with sequences (user, tec)20:52
prometheanfirelooks like that fixed it21:00
mriedemprometheanfire: probably a good idea to report a bug just so it's tracked if someone else hits it (windriver uses postgresql i think, as does huawei...)21:00
mriedemhttps://storyboard.openstack.org/#!/project/111321:00
prometheanfireI will21:00
openstackgerritMatt Riedemann proposed openstack/placement stable/stein: DNM: log trait sync duplicate entry failure  https://review.openstack.org/65358521:06
mriedemtesting grenade-postgresql in stein here https://review.openstack.org/#/c/653587/21:08
prometheanfirehttps://storyboard.openstack.org/#!/story/200547821:10
prometheanfirethanks21:10
*** takashin has joined #openstack-placement21:50
*** mriedem has quit IRC22:10

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!