*** jparrill has joined #ara | 06:35 | |
ara-slack | <thangola> Hi, I’m using ara 1.3, and have a new problem with web GUI. When I opened my ara home page, it showed 500 error. When I check log I see this: | 06:37 |
---|---|---|
ara-slack | <thangola> ``` | 06:37 |
ara-slack | <thangola> any idea? | 06:37 |
*** andymccr has quit IRC | 07:07 | |
*** andymccr has joined #ara | 07:09 | |
*** jparrill has quit IRC | 10:37 | |
ara-slack | <dmsimard> @thangola there's no such thing as ARA 1.3? :/ | 11:44 |
ara-slack | <dmsimard> From looking at the stack trace, it looks like there is a record for the playbook but the playbook run was interrupted before its file could be saved which is very much a race condition | 11:45 |
ara-slack | <dmsimard> I'll note the bug, we can better handle that. In the meantime, I would try to delete that playbook run. Are you able to do an 'ara playbook list' without crashing ? | 11:48 |
*** dougbtv_ has quit IRC | 12:20 | |
*** sshnaidm is now known as sshnaidm|afk | 13:23 | |
*** tbielawa has joined #ara | 13:29 | |
*** tbielawa is now known as tbielawa|mtg | 13:29 | |
*** jrist has quit IRC | 13:31 | |
*** jrist has joined #ara | 13:35 | |
*** sshnaidm|afk is now known as sshnaidm | 13:54 | |
*** jrist has quit IRC | 14:17 | |
*** resmo has joined #ara | 14:18 | |
ara-slack | <thangola> sorry, mine is 0.13.1 | 14:27 |
ara-slack | <thangola> no, I can it it properly: ``` ara playbook list | 14:28 |
ara-slack | <thangola> could you tell me how to find the broken playbook to delete? | 14:30 |
ara-slack | <dmsimard> What database backend are you using ? | 14:35 |
ara-slack | <thangola> mysql | 14:36 |
ara-slack | <dmsimard> ok, are you a bit familiar with sql queries ? can you open a mysql prompt ? | 14:39 |
ara-slack | <thangola> sure, I can do anything, don’t worry | 14:41 |
-openstackstatus- NOTICE: The infra team is now taking Zuul v2 offline and bringing Zuul v3 online. Please see https://docs.openstack.org/infra/manual/zuulv3.html for more information, and ask us in #openstack-infra if you have any questions. | 14:41 | |
ara-slack | <dmsimard> ok, can you do a pastebin of select * from playbooks; ? | 14:42 |
ara-slack | <dmsimard> er, hang on, let me scope that a bit more | 14:42 |
ara-slack | <thangola> I have 610 playbooks now, fyi | 14:44 |
dmsimard | yeah, hang on | 14:44 |
dmsimard | getting a database spun up to get the right query | 14:44 |
ara-slack | <dmsimard> select id,path from playbooks; | 14:45 |
ara-slack | <dmsimard> that won't contain junk and show us what we need to see | 14:46 |
ara-slack | <dmsimard> in that output, you probably have a playbook without a path | 14:46 |
ara-slack | <thangola> all of them have path | 14:48 |
ara-slack | <dmsimard> That's so weird.. hang on let me have another look at that trace | 14:48 |
ara-slack | <thangola> sure | 14:49 |
ara-slack | <dmsimard> ok, there's another possibility, hang on | 14:49 |
ara-slack | <dmsimard> did you run any other playbooks after you started seeing the issue ? | 14:50 |
ara-slack | <thangola> yes, alot, now I can visit first page, but not the 2nd | 14:52 |
ara-slack | <dmsimard> oh, that can help narrowing down which playbook is the faulty one | 14:52 |
ara-slack | <dmsimard> how many playbooks you have per page ? default is 10 I think | 14:53 |
ara-slack | <thangola> 10 | 14:53 |
ara-slack | <dmsimard> ok hang on.. | 14:54 |
ara-slack | <dmsimard> So this should get you the 10 playbooks on the first page: select id,path,time_end from playbooks order by time_start desc limit 10; | 14:56 |
ara-slack | <dmsimard> Now, the problem is that one of those 10 playbooks does not have a file in the 'files' table with 'is_playbook' set to true | 14:56 |
ara-slack | <thangola> there’s 2 with time_end is NuLL | 14:57 |
ara-slack | <dmsimard> ok, that's another hint | 14:57 |
ara-slack | <dmsimard> now, take the path from those two with time_end | 14:57 |
*** tbielawa|mtg is now known as tbielawa|donate | 14:58 | |
ara-slack | <thangola> `/home/XXXXX/ansible/common/playbooks/statistics-clickhouse.yml` | 14:58 |
ara-slack | <thangola> both are this | 14:58 |
ara-slack | <dmsimard> select id,playbook_id,path,is_playbook from files where playbook_id='playbookid' and path='path'; | 14:58 |
ara-slack | <dmsimard> substitute playbookid and path by the right values | 14:59 |
ara-slack | <thangola> and I can’t visit the page with that id: /reports/d84d5aca-a0be-4e68-beaa-4fb4de8704ec.html | 14:59 |
*** jrist has joined #ara | 14:59 | |
ara-slack | <thangola> I think that those are the error ones | 15:00 |
ara-slack | <dmsimard> ok can you do select id,playbook_id,path,is_playbook from files where playbook_id='d84d5aca-a0be-4e68-beaa-4fb4de8704ec' and path='<path>'; ? | 15:00 |
ara-slack | <thangola> ``` id: 3d6dd966-dd14-4584-90bd-8030cb8fabdd playbook_id: d84d5aca-a0be-4e68-beaa-4fb4de8704ec path: /home/XXX/ansible/common/playbooks/statistics-clickhouse.yml is_playbook: 0 ``` | 15:01 |
ara-slack | <dmsimard> ok, can you do an update to set is_playbook to 1 ? | 15:01 |
ara-slack | <thangola> sec | 15:02 |
ara-slack | <dmsimard> for that file id, 3d6dd966-dd14-4584-90bd-8030cb8fabdd | 15:02 |
*** tbielawa|donate has quit IRC | 15:02 | |
ara-slack | <thangola> update both | 15:04 |
ara-slack | <thangola> now it works | 15:04 |
ara-slack | <dmsimard> yay | 15:04 |
ara-slack | <dmsimard> yeah.. that's quite a nasty bug :S | 15:04 |
ara-slack | <thangola> so is it because of sudden interrupting? | 15:05 |
ara-slack | <dmsimard> yeah, in the order of events, the playbook is created first and then the file record is created and then the playbook file is updated with is_playbook = true | 15:05 |
ara-slack | <dmsimard> so the playbook was interrupted right after the playbook file was created but before it was updated | 15:06 |
ara-slack | <dmsimard> probably a matter of milliseconds/nanoseconds between the two operations :S | 15:06 |
ara-slack | <dmsimard> I'll fix it anyway | 15:06 |
ara-slack | <thangola> okay, at least I know how to handle it now | 15:06 |
ara-slack | <thangola> thank you very much | 15:06 |
ara-slack | <thangola> I wish I can buy you a beer | 15:07 |
ara-slack | <dmsimard> haha :slightly_smiling_face: | 15:09 |
ara-slack | <dmsimard> I filed a bug here https://storyboard.openstack.org/#!/story/2001217 -- in the upcoming ara 1.0 version we're doing things quite differently but we're still susceptible to that kind of race condition | 15:10 |
ara-slack | <dmsimard> for what it's worth, there are some good improvements between the version you are using (0.13.1) and the latest version (0.14.4) | 15:11 |
ara-slack | <thangola> sure | 15:12 |
ara-slack | <dmsimard> @thangola I'd suggest backing up your database first (never too safe) and then upgrading | 15:13 |
ara-slack | <dmsimard> I don't believe there are any SQL migrations between 0.13.1 and 0.14.4 | 15:13 |
ara-slack | <dmsimard> You can see all the release notes directly on the tags here: https://github.com/openstack/ara/releases | 15:14 |
ara-slack | <thangola> yep, I have plan to upgrade already, just too busy/lazy | 15:17 |
ara-slack | <thangola> hehe | 15:17 |
ara-slack | <dmsimard> no worries, keep in mind that 1.0 won't provide sql migrations though, so you'll need to start over from scratch | 15:18 |
ara-slack | <thangola> oops, I’ll try on sandbox anyway | 15:19 |
*** jrist has quit IRC | 15:36 | |
*** jrist has joined #ara | 16:11 | |
*** tbielawa has joined #ara | 17:14 | |
*** sshnaidm is now known as sshnaidm|off | 18:21 | |
*** resmo has quit IRC | 19:04 | |
*** tbielawa is now known as tbielawa|g0n3 | 19:58 | |
*** weshay has quit IRC | 20:10 | |
*** weshay has joined #ara | 20:11 | |
*** cherna has joined #ara | 21:03 | |
cherna | Hi, anyone has seen this error https://gist.github.com/anonymous/120748a4f7245f7b31bf015eb1033fee | 21:05 |
cherna | Failure using method (v2_playbook_on_task_start) in callback plugin (<ansible.plugins.callback./opt/daddy/venv/lib/python2.7/site- packages/ara/plugins/callbacks/log_ara.CallbackModule object at 0x24f0d10>): This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (pymysql.err.OperationalError) | 21:05 |
dmsimard | cherna: hey, that's a fairly generic error that usually happens after another error | 21:06 |
dmsimard | looks like the previous error in this case was.. | 21:07 |
dmsimard | "MySQL server has gone away (error(32, 'Broken pipe')) | 21:07 |
cherna | @dmsimard Ok but I see this error for those jobs which are running long | 21:34 |
cherna | so wanted to check whether there is any issue which results in this error... | 21:35 |
dmsimard | cherna: It's entirely possible you're hitting a bug but I haven't seen that kind of issue happen. Could you try and see if you are able to reproduce with the default sqlite backend out of the curiosity ? | 21:36 |
cherna | @dmsimard not sure whether I can do that... thanks for response, will update here if I come across this issue | 21:49 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!