*** thorst has joined #openstack-powervm | 01:37 | |
*** thorst has quit IRC | 01:42 | |
*** thorst has joined #openstack-powervm | 02:38 | |
*** thorst has quit IRC | 02:57 | |
*** apearson has joined #openstack-powervm | 03:00 | |
*** dwayne__ has quit IRC | 03:19 | |
*** dwayne__ has joined #openstack-powervm | 03:22 | |
*** jwcroppe has quit IRC | 03:31 | |
*** jwcroppe has joined #openstack-powervm | 03:32 | |
*** apearson has quit IRC | 03:40 | |
*** apearson has joined #openstack-powervm | 03:40 | |
*** apearson has quit IRC | 03:47 | |
*** thorst has joined #openstack-powervm | 03:54 | |
*** thorst has quit IRC | 03:58 | |
*** thorst has joined #openstack-powervm | 04:54 | |
*** thorst has quit IRC | 04:59 | |
*** shyama has joined #openstack-powervm | 05:29 | |
shyama | thorst: efried please review https://review.openstack.org/#/c/432322/ | 05:37 |
---|---|---|
*** thorst has joined #openstack-powervm | 05:55 | |
*** thorst has quit IRC | 05:59 | |
*** jwcroppe has quit IRC | 06:32 | |
*** jwcroppe has joined #openstack-powervm | 06:32 | |
*** thorst has joined #openstack-powervm | 06:56 | |
*** thorst has quit IRC | 07:00 | |
*** thorst has joined #openstack-powervm | 07:56 | |
*** openstackgerrit has quit IRC | 08:03 | |
*** thorst has quit IRC | 08:16 | |
*** thorst has joined #openstack-powervm | 09:13 | |
*** thorst has quit IRC | 09:17 | |
*** k0da has joined #openstack-powervm | 09:56 | |
*** chas has joined #openstack-powervm | 10:35 | |
*** shyama has quit IRC | 11:02 | |
*** shyama has joined #openstack-powervm | 11:20 | |
*** shyama has quit IRC | 11:25 | |
*** thorst has joined #openstack-powervm | 11:34 | |
*** thorst has quit IRC | 11:36 | |
*** thorst has joined #openstack-powervm | 12:00 | |
*** jpasqualetto has joined #openstack-powervm | 12:14 | |
*** jpasqualetto has quit IRC | 12:25 | |
*** edmondsw has joined #openstack-powervm | 12:32 | |
*** efried has quit IRC | 12:38 | |
*** mdrabe has joined #openstack-powervm | 12:39 | |
*** efried has joined #openstack-powervm | 12:48 | |
*** shyama has joined #openstack-powervm | 12:51 | |
*** jwcroppe has quit IRC | 12:53 | |
*** jwcroppe has joined #openstack-powervm | 12:54 | |
*** jwcroppe has quit IRC | 12:58 | |
*** jwcroppe has joined #openstack-powervm | 13:07 | |
*** esberglu has joined #openstack-powervm | 13:15 | |
*** jpasqualetto has joined #openstack-powervm | 13:20 | |
*** dwayne__ has quit IRC | 13:48 | |
*** mdrabe has quit IRC | 14:11 | |
*** mdrabe has joined #openstack-powervm | 14:17 | |
*** tjakobs has joined #openstack-powervm | 14:23 | |
*** smatzek has joined #openstack-powervm | 14:25 | |
*** jpasqualetto has quit IRC | 14:27 | |
*** jpasqualetto has joined #openstack-powervm | 14:28 | |
*** nbante has joined #openstack-powervm | 14:39 | |
*** dwayne__ has joined #openstack-powervm | 14:48 | |
*** shyama has quit IRC | 14:49 | |
esberglu | efried: I'm seeing some issues with power off in the in-tree CI runs | 14:53 |
efried | Tell me | 14:53 |
esberglu | It goes through and tries normal power off, that times out and it tries the VSP hard shutdown. Then that fails as well | 14:56 |
esberglu | "Partition must be running to shut down" | 14:56 |
esberglu | A couple things in the logs that concern me | 14:56 |
esberglu | When the first shutdown comes through the instance is in state open firmware, not active | 14:57 |
esberglu | Then also seeing a message "can't perform OS shutdown because RMC connection is not active" | 14:58 |
esberglu | I'm wondering if we are trying to shutdown an instance that is not ready yet? | 14:58 |
esberglu | http://184.172.12.213/74/385074/10/check/nova-in-tree-pvm/1427341/ | 14:59 |
esberglu | And not handling that scenario | 14:59 |
*** jwcroppe has quit IRC | 15:02 | |
esberglu | efried: I can point out timestamps in the logs if you would like | 15:03 |
efried | esberglu I got 'em. Looking now. | 15:03 |
esberglu | efried: This issue happens 2x in that run | 15:04 |
efried | Yuh | 15:04 |
*** jwcroppe has joined #openstack-powervm | 15:09 | |
*** jwcroppe has quit IRC | 15:09 | |
*** jwcroppe has joined #openstack-powervm | 15:10 | |
efried | esberglu Erm, what code is this using? | 15:10 |
esberglu | pypowervm? | 15:11 |
efried | Sorry, yeah. I'm seeing a log message that isn't in the source. | 15:11 |
esberglu | 1.1.0 | 15:13 |
esberglu | It pulls it from upper-constraints | 15:13 |
efried | esberglu Well, something ain't right. | 15:14 |
efried | Cause I can't find the source code that emits this message: | 15:14 |
efried | 2017-04-03 07:11:56.221 WARNING pypowervm.tasks.power [req-23017211-7e7f-4b89-889b-b8e6ed7ddbe5 tempest-ServerDiskConfigTestJSON-561105912 tempest-ServerDiskConfigTestJSON-561105912] Can not perform OS shutdown on Virtual Machine pvm3-tempest-Ser-49a39f18 because its RMC connection is not active. | 15:14 |
*** jwcroppe has quit IRC | 15:15 | |
esberglu | efried: exceptions.py | 15:17 |
esberglu | https://github.com/powervm/pypowervm/blob/1.1.0/pypowervm/exceptions.py#L139 | 15:17 |
efried | Got it, thanks. Not sure how I missed that on my first pass. | 15:19 |
esberglu | efried: Had me worried for a second there | 15:21 |
efried | Me too. Brain not yet in gear after weekend. | 15:21 |
efried | More honey-do projects in last two days than, like, last six months combined. | 15:21 |
efried | After a weekend of installing ceiling fixtures and ripping up carpet, code takes a bit of a gear shift to get back into. | 15:22 |
efried | esberglu thorst Okay, I think it's this 60s timeout that's screwing us again. | 15:25 |
efried | So we tried VSP normal, and it timed out - but probably actually succeeded. | 15:25 |
esberglu | And then the VSP hard fails because it's already shut down | 15:26 |
efried | Then we moved on to VSP hard, but by the time we issued that guy, the partition was actually already down, so it "failed" | 15:26 |
efried | yup. | 15:26 |
efried | So I'm thinking perhaps we want a tweak to the logic here. | 15:26 |
thorst | efried: so just add a check? | 15:26 |
thorst | to see if already dead. | 15:26 |
thorst | which I know is silly but... | 15:26 |
efried | thorst Well, not to check the state of the partition. | 15:26 |
efried | But a check for this error code. | 15:26 |
thorst | ahh | 15:27 |
thorst | fair | 15:27 |
efried | "[PVME01050901-0581] Partition must be running to shut down." | 15:27 |
efried | Check for that PVME code. | 15:27 |
efried | Weird thing is, it took three minutes for that hard shutdown to fail. | 15:27 |
thorst | heh, something to ask hsien | 15:28 |
thorst | that is weird.... | 15:28 |
efried | Could just be bogged system. | 15:28 |
thorst | shouldn't be that bogged. | 15:28 |
efried | Anyway, I think it's reasonable to check for that PVME code and "succeed" at that point. | 15:29 |
thorst | Raghu's scale tests have been hitting 1000 VMs | 15:29 |
efried | esberglu We don't have pvm-rest logs for these, do we? | 15:29 |
esberglu | Nope | 15:29 |
efried | I think we should add those. thorst adreznec Unless there's space considerations? | 15:29 |
efried | Cause without that, we don't have squat for @changh to look at. | 15:30 |
thorst | efried: they're just going to be flooded with other requests | 15:30 |
thorst | but yeah, I like that idea | 15:30 |
adreznec | efried: to clarify, you're thinking vsp soft, if fail, vsp hard and if we hit that PVME then at that point assume it succeeded in the interim | 15:30 |
efried | adreznec I'm saying if power-off hits that PVME at any point, it just returns success. | 15:30 |
efried | Not talking about changing the flow. | 15:30 |
adreznec | efried: thorst I think the only issue there was log scrubbing at the time | 15:30 |
adreznec | but we've kind of given up on there... | 15:31 |
adreznec | *that | 15:31 |
efried | adreznec You mean internal IPs and whatnot? | 15:31 |
adreznec | Space shouldn't really be an issue with the log retain period | 15:31 |
adreznec | I don't think... esberglu have you looked at how much space we're using lately? | 15:31 |
adreznec | efried: yes | 15:31 |
*** apearson has joined #openstack-powervm | 15:31 | |
esberglu | Logserver is currently 88% full. 82 of 100G used | 15:33 |
adreznec | hmm | 15:33 |
adreznec | So maybe we would need more storage for that | 15:33 |
*** thorst is now known as thorst_afk | 15:33 | |
esberglu | Either more storage or decrease the time until we delete | 15:33 |
thorst_afk | MOAR STORAGE | 15:34 |
thorst_afk | it's just SL :-) | 15:34 |
efried | Okay, so we already have logic to succeed the power-off if certain error codes are received; but this one seems to be new. | 15:34 |
*** jwcroppe has joined #openstack-powervm | 15:46 | |
*** jwcroppe has quit IRC | 15:47 | |
efried | thorst_afk esberglu adreznec: 5079 | 15:53 |
efried | (UT already covered by existing tests) | 15:53 |
*** jwcroppe has joined #openstack-powervm | 15:54 | |
*** efried has quit IRC | 15:59 | |
*** k0da has quit IRC | 16:02 | |
*** apearson has quit IRC | 16:36 | |
*** jwcroppe_ has joined #openstack-powervm | 16:37 | |
*** jwcroppe has quit IRC | 16:40 | |
*** thorst_afk is now known as thorst | 16:42 | |
*** shyama has joined #openstack-powervm | 16:57 | |
*** esberglu_ has joined #openstack-powervm | 17:23 | |
*** tjakobs_ has joined #openstack-powervm | 17:23 | |
*** tjakobs has quit IRC | 17:24 | |
*** esberglu has quit IRC | 17:24 | |
*** shyama has quit IRC | 17:34 | |
esberglu_ | FYI: PowerVM CI will be down starting at 6 PM central time today. I am going to be upgrading the CI undercloud from newton to ocata | 17:37 |
esberglu_ | Timeframe until back up is ~4 hours provided that there are no issues with the upgrade | 17:37 |
thorst | esberglu_: ack | 17:37 |
*** jwcroppe_ has quit IRC | 17:41 | |
*** apearson has joined #openstack-powervm | 17:52 | |
*** chas has quit IRC | 18:04 | |
*** nbante has quit IRC | 18:08 | |
*** jwcroppe has joined #openstack-powervm | 18:11 | |
*** jpasqualetto has quit IRC | 18:14 | |
*** jpasqualetto has joined #openstack-powervm | 18:26 | |
*** shyama has joined #openstack-powervm | 18:37 | |
*** shyama has quit IRC | 18:49 | |
*** efried has joined #openstack-powervm | 18:52 | |
*** jpasqualetto has quit IRC | 19:01 | |
*** k0da has joined #openstack-powervm | 19:09 | |
*** jpasqualetto has joined #openstack-powervm | 19:16 | |
*** jpasqualetto has quit IRC | 19:24 | |
*** smatzek has quit IRC | 19:54 | |
*** apearson has quit IRC | 19:55 | |
thorst | efried: mind taking a look at https://review.openstack.org/#/c/432322/ when you get a chance? | 20:00 |
thorst | my +2 is stuck in submit | 20:01 |
*** apearson has joined #openstack-powervm | 20:04 | |
*** k0da has quit IRC | 20:11 | |
*** k0da has joined #openstack-powervm | 20:24 | |
efried | thorst Sorry, yeah. | 20:32 |
efried | mdrabe and I have been banging our heads against the download hang. | 20:32 |
efried | Pretty sure I'm going to have to retool pypowervm to use eventlet instead of concurrent.futures | 20:33 |
thorst | wtf... | 20:33 |
thorst | that sux | 20:33 |
efried | Well, it would suck less if it didn't mean we were totally broken without a new pypowervm. | 20:34 |
thorst | where are the threads in pypowervm now that we've gotten rid of that coordinated upload crap? | 20:34 |
thorst | I guess i can do a find quick | 20:34 |
thorst | efried: so just transaction.py? | 20:35 |
efried | thorst thread_utils.py is what's killing us right now. | 20:35 |
thorst | efried: this is because of the rest_api_pipe function? | 20:36 |
efried | Which is only used from tasks/storage.py for the upload business. | 20:37 |
thorst | efried: and this is because nova-powervm uses 'func' uploads | 20:38 |
efried | Well, no, it's worse than that. | 20:38 |
efried | It's also because pypowervm uses coordinated. | 20:38 |
efried | We have to change _both_ to get rid of threads. | 20:39 |
thorst | right right. I'm assuming you're using the develop branch that 'got rid of' coordinated | 20:39 |
efried | Because the API/FUNC upload in pypowervm uses that _rest_api_pipe, which _also_ uses futures. | 20:39 |
efried | thorst Yeah, the problem is that API upload _also_ uses futures for API upload when the UploadType is FUNC. | 20:40 |
thorst | efried: look at how we used to do it...sans 'FUNC' | 20:41 |
thorst | https://github.com/openstack/nova-powervm/blob/stable/mitaka/nova_powervm/virt/powervm/disk/localdisk.py | 20:41 |
efried | Yeah, I know, without FUNC we can get rid of threads. | 20:41 |
thorst | but I also know that we need to support FUNC. | 20:41 |
efried | thorst Only for backward compatibility. | 20:42 |
efried | I'm not convinced we need FUNC at all. | 20:42 |
thorst | we could make FUNC you know...write to a file and then upload/delete. Backwards compat wise. | 20:42 |
thorst | the only time I think we need FUNC is if we're going to transform the image ahead of time | 20:42 |
efried | thorst Sure, guaranteeing we have a temp file system with enough space. | 20:42 |
thorst | like what we get from glance may be a tar.gz....FUNC gives us an opportunity to do something to it beforehand. | 20:42 |
efried | How so? | 20:43 |
thorst | get the stream, do something to said stream, upload | 20:43 |
efried | That's the non-FUNC path. | 20:43 |
thorst | well, FUNC defers it until you need to upload | 20:43 |
thorst | ssp case may not actually need to pull from glance at all. | 20:43 |
efried | If we could work with a stream, we just get the chunks from the download function by not specifying a target | 20:43 |
thorst | but I suppose you could know that ahead of time anywho | 20:44 |
thorst | I'm good with deprecating func if we want TBH | 20:44 |
thorst | change nova-powervm to be the way it was...now that the speed is good...and then make FUNC write to a temp file (or to a pipe like it is now...openstack won't use it that way) | 20:44 |
efried | thorst We need to do some investigation to see how far back we're broken, though. | 20:44 |
thorst | true...we can also push this change back to stable/ocata... | 20:45 |
thorst | if needed of course | 20:45 |
efried | I am almost certain master nova-powervm is dead. | 20:45 |
efried | Ocata might be busted too. | 20:45 |
thorst | though esbeglu is redeploying the undercloud tonight | 20:45 |
thorst | so I'm not convinced ocata is dead | 20:45 |
efried | I'm not convinced either. We need to check. | 20:45 |
thorst | should find out tonight | 20:46 |
thorst | either the undercloud busts or it doesn't | 20:46 |
efried | Whatever changed that caused this freeze changed like a week, week and a half ago. | 20:46 |
efried | No | 20:46 |
efried | The CI will not hit this. | 20:46 |
thorst | the undercloud CI won't hit this? | 20:46 |
thorst | remember, he's redeploying the whole thing...not just the workloads themselves. | 20:46 |
efried | Uhm, maybe actually. | 20:46 |
efried | Because API/FUNC is still busted. | 20:46 |
efried | We verified that (accidentally) | 20:47 |
efried | With master + in-tree SSP change set + changes to make it go API/FUNC. | 20:47 |
thorst | right... | 20:47 |
efried | Soooo... it's possible we'd be okay if we got rid of FUNC in pike nova-powervm and in-tree AND made pike requirements for both of those guys require pypowervm 1.1.1 that gets rid of coordinated. | 20:48 |
efried | Older openstack with newer pypowervm would be aaight - assuming ocata is still okay. | 20:48 |
thorst | right. | 20:48 |
efried | And vice versa wouldn't be possible. | 20:48 |
thorst | nice and clean | 20:48 |
efried | Yeah, right. Nice and clean. | 20:49 |
thorst | :-p | 20:49 |
thorst | software dependencies are ... fun | 20:49 |
efried | Okay, gonna get a water refill and start working on all of that. | 20:49 |
thorst | good luck | 20:49 |
thorst | I'll be heading out to diapers in ten... | 20:49 |
efried | Course, we may still be broken in that transaction.py business you pointed out. | 20:49 |
thorst | yep... | 20:49 |
thorst | but that's not doing i/o waits. | 20:50 |
efried | Mebbe so. | 20:50 |
efried | Wouldn't count on it. | 20:50 |
efried | Especially as we move to LIO... | 20:50 |
thorst | efried: LIO shouldn't matter there...just the stream to the glance is all that I/O waits | 20:51 |
thorst | the rest of it is rest calls (which is an I/O call, sure...but different) | 20:51 |
*** esberglu_ is now known as esberglu | 20:51 | |
*** esberglu has quit IRC | 20:56 | |
*** esberglu has joined #openstack-powervm | 20:56 | |
*** thorst has quit IRC | 20:59 | |
*** thorst has joined #openstack-powervm | 21:00 | |
*** esberglu has quit IRC | 21:01 | |
*** thorst has quit IRC | 21:04 | |
*** edmondsw has quit IRC | 21:12 | |
*** edmondsw has joined #openstack-powervm | 21:14 | |
*** edmondsw has quit IRC | 21:19 | |
*** thorst has joined #openstack-powervm | 21:26 | |
*** thorst has quit IRC | 21:30 | |
*** jwcroppe has quit IRC | 21:34 | |
*** jwcroppe has joined #openstack-powervm | 21:34 | |
*** esberglu has joined #openstack-powervm | 21:35 | |
*** jwcroppe has quit IRC | 21:38 | |
*** thorst has joined #openstack-powervm | 21:47 | |
efried | thorst So we can't actually get rid of all the FUNC code :( | 21:53 |
*** mdrabe has quit IRC | 22:13 | |
*** tjakobs_ has quit IRC | 22:19 | |
*** apearson has quit IRC | 22:23 | |
*** efried has quit IRC | 22:32 | |
*** edmondsw has joined #openstack-powervm | 22:45 | |
*** edmondsw has quit IRC | 22:49 | |
*** jwcroppe has joined #openstack-powervm | 23:01 | |
*** k0da has quit IRC | 23:18 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!