*** jb__ has joined #openstack-ironic | 00:06 | |
jb__ | hello | 00:06 |
---|---|---|
jb__ | i am trying to provison more than 20 nodes. after 20, nodes do get provisioned but for some reason nova-compute's ironic client is getting BadStatusLine error from ironic-api which is causing that node's instance to turn into error and terminating it (even though the node gets successfully rebooted) | 00:09 |
*** arif-ali has quit IRC | 00:10 | |
*** zhidong has joined #openstack-ironic | 00:11 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/ironic-python-agent: Updated from global requirements https://review.openstack.org/145885 | 00:12 |
*** arif-ali has joined #openstack-ironic | 00:16 | |
*** andreykurilin has quit IRC | 00:18 | |
NobodyCam | jb__: have a log we could look at... | 00:29 |
NobodyCam | JoshNang: on the cleaning spec does the new FSM effect the ablity to boot Long running agents in cleaning? | 00:30 |
*** bandicot has joined #openstack-ironic | 00:31 | |
JoshNang | NobodyCam: i don't remember seeing any concerns when going through that spec | 00:31 |
JoshNang | the fsm spec i mean, i haven't gone through the fsm patch set | 00:32 |
NobodyCam | I think i'm linking power state and provision state in my head | 00:33 |
JoshNang | yeah, i think power i recall someone mentioning that in the spec, that this was about provision states, not power | 00:35 |
NobodyCam | ya CLEANED -> AVAILABLE (just dont power off) ok got it now | 00:36 |
NobodyCam | :-p | 00:36 |
NobodyCam | after 10 hours the brain get a little mushy | 00:36 |
JoshNang | no worries, its a big, broad spec. hard to keep all the details straight | 00:37 |
*** Masahiro has joined #openstack-ironic | 00:40 | |
openstackgerrit | Adam Gandelman proposed openstack/ironic: Provided backward compat for enforcing admin policy https://review.openstack.org/145984 | 00:41 |
*** Masahiro has quit IRC | 00:44 | |
jb__ | NobadyCam: Here is the nova-compute.log | 00:45 |
jb__ | 2015-01-08 23:44:21.569 4848 TRACE nova.compute.manager [instance: a8c137cb-9962-4c61-858c-8828e1777a16] File "/usr/lib/python2.7/dist-packages/ironicclient/v1/node.py", line 134, in get_by_instance_uuid 2015-01-08 23:44:21.569 4848 TRACE nova.compute.manager [instance: a8c137cb-9962-4c61-858c-8828e1777a16] nodes = self._list(self._path(path), 'nodes') 2015-01-08 23:44:21.569 4848 TRACE nova.compute.manager [instance: a8 | 00:45 |
NobodyCam | any thing in the ironic-conductor / ironic-api logs? | 00:47 |
jb__ | NobdyCam: http://pastebin.com/4iSz3TGj | 00:47 |
jb__ | ironic-conductor does not have any errors . Neither ironic-api | 00:48 |
jb__ | after 20 nodes , I have not been able to launch any and periodic power sync for all the active launches are failing with BadStatusLine in nova-compute | 00:49 |
jb__ | Running Juno GA release | 00:49 |
NobodyCam | ironic node-list working? | 00:50 |
JayF | Did you restart a conductor during the deployment? | 00:52 |
*** naohirot has joined #openstack-ironic | 00:53 | |
naohirot | GM ironic! | 00:54 |
devananda | jb__: can you pastebin a larger section of that log file? that line isn't enough, and please don't paste a bunch here :) | 00:54 |
NobodyCam | morning naohirot | 00:54 |
naohirot | NobodyCam: Hi :) | 00:54 |
NobodyCam | :) | 00:55 |
*** Masahiro has joined #openstack-ironic | 00:59 | |
jb__ | NobodyCam:ironic node-list is working .. But is very slow.. | 01:03 |
jb__ | NobodyCam: I did NOT start the ironic conductor during deployment | 01:03 |
*** chuckC_ has joined #openstack-ironic | 01:04 | |
jb__ | NobodyCam: I am running ironic-api and ironic database on one server and (ironic-conductor and nova-compute) on another server | 01:04 |
*** ChuckC has quit IRC | 01:06 | |
*** chuckC__ has joined #openstack-ironic | 01:07 | |
*** chuckC_ has quit IRC | 01:09 | |
*** chuckC__ is now known as chuckC_ | 01:09 | |
openstackgerrit | Haomeng,Wang proposed openstack/ironic: raises exception if can not get uuid of root filesystem https://review.openstack.org/143919 | 01:11 |
*** Haomeng has joined #openstack-ironic | 01:11 | |
jb__ | More nova-compute log is pasted here ... http://pastebin.com/tpqbre29 | 01:12 |
*** bandicot has quit IRC | 01:15 | |
*** ijw has quit IRC | 01:17 | |
*** ijw has joined #openstack-ironic | 01:18 | |
devananda | jb__: and this happens only after there are >20 nodes? | 01:19 |
jb__ | devananda: yes.. last time it happend after 22 nodes.. cleaned up the whole env. brought up a clean new openstack/env ...now it is happening after 20 | 01:22 |
jb__ | is there a way to number of requests that ironic-api can handle? | 01:23 |
jroll | we run it with many many more than that | 01:24 |
devananda | jb__: there is an optional limit=NN parameter which can be passed | 01:24 |
devananda | to ironic when getting a list of nodes | 01:24 |
jroll | jb__: what power driver are you using? | 01:25 |
devananda | however, that error log shows it is getting only a single record, not a list | 01:25 |
devananda | jb__: I need to run, but the next place I would look is in the ironic-api logs | 01:25 |
devananda | jb__: ironic-api seems to be returning something that httplib on your nova-compute host is not able to parse | 01:26 |
devananda | I thought it might be related to # of nodes, but after looking a bit, I don't think so | 01:26 |
jroll | oh, yeah, that's not just a normal non-200 | 01:26 |
jroll | jb__: do you have apache or haproxy or something in front of ironic-api? | 01:26 |
jroll | devananda: also thinking conductor could be tied up with slow bmcs | 01:26 |
JayF | https://review.openstack.org/#/c/145885/2 easy review, global reqs for IPA (oslo.config bump) and tempest still passes | 01:27 |
devananda | jroll: it's failing in this call from nova: node = _validate_instance_and_node(icli, instance) | 01:27 |
devananda | which shouldn't tickle any BMCs | 01:27 |
*** nosnos has joined #openstack-ironic | 01:27 | |
JayF | but if all conductor workers were tied up | 01:27 |
JayF | it would still be a potential issue, right? | 01:27 |
devananda | ah, true | 01:27 |
jroll | devananda: oh, true | 01:27 |
jroll | no, that call doesn't touch conductors | 01:27 |
devananda | if there were no RPC listeners available | 01:27 |
* JayF can't remember if we patched the power loop to use a separate set of workers or not | 01:27 | |
jroll | oh yes it does | 01:27 |
jroll | yeah | 01:27 |
jroll | I don't think so | 01:28 |
devananda | it should touch condutor, but not a BMC | 01:28 |
jroll | I don't think we patched the power loop, that is | 01:28 |
devananda | anyway, really gotta go | 01:28 |
jroll | right | 01:28 |
*** ijw has quit IRC | 01:29 | |
JoshNang | rloo: i threw a +2 on https://review.openstack.org/#/c/143193/ but didn't want to +A until you got a chance to see if your concerns were fixed (no rush!) | 01:30 |
rloo | JoshNang: if you guys are happy with it, I am fine too (probably). thx, I'm just finishing up something. will look in a few minutes. | 01:30 |
openstackgerrit | Ruby Loo proposed openstack/ironic: Improve testing of state transitions https://review.openstack.org/145929 | 01:32 |
jb__ | devananda: pxe_ipmi driver | 01:33 |
jb__ | jroll: No , I am not running any other than wsgi that comes with ironic | 01:33 |
JayF | ipmitool or ipminative? | 01:33 |
jroll | would love to see how long the power loop takes in ironic-conductor | 01:33 |
jb__ | JayF: using ipmitool | 01:33 |
JayF | good choice | 01:34 |
*** chuckC_ has quit IRC | 01:42 | |
*** dprince has joined #openstack-ironic | 01:42 | |
*** Marga__ has quit IRC | 01:51 | |
rloo | JayF: wrt 143193, I don't like the error msg/handling in HardwareManagerMethodNotFound when method is None. | 01:51 |
rloo | JayF: is that something you care about? I can approve it regardless. | 01:52 |
rloo | JayF: you can fix it afterwards too if you want. | 01:52 |
jroll | rloo: nice catch | 01:52 |
jroll | I think the debug log might be fine | 01:53 |
jroll | but except(errors.IncompatibleHardwareMethodError) | 01:53 |
jroll | is sketchy | 01:53 |
rloo | JayF, jroll: it has two +2 already, so I added my comment but gave it a +1. i hate being the party pooper. | 01:54 |
rloo | JoshNang: ^^ you can decide if you want. I'm gone til Monday, so don't wait for me on this ;) | 01:56 |
jroll | rloo: oh, that | 01:59 |
jroll | I agree | 02:00 |
jroll | have a good weekend :) | 02:00 |
*** chenglch has joined #openstack-ironic | 02:03 | |
*** dprince has quit IRC | 02:08 | |
openstackgerrit | Haomeng,Wang proposed openstack/ironic: manager._check_deploy_timeouts should cover DEPLOYING state nodes https://review.openstack.org/145996 | 02:22 |
*** ijw has joined #openstack-ironic | 02:30 | |
*** pcaruana is now known as pcaruana|afk| | 02:34 | |
*** ijw has quit IRC | 02:35 | |
*** chuckC_ has joined #openstack-ironic | 02:40 | |
*** dlaube has quit IRC | 02:43 | |
*** ramineni has joined #openstack-ironic | 02:46 | |
*** chlong has joined #openstack-ironic | 02:49 | |
*** eghobo has quit IRC | 02:53 | |
*** jb__ has quit IRC | 02:55 | |
*** ChuckC has joined #openstack-ironic | 02:56 | |
*** chuckC_ has quit IRC | 02:58 | |
*** ijw has joined #openstack-ironic | 03:01 | |
*** ijw_ has joined #openstack-ironic | 03:04 | |
*** chuckC_ has joined #openstack-ironic | 03:05 | |
*** ijw has quit IRC | 03:07 | |
*** ijw_ has quit IRC | 03:10 | |
*** nosnos has quit IRC | 03:24 | |
*** naohirot has quit IRC | 03:27 | |
*** Marga_ has joined #openstack-ironic | 03:27 | |
*** Marga_ has quit IRC | 03:28 | |
*** Marga_ has joined #openstack-ironic | 03:28 | |
*** david-lyle has joined #openstack-ironic | 03:29 | |
*** harlowja is now known as harlowja_away | 03:38 | |
*** rwsu has quit IRC | 03:53 | |
*** nosnos has joined #openstack-ironic | 03:56 | |
*** rloo has quit IRC | 03:57 | |
*** lucas-dinner has quit IRC | 03:57 | |
*** ijw has joined #openstack-ironic | 04:03 | |
*** naohirot has joined #openstack-ironic | 04:08 | |
*** ijw has quit IRC | 04:09 | |
*** bradjones has quit IRC | 04:12 | |
*** david-lyle has quit IRC | 04:17 | |
*** rameshg87 has joined #openstack-ironic | 04:19 | |
*** jay-s-b has joined #openstack-ironic | 04:40 | |
*** ijw has joined #openstack-ironic | 05:03 | |
*** ijw has quit IRC | 05:09 | |
openstackgerrit | Naohiro Tamura proposed openstack/ironic: Update etc/ironic/ironic.conf.sample https://review.openstack.org/146016 | 05:11 |
*** nosnos has quit IRC | 05:21 | |
*** bandicot has joined #openstack-ironic | 05:21 | |
*** nosnos has joined #openstack-ironic | 05:21 | |
*** nosnos has quit IRC | 05:26 | |
*** eghobo has joined #openstack-ironic | 05:28 | |
openstackgerrit | Naohiro Tamura proposed openstack/ironic: Add iRMC Driver and its iRMC Power module https://review.openstack.org/144901 | 05:35 |
*** eghobo has quit IRC | 05:47 | |
*** pradipta_away is now known as pradipta | 05:47 | |
*** nosnos has joined #openstack-ironic | 05:53 | |
*** ijw has joined #openstack-ironic | 06:03 | |
*** ijw has quit IRC | 06:09 | |
*** killer_prince has quit IRC | 06:15 | |
*** killer_prince has joined #openstack-ironic | 06:29 | |
*** killer_prince is now known as lazy_prince | 06:30 | |
*** pcrews has quit IRC | 06:32 | |
*** lazy_prince has quit IRC | 06:42 | |
*** lazy_prince has joined #openstack-ironic | 06:42 | |
*** nosnos has quit IRC | 06:48 | |
*** nosnos has joined #openstack-ironic | 06:50 | |
*** dlpartain has joined #openstack-ironic | 07:00 | |
*** ijw has joined #openstack-ironic | 07:03 | |
*** ijw has quit IRC | 07:09 | |
*** bandicot has quit IRC | 07:10 | |
*** mrda is now known as mrda-lca | 07:13 | |
*** dlpartain has left #openstack-ironic | 07:23 | |
*** coolsvap|afk is now known as coolsvap | 07:49 | |
*** lazy_prince has quit IRC | 07:57 | |
*** ifarkas has joined #openstack-ironic | 08:02 | |
*** ijw has joined #openstack-ironic | 08:03 | |
*** killer_prince has joined #openstack-ironic | 08:04 | |
*** killer_prince is now known as lazy_prince | 08:04 | |
*** jcoufal has joined #openstack-ironic | 08:04 | |
*** Nisha has joined #openstack-ironic | 08:04 | |
*** lazy_prince has quit IRC | 08:06 | |
*** lazy_prince has joined #openstack-ironic | 08:06 | |
*** ijw has quit IRC | 08:09 | |
*** zhidong has quit IRC | 08:20 | |
*** ijw has joined #openstack-ironic | 08:31 | |
*** pensu has joined #openstack-ironic | 08:34 | |
*** ijw has quit IRC | 08:36 | |
rameshg87 | naohirot, hi | 08:37 |
*** Marga_ has quit IRC | 08:46 | |
*** ijw has joined #openstack-ironic | 09:03 | |
naohirot | rameshg87: Hi | 09:04 |
*** Masahiro has quit IRC | 09:04 | |
*** jistr has joined #openstack-ironic | 09:05 | |
*** Masahiro has joined #openstack-ironic | 09:05 | |
*** jiangfei has quit IRC | 09:07 | |
*** jiangfei has joined #openstack-ironic | 09:07 | |
*** ijw has quit IRC | 09:09 | |
jay-s-b | ironic-api slows down as more nodes are launched. Periodic power sync for some nodes fail because of slow response from ironic-api. is there a setting in ironic.conf to take care of this? | 09:12 |
naohirot | jay-s-b: Hi, I don't have the answer. But I'm curious how many nodes cause to getting slow? | 09:17 |
jay-s-b | about 20 nodes...not too many | 09:17 |
naohirot | jay-s-b: just 20, not so many I think. | 09:18 |
naohirot | jay-s-b: and how slow? twice? | 09:19 |
*** derekh has joined #openstack-ironic | 09:22 | |
*** romcheg has joined #openstack-ironic | 09:27 | |
*** dtantsur|afk is now known as dtantsur | 09:33 | |
dtantsur | Morning Ironic | 09:33 |
naohirot | dtantsur: good morning | 09:35 |
jay-s-b | I have about 23 nodes registered (ironic node-list) and 20 are provisioned.. When nova-compute starts the power_sync_cyclefor all the 23 nodes, it locks/unlocks each node and totally takes more than 3 mins to finish. Meanwhile, if ironic node-list is run from command line, it stalls sometimes upto >1 min to show the result. In the compute log, for some power sync which tries to get node by... | 09:37 |
jay-s-b | ...instance_uuid result in badstatusline eventhough the same curl command comes out successful when run from the command line. | 09:37 |
dtantsur | oh this locking is definitely a problem we should be solving this cycle... | 09:39 |
*** Haomeng|2 has joined #openstack-ironic | 09:41 | |
jay-s-b | naohirot: With this response time, not able to provision any more nodes. Even deleting some of the nodes is failing. | 09:41 |
*** Haomeng has quit IRC | 09:42 | |
*** pensu has quit IRC | 09:43 | |
jay-s-b | dtantsur: i heard ironic has been tested with more than 50 nodes. My env is failing at around 20 nodes. Wondering if there is any configuration that i need to set correctly. | 09:44 |
dtantsur | jay-s-b, you may want to talk to JayF or jroll when they're available in ~ 5-6 hours. they have pretty big installation at rackspace | 09:45 |
naohirot | jay-s-b: I see. I'm very interested in hearing such performance data. | 09:46 |
naohirot | rameshg87: I leave for dinner and come back about 1 hour later. | 09:51 |
jay-s-b | dtantsur: Is there a bug reported regarding the locking issue? | 09:58 |
dtantsur | jay-s-b, I don't remember one. we were just discussing it on the summit and IIRC rloo agreed to watch over this work | 09:58 |
*** MattMan has quit IRC | 10:01 | |
sambetts | They were removed the repo and you now have to run "python setup.py compile_catalog" to generate them | 10:01 |
sambetts | -,- wrong channel sorry | 10:02 |
*** MattMan has joined #openstack-ironic | 10:02 | |
*** jiangfei has quit IRC | 10:03 | |
*** jiangfei has joined #openstack-ironic | 10:03 | |
*** ijw has joined #openstack-ironic | 10:03 | |
*** romcheg1 has joined #openstack-ironic | 10:06 | |
*** romcheg has quit IRC | 10:06 | |
*** ijw has quit IRC | 10:09 | |
*** Nisha_away has joined #openstack-ironic | 10:13 | |
*** Nisha has quit IRC | 10:13 | |
*** naohirot has quit IRC | 10:17 | |
*** Masahiro has quit IRC | 10:26 | |
*** Nisha_away has quit IRC | 10:27 | |
*** bauzas is now known as bauwzer | 10:37 | |
*** bauwzer is now known as bauwser | 10:37 | |
*** Marga_ has joined #openstack-ironic | 10:40 | |
*** andreykurilin has joined #openstack-ironic | 10:44 | |
*** gilliard is now known as gilllliard | 10:46 | |
*** rameshg87 has quit IRC | 10:51 | |
*** athomas has quit IRC | 10:55 | |
*** athomas has joined #openstack-ironic | 10:56 | |
*** ramineni has quit IRC | 10:58 | |
*** pelix has joined #openstack-ironic | 10:59 | |
*** chenglch has quit IRC | 10:59 | |
*** Marga_ has quit IRC | 11:00 | |
*** Marga_ has joined #openstack-ironic | 11:00 | |
*** ijw has joined #openstack-ironic | 11:03 | |
*** lucasagomes has joined #openstack-ironic | 11:05 | |
*** ijw has quit IRC | 11:09 | |
*** alexpilotti has joined #openstack-ironic | 11:14 | |
*** romcheg1 has quit IRC | 11:19 | |
*** naohirot has joined #openstack-ironic | 11:20 | |
*** pradipta is now known as pradipta_away | 11:21 | |
naohirot | rameshg87: I'm back, but it seems you are here | 11:24 |
*** Masahiro has joined #openstack-ironic | 11:27 | |
naohirot | dtantsur: Haomeng|2: thank you for the swift approval.:) | 11:30 |
openstackgerrit | Merged openstack/ironic: Update etc/ironic/ironic.conf.sample https://review.openstack.org/146016 | 11:30 |
dtantsur | np) | 11:30 |
naohirot | dtantsur: :) | 11:31 |
*** Masahiro has quit IRC | 11:31 | |
naohirot | rameshg87: s/you are here/you are not here/ | 11:40 |
Haomeng|2 | dtantsur: np | 11:48 |
*** andreykurilin has quit IRC | 11:49 | |
*** jcoufal_ has joined #openstack-ironic | 11:50 | |
*** jcoufal has quit IRC | 11:52 | |
*** nosnos has quit IRC | 11:54 | |
*** ijw has joined #openstack-ironic | 12:03 | |
*** EmilienM|afk is now known as EmilienM | 12:07 | |
*** ijw has quit IRC | 12:09 | |
openstackgerrit | Dmitry Tantsur proposed stackforge/ironic-discoverd: Implement get status endpoint https://review.openstack.org/146067 | 12:31 |
*** romcheg has joined #openstack-ironic | 12:32 | |
*** chlong has quit IRC | 12:32 | |
openstackgerrit | Dmitry Tantsur proposed stackforge/ironic-discoverd: Implement get status endpoint https://review.openstack.org/146067 | 12:36 |
dtantsur | ifarkas, hey, ^^^ I have some huge patch for you =^_^= thanks | 12:39 |
*** lazy_prince is now known as killer_prince | 12:39 | |
ifarkas | dtantsur, will review ;-) | 12:40 |
*** kbyrne has quit IRC | 12:49 | |
*** EmilienM is now known as EmilienM|afk | 12:52 | |
*** kbyrne has joined #openstack-ironic | 12:54 | |
*** Masahiro has joined #openstack-ironic | 12:59 | |
*** smoriya has quit IRC | 13:00 | |
*** ijw has joined #openstack-ironic | 13:03 | |
*** erwan_taf has joined #openstack-ironic | 13:04 | |
*** Masahiro has quit IRC | 13:04 | |
*** EmilienM|afk is now known as EmilienM | 13:09 | |
*** ijw has quit IRC | 13:09 | |
*** naohirot has quit IRC | 13:18 | |
*** dprince has joined #openstack-ironic | 13:21 | |
openstackgerrit | Ihar Hrachyshka proposed openstack/ironic-python-agent: Revert "Remove python 2.6 from tox.ini" https://review.openstack.org/146083 | 13:37 |
*** romcheg has quit IRC | 13:46 | |
*** romcheg has joined #openstack-ironic | 13:46 | |
*** pcaruana|afk| has quit IRC | 13:53 | |
jroll | jay-s-b: one thing you can do is set the power sync interval higher so it doesn't run as often | 13:54 |
jroll | jay-s-b: or scale up your conductor cluster | 13:55 |
jroll | jay-s-b: that loop shouldn't be taking 3 minutes for 23 healthy bmcs... there might be one bad one slowing everything down | 13:55 |
jroll | morning irnoic :) | 13:55 |
dtantsur | jroll, morning :) | 13:55 |
jroll | also, morning ironic | 13:55 |
jroll | hey dtantsur | 13:55 |
lucasagomes | jroll, morning | 13:56 |
jroll | morning lucasagomes :) | 13:58 |
dtantsur | jroll, could you join the discussion at https://review.openstack.org/#/c/135589/ ? (also added to the meeting agenda, but maybe we can solve it earlier) | 14:01 |
jroll | dtantsur: I'm +1 on it just from reading the commit message | 14:02 |
jroll | but yeah, I'll review that today | 14:02 |
* jroll adds to list | 14:02 | |
dtantsur | thanks | 14:03 |
*** ijw has joined #openstack-ironic | 14:03 | |
BadCub_ | Morning Ironic | 14:07 |
dtantsur | BadCub_, morning | 14:08 |
BadCub_ | Morning dtantsur | 14:09 |
*** ijw has quit IRC | 14:09 | |
*** coolsvap is now known as coolsvap|afk | 14:10 | |
*** ijw has joined #openstack-ironic | 14:14 | |
openstackgerrit | Dmitry Tantsur proposed stackforge/ironic-discoverd: DO NOT MERGE https://review.openstack.org/141786 | 14:14 |
*** jjohnson2 has joined #openstack-ironic | 14:16 | |
*** erwan_taf has quit IRC | 14:18 | |
*** coolsvap|afk is now known as coolsvap | 14:19 | |
*** ryanpetrello has joined #openstack-ironic | 14:20 | |
NobodyCam | morning Ironic | 14:25 |
jroll | morning NobodyCam | 14:26 |
NobodyCam | hey hey jroll :) | 14:27 |
jroll | happy friday! | 14:27 |
NobodyCam | oh ya .... TGIF | 14:27 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/ironic: Updated from global requirements https://review.openstack.org/145884 | 14:28 |
jroll | so this "nova driver is broken" thread | 14:30 |
jroll | are they just missing ExactRAMFilter? | 14:31 |
*** erwan_taf has joined #openstack-ironic | 14:34 | |
gilllliard | jroll: which thread? | 14:34 |
gilllliard | on the ML? | 14:34 |
jroll | gilllliard: yes | 14:34 |
gilllliard | I think they're not using that filter on purpose. | 14:35 |
jroll | >.> | 14:35 |
jroll | why? | 14:35 |
gilllliard | To deploy to heterogeneous hw sets? | 14:35 |
jroll | so use a flavor for each type | 14:35 |
jroll | as a user, if I ask for 512MB, I want 512MB | 14:35 |
jroll | idk. | 14:35 |
gilllliard | IDK Alex's use case, but we've hit similar using tripleo | 14:36 |
jroll | interesting | 14:37 |
gilllliard | the heat template gives you a single flavor param for your OvercloudComputeFlavor (for example) | 14:37 |
gilllliard | So if you have machines with differing amounts of ram, you set the flavor to have the smaller amount and turn off exact ram filter | 14:37 |
jroll | fun | 14:38 |
jroll | the other thing is, the ironic driver should be marking nodes as using all resources; there's probably a race there but yeah. | 14:38 |
jroll | maybe that's the real issue here | 14:39 |
gilllliard | There is a race, because nova only polls for available resources every 60s | 14:39 |
jroll | https://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py#L240-244 | 14:39 |
jroll | yeah | 14:39 |
jroll | ok | 14:39 |
gilllliard | So nova does (available_ram - flavor_ram) in its resource tracking, and it's up to 60s before ironic tells it any better | 14:40 |
jroll | yep, understood | 14:40 |
*** Haomeng has joined #openstack-ironic | 14:43 | |
*** Haomeng|2 has quit IRC | 14:44 | |
gilllliard | AIUI this is expected nova behaviour, and the best ironic can do is fail fast when nodes are double-scheduled during those 60s | 14:44 |
openstackgerrit | Matthew Gilliard proposed openstack/ironic: Adds get_glance_image_properties https://review.openstack.org/146099 | 14:45 |
*** Masahiro has joined #openstack-ironic | 14:48 | |
dtantsur | NobodyCam, hey! let us discuss you concerns around https://review.openstack.org/#/c/135605/ and node locking, I'm not sure what exactly concerns you | 14:49 |
dtantsur | with 3rdparty service driving Ironic you'll always have possibilities for races, because Ironic can't distinguish between discoverd and an insane operator :D | 14:50 |
*** mjturek has joined #openstack-ironic | 14:50 | |
*** killer_prince has quit IRC | 14:51 | |
*** Masahiro has quit IRC | 14:52 | |
*** pcaruana|afk| has joined #openstack-ironic | 14:52 | |
*** killer_prince has joined #openstack-ironic | 14:53 | |
*** killer_prince is now known as lazy_prince | 14:53 | |
NobodyCam | dtantsur: I was reading that yesterday and thinking out power check task takes a lock | 14:55 |
NobodyCam | you will release a lock, will you be blocking the power check task when a node is in discovery | 14:56 |
NobodyCam | else I could see it re-acquiring the lock on the node | 14:56 |
NobodyCam | (granted only for the time needed to check power) | 14:57 |
dtantsur | NobodyCam, it does not matter if power would release after reasonable amount of time | 14:57 |
dtantsur | discoverd will just retry | 14:57 |
NobodyCam | ok so you just concerned about anything that would keep a node locked. | 14:59 |
NobodyCam | not something like the a bref power check | 15:00 |
dtantsur | yes | 15:00 |
NobodyCam | :) | 15:00 |
dtantsur | NobodyCam, because if I don't release a lock taken by inspect_hardware, there'll be a deadlock | 15:00 |
dtantsur | otherwise discoverd should be fine | 15:01 |
NobodyCam | ok I will update my comment | 15:01 |
NobodyCam | thank you | 15:01 |
*** achanda has joined #openstack-ironic | 15:02 | |
dtantsur | thanks! | 15:05 |
lucasagomes | NobodyCam, g'morning | 15:05 |
NobodyCam | dtantsur: re-reviewed :) | 15:06 |
NobodyCam | morning lucasagomes | 15:07 |
NobodyCam | :) | 15:07 |
dtantsur | oh cool! | 15:07 |
dtantsur | now I need someone to push The Button :) | 15:07 |
NobodyCam | dtantsur: I just waiting for a bit so folks can look it over. | 15:08 |
dtantsur | yep sure | 15:08 |
NobodyCam | :) | 15:08 |
*** humble_ has joined #openstack-ironic | 15:09 | |
NobodyCam | lucasagomes: got a second for a quick review of the dicoverd spec? | 15:09 |
NobodyCam | 2 +2's and a rloo +1 :) | 15:09 |
*** chuckC_ has quit IRC | 15:10 | |
NobodyCam | is the gate still borked? | 15:11 |
lucasagomes | NobodyCam, I'm going to get some food, I can take a look as soon as I get back | 15:12 |
NobodyCam | :) thank you lucasagomes :) | 15:12 |
lucasagomes | brb | 15:12 |
NobodyCam | enjot the food | 15:12 |
*** lucasagomes is now known as lucas-hungry | 15:12 | |
lucas-hungry | thanks :D | 15:12 |
NobodyCam | enjoy even | 15:12 |
* NobodyCam goes for more coffee | 15:12 | |
*** TheJulia has quit IRC | 15:13 | |
*** erwan_taf has quit IRC | 15:18 | |
*** Marga_ has quit IRC | 15:19 | |
*** Marga_ has joined #openstack-ironic | 15:19 | |
*** erwan_taf has joined #openstack-ironic | 15:31 | |
* dtantsur brb | 15:31 | |
*** stendulker has joined #openstack-ironic | 15:32 | |
openstackgerrit | Matthew Gilliard proposed openstack/ironic: Adds get_glance_image_properties https://review.openstack.org/146099 | 15:34 |
*** achanda_ has joined #openstack-ironic | 15:41 | |
*** achanda has quit IRC | 15:41 | |
*** stendulker has quit IRC | 15:49 | |
*** bnemec is now known as beekneemech | 15:56 | |
openstackgerrit | Steven Dake proposed openstack/ironic-specs: Override boot options via glance property https://review.openstack.org/144235 | 15:56 |
*** achanda_ has quit IRC | 15:58 | |
*** bandicot has joined #openstack-ironic | 16:00 | |
*** Marga_ has quit IRC | 16:03 | |
*** bandicot has quit IRC | 16:04 | |
*** openstackgerrit has quit IRC | 16:05 | |
*** openstackgerrit has joined #openstack-ironic | 16:06 | |
openstackgerrit | Matthew Gilliard proposed openstack/ironic: Adds get_glance_image_properties https://review.openstack.org/146099 | 16:07 |
openstackgerrit | Jarrod Johnson proposed stackforge/pyghmi: Implement server side IPMI protocol (WIP) https://review.openstack.org/138109 | 16:11 |
NobodyCam | jjohnson2: woo hoo .. :) | 16:12 |
jjohnson2 | NobodyCam, well, almost | 16:12 |
jjohnson2 | NobodyCam, now it can navigate the login and parse commands | 16:12 |
jjohnson2 | just tweaking a few mistakes in sending day-to-day replies | 16:13 |
NobodyCam | nice.. I haven't look at it yet.. still on other reviews | 16:13 |
*** lucas-hungry is now known as lucasagomes | 16:13 | |
*** coolsvap is now known as coolsvap|afk | 16:16 | |
jjohnson2 | NobodyCam, yeah, it's a bit... big | 16:18 |
lucasagomes | NobodyCam, spec lgtm! +a | 16:22 |
lucasagomes | dtantsur, ^ | 16:22 |
openstackgerrit | Merged openstack/ironic-specs: In-band hardware properties inspection via ironic-discoverd https://review.openstack.org/135605 | 16:24 |
openstackgerrit | Jarrod Johnson proposed stackforge/pyghmi: Implement server side IPMI protocol (WIP) https://review.openstack.org/138109 | 16:25 |
NobodyCam | woo hoo | 16:29 |
dtantsur | WOOOHOOOOOO | 16:29 |
NobodyCam | thank you lucasagomes | 16:29 |
NobodyCam | :) | 16:29 |
jay-s-b | jroll: for some reason sync_power_state_interval is stuck with 600 seconds(10 mins) Though i have it set up at 1800 seconds in ironic.conf conductor and i want to increase it to more 20 mins because my provision can take from upto 10 mins before it can become active. nova-compute.log flag shows this value set at 600 seconds. But, ironic code defaults to 60 seconds and i have set it to 1800... | 16:33 |
jay-s-b | ...seconds and i have restart nova-compute and ironic conductor and ironic api several times. I am not sure where this 600 seconds is coming from. | 16:33 |
*** Marga_ has joined #openstack-ironic | 16:33 | |
NobodyCam | oh joy. http://logs.openstack.org/89/145389/4/check-tripleo/check-tripleo-ironic-undercloud-precise-nonha/26b95cb/logs/seed_logs/nova-compute.txt.gz | 16:33 |
openstackgerrit | Merged stackforge/ironic-discoverd: Implement get status endpoint https://review.openstack.org/146067 | 16:34 |
jroll | jay-s-b: nova and ironic each have their own power sync task, there's different configs for them | 16:34 |
jroll | jay-s-b: I really think this is all due to slow bmcs | 16:34 |
jroll | *really* slow bmcs | 16:34 |
jroll | might even just be one | 16:34 |
*** andreykurilin has joined #openstack-ironic | 16:35 | |
NobodyCam | would cool to collect data on how long each bmc takes to reply | 16:35 |
NobodyCam | to track down slow ones | 16:35 |
*** Masahiro has joined #openstack-ironic | 16:37 | |
jroll | *cough* there's a spec for that | 16:37 |
NobodyCam | lol | 16:37 |
jroll | https://review.openstack.org/#/c/137171/ | 16:38 |
*** athomas has quit IRC | 16:38 | |
lucasagomes | I feel like "there's an app for that" and "there's an spec for that" are now analogous | 16:39 |
*** Marga__ has joined #openstack-ironic | 16:39 | |
*** pcrews has joined #openstack-ironic | 16:40 | |
*** Masahiro has quit IRC | 16:41 | |
*** Marga_ has quit IRC | 16:42 | |
NobodyCam | lol | 16:43 |
*** ijw has quit IRC | 16:43 | |
*** rameshg87 has joined #openstack-ironic | 16:45 | |
rameshg87 | dtantsur, hi | 16:45 |
JayF | if someone wants to approve this: https://review.openstack.org/#/c/143193/22 I'm working on a followup patch for ruby's concern now | 16:45 |
* JayF just doesn't want to lose the cavalcade of +2s | 16:45 | |
jroll | :P | 16:46 |
dtantsur | rameshg87, o/ | 16:46 |
rameshg87 | dtantsur, had some comments/questions on inband hw inspection: https://review.openstack.org/#/c/135605/13 | 16:46 |
*** athomas has joined #openstack-ironic | 16:46 | |
rameshg87 | dtantsur, just realised it got merged | 16:47 |
rameshg87 | dtantsur, please check if they are of concern | 16:47 |
rameshg87 | dtantsur, added my comments on the same review | 16:47 |
dtantsur | sure, will have a look right now | 16:47 |
rameshg87 | dtantsur, mainly on releasing the lock, won't it let sync_power_state to do some operation on node ? | 16:48 |
rameshg87 | dtantsur, are we going to do something on that ? | 16:48 |
dtantsur | rameshg87, it will. I don't see anything bad in Ironic knowing real state of machine; discoverd also has retry logic in case it tries to do something during this lock sync | 16:48 |
*** TheJulia has joined #openstack-ironic | 16:49 | |
* dtantsur is writing answers on a spec | 16:49 | |
openstackgerrit | Jay Faulkner proposed openstack/ironic-python-agent: HardwareManagerMethodNotFound requires a method https://review.openstack.org/146133 | 16:49 |
rameshg87 | dtantsur, oh discoverd uses ironic to power on/off machine, right ? | 16:49 |
rameshg87 | dtantsur, i missed that point :D | 16:49 |
dtantsur | rameshg87, exactly | 16:49 |
rameshg87 | dtantsur, okay | 16:49 |
NobodyCam | brb | 16:50 |
rameshg87 | dtantsur, then another small thing was we might need to set the boot device to pxe on the node | 16:50 |
dtantsur | rameshg87, https://github.com/stackforge/ironic-discoverd/blob/master/ironic_discoverd/discover.py#L111 :) | 16:50 |
dtantsur | actually it's a good point and initially I forgot about it | 16:51 |
dtantsur | but right now it's already fixed | 16:51 |
rameshg87 | dtantsur, :) | 16:51 |
rameshg87 | dtantsur, okies | 16:51 |
rameshg87 | dtantsur, others were nits | 16:51 |
dtantsur | rameshg87, thank you for review anyway. please ping me if you have more questions | 16:51 |
rameshg87 | dtantsur, sure | 16:52 |
* dtantsur leaves comments on a spec for future reference | 16:52 | |
*** rwsu has joined #openstack-ironic | 16:53 | |
*** romcheg has quit IRC | 16:55 | |
NobodyCam | jroll: do you know if aweeks will be addressing devananda's comments on that? | 16:56 |
jroll | NobodyCam: I sure hope so? :P | 16:56 |
jroll | aweeks: ^ you has comments on your spec | 16:57 |
openstackgerrit | Jay Faulkner proposed openstack/ironic-python-agent: HardwareManagerMethodNotFound requires a method https://review.openstack.org/146133 | 16:59 |
jroll | NobodyCam: in general, "instrument all the things best we can" | 16:59 |
JayF | I actually think the best response in that spec tp " | 17:00 |
JayF | *to "what will we metric" is basically that | 17:00 |
openstackgerrit | Alexis Lee proposed openstack/ironic: Distinguish between prepare + deploy errors https://review.openstack.org/146135 | 17:00 |
jroll | yeah | 17:00 |
JayF | but the spec should only have to cover adding the framework for metricing stuff | 17:00 |
JayF | seems like putting the cart before the horse to decide what all metrics are useful (especially when the answer is likely to be anything we can think of and probably some we didn't) | 17:00 |
yjiang5 | dtantsur: hi | 17:02 |
jroll | I think "how do we metric things that are async or in more than one request" is a valid question | 17:02 |
dtantsur | yjiang5, hi | 17:02 |
*** Marga__ has quit IRC | 17:03 | |
*** EmilienM is now known as EmilienM|afk | 17:04 | |
*** jgrimm is now known as zz_jgrimm | 17:05 | |
yjiang5 | dtantsur: I'm reading the spec at https://review.openstack.org/#/c/133902/11/specs/kilo/bare-metal-trust-using-intel-txt.rst , just wondering if that can be achieved based on your discoverd effort as https://review.openstack.org/#/c/135605/? The baremetal trust basically is check hardware's 'trust' property, which is dynamic one. | 17:05 |
* dtantsur is looking | 17:07 | |
jroll | yjiang5: what would discoverd handle for this? | 17:08 |
dtantsur | yjiang5, what you're talking about is what we call "hardware capability" and it can definitely be discovered using discoverd or (possibly) OOB means. Did I got you right? | 17:08 |
openstackgerrit | Alexis Lee proposed openstack/ironic: Remove tautological condition https://review.openstack.org/146138 | 17:08 |
yjiang5 | jroll: that spec need a gold image to discover if the node is trusted, and also need to update the hardware property for it. To me, it's quite similar to discoverd | 17:10 |
NobodyCam | brb | 17:10 |
jroll | yjiang5: ah, I see | 17:10 |
jroll | yeah, discoverd could probably handle that, don't see why not, the question is if it should | 17:10 |
dtantsur | can't say for sure (Friday evening, you know...), but it should be possible | 17:11 |
yjiang5 | dtantsur: yes, exactly. So does discovered cover hardware capability (is capability same as property?). I suspect the 'trust' property can't be achieved through OOB. | 17:11 |
jroll | and is it a gold image for each node, or just one? | 17:11 |
yjiang5 | dtantsur: I'm friday morning, need cofee too :) | 17:11 |
yjiang5 | jroll: I'd take it as just one gold iimage should be enough, the only requirement is a trust testation client included. | 17:12 |
dtantsur | jroll, yjiang5, discoverd has plugins. so if you write a plugin for discoverd and you ramdisk, you can use it. I am not sure, whether it's going to save you a lot of coding, of course | 17:12 |
yjiang5 | dtantsur, jroll, I'm not working on that spec, but I will talk our colleagues on it. Thanks for your input, I will talk with them to see if we can achieve it using the discoverd service. | 17:13 |
jroll | cool | 17:13 |
*** beekneemech has quit IRC | 17:17 | |
*** bnemec has joined #openstack-ironic | 17:18 | |
*** achanda has joined #openstack-ironic | 17:22 | |
*** jistr_ has joined #openstack-ironic | 17:31 | |
*** ndipanov_ has joined #openstack-ironic | 17:31 | |
*** dtantsur_ has joined #openstack-ironic | 17:31 | |
*** lsmola_ has joined #openstack-ironic | 17:31 | |
*** jcoufal has joined #openstack-ironic | 17:32 | |
*** Haomeng has quit IRC | 17:33 | |
*** ryanpetrello has quit IRC | 17:33 | |
*** jcoufal_ has quit IRC | 17:33 | |
*** jcoufal_ has joined #openstack-ironic | 17:34 | |
*** dtantsur has quit IRC | 17:34 | |
openstackgerrit | Merged openstack/ironic-python-agent: Allow use of multiple simultaneous HW managers https://review.openstack.org/143193 | 17:34 |
devananda | morning, all | 17:34 |
*** zer0c00l has quit IRC | 17:34 | |
lucasagomes | devananda, good morning | 17:34 |
*** lsmola has quit IRC | 17:35 | |
*** Haomeng has joined #openstack-ironic | 17:35 | |
*** jistr has quit IRC | 17:35 | |
*** ndipanov has quit IRC | 17:35 | |
*** ndipanov_ has quit IRC | 17:37 | |
*** jcoufal has quit IRC | 17:37 | |
*** lsmola_ has quit IRC | 17:37 | |
*** dtantsur_ has quit IRC | 17:37 | |
*** jistr_ has quit IRC | 17:37 | |
openstackgerrit | Adam Gandelman proposed openstack/ironic: Provided backward compat for enforcing admin policy https://review.openstack.org/145984 | 17:39 |
devananda | anyone see the ML thread / bug about resource tracker? https://bugs.launchpad.net/nova/+bug/1402658 | 17:40 |
*** spandhe has joined #openstack-ironic | 17:40 | |
*** jcoufal_ has quit IRC | 17:42 | |
*** dtantsur has joined #openstack-ironic | 17:42 | |
NobodyCam | morning devananda :) | 17:43 |
NobodyCam | i quickly skimmed it but have not looked in to it at all | 17:43 |
jay-s-b | jroll: I updated sync_power_state_interval to 1800 seconds of nova-compute. That works. Because nova-compute kicks off power sync task for all the instances and that in turn seems to call power sync on ironic instances/nodes. This particular lock seems to have some issue (one of the reasons , as you mentioned earlier, could be slow bmc but ironic-conductor log timestamp looks good for bmc... | 17:44 |
jay-s-b | ...access for all the nodes registered. For now, with new sync_power_state_interval, I have deferred the problem to 30 mins. Within that 30 mins, I am able to provision some number of nodes. However, if i try to provision a node during the next power sync kick off after 30 mins, provision is highly likely to fail. | 17:44 |
aweeks | NobodyCam: I was out sick yesterday, but I'm planning to update the spec per devananda's comments today | 17:44 |
*** andreykurilin has quit IRC | 17:44 | |
NobodyCam | aweeks: TY.. hope your feeling better | 17:44 |
*** andreykurilin has joined #openstack-ironic | 17:45 | |
*** ryanpetrello has joined #openstack-ironic | 17:45 | |
aweeks | devananda: per your comments on https://review.openstack.org/#/c/137171/10/specs/kilo/add-pluggable-metrics-for-ironic-and-ipa.rst are you looking for more details on a few specific metrics that could be used? a code sample of how to instrument code? I don't think its realistic to exhaustively list every metric that we could want to report, but if its | 17:46 |
aweeks | just more detail on some examples, then I'm happy to add that. | 17:46 |
aweeks | NobodyCam: thanks, much better :) | 17:47 |
devananda | jay-s-b: if you wait until after the power sync completes, then what? | 17:47 |
devananda | jay-s-b: iow, is it a race during the power sync, or is the sync leaving something in an inconsistent/fragile state? | 17:47 |
*** vdrok has joined #openstack-ironic | 17:48 | |
jroll | jay-s-b: hmm, this is strange. I wonder if turning down the number of workers for the power sync could help too | 17:48 |
devananda | aweeks: more detail on some examples ++ | 17:48 |
devananda | aweeks: also, as a reviewer, how will I know when the spec is done? | 17:48 |
devananda | aweeks: unless there is a list of "here are the metrics which will be collected as part of this spec" | 17:48 |
JayF | I would argue when writing software you should never be done adding metrics | 17:49 |
JayF | because as you run it in production, over time, you learn new things to metric | 17:49 |
devananda | certainly | 17:49 |
devananda | I'm not saying we can't add more specs later | 17:49 |
aweeks | devananda: yeah, I understand. I see the set of metrics to be added per the spec as a relatively small set of example metrics, and over time we'll just keep adding | 17:49 |
JayF | I would consider the spec complete when a short list of very high impact metrics are added | 17:49 |
*** ndipanov_ has joined #openstack-ironic | 17:49 | |
devananda | but I can't judge a design, or the implemetation thereof, without some reference to what is going to be implemented | 17:49 |
JayF | and that then there could be continued follow up patches (attached to a bug?) to metric parts of the code that aren't well metric'd | 17:49 |
devananda | JayF: fine. but those should be listed in the spec :) | 17:50 |
jroll | I would start with: every api method, every RPC dispatch | 17:50 |
jroll | maybe even every objects and/or db call | 17:50 |
devananda | erm | 17:50 |
JayF | and for IPA, with the code around writing and downloading the image | 17:50 |
devananda | so | 17:50 |
*** ndipanov_ has quit IRC | 17:50 | |
jay-s-b | devanada: It could be race problem. When I run the same curl command that fails with exception from command line returns correct result with correct state | 17:51 |
devananda | let's pause a sec, jroll. what is the use-case for collecting these metrics? | 17:51 |
*** lsmola_ has joined #openstack-ironic | 17:51 | |
aweeks | devananda: when something goes wrong, and you wish you'd collected them :P | 17:51 |
jroll | devananda: identifying when your software is having problems | 17:51 |
jroll | also that | 17:51 |
*** derekh has quit IRC | 17:51 | |
JayF | The use case is troubleshooting software in production, without detailed timers and counters on what it's really doing, most troubleshooting involves log speulunking and educated guesses | 17:51 |
jroll | devananda: for example: db latency goes to 2x. | 17:52 |
devananda | jroll: debugging in QA? benchmarking? tracking performance in production to predict upcoming scalability limits? | 17:52 |
devananda | have you guys looked at osprofiler? | 17:52 |
jroll | that will present as "general slowness" | 17:52 |
JayF | I'm honestly amazed we have to give a case for metrics in software | 17:52 |
JayF | I thought that was a basic tenet: emit statistics on everything | 17:52 |
jroll | devananda: tracking performance in production to identify potential problems and assist in tracking down existing problems | 17:52 |
devananda | great. have you looked at osprofiler? | 17:53 |
JayF | It's been incredibly rough running Ironic in production without having any metrics, especially after moving from a team that ran software that had a metric emitted for everything | 17:53 |
openstackgerrit | Dmitry Tantsur proposed stackforge/ironic-discoverd: Rework node cache clean up according to recent changes https://review.openstack.org/146148 | 17:53 |
*** ndipanov has joined #openstack-ironic | 17:54 | |
jroll | devananda: it was my understanding that osprofiler is meant for test/qa/benchmarking, not production | 17:55 |
devananda | jroll: AIUI, osprofiler is meant for production -- but not necessarily to be turned on on every call | 17:55 |
jroll | also, I'm not going to run ceilometer. | 17:55 |
jroll | right | 17:55 |
jroll | we want metrics on every call. | 17:55 |
JayF | Right now I can't tell you how, lets say, a network incident might have affected my control plane | 17:56 |
JayF | because I'm not getting metrics on every call | 17:56 |
devananda | JayF: I'm not in any awy suggesting we don't have metrics | 17:57 |
jroll | ok, reading more, osprofiler is intended to be used on an ad-hoc basis | 17:57 |
devananda | JayF: so there's no need to explain why you want them | 17:57 |
jroll | it doesn't even sample | 17:57 |
jroll | you have to run a cli | 17:57 |
jroll | I think, maybe not | 17:57 |
devananda | jroll: I'm fairly sure it samples, and can be triggered by an HTTP header, or enabled on certain API servers and not others, for example | 17:58 |
devananda | and taht header would get passed throug context throughout the whole call stack | 17:58 |
devananda | and it would collect timing data on every method call, objecet, db, and rpc call in that execution | 17:58 |
jroll | yeah, I want that data for every execution | 17:59 |
JayF | ++ | 17:59 |
devananda | in which case, it would be collecting all the data you want -- without polluting the code with decorators on everything | 17:59 |
*** ndipanov has quit IRC | 17:59 | |
devananda | so then enable it on every service | 17:59 |
devananda | rather than in the request header | 17:59 |
devananda | and it's on globally | 17:59 |
jroll | according to the readme, you still have to instrument the functions | 17:59 |
jroll | https://github.com/stackforge/osprofiler#osprofiler-api-version-030 | 18:00 |
devananda | hm. I'm certain that boris-42 has said that you do NOT have to instrument individual functions | 18:00 |
JayF | He works on rally, not osprofiler iirc | 18:01 |
devananda | nope. he did osprofiler | 18:01 |
devananda | and rally | 18:01 |
aweeks | so, to get less philosophical, currently three ways to collect metrics that I'm going to detail in the updated spec: 1. directly calling the logger to emit a metric 2. A decorator for instrumenting functions 3. A context manager for instrumenting code blocks | 18:01 |
devananda | https://review.openstack.org/#/c/134839/ | 18:01 |
devananda | he's proposed it openstack-wide | 18:02 |
devananda | though there's a lot of work and discussion still to do | 18:02 |
devananda | many people (myself included) still have reservations and doubts about the implementation | 18:02 |
devananda | I'm not advocating that we use it -- but rather, trying to understand if/how it is different from this | 18:02 |
devananda | which I did not think would be used to instrument *every* call | 18:02 |
aweeks | the goals of osprofiler seem remenicent of https://github.com/twitter/zipkin | 18:03 |
*** MattMan has left #openstack-ironic | 18:04 | |
russell_h | the amount of stuff os reinvents blows my mind | 18:04 |
jroll | devananda: see lines 227-228 | 18:04 |
jroll | Developers will have to make sure that they're adding profiling hooks for new 227 | 18:04 |
jroll | APIs, DB calls, etc. | 18:04 |
JayF | russell_h: and sending the quantity of metrics I hope we send via a rabbitmq seems like a recipe for horrible and hilarious failures | 18:04 |
jroll | so we still have to decorate all the things | 18:04 |
*** rameshg87 has quit IRC | 18:05 | |
dtantsur | have a great weekend, folks! | 18:10 |
*** dtantsur is now known as dtantsur|afk | 18:10 | |
jroll | JayF: how will you get your "RPC is slowing down" metrics :P | 18:12 |
*** achanda has quit IRC | 18:13 | |
jroll | devananda: so, do you disagree that we should do this? we could even have an osprofiler backend if that's what the people want. | 18:13 |
devananda | jroll: it sounds like this proposal is for exactly the same use-case as osprofiler | 18:14 |
devananda | jroll: amirite? | 18:14 |
jroll | devananda: yes, plus the ability to have it always on without killing performance | 18:15 |
jroll | and the ability to not use rabbit | 18:15 |
jroll | and the ability to not use ceilometer | 18:16 |
devananda | right | 18:16 |
devananda | so osprofiler now has a pluggable mongo backend, instead of ceilometer | 18:16 |
devananda | which doesn't make me particularly happy | 18:16 |
JayF | that's still not a metrics store that anyone outside of openstack uses | 18:16 |
devananda | right | 18:16 |
lucasagomes | jroll, JayF (off-topic): if you guys have some free time please take a look at the configdrive patches? Ironic: https://review.openstack.org/#/c/143510/ Nova: https://review.openstack.org/#/c/144792/ (base patch) https://review.openstack.org/#/c/145235/ (integration with Swift) | 18:16 |
devananda | and there are license issues with it | 18:16 |
JayF | I think having an osprofiler backend to what aweeks has proposed is a great idea | 18:17 |
jroll | lucasagomes: I'll add them to my list | 18:17 |
jroll | JayF++ wouldn't be hard to implement, either | 18:18 |
lucasagomes | cool, I tested it locally w/ and w/o Swift seems good, you may want to test it with IPA as well | 18:18 |
aweeks | JayF: jroll: to be fair, metrics ⊊ tracing | 18:19 |
JayF | that's not a != sign in my font fwiwi | 18:20 |
JayF | lol | 18:20 |
aweeks | JayF: proper subset symbol is what I was going for | 18:20 |
JayF | aweeks: then it might be that. I have no idea what that is. | 18:20 |
NobodyCam | brb quick walkies | 18:21 |
devananda | aweeks: yes, subset symbol | 18:22 |
devananda | aweeks: and you are correct | 18:22 |
aweeks | what I'm trying to say is that trying to make a metrics library have a tracing library as a back end isn't the correct direction to do it | 18:22 |
devananda | tracing provides a lot more granular information | 18:23 |
devananda | if this proposal aims to meet the same use-case that osprofiler does (and it sounds like they do) | 18:23 |
aweeks | my proposal was specifically not tracing | 18:24 |
devananda | aweeks: your implementation is not tracing, but isn't the use case the same? | 18:24 |
aweeks | devananda: the use cases are a subset of the use cases for tracing | 18:25 |
aweeks | but I deliberately avoided tracing, because doing that properly is much larger problem | 18:25 |
openstackgerrit | Merged openstack/ironic: Updated from global requirements https://review.openstack.org/145884 | 18:25 |
aweeks | and requires a lot of coordination across projects | 18:25 |
*** Masahiro has joined #openstack-ironic | 18:25 | |
devananda | aweeks: so is it reasonable to do both at the same time? | 18:26 |
aweeks | trace and collect metrics simultaneously you mean? | 18:26 |
devananda | yes | 18:26 |
aweeks | I don't think that would be a problem | 18:27 |
devananda | ok | 18:27 |
*** erwan_taf has quit IRC | 18:27 | |
devananda | so to make taht clear, this proposal should not be a subset of tracing, but a non-intersecting set | 18:27 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/ironic: Updated from global requirements https://review.openstack.org/146156 | 18:27 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/ironic-python-agent: Updated from global requirements https://review.openstack.org/145885 | 18:28 |
devananda | rather than expose timing and tracing of RPC calls, which it proposes now | 18:28 |
devananda | expose timing and counting of completed operations | 18:29 |
devananda | which is also proposed in here. basically, make the distinction clearer | 18:29 |
aweeks | kk | 18:29 |
jroll | I don't think this spec was proposing tracing RPC calls | 18:29 |
devananda | does that make sense // seem reasonable // still meet the needs you (and everyone else) has? | 18:29 |
devananda | L46: * Counting and timing of RPCs | 18:30 |
jroll | or tracing anything | 18:30 |
jroll | yes, timing | 18:30 |
jroll | not tracing | 18:30 |
jroll | time the dispatch | 18:30 |
jroll | time the method call | 18:30 |
aweeks | I think we will necessarily have some overlap with what tracing could provide | 18:30 |
jroll | (or maybe just the calls that the RPC method makes, e.g. time bmc pokes0 | 18:30 |
*** Masahiro has quit IRC | 18:30 | |
aweeks | but the goal is to keep that small | 18:30 |
*** jcoufal has joined #openstack-ironic | 18:31 | |
devananda | great | 18:32 |
devananda | aweeks: perhaps I misunderstood, but somehow had the impression that this included both timing and tracing | 18:33 |
aweeks | devananda: no, not tracing--all of the measurements occur independently | 18:33 |
devananda | aweeks: so some more clarity around that, an example or two of the decorator / instrumentation, and a list of the initial calls that will be instrumented (to consider the BP as complete) | 18:33 |
aweeks | as in, there is no token that gets passed around in order to trace the order in which things are called | 18:34 |
aweeks | devananda: ok, cool, I'll get on that | 18:34 |
NobodyCam | devananda: not to distract you but have you seen rloo's comemnts on 145389 | 18:35 |
devananda | aweeks: right. your aim is "graph all the counters", not "figure out why $this call is slow" | 18:35 |
devananda | NobodyCam: not yet | 18:35 |
NobodyCam | :) | 18:36 |
aweeks | devananda: yes, if I understand you correctly | 18:36 |
devananda | NobodyCam: ah, got it. interesting. I was planning to remove that later, yes | 18:38 |
aweeks | I think some example code will clear things up, in any case | 18:39 |
devananda | aweeks: indeed. thanks | 18:40 |
Shrews | bnemec: https://review.openstack.org/146156 is the change you need for TripleO? | 18:48 |
bnemec | Shrews: That's the lib, although we're actually fixed already because a new oslo.utils released and gets picked up automatically because of the >=. | 18:49 |
Shrews | bnemec: ah, great. just checking | 18:50 |
bnemec | Shrews: Yep, thanks | 18:50 |
NobodyCam | brb | 18:51 |
*** bnemec is now known as beekneemech | 18:51 | |
*** harlowja_away is now known as harlowja | 18:52 | |
*** humble_ has quit IRC | 18:53 | |
-openstackstatus- NOTICE: paste.openstack.org is going offline for a database migration (duration: ~2 minutes) | 18:58 | |
*** alexpilotti has quit IRC | 19:03 | |
*** ijw has joined #openstack-ironic | 19:04 | |
*** ijw_ has joined #openstack-ironic | 19:06 | |
*** dlaube has joined #openstack-ironic | 19:08 | |
*** ijw has quit IRC | 19:09 | |
*** jcoufal has quit IRC | 19:12 | |
*** jcoufal has joined #openstack-ironic | 19:13 | |
*** Marga_ has joined #openstack-ironic | 19:16 | |
jjohnson2 | NobodyCam, evice Support : | 19:19 |
jjohnson2 | [root@odin ~]# ipmitool -I lanplus -U USERID -P password -H 127.0.0.1 mc info|grep Firm | 19:19 |
jjohnson2 | Firmware Revision : 1.00 | 19:19 |
jjohnson2 | my fake bmc is version 1.00 | 19:20 |
JayF | Easy review; some oslo bumps for IPA that pass tempest: https://review.openstack.org/#/c/145885/3 | 19:20 |
NobodyCam | jjohnson2: slick | 19:22 |
jroll | JayF: +A | 19:23 |
NobodyCam | devananda: see jjohnson2 comemnt ^^^ on ipmi listener thingy | 19:23 |
*** EmilienM|afk is now known as EmilienM | 19:24 | |
*** pelix has quit IRC | 19:34 | |
devananda | jjohnson2: nice! | 19:40 |
devananda | hm, where'd bnemec go? | 19:41 |
jjohnson2 | I'm almost done making the more readable layer and a sample | 19:41 |
openstackgerrit | Lucas Alvares Gomes proposed openstack/ironic: Add support for local boot (PoC) https://review.openstack.org/146189 | 19:47 |
NobodyCam | devananda: I think he us using the nick beekneemech: today (15:56 |-INFO > bnemec is now known as beekneemech) | 19:49 |
beekneemech | Yep, casual nick friday. ;-) | 19:50 |
lucasagomes | folks I will call it a day | 19:50 |
lucasagomes | have a good night, enjoy the weekend! | 19:50 |
NobodyCam | have a great weekend lucasagomes | 19:50 |
lucasagomes | NobodyCam, you too, enjoy it! | 19:50 |
* NobodyCam may sneak out a little early to start enjoying it) hehehee | 19:51 | |
*** lucasagomes is now known as lucas-dinner | 19:51 | |
*** dprince has quit IRC | 19:53 | |
openstackgerrit | Merged openstack/ironic-python-agent: Updated from global requirements https://review.openstack.org/145885 | 20:00 |
*** Marga_ has quit IRC | 20:05 | |
*** andreykurilin has quit IRC | 20:09 | |
*** andreykurilin has joined #openstack-ironic | 20:09 | |
*** Masahiro has joined #openstack-ironic | 20:14 | |
*** kbyrne has quit IRC | 20:15 | |
*** kbyrne has joined #openstack-ironic | 20:16 | |
*** Masahiro has quit IRC | 20:19 | |
*** ijw_ is now known as ijw | 20:22 | |
jjohnson2 | ok, I can turn it on | 20:25 |
jjohnson2 | and off | 20:26 |
openstackgerrit | Jarrod Johnson proposed stackforge/pyghmi: Implement server side IPMI protocol (WIP) https://review.openstack.org/138109 | 20:32 |
jjohnson2 | https://review.openstack.org/#/c/138109/12/bin/fakebmc | 20:33 |
jjohnson2 | NobodyCam, devananda: look like a decent interface to write whatever you want? | 20:34 |
openstackgerrit | John Trowbridge proposed stackforge/ironic-discoverd: Changes utils.get_keystone(token) to utils.is_admin(token) https://review.openstack.org/145657 | 20:35 |
*** ijw_ has joined #openstack-ironic | 20:38 | |
*** ijw__ has joined #openstack-ironic | 20:39 | |
openstackgerrit | Jarrod Johnson proposed stackforge/pyghmi: Implement server side IPMI protocol (WIP) https://review.openstack.org/138109 | 20:40 |
*** ijw has quit IRC | 20:42 | |
NobodyCam | jjohnson2: neat... so this is where we could add stuff like set next boot device? | 20:42 |
jjohnson2 | NobodyCam, so fakebmc implements the interface in ipmi.bmc | 20:42 |
*** ijw_ has quit IRC | 20:43 | |
jjohnson2 | ipmi.bmc is where I'd translate bytecodes to function/parameter names | 20:43 |
jjohnson2 | so fakebmc is how I use that to make a dummy dbmc to track just power on/off | 20:43 |
NobodyCam | ahh yep I see now | 20:43 |
jjohnson2 | trying to be an example that someone could reference in writing whatever the real goal is | 20:44 |
NobodyCam | :) maybe a comment around the print commands making it clear that real code could go here | 20:45 |
NobodyCam | type stuff | 20:45 |
openstackgerrit | Jarrod Johnson proposed stackforge/pyghmi: Implement server side IPMI protocol (WIP) https://review.openstack.org/138109 | 20:47 |
openstackgerrit | John Trowbridge proposed stackforge/ironic-discoverd: Changes utils.get_keystone(token) to utils.is_admin(token) https://review.openstack.org/145657 | 20:48 |
NobodyCam | anyone up for a minor fsm review? https://review.openstack.org/#/c/145389 | 20:59 |
jroll | I'm going to review that entire tree this afternoon | 20:59 |
jroll | just need to find some braveness and a bit of caffeine | 20:59 |
NobodyCam | lol | 21:00 |
NobodyCam | ++ | 21:00 |
* jroll is planning on an upstream afternoon | 21:00 | |
Shrews | NobodyCam: i'm ok with that, but i was wondering if deva was going to pull out that bit of code ruby mentioned | 21:02 |
NobodyCam | Shrews: 18:38 | devananda > NobodyCam: ah, got it. interesting. I was planning to remove that later, yes | 21:03 |
NobodyCam | so yes but in another patch I beleieve | 21:04 |
Shrews | NobodyCam: ah! then i'm happy to approve that, unless jroll wants a chance to look at it first | 21:04 |
jroll | Shrews: if you want to, go ahead | 21:04 |
NobodyCam | :) | 21:04 |
Shrews | done! | 21:04 |
NobodyCam | w00 h00 | 21:05 |
NobodyCam | looks like the comments on 140868 can eassily be addressed. I can do that once I return from brb... | 21:09 |
*** ryanpetrello_ has joined #openstack-ironic | 21:10 | |
*** ryanpetrello has quit IRC | 21:12 | |
*** ryanpetrello_ is now known as ryanpetrello | 21:12 | |
*** achanda has joined #openstack-ironic | 21:13 | |
*** lucas-dinner has quit IRC | 21:15 | |
NobodyCam | devananda: mind if take a wack at 140868, or are you currently working on it? | 21:16 |
NobodyCam | brb | 21:16 |
openstackgerrit | Steven Dake proposed openstack/ironic-specs: Override boot options via glance property https://review.openstack.org/144235 | 21:18 |
*** achanda has quit IRC | 21:18 | |
*** ijw has joined #openstack-ironic | 21:22 | |
*** ijw_ has joined #openstack-ironic | 21:23 | |
*** ijw__ has quit IRC | 21:25 | |
*** ijw has quit IRC | 21:27 | |
*** krtaylor has quit IRC | 21:35 | |
*** mjturek has quit IRC | 21:39 | |
*** mjturek has joined #openstack-ironic | 21:39 | |
*** krtaylor has joined #openstack-ironic | 21:47 | |
*** andreykurilin has quit IRC | 21:51 | |
*** andreykurilin has joined #openstack-ironic | 21:51 | |
*** Marga_ has joined #openstack-ironic | 21:55 | |
*** Marga_ has quit IRC | 21:55 | |
*** Marga_ has joined #openstack-ironic | 21:56 | |
*** Marga_ has quit IRC | 21:57 | |
*** Marga_ has joined #openstack-ironic | 21:57 | |
openstackgerrit | Merged openstack/ironic: Minor changes to state model https://review.openstack.org/145389 | 21:58 |
* devananda is back | 22:01 | |
*** Marga_ has quit IRC | 22:03 | |
*** Masahiro has joined #openstack-ironic | 22:03 | |
*** Marga_ has joined #openstack-ironic | 22:04 | |
NobodyCam | wb devananda | 22:06 |
openstackgerrit | Jarrod Johnson proposed stackforge/pyghmi: Implement server side IPMI protocol (WIP) https://review.openstack.org/138109 | 22:07 |
jjohnson2 | NobodyCam, devananda the next wip includes boot device management to some extent, not working, but weekend time | 22:07 |
jjohnson2 | it can get a value, but setting isn't working | 22:08 |
*** Masahiro has quit IRC | 22:08 | |
NobodyCam | jjohnson2: I'll take a full look in a bit | 22:09 |
jjohnson2 | I won't look again til monday | 22:09 |
NobodyCam | awesome work. tank you .. and have a great weekend | 22:09 |
jjohnson2 | but I might remove WIP on monday | 22:09 |
jjohnson2 | WIP removal because: good enough for ironic | 22:09 |
NobodyCam | :) | 22:09 |
NobodyCam | will this rev pass pep8 tests lol | 22:10 |
NobodyCam | J/K | 22:10 |
NobodyCam | hehehe | 22:10 |
jjohnson2 | NobodyCam, this one did ;) | 22:10 |
NobodyCam | hehehe :) ++ | 22:10 |
jjohnson2 | I switched to an editor that nags me in the margin about any pep8 stuff | 22:11 |
jjohnson2 | but now I just stop looking at the margin colors... | 22:11 |
*** ijw_ is now known as ijw | 22:14 | |
NobodyCam | lol | 22:14 |
openstackgerrit | Josh Gachnang proposed openstack/ironic-specs: Implement Cleaning States https://review.openstack.org/102685 | 22:17 |
*** jjohnson2 has quit IRC | 22:18 | |
NobodyCam | brb | 22:22 |
*** ijw has quit IRC | 22:30 | |
*** ijw has joined #openstack-ironic | 22:31 | |
*** vdrok has quit IRC | 22:31 | |
*** ijw_ has joined #openstack-ironic | 22:33 | |
openstackgerrit | Chris Krelle proposed openstack/ironic: Enable async callbacks from task.process_event() https://review.openstack.org/140868 | 22:33 |
*** ijw has quit IRC | 22:36 | |
*** ijw_ has quit IRC | 22:41 | |
*** ijw has joined #openstack-ironic | 22:41 | |
*** ijw_ has joined #openstack-ironic | 22:48 | |
*** ijw__ has joined #openstack-ironic | 22:50 | |
*** ijw has quit IRC | 22:51 | |
*** ijw_ has quit IRC | 22:53 | |
*** ijw__ is now known as ijw | 22:55 | |
*** jcoufal has quit IRC | 23:00 | |
*** EmilienM is now known as EmilienM|afk | 23:01 | |
*** Marga_ has quit IRC | 23:03 | |
*** Marga_ has joined #openstack-ironic | 23:04 | |
*** Marga_ has quit IRC | 23:04 | |
*** Marga_ has joined #openstack-ironic | 23:04 | |
*** ijw_ has joined #openstack-ironic | 23:08 | |
*** ijw has quit IRC | 23:11 | |
*** ijw has joined #openstack-ironic | 23:12 | |
*** Marga_ has quit IRC | 23:12 | |
*** Marga_ has joined #openstack-ironic | 23:12 | |
openstackgerrit | Chris Krelle proposed openstack/ironic: Enable async callbacks from task.process_event() https://review.openstack.org/140868 | 23:13 |
*** Marga_ has quit IRC | 23:13 | |
*** Marga_ has joined #openstack-ironic | 23:13 | |
*** ijw_ has quit IRC | 23:15 | |
NobodyCam | oh why can I get 140869 to rebase cleanly | 23:23 |
*** ChuckC_ has joined #openstack-ironic | 23:28 | |
*** ChuckC has quit IRC | 23:29 | |
*** jeblair is now known as corvus | 23:33 | |
*** eghobo has joined #openstack-ironic | 23:36 | |
*** ryanpetrello has quit IRC | 23:36 | |
*** ryanpetrello has joined #openstack-ironic | 23:37 | |
*** ChuckC_ has quit IRC | 23:39 | |
*** ryanpetrello has quit IRC | 23:42 | |
*** pcaruana|afk| has quit IRC | 23:46 | |
openstackgerrit | Chris Krelle proposed openstack/ironic: Convert check_deploy_timeout to use process_event https://review.openstack.org/140869 | 23:50 |
*** Masahiro has joined #openstack-ironic | 23:52 | |
*** spandhe has quit IRC | 23:56 | |
*** Masahiro has quit IRC | 23:56 | |
*** ChuckC_ has joined #openstack-ironic | 23:57 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!