*** yuanying_ has joined #openstack-ironic | 00:01 | |
*** yuanyin__ has joined #openstack-ironic | 00:04 | |
*** yuanying has quit IRC | 00:04 | |
jroll | hiya Haomeng|2 :) | 00:05 |
---|---|---|
jroll | Haomeng|2: I fixed the link here for you :) | 00:06 |
jroll | https://review.openstack.org/#/c/128388/ | 00:06 |
*** yuanying_ has quit IRC | 00:06 | |
*** rainya has quit IRC | 00:11 | |
Haomeng|2 | jroll: thank you:) | 00:27 |
Haomeng|2 | jroll: +2 | 00:29 |
Haomeng|2 | jroll: :) | 00:29 |
*** hemna has quit IRC | 00:42 | |
*** praneshp has joined #openstack-ironic | 00:46 | |
*** ChuckC has quit IRC | 00:48 | |
*** ChuckC has joined #openstack-ironic | 00:54 | |
*** yuanyin__ has quit IRC | 01:00 | |
*** yuanying has joined #openstack-ironic | 01:03 | |
rloo | jroll: wrt 128388 -- the spec hasn't been approved so I -2'd it. | 01:03 |
*** rainya has joined #openstack-ironic | 01:04 | |
*** yuanying has quit IRC | 01:05 | |
*** yuanying_ has joined #openstack-ironic | 01:05 | |
*** praneshp has quit IRC | 01:21 | |
*** yongli has quit IRC | 01:32 | |
*** nosnos has joined #openstack-ironic | 01:32 | |
Haomeng|2 | rloo: ok:) | 01:35 |
rloo | Haomeng|2: :-) | 01:35 |
Haomeng|2 | rloo: :) | 01:35 |
rloo | Haomeng|2: I just saw that you changed your vote. I think it is probably fine to eg +2 to show that you think it is good. We just don't want to accidentally merge it, that's all. | 01:41 |
Haomeng|2 | rloo: np, I will vote +2 once the spec is approved:) | 01:41 |
Haomeng|2 | rloo: :) | 01:41 |
rloo | Haomeng|2: okay. | 01:42 |
Haomeng|2 | rloo: :) | 01:42 |
Haomeng|2 | rloo: I should follow the process, missed the spec:) | 01:43 |
rloo | Haomeng|2: so the process you took was fine. You reviewed it and gave a +2. That's OK. What we don't want to do is +1 Approve it. | 01:45 |
Haomeng|2 | rloo: ok:) | 01:45 |
Haomeng|2 | rloo: :) | 01:46 |
*** spandhe_ has quit IRC | 01:46 | |
*** marcoemorais has quit IRC | 01:46 | |
rloo | Haomeng|2: if jroll gets two +2's on that before the spec is approved, I should be able to remove the -2 and someone would just need to approve. (Assuming no rebase is needed.) | 01:47 |
Haomeng|2 | rloo: ha, ok, got it:) | 01:49 |
rloo | Haomeng|2: great :D | 01:49 |
Haomeng|2 | rloo: :) | 01:49 |
*** MattMan has quit IRC | 01:54 | |
*** MattMan has joined #openstack-ironic | 01:55 | |
*** rainya has quit IRC | 01:58 | |
*** rainya has joined #openstack-ironic | 01:59 | |
*** rainya has quit IRC | 02:09 | |
*** rainya has joined #openstack-ironic | 02:20 | |
*** r-daneel has quit IRC | 02:21 | |
*** rainya has quit IRC | 02:23 | |
*** spandhe has joined #openstack-ironic | 02:28 | |
*** rainya has joined #openstack-ironic | 02:32 | |
*** spandhe__ has joined #openstack-ironic | 02:33 | |
*** spandhe has quit IRC | 02:35 | |
*** spandhe__ is now known as spandhe | 02:35 | |
*** achanda_ has joined #openstack-ironic | 02:43 | |
*** achanda__ has joined #openstack-ironic | 02:43 | |
*** rainya has quit IRC | 02:46 | |
*** achanda has quit IRC | 02:46 | |
*** achanda_ has quit IRC | 02:47 | |
*** harlowja is now known as harlowja_away | 02:47 | |
*** rainya_ has joined #openstack-ironic | 02:49 | |
*** rainya has joined #openstack-ironic | 02:51 | |
*** rloo has quit IRC | 02:51 | |
*** rainya_ has quit IRC | 02:52 | |
*** vinbs has joined #openstack-ironic | 03:02 | |
*** ramineni has joined #openstack-ironic | 03:22 | |
*** jcoufal has joined #openstack-ironic | 03:27 | |
*** dlaube has quit IRC | 03:40 | |
*** Haomeng has joined #openstack-ironic | 03:45 | |
*** Haomeng|2 has quit IRC | 03:46 | |
*** ramineni has quit IRC | 03:53 | |
*** Poornima has joined #openstack-ironic | 04:31 | |
*** ramineni has joined #openstack-ironic | 04:40 | |
*** pcrews has quit IRC | 04:42 | |
*** jcoufal has quit IRC | 04:52 | |
openstackgerrit | Anusha Ramineni proposed a change to openstack/ironic: Update node-validate error messages https://review.openstack.org/128862 | 04:56 |
*** chenglch has joined #openstack-ironic | 04:59 | |
*** achanda has joined #openstack-ironic | 04:59 | |
*** achanda__ has quit IRC | 05:02 | |
*** lazy_prince has quit IRC | 05:11 | |
*** praneshp has joined #openstack-ironic | 05:21 | |
*** vinbs has quit IRC | 05:31 | |
*** teju has joined #openstack-ironic | 05:33 | |
*** vinbs has joined #openstack-ironic | 05:34 | |
openstackgerrit | Michael Davies proposed a change to openstack/ironic: Put a cap on our cyclomatic complexity https://review.openstack.org/129132 | 05:38 |
*** Nisha has joined #openstack-ironic | 05:42 | |
*** k4n0 has joined #openstack-ironic | 06:09 | |
*** pensu has joined #openstack-ironic | 06:12 | |
*** achanda has quit IRC | 06:15 | |
*** achanda has joined #openstack-ironic | 06:16 | |
*** achanda has quit IRC | 06:16 | |
*** praneshp_ has joined #openstack-ironic | 06:23 | |
*** praneshp has quit IRC | 06:27 | |
*** praneshp_ is now known as praneshp | 06:27 | |
openstackgerrit | Michael Davies proposed a change to openstack/ironic: Bring in Nova's pylint tox config https://review.openstack.org/118270 | 06:32 |
*** killer_prince has joined #openstack-ironic | 06:33 | |
*** killer_prince is now known as lazy_prince | 06:33 | |
*** andreykurilin_ has joined #openstack-ironic | 06:56 | |
*** spandhe has quit IRC | 07:17 | |
*** ifarkas has joined #openstack-ironic | 07:20 | |
*** andreykurilin_ has quit IRC | 07:32 | |
*** rameshg87 has joined #openstack-ironic | 07:35 | |
*** pradipta_away is now known as pradipta | 08:06 | |
*** jcoufal has joined #openstack-ironic | 08:24 | |
*** vinbs has quit IRC | 08:24 | |
*** jistr has joined #openstack-ironic | 08:24 | |
*** derekh has joined #openstack-ironic | 08:26 | |
*** lucasagomes has joined #openstack-ironic | 08:31 | |
*** wendar_ has joined #openstack-ironic | 08:38 | |
*** wendar has quit IRC | 08:40 | |
*** Isotopp has quit IRC | 08:43 | |
*** Isotopp has joined #openstack-ironic | 08:43 | |
*** dtantsur|afk is now known as dtantsur | 08:47 | |
dtantsur | Morning, TGIF! | 08:47 |
Haomeng | dtantsur: morning:) | 08:49 |
*** jcoufal has quit IRC | 08:53 | |
*** jcoufal has joined #openstack-ironic | 08:54 | |
openstackgerrit | Roman Dashevsky proposed a change to openstack/ironic: Add driver for supports Aten PDU's https://review.openstack.org/129174 | 09:05 |
*** athomas has joined #openstack-ironic | 09:13 | |
*** lazy_prince has quit IRC | 09:19 | |
*** praneshp has quit IRC | 09:19 | |
*** Nisha has quit IRC | 09:20 | |
*** pelix has joined #openstack-ironic | 09:44 | |
*** chenglch has quit IRC | 09:57 | |
*** athomas has quit IRC | 09:58 | |
*** killer_prince has joined #openstack-ironic | 10:06 | |
*** killer_prince is now known as lazy_prince | 10:07 | |
*** athomas has joined #openstack-ironic | 10:09 | |
*** rameshg87_ has joined #openstack-ironic | 10:10 | |
*** rameshg87 has quit IRC | 10:11 | |
*** yuanying_ has quit IRC | 10:18 | |
*** jcoufal has quit IRC | 10:18 | |
*** yuanying has joined #openstack-ironic | 10:19 | |
*** yuanying has quit IRC | 10:19 | |
*** yuanying has joined #openstack-ironic | 10:20 | |
*** yuanying has quit IRC | 10:24 | |
*** Haomeng|2 has joined #openstack-ironic | 10:43 | |
*** Haomeng has quit IRC | 10:45 | |
*** pensu has quit IRC | 10:45 | |
*** rameshg87 has joined #openstack-ironic | 10:52 | |
*** rameshg87 has quit IRC | 10:53 | |
*** rameshg87_ has quit IRC | 10:55 | |
ramineni | dtantsur: hi | 10:56 |
dtantsur | ramineni, o/ | 10:56 |
ramineni | dtantsur, regarding your comment on https://review.openstack.org/#/c/128862/2/ironic/drivers/modules/deploy_utils.py | 10:56 |
ramineni | dtantsur , if we add it error msg , the msg will go redundant | 10:57 |
*** jcoufal has joined #openstack-ironic | 10:57 | |
*** jcoufal has quit IRC | 10:57 | |
dtantsur | ramineni, it's not that bad, though you can simplify message in deploy_utils like "%(error_msg)s. Missing are: %(missing)s" | 10:57 |
*** jcoufal has joined #openstack-ironic | 10:57 | |
dtantsur | but anyway being redundant is much better than being unclear | 10:58 |
ramineni | dtantsur, all other functions which call check_for_missing_params have missing params in driver_info | 10:58 |
ramineni | dtantsur , ya it sounds better | 10:58 |
dtantsur | IMO it anyway does not justify hardcoding possible variants in deploy_utils(). we should be able to add new field there without analyzing the whole code base :) | 10:59 |
ramineni | dtantsur , you meant what Yuriy suggested? | 11:00 |
dtantsur | ramineni, I mean what you have now. With it you should remember to update 'if' statement on adding new possible items to instanc_info | 11:00 |
ramineni | dtantsur , hmm , your suggestion to add it error msg and just adding " Missing are:" in check_params sounds more effective . No need to change anything when new field is added | 11:04 |
dtantsur | right | 11:05 |
ramineni | dtantsur , will modify accordingly , thanks :) | 11:05 |
*** ramineni has quit IRC | 11:05 | |
*** yuanying has joined #openstack-ironic | 11:08 | |
*** Haomeng has joined #openstack-ironic | 11:29 | |
Shrews | dtantsur: ugh. i want to slap the person that decided getting rid of long in py3 was a good idea | 11:29 |
*** Haomeng|2 has quit IRC | 11:30 | |
dtantsur | Shrews, well, unifying things under one int() type makes some sense for dynamic language... | 11:30 |
Shrews | dtantsur: but soooo many programs use long() already. seems they could have just made the two equivalent | 11:31 |
dtantsur | Shrews, IIRC even in Py2 you can use just int() for casting, it will produce long on demand | 11:35 |
Shrews | dtantsur: i think so | 11:36 |
*** derekh has quit IRC | 11:52 | |
*** ifarkas has quit IRC | 11:52 | |
*** derekh has joined #openstack-ironic | 11:56 | |
*** teju has left #openstack-ironic | 12:05 | |
*** dprince has joined #openstack-ironic | 12:11 | |
*** Haomeng|2 has joined #openstack-ironic | 12:18 | |
*** Haomeng has quit IRC | 12:19 | |
*** pensu has joined #openstack-ironic | 12:20 | |
*** pensu has quit IRC | 12:25 | |
*** pradipta is now known as pradipta_away | 12:27 | |
*** Poornima has quit IRC | 12:29 | |
*** nosnos has quit IRC | 12:41 | |
*** nosnos has joined #openstack-ironic | 12:42 | |
*** pensu has joined #openstack-ironic | 12:43 | |
*** nosnos has quit IRC | 12:46 | |
*** jjohnson2 has joined #openstack-ironic | 12:54 | |
*** k4n0 has quit IRC | 12:55 | |
*** rameshg87 has joined #openstack-ironic | 12:55 | |
*** igordcard has joined #openstack-ironic | 13:05 | |
*** pensu has quit IRC | 13:17 | |
*** openstackgerrit has quit IRC | 13:19 | |
*** openstackgerrit has joined #openstack-ironic | 13:20 | |
MattMan | Just looking at ironic juno requirements, and I see pyghmi no longer a requirement, is this likely to remain the case ? | 13:24 |
lucasagomes | MattMan, yes, it's a third part library | 13:26 |
lucasagomes | just like seamicro client, ilo client etc | 13:26 |
lucasagomes | it's a dependency for a specific driver, not for ironic in general | 13:27 |
MattMan | Great thanks, so really only an dependency of the driver you are using depends on it. | 13:27 |
MattMan | s/of/if/ | 13:28 |
*** rameshg87 has quit IRC | 13:32 | |
lucasagomes | yea | 13:35 |
*** r-daneel has joined #openstack-ironic | 13:37 | |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: TestAgentVendor to use the fake_agent driver https://review.openstack.org/129251 | 13:41 |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: Add a basic mechanism to route and validate vendor methods https://review.openstack.org/129261 | 13:59 |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: Add a basic mechanism to route and validate vendor methods https://review.openstack.org/129261 | 14:07 |
openstackgerrit | John Trowbridge proposed a change to openstack/python-ironicclient: Adds tty password entry for ironicclient https://review.openstack.org/129010 | 14:07 |
*** spandhe_ has joined #openstack-ironic | 14:14 | |
*** spandhe_ has quit IRC | 14:14 | |
lucasagomes | dtantsur, didn't get ur comment | 14:15 |
dtantsur | mmm? | 14:15 |
lucasagomes | "left comments on how I see it" | 14:15 |
lucasagomes | https://review.openstack.org/#/c/129261/2 | 14:16 |
dtantsur | we need to get rid of actual vendor_passthru methods in all or most drivers | 14:16 |
*** ndipanov is now known as ndipanoff | 14:16 | |
lucasagomes | oh my page wasn't updated | 14:16 |
lucasagomes | now I see ur comments | 14:16 |
dtantsur | lucasagomes, sorry, looks like you pushed one more revision in the meanwhile | 14:16 |
dtantsur | and I didn't notice it while leaving a review | 14:16 |
lucasagomes | dtantsur, it's alright... right so the base class would take the function from the routes | 14:19 |
dtantsur | yeah, call it, probably convert exceptions to ironic ones | 14:20 |
dtantsur | this is how it's implemented in all drivers actually | 14:20 |
lucasagomes | right, yeah there's some small diffs but I think I can make it generic | 14:21 |
lucasagomes | dtantsur, lemme try something | 14:21 |
dtantsur | cool | 14:22 |
lucasagomes | dtantsur, I'm also thinking about using the route dict map to tell whether that method should run async or sync | 14:23 |
dtantsur | interesting idea hmm... | 14:23 |
lucasagomes | so, it would be up to the dev to say how that method should run | 14:23 |
dtantsur | makes sense to me | 14:23 |
lucasagomes | yeah | 14:23 |
lucasagomes | and after that I want add support to all http methods, and not only use POST | 14:23 |
lucasagomes | I don't know whether I should make a spec for it tho | 14:23 |
lucasagomes | it won't be a super small change | 14:24 |
lucasagomes | ideas? | 14:25 |
dtantsur | lucasagomes, I usually prefer spec, especially since we have lightwieght spec procedure | 14:30 |
*** rwsu has joined #openstack-ironic | 14:31 | |
lucasagomes | dtantsur, right, yeah I will finish this base mechanism | 14:32 |
lucasagomes | and try to come up with a spec for the work | 14:32 |
dtantsur | cool | 14:32 |
dtantsur | brb | 14:34 |
*** pensu has joined #openstack-ironic | 14:43 | |
*** achanda has joined #openstack-ironic | 14:49 | |
*** achanda_ has joined #openstack-ironic | 14:50 | |
jroll | Haomeng|2: thanks | 14:52 |
jroll | morning ironic | 14:52 |
*** achanda has quit IRC | 14:54 | |
*** jjohnson2 has quit IRC | 14:54 | |
*** jjohnson2 has joined #openstack-ironic | 14:55 | |
Shrews | jroll: morning | 14:55 |
*** achanda_ has quit IRC | 14:55 | |
*** pcrews has joined #openstack-ironic | 14:57 | |
*** achanda has joined #openstack-ironic | 14:57 | |
NobodyCam | oh I slept in...TGIF... morning Ironic :) | 15:00 |
Shrews | NobodyCam: morning mr. sleepy | 15:02 |
NobodyCam | :-p gah | 15:02 |
jroll | morning NobodyCam :) | 15:03 |
jroll | hey I have a thing | 15:04 |
jroll | if mysql goes away | 15:04 |
jroll | the conductor will blow up at the next heartbeat and never heartbeat again | 15:04 |
jroll | does anyone else think it should keep trying? | 15:04 |
NobodyCam | morning jroll :) | 15:04 |
Shrews | jroll: i think you should run galera :-P | 15:05 |
Shrews | but, yeah, good question | 15:05 |
*** achanda has quit IRC | 15:05 | |
Shrews | retrying seems sensible | 15:05 |
jroll | Shrews: doesn't matter if the network blips :P | 15:05 |
jroll | ok | 15:06 |
* jroll makes a bug and a patch | 15:06 | |
Shrews | i mean, any reason NOT to retry? | 15:06 |
NobodyCam | jroll: i would say at least once to account for network blips, but I can also see where that would dbl the timeout values to | 15:06 |
jroll | I don't have one | 15:06 |
jroll | NobodyCam: I think which exception we catch for this should be thought through... but connection failed seems reasonable to retry forever | 15:07 |
jroll | looks like DBConnectionError | 15:08 |
*** dtantsur has quit IRC | 15:08 | |
Shrews | what are our connection attempt points? startup... idle timeouts (i assume, if we are using persistent connections)? | 15:09 |
* Shrews *hopes* we use persistent connections | 15:09 | |
jroll | Shrews: doesn't look like it | 15:10 |
jroll | oh, maybe | 15:10 |
openstackgerrit | Sam Betts proposed a change to openstack/ironic: Add logging to driver vendor_passthru functions https://review.openstack.org/129298 | 15:10 |
NobodyCam | jroll: if a conductor is decomissioned will it inform the agent to look for a new conductor (/me assumes long running agebt) | 15:10 |
jroll | NobodyCam: yes, hash ring will deal with it, agent only talks to the api | 15:11 |
Shrews | jroll: oslo.db should handle persistence for us, i think/hope | 15:11 |
*** sambetts has joined #openstack-ironic | 15:12 | |
jroll | I would hope so too | 15:12 |
* NobodyCam needs to stop thinking agent is talking to the conductor.. :p | 15:12 | |
Shrews | jroll: there is already an option: #use_db_reconnect=false | 15:15 |
jroll | Shrews: what does that do? | 15:15 |
Shrews | # Enable the experimental use of database reconnect on | 15:16 |
Shrews | # connection lost. (boolean value) | 15:16 |
Shrews | jroll: it's in the sample config in the [database] section | 15:16 |
jroll | Shrews: hrm, that's great but I don't think that solves this problem | 15:16 |
Shrews | why ot? | 15:16 |
Shrews | not? | 15:16 |
jroll | well, I presume it doesn't retry forever | 15:17 |
Shrews | there are other parameters controlling that | 15:17 |
jroll | essentially if touch_conductor() raises an exception, the entire heartbeat thread dies | 15:17 |
jroll | I guess maybe I just don't trust that to work in all cases | 15:18 |
jroll | especially when it says experimental | 15:18 |
jroll | I don't think it's ok for default settings to allow the entire conductor cluster to die if the network goes away for $heartbeat_timeout | 15:18 |
jroll | it should recover | 15:18 |
*** pradipta_away is now known as pradipta | 15:21 | |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: Add a basic mechanism to route and validate vendor methods https://review.openstack.org/129261 | 15:22 |
lucasagomes | jroll, ^ changes the agent | 15:22 |
lucasagomes | NobodyCam, Shrews jroll morning :) | 15:23 |
JayF | NobodyCam: conductor *does* talk to the agent | 15:23 |
JayF | NobodyCam: just agent talks to the api | 15:23 |
NobodyCam | morning lucasagomes | 15:23 |
jroll | lucasagomes: morning, cool :) | 15:23 |
NobodyCam | there ya go JayF confuzzing me blurry eyed state..lol...:).. and Good morning :) | 15:24 |
JayF | NobodyCam: yeah outgoing is similar to a bmc. Conductor makes all the calls. | 15:24 |
JayF | NobodyCam: Incoming has to be API, because that's the only HTTP-way to get messages to anything Ironic | 15:24 |
lucasagomes | JayF, morning too, sorry didn't see ya on the comments above | 15:25 |
NobodyCam | JayF: and its easy to load balance with run of the mill tech | 15:25 |
sambetts | lucasagomes: That patch kinda blows the one I just pushed out of the water https://review.openstack.org/#/c/129298/ | 15:25 |
NobodyCam | the api that is | 15:25 |
lucasagomes | sambetts, ouch :/ | 15:25 |
JayF | NobodyCam: exactly. Or you can use wsgiref and just overscale the API cluster *whistles innocently* | 15:26 |
lucasagomes | sambetts, hmm didn't know someone were working on it... maybe we should mix the patches? | 15:26 |
lucasagomes | sambetts, what I did was to get rid of all the custom vendor_passthru and driver_vendor_passthru methods on the drivers, and make it generic so we only need to add more logs on the base class and it will affect all drivers | 15:27 |
sambetts | lucasagomes: I just took a quick look, I only picked up that bug this morning, I didn't realise other work was going on to improve the vender_passthru stuff, tbh your solution is cleaner | 15:29 |
*** igordcard has quit IRC | 15:29 | |
lucasagomes | sambetts, right, yeah I want to improve the vendor_passthru and driver_vendor_passthru not to make it more consistent, but also to allow devs to tell whether the method should run sync or async | 15:30 |
lucasagomes | and support all HTTP methods instead of only POST | 15:31 |
lucasagomes | sambetts, I have to take a look closer at ur patch, and see if I can rebase mine on the top :/ | 15:31 |
lucasagomes | if possible | 15:31 |
*** igordcard has joined #openstack-ironic | 15:32 | |
lucasagomes | sambetts, and I haven't seem that bug about the logs :( sorry for stepping on ur toes | 15:33 |
openstackgerrit | David Shrewsbury proposed a change to openstack/ironic: Improve hash ring value conversion https://review.openstack.org/129031 | 15:33 |
sambetts | lucasagomes: thats ok, its not your fault, stuff like this is bound to happen with such large projects :-) I was hoping for an easy bug fix ;-) | 15:34 |
lucasagomes | sambetts, yeah :/ | 15:35 |
openstackgerrit | Jim Rollenhagen proposed a change to openstack/ironic: Continue heartbeating after DB connection failure https://review.openstack.org/129301 | 15:35 |
jroll | Shrews, NobodyCam ^ | 15:35 |
*** MattMan has left #openstack-ironic | 15:35 | |
* jroll waits for someone to tell him to wrap that exception | 15:35 | |
sambetts | lucasagomes: You might be able to use the ideas from mine to reduce some code duplication but I'm not usre | 15:36 |
sambetts | s/usre/sure | 15:36 |
openstackgerrit | Jim Rollenhagen proposed a change to openstack/ironic: Continue heartbeating after DB connection failure https://review.openstack.org/129301 | 15:36 |
lucasagomes | sambetts, I will take a look at it | 15:36 |
lucasagomes | thanks | 15:36 |
JayF | jroll: I'm thinking that bug should be tagged juno-backport-potential | 15:37 |
JayF | jroll: given how nasty it is | 15:37 |
sambetts | lucasagomes: if not I'm completely behind your patch :-) | 15:37 |
* lucasagomes brb in a call | 15:38 | |
jroll | JayF: I'll leave that to deva, juno final is cut or close | 15:38 |
* jroll brb | 15:38 | |
JayF | jroll: the -backport-potential is about getting it in the stable branch | 15:38 |
jroll | oh | 15:40 |
jroll | I mean | 15:40 |
jroll | I'll leave it to devananda :) | 15:40 |
jroll | or someone else | 15:40 |
JayF | devananda: https://bugs.launchpad.net/ironic/+bug/1382589?comments=all I think this is likely a good candidate for a juno bugfix backport | 15:40 |
*** alexpilotti has joined #openstack-ironic | 15:51 | |
alexpilotti | JayF: so would it make sense to propose a BP for an Intel AMT power driver? | 15:51 |
JayF | alexpilotti was asking in #cloud-init if we supported AMT | 15:51 |
alexpilotti | the big question I see, is who is going to do the CI testing | 15:52 |
JayF | alexpilotti: I'd think so; we use a two-step specs process: step 1: make a blueprint and fill out the first (3, iirc) sections; then once there's a consensus this is something people want in ironic, write out all the implementations | 15:52 |
JayF | alexpilotti: currently the only power driver with any CI is ipminative | 15:53 |
JayF | alexpilotti: because in Devstack we run a driver that basically controls libvirt using virsh like a bmc | 15:53 |
alexpilotti | JayF: k! | 15:53 |
NobodyCam | JayF: lol ssh is also tested in the gate :-p | 15:54 |
alexpilotti | JayF: ok, I’ll send a spec proposal on ironic-specs | 15:54 |
jroll | alexpilotti: it would have to be third-party CI, and realistically can only happen if someone can donate the resources | 15:54 |
* NobodyCam ducks | 15:54 | |
JayF | alexpilotti: drop me a link when you get it up and I'll help out with an initial reivew | 15:54 |
alexpilotti | jroll: we run the Hyper-V CI, we might see if we could dedicate some servers there | 15:54 |
JayF | alexpilotti: vPRO/AMT is what's in the Ubuntu orange box, right? | 15:54 |
alexpilotti | JayF: correct | 15:55 |
alexpilotti | JayF: it’s what the Intel NUC support | 15:55 |
* JayF salivates at the thought of an ironic cluster on his desk | 15:55 | |
alexpilotti | JayF: and we use tons of them at CLoudbase for dev and testing | 15:55 |
JayF | nice | 15:55 |
JayF | I bet if you donated them to Ironic devs | 15:55 |
NobodyCam | alexpilotti: got any spares arounf | 15:55 |
JayF | we'd test the crap out of your power drivers | 15:55 |
JayF | NobodyCam: gmta, haha | 15:55 |
alexpilotti | JayF: loot at this: https://www.youtube.com/watch?v=J3lFTYvSrcQ | 15:56 |
*** andreykurilin_ has joined #openstack-ironic | 15:56 | |
alexpilotti | JayF: since the latest NUCs come w/o vPro/QMT, we ended up doing a power driver… with lego robotics :-) | 15:57 |
sambetts | alexpilotti: was that the thing that Canonical demoed at the Hong Kong summit? | 15:57 |
NobodyCam | alexpilotti: I love it | 15:57 |
alexpilotti | sambetts: yep, correct | 15:57 |
JayF | alexpilotti: that was what Shuttleworth demo'd with at Atlanta, right? | 15:57 |
alexpilotti | JayF: yep | 15:57 |
JayF | sambetts: it was Atlanta, not HK | 15:57 |
JayF | sambetts: I didn't Openstack during the HK summit and I saw it :P | 15:57 |
alexpilotti | Atlanta, sorry | 15:57 |
sambetts | Ah they all merge into one in my mind these days | 15:58 |
alexpilotti | we had it also in Atlanta on our booth | 15:58 |
JayF | We've joked before about having a log power driver | 15:58 |
JayF | "WARN: Operator, go power on the machine" | 15:58 |
alexpilotti | Mark saw it, liked it and asked to add it in the demo | 15:58 |
JayF | kinda like a robot but not quite the same | 15:58 |
JayF | haha | 15:58 |
alexpilotti | heh | 15:58 |
sambetts | haha :D | 15:58 |
alexpilotti | so maybe this time (Paris) we could use it to demo Ironic as well | 15:59 |
JayF | devananda: I'm commuting now; will likely be off the train before 10 but wanted to give you a heads up in case I run late | 15:59 |
sambetts | alexpilotti: that would be pretty awesome! I'll have to come and find your booth this time round | 16:00 |
alexpilotti | sambetts: sounds good! :-) | 16:00 |
*** dlaube has joined #openstack-ironic | 16:02 | |
JayF | devananda: scratch that; trains and busses were all too far away so I'll wait until after we meet to leave for the office. I can start whenever you can (even though it's earlier than we agreed) | 16:05 |
NobodyCam | JayF: that is no joke | 16:06 |
JayF | NobodyCam: really? We seriously want a log power driver? | 16:06 |
JayF | heh | 16:06 |
JayF | it would at least enable my vision of a cluster of rpis being deployed to by ironic | 16:07 |
NobodyCam | we have folk out there running it | 16:07 |
NobodyCam | for realz | 16:07 |
JayF | For testing only? Or for production? | 16:07 |
NobodyCam | I believe testing in a production env | 16:08 |
JayF | they need a cloudbase robot, apparently ;) | 16:08 |
*** derekh has quit IRC | 16:09 | |
NobodyCam | lol they have then... I think it called (data center) remote hands...lol :p | 16:09 |
JayF | that's almost more like you need a nagios/nsca power driver | 16:10 |
JayF | throw an alert to reboot a server | 16:10 |
JayF | lol | 16:10 |
NobodyCam | huummmmm | 16:11 |
*** eghobo has joined #openstack-ironic | 16:16 | |
*** marcoemorais has joined #openstack-ironic | 16:17 | |
*** marcoemorais has quit IRC | 16:17 | |
*** marcoemorais has joined #openstack-ironic | 16:17 | |
lucasagomes | alexpilotti, +1 for demoing it in Paris :P | 16:23 |
alexpilotti | lucasagomes: k :-) | 16:23 |
lucasagomes | easy patch just waiting for another +2: https://review.openstack.org/#/c/129251/ | 16:24 |
jroll | lucasagomes: people want a spec for maint reason | 16:25 |
lucasagomes | jroll, oh :/ isn't a bug enough? | 16:25 |
JayF | I'd be +1 to filing a bug | 16:25 |
jroll | lucasagomes: approved that | 16:25 |
JayF | "As an operator I lose valuable data when last_error gets clobbered" | 16:26 |
jroll | lucasagomes: ask rloo or dtantsur https://review.openstack.org/#/c/128645/ | 16:26 |
NobodyCam | your to fast jroll :-p | 16:26 |
lucasagomes | jroll, right | 16:26 |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: Add a basic mechanism to route and validate vendor methods https://review.openstack.org/129261 | 16:29 |
*** eghobo has quit IRC | 16:29 | |
devananda | JayF: ping | 16:31 |
lucasagomes | jroll, maybe because we added the API part it knows impact in multiple areas of the project, API and DB interfaces that's why they want a spec | 16:31 |
JayF | yo | 16:31 |
lucasagomes | it now* | 16:31 |
lucasagomes | devananda, morning | 16:32 |
NobodyCam | morning devananda | 16:32 |
jroll | lucasagomes: I guess... it's just a new endpoint :| | 16:32 |
devananda | JayF: we actually have a LogPower driver. It was proposed by HP a few months back | 16:33 |
JayF | devananda: NobodyCam informed me of that :) | 16:33 |
devananda | hehe | 16:33 |
devananda | lots of scrollback... not gonna get through it all | 16:33 |
NobodyCam | :-p | 16:33 |
devananda | JayF: you wanna chat? I have to leave in ~2hr for a flight | 16:33 |
JayF | absolutely | 16:34 |
sambetts | Anyone got any ideas why the pep8 CI is throwing issues that the tox -e pep8 isnt?? | 16:34 |
NobodyCam | sambetts: have you rebuild you venv? | 16:34 |
lucasagomes | jroll, yeah, I'm divided haha gosh... | 16:35 |
sambetts | not the pep8 one, I'll try that | 16:35 |
devananda | jroll: fwiw, juno final has been cut already: https://launchpad.net/ironic/juno/2014.2 | 16:37 |
devananda | backport fixes happen on a schedule, unless it's a CVE, which this isn't | 16:37 |
lucasagomes | jroll, if u want I can start a quick spec about it, I think people won't object for the change itself so we could go with a small full spec for it. I think the API change is the the big impact here, because once API changes like that lands (new endpoints) we will have to support it | 16:37 |
lucasagomes | I mean support it for a long time and not break it | 16:38 |
devananda | *backported fixes get released on a schedule | 16:38 |
jroll | devananda: I'm not worried about backporting that, but it is a bit nasty | 16:38 |
devananda | nasty how? | 16:38 |
jroll | lucasagomes: if you want to, I could also get it later today | 16:38 |
lucasagomes | jroll, right... whatever is easier, cause it's late here already I was thinking about getting a break | 16:40 |
lucasagomes | jroll, but if u don't mind I can put something up monday morning | 16:40 |
jroll | it's friday, go have a beer for me :) | 16:41 |
jroll | if you don't see a spec by monday morning go for it | 16:41 |
jroll | I should be able to do it today though | 16:41 |
lucasagomes | jroll, deal | 16:41 |
lucasagomes | right yeah I'll call it a day here | 16:42 |
lucasagomes | have a great night everybody, enjoy the weekend | 16:42 |
jroll | alright, have a good one :) | 16:42 |
NobodyCam | have a great weekend lucasagomes | 16:42 |
lucasagomes | NobodyCam, jroll you guys too | 16:42 |
lucasagomes | see ya | 16:42 |
*** lucasagomes is now known as lucas-dinner | 16:42 | |
*** pelix has quit IRC | 16:43 | |
jjulien | does ironic support using qpid? | 16:46 |
jroll | jjulien: we use oslo.messaging for amqp, so I'd ask them | 16:46 |
jjulien | I'm getting a 'module has no attribute create_connection' error | 16:46 |
jjulien | jroll: ok, thanks | 16:47 |
jroll | :) | 16:47 |
*** ndipanoff has quit IRC | 16:50 | |
*** athomas has quit IRC | 16:51 | |
*** pensu has quit IRC | 16:52 | |
dlaube | g'morning | 16:54 |
NobodyCam | morning dlaube | 16:55 |
NobodyCam | brb....bbt time | 16:56 |
openstackgerrit | A change was merged to openstack/ironic: TestAgentVendor to use the fake_agent driver https://review.openstack.org/129251 | 16:57 |
jjulien | jroll: I'm trying to get ironic to work with icehouse, so using the stable/icehouse branch. The code is directly importing the qpid module, or whatever is specified in teh config as rpc_backend. Then calling create_connection on it. This would seem to be a bug. https://github.com/openstack/ironic/blob/stable/icehouse/ironic/openstack/common/rpc/__init__.py#L264-L275 | 16:59 |
jjulien | jroll: given it's icehouse, i'm guessing it won't be fixed? or is there some other way I should be configuring to use qpid? | 16:59 |
jroll | jjulien: oh, yeah, I think we moved that in juno | 17:00 |
jroll | icehouse feels like years ago to me... I personally have no idea | 17:00 |
jroll | but someone else might know | 17:00 |
*** pensu has joined #openstack-ironic | 17:02 | |
*** viktors is now known as viktors|afk | 17:02 | |
*** spandhe has joined #openstack-ironic | 17:10 | |
openstackgerrit | Sam Betts proposed a change to openstack/ironic: Add logging to driver vendor_passthru functions https://review.openstack.org/129298 | 17:13 |
*** comstud is now known as bearhands | 17:23 | |
*** achanda has joined #openstack-ironic | 17:26 | |
*** pcrews has quit IRC | 17:27 | |
*** igordcard has quit IRC | 17:31 | |
*** mikedillion has joined #openstack-ironic | 17:33 | |
NobodyCam | brb | 17:38 |
*** spandhe has quit IRC | 17:42 | |
*** spandhe has joined #openstack-ironic | 17:47 | |
*** harlowja_away is now known as harlowja | 17:49 | |
*** eghobo has joined #openstack-ironic | 17:52 | |
*** achanda has quit IRC | 17:57 | |
*** sambetts has quit IRC | 17:59 | |
*** marcoemorais has quit IRC | 18:01 | |
*** marcoemorais has joined #openstack-ironic | 18:01 | |
* NobodyCam reposts -> krotscheck > Here’s something to put on your itinerary for paris: http://dangerousminds.net/comments/no_butts_its_a_christmas_tree | 18:04 | |
*** pradipta is now known as pradipta_away | 18:07 | |
*** achanda has joined #openstack-ironic | 18:10 | |
openstackgerrit | Chris Behrens proposed a change to openstack/ironic: Store image disk_format and container_format https://review.openstack.org/128463 | 18:12 |
*** marck has quit IRC | 18:22 | |
*** pcrews has joined #openstack-ironic | 18:25 | |
*** marcoemorais has quit IRC | 18:26 | |
*** marcoemorais has joined #openstack-ironic | 18:26 | |
*** Haomeng|2 has quit IRC | 18:26 | |
*** marcoemorais has quit IRC | 18:27 | |
*** andreykurilin_ has quit IRC | 18:28 | |
*** marcoemorais has joined #openstack-ironic | 18:28 | |
*** Haomeng|2 has joined #openstack-ironic | 18:29 | |
*** spandhe has quit IRC | 18:46 | |
*** pensu has quit IRC | 18:56 | |
*** spandhe has joined #openstack-ironic | 19:02 | |
*** BertieFulton has joined #openstack-ironic | 19:03 | |
*** pensu has joined #openstack-ironic | 19:12 | |
*** jcoufal has quit IRC | 19:29 | |
* NobodyCam is starting to think we should add a operational risk / impact section to our specs.. what are the chances that https://review.openstack.org/#/c/100842 /could/ brick a node | 19:29 | |
jroll | ugh | 19:30 |
jroll | we should not have a rest api to do that | 19:30 |
jroll | they're cattle, update them all at once | 19:30 |
JayF | Yeah I apparently should review that | 19:31 |
JayF | We should not have a rest api for doing that at all | 19:31 |
*** pcrews has quit IRC | 19:31 | |
lifeless | so perhaps some context is useful | 19:32 |
lifeless | HP has added out of band updates | 19:32 |
lifeless | which is on its backend a REST API :) | 19:32 |
JayF | I appreciate that and want Ironic to call that rest API | 19:33 |
JayF | what I don't ever want is a single API call to Ironic changing a single setting on a specific node | 19:33 |
JayF | because that's not cloudy; that's snowflake configuration | 19:33 |
lifeless | ah | 19:34 |
lifeless | so thats perhaps a bit black and white | 19:34 |
lifeless | here's a thing, I'm not sure how we should enable it | 19:34 |
lifeless | the thing is that some firmware configs work well with some operating system stacks and workloads | 19:34 |
lifeless | they get certified by vendors | 19:34 |
lifeless | so e.g. your 1000 node hadoop cluster 'must run firmware X' | 19:34 |
JayF | lifeless: devananda and I both talked about that and laid out a flow I'm going to reflect in a spec by, at the latest, COB Monday | 19:35 |
JayF | lifeless: we literally G+'d about it this morning :) | 19:35 |
lifeless | JayF: ok, cool. | 19:35 |
jroll | lifeless: firmware version/binaries or firmware configs? | 19:35 |
jroll | this spec sounds like the former | 19:35 |
lifeless | jroll: firmware version | 19:35 |
jroll | oh, huh | 19:35 |
jroll | ok | 19:35 |
JayF | If firmware version changes are desired at deploy time | 19:35 |
lifeless | jroll: the thing laid down in the flash | 19:35 |
JayF | they'd likely be handled by the hardware capabilties stuff deva and I were talking about | 19:35 |
jroll | anyhow, yeah, there's better ways to do this, I think | 19:35 |
lifeless | related to that then is how you pre-prod test a change to that | 19:36 |
lifeless | saying the entire fleet has to be one version is not sufficeint | 19:36 |
lifeless | jroll: sure | 19:36 |
lifeless | I'm not defending that spec per se | 19:36 |
jroll | right :) | 19:36 |
lifeless | I'm trying to make sure the driviing situation is clear to you guys | 19:36 |
jroll | indeed | 19:36 |
NobodyCam | thank you lifeless the back story helps (alot) | 19:37 |
jroll | JayF's thing basically says 'I need capability x'; this might be 'weird hadoop cluster' | 19:37 |
JayF | jroll: well... kinda | 19:37 |
jroll | which triggers a firmware update to the correct version | 19:37 |
jroll | I'm simplifying | 19:37 |
JayF | so you'd define a flavor in nova as wanting certain capabilities | 19:37 |
lifeless | so things that interact here | 19:37 |
JayF | one of those capabilities could be firmware_version_314 | 19:37 |
JayF | (easy as pi, amirite) | 19:37 |
lifeless | some hardware is still update-locally-only | 19:37 |
lifeless | particularly existing fleets | 19:37 |
lifeless | new hardware can be asserted remotely | 19:37 |
lifeless | but I expect LCD stuff to be local-only indefinitely | 19:38 |
JayF | lifeless: Doing the stuff in-band is going to be difficult to do at deploy time | 19:38 |
lifeless | sorry, lowest common denominator | 19:38 |
lifeless | JayF: why? I have stuff doing it in-band today | 19:38 |
JayF | lifeless: doing it at decom time (think of decom as happening before any single tenant gets on the node at all) for in band stuff makes more sense | 19:38 |
JayF | lifeless: doing firmware flashes in-band at deploy time? | 19:38 |
lifeless | JayF: yup | 19:38 |
lifeless | JayF: see the ilo firmware element in tripleo-image-elements, for instance. | 19:39 |
lifeless | its not IPA but should be easy to adapt | 19:39 |
JayF | in-band changes at deploy time isn't something I had thought about | 19:39 |
JayF | but it's something that should fit into the model | 19:39 |
JayF | given you have a sufficiently intellegent agent to perform the flash on request before doing a deploy | 19:39 |
JayF | which IPA is :) | 19:39 |
lifeless | so basically I think there are a couple of broad user stories | 19:39 |
lifeless | a) give me latest always but don't change it under me | 19:40 |
lifeless | b) give me latest always and keep it update to date (remote-only updates) | 19:40 |
lifeless | c) give me a specific version thanks! | 19:40 |
lifeless | c)'s complexity probably emerges when we consider scheduling | 19:40 |
*** spandhe has quit IRC | 19:46 | |
JayF | lifeless: c is completely covered by my use case, a is covered by decom | 19:47 |
JayF | lifeless: b is actually the harder one | 19:47 |
JayF | lifeless: I'd potentially even say that I don't think Ironic should change *anything* on a node provisioned to a tenant | 19:48 |
JayF | lifeless: because that's not cloudy; an arguably better pattern is to spin a new box with the updated firmware, and let (a) handle the updating for you in that case | 19:48 |
*** yuanying has quit IRC | 19:56 | |
NobodyCam | JayF: could that be spin up new instance with latest then live migrate to it, ofc would have to do that with new release, which I'm not sure how we would know when that is | 19:56 |
NobodyCam | oh and support live migrate too | 19:57 |
JayF | NobodyCam: Yeah, I get how the idea in theory doesn't fit so well with reality because reality has data and ironic doesn't have live migrations | 19:57 |
JayF | NobodyCam: but I think it's fairly important that Ironic *leave a node alone* when it's been provisioned to a tenant | 19:57 |
lifeless | JayF: to a degree | 19:59 |
NobodyCam | JayF: I actually agree but I can see a use case where a operator would what to update a provisioned node, security hardware fixes and such | 19:59 |
lifeless | JayF: security fixes in firmware.... | 19:59 |
lifeless | tada | 19:59 |
JayF | I don't disagree it's a useful thing some people would want to do | 20:00 |
JayF | I disagree with the idea it belongs in Ironic / is Ironic's job | 20:00 |
*** marcoemorais has quit IRC | 20:01 | |
*** marcoemorais has joined #openstack-ironic | 20:02 | |
*** spandhe has joined #openstack-ironic | 20:03 | |
NobodyCam | humm we do support nova rebuild, we could check / update FW on rebuilds | 20:03 |
JayF | that's something that I could see making a lot more sense | 20:04 |
JayF | although I'm unsure how it would fit into the code | 20:04 |
JayF | let me get that spec up monday, and for some of these things we'll be talking about the same stuf | 20:05 |
JayF | lifecycle management of firmwares on deployed nodes is not something that'll be covered by that, at all, nor do I think we should support it in Ironic today. I am easily swayed by good arguments and a consensus of people telling me I'm wrong though :P | 20:05 |
*** dprince has quit IRC | 20:06 | |
NobodyCam | JayF: I agree ironic is not a lifecycle management of firmware, but I feel that we should support some way of upgrading FW in a provisioned instance | 20:08 |
*** pcrews has joined #openstack-ironic | 20:11 | |
*** pensu has quit IRC | 20:26 | |
NobodyCam | brb | 20:26 |
-openstackstatus- NOTICE: Gerrit will be offline from 2100-2130 for project renames | 20:33 | |
*** ChanServ changes topic to "Gerrit will be offline from 2100-2130 for project renames" | 20:33 | |
*** igordcard has joined #openstack-ironic | 20:43 | |
*** BLZbubba has joined #openstack-ironic | 20:45 | |
BLZbubba | will any of us be alive in the years 2100-2130? :P | 20:46 |
*** rwsu has quit IRC | 20:56 | |
*** jistr has quit IRC | 20:58 | |
-openstackstatus- NOTICE: Gerrit is offline from 2100-2130 for project renames | 21:02 | |
*** ChanServ changes topic to "Gerrit is offline from 2100-2130 for project renames" | 21:02 | |
*** achanda has quit IRC | 21:06 | |
*** BertieFulton has quit IRC | 21:09 | |
*** achanda has joined #openstack-ironic | 21:12 | |
*** ChanServ changes topic to "Bare Metal Provisioning | Status: http://bit.ly/ironic-whiteboard | Docs: http://docs.openstack.org/developer/ironic/ | Bugs: https://bugs.launchpad.net/ironic" | 21:25 | |
-openstackstatus- NOTICE: Gerrit is back online | 21:25 | |
* NobodyCam seemly needs to go walkies. | 21:30 | |
SpamapS | question | 21:35 |
SpamapS | If I were to do something crazy like delete /tftpboot/* and then restart ironic conductor.. would it re-write those files? | 21:36 |
NobodyCam | did the hash ring change? | 21:37 |
SpamapS | no | 21:37 |
SpamapS | just rebuilding a tripleo undercloud | 21:37 |
NobodyCam | well let chech the p_tasks | 21:38 |
SpamapS | also is there anything on disk that, if lost, would lead to a stale lock or something else? | 21:39 |
NobodyCam | like chahed images and the /tftpboot dir | 21:40 |
SpamapS | take-2... english this time... ACTION | 21:40 |
SpamapS | oh cached | 21:40 |
NobodyCam | the the cached images | 21:40 |
NobodyCam | :-p | 21:40 |
SpamapS | looks like hostname changed | 21:41 |
NobodyCam | did ip? | 21:42 |
NobodyCam | change? | 21:42 |
SpamapS | NobodyCam: no, but the box's hostname changed so locks are stale | 21:44 |
*** jjohnson2 has quit IRC | 21:45 | |
NobodyCam | yep | 21:46 |
NobodyCam | SpamapS: I'm looking at this https://github.com/openstack/ironic/blob/master/ironic/conductor/manager.py#L889-L896 | 21:47 |
*** andreykurilin_ has joined #openstack-ironic | 21:51 | |
NobodyCam | SpamapS: are you looking to "get ironic to rebuild? after a oops (ie a hackish fix would work)? | 21:51 |
SpamapS | NobodyCam: I rebuilt the node ironic is running on | 21:51 |
SpamapS | NobodyCam: /tftpboot is wiped out | 21:52 |
NobodyCam | yes | 21:52 |
NobodyCam | there is periodic task (linked above) that can do it | 21:53 |
NobodyCam | but I think this is stopping it : https://github.com/openstack/ironic/blob/master/ironic/conductor/manager.py#L929 | 21:53 |
SpamapS | NobodyCam: so that means "if it already is mine, do nothing" ? | 21:55 |
NobodyCam | thats how I am reading the continue | 21:55 |
NobodyCam | but if that field where to be a value != self.id then it would | 21:56 |
*** tatyana has joined #openstack-ironic | 22:00 | |
jroll | it should just work | 22:02 |
jroll | but if the lock is stuck, you're kinda screwed | 22:02 |
*** tatyana has quit IRC | 22:03 | |
*** openstackgerrit has quit IRC | 22:03 | |
SpamapS | jroll: in our case, hostname has now been changed back, but lock is still "stuck" | 22:04 |
*** spandhe has quit IRC | 22:04 | |
jroll | right, stuck lock is a huge pain point | 22:05 |
jroll | need to touch the db to fix it | 22:05 |
SpamapS | jroll: but why would it be stuck? | 22:05 |
SpamapS | jroll: hostname == hostname | 22:05 |
*** openstackgerrit has joined #openstack-ironic | 22:05 | |
jroll | because locks are managed in a context manager | 22:05 |
jroll | release() won't get called after a restart | 22:05 |
jroll | with task_manager.acquire(admin_context, node_id) as task: | 22:05 |
jroll | calls release() when it exits that block | 22:06 |
SpamapS | ah and there's no break lock call? | 22:06 |
jroll | right :( | 22:06 |
jroll | we've been talking about it | 22:06 |
jroll | haven't done anything yet | 22:06 |
jroll | because it can be dangerous | 22:06 |
SpamapS | in heat every time we start a new heat-engine, we generate a UUID for lock-owner-id | 22:06 |
SpamapS | and that UUID gets an RPC topic | 22:06 |
SpamapS | if you don't answer on that topic, your lock gets broken | 22:07 |
jroll | so mysql -c 'use ironic; update nodes set reservation=null where uuid = "blah";' it is | 22:07 |
jroll | hmm | 22:07 |
jroll | I mean, I'd rather use some sort of timeout thing | 22:07 |
jroll | maybe | 22:07 |
SpamapS | it is a timeout | 22:07 |
jroll | right | 22:07 |
SpamapS | you encounter a lock, you wait 30s or so for a reply to the rpc, or you break it. | 22:07 |
jroll | there's a spec up to use zookeeper for locking, which could be nice | 22:08 |
SpamapS | that would be ideal | 22:08 |
jroll | is that lock distributed somehow? | 22:08 |
jroll | all conductors would need to know about | 22:08 |
SpamapS | (and was what we wanted, but sdake went all 400lb. gorilla on us when we said we'd have to use java ;) | 22:08 |
jroll | it | 22:08 |
jroll | lol | 22:08 |
SpamapS | jroll: it's in the db | 22:08 |
jroll | ah | 22:09 |
jroll | ok | 22:09 |
SpamapS | jroll: might be good to just clear reservations for your own host. | 22:09 |
jroll | at startup? yeah | 22:09 |
NobodyCam | ya !!! | 22:10 |
jroll | though if two conductors have the same hostname, that would go downhill quickly | 22:10 |
NobodyCam | hostname+ip? | 22:10 |
SpamapS | jroll: you can use tooz and at least allow for an abstraction and not be ONLY zookeeper | 22:11 |
jroll | right, it'll be pluggable | 22:11 |
jroll | well, at the ironic level it will be pluggable | 22:12 |
SpamapS | jroll: wait so if I stop ironic, it doesn't release reservations? | 22:12 |
jroll | right. | 22:12 |
*** mikedillion has quit IRC | 22:12 | |
jroll | because... we'd have to wait for work to finish | 22:12 |
*** r-daneel has quit IRC | 22:12 | |
jroll | which could take a long time | 22:12 |
jroll | I want graceful shutdown as well, fwiw | 22:12 |
SpamapS | jroll: so... restarting ironic-conductor requires database fiddling? | 22:12 |
jroll | it may | 22:12 |
SpamapS | jroll: I'm finding it hard to believe this is intentional. | 22:13 |
NobodyCam | that seems wrong (though it may be true) | 22:13 |
jroll | SpamapS: I feel your pain, trust me | 22:13 |
jroll | I may have a thing | 22:14 |
jroll | sec | 22:14 |
jroll | boo, I thought we had a script for this | 22:15 |
jroll | JoshNang: ^ do we have something for resetting a stuck lock? | 22:15 |
NobodyCam | if you find it put patch for tool with it | 22:15 |
JoshNang | jroll: nope, because it required database fiddling | 22:16 |
jroll | boo | 22:16 |
SpamapS | Well ideally at startup, ironic-conductor would clear reservations with the same hostname. | 22:16 |
SpamapS | Because those operations are definitely _dead_ | 22:16 |
jroll | NobodyCam: it would just be something horrible like " update nodes set reservation=null where updated_at is older than 1 hour" or something | 22:16 |
SpamapS | since we're starting up.. again. | 22:16 |
*** praneshp has joined #openstack-ironic | 22:16 | |
SpamapS | jroll: could also have a SIGTERM handler that _does_ wait for all reservations. SIGKILL is the thing we use when we want to accept inconsistent state. | 22:18 |
jroll | would love to have that | 22:18 |
jroll | SpamapS: sure, the problem is that it could take minutes, hours eventually, to wait for all operations to finish | 22:18 |
jroll | because eventually we'll be erasing disks etc | 22:18 |
SpamapS | jroll: would you SIGKILL your database if it took 30 minutes to flush all writes? | 22:18 |
jroll | I do not want my conductor cluster to take hours to rolling restart | 22:18 |
jroll | I wouldn't use that database :P | 22:19 |
SpamapS | You probably do -->> mysql ;) | 22:19 |
SpamapS | give it a really big transaction log and throw 50,000 transactions per second at it. :) | 22:19 |
jroll | I wouldn't use mysql for 50k TPS | 22:19 |
SpamapS | why not? | 22:20 |
SpamapS | it works great | 22:20 |
SpamapS | anyway OT | 22:20 |
jroll | right :P | 22:20 |
jroll | anyhow, would love code that does that | 22:20 |
*** alexpilotti has quit IRC | 22:20 | |
SpamapS | those reservations are inconsistent state. I think if there are bits that need to continue while the database-handling-thing gets poked, they might need to run as separate worker tasks that just report back their state when done | 22:20 |
SpamapS | this is why nova-conductor handles the DB, and nova-compute handles the insanity of virtualization abstraction | 22:21 |
jroll | yeah | 22:21 |
jroll | don't get me wrong, I agree that it's horrible | 22:21 |
SpamapS | Apache works this way too really | 22:21 |
SpamapS | jroll: well it's understood, so it's not THAT horrible. :) | 22:22 |
SpamapS | better than "dunno, good luck, YOLO" | 22:22 |
jroll | mmm. | 22:22 |
jroll | it's pretty horrible :) | 22:22 |
jroll | lol | 22:22 |
jroll | anyhow, here's how our team would like to fix it: https://review.openstack.org/#/q/owner:%22Kyle+Stevenson%22+status:open,n,z | 22:23 |
*** andreykurilin_ has quit IRC | 22:23 | |
SpamapS | jroll: is there a bug in LP for this yet? | 22:23 |
jroll | clearly work to do | 22:23 |
jroll | I have no idea | 22:23 |
jroll | maybe? | 22:23 |
SpamapS | we're working around it in our update code now.. want a reference next to the workaround | 22:23 |
*** praneshp has quit IRC | 22:23 | |
SpamapS | jroll: should pivot that from ZK direct, to tooz | 22:24 |
jroll | how are you working around it? | 22:24 |
jroll | kylestev: ^ | 22:24 |
jroll | I think someone else recommended that, too | 22:24 |
SpamapS | jroll: the way I described, before starting ironic, will clear all reservations for our host. | 22:24 |
jroll | and someone else else said tooz is horrible or something | 22:24 |
jroll | ah | 22:24 |
jroll | but tripleo will touch the db to do so? | 22:24 |
SpamapS | https://github.com/stackforge/tooz/tree/master/tooz/drivers | 22:24 |
SpamapS | note.. just a frontend for ZK | 22:25 |
jroll | oh, right, I remember this | 22:25 |
SpamapS | jroll: yeah. | 22:25 |
jroll | https://github.com/stackforge/tooz/tree/master/tooz/tests/drivers | 22:25 |
*** igordcard has quit IRC | 22:26 | |
SpamapS | jroll: looks like it needs some love | 22:28 |
jroll | indeed | 22:29 |
*** mikedillion has joined #openstack-ironic | 22:29 | |
kylestev | definitely | 22:29 |
SpamapS | jroll: https://bugs.launchpad.net/ironic/+bug/1382698 filed .. please clarify or correct anything I might have gotten wrong. Thanks. | 22:32 |
jroll | looks fine, thanks :) | 22:34 |
SpamapS | I think the other problem we're dealing with, which may or may not be expecting too much on our part, is that we are currently getting rid of /tftpboot because we reimage the box. | 22:36 |
SpamapS | conductor will only re-assert the contents of that if a node changes affinity | 22:37 |
SpamapS | Simple workaround there.. just backup /tftpboot before reimage. | 22:37 |
jroll | SpamapS: I don't think that should be a problem, other than ironic will need to re-cache images | 22:42 |
jroll | but I'm not 100% sure since I don't use that :) | 22:42 |
*** rwsu has joined #openstack-ironic | 22:52 | |
NobodyCam | oh this sounds like a cool seminar "Foundations of All-Optical Logic for Advanced Computing Systems" | 22:57 |
*** mikedillion has quit IRC | 22:58 | |
*** achanda has quit IRC | 23:21 | |
*** spandhe has joined #openstack-ironic | 23:30 | |
*** praneshp has joined #openstack-ironic | 23:42 | |
NobodyCam | SpamapS: fyi I was able to get ironic to rebuild the tftp dir | 23:44 |
NobodyCam | well most of it | 23:44 |
NobodyCam | it required database fiddling | 23:44 |
*** praneshp_ has joined #openstack-ironic | 23:48 | |
*** praneshp has quit IRC | 23:50 | |
*** praneshp_ is now known as praneshp | 23:50 | |
*** praneshp has quit IRC | 23:56 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!