Tuesday, 2019-01-22

*** tetsuro has joined #openstack-placement00:10
*** tetsuro_ has joined #openstack-placement04:09
*** tetsuro has quit IRC04:11
*** tetsuro has joined #openstack-placement04:31
*** tetsuro_ has quit IRC04:31
*** openstackgerrit has joined #openstack-placement04:41
openstackgerritMerged openstack/placement master: Add irrelevant files list to perfload job  https://review.openstack.org/62404704:41
*** e0ne has joined #openstack-placement05:52
*** e0ne has quit IRC05:53
*** avolkov has joined #openstack-placement06:03
*** takashin has left #openstack-placement06:36
*** tssurya has joined #openstack-placement07:41
*** helenafm has joined #openstack-placement08:22
*** tssurya has quit IRC09:25
*** tetsuro has quit IRC09:36
openstackgerritTetsuro Nakamura proposed openstack/placement master: Configure database api in upgrade check  https://review.openstack.org/63236509:51
*** cdent has joined #openstack-placement10:12
cdentthanks for paying attention tetsuro10:15
cdentgibi: if you're happy with https://review.openstack.org/#/c/632365/ can you kick it in? It's a bug fix to the status command that didn't get fully tested before merged10:16
gibicdent: done10:20
cdentthanks10:20
*** e0ne has joined #openstack-placement10:24
openstackgerritTetsuro Nakamura proposed openstack/placement master: Configure database api in upgrade check  https://review.openstack.org/63236510:32
*** tetsuro has joined #openstack-placement10:34
cdentI think maybe we should just let tetsuro do everything, he's the only one paying sufficient attention to correctness :)10:37
tetsurocdent: anyway thanks for re-approving! :)10:38
cdentthank you10:39
gibi:)10:46
*** ttsiouts has joined #openstack-placement12:22
*** e0ne has quit IRC12:25
*** e0ne has joined #openstack-placement12:30
*** tssurya has joined #openstack-placement12:38
*** tetsuro has quit IRC12:53
*** mriedem has joined #openstack-placement13:18
openstackgerritMerged openstack/placement master: Configure database api in upgrade check  https://review.openstack.org/63236514:01
*** e0ne has quit IRC14:05
*** e0ne has joined #openstack-placement14:08
*** mriedem has quit IRC14:19
*** mriedem has joined #openstack-placement14:21
*** ttsiouts has quit IRC14:39
*** ttsiouts has joined #openstack-placement14:39
*** ttsiouts has quit IRC14:41
*** ttsiouts has joined #openstack-placement14:41
*** efried_mlk is now known as efried14:45
*** rubasov_ is now known as rubasov14:50
*** efried1 has joined #openstack-placement15:00
*** efried has quit IRC15:01
*** efried1 is now known as efried15:01
*** avolkov has quit IRC15:34
*** openstackgerrit has quit IRC15:51
*** efried has quit IRC16:00
*** efried has joined #openstack-placement16:09
*** dims has quit IRC16:15
*** dims has joined #openstack-placement16:20
*** e0ne has quit IRC16:38
*** e0ne has joined #openstack-placement16:39
*** mriedem is now known as mriedem_away16:39
*** ttsiouts has quit IRC16:55
*** ttsiouts has joined #openstack-placement16:55
*** ttsiouts has quit IRC17:00
*** efried has quit IRC17:00
*** e0ne has quit IRC17:02
*** helenafm has quit IRC17:10
*** efried has joined #openstack-placement17:49
*** avolkov has joined #openstack-placement18:35
*** e0ne has joined #openstack-placement19:07
*** gryf has joined #openstack-placement19:29
*** tssurya has quit IRC19:36
*** e0ne has quit IRC19:42
*** mriedem_away is now known as mriedem20:16
*** alanmeadows has joined #openstack-placement20:37
jaypipesalanmeadows: whatup g-money?20:37
alanmeadowsAhoy folks.20:37
alanmeadowsWe had change go out to a number of production sites that adjusted the hostname to nova agents (agent stops reporting as `host` and starts reporting as `host.fqdn`)20:38
jaypipesalanmeadows: lemme guess... doubled-up resource provider records? :)20:39
jaypipesalanmeadows: and a scheduler that suddenly thinks you've got a shitload of extra capacity?20:39
alanmeadowsyes along those lines20:39
*** dims has quit IRC20:40
alanmeadowspci scheduling conflicts obviously, as the pci_devices table is populated with unallocated entries for these "new" nodes20:40
alanmeadowsbut let me walk through what was tried quickly20:40
jaypipesalanmeadows: and you're looking for a quick hotfix to get the data ungoofed?20:40
alanmeadowsand link that up to the placement question20:40
jaypipesack20:40
alanmeadowswe got the bright idea given no resources have been created using the new compute_nodes entry20:41
alanmeadowsthat we would revive the compute_nodes old entry, and update it to the new fqdn, and deactivate the new one20:41
*** dims has joined #openstack-placement20:42
alanmeadowsdancing around the unique constraints20:42
alanmeadowswe then discovered a `uuid` in compute_nodes that clearly links the node to the placement tables20:42
alanmeadowsand finally on to the bit confusing us20:43
jaypipesalanmeadows: yes, that's essentially what you'll need to do. the only issue is that you're going to need to first delete the placement resource_providers table records that refer to the new fqdns20:43
alanmeadowswell this is whats weird20:43
alanmeadowsthe resource_providers table has an entry for the new, target fqdn name, with the wrong uuid20:44
alanmeadowswe can fix that, sure20:44
alanmeadowsbut whats odd is there is no entry for the old agent name like there was in compute_nodes20:44
alanmeadowson top of that there are no allocations for the new entry generated in resource_providers20:44
jaypipesalanmeadows: that is indeed weird.20:45
alanmeadowsmuch like the magic conversion nova will do for deactivating dupes in compute_nodes (set shortname to deleted=1, ...)20:45
jaypipesalanmeadows: there cannot be allocations referring to the old provider UUIDs but no entries in the resource_providers table with those UUIDs.20:45
alanmeadowsto make a transition from short->long or long->short hostnames painless20:45
alanmeadowsit almost seems like some magic transition happened in the placement data, and all allocations lost in the process20:46
jaypipesalanmeadows: so all allocations are gone?20:46
alanmeadowsall allocations for a node that has undergone this hostname transition of short->fqdn are gone20:47
jaypipesyikes.20:47
alanmeadowsin a site where this happened to all nodes before it was noticed20:47
jaypipesmriedem: ^^20:47
alanmeadowsthe allocations table is an empty set20:47
jaypipesalanmeadows: is this a flashing red lights situation?20:49
jaypipesalanmeadows: i.e. production site with no immediate way of recovering20:49
alanmeadowsit has some people quite interested in the outcome and a resolution ;-)20:49
jaypipesheh, ok.20:50
jaypipeswe really need to put some sort of barrier/prevention in place when/if we notice a my_hostname CONF change...20:50
*** dims has quit IRC20:50
jaypipesor whatever the CONF option is called that determines the nova-compute service name. can't ever remember it.20:50
jaypipesmy_ip?20:51
jaypipesmeh, whatever...20:51
jaypipesalanmeadows: lemme have a think.20:52
*** dims has joined #openstack-placement20:52
jaypipesalanmeadows: if you reset the hostname for a service back to its original hostname and restart the nova-compute service, what happens?20:52
alanmeadowsoh thats definitely coming under strict control as a lesson learned20:52
alanmeadowsI did not try that scenario without the mucking20:53
mriedemhostname changes will result in a new compute node record20:53
mriedemwhich means a new resource provider with new uuid20:53
mriedemcompute nodes are unique per hostname/nodename (which for kvm is the same)20:53
alanmeadowsWe attempt to roll forward with the fqdn transition but preserve mappings20:54
jaypipesmriedem: right, but apparently something deletes all the instances/allocations on the old provider in the process...20:54
alanmeadowslooking at nodes that underwent the transition20:54
mriedemprobably the resource tracker20:54
alanmeadowsthey have the highest ID increment in resource_providers20:54
mriedemhttps://github.com/openstack/nova/blob/31956108e6e785407bdcc31dbc8ba99e6a28c96d/nova/compute/resource_tracker.py#L124420:54
alanmeadowsas though something deleted their `short` version, created the `longName` version and cascaded the allocations20:54
mriedemmy guess would be either something in the RT or something on compute restart thinking an evacuation happened20:56
jaypipesmriedem: right, but https://github.com/openstack/nova/blob/31956108e6e785407bdcc31dbc8ba99e6a28c96d/nova/compute/resource_tracker.py#L792-L794 should not delete instances from the *old* hostname, since self._compute_nodes[nodename] (where nodename == new FQDN) should yield no results for InstanceList.get_all_by_host_and_nodename(), right?20:56
mriedemhttps://github.com/openstack/nova/blob/31956108e6e785407bdcc31dbc8ba99e6a28c96d/nova/compute/manager.py#L62820:57
jaypipesmriedem: that's migrations, though, which again, shouldn't be returning anything in this case (of a hostname rename)20:58
jaypipesor at least, I *think* that's the case. alanmeadows, what version of openstack are you using?20:59
alanmeadowsThis is ocata20:59
alanmeadowsin this instance20:59
mriedemthe evac migrations robustification was added by dan because in the olden times a hostname change would make compute think an evac happened and delete your instances20:59
mriedembut that was liberty i think20:59
mriedemhttps://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/robustify_evacuate.html20:59
jaypipesyeah, ocata code is identical as no21:00
jaypipesyeah, ocata code is identical as now21:00
jaypipesso I don't believe it's the evacuate code path that is the issue here.21:00
mriedemso the allocations were deleted for the old provider in placement or just not there for the new provider?21:00
jaypipesalanmeadows: UNLESS... your deployment tooling issued some sort of host-evacuate call in doing this rename of hostname FQDNs?21:00
alanmeadowsI'd be ok with them not being there for the new provider21:01
alanmeadowsI could deal with that21:01
alanmeadowsits that they appear to be gone entirely21:01
alanmeadows@jaypipes: definitely no, no nova calls, just a /etc/hosts ordering and `domain` resolv updates.21:02
mriedemas a workaround you could run the heal_allocations CLI but that's not in ocata you'd have to backport it or run from a container https://docs.openstack.org/nova/latest/cli/nova-manage.html#placement21:02
alanmeadowsdidn't know about that21:02
alanmeadowspotential contender21:03
alanmeadowsone grounding question though21:05
alanmeadowsthese agents that have undergone this name transition21:06
alanmeadowsthey come up, and wish to report to the world their PciDevicePool counts are 100% available21:07
alanmeadowswhen obviously resources have been assigned21:07
alanmeadowsthey also overwrite any pci_devices entries for that node id that may have been allocated to unallocated21:07
alanmeadowsand so at the end of the day, am I chasing the right thing with there being empty allocations records for these hosts21:08
jaypipesalanmeadows: well, PCI devices are not handled by placement unfortunately (or... fortunately, for you at least)21:09
jaypipesalanmeadows: so we need to separate out the placement DB's allocations table issues from the pci_devices table issues, because they are handled differently.21:09
alanmeadowssure the authority separation I get21:10
alanmeadowsI arrived at missing data in the allocations table but I started with21:10
alanmeadowswhy these agents are reporting an incorrect state to the world21:10
jaypipesalanmeadows: are there still entries in the pci_devices table that refer to the original compute nodes table records for the original hostname?21:13
alanmeadowsyes, until the agent starts up and whacks them21:14
alanmeadowsoh, re-read your question21:14
alanmeadowsyes, there were21:14
alanmeadowsbut recall our brilliant idea about how to back out of this21:14
alanmeadowsand preserve mappings21:14
jaypipesalanmeadows: ok, so at least *that* issue should be easy to resolve...21:14
jaypipesalanmeadows: need to drop jules off ... back in about 20 mins21:14
alanmeadowswas to revive the original compute_nodes entry21:15
alanmeadowsby updating its hypervisor_name21:15
alanmeadowsand when we do this21:15
alanmeadowsthe correct pci_devices entries for that older node name (but now updated)21:15
alanmeadowsare clobbered21:15
alanmeadowsand set all set to available21:16
alanmeadowswe're just trying this approach out on one host21:16
alanmeadowshttps://github.com/openstack/nova/blob/stable/ocata/nova/compute/manager.py#L6658-L667121:18
* alanmeadows blinks21:18
*** efried has quit IRC21:30
*** efried has joined #openstack-placement21:30
jaypipesmriedem, alanmeadows: that looks to be it.21:39
alanmeadowsHow much do we trust https://docs.openstack.org/nova/latest/cli/nova-manage.html#placement21:40
alanmeadowsThis does look to be an answer for rebuilding this data21:40
alanmeadowswithout having to go off and figure out how to cobble it21:40
mriedemi wrote it21:49
mriedembut that doesn't mean you have to trust it21:50
mriedemrun it with --max-count of 1 if you don't trust it21:50
mriedemi thought about adding a --dry-run option but didn't have time21:50
mriedemmnaser also has a script to fix up allocations i think21:51
alanmeadowswhatever the nova elders believe is the best approach21:53
alanmeadowsensuring I can still read docs I should be fine with an RPC version of 1.28 against a rocky nova-manage to leverage heal_allocations21:54
alanmeadowsaka rocky nova-manage on ocata nova/placement for `heal_allocations` seems to be "ok"21:55
jaypipesalanmeadows: honestly, I'm still trying to figure out if the code you link above is actually the thing that is deleting the resource provider and allocation records.21:56
jaypipesmriedem: I mean, wouldn't self.host by equal to the new FQDN in this line? https://github.com/openstack/nova/blob/stable/ocata/nova/compute/manager.py#L6675 and therefore it would not find the old compute node record and call destroy() on it?21:58
mriedemalanmeadows: heal_allocations shouldn't require anything over rpc22:00
mriedemit's all db22:00
jaypipesmriedem: ahhhhhhhhh22:00
jaypipesmriedem: I think I understand now what happened...22:01
mriedemwell, was "Deleting orphan compute node" in the logs?22:01
jaypipesalanmeadows: I bet you didn't change the nova.cnf file's CONF.host option when you changed the hostname of the compute nodes, right?22:02
alanmeadowsmriedem: excellent question, working on an answer to that22:02
alanmeadowsjaypipes: we do not use `host` at this time, but clearly after this, we will drive it going forwar22:03
alanmeadowsto avoid any shuffling not at our consent22:03
alanmeadowswe let nova determine it22:04
alanmeadowsand of course, no one likes moving targets22:04
jaypipesalanmeadows: and in doing so, there was a mismatch between the CONF.host value and what was returned by the virt driver's get_available_nodes() method (called from here: https://github.com/openstack/nova/blob/stable/ocata/nova/compute/manager.py#L6654). the issue is that get_available_nodename() doesn't use CONF.host. It uses the hypervisor's local hostname which would be different (https://github.com/openstack/nova/blob/stable/ocata/nova/22:06
jaypipesvirt/libvirt/host.py#L681-L691)22:06
jaypipesalanmeadows: and that's what caused the delete of orphaned compute nodes to run.22:06
jaypipesmriedem: so, basically, nova-compute started up thinking it was the old hostname, libvirt told the compute manager it was the new hostname, and the compute manager deleted the compute node record referring to the old hostname.22:07
mriedemyup, that's what the old evac issue was like22:09
mriedemthat code in the compute manager is really meant for ironic22:09
alanmeadowsluckily we don't allow orphan vms to be cleaned22:09
alanmeadowsor ... oops.22:09
*** cdent has quit IRC22:50
*** efried has quit IRC22:53
alanmeadowslooks like heal_allocations will require backporting23:01
mnaserseems like hostname changing fun? :\23:40
mnaserwe just reboot servers on hostname changes now23:40
alanmeadowssince you popped in mnaser...23:41
alanmeadowsmriedem mentioned you had a script for fixing up allocations23:42
mnaseri pasted it somewhere hmm23:42
mnaserit was more meant for cleaning up in the sense of removing entries that should not be there23:42
alanmeadowsattempting to slam heal_allocations into ocata is proving... fun23:42
alanmeadowsso if you have something more simplistic about23:42
mnaserbut you could maybe rewrite it using the foundation to do more23:42
mnaserlet me find it23:42
mnaserit was in a launchpad somewhere..23:43
mnaserthe world's worst site to search23:43
mnaseralanmeadows: https://bugs.launchpad.net/nova/+bug/179356923:44
openstackLaunchpad bug 1793569 in OpenStack Compute (nova) "Add placement audit commands" [Wishlist,Confirmed]23:44
mnaserhttp://paste.openstack.org/show/734146/23:44
alanmeadowsthis is much more hackable23:45
mnaserso the idea is it hits the nova os-hypervisors api23:45
mnaserand then kinda just does an audit comparing things back and forth23:45
alanmeadowsI'm not convinced the ocata placement api has everything heal_allocations wants (the report client definitely does not, but was fixing) - that rabbit hole feeling was creeping over me23:46
mnaserif you can keep somewhat the same logic and add a way to make sure entries which are missing get added, it'll be even more useful23:46

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!