opendevreview | melanie witt proposed openstack/nova master: Add stub unified limits driver https://review.opendev.org/c/openstack/nova/+/712137 | 00:47 |
---|---|---|
opendevreview | melanie witt proposed openstack/nova master: Assert quota related API behavior when noop https://review.opendev.org/c/openstack/nova/+/712140 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Make unified limits APIs return reserved of 0 https://review.opendev.org/c/openstack/nova/+/712141 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: DNM Run against unmerged oslo.limit changes https://review.opendev.org/c/openstack/nova/+/812236 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Add logic to enforce local api and db limits https://review.opendev.org/c/openstack/nova/+/712139 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Enforce api and db limits https://review.opendev.org/c/openstack/nova/+/712142 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Update quota_class APIs for db and api limits https://review.opendev.org/c/openstack/nova/+/712143 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Update limit APIs https://review.opendev.org/c/openstack/nova/+/712707 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Update quota sets APIs https://review.opendev.org/c/openstack/nova/+/712749 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Tell oslo.limit how to count nova resources https://review.opendev.org/c/openstack/nova/+/713301 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Enforce resource limits using oslo.limit https://review.opendev.org/c/openstack/nova/+/615180 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Add legacy limits and usage to placement unified limits https://review.opendev.org/c/openstack/nova/+/713498 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Update quota apis with keystone limits and usage https://review.opendev.org/c/openstack/nova/+/713499 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: Add reno for unified limits https://review.opendev.org/c/openstack/nova/+/715271 | 00:47 |
opendevreview | melanie witt proposed openstack/nova master: WIP Enable unified limits in the nova-next job https://review.opendev.org/c/openstack/nova/+/789963 | 00:47 |
*** tbachman_ is now known as tbachman | 03:02 | |
opendevreview | Balazs Gibizer proposed openstack/nova master: Reproduce bug 1952941 https://review.opendev.org/c/openstack/nova/+/820121 | 08:52 |
gibi | if somebody has good ideas how to do a proper OVO data migration (including writing the upgraded data back to the DB) in case an OVO is persisted in two different tables (instance_extra.numa_topology and request_spec.numa_topology) then let me know. See ^^ as reference | 09:07 |
gibi | I can create a solution that has two separate code patch one for the instance_extra and one for the request_spec case but that is i) ugly ii) does not generic enough to handle the case when object will be persisted in a 3rd table iii) create a bad precedence for future OVO data migrations | 09:09 |
*** efried1 is now known as efried | 09:48 | |
bauzas | morning | 09:51 |
gibi | bauzas: o/ good morning | 09:52 |
bauzas | gibi: we did some data migrations in the past | 09:52 |
bauzas | but if you need to update two objects, well, wow | 09:52 |
gibi | bauzas: yes, and the pcpuset one is broken :) | 09:53 |
gibi | bauzas: it is one OVO class that is persisted to two db table | 09:53 |
* bauzas facepalms | 09:53 | |
gibi | InstanceNUMACell is part of instance_extra.numa_topology as well as request_spec.numa_topology | 09:53 |
bauzas | I guess we don't use a same ovo object for NUMATopology ? | 09:53 |
gibi | same ovo class for both | 09:54 |
bauzas | lemme look at your bug above ^ | 09:54 |
bauzas | hah | 09:54 |
gibi | sure | 09:54 |
gibi | I'm hacking on a soluiton atm | 09:54 |
bauzas | okay, I think I understand the problem | 09:59 |
bauzas | we updated instance_extra | 10:02 |
bauzas | when we called the InstanceNumaTopology object | 10:02 |
bauzas | but if you don't hydrate this object this way, we don't update it | 10:03 |
gibi | yepp | 10:03 |
bauzas | I thought we had an db upgrade too before moving to Wallaby then | 10:03 |
bauzas | at least a nova-upgrade check | 10:03 |
bauzas | telling that some objects weren't updated | 10:04 |
gibi | I don't think so as the code still has todos to remove the migration code once we are sure the the objects are loaded once | 10:04 |
bauzas | at least that's what I'd do if I would write some upgrade change | 10:04 |
gibi | but even if we had a blocking migration that would miss request spec too | 10:04 |
bauzas | gibi: because we persist it in the RequestSpec API DB ? | 10:05 |
gibi | yewpp | 10:05 |
gibi | yepp | 10:05 |
bauzas | ok I see the problem | 10:05 |
bauzas | so in theory the cell db is upgraded | 10:05 |
gibi | I assume that if we forgot to migrate the request spec then we forgot to add a blocking migration for that too | 10:05 |
bauzas | but the api db continues to have old values | 10:05 |
gibi | yes | 10:05 |
gibi | or more precisely when an instance is loaded it has a proper value, but if a request spec is loaded it has still the old value | 10:06 |
bauzas | ok, so we need to use this migrate method for the requestspec object then | 10:06 |
bauzas | gibi: yup, understood | 10:06 |
gibi | yes, but we need to split the code as in case of request spec we need to persist the updated value to a different table | 10:06 |
gibi | hence my pain | 10:06 |
bauzas | hah | 10:06 |
gibi | it is ugly and non generic | 10:06 |
bauzas | got it | 10:07 |
bauzas | well, this should have been done this way in Victoria either way, right? | 10:07 |
bauzas | problem is, we assume that InstanceNumaTopology object is only persisted by instance_extra db table | 10:08 |
bauzas | right?N | 10:08 |
gibi | right | 10:09 |
bauzas | if so, the object is broken | 10:09 |
bauzas | I mean the object persistency | 10:09 |
gibi | the problem is that the InstanceNUMATopology object does not know that it is used in two differnt context | 10:09 |
bauzas | we need to say we have to persist in two db tables | 10:09 |
bauzas | gibi: yup, hence my word 'broken' | 10:09 |
bauzas | in general we have object classes that are backed by a single db table | 10:10 |
gibi | I can add a generic code to _obj_from_primitives that called in both case and do the data migration genericly, but in that code I cannot decide which table to write the data back | 10:10 |
bauzas | and we directly map the object fields with the db values | 10:10 |
bauzas | gibi: agreed with your problem, this is a pain to fix | 10:10 |
bauzas | the design itself is having flaws | 10:10 |
bauzas | we somehow need to have ovo objects that know which db table they are related | 10:11 |
bauzas | but honestly, I somehow feel those necessarly have to be two different objects | 10:11 |
bauzas | we can nest objects under others | 10:12 |
bauzas | but given RequestSpecs is at the API DB, we can't just hydrate the value from the cell DB values | 10:12 |
bauzas | or, the other way to consider that, is that we only use RequestSpecs.InstanceNUMATopology as a non-persisted object | 10:13 |
bauzas | but we would need to scatter/gather the values from the cell DB before we hydrate such reqspec nested object | 10:13 |
bauzas | gibi: see ? | 10:14 |
bauzas | one way to address this would be to gather the instancenumatopology object from the cell DB at the api level before we hydrate the requestspec | 10:14 |
gibi | I'm not even sure that for a single instnace the InstanceNUMATopology in the instance and in the request spec are the saem | 10:14 |
gibi | same | 10:14 |
gibi | i.e. instance multicreate reuses a single request spec object afaik | 10:15 |
bauzas | this is fine | 10:15 |
bauzas | this is just for scheduling decisions | 10:15 |
bauzas | we don't need to persist this | 10:15 |
bauzas | that's why we always said we were fine with one single reqspec per boot call in case of multicreate | 10:16 |
bauzas | and in case of a move op, getting the generic reqspec is ok | 10:16 |
gibi | then I don't get it | 10:16 |
gibi | the failure happens during the scheduling of a migration of a pre Victoria instance | 10:16 |
gibi | in the NumaTopology filter | 10:17 |
bauzas | gibi: ok, so at the API, we need to gather the instance numa topology from the cell DB corresponding to where the instance is stored | 10:17 |
bauzas | and we will then hydrate the reqspec with the proper values of the instance | 10:18 |
gibi | no no, I get that. what I don't get is why the NumaTopology filter depends on this value | 10:18 |
gibi | what does the request_spec.numa_topology describes? | 10:19 |
gibi | is the what was requested? | 10:19 |
gibi | and the instance_extra.numa_topology represent what was actually selected | 10:19 |
bauzas | excellent question | 10:19 |
gibi | but then in multicreate there is one request but more than one actual selection | 10:20 |
gibi | and those selections can be have different values, different pcpu ids | 10:20 |
gibi | anyhow I have to go back hacking on a solution to fix an urgent downstream issue. The fix needs to be backportable so I will do the ugly code path duplication | 10:21 |
gibi | later on we can think about a proper fix that reimagine the ovo usage | 10:21 |
bauzas | good luck | 10:22 |
gibi | thanks | 10:22 |
gibi | you will see the result :D | 10:22 |
gibi | bauzas: how could this ever work? https://github.com/openstack/nova/blob/7670303aabe16d1d7c25e411d7bd413aee7fdcf3/nova/objects/request_spec.py#L727-L730 | 10:48 |
gibi | as far as I see db_spec is a dict | 10:48 |
gibi | db_spec.save() is a type error | 10:48 |
bauzas | gibi: no this isn't a dict | 10:52 |
gibi | is something connected to a db model? | 10:53 |
bauzas | gibi: IIRC it returns the class object | 10:53 |
bauzas | oh wait, sec | 10:54 |
bauzas | it returns some SQLA object | 10:55 |
gibi | OK I think I see now, thanks | 10:56 |
gibi | the unit test simulates the db in a wrong way :/ | 10:57 |
bauzas | https://www.commitstrip.com/wp-content/uploads/2016/01/Strip-Voyage-dans-le-temps-650-finalenglish-2.jpg | 10:58 |
bauzas | gibi: I guess we need to pdb it | 10:58 |
bauzas | my object-persistency-fu is a bit rusty | 10:59 |
gibi | lol | 10:59 |
gibi | so the test mocks in a dict instaed of a proper object from the db, hence my problem | 10:59 |
gibi | moving on | 11:00 |
gibi | stephenfin: do you remember if we have a blocking db migration for the cpuset -> pcpuset data migration? | 11:14 |
sean-k-mooney | yay another libosinfo bug that can break live migration this time. | 11:29 |
sean-k-mooney | gibi: i did not think we needed one | 11:29 |
gibi | sean-k-mooney: if want to remove the data migration code from InstanceNUMATopology ever then we need a blocking migration to see that all the objects is loaded at least once | 11:30 |
sean-k-mooney | gibi right but i tought we were not goign to drop that | 11:31 |
sean-k-mooney | i mean we can but we do it on load to avoid a data migration | 11:32 |
sean-k-mooney | gibi: back to your instance numa toploty question | 11:32 |
sean-k-mooney | the one in the request spec is not the same as the one in the instnace | 11:32 |
sean-k-mooney | the request spec version will not have actul cpus ectra assigned | 11:33 |
sean-k-mooney | it just has the constriants | 11:33 |
sean-k-mooney | e.g. 2 numandoes with pinned cpus | 11:33 |
sean-k-mooney | the one in the instance actully will be fully populated | 11:33 |
sean-k-mooney | once it has landed on the compute node | 11:33 |
sean-k-mooney | and been assinged cores and save back to the db | 11:33 |
gibi | sean-k-mooney: yeah, so we cannot use the data in instance_extra.numa_topology to populate request_spec.numa_topology due to multicreate | 11:34 |
gibi | we need to independently handle request_spec.numa_topology and migrate it | 11:35 |
sean-k-mooney | well we shoudl not be populating the request spec version and persiting it | 11:35 |
sean-k-mooney | gibi: we clone the request spec to avoid it modifying it https://github.com/openstack/nova/blob/7670303aabe16d1d7c25e411d7bd413aee7fdcf3/nova/scheduler/filters/numa_topology_filter.py#L64-L71 | 11:38 |
sean-k-mooney | so that multi create works | 11:38 |
sean-k-mooney | stephen fixed that in https://github.com/openstack/nova/commit/bff2030ecea8a1d21e03c61a7ece02f40dc25c5d | 11:38 |
sean-k-mooney | https://bugs.launchpad.net/nova/+bug/1655979 | 11:39 |
gibi | sean-k-mooney: so there will be a request spec for each multicreated instance during scheduling but only the first request spec is persisted the rest is just temporary | 11:39 |
gibi | or we call create() on the clones as well? | 11:39 |
gibi | ohh | 11:40 |
sean-k-mooney | no only one will be persited | 11:40 |
sean-k-mooney | we make a copy and use that in the filter | 11:40 |
gibi | and that one has numa_topology ? | 11:40 |
sean-k-mooney | the initial request spec has a partly populated numa_toplogy object | 11:41 |
sean-k-mooney | which is intentional | 11:41 |
sean-k-mooney | its create by numa_get_constraits https://github.com/openstack/nova/blob/8d9785b965657d42f20e1ad7234f570077a387d7/nova/virt/hardware.py#L1858 | 11:42 |
sean-k-mooney | in the api | 11:42 |
sean-k-mooney | https://github.com/openstack/nova/blob/ff4b396abff80dea5a54dfe830a7db3a97a7360c/nova/compute/api.py#L1068 | 11:43 |
sean-k-mooney | as part of _validate_and_build_base_options | 11:43 |
sean-k-mooney | that is what is stored in the request_spec | 11:43 |
gibi | so that create numa cell objects with cpuset values | 11:44 |
sean-k-mooney | no the cpuset values will not be populated | 11:45 |
gibi | https://github.com/openstack/nova/blob/8d9785b965657d42f20e1ad7234f570077a387d7/nova/virt/hardware.py#L1553 | 11:45 |
gibi | https://github.com/openstack/nova/blob/8d9785b965657d42f20e1ad7234f570077a387d7/nova/virt/hardware.py#L2041 | 11:45 |
gibi | based on this they will | 11:45 |
sean-k-mooney | well ok they will https://github.com/openstack/nova/blob/a027b45e46fb3a63166b8b86ef7f99b0b04bcec8/nova/virt/hardware.py#L1552-L1553 | 11:46 |
sean-k-mooney | from https://github.com/openstack/nova/blob/a027b45e46fb3a63166b8b86ef7f99b0b04bcec8/nova/virt/hardware.py#L2040-L2051 | 11:46 |
gibi | yepp | 11:46 |
sean-k-mooney | they wont be mapped to host cores at this point | 11:47 |
gibi | true | 11:47 |
sean-k-mooney | but we will know how many floatign or pinned cores are required | 11:47 |
sean-k-mooney | gibi: what is your current issue by the way | 11:48 |
sean-k-mooney | i only read back part of the irc log | 11:48 |
gibi | OK. So i) we need to do the cpuset -> pcpuset data migration in request_spec.numa_topology that was missed originally in Victoria. And we cannot avoid that by populating request_spec.numa_topology from instance_extra.numa_topology | 11:48 |
gibi | sean-k-mooney: the cpuset -> pcpuset data migration was only done in instance_extra.numa_topology and not in request_spec.numa_topology | 11:49 |
gibi | so when a pre Victoria pinned instance is migrated after upgrade to Victoria the scheduler blows | 11:49 |
gibi | as it triest to read out pcpuset but it is not populated | 11:50 |
gibi | see https://review.opendev.org/c/openstack/nova/+/820121 | 11:50 |
gibi | so I'm working on a bugfix for it, but it will be ugly as the InstanceNUMATopology object is persisted in two tables but the OVO itself does not know about that fact | 11:51 |
gibi | so I cannot add a generic solution | 11:51 |
gibi | I have to keep the upgrading of the two table in separate paths | 11:51 |
sean-k-mooney | gibi well we can just do the migration in the ovo no? | 11:52 |
sean-k-mooney | rather then in the db | 11:52 |
gibi | de ovo does not know which table to update | 11:52 |
gibi | the current migration persists the data after the data migration | 11:52 |
gibi | but only in instance_extra | 11:52 |
sean-k-mooney | well it only gets saved indretly via eithe a request_spec.save or instance.save | 11:52 |
gibi | nope | 11:52 |
sean-k-mooney | wehre do we save the toplogy diretly | 11:53 |
gibi | sec... | 11:53 |
gibi | https://github.com/openstack/nova/blob/7670303aabe16d1d7c25e411d7bd413aee7fdcf3/nova/objects/instance_numa.py#L203 | 11:53 |
gibi | here | 11:53 |
sean-k-mooney | right so that is in the data migration code | 11:54 |
gibi | so when an instance is loaded that needed a migration the instance_extra is changed in the DB during the load | 11:54 |
sean-k-mooney | but normally we do not do that | 11:54 |
sean-k-mooney | ya only for old instances | 11:54 |
sean-k-mooney | we can do the same in the request spec | 11:54 |
sean-k-mooney | wehn its loaded if it has an old instance object it can also just save it | 11:55 |
gibi | yes, that is what I'm doing but that is ugly | 11:55 |
gibi | i) ugly ii) does | 11:56 |
gibi | not generic enough to handle the case when object will be persisted in a 3rd | 11:56 |
gibi | table iii) create a bad precedence for future OVO data migrations | 11:56 |
sean-k-mooney | the https://github.com/openstack/nova/blob/7670303aabe16d1d7c25e411d7bd413aee7fdcf3/nova/objects/request_spec.py#L245-L262 | 11:56 |
gibi | I will do it as I have to provide a fix today downstream that is backportable, but I hate i | 11:56 |
sean-k-mooney | gibi: well you should not be updating multiple tables really | 11:56 |
gibi | sean-k-mooney: those are only called if the request spec is created from an instance object | 11:57 |
sean-k-mooney | you are either udpateing the instance copy or the request spec copy | 11:57 |
sean-k-mooney | they are not ment to have the same data in them | 11:57 |
gibi | yes, but from the OVO perspective it does not know whihc | 11:57 |
gibi | so I cannot have a generic _obj_from_primitive call in InstanceNUMATopology that does the migration as that call has no info where to persist the result | 11:58 |
sean-k-mooney | the instance request spec object wont | 11:58 |
sean-k-mooney | but the request spec will know it the request spec | 11:58 |
sean-k-mooney | you shoudl be fixint this in the request psec object not the InstanceNUMATopology | 11:58 |
gibi | so now the RequestSpec object has to know that there is an InstanceNUMATopology data migration that needs to be run, that is baaad encapsulation | 11:59 |
sean-k-mooney | not really | 11:59 |
gibi | the data format change should be encapsulated in the InstanceNUMATopology in my eyes | 11:59 |
sean-k-mooney | the instance numa toplogy object is a child object of the request spec | 11:59 |
sean-k-mooney | gibi: well it can be encaplusted there but there | 12:00 |
sean-k-mooney | but the request spec shoudl call InstanceNUMATopology to parse the data and also to gereate teh updated serialised version | 12:00 |
sean-k-mooney | then the the request spec shoudl save that as part of iteslf | 12:01 |
gibi | but that is bad encapsulation | 12:01 |
gibi | the data migration should be automatic and opaque from the client of the object | 12:01 |
sean-k-mooney | gibi: how its just create the InstanceNUMATopology form a json blob and then saving it as a json blob | 12:01 |
sean-k-mooney | gibi: it woudl be automatic if we bumpt the RequestSpec ovo version when the ovo version of one of its contained objects is increased | 12:03 |
sean-k-mooney | but we dont | 12:03 |
sean-k-mooney | so the request spec jsut need to detach that the contained ovo is an older veriion then the current one | 12:03 |
sean-k-mooney | the constoct the new one using from_primitve and then call to_primate to get the updated version and save it back | 12:04 |
sean-k-mooney | gibi: we shoudl be doing the data migration in the from_primitive function | 12:04 |
sean-k-mooney | in InstanceNUMATopology | 12:04 |
gibi | sean-k-mooney: but then you don't know from the client side if re-presisting is needed or not | 12:04 |
sean-k-mooney | that will make it transparent to the client and keep the encalupation | 12:04 |
sean-k-mooney | gibi: well you would if the verion changed but i guess that is a catch 22 | 12:05 |
sean-k-mooney | we might need a way to ask the InstanceNUMATopology ovo if it need migration | 12:05 |
sean-k-mooney | by passing it the primiave object form the db and having it return true/false | 12:06 |
stephenfin | gibi: I think you've figured this out already but no, no blocking DB migration yet. We could add one but it hasn't been done | 12:07 |
stephenfin | reading back through logs. That looks like a hairy bug | 12:07 |
gibi | stephenfin: thanks. I will not have time to add the blocking migration now, but maybe at some point in the future | 12:08 |
sean-k-mooney | is this only needed wehn we have mixed cpus | 12:11 |
gibi | nope | 12:11 |
gibi | what you need is a pre victoria pinned instance migrated post Victoria | 12:11 |
sean-k-mooney | ok so its form the PCPU in placment changes | 12:11 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Migrate RequestSpec.numa_topology to use pcpuset https://review.opendev.org/c/openstack/nova/+/820153 | 12:12 |
gibi | sean-k-mooney, bauzas, stephenfin: ^^ that is my first stab to fix this | 12:12 |
sean-k-mooney | gibi: as a quick hack you can just update the numa toplogy object in the filter | 12:12 |
sean-k-mooney | taht wont actully fix it will it. well i twill work but it wont update the request spec version | 12:13 |
gibi | yeah that would be an option too If I can assume that only the numa filter depends on this | 12:14 |
sean-k-mooney | ah you also do it there | 12:14 |
gibi | I do it now in the RequestSpec loading | 12:14 |
sean-k-mooney | yep just lookg at it now | 12:15 |
sean-k-mooney | we use it in the api also | 12:15 |
sean-k-mooney | wehn validating rebuilds | 12:15 |
sean-k-mooney | although im not sure we actully depend in the cpu sets directly | 12:15 |
sean-k-mooney | i think we generate new objects form the old and new image | 12:15 |
sean-k-mooney | and assert they are the same | 12:16 |
stephenfin | gibi: that's how I'd have done it also | 12:16 |
sean-k-mooney | stephenfin: in the filter or gibis patch | 12:17 |
stephenfin | gibi's patch | 12:17 |
sean-k-mooney | ya i was assuming we would do it on load | 12:18 |
sean-k-mooney | to avoid the blocker migration | 12:18 |
stephenfin | doing it in the filter leaves us having to support that forever. As a temporary fix, sure, but it's not viable long-term | 12:18 |
sean-k-mooney | gibi im not sure we can assume we can remove this imideatly after yoga | 12:20 |
sean-k-mooney | we might want to keep it for a few releases in the event that peoppel do an inplace upgrade | 12:20 |
stephenfin | sean-k-mooney: when do we ever remove things when we said we would | 12:20 |
gibi | :D | 12:20 |
sean-k-mooney | lol true | 12:20 |
gibi | sean-k-mooney: technically if we add a blocking migration then we can remove this in Yoga | 12:21 |
gibi | probably we wont add that | 12:21 |
sean-k-mooney | gibi: we could also do it as a nova manage command with a nova status check | 12:22 |
gibi | yepp | 12:22 |
bauzas | honestly, my jab is that we shouldn't persist this object in the API DB | 12:22 |
gibi | if we trust the users to run the upgrade check | 12:22 |
sean-k-mooney | bauzas: no the curent behavior is correct | 12:22 |
bauzas | if it's just for asking the scheduler using it | 12:22 |
sean-k-mooney | bauzas: we need to persit it to persit the numa constraits | 12:22 |
sean-k-mooney | well technially we can generate it again | 12:23 |
bauzas | which service is persisting it ? | 12:23 |
sean-k-mooney | if we have the embeded image and flavor | 12:23 |
sean-k-mooney | it capture the numa scheduleing requirements | 12:23 |
bauzas | I need to taxi my daughter now | 12:23 |
bauzas | but let's discuss this after | 12:23 |
sean-k-mooney | if we dont persite it we need to regenerate it form the embeded flavor and image | 12:23 |
bauzas | or from the instance one when moving , nope ? | 12:25 |
bauzas | anyway, me moving | 12:25 |
sean-k-mooney | no the instance object is fully populated it might work but its would not work for inital schduling | 12:26 |
sean-k-mooney | we woudl still need to store it in the request spec to not chagne ethe filter api | 12:26 |
bauzas | sean-k-mooney: that's something we already do | 12:44 |
bauzas | sean-k-mooney: we either populate from the image and/or the flavor first | 12:44 |
bauzas | but then for move ops, we get it from somewhere else | 12:44 |
sean-k-mooney | where? we currently populate it form the instance or when we build the request spec form its componets | 12:47 |
sean-k-mooney | bauzas: gibi we coudl make it a propery and constuct it form the image and flavor if we needed too | 12:48 |
sean-k-mooney | and stop persiting it | 12:48 |
sean-k-mooney | it might slow donw scduling some what as numa_get_constraints is not super cheap but its also not that expensive either | 12:51 |
* gibi is on a call | 12:52 | |
sean-k-mooney | its a pure fucntion of its inputs https://github.com/openstack/nova/blob/7670303aabe16d1d7c25e411d7bd413aee7fdcf3/nova/virt/hardware.py#L1858 but we would have to make sure to use the correct falvor on resize | 12:52 |
gibi | sean-k-mooney: you mean not persist the numa_topology in the RequestSpec at all? Just create it from the image/flavor every time we need it? | 13:48 |
sean-k-mooney | gibi: yep that is what bauzas is suggesting | 13:50 |
sean-k-mooney | which we could actully do in the case of the request sepc | 13:50 |
bauzas | sorry, a bit busy on and off | 13:50 |
bauzas | but yeah I don't like to persist the same DB table in another DB | 13:51 |
sean-k-mooney | teh numa constreait fucntion is relitvely cheap in comparison to the actul numa affinity process | 13:51 |
sean-k-mooney | bauzas: its really two differnt thigns | 13:51 |
sean-k-mooney | we jsut use the same object | 13:51 |
sean-k-mooney | one is the numa affinity request object | 13:51 |
sean-k-mooney | and the other is the final affinty object which stores the assinged cpus and numa nodes | 13:51 |
sean-k-mooney | we just reused the same object for both | 13:52 |
gibi | OK. I got it thanks. | 13:52 |
sean-k-mooney | gibi: if we go with the generate it when needed approch i would prefer not to backport that mainly because i dont want to have to thing do we have that or not when looking at different downstream releases | 13:53 |
sean-k-mooney | but im not against it for master on | 13:53 |
gibi | yeah I would not backport that either | 13:53 |
gibi | Lets see if I can get some time adding that to master | 13:54 |
gibi | but I will go with the current patch as a backportable thing | 13:54 |
bauzas | sean-k-mooney: I understand that's two different things | 13:54 |
bauzas | but using the same object is creating some concerns | 13:54 |
gibi | we need that to move back til victoria | 13:54 |
gibi | bauzas: totally agree I think it is worth to remove it from the second table | 13:55 |
bauzas | ++ | 13:55 |
sean-k-mooney | gibi: i review that and +1'd by the way the only thing i woudl add is a release note but it looks correct to me | 13:55 |
sean-k-mooney | well actully | 13:55 |
sean-k-mooney | did you update the func test | 13:55 |
sean-k-mooney | i closed the review | 13:56 |
sean-k-mooney | ok you did not | 13:56 |
sean-k-mooney | so ya release note and then you need to fix the repoduce func test | 13:57 |
gibi | sure I will add a reno | 13:57 |
sean-k-mooney | which passed... | 13:57 |
gibi | the reproduce was fixed but it is not a func test it is a unit | 13:57 |
sean-k-mooney | oh ok | 13:57 |
sean-k-mooney | ya it is | 13:57 |
sean-k-mooney | https://review.opendev.org/c/openstack/nova/+/820153/1/nova/tests/unit/objects/test_request_spec.py | 13:57 |
gibi | creating a pre victoria instance in the func test is far from trivial | 13:58 |
gibi | so I dropped that direction | 13:58 |
sean-k-mooney | sorry i normally expect those to be func tests but ya backporting the func test to victoria woudl be non trivail | 13:59 |
sean-k-mooney | in this specific case i think a unit test is ok | 13:59 |
gibi | not that what I meant. I mean creating a func test on master that simulates a pre Victoria instance is hard | 13:59 |
gibi | I would need to dig in to the DB anyhow to backlevel the data to pre-victoria as our object layer will persist the master version | 14:00 |
*** whoami-rajat__ is now known as whoami-rajat | 14:00 | |
sean-k-mooney | yes so you woudl have to boot the vm | 14:00 |
sean-k-mooney | then update its request spec | 14:00 |
sean-k-mooney | then migrate it and assert it expodes | 14:00 |
sean-k-mooney | gibi: i assumed you ment because we willl be missing several fo the helper functions | 14:01 |
gibi | not that ^^ | 14:01 |
gibi | but the "then update its request spec" | 14:01 |
gibi | that would be heavy DB digging in a func test | 14:01 |
sean-k-mooney | gibi: ya you woudl effectly have to directly execute sql | 14:03 |
sean-k-mooney | which i think is over kill | 14:04 |
gibi | and in that sql manipulate a highly nested json dict :D | 14:04 |
sean-k-mooney | and this is all in a json blob in the db too so its not exactly nice to update via sql either | 14:04 |
sean-k-mooney | exactly | 14:04 |
gibi | exactly :D | 14:04 |
sean-k-mooney | jinks :) | 14:04 |
sean-k-mooney | so ya its understandable that you chose to go the unit test route | 14:04 |
gibi | I have a downstream env where they can reproduce the issue so I will have real test result hopefully in a day | 14:04 |
gibi | that env is a complicated one simulating an upgrade from Mitaka to Victoria :D | 14:05 |
gibi | so when I say pre-Victoria instance that is actually a Mitaka instance :D | 14:06 |
sean-k-mooney | i suspect the only reason we have not hit this downstream is our last release was based on train | 14:06 |
sean-k-mooney | and the next one which will be based on wallaby is not releasing until next year | 14:06 |
gibi | yepp, the bug is introduced in Victoria | 14:06 |
sean-k-mooney | that is after the initall cpu in placement work | 14:07 |
sean-k-mooney | that was in train | 14:07 |
sean-k-mooney | this is part of the mixed cpu feature i think | 14:07 |
sean-k-mooney | ya part of https://specs.openstack.org/openstack/nova-specs/specs/victoria/implemented/use-pcpu-vcpu-in-one-instance.html | 14:07 |
sean-k-mooney | https://specs.openstack.org/openstack/nova-specs/specs/victoria/implemented/use-pcpu-vcpu-in-one-instance.html#work-items | 14:08 |
sean-k-mooney | it added the pcpuset filed | 14:08 |
sean-k-mooney | initally after the pcpu in placment change we just used cpuset since all cores were either pinned or not and we coudl tell that from teh config cpu_policy | 14:10 |
gibi | yepp that is my understanding too | 14:10 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Migrate RequestSpec.numa_topology to use pcpuset https://review.opendev.org/c/openstack/nova/+/820153 | 14:14 |
gibi | now with release notes :) | 14:14 |
sean-k-mooney | :) | 14:17 |
sean-k-mooney | ill wait for ci to finsih and ill try an re review later | 14:17 |
gibi | thanks | 14:19 |
*** akekane__ is now known as abhishekk | 14:58 | |
*** tosky_ is now known as tosky | 16:17 | |
*** priteau is now known as Guest7388 | 16:38 | |
*** priteau_ is now known as priteau | 16:38 | |
opendevreview | Balazs Gibizer proposed openstack/nova master: [WIP]Stop persisting RequestSpec.numa_topology https://review.opendev.org/c/openstack/nova/+/820215 | 17:43 |
*** artom__ is now known as artom | 19:36 | |
*** tbachman_ is now known as tbachman | 19:59 | |
*** tbachman_ is now known as tbachman | 20:10 | |
*** tbachman_ is now known as tbachman | 21:05 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!