bauzas | gentle reminder that we have a couple of open patches that we need to merge before RC1 : https://etherpad.opendev.org/p/nova-epoxy-rc-potential | 10:47 |
---|---|---|
bauzas | also, I'll be off today after noon CET (ie. in 15 mins) | 10:47 |
opendevreview | ribaudr proposed openstack/nova master: FUP improve and add integration tests for PCI SR-IOV servers https://review.opendev.org/c/openstack/nova/+/944106 | 12:11 |
opendevreview | ribaudr proposed openstack/nova master: FUP Add a warning to make non-explicit live migration request debugging easier https://review.opendev.org/c/openstack/nova/+/944133 | 12:11 |
opendevreview | ribaudr proposed openstack/nova master: FUP Update pci-passthrough and virtual-gpu documentation https://review.opendev.org/c/openstack/nova/+/944153 | 12:11 |
opendevreview | Merged openstack/nova master: doc: mark the maximum microversion for 2025.1 Epoxy https://review.opendev.org/c/openstack/nova/+/943948 | 12:32 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Reproduce bug/2098496 https://review.opendev.org/c/openstack/nova/+/941673 | 13:21 |
opendevreview | Dan Smith proposed openstack/nova-specs master: Add one-time-use-devices spec https://review.opendev.org/c/openstack/nova-specs/+/943486 | 13:37 |
dansmith | bauzas: typo fix, if you could ^ | 13:37 |
dansmith | melwitt: I still think your +1 on that is useful if you want to re-add it (and I don't blame you, this is trial-by-PCI fire for me) | 13:38 |
opendevreview | Dan Smith proposed openstack/nova master: Add one-time-use devices docs and reno https://review.opendev.org/c/openstack/nova/+/944262 | 14:29 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Reproduce bug/2098496 https://review.opendev.org/c/openstack/nova/+/941673 | 15:29 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Ignore metadata tags in pci/stats _find_pool logic https://review.opendev.org/c/openstack/nova/+/944277 | 15:29 |
gibi | dansmith: just an FYI. This ^^ is the bugfix in the PCI in Placement codepath that fixes double allocations due to pool breakup | 15:31 |
dansmith | um, you say that like I should have context on that issue.. did we discuss this before? | 15:31 |
gibi | yepp in our last sprint planning downstream ;) | 15:32 |
gibi | you were interested in reviewing it | 15:32 |
gibi | the main issue is that if you have like 2 VFs per a single PF | 15:32 |
dansmith | okay I shall refresh myself :D | 15:32 |
gibi | and boot a VM requesting 1 VF, then delete the VM, and boot again the second VM will get 2 VFs instead of one | 15:33 |
dansmith | oh right right sorry | 15:33 |
gibi | the way the pci stats device pooling interacts with the placement driven PCI allocation leads some broken assumptions | 15:34 |
gibi | causing that the two VFs are broken up into two pools but both refer to the same placement RP | 15:34 |
gibi | while the other side of the code assumes that devices from a same RP are always in the same pool | 15:35 |
dansmith | I totally remember the conversation now, yep | 15:35 |
gibi | now both side of the code is improved. i) we avoid breaking up the pool ii) even if the pool is broken up in the future the other side of the code will be more graceful | 15:36 |
gibi | and not just allocate the same amount from each pool | 15:36 |
dansmith | okay I skimmed the repro thing.. I need to go learn about the pooling as I have a big gap there.. I'm also surprised this affects placement, but I guess because of some healing | 15:45 |
dansmith | but the "doesn't end up like we started" part of that makes sense of course | 15:45 |
sean-k-mooney | the pooling of the pci devices? | 15:46 |
gibi | sean-k-mooney: yepp | 15:46 |
gibi | dansmith: exactly I also think it affects placement due to healing, I will assert that in the repro.. | 15:47 |
sean-k-mooney | nova pci tracker kid of wors like placment but before placement was a thing. we summerised the "invetories" of devices into pools and that what we expose to the schduler | 15:47 |
sean-k-mooney | rather then all the devices on each host | 15:47 |
dansmith | I assume the pools are because of numa nodes or something? | 15:47 |
sean-k-mooney | not exactly | 15:48 |
sean-k-mooney | they are partly related ot that | 15:48 |
sean-k-mooney | btu they can be splcit on other thign liek neutron phsynet | 15:48 |
gibi | or rp_uui :) | 15:49 |
gibi | rp_uuid | 15:49 |
sean-k-mooney | ya that was grafted on after the fact for pci in placement | 15:49 |
gibi | dansmith: (confirmed locally that indeed the healing part breaks the placement view) | 15:49 |
sean-k-mooney | before a pool of VFs (think inventory of vfs) was group by the vendor id and product id, and i think parent and or phsynet? | 15:50 |
dansmith | but.. is there a pool object somewhere? seems like maybe they're just generated on the fly when we go to process a request? | 15:51 |
sean-k-mooney | yes | 15:51 |
sean-k-mooney | its stored in the comptue node table | 15:51 |
sean-k-mooney | that where the schdluer gets them form to add to the host state object | 15:51 |
dansmith | oh yeah I see | 15:52 |
dansmith | but this pci/stats thing (which seems not very related to stats) defines Pool as a dict | 15:52 |
sean-k-mooney | no its an ovo | 15:53 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/objects/pci_device_pool.py | 15:53 |
dansmith | I see that | 15:53 |
sean-k-mooney | well its stored as a json blob if that what your asking | 15:53 |
dansmith | https://review.opendev.org/c/openstack/nova/+/944277/1/nova/pci/stats.py line 39 | 15:53 |
dansmith | ^ is what I'm talking about | 15:53 |
sean-k-mooney | we convert it here https://github.com/openstack/nova/blob/master/nova/objects/compute_node.py#L321-L326 | 15:54 |
sean-k-mooney | which is called in save on the compute node here https://github.com/openstack/nova/blob/master/nova/objects/compute_node.py#L360 | 15:55 |
gibi | stats pool concept works with dict yes, the DB model works with jsonified ovos | 15:55 |
dansmith | and the PciDeviceStats seems to call a PciDevicePoolList "stats" | 15:55 |
sean-k-mooney | ya | 15:56 |
dansmith | which it then converts to a dict which it calls "pools" | 15:56 |
dansmith | wtf? | 15:56 |
sean-k-mooney | the nameing is shall we say in consitent | 15:56 |
dansmith | I was going to pick a different adjective.. possibly an emoji | 15:56 |
sean-k-mooney | the db colume is called pci_stats | 15:57 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/db/main/models.py#L227-L229 | 15:57 |
dansmith | smh | 15:57 |
sean-k-mooney | the stats name came for the fact its a summary | 15:57 |
sean-k-mooney | i.e. a count and some metadata | 15:58 |
dansmith | so PciDevicePool has a product and vendor but not an address, | 15:58 |
sean-k-mooney | rather then each device | 15:58 |
sean-k-mooney | correct | 15:58 |
dansmith | so I assume we take the alias map and generate all the PciDevicePool objects we need with the appropriate count, even if count=1 because they just specified one address? | 15:58 |
sean-k-mooney | its a count of how many of that type of device (product/vendor) are avaible | 15:58 |
sean-k-mooney | yes | 15:58 |
sean-k-mooney | if the pci dev spec only allows 1 device you get a pool or 1 | 15:59 |
sean-k-mooney | *of 1 | 15:59 |
dansmith | this code is, I think, a very nice concrete example of why I normally just gloss over anytime PCI stuff comes up and try to ignore it :) | 15:59 |
dansmith | ack | 15:59 |
sean-k-mooney | i actully like this code partly becasue to me it very placement like | 15:59 |
dansmith | does the grouping per physnet, per rp, etc, etc happen in compute node so we report the proper splitting to scheduler? | 16:00 |
sean-k-mooney | well conceptually | 16:00 |
sean-k-mooney | dansmith: yes it does | 16:00 |
dansmith | sean-k-mooney: I mean the name confusion and the converting everything to mis-named dicts to work on :) | 16:00 |
sean-k-mooney | its part of the resouce summary view we create as a perodic | 16:00 |
dansmith | okay makes sense | 16:00 |
sean-k-mooney | dansmith: some of this code i think predate when you added nova objects | 16:01 |
sean-k-mooney | so ya the conventiosn were not as well established | 16:01 |
dansmith | sean-k-mooney: I don't think it predates objects, but I know it's old | 16:02 |
sean-k-mooney | perhasp it was 2012 ish | 16:02 |
sean-k-mooney | not imporant now i ugess | 16:03 |
gibi | another "complication" that pool filtering happens twice by this code, once from the PCIPassthroughFilter and once from the pci claim on the compute. But it happens a bit differently as the former only handles counts, while the latter handles real device addresses to allocat | 16:03 |
sean-k-mooney | it actully happens 3 times | 16:04 |
gibi | true, twice in the scheduler | 16:04 |
sean-k-mooney | it happens in both the pci passthoguh filter and again in the numa topology filter | 16:05 |
sean-k-mooney | the pci filter really just checks that there are enough devices of the correct type | 16:06 |
sean-k-mooney | the numa filter need to also take into acocunt the numa constraits | 16:06 |
opendevreview | Dan Smith proposed openstack/nova master: Support "one-time-use" PCI devices https://review.opendev.org/c/openstack/nova/+/943816 | 17:30 |
opendevreview | Dan Smith proposed openstack/nova master: Add one-time-use devices docs and reno https://review.opendev.org/c/openstack/nova/+/944262 | 17:30 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Reproduce bug/2098496 https://review.opendev.org/c/openstack/nova/+/941673 | 17:38 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Ignore metadata tags in pci/stats _find_pool logic https://review.opendev.org/c/openstack/nova/+/944277 | 17:38 |
*** __ministry is now known as Guest11263 | 18:06 | |
atmark | hello, does ceph driver for nova support rbd namespaces? | 19:59 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!