Tuesday, 2021-08-17

opendevreviewYongli He proposed openstack/nova master: Accelerator smartnic SRIOV support  https://review.opendev.org/c/openstack/nova/+/80432002:26
gibimy broadband acting up in the last 24 hours so I might not immediatly see pings 07:24
*** rpittau|afk is now known as rpittau07:54
aarentsHi gibi, Thks for comment on https://review.opendev.org/c/openstack/nova/+/764435 I replied07:57
lyarwoodgibi: \o hey how was PTO? 09:38
lyarwoodhttps://review.opendev.org/c/openstack/nova/+/804275 (and the various patches either side) could use reviews this week if anyone has time09:50
lyarwoodops that should be https://review.opendev.org/c/openstack/nova/+/804230/ but that API change is also ready09:51
lyarwoodI was just going to add novaclient support for the microversion alongside 09:51
lyarwoodanother bugfix https://review.opendev.org/c/openstack/nova/+/802317 is also ready for review FWIW09:52
* lyarwood sorts out the novaclient change before doing some master reviews of his own09:52
gibi___aarents: ack, I will get back to that10:22
gibi___lyarwood: o/ PTO was good, thanks. now I have broadband issues :/ but I will get to both of your reviews10:23
gibi___stephenfin: I respun the pps series yesterday and fixed your nits along the way. So if you have time for a quick re-review then that would be appreciated10:26
stephenfingibi___: will do10:29
gibi___stephenfin: thanks10:29
Gowthami__Hi All, Hope you are doing fine.  https://review.opendev.org/c/openstack/nova/+/764482/ commit is made for the bug https://launchpad.net/bugs/1581977. The tempest introduced along with commit has been failing on "IBM PowerKVM CI" unable to ping the floating ip. The "guest-instance-1.domain.com" vm created in the https://review.opendev.org/c/openstack/tempest/+/795699 ( ServersTestFqdnHostnames.test_create_server_with_fq10:30
Gowthami__Hi All, Hope you are doing fine.  https://review.opendev.org/c/openstack/nova/+/764482/ commit is made for the bug https://launchpad.net/bugs/1581977. The tempest introduced along with commit has been failing on "IBM PowerKVM CI" unable to ping the floating ip. The "guest-instance-1.domain.com" vm created in the  ( ServersTestFqdnHostnames.test_create_server_with_fqdn_name) is active but couldn't ping the vm from its na10:32
Gowthami__The tempest is being executed on adevstack vm and please find the error link: https://oplab9.parqtec.unicamp.br/pub/ppc64el/openstack/nova/82/764482/2/check/tempest-dsvm-full-focal-py3/de4be5a/job-output.txt openstack console log show doesn't have any error "Trying to load:  from: /pci@800000020000000/scsi@3 ...   Successfully loaded\"  May I ask if you could suggest way forward for this ?10:32
lyarwoodGowthami__: that's from within the guestOS itself10:37
lyarwoodGowthami__: so whatever guest image you're using doesn't seem to boot fully in this example10:37
lyarwoodGowthami__: I would highly doubt that is due to the test and/or fix you have referenced above10:38
lyarwoodGowthami__: and it's likely more of an issue with your test env (lack of resources given to each test instance?) or guest image (a recent update maybe?)10:38
lyarwood2021-08-17 07:31:10.919117 | devstack-focal-newcloud | ++ stackrc:source:691                       :   DEFAULT_IMAGE_NAME=cirros-0.4.0-ppc64le-disk10:40
lyarwood2021-08-17 07:31:10.921287 | devstack-focal-newcloud | ++ stackrc:source:692                       :   DEFAULT_IMAGE_FILE_NAME=cirros-0.4.0-ppc64le-disk.img10:40
lyarwood2021-08-17 07:31:10.923262 | devstack-focal-newcloud | ++ stackrc:source:693                       :   IMAGE_URLS+=http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-ppc64le-disk.img10:40
lyarwoodI'd recommend trying to use 0.5.2 tbh10:41
lyarwood2021-08-17 07:31:10.883720 | devstack-focal-newcloud | ++ stackrc:source:672                       :   CIRROS_VERSION=0.4.010:42
lyarwoodmissed that this job is hardcoded to 0.4.0,10:43
gibi___stephenfin: I have a request in https://review.opendev.org/c/openstack/nova/+/799684/5/nova/tests/unit/db/api/test_migrations.py10:55
stephenfingibi / gibi___: replied11:00
stephenfin(done in a follow-up https://review.opendev.org/c/openstack/nova/+/800484/4/tox.ini)11:00
songwenping__stephenfin: hi, have you already fixed the oslo.db for 8.5.0 version about the mysql conflict message changed?11:04
stephenfinsongwenping__: that's probably better asked on #openstack-oslo, but the answer is the patches have been merged but they have not been released yet11:06
songwenping__got it, thanks.11:07
stephenfinsongwenping__: https://review.opendev.org/c/openstack/releases/+/80484411:08
opendevreviewMerged openstack/nova stable/wallaby: virt: Add destroy_secrets kwarg to destroy and cleanup  https://review.opendev.org/c/openstack/nova/+/79625711:10
gibi___stephenfin: thanks11:13
songwenping__stephenfin: thanks, wait for the release patch merged.11:14
gibi___stephenfin: I have comments in https://review.opendev.org/c/openstack/nova/+/80007811:29
gibi___stephenfin, lyarwood : I'm done with the alembic series, I'm mostly +2. 11:33
gibi___stephenfin: thanks for working on it11:33
lyarwoodACK I think I still had some left to review in that series, I'll try to finish it today11:37
gibi___lyarwood: yepp, that is why I pinged you, as I saw you were doing active review previously on that series11:38
lyarwoodah I see, thanks11:38
* lyarwood is still catching up after being offline sick11:38
Gowthami__<lyarwood> Thank you . Will try with 0.5.2 and also increase the resources too.11:50
lyarwoodGowthami__: yeah FWIW if you do try 0.5.2 you need to raise the resouces anyway https://github.com/cirros-dev/cirros/issues/5311:51
gibi___aarents: I have still concerns about https://review.opendev.org/c/openstack/nova/+/764435/5/nova/virt/libvirt/driver.py#9970 11:57
sean-k-mooneygibi___: my understainding is we are nver ment to attempt to rollback a live migration once we have actully started it in qemu12:00
sean-k-mooneywe can rollback if we call migrate on libvirt and it imideatly returns with an error12:00
sean-k-mooneybut once it start we dont rollback unless it times out12:00
sean-k-mooneybesided timeout to we have other cases where we rollback after the migration has started today?12:01
sean-k-mooneyim not sure that aarents patch will solve the issue they are trying to solve in this case either12:03
sean-k-mooneybut for different reasons, the migration may continue as they said and the vm can end up on the destionation12:03
sean-k-mooneyso reverting the db state may or may not be the correct thing to do12:04
sean-k-mooneyfor example im concerned about what happens with post copy12:04
sean-k-mooneythe instance would be still migrating but running on the dest and we would have already executed part or all of post_live_migrateion assuimg we recived the post_copy_reume event before  the monitor connection died12:06
sean-k-mooneywhihc shoudl mean the host is already updated.12:06
aarentsgibi___: Hum, but I think I call live_migration_abort() of libvirt driver not from manager and it only call libvirt.api but now I have doubt 12:07
sean-k-mooneyin the libvirt dirver it just does https://review.opendev.org/plugins/gitiles/openstack/nova/+/refs/changes/35/764435/5/nova/virt/libvirt/driver.py#943412:07
aarentssean-k-mooney: yes thanks for the link12:08
sean-k-mooneyso i think that is fine it will jsut call libvirt12:12
sean-k-mooneyalthoguh if the monitor connect is down it may not be able to mange the vm12:12
* gibi_ hates vodafone12:13
aarentssean-k-mooney: yes in that case it will not work12:13
sean-k-mooneywhich is the case you are trying to fix right. in the even the monitor connection drops you want to about the migration job12:14
sean-k-mooneyyou can tell libvirt to about the migration12:14
sean-k-mooneybut it may or may not be able ot comply12:14
sean-k-mooneyi assume the except Exception: is to catch the libvirt error that is raised when that happens12:15
gibi_aarents: oops sorry I jumped to the wrong driver.live_migration_abort call. 12:16
aarentsthis will work only with network,RPC,DB issues not for libvirt issue12:16
sean-k-mooneyaarents: so for those cases im not sure we want to abort the migration12:17
sean-k-mooneyaarents: unless you want to abort all other operation when that happens12:17
aarentssean-k-mooney: or it may work if there is only one flap from libvirt12:17
sean-k-mooneyspawns, deletes ectra12:17
sean-k-mooneyits a larger chagne but to me what feels like a more robost change would be to suspend the green thread if the connection is closed and resume it when we reconnect and only try to send the rpc call then12:20
sean-k-mooneyrealistically if the rpc bus is down there is notight on the compute we can do to update the db state12:20
aarentssean-k-mooney: honestly, the change is just ensuring to kill job regardless if state in can or cannot update in DB, we loss some instances due to that as explain in bug12:24
gibi_hm, so assuming we have the RPC down. the patch aborts the libvirt job. then raises the exception as today. That exception expected to update the instance and migration states which will not happen while the RPC is down. the nova compute RPC call to update the DB will time out eventually I guess. 12:24
aarentsgibi_: yes12:24
gibi_so the nova DB will still see the migration as runnig12:24
gibi_but the compute already aborted it12:25
gibi_does the conductor time out the migration eventually too?12:25
sean-k-mooneyaarents: have you tested this with post-copy enabled12:25
sean-k-mooneygibi_: i think the timeout happened at the comptue level12:25
sean-k-mooneynot the conductor12:25
gibi_sean-k-mooney: so there is no cleanup triggered be the conductor for this aborted migration12:26
gibi_s/be/by/12:26
sean-k-mooneyim not sure12:26
aarentsgibi_: yes there will be inconsitency that need operator intervention, but vm will be safe because still referenced in source host & running on source host12:26
gibi_aarents: I see. that was the missing piece12:26
sean-k-mooneyaarents: again i dont know if that is always correct12:27
gibi_aarents: so this change does not try to fix an DB inconsistency but try to save the VM12:27
sean-k-mooneyyou have ignored my post-copy question12:27
aarentssean-k-mooney: good question12:28
aarentsgibi_: exactly, I was not clear12:28
gibi_sean-k-mooney: so you suggest that the save move would be to abort the non post-copy migrations and let the post-copy migrations run forward12:28
gibi_/save/safe/12:28
sean-k-mooneygibi_: yes12:28
sean-k-mooneybut only afte we are in the post copy phase12:28
gibi_sean-k-mooney: I guess we document that post-copy means no way back12:28
sean-k-mooneywell we can abort until we enter the post copy phase12:29
aarentssean-k-mooneyI don't have so much experiance about post copy in operation12:29
sean-k-mooneybut when we hit post copy suspend we call post_live_migration12:29
sean-k-mooneyand update the host and prot bindings12:29
sean-k-mooneyaarents: we dont have access to the last known state of the instance at thsi point do we12:31
sean-k-mooneyform a libvirt perspecitiv12:31
sean-k-mooneyassuming not then i would make the abort condtional on "not postcopy_enabled"12:31
sean-k-mooneyto be on the safe side12:31
sean-k-mooneygibi_: looking at https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py i dont see anything that looks like cleanup logic once a migration has started12:33
aarentssean-k-mooney: no we don't have access to the instance state12:33
sean-k-mooneywe handel messigng time ectra form   check_can_live_migrate_destination and other cases but there seams to be no overall timeout enforced by the conductor12:34
sean-k-mooneywhich kind of makes sense since the live migration timout option are virt diriver specific12:34
sean-k-mooneyas is whereter we abort or force complete when the timeout expires12:35
gibi_sean-k-mooney: ack, I got it now that aarents' goal is to save the VM running state instead of avoiding the DB inconsistency in case of RPC/DB issue12:35
gibi_sean-k-mooney, aarents: I'm fine with the patch with addition of the "not postcopy_enabled" condition as sean-k-mooney suggests12:36
sean-k-mooneygibi_: i think i would be oke with it with that added also12:37
gibi_cool12:37
aarentsgibi_: sean-k-mooney yep this "not postcopy_enabled" condition make sense12:37
aarentsI will add that, thanks12:38
lyarwoodhttps://libvirt.org/html/libvirt-libvirt-domain.html#virDomainAbortJob FWIW12:39
lyarwoodIn case the job is a migration in a post-copy mode, virDomainAbortJob will report an error (see virDomainMigrateStartPostCopy for more details).12:39
lyarwoodhttps://libvirt.org/html/libvirt-libvirt-domain.html#virDomainMigrateStartPostCopy has some more context12:40
lyarwoodOn the other hand once the guest is running on the destination host, the migration can no longer be rolled back because none of the hosts has complete state. If this happens, libvirt will leave the domain paused on both hosts with VIR_DOMAIN_PAUSED_POSTCOPY_FAILED reason. It's up to the upper layer to decide what to do in such case. Because of this, libvirt will refuse to cancel post-copy migration via virDomainAbortJob.12:40
lyarwoodso tbh I don't think we need to check anything12:41
lyarwoodoh wait I missed that live_migration_abort is raising the error back, sigh12:42
sean-k-mooneyya although we are catching and ignoring that with a log in aarents patch12:45
sean-k-mooneyi guess we could rely on that behavior but i would prefer to have a comment to that effect honestly12:45
sean-k-mooneyjsut to not forget that12:45
sean-k-mooneyas the next time i see the abbort ill get suspicios about post copy again.12:46
gibi_aarents: are you seeing this ^^ :)12:49
aarentsgibi_: Yes so I will add a comment that say that abort may not work in case of post copy ?12:50
gibi_aarents: I guess you need to catch the error returned from abort and ignore it12:50
aarentsSo I drop the warning12:52
aarents?12:52
gibi_aarents: sorry, so you already catching the error from abort, that is OK12:53
gibi_keep the warning too12:53
gibi_just add a note as sean-k-mooney requested12:53
aarentsAnd is there a concensus about sean-k-mooney suggestion to change except Exception with except libvirt.libvirtError: ?12:53
aarentsgibi_: ok12:53
gibi_yepp go with libvirtError12:53
aarentsok cool12:55
opendevreviewStephen Finucane proposed openstack/nova master: docs: Add documentation on database migrations  https://review.opendev.org/c/openstack/nova/+/80007813:05
opendevreviewStephen Finucane proposed openstack/nova master: db: Final cleanups  https://review.opendev.org/c/openstack/nova/+/80048413:05
opendevreviewStephen Finucane proposed openstack/nova master: tests: Enable SADeprecationWarning warnings  https://review.opendev.org/c/openstack/nova/+/80470813:05
opendevreviewStephen Finucane proposed openstack/nova master: WIP tests: Enable SQLAlchemy 2.0 deprecation warnings  https://review.opendev.org/c/openstack/nova/+/80470913:05
stephenfingibi_: ^13:05
opendevreviewLee Yarwood proposed openstack/nova master: api: Introduce microversion 2.89 adding attachment_id to responses  https://review.opendev.org/c/openstack/nova/+/80427513:07
gibistephenfin: ack13:09
gibistephenfin: thanks, now I'm +2 all the way13:10
*** thelounge555 is now known as thelounge5513:35
opendevreviewStephen Finucane proposed openstack/nova master: tests: Enable SQLAlchemy 2.0 deprecation warnings  https://review.opendev.org/c/openstack/nova/+/80470913:53
opendevreviewStephen Finucane proposed openstack/nova master: Replace use of Engine.scalar(), Engine.execute()  https://review.opendev.org/c/openstack/nova/+/80487813:53
opendevreviewElod Illes proposed openstack/nova stable/rocky: [stable-only] Fix lower-constraints job  https://review.opendev.org/c/openstack/nova/+/76991013:55
opendevreviewAlexandre arents proposed openstack/nova master: libvirt: Abort live-migration job when monitoring fails  https://review.opendev.org/c/openstack/nova/+/76443514:00
opendevreviewElod Illes proposed openstack/nova stable/rocky: [stable-only] Fix lower-constraints job  https://review.opendev.org/c/openstack/nova/+/76991014:28
gibimelwitt: hi! I left feedback in https://review.opendev.org/c/openstack/nova/+/713301 the most concerning for me is the dependency on an oslo.limit patch as non client libraries are going to feature freeze this week14:50
gansomelwitt, gibi hi! if you have a few minutes could please take a look at this 1-liner fix https://review.opendev.org/c/openstack/nova/+/804303 ?  Thanks in advance15:49
gibiganso: looking15:50
spatelsean-k-mooney hey! i am upgrading minor version of victoria and during upgrade at this step i hit this issue - https://paste.opendev.org/show/808150/15:52
gibiFYI, nova meeting starts in 5 minutes here in the channel 15:54
gibiganso: does the 1 vcpu + multiqueue case works for other than vif_type=tap?15:56
gansogibi that code path is only reached when vif_type=tap. If using openvswitch, it wasn't impacted by the previous patch (that introduced the regression), neither this. I am not sure if the same problem happens with ovs, but since the original problem didn't, I believe this one also doesn't15:57
gibiganso: ok, let me try with ovs15:59
sean-k-mooneyvif_type=tap is not currently used with ovs15:59
sean-k-mooneyit was added tempoery and then removed16:00
sean-k-mooneyi belive the only thing that uses vif_type=tap today is calico16:00
gibisean-k-mooney: yepp the bug mentions calico16:00
gibisean-k-mooney: https://bugs.launchpad.net/nova/+bug/193960416:00
melwittgibi: ah, right. it's not a hard dep, so we can untie it. it's nice-to-have since it will cache limit for a repeated try N, N-1, N-2, etc for a multi create16:00
sean-k-mooneygibi: ah ok ill look at it after the meeting16:01
gibimelwitt: OK, it is easier then. lets see if the oslo patch lands before the deadline and if not then just remove the depends-on16:01
gibibut now lets have a meeting16:01
gibi#startmeeting nova16:01
opendevmeetMeeting started Tue Aug 17 16:01:38 2021 UTC and is due to finish in 60 minutes.  The chair is gibi. Information about MeetBot at http://wiki.debian.org/MeetBot.16:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:01
opendevmeetThe meeting name has been set to 'nova'16:01
melwittgibi: ++ thanks16:01
gibisean-k-mooney: thanks16:01
gibi#topic Bugs (stuck/critical) 16:02
gibino critical bug open16:02
gibi#link 15 new untriaged bugs (+4 since the last meeting): #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New16:02
gibiis there any specific bug to discuss today?16:03
gibiI see ganso has one in the open discussion, lets bring that up here16:04
gibi(ganso): bug "Compute node deletes itself if rebooted without DNS": https://bugs.launchpad.net/nova/+bug/193992016:04
gibiwas this a design choice? acceptable solutions discussion16:04
gibiEOM16:04
gansogibi thanks16:04
gansoso, IMO this is a critical bug, and after reading the code and the way it works it kinda feels like a design choice16:05
gansobecause seems like it was intentionally implemented for it to scan for "orphan compute nodes" and delete them, clear the allocations and RP, etc16:05
gibiyes that was intentional16:05
gansobut it is producing this effect which is very undesirable16:05
gibiso in your infra the compute host can change hostname and that causing the issue16:06
gansoas I suggested in the bug, a possible solution I see if to compare the host field in the nova.compute_nodes table16:06
gansoif it is the same, then we would skip this16:06
gansogibi: it is not that it "can" change the hostname. But it happens due to external reasons16:06
gansolike, lack of connectivity when it boots, a DNS outage, etc16:07
melwittit will recover in that it will create a new compute node etc. the main thing that is "unique" is the hostname, that's what's stored in the instance.host and a whole lot of other places. so changing the name you break all the associations and in reality you essentially have a new/different service and compute node16:07
melwittif the associations were done using UUID it would be a different story. but unfortunately it is what it is and would take a large work to change it IMHO16:08
gansomelwitt: right, so the instance.host captures the entire FQDN, and the FQDN is what is changing, therefore when that changes, running instances are no longer identifiable as running in that node16:08
melwittright16:08
gansomelwitt: so that is another side-effect of that FQDN changing problem, but I am not proposing changing that. I am just proposing to skip this "deletion" step if the compute_nodes.host field does not change16:09
gansoif will avoid part of the issues16:09
gansomelwitt it will not avoid the issue you described, but 1 issue is better than 2 I think16:10
dansmithI'm missing the distinction I think16:10
gibibut compute_nodes.host comes from the DB isn't it? so it won't ever change16:11
sean-k-mooneyganso: right so nova does not support compute hosts changing hostname today16:11
gansogibi: doesn't it derive from the FQDN it reads from the system?16:11
sean-k-mooneyso if it is changing for external reasons that is not expected to work out of the box16:11
dansmithsean-k-mooney: ++16:12
gansosean-k-mooney: right, but I'm not proposing that it does support, but just stop doing what it is doing today. That thing about orphan compute nodes isn't supposed to address changing hostnames either16:12
gibiganso: when the ComputeNode is created then yes, it is coming from the hostname reported by libvirt, but never changes after16:13
sean-k-mooneyganso: even if we did not clean up the orpah compute nodes teh instnace.host is used to make rpc calls to the host that the instance is on16:13
sean-k-mooneyso unless you hardcode the chost paramter in the nova.conf so it does not change16:14
gansodansmith: when FQDN changes from "host.domain" to "host.domain1" or just "host" it causes the compute node to delete itself from the DB, clear allocations, RP, etc, and the new name will not match the instances.host field as melwitt mentioned. Out of all those consequences, I'd suggest skipping the compute node deletion, because this is an error state, to avoid deleting up all allocations and RPs, so the node can more easily go 16:14
gansoback to normal once the FQDN is fixed and the service is restarted16:14
sean-k-mooneythat will still break16:14
sean-k-mooneyganso: the compute service will not do that by default16:14
sean-k-mooneythe compute service will auto register16:14
gansosean-k-mooney: yes, that will still be broken, as it is today, no need to fix that right now16:14
sean-k-mooneybvut it wont auto delete16:14
gansosean-k-mooney: well it does, it thinks there was an orphan and deletes it16:15
sean-k-mooneyganso: what deletes it16:15
sean-k-mooneyi think i missed that16:15
gansosean-k-mooney: https://github.com/openstack/nova/blob/b0099aa8a28a79f46cfc79708dcd95f07c1e685f/nova/compute/manager.py#L999716:15
sean-k-mooneyis this a clustered hypervior16:16
gansosean-k-mooney: "host.domain" changes to "host", so it deletes "host.domain" from the compute nodes table and creates a new one, as if the node was brand new16:16
sean-k-mooneye.g.  ironic or hyperv or something like vmware16:16
dansmithganso: because that's a hostname change16:16
gansosean-k-mooney: no, it is just a regular compute node with a libvirt compute service16:16
dansmithganso: arrange for that not not happen, that's the solution, IMHO16:17
gansodansmith: unfortunately it is beyond control16:17
sean-k-mooneywell for the libvirt driver that is entirly unsupported16:17
sean-k-mooneythe other way to fix this is to make sure your cannonical hostname is not the fqdn16:17
gansomy proposal is to leave it in an error state to prevent it from deleting allocations and RP16:17
sean-k-mooneye.g. in /etc/host set <ip> <short hostname> <fqdn>16:18
dansmithsean-k-mooney: or /etc/domainname, but yeah, totally fixable, IMHO16:19
sean-k-mooneyim still configuse how nodenames is a list in this case16:19
gansosean-k-mooney: hmm I see, that would override the one currently being provided by the domain provider16:19
sean-k-mooneyor rather how when the the fqdn changes we are actully geting anything back from the db16:20
sean-k-mooneyi was expecting it to not match anything16:20
dansmithcan't you set the hostname nova uses in the config anyway? to hard-code it per host so it doesn't change, I thought we had that16:20
sean-k-mooneyunless you have something linke  host1.<domain1> host1.<doamin2>16:20
gansodansmith: looking in the code now16:20
sean-k-mooneydansmith: you can set the hostname used by the compute service16:21
dansmithwe might not want to hold up the meeting to discuss this to completion16:21
sean-k-mooneynot the hypervior_hostname16:21
sean-k-mooneywhich in this case comes form libviret16:21
dansmithsean-k-mooney: ah, so that's fixed but the nodename is always from hostname, right okay16:21
gansooh yea, console_host16:21
gansodefault=socket.gethostname()16:21
sean-k-mooneyganso: not console host but ill get the link and send it to you16:21
sean-k-mooneyi think we can move on and come back to this after the meeting16:21
gansosean-k-mooney: thanks, yes. Thanks for the suggestions!16:22
gibilets come back to this16:22
gibithanks sean-k-mooney dansmith melwitt 16:22
sean-k-mooneyganso: https://github.com/openstack/nova/blob/master/nova/conf/netconf.py#L52-L7016:22
gibiany other bug that needs attention?16:22
gibi#topic Gate status 16:23
gibiNova gate bugs #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure16:23
gibiI dont see new gate bugs in that list16:23
gibiand also I pushed plenty of patches yesterday without many failures from Zuul16:24
gibiso I think master CI looks good16:24
sean-k-mooney:)16:24
gibiany recent failures?16:24
melwittyeah I think the troubling one is the libvirt/qemu one.. that we're doing non voting on live migration job over16:24
gibiyeah, skiping that helped a lot 16:25
gibiplacement period jobs are green too #link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly16:25
gibianything else about the gate?16:25
gibi#topic Release Planning 16:26
gibiMilestone 3 and therefore Feature Freeze is at 3rd of September which is in 2 weeks.16:26
gibilets land things :)16:26
gibiNon client library freeze is this week. 16:26
gibios-vif: https://review.opendev.org/q/project:openstack/os-vif+status:open+branch:master nothing important seems to be pending16:26
gibios-resource-classes: https://review.opendev.org/q/project:openstack/os-resource-classes+status:open ditto nothing is pending16:27
sean-k-mooneyyes i might try to addreess https://bugs.launchpad.net/os-vif/+bug/193954216:27
gibios-traits: https://review.opendev.org/q/project:openstack/os-traits+status:open there seem pending reviews for traits needed by ongoing features e.g.: COMPUTE_GRAPHICS_MODEL_BOCHS and HW_FIRMWARE_UEFI 16:27
sean-k-mooneybut im fine with backporting it too16:27
gibisean-k-mooney: sure, bugs are easy, as the fix is backportable16:27
gibibut os-traits has some new trait proposal that if they not land then the feature depending on them is blocked in Xena16:28
gibiso let's close those this week 16:28
sean-k-mooneyim not sure about HW_FIRMWARE_UEFI16:28
sean-k-mooneybut ill review it16:29
sean-k-mooneythecnially that is stating that the host has uefi boot capablity16:29
gibithe BOCHS trait probably needs kashyap answer as stephenfin had some feedback on https://review.opendev.org/c/openstack/os-traits/+/79480716:29
sean-k-mooneyas in the host can boot in uefi mode not that it can virtualise it16:29
sean-k-mooneyso i think HW_FIRMWARE_UEFI shoudl be COMPUTE_FIRMWARE_UEFI16:30
gibiohh, that is a good point16:30
gibistephenfin: ^^ :)16:30
sean-k-mooneythe bosh trait looks correct but ill read stpehns commnets16:30
gibisean-k-mooney: thanks16:31
gibianything else about the coming lib feature freeze?16:31
*** rpittau is now known as rpittau|afk16:31
gibi#topic PTG Planning 16:32
gibievery info is in the PTG etherpad #link https://etherpad.opendev.org/p/nova-yoga-ptg16:32
gibiIf you see a need for a specific cross project section then please let me know16:32
gibis/section/session/16:33
gibiany question about the PTG?16:34
gibi#topic Stable Branches 16:35
gibistable/queens is blocked (tempest-full-py3 @ "Starting Horizon", probably due to queens-eol of horizon)16:35
gibiall the other branches' gate look OK16:35
gibiEOM from elodilles 16:35
elodillesi've proposed a quick fix for queens gate: https://review.opendev.org/c/openstack/devstack/+/80488916:35
gibielodilles: thanks16:36
gibiany other news from stable-land?16:36
elodillesnothing from me16:36
gibiOK moving on16:37
gibiI'm skipping libvirt subteam as bauzas_away is on PTO16:37
gibi#topic Open discussion 16:37
gibi(melwitt): unified limits series is ready for review (https://blueprints.launchpad.net/nova/+spec/unified-limits-nova) https://review.opendev.org/q/topic:bp/unified-limits-nova16:37
gibiI started on that ^^ and will continue tomorrow16:37
opendevreviewMerged openstack/nova stable/wallaby: libvirt: Do not destroy volume secrets during _hard_reboot  https://review.opendev.org/c/openstack/nova/+/79625816:38
gibibut one more core is needed16:38
gibiwho feels the power?16:38
melwittyeah just wanted to give a quick heads up that this is up-to-date, as some know it was stalled for awhile. it's a "tech preview" status where the legacy quota APIs are read-only and there are no quota migration tools, it is DIY for operators to try out 16:39
sean-k-mooneydansmith: lyarwood  do ye have time to review the unified limits series16:39
melwittI have added some tempest test coverage that Depends-On it that can be looked at to see it working16:39
dansmithdo I? no. should I? yes. Will I? I'll try :)16:39
sean-k-mooney:)16:39
melwitthehe ++16:39
gibi:)16:40
melwittthanks all for listening, we can move on I think16:40
gibiok16:40
gibithere is one more topic on the wiki16:41
gibi(gibi): PTL nomination is open. As I noted in my Xena nomination, I will not run for the 4th time as Nova PTL.16:41
gibiif you have questions about the role as you consider running for it then feel free to ask me16:41
gibinothing else on the agneda16:43
gibiis there anything else to discuss today?16:43
gibiif not then thanks for joining 16:44
gansogibi: I will do more testing with the hostname config and /etc/hosts later today and I will mark that bug as invalid if successful (probably will be) =)16:45
gibiganso: cool, thanks16:45
gibi#endmeeting 16:45
opendevmeetMeeting ended Tue Aug 17 16:45:20 2021 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:45
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2021/nova.2021-08-17-16.01.html16:45
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2021/nova.2021-08-17-16.01.txt16:45
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2021/nova.2021-08-17-16.01.log.html16:45
sean-k-mooneygibi: ganso so i looked at https://review.opendev.org/c/openstack/nova/+/804303 in parallel16:46
sean-k-mooneyit looks ok to me modulo some nits16:46
gansosean-k-mooney: thank you very much! I will address them this afternoon!16:47
melwittdansmith: tangentially related, I updated the oslo.limit caching patch a couple of weeks ago to address your comments if you wanted to take another look https://review.opendev.org/c/openstack/oslo.limit/+/80281416:52
gibiganso, sean-k-mooney: I also checked and it seems in the non tap case vcpu=1 and multiqueue works today16:52
dansmithmelwitt: ack16:53
sean-k-mooneygibi: ya it weird i expect it to work and just configure 1 queue17:04
gansogibi: thanks! so the patch looks good?17:17
gibiganso: yapp17:18
melwittjohnthetubaguy[m]: not sure if you would be able to take a quick look, but are you opposed to the idea of putting global limits in keystone as well, instead of setting them in config? https://review.opendev.org/c/openstack/nova/+/712142/14#message-76a84195c59afe78a2a26cbfd8d710bb2ad1016517:53
opendevreviewRodrigo Barbieri proposed openstack/nova master: Fix 1vcpu error with multiqueue and vif_type=tap  https://review.opendev.org/c/openstack/nova/+/80430318:03
gansosean-k-mooney: I had tested nova.instances.vcpus doing resizes and seeing that the value in that variable has the same content as the new flavor. I also think nova.instances.vcpus is more performant where it does not need to join tables to get that value18:17
sean-k-mooneymelwitt: e.g. having global_vcpu_limit or sometihng in keysotne and then useing that18:17
sean-k-mooneyganso: its a copy of the flaovr value18:18
sean-k-mooneywe likely should remove it in the future18:18
sean-k-mooneyand make it a property that just gets it form the flaovr18:18
sean-k-mooneyganso: but the the xml generation exctra will never use instance.vcpus18:19
gansosean-k-mooney: what if the flavor is edited? where will the original value be saved?18:19
sean-k-mooneyganso: the in the instnace_extra table18:19
gansosean-k-mooney: oh I see, so it is not directly from the flavors table18:19
sean-k-mooneywe make a copy of the flavor per instance18:19
sean-k-mooneyganso: no its not form the api db18:19
sean-k-mooneyinstance.flavor.vcpu is comming form the copy of the flaovr created when the instance was created18:20
sean-k-mooneyinstnace.vcpu is identical18:20
sean-k-mooneyganso: if other are ok with it it should work18:20
sean-k-mooneyganso: i just tought we had deprecated instance.vcpus already18:21
sean-k-mooneyalong with instance.memory_mb and the other thngs that are in the flavor18:21
gansosean-k-mooney: I probably would need to retest a resize to see if instances.get_flavor().vcpus gets the old or the new flavor18:23
sean-k-mooneywell for resize we have seperate flavors18:23
gansosean-k-mooney: I was happy that instances.vcpu was consistent for resizes18:23
sean-k-mooneyganso: i dont know if we have testing that enforces that which is why i was nervous with using it18:24
opendevreviewRodrigo Barbieri proposed openstack/nova master: Fix 1vcpu error with multiqueue and vif_type=tap  https://review.opendev.org/c/openstack/nova/+/80430321:01

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!