Wednesday, 2022-11-09

melwittclarkb: this is the test coverage we have for rescue https://github.com/openstack/tempest/blob/master/tempest/api/compute/servers/test_server_rescue.py and it's enabled in the tempest-integrated-compute job for example https://zuul.opendev.org/t/openstack/build/c35d560c76a24e45959aa609ac372d67/log/controller/logs/tempest_conf.txt#7001:37
opendevreviewAmit Uniyal proposed openstack/nova stable/train: Adds a repoducer for post live migration fail  https://review.opendev.org/c/openstack/nova/+/86380606:23
opendevreviewAmit Uniyal proposed openstack/nova stable/train: [compute] always set instance.host in post_livemigration  https://review.opendev.org/c/openstack/nova/+/86405506:23
opendevreviewAmit Uniyal proposed openstack/nova stable/train: Adds a repoducer for post live migration fail  https://review.opendev.org/c/openstack/nova/+/86380607:48
opendevreviewAmit Uniyal proposed openstack/nova stable/train: [compute] always set instance.host in post_livemigration  https://review.opendev.org/c/openstack/nova/+/86405507:48
opendevreviewNobuhiro MIKI proposed openstack/nova master: libvirt: add maxphysaddr support  https://review.opendev.org/c/openstack/nova/+/86409108:20
samuelkunkel[m]Good morning,... (full message at <https://matrix.org/_matrix/media/r0/download/matrix.org/rADMLssdKgBpMiHEvywbsOpx>)10:06
fricklersamuelkunkel[m]: your message has been truncated by the matrix bridge. I suggest not to use matrix in order to join IRC. if you think that this is still the right solution for you, make sure your messages are not too long10:21
fricklerin particular avoiding to send multiline messages may be helpful10:22
samuelkunkel[m]ah sure, sorry. I can try to make it single line. Links still should work? gonna look for a different client...10:23
samuelkunkel[m]we are currently facing an issue in yoga with libvirt 8.0 for reporting mdev devices10:23
samuelkunkel[m]in particular https://review.opendev.org/c/openstack/nova/+/83897610:23
samuelkunkel[m]is this still being worked on?10:24
samuelkunkel[m](hope it is readable now)10:24
fricklerseem bauzas was the last one working on it10:25
samuelkunkel[m]currently I will use the quick fix provided https://review.opendev.org/c/openstack/nova/+/83897610:25
bauzasfrickler: yup, I need to update my change10:26
bauzasit's a priority I have10:26
samuelkunkel[m]that sounds nice, if you need somebody to test - feel free to  reach out to me, have some nodes with mdevs to play on10:38
ygk_12345HI all11:56
sean-k-mooneysamuelkunkel[m]: we not only plan to fix that but backport the fix to wallaby as we require it for our downstream product that far and there is no  point doing it downstream only since the fix is backpoartable12:04
sean-k-mooneyso given your on yoga that shoudl hopefully also adress your usecase12:05
samuelkunkel[m]yes, that sounds great12:05
samuelkunkel[m]I assume there is currently no estimation possible on a timeframe?12:05
sean-k-mooneywell the patch thats propsoed actully works we just need a few comments adressed12:06
auniyalHi sean-k-mooney 12:06
sean-k-mooneydownstream we have a dealine of mid decemebr to adress this so i am stongly hoping that we can adress this upstream before then so our product team does not start asking me about it12:07
auniyalhow can we run tox functional locally in train branch12:07
auniyaltox -e functional fails12:07
sean-k-mooneyuse the python3 version12:07
sean-k-mooneyor a vm/container based on ubutu 18.04?12:07
samuelkunkel[m]I can second that, it also works on my yoga setup on a non productive cluster. Thanks for the clarification. Until the fix is backported I just use the patch12:07
samuelkunkel[m]thanks for all the information12:07
sean-k-mooneyauniyal: so on tain you can use tox -e functional-py36 or tox -e functional-py3712:09
sean-k-mooneyauniyal: i would either use ubuntu 18.04/ubuntu-bionic or centos 8 stream to run the tests12:10
sean-k-mooneywe use 18.04 in teh ci https://github.com/openstack/nova/blob/stable/train/.zuul.yaml#L72-L11912:11
auniyalgot same error, I think its trying to need some package/module 12:11
auniyalhttps://paste.opendev.org/show/bsE4F25vNPl8BaGh7I6I/12:11
sean-k-mooneyyou are trying to use 3.812:11
auniyaloh in here - /usr/lib/python3.8/runpy.py12:12
sean-k-mooneydo you have 3.6 avaiable12:12
auniyalno right now 3.612:12
auniyal3.812:12
sean-k-mooneyya 3.8 was not released/supported by train12:13
auniyalif I create venv of 3.6 and install test-requirements.txt in it12:13
auniyalwill it work12:13
sean-k-mooneyso if you want to run these you need to use an operating system that was support hence why i said centos 8 stream or ubuntu 18.0412:13
auniyalack, will go with ubuntu 18, 12:14
auniyalthanks Sean12:14
sean-k-mooneyif you host os is too new thing liek sqlight might have issues 12:14
sean-k-mooneybasically where we have python modules that wrap c libs12:15
sean-k-mooneyif your host os lib is too new then the old python bindign might now work12:15
sean-k-mooneyso if your currently using say the latest fedora you are likely to have issues with old releases like train12:15
sean-k-mooneyi generally use vms or contaienr to work around that if i hit that12:16
auniyalyeah I am using vm , devstack on ubuntu  2012:16
auniyalfo this tests, will go with ubuntu 1812:17
sean-k-mooneyack i used to keep a few vms around for backporting 12:17
sean-k-mooneyi do that less now just because its rare that i need older then 3.812:18
auniyalack12:19
sean-k-mooneyi think we added 3.8 in ussuri so train is really the only release that does not supprot 3.8 officall now12:19
sean-k-mooneyon it was victoria12:20
auniyalfor ussuri also I was dependent on zuul, but as there less conflict so it need less tests12:21
sean-k-mooneyfrickler: by the way i have been using matrix on and off via the element client pretty seamlessly for irc12:23
sean-k-mooneyfrickler: i still use weechat as my main irc client12:24
sean-k-mooneybut if im not at my work laptop i somethimes use teh eleemnt client form my personal laptop or ipad to chat via teh matrix.org bridge12:24
sean-k-mooneyso ya if you keep messanges relitivly short (3-4 lines)  it works fine i havent hit the lenght limit personlly12:25
sean-k-mooneyof the irc alternivies i have used matrix is really the only one i tollerate12:26
sean-k-mooneyif the element desktop clinet ever get the ablity to sign into two matix accounts at once it might even be something i would consider as a replacemnt for weechat12:27
fricklersean-k-mooney: there have also been issues where the bridge disconnects but you do not notice on the matrix side, so my personal suggest is still to not use this, ymmv12:27
sean-k-mooneyya i have not had that issue but i still use irc as my primary interface and matix as what i use when im traveling or not working from my normal location12:29
sean-k-mooneyso i porably would not notice if there were tempoiry issues12:30
admin1i have a vm which is always in a  pause state in the hypervisor ..    trying to unpause using virsh gives error: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreateWithFlags) .. the vm is backed by volume on ceph, but ceph is fine and there are no locks 12:33
admin1what can i do to check/troubleshoot this issue12:33
admin1i rebooted the hypervisor as well, no luck 12:33
sean-k-mooneythis might be a lock crated by qemu12:34
sean-k-mooneyhave you tried stopping the vm and staring it12:35
sean-k-mooneye.g. via a hard reboot12:35
admin1when i do a vrish destroy,  it disappears from virsh list --all 12:35
admin1when i start again (horizon/cli) appears back 12:36
admin1with a paused state12:36
sean-k-mooneyack12:36
sean-k-mooneydid you check the qemu instance log for any errors12:37
sean-k-mooneythis does not sound like a nova issue by the way12:37
sean-k-mooneythis sound like an issue at the qemu/libvirt level and or perhaps the ceph interaction12:37
sean-k-mooneyyou dont happen to have a kvm error in the instance log do you?12:38
sean-k-mooneywe hit an issue with ubutu 22.04 where libvirt incorrectly detected the cpu model 12:38
sean-k-mooneyit enabled amd cpu flags in the domain on an intel host12:39
sean-k-mooneythat left teh vm in a paused state12:39
sean-k-mooneyalthough that would not expaling the lock message but i woudl check the qemu instance log in anycase12:39
admin1sean-k-mooney thanks . i know what to check for now 12:40
admin1where does qemu/libvirt read the ceph connectioon details like mon addresses ? 12:40
admin1from /etc/ceph/ceph.conf ? 12:40
admin1or is it internally somewhere else 12:41
sean-k-mooneywe get them form the cinder attachment connection info and then store them in our db and pass it to libvirt12:43
sean-k-mooneyso no not from the ceph.conf12:43
sean-k-mooneyin recent release of openstack (xena+) we have a nova manage command to refresh the atachment info 12:45
sean-k-mooneyhttps://docs.openstack.org/nova/latest/cli/nova-manage.html#volume-attachment-refresh12:45
admin1this one is not xena yet12:45
admin1i want to remove 2x mons and use only 1 remaining mon 12:45
admin1how do I update/edit this db ? 12:45
sean-k-mooneywith great pain and care12:46
sean-k-mooneyso we added this command to nova-manage because this is sotred in a json blob in the db12:46
sean-k-mooneywhile it can be modifed its a pain to do12:46
sean-k-mooneyadmin1: one option woudl be to grab a xena contaiern or create a xena virtual env and just run nova manage12:47
sean-k-mooneyi belive this is implemented such that if you have the new version of nova manage and point it to an old cloud it can work but im not 100% certin of that12:47
admin1you mean have binaries of xena but connect to existing db to manage/manipulate the entries ? 12:48
sean-k-mooneyya12:48
sean-k-mooneyso bauzas gibi correct me if im wrong be we have had customer do that right^12:48
sean-k-mooneyuse the updated contaienr with this command ot repair old dbs when connection infor is out of date12:48
sean-k-mooneyadmin1: i think we have a downstream backport of this by the way to some release which is why im not 100% sure how we used this downstream with train12:49
sean-k-mooneyadmin1: ya so we have it backported downstream to train in our 16.2 product12:51
sean-k-mooneyand i think we have had custoemr use the 16.2 contaienr to fix this on queens/osp 1312:52
admin1i am on osa tag 23.1.2 12:52
admin1wallaby12:53
sean-k-mooneywe  cannot backport db/object/rpc change even downstream so the fact it works on train implies this is very self contaiend meanign you should be able to use it with wallaby12:54
admin1: invalid choice: 'volume_attachment' on this 12:54
admin1i have to boot a new container, point to the existing one and try from there12:54
sean-k-mooneyyep12:55
opendevreviewAmit Uniyal proposed openstack/nova stable/train: Adds a repoducer for post live migration fail  https://review.opendev.org/c/openstack/nova/+/86380613:46
opendevreviewAmit Uniyal proposed openstack/nova stable/train: [compute] always set instance.host in post_livemigration  https://review.opendev.org/c/openstack/nova/+/86405513:46
dvo-plvHello, everyone, Could tou please review our comments on the next blueprint: https://review.opendev.org/c/openstack/nova-specs/+/85929013:55
*** slaweq_ is now known as slaweq14:09
*** dasm|off is now known as dasm14:10
admin1sean-k-mooney,is this a libvirt-secrets-gone thing or  a ceph thing ? https://gist.githubusercontent.com/a1git/67cc7dab45f9bff536296670ab6ce65d/raw/450a2e31125e01826a90b1f133bc9b4821f807e8/gistfile1.txt 14:19
admin1my mons were deleted completely .. i recreated those from osds 14:20
admin1and cinder client  was added with the same keys 14:20
admin1most vms started, a few come with this error14:20
sean-k-mooneydid the mon ips change14:22
sean-k-mooneypresumable the secret is the same14:23
sean-k-mooneybut ya it could be that either the secret is msisign or the user aut info changed14:23
sean-k-mooneythe sechre has the ceph keyring inside14:23
sean-k-mooneyi dont have a ceph deployment to check but i belive that is tied a a spcific pool/user uuid 14:24
sean-k-mooneyim not really shoudl how you recover on teh cecph side form all mons going away14:24
sean-k-mooneybut if any of the uuid chaged then you might need to get new keyrings and update the secret14:25
sean-k-mooneychanging the mon ips is not supproted in an openstack env since it requried bd surgery to fix14:26
admin1the mons were gone totally, but the ips did not change 14:26
admin1all 3 mons are back in qorum 14:26
admin1ceph is healthy and  most of the vms started OK14:26
sean-k-mooneyok the ips are cluster uuid and secrete are teh main things14:26
admin1there are 2 i know of that show this behaviour 14:26
admin1there is no lock 14:26
sean-k-mooneyso if the ips are the same no need to update the nova db unless the cluster id changed14:27
admin1cluster name /fsid all is same14:27
sean-k-mooneyya fsid was what i ment14:27
admin1fsid is the same14:27
sean-k-mooneyso if that the same then provide the keyring in the secret is still valid you are porably ok14:27
sean-k-mooneyhave you tried using that to list the volumes on the pool14:28
admin1got it 14:32
clarkbmelwitt: thanks. It does look like bfv is tested, but it does also appear that the image used for rescuing is modified to set its bus and device types? I wonder if that is what we are missing here. The rescue command itself doesn't appear to take those arguments so these would need to be specified before hand on a special image? I guess that lends more weight to having15:50
clarkbdedicated images as a required part of the rescue process?15:50
melwittclarkb: can be image properties or expressed as extra specs in the flavor. I can test out a change that would add a flavor create setting bus and device types in the test16:07
clarkbmelwitt: either way it is something that the cloud or cloud user would need to be aware of. Currently the default rescue behavior is to reuse the same image the rescued node booted off of. This is problematic because of the label specifier collisions, but also because if the image itself is broken you'd still be broken in a rescue. That leads users to using another image, but16:09
clarkbthere isn't any clear indication to me as a user that I need to use a special image.16:09
opendevreviewDan Smith proposed openstack/nova master: Test ceph-multistore with a real image  https://review.opendev.org/c/openstack/nova/+/86086416:10
clarkbI suspect the solution here is to make it clear to cloud operators that rescue has requirements x y z (I don't know what they all are yet) and that they should provide an image that meets those requirements16:11
melwittclarkb: hm yeah you are probably right it's only image properties, this section doesn't mention using a flavor to do it https://docs.openstack.org/nova/latest/user/rescue.html#stable-device-instance-rescue16:11
clarkbalso I wonder if nova should drop the default behavior or reusing the running image and instead force people to explicitly provide one16:12
clarkbI suspect there are scenarios where reusing the image would work, but in the vast majority it seems unlikely16:13
clarkband that would help provide signal that something different is required here16:13
melwittI think we could do that in a new API microversion to avoid breaking anyone who is using it the old way and succeeding ... but the fact that openstackclient defaults to lowest microversion makes it more difficult to signal imho16:16
clarkbya and I think users could manually specify the same image if they really did need/want that16:18
clarkbit just wouldn't be provided as a dfeault (which I think users expect to work)16:18
melwittyeah, I think that makes sense16:19
opendevreviewMerged openstack/nova stable/yoga: [compute] always set instance.host in post_livemigration  https://review.opendev.org/c/openstack/nova/+/86187217:19
opendevreviewAmit Uniyal proposed openstack/nova stable/ussuri: add regression test case for bug 1978983  https://review.opendev.org/c/openstack/nova/+/86260317:31
opendevreviewAmit Uniyal proposed openstack/nova stable/ussuri: For evacuation, ignore if task_state is not None  https://review.opendev.org/c/openstack/nova/+/86260417:31
opendevreviewmelanie witt proposed openstack/nova-specs master: Re-propose spec for ephemeral storage encryption  https://review.opendev.org/c/openstack/nova-specs/+/86413818:45
darkhorseHi team19:29
darkhorse class NovaSession():19:29
darkhorse    def __init__(self):19:29
darkhorse        self.auth = v3.Password(KEYSTONE_URL, username=OPENSTACK_ADMIN, password=OPENSTACK_ADMIN_PASS, project_name=ADMIN_PROJECT, user_domain_id=DOMAIN_ID,19:29
darkhorse                   project_domain_id=PROJECT_DOMAIN_ID)19:29
darkhorse        self.sess = session.Session(self.auth)19:29
darkhorse        self.nova2 = nova_client.Client(2, session=self.sess)19:29
darkhorseI use this code to create nova session. I want to use internal IP address since my app runs on controller. I set KEYSTONE_URL to internal keystone endpoint address but the client still send requests to nova public ip address.19:31
darkhorseIs there a setting that I can tell the client to use internal address?19:31
opendevreviewDan Smith proposed openstack/nova master: Test ceph-multistore with a real image  https://review.opendev.org/c/openstack/nova/+/86086419:34
melwittdarkhorse: you might try the 'interface' kwarg to Client (interface=$the_name_of_your_internal_endpoint_in_the_service_catalog) which will get passed to the keystone adapter https://docs.openstack.org/keystoneauth/latest/api/keystoneauth1.adapter.html19:57
melwittthere's also endpoint_override to provide the full url but the interface discovery is nicer I think if it works19:58
darkhorsemelwitt: thank you i ended up using nova_client.Client(2, session=self.sess, endpoint_type='internal')20:02
darkhorseinterface kwarg seems to be deprecated.20:02
melwittdarkhorse: ok cool. it's the other way around I think, endpoint_type is an old name now used as an alias20:03
darkhorseok thank you.20:04
opendevreviewmelanie witt proposed openstack/nova-specs master: Re-propose spec for ephemeral encryption for libvirt  https://review.opendev.org/c/openstack/nova-specs/+/86414722:30
*** dasm is now known as dasm|offp23:03
*** dasm|offp is now known as dasm|off23:03
opendevreviewmelanie witt proposed openstack/nova-specs master: Re-propose per process healthchecks  https://review.opendev.org/c/openstack/nova-specs/+/86415023:44

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!