Monday, 2021-09-06

opendevreviewJorhson Deng proposed openstack/nova master: recheck the attachment_id after the reschedule successful  https://review.opendev.org/c/openstack/nova/+/79620902:41
opendevreviewmelanie witt proposed openstack/placement master: Narrow scope of set allocations database transaction  https://review.opendev.org/c/openstack/placement/+/80701405:57
opendevreviewmelanie witt proposed openstack/placement master: Add reproducer for Allocation/Inventory update race bug  https://review.opendev.org/c/openstack/placement/+/80749305:58
gibigood morning06:49
gibithe nova-next job constantly failing on master since Friday https://zuul.opendev.org/t/openstack/builds?job_name=nova-next&project=openstack%2Fnova&branch=master06:50
gibie.g. https://zuul.opendev.org/t/openstack/build/f5b881e3601f4160a5f82a9b4cdc10ad/log/job-output.txt#62198-6220006:50
gibireported a bug https://bugs.launchpad.net/nova/+bug/194274006:58
fricklergibi: looks placement-y once again. maybe https://review.opendev.org/c/openstack/releases/+/80658006:58
gibifrickler: yes06:58
gibiyes placement-y06:58
gibiI will look into it shortly06:58
gibicould be the new osc-placement release, yes06:58
lyarwoodgibi: Morning, I'm going to be afk this morning looking after my sick kid once again, I'll try to help with the nova-next stuff this afternooon \o07:14
gibilyarwood: ack, hope the kid will be better soon07:15
bauzasmorning folks07:39
bauzasgibi: ack, will look up, I need to write the prelude section :D07:39
gibibauzas: morning, I'm debugging the nova-next job now, so you are free to work with the prelude :)07:40
bauzas:D07:44
bauzas(good opportunity for looking at what we eventually merged during this cycle fwiw)07:44
bauzas(in case people want to work on a prelude change for Yoga :p )07:45
gibifrickler: indeed the new osc-placement release broke the test we have a bug in https://review.opendev.org/c/openstack/osc-placement/+/80445807:57
gibifrickler: I will propose a fix for osc-placement. Can we release a new osc-placement lib for Xena?07:58
gibielodilles: ^^ ?07:58
fricklergibi: I looked at that patch, but assumed it would only change behavior when actively requesting that version07:59
fricklergibi: releasing bug-fixes should always be possible, too07:59
gibifrickler: it moved some code around outside of the version guard 08:00
opendevreviewJorhson Deng proposed openstack/nova master: recheck the attachment_id after the reschedule successful  https://review.opendev.org/c/openstack/nova/+/79620908:17
opendevreviewPierre-Samuel Le Stang proposed openstack/nova master: Fix instance's image_ref lost on failed unshelving  https://review.opendev.org/c/openstack/nova/+/80755108:52
opendevreviewBalazs Gibizer proposed openstack/osc-placement master: Repro allocation show bug with empty allocation  https://review.opendev.org/c/openstack/osc-placement/+/80755309:12
gibibauzas: ^^09:12
gibisorry, I was too fast, it will fail ...09:13
opendevreviewBalazs Gibizer proposed openstack/osc-placement master: Repro allocation show bug with empty allocation  https://review.opendev.org/c/openstack/osc-placement/+/80755309:22
opendevreviewPierre-Samuel Le Stang proposed openstack/nova master: Fix instance's image_ref lost on failed unshelving  https://review.opendev.org/c/openstack/nova/+/80755109:27
opendevreviewPierre-Samuel Le Stang proposed openstack/nova master: Fix instance's image_ref lost on failed unshelving  https://review.opendev.org/c/openstack/nova/+/80755509:27
opendevreviewBalazs Gibizer proposed openstack/osc-placement master: Repro allocation show bug with empty allocation  https://review.opendev.org/c/openstack/osc-placement/+/80755309:39
opendevreviewPierre-Samuel Le Stang proposed openstack/nova master: Fix instance's image_ref lost on failed unshelving  https://review.opendev.org/c/openstack/nova/+/80755109:48
opendevreviewBalazs Gibizer proposed openstack/osc-placement master: Fix allocation show / unset on empty allocation  https://review.opendev.org/c/openstack/osc-placement/+/80755609:51
gibibauzas: please prioritize ^^09:53
opendevreviewBalazs Gibizer proposed openstack/nova master: DNM: check nova-next with osc-placement fix  https://review.opendev.org/c/openstack/nova/+/80755809:57
*** bhagyashris_ is now known as bhagyashris10:00
gibibauzas: fyi, I've created the yoga series in launchpad https://launchpad.net/nova/yoga10:49
gibibauzas: I will set it to active once Xena RC1 is cut10:50
sean-k-mooneyhehe i proably shoudl do that for os-vif although i generally only do it when we need to backport something and the series does not exist10:59
sean-k-mooneythe last one i created was victoria...10:59
gibisean-k-mooney: I use the series and the milestones to track bps 10:59
sean-k-mooneyya for os-vif we just have bugs for RFEs so it tend to be less important11:00
sean-k-mooneyill quickly add wallaby and yoga11:00
gibiack11:02
viks__hi, i'm doing some disk I/O testing in my openstack setup... In my test, i see that write speed in VM is very less compared to that of host.. So is there any way i can increase the VM disk I/O? is there any comparision or something which says what percentage ratio disk write speed gets degraded in VM comparison to host? Can some plz give some direction?11:02
sean-k-mooneyok done 11:02
opendevreviewBalazs Gibizer proposed openstack/nova master: [doc] port-resource-request-groups not landed in Xena  https://review.opendev.org/c/openstack/nova/+/80756411:07
sean-k-mooneyviks__: it should be close to identical at least in the raw backend11:41
sean-k-mooneyfor qcow it should also be close but we expect some overhead while the file is growing. we dont really maintian the lvm image_backend activly but it used to give slightly better write performnce the qcow but i think raw was on par11:42
sean-k-mooneyyou may want to exlopre useing virtio-scsi instaed of virtio-blk which is our default11:42
sean-k-mooneyviks__: we do not support multiple io treads or multiqueue currently so if you have multiple disk the performace will not scale per disk11:43
sean-k-mooneybut for a vm with a singel root disk and preallcoated stroage the perfromance shoudl be close (within singel digit percent) to that of the host 11:44
viks__sean-k-mooney:  is `virtio-scsi` applies for vm with single root disk/preallcoated storage or cinder volumes attached?11:47
sean-k-mooneywhen you enabel virtio-scsi via hw_disk_bus it is used for all storage11:48
sean-k-mooneymy suggestion to try it is because it supports a slightly different feature set like trim11:48
sean-k-mooneyso dpending on if your using ssd and your workload sometimes virtio-scsi perfroms better even for a vm with 1 root disk and no cinder volumes11:49
sean-k-mooneyvirtio-scsi is our recomendation when ever using ceph11:49
lyarwoodthat reminds me I need to read up on the virtio-blk trim support thread11:49
sean-k-mooneybut for local storage we generally suggest staying with virtio-blk unless you have messured a performace increase with virtio-scsi11:50
sean-k-mooneyvirtio-scsi is general better when you use a large number of cinder volumes as it does not consume a pci device slot per volumn11:51
sean-k-mooneyso it scales better in that regard but again all io is handel by the qemu emulator thread since we do not use io threads11:51
sean-k-mooneyso that will be the bottle neck regardless of if you use virtio-blk or virtio-scsi11:51
viks__sean-k-mooney:  ok thanks for the inputs.. i'll explore and test `virtio-scsi` also and see if i have some findings..11:55
opendevreviewMerged openstack/os-vif stable/ussuri: Refactor code of linux_net to more cleaner and increase performace  https://review.opendev.org/c/openstack/os-vif/+/76541912:49
viks__sean-k-mooney:  here is what i have tested... https://paste.openstack.org/show/808597/  These are from my default setup.. i see a huge difference... but as per you, it  should not differ too much even in default setup.. am i doing something wrong or missing some configuration?12:50
sean-k-mooneycorrect it should not differe much. so some questions what virt driverf are you using (libvirt?) what images_backend have you configured (qcow? raw? flat? lvm?) 12:51
sean-k-mooneyhave you set your disk cache mode or is it at the default and have you enabeld image preallocation our not12:52
opendevreviewMerged openstack/os-vif stable/ussuri: Fix - os-vif fails to get the correct UpLink Representor  https://review.opendev.org/c/openstack/os-vif/+/76596712:52
sean-k-mooneyviks__: unless sysbench preallocates the file and then writes to it the large delta there may just be growing the root disk12:54
sean-k-mooneyviks__: so you might need to rerun that12:54
sean-k-mooneyfor fio i know it preallocates and there we are seeign about a 2x delta12:54
bauzasgibi: ok, so you need to create a new series for launchpad ?12:54
bauzasI wasn"t knowing about it12:54
bauzasgibi: could you please add this for https://docs.openstack.org/nova/latest/contributor/ptl-guide.html ?12:55
bauzasah, nevermind12:56
bauzashttps://docs.openstack.org/nova/latest/contributor/ptl-guide.html#immediately-after-rc12:56
viks__sean-k-mooney: when i rerun sysbench, then also i get the below, which is still way below the one on the host:12:58
viks__https://www.irccloud.com/pastebin/sf7VLtQi/12:58
sean-k-mooneyviks__: ack but without answering my ohter question i cant really help12:59
sean-k-mooneyviks__: the default parmaters that are used will depend on your deployment tool so that is why i am asking what you image_backend set to and cache modes also have you preallocated the image or not13:00
viks__sean-k-mooney:  oh sorry... it's libvirt/qcow,  `preallocate_images = none,  disk_cachemodes = file=writeback,block=writeback`13:02
sean-k-mooneyok so preallocat_image=none means tehre will be some degradation while the qcow is expanded but that does not account of rthe fio delta since it precallocates teh files in the guest to prevent that form being an issue13:05
sean-k-mooneywriteback is generally quite good for perfromance you coudl try settign the cache mode to none if you host suport O_DIRECT that might improve perfromance13:06
sean-k-mooneyif you leeve the cachemode unest we will default to none if O_DIRECT is support or use writeback which was our default until recently13:06
viks__sean-k-mooney: ok.. what is the easy way to check for `O_DIRECT` support in the host?13:09
sean-k-mooneywell you were using --direct=1 in your fio test so it should support it13:10
sean-k-mooneyO_DIRECT is the file mapping mode used for direct io13:11
viks__ok... 13:11
sean-k-mooneyfirst thing i would try is setting the cachemode for file to none13:11
opendevreviewBalazs Gibizer proposed openstack/osc-placement master: Repro allocation show bug with empty allocation  https://review.opendev.org/c/openstack/osc-placement/+/80755313:12
opendevreviewBalazs Gibizer proposed openstack/osc-placement master: Fix allocation show / unset with empty allocation  https://review.opendev.org/c/openstack/osc-placement/+/80755613:12
sean-k-mooneythen hard reboot the vm and see if that improve the performance13:12
viks__sean-k-mooney: ok thanks... will try the same13:12
sean-k-mooneyif that has no effect i would try setting preallocate_images=space and then try changing images_backend to raw. you could also in parallel create a second vm with hw_disk_bus=virtio-scsi to compare side by side to virtio-blk13:14
sean-k-mooneyviks__: for really write heavy worklaods lvm used to slightly outperform raw but that requried extra setup and it is less well maintained13:15
sean-k-mooneyso if the other tuning dont help in your case that is what i woudl try last13:15
sean-k-mooneyviks__: outside of openstack there are some host level sysctl tunings you can do to improve guest io perfromance like changing the io schduler or dirty data writeback tresholds13:16
sean-k-mooneytuned can be used as a erfernce for some of the commnly tuned settings13:18
sean-k-mooneyhttps://github.com/redhat-performance/tuned/blob/master/profiles/throughput-performance/tuned.conf#L24-L5913:18
viks__sean-k-mooney:  thanks for all the inputs.. will do some testing based on these .... thanks again...13:19
opendevreviewBalazs Gibizer proposed openstack/nova master: DNM: check nova-next with osc-placement fix  https://review.opendev.org/c/openstack/nova/+/80755813:22
sean-k-mooneyviks__: if you are testing virtio-scsi i the image options are hw_disk_bus=scsi hw_scsi_model=virtio-scsi by they way13:25
sean-k-mooneyi previously msityped it as hw_disk_bus=virtio-scsi13:25
viks__sean-k-mooney: ok...13:26
bauzashuzzah, can see my review priorities : https://review.opendev.org/q/label:Review-Priority%253E%253D1%252Csbauza14:06
opendevreviewMerged openstack/osc-placement master: Repro allocation show bug with empty allocation  https://review.opendev.org/c/openstack/osc-placement/+/80755314:20
opendevreviewMerged openstack/osc-placement master: Fix allocation show / unset with empty allocation  https://review.opendev.org/c/openstack/osc-placement/+/80755614:20
gibilyarwood, bauzas: ^^ thanks for the quick approve. I will ask for a release freeze exception shortly14:41
gibiRFE requested: http://lists.openstack.org/pipermail/openstack-discuss/2021-September/024686.html14:53
slaweqhi stable nova cores, can somebody check https://review.opendev.org/c/openstack/nova/+/791421 /.14:55
slaweq?14:55
slaweqthx in advance for help :)14:55
gibido we have USA out today due to Labor Day?15:28
kashyapgibi: Yeah15:31
kashyapAlso Canada, IIRC15:31
gibiOK, then I don't wait for them today :)15:31
bauzasgibi: sorry was on a meeting15:57
bauzasgibi: yup, saw your FFE request but as kashyap said and you guessed, a whole portion of the world located between Pacific and Atlantic oceans and above a certain latitude is currently shutdown for the day15:58
gibibauzas: no worries the release now queued15:59
bauzassaid a French guy working15:59
gibibauzas: I looked at melwitt's repro patch https://review.opendev.org/c/openstack/placement/+/807493 and I left some ideas but no real solutions yet15:59
bauzasgibi: honestly we face the limits of the single-commit design we have with Placement16:01
gibibauzas: this is now more about writing a sane a repro test that excersise transaction isolation with mysql (as sqlite does not have it)16:03
gibiI think the fix melwitt's proposed on top is sane but the repro test is racy somehow actually forks processes16:03
gibiwhihc is scarry16:04
bauzasgibi: yup, I saw the tox modification and I understood the reasoning16:04
melwittI will try to find a different way to do it. it's inherently problematic to try and repro this because once one path starts a transaction, trying to do something in the middle of it (to cause the race state) poses a record locking problem. at one point I was trying to fake it by just returning bogus things to make it think it hit a generation conflict (without doing any database write), which worked to make it retry but that wasn't16:12
melwitt showing the effect of the consistent read problem inside the transaction16:12
gibimelwitt: yes, exactly, that was the dead end I went down this afternoon :/16:21
gibimelwitt: I think at some point we can accept that we cannot reliably test this in func env and simply land the fix and monitor the tempest jobs to see if it is resolved the race or not16:22
melwittgibi: I went down the dead end three days in a row 😑16:22
melwittgibi: yeah. I was thinking that too. I kept thinking there has to be a way but if there is I'm not clever enough to find it haha16:23
gibimelwitt: but enjoy your day off, we can continue this tomorrow16:24
melwittkk, thanks16:24
*** elodilles is now known as elodilles_pto20:32

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!