Tuesday, 2023-07-25

sapd1Hi everyone, I'm running openstack train. I have a problem when resize an instance with PCI devices(GPUPassthrough) to a flavor without PCI devices. 01:56
sapd1Does anyone know how to fix this problem.01:56
sapd1When I perform resize: No valid host was found. No valid host found for resize (HTTP 400) (Request-ID: req-9d6cf3f9-8f6f-482f-b0b1-95ec286a0fc3)01:57
auniyalsapd1 - as told No Valid Host. 05:33
auniyalbut if you are sure this is not correct and still looking for reason, can you add --debug in your resize cmd to find out exactly which call failed and why, then once you find out that  look for req-ID in compute logs. it will give you better idea of fail. if there are any traceback's in logs - you can file a bug here:05:33
auniyalhttps://bugs.launchpad.net/nova05:33
sapd1auniyal: this bug: https://bugs.launchpad.net/nova/+bug/194100507:07
auniyalit seems fixed07:09
sapd1auniyal: but it's not merged yet 07:09
auniyaloh yes, in train its not merged07:10
sapd1auniyal: I patched it in my system. 07:19
opendevreviewKonrad Gube proposed openstack/nova master: Use Cinder's os-extend_volume_completion volume action.  https://review.opendev.org/c/openstack/nova/+/87356007:40
bauzasdansmith: gmann: it occurs to me that nova-next is almost failing on volume detach due to some kind of filesystem check (that's a guess) and kinda related to what you told this night (for me)08:32
bauzashttps://6d9b97dc35f887a99105-3214406b4544fce2f9d807df6ea4fe3f.ssl.cf5.rackcdn.com/886232/4/check/nova-next/31d3775/testr_results.html08:32
sean-k-mooneyis that using cinder lvm?09:02
sean-k-mooneyi.e. not ceph09:03
sean-k-mooneyits not quite the saem as what https://github.com/openstack/devstack/commit/58c80b2424623096e4a1f7a901f424be0ce6cb3f adressed and actully that should help there in any case09:04
sean-k-mooneyi think https://review.opendev.org/c/openstack/tempest/+/886991 could still help 09:05
sean-k-mooneybauzas: i think its https://bugs.launchpad.net/tempest/+bug/202485909:06
bauzassean-k-mooney: -ish yeah09:07
sean-k-mooneyi just rebased melwitt patch to see if it will pass ci09:08
sean-k-mooneywe have made some other improvment so it might help but im not sure09:09
sean-k-mooneywe could also enabel caching in qemu09:09
sean-k-mooneyso trun on teh writeback cache09:09
sean-k-mooneyand see if that can hide the slow storage performance09:09
sean-k-mooneyhehe i was tinkign of setting https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.disk_cachemodes to writeback09:13
sean-k-mooneybut we could also set it to unsafe ... that would definetly be faster09:14
bauzasfwiw, I only see this pattern on nova-next09:15
bauzasor, I'd rather say, this failure incidence is way higher on this job than any other one in the nova check and gate pipelines09:15
sean-k-mooney[libvirt]/disk_cachemode=file=none,block=unsafe,network=writeback09:16
sean-k-mooneyim goign to add a patch to nova-next to configre that and lets see if it fails09:17
opendevreviewsean mooney proposed openstack/nova master: [WIP] use disk caching to hide slow cinder performance  https://review.opendev.org/c/openstack/nova/+/88938309:23
sean-k-mooneybauzas: ^ that might work although in production you would not use unsafe but for ci it should be fine09:25
dvo-plvHello, nova folks. Maybe you will have a chance to review this commit, thank you https://review.opendev.org/c/openstack/nova/+/87607509:37
sean-k-mooneyi started that review twice already so i proably should go finish it :)09:39
sean-k-mooneydvo-plv: looks good to me09:47
kashyapWhile I'm off the next 3 days, anyone willing to get this over the line?  It's just failing on some timeouts: https://review.opendev.org/c/openstack/nova/+/88725511:35
kashyap(I see sean-k-mooney already did a recheck on the dep patch; thanks!)11:36
sean-k-mooneyit still needs review form other but i set RP+2 on it as well11:37
kashyapYeah, saw that, thank you!11:38
auniyalsean-k-mooney, and others, how can I set debug=true for nova-scheduler; in devstack  nova.conf for [DEFAULT] its already set there is another group [scheduler], I tried there as well but no difference12:53
auniyalI am trying to see https://github.com/openstack/nova/blob/815683ea86492d3ed77b04cc56f3db87e2b8c47d/nova/weights.py#L13612:53
auniyalin scheduler logs while launching VM12:53
auniyalis there any other conf I should look into ?12:55
sean-k-mooneydevstack runs in debug mode by default13:06
auniyalyes, still these logs are not coming in "jouranlctl -u devstack@n-sch"13:07
sean-k-mooney stack@upstream-devstack:~$ sudo journalctl -u devstack@n-sch | grep debug | wc13:09
sean-k-mooney     10     172    293213:09
auniyalyes debugs logs are coming in n-sch but not these logs https://github.com/openstack/nova/blob/815683ea86492d3ed77b04cc56f3db87e2b8c47d/nova/weights.py#L13613:11
auniyalso I think they should come whie VM create13:12
auniyalright ?13:12
sean-k-mooneyim seeign the log in the ci13:12
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/fd53404ef23341828ae15f8ccf596e6c/log/controller/logs/screen-n-sch.txt#101813:13
auniyalyes, here they are coming, I have single-node  devstack env  localy13:15
auniyalit should be right ?13:15
sean-k-mooney/etc/nova/nova.conf as rendered by devstack has debug=true set in the default section13:23
sean-k-mooneythat is what the scheduler uses13:24
auniyalack, thanks 13:30
sean-k-mooneythe backport does not seam to be working properly13:33
sean-k-mooneywait why are you proposing https://review.opendev.org/c/openstack/nova/+/889311 directly to yoga13:34
sean-k-mooneyoh your are not13:34
sean-k-mooneythis is incorrect13:35
auniyalack14:01
opendevreviewalecorps proposed openstack/nova master: Workaround for issues with ephemeral disk named disk.local during resize  https://review.opendev.org/c/openstack/nova/+/88822014:10
opendevreviewAmit Uniyal proposed openstack/nova master: Added context manager for instance lock  https://review.opendev.org/c/openstack/nova/+/87364814:14
opendevreviewAmit Uniyal proposed openstack/nova master: Disconnecting volume from the compute host  https://review.opendev.org/c/openstack/nova/+/87744614:14
bauzaselodilles: gibi: I'll need to stop leading the nova meeting after 20-25 mins14:37
bauzasUggla and me are going to visit artom who's in the surroundings14:37
opendevreviewMaxim Monin proposed openstack/nova master: Server Rescue leads to Server ERROR state if original image is deleted  https://review.opendev.org/c/openstack/nova/+/87238514:39
elodillesbauzas: hmmm... i might be late like 20-25 mins, actually :S14:42
elodilles(though i hope not...)14:43
auniyalhey melwitt, I heard PS[number] from you few times, but not sure what it is, like PS1 or PS10 14:47
bauzasauniyal: PS = patchset14:47
bauzasthe revision number in gerrit14:47
auniyalso in one change patchset 1014:47
auniyalack, thanks 14:48
gibibauzas: I probably need to skip today's meeting (or I will be on and off during it)14:56
bauzasnp14:57
bauzashonestly, given this and that, I don't want to skip but I'll say that we'll just check a few things14:57
gibiack14:59
auniyalgibi, bauzas, dansmith, sean-k-mooney, melwitt can you please review these stable branch patches - https://etherpad.opendev.org/p/release-liaison-PatchesToReview15:52
auniyalmost of them are clean cherry-pick and already have 1 +2 and good to merge15:52
sean-k-mooneythe gates on stabel shoudl be unblocked15:54
sean-k-mooneythe nova-lvm and ceph issues shoudl be fixed on all branches15:54
bauzasauniyal: ack, my main prio before leaving for 3 weeks is about features reviews but I'll try15:55
elodillesyepp, they are unblocked, thanks sean-k-mooney for the patches \o/15:56
bauzas#startmeeting nova16:00
opendevmeetMeeting started Tue Jul 25 16:00:23 2023 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
opendevmeetThe meeting name has been set to 'nova'16:00
bauzashey, let's try to have a 10-min meeting if we can16:00
dansmitho/16:00
bauzas#link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting16:00
gmanno/16:00
auniyalo/16:01
elodilleso/16:01
bauzasok, starting 16:01
bauzasif someone wants to continue discussing, I can pass the chair16:01
bauzas#topic Bugs (stuck/critical) 16:01
bauzas#info No Critical bug16:01
bauzas#link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 40 new untriaged bugs (+0 since the last meeting)16:01
bauzasUggla made a good effort on triaging this week, thanks 16:01
bauzashe shared to me https://etherpad.opendev.org/p/nova-bug-triage-2023072516:02
bauzas#info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster16:02
bauzaselodilles: can you be the baton owner for this week ?16:02
elodillesyepp 16:02
bauzascool thanks !16:02
bauzas#info bug baton is being passed to elodilles16:02
bauzasany bug to discuss ?16:02
elodillesi'll be off on Friday, but otherwise fine!16:02
bauzaselodilles: I'll be off next week so :)16:03
bauzas#topic Gate status 16:03
elodillestouche ;)16:03
bauzas#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:03
bauzas#link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&pipeline=periodic-weekly Nova&Placement periodic jobs status16:03
bauzas#info nova-emulation back to work16:03
bauzas#link https://zuul.openstack.org/build/2eb47b5dcb944b7eb03224e6ee599a5716:03
dansmithgmann and I have been working on a bunch of patches to address job timeout issues16:03
bauzasnow we have a failure on tempest-integrated-placement 16:03
dansmithstill lots in flight, but we're hopeful that it will improve those issues at least16:03
bauzasbut we'll see next week16:03
dansmithbut there are still lots of other spurious fails16:04
bauzasdansmith: I was about to mention the other pipelines but the periodic-weekly16:04
bauzas#info Please look at the gate failures and file a bug report with the gate-failure tag.16:04
dansmithreally need all hands on deck to improve this situation16:04
gmannyeah, timeout now a days is due to multiple reasons and we are trying to improve a few of them16:04
bauzasdansmith: so, yeah, there are a bunch of patches in flight16:04
bauzasdansmith: gmann: I don't know if you have seen my pings today but yeah, there is another suspect16:05
bauzas#link https://bugs.launchpad.net/tempest/+bug/202485916:05
dansmithbauzas: the ping was about the mkfs issue?16:05
dansmiththat's known and unclear how to resolve, AFAICT16:05
bauzasyup16:05
gmannbauzas: I saw that, I did not look into that expect melwitt trying with some change 16:05
gmannyeah mkfs 16:05
dansmithbut I'm highly concerned that we've got a group of workers that is suddenly very very IO constrained16:05
bauzassean-k-mooney has a WIP proposal16:06
dansmithbecause we're seeing 100% slowdown on some nodes that seems IO-related, like that mkfs issue16:06
dansmithmelwitt had a proposal but it turned out not to help, last I checked16:06
gmannyeah16:06
sean-k-mooneythe nova-lvm job passed with it btu i dont know if it was failing consitently before16:06
bauzasabout using disk cachinh16:06
bauzasbut I'm torn enabling it16:06
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/88938316:06
bauzasanyway, I don't disagree, we need to continue digging into it16:07
sean-k-mooneywell its partly enabled already 16:07
gmannsean-k-mooney: might not be so consistent but I see this 2-3 times in a week16:07
bauzasdansmith: gmann: from the most remaining issues, is nova just a coal canary or is responsible for some of them ?16:07
sean-k-mooneyi was suggeting setting  disk_cachemode: "file=none,block=unsafe,network=writeback"16:07
sean-k-mooneyblock could also be writeback16:07
dansmithbauzas: need more triage to know really16:08
dansmithbauzas: I've been focusing on the timeout stuff for a week16:08
gmannbauzas: timeout is more of test runner unbalancing on test worker and slow test etc etc16:08
bauzasok, I'll try to get more hands on it before I leave16:08
gmannbecause timeout hold many of the gate fixes so that should be fixed first :)16:09
bauzasack16:09
bauzasmoving on16:09
bauzas#topic Release Planning 16:09
bauzas#link https://releases.openstack.org/bobcat/schedule.html16:09
bauzas#link https://etherpad.opendev.org/p/nova-bobcat-blueprint-status Etherpad for tracking blueprints status16:09
bauzas#info 5 weeks before FeatureFreeze16:09
bauzasas I said before, my attention will go to features sets this week16:10
bauzasin theory, today was a Feature Review Day16:10
bauzasbut I feel many people having different priorities16:10
bauzasso I haven't really called it 16:10
bauzas#topic Review priorities 16:11
bauzas#link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement+OR+project:openstack/os-traits+OR+project:openstack/os-resource-classes+OR+project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/osc-placement)+(label:Review-Priority%252B1+OR+label:Review-Priority%252B2)16:11
bauzas#info As a reminder, people eager to review changes can +1 to indicate their interest, +2 for asking cores to also review16:11
bauzas#topic Stable Branches 16:11
bauzaselodilles: your time16:11
elodilles #info nova-lvm / nova-ceph-multistore jobs are fixed on all branches (2023.1, zed, yoga) \o/16:11
elodilles#info stable/victoria gate fix has also landed16:11
elodilles#info gates from 2023.1 back till train should be OK16:11
elodilles#info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:11
elodillesEOM16:11
bauzasthanks16:11
bauzas#info train-eol patch proposed https://review.opendev.org/c/openstack/releases/+/88536516:12
bauzaswe're missing a second release core vote16:12
bauzasbut should happen eventually16:12
elodillesyepp16:12
bauzasauniyal had a point, I'll rephrase it 16:12
bauzas#info Please review these backport patches of stable 2023.1, zed and yoga for next minor release16:12
bauzas#info most of these already have one +2.16:12
bauzas#link https://etherpad.opendev.org/p/release-liaison-PatchesToReview16:12
bauzasand the stable branches are now back in healthy state16:13
bauzas#topic Open discussion 16:13
bauzasI have one item16:13
bauzasas I said previously, I'll be on PTO starting next Tues16:14
bauzasgibi volunteered for running the Aug 15 meeting16:14
bauzas(and Aug 22)16:14
bauzasbut we may need someone to chair the Aug 1 and Aug 8 meetings16:14
bauzasor we skip them16:14
gibijepp I still OK with the 15 and 2216:15
bauzasso, anyone fancy volunteering ? if not, nevermind, I'll send an email to cancel the two weeks16:15
bauzasI can try to handle Aug 1 meeting as I won't (yet) be on a plane, but I can't promise16:16
bauzasok, let's consider the meetings cancelled16:16
bauzas#info Aug 1 and Aug 8 Nova meetings are CANCELLED16:17
bauzas#action bauzas to notify the ML accordingly16:17
bauzasthat's it16:17
bauzasfor me16:17
bauzasanything anyone ?16:17
bauzasok, thanks then16:18
bauzas#endmeeting16:18
opendevmeetMeeting ended Tue Jul 25 16:18:41 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:18
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2023/nova.2023-07-25-16.00.html16:18
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2023/nova.2023-07-25-16.00.txt16:18
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2023/nova.2023-07-25-16.00.log.html16:18
gibio/16:19
elodillesthanks, too o/16:19
gmannthanks o/16:21
opendevreviewsean mooney proposed openstack/nova master: [WIP] use disk caching to hide slow cinder performance  https://review.opendev.org/c/openstack/nova/+/88938316:58
sean-k-mooneyif we decied this is the correct approch in general i can do ^ in devstack instead16:59
sean-k-mooneyor in the base job but im adding it to all the in tree job to get more input on if it affect the ci jobs positivly or negitivly16:59
dansmithsean-k-mooney: I think doing it in devstack itself is the wrong move, but maybe in some of the devstack job defs.. this is really an optimization for our own CI and not something we'd want anyone running without knowing it, even in devstack, IMHO17:02
dansmithI'll be interested to see if that helps significantly.. I've seen that bug manifest only occasionally in my surveying recently17:02
sean-k-mooneyi was thinking of doing it like the mysql memory thing17:02
sean-k-mooneye.g. off by default with a macro/var to turn it on in ci17:03
dansmithyeah, it could be a flag, sure17:03
sean-k-mooneybasically i just dont want to have to copy paste that to even more jobs over time :)17:03
sean-k-mooneyim not sure if it will help or not but we will see17:04
sean-k-mooneyin theory it should not sure17:04
opendevreviewMaxim Monin proposed openstack/nova master: Server Rescue leads to Server ERROR state if original image is deleted  https://review.opendev.org/c/openstack/nova/+/87238519:24

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!