Tuesday, 2017-12-19

*** chyka has quit IRC00:00
rybridgesnot much seems pertinent in the libvirtd log00:00
rybridgesI dont think that this is related to libvirtd or qemu actually00:02
*** mingyu has quit IRC00:02
rybridgeswe are running a juno deployment with identical libvirtd / qemu versions and config00:02
rybridgesand we do not see thsi problem00:02
rybridgesbut when we run the same libvirtd/qemu setup with the ocata codebase, we see this issue00:03
*** rcernin has quit IRC00:09
*** rcernin_ has joined #openstack-nova00:09
rybridgescould be a problem with the libvirt-python version in ocata00:26
rybridgesthe upper constraints is capped at 2.5.000:26
rybridgesbut that is completely broken in rhel environments, cant even install it00:26
rybridgesso we tried 3.5.000:26
rybridgesthat wasnt working00:26
rybridgestried 3.10.000:27
rybridgesalso not working00:27
rybridgesnow trying 3.7.000:27
rybridgesand it seems to be working00:27
rybridgesi have suspended 40 instances without error00:27
*** yangyapeng has joined #openstack-nova00:27
rybridgesdoh00:27
rybridgestake that back00:28
rybridgestried 20 in parallel00:28
rybridgesstill got a few errors00:28
clarkbrybridges: libvirt-python is supposed to be compatible with any libvirt that is the same release as it or an older release. so libvirt-python 3.0 can tlak to libvirt 2.5 but libvirt-python 2.5 cn't talk to libvirt 3.000:31
clarkbthis is why rhel 7.4 broke the 2.5.0 cap00:31
clarkb(they did a major upgrade of libvirt00:31
*** yangyapeng has quit IRC00:31
rybridgesright00:32
rybridgesyes00:32
rybridges2.5 breaks on later versions of libvirt00:32
rybridgesso we had to switch up00:33
*** kumarmn has joined #openstack-nova00:35
*** hiro-kobayashi has joined #openstack-nova00:35
*** marst has quit IRC00:35
*** trinaths has joined #openstack-nova00:36
*** tuanla____ has joined #openstack-nova00:38
*** kumarmn has quit IRC00:40
*** moshele has joined #openstack-nova00:40
*** rcernin_ has quit IRC00:50
*** psachin has joined #openstack-nova00:53
*** annp has quit IRC00:53
*** annp has joined #openstack-nova00:53
*** jose-phillips has quit IRC00:53
*** hoangcx has quit IRC00:53
*** hieulq has quit IRC00:53
*** daidv has quit IRC00:53
*** tuanla____ has quit IRC00:53
*** daidv has joined #openstack-nova00:54
*** tuanla____ has joined #openstack-nova00:54
*** hoangcx has joined #openstack-nova00:54
*** hieulq has joined #openstack-nova00:54
*** jose-phillips has joined #openstack-nova00:55
*** catintheroof has joined #openstack-nova00:56
*** edmondsw has joined #openstack-nova00:58
*** moshele has quit IRC01:00
*** chyka has joined #openstack-nova01:01
*** gyee has quit IRC01:01
*** phuongnh has joined #openstack-nova01:02
*** moshele has joined #openstack-nova01:03
*** huanxie has quit IRC01:04
*** huanxie has joined #openstack-nova01:06
*** chyka has quit IRC01:06
*** salv-orlando has quit IRC01:13
*** salv-orlando has joined #openstack-nova01:14
*** yangyapeng has joined #openstack-nova01:16
*** yangyapeng has quit IRC01:17
*** yangyapeng has joined #openstack-nova01:17
*** salv-orlando has quit IRC01:18
*** catintheroof has quit IRC01:21
*** vishwanathj has joined #openstack-nova01:30
openstackgerritOpenStack Proposal Bot proposed openstack/nova master: Updated from global requirements  https://review.openstack.org/52888101:31
*** Apoorva_ has joined #openstack-nova01:32
*** linkmark has quit IRC01:33
*** Apoorva has quit IRC01:36
*** Apoorva_ has quit IRC01:37
mriedemalex_xu: here is a question about something from long ago https://review.openstack.org/#/c/97727/01:44
openstackgerritOpenStack Proposal Bot proposed openstack/python-novaclient master: Updated from global requirements  https://review.openstack.org/52891101:44
mriedemalex_xu: why does populate_retry not check for MaxRetriesExceeded if max_attempts = 1?01:44
mriedemi realize that means reschedules are disabled, but why wouldn't we compare num_attempts > max_attempts?01:45
*** Dinesh_Bhor has joined #openstack-nova01:46
mriedemi guess that's what the code always did...01:46
mriedemgoes way back to https://review.openstack.org/#/c/9540/01:48
*** claudiub has joined #openstack-nova01:48
mriedemoh nvm, i know why01:51
mriedemif max_attempts == 1, we never set the retry key in the filter properties passed to compute01:51
*** trungnv has joined #openstack-nova01:52
mriedemhttps://github.com/openstack/nova/blob/master/nova/compute/manager.py#L185501:52
mriedemand then we don't reschedule01:52
openstackgerritOpenStack Proposal Bot proposed openstack/nova master: Updated from global requirements  https://review.openstack.org/52888101:53
*** edmondsw has quit IRC01:53
openstackgerritMatt Riedemann proposed openstack/nova master: Don't try to delete build requests on MaxRetriesExceeded  https://review.openstack.org/52883501:53
mriedemmelwitt: %01:53
mriedem^01:53
mriedemgah, that also goes back to newton01:54
*** andreas_s has joined #openstack-nova01:55
*** andreas_s has quit IRC01:59
*** penick has quit IRC02:07
*** moshele has quit IRC02:08
*** rcernin has joined #openstack-nova02:11
*** zhangjl has joined #openstack-nova02:11
*** zhangjl has left #openstack-nova02:11
*** moshele has joined #openstack-nova02:12
*** salv-orlando has joined #openstack-nova02:14
*** moshele has quit IRC02:15
openstackgerritMatt Riedemann proposed openstack/nova master: Don't try to delete build request during a reschedule  https://review.openstack.org/52883502:16
*** r-daneel has quit IRC02:17
rybridgesso I think i found the root problem with suspend02:17
rybridgeswhen i run suspend like this: openstack server suspend <uuid1> <uuid2> <uuid3> <uuid4> <uuid5>..... <uuid40>02:17
rybridgesalmost all of the instances go to error state02:17
rybridgesbut02:17
*** psachin has quit IRC02:18
rybridgeswhen I run suspend in a simple for loop02:18
rybridgeslike this:02:18
rybridgesfor i in {1..20}02:18
rybridgesdo02:18
*** trinaths has left #openstack-nova02:18
rybridges    openstack server suspend ryan-rhel68-$i &02:18
rybridgesdone02:19
rybridgesi get no errors02:19
rybridgesall of the instances go to suspended state (and NOT error state like the first command)02:19
mriedemthe compute api only takes a single instance for suspend,02:19
*** salv-orlando has quit IRC02:20
mriedemso not sure what osc cli is doing02:20
mriedemlooks like it should be doing the same thing as you are, in a loop https://github.com/openstack/python-openstackclient/blob/master/openstackclient/compute/v2/server.py#L213102:20
*** moshele has joined #openstack-nova02:20
*** tuanla____ has quit IRC02:20
*** daidv has quit IRC02:20
rybridgesright02:21
rybridgesi was just looking at that02:21
rybridgesit looks like it should do essentially the same thing as the loop02:21
rybridgesbut its not02:21
mriedemwhat is the actual error in the nova logs?02:21
rybridgesbecause 80% of the instances go to error state02:21
*** daidv has joined #openstack-nova02:21
*** tuanla____ has joined #openstack-nova02:21
rybridgeshttps://pastebin.com/jTedyZVJ02:21
rybridgesweird libvirt error02:21
rybridgesbut i dont get that at all when i call suspend in a loop from a shell scrip02:22
rybridgesscript*02:22
mriedemhuh, shouldn't make any difference02:22
mriedemdefinitely looks like you're killing libvirt02:22
mriedemseeing libvirt crash in the libvirtd logs or syslog?02:22
rybridgesi checked the libvirtd log02:23
rybridgesand did not see anything useful at all02:23
rybridgesnot really any errors that seem meaningful02:23
rybridgeseven if that was the case02:23
mriedemi don't know why it would be any different02:23
rybridgeswhy would running the command in one way crash it and running the command in another way be just fine02:23
mriedemeither way you're running it02:23
rybridgesyea02:23
rybridgesit is though, i have 4 ocata clusters02:24
mriedemunless there is some timing difference02:24
rybridgesall of them the behavior is like this02:24
rybridgeswe have an ntp server02:24
rybridgesit also cant be timing02:24
rybridgesbecause if it was02:24
rybridgesit would be reproducible with both commands02:24
rybridgesright?02:24
rybridgesunless02:24
rybridgesone command is doing something different than the other02:24
mriedemwell,02:24
rybridgesdo you know if that .suspend() call is asynch?02:24
mriedemthere is overhead to simply issuing an osc command02:25
mriedemit is02:25
mriedemhttps://github.com/openstack/python-openstackclient/blob/master/openstackclient/compute/v2/server.py#L213102:25
mriedemoops02:25
mriedemhttps://github.com/openstack/python-openstackclient/blob/master/openstackclient/compute/v2/server.py#L213102:25
mriedemdamn02:25
mriedemanyway yeah it's an rpc cast from api to compute02:25
rybridgesright02:25
mriedemso i'm wondering if your script is hitting the osc overhead just enough that each iteration is slow enough02:25
*** claudiub has quit IRC02:25
rybridgeshmm could be02:26
mriedembut when doing them in batch via osc itself, it doesn't have the per-issue overhead02:26
rybridgesin theory, you would think that running the script would actually be calling that .suspend() method slower than passing all the uuids02:26
mriedemtry running both using timeit?02:26
mriedemthat's what i'm saying,02:26
mriedemi think the script way is slower02:26
mriedemand you're slowing it down, effectively load balancing :)02:26
rybridgesyea that makes sense02:26
mriedemso you don't DoS libvirt02:26
*** gcb has joined #openstack-nova02:27
mriedemi didn't know osc actually let you specify a list of uuids to perform some action02:28
rybridgeswell02:28
rybridgesit wasnt always like that02:28
rybridgesin juno we could not do that for the suspend command02:28
mriedemyeah but now you guys are all upgraded to ocata02:28
mriedemand have shiny new ways to kill yourselves02:28
rybridgeslololol02:29
*** Apoorva has joined #openstack-nova02:29
lbragstadmriedem: responded with more context/questions, hopefully it's clearer https://review.openstack.org/#/c/525772/102:31
mriedemlbragstad: i think v1 of this thing needs to probably default to allowing whatever we support today,02:32
mriedemwhich is admin == god02:32
mriedemso in this thing, god == system scope02:32
mriedemyes?02:32
lbragstadso - ['system', 'project']02:32
mriedemyeah,02:33
lbragstadbecause right now if you're admin you're god02:33
mriedemand then for deployments that are doing a god -> project admin -> sheep setup, they can tweak their policy02:33
lbragstadand can do anything everywhere02:33
mriedemcburgess: ^02:33
mriedemcburgess would be a good person to ask because i think he's in the god role02:33
mriedemi.e. the hosting company operator02:33
lbragstadright02:34
lbragstadso the big question is, how much power do i want to give customers without giving them the power to hose my deployment02:34
*** psachin has joined #openstack-nova02:36
mriedemtoday by default its all or none right?02:37
mriedemadmin or not admin02:37
lbragstadpretty much02:37
mriedemok so i would think in queens, anything that's an admin rule by default today, would be system and project scopes02:38
mriedemfor compat02:38
mriedemthen over time you could start restricting the defaults from system to just project with release notes02:38
*** trungnv has quit IRC02:38
mriedemthese are just defaults in the code, and can be overridden02:38
lbragstadso - i kinda tried to go about doing that here: https://review.openstack.org/#/c/528847/102:38
*** AlexeyAbashkin has joined #openstack-nova02:38
lbragstadand i'd be super curious to get cburgess' feedback on that02:38
mriedemoh so you have a global switch02:39
lbragstadwhere an operator can go through and flip that switch once they have the right role infrastructure in place02:39
lbragstadand they have audited their users to have the right roles02:39
rybridgesso the whole reason why i was asking about suspend originally is because snapshots were failing02:39
rybridgesand the snapshot flow (to my knowledge) is suspend > snapshot > resume02:39
lbragstad(e.g. bob had the admin role but based on good faith, he didn't hose my deployment)02:39
rybridgesand it was always failing on suspend02:40
rybridgesand they still fail most of the time on suspend02:40
rybridgeswith the same error above02:40
mriedemrybridges: what libvirt calls suspend is likely != the compute api suspend02:40
rybridgeseven though i cannot reproduce the error with suspending on the cli with the loop02:40
mriedemhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L178602:41
mriedemhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L268602:41
mriedemformer is what libvirt calls on the guest during a snapshot02:41
mriedemlatter is what you get with 'openstack server suspend'02:42
rybridgesoh02:42
rybridgesok that is interesting02:42
mriedemoh jeez, nvm02:43
mriedemself.suspend(context, instance)02:43
mriedemderp02:43
mriedemyou're right02:43
*** AlexeyAbashkin has quit IRC02:43
mriedemi was thinking of this https://github.com/openstack/nova/blob/master/nova/virt/libvirt/guest.py#L60002:43
mriedemrybridges: did you see where/why the snapshot was actually failing? have you tried doing live snapshots?02:43
rybridgesthe snapshots are failing with the exact same error as i posted in the pastebin above02:44
mriedemyou might want to try live snapshot if libvirt / qemu on the host is new enough02:44
rybridgesit looks like it is just failing on the suspend02:44
mriedemwe don't call suspend if you do a live snapshot02:44
rybridgeswe are running the latest libvirt / qemu that is available for rhel702:44
mriedemwhich is what?02:45
mriedemhttps://github.com/openstack/nova/blob/stable/ocata/nova/conf/workarounds.py#L6802:45
*** Tom-Tom has joined #openstack-nova02:45
rybridgescan you do live snapshot from horizon?02:45
mriedemlive vs cold is a config option in nova-compute in this case02:45
rybridgesi dont see where to do that02:46
mriedemby default it's cold02:46
mriedemwe removed that in queens https://github.com/openstack/nova/commit/980d0fcd75c2b15ccb0af857a9848031919c6c7d02:46
mriedemso now it's always live02:46
mriedemwell, live by default02:46
rybridgesok this is very interesting02:47
mriedemusing libvirt 3.6.0 and qemu 2.10 we haven't seen issues with live snapshot in CI02:47
rybridgesi will try this now02:47
rybridgesok02:47
mriedemused to see about a 25% failure rate with live snapshot using libvirt 1.2.2 back in the day02:47
rybridgesok02:48
*** windsn has quit IRC02:48
rybridgesyou said that conf option should be on the hypervisor right?02:48
rybridgesfor nova compute02:48
mriedemyeah02:48
rybridgesnot in nova api02:48
rybridgesok02:48
mriedemit's read from the nova-compute service02:48
mriedemif that works, penick owes me a ginger ale in dublin02:49
mriedemeither way i'm hanging it up for the night02:49
rybridgeshaha02:50
rybridgesi will be in ireland for ptg in feb02:50
rybridgesso ill get ou a ginger ale too02:50
*** mriedem has quit IRC02:56
*** Dinesh_Bhor has quit IRC03:01
*** yamahata has quit IRC03:12
*** edmondsw has joined #openstack-nova03:15
*** salv-orlando has joined #openstack-nova03:15
*** abhishekk has joined #openstack-nova03:16
*** mingyu has joined #openstack-nova03:17
*** yamahata has joined #openstack-nova03:17
*** edmondsw has quit IRC03:19
*** salv-orlando has quit IRC03:20
*** armax has quit IRC03:22
*** mingyu has quit IRC03:22
*** markvoelker has joined #openstack-nova03:24
*** takashin has quit IRC03:25
*** yamahata has quit IRC03:27
*** lyan has quit IRC03:30
*** tbachman has quit IRC03:42
*** tbachman has joined #openstack-nova03:43
*** mingyu has joined #openstack-nova03:45
*** mingyu has quit IRC03:49
*** dave-mccowan has quit IRC03:51
*** Dinesh_Bhor has joined #openstack-nova03:52
*** sridharg has joined #openstack-nova03:55
*** markvoelker has quit IRC03:58
*** armax has joined #openstack-nova04:01
*** Apoorva has quit IRC04:04
*** itlinux_ has joined #openstack-nova04:04
*** salv-orlando has joined #openstack-nova04:16
*** gouthamr has quit IRC04:18
*** fragatina has quit IRC04:20
*** fragatina has joined #openstack-nova04:20
*** salv-orlando has quit IRC04:21
*** takashin has joined #openstack-nova04:25
*** psachin has quit IRC04:30
*** psachin has joined #openstack-nova04:35
*** andreas_s has joined #openstack-nova04:40
*** Tom-Tom has quit IRC04:41
*** janki has joined #openstack-nova04:43
*** andreas_s has quit IRC04:45
*** ratailor has joined #openstack-nova04:48
*** phuongnh has quit IRC04:51
*** annp has quit IRC04:51
*** huanxie has quit IRC04:51
*** phuongnh has joined #openstack-nova04:51
*** annp has joined #openstack-nova04:52
openstackgerritTakashi NATSUME proposed openstack/nova master: api-ref: Verify parameters in servers.inc  https://review.openstack.org/52820104:52
openstackgerritTakashi NATSUME proposed openstack/nova master: api-ref: Verify parameters in servers.inc  https://review.openstack.org/52820104:52
*** armax has quit IRC04:53
*** daidv has quit IRC04:54
*** hieulq has quit IRC04:54
*** tuanla____ has quit IRC04:54
*** hoangcx has quit IRC04:54
*** huanxie has joined #openstack-nova04:54
*** tuanla____ has joined #openstack-nova04:55
*** daidv has joined #openstack-nova04:55
*** hieulq has joined #openstack-nova04:55
*** hoangcx has joined #openstack-nova04:55
*** markvoelker has joined #openstack-nova04:55
*** gcb has quit IRC04:56
*** yamamoto has joined #openstack-nova04:57
*** gcb has joined #openstack-nova04:58
*** yamamoto has quit IRC05:07
*** psachin has quit IRC05:07
*** yamamoto has joined #openstack-nova05:08
*** janki has quit IRC05:09
*** janki has joined #openstack-nova05:10
*** Tom-Tom has joined #openstack-nova05:11
*** psachin has joined #openstack-nova05:17
*** salv-orlando has joined #openstack-nova05:17
openstackgerritNakanishi Tomotaka proposed openstack/nova master: Use Placement API to check resource usage  https://review.openstack.org/52895305:21
*** salv-orlando has quit IRC05:22
*** armax has joined #openstack-nova05:22
*** yamamoto has quit IRC05:27
*** markvoelker has quit IRC05:28
*** armax has quit IRC05:28
*** yamamoto has joined #openstack-nova05:30
*** penick has joined #openstack-nova05:32
*** tuanla____ has quit IRC05:33
*** Dinesh_Bhor has quit IRC05:34
*** Dinesh_Bhor has joined #openstack-nova05:34
*** yamamoto has quit IRC05:36
*** penick has quit IRC05:36
*** penick_ has joined #openstack-nova05:36
*** Dinesh_Bhor has quit IRC05:38
*** Dinesh_Bhor has joined #openstack-nova05:39
*** Dinesh_Bhor has quit IRC05:42
*** Dinesh_Bhor has joined #openstack-nova05:43
*** links has joined #openstack-nova05:47
*** chyka has joined #openstack-nova05:50
*** chyka has quit IRC05:51
*** tuanla____ has joined #openstack-nova05:58
*** Tom-Tom has quit IRC06:02
openstackgerritRajesh Tailor proposed openstack/nova master: Host addition host-aggregate should be case-sensitive  https://review.openstack.org/49833406:05
openstackgerritRajesh Tailor proposed openstack/nova master: Fix case-sensitivity for metadata keys  https://review.openstack.org/50488506:06
*** itlinux_ has quit IRC06:08
*** annp has quit IRC06:08
*** annp has joined #openstack-nova06:09
*** afazekas has quit IRC06:11
*** afazekas has joined #openstack-nova06:11
*** penick_ has quit IRC06:12
*** karthiks has joined #openstack-nova06:13
*** namnh has joined #openstack-nova06:13
*** itlinux_ has joined #openstack-nova06:14
*** itlinux_ has quit IRC06:15
*** chyka has joined #openstack-nova06:15
*** chyka has quit IRC06:20
*** armax has joined #openstack-nova06:21
*** janki has quit IRC06:22
*** Dinesh_Bhor has quit IRC06:23
*** markvoelker has joined #openstack-nova06:25
*** armax has quit IRC06:28
*** yamamoto has joined #openstack-nova06:35
*** hiro-kobayashi has quit IRC06:36
*** Dinesh_Bhor has joined #openstack-nova06:37
*** moshele has quit IRC06:40
*** Tom-Tom has joined #openstack-nova06:41
*** yamamoto has quit IRC06:41
*** trungnv has joined #openstack-nova06:43
*** moshele has joined #openstack-nova06:49
*** edmondsw has joined #openstack-nova06:51
*** moshele has quit IRC06:55
*** edmondsw has quit IRC06:55
*** markvoelker has quit IRC06:59
*** _gryf has joined #openstack-nova06:59
*** jchhatbar has joined #openstack-nova07:00
openstackgerritjichenjc proposed openstack/nova master: Remove 'nova-manage shell' command  https://review.openstack.org/52183507:00
openstackgerritjichenjc proposed openstack/nova master: Remove 'nova-manage account' and 'nova-manage project'  https://review.openstack.org/52183307:00
openstackgerritjichenjc proposed openstack/nova master: Remove 'nova-manage logs' command  https://review.openstack.org/52213307:00
*** kumarmn has joined #openstack-nova07:00
*** kumarmn has quit IRC07:05
*** chyka has joined #openstack-nova07:08
*** armax has joined #openstack-nova07:10
*** armax has quit IRC07:11
*** ludo has joined #openstack-nova07:12
*** ludo is now known as Guest7202807:12
Guest72028Hello All07:12
*** chyka has quit IRC07:13
*** pchavva has quit IRC07:14
*** andreas_s has joined #openstack-nova07:18
*** salv-orlando has joined #openstack-nova07:19
Guest72028I have a questions for nova specialists about nova ressources: is it possible to have compute nodes spares ? is it possible to reserve compute ressources , prioritize rebuild order of VM while evacuation process ?07:19
*** claudiub has joined #openstack-nova07:20
*** nore_rabel has joined #openstack-nova07:22
*** salv-orlando has quit IRC07:23
*** rcernin has quit IRC07:31
*** Eran_Kuris has quit IRC07:32
*** ircuser-1 has joined #openstack-nova07:39
*** tetsuro has quit IRC07:43
*** yamamoto has joined #openstack-nova07:45
*** Dave has quit IRC07:48
*** Dave has joined #openstack-nova07:49
*** _gryf has quit IRC07:52
*** AlexeyAbashkin has joined #openstack-nova07:52
*** hoangcx has quit IRC07:53
*** salv-orlando has joined #openstack-nova07:54
*** sahid has joined #openstack-nova07:55
*** takashin has left #openstack-nova08:00
*** rcernin has joined #openstack-nova08:02
*** hoangcx has joined #openstack-nova08:05
*** Dinesh_Bhor has quit IRC08:07
*** Dinesh_Bhor has joined #openstack-nova08:08
*** ralonsoh has joined #openstack-nova08:10
*** daidv has quit IRC08:10
*** hieulq has quit IRC08:10
*** tuanla____ has quit IRC08:10
*** tuanla____ has joined #openstack-nova08:11
*** daidv has joined #openstack-nova08:11
*** hieulq has joined #openstack-nova08:11
*** Dinesh_Bhor has quit IRC08:17
*** Dinesh_Bhor has joined #openstack-nova08:18
*** Dinesh_Bhor has quit IRC08:21
*** xinliang has quit IRC08:22
*** alexchadin has joined #openstack-nova08:22
*** moshele has joined #openstack-nova08:24
*** Dinesh_Bhor has joined #openstack-nova08:24
openstackgerrit龚肖 proposed openstack/nova stable/pike: compute: Catch binding failed exception while init host  https://review.openstack.org/52898508:25
*** Guest72028 has left #openstack-nova08:25
*** andreas__ has joined #openstack-nova08:30
*** andreas_s has quit IRC08:34
*** xinliang has joined #openstack-nova08:34
openstackgerritTommyLike proposed openstack/nova master: Remove redundant try/except block when authorize  https://review.openstack.org/52899108:35
openstackgerritMr Rambo proposed openstack/nova master: Fix the problems that volume-backed server rebuild  https://review.openstack.org/52899408:35
*** edmondsw has joined #openstack-nova08:39
*** Dinesh_Bhor has quit IRC08:43
*** edmondsw has quit IRC08:44
*** damien_r has joined #openstack-nova08:45
*** salv-orlando has quit IRC08:45
*** salv-orlando has joined #openstack-nova08:46
*** alexchadin has quit IRC08:47
*** mdnadeem has joined #openstack-nova08:47
*** alexchadin has joined #openstack-nova08:48
*** priteau has joined #openstack-nova08:50
*** jpena|off is now known as jpena08:51
*** salv-orlando has quit IRC08:51
*** trungnv has quit IRC08:52
*** karthiks has quit IRC08:56
*** markvoelker has joined #openstack-nova08:56
*** cdent has joined #openstack-nova08:57
*** chyka has joined #openstack-nova08:58
*** salv-orlando has joined #openstack-nova08:58
*** brault has joined #openstack-nova08:59
*** andreas_s has joined #openstack-nova09:00
*** chyka has quit IRC09:02
*** andreas__ has quit IRC09:04
lyarwoodmdbooth: thanks for the reviews yesterday btw, is your uuid series ready for review?09:05
*** Dinesh_Bhor has joined #openstack-nova09:05
mdboothlyarwood: Mostly, yes.09:07
mdboothlyarwood: Well, most of it09:07
mdboothlyarwood: I'd very much like your input on this one: https://review.openstack.org/#/c/528363/09:08
mdboothBut I'm about to rebase that into the main series, because I need it for the next patch09:08
*** karthiks has joined #openstack-nova09:08
mdboothThe series is here: https://review.openstack.org/#/q/topic:bp/local-disk-serial-numbers+(status:open+OR+status:merged)09:09
*** Dinesh_Bhor has quit IRC09:10
lyarwoodmdbooth: ack, looking09:10
*** ttsiouts has quit IRC09:11
*** ttsiouts has joined #openstack-nova09:11
mdboothlyarwood: Thanks09:13
*** Dinesh_Bhor has joined #openstack-nova09:22
*** Dinesh_Bhor has quit IRC09:24
*** Dinesh_Bhor has joined #openstack-nova09:25
*** mvk has quit IRC09:25
*** jaianshu has joined #openstack-nova09:27
*** markvoelker has quit IRC09:29
*** Dinesh_Bhor has quit IRC09:30
*** lucas-afk is now known as lucasagomes09:32
*** Dinesh_Bhor has joined #openstack-nova09:33
*** derekh has joined #openstack-nova09:36
openstackgerritMr Rambo proposed openstack/nova master: Fix the problems that volume-backed server rebuild  https://review.openstack.org/52874009:37
*** josecastroleon has joined #openstack-nova09:37
maciejjozefczykgibi jaypipes: Hello, thanks for your comments :) Please check my response https://review.openstack.org/#/c/520024/ Thank you!09:41
*** Dinesh_Bhor has quit IRC09:42
*** ratailor has quit IRC09:44
*** yangyapeng has quit IRC09:45
*** mvk has joined #openstack-nova09:52
*** afazekas has quit IRC10:01
*** namnh has quit IRC10:01
lyarwoodmdbooth: so the DriverVolumeBlockDevice change LGTM, could we set self.connection_info earlier and just pass that around within the method?10:02
mdboothlyarwood: Which function are you referring to specifically?10:05
lyarwoodmdbooth: _legacy_volume_attach or _volume_attach in block_device.py10:05
*** josecastroleon has quit IRC10:06
mdboothSec, just reading and digesting your review comment10:06
mdboothlyarwood: Ah, you're talking about *not* changing the interface?10:07
*** josecastroleon has joined #openstack-nova10:07
mdboothAnd continuing to pass connection info explicitly?10:07
*** afazekas has joined #openstack-nova10:07
lyarwoodmdbooth: no, just setting self.connection_info when we actually fetch it from cinder10:07
mdboothAh...10:07
mdboothYep, that would be cleaner.10:08
*** norman has joined #openstack-nova10:08
mdboothIt would make the patch a bit noisier, though...10:08
mdboothAnd it's already pretty noisy.10:08
mdboothHmm...10:09
lyarwoodyeah I assumed that's why you didn't do it tbh10:09
normanhi all,  anyone know why domainxml of the live-migrated instance has <features> under <cpu> section, but new booted instances not10:16
normanI'd  trying go through the code ,failed to find clues. I am still using the Mitaka,  not  sure the new version is Ok or not10:17
TahvokHey guys, I'm unable to find a good example of ComputeCapabilitiesFilter. Do you apply it on the flavor's metadata or on the compute host somehow? If it should me on the compute host, so where exactly do I specify my 'capabilities' filter?10:17
*** MikeG451 has joined #openstack-nova10:18
*** karthiks has quit IRC10:20
*** fragatin_ has joined #openstack-nova10:22
*** fragatina has quit IRC10:22
*** norman has quit IRC10:23
*** psachin has quit IRC10:24
*** norman has joined #openstack-nova10:24
*** namnh has joined #openstack-nova10:25
*** namnh has quit IRC10:25
*** markvoelker has joined #openstack-nova10:26
*** edmondsw has joined #openstack-nova10:27
openstackgerritMerged openstack/nova stable/ocata: Make request_spec.spec MediumText  https://review.openstack.org/52833210:28
openstackgerritMerged openstack/nova master: Fix the formatting for 2.56 in the compute REST API history doc  https://review.openstack.org/52811410:28
*** moshele has quit IRC10:29
*** mvk has quit IRC10:31
*** edmondsw has quit IRC10:32
*** mvk has joined #openstack-nova10:32
*** phuongnh has quit IRC10:37
*** yikun has quit IRC10:37
*** josecastroleon has quit IRC10:37
*** yikun has joined #openstack-nova10:37
*** sambetts|afk is now known as sambetts10:39
lyarwoodkashyap: http://logs.openstack.org/38/528338/4/check/legacy-tempest-dsvm-multinode-live-migration/d867726/logs/subnode-2/screen-n-cpu.txt.gz#_2017-12-18_20_20_13_894 - seeing this on stable/newton, LM failure for a single paused instance, looks like the remote libvirtd didn't respond in time, have you seen this before?10:41
lyarwoodkashyap: http://logs.openstack.org/38/528338/4/check/legacy-tempest-dsvm-multinode-live-migration/d867726/job-output.txt.gz#_2017-12-18_20_20_56_230121 is the tempest failure10:41
* kashyap clicks10:41
*** jchhatbar is now known as janki10:41
kashyaplyarwood: Is this only stable/newton?10:42
lyarwoodkashyap: that's the only place I've seen it thus far10:42
*** inara has quit IRC10:42
* kashyap checks the other libvirt log to see about the timeout10:42
* kashyap notes to himself: If you see "subnode-2" in the URL, that means it's the 'source' host.)10:43
*** norman has quit IRC10:43
*** brault has quit IRC10:44
*** inara has joined #openstack-nova10:44
*** brault has joined #openstack-nova10:45
*** karthiks has joined #openstack-nova10:46
*** yangyapeng has joined #openstack-nova10:47
*** brault has quit IRC10:50
kashyaplyarwood: So I looked at all the logs, one thing that potentially jumps out at me is in the source QEMU log:10:50
kashyap[...]10:50
kashyap[...]10:50
kashyapwarning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]10:50
kashyapmain-loop: WARNING: I/O thread spun for 1000 iterations10:50
kashyapNow, that warning isn't really an egregious error (& upstream QEMU is aware of it; it's a hard thing to fix),  but that might be contributing to it10:50
kashyapWhere "it" being the timeout we see:10:51
kashyap2017-12-18 20:20:13.880+0000: 16820: error : virKeepAliveTimerInternal:143 : internal error: connection closed due to keepalive timeout10:51
kashyap2017-12-18 20:20:13.881+0000: 16825: error : virKeepAliveTimerInternal:143 : internal error: connection closed due to keepalive timeout10:51
kashyap(From the source)10:51
*** brault has joined #openstack-nova10:51
* kashyap checks with a migration dev 10:51
lyarwoodkashyap: kk, so the dest was stuck, didn't send a keepalive and everything dies?10:52
kashyaplyarwood: Yeah, from the destination livirtd log:10:52
kashyap    2017-12-18 20:20:19.394+0000: 23816: error : qemuMonitorIOWrite:545 : Unable to write to monitor: Broken pipe10:52
kashyapThe above means libvirt lost access to the QMP socket connection, i.e. VM died10:53
*** yangyapeng has quit IRC10:53
lyarwoodkashyap: it's paused, that shouldn't cause the QMP socket to die however right?10:53
*** huanxie has quit IRC10:54
*** huanxie has joined #openstack-nova10:54
*** yamamoto has quit IRC10:55
kashyaplyarwood: Yeah, clearly something is wonky.  I'll check this w/ Dave Gilbert as he meditates on migration10:55
*** yamamoto has joined #openstack-nova10:55
kashyapBut the process _is_ killed, as we see from the destination (http://logs.openstack.org/38/528338/4/check/legacy-tempest-dsvm-multinode-live-migration/d867726/logs/subnode-2/libvirt/qemu/instance-00000004.txt.gz):10:55
kashyap    2017-12-18T20:17:51.612132Z qemu-system-x86_64: terminating on signal 15 from pid 1682010:55
*** Tom-Tom has quit IRC10:56
*** yamamoto has quit IRC10:56
*** yamamoto has joined #openstack-nova10:56
*** fragatina has joined #openstack-nova10:57
*** abhishekk has quit IRC10:57
*** fragatin_ has quit IRC10:57
openstackgerritChen Hanxiao proposed openstack/nova master: libvirt: don't call sync_guest_time if qga is not enabled  https://review.openstack.org/52483610:58
lyarwoodkashyap: kk, thanks, I'm going to recheck this change and see if we can hit it again10:58
kashyapSo it is the live block migration, right10:58
kashyap(To pause it)10:58
*** damien_r has left #openstack-nova10:58
kashyapLiveMigrationTest.test_live_block_migration_paused10:58
lyarwoodkashyap: yup, LM of a paused instance without shared storage10:59
*** markvoelker has quit IRC11:00
openstackgerritMatthew Booth proposed openstack/nova master: Rename block_device_info_get_root  https://review.openstack.org/52902811:01
openstackgerritMatthew Booth proposed openstack/nova master: Add local_root to block_device_info  https://review.openstack.org/52902911:01
openstackgerritMatthew Booth proposed openstack/nova master: Expose driver_block_device fields as attributes  https://review.openstack.org/52836211:03
openstackgerritMatthew Booth proposed openstack/nova master: Pass DriverBlockDevice to driver.attach_volume  https://review.openstack.org/52836311:03
*** yangyapeng has joined #openstack-nova11:03
*** yangyapeng has quit IRC11:08
*** salv-orlando has quit IRC11:09
*** salv-orlando has joined #openstack-nova11:09
*** yamamoto has quit IRC11:09
*** andreas_s has quit IRC11:10
*** andreas_s has joined #openstack-nova11:11
openstackgerritMerged openstack/nova master: Implement query param schema for migration index  https://review.openstack.org/51864411:13
*** yamamoto has joined #openstack-nova11:13
kashyaplyarwood: So a couple of things from interacting w/ Dan & Dave from QEMU:11:14
*** salv-orlando has quit IRC11:14
kashyap(1) You see the _later_ log messages (on destination) are 3 before the earlier log message.11:14
kashyapSo that's some weird timestamps there.11:15
kashyap(2) We're not the first to hit this case; there's this existing bug https://bugzilla.redhat.com/show_bug.cgi?id=1367620 ("storage migration fails due to keepalive timeout")11:17
openstackbugzilla.redhat.com bug 1367620 in libvirt "storage migration fails due to keepalive timeout" [High,Assigned] - Assigned to jdenemar11:17
*** yamamoto has quit IRC11:18
openstackgerritMatthew Booth proposed openstack/nova master: Expose BDM uuid to drivers  https://review.openstack.org/52903711:18
kashyaplyarwood: Wonder if you have a link to how many times this was hit in the past / this week?11:19
lyarwoodkashyap: I don't have one to hand now but I can create one, and an upstream bug for this11:21
mdboothlyarwood: Don't know if you're still looking at it, but I've been messing with that series this morning.11:22
*** AlexeyAbashkin has quit IRC11:22
lyarwoodassuming we've seen it more than once11:22
mdboothNot quite finished yet.11:22
lyarwoodmdbooth: kk I stopped after the earlier change sorry11:22
mdboothlyarwood: NP. I'd have been messing you about anyway.11:23
kashyaplyarwood: I'm creating a potential reproducer, and can file one with that11:23
kashyaplyarwood: But if you've already drafted a bug / issue, go ahead & submit it11:23
* kashyap looks at logstash meanwhile11:24
*** andreas_s has quit IRC11:24
*** yangyapeng has joined #openstack-nova11:24
*** huanxie has quit IRC11:25
*** andreas_s has joined #openstack-nova11:26
openstackgerritMerged openstack/nova master: Remove 'nova-manage shell' command  https://review.openstack.org/52183511:27
*** gszasz has joined #openstack-nova11:28
*** yangyapeng has quit IRC11:29
*** huanxie has joined #openstack-nova11:30
*** andreas_s has quit IRC11:30
openstackgerritMatthew Booth proposed openstack/nova master: Give volume DriverBlockDevice classes a common prefix  https://review.openstack.org/52634611:32
openstackgerritMatthew Booth proposed openstack/nova master: Add DriverLocalImageBlockDevice  https://review.openstack.org/52634711:32
*** david_8 has quit IRC11:33
*** carthaca_ has quit IRC11:33
*** mkoderer_ has quit IRC11:33
*** tpatzig_4 has quit IRC11:33
*** tpatzig_5 has joined #openstack-nova11:34
*** carthaca_1 has joined #openstack-nova11:34
*** david_9 has joined #openstack-nova11:34
*** dgonzalez_ has joined #openstack-nova11:34
*** carthaca_ has joined #openstack-nova11:34
*** mkoderer_ has joined #openstack-nova11:34
openstackgerritMatthew Booth proposed openstack/nova master: Use real block_device_info data in test_blockinfo  https://review.openstack.org/52791611:34
openstackgerritMatthew Booth proposed openstack/nova master: Rename block_device_info_get_root  https://review.openstack.org/52902811:34
openstackgerritMatthew Booth proposed openstack/nova master: Add local_root to block_device_info  https://review.openstack.org/52902911:34
openstackgerritMatthew Booth proposed openstack/nova master: Expose driver_block_device fields as attributes  https://review.openstack.org/52836211:34
openstackgerritMatthew Booth proposed openstack/nova master: Pass DriverBlockDevice to driver.attach_volume  https://review.openstack.org/52836311:34
*** dgonzalez_ has quit IRC11:36
*** carthaca_1 has quit IRC11:36
*** AlexeyAbashkin has joined #openstack-nova11:38
*** brault has quit IRC11:39
*** andreas_s has joined #openstack-nova11:40
*** alexchadin has quit IRC11:44
*** yangyapeng has joined #openstack-nova11:45
*** brault has joined #openstack-nova11:45
*** yangyapeng has quit IRC11:49
*** andreas_s has quit IRC11:50
*** brault has quit IRC11:50
*** andreas_s has joined #openstack-nova11:51
*** andreas_s has quit IRC11:53
*** andreas_s has joined #openstack-nova11:53
openstackgerritStephen Finucane proposed openstack/nova master: Change 'InstancePCIRequest' spec field  https://review.openstack.org/44925711:53
openstackgerritStephen Finucane proposed openstack/nova master: Add Neutron port capabilities to devspec in request  https://review.openstack.org/45177711:53
openstackgerritStephen Finucane proposed openstack/nova master: Format NIC features using os-traits definitions  https://review.openstack.org/46605111:53
*** smatzek has joined #openstack-nova11:56
*** markvoelker has joined #openstack-nova11:57
*** gszasz has quit IRC11:59
*** gszasz has joined #openstack-nova12:00
stephenfinralonsoh: If you have the time to address it, I rebased and left a question on https://review.openstack.org/#/c/449257/12:00
openstackgerritMerged openstack/nova master: Pass mountpoint to volume attachment_update  https://review.openstack.org/52746812:00
*** huanxie has quit IRC12:01
ralonsohstephenfin: I'll take a look to those patches on Friday. This patch depends on other two patches, 449257 and 45177712:02
stephenfinralonsoh: 449257 is the one I'm referring to :)12:02
ralonsohstephenfin: https://review.openstack.org/#/c/466051/ shouldn't be on top of master12:02
ralonsohstephenfin: ok, thanks! I'll take a look at those patches on Friday12:03
stephenfinralonsoh: It isn't - it's on top of 449257 and 45177712:03
ralonsohstephenfin: I was lookin at the wrong patch12:03
ralonsohlooking12:04
gibijaypipes: some extra madness for the server_group functional tests https://bugs.launchpad.net/nova/+bug/173901312:04
openstackLaunchpad bug 1739013 in OpenStack Compute (nova) "nova.tests.functional.test_server_group.ServerGroupTest*.test_evacuate_with_anti_affinity does not validate that evacuation really happens" [Undecided,New]12:04
*** yangyapeng has joined #openstack-nova12:05
*** huanxie has joined #openstack-nova12:05
*** yamamoto has joined #openstack-nova12:06
*** alexchadin has joined #openstack-nova12:06
openstackgerritYikun Jiang (Kero) proposed openstack/python-novaclient master: Microversion 2.58 - Instance actions list pagination  https://review.openstack.org/52860112:07
*** Brin has joined #openstack-nova12:09
*** salv-orlando has joined #openstack-nova12:10
*** yamamoto has quit IRC12:10
*** yangyapeng has quit IRC12:11
*** zhangbailin_ has joined #openstack-nova12:12
*** Brin has quit IRC12:12
*** salv-orlando has quit IRC12:14
*** edmondsw has joined #openstack-nova12:16
*** annp has quit IRC12:17
*** zhangbailin_ has quit IRC12:17
jaypipesgibi: not sure how much more madness you can get in that... :)12:17
ebbexI've created a server with swap, where both root and swap are rbd, yet I have a "huge" swap-file on my compute node under nova/instances/_base/, and I see in the logs a "nova-rootwrap touch -c ...ova/instances/_base/swap_16384" going off about once a minute on that compute node. Where can I read up on the code that creates that file, and how long is the swap-file supposed to stay there?12:18
jaypipesebbex: the swap file should stay there for the life of the VM (since it's the swap content for the image...)12:19
jaypipesebbex: though I'm not sure why you'd see the touch -c command show up more than once. that's odd...12:20
*** edmondsw has quit IRC12:20
ebbex"virsh domblklist instance-00000041" gives:  vdb vms/6e366e4b-2d87-48ac-a99c-999706e7e4f0_disk.swap, (on the ceph cluster) which I take it is where the instance gets to write swap to, right? No actual writes going to the _base/swap12:22
*** links has quit IRC12:23
ebbexI'm afraid that we might end up with a full disk thanks to swap images on our computenode as we have really small disks there. Yet vast amounts of storage on ceph.12:24
jaypipesebbex: hmm, I'm not sure. sure... mdbooth you around?12:25
lyarwoodmdbooth: ^ that smells like a bug, looksing at the code the fetch_func for creating swap is always _create_swap in nova/virt/libvirt/driver.py12:26
lyarwoodlooksing12:26
lyarwood:|12:26
jaypipesmdbooth: does swap file get fulfilled by local disk even when ceph is used?12:26
jaypipesoh, hey lyarwood :)12:26
lyarwood\o_ morning12:26
*** mlavalle has joined #openstack-nova12:28
ebbexjaypipes: Yeah, I think it's kinda odd touching the file every minute, if the ImageCache tries something like _remove_old_enough*. I don't really understand how it's all supposed to hang together.12:28
*** yangyapeng has joined #openstack-nova12:29
jaypipesebbex: we've sent up the bat-signal for mdbooth :) hopefully he can share his insight on this (I'm afraid I'm not proficient enough in this area of the codebase)12:29
*** markvoelker has quit IRC12:29
*** brault has joined #openstack-nova12:30
ebbex:)12:30
*** gszasz has quit IRC12:33
*** yangyapeng has quit IRC12:34
*** huanxie has quit IRC12:36
*** yamamoto has joined #openstack-nova12:38
*** yamamoto has quit IRC12:39
*** tuanla____ has quit IRC12:39
*** lucasagomes is now known as lucas-hungry12:42
jaypipesstephenfin: I'd just go ahead and take over that InstancePCIRequest patch... ralonsoh, you cool with that?12:42
*** huanxie has joined #openstack-nova12:44
*** gszasz has joined #openstack-nova12:45
*** alexchadin has quit IRC12:46
ralonsohjaypipes, stephenfin: but I've been taking care of my remaining patches. Anyway, if doing this we can have this 8 months patch merged, is ok12:47
openstackgerritClaudiu Belu proposed openstack/nova master: tests: autospecs all the mock.patch usages  https://review.openstack.org/47077512:48
*** jpena is now known as jpena|lunch12:48
*** weshay_pto is now known as weshay12:48
*** yangyapeng has joined #openstack-nova12:49
*** aarefiev has joined #openstack-nova12:50
jaypipesralonsoh: cool. it's just that stephenfin has been making some other changes around InstancePCIRequest object to support NUMA PCI affinity policy, so I thought it would be easier to have him take it over.12:52
*** ralonsoh has quit IRC12:53
*** yangyapeng has quit IRC12:53
jaypipesgibi: are you planning on pushing a patch around that test_server_groups.py bug?12:54
*** claudiub|2 has joined #openstack-nova12:57
*** zhurong has joined #openstack-nova13:00
*** claudiub has quit IRC13:00
*** zhurong has quit IRC13:02
*** zhurong has joined #openstack-nova13:03
*** claudiub has joined #openstack-nova13:04
gibijaypipes: yes, I'm working on that right now13:07
jaypipesgibi: cool.13:07
*** claudiub|2 has quit IRC13:07
*** yangyapeng has joined #openstack-nova13:10
*** salv-orlando has joined #openstack-nova13:11
*** janki has quit IRC13:11
*** janki has joined #openstack-nova13:11
*** catintheroof has joined #openstack-nova13:11
*** catintheroof has quit IRC13:12
openstackgerritMerged openstack/nova master: Deprecate configurable Hide Server Address Feature  https://review.openstack.org/52629713:12
*** catintheroof has joined #openstack-nova13:12
*** huanxie has quit IRC13:14
*** yangyapeng has quit IRC13:15
*** salv-orlando has quit IRC13:15
*** links has joined #openstack-nova13:18
*** r-daneel has joined #openstack-nova13:19
*** rcernin has quit IRC13:20
*** huanxie has joined #openstack-nova13:20
*** r-daneel has quit IRC13:20
*** zhurong has quit IRC13:25
*** markvoelker has joined #openstack-nova13:27
*** markvoelker has quit IRC13:28
*** jaianshu has quit IRC13:28
*** markvoelker has joined #openstack-nova13:29
*** yangyapeng has joined #openstack-nova13:30
openstackgerritBalazs Gibizer proposed openstack/nova master: Fix false positive server group functional tests  https://review.openstack.org/52906313:30
gibijaypipes: ^^13:30
jaypipescool, thanks13:30
*** Tom-Tom has joined #openstack-nova13:31
cdentgibi: since you've spent a lot of time in the servers functional tests, can you recall how good the coverage is for the various migrations? I'm hoping that existing tests cover https://review.openstack.org/#/c/528089/13:34
*** r-daneel has joined #openstack-nova13:34
*** lucas-hungry is now known as lucasagomes13:34
*** yangyapeng has quit IRC13:35
*** diga has joined #openstack-nova13:36
*** stvnoyes has joined #openstack-nova13:36
*** pchavva has joined #openstack-nova13:38
*** ralonsoh has joined #openstack-nova13:39
*** yangyapeng has joined #openstack-nova13:39
*** yamamoto has joined #openstack-nova13:40
*** liverpooler has joined #openstack-nova13:40
*** yangyapeng has quit IRC13:44
*** dave-mccowan has joined #openstack-nova13:45
*** yamamoto has quit IRC13:48
*** jpena|lunch is now known as jpena13:49
*** huanxie has quit IRC13:51
*** Tom-Tom_ has joined #openstack-nova13:51
*** mingyu has joined #openstack-nova13:52
*** Tom-Tom has quit IRC13:54
*** huanxie has joined #openstack-nova13:56
*** mriedem has joined #openstack-nova13:58
*** lyan has joined #openstack-nova14:01
bauzasmmm, blaming libvirt/driver.py seems a bit habit :p14:02
bauzass/bit/bad14:02
*** links has quit IRC14:03
*** yangyapeng has joined #openstack-nova14:06
gibicdent: I think it is safe to assume that the changes in https://review.openstack.org/#/c/528089/ is covered with the existing functional tests14:10
gibicdent: I put the patch on my review list14:11
cdentthanks gibi14:11
*** salv-orlando has joined #openstack-nova14:11
mriedemedleafe: the failure on https://review.openstack.org/#/c/511358/ is because we aren't removing the existing allocations for the instance (from the tried and failed host) before we try allocating resources on the alternate14:11
mriedemso the report client thinks we're doing a move operation, which we aren't14:12
*** yangyapeng has quit IRC14:12
openstackgerritBalazs Gibizer proposed openstack/nova master: Fix false positive server group functional tests  https://review.openstack.org/52906314:13
openstackgerritBalazs Gibizer proposed openstack/nova master: Fix false positive server group functional tests  https://review.openstack.org/52906314:15
mriedem_move_operation_alloc_request is broken if we get the allocation candidates using 1.1214:15
*** salv-orlando has quit IRC14:16
edleafemriedem: ok, just settling in. Will look over that shortly14:17
*** links has joined #openstack-nova14:17
*** abhishekk has joined #openstack-nova14:18
*** abhishekk_ has joined #openstack-nova14:19
*** Tom-Tom_ has quit IRC14:19
*** Tom-Tom has joined #openstack-nova14:19
mriedemi'll open a bug for the _move_operation_alloc_request thing14:20
*** salv-orlando has joined #openstack-nova14:23
mriedemhttps://bugs.launchpad.net/nova/+bug/173904214:23
openstackLaunchpad bug 1739042 in OpenStack Compute (nova) "_move_operation_alloc_request fails with TypeError when using 1.12 version allocation request" [Undecided,New]14:23
kashyapmriedem: When you get a moment, my 'logstash' foo isn't helping me; I want to see how many times this error has occurred: "error: connection closed due to keepalive timeout"14:24
kashyapPutting it verbatim here http://logstash.openstack.org/#/dashboard/file/logstash.json14:24
*** Tom-Tom has quit IRC14:24
kashyapDidn't help.14:24
mriedemkashyap: where is it originating from?14:24
kashyapmriedem: stable/newton14:24
mriedemwhich file?14:24
kashyapLet me get a link14:24
kashyapmriedem: There - http://logs.openstack.org/38/528338/4/check/legacy-tempest-dsvm-multinode-live-migration/d867726/job-output.txt.gz#_2017-12-18_20_20_56_23012114:24
*** yangyapeng has joined #openstack-nova14:24
kashyapIt's this one: LiveMigrationTest.test_live_block_migration_paused14:24
mriedemi don't see "error: connection closed due to keepalive timeout" in there at all14:25
kashyapI debugged it a bit this morning w/ upstream libvirt & QEMU folks.  And I'm setting up a reproducer to see if I can get to it14:25
kashyapmriedem: Ah, sorry; that error actually comes from libvirtd log, let me get that link14:25
mriedemwe don't index the libvirtd logs14:25
mriedemwhich is why it's not in logstash14:25
lyarwoodit's also in n-cpu FWIW14:25
lyarwoodhttp://logs.openstack.org/38/528338/4/check/legacy-tempest-dsvm-multinode-live-migration/d867726/logs/subnode-2/screen-n-cpu.txt.gz#_2017-12-18_20_20_13_89414:26
kashyapmriedem: There - http://logs.openstack.org/38/528338/4/check/legacy-tempest-dsvm-multinode-live-migration/d867726/logs/subnode-2/libvirt/libvirtd.txt.gz#_2017-12-18_20_20_13_88014:26
kashyapAh-ha14:26
kashyapmriedem: Any reason we don't index it?14:26
mriedemhttp://logs.openstack.org/38/528338/4/check/legacy-tempest-dsvm-multinode-live-migration/d867726/logs/subnode-2/screen-n-cpu.txt.gz#_2017-12-18_20_20_13_894 is debug14:26
mriedemwe index INFO+14:26
*** huanxie has quit IRC14:26
mriedemwe don't index libvirtd because it kills the indexer14:26
mriedemtoo much content14:26
* kashyap nods14:26
kashyapOkay, the screen-n-cpu.txt has it14:26
lyarwoodmriedem: it's also above in ERROR14:26
kashyapYeah, it's in ERROR14:27
mriedemhttp://logs.openstack.org/38/528338/4/check/legacy-tempest-dsvm-multinode-live-migration/d867726/logs/subnode-2/screen-n-cpu.txt.gz#_2017-12-18_20_20_13_89314:27
mriedemok that should work14:27
kashyapmriedem: Do you kow how could it kill the index?  Due to its size?14:27
mriedemkashyap: yes14:27
mriedemsize14:27
mriedemhttp://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Live%20Migration%20failure%3A%20internal%20error%3A%20connection%20closed%20due%20to%20keepalive%20timeout%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d14:28
kashyapAh, interesting, even ~217K is too much?14:28
kashyap(Anyway, that's fine.)14:28
* kashyap clicks14:28
kashyapmriedem++14:28
mriedemkashyap: it's that times however many jobs we run PER DAY14:28
kashyapNo Karma bot14:28
*** smatzek has quit IRC14:29
*** yangyapeng has quit IRC14:29
kashyapOkay, only 2 hits so far; thanks mriedem14:29
mriedemyeah, but all on newton14:30
kashyapRight14:30
*** andreas_s has quit IRC14:30
mriedemso i'm guessing it's related to the version of libvirt/qemu we have in newton14:30
kashyap(Don't know the root cause of it yet; could be QEMU, could be libvirt.  There are 2 other bugs filed for them - https://bugzilla.redhat.com/show_bug.cgi?id=1367620)14:30
openstackbugzilla.redhat.com bug 1367620 in libvirt "storage migration fails due to keepalive timeout" [High,Assigned] - Assigned to jdenemar14:30
*** kumarmn has joined #openstack-nova14:30
mriedemi don't think we're using UCA packages in newton jobs14:30
*** andreas_s has joined #openstack-nova14:30
mriedem^ is libvirt 1.3.1 and qemu 2.514:31
mriedemwe're way newer than that on queens14:31
kashyapYeah, saw the versions earlier in the day14:31
*** huanxie has joined #openstack-nova14:31
kashyapIs it worth it to use UCA in that case?  Maybe not, for these rare one-off cases14:31
mriedemnot at this point for newton14:32
kashyapYep, noted.14:32
mriedembauzas: this is a regression introduced in newton https://review.openstack.org/#/c/528835/ - would be good to get your review on that14:32
openstackgerritBernhard M. Wiedemann proposed openstack/nova master: Fix 4 doc typos  https://review.openstack.org/52908414:33
bauzasmriedem: ack, looking14:35
*** abhishekk has quit IRC14:35
bauzasmriedem: ah, good call14:36
bauzasI remember we had a shit number of races for the BuildRequest object14:37
edleafemriedem: about the func test failure: this line should de-allocate against the instance: https://review.openstack.org/#/c/511358/43/nova/compute/manager.py@177814:38
mriedemright so what was added there in newton was just for novalidhost on the initial create14:38
mriedembut didn't take into account reschedules14:39
mdboothmriedem: Any chance you could have another look at the BDM uuid patches? https://review.openstack.org/#/c/242602/25 and the following 2 are the ones which do the db modification. I addressed your review comments.14:39
bauzasmriedem: the point is that we were not having cell conductors yet14:39
mriedemedleafe: ah, well, that's a race :)14:39
mriedemedleafe: we cast to build_instances *before* compute cleans up the allocations14:40
bauzasmriedem: now that we reschedule per cell conductors, yes it's a problem14:40
mriedembauzas: you could still run newton in split MQ mode14:40
mriedemand split db14:40
mriedemi think anyway14:40
edleafemriedem: so it's only locked for build, not claim14:40
bauzasmriedem: sure14:41
mriedemedleafe: the lock in compute doesn't matter14:41
mriedemcompute rpc casts to conductor build_instances14:41
mriedemand then goes to delete the allocation for the instance14:41
edleafemriedem: that's my point - it's only locking builds for that host14:42
mriedemin fact, this could overwrite what conductor claims on the alternate if the timing window hits it just right14:42
edleafeI'll move the allocation cleanup so it is run before the cast14:42
mriedemedleafe: you can't just move it,14:42
mriedemit's there for reschedules and any other kind of failure14:43
mriedemedleafe: i think this:14:43
mriedemfails = (build_results.FAILED,14:43
mriedem                             build_results.RESCHEDULED)14:43
edleafemriedem: all of the other cleanups are in _do_build_and_run_instance()14:43
mriedembecomes just build_results.FAILED14:43
mriedembut if we change that then self._build_failed() won't get called...14:44
bauzasmriedem: looking at http://www.voidspace.org.uk/python/mock/magicmock.html#mock.NonCallableMagicMock14:45
*** andreas_s has quit IRC14:45
bauzasmriedem: it means that we call it, then we would have an exception ?14:45
mriedembauzas: yes14:45
bauzasinteresting14:45
bauzasI wasn't knowing it14:45
mriedemedleafe: so if you're going to leave the cleanup in the compute, then i think we can only call https://review.openstack.org/#/c/511358/43/nova/compute/manager.py@1778 if result == build_results.FAILED in that block14:45
*** andreas_s has joined #openstack-nova14:45
mriedembecause we still need to call self._build_failed()14:45
mriedemand then *add* rt.reportclient.delete_allocation_for_instance(instance.uuid) right before we cast to build_instances14:46
mriedemyeah?14:46
edleafemriedem: I can split the code running under that conditional so that the deallocation only runs for FAILED, but the rest runs for both14:46
*** burt has joined #openstack-nova14:46
bauzasmriedem: any reason why you're not just using http://www.voidspace.org.uk/python/mock/mock.html#mock.Mock.called ?14:46
mriedembauzas: one less thing to do14:46
edleafeyeah, that's where I was going to move it to. I'll just copy the call.14:47
*** gouthamr has joined #openstack-nova14:47
mriedembauzas: NonCallableMock just does the thing i already want14:47
bauzasI see14:47
bauzasanway, I don't want to discuss about the pattern14:47
*** felipemonteiro has joined #openstack-nova14:47
*** cleong has joined #openstack-nova14:47
bauzasmy point is just that when reviewing the change, we need to understand that noncallablemock already supports that14:47
*** andreas_s has quit IRC14:48
bauzaswithout needing to verify the call count14:48
*** andreas_s has joined #openstack-nova14:48
bauzasless explicit, but interesting tho14:48
mriedemwe = you?14:48
mriedemnow you know :)14:48
mriedemi expect to see it in all of your new tests now14:48
bauzasheh14:49
openstackgerritJackie Truong proposed openstack/python-novaclient master: Microversion 2.59 - Add trusted_image_certificates  https://review.openstack.org/50039614:50
*** jmlowe has joined #openstack-nova14:51
*** gszasz has quit IRC14:57
*** smatzek has joined #openstack-nova14:58
*** pchavva has quit IRC14:59
openstackgerritMerged openstack/nova stable/newton: Make request_spec.spec MediumText  https://review.openstack.org/52833815:00
*** yangyapeng has joined #openstack-nova15:01
mriedemhuh https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/newton15:02
mriedemto eol or not to eol15:02
lyarwood\o/15:02
*** huanxie has quit IRC15:02
mriedemi worry about not having https://review.openstack.org/#/c/528835/ in newton15:02
*** pchavva has joined #openstack-nova15:03
*** andreas_s has quit IRC15:03
mriedembut i also don't know how many people running newton are going to have conductor split out yet and not have the cell conductor configured to hit the api db15:03
mriedemas bauzas noted, probably not a real worry15:03
*** andreas_s has joined #openstack-nova15:03
*** salv-orlando has quit IRC15:04
*** salv-orlando has joined #openstack-nova15:04
mriedemunrelated, i also started thinking about https://review.openstack.org/#/q/topic:fix-bfv-boot-resources+(status:open+OR+status:merged) again...15:06
mriedemand whether or not we should just take on the debt since shared provider modeling is who knows how far off yet15:06
*** huanxie has joined #openstack-nova15:08
*** salv-orlando has quit IRC15:09
* cdent feels shame15:09
*** mingyu has quit IRC15:10
*** marst has joined #openstack-nova15:10
*** andreas_s has quit IRC15:12
*** gszasz has joined #openstack-nova15:13
mriedemno shame intended15:13
*** andreas_s has joined #openstack-nova15:13
mriedemit's that we put that off for a few releases because we were saying placement would fix the problem, and we haven't yet, and people (ops) ask for it at least once per cycle15:13
jaypipescdent shaming is indeed the best kind of shaming. second only to pug shaming.15:14
openstackgerritMerged openstack/nova master: [placement] Add x-openstack-request-id in API ref  https://review.openstack.org/52300715:14
cdentmriedem: don't worry, I'll feel shame, even for things entirely outside my control and/or the result of perfectly reasonable decision making processes15:15
cdentI may be part pug15:15
*** salv-orlando has joined #openstack-nova15:15
*** aarefiev has quit IRC15:20
*** chyka has joined #openstack-nova15:20
*** sahid has quit IRC15:20
*** armax has joined #openstack-nova15:20
*** andreas_s has quit IRC15:23
*** karthiks has quit IRC15:25
*** chyka has quit IRC15:26
*** chyka has joined #openstack-nova15:26
*** andreas_s has joined #openstack-nova15:27
maciejjozefczykjaypipes: Hey :) I responded to your comment https://review.openstack.org/#/c/520024/ Could you please check it? Is it possible to discuss it when you'll check it? Maybe on Thursdays meeting? Thanks :)15:29
*** felipemonteiro has quit IRC15:29
openstackgerritJackie Truong proposed openstack/python-novaclient master: Microversion 2.59 - Add trusted_image_certificates  https://review.openstack.org/50039615:30
*** felipemonteiro has joined #openstack-nova15:30
*** chyka has quit IRC15:31
*** awaugama has joined #openstack-nova15:34
*** andreas_s has quit IRC15:34
*** andreas_s has joined #openstack-nova15:34
*** slunkad_ has quit IRC15:36
*** liverpooler has quit IRC15:38
jaypipesmaciejjozefczyk: I should be able to get to that patch today, yes.15:38
*** eharney has joined #openstack-nova15:38
*** huanxie has quit IRC15:38
*** slunkad has joined #openstack-nova15:39
maciejjozefczykjaypipes: thanks a lot :)15:39
*** gszasz has quit IRC15:39
*** yikun_jiang has joined #openstack-nova15:41
*** slunkad has quit IRC15:43
*** yikun has quit IRC15:44
*** huanxie has joined #openstack-nova15:44
*** liusheng has quit IRC15:45
*** liusheng has joined #openstack-nova15:45
*** elod has quit IRC15:47
*** liverpooler has joined #openstack-nova15:48
*** gszasz has joined #openstack-nova15:51
*** edmondsw has joined #openstack-nova15:52
*** salv-orlando has quit IRC15:52
*** salv-orlando has joined #openstack-nova15:53
*** nore_rabel has quit IRC15:54
*** edmondsw has quit IRC15:56
*** salv-orlando has quit IRC15:57
*** slunkad has joined #openstack-nova15:59
mriedemlyarwood: artom: bauzas: did we or did we not say that we needed a minor version bump on stable for the release with the schema migration?16:07
lyarwoodmriedem: we don't \need\ it for anything but I think we agreed it would be nice to have a minor version bump for this, yes.16:07
*** josecastroleon has joined #openstack-nova16:10
mriedemok here is ocata https://review.openstack.org/52910016:11
*** jmlowe has quit IRC16:13
mriedemand newton: https://review.openstack.org/52910216:14
*** huanxie has quit IRC16:14
*** damien_r has joined #openstack-nova16:20
*** huanxie has joined #openstack-nova16:20
*** brault has quit IRC16:20
*** brault has joined #openstack-nova16:21
*** salv-orlando has joined #openstack-nova16:25
*** brault_ has joined #openstack-nova16:25
*** brault has quit IRC16:26
*** andreas_s has quit IRC16:26
*** andreas_s has joined #openstack-nova16:26
openstackgerritMerged openstack/nova master: Updated from global requirements  https://review.openstack.org/52888116:27
*** janki has quit IRC16:31
mriedemjaypipes: on maciejjozefczyk's patch, i'm assuming the shutdown instances thing is a problem because of _update_usage_from_instance which is called between the initial compute node update and the final one,16:32
mriedemand _update_usage_from_instance calls self.stats.update_stats_for_instance(instance, is_removed_instance)16:32
mriedemwhich looks at things like vm_sate16:32
mriedem*state16:32
jaypipesyeah16:32
mriedemand calls _update_usage16:33
mriedemi'm not sure wth cn.current_workload = self.stats.calculate_workload() is for16:33
mriedemno filters use that, it's just for reporting out of the API i guess16:34
*** moshele has joined #openstack-nova16:34
jaypipesmriedem: switched my vote on it.16:34
*** r-daneel has quit IRC16:38
openstackgerritMerged openstack/python-novaclient master: Updated from global requirements  https://review.openstack.org/52891116:39
mriedemjaypipes: i think he still has changes to make16:39
mriedemper my earlier review16:39
mriedemin _check_for_nodes_rebalance16:39
*** moshele has quit IRC16:39
*** andreas_s has quit IRC16:40
*** gyee has joined #openstack-nova16:42
*** penick has joined #openstack-nova16:42
jaypipesmriedem: sure, though that's only going to be valid for baremetal nodes...16:46
jaypipesmriedem: not sure there's much of a race interval for that... but maybe16:47
*** huanxie has quit IRC16:50
*** andreas_s has joined #openstack-nova16:53
*** huanxie has joined #openstack-nova16:56
*** lucasagomes is now known as lucas-afk16:56
*** sridharg has quit IRC16:56
*** AlexeyAbashkin has quit IRC17:01
openstackgerritStephen Finucane proposed openstack/nova master: console: introduce framework for RFB authentication  https://review.openstack.org/34539717:01
openstackgerritStephen Finucane proposed openstack/nova master: console: introduce the VeNCrypt RFB authentication scheme  https://review.openstack.org/34539817:01
openstackgerritStephen Finucane proposed openstack/nova master: console: Provide an RFB security proxy implementation  https://review.openstack.org/34539917:01
openstackgerritStephen Finucane proposed openstack/nova master: doc: Document TLS security setup for noVNC proxy  https://review.openstack.org/50054417:01
*** salv-orlando has quit IRC17:01
*** salv-orl_ has joined #openstack-nova17:01
cdentjaypipes: speaking of cdent shaming, I was hoping you were going to shame me for my infinite resource classes crack in the latest placement update17:02
*** andreas_s has quit IRC17:03
jaypipescdent: haven't gotten that far yet.17:04
jaypipescdent: still trying to wrestle with friggin server groups.17:05
*** r-daneel has joined #openstack-nova17:07
*** imacdonn has quit IRC17:11
*** imacdonn has joined #openstack-nova17:11
*** ludovic_ has joined #openstack-nova17:13
ludovic_Hello everyone17:13
stephenfinludovic_: o/17:13
openstackgerritMerged openstack/os-vif master: Check if interface belongs to a Linux Bridge before removing  https://review.openstack.org/52607917:14
mnaserSpec-ing out new compute and in the interest of deployers and doing things in open we’re planning to publish the document ... I wanted to gather some feedback at what sort of cpu overcommit you’ve ran/seen people run?17:14
ludovic_Maybe someone can help me to understand the Filter Scheduler ? I have found a strange scheduler behaviour while host-evacuate17:14
ludovic_I have two compute Nodes.  The first compute Node with 31 instances ( total  RAM allocated = 177152 M) , the second one with 2 instances ( Total RAM allocated = 24576 )17:18
ludovic_RAM of computes Nodes is  196483 (memory_mb)17:19
ludovic_Host-evacuate work but 5 VMs was on ERROR with insufficient memory (nova-compute log) and on was with NO STATE (No Host found by RAM Filter)17:20
ludovic_I expected the same ERROR on these 6 instances17:21
ludovic_I used tripleO to deploy my OpenStack environment17:22
ludovic_with Ocata repository17:22
ludovic_nova.conf :  enabled_filters=RetryFilter,AggregateInstanceExtraSpecsFilter,AvailabilityZoneFilter,RamFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter17:24
*** mvk has quit IRC17:25
ludovic_Maybe someone can explain this situation ? Thanks a lot17:25
ludovic_nova specialist ?17:26
*** andreas_s has joined #openstack-nova17:26
*** huanxie has quit IRC17:26
*** shaner has quit IRC17:27
ludovic_is someone have already wrote a scheduler filter to prioritize rebuild of instances ?17:29
*** ralonsoh has quit IRC17:30
*** huanxie has joined #openstack-nova17:32
*** brault_ has quit IRC17:38
*** brault has joined #openstack-nova17:38
*** edmondsw has joined #openstack-nova17:40
*** brault has quit IRC17:43
*** andreas_s has quit IRC17:43
*** Apoorva has joined #openstack-nova17:43
*** Apoorva has quit IRC17:43
*** Apoorva has joined #openstack-nova17:44
jaypipesmriedem: http://paste.openstack.org/show/629349/ .. is there some other place other nova/api/openstack/api_version_request.py that I need to bump a max microversion?17:44
*** edmondsw has quit IRC17:44
* jaypipes only used to the placement microversion dance, not the main nova api one..17:45
*** derekh has quit IRC17:47
*** andreas_s has joined #openstack-nova17:48
*** diga has quit IRC17:48
*** catintheroof has quit IRC17:50
*** shaner has joined #openstack-nova17:50
cfriesenquestion about the DB interfaces...why does some code go through nova.db.api and other code directly uses nova.db.sqlalchemy.api ?17:50
*** josecastroleon has quit IRC17:51
*** catintheroof has joined #openstack-nova17:51
*** mdnadeem has quit IRC17:52
*** damien_r has quit IRC17:52
*** alee is now known as alee_lunch17:54
mriedemjaypipes: yeah the version samples17:57
mriedemjaypipes: https://github.com/openstack/nova/tree/master/doc/api_samples/versions17:58
jaypipesmriedem: I looked there but all I see is an interpolation marker for max_api_version17:58
jaypipesmriedem: gah, never mind.17:59
jaypipesmriedem: sigh...17:59
jaypipesmriedem: was looking in nova/tests/functional/api_samples/17:59
jaypipesmriedem: have I mentioned I hate these? :)17:59
*** r-daneel_ has joined #openstack-nova18:01
*** r-daneel has quit IRC18:01
*** r-daneel_ is now known as r-daneel18:01
*** huanxie has quit IRC18:02
*** andreas_s has quit IRC18:02
*** sambetts is now known as sambetts|afk18:03
cfriesenis all quota information now going into the API DB?  or will we still put some in the main DB?18:04
openstackgerritMerged openstack/nova master: Fix 4 doc typos  https://review.openstack.org/52908418:05
jaypipesmelwitt: see cfriesen's ? above...18:06
*** catintheroof has quit IRC18:07
*** catintheroof has joined #openstack-nova18:08
*** huanxie has joined #openstack-nova18:08
*** harlowja has joined #openstack-nova18:15
*** chyka has joined #openstack-nova18:18
*** corey_ has joined #openstack-nova18:21
*** cleong has quit IRC18:22
*** AlexeyAbashkin has joined #openstack-nova18:22
*** links has quit IRC18:23
*** corey_ is now known as cleong18:24
*** AlexeyAbashkin has quit IRC18:27
*** jpena is now known as jpena|off18:33
*** cfriesen has quit IRC18:33
*** cfriesen has joined #openstack-nova18:33
*** andreas_s has joined #openstack-nova18:39
*** gszasz has quit IRC18:39
*** huanxie has quit IRC18:40
openstackgerritMerged openstack/python-novaclient master: CommandError is raised for invalid server fields  https://review.openstack.org/52511018:40
*** burt has quit IRC18:40
*** burt has joined #openstack-nova18:43
*** andreas_s has quit IRC18:43
*** huanxie has joined #openstack-nova18:44
*** chyka has quit IRC18:45
*** AlexeyAbashkin has joined #openstack-nova18:46
*** cleong has quit IRC18:47
*** chyka_ has joined #openstack-nova18:49
*** chyka_ has quit IRC18:49
*** cleong has joined #openstack-nova18:49
*** chyka_ has joined #openstack-nova18:50
mriedemcfriesen: api db18:51
mriedemcfriesen: starting in pike we don't use the usages or reservations tables anymore18:51
*** brault has joined #openstack-nova18:52
*** brault has quit IRC18:53
*** brault has joined #openstack-nova18:54
*** chyka_ has quit IRC18:54
*** dtantsur is now known as dtantsur|afk18:56
*** catintheroof has quit IRC18:56
*** cdent has quit IRC18:56
*** r-daneel has quit IRC18:57
*** AlexeyAbashkin has quit IRC18:58
*** elod has joined #openstack-nova18:59
openstackgerritJay Pipes proposed openstack/nova-specs master: Support aggregate affinity scheduler filters  https://review.openstack.org/52913519:01
*** andreas_s has joined #openstack-nova19:01
*** sbezverk has joined #openstack-nova19:02
*** r-daneel has joined #openstack-nova19:03
*** alee_lunch is now known as alee19:04
*** chyka has joined #openstack-nova19:12
*** huanxie has quit IRC19:14
*** andreas_s has quit IRC19:15
*** chyka has quit IRC19:17
*** huanxie has joined #openstack-nova19:20
*** kumarmn has quit IRC19:21
*** ludovic_ has quit IRC19:27
*** edmondsw has joined #openstack-nova19:28
*** catintheroof has joined #openstack-nova19:29
*** awaugama has quit IRC19:29
openstackgerritJay Pipes proposed openstack/nova-specs master: Support aggregate affinity scheduler filters  https://review.openstack.org/52913519:31
*** damien_r has joined #openstack-nova19:32
*** edmondsw has quit IRC19:32
*** edmondsw has joined #openstack-nova19:37
*** openstack has joined #openstack-nova19:43
*** ChanServ sets mode: +o openstack19:43
*** edmondsw has quit IRC19:44
*** gouthamr has quit IRC19:45
*** nore_rabel has joined #openstack-nova19:45
*** yamahata has joined #openstack-nova19:47
*** claudiub|2 has joined #openstack-nova19:48
*** nore__ has joined #openstack-nova19:49
*** claudiub has quit IRC19:50
*** huanxie has quit IRC19:50
*** itlinux has quit IRC19:52
*** itlinux has joined #openstack-nova19:52
*** fragatina has quit IRC19:56
*** huanxie has joined #openstack-nova19:56
*** damien_r has quit IRC19:59
*** gouthamr has joined #openstack-nova20:00
*** gouthamr has quit IRC20:01
*** nore_rabel has quit IRC20:04
*** nore__ has quit IRC20:05
rybridgesSo I am seeing errors every time i make a snapshot20:11
*** thingee has left #openstack-nova20:11
rybridgesregarldess of whether i use disable_libvirt_livesnapshot = true or disable_libvirt_livesnapshot = false20:11
rybridgesthe error is the same every time20:11
rybridgesit happens on the hypervisor20:11
rybridgesthis is nova-compute.log https://pastebin.com/WiqYtZzF20:12
rybridgesthis is the error for libvirt log: https://pastebin.com/4NeSuNvX20:13
rybridgesits impossible to take a snapshot on ocata from the horizon ui20:14
rybridgesas far as we can tell20:14
*** mvk has joined #openstack-nova20:16
mriedemhttps://www.jrssite.com/wordpress/?p=30220:18
mriedemhttps://bugs.launchpad.net/nova/+bug/138115320:18
openstackLaunchpad bug 1381153 in OpenStack Compute (nova) "Cannot create instance live snapshots in Centos7 (icehouse)" [Undecided,Invalid]20:18
*** claudiub|2 has quit IRC20:23
*** huanxie has quit IRC20:23
*** huanxie has joined #openstack-nova20:23
*** huanxie has quit IRC20:26
*** jmlowe has joined #openstack-nova20:28
cfriesenmriedem: thanks for the quota answer earlier...in objects.Quotas we're still looking at both the api DB and the main DB.  is that left over from Pike?  Presumably now we could remove the main DB access?20:28
mriedemcfriesen: that's for the online data migration20:30
aleehi - does anyone know how to force nova to re-fetch images from glance?20:30
mriedemso we can't remove that until people have run through the online data migrations,20:30
cfriesenmriedem: that should have happened in Pike though, right?  so we could remove it in Q?20:30
mriedemand we have a blocker schema migration in place to enforce that there are no quota limits/classes in the main cell db20:30
mriedemcfriesen: you'd have to add a blocker migration20:30
cfriesenmriedem: okay20:31
*** nore_rabel has joined #openstack-nova20:32
*** nore__ has joined #openstack-nova20:32
*** huanxie has joined #openstack-nova20:32
*** AlexeyAbashkin has joined #openstack-nova20:37
*** kumarmn has joined #openstack-nova20:38
*** kumarmn has quit IRC20:41
*** kumarmn has joined #openstack-nova20:41
*** AlexeyAbashkin has quit IRC20:41
*** nore_rabel has quit IRC20:46
*** nore__ has quit IRC20:46
*** nore_rabel has joined #openstack-nova20:46
*** Apoorva has quit IRC20:51
*** penick has quit IRC20:53
*** smatzek has quit IRC20:54
*** ludovic has joined #openstack-nova21:02
*** chyka has joined #openstack-nova21:02
*** catintheroof has quit IRC21:02
*** huanxie has quit IRC21:02
*** catintheroof has joined #openstack-nova21:03
openstackgerritEd Leafe proposed openstack/nova master: Make conductor pass and use host_lists  https://review.openstack.org/51135821:04
openstackgerritEd Leafe proposed openstack/nova master: Change compute RPC to use alternates for resize  https://review.openstack.org/52643621:04
edleafemriedem: ^^ should address the functional test failures21:04
*** MikeG451 has quit IRC21:05
ludovicHi, I 'd like to ask few questions to nova experts: is it exists tips to prioritize evacuation order for instances21:06
*** chyka has quit IRC21:06
openstackgerritMerged openstack/nova master: Some nit fix in multi_cell_list  https://review.openstack.org/52759721:06
openstackgerritMerged openstack/nova master: doc: add note about fixing admin-only APIs without a microversion  https://review.openstack.org/52742121:07
ludovicor is it possible to reserve spare nodes ti ensure host-evacuation21:07
*** catintheroof has quit IRC21:07
mriedemludovic: no and no,21:08
mriedemyou can specify a target host during evacuate, but that doesn't reserve it21:08
*** huanxie has joined #openstack-nova21:08
mriedemand you evacuate one instance at a time,21:08
mriedemso if there is priority, the caller handles that21:08
*** cleong has quit IRC21:10
ludovicok, so we can't ensure a good SLA21:10
ludovicof we can't reserve resources to securely evacuate all workload21:11
ludovicthe SLA is impacted21:11
ludovicmriedem: does the scheduler propose as a trick to overcome this?21:15
*** edmondsw has joined #openstack-nova21:17
mriedemludovic: how is this any different than the scheduler correctly picking a host during the initial server create?21:19
mriedembesides reschedules21:19
ludovicexcuse-me , maybe can we have a private discussion about the Filter scheduler VS host-evacuate process with HA compute instances ?21:20
*** Apoorva has joined #openstack-nova21:20
mriedemwhy private?21:20
*** penick has joined #openstack-nova21:20
ludovicin order to not disturb the room21:21
ludovicnp21:21
ludovicI'm testing the HA compute instances with tripleo deployment21:21
ludovicso the nova-evacuate pacemaker process is working well21:22
ludovic but i was suprised by the Ram Filter scheduler21:22
*** edmondsw has quit IRC21:22
ludovicto explain the case , I have two compute Nodes.  I have 2 Computes Nodes. The first compute Node with 31 instances ( total  RAM allocated = 177152 M) , the second one with 2 instances ( Total RAM allocated = 24576   )21:24
ludovicRAM of computes Nodes is  196483 (memory_mb)21:24
ludovicThe Host-evacuate worked  but 5 VMs was on ERROR with insufficient memory (nova-compute log) and on was with NO STATE (No Host found by RAM Filter)21:26
ludovicone21:26
ludovicI expected the same ERROR on these 6 instances21:26
ludovicHOST NOT FOUND or ERROR with insufficient memory21:27
mriedemso you're evacuating the 31 instances on the one compute node to the other compute node with 2 instances?21:27
ludovicSo i don't understand the result21:27
ludovicyes21:28
mriedemand 5 of 31 fail21:28
ludovic621:28
mriedemwhich release?21:28
ludovic5 with insufficient memory and 1 with no valid host found21:28
mriedemif you're in pike+ and using the FilterScheduler, you should remove the RamFilter21:28
ludovicocata21:28
mriedemso the problem is,21:29
mriedemthe scheduler has a point in time snapshot of the resources from the compute, per request,21:29
ludovici use ocata for the moment21:29
mriedemand the ram usage doesn't change until one of the evacuations makes it to the compute and claims those resources, and updates the compute node record in the db, which the scheduler will read in the next scheduling attempt21:29
mriedemso the problem is if you're sending all 31 evacuate requests in a for loop, for example, with no time in between for the scheduler to catch up to the changes in the computes,21:30
mriedemthe scheduler thinks the compute is fine and sends the instance there, but the claim on the compute might fail because another request claimed those resources in the meantime21:30
ludovicyes exactly21:31
mriedemthis should be fixed in pike,21:31
mriedembecause in pike, the RamFilter can be removed and the FilterScheduler uses the Placement service to claim the resources during scheduling21:31
mriedemany claim collisions on the compute node in pike due to concurrent requests will be retried up to 3 times21:32
mriedemotherwise what you're hitting is latent behavior21:32
ludovicah ok so with ocata the process is not enough efficient ?21:32
mriedemcorrect; the late claim on the compute has always been a known issue with scheduling21:32
mriedemwith server create, if you hit this, we would reschedule to another compute21:33
mriedemwe don't do reschedules with evacuate though21:33
ludovicBut is it possible to influence the evacuate with max_concurrent_build ?21:33
ludovicin nova.conf ?21:33
ludovicby default this parameter is 10 , if we reduce to one  ?21:34
mriedemno max_concurrent_build is for server create, not evacuate21:36
ludovicah ok i understand .21:36
mriedemthere is no option like that for limiting evacuates21:36
mriedemthat's not to say one couldn't be added, but adding that in queens or pike doesn't make sense when we've solved this part of the problem in the scheduler21:37
mriedemstill, it seems reasonable to allow limiting the number of concurrent evacuates on a given compute, since we do that for spawn and live migrate21:38
mriedemi wouldn't be opposed to adding something like that21:39
*** huanxie has quit IRC21:40
ludovicok thank you for your explanations, it's precious for me because I'm testing openstack for a big French Company you know21:40
ludovicAnd the goal is to see if we can in the future replace the massive usage of WMare with OpenStack you know21:41
mriedemoolala21:41
ludovicnear future21:42
*** pchavva has quit IRC21:42
mriedemok. would be cool if you could test this out on a pike deployment.21:42
mriedemto make sure the filter scheduler + placement is correctly handling this for you21:42
ludovicThat's why i asked if reservation and /or prioritizing exist under OpenStack21:42
mriedemremember to remove the RamFilter in pike if you're using the FilterScheduler since (1) it's redundant and (2) it will remove the memory_mb claim in the compute21:42
ludovicok21:43
*** huanxie has joined #openstack-nova21:44
ludovicdon't you think it will be interesting to add the possibility to have spare Compute Nodes ?21:44
ludovicwith aggregate host spare for example21:45
mriedemyou mean build something into nova to mark specific computes as only used for evacuate?21:45
ludovicor in a aggregat to propose the prioritinzing of important workload when evacuate21:45
ludovicyes for example21:45
mriedemyeah idk, maybe. i wouldn't want to change the evacuate api to pass through scheduler hints probably.21:46
ludovicThat will be ensure a very good SLA because the evacuate processus would be securized21:46
*** priteau_ has joined #openstack-nova21:46
mriedemi don't know how many deployments just have compute nodes lying around as spares for evacuate21:46
mriedemyou can also control which host is used client-side21:47
mriedemas noted21:47
mriedemso as a client, if you have a special "evacuate" host aggregate, you could round robin through those hosts and send it with the evacuate request21:48
mriedembut, as your evacuate aggregate starts to fill it, it is no longer spare capacity21:48
mriedemso...21:48
mriedem*fill up21:48
openstackgerritEd Leafe proposed openstack/nova master: Make conductor pass and use host_lists  https://review.openstack.org/51135821:49
openstackgerritEd Leafe proposed openstack/nova master: Change compute RPC to use alternates for resize  https://review.openstack.org/52643621:49
edleafe^^ fixed pep8 booboo21:50
*** priteau has quit IRC21:50
ludovicit seems to not be easy to design ...21:50
ludovicevacuate is temporary21:50
ludovicuntil the source node repaired21:51
ludovicso the goal is to failback and so free the evacuate aggregat21:51
mriedemludovic: sure, that's why it's not something built natively into nova21:52
mriedemnova provides the API so a higher level service can orchestrate whatever you need here21:52
mriedemedleafe: ack, will run the ironic patch on that21:53
jose-phillipshi any idea21:55
*** nore_rabel has quit IRC21:55
jose-phillipswhy im im getting this error on devstack21:56
*** wind has joined #openstack-nova21:56
jose-phillipscan't apply process capabilities -121:59
jose-phillipsusing qemu21:59
rybridgesfor the record mriedem, i tried recompiling qemu like you said... took the latest version i could find here http://ftp.redhat.com/redhat/linux/enterprise/7Server/en/RHEV/SRPMS/22:00
windHi, I'm trying to make a rest-api call from ironic to nova, was hoping if someone could gimme a hint how to do so ... Ironic.conf doesn't have anything in there for nova, so should i build a keystoneclient session, and then use that to retrieve the token, and compute-api endpoint and then send the GET & PUT Commands22:00
rybridgesstill get the exact same erroro22:00
windAny suggestions would be really helpful.... I'm new to openstack22:01
*** kumarmn has quit IRC22:04
*** kumarmn has joined #openstack-nova22:05
*** eharney has quit IRC22:05
mnaser2017-12-19 22:04:53.593 44047 DEBUG nova.compute.resource_tracker [req-2858b7b9-d273-4347-aff3-dfa3fa10f4cd - - - - -] We're on a Pike compute host in a deployment with Ocata compute hosts. Auto-correcting allocations to handle Ocata-style assumptions. _update_usage_from_instance /usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py:1042 <=== wouldnt this be nice if this was a warning instead of22:05
mnaserDEBUG?22:05
mnasergiven that I just found out Nova thinks there are Ocata hosts in my all-pike installation (i guess something is wrong somewhere)22:06
ludovicmriedem: Just to be sure, in order to influence the host-evacuate process, i don't have other options than using host aggregate with flavor and extra_specs ?22:08
mnaserludovic: yes, but afaik migrations use the flavor and extra_specs that it had at the moment of provisioning22:08
mnaserex: if your flavor A had aggregate "foo" when a VM is booted, and you changed the aggregate to "bar", and then do a migration/evacuate, it will still try to look for "foo"22:09
*** priteau_ has quit IRC22:09
mnaserafaik that has been my experience but i might be wrong?22:09
*** priteau has joined #openstack-nova22:09
ludovicoh yes so it's necessary to provide two aggregates at the moemet of provisioning "prod" and "spare"22:10
ludovicmoment22:11
mnaserwell ideally if you launched an instance in 'prod', you dont want it to be evacuated into 'staging'22:11
mnaserbecause maybe your host aggregate hardware in staging gets turned off at 5pm22:11
* mnaser goes back to wrestling nova not getting/processing any live migrations22:12
*** claudiub|2 has joined #openstack-nova22:12
mnaseroh would you look at that22:13
mnaser"Unable to submit allocation for instance 42e2e0cd-0dd2-48c0-b873-ed9cd08a451a" .. placement returning 400, JSON does not validate: None is not of type 'string' ... instance['project_id'] == None somehow?!22:13
*** kumarmn has quit IRC22:14
mnaserdoes this ring any bells to anyone or should i start diving in22:14
*** priteau has quit IRC22:14
penickWorking hard to get live migrations functional is like cranking on the handle for a jack-in-the-box. Except the thing in the box is a fist. And it punches you in the face.22:14
*** huanxie has quit IRC22:14
ludovicmnaser: that sound JSON Filter not working  no ?22:15
*** kumarmn has joined #openstack-nova22:15
mnaserpenick: its always worked, but i guess this is a weird pike corner case, i see this - https://bugs.launchpad.net/nova/+bug/1701129 but it should be in pike which we're running22:16
openstackLaunchpad bug 1701129 in OpenStack Compute (nova) "Functional tests fail intermittently with 400 Bad Request from placement" [Low,Fix released] - Assigned to melanie witt (melwitt)22:16
* mnaser thinks22:16
*** penick has quit IRC22:19
*** huanxie has joined #openstack-nova22:20
mnaseri guess for some reason RequestSpec is getting an empty project_id22:21
openstackgerritLance Bragstad proposed openstack/nova master: Add scope_types to server policies  https://review.openstack.org/52577222:21
openstackgerritMerged openstack/nova master: Convert ext filesystem resizes to privsep.  https://review.openstack.org/51751622:23
openstackgerritMerged openstack/nova master: Move flushing block devices to privsep.  https://review.openstack.org/51901022:23
openstackgerritMerged openstack/nova master: [placement] Separate API schemas (resource_class)  https://review.openstack.org/52061122:23
openstackgerritMerged openstack/nova master: Update nova-status and docs for nova-compute requiring placement 1.14  https://review.openstack.org/52650522:23
*** gouthamr has joined #openstack-nova22:23
openstackgerritMerged openstack/nova master: Deduplicate functional test code  https://review.openstack.org/52622722:23
openstackgerritMerged openstack/nova master: Fix possible TypeError in VIF.fixed_ips  https://review.openstack.org/52792022:24
*** lyan has quit IRC22:24
*** penick has joined #openstack-nova22:25
*** rcernin has joined #openstack-nova22:28
*** salv-orl_ has quit IRC22:36
*** salv-orlando has joined #openstack-nova22:37
*** penick has quit IRC22:40
*** marst has quit IRC22:41
*** salv-orlando has quit IRC22:41
*** salv-orlando has joined #openstack-nova22:42
*** penick has joined #openstack-nova22:43
mnaserinstance = common.get_instance(self.compute_api, context, id) <== would anyone know if this supplies project_id by default?22:49
mnaserbecause that's the instance which is passed down to conductor and by the time its at the scheduler, instance.project_id == None which then in turn makes it fail the request to the placement api22:50
*** huanxie has quit IRC22:50
mnaserfurther investigation - {"project_id": null, "user_id": "695d5f386eed440cb0e38455e1afdc9e", "allocations": [{"resource_provider": {"uuid": "5d5c5177-29bb-484f-9cc6-928360afa195"}, "resources": {"MEMORY_MB": 512, "VCPU": 2, "DISK_GB": 20}}, {"resource_provider": {"uuid": "4e43861e-ee36-40b7-ba7b-2239b46a1609"}, "resources": {"VCPU": 2, "MEMORY_MB": 512, "DISK_GB": 20}}]} .. for some reason, user_id comes in but22:55
mnaserproject_id doesn't.  fwiw, this is a server created in 2015.22:55
*** huanxie has joined #openstack-nova22:56
mnaserthe user_id is the user of the one executing the live migration, not the owner of the instance oddly enough22:57
*** lennyb has quit IRC22:58
*** lennyb has joined #openstack-nova23:00
*** edmondsw has joined #openstack-nova23:06
mnaserok.. request_spec record has project_id set to null for that vm23:06
mnaserin the database23:06
mnaserwhy and how.. :(23:06
mriedemhmm, not sure why the project_id would be null23:06
mriedemshould come off the context23:07
mriedemsorry, was on a call for the last hour23:07
mnasermriedem: no problem, its null because .. its null in the request_specs table too..23:07
mnaseri wonder why23:07
mriedemyou said it's a really old instance right?23:07
mnaseryes mriedem23:07
mriedemok reqspec is created here https://github.com/openstack/nova/blob/16.0.4/nova/compute/api.py#L89923:08
mnaserthe created_at for the requestspec is "2017-03-07 02:28:47"23:08
mnaserbut no updated_at23:08
mriedemhttps://github.com/openstack/nova/blob/16.0.4/nova/objects/request_spec.py#L41123:08
mriedem2017-03-07 is ocata right?23:09
mriedemi'm wondering if this was a request spec created for an older instance23:09
mriedemwhat's the created_at on the instance?23:09
mnaserit was23:09
mnaser2015 created_at, 2017 requestspec23:09
mriedemok in ocata this is the routine for creating requestspecs for old instances23:09
mriedemhttps://github.com/openstack/nova/blob/stable/ocata/nova/objects/request_spec.py#L59023:09
mriedemwhich https://github.com/openstack/nova/blob/stable/ocata/nova/objects/request_spec.py#L40523:10
mriedemhowever,23:10
mriedemif that's an admin context, from the online data migration, it won't have a project id...23:10
*** edmondsw has quit IRC23:10
mriedemhttps://github.com/openstack/nova/blob/stable/ocata/nova/cmd/manage.py#L77623:11
mnaserwhich explains how we landed in this case23:11
mriedemhttps://github.com/openstack/nova/blob/stable/ocata/nova/context.py#L31323:11
mriedemyup23:11
mnaseri guess its probably not the only one23:11
mriedemprobably not23:11
mriedemok so you're hitting this trying to live migrate that instance right?23:12
mnasermriedem: yes but i believe that any operations involving placement will likely fail23:12
mriedemso that's this http://git.openstack.org/cgit/openstack/nova/tree/nova/scheduler/client/report.py#n114123:12
mriedemthe scheduler is trying to create allocations in placement on the target node for that instance23:12
mnasercorrect, and because im not forcing it, it goes through the scheduler23:13
mnaserand the scheduler tacks on project_id from the request_spec23:13
mriedemyup https://github.com/openstack/nova/blob/16.0.4/nova/scheduler/filter_scheduler.py#L28723:14
mriedemand in this case, the instance project_id is likely != the context.project_id because the context is the admin user23:14
mriedemdoing the live migration23:14
mriedemSOB23:14
mnaseri looked at the number of request_specs23:15
mnaserand its pretty terrifying to have to update it all23:15
mnaserlol23:15
mriedemthe number of reqspecs that don't have a project_id set?23:15
*** burt has quit IRC23:15
mnaseri didnt want to run that query because im pretty sure ill burn down the sql server23:15
mnaserclose to a million records and i probably would have to wildcard match it23:15
mriedemselect count(*) from nova_api.request_specs where project_id is null and deleted == 0;23:16
mriedem?23:16
mnaserrequest_specs contains a json thingy called 'spec'23:16
mnaser{"nova_object.version": "1.5", ...}23:17
mriedemoh right23:17
mriedemyeah the request_specs.spec is a serialized json blob of the object23:17
mriedemso forget your db query23:17
mriedemjaypipes: ^23:17
mriedemmnaser: well, i could hack something up for you quickish23:17
mriedemmnaser: have you reported a bug yet?23:17
mnasermriedem: i havent yet, i just kinda discovered how i ended up here with your information23:18
mnaser(i got as far as .. request spec doesnt have project id) but the online migration confirms it23:18
mriedemok, i can start hacking up a workaround if you can report a bug23:18
* mriedem wonders if we should hold up the newton eol for this23:18
mnasermriedem: just out of curiosity, is project_id/user_id actually used by the placement api ?23:18
mriedemnot yet23:19
mnaserbut i guess we dont want to make it from bad to worse23:19
mriedemthe long-term idea is we can leverage the allocations with the project/user information for doing things like counting quotas without iterating the cells23:19
mnasergotcha23:20
mnaseralright let me write up a bug23:20
mriedemthis would be very wrong for that though https://github.com/openstack/nova/blob/16.0.4/nova/scheduler/filter_scheduler.py#L29323:20
mriedemif we're live migrating or evacuating23:20
mnaseri guess thats why it says todo :>23:21
mriedemheh23:21
mriedemmelwitt: ^ a todo to keep in mind if we ever want to use placement allocations to mine data for counting quotas23:21
mriedemwe aren't storing the correct user_id for all allocations23:21
melwittso we should have one claim per allocation or?23:23
mriedemwhen migrating or evacuating, by default the context is the admin23:24
mriedemb/c those are admin apis23:24
mriedemso the user_id we're storing in the allocation for the instance is from the admin, but the project_id should come from the instance, which is the user23:24
melwittyeah, I see. guh23:24
*** felipemonteiro has quit IRC23:24
*** felipemonteiro has joined #openstack-nova23:24
*** kumarmn has quit IRC23:25
*** kumarmn has joined #openstack-nova23:26
*** huanxie has quit IRC23:27
*** andreas_s has joined #openstack-nova23:28
melwittdoes it maybe work out because allocations are updated by the compute host every update interval? would it auto heal the user/project once we fix it?23:28
mriedemno23:29
mriedemcomputes don't mess with allocations once you're upgraded to pike23:29
mnasermriedem: https://bugs.launchpad.net/nova/+bug/173931823:30
openstackLaunchpad bug 1739318 in OpenStack Compute (nova) "Online data migration context does not contain project_id" [Undecided,New]23:30
mriedemmnaser: thanks23:30
melwitthm, I thought that's what update_available_resource did23:30
mriedemmelwitt: used to did23:30
mnaseralso looks like the claim resources which did `project_id = spec_obj.project_id` was moved to scheduler utils23:30
melwittdamn23:30
mnaserso that might make things more challenging to backport.23:30
*** kumarmn has quit IRC23:30
mnaser(or if you have to solve the user_id one)23:31
*** kumarmn has joined #openstack-nova23:31
*** huanxie has joined #openstack-nova23:32
*** andreas_s has quit IRC23:32
*** moshele has joined #openstack-nova23:32
*** itlinux has quit IRC23:33
mnasermriedem: so is it time to write a little script to iterate all request specs, and those will null, look up the project_id from instances table and update it again with the project_id in there?23:36
mriedemmnaser: i think we might need that too, but i have a workaround i think we can use for now,23:38
mriedemplus fixing that busted migration routine for people that haven't hit this yet23:38
mnasermriedem: im working on a small fix for that busted migration routine as it seems pretty trivial23:38
*** penick has quit IRC23:39
mnasermriedem: im noticing a lot (most fields) are nullable=True .. can I drop that for project_id or is that a design decision?23:41
mnaserif i cant drop it, i can raise an exception in from_components if context.project_id is none (and add a unit test for that), then fix the layer above it to make sure it always supplies a project_id23:42
openstackgerritMatt Riedemann proposed openstack/nova master: Use instance.project_id when creating request specs for old instances  https://review.openstack.org/52918423:42
openstackgerritMatt Riedemann proposed openstack/nova master: WIP: Workaround missing RequestSpec.project_id when moving an instance  https://review.openstack.org/52918523:42
mriedemmnaser: this is my start ^23:42
mnaseroh okay :P23:42
mnaserc'mon gerrit23:43
*** catinthe_ has joined #openstack-nova23:43
openstackgerritMatt Riedemann proposed openstack/nova master: WIP: Workaround missing RequestSpec.project_id when moving an instance  https://review.openstack.org/52918523:45
mriedem^ handles the other cases23:45
mriedemtonyb: think we might want to hold up https://review.openstack.org/#/c/529102/ for https://review.openstack.org/52918423:47
mnasermriedem: the patch for the fix looks good, but just a question, do you want to drop nullable=True to make sure that it will never save (in case we ever likely run into this again?)23:47
mriedemmnaser: that will require a version bump on the object and isn't something we can backport23:47
mriedemit's something we can do on master, but not critical atm23:47
mnaserah okay, figured there was a reason behind it23:47
mriedemi'll leave a todo23:48
*** kumarmn has quit IRC23:48
mriedemmnaser: i don't suppose you have a recreate of this in staging that you can test out with the workaround patch?23:49
mnasermriedem: i dont think i can recreate this scenario.. we just rebuilt our local dev cloud from scratch a few weeks ago :(23:50
mnaserit was too bad because it was running since newton23:50
mriedemok, we could probably recreate it though with devstack. create a new instance, delete it's request spec from the db directly, then run the migration routine23:50
mriedemthen try to migrate that instance23:50
mnasermriedem: we probably dont have to get that far, probably seeing project_id non null in request_specs table would probably be enough to show that this bug specifically was resolved23:51
openstackgerritMerged openstack/nova master: Deduplicate instance.create notification samples  https://review.openstack.org/52345623:51
mriedemtrue23:52
mriedemi mean, you could just test this in prod, but...i didn't want to ask23:52
mnasermriedem: i could probably patch up the live migration one only23:52
*** catintheroof has joined #openstack-nova23:52
mnasersince really nothing can break there because its an admin api only23:52
*** catintheroof has quit IRC23:52
mnaseri wouldnt be able to test the migrate and conductor changes as those are too critical tbh23:52
*** catintheroof has joined #openstack-nova23:53
mnaserby conductor, the conductor manager change that is23:53
mriedemyeah23:55
mnaseroh fun times23:55
mnaserthis will conflict in stable/pike23:55
mnaser_get_request_spec_for_select_destinations doesnt exist in stable/pike .. not in my stable/pike23:55
*** catinthe_ has quit IRC23:55
mnasertasks23:56
mriedemthat code is in _find_destination in pike23:57
*** ludovic has quit IRC23:57
*** catintheroof has quit IRC23:58
mriedemhttps://github.com/openstack/nova/blob/stable/pike/nova/conductor/tasks/live_migrate.py#L26423:58
mnaserok i see23:58
*** jdurgin has quit IRC23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!