Friday, 2021-09-24

*** rpittau\|afk is now known as rpittau		06:41
*** odyssey4me is now known as Guest847		08:50
snadge	i have dhcp and dhcpv6 enabled on the wan interface, and now the ipv4 address only times out after 1800 seconds	11:56
snadge	its like it wont renew.. but if i manually renew it does, pretty frustrating	11:56
snadge	i could just set the ipv4 address statically.. as it is a static ip, but that's annoying, i want to know why its doing that	11:57
spatel	noonedeadpunk yesterday i have upgraded openstack V to W without any issue.	12:58
spatel	it took 8 hours to completed full upgrade	12:58
mgariepy	nice spatel, what is your network stack ? ovs ? or still lxb ?	13:06
spatel	lxb	13:06
mgariepy	how many host do you have ?	13:06
spatel	this is production 100 around	13:06
mgariepy	cool.	13:07
noonedeadpunk	great news!	13:07
spatel	is that normal to take 8 hour ?	13:07
noonedeadpunk	against 100 hosts?	13:07
spatel	yes	13:07
spatel	i have one more environment which has 328 computes :)	13:08
mgariepy	when i do upgrade like that i tend to split the task over a couple of days.	13:08
spatel	mgariepy is it ok if infra running on wallaby and compute running on Victoria ?	13:08
noonedeadpunk	me too	13:08
noonedeadpunk	yes, totally	13:08
spatel	i thought they don't work in mix version	13:08
mgariepy	control plane first on day 1 (infra + keystone + other services) and then the nova/neutron on day 2.	13:09
spatel	oh!! damn it.. i didn't know that	13:09
mgariepy	well your upgrade was live and for 8 hours you had a mix of the 2 releases! ;)	13:09
spatel	next time i will split out... it would be good to put nodes in Official doc incase people not aware :)	13:09
mgariepy	you can also split that i 3 days if you want !	13:10
noonedeadpunk	we split in a week lol	13:10
mgariepy	depending on how many hosts :P	13:10
mgariepy	my cloud is kinda small so .. 2 days is enough haha	13:10
noonedeadpunk	to provide exact time when and what exactly can fail for customers	13:10
spatel	i have noticed when you upgrade OVS agent / restart take network hit	13:12
spatel	is that true?	13:12
noonedeadpunk	yep, ovs restart does break networking on compute	13:12
spatel	so we have to take downtime, that sucks...	13:12
noonedeadpunk	because eventually you shot down and re-create bridges	13:12
spatel	my big problem is in remote datacenter i am planning to use dpdk (because sriov doesn't support bonding). so i have to use ovs	13:13
spatel	noonedeadpunk as you know i haven't upgrade ceph yet.. that is i am planning to test in lab and later do it on production	13:14
spatel	noonedeadpunk i have very stupid question, what is i lost deployment node in that case how to build deployment node?	13:15
noonedeadpunk	you need to backup /etc/openstack_deploy :)	13:15
noonedeadpunk	or store it in git	13:26
mgariepy	and push it somewhere.	13:26
noonedeadpunk	not best idea to store user_secrets in git though	13:26
mgariepy	i do encrypt secrets and store it in a private git server.	13:27
mgariepy	how do you guys do that noonedeadpunk ?	13:29
jamesdenton	anyone here run into an issue with the dashboard, when after logging in you get a 504 after a while. Looking at nova-api it's just spinning its wheels on /v2.1/os-simple-tenant-usage/	13:29
jamesdenton	repeated (successful) requests for GET /v2.1/os-simple-tenant-usage/7a8df96a3c6a47118e60e57aa9ecff54 (project). strange	13:31
mgariepy	is it only one node that is failing ?	13:32
jamesdenton	i see the same behavior across all controllers running that service	13:33
jamesdenton	It's the Project -> Compute -> Overview page that's problematic, not all of horizon	13:34
mgariepy	do you see the error in the nova logs?	13:34
mgariepy	when i do an upgrade on horizon i usualy destroy one of the container and create a new one. just to have a quick fallback if the upgrade fails for misterious reason..	13:36
jamesdenton	https://pastebin.com/ndzhmk1B	13:36
jamesdenton	horizon logs look clean, but something is making repeated requests to this url	13:37
jamesdenton	which is related to that overview page	13:37
mgariepy	are the request comes all from the same horizon container ?	13:40
mgariepy	or mostly**	13:40
jamesdenton	ahh, good question, i'll have to check. it's "load balanced" but i need t make sure	13:42
mgariepy	the connection to horizon are sticky per client IRRC	13:43
jamesdenton	nova usage-list uses the same os-simple-tenant-usage calls, testing it now.	13:43
mgariepy	your loadbalancer would log the balancing for the nova api logs if you list the last 1000 are they coming mostly from the same host ?	13:45
jamesdenton	this is just a lab env	13:45
mgariepy	lol ok and ? take 100 call then haha	13:46
jamesdenton	:D	13:46
mgariepy	you are the only client ?	13:46
jamesdenton	yes, i'm the only client. Problem persists a reboot, too, so i wonder if there something funky in the DB	13:46
mgariepy	flush memcached	13:47
jamesdenton	it happened after a recent upgrade, but can't recall which	13:47
mgariepy	is it a 3 ctrl node deployement ?	13:48
jamesdenton	yes	13:48
mgariepy	block-migrate on hdds is soooo painful	13:48
mgariepy	did you try to remove on ctrl node from the lb to see if it's only one node that is causing the issue ?	13:51
jamesdenton	yeah, i've done the usual stuff.	14:02
mgariepy	you should try to start the weekend early ahha	14:07
Adri2000	hi... this backport https://review.opendev.org/c/openstack/openstack-ansible-os_swift/+/806210 would be happy with another +2 :)	14:17
mgariepy	Adri2000, done.	14:22
Adri2000	thank you mgariepy!	14:23
noonedeadpunk	Eventually would be great to review https://review.opendev.org/q/topic:"osa%252Fgalera_pki"+(status:open) as well :)	14:51
jamesdenton	mgariepy So, I've isolated this to some issue w/ nova api versioning. But maybe really something with haproxy, dunno. If I use microversion <= 2.39 it works, >2.40 fails	15:17
jamesdenton	https://docs.openstack.org/nova/latest/reference/api-microversion-history.html	15:17
mgariepy	jamesdenton, mismatch of version of the microversion between horizon/nova/ and other stuff?	15:35
jamesdenton	i can't imagine so, microversion 2.40 is pretty old (older than this env, even). but i'm not really sure what the deal is. running wallaby now and this issue started happening in this env during victoria, IIRC. this lab sees some abuse. at this point it's the principle of the thing - i need to fix it to keep myself sane	15:37
jamesdenton	i might adjust the endpoints to bypass haproxy and see if it occurs directly to nova-api	15:38
mgariepy	keep us in the loop	15:42
mgariepy	i don't see why haproxy would cause issue with a version like that.	15:43
jamesdenton	I was thinking maybe a difference in the payload was causing an issue. but the same thing is happening directly to nova api when i changed the endpoint	15:44
jamesdenton	mgariepy so, it looks like there was something about terminated instances that was causing an issue. The issue persisted, even after deleting all instances from all projects. Had to run "nova-manage db archive_deleted_rows"	15:58
mgariepy	when did you install that system ? is it an old install upgraded for the last 10 years ?	15:59
jamesdenton	Austin -> Wallaby	15:59
jamesdenton	Not really, probably Queens/Rocky -> W	15:59
mgariepy	LOL	15:59
jamesdenton	but this issue crept up some time in the last few weeks	15:59
mgariepy	did you run nova api with debug ?	16:00
mgariepy	how did you saw the issue with the deleted instance?	16:00
jamesdenton	oh yeah. in fact, i thought it was fixed but it's not. as soon as i created a new instance, the problem came back. I mentioned deleted instances because you could still see that returned in the payload	16:01
mgariepy	db migration missing?	16:01
mgariepy	if nova error's out on a db issue there should be a traceback somewhere ?no ?	16:02
mgariepy	unless it's in try: except: pass haha	16:03
jamesdenton	i would think so. only think i see in the logs is repeated attempts against os-simple-tenant-usage	16:03
opendevreview	Merged openstack/openstack-ansible-os_swift stable/ussuri: Revert "split templates to work around configparser bug" https://review.opendev.org/c/openstack/openstack-ansible-os_swift/+/806210	16:35

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!