Wednesday, 2022-06-22

*** dviroel|afk is now known as dviroel00:07
*** ysandeep|out is now known as ysandeep02:09
*** carloss_ is now known as carloss06:12
*** ysandeep is now known as ysandeep|afk07:02
fanfi_hi, is there sb who can help me with bootstrap script ? https://paste.opendev.org/show/bgD9fuewymHEY2Ip1kYL/07:57
jrosser_fanfi_: which branch is that? master?08:12
fanfi_* master08:12
fanfi_  remotes/origin/HEAD -> origin/master08:12
jrosser_did you start from a clean host there?08:13
fanfi_yes08:13
fanfi_ubuntu 2008:16
noonedeadpunkthat sounds like installation of collections fails08:16
noonedeadpunkah08:16
jrosser_i think we need to address this https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/openstack-ansible.rc#L5608:17
jrosser_here https://github.com/openstack/openstack-ansible/blob/master/scripts/bootstrap-ansible.sh#L194-L20408:18
noonedeadpunkhm. how we haven't catched that?08:19
noonedeadpunkI was running AIO multiple times...08:19
noonedeadpunkwill you patch that?08:19
jrosser_yes let me take a look08:28
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix tasks name for collections bootstrap  https://review.opendev.org/c/openstack/openstack-ansible/+/84713408:29
Adri2000hello jrosser_, I've just replied to https://review.opendev.org/c/openstack/openstack-ansible/+/846787 - but I understand my patch may be wrong, given that the CI fails anyway. happy to discuss further here to try to understand what would be a proper fix08:34
jrosser_noonedeadpunk: ^ regarding this - we have a bunch of code with conditionals on `(galera_root_user != 'root')` and i cant' remember if that stuff is all transitional08:35
jrosser_and it's really confusing to read actually08:36
Adri2000my understanding is that this patch https://opendev.org/openstack/openstack-ansible-galera_server/commit/931f3c74a78774f319acfb6867ff9742b9b46e3d has the consequence that /root/.my.cnf is no longer dropped in the galera container. and that has to be done by the client part of the galera role, later in the process.08:37
* noonedeadpunk in a meeting08:43
anskiyNot sure if this relates, but  I've been doing this: https://paste.opendev.org/show/bA0puUg2Sv9ueZtgaR5j/, prior to running any openstack-ansible things.08:44
jrosser_fanfi_: i am not managing to reproduce your error :(08:56
jrosser_i made a fresh 20.04 VM, apt update / dist-upgrade, git clone https://opendev.org/openstack/openstack-ansible.git, cd openstack-ansible, ./scripts/bootstrap-ansible.sh08:58
fanfi_yes, i did same on fresh ubuntu 20.04 VM https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/deploymenthost.html08:59
noonedeadpunkSo.08:59
noonedeadpunkWhen user is root, we need my.cnf to be installed since role will set password on the user and auth with socket won't be available09:00
fanfi_i am using MAAS so it easy to make redeploy the VM, let me try it again 09:01
jrosser_hold on09:01
noonedeadpunkWhen user is not root, we don't need that, as socket path is default. We basically don't even need my.cnf inside galera containers, as socket auth will be default way as well09:01
fanfi_oka09:01
jrosser_fanfi_: can you try this? https://paste.opendev.org/show/b19m4Pl8NhG9LA0bmsx8/09:01
damiandabrowski[m]wasn't it fixed by https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/829260 ?09:01
noonedeadpunkAdri2000: ^09:02
noonedeadpunkso basically question - why you need my.cnf inside galera containers?09:03
anskiydamiandabrowski[m]: I'll give it a try, thank you!09:03
Adri2000noonedeadpunk: hmm ok. in my case, running `mysql` in the galera container was failing, it'd not connect to the socket. I'm currently setting up an AIO master to check whether I can reproduce the problem or if at least I can understand what was wrong in my deployment.09:03
Adri2000noonedeadpunk: indeed I don't need my.cnf explicitely, I just need to be able to login to mysql09:03
noonedeadpunkWell, that could be the case in your deployment, if you upgraded OS as example?09:04
jrosser_i think this is my confusion about why we have that conditional code still09:04
jrosser_if the client is never needed in the galera containers in newer deploys.....09:04
noonedeadpunkSo root user is still set for password auth (ie it was older deployment when we used root user), but when you re-deployed contianer, my.cnf jsut vanished09:04
noonedeadpunkjrosser_: it's needed if user is root09:05
noonedeadpunkif it's been explicitly overriden as an example09:05
noonedeadpunkAdri2000: so basically, if you indeed re-deployed containers and can't connect, then likely you need either to set `galera_root_user: root` or reset root user auth to socket instead of password09:06
noonedeadpunkAs we indeed didn't cover upgrade path in terms of containers re-deployment or OS upgrade.09:07
noonedeadpunkor well. you can always revert to using root user09:07
Adri2000it's a deployment where we recently upgraded from ubuntu 18.04 to 20.04, and then from victoria to wallaby... from what I see the transition from "root" to "admin" was done correctly09:07
Adri2000let me do some more testing and I'll come back later today with some explanation, hopefully :)09:08
*** ysandeep|afk is now known as ysandeep09:09
noonedeadpunkMaybe you need to jsut drop 'root'@'%' ?09:09
noonedeadpunkIe https://review.opendev.org/c/openstack/openstack-ansible/+/775684/8/releasenotes/notes/galera_root_user-43c292688ddc4f1d.yaml09:09
noonedeadpunkSure!09:09
admin1https://docs.openstack.org/openstack-ansible-os_neutron/latest/configure-network-services.html -- i am guessing firewall_v2 and vpnaas is not ON by default ? 09:43
noonedeadpunkWe should drop fwaas mentioning from docs I believe...09:51
noonedeadpunkbut no, they're not09:51
jrosser_fwaas is coming back some time isnt it09:53
jrosser_but no idea if that is only OVN or smth09:54
noonedeadpunkhuh09:55
noonedeadpunkit's not ovn09:55
noonedeadpunkhttps://docs.openstack.org/neutron/latest/admin/fwaas-v2-scenario.html09:55
noonedeadpunkbut how it should be isntalled...09:55
noonedeadpunkI also not understand how it;s different from port security now...09:56
jrosser_its at the router isnt it?09:58
noonedeadpunkv1 yes on the router09:59
noonedeadpunkv2 on ports?09:59
noonedeadpunkhttps://docs.openstack.org/neutron/latest/admin/fwaas.html09:59
noonedeadpunkbut indeed I see that https://opendev.org/openstack/neutron-fwaas revived recently09:59
noonedeadpunkugh09:59
noonedeadpunkneed to catch up with that10:00
anskiynoonedeadpunk: fwaas works only with OVS10:18
noonedeadpunkyeah... that's true. 10:19
anskiyit doesn't break if you're using OVN tho, just silently allows all traffic :)10:20
admin1so fwaaas is out . is vpnaas future proof ? 10:29
admin1ovs -> ovn proof 10:29
*** dviroel__ is now known as dviroel11:28
*** ysandeep is now known as ysandeep|afk12:23
*** ysandeep|afk is now known as ysandeep13:22
Adri2000noonedeadpunk, jrosser_: tested on a brand new OSA AIO master: indeed `mysql` will allow to login inside the galera container. however it seems that works as it authenticates with the root user via unix socket. my understanding of what you said, is that it should work with the admin user via unix socket. `show grants for admin@localhost` will show that the admin user is14:41
Adri2000actually not authorized to login via unix_socket. in other words if you drop the root user (as recommended in the release note), then `mysql` in the container will no longer log you in.14:41
Adri2000also, is it at all expected that the user is present on a brand new deployment? if we recommend deleting it on upgrades, it should not be there on a fresh install?14:42
Adri2000finally, I noticed /etc/mysql/debian.cnf mentions the root user twice - shouldn't it be replaced with admin?14:43
Adri2000that's OSA master on Ubuntu 20.04, fwiw14:43
Adri2000*** I meant: is it expected that the *root* user is present on a brand new deployment14:44
jrosser_i think that the intention is that for everything beyond boostrapping the admin user is done fron the utility container14:45
noonedeadpunkAdri2000: but in release notes recommended to drop root@%. There's still root@localhost that should be able to login to the socket?14:58
noonedeadpunkand root@% is not present on brand new deployments I believe14:58
*** dviroel is now known as dviroel|lunch14:59
*** ysandeep is now known as ysandeep|out15:00
Adri2000right only root@localhost is present. so it's expected that login from the galera container happens via root/unix_socket rather than admin/unix_socket?15:11
noonedeadpunkyup15:12
Adri2000ok. what about the root+password credentials in /etc/mysql/debian.cnf? I'm not sure it causes actual problems, but it should be something else right? either admin+password, or just root/unix_socket15:12
lowercas_Has anyone ran into an issue where openstack nova is complaining about cell v2 when upgrading from wallaby to xena where it appears the api is calling an older api version which isn't compatible with the new xena version?15:12
noonedeadpunkBasically this comes from recommendation never to mess up with root user on mariadb15:13
Adri2000anyway, thanks noonedeadpunk and jrosser_ for your help. I'll rethink about all of this tomorrow as I had enough mysql for today =) and will certainly drop my patch15:13
noonedeadpunkI can't recall, but that could be opened as separate packaging bug or smth15:13
Adri2000yeah, I can do a bug report for that part15:14
jrosser_Adri2000: if you think we could document better, it would be cool to know where / what15:14
noonedeadpunkas what mariadb folks told us is that modifying root user in anyway is smth that should be avoided and not good practise15:14
Adri2000understood15:15
noonedeadpunkbtw isn't in debian.cnf some other user used by default then root?15:16
noonedeadpunklike system-maint or smth like that?15:16
Adri2000maybe by default, but I believe OSA will rewrite it: https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/tasks/galera_server_post_install.yml#L151-L155 and root is hardcoded in the template: https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/templates/debian.cnf.j215:18
noonedeadpunkok, that is weird/wrong....15:19
lowercas_found it: https://bugs.launchpad.net/openstack-ansible/+bug/173673115:21
lowercas_^ for my bug15:21
noonedeadpunklowercas_: don't we cover that with https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/master/tasks/nova_db_setup.yml#L60-L87 ?15:23
noonedeadpunklooking at patch, it should be already present for Wallaby15:24
lowercas_noonedeadpunk: https://paste.centos.org/view/ee5f5b8115:30
lowercas_you're query doesn't appear to be grabbing the uuid, its grabbing the whole line?15:30
noonedeadpunkI think we rely more on the return code then on the content15:33
noonedeadpunkbtw15:33
lowercas_hmm, yea i see that.15:34
noonedeadpunkSo bsically, when there's already cell1 present `Create the cell1 mapping entry in the nova API DB` jsut should not run15:35
lowercas_root@infra1-nova-api-container-f0847039:/openstack/venvs/nova-24.1.0# bin/nova-manage cell_v2 verify_instance --uuid 70bd4168-3fc1-4b6e-b2bb-6b0faf31470215:36
lowercas_Instance 70bd4168-3fc1-4b6e-b2bb-6b0faf314702 is not mapped to a cell (upgrade is incomplete) or instance does not exist15:36
lowercas_that's essentially where i am stuck15:36
*** lowercas_ is now known as lowercase15:37
opendevreviewMerged openstack/openstack-ansible master: Fix tasks name for collections bootstrap  https://review.opendev.org/c/openstack/openstack-ansible/+/84713416:12
*** dviroel|lunch is now known as dviroel16:31
spatelAny outstanding issue to upgrade from Wallaby->Xena ?16:56
fanfi_hi, guys did you get following error during bootstrap ? https://paste.opendev.org/show/b7ndfzMTPjsKujtj6ANU/19:07
noonedeadpunkfanfi_: galaxy.ansible.com can have plenty of different issues :(19:10
noonedeadpunkmost of they are intermittent though and should restore on themselves19:11
fanfi_:( sweet 19:11
noonedeadpunkwe have only 2 collections being installed from galaxy though19:14
noonedeadpunk(maybe 3)19:14
noonedeadpunkyou can replace them with installation from github but there're drawbacks 19:15
noonedeadpunkas they won't have version when installed19:15
noonedeadpunkfanfi_: but to have that said it's all really about galaxy.ansible.com  which we don't control19:19
jrosser_fanfi_: i could not reproduce either your error with the linear strategy nor the SSL error which you showed me earlier19:21
fanfi_sure, i will need to check the collection what must by installed19:21
jrosser_are you behind some strange corporate SSL man-in-the-middle firewall?19:21
noonedeadpunkyou can also create user-collection-requirements.yml with content https://paste.openstack.org/show/bVtF9LZofRiWIB8ShzqD/19:22
noonedeadpunkIt could be also some CDN thing...19:23
fanfi_not sure, probably not. it is lab env. but in corporate building 19:23
noonedeadpunkonce I wrote it, I'm not sure this would replace galaxy source.. .Or well, I'm pretty sure it's not...19:24
noonedeadpunkbut you can just in-place manually overwrite them in /opt/opensatck-ansible/ansible-collections-requirements.yml19:25
fanfi_okay ...looks like our corporate issue with ssl :( ..so i need to install  the collections by different way 19:33
spatelAny idea about this error ? - https://paste.opendev.org/show/bIur5LQFlctDbuCnAfn4/19:33
spateldeploying 23.3.0 wallaby on new cloud 19:33
jrosser_what version of the community.general collection do you have19:38
jrosser_it should be 2.1.1 https://github.com/openstack/openstack-ansible/blob/stable/wallaby/ansible-collection-requirements.yml#L619:39
spatelchecking..19:41
admin1spatel, i have faced that error somewhere .. but can't remember when 19:42
spatelits version: 2.1.119:42
admin1i upgraded a few to 23.3.0 without issues 19:42
spatellet me re-run playbook 19:42
admin1check haprpxy stats .. does it show all OK ? 19:43
spatelits showing ok 19:43
admin1galera ,. one should be green and 2 other in standby type ( and not error ) 19:44
spateli found some DNS related issue which i just fixed so lets see19:44
admin1in my case, my galera was not in sync/broken 19:44
*** dviroel is now known as dviroel|afk19:48
spatelsomething is really messed up.. i am destroying container and re-building it19:58
spatelgalera is not happy somehow 19:58
spateljrosser_ admin1 no error after destroy container and re-deploy galera 20:04
admin1yours is a new cloud, so nice 20:05
spatelyes 20:05
spatelbuilding 78 nodes cloud in singapore region 20:06
admin1mine was an existing one where upgrade ( somehow ) wiped my old galera and brought up a new database .. ( it said cluster name mismatch ) .. but i had backups just 5 mins before, so had to manually bring up galera and re-run 20:06
spatelonce it done, my goal is to go back to OVN and test LXD to OVN migration stuff20:06
jrosser_spatel: if you read the error message it is a task delegated from the galera container to the host, managing the haproxy settings20:10
spatelwhat does that means?20:14
spatelThis is what i think, i had DNS problem on haproxy public vip and that may created issue. 20:15
spatelit could be just me and my error20:16
spatelI have to leave now but i will see if i hit any other error :)20:18
spatelsoon i have plan to upgrade all production clouds from Wallaby->Xena so hope all go well 20:18
spatelGood night! see you tomorrow 20:19
*** dviroel|afk is now known as dviroel20:43
*** dviroel is now known as dviroel|afk21:22

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!