spatel | jrosser Hey! | 14:18 |
---|---|---|
jrosser | hello | 14:19 |
spatel | Around? | 14:19 |
jrosser | a bit, yes :) | 14:19 |
spatel | https://paste.opendev.org/show/bIAsi8mx3HrA2gG0UTVd/ | 14:19 |
spatel | What is wrong here.. I am pulling my hair :( | 14:19 |
spatel | trying to install magnum-cluster-api | 14:20 |
spatel | Cluster type (vm, ubuntu, kubernetes) not supported | 14:20 |
spatel | I found this reference thread but no solution yet - https://kubernetes.slack.com/archives/C05Q8TDTK6Z/p1711704187186329 | 14:21 |
* jrosser shakes fist at slack | 14:23 | |
jrosser | spatel: so there are messages from the magnum-cluster-api almost as soon as magnum-conductor starts (you did restart it?!), like this https://zuul.opendev.org/t/openstack/build/ce6d644ed18144869e501b512ec165b1/log/logs/openstack/aio1-magnum-container-d0d4b96e/magnum-conductor.service.journal-19-13-04.log.txt#1921 | 14:29 |
jrosser | where it does not find the manila service, for example | 14:29 |
jrosser | that shows that the driver is loaded | 14:29 |
spatel | I did restart both api and conductor but if you think I should do conductor first then I can do in that order | 14:30 |
jrosser | the order should not matter | 14:31 |
jrosser | when it says that the cluster type is not supported it means that for some reason the driver is not loaded, like the thread in slack says | 14:31 |
jrosser | i have never seen this though | 14:31 |
spatel | hmm but i did install magnum-cluster-api with pip command as mentioned | 14:33 |
spatel | Does it has anything to do with mgmt cluster? | 14:33 |
jrosser | no i don't think so | 14:34 |
spatel | This is my mgmt cluster logs - https://paste.opendev.org/show/bwir9RYgJZPjFSbzUlfQ/ | 14:35 |
spatel | I thought mcapi go out and talk to mgmt k8s cluster and then register | 14:35 |
jrosser | register? | 14:39 |
jrosser | you are trying to create the cluster template - this is something entirely internal to magnum and its db | 14:40 |
jrosser | but it tells you that it does not know what to do for a cluster with the tuple (vm, ubuntu, kubernetes) | 14:41 |
jrosser | which suggests that the driver is not loaded for some reason | 14:43 |
spatel | hmm | 14:52 |
jrosser | it is also true that the first thing that the driver does is setup the API object for the management cluster https://github.com/vexxhost/magnum-cluster-api/blob/724a3fbb19889342fc427298729d1e3f88c35161/magnum_cluster_api/driver.py#L46 | 14:54 |
jrosser | though if that actually touches the API at all is another matter | 14:54 |
spatel | Let me redeploy mgmt cluster | 14:56 |
spatel | something funny going on | 14:57 |
spatel | I am waiting for mnaser to respond :) | 14:57 |
jrosser | how did you deploy it? | 15:07 |
spatel | https://paste.opendev.org/show/bVYdx1iH7YXna9AnCT8a/ | 15:11 |
spatel | but i am doing latest version of clusterctl | 15:11 |
spatel | instead of old | 15:12 |
jrosser | oh well i meant magnum-cluster-api | 15:12 |
spatel | pip install magnum-cluster-api | 15:12 |
jrosser | you took care of upper-constraints etc when using pip? | 15:12 |
spatel | no | 15:13 |
spatel | All i do just pip install magnum-cluster-api | 15:13 |
spatel | I did deploy couple of cluster like that before and all works | 15:13 |
jrosser | oh sure | 15:14 |
jrosser | all i am saying is that maybe at that time you got lucky with the versions of things that were on pypi | 15:14 |
jrosser | perhaps today not so lucky | 15:14 |
spatel | I am also following - https://www.roksblog.de/openstack-magnum-cluster-api-driver/ | 15:14 |
jrosser | but i have no idea | 15:14 |
spatel | I am going back to older version and see | 15:15 |
jrosser | i am more meaning dependancies | 15:15 |
spatel | Hmm.. all I have to go back to older version and try | 15:15 |
jrosser | but like i say i have no idea about installing "open loop" like this without any constraints | 15:15 |
jrosser | thats why there is a ton of complexity in the way that OSA makes venvs to ensure that everything is consistent | 15:16 |
spatel | hmm | 15:16 |
jrosser | ideally you need to use whatever mechanism kolla has to insert extra python packages at the point that the container image is built | 15:17 |
jrosser | (i have no idea if this is possible at all) | 15:17 |
sykebenX | noonedeadpunk: Good afternoon! Just wanted to thank you for your help yesterday and report back that I found out what appears to have been causing the issue. It seems that the SSH Multiplexing session from a previous run as an unprivileged used (ANSIBLE_REMOTE_USER=nonroot) was being re-used on subsequent runs and as a result, even though the SSH command that initiates the lxc-attach during gather container facts specifies | 18:15 |
sykebenX | does not respect that | 18:15 |
sykebenX | the workaround was to simply `killall ssh` to kill the multiplex sessions and re-run with ANSIBLE_REMOTE_USER=root | 18:15 |
sykebenX | Still not 100% sure why running as nonroot in the first place will not work, but this seems to at least be part of the problem I was having | 18:16 |
jrosser | sykebenX: I’m not sure if we say that non root is supported or not | 18:20 |
jrosser | we certainly do not test that at all | 18:20 |
sykebenX | Fair enough! I assumed that based on this document: https://docs.openstack.org/openstack-ansible/latest/user/security/non-root.html | 18:21 |
jrosser | aaaah ok | 18:22 |
jrosser | it would probably be not too difficult to set up a CI job to check that works properly | 18:23 |
jrosser | ultimately you end up with become=true so the difference with root and non root is pretty small, but I understand that some places have policy about this | 18:24 |
sykebenX | jrosser: Yeah I agree it's probably a bit unnecessary, but yes, it is a requirement for some orgs | 18:25 |
spatel | jrosser no luck | 18:39 |
spatel | I did downgrade everything to working envirnment | 18:39 |
spatel | I have 2023.1 running in production with mcapi and everything working well | 18:40 |
jrosser | I think you have to do some more active debugging | 18:40 |
spatel | I took 2023.1 magnum container and put it on 2024.1 cloud | 18:40 |
spatel | I did enable debug but not a single error | 18:40 |
spatel | everything looks clean... | 18:40 |
jrosser | you can use this to check that the driver is visible to magnum (I think) https://pypi.org/project/entry-point-inspector/ | 18:41 |
spatel | I am running mcapi in yoga + zed + 2023.1 with same config and same install method | 18:41 |
spatel | only with 2024.1 I am seeing this issue.. I don't think its 2024.1 release issue | 18:41 |
jrosser | you can add some debug prints to the mcapi driver to see if it even inits | 18:42 |
jrosser | well, if it was me I would spin an AIO and test locally | 18:42 |
jrosser | but then for OSA we put in ~1 year of effort to make that possible | 18:42 |
spatel | Let me enable debug but I can tell you its not saying anything in debug at all.. even no "capi" keyword in entire log search | 18:43 |
jrosser | that’s how I get confidence that it will work for me in prod when I need it - that the upstream tests are representative and working | 18:43 |
jrosser | then that still points to the driver not loading | 18:44 |
jrosser | does kolla use a venv inside the docker container? | 18:44 |
spatel | Question, how does magnum know that i need to load mcapi ? | 18:44 |
spatel | Yes docker use venv | 18:44 |
jrosser | through the stevedore framework I think | 18:44 |
jrosser | which is why I pointed you to entrypoint-inspector | 18:45 |
jrosser | so that you can check the visibility of the installed driver | 18:45 |
spatel | guide me.. how.. let me first install entrypoint-inspector | 18:45 |
spatel | pip install entrypoint-inspector | 18:46 |
jrosser | I am not at my work computer right now | 18:46 |
spatel | ERROR: Could not find a version that satisfies the requirement entrypoint-inspector | 18:46 |
spatel | let me google | 18:46 |
jrosser | kind of https://doughellmann.com/projects/entry-point-inspector/ | 18:47 |
jrosser | ah maybe entry-point-inspector | 18:47 |
jrosser | I am just on my phone :) | 18:47 |
spatel | works! | 18:48 |
spatel | Thank you so much for the help.. from your phone :) | 18:48 |
spatel | I have installed that package | 18:48 |
spatel | now enable debug and restart service? | 18:48 |
jrosser | no use the epi command to look at the drivers for magnum | 18:49 |
spatel | ep show | 18:50 |
spatel | let me post output | 18:50 |
spatel | https://paste.opendev.org/show/bMzDNe0qidpLyijcPetw/ | 18:51 |
spatel | look like driver is loaded | 18:51 |
jrosser | well, it’s discoverable | 18:52 |
spatel | hmm | 18:52 |
spatel | so how to make sure its loading during service start? | 18:52 |
jrosser | I’m not sure | 18:54 |
jrosser | it almost sounds like you have magnum running in one place and the driver installed somewhere else | 18:55 |
jrosser | did you check the magnum service runs actually from the same venv? | 18:55 |
spatel | hmm! that is a good point | 18:56 |
spatel | let me check.. hold on.. | 18:56 |
noonedeadpunk | sykebenX: I think this also could be to usage of session presistance that speeds up ansible dramatically, but maybe you can play with that... https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/openstack-ansible.rc#L52 | 18:56 |
spatel | but this is how I am doing everywhere and it works in most case.. | 18:56 |
spatel | jrosser its in right place - https://paste.opendev.org/show/bF2bVw5I5k1eKBaoX1Zq/ | 18:57 |
spatel | magnum and magnum-capi in same venv environment | 18:58 |
spatel | Do you think its SDK problem not the real server | 19:00 |
sykebenX | noonedeadpunk: Thank you! I will investigate that | 19:01 |
spatel | jrosser no kidding it works.... | 19:06 |
spatel | look like i found something... | 19:06 |
jrosser | oh?! | 19:06 |
spatel | just hold on.. let me prove myself | 19:07 |
spatel | jrosser here is the culprit https://paste.opendev.org/show/brp45S4hl9YgtuKAoYBs/ | 19:09 |
spatel | this line was in magnu.conf file | 19:09 |
spatel | somehow 2024.1 has this option added | 19:10 |
spatel | as soon as I removed all good | 19:10 |
noonedeadpunk | stackhpc influence on kolla ?:) | 19:10 |
spatel | Yes yes.. they are pushing there driver | 19:11 |
noonedeadpunk | if you ever happen to try out their driver - I will be all ears to listen about it | 19:11 |
jrosser | that is probably because it uses the same vm/ubuntu/k8s tuple so cannot co-exist | 19:12 |
noonedeadpunk | I wonder actually if out control cluster could be used "as is" with their driver | 19:12 |
noonedeadpunk | yup | 19:12 |
jrosser | probably | 19:12 |
noonedeadpunk | which is in fact stupid problem to have kinda... imo, it's high time to invent a better label for images... | 19:13 |
noonedeadpunk | anyway | 19:13 |
spatel | Agreed... | 19:14 |
spatel | This damn confusing with same variable but multiple drivers | 19:14 |
spatel | it took my 3 days worth of life :( | 19:15 |
noonedeadpunk | also having "ubuntu" value to select magnum drivers... is not intuitive | 19:15 |
noonedeadpunk | also os_distro is used to define distribution indeed, and we mark all existing images with os_distro... | 19:16 |
noonedeadpunk | due to https://docs.scs.community/standards/scs-0102-v1-image-metadata/#technical-requirements-and-features | 19:16 |
noonedeadpunk | to it produces soooo much more issues.... | 19:17 |
spatel | Thank you jrosser for helping me out.. I like that entry-point stuff.. good to know | 19:22 |
-opendevstatus- NOTICE: Gerrit will undergo a short restart to pick up some bugfixes for the 3.10 release that we upgraded to. | 19:24 | |
jrosser | spatel: https://opendev.org/openstack/kolla-ansible/commit/4879656058c43a88eb934ae0c61251aaa5ff82b9 | 19:45 |
jrosser | this all seems to be because of not defining magnum_kubeconfig_file_path | 19:45 |
spatel | Just found that from slack with talking to kolla developer - magnum_kubeconfig_file_path is not defined | 19:45 |
jrosser | snap :) | 19:46 |
spatel | haha! bummer (This is bad variable.. it should throw error instead silent killer) | 19:46 |
jrosser | anyway good to understand - nice it’s working now | 19:48 |
spatel | indeed.. lots of hide jewels we discovered | 19:52 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!