Thursday, 2023-07-13

ftl_masonildikov: Based on the installer logs, it appears that the manifest for k8s dashboard was referencing cert-manager where kubernetes-dashboard was expected.00:09
ftl_masonI'm going to create a Launchpad account and log an issue there.00:10
ildikovOh, interesting00:10
ildikovThanks for looking into it!00:10
ftl_masonHowever, it doesn't look like I'm out of the woods yet.  I was able to login to the controller's web UI, but when I got there and reviewed the alerts it appeared there was an issue with the controller.  It recommended that I lock and unlock the controller, which I did.  This triggered a reboot, but the VM won't come back up.  It's just stuck at the bootloader.00:11
ftl_masonI'm running VirtualBox on Ubuntu 22.04.  I don't know if the problem is VirtualBox or StarlingX, but this far my experience setting up a controller has been really poor.  I'm 3 full days into this and still don't have a functioning controller.00:13
ftl_masonI'm not sure in which direction I should head now.  I'm thinking that perhaps the VM route is a dead end right now and I should just try bare-metal?  I think I have enough resources kicking around my office that I can build an AIO-SX system on bare metal and hopefully spin up an AIO-SX subcloud on another machine.00:16
ftl_masonHowever, I'm struggling a bit to understand exactly what the network interface configuration needs to look like to accomplish that.  I've seen some suggestions that it can all be done on a single NIC with VLANs, some suggestions that I need multiple discreet networks and some other permutations.  I'll try going through the documentation again and see if I can get this working on bare metal.00:19
ftl_masonRight now the main driver behind this effort, is simply to find out what the real-world minimum requirements are for an AIO-SX subcloud.  The documentation suggests that I need a minimum of a xeon-D and a lot of RAM, yet I heard from a few people at the conference that it'll run in a VM with as little as 8GB of RAM and 4 vCPUs.  I want to know what I can actually get away with.00:23
ftl_masonI have to run out for a few min.  If anyone responds, I'll continue the conversation shortly.00:24
ftl_masonI'm back.00:44
ildikovHmm, that’s not good :(01:05
ildikovI’ve never done the install myself01:05
ildikovOutBackDingo: have you or your team ever done the virtual setup? ^^01:06
ildikovOr maybe about to suggest on network config for bare metal?01:06
OutBackDingoyes we have on kvm, and then i filed patches for it, whioch i could never complete01:06
OutBackDingosince then its been "closed"01:07
ildikovOk, so not everything got fixed then01:07
ildikovI know Bruno’s team picked some of that up, but I think he signed off for the day already01:08
OutBackDingono... but we did have it up on kvm, it was a matter of only cvhanging the machine model for alterantive operating system ... ie... fedora01:08
OutBackDingolet me finish what im into and read tyhe conversation, see if i can help01:09
OutBackDingogive me like 30 mins01:09
ildikovSounds good, thank you!01:09
ftl_masonOutBackDingo: Thank you!01:09
OutBackDingoftl_mason: so first whats the environment kvm or vbox ?01:10
OutBackDingoVirtualBox on Ubuntu 22.0401:11
ftl_masonI'm willing to do whatever works.  Both Eddy and Bruno have recommended the VirtualBox route, as it seems like that's where most of the work has been.01:11
OutBackDingopffft :)01:11
OutBackDingoto be quite honest, the kvm way just worked out of the box.... previously, not sure why any of that would have changed as all their work was vbox based01:12
ftl_masonIt's possible it's just me doing something odd or misunderstanding the documentation.01:13
OutBackDingoand what git repo did you use01:13
ftl_masonhttps://opendev.org/starlingx/virtual-deployment01:13
OutBackDingoand what doc ?01:14
OutBackDingojust so i can see what you followed01:14
ftl_masonI also tried this one https://github.com/zbsarashki/stx-labs-openInfraVancouver2023/tree/main01:14
ftl_masonThe URL I just posted shows the instructions that I followed from that repo.01:15
ftl_masonThese are the instructions that I followed for the Pybox install.  This install worked just fine, but when I locked and unlocked the VM it triggered a reboot and the VM wouldn't boot after that.  https://opendev.org/starlingx/virtual-deployment/src/branch/master/virtualbox/pybox01:16
ftl_masonI haven't played around with VirtualBox to see if I can get the VM to boot again.01:16
OutBackDingohttps://github.com/zbsarashki/stx-labs-openInfraVancouver2023/tree/main/libvirt looks viable... however you also have already installed vbox, which honestly i dont use... never saw the need for it as kvm just worked01:17
OutBackDingoand this is the repo and guide that we based our fedora deployment from https://github.com/zbsarashki/stx-labs-openInfraVancouver2023/tree/main/libvirt01:19
OutBackDingoso it should also work fine on ubuntu out of the box01:19
ftl_masonI followed these instructions for my first attempt https://docs.starlingx.io/r/stx.8.0/deploy_install_guides/release/virtual/aio_simplex.html01:19
ftl_masonWhich really didn't work.01:19
ftl_masonI must be missing something really obvious then.01:20
OutBackDingommm possibly... 01:21
OutBackDingolet me try something right quick01:21
ftl_masonSure01:21
OutBackDingook...01:54
OutBackDingothis works so far https://github.com/zbsarashki/stx-labs-openInfraVancouver2023/tree/main/libvirt01:54
OutBackDingo./setup_configuration.sh -i /var/lib/libvirt/images/pool/starlingx-intel-x86-64-cd.iso -c simplex01:55
OutBackDingoand you have to change stxbr to madbr01:56
OutBackDingoin the setup_configuratiomn.sh01:56
OutBackDingo```❯ brctl show01:56
OutBackDingobridge namebridge idSTP enabledinterfaces01:56
OutBackDingobr-98a1f15542408000.0242824763d4noveth143a4e401:56
OutBackDingodocker08000.02429a98242fno01:56
OutBackDingomadbr18000.460c19283202novnet5401:56
OutBackDingomadbr28000.3eac164718e6novnet5501:56
OutBackDingomadbr38000.ba6c78ed84ccnovnet5601:56
OutBackDingomadbr48000.929c5f4a71a8novnet5701:56
OutBackDingovirbr08000.525400ca2351yes```01:56
ftl_masonOk, I'll try that later this evening.  I'll let you know how I make out.  Thanks for your help! 01:57
brunomunizOk, I had to go through the logs to make sure I read everything http://eavesdrop.openstack.org/irclogs/%23starlingx/07:13
brunomunizRegarding the VM being stuck at the bootloader, I remember some reports related to that and us switching mostly to graphical install type to avoid that (via the "--install-mode graphical"parameter). IIRC on existing VMs, connecting to the serial port via "socat" would also unstuck the VM. =| 07:17
haoiihow do I connect to the serial port via socat?07:20
brunomunizSomething like "socat <address> stdio,raw,escape=0x1d,echo=0,icanon=0"07:24
brunomunizYou can find the address on the virtualbox config (either via CLI or GUI)07:25
brunomuniz"vboxmanage list vms --long | grep 'UART 1'" to take a quick look at what you might have there.07:27
haoiivboxmanage list vms --long | grep 'UART 1' UART 1:                      I/O base: 0x03f8, IRQ: 4, attached to pipe (server) '/tmp/STX8-AIOSX_serial', 16550A UART 1:                      I/O base: 0x03f8, IRQ: 4, attached to pipe (server) '/tmp/hli_StarlingX-controller-0_serial', 16550A07:28
haoiiit is the bottom one I am trying to fix, the VirtualBox starts but it stuck in black screen.07:29
brunomunizTry the socat with this /tmp/hli_StarlingX-controller-0_serial address and hit enter when it connects (it should just hag there for a while)07:30
haoiiI am sorry, this is not my field of competance. There will be some dumd questions, but I do not understand what is the serial address.07:34
brunomunizThe might be some dumb answers as well :) 07:40
brunomunizThink of it as just a way to interact with the VMs console.07:40
brunomuniz(the same thing that you usually see when you start a VM in VirtualBox and it opens up in a window - if you're on a GUI environment)07:41
haoiiAh, I understand. It seems that socat needs two addresses, what is the second address I should give it?07:44
brunomunizI just do an address and then a set of parameters for the connection07:49
brunomunizIn my case, for example, I just did "socat TCP4:localhost:10001 stdio,raw,escape=0x1d,echo=0,icanon=0"...07:51
brunomunizFor you it should be "socat /tmp/hli_StarlingX-controller-0_serial stdio,raw,escape=0x1d,echo=0,icanon=0"07:52
brunomuniz(the escape sequence would be CTRL+] then <enter>, so you don't get stuck in yet another place)07:53
haoiiI still got a black screen. Have not gotten anything on the screen on this one. Have one which I tried with libvirt, but there I have other problems:p07:59
brunomunizIt takes some time to restart. If you connect to the serial port via socat then restart the VM (on another terminal or via GUI) do you see anything?08:01
brunomuniz(sorry, brb)08:04
brunomunizRegarding the virtualization tool, we're doing most of the work in VirtualBox because we got feedback from the community in at least two different meetings back in March/April that it's what most devs use, so it made sense for us to put most of our work there (also being the one with an existing automation that hadn't being touched in a while)08:27
haoiiThe socat then stopped. It was running as long as the VM was running, but did not give anything back.08:27
brunomunizBut I don't have any favorites. In a perfect world we would have a customizable automation/installation tool that could interface with either virtualbox or libvirt on the underlying OS.08:28
brunomunizWere you able to reconnect right after the VM restarted?08:30
brunomunizIt didn't show anything even with the VM booting up? That's weird.08:30
brunomunizIf you try VirtualBox again, try the pybox thing with "--install-mode graphical". This solved problems like this for us before, although we're not sure exactly what the problem was, tbh.08:32
haoiiokay, I might try that then. Thanks!08:34
haoiiNow I rememberd my PyBox error: 2023-07-13 10:43:41,326: Expecting text within 3.0 minutes: Press08:47
haoiithis step fails, as no text appears on the screen.08:47
haoiiCan not find any error within the VirtualBox GUI08:47
haoiiIs anyone known to this and a possible cause to this08:48
haoiiThis time I ran the installation with --install-mode graphical instead of serial08:51
haoiistill had the same error, so assume it is not related to the installation mode08:51
haoiibeen following this guide, which was linked here one of the previous days: https://opendev.org/starlingx/virtual-deployment/src/branch/master/virtualbox/pybox#installation-and-usage09:02
brunomunizThis comes from a function that expects a given text to appear on the console. There's a few calls that use the 3 minutes timeout, mostly related to logging in for the first time and then changing the password. Can you paste the logs from a few lines above so we know what the code was doing 10:06
brunomunizHave you defined the password with "--password <something>"10:06
brunomunizThe instructions (if you copy and paste) will use an environment variable called $STX_INSTALL_PASSWORD.10:07
brunomunizThe current version, however, does some basic validation of the password (size, special chars etc) to make sure it will be accepted later by the OS, so I'm thinking it should be something else.10:14
brunomunizThe logs right above this error should point us to what specifically is failing.10:15
daniel-cairesAbout the VM not booting, it seems is a problem with VirtualBox when a Host Pipe is used. A review is currently in progress that will change host pipe to TCP as serial port( https://review.opendev.org/c/starlingx/virtual-deployment/+/887301v ). But you can to this manually, once you change to TCP put a port as path, then your VM should boot normally11:10
daniel-cairesAs Bruno said using graphical as --install-mode you will see the VM booting otherwise it will be blank for about 6 minutes while it boots11:12
haoiiiGot passed my error, seemed to be a memory error. Now the installation ran very far, beyond the ansible playbook step. Now I am not able to retry, but it wont let me. Is there a great tool for sending the logs? assume it is not wanted here as it is a bit long.11:15
daniel-cairesThere are a few stages that can only be ran once as it makes some configurations that the system won't allow making twice. 11:21
haoiiiyes it wont let me retry the pybox installation process11:22
daniel-cairesI often use --snapshot parameter so I can return to after some stage and run the one that failed again11:22
daniel-cairesfrom the very begging?11:22
haoiiiyes, I tried to delete the VM to see if it helped, but it didnt. So now I have neither the VM nor the snapshots...11:23
daniel-cairesthat's weird. If anyone knows somewhere where the log can be posted so we can take a look11:25
daniel-cairesAlthough I think it did happened once with me, but it got resolved once a changed the labname11:26
haoiiiAh that helped:)11:27
daniel-cairesGreat!11:27
daniel-cairesJust one thing about the socat, may be irrelevant but the full command that works with me is "socat UNIX-CONNECT:'<tmp/adress>' stdio,raw,escape=0x1d,echo=0,icanon=0" if using host pipe as serial mode 11:33
dpereira_haoiii, in the future you can use https://paste.opendev.org/ to send your log output.11:37
haoiiithanks!11:46
haoiiihttps://paste.opendev.org/show/beFo6fjdPsy7Pjq0B68R/12:21
haoiiiSo here is the logs of the error I get, after the ansible playbook stage of the pybox StarlingX installation12:21
haoiiiseems to be ssh related12:22
daniel-cairesssh is not something I'm very familiar with, so I may not be of much help but let's try :) It seems it didn't find the sshpass folder, did you installed it? and rsync ass well12:28
haoiiirsync is installed it says, did install ssh now. Still cant locate the folder.12:33
brunomuniz[m](hopefully) all the dependencies are installed with: sudo apt install virtualbox socat git rsync sshpass openssh-client python3-pip python3-venv12:36
haoiiiYes of course I did that step before launcing the installation, so it should be fine, but apperantly there is some problem.12:37
brunomuniz[m]It's not locating the known_hosts file, apparently, right?12:41
brunomuniz[m]Does the command work by itself? The "ssh-keygen -f "/home/hli/.ssh/known_hosts" -R [127.0.0.1]:3122"?12:41
haoiiibut that is done on the VM or on the host machine?12:44
brunomuniz[m]Tjat12:44
brunomuniz[m] That's done on the host machine.12:44
brunomuniz[m]Try to backup whatever is there in your known_hosts file with something like "mv /home/hli/.ssh/known_hosts /home/hli/.ssh/known_hosts.bkp"12:44
brunomuniz[m]I assume hli is your username on the host machine.12:44
brunomuniz[m]haoiii: I believe you're not available anymore, but I just found a situation that might be similar to what you're facing. I can explain my conclusions and I can also take a look at you full logs if you want, to confirm my theory.15:28
brunomuniz[m]In the mean time, you can try two things.15:28
brunomuniz[m]1) vboxmanage natnetwork start --netname NatNetwork; vboxmanage natnetwork stop --netname NatNetwork15:29
brunomuniz[m]This was something that we noticed last week (and I found other reports about VirtualBox not being able to handle the port forwards to a Nat Network).15:30
brunomuniz[m]Right now I noticed that the port forwards were working just fine (via netstat I could see the ports listening on my host) but only after recreating the whole NatNetwork on my system (delete then re-create) my SSH connection (which relies on a portforward to the NatNetwork) started working again.15:31
brunomuniz[m]So, the second thing would be:15:32
brunomuniz[m]2. Recreate the NatNetwork in VirtualBox from scratch. I did it via GUI, but I can paste a one-liner that does it if you need.15:32

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!