Wednesday, 2024-01-31

*** zigo_ is now known as zigo09:43
jrossermorning11:53
noonedeadpunko/12:28
opendevreviewAndrew Bonney proposed openstack/openstack-ansible master: WIP: [doc] Update distribution upgrades document for 2023.1/jammy  https://review.opendev.org/c/openstack/openstack-ansible/+/90683213:03
noonedeadpunkandrewbonney: I've commented our solution to one of your TODO items fwiw13:13
andrewbonneyTa. I've got a copy of your script, just haven't got to using it during this process yet13:13
noonedeadpunkah, ok, was not sure if I've ever shared this13:13
noonedeadpunkalso likely it should be polished a bit... but well13:14
jrossernoonedeadpunk: do you know how nested virt is supposed to work in CI jobs?13:14
jrosseri see that there are nested virt enabled flavors here https://opendev.org/openstack/project-config/src/branch/master/nodepool/nl03.opendev.org.yaml#L147-L18813:14
noonedeadpunki somehow thought that nested virt is a requirement from infra side before joining13:15
jrosserbut we always make `nova_virt_type: qemu` in user_variables for any zuul job13:15
jrosserand this is terribly slow for the capi job, like 10x slower13:15
noonedeadpunkqemu is very slow, yes13:16
noonedeadpunkas generally tempest just spawn VMs with cirros to check connectivity and drop it13:16
jrosserso i was wondering if we are somehow doing it wrong, and the nodeset choice is supposed to decide if its nested virt or not13:16
noonedeadpunkor well, maybe cases with octavia where LBs are spawned, but all cases have quite small workload13:17
jrosserwe can detect this at runtime with `cat /sys/module/kvm_amd/parameters/nested`13:18
noonedeadpunkso you mean that nested virt we should be able to use kvm?13:18
jrosser(or kvm_intel as needed)13:18
jrosseryeah so in my cloud here i am running AIO for capi with nested virt enabled13:18
jrosserand becasue it's not a zuul job it actually is using kvm accel13:18
noonedeadpunkand holds you saw were all having it?13:19
jrosserand the magnum cluster creates in 4 mins13:19
jrosserand in the CI hold i have, the nodes are nested virt enabled, but we set it to qemu in zuul user_variables13:19
noonedeadpunkyeah, I see13:19
jrosserand then it takes like 40mins+ to create the capi cluster13:19
noonedeadpunkmakes sense to use kvm when we can, sure13:20
jrosserand the CPUs are just pegged at 100% on qemu processes13:20
noonedeadpunkmaybe this will get some boost for other jobs as well13:20
jrosseryeah, and i guess if i can make it dynamic detection then if the provider supports it, it will use it13:21
noonedeadpunkI feel like it should be >90% of cases...13:22
noonedeadpunkLike testing kata would be impossible otherwise...13:23
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Enable nested virtualisation in the AIO when it is available  https://review.opendev.org/c/openstack/openstack-ansible/+/90732714:11
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver  https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/90519914:12
* noonedeadpunk wonder if execution time will drop significantly14:23
jrosseri guess in a normal job this will only affect booting cirros, so perhaps not14:23
jrosserbut the capi job makes amphora and two ubuntus......14:23
noonedeadpunkyeah, true... 14:23
noonedeadpunktempest run still takes 8mins...14:25
noonedeadpunkSo if it will drop to 3 - 5 mins is noticable...14:25
jrosserthat would be good14:25
noonedeadpunkjrosser: seems it 's not working out: https://zuul.opendev.org/t/openstack/build/bad24e6c50114083a56c7ae392f99dfe/log/logs/openstack/aio1-utility/tempest_run.log.txt15:17
noonedeadpunkI know it's distro job, but it fails like very alike to how it would with such change15:17
noonedeadpunkwith VM crashing on attempt of startup15:18
jrosserhmm maybe it needs to be opt-in for the job15:26
noonedeadpunkand you have KVM not QEMU locally?15:27
noonedeadpunkLike, you sure about that?:)15:27
noonedeadpunkjrosser: as we are already trying to guess kvm/qemu here: https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/master/tasks/nova_virt_detect.yml15:29
noonedeadpunkso eventually, we could just undefine it and rely on nova role...15:30
jrosserargh i see15:31
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Allow nova role to detect virtalisation type in CI jobs  https://review.opendev.org/c/openstack/openstack-ansible/+/90732715:36
jrossernoonedeadpunk: in my local AIO vm i see `-accel kvm` on the /usr/bin/qemu-system-x86_64 process15:37
noonedeadpunkmhm, I see15:37
noonedeadpunkjrosser: fwiw, it's failing quite dramatically as well17:03
noonedeadpunkbut not for everything....17:04
jrosserok so there is probably very good reason to have nested virt specific nodesets17:04
noonedeadpunkI wonder what's wrong though, and why it's not catched with nova role then17:05
jrosseras it may just be br0k in certain providers17:05
noonedeadpunkas conditions there looked around right17:05
jrosserwell if you have unfortunate kernel troubles between host/guest doesnt this all go a bit bad?17:05
jrosserafaik its very sensitive to host side things17:05
jrosserif you codesearch for the virt_type stuff its just tons of "use qemu in CI" comments all over17:07
noonedeadpunkyeah, that;'s probably reason why it's there17:09
jrosser anyway - the node type for the capi job is one where this is actaully supposed to work17:12
jrosserso we still need a way to opt in17:12
jrosserOMG it works https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199?tab=change-view-tab-header-zuul-results-summary17:44
jrosser\o/17:44
noonedeadpunkwow17:56
noonedeadpunkthat is veeeery promising :)17:56
noonedeadpunkonly 2h:)17:57
noonedeadpunkeven faster then upgrade job :D17:57
jrosserthey are fast nodes18:47
TheCompWizjrosser: I finally broke down and tried the AIO install, and I'm getting yet-another-brick wall.  The bootstrap_aio script dies when trying to "Download EPEL gpg keys"19:32
TheCompWizwait... why on earth would it be trying to install centos EPEL keys on ubuntu?19:38
noonedeadpunkTheCompWiz: that is correct question :D19:39
TheCompWizansible facts say distribution is ubuntu... 19:50
jrossercan you share the output?19:50
TheCompWizhttps://paste.openstack.org/show/bPrmCUBAQKp3mRy0LpGb/19:51
TheCompWizthe ansible facts shows this: "ansible_os_family": "Debian"19:51
TheCompWizand everything I see says the "Install EPEL" block shouldn't even be run... 19:52
TheCompWizbrb 1 sec...   switching PCs.19:54
*** TheCompWiz is now known as TCW19:55
TheCompWizback19:56
jrosserits not to do with debian or not19:58
jrosserthe error says `error while evaluating conditional ('s3fs' in systemd_mount_types)`19:58
jrosserseeing the output of tasks to do with setting up the AIO storage is going to be the only way to understand this20:11
noonedeadpunkTheCompWiz: to be more specific `'_bootstrap_host_data_disk_device' is undefined.`20:36
TheCompWizwhich is odd... because I set export BOOTSTRAP_OPTS="bootstrap_host_data_disk_device=sdb bootstrap_host_data_disk_fs_type=xfs bootstrap_host_public_interface=ens34"20:37
TheCompWizbefore running bootstrap.20:37
TheCompWizand yes, sdb does exist.20:37
TheCompWizand was partitioned & formatted.20:37
noonedeadpunkaha, ok, that;s important input20:39
dmsimard[m]noonedeadpunk: are you going to fosdem after all? :P20:39
noonedeadpunkdmsimard[m]: nah :( Like I even got time and funding, but task workload is just /o\20:39
noonedeadpunkso next time I guess20:40
dmsimard[m]I know the feeling, there'll be a next time no stress20:40
noonedeadpunkTheCompWiz: that actually should have worked....20:47
noonedeadpunkI'd need try to reproduce that, but only tomorrow :(20:48
noonedeadpunkeventually then it should have failed waaaay earlier I guess20:49
noonedeadpunklike here: https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/tasks/check-requirements.yml#L135-L14420:50
noonedeadpunkTheCompWiz: oh.... what if you try to drop partition table?20:53
noonedeadpunkthat looks like a bug20:53
noonedeadpunkso we define _bootstrap_host_data_disk_device only when there's no partition table exist20:54
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/tasks/prepare_data_disk.yml#L36-L3820:54
noonedeadpunkBut that in next task assume it's defined and using another condition20:55
noonedeadpunkwhich I think explicitly what you see20:55
noonedeadpunkso solution - add disk, set everything the same, but don;'t bother yourself with partition creation on the drive :D20:55
noonedeadpunkTheCompWiz: ^20:55
TheCompWiznoonedeadpunk: wiped partitions, and re-running bootstrap-aio.  I'll paste the results.21:22
TheCompWizhmmm... that seems to have worked.  ... not sure how I ended up in that situation.21:24
spatelHow to force detach volume?21:54
spatelI have one VM has single volume attached two time 21:54

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!