Wednesday, 2021-12-22

*** gmann_afk is now known as gmann00:35
*** pmannidi|AFK is now known as pmannidi00:42
*** amoralej|off is now known as amoralej08:25
dtantsurmorning ironic10:28
holtgrewedtantsur: good morning, the crowd here appears to be thinning out with the end of the year approaching10:32
dtantsuryep, exactly :)10:32
dtantsurholtgrewe: do you have a winter break?10:32
holtgrewedtantsur: usually, yes, but this year not10:35
dtantsuroh10:35
* dtantsur is wondering if the noon is a good time for breakfast10:50
jandersgood morning / afternoon dtantsur holtgrewe and Ironic o/11:02
holtgreweo/11:05
dtantsurhey janders 11:10
*** sshnaidm|afk is now known as sshnaidm11:26
holtgreweI have assigned a capability "node:NODENAME" to my baremetal host. Now I want to target this particular baremetal host with nova. How would the scheduler hints look like? I'm trying to follow https://fatmin.com/2018/08/20/openstack-mapping-ironic-hostnames-to-nova-hostnames/12:28
dtantsurholtgrewe: this is tripleo-specific stuff. I don't think you can just use it on a generic Ironic.12:32
holtgrewedtantsur: :-(12:32
dtantsurI'm curious why you need to target a specific host with nova.12:32
holtgreweAnd I'm on Kolla/Kayobe.12:32
dtantsurThis is considered an anti-pattern.12:32
holtgreweI want to reproduce my old xCAT based HPC deployment.12:33
dtantsurIf you have admin rights, I think you can ask nova for a specific hypervisor?12:33
dtantsureach Ironic node becomes its own hypervisor in Nova12:33
dtantsurI think hypervisor ID == node uUID12:33
holtgreweOK, I'm admin12:34
holtgrewelife is hard, root passwords tend to help from time to time ;-)12:34
dtantsur:)12:34
janderssee you tomorrow Ironic o/12:55
janders(and to those who are about to start their Christmas Holiday - Merry Christmas and Happy New Year! )12:56
jandersI'll be back for one more day tomorrow12:56
holtgreweIs there a way to show the current nova configuration including the scheduler_filter section including applied defaults?13:03
dtantsurholtgrewe: I don't know. Ironic in debug mode logs its configuration on start-up, but I'm not sure if Nova does it.13:09
holtgrewedtantsur: thanks13:13
*** amoralej is now known as amoralej|lunch13:16
holtgrewedtantsur: you are right, this one is specific to https://github.com/openstack/tripleo-common/blob/master/tripleo_common/filters/capabilities_filter.py13:42
dtantsurholtgrewe: maybe https://docs.openstack.org/nova/xena/admin/availability-zones.html#using-availability-zones-to-select-hosts can help?13:48
holtgrewedtantsur: that looks very helpful13:49
dtantsurI remember there was a way to target a hypervisor specifically, but this is all I could find13:52
holtgreweI'll ask in #openstack-kolla what they think is the best way13:53
dtantsurthe old docs have this https://docs.openstack.org/nova/train/admin/availability-zones.html#using-explicit-host-and-or-node13:53
holtgreweI could inject something like the tripleo thing13:53
dtantsurI'm not sure if it's still supported or not. worth trying?13:53
holtgrewedtantsur: it's still in the docs13:53
* dtantsur does not use nova nowadays13:53
holtgreweI'm deploying via the openstack collection... which only has this availability zone...13:55
TheJuliaAdmin users should be able to request instances to be deployed to specific hypervisors in nova13:59
holtgreweTheJulia: that functionality appears to be not exposed via the Ansible collections. Anyway, mgoddard convinced me to see the light and define one flavor per host generation and rather than HPC hammer/nail go for baremetal cloud.14:00
TheJuliaHeh, okay. I mean… if you were to propose a patch to the ansible modules… we might know some people :)14:02
TheJuliaFlavor per host doesn’t sound that ideal either, but that is the far friendlier option to users14:03
holtgreweTheJulia: thanks for the offer.14:03
TheJuliaAnyway, time to wake up, make coffee, and load up the car14:03
holtgreweTheJulia: I guess all I needed was the reminder that one should try to use one's shiny new tools in their idiomatic ways.14:03
TheJuliaIt does often help14:04
mgoddardTheJulia: I think holtgrewe means flavor per host type/generation rather than per host14:05
holtgreweyes14:06
TheJuliaOh, good14:06
* TheJulia hears tons of coyotes from her kitchen and wonders if it is because she is on IRC at the moment14:07
holtgreweAnd here is my brand new flavor bm.2020-11-c6420 ...14:10
TheJuliaYay!14:11
* TheJulia makes coffee otherwise the drive to flagstaff today will not be fun14:11
timeu_holtgrewe: FYI we use traits or CUSTOM_RESOURCE on the flavors to target specific node types (cpu, memory, gpu, etc) for our test baremetal HPC cluster. works quite well. 14:27
dtantsurgood morning TheJulia 14:27
holtgrewetimeu_: thanks for the info, I'm actually dealing with ~8 delivery batches of hardware here, having a flavor for each works well enough I think14:28
holtgreweI need to put some labels statically into the slurm.conf anyway.14:28
* holtgrewe is bumping the quotas for the hpc project to 999,999,99914:29
timeu_yeah we render our slurm config based on the heat output + ansible dynamically for the various node types. works quite well 14:29
*** amoralej|lunch is now known as amoralej14:29
holtgreweI'm not yet at the point where I am going down the rabbithole of heat14:29
* dtantsur feels like a lot of experience exchange can be happening14:30
timeu_we have been using heat but we are considering of switching to terraform 14:30
timeu_because it gives you a bit more transparency of what is happening if you scale up/down the stack14:30
timeu_in general works well, however writing the heat templates is much more verbose/cumbersome than HCL14:31
holtgrewetimeu_: for now I'm replacing proxmox for vms and xcat for bare metal with openstack14:32
holtgreweI have existing Ansible playbooks for the old infrastructure and now want to do that switch first14:32
holtgrewenext up would be looking at having neutron configure the switches (I understand that's possible)14:33
timeu_yeah sounds like a good plan 14:33
holtgreweMy use case is like ... one HPC14:33
holtgrewe;-)14:33
timeu_yeah that's basically what we are planing to do for the next iteration of our HPC system. Right now it's fully virtualized on Openstack but the goal is to move it to a baremetal deployment on OpenStack.14:34
timeu_we do also integrate ironic with neutron but we are using cisco SDN 14:34
dtantsurhave you considered doing a talk for the opendev summit next year?14:35
holtgreweActually, once I have a sizeable portion of nodes ported over, iDRAC/BIOS settings cleaned up, everything booting from UEFI, ... I need to get my all-NVME CephFS up and running and tuned so I can office-space-printer-scene my HDD based GPFS system14:35
timeu_but it does work quite well including trunk ports so we can get rid of 200 1G cables and just go with the 100G ones (should also improve air flow in the racks) and the 1G cables/NIC flap like cracy14:35
holtgrewetimeu_: OK... I don't understand much about SDN14:35
timeu_make sure to record it then ;-)14:36
holtgrewetimeu_: we have 2x10GbE/2x40GbE network in the old part of the cluster, old blade enclosures14:36
holtgreweand 2x25GbE/2x100GbE in the new part of the cluster14:36
holtgreweI currently have 10MB/sec throughput of the HDD file system14:36
timeu_quite a bit of connectivity there ;-) 14:37
holtgrewewith the NVME system I'm pretty certain the network will be the bottlneck14:37
holtgreweyeah, life science sure wants their I/O14:37
holtgreweCPU cycles per GB is quite low compared to things like FEM simulations14:37
timeu_yeah I think with NVME 100G is required. 14:37
holtgrewesome people might say that it's essentially "gunzip -c | awk | gzip" performance wise ;-)14:38
holtgreweAnd I'm not even looking at the imaging people who either do 100M images of 4kb or large HDF files with only random seeks.14:38
timeu_yeah we also have those but its true that an all flash parallel filesystem gives you the biggest performance boost across the board for various HPC workloads 14:39
timeu_but yeah we have also all kinds of edge cases in regards with I/O workloads including the thousand of small temporary files in one folder that usually is poison for a paralllel filesystem ;-)14:40
timeu_I can feel the pain14:40
holtgrewetimeu_: yes... if I can move my 24 recent nodes across data centers early next year, I can connect modern CPUs with 2x25GB to the ceph system on one switch. I'm interested how the IO500 benchmark looks there14:41
holtgreweWell, rater thousands of small files on NVME than on HDD ;-)14:41
holtgreweeven the GPFS meta data lives on HDD, fast spinning ones, but still14:41
timeu_yeah but we also saw our all flash beegfs cluster crawl due to metadata operation in certain cases but it's definately better than rust ;-)14:42
holtgrewe;-)14:42
holtgreweWhat's your experience with BeeGFS, it always felt more like a /scratch when doing research about it than a /home14:43
* holtgrewe wonders whether that's actually OT here14:43
timeu_yeah it's definately more than a /scratch although it has been very reliable. Almost no crashes but we have a relatively small setup (~300TB, 12 nodes). 14:45
holtgrewetimeu_: sounds good14:45
timeu_there are features such as mirroring (metadata + data) but they come with performance overhead, so we don't use them14:45
holtgreweI've used two clusters ... 8 and 12 years ago that had a dying ... FrauenhoferFS as it was called back then14:46
holtgreweBut that's ages ago14:47
holtgreweit's good to hear that things have improved14:47
opendevreviewMerged openstack/ironic-python-agent stable/ussuri: Re-read the partition table with partx -a  https://review.opendev.org/c/openstack/ironic-python-agent/+/82178515:38
opendevreviewMerged openstack/ironic-python-agent stable/ussuri: Re-read the partition table with partx -a, part 2  https://review.opendev.org/c/openstack/ironic-python-agent/+/82178615:38
opendevreviewVerification of a change to openstack/ironic-python-agent stable/train failed: Re-read the partition table with partx -a  https://review.opendev.org/c/openstack/ironic-python-agent/+/82179115:38
dmelladodtantsur: hi again man xD16:17
dtantsuro/16:18
dmelladoI'm histingt an issue which I'm not sure about16:18
dmelladoailed to prepare to deploy: Could not link image http://192.168.200.8:8082/ipa.kernel from /var/lib/ironic/master_images/1b75d649-9e6d-5c0a-966c-36b1a84264fc.converted to /httpboot/ef4bbe4a-3ecd-4bae-888e-52c406707206/deploy_kernel, error: [Errno 18] Invalid cross-device link: '/var/lib/ironic/master_images/1b75d649-9e6d-5c0a-966c-36b1a84264fc.converted' ->16:18
dmellado'/httpboot/ef4bbe4a-3ecd-4bae-888e-52c4016:18
dmellado6707206/deploy_kernel'16:18
dmelladodoes this ring a bell? 16:18
dmelladoif not, I'll try debugging but it may be one of those 'oh, that'16:18
dtantsurdmellado: yeah. something we should likely document: our caches work via hardlinking.16:19
dtantsurso if you have /var and /httpboot on different devices, it work16:19
dtantsuregh16:19
dtantsurwon't work16:19
dtantsuris it the case?16:19
dmelladolet me check16:20
dmelladohttps://paste.openstack.org/show/boRymb2crVqUTDjB73ds/16:21
dmelladoI don't see anything extraordinary there16:21
dmelladomaybe I should try to hardlink that myself16:22
dtantsuryeah, give it a try16:22
dmelladoaha, so it failed as well, interesting16:23
dtantsurhonestly, we should probably move Ironic stuff to /var/lib completely16:23
dmellado++16:23
dtantsurI guess these /tftpboot and /httpboot are historical16:23
dmelladoI assume a soft link won't work16:24
dtantsurdmellado: no, it won't because of the way we deal with caches (we rely on being able to check the link count)16:24
dmelladohow about moving http_boot_folder to /var/lib/httpboot16:25
dmellado?16:25
dtantsurthis is what I'm pondering, yes16:25
dtantsuryou can try it. if it works, we should probably change the bifrost's default16:25
dmelladochecking16:27
dtantsurI'll prepare a patch meanwhile16:29
dmelladoany way I can check its status, besides 'deploying'?16:33
dtantsurits = which?16:34
dmelladothe server one16:34
dmelladonow it says provisioning state = wait call-back16:34
dtantsurdmellado: ironic-conductor logs, the virtual console of the machine (if you have access)16:34
dmelladohmmm it seems that it gets stuck waiting on pxe16:39
dmelladobut that workaround re: the hardlink worked16:40
opendevreviewDmitry Tantsur proposed openstack/bifrost master: Move /{tftp,http}boot to /var/lib/ironic  https://review.opendev.org/c/openstack/bifrost/+/82274316:41
dtantsurdmellado: ^^16:41
dmelladohmmm stupid question16:49
dmelladoit tried to pxe16:50
dmelladothen it gets stuck waitin on net0, which is the mac address I set up on baremetal.json16:50
dmelladothat should be it, shouldn't it?16:50
dmelladoit responds to ipmi commands and reboots16:50
dmelladoon the ironic side it gets stuck on wait call-back16:51
dtantsurdmellado: so, it doesn't DHCP? doesn't get the ramdisk? or?17:02
dmelladoit doesn't dhcp17:03
dmelladomay be some weird config on the host, though 17:03
dmelladonetwork_interface does it refers to the network interface it would use for dhcp17:04
dmelladoor the bmc one?17:04
dmelladodtantsur: 17:05
dmellado?17:06
dtantsurdmellado: it's the host interface you're using for DHCP and other boot business17:06
dmelladogotcha, then it was wrong17:06
dtantsurnot the BMC one17:06
dtantsurdocs updates are very welcome17:06
dmelladoI'll put that there, as at least I'm finding some issues (and being annoying)17:06
dmelladohope that at least it helps fix some stuff around, dtantsur  ;)17:07
dtantsurit helps a lot!17:08
dtantsurespecially the docs, it's hard to see that something is not obvious after 7 years on the project17:09
dmellado++17:09
dmelladod'oh kernel panic???17:14
dtantsurOo17:15
dtantsurI wonder which IPA image it is using by default17:15
dmelladoIf you don't mind I may ping you tomorrow morning17:15
dmelladoand if you have some time to do a quick meet17:16
dtantsurI'm on a break starting tomorrow17:16
dmelladoas it's probably me doing something stupid17:16
dmelladooh, then after xmas17:16
dtantsurdmellado: try setting use_tinyipa=false in bifrost if not already17:16
* dtantsur is curious why we even default to that...17:16
dtantsurdmellado: do you see which operating system it's trying to boot? CentOS or TinyCoreLinux?17:17
dmelladojust a huge kernel panic17:18
dmelladoI'll retry and be back later or my wife would kill me xD17:18
dtantsurfair :D17:18
dmelladodo I need to reinstall 17:18
dmelladobifrost?17:19
dmelladoor would these changes be picked on the fly and just reenroll and redeploy?17:19
dmelladodtantsur: ?17:19
dmelladooff I go, will read later xD17:19
dmelladothanks!17:19
dtantsurreinstalling will be needed17:19
dmelladoack17:19
opendevreviewDmitry Tantsur proposed openstack/bifrost master: Change the default image to a DIB-built one  https://review.opendev.org/c/openstack/bifrost/+/82275117:22
dtantsurdmellado: ^^17:22
dtantsuron this positive note, I'm wishing everyone a great rest of the week, great holidays for those celebrating, and see you in January!17:22
JayFHave a great holiday Dmitry (& others)!17:32
*** amoralej is now known as amoralej|off17:34

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!