Wednesday, 2021-12-22

*** gmann_afk is now known as gmann		00:35
*** pmannidi\|AFK is now known as pmannidi		00:42
*** amoralej\|off is now known as amoralej		08:25
dtantsur	morning ironic	10:28
holtgrewe	dtantsur: good morning, the crowd here appears to be thinning out with the end of the year approaching	10:32
dtantsur	yep, exactly :)	10:32
dtantsur	holtgrewe: do you have a winter break?	10:32
holtgrewe	dtantsur: usually, yes, but this year not	10:35
dtantsur	oh	10:35
* dtantsur is wondering if the noon is a good time for breakfast		10:50
janders	good morning / afternoon dtantsur holtgrewe and Ironic o/	11:02
holtgrewe	o/	11:05
dtantsur	hey janders	11:10
*** sshnaidm\|afk is now known as sshnaidm		11:26
holtgrewe	I have assigned a capability "node:NODENAME" to my baremetal host. Now I want to target this particular baremetal host with nova. How would the scheduler hints look like? I'm trying to follow https://fatmin.com/2018/08/20/openstack-mapping-ironic-hostnames-to-nova-hostnames/	12:28
dtantsur	holtgrewe: this is tripleo-specific stuff. I don't think you can just use it on a generic Ironic.	12:32
holtgrewe	dtantsur: :-(	12:32
dtantsur	I'm curious why you need to target a specific host with nova.	12:32
holtgrewe	And I'm on Kolla/Kayobe.	12:32
dtantsur	This is considered an anti-pattern.	12:32
holtgrewe	I want to reproduce my old xCAT based HPC deployment.	12:33
dtantsur	If you have admin rights, I think you can ask nova for a specific hypervisor?	12:33
dtantsur	each Ironic node becomes its own hypervisor in Nova	12:33
dtantsur	I think hypervisor ID == node uUID	12:33
holtgrewe	OK, I'm admin	12:34
holtgrewe	life is hard, root passwords tend to help from time to time ;-)	12:34
dtantsur	:)	12:34
janders	see you tomorrow Ironic o/	12:55
janders	(and to those who are about to start their Christmas Holiday - Merry Christmas and Happy New Year! )	12:56
janders	I'll be back for one more day tomorrow	12:56
holtgrewe	Is there a way to show the current nova configuration including the scheduler_filter section including applied defaults?	13:03
dtantsur	holtgrewe: I don't know. Ironic in debug mode logs its configuration on start-up, but I'm not sure if Nova does it.	13:09
holtgrewe	dtantsur: thanks	13:13
*** amoralej is now known as amoralej\|lunch		13:16
holtgrewe	dtantsur: you are right, this one is specific to https://github.com/openstack/tripleo-common/blob/master/tripleo_common/filters/capabilities_filter.py	13:42
dtantsur	holtgrewe: maybe https://docs.openstack.org/nova/xena/admin/availability-zones.html#using-availability-zones-to-select-hosts can help?	13:48
holtgrewe	dtantsur: that looks very helpful	13:49
dtantsur	I remember there was a way to target a hypervisor specifically, but this is all I could find	13:52
holtgrewe	I'll ask in #openstack-kolla what they think is the best way	13:53
dtantsur	the old docs have this https://docs.openstack.org/nova/train/admin/availability-zones.html#using-explicit-host-and-or-node	13:53
holtgrewe	I could inject something like the tripleo thing	13:53
dtantsur	I'm not sure if it's still supported or not. worth trying?	13:53
holtgrewe	dtantsur: it's still in the docs	13:53
* dtantsur does not use nova nowadays		13:53
holtgrewe	I'm deploying via the openstack collection... which only has this availability zone...	13:55
TheJulia	Admin users should be able to request instances to be deployed to specific hypervisors in nova	13:59
holtgrewe	TheJulia: that functionality appears to be not exposed via the Ansible collections. Anyway, mgoddard convinced me to see the light and define one flavor per host generation and rather than HPC hammer/nail go for baremetal cloud.	14:00
TheJulia	Heh, okay. I mean… if you were to propose a patch to the ansible modules… we might know some people :)	14:02
TheJulia	Flavor per host doesn’t sound that ideal either, but that is the far friendlier option to users	14:03
holtgrewe	TheJulia: thanks for the offer.	14:03
TheJulia	Anyway, time to wake up, make coffee, and load up the car	14:03
holtgrewe	TheJulia: I guess all I needed was the reminder that one should try to use one's shiny new tools in their idiomatic ways.	14:03
TheJulia	It does often help	14:04
mgoddard	TheJulia: I think holtgrewe means flavor per host type/generation rather than per host	14:05
holtgrewe	yes	14:06
TheJulia	Oh, good	14:06
* TheJulia hears tons of coyotes from her kitchen and wonders if it is because she is on IRC at the moment		14:07
holtgrewe	And here is my brand new flavor bm.2020-11-c6420 ...	14:10
TheJulia	Yay!	14:11
* TheJulia makes coffee otherwise the drive to flagstaff today will not be fun		14:11
timeu_	holtgrewe: FYI we use traits or CUSTOM_RESOURCE on the flavors to target specific node types (cpu, memory, gpu, etc) for our test baremetal HPC cluster. works quite well.	14:27
dtantsur	good morning TheJulia	14:27
holtgrewe	timeu_: thanks for the info, I'm actually dealing with ~8 delivery batches of hardware here, having a flavor for each works well enough I think	14:28
holtgrewe	I need to put some labels statically into the slurm.conf anyway.	14:28
* holtgrewe is bumping the quotas for the hpc project to 999,999,999		14:29
timeu_	yeah we render our slurm config based on the heat output + ansible dynamically for the various node types. works quite well	14:29
*** amoralej\|lunch is now known as amoralej		14:29
holtgrewe	I'm not yet at the point where I am going down the rabbithole of heat	14:29
* dtantsur feels like a lot of experience exchange can be happening		14:30
timeu_	we have been using heat but we are considering of switching to terraform	14:30
timeu_	because it gives you a bit more transparency of what is happening if you scale up/down the stack	14:30
timeu_	in general works well, however writing the heat templates is much more verbose/cumbersome than HCL	14:31
holtgrewe	timeu_: for now I'm replacing proxmox for vms and xcat for bare metal with openstack	14:32
holtgrewe	I have existing Ansible playbooks for the old infrastructure and now want to do that switch first	14:32
holtgrewe	next up would be looking at having neutron configure the switches (I understand that's possible)	14:33
timeu_	yeah sounds like a good plan	14:33
holtgrewe	My use case is like ... one HPC	14:33
holtgrewe	;-)	14:33
timeu_	yeah that's basically what we are planing to do for the next iteration of our HPC system. Right now it's fully virtualized on Openstack but the goal is to move it to a baremetal deployment on OpenStack.	14:34
timeu_	we do also integrate ironic with neutron but we are using cisco SDN	14:34
dtantsur	have you considered doing a talk for the opendev summit next year?	14:35
holtgrewe	Actually, once I have a sizeable portion of nodes ported over, iDRAC/BIOS settings cleaned up, everything booting from UEFI, ... I need to get my all-NVME CephFS up and running and tuned so I can office-space-printer-scene my HDD based GPFS system	14:35
timeu_	but it does work quite well including trunk ports so we can get rid of 200 1G cables and just go with the 100G ones (should also improve air flow in the racks) and the 1G cables/NIC flap like cracy	14:35
holtgrewe	timeu_: OK... I don't understand much about SDN	14:35
timeu_	make sure to record it then ;-)	14:36
holtgrewe	timeu_: we have 2x10GbE/2x40GbE network in the old part of the cluster, old blade enclosures	14:36
holtgrewe	and 2x25GbE/2x100GbE in the new part of the cluster	14:36
holtgrewe	I currently have 10MB/sec throughput of the HDD file system	14:36
timeu_	quite a bit of connectivity there ;-)	14:37
holtgrewe	with the NVME system I'm pretty certain the network will be the bottlneck	14:37
holtgrewe	yeah, life science sure wants their I/O	14:37
holtgrewe	CPU cycles per GB is quite low compared to things like FEM simulations	14:37
timeu_	yeah I think with NVME 100G is required.	14:37
holtgrewe	some people might say that it's essentially "gunzip -c \| awk \| gzip" performance wise ;-)	14:38
holtgrewe	And I'm not even looking at the imaging people who either do 100M images of 4kb or large HDF files with only random seeks.	14:38
timeu_	yeah we also have those but its true that an all flash parallel filesystem gives you the biggest performance boost across the board for various HPC workloads	14:39
timeu_	but yeah we have also all kinds of edge cases in regards with I/O workloads including the thousand of small temporary files in one folder that usually is poison for a paralllel filesystem ;-)	14:40
timeu_	I can feel the pain	14:40
holtgrewe	timeu_: yes... if I can move my 24 recent nodes across data centers early next year, I can connect modern CPUs with 2x25GB to the ceph system on one switch. I'm interested how the IO500 benchmark looks there	14:41
holtgrewe	Well, rater thousands of small files on NVME than on HDD ;-)	14:41
holtgrewe	even the GPFS meta data lives on HDD, fast spinning ones, but still	14:41
timeu_	yeah but we also saw our all flash beegfs cluster crawl due to metadata operation in certain cases but it's definately better than rust ;-)	14:42
holtgrewe	;-)	14:42
holtgrewe	What's your experience with BeeGFS, it always felt more like a /scratch when doing research about it than a /home	14:43
* holtgrewe wonders whether that's actually OT here		14:43
timeu_	yeah it's definately more than a /scratch although it has been very reliable. Almost no crashes but we have a relatively small setup (~300TB, 12 nodes).	14:45
holtgrewe	timeu_: sounds good	14:45
timeu_	there are features such as mirroring (metadata + data) but they come with performance overhead, so we don't use them	14:45
holtgrewe	I've used two clusters ... 8 and 12 years ago that had a dying ... FrauenhoferFS as it was called back then	14:46
holtgrewe	But that's ages ago	14:47
holtgrewe	it's good to hear that things have improved	14:47
opendevreview	Merged openstack/ironic-python-agent stable/ussuri: Re-read the partition table with partx -a https://review.opendev.org/c/openstack/ironic-python-agent/+/821785	15:38
opendevreview	Merged openstack/ironic-python-agent stable/ussuri: Re-read the partition table with partx -a, part 2 https://review.opendev.org/c/openstack/ironic-python-agent/+/821786	15:38
opendevreview	Verification of a change to openstack/ironic-python-agent stable/train failed: Re-read the partition table with partx -a https://review.opendev.org/c/openstack/ironic-python-agent/+/821791	15:38
dmellado	dtantsur: hi again man xD	16:17
dtantsur	o/	16:18
dmellado	I'm histingt an issue which I'm not sure about	16:18
dmellado	ailed to prepare to deploy: Could not link image http://192.168.200.8:8082/ipa.kernel from /var/lib/ironic/master_images/1b75d649-9e6d-5c0a-966c-36b1a84264fc.converted to /httpboot/ef4bbe4a-3ecd-4bae-888e-52c406707206/deploy_kernel, error: [Errno 18] Invalid cross-device link: '/var/lib/ironic/master_images/1b75d649-9e6d-5c0a-966c-36b1a84264fc.converted' ->	16:18
dmellado	'/httpboot/ef4bbe4a-3ecd-4bae-888e-52c40	16:18
dmellado	6707206/deploy_kernel'	16:18
dmellado	does this ring a bell?	16:18
dmellado	if not, I'll try debugging but it may be one of those 'oh, that'	16:18
dtantsur	dmellado: yeah. something we should likely document: our caches work via hardlinking.	16:19
dtantsur	so if you have /var and /httpboot on different devices, it work	16:19
dtantsur	egh	16:19
dtantsur	won't work	16:19
dtantsur	is it the case?	16:19
dmellado	let me check	16:20
dmellado	https://paste.openstack.org/show/boRymb2crVqUTDjB73ds/	16:21
dmellado	I don't see anything extraordinary there	16:21
dmellado	maybe I should try to hardlink that myself	16:22
dtantsur	yeah, give it a try	16:22
dmellado	aha, so it failed as well, interesting	16:23
dtantsur	honestly, we should probably move Ironic stuff to /var/lib completely	16:23
dmellado	++	16:23
dtantsur	I guess these /tftpboot and /httpboot are historical	16:23
dmellado	I assume a soft link won't work	16:24
dtantsur	dmellado: no, it won't because of the way we deal with caches (we rely on being able to check the link count)	16:24
dmellado	how about moving http_boot_folder to /var/lib/httpboot	16:25
dmellado	?	16:25
dtantsur	this is what I'm pondering, yes	16:25
dtantsur	you can try it. if it works, we should probably change the bifrost's default	16:25
dmellado	checking	16:27
dtantsur	I'll prepare a patch meanwhile	16:29
dmellado	any way I can check its status, besides 'deploying'?	16:33
dtantsur	its = which?	16:34
dmellado	the server one	16:34
dmellado	now it says provisioning state = wait call-back	16:34
dtantsur	dmellado: ironic-conductor logs, the virtual console of the machine (if you have access)	16:34
dmellado	hmmm it seems that it gets stuck waiting on pxe	16:39
dmellado	but that workaround re: the hardlink worked	16:40
opendevreview	Dmitry Tantsur proposed openstack/bifrost master: Move /{tftp,http}boot to /var/lib/ironic https://review.opendev.org/c/openstack/bifrost/+/822743	16:41
dtantsur	dmellado: ^^	16:41
dmellado	hmmm stupid question	16:49
dmellado	it tried to pxe	16:50
dmellado	then it gets stuck waitin on net0, which is the mac address I set up on baremetal.json	16:50
dmellado	that should be it, shouldn't it?	16:50
dmellado	it responds to ipmi commands and reboots	16:50
dmellado	on the ironic side it gets stuck on wait call-back	16:51
dtantsur	dmellado: so, it doesn't DHCP? doesn't get the ramdisk? or?	17:02
dmellado	it doesn't dhcp	17:03
dmellado	may be some weird config on the host, though	17:03
dmellado	network_interface does it refers to the network interface it would use for dhcp	17:04
dmellado	or the bmc one?	17:04
dmellado	dtantsur:	17:05
dmellado	?	17:06
dtantsur	dmellado: it's the host interface you're using for DHCP and other boot business	17:06
dmellado	gotcha, then it was wrong	17:06
dtantsur	not the BMC one	17:06
dtantsur	docs updates are very welcome	17:06
dmellado	I'll put that there, as at least I'm finding some issues (and being annoying)	17:06
dmellado	hope that at least it helps fix some stuff around, dtantsur ;)	17:07
dtantsur	it helps a lot!	17:08
dtantsur	especially the docs, it's hard to see that something is not obvious after 7 years on the project	17:09
dmellado	++	17:09
dmellado	d'oh kernel panic???	17:14
dtantsur	Oo	17:15
dtantsur	I wonder which IPA image it is using by default	17:15
dmellado	If you don't mind I may ping you tomorrow morning	17:15
dmellado	and if you have some time to do a quick meet	17:16
dtantsur	I'm on a break starting tomorrow	17:16
dmellado	as it's probably me doing something stupid	17:16
dmellado	oh, then after xmas	17:16
dtantsur	dmellado: try setting use_tinyipa=false in bifrost if not already	17:16
* dtantsur is curious why we even default to that...		17:16
dtantsur	dmellado: do you see which operating system it's trying to boot? CentOS or TinyCoreLinux?	17:17
dmellado	just a huge kernel panic	17:18
dmellado	I'll retry and be back later or my wife would kill me xD	17:18
dtantsur	fair :D	17:18
dmellado	do I need to reinstall	17:18
dmellado	bifrost?	17:19
dmellado	or would these changes be picked on the fly and just reenroll and redeploy?	17:19
dmellado	dtantsur: ?	17:19
dmellado	off I go, will read later xD	17:19
dmellado	thanks!	17:19
dtantsur	reinstalling will be needed	17:19
dmellado	ack	17:19
opendevreview	Dmitry Tantsur proposed openstack/bifrost master: Change the default image to a DIB-built one https://review.opendev.org/c/openstack/bifrost/+/822751	17:22
dtantsur	dmellado: ^^	17:22
dtantsur	on this positive note, I'm wishing everyone a great rest of the week, great holidays for those celebrating, and see you in January!	17:22
JayF	Have a great holiday Dmitry (& others)!	17:32
*** amoralej is now known as amoralej\|off		17:34

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!