Tuesday, 2021-12-14

opendevreviewMerged openstack/neutron stable/wallaby: [stable-only] "_clean_logs_by_target_id" to use old notifications  https://review.opendev.org/c/openstack/neutron/+/82156300:17
opendevreviewGhanshyam proposed openstack/neutron master: Updating python testing as per Yoga testing runtime  https://review.opendev.org/c/openstack/neutron/+/81920102:13
opendevreviewGhanshyam proposed openstack/os-vif master: Updating python testing classifier as per Yoga testing runtime  https://review.opendev.org/c/openstack/os-vif/+/81920402:20
opendevreviewFederico Ressi proposed openstack/neutron master: Change tobiko CI job in the periodic queue  https://review.opendev.org/c/openstack/neutron/+/81397708:45
bbezakHello, did anybody observed this bug? https://bugs.launchpad.net/neutron/+bug/1953510 - looks pretty important, as neutron ovn metadata agent cannot really connect correctly to new OVN SB raft leader - and OVS 2.16  is doing automatic leader transfer while doing snapshots.09:03
ralonsohbbezak, is it reproducible just restarting the OVN SB?09:08
bbezakralonsoh: yes, very much so - tested just now09:12
ralonsohbbezak, ok, I'll try to reproduce it locally and debug it09:12
ralonsohthanks for the report09:12
bbezakfurthermore I can see weird behaviour that old metadata agents that were removed before (like 6 months prior) came back in down status in openstack network agent list. looks like different issue I guess09:13
bbezakbut it is happening at the same time though09:14
ralonsohbbezak, removed how? from the API? the chassis?09:14
bbezakremoved via openstack network agent delete09:14
ralonsohok09:14
ralonsohjust a heads-up09:14
bbezakand before stopped it's services of course on hosts09:14
ralonsohthis is for clean-up purposes09:15
ralonsohthat means: when you delete a chassis and the ovn-controller does not end gracefully09:15
ralonsohthe chassis register will remain in the SB DB09:15
ralonsohthat means Neutron will see it as an active chassis09:15
ralonsohthe Neutron API allows to remove the controller and metadata agents associated to this chassis09:15
ralonsohbut as I said, for clean-up purposes, only when the Chassis and Chassis_Private registers do not exist09:16
bbezakright, so what would be proper procedure for removal of old metadata agents entries then? remove it directly within OVN?09:18
opendevreviewSlawek Kaplonski proposed openstack/neutron-tempest-plugin master: Update irrelevant-files for api/scenario jobs  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82116009:19
opendevreviewSlawek Kaplonski proposed openstack/neutron-tempest-plugin master: Update irrelevant-files for stadium project's tests  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82141409:19
ralonsohbbezak, if you didn't properly stop the ovn-controller, then the metadata and the controller agents will be still present in the DB09:20
ralonsohsorry, in the Neutron API09:20
bbezakok09:20
ralonsohyou should remove the Chassis and Chassis_Private refisters09:20
ralonsohthat will trigger the agent deletion09:20
bbezakok, but you mean in OVN SB - I need to get rid of only metadata agent - as this node won't be hypervisor anymore. So only chassis_private to remote then?09:31
bbezakI meant to remove ;)09:32
ralonsohbbezak, if there are no hypervisors in a compute, then you need to stop the ovn-controller 09:32
ralonsohthat will remove the Chassis and the Chassis_Private09:32
ralonsohboth are created and deleted together09:32
bbezakbut it is a controller/network node09:32
ralonsohyou can't have one without the other09:32
bbezak(supposed to be hyper-converged at the beginning)09:33
ralonsohI don't know now how to stop having a metdata in a node09:35
ralonsohbut it doesn't matter09:35
ralonsohthat won't affect your architecture09:35
ralonsohand the VMs will retrieve from the local metadata servers09:35
bbezakyeah, indeed, but I don't like  agents down :), not a huge deal though, first issue is more important09:36
bbezaksorry for sidetracking09:36
opendevreviewRodolfo Alonso proposed openstack/neutron master: Remove the expired reservations in a separate DB transaction  https://review.opendev.org/c/openstack/neutron/+/82159210:01
ralonsohslaweq, https://review.opendev.org/c/openstack/neutron/+/81839910:03
ralonsohif you have a couple of mins to review it10:04
ralonsohthanks in advance10:04
ralonsohah, and this one: https://review.opendev.org/c/openstack/neutron/+/82127110:04
ralonsohobondarev, lajoskatona ^^10:04
ralonsohif you have time10:04
slaweqralonsoh: sure. I will check it in few minutes10:10
ralonsohthanks!10:10
lajoskatonaralonsoh: me too :-)10:10
ralonsohthanks!10:10
slaweqralonsoh are You sure that failed test_neutron_status_cli test in patch https://review.opendev.org/c/openstack/neutron/+/818399 is not related to that change?10:17
ralonsohslaweq, hmmm I saw the other one... sorry10:18
ralonsohit is related but I really don't know what happened10:18
ralonsohthe DB connection is also used in other checks10:18
slaweqmaybe it was just some issue with MySQL backend10:19
slaweqI will recheck it once again to see if it will be green10:19
ralonsohyes, but a bad coincident with this new check...10:19
ralonsohas I said, there are other checks that use the DB backend10:20
ralonsohnot only this one10:20
opendevreviewLajos Katona proposed openstack/neutron-tempest-plugin master: Add logs for test_floatingip_port_details  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82155710:23
opendevreviewLajos Katona proposed openstack/neutron master: Add logs for port_bound_to_host  https://review.opendev.org/c/openstack/neutron/+/82159910:27
opendevreviewMerged openstack/neutron stable/victoria: [stable-only] "_clean_logs_by_target_id" to use old notifications  https://review.opendev.org/c/openstack/neutron/+/82156410:45
bbezakralonsoh: FYI: just updated bug with debug logs from metadata agent - https://bugs.launchpad.net/neutron/+bug/195351011:05
ralonsohthanks11:05
slaweqralonsoh11:06
slaweqlajoskatona please check https://review.opendev.org/q/Ia3fbde3304ab5f3c309dc62dbf58274afbcf4614 when You will have few minutes11:06
slaweqthx in advance11:06
ralonsohslaweq, sure11:06
opendevreviewMerged openstack/neutron stable/ussuri: [stable-only] "_clean_logs_by_target_id" to use old notifications  https://review.opendev.org/c/openstack/neutron/+/82156511:16
*** bluex_ is now known as bluex11:17
bluexI'm getting warning from _is_session_semantic_violated caused (if I'm not wrong) by (ml2.plugin)delete_port calling (l3_db)disassociate_floatingips which tries to send AFTER_UPDATE notification while parent transaction from delete_port is still active11:44
bluexBecause of that FIP notifications from delete_port seems to be never never sent11:44
bluexIs this known issue?11:44
ralonsohbluex, when this is happening?11:45
ralonsohdo you have logs?11:45
ralonsohslaweq, https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/821079 can you check my comment there?11:45
ralonsohbluex, what backend are you using?11:46
ralonsoh(that should have been my first question)11:46
opendevreviewRodolfo Alonso proposed openstack/neutron master: [OVN][Placement] Attend to chassis configuration events  https://review.opendev.org/c/openstack/neutron/+/82169311:49
bluexralonsoh, sure give me sec, I'll post relevant logs somewhere11:51
bluexas for backend, I'm not exactly sure what backend are you asking about11:52
ralonsohbluex, OVS, OVN, linux bridge...11:52
bluexOVS11:53
bluexbtw https://paste.openstack.org/ gives me 502, is there other recommended paste platform11:55
ralonsohbluex, use this11:56
ralonsohhttps://etherpad.opendev.org/p/problem_fips11:56
slaweqralonsoh I just replied12:00
slaweqI like Your idea of combining both tests there12:00
slaweqthx12:00
ralonsohthanks!12:00
ykarelmany neutron jobs are red https://zuul.openstack.org/status12:01
ykarellikely some infra issue, checking12:01
bluexralonsoh, relevant section of logs pasted12:01
bluexA solution I'm thinking about is: no longer checking _is_session_semantic_violated when notification is published and instead performing open transaction check directly in _ObjectChangeHandler.dispatch_events12:02
bluexThen if transaction is still ongoing we can put notification back on queue with some delay (1s for example) instead of discarding it.12:02
ralonsohbluex, where are the logs?12:03
ykarelohkk checked, pip install fails for different modules, also seems only few providers impacted, so jobs going to fail unless fixed12:07
ykarelwill check what all providers impacted and will ping infra to get it clear12:08
ykarelralonsoh, slaweq fyi ^12:08
ralonsohI'm aware, yes12:09
ralonsohcould be just a problem in pip mirror12:09
ralonsohthat happens sometimes12:09
slaweqthx ykarel for info and for taking care of it12:10
bluexralonsoh, sorry, etherpad didn't appreciate having to paste >300 lines of logs so I pasted only traceback (10 lines at once) for now12:14
ralonsohbluex, but that is the method that is handling this event and raising this exception?12:14
ralonsohthat's the key point here12:14
ralonsohand we don't see that in the logs12:14
bluexok, added some previous logs12:19
bluextraceback comes from plugins.ml2.ovo_rpc._ObjectChangeHandler._is_session_semantic_violated12:19
bluexwhile plugins.ml2.plugin.Ml2Plugin.delete_port is being executed12:20
bluexI should also mention it's executed on older version of stable/stein branch, but relevant code is the same as on master12:38
lajoskatonaykarel: thanks for the headsup12:47
ralonsohbluex, what is networking_ovh?12:50
ralonsohthis is not the Neutron ml2 plugin12:50
ralonsohthis is your own plugin12:50
ykarellajoskatona, ralonsoh slaweq the identified pip modules and affected providers are refreshed for now12:51
ykarelif it still happens for others need to reach to #opendev with details12:51
lajoskatonaykarel: thanks, keep an eye on it :-)12:52
opendevreviewLajos Katona proposed openstack/neutron master: Fix stackalytics' link  https://review.opendev.org/c/openstack/neutron/+/82170413:03
bluexralonsoh, yup, it's propertiary OVH plugin extending some parts of Neutron, in logs provided you can assume is neutron instead of networking_ovh since all of the code mentioned so far comes directly from neutron13:10
opendevreviewLuis Tomas Bolivar proposed openstack/neutron master: Use Port_Binding up column to set Neutron port status  https://review.opendev.org/c/openstack/neutron/+/82154413:18
ralonsohbluex, right, this is a bug. Please, can you report it? in launchpad13:37
ralonsohbluex, but please, identify the methods that this handler is calling13:38
bluexsure13:40
bluexbtw any thoughts about solution I suggested?13:41
ralonsohto add a delay in any operation? No, not at all, by any circumstance13:42
bluexit won't be real delay, (like sleep) instead since we have loop in dispatch_events in external thread we can utilize it to essentially send notification after some time have passed13:44
bluexI was more worried about no longer performing _is_session_semantic_violated check13:45
opendevreviewMaor Blaustein proposed openstack/neutron-tempest-plugin master: Add test_create_and_update_port_with_dns_name  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82107913:48
bluex* plugins.ml2.ovo_rpc._ObjectChangeHandler.dispatch_events13:48
bluexbut I don't see any reason why we should drop AFTER notifications when transaction is not finished13:50
opendevreviewRodolfo Alonso proposed openstack/neutron master: Remove the expired reservations in a separate DB transaction  https://review.opendev.org/c/openstack/neutron/+/82159213:53
opendevreviewBence Romsics proposed openstack/neutron master: Make the dead vlan actually dead  https://review.opendev.org/c/openstack/neutron/+/82089714:00
lajoskatona#startmeeting networking14:00
opendevmeetMeeting started Tue Dec 14 14:00:46 2021 UTC and is due to finish in 60 minutes.  The chair is lajoskatona. Information about MeetBot at http://wiki.debian.org/MeetBot.14:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.14:00
opendevmeetThe meeting name has been set to 'networking'14:00
mlavalleo/14:00
lajoskatonao/14:00
slaweqhi14:00
frickler\o14:00
amotokihi14:00
isabekHi14:00
njohnstono/14:00
manubhi14:01
damiandabrowski[m]hi!14:01
obondarevhi14:01
lajoskatona#topic Announcements14:02
lajoskatonaThe usual reminder for Yoga cycle calendar https://releases.openstack.org/yoga/schedule.html14:02
rubasovo/14:02
bcafarelo/14:02
lajoskatonaLast week the TC organized meeting series was started for operator pain points14:02
lajoskatona#link http://lists.openstack.org/pipermail/openstack-discuss/2021-December/026169.html14:03
lajoskatonasummary of the meeting from Rico: #link http://lists.openstack.org/pipermail/openstack-discuss/2021-December/026245.html14:03
lajoskatonaand there is a nice etherpad: https://etherpad.opendev.org/p/pain-point-elimination14:03
ralonsohhi14:04
lajoskatonahm, I can send the link to the line from where Neutron is mentioned: https://etherpad.opendev.org/p/pain-point-elimination#L26014:04
lajoskatona4 topics were mentioned for neutron:14:05
slaweqI created one LP bug from that etherpad last week14:05
slaweqI think that ykarel is working on it now14:05
lajoskatonaneutron (via python calls) and OVN (via C calls) can have different ideas about what the hostname is, particulary if a deployer (rightly or wrongly) sets their hostname to be an FQDN14:05
lajoskatonaslaweq: thanks14:06
amotokihttps://bugs.launchpad.net/neutron/+bug/1953716 this is a bug slaweq filed14:06
lajoskatonayes and we have another for "full" solution for network cascade deletion: https://bugs.launchpad.net/neutron/+bug/187031914:06
ykarelo/14:07
lajoskatonaThe next one from the list is more an opinion: OpenVSwitch Agent excess logging at INFO level14:08
ralonsohthis is the opposite to what we did 2 years ago14:08
slaweqexactly14:08
ralonsohmoving the debug messages to INFO14:08
lajoskatonaexactly :-)14:08
slaweqI personally I think it's better how it's now14:08
ralonsohI have the same opinion14:08
ralonsoh(maybe there is something that could improve)14:08
lajoskatonaperhaps we can do something which stops in the middle :-)14:09
lajoskatonahalf info half debug :-)14:09
ralonsohthat's an option, yes14:09
slaweqfor me personally it's very hard issue because when all works fine, You don't need almost any logs14:09
slaweqbut if something is going wrong, You basically need to have debug enabled always14:09
slaweqso in case of the problem, our current INFO logging isn't even enough for me :)14:10
lajoskatonayeah it's more, a feedback that the agent loop is ok14:11
amotokifrom my operator experience, it would be nice if INFO messages mention what goes good and what goes wrong. Detail is not needed. DEBUG should help us for such caes.14:11
amotokiI am not sure the loop message in ovs-agent really helps14:12
fricklerwould it be an option to move some logs to WARNING, so that deployers getting too much log can turn off INFO?14:13
lajoskatonaamotoki: thanks14:13
lajoskatonafrom operator/deployer perspective that's an extra pain to have different (log)settings for just ovs-agent for example, and the esiest way from their perspective is to change the code :-)14:15
amotokiI think we can start from the logging guideline https://specs.openstack.org/openstack/openstack-specs/specs/log-guidelines.html 14:16
amotokiinterpreation of the log levels might be different per person14:16
lajoskatonaamotoki: good that you mentioned, the guideline was mentioned also, but not sure if it is up-to-date14:17
lajoskatonaand another point for OVN: Spoofing of DNS reponses seems very wrong behavior14:17
frickleramotoki: from that, a lot of errors should likely only be warnings14:18
lajoskatonato tell the truth I don't know details of it, because the meeting finished, and not sure if there was anybody who has insight of this one14:18
* slaweq will be back in 5 minutes14:19
ralonsohlajoskatona, is there a LP bug?14:20
ralonsohfor the OVN DNS issue14:20
amotokiwe can add the labels defined at https://etherpad.opendev.org/p/pain-point-elimination#L1014:21
lajoskatonaralonsoh: nothing: https://etherpad.opendev.org/p/pain-point-elimination#L27914:21
ralonsohok14:21
amotokiif we need more info, we can INFO NEEDED label14:21
lajoskatonaamotoki: I added now14:22
amotokithanks14:22
lajoskatonafor logging TC will work on it to have fresh logging guidelines, I just saw in the etherpad14:23
njohnstonI know a bit about OVN DNS spoofing, but I am not an expert.  I think dalvarez knows more.  14:24
dalvarezhi, can i help somehow? o/14:24
lajoskatonanjohnston, dalvarez: Hi, there's an "operator pain points" etherpad and related discussion14:25
lajoskatona njohnston, dalvarez:  and one point is: "OVN: Spoofing of DNS reponses seems very wrong behavior" without more details14:25
lajoskatona#link https://etherpad.opendev.org/p/pain-point-elimination#L27914:25
lajoskatonaif you can think about it please check the etherpad if this issue is relevant at all14:26
njohnstonSo when OVN sees a DNS request originating on a local net it captures it and compares it against it's local cache, answers it if it can and then allows it to proceed out via normal routing if there is no match14:27
dalvarezlajoskatona: ack, i agree we need more info. DNS is not really spoofing anything but acting as a DNS server itself, not a full fledged server though so we may be missing some options or maybe it's not fully compliant with the standard... :?14:28
njohnstonthat causes a change in functionality for private networks that don't have egress; in ML2/OVS they could resolve DNS via the local dnsmasq but in OVN it is not possible14:28
njohnston^ s/resolve DNS/resolve external DNS/14:28
njohnstonwhere external DNS includes Designate14:28
ralonsohright, this is a feature gap, OVN still doesn't support external DNS servers14:29
ralonsohnjohnston, is there a LP bug?14:30
lajoskatonajohnston, ralonsoh, dalvarez: thanks for the explanation14:30
njohnstonI'll have to go back and check, we last talked about it maybe 6-8 months ago14:30
fricklerIIUC the issue is more that the instance sends a query to e.g. 8.8.8.8, but the query never reaches that server, instead OVN creates a fake response that looks like 8.8.8.8 sent it14:31
njohnstonralonsoh: https://bugs.launchpad.net/neutron/+bug/190295014:32
ralonsohnjohnston, thanks!14:32
lajoskatonaI will add these insights to the etherpad, check if we have a gap written for it, and we can continue the discussion under a LP if there's already one14:32
opendevreviewMaor Blaustein proposed openstack/neutron-tempest-plugin master: Add test_create_and_update_port_with_dns_name  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82107914:33
lajoskatonaThat's all from the meeting last week14:33
lajoskatona#topic Community Goals14:34
lajoskatonait was again something on the TC worked and discussed: #link http://lists.openstack.org/pipermail/openstack-discuss/2021-December/026280.html (look for "Community-wide goal updates")14:34
lajoskatonamail from gmann: #link http://lists.openstack.org/pipermail/openstack-discuss/2021-December/026218.html14:34
lajoskatonaCurrent Selected (Active) Goals: 14:35
lajoskatonaMigrate from oslo.rootwrap to oslo.privsep14:35
lajoskatonaConsistent and Secure Default RBAC14:35
slaweqregarding Secure RBAC I proposed patch with update of the policies: https://review.opendev.org/c/openstack/neutron/+/82120814:35
slaweqI forgot to reply to ralonsoh's comments there14:35
ralonsohregarding to the first one, privsep, I think we are done14:36
lajoskatonawith privsep migration we are good, I think we even finished stadium project migration14:36
ralonsohwe still use rootrwap to spawn the privsep daemons14:36
ralonsohthis like some kind of "inception"14:36
ralonsohI would remove this part14:36
ralonsohremoving any dependency from rootwrap14:36
lajoskatonaralonsoh, slaweq: thanks14:36
lajoskatonaThis was more a headsup (as we are progressing well /finished these) that we have these 2 goals for this cycle again14:39
lajoskatona#topic Bugs14:40
lajoskatonaamotoki was bug deputy last week: http://lists.openstack.org/pipermail/openstack-discuss/2021-December/026335.html14:40
lajoskatonaAs I checked there are a few unassigned bugs:14:41
lajoskatona#link https://bugs.launchpad.net/neutron/+bug/195438414:41
lajoskatonathis one again a DNS related bug14:42
ralonsohI think we had the same discussion last week14:42
ralonsohthis is for designate14:42
fricklerI can look into that14:42
amotokibut it is a case with linux bridge (not OVN)14:42
ralonsohif you don't use it, then this dns-domain name should be the same as the VM name14:43
ralonsohI'll find the LP bug I'm talking about14:43
lajoskatonafrickler, amotoki, ralonsoh: thanks14:44
njohnstonI wonder if this behavior change in Nova may affect things: https://specs.openstack.org/openstack/nova-specs/specs/xena/implemented/configurable-instance-hostnames.html14:45
njohnstonIn any evenbt, it was not something I was aware of until recently, so good to boost exposure within the team14:45
opendevreviewMerged openstack/neutron-tempest-plugin master: Add test_create_port_with_dns_name  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82045614:45
lajoskatonaamotoki: do you have other highlights from last week's bugs?14:46
amotokione thing is https://bugs.launchpad.net/neutron/+bug/195347814:47
amotokiit is realted to the timing of vif-plugged event.14:47
amotokiwe need to clarify when we send the event and what is expected from both nova and neutron perspective.14:48
amotokiin addition, two bugs on OVN metadata agent. it would be nice if folks familiar with OVN can check them.14:48
ralonsohI'm checking the issues with the OVN metadata agent14:49
lajoskatonahttps://bugs.launchpad.net/neutron/+bug/1953510 & https://bugs.launchpad.net/neutron/+bug/1953295 I think14:49
amotokilajoskatona: correct. thanks14:50
ralonsohat least the first one (I'll assign it to me)14:50
damiandabrowski[m]I would really appreciate a helping hand here: https://bugs.launchpad.net/neutron/+bug/195290714:50
damiandabrowski[m]I have already spent several hours on troubleshooting and proposed a few possible solutions, but that's probably all I can do.14:50
amotokiregarding the vif-plugged event bug, I can look it into detail, but more eyes would be appreciated so I let it unassigned.14:51
lajoskatonadamiandabrowski[m]: I will check your proposals, thanks for working on it14:52
damiandabrowski[m]thank You!14:52
lajoskatonaamotoki: is it written somewhere when we send events like vif-plugged?14:53
amotokilajoskatona: I don't think so.... unfortunately14:54
ralonsohno and this is something I said I was going to write14:54
ralonsoh(in the PTG) but I didn't have time yet14:54
lajoskatonaperhaps we can start with syncing it with nova and writing down, and check each backend14:54
amotoki+114:55
lajoskatonaralonsoh: ok, thanks14:55
fricklerdid anyone look at the discussion I had with mdbooth yesterday? tldr: trunk creation/deletion races with unplugging the baseport from a server14:57
fricklerno bug reported yet, I was waiting to ask mdbooth whether they would like to do that14:57
lajoskatonafrickler: that's a good idea to have a bug if they have issue14:58
lajoskatonaOk, the CI meeting soon will start, so we have to finish now14:58
fricklermaybe it was known already or expected somehow, that's why I'm asking14:58
fricklerfinal question: anyone looking at the OVN/n-d-r issue yet?14:59
lajoskatona#endmeeting15:00
opendevmeetMeeting ended Tue Dec 14 15:00:10 2021 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:00
opendevmeetMinutes:        https://meetings.opendev.org/meetings/networking/2021/networking.2021-12-14-14.00.html15:00
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/networking/2021/networking.2021-12-14-14.00.txt15:00
opendevmeetLog:            https://meetings.opendev.org/meetings/networking/2021/networking.2021-12-14-14.00.log.html15:00
slaweq#startmeeting neutron_ci15:00
opendevmeetMeeting started Tue Dec 14 15:00:21 2021 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'neutron_ci'15:00
lajoskatonasorry frickler :-)15:00
ralonsohhi again15:00
slaweqwelcome on another meeting :)15:00
lajoskatonafrickler: I try to fetch some time to check the trunk discussion15:00
lajoskatonaHi15:00
slaweqGrafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate15:01
obondarevhi15:01
ykarelhi15:01
mlavalleo/o/15:01
slaweqI think we can start now15:01
bcafarelhi again15:01
slaweq#topic Actions from previous meetings15:01
slaweqslaweq to check missing grafana logs for periodic jobs15:02
slaweqactually ykarel already proposed fix https://review.opendev.org/c/openstack/project-config/+/82098015:02
slaweqthx ykarel15:02
slaweqnext one15:02
slaweqralonsoh to check  https://bugs.launchpad.net/neutron/+bug/195348115:02
ralonsoh#link https://review.opendev.org/c/openstack/neutron/+/82091115:03
* dulek doesn't want to interrupt, but the CI is broken at the moment due to pecan bump in upper-constraints.txt. Ignore if you're already aware.15:03
ralonsohactually this is the meeting to tell this update15:04
slaweqthx ralonsoh for the fix15:04
mlavallethanks dulek 15:04
lajoskatonadulek: thanks, recheck is useless now?15:04
slaweqand thx dulek for info - it's very good timinig for that :)15:04
dulekI think it's useless.15:04
dulekI mean recheck.15:04
lajoskatonaok, thanks15:04
ralonsohgood to know15:04
mlavalleyeah, chances are it's useless :-)15:04
dulekThe conflict is caused by:15:05
dulek    The user requested pecan>=1.3.215:05
dulek    The user requested (constraint) pecan===1.4.115:05
dulekThis is the error. And yes, I can't really understand it either, because I don't think this conflicts.15:05
ykareldulek, log link? just to check if it failed recently, as that issue was fixed for some provider some time back15:05
fricklerah, that was likely an issue with pypi, see discussion in #opendev15:05
lajoskatonamlavalle: https://youtu.be/SJUhlRoBL8M15:05
dulekhttps://0d4538c7b62deb4c15ac-e6353f24b162df9587fa55d858fbfadc.ssl.cf5.rackcdn.com/819502/3/check/openstack-tox-pep8/a4f6148/job-output.txt15:05
fricklerhopefully solved by refreshing CDN15:05
ykarelapprox 2 hour back15:06
slaweq++ thx ykarel and frickler15:06
ykarel^ logs are older than that15:06
slaweqok, lets move on with the meeting15:07
slaweq#topic Stable branches15:07
slaweqbcafarel any updates?15:07
slaweqI think it's not in bad shape recently15:08
bcafarelindeed I think we are good overall15:08
bcafarelI was off yesterday so still checking a few failed runs in train, but nothing that looks 100% reproducible :)15:08
slaweqgreat, thx15:09
slaweq#topic Stadium projects15:09
slaweqanything in stadium what we should discuss today?15:09
slaweqlajoskatona15:09
lajoskatonaeverything is ok as far as I know15:09
opendevreviewRodolfo Alonso proposed openstack/neutron master: Remove the expired reservations in a separate DB transaction  https://review.opendev.org/c/openstack/neutron/+/82159215:09
slaweqthat's great, thx :)15:09
slaweq#topic Grafana15:09
lajoskatonaperhaps some advertisement: if you have time please check tap-as-a-service reviews: https://review.opendev.org/q/project:openstack%252Ftap-as-a-service+status:open 15:09
slaweqlajoskatona sure15:10
slaweqlajoskatona all of them? or just those which don't have merge conflicts now?15:10
mlavalleseveral of them have merge conflicts15:10
lajoskatonaslaweq: sorry, the recent ones, for some reason that is not working for me now15:12
slaweqk15:12
lajoskatonahttps://review.opendev.org/q/project:openstack/tap-as-a-service+status:open+-age:8week15:13
lajoskatonasorry for spamming the meeting....15:13
slaweqlajoskatona I will review them tomorrow15:14
lajoskatonaslaweq: thanks15:14
slaweqok, lets get back to grafana15:15
slaweqhttp://grafana.openstack.org/dashboard/db/neutron-failure-rate15:15
slaweqin overall things don't looks bad IMO15:16
slaweqand I also saw the same while looking at results of the failed jobs from last week15:16
opendevreviewMaor Blaustein proposed openstack/neutron-tempest-plugin master: Add test_create_and_update_port_with_dns_name  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82107915:17
slaweqtoday I also run my script to check number of rechecks recently15:17
slaweqand we are improving I think15:17
slaweqweek 2021-47 was 3.38 recheck in average15:17
slaweqweek 2021-48 - 3.5515:17
slaweqweek 2021-49 - 2.9115:18
mlavallenice, trending in the right direction!15:18
ralonsohmuch better than before15:18
opendevreviewBernard Cafarelli proposed openstack/neutron stable/train: DNM: test tempest train-last tag  https://review.opendev.org/c/openstack/neutron/+/81659715:18
slaweqit's at least not 13 rechecks in average to get patch merged as it was few weeks ago15:18
lajoskatona+115:18
mlavalleit's a big improvement15:18
slaweqand one more thing15:18
obondarevcool!15:18
slaweqI spent some time last week, going through the list of "gate-failure" bugs15:19
ykarelcool15:19
slaweq#link https://tinyurl.com/2p9x6yr215:19
slaweqI closed about 40 of them15:19
slaweqbut still we have many opened15:19
slaweqso if You would have some time, please check that list15:19
ralonsohsure15:19
slaweqmaybe something is already fixed and we can close it15:19
slaweqor maybe You want to work on some of those issues :)15:20
mlavalleso the homework is to close as many as possible?15:20
mlavalleohh, now I understand15:20
slaweqalso, please use that list to check if You didn't hit some already known issue before rechecking patch15:20
lajoskatonaslaweq: thanks, and thanks for closing so many of old bugs15:20
slaweqI want to remind You that we should not only recheck patches but try to identify reason of failure and open bug or link to the existing one in the recheck comment :)15:21
slaweqthat's what we agreed last week on the ci meeting15:21
slaweqanything else You want to talk regarding Grafana or related stuff?15:22
slaweqif not, I think we can move on15:22
lajoskatonanothing from me15:22
ykareljust that i pushed https://review.opendev.org/c/openstack/project-config/+/821706 today15:23
ykarelto fix dashboard with recent changes15:23
lajoskatonaykarel: thanks15:24
slaweqthx15:25
slaweq#topic fullstack/functional15:25
fricklerI merged that already, feel free to ping me for future updates15:25
ykarelThanks frickler 15:26
slaweqI opened one new bug related to functional job's failures: https://bugs.launchpad.net/neutron/+bug/195475115:26
slaweqI noticed it I think twice this week15:26
slaweqso probably we will need to investigate that15:26
slaweqI will add some extra logs there to know better what's going on in that test and why it's failing15:27
slaweqas for now it's not easy to tell exactly what was the problem there15:27
slaweq#action slaweq to add some extra logs to the test, related to https://bugs.launchpad.net/neutron/+bug/1954751 to help further debugging15:27
slaweq#topic Tempest/Scenario15:28
slaweqregarding scenario jobs, the only issue which I noticed that is really impacting us now often are jobs' timeouts https://bugs.launchpad.net/neutron/+bug/195347915:29
slaweqI saw at least 2 or 3 times such timeouts this week again15:29
slaweqykarel15:29
slaweqproposed https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82106715:29
slaweqto use nested virt in those jobs15:30
slaweqand wanted to discuss what do You think about it15:30
slaweqIMO that sounds like worth to try (again)15:30
ralonsohfor sure yes15:30
slaweqmaybe this time it will work for us better15:30
ralonsohdo we have figures? to compare performance15:30
lajoskatonamy concern /question is regarding the availabilty of these nodes15:31
slaweqthe only issue with that, IIUC is that there is limited pool of nodes/providers which provides that possibility15:31
ykarelyes initial results are good with that but worth trying it in all patches 15:31
slaweqso jobs may be executed longer15:31
lajoskatonaif ti will be a bottleneck as jobs are waiting for nodes/resources15:31
ykarelthose scenario jobs are now taking approx 1 hour less to finish15:31
ralonsohand are less prone to CI failures, right?15:31
ykarelyes i have not seen any ssh timeout failure in those tests yet15:32
lajoskatonasounds good15:32
ralonsohhmmm I know the availability is an issue, but sounds promising15:32
obondarev+1 let's try15:32
ralonsoh+1 right15:32
mlavalle+115:32
lajoskatona+1, we can revisit if we see bottleneck :-)15:32
slaweqyes, that's also my opinion about it - better wait a bit longer than recheck :)15:32
ralonsohright!15:33
ykarelyes availability is a concern, but if we have less queue time then we should be good15:33
mlavalleyes, the overall effect might be a net reduction in wait time and increase in throughput15:33
ykareland also another issue is when all the providers are down that provide those nodes15:33
ykarelwe will have issue15:33
ykarelbut if that happens rarely we can switch/switch-out from those nodes15:33
ralonsohanother fire in OVH? I hope no15:34
slaweqLOL15:34
mlavalleLOL15:34
ykarelwill send seperate patch to easily allow switching/reverting from those nodes15:35
slaweqykarel thx a lot15:36
slaweqok, lets move on15:36
slaweq#topic Periodic15:36
slaweqI see that since yesterday neutron-ovn-tempest-ovs-master-fedora seems to be broken (again)15:36
slaweqanyone want's to check it?15:36
slaweqif not, I can do it15:36
mlavalleI'll help15:37
slaweqthx mlavalle15:37
slaweq#action mlavalle to check failing neutron-ovn-tempest-ovs-master-fedora job15:37
slaweqso that's all what I had for today15:38
* mlavalle has missed to see his name associated to an action item in the CI meeting :-)15:38
slaweqthere is still that topic about ci improvements from last week15:38
slaweqbut TBH I didn't prepare anything for today as I though that maybe it will be better to talk about it next week on video call15:38
slaweqwdyt?15:38
slaweqshould we continue discussion today or next week on video?15:38
mlavallesounds good15:38
ralonsohperfect, I like video calls for CI meetings15:38
mlavallewhat video service do we use?15:39
lajoskatonamlavalle: jitsi15:39
mlavallecool!15:39
slaweqok, so lets continue that topic next week then15:40
lajoskatonaI like the video personally, its extra work fro you to keep the logs written15:40
slaweqif there are no other topics for today, I can give You back about 20 minutes15:41
mlavalle+115:41
ralonsohsee you tomorrow!15:41
slaweqlajoskatona it's not big thing to do TBH, and I like video meetings too :)15:41
ykarel+115:41
slaweqthx for attending the meeting today15:41
slaweqhave a great day and see You online :)15:41
mlavalleo/15:41
slaweq#endmeeting15:41
opendevmeetMeeting ended Tue Dec 14 15:41:59 2021 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:41
opendevmeetMinutes:        https://meetings.opendev.org/meetings/neutron_ci/2021/neutron_ci.2021-12-14-15.00.html15:41
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/neutron_ci/2021/neutron_ci.2021-12-14-15.00.txt15:41
opendevmeetLog:            https://meetings.opendev.org/meetings/neutron_ci/2021/neutron_ci.2021-12-14-15.00.log.html15:41
lajoskatonao/15:41
obondarevo/15:42
bluexif you have some time please take a look at https://bugs.launchpad.net/neutron/+bug/195478515:43
bluexAny feedback on provided possible solutions would be great15:43
ralonsohbluex, can you provide a way to reproduce it?15:44
ralonsohsome steps, commands, ...15:44
mlavalle+1, yeah, taht always heleps15:47
bluexI'll try to figure that out, for now I'm not sure15:48
jhartkopfHey, I have a question regarding floating IP quota. Is it possible to set specific quota per FIP pool? For example, if I have two FIP pools "public" and "internal", can I set a quota of 1 for public and a quota of 5 for internal?15:54
ralonsohjhartkopf, you can specify a quota per (project, resource)15:55
opendevreviewSlawek Kaplonski proposed openstack/neutron master: Allocate IPs in bulk requests in separate transactions  https://review.opendev.org/c/openstack/neutron/+/82172715:55
ralonsohslaweq, ^^ checking this!15:55
slaweqralonsoh ^^ please check when You will have some time15:55
ralonsohhahaha15:55
slaweqin my tests improvement was significant15:55
ralonsohperfect!15:55
slaweqwe can talk about it tomorrow if You want15:56
ralonsohno way, very cool figures!15:56
slaweqdulek also, maybe You want to apply that patch in Your env and try https://review.opendev.org/c/openstack/neutron/+/82172715:56
slaweqit's related to that bug with timeout on haproxy which we talked about few weeks ago15:56
slaweqralonsoh yeah, but a lot of that improvement I got when I moved allocation for each IP to the separate transaction really15:57
slaweqwhen allocation for e.g. 10 ports was in one transaction, it was about 1:45 in average15:57
dulekslaweq: Oh man, Christmas came early this year!15:58
slaweqdulek please first try :)15:59
slaweqin my simple reproducer (without kuryr) improvement is pretty good but I would like to see feedback from You alsoi15:59
dulekslaweq: Sure, I should be able to try it.16:00
slaweqthx a lot16:00
dulekBTW - here's our approach to the problem: https://review.opendev.org/c/openstack/kuryr-kubernetes/+/82162616:00
dulekWe'll get it in anyway - we got to support many OpenStacks that are there in the wild.16:01
dulekslaweq: So the idea behind your patch is that with each transation being separate there is less window for conflicts?16:01
slaweqdulek the idea is to do IP allocation for ports in separate transactions - so if one of them will fail, only that one will be retried16:02
slaweqand when IPs for all ports are allocated, do everything else with single transaction, as it was so far16:03
jhartkopfralonsoh: So a resource would be one FIP? Or can this also be a whole FIP pool?16:05
ralonsohjhartkopf, no, a resource is FIP, port, security group, SG rules, network, router (I'm missing something)16:08
ralonsohthat means you can set the quota for FIPs in a project, or the quota for networks in a project16:08
ralonsoh$openstack quota show (that will list the quotas per project)16:08
jhartkopfralonsoh: Ok, so it doesn't seem to be possible to do this for whole FIP pools. Do you know if this feature has been discussed before?16:14
ralonsohjhartkopf, not that I'm aware16:14
ralonsohand this should be compatible not only with Neutron but with all projects using quota16:15
jhartkopfralonsoh: Do you think this would be a welcome feature to invest some time to create a spec for?16:22
jhartkopfOr to further discuss this in general16:22
ralonsohjhartkopf, to be honest (but this is my particular opinion), there is no need for this feature right now (I know, I know, you do). And as I commented, that should be aligned with other projects16:23
ralonsohso this is more complex than you expect16:23
ralonsohcan't you use the project quota assigned to FIP?16:24
ralonsohin any case, if you want to discuss it, you can raise a topic in neutron driver's meeting16:25
ralonsohon fridays16:25
ralonsohone sec16:25
ralonsohhttps://meetings.opendev.org/#Neutron_drivers_Meeting16:25
ralonsohyou can amend the agenda and add this topic for discussion16:25
ralonsohyou can initially propose this RFE in a bug, with a description16:26
ralonsohbefore spending too much time in a spec16:26
jhartkopfralonsoh: Ok, thank you. I didn't expect this to be not complex though.16:27
jhartkopfWhat do you mean with the project quota assigned to FIP?16:28
ralonsohyou can assign a quota for FIP for a specific project16:28
ralonsohproject_1, FIP: 100, project_2, FIP: 20016:28
ralonsohyou can have different users assigned to different projects with different quotas16:29
ralonsohif that works for you16:29
jhartkopfAh I see16:29
jhartkopfI think the problem is that we'd like to set different quotas for different FIP pools16:30
jhartkopfSo we could allow usage of 1 publicy routable IP but 5 IPs that are routable internally only16:31
ralonsohjhartkopf, where are you using FIP pools?16:33
ralonsohthis seems to be a resource of nova networks16:33
ralonsoh(no, not nova networks but a compute API resource)16:34
ralonsohjhartkopf, https://github.com/openstack/nova/blob/master/api-ref/source/os-floating-ip-pools.inc16:35
ralonsohthis is deprecated16:35
jhartkopfralonsoh: Yes, we are using them with Nova instances.16:40
jhartkopfBut I cannot imagine that the feature itself is deprecated16:41
opendevreviewMaor Blaustein proposed openstack/neutron-tempest-plugin master: Add test_create_and_update_port_with_dns_name  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82107916:44
ralonsohjhartkopf, what version are you running?16:44
jhartkopfralonsoh: Ussuri16:45
opendevreviewMerged openstack/neutron-tempest-plugin master: Switch scenario jobs to nested-virt nodes  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82106717:15
jhartkopfralonsoh: Could you give me a hint where to further discuss this feature? Would this be part of Neutron? Nova seems to use those APIs (thus deprecation of the proxy API).17:23
ralonsohjhartkopf, this is in Neutron now but I can't enable it. In any case, a FIP pool (according to the description) is just the list of FIPs per subnet17:25
jhartkopfralonsoh: Ok, will have to look further into that. Thanks a lot!17:30
opendevreviewMaor Blaustein proposed openstack/neutron-tempest-plugin master: Add test_create_and_update_port_with_dns_name  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/82107917:36
opendevreviewMerged openstack/neutron stable/train: Cleanup router for which processing added router failed  https://review.opendev.org/c/openstack/neutron/+/81767718:29
masterpe[m]Hi, If I add a static route in a neutron router where the destination already exists but as a connected interface/route neutron does a replace of that route. And so it removes the connected route20:06
masterpe[m]see https://bugs.launchpad.net/neutron/+bug/1954777 20:06
opendevreviewAdam Harwell proposed openstack/neutron master: Tweak port metadata test to be more reliable  https://review.opendev.org/c/openstack/neutron/+/82177922:36
opendevreviewAdam Harwell proposed openstack/neutron master: Tweak port metadata test to be more reliable  https://review.opendev.org/c/openstack/neutron/+/82177922:59

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!