*** armax has joined #openstack-neutron-ovn | 00:10 | |
*** thumpba has joined #openstack-neutron-ovn | 00:13 | |
*** thumpba has quit IRC | 00:17 | |
*** aginwala has quit IRC | 00:42 | |
*** aginwala has joined #openstack-neutron-ovn | 00:45 | |
*** aginwala has quit IRC | 00:46 | |
*** aginwala has joined #openstack-neutron-ovn | 00:46 | |
*** gangil1 has joined #openstack-neutron-ovn | 00:47 | |
Sam-I-Am | moo. | 00:47 |
---|---|---|
*** gangil has quit IRC | 00:48 | |
*** gangil1 has quit IRC | 00:48 | |
*** gangil has joined #openstack-neutron-ovn | 00:48 | |
*** gangil has joined #openstack-neutron-ovn | 00:48 | |
*** aginwala has quit IRC | 01:13 | |
*** aginwala has joined #openstack-neutron-ovn | 01:20 | |
*** aginwala has quit IRC | 01:35 | |
*** azbiswas has quit IRC | 01:39 | |
*** azbiswas has joined #openstack-neutron-ovn | 01:39 | |
*** azbiswas has quit IRC | 01:43 | |
*** azbiswas has joined #openstack-neutron-ovn | 01:57 | |
*** gangil has quit IRC | 02:07 | |
*** azbiswas_ has joined #openstack-neutron-ovn | 02:08 | |
*** terryw has joined #openstack-neutron-ovn | 02:14 | |
*** azbiswas has quit IRC | 02:15 | |
*** otherwiseguy has quit IRC | 02:15 | |
*** openstackgerrit has quit IRC | 02:15 | |
*** openstackgerrit has joined #openstack-neutron-ovn | 02:24 | |
*** armax has quit IRC | 02:43 | |
*** jckasper has joined #openstack-neutron-ovn | 02:53 | |
*** jckasper has quit IRC | 03:11 | |
*** jckasper has joined #openstack-neutron-ovn | 03:11 | |
*** jckasper has quit IRC | 03:15 | |
*** azbiswas_ has quit IRC | 03:52 | |
*** azbiswas has joined #openstack-neutron-ovn | 03:53 | |
*** azbiswas has quit IRC | 03:57 | |
*** azbiswas has joined #openstack-neutron-ovn | 04:04 | |
*** aginwala has joined #openstack-neutron-ovn | 04:04 | |
*** jckasper has joined #openstack-neutron-ovn | 04:30 | |
*** armax has joined #openstack-neutron-ovn | 04:34 | |
*** aginwala has quit IRC | 04:34 | |
*** jckasper has quit IRC | 04:36 | |
*** jckasper has joined #openstack-neutron-ovn | 04:36 | |
*** jckasper has quit IRC | 04:40 | |
*** salv-orl_ has joined #openstack-neutron-ovn | 04:41 | |
*** jckasper has joined #openstack-neutron-ovn | 04:44 | |
*** salv-orlando has quit IRC | 04:44 | |
*** jckasper has quit IRC | 04:45 | |
*** jckasper has joined #openstack-neutron-ovn | 04:45 | |
*** jckasper has quit IRC | 04:50 | |
*** armax has quit IRC | 04:51 | |
*** armax has joined #openstack-neutron-ovn | 05:05 | |
*** aginwala has joined #openstack-neutron-ovn | 05:08 | |
*** aginwala has quit IRC | 05:12 | |
openstackgerrit | Babu Shanmugam proposed openstack/networking-ovn: DPDK support for OVN https://review.openstack.org/275103 | 05:25 |
*** armax has quit IRC | 05:37 | |
*** allan_h has quit IRC | 05:56 | |
*** aginwala has joined #openstack-neutron-ovn | 07:02 | |
*** numans has quit IRC | 07:48 | |
openstackgerrit | Babu Shanmugam proposed openstack/networking-ovn: Enabling qos support through Logical_Port.options https://review.openstack.org/265798 | 08:01 |
*** nate_gone has joined #openstack-neutron-ovn | 08:07 | |
*** nate_gone is now known as njohnston | 08:08 | |
*** aginwala_ has joined #openstack-neutron-ovn | 08:09 | |
*** aginwala has quit IRC | 08:11 | |
*** palexster has joined #openstack-neutron-ovn | 08:11 | |
*** palexster has quit IRC | 08:12 | |
*** palexster has joined #openstack-neutron-ovn | 08:12 | |
*** jckasper has joined #openstack-neutron-ovn | 08:15 | |
*** jckasper has quit IRC | 08:20 | |
*** aginwala_ has quit IRC | 08:43 | |
*** azbiswas has quit IRC | 08:44 | |
*** aginwala has joined #openstack-neutron-ovn | 08:44 | |
*** openstackgerrit has quit IRC | 08:47 | |
*** openstackgerrit_ has joined #openstack-neutron-ovn | 08:47 | |
*** openstackgerrit_ is now known as openstackgerrit | 08:48 | |
*** numans has joined #openstack-neutron-ovn | 08:48 | |
*** aginwala has quit IRC | 08:49 | |
*** jckasper has joined #openstack-neutron-ovn | 09:11 | |
*** jckasper has quit IRC | 09:16 | |
*** yamamoto has quit IRC | 10:10 | |
*** yamamoto has joined #openstack-neutron-ovn | 10:10 | |
*** yamamoto has quit IRC | 10:38 | |
*** salv-orlando has joined #openstack-neutron-ovn | 10:42 | |
*** azbiswas has joined #openstack-neutron-ovn | 10:44 | |
*** salv-orl_ has quit IRC | 10:45 | |
*** roeyc has joined #openstack-neutron-ovn | 10:48 | |
*** yamamoto has joined #openstack-neutron-ovn | 10:49 | |
*** yamamoto has quit IRC | 10:49 | |
*** azbiswas has quit IRC | 10:50 | |
openstackgerrit | Numan Siddique proposed openstack/networking-ovn: Neutron ovn northbound db sync tool https://review.openstack.org/277805 | 11:24 |
*** jckasper has joined #openstack-neutron-ovn | 11:24 | |
*** jckasper has quit IRC | 11:29 | |
*** _Mic22 has joined #openstack-neutron-ovn | 11:30 | |
*** Mic22 has quit IRC | 11:32 | |
*** Mic22 has joined #openstack-neutron-ovn | 11:38 | |
*** _Mic22 has quit IRC | 11:41 | |
*** yamamoto has joined #openstack-neutron-ovn | 11:50 | |
*** yamamoto has quit IRC | 11:58 | |
*** ig0r_ has joined #openstack-neutron-ovn | 12:12 | |
*** rtheis has joined #openstack-neutron-ovn | 12:38 | |
*** azbiswas has joined #openstack-neutron-ovn | 12:47 | |
*** azbiswas has quit IRC | 12:51 | |
*** jckasper has joined #openstack-neutron-ovn | 13:07 | |
*** jckasper has quit IRC | 13:11 | |
*** jckasper has joined #openstack-neutron-ovn | 13:11 | |
*** jckasper has quit IRC | 13:15 | |
*** jckasper has joined #openstack-neutron-ovn | 13:48 | |
Sam-I-Am | moo. | 13:48 |
*** brad_behle has joined #openstack-neutron-ovn | 14:02 | |
russellb | quack | 14:13 |
*** palexster1 has joined #openstack-neutron-ovn | 14:14 | |
*** palexster has quit IRC | 14:14 | |
*** palexster1 is now known as palexster | 14:14 | |
Sam-I-Am | russellb: moooorning! | 14:15 |
Sam-I-Am | russellb: up for a philosophical discussion on routing in ovn? | 14:30 |
*** allan_h has joined #openstack-neutron-ovn | 14:53 | |
openstackgerrit | Ryan Moats proposed openstack/networking-ovn: (WIP) Support port security API extension https://review.openstack.org/269219 | 14:58 |
openstackgerrit | Ryan Moats proposed openstack/networking-ovn: (WIP) Support Allowed address pairs https://review.openstack.org/269535 | 14:58 |
*** ig0r_ has quit IRC | 15:21 | |
*** azbiswas has joined #openstack-neutron-ovn | 15:33 | |
azbiswas | russellb: Have you ever encountered "referential integrity violation" in the ovsdb. | 15:44 |
Sam-I-Am | azbiswas: i've seen it in the gate | 15:44 |
azbiswas | Sam-I-Am: Thanks, do we know the reason behind it - Is it multiple api workers trying to modify the DB at the same time? | 15:45 |
Sam-I-Am | dont know | 15:46 |
Sam-I-Am | do we set workers >1 by default? | 15:46 |
azbiswas | In devstack yes - not sure what the gate does | 15:46 |
azbiswas | I'm not sure if the api workers is the reason behind it either. | 15:47 |
Sam-I-Am | definitely needs some invesigation. i dont see it in the neutron/ml2 jobs. | 15:47 |
azbiswas | In our situation, there was 2 api workers operating on the same Logical Switch/ACL association at the same time. The 1st api worker was able to commit its changes correctly, the 2nd api worker failed. | 15:49 |
russellb | sounds like an expected type of error when there are multiple workers | 15:54 |
russellb | we need to be able to handle those errors, it means we need to redo the transaction using the updated db state | 15:55 |
*** njohnston has quit IRC | 15:55 | |
*** HenryG has quit IRC | 15:55 | |
Sam-I-Am | i wonder what the default # of workers is | 15:55 |
Sam-I-Am | i know increasing rpc workers creates a mess | 15:55 |
russellb | it's > 1 | 15:55 |
Sam-I-Am | is it >1 for neutron/ml2 too? | 15:55 |
Sam-I-Am | i thought it was based on number of cpus or something | 15:55 |
russellb | it has changed a few times | 15:56 |
Sam-I-Am | surprise :) | 15:56 |
russellb | neutron defaults to # of CPUs | 15:57 |
russellb | the gate cuts that back | 15:57 |
russellb | but it's still multiple | 15:57 |
azbiswas | Neutron retries the transaction if the error returned is RETRY | 15:57 |
azbiswas | but in this case it was error | 15:58 |
russellb | oh, i see | 15:58 |
russellb | can you send the full error? | 15:58 |
azbiswas | "details":"Table Logical_Switch column acls row fa55edc2-72de-4d02-8ccc-19f10be2dc1e references nonexistent row 036dc043-f434-48a3-b767-9b68653f3fa3 in table ACL.","error":"referential integrity violation"} | 15:58 |
azbiswas | There were 2 create ports on the same provider network at the same time on 2 different api workers | 16:00 |
azbiswas | There were 4 existing ports on the provider networks, say A, B, C, D and api worker 1 is adding port X, api worker 2 is adding port Y. | 16:01 |
*** HenryG has joined #openstack-neutron-ovn | 16:02 | |
russellb | i see... | 16:03 |
azbiswas | api worker 1 tries to commit [DelACL A, AddACL A, DelACL B, AddACL B, DelACL C, AddACL C, DelACL D, AddACL D, DelACL X, AddACL X] | 16:03 |
azbiswas | and succeeds | 16:03 |
azbiswas | api worker 2 tries to commit [DelACL A, AddACL A, DelACL B, AddACL B, DelACL C, AddACL C, DelACL D, AddACL D, DelACL Y, AddACL Y] | 16:03 |
azbiswas | and fails | 16:03 |
azbiswas | since api worker doesn't know about X as yet, it doesn't operate on X - that's a separate problem. | 16:04 |
azbiswas | or maybe the same? | 16:04 |
* russellb reading code, not ignoring | 16:05 | |
russellb | oops i have a meeting that started 5 minutes ago | 16:05 |
azbiswas | We can discuss later - The nonexistent ACL row was deleted by api worker 1 but MAY have been contained in the LSwitch ACL list transaction of api worker 2 - possibly. | 16:06 |
russellb | yes that would explain it | 16:11 |
russellb | and if that's it i think i know the fix | 16:11 |
russellb | we should be calling verify() | 16:13 |
azbiswas | I'm still trying to wrap my head around why the 2nd api worker LSwitch ACL list would contain an old ACL, they are both deleting the existing A,B,C,D ACLs | 16:13 |
russellb | to make sure that when we update a column, it previously had the values we think it did | 16:13 |
azbiswas | we are calling lswitch.verify() after each add/delete of ACL | 16:14 |
russellb | hm ok | 16:14 |
azbiswas | So each api worker sees its own version of lswitch = idlutils.row_by_value(self.api.idl, 'Logical_Switch', ..)? | 16:19 |
*** roeyc has quit IRC | 16:23 | |
*** thumpba has joined #openstack-neutron-ovn | 16:26 | |
*** flaviof has quit IRC | 16:38 | |
*** lrichard has quit IRC | 16:39 | |
*** lrichard has joined #openstack-neutron-ovn | 16:39 | |
*** armax has joined #openstack-neutron-ovn | 16:41 | |
*** salv-orl_ has joined #openstack-neutron-ovn | 16:41 | |
*** salv-orlando has quit IRC | 16:44 | |
*** flaviof has joined #openstack-neutron-ovn | 16:53 | |
*** numans has quit IRC | 17:13 | |
Sam-I-Am | russellb: moo? | 17:20 |
*** regXboi has joined #openstack-neutron-ovn | 17:24 | |
*** manand has joined #openstack-neutron-ovn | 17:24 | |
*** manand has quit IRC | 17:25 | |
russellb | azbiswas: yes, each worker has its own local copy of the db, when it does row_by_value(), it's from local cache | 17:42 |
russellb | verify() is supposed to before saying "before i update this column, make sure it still has the previous value i thought it had" | 17:42 |
russellb | you said we're calling verify() after each add/delete, that might be wrong | 17:43 |
russellb | after instead of before i mean | 17:43 |
russellb | i'll have to dig into the code some more | 17:43 |
russellb | can someone file a bug about this? | 17:43 |
russellb | Sam-I-Am: hi | 17:43 |
Sam-I-Am | russellb: hey | 17:44 |
azbiswas | I'll file the bug, the verify() is called at each run_idl() call for AddACL and DelACL | 17:44 |
russellb | in any case, it definitely sounds like we've missed something in proper transaction handling | 17:45 |
Sam-I-Am | russellb: i had a long debate with myself yesterday about routing... particularly the provider-private/nat/fip bits. maybe you have some time to discuss at some point? | 17:46 |
russellb | sure, though i don't know what our plans are for that yet | 17:47 |
Sam-I-Am | well, the concern is, do we really NEED plans | 17:47 |
Sam-I-Am | hence the philosophical part :) | 17:47 |
russellb | azbiswas and chandrav did a fantastic job on a floating IP proposal (that i've been bad about not responding to yet) | 17:47 |
russellb | heh | 17:48 |
russellb | did it get solved in my sleep? | 17:48 |
russellb | or? | 17:48 |
* azbiswas perks up about floating IP discussion :) | 17:48 | |
*** igordcard has joined #openstack-neutron-ovn | 17:48 | |
Sam-I-Am | well, i'm kind of wondering... is the concept of floating ips flawed and just something we've all gotten used to having? | 17:48 |
russellb | some people claim that | 17:50 |
russellb | however, it's a well established thing that we're expected to have | 17:50 |
Sam-I-Am | seems like its there to work around the limits of ipv4 addresses | 17:50 |
Sam-I-Am | there is no floating ip for v6 | 17:50 |
Sam-I-Am | nat is a hack | 17:50 |
russellb | right. | 17:50 |
russellb | are we doing ipv6 only? | 17:50 |
russellb | (no :-p) | 17:50 |
Sam-I-Am | lol | 17:50 |
russellb | i'm not sure we can get away with *not* supporting nat and floating ips | 17:51 |
russellb | it's the most common setup | 17:51 |
Sam-I-Am | well, maybe, maybe not | 17:51 |
Sam-I-Am | for a long time, it was assumed that The Way with neutron was a provider network, private network(s), routers, and floating IPs for vms that need access from the outside world | 17:52 |
Sam-I-Am | people quickly found out the limitations of the network node... a performance bottleneck and single point of failure | 17:53 |
* russellb nods | 17:53 | |
Sam-I-Am | along came the networking guide which outlined the ability to attach vms directly to provider nets, or use a hybrid approach | 17:53 |
Sam-I-Am | for the hybrid approach, vms that need to be public have an interface on a provider network, and optionally a private network that may or may not allow snat. | 17:54 |
russellb | scaling that well really needs "routed networks" | 17:55 |
russellb | right? | 17:55 |
Sam-I-Am | define routed networks | 17:55 |
Sam-I-Am | you mean rather than snatted? | 17:55 |
russellb | i mean the neutron routed networks spec | 17:56 |
Sam-I-Am | oh, ehhh.... | 17:57 |
russellb | part of what you're getting at kind of sounds like this existential debate about how much we want virtual networking in the first place | 17:57 |
Sam-I-Am | haha | 17:57 |
Sam-I-Am | well, i was hoping that wasnt the case... | 17:57 |
russellb | maybe it's not | 17:58 |
russellb | but | 17:58 |
russellb | with both floating IPs and direct plugging to the public network, we currently assume all IPs are valid on every hypervisor | 17:59 |
russellb | in larger deployments, that starts to not be great, as i understand it | 17:59 |
Sam-I-Am | things like dvr go to great extents to preserve the conventional floating ip logic more or less using layer 2. sure, each compute node can respond to floating IPs for vms that reside on it, but at the expense of complexity and resiliency | 18:00 |
russellb | gateways help in one way (limit the scope of what nodes need this) | 18:00 |
russellb | the other way is to expose more info about the scalable L3 fabric directly to the compute hosts (that's where routed networks comes in) | 18:00 |
Sam-I-Am | dvr also doesnt solve snat for fixed ips, because what it does is broken. | 18:00 |
Sam-I-Am | well, thats sort of where i was going | 18:01 |
Sam-I-Am | conventional networks either use mct, vrrp, or rely on routing protocols to handle ecmp and failover | 18:02 |
Sam-I-Am | neutron l3ha is vrrp, and that's all good. its well understood and works. with the proper scheduling of routers, it scales better, but still not going to match hardware routers/nat | 18:03 |
azbiswas | russellb: I gotta run, will be following the ovn meeting in read only mode, but I noticed that in the neutron code the verify is done before update while in the networking-ovn code the verify is done after the update. | 18:05 |
azbiswas | row.external_ids = {'neutron:lport': self.lport} | 18:05 |
azbiswas | acls = getattr(lswitch, 'acls', []) | 18:05 |
azbiswas | acls.append(row.uuid) | 18:05 |
azbiswas | setattr(lswitch, 'acls', acls) | 18:05 |
azbiswas | lswitch.verify('acls') | 18:05 |
russellb | ok, i think we have that backwards | 18:05 |
azbiswas | i'll try out the change and see if it makes a difference | 18:06 |
russellb | azbiswas: great thank you!! | 18:06 |
russellb | hopefully that's all it is ... | 18:06 |
* azbiswas nods | 18:06 | |
*** azbiswas has quit IRC | 18:06 | |
russellb | (OVN meeting in 10 minutes) | 18:06 |
Sam-I-Am | russellb: i think what i'm getting at is if we're going to do routing in neutron, it needs to be REAL routing, not some hacky layer2 stuff | 18:07 |
Sam-I-Am | i was sort of getting the feeling that the nat/fip spec for ovn looked too much like dvr | 18:07 |
russellb | from a high level yes | 18:07 |
russellb | implemented quite differently | 18:07 |
russellb | but yeah. | 18:07 |
Sam-I-Am | ovn/ovs is already overwhelming for most operators to comprehend, hence why linuxbridge is gaining popularity | 18:08 |
Sam-I-Am | dvr adds significant complexity that arguably overshadows what it aims to solve | 18:08 |
Sam-I-Am | i'm afraid that we're going to implement something thats fundamentally broken just to say we have it rather than looking at the bigger picture of what routing in neutron means | 18:09 |
*** gangil has joined #openstack-neutron-ovn | 18:11 | |
*** gangil has joined #openstack-neutron-ovn | 18:11 | |
russellb | right now we just have the east-west part ... as for north/south, the picture is totally unclear to me until there's a proposal for gateways | 18:12 |
russellb | i hope you can chime in on that, because it sounds like you have a better idea than i do | 18:12 |
Sam-I-Am | yeah... east-west that doesnt rely on a single node is one of the two things DVR does | 18:13 |
russellb | right | 18:13 |
russellb | that part seems sane | 18:13 |
russellb | right? | 18:13 |
Sam-I-Am | yeah, it makes sense | 18:13 |
*** aginwala has joined #openstack-neutron-ovn | 18:14 | |
Sam-I-Am | what doesnt make sense is n-s with fips | 18:14 |
russellb | ok, it'd be good to express that view on the floating IP proposal on the ovs dev mailing list | 18:14 |
russellb | (meeting time) | 18:15 |
*** zhouhan has joined #openstack-neutron-ovn | 18:15 | |
Sam-I-Am | i'm trying to make sure i dont sound completely silly. just after a long internal debate, it seemed like we're trying to implement fips because they've just been there. | 18:15 |
Sam-I-Am | russellb: yeah, catch you on the other side | 18:15 |
zhouhan | hi | 18:15 |
aginwala | hi | 18:16 |
Sam-I-Am | mestery: i'd also like to get your opinion on these things | 18:16 |
*** azbiswas_ has joined #openstack-neutron-ovn | 18:18 | |
*** chandrav has joined #openstack-neutron-ovn | 18:21 | |
*** numans has joined #openstack-neutron-ovn | 18:48 | |
*** arosen12 has left #openstack-neutron-ovn | 18:54 | |
*** arosen12 has joined #openstack-neutron-ovn | 18:54 | |
*** ig0r_ has joined #openstack-neutron-ovn | 18:58 | |
*** ig0r_ has quit IRC | 19:03 | |
*** azbiswas_ has quit IRC | 19:04 | |
*** ig0r_ has joined #openstack-neutron-ovn | 19:08 | |
*** jckasper has quit IRC | 19:15 | |
*** jckasper has joined #openstack-neutron-ovn | 19:17 | |
russellb | anyone interested in testing this to see if you can reproduce it? https://bugs.launchpad.net/networking-ovn/+bug/1543795 | 19:19 |
openstack | Launchpad bug 1543795 in networking-ovn "Ping through 2 routers is not working" [Undecided,New] | 19:19 |
*** numans has quit IRC | 19:19 | |
mestery | regXboi: I can try that at the top of the hour | 19:27 |
mestery | I have a meeting in 3 minutes | 19:27 |
mestery | russellb not regXboi | 19:27 |
chandrav | russellb: Is that a valid configuration ? | 19:27 |
russellb | i think there's a typo in the report | 19:27 |
russellb | (same address on networks 2 and 3) | 19:28 |
russellb | assuming that's a typo, seems like it should be valid | 19:28 |
chandrav | How can network 2 be connected to 2 routers | 19:28 |
chandrav | it has only 1 gateway address (10.2.0.1) | 19:28 |
russellb | ah, well that's a good point :) | 19:29 |
chandrav | for a given network there can be only 1 default gateway. I doubt if openstack allows such a configuration | 19:30 |
russellb | thanks chandrav, i guess i hadn't really thought through what they were doing | 19:30 |
chandrav | yw | 19:31 |
chandrav | i am seeing a problem with provider networks with the latest tip of the tree. the ports connecting the br-provider and br-int do not show up. anyone else seen this issue yet ? | 19:36 |
russellb | chandrav: it might be a change in behavior you didn't know about | 19:38 |
russellb | chandrav: the patch ports get created later, once you create a port on the provider network | 19:39 |
russellb | and there's going to be a patch port per VM now | 19:39 |
russellb | https://github.com/openvswitch/ovs/commit/e90aeb578ffc0cbae377b6251c2d956de98dacad | 19:39 |
chandrav | russellb: thx, i'll look into it | 19:40 |
russellb | and then zhouhan has a nice patch to change it yet again :) | 19:47 |
russellb | so that we don't create a separate lswitch for every port on a provider network | 19:47 |
*** azbiswas has joined #openstack-neutron-ovn | 19:50 | |
*** aginwala has quit IRC | 19:56 | |
*** allan_h has quit IRC | 19:59 | |
*** allan_h has joined #openstack-neutron-ovn | 20:02 | |
*** aginwala has joined #openstack-neutron-ovn | 20:04 | |
Sam-I-Am | russellb: moo | 20:26 |
Sam-I-Am | russellb: got a q about the devstack plugin | 20:26 |
Sam-I-Am | russellb: in vagrant, the controller node needs neither ovn-northd nor ovn-controller | 20:28 |
Sam-I-Am | ovn-northd is on the db node and ovn-controller is only on compute nodes | 20:28 |
Sam-I-Am | however, if you disable both of those services, install_ovn never runs | 20:28 |
regXboi | Sam-I-Am: mestery and I ran into that | 20:29 |
Sam-I-Am | sounds like we can just ditch that conditional? | 20:29 |
Sam-I-Am | we already know the deployment uses networking-ovn at that point | 20:29 |
regXboi | I was thinking of running install_ovn if q-svc is defined | 20:29 |
Sam-I-Am | that might work. really if any q* service is enabled. | 20:30 |
regXboi | yeah, I think that sounds right | 20:30 |
Sam-I-Am | do the conventional neutron agents like dhcp and md need ovn-controller at all? | 20:31 |
russellb | dhcp does | 20:31 |
russellb | not sure about md, i don't remember how that one works to be honest | 20:31 |
Sam-I-Am | magic | 20:31 |
russellb | any agent that creates ports needs ovn-controller | 20:31 |
azbiswas | l3 does as well | 20:31 |
russellb | so yep, l3 needs it too | 20:32 |
Sam-I-Am | azbiswas: yeah. i'm assuming we're not using the conventional l3 agent. | 20:32 |
Sam-I-Am | but i figured it would apply | 20:32 |
azbiswas | right, just threw it out there. | 20:32 |
russellb | that's some validation we could put into our devstack plugin actually | 20:32 |
russellb | if (q-l3 || q-dhcp) && !ovn-controlller, error | 20:32 |
Sam-I-Am | possibly | 20:33 |
russellb | if nova-compute && !ovn-controller, error | 20:33 |
Sam-I-Am | thinking about that | 20:34 |
Sam-I-Am | first thing, if is_ovn_service_enabled q-* then do main loop | 20:35 |
Sam-I-Am | since there's no generic ovn service | 20:36 |
Sam-I-Am | but we know if there's neutron bits, ovn should probably be there too | 20:36 |
Sam-I-Am | russellb: is this even a thing - is_service_enabled ovn ? | 20:39 |
Sam-I-Am | from plugin.sh | 20:39 |
russellb | historical ... | 20:40 |
Sam-I-Am | ok, makes sense. | 20:40 |
russellb | first version didn't support multi-node | 20:40 |
russellb | it was only "ovn" and it enabled everything | 20:40 |
*** aginwala has quit IRC | 20:40 | |
Sam-I-Am | does anything still use it? | 20:40 |
*** aginwala has joined #openstack-neutron-ovn | 20:40 | |
russellb | no docs or anything no | 20:41 |
Sam-I-Am | debating where i might put this sanity check. in here... or in another function... or just in the main loop | 20:41 |
russellb | dunno | 20:41 |
Sam-I-Am | or if its just a docs thing too | 20:45 |
russellb | detecting obviously broken config seems like a good thing if easy enough to do | 20:45 |
russellb | not critical though | 20:46 |
*** aginwala has quit IRC | 21:02 | |
*** aginwala has joined #openstack-neutron-ovn | 21:03 | |
*** jckasper has quit IRC | 21:09 | |
*** aginwala has quit IRC | 21:11 | |
*** aginwala has joined #openstack-neutron-ovn | 21:16 | |
*** allan_h has quit IRC | 21:24 | |
*** allan_h has joined #openstack-neutron-ovn | 21:24 | |
Sam-I-Am | oops, disabling ovn-controller also yielded another problem | 21:32 |
Sam-I-Am | yay fun bugs | 21:32 |
*** allan_h has quit IRC | 21:39 | |
*** arosen121 has joined #openstack-neutron-ovn | 21:47 | |
*** brad_behle has quit IRC | 21:49 | |
*** allan_h has joined #openstack-neutron-ovn | 21:51 | |
*** arosen12 has quit IRC | 21:51 | |
*** aginwala has quit IRC | 21:54 | |
*** thumpba has quit IRC | 21:54 | |
*** manand has joined #openstack-neutron-ovn | 21:57 | |
*** aginwala has joined #openstack-neutron-ovn | 21:57 | |
*** arosen12 has joined #openstack-neutron-ovn | 22:00 | |
*** jckasper has joined #openstack-neutron-ovn | 22:02 | |
*** arosen121 has quit IRC | 22:03 | |
*** igordcard has quit IRC | 22:09 | |
*** aginwala_ has joined #openstack-neutron-ovn | 22:10 | |
*** aginwala has quit IRC | 22:10 | |
*** rtheis has quit IRC | 22:10 | |
*** c_z has joined #openstack-neutron-ovn | 22:29 | |
*** jckasper has quit IRC | 22:30 | |
*** jckasper has joined #openstack-neutron-ovn | 22:31 | |
*** jckasper has quit IRC | 22:35 | |
*** jckasper has joined #openstack-neutron-ovn | 22:37 | |
c_z | I tried going through this tutorial http://blog.russellbryant.net/2015/05/14/an-ez-bake-ovn-for-openstack/ but the second devstack host never finished setting up. just says Waiting for ovn-controller to start ... done. | 22:37 |
c_z | any thoughts on why this happened? | 22:38 |
Sam-I-Am | c_z: good question | 22:40 |
Sam-I-Am | so the controller node hung? | 22:40 |
c_z | yeah | 22:41 |
Sam-I-Am | are there any firewalls on these nodes? | 22:41 |
c_z | I checked nova hypervisor-list too, it isn't showing up | 22:42 |
c_z | I don't think so | 22:42 |
Sam-I-Am | is the ovsdb listening on the controller port 6640? | 22:42 |
*** salv-orlando has joined #openstack-neutron-ovn | 22:42 | |
c_z | how can I check that? | 22:42 |
Sam-I-Am | ss -lntp | grep 6640 | 22:42 |
c_z | yeah looks like it is | 22:43 |
Sam-I-Am | can the compute node get there? | 22:43 |
*** salv-orl_ has quit IRC | 22:44 | |
c_z | yeah | 22:45 |
c_z | I'm pretty new to this so I apologize if this is something silly | 22:47 |
Sam-I-Am | well, it could be something broken in devstack | 22:49 |
Sam-I-Am | c_z: not sure how updated that blog post is | 22:51 |
Sam-I-Am | does this change anything you did? https://github.com/openstack/networking-ovn/blob/master/doc/source/testing.rst | 22:51 |
Sam-I-Am | err... | 22:51 |
Sam-I-Am | http://docs.openstack.org/developer/networking-ovn/testing.html | 22:51 |
c_z | I've gone through that one too, same result | 22:53 |
Sam-I-Am | so the first node worked ok? | 22:53 |
c_z | yeah | 22:54 |
c_z | and I can boot VMs fine | 22:54 |
Sam-I-Am | can you find anything in the logs on the compute node? | 22:54 |
Sam-I-Am | usually ovn-controller hanging means it can't reach the db for some reason | 22:54 |
c_z | I honestly don't know what to look for | 22:54 |
Sam-I-Am | so from the compute node, you can telnet to the controller node on port 6640? | 22:55 |
*** aginwala_ has quit IRC | 22:56 | |
*** aginwala has joined #openstack-neutron-ovn | 22:56 | |
c_z | no, looks like I can't | 23:00 |
Sam-I-Am | but its listening on the controller node? | 23:03 |
c_z | I think so | 23:05 |
*** regXboi has quit IRC | 23:07 | |
Sam-I-Am | so, ss -lntp shows 6640 listening? | 23:07 |
Sam-I-Am | and the ovsdb process is running? | 23:07 |
c_z | if I do that on the first host it shows 6640 listening | 23:08 |
c_z | was I supposed to do it on the second? | 23:08 |
Sam-I-Am | might as well | 23:10 |
c_z | well on the second it shows nothing | 23:11 |
Sam-I-Am | ok | 23:11 |
Sam-I-Am | on the controller node, what does iptables -S show? | 23:11 |
*** jckasper has quit IRC | 23:13 | |
*** jckasper has joined #openstack-neutron-ovn | 23:13 | |
*** jckasper has quit IRC | 23:14 | |
*** jckasper has joined #openstack-neutron-ovn | 23:14 | |
russellb | c_z: so it says waiting for ovn-controller to start, but it *does* say "done." right? | 23:16 |
russellb | and this is a compute node? | 23:16 |
russellb | the next command the devstack plugin runs is "ovs-appctl" | 23:16 |
russellb | can you see if there is an ovs-appctl process running? | 23:16 |
c_z | it looks like there is | 23:21 |
c_z | and yes, it says done after waiting for ovn-controller | 23:23 |
c_z | but it never showed the host ip or said stack.sh completed | 23:23 |
russellb | ok | 23:24 |
russellb | so ovs-appctl is hanging then | 23:24 |
russellb | c_z: try ... sudo ovs-appctl -t ovn-controller list-commands | 23:25 |
*** manand has quit IRC | 23:26 | |
c_z | so I tried on both nodes | 23:27 |
c_z | the first node it lists commands, but it hangs for the second node | 23:27 |
russellb | ok, well at least it's consistent!! | 23:27 |
Sam-I-Am | lol | 23:27 |
c_z | lol | 23:27 |
russellb | is ovn-controller actually running? | 23:27 |
russellb | or did devstack mistakenly think so | 23:28 |
russellb | and if it *is* running, what's in the log | 23:28 |
c_z | hmmm it looks like it | 23:29 |
c_z | is there a better way to check than ps aux? | 23:29 |
russellb | it should be running in one of the screen tabs | 23:29 |
russellb | screen -x ... switch to the tab running ovn-controller | 23:29 |
russellb | maybe ovn-controller is mis-configured and it's stuck in some startup.. | 23:29 |
* russellb has to step away ... leave whatever info you have and i'll keep looking later (maybe tomorrow) | 23:31 | |
c_z | looks like it is attempting to connect over and over | 23:31 |
russellb | ok, so mis-configured somehow | 23:31 |
russellb | I take it you set SERVICE_HOST in local.conf to the IP of the first node? | 23:31 |
c_z | yes I did | 23:32 |
russellb | ok | 23:32 |
russellb | wellllll | 23:32 |
russellb | i'll see if i can reproduce tomorrow, not sure what it | 23:32 |
russellb | is | 23:32 |
c_z | ok, thanks for the help! | 23:32 |
c_z | you too, Sam-I-Am | 23:32 |
c_z | it could be user error :P | 23:33 |
Sam-I-Am | c_z: does it say what its trying to connect to? | 23:33 |
c_z | 2016-02-11T23:34:47Z|00817|reconnect|INFO|tcp:10.0.2.15:6640: connecting... 2016-02-11T23:34:47Z|00818|reconnect|INFO|tcp:10.0.2.15:6640: connection attempt failed (Connection refused) 2016-02-11T23:34:47Z|00819|reconnect|INFO|tcp:10.0.2.15:6640: waiting 8 seconds before reconnect | 23:33 |
Sam-I-Am | what is 2.15? | 23:35 |
c_z | sorry, that's the ip of the first node | 23:35 |
c_z | 10.0.2.15 | 23:35 |
Sam-I-Am | ok, so back on the controller node... | 23:35 |
Sam-I-Am | what does iptables -S say? | 23:35 |
Sam-I-Am | also what distro is this? | 23:35 |
c_z | ubuntu 14.04 | 23:36 |
c_z | here is the iptables result for the first node http://pastebin.com/raw/hr2tW8Vc | 23:36 |
c_z | and for the second http://pastebin.com/raw/DNz6QxVL | 23:37 |
Sam-I-Am | hmm | 23:38 |
Sam-I-Am | on the controller node, what is the output of 'ss -lntp' ? | 23:38 |
Sam-I-Am | whats the ip of the second node? | 23:38 |
Sam-I-Am | 10.0.2.x smells like a virtualbox network... or an lxc default. | 23:39 |
c_z | I am doing this on virtualbox... is that a mistake? | 23:40 |
Sam-I-Am | no | 23:42 |
Sam-I-Am | but i'm thinking your vms cant actually see each other | 23:42 |
c_z | you know, I bet you're right | 23:42 |
Sam-I-Am | so, since you're using virtualbox, did you know we offer vagrant scripts? | 23:43 |
c_z | yeah | 23:43 |
Sam-I-Am | its a lot more elegant | 23:43 |
c_z | hmm, maybe I should go that route | 23:45 |
c_z | I think I am gonna come back to this tomorrow, I need to take a break haha | 23:48 |
c_z | thanks for all the help :) | 23:48 |
Sam-I-Am | no problem | 23:49 |
Sam-I-Am | brain hurt sucks | 23:50 |
Sam-I-Am | its not unusual around here | 23:50 |
c_z | haha good to know | 23:51 |
*** c_z has left #openstack-neutron-ovn | 23:51 | |
Sam-I-Am | that looks a lot like an ibm ip | 23:58 |
*** chandrav has quit IRC | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!