Sunday, October 26, 2008

Debugging an E-BGP multihop scenario

I have the following frame-relay hub and spoke topology:

R5----R2----R6

R2 is the hub, all routers are in the 150.100.100.0/24 subnet.

R2 = 150.100.100.2
R5 = 150.100.100.5
R6 = 150.100.100.6

Please note that R2 has one multipoint subinterface connected to R5 and R6. Blogspot doesn't like text drawings so I must draw it like above.

All routers are in sub-AS bgp confederations. R2 can only peer with one, and R5 and R6 must peer with each other.

The peers will not come up without ebgp-multihop configured, but suppose we forgot that. What kind of debugging could we do to lead us to that conclusion?

1) debug ip bgp

R6#
*May 25 04:46:18.587: BGP: 150.100.100.5 open active, local address 150.100.100.6
*May 25 04:46:48.587: BGP: 150.100.100.5 open failed: Connection timed out; remote host not responding, open active delayed 29387ms (35000ms max, 28% jitter)

This debug command shows us that BGP never completes the Active state. RFC 1771 tells us this about the Active state:

"In this state BGP is trying to acquire a peer by initiating a transport protocol connection."

So our TCP connection is not completing. Do you we have IP connectivity to R5? Sure:

R6#ping 150.100.100.5

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 150.100.100.5, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 100/102/108 ms
R6#


So now we can look higher up the protocol stack (e.g. filtering), or maybe the problem is still in the IP layer. In this case we have no ACL's applied.

2) debug ip packet detail

What I am looking for here is some packets from R5, sourced from 150.100.100.5. Debugging shows that I am getting none! However it also shows that I am getting ICMP type 11 messages from R2 immediately after I send a packet to R5:

R6#
*May 25 04:51:44.539: IP: tableid=0, s=150.100.100.6 (local), d=150.100.100.5 (Serial0/1/0), routed via FIB
*May 25 04:51:44.539: IP: s=150.100.100.6 (local), d=150.100.100.5 (Serial0/1/0), len 44, sending
*May 25 04:51:44.539: TCP src=24713, dst=179, seq=1584149779, ack=0, win=16384 SYN
*May 25 04:51:44.563: IP: tableid=0, s=150.100.100.2 (Serial0/1/0), d=150.100.100.6 (Serial0/1/0), routed via RIB
*May 25 04:51:44.563: IP: s=150.100.100.2 (Serial0/1/0), d=150.100.100.6 (Serial0/1/0), len 56, rcvd 3
*May 25 04:51:44.563: ICMP type=11, code=0


Seems that R2 is telling us something about our packet sent to R5.

3) debug ip icmp

*May 25 04:53:23.839: ICMP: time exceeded rcvd from 150.100.100.2

Here we get our answer. At this point we realize that our tcp syn packets sent to R5 have an IP TTL of 1, and thus are getting dropped by R2.

Do you know any other commands that would help you come to this conclusion?

1 comment:

  1. Man...am having the exact same problem and was reading through this like crazy and then bang! so what happened after all?

    Cheers,

    ReplyDelete

Note: Only a member of this blog may post a comment.