Wednesday, August 11, 2010

A Brief Introduction to LISP

At Cisco Live 2009 I was introduced to the Locater/ID Separation Protocol (LISP).  I thought the idea seemed interesting, but I didn’t quite follow the practical purpose for it.  While planning my Cisco Live 2010 schedule I made sure to revisit this topic to get a better understanding of it.  After the breakout session and a few hours of experimenting, I believe I have a good feel for the issue LISP is attempting to solve, and the manner in which it intends to solve it.

By the way, I’m going to skip the requisite joke about the recursive programming language.  I will say that in college I spent a semester and a half programming in Scheme.  This first semester was an Introduction to Computer Science course, and wasn’t too bad.  The half semester was an advanced course called Artificial Intelligence.  It was a struggle for me to wrap my head around all the recursion.  When I reached the halfway point of the course, I asked the instructor about my outlook for a good grade.  She politely suggested I consider dropping the course :)

 

The IP Routing Problem

Before discussing LISP, it is useful to compare/contrast how DNS works versus how Internet IP routing works.  Both DNS and IP Routing deal with a large databases.  With DNS, the database is truly distributed.  End user DNS servers (for example, your corporate Internet-attached DNS server) are configured with specific names for their authoritative zones, plus enough information to allow them to look-up any other information they might need.  Random DNS entries (for example, www.cisco.com) are not pre-loaded into your DNS server.  If you need to resolve this name, the DNS server requests the information from an upstream server.  Caching adds some efficiency, but does not change the overall structure.  This system has allowed the number of DNS zones to scale well into the millions.

The database for IP routing is handled quite differently.  In the Internet’s Default Free Zone (DFZ), all routers must have the entire Internet routing database.  Summarization can help alleviate this requirement to a degree, but summarization also comes at the price of less accurate routing information.  The IPv4 routing database is currently 325,000 routes, give or take a few thousand.  Ultimately the IPv6 table will be as large, or more likely considerably larger, and the increased size of the address space will result in larger memory requirements.  And remember, this is high-speed router memory, not generic DRAM.  Wouldn’t it be great if we could transition IP routing from it’s current ‘replicated database’ model to a distributed database, like DNS?  As a matter of fact, that’s the goal of LISP.

 

Following a Connection in a LISP-Enabled World

Let’s go through a simple example in a fully LISP-enabled environment.  A user PC would determine the destination IP address of a web server via DNS and create a standard IP packet, with its own IP as the source and the IP of the web server as the destination.  The packet would be routed through the user’s LAN until it reached an Internet gateway.

 

Blog Post - LISP

At this point, the LISP-aware ISP router (in LISP-speak, the Ingress Tunnel Router, or ITR) would perform a lookup for the destination IP address.  An answer would come back with the IP address of the Egress Tunnel Router (ETR) for that destination.  The ITR would then encapsulate the user’s packet in a LISP packet, with a source IP of the ITR’s ISP interface and a destination of the ETR’s ISP interface.  This packet would then be sent through the Internet to the ETR.

The ETR receives the LISP-encapsulated packet, removes the header and routes the native IP packet (with the user’s PC as a source IP and the web server’s IP as a destination) into the local LAN, to the web server.  Return traffic from the web server to the user follows the same procedure in reverse (the web server’s ISP router acts as the ITR and the user’s ISP router is the ETR).

As a side note, the “Ingress” and “Egress” designations for Tunnel Routers are relative to the LISP-encapsulated tunnel.  The router that encapsulates a packet into LISP is the ingress router.

 

Tell Me Again… Why Is This Better?

On the surface this seems like extra work for little to no benefit.  Let’s dig a bit deeper to see how this helps each component in the path.

User PC – Nothing is different

ITR – This router only needs a default route to the Internet.  It is not clear from this example, but even if there is redundancy, in a fully LISP-enabled environment, only a default route is required.

Internet PE / P routers – This is where the magic happens.  The Internet PE and P routers only need routes for the ISP interfaces of the customer routers.  All packets between customers will be encapsulated, with the source and destination IP addresses coming from the WAN circuits.  Memory requirements are greatly reduced in these devices.

ETR – Only requires a default route to the Internet.

Web Server – Nothing is different

 

Drawbacks

So what are the drawbacks to LISP?  I can think of several:

MTU Issues – At its heart, LISP is a tunneling technology.  All tunneling technologies suffer from potential MTU issues.

Complexity – This paradigm is clearly different than what most of us are comfortable with.  But we didn’t enter this field thinking nothing would ever change, right?

Delay – The initial packet towards a new destination will likely get dropped, because the destination lookup takes time.  After the first packet the destination information is cached, so subsequent packets should flow without delay.  According to the Cisco LISP team, testing has shown this isn’t as big of an issue as it appears in theory.

A few optimizations have been suggested to deal with this delay issue.  First, a set of common destinations could be pre-programmed into ITRs (subnets associated with Google, for example).  Colin McNamara had a particularly interesting suggestion of ITRs performing DNS reply snooping, as it is highly likely that a DNS lookup will be followed quickly by an initial packet to that destination.  I’m not sure if this is being worked on, but it seems like a great idea.

 

What is Missing?

I glossed over the entire destination IP address lookup portion of LISP.  I will post a follow-up article describing this step.  For now, trust me when I say that it is not much different than DNS.

I also skipped over the other advantages of LISP.  For one, the ETR completely controls how inbound traffic is delivered to it.  If a destination IP address has multiple ISP gateways, those gateways can instruct ITRs to load-balance between destinations.  Experienced network engineers should immediately see the power of this feature.  This could signal the end of our rudimentary BGP-based load-balancing mechanisms (AS prepending, subnet splitting/disaggregation, etc).

A second benefit that LISP provides is its ability to send non-native traffic over a routed backbone.  In the example above I did not specify any of the IP addressing involved.  It is possible for the user PC and the web server to use IPv6, while the ISP network uses IPv4.  The ITR would receive an IPv6 packet from the user and perform a lookup which resolves to the IPv4 ISP address of the ETR.  The ITR would then encapsulate the IPv6 packet into an IPv4 LISP packet, which would then be sent over the ISP infrastructure.  When the ETR receives the LISP-encapsulated packet, it strips the header off and routes the IPv6 packet towards the web server.  This can also happen in reverse, where a pair of IPv4 speakers communicate over an IPv6 backbone.

Lastly, I completely bypassed the transitioning technologies to a fully LISP-enabled Internet.  They do exist, and it is very possible to deploy LISP incrementally.  We are long past the point of Flag Days.

 

What’s Next?

The LISP presentations I’ve seen have spent a lot of time describing the benefits of LISP for IPv4.  I’m not sure that is the correct place to focus.  As stated above, we are at about 325,000 IPv4 Internet routes.  At most I foresee a doubling in size of the routing table, to somewhere around 600,000 routes.  This factors in the usage of the 10% or so of remaining IPv4 space, as well as increased use of subnetting for load-balancing purposes.  I see the LISP protocol filling two needs.  First, it can be a great transitioning technology to IPv6, as well as a way to keep IPv4 alive over an IPv6-only infrastructure.  In an upcoming LISP blog post I will demonstrate how I am using LISP to reach the IPv6 Internet over an IPv4-only ISP.

Even more importantly, we are at the very beginning of the IPv6 route table explosion.  If the LISP team can get significant traction with LISP6, we can avoid the routing table bloat issue we’ve run into with IPv4.  Remember, the IPv4 routing table isn’t going anywhere, so we will only be adding to our problems with the deployment of IPv6.

If you are keeping track, I’ve promised two upcoming blog posts on LISP (my implementation experience and how the mapping database system works).  In the meantime, there are two useful resources online – lisp4.cisco.com and www.lisp4.net.  For a more scholarly take on LISP, see Petr Lapukhov’s article at iNE.com.

2 comments:

Sam Crooks said...

sounds like... MPLS in LISP clothes

Jeremy Filliben said...

Sam C,

I agree, it is very MPLS L3VPN-like. I thought about using that analogy, but I couldn't make it work as well as the DNS analogy. From a conceptual POV, the main difference is that MPLS L3VPN requires all PEs to have the full routing table, so there is no lookup mechanism.

Once the lookup is complete, I agree, it is very similar to MPLS.