Lately there has been a lot of talk around the network and the corresponding conflation of terms and hyperbole around “Network Virtualization including Nypervisor, Software Defined Networking, Network Abstraction Layer, SDN, OpenFlow, etc.
Recently a blog entry entitled “Networking Needs a VMWare (Part 1: Address Virtualization)” appeared on Martin Casado’s blog which tries to make a case for comparing the memory virtualization capability in today’s modern hypervisors to network virtualization.
This sort of left an uneasy feeling in fully describing why we are seeing this activity in the network domain specifically to deal with the broken address architecture. This post is to try and bring some clarity to this and to maybe dig deeper into the root causes or problems in networking which have led us to this point.
The synopsis in the blog goes like:
One of the key strengths of a hypervisor lies in its insertion of a completely new address space below the guest OS’s view of what it believes to be the physical address space. And while there are several possible ways to interpose on network address space to achieve some form of virtualization, encapsulation provides the closest analog to the hierarchical memory virtualization used in compute. It does so by taking advantage of the hierarchy inherent in the physical topology, and allowing both the virtual and physical address spaces to support complete forwarding and addressing models. However, like memory virtualization’s page table, encapsulation requires maintenance of the address mappings (rules to tunnel mappings). The interface for doing so should be open, and a good candidate for that interface is OpenFlow.
The author of the blog post is trying to describe a well-known aphorism by David Wheeler, which states: “All problems in computer science can be solved by another level of indirection”. This statement is at the heart of “virtualization” as well as other references in communications layering, computer architecture and programming models.
Sidebar OSI Model
Lots of networking professionals like to refer to the 7-layer OSI model when talking about network abstractions. The problem is the OSI model was never adopted; in addition most OSI engineers agree that the top 3-layers of the OSI (Application, Presentation and Session) belongs in “one” application layer. We utilize a derivative of that model which is essentially the four-layers representative in the TCP/IP model.
Lets first try and define what an address is and then what is meant by encapsulation being careful not to conflate these two important yet independent terms.
Addressing and Naming
The first thing to recognize is that the Internet is comprised of two name spaces, what we call the Domain Name System and the Internet Address Space. These turn out to be just synonyms for each other in the context of addressing with different scope. Generally we can describe an address space as consisting of a name space with a set of identifiers within a given scope.
An address-space in a modern computer system is location-dependent but hardware-independent thanks to the virtual memory manager and “memory virtualization”. The objective of course is to present a logical address space which is larger than the physical memory space in order to give the illusion to each process that it owns the entire physical address space. This is a very important indirection mechanism, if we didn’t have this, applications would have to share a much smaller set of available memory. Does anyone remember DOS?
“Another problem with TCP/IP is that the real name of an application is not the text form that humans type; it’s an IP address and its well-known port number. As if an application name were a macro for a jump point through a well-known low memory address. – Professor John Day”
Binding a service, which needs to be re-locatable to a location-dependent address, is why we have such problems with mobility today (in fact we may even conclude that we are missing a layer). Given the size and failure rates of today’s modern data-centers this problem also impacts the reliability of the services and applications consumers are so dependent on in todays web-scale companies.
So while this is a very important part of OS design, its completely different from how the Internet works because the address system we use today has no such indirection without breaking the architecture (i.e. NATS, Load Balancers, etc).
If this is true, is the IP address system currently used on the Internet “location-dependent”? Well actually IP addresses were distributed as a “location-independent” name, not an address. There are current attempts to correct this such as LISP, HIP as well as “BeyondIP” solutions such as RINA.
So it turns out the root of the problem in relation to addressing is that we don’t have the right level of indirection because according to Saltzer and Day, we need a “location-independent” name to identify the application or service but all we have is a location-dependent address which is just a symbolic name!.
What is encapsulation?
Object Oriented Programming refers to encapsulation as a pattern by which [“the object’s data is contained and hidden in the object and access to it restricted to members of that class”]. In networking we use encapsulation to define the different layers of the protocol stack, which, as we know “hides” the data from members not in the Layer, in this way the protocol model forms the “hour-glass” shape minimizing the interface and encapsulating the implementation.
Sidebar Leaky Abstractions
Of course this isn’t completely true as the current protocol model of TCP/IP is subject to a “leaky-abstraction”. For instance there is no reason for the TCP logic to dive into the IP frame to read the TOS data structure, doing so would be a “Layer Violation” but we know that TCP reaches into IP to compute the pseudo header checksum. This rule can be dismissed if we think of TCP/IP as actually one layer as it was before 1978. But the reality of the broken address architecture leads to the “middle boxes” which must violate the layers in order to rewrite the appropriate structures to stitch back together the connection.
So how does encapsulation help?
In networking we use encapsulations all the time..
We essentially encapsulate the data structures which need to be isolated (the invariants) with some other tag, header, etc. in order to hide the implementation. So in 802.1Q we use the C-TAG to denote a broadcast domain or VLAN, in VXLAN we encapsulate the host within a completely new IP shell in order to “bridge” it across without leaking the protocol primitives necessary for the host stack to process within a hypervisors stack.
From the blog.. “encapsulation provides the closest analog to the hierarchical memory virtualization in compute”
So in the context of a “hierarchy” yes we encapsulate to hide but not for the same reasons we have memory hierarchies (i.e. SRAM(cache) and DRAM). This generalization is where the blog post goes south.
So really what is the root of the problem and how is SDN an approach to solve it?
From an earlier statement we need a “location-independent” name to identify the application or service but all we have is a location-dependent address which is just a symbolic name!. If we go back to Saltzer we see that’s only part of the problem as we need a few more address/names and the binding services to accomplish that.
One interesting example to this is the implementation of Serval from Mike Freedman at Princeton University. Serval actually breaks the binding between the application/service name and the inter-networking address..(Although there are deeper problems then this since we seem to be missing a network layer somewhere). Serval accomplishes this through the manipulation of forwarding tables via OpenFlow although it can be adapted to use any programmable interface if one exists. Another example is the NDN Project led by Van Jacobson
Yes it is unfair to conflate “Network Virtualization” with “OS Virtualization” as they deal with a different level of abstraction, state and purpose. Just as hypervisors were invented to “simulate” a hardware platform there is the need to “simulate” or abstract the network in order to build higher-level services and simplify the interface (not necessarily the implementation). In fact a case can be made that “OS Virtualization” may eventually diminish in importance as we find better mechanisms for dealing with isolation and protection of the host stack while network virtualization will extend beyond the existing solutions and even existing protocols allowing us to take on a new set of challenges. This is what makes SDN so important; not the implementation but the interface. Once we have this interface, which is protocol independent, we can start to look at fixing the really hard problems in networking in a large scale way..