Exploring the Justification for a “New IP”

Well, we have another new-IP story emerging, one that includes both some of our old familiar themes and some new-ish ones.  The questions are first, whether any of the suggestions make sense, and second, whether they could be implemented if they did.  I think there may even be a better way to achieve the same goals.

A Light Reading story lays out the issue with perhaps a tolerable touch of assumption.  Future 5G applications like “remote surgery, haptic suits, holographic calls and self-driving robots” might be impossible in an IP world because of the inherent latency of packet handling in IP networks.  A new IP, or something decidedly non-IP, could reduce that latency and get those robots moving (perhaps to do surgery in haptic suits).

There is no question that there are applications whose latency requirements would be difficult or impossible to meet over the Internet.  Answering the two questions posed at the opening of this blog means weighing the credibility of those applications, the technical credibility of proposed solutions, and the likelihood that the solutions would be adopted by operators, which is a question of return on investment.

I don’t think many people would rule out any of the possible low-latency applications that the article lays out.  That’s not the same as saying there’s pent-up demand for them, or that their value would justify a major fork-lift to infrastructure.  To a degree, I think the proponents of new-IP models are guilty of circular justification.  We need 5G.  5G needs a differentiator.  Low latency is a differentiator.  Therefore, we need to rebuild the Internet to offer low latency so we can justify 5G.  The obvious problem with that sequence is that creating the “driver” for 5G to justify investing in it demands a much larger driver, to justify a much broader infrastructure change.

There is no clear justification for broad changes to IP to lower latency.  The Internet today demonstrates that IP is fine for what we’re doing with it.  Sure, it could be better (we’ll get to that below), but to say that it’s not suitable because it won’t support self-driving cars or factory automation begs the question of how we got both of them with the old IP.

Factory automation doesn’t require us to haul sensor data a thousand miles or more.  What factories run with sensors in one city and controllers in another?  That would be a silly approach.  Industrial IoT may not always be line-of-sight, but it’s darn sure unlikely to be intercontinental.  And self-driving cars?  Why would you put vehicle automation anywhere but in the vehicle?  Sure, the vehicle needs information on routes and traffic conditions, but those aren’t the things with a latency problem.  Pedestrian steps off sidewalk without looking, or vehicle runs the light.  Onboard sensors are how those things are detected.

I’m not saying that low-latency is stupid and unnecessary, only that it’s not broadly justified if it costs much to achieve it.  The article says that latencies of one or two milliseconds could disrupt remote surgery or self-drive.  We’ve already seen the latter doesn’t really need low latency.  How about remote surgery?  It is possible that we might want a specialist a thousand miles away to perform an operation, isn’t it?

Yes, but.  The “but” is that just having one surgeon doing that doesn’t mean we have to rebuild the Internet.  Do we think that all surgery would be remote?  Remember, “remote” would mean somewhere far enough to need the Internet to make the connection.  It would require a lot of surgery, and a lot of surgeons, to justify a major Internet-wide shift.  Remember, we’re not going to spend a trillion dollars for something that might happen, eventually, on a small scale.  There’d be a better way (again, see below).

OK, let’s move past the fact that none of the missions cited as a justification for remaking the Internet and IP have real broad-based demand at the scale needed.  What about the simple question of whether there would be a better way of doing IP?  The article quotes John Grant of the European Telecommunications Standards Institute (ETSI).  “All these extra things you want to do to make IP usable in a mobile system have involved putting extra headers on the packet, and that means you have to do more processing and send more bits over the interface.”

Wait!  Now we’re doing mobile remote surgery?  But let’s put this aside too.  Extra headers, more packet processing per header, more bits on the interface.  Well, yes, that’s true.  The average Internet packet is about 570 bytes, of which about 22 bytes is the IP header.  Control processes like IoT, including likely remote surgery, would require smaller packets because incremental changes in things have to be reported quickly.  We could say that an IoT packet might run only about 40 bytes, over half of which would be header.

Let’s do some math.  The speed of bits in copper is about 100,000 miles per second, and in fiber optics is roughly 120,000 miles/second, we’d have a propagation delay of 8.3 x10-6 seconds per mile, or 8.3 x10-3 milliseconds per mile.  So, if our surgeon is more than 240 miles from the patient, the maximum of our “1 or 2 milliseconds” cited as the tolerable delay for remote surgery is already shot.

You can’t speed up light, and it’s hard to see how remote surgery limited to a maximum of 240 miles is a game-changer.  Also note that this doesn’t include the fact that whatever the surgeon does has to be reflected back along the same path as the event that caused it.  We can achieve propagation event/response delays of a millisecond over 60 miles only, so I think the robotic surgery story is too limited an application, given physical limits of data transmission.

Handling, meaning serialization delay of the packets at the data rate plus the handling delay of the devices, queuing, etc. is a large factor in total delay.  Where it might add up is in the network, where a packet has to be read, switched, and written multiple times.  Industry savvy says that a decent core router with a typical packet would introduce about 52 microseconds per hop, absent congestion.  The delay on ingress to the Internet and on egress to the destination would be longer because the speeds would be lower.  The point is that absent congestion and propagation delay between nodes, you could do about 20 hops in a millisecond.

Would changing the headers or forwarding strategy improve this?  I’ve not seen much to suggest that a big improvement is possible.  Most big routers have chips for table lookup, and while you might reduce the time required to switch a packet a bit by making the tables smaller, it’s not clear how easy that would be without rethinking what’s in them, which would mean restructuring how destinations are addressed.  Even if you did that, what would the benefit be, given the other factors in switching delay.

Which are related to queuing and congestion.  Some of the measures people are proposing for reducing latency are designed to manage bandwidth better.  Have we learned nothing in the last decade?  Capacity these days is almost always cheaper than managing capacity, particularly when you consider that the latter approach has tended to increase complexity and opex.

My view on this is the same it’s always been.  The network of the future doesn’t need a new IP.  It might benefit from things like segment routing, but the biggest gains would come simply by creating an incredibly rich optical network that provided direct optical paths between all the major areas.  If we presumed a hop from source to metro on-ramp, to destination off-ramp, to destination, we would have 4 transit hops.  If everything was oversupplied with capacity, we’d have little chance for congestion, and that would mean 8 round-trip hops for 0.4 milliseconds, plus round-trip fiber transit delays—say for 500 miles each way, 8.3 milliseconds.

This pair of numbers tells the whole story, I think.  The great majority of latency, even in an optimum new-IP structure, comes from the source we can’t change—the speed of light in fiber.  We’re proposing to tweak something that accounts for (in my example) 4.8% of the latency.  And to do that, we invent a new IP?  I think that we could address legitimate low-latency requirements by augmenting capacity, reducing hops with more direct optical pathways, and even separating traffic that does require low latency from “normal” Internet traffic.  No new IP required.

This doesn’t address the other issue mixed into the narrative, mobile networks and 5G.  It is true that mobility poses major issues, because users move between cells and so their own IP address isn’t sufficient to route them correctly.  It’s also true that the solution to this problem, which is tunneling user traffic to the right cell based on mobile registration, adds header overhead and processing.  But….

…is that necessarily a bad thing?  It is if you want to assume that low-latency, edge computing, and the like are all justifications in themselves, not things that need to be justified.  We’re not there yet.

…is there an approach on the table to fix it, without changing IP overall?  Well, back in the heyday of SDN, I proposed that an SDN “route”, established as it was based on centrally imposed rules, could include a topology I called the “whip”.  It was held at one end and swirled around at the other.  If we established an SDN route for a given mobile device, from the packet gateway as the “held” end of the whip, we could centrally move the tip around from cell to cell.  I submit that this would work, that it would simplify mobility (the mobility management system only has to tell the central SDN controller when something moves), and that it would radically simplify the EPC or 5G core handling.

Might a wiser, identity-based rather than location-based, system of routing have done better for us had it been adopted before Internet usage exploded?  Sure, the usual 20-20 hindsight applies.  I do not think that major revisions to the Internet are justified just to solve the mobility problem, which is fairly local in scope.  Until we can teleport people around, we’re not going to rearrange the relationship between cell sites and smartphones across continents in a flash.  It’s worthwhile to consider the “whip” model of forwarding path via SDN, or a similar approach, to modernize mobility support.  No need for a new IP here either.

It’s going to be very difficult to create a “new IP” at this point.  One obvious reason is that the Internet’s clients and services dominate our society, and the total investment in client-side technology alone would drive a deep stake into the ground at the site of “current IP”.  The second reason is that the “needs” being cited to justify pulling out that stake are themselves in need of justifications.  We might have done things differently had we known where all this was heading.  We might even have done it better, but we didn’t know, didn’t do, and so now the incremental rather than revolutionary approach is the right answer.  We can tweak IP here and there, but it will remain what it was, and is, for a long time to come.