Latency in 5G, Legacy in 4G
Don Brown & Stephen Wilkus
In developing wireless 5G standards, we have an opportunity to further reduce latency, the time delays, in future wireless networks. In fact, there appears to be unanimous opinion that 5G standards should have less than 1 millisecond (msec) of latency.,,, But why?
In considering results from neurology and studies of interactive games, and in considering the current state of network latency, we do not see compelling business requirements for lower latencies, except insofar as such improvements can also improve throughput and connection setup times. Support for high speed trains may also benefit from lower latencies.
Before discussing the motivation behind a latency requirement of ≤1 msec, let’s be clear on what we mean by latency. The various proposals for 5G are typically specific about the numerical goals for the standard but rarely specific about what the numbers really mean. Some talk of latency as End to End delay, or round trip times, transmit time interval (TTI), ping times, Radio Link Layer TX to ACK times, call setup time, etc.; but nearly all say “it” should be no more than 1 msec. To be specific:
Transmit Time Interval (TTI): The minimum length of time of a UE specific transmission.
In the case of LTE, one sub frame is 1 msec long and consists of 2 time slots. This is the smallest scheduled time interval that can be allocated to a UE. Before one can start transmitting a burst of encoded and error protected data, one must have the complete transport block, which means that there is at least this much delay between getting the data from microphone or camera or other sensor and transmitting it. One can say that LTE has a 0.5 msec TTI.
Large IP packets may need to be segmented in to multiple TTIs depending upon the coding and modulation schemes chosen to adapt to the channel quality. This segmentation can lead to a single IP packet being scheduled onto several time slots.
HARQ processing time: There is a reasonable chance that a received transmission will be in error, typically assumed to be about 10%. When this happens, a Hybrid Automatic Retransmission reQuest is sent (HARQ) between the eNodeB and the User Equipment (UE). The latency of a wireless system needs to account for the processing time to decode and error check a transport block, send a retransmission request and expect one or more retransmissions. These retransmissions are one important source of jitter in the timing.
In the case of LTE, the HARQ processing time delay is 4 subframes (4 msec) so a retransmission requires 7 msecs, with a chance of several more such requests depending upon interference and levels, signal strength and congestion. This is shown in the following figure. With TTI bundling of the sort used in VoLTE there is a 12 msec delay.
For TDD-LTE, the HARQ delay is 9 to 10 msec and 13 to 16 msec for for TTI bundling of the sort used with VoLTE.
Frame size: The minimum time period between system transmissions from a radio that includes feedback from the other end of the link.
As illustrated in the previous figure, in LTE, the frame is 10 msec long and is the periodicity of the Physical Broadcast Channel (PBCH) used for synchronization with the Master Information Block (MIB). Note that ideally, when datagrams are small, and channel quality is good, UE to eNodeB to UE times can be as little as 5 msec, which is less than the frame sizes. This is commonly misunderstood in discussions of latency; an acknowledged transmission can be faster than the frame interval.
The Round Trip Time (RTT) typically refers to the “ping time” to send a short IP packet from the UE to a server in the Internet and receive a reply back. Because Ping time is easily measured from any smart phone, tablet or laptop, the press typically reports these ping times as latencies. These numbers are dominated by the network delays between the base station and the servers or other end points illustrated on the far right of the previous figure. The internet may introduce seconds of delays when connections go through satellite links or intercontinental routes.
- Discontinuous Reception – Receiving the Physical Downlink Control Channel every 1 msec to listen for pages from the network would waste battery capacity. Rather than reduce battery life so quickly, UEs use Discontinuous Reception (DRX) in which they skip many frames and only wake up every 32 frames (or so) to check for relevant downlink signals. This is not relevant when the UE is in actively connected mode (Cell_DCH), but it creates a long latency of many tens of msec for unscheduled messaging.
These various measures of latency and communications delays have regularly improved over time as suggested in the comparative plot below. This shows minimum LTE ping times of 44 msec to the OOKLA “speedtest” server. It shows the ping times for LTE 4G has a minimum round trip ping time of 32 and 44 msec (on AT&T and Verizon service, respectively) compared with 88 msec for UMTS HSPA 3G service on an iPhone 4 (AT&T). (The iPhone 4S measurements were all made at the same location and night while the others were measured in much more varied conditions.)
The 32 msec minimum LTE ping time may appear at odds with the theoretical minimum of 5 msec round trip time discussed above, but the 5 msec figure was only for a UE transmission to be acknowledged from the eNodeB, while the 32 msec measured ping time was to a server located in the internet over 40 km away and with several intermediate nodes along the way. OpenSignal has reported LTE latency of 98 msec averaged over several operators.
There are several reasons to try to reduce the TTI, frame, HARQ and setup times in making 5G. For example, reducing the TTI time slot interval directly reduces the feedback time, enabling smaller buffers and more efficient and timely feedback. But we should be clear that end to end times are determined primarily by network considerations, and that further improvements in the air interface will not help end to end delays improve substantially.
As an example, the very fastest fiber optic link between the Chicago and New York stock exchanges have been optimized with extravagant deployments of particularly straight paths to get to 13 msec round trip times. It turns out that the High Velocity traders on Wall Street want the fastest possible link from their computers to the trading computers on Wall Street.
One company, Spread Networks® offers a dedicated network connection from Chicago to NJ/NYC for this specific purpose.
Chicago to NYC is about 1140 km in a straight line. Light travels thru fiber at about 200 km per 1 ms – so light takes about 6.5 ms just to travel from Chicago to NYC, one way (in a straight line), or about 13 ms round trip. So, given Spread Networks® report of taking about 14.5 ms, this means that there is an additional 1.5 ms for the signal to go thru the regenerators, computers, routers and other switching equipment, round trip. (Purpose built microwave links between Chicago and New York City claim to have reduced to the time to ~8.6 ms round trip, thanks to the fact that air has a higher refractive index than glass. (The speed of light limit is 7.6 msec, so they have done an excellent job of reducing regeneration and error correction delays.)
From this extravagant system, we are lead to conclude that 82 miles or 132 km is as far as one could backhaul without incurring 1 msec of additional round trip delay. So when 5G proponents talk of 1 msec E2E latencies, we are restricted to distances much less than 82 miles or the distance between New York City and Philadelphia, PA.
This suggests one approah to reduce End to End (E2E) latencies; by offloading local traffic at the base station. This would allow two interactive gamers or two vehicles that are within the same cell to communicate with sub frame time latencies. This would express local traffic without incurring the delays in the network to the right of the Service Gate Way (SGW) shown in the first figure.
Which Applications need low latencies?
Which applications, and what business cases, drive the need for low latencies?
A number of proponents have suggested that 5G will enable what is loosely called, “Tactile Networks.”,  This is to serve very responsive applications such as gaming and vehicle control systems.
However, we find from neurological studies that conduction velocities of nerves are on the order of a few inches per millisecond. To conduct pain 1 meter, from, say, fingertips to brainstem, takes 29 to 200 msec with the Aδ axons, as indicated in the following figure. This is even without motor feedback or cognitive processing. 
Once Electro Mechanical Delay (EMD) is considered, we see that there are tens of milliseconds of delay in even reflex responses. 
In interactive computer games, researchers tell us that in the most demanding games of First Person Player or Racing games, about 50 msec latencies are inconsequential. One oft-cited article suggests that the threshold for first person shooter games and racing is 100 msec. (Though a graphic shows some improvement in lap times for a racing game as the latency is decreased below 100 msec.)
It is worth remembering that the screen refresh rate in film is 24 fps or 41.66 msec, which the eye does not detect. That is to say, many displays would not even present a gamer with a new view of the racetrack more often than about every 20 msec. The European Broadcasting Union recommendation on Lip-Synch, the time delay between audio and video content, states that audio/video synch should be within +40 msec to -60msec (audio before/after video), but are often off by 100 msec. This further supports the notion that the human nervous system is insensitive to the sort of latencies of tens of msec.
Remember how proud you were of yourself when you caught an object that had fallen from a tabletop? To drop 1 meter takes 250 msec, much longer than the 1 msec response times proposed to enable “tactile networks.”
Why might we need latencies under 1 millisecond?
Communications between autonomous automobiles is both local (likely the same cell) and potentially urgent. However, even here we observe that at 55 MPH a car moves 1 inch in 1 msec. So latency in inter-car communications of even 10 msec corresponds to less than a foot or 25 cm. Air bags deploy in 15 to 30 msec.
As a result, the authors suggest that aside from research funding opportunities, very low latencies of ≤1 msec have not clear business drivers, with the exception of generally improving overall throughput and channel sensing at speeds corresponding to high-speed trains. In such cases, and for these reasons alone, it appears that improvements to the latencies inherent in the air interface may be warranted, but otherwise the business imperatives are not apparent.
In fact, for sensor networks, and similar machine-to-machine communications, time diversity from repeated transmissions or HARQ may be more helpful to communicating high value bits through extended link budgets with penetration through walls and earth, than low latency. A delay of many seconds in communicating an alert of a flooded basement or a utility meter reading seems a valuable tradeoff in the interest of reliability and range.
 IWPC white paper, Mobile Multi Gigabit (Mogig) Wireless Networks And Terminals – 5000x Working Group, April 2, 2014. http://iwpc.org/WhitePapers.aspx#5000x. METIS requirements, presentations by Samsung, Intel, Ericsson, 5GNow, etc. etc.
 Presentation by Howard Been, Jan 2014, Vision and Key Features for 5th Generation (5G) Cellular. Available on-line at: http://cambridgewireless.co.uk/Presentation/RadioTech_30.01.14_HowardBenn.Samsung.pdf
 Ericsson white paper, “5G Radio Access, Challenges for 2020 and Beyond.” June 2013. Available at: http://www.ericsson.com/res/docs/whitepapers/wp-5g.pdf
 METIS Document Number: ICT-317669-METIS/D1.1, Scenarios, requirements and KPIs for 5G mobile and wireless system, April 29, 2013. Available on line at: https://www.metis2020.com/wp-content/uploads/deliverables/METIS_D1.1_v1.pdf
 Here we define latency as the time difference between the start of a transmission and the receipt of its acknowledgement from the other end of the radio link, as defined in the excellent paper, Blajić, Nogulić, and Družijanić, "Latency Improvements in 3G Long Term Evolution." Mipro CTI, svibanj (2006), available on-line at: http://nashville.dyndns.org:800/WirelessDownloads/_lte/Core%20EPC%20and%20SAE/LatencyImprovementsInLTE.pdf
 Bontu, C.S.; Illidge, E., "DRX mechanism for power saving in LTE," Communications Magazine, IEEE , vol.47, no.6, pp.48,55, June 2009. available on line at: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5116800&isnumber=5116787
 Samuel Johnston, “LTE Latency: How does it compare to other technologies?” report of OpenSignal March 10, 2014. Available at: http://opensignal.com/blog/2014/03/10/lte-latency-how-does-it-compare-to-other-technologies/
 Spread Networks® Latencies for Ultra Low Latency Service Latency between Chicago – 350 E. Cermak and New Jersey Trading Venues
http://www.spreadnetworks.com/media/11244/wavelength_latencies_chicago_to_nj_12_2013a.pdf and http://spreadnetworks.com/products/ultra-low-latency-services/carteret-to-chicago-dark-fiber-–-1300-milliseconds-roundtrip/
 Jake Thomases, “Capital Markets to Embrace Microwaves for Data Feeds,” Source: Waters | 16 Aug 2013, available at: http://www.waterstechnology.com/waters/feature/2289570/capital-markets-to-embrace-microwaves-for-data-feeds
 Gerhard Fettweis, “The Tactile Internet – Driving 5G,” ETSI Future Mobile Summit, Nov 21, 2013. available on line at: http://docbox.etsi.org/Workshop/2013/201311_FUTUREMOBILESUMMIT/11_TECHNICALUNIofDRESDEN_FETTWEIS.pdf
 Gerhard Fettweis, “5G – What will it be: The Tactile Internet,” July 30, 2013, available at: http://icc2013.ieee-icc.org/speakers_17_198889650.pdf
 ElectroMechanical Delays (EMD) of reflex responses (which do not go through the brain) are measured to be from 7 msec to 40.8msec (Zhou, Shi, Lawson, David, Morrison, William, “Electromechanical delay in isometric muscle contractions evoked by voluntary, reflex and electrical stimulation,” European Journal of Applied Physiology and Occupational Physiology, 1995, Volume 70, Issue 2, pp 138-145)
 Claypool, Mark, and Kajal Claypool. "Latency can kill: precision and deadline in online games." Proceedings of the first annual ACM SIGMM conference on Multimedia systems. ACM, 2010. http://dl.acm.org/citation.cfm?id=1730863
 Claypool, & Claypool, “Latency and Player Actions in Online Games,” Communications of the ACM, Nov. 2006/ Vol. 49, No. 11, available at: http://web.cs.wpi.edu/~claypool/papers/precision-deadline/final.pdf