Friday, September 25, 2015

Thanks Google for Open Source TCP Fix!

The Google transport networking crew (QUIC, TCP, etc..) deserve a shout out for identifying and fixing a nearly decade old Linux kernel TCP bug that I think will have an outsized impact on performance and efficiency for the Internet.

Their patch addresses a problem with cubic congestion control, which is the default algorithm on many Linux distributions. The problem can be roughly summarized as the controller mistakenly characterizing the lack of congestion reports over a quiescent period as positive evidence that the network is not congested and therefore it should send at a faster rate when sending resumes. When put like this, its obvious that an endpoint that is not moving any traffic cannot use the lack of errors as information in its feedback loop.

The end result is that applications that oscillate between transmitting lots of data and then laying quiescent for a bit before returning to high rates of sending will transmit way too fast when returning to the sending state. This consequence of this is self induced packet loss along with retransmissions, wasted bandwidth, out of order packet delivery, and application level stalls.

Unfortunately a number of common web use cases are clear triggers here. Any use of persistent connections, where the burst of web data on two separate pages is interspersed with time for the user to interpret the data is an obvious trigger. A far more dangerous class of triggers is likely to be the various HTTP based adaptive streaming media formats where a series of chunks of media are transferred over time on the same HTTP channel. And of course, Linux is a very popular platform for serving media.

As with many bugs, it all seems so obvious afterwards - but tracking this stuff down is the work of quality non-glamorous engineering. Remember that TCP is robust enough that it seems to work anyhow - even at the cost of reduced network throughput in this case. Kudos to the google team for figuring it out, fixing it up, and especially for open sourcing the result. The whole web, including Firefox users, will benefit.

Thanks!