Using Apache HTTPD mod_deflate, we recently enabled gzip compression, by default, on all Go Daddy 4GH sites. Reducing the size of documents might be seen primarily as a way to reduce network bandwidth consumption. However, I will explain the real reason, which is to improve our customersâ€™ website page load performance.
You might think with todayâ€™s high-speed DSL, Cable, and Fiber networks, that downloading a few 100KB of data would be more or less instantaneous. If you do the math, a 20Mbps cable connection is able to fetch a 200KB HTML file in about 75ms. In the real world, that generally doesnâ€™t happenâ€”you’re likely seeing download times of the first HTML document in the hundreds of milliseconds.
“Who stole my bandwidth here?” you may be wondering. Not to worry, it’s still there, and Iâ€™ll explain what is really happening in a moment.
Before we go into details about why we canâ€™t utilize the full line speed on the first HTML download, let’s look at some real examples of the impact of page load time with and without gzip compression.
We created a setup using a typical high-traffic, large content website. The HTML alone for this page is 200KB uncompressed and 40KB compressed. As a side note, 80-90% is a pretty typical compression ratio with the standard gzip deflators.
We fetched an HTML file from a couple of places in the world: one very close to the server, and one further away. Albeit somewhat exacerbated, the test clearly visualizes the problem:
|Â Â Â Â Â||Uncompressed||Compressed|
|Theoretical limit on 20Mbps||78ms||15ms|
|Low latency (17ms RTT)||182ms||105ms|
|High latency (159ms RTT)||1,340ms||840ms|
What is going on here? Why am I not getting the promised 20Mbps download speed when I fetch the initial HTML page? Why is latency having such a tremendous impact on the download times? There are two reasons for this, and both are artifacts of how the network protocol (TCP) was designed:
- TCP 3-Way Handshake â€” Creating a new TCP network connection requires one and a half round-trips between the server and the client.
- Congestion avoidance â€” These are safety mechanisms in place to prevent clients from completely overloading a busy network with too much traffic.
The performance impact of these two mechanisms varies depending on the latency, or round-trip time, between the client and the server. Â They both also benefit from HTTP keep-alive connections (or â€œPConnsâ€), which is why you should always support it if possible.
For our purposes, let’s focus on the “congestion avoidance” because this is what we have improved upon with our gzip content compression.
There are many different congestion control mechanisms and protocols in use today, with numerous configuration options. Let’s focus on one typical setup:
- On a freshly established connection, at most, three TCP packets can be outstanding on the wire, waiting for an acknowledgement. This is referred to as the initial congestion window size.
- The number of TCP packets that can be outstanding doubles after every successful acknowledgement.
- There is an upper boundary, for the sake of this example letâ€™s assume it is 128KB. We will also back off as necessary due to packet losses.
What does this mumbo-jumbo really mean? Let’s examine what happens on the wire when an HTTP request is processed on a new TCP connection:
On a new connection, after the TCP 3-way handshake is completed, the server is initially allowed to send no more than 3 packets (#0-2). The server then has to wait for an acknowledgement (ACK) from the client. Upon receiving the ACK, the window size (number of packets allowed) doubles and the server sends 6 packets (#3-8). Then, the server again has to wait for an ACK. This process continues (sending #9-21, etc.) until the document is completely downloaded.
The 40KB compressed document can be completely sent in 5 round-trips, while the 200KB uncompressed document requires 8 round-trips. So even though the congestion window grows exponentially over time, the number of round-trips is noticeably higher for the larger object.
What does this mean for our two use cases? Using the information above, the math is simple:
|UncompressedÂ Â Â||Compressed|
|Low latency (17ms RTT)||8 * 17Â = 136ms||5 * 17 = 85ms|
|High latency (159ms RTT)||8 * 159 = 1,272ms||5 * 159 = 795ms|
These numbers donâ€™t take into account anything outside the theoretical limitations of a newly created TCP connection. The numbers do however match up very closely to the real world example above. This is the main reason why gzip compression makes such a tremendous impact on the HTML download. Reducing the number of packets, and therefore the number of round-trips, implies real improvements on page-load times!
Content compression with gzip has significant performance improvements, particularly for higher latency networks over longer distances. We have shown that reducing the number of network packets when fetching the HTML on a newly established TCP connection has a direct impact on download times.Â Enabling gzip compression is virtually free, although there may be very few, specific cases where gzip may not be the best choice.