Being Pushy
I've spent a few days last week in Stockholm attending the HTTP Workshop, and taken part in many fascinating discussions. One of them revolved around HTTP push, its advantages, disadvantages and the results we see from early experiments on that front.
The general attitude towards push was skeptical, due to the not-so-great results presented from early deployments, so I'd like to share my slightly-more-optimistic opinion.
# What can push do that preload can't
A recurring theme from the skeptics was "push is only saving 1 RTT in comparison to preload". That is often not true in practice, as there is one major use case that push enables and preload cannot.
# Utilizing server think-time
HTML responses are rarely static resources nowadays. They are often dynamically generated using a higher-level language (which may be slightly on the slower side) while gathering the info needed for their creation from a database. While the back-end's response time is something you can and should optimize, response times in the order of hundreds of milliseconds are not uncommon.
There's a common advice to "flush early" your HTML, and start sending the first chunks of your HTML in parallel to querying the database and constructing its dynamic parts. However, not all server-side architectures make it easy to implement early flushing.
Another factor that makes early flushing harder than it should be is the fact that at the time we need to start sending data down to the browser, we're not yet sure that the response construction will complete successfully. In case something in the response creation logic goes wrong (e.g. database error or server-side code failing to run), we need to build a way to "roll-back" the already-sent response into our application logic, and display an error message instead.
While it's certainly possible to do that (even automatically), there's no generic way to do that today as part of the protocol.
So, the common scenario is one where the Web server is waiting a few hundred milliseconds for the back-end to construct the page, and only then starts to send it down. This is the point where we hit slow start, so we can only send around 14KB in our first RTT, 28KB in the second, etc. Therefore, it takes us think-time + slow-start time in order to deliver our HTML. And during that think-time the browser has no idea what resources would be needed next so it doesn't send any requests for the critical path resources would be needed.
And even if we're trying to be smart and add preload headers for those resources, they do nothing to utilize that think-time if we don't early-flush the document's start.
Now, compare that to what we can do with H2 push. The server can use the think-time in order to push required critical resources - typically CSS and JS ones. So, by the time think-time is over, there's a good chance we already sent all the required critical resources to the browser.
For extra credit, these resources also warm up our TCP connection and increase its congestion window, making sure that on the first RTT after the think-time the HTML could be sent using a congestion window of 28KB, 56KB or even more (depending on think-time and how much we pushed during it).
Let's take a look at a concrete example: How would the loading of an 120KB HTML page with critical CSS of 24KB and critical JS of 74KB over a network with an RTT of 100ms and infinite bandwidth?
Without push, we wait 300ms for HTML generation, then 4 RTTs to send the HTML, due to slow-start, and another RTT for the requests for JS and CSS to come in and send their responses. Overall 800ms for first render.
With push, the CSS and JS are sent as soon as the request for the HTML arrives, it takes them 3 RTTs to be sent (again, due to slow start) and they bump up the CWND to ~128KB, so when the HTML is ready to be sent, it can be sent down within a single RTT. Overall time for first render: 400ms.
That's a 50% speedup to first render! Not too shabby...
# Where push is not-so-great
One of the reasons I believe people are Using It Wrong™ when it comes to push is that they're using it in scenarios where it doesn't provide that much benefit or even causing effective harm.
# Blindly pushing static resources
One of the major things you can do wrong with push is saying to yourself: "Hey, I have these static resources that all my pages need, I'll just configure it to be pushed on all pages".
The main reason this is a bad idea is caching. These resources are likely to be in the browser's cache after the user visits the first page, and you keep pushing it to no end. You could argue that it's no worse than inlining all those resources and you'd be right, but I'd argue back that inlining all those resources would also be a bad idea :)
So, if you are blindly pushing resources that way, make sure that it's only stuff you would have inlined, which is basically your critical CSS. Otherwise, you run a risk of making repeat visits significantly slower.
You may think that stream resets will save you from wasting too much bandwidth and time on pushing already-cached resources. You'd be wrong. Apparently, not all browsers check their caches and terminate push stream of cached resources. And even if they do, you're still sending the resource data for a full RTT before the stream reset reaches the server. Especially if you're doing that for multiple resources, that may end up as a lot of wasted data.
# Getting stuff into the browser's cache
You may think that push gets stuff into the browser's cache and can be used to e.g. invalidate current resources. At least at the moment, that is not the case. One of the topics of discussion in the workshop revolved around the fact that we may need to change current push behavior to support direct interaction with the browser's cache, but right now, push is simply not doing that. Pushed responses go into this special push-only cache, and they go into the HTTP cache only when there's an actual request for them.
So if you're pushing resources in hope that they'd be used in some future navigation, the browser may throw them out of the push cache way before they'd actually be needed.
At least that's the way the implementations work today.
# Filling the pipe after the HTML was sent down
Often in the page's download cycle there are gaps in the utilized bandwidth, meaning that we're not sending down the required resources as fast as we could be, usually due to late discovery of those resources by the browser.
While you should try to fill in these gaps by sending down resources that the page needs, it is often better to do that with preload rather than push. As preload takes caching, cookies and content negotiation into account, it doesn't run the risks of over sending or sending the wrong resource that push does. For filling in these gaps, there's no advantage for push, only disadvantages. So it's significantly better not to use push for that purpose, but use preload instead.
# Cache Digests
We saw that one of push's big disadvantages is that the server is not necessarily aware of the browser's cache state and therefore when pushing we run a risk of pushing something that's already in the cache.
There's a proposed standard extension that would resolve that called cache-digests. The basic idea is that the browser would send a digest to the server when the HTTP/2 connection is initialized, and the server can then estimate with high accuracy if a resource is in the browser's cache before sending it down.
It's still early days for that proposal and it may have to be somewhat simplified in order to make its implementation less expensive, but I'd argue that currently H2 push is only half a feature without it.
# To sum it up
H2 push can be used to significantly improve loading performance, and when used right can speed up the very first critical path loading, resulting in improved performance metrics all across the board.
Push is still very much new technology, and like all new tools, it may take a while before we figure out the optimal way to use it. Often that means one or two sore thumbs along the way.
So, initial results from early experiments may not be everything that we hoped for, but let's treat those results as an indication that we need to get smarter about the way we use push, rather than concluding it's not a useful feature.
2 comments
asilvas — Tue, 02 Aug 2016 16:59:20 GMT
Good article, thanks. My findings are very similar. What I was pleased to find was that both high & low latency networks benefited 25-35% from H2 Push in page load times compared to that of HTTP/2 (with multiplexing only).
https://github.com/asilvas/...
Cache Digests will fill a critical limitation in H2 Push. I had published a competing draft (found https://github.com/asilvas/... ), but Cache Digests already has quite a bit of momentum, and all I care about is getting this lack of client state squashed ASAP.
Uwe Trenkner — Wed, 03 Aug 2016 07:03:39 GMT
It clearly was the server think-time that made us use server push for a certain page. In this case, a big Zend application is working in the background, which takes some time to reply.
In this waterfall chart one can see that the first byte of HTML comes in only after 1048ms: http://www.webpagetest.org/...
By that time, the h2o web server is long since done pushing 19 different assets (css, js, svg) to the browser, which will be used on that page. A perfect case for server push.
The assets are also used on related pages and can then be re-used from the browser cache. And because h2o has CASPer (cache-aware server-push), a mechanism similar to cache-digest, the assets are not pushed again in repeat view: http://www.webpagetest.org/...
Thus, for us, cache-aware server-push is a no-regret mechanism to improve the performance of that page.