Summary for the impatient
Lossless compression with current formats can reduce image size on the web by 12.5%.
PNG24 with an Alpha channel comprise 14% of images on the web. We can cut their size by 80% using WebP.
Savings from lossless optimization can save 1.5% of overall Internet traffic!*
Savings from conversion of PNG24 with an alpha channel to WebP can save 2.1% of overall Internet traffic!!!
That's 2.8 Tbps!!! That's over 10 million kitty photos per second**!!! Save bandwidth for the kittens!
How it began
A couple of months ago, I attended (the awesome) Mobilism Conference in Amsterdam, and got introduced to the legendary Steve Souders. We got talking about images and the possibilities in lossless compression. He suggested to send me over the image URLs for the top 200K Alexa sites and that I'll run some lossless compression analysis on them. How can you say no to that?
So, 5.8M image URLs later, I started downloading, analyzing and optimizing images. That took ages, mostly because
ls is slow with 5.8M files in a single directory. (I found a solution to that since)
Bytes distribution according to type:
|Image format||% from overall images|
|JPEG EXIF removal||6.6%|
|JPEG EXIF removal & optimized Huffman||13.3%|
|JPEG EXIF removal, optimized Huffman & Convert to progressive||15.1%|
Overall it seems that with these lossless optimization techniques, about 12.5% of image data can be saved.
jpegtranfor the JPEG optimization and
pngcrushfor the PNG optimization.
- In order to speed things up, I did the optimization experiments over a sample of 100K random images from each type.
PNG24 images with an alpha channel (PNG color type 6) are the only way to put high quality real life images with an alpha channel on the web today. This is the reason they comprise 14% of overall image traffic on the web. What distinguishes them from other image formats is that in most cases, they are the wrong format for the job. JPEGs represent real life images with significantly smaller byte sizes. The only reason they are used is their alpha channel. That's where WebP fits it.
WebP is a new(ish) image format from Google. It is a derivative of their VP8 video codec, and provides significant image savings. One of the killer features of their latest release is an alpha channel. It means that PNG24α images can be converted to WebP (in its lossy variant) with minimal quality losses and huge savings.
PNG24α => WebP
I ran that conversion on the set of 100K PNG24α images. What I got was 80% size reduction in average for these images. From looking at Google's latest research, even if they don't say it out loud, they get similar results in their latest study. (0.6 bits per pixel for lossy WebP vs. 3.6 bits per pixel for PNG)
What's the catch?
There are 2 problems with deploying WebP today:
- Browser support
- WebP's previous version is currently only supported by Chrome, Android & Opera. WebP's current version will probably be supported in Chrome in 3-6 months.
- FireFox has refused to implement the format in its previous incarnation for various reasons. Let's hope they would reconsider the format in it's current version.
- Microsoft and Apple have not made any public comments.
- Lack of fallback mechanisms for the <img> element.
- That means that implementing WebP requires server side logic, and caching that varies according to User-Agent.
- The proposed <picture> element does not include such a mechanism either. It probably should.
What I did not yet test?
I did not yet test WebP's benefits for lossy images, which Google claim to be around 30%. These savings are likely to make WebP even more attractive.
- Better lossless image compression using current formats by web authors can provide 12.5% savings of images data. Web authors should start using the free tools that do that, and should start doing this today! No excuses!
- WebP in its latest incarnation can provide dramatically higher savings, especially in the use case of real-life alpha channel photos. It would increase potential image data savings to at least 21.7%.
- We need browsers to either support WebP or offer a better alternative. Current file formats are not good enough, especiall for the use case of real-life photo with alpha channel.
- We need a fallback mechanism in HTML that will enable browsers and authors to experiment with new file formats without cache-busting server side hacks.
* Assuming that images comprise 15% of overall Internet traffic, which is a conservative assumption
** Assuming 35KB per kitty photo, similar to this one: