Image (cc) by Nathan E Photography
Google, Wikipedia, Twitter and other major players in the web moved (most of) their services to HTTPS delivery. – Google went even further and primarily developed SPDY which works as a tunnel for the HTTPS protocol and promises fast page-load times even comprising the overhead of encryption. The successor of SPDY, HTTP/2, will most likely only be supported on Firefox and Chrome for sites which are served via TLS.
So, this time I wanted to know if it’s possible to deliver website assets faster or at least with equal speed by a CDN via HTTPS/SPDY compared to HTTP.
Here’s my initial situation…
As for the last years, I’ve used two VPS-instances (classic separation between a frontend for Apache and a backend for MySQL) at DreamHost in combination with Amazon CloudFront (Origin Pull). Regarding the CDN this means that assets are requested from the origin server (the frontend VPS) in case the file is not yet/anymore cached on the edge server with which the user communicates.
This site and JourneyCalculator do both create moderate (though highly valued ) traffic with some spikes, e.g. being featured somewhere or also DDOS or Brute-force attacks. Anyway, observing the stats in the Amazon Web Services (AWS) console, I figured that many cache misses occur. Looking up the AWS forums and Google I didn’t find much information about the ‘real’ cache life-time of assets on CloudFront edge caches. An AWS document only states that “if an object in an edge location isn’t frequently requested, CloudFront might evict the object – remove the object before its expiration date – to make room for objects that have been requested more recently” (cf. CloudFront Developer Guide).
Clearly cost-saving measurements are necessary to keep the price/GB low, but for my use-cases this was suboptimal. In fact that would mean that at least some pages are slower than without the usage of a CDN because two requests (i.e. client/browser ↔ edge server ↔ origin server) are executed for a non-cached object, which would most likely increase the page-load time and therefore provide a worse user-experience than without the CDN.
Despite of that, SPDY is still not supported by CloudFront.
An alternative to CloudFront
So, it was time to try something new. Especially SPDY has been on my bucket list, because I enjoy the idea of encrypted transfer in combination with better page-load times. In most cases, performance and security are natural enemies. Btw. thanks to Robert, a team member of the company I used to work for, who pointed me to the ongoing development of SPDY some time ago.
First the client check: In August 2015 close to 80% of all browsers used worldwide support SPDY.
Regarding the performance, I started with researching the Internet for metrics and benchmarks, but came only up with a lot of synthetic tests and a paper by Jitu Padhye and Henrik Frystyk Nielsen, both Microsoft, which evaluated an older version of SPDY. Please note that there is much information on the Internet which compare HTTPS to SPDY, for example Google’s statement about their Apache module mod_spdy, but that’s not what I’m trying to evaluate.
I setup a pull zone for each of my projects and wrote a script to remote control Chrome. The script
- cleans the browser cache
- dials up different VPNs
- opens a list of URLs (retrieved from my sitemaps)
to simulate world-wide traffic.
I ran all tests at least twice (a couple of hours each with about 60 pages and 30 VPNs) to make sure that the results were accurate. In contrast to CloudFront, this time my assets would stay on their respective edge servers as specified in their cache-control headers.
Besides of their pull zone, they also offer a so-called push zone at KeyCDN where you can upload your content to a central server via (encrypted) FTP or rsync. Though the assets are not cached yet on the edge servers, they will become instantaneously available on request.
Nice one. This means, that I can offload a rarely changing assets collection which I don’t want to control with cache invalidations to their infrastructure and no requests would hit my (not SPDY-enabled) VPS. Here’s the lengthy cronjob-invoked rsync-call to only include certain file-types, in my case WordPress core files.
/usr/bin/rsync --prune-empty-dirs --delete -rlthvz --chmod=u=rwX,g=rX --include='*/' --include='*.png' --include='*.jpg' --include='*.jpeg' --include='*.gif' --include='*.css' --include='*.js' --include='*.svg' --include='*.woff' --include='*.woff2' --include='*.ttf' --include='*.ico' --include='*.eot' --exclude='*' /PATH/TO/WEB_ROOT/ USER@rsync.keycdn.com:zones/ZONE/
To use it yourself, please replace /PATH/TO/WEB_ROOT/, USER and ZONE. I’ve also included a re-try in case the connection failed and implemented some parameterization which is not scope of this article, so I won’t go into detail here.
While observing the CDN access logs, I got a lot of REVALIDATED entries besides of the expected MISS and HIT statuses you’ll get from an edge cache. Wanting to know more about that behavior, I contacted KeyCDN’s support and within 10 minutes on a Saturday (!) I got the following reply: The files on push zones are revalidated every 15 minutes on their root server and only files that change in file size or timestamp are updated on the edge caches/POPs (Points of Presence). The revalidation itself is done very quickly. Not requested files are purged after a couple of days.
Finally: HTTP vs. SPDY
So far, so good. – After the new architecture had been setup, I wanted to benchmark the page-load time of HTTP vs. SPDY assets on a HTTP page.
First of all I used my automated browser script to warm up the edge caches. Then good old webpagetest.org had to incur dozens of tests from different locations. The results seemed promising. Despite of the fact that a SSL handshake was necessary to initiate the connection, DOMContentLoaded and full page load times were at least equal or about 10% faster than compared to HTTP assets, if the page would hold more than five assets (e.g. 1x CSS, 1x JS, 3x image-files). The more assets, the bigger the savings.
Then I used the dotcom-monitor benchmark tool which will test your pages from 22 different locations (remark: about five locations are most of the time not available, but you get a good picture of the global response times). The results were similar, in other words about 10% savings.
Please note that according to the numbers derived from a warm-up shell script (about three million requests to one thousand assets distributed over their network retrieved with cURL within a few weeks), a cache HIT from KeyCDN’s pull zone is about 10-20% faster compared to a REVALIDATED response from their pull zone depending on the size of the asset and the edge server location.
More information on performance optimization with SPDY
- Domain sharding may decrease SPDY’s performance benefits, thus this technique should re-evaluated.
- An additional benefit of delivering HTTPS-assets on HTTP-pages is that all assets are already browser-side cached if the user decides to switch from HTTP to HTTPS or vice-versa. This happens for instance on some sites if the user logs in.
Conclusion – Happy camper
A remark: All links to KeyCDN are affiliate links. In case you consider signing up with them, you’ll get a start-up credit of 25 GB from them and I thank you for supporting this experiment. The costs for data-transfer start at $0.04/GB for all continents.
Leave a Reply
You must be logged in to post a comment.