Francesco Altomare
Hasenheide 9
10967 Berlin (Germany)
Mobile: +49 151 65623284

Blog

On GZIP

My dear Reader,

Let’s cut the crap and start talking about serious optimization matters today, and GZIP as a first-choice topic. Why gzip? Because it matters (amongst the whole rest we’ll discuss in our Blog).

You have a Website, which is also most likely your Business, and you may even have some CDN or Cloud Services boosting up your performance. Metaphorically speaking, the Internetworks are the very roads leading to your virtual Shop or Newspaper Stand.
You also know humans (your Audience) are lazy , so why would I – as a human –  go buy my newspaper from Stand A (10m walk) when Stand B takes me 5m walk only? I may still go to Stand A for the time being as “it’s an old friend of mine” so I enjoy the talk and gossiping whilst there (Content Quality), but over time – you know how that goes eventually – I’ll turn to Stand B.

Be Stand B! That’s what we’ve been telling you for a long time now.

Amongst the whole rest of things turning old town cobbled roads to your Stand into clean, 16-laned highways, GZIP is a go for in case you haven’t already.

We will assume you know Gzip Deflate is a compression method, we will also assume you know where and how to activate it in an Apache/Nginx enviornment (if not please check this very well written Post to respect); something we won’t assume though is you really ever spent a serious observation on topic.

We have seen an increasingly larger number of Business (even well established Brand names) literally dropping jaws when presented with their content being delivered 100% plain, and explained they’ve been losing on money, performance and Audience ultimately all the way through us.

 

The purpose of activating – and especially finetuning – GZIP Compression is manifold; let’s have a look at what pros and contras it brings:

Contras:

– It costs CPU: Compressing assets (and, respectively, decompressing them for a Browser) for an Origin Infrastructure requires computing cycles: the more assets you compress , the more your sorry CPU will have to cycle and compute compressed responses.

Now, if you have tons of CPU available and a distributed WebServers Architecture this may not be a hassle at all, but if you struggle cooling down your CPUs and keeping them at bay without compression, this may turn in a serious showstopper.
In fact, I have seen instances where disabling GZip Compression (and freeing up lots of CPUs for responding incoming connections and starting data transfers) outweighed keeping Compression on – for images ie. – and skimming that 10% bandwidth total on it. Spend a serious observation on this.

– It doesn’t help a lot with formats such as images : that is also a typical counterargument when it comes to GZip, as in fact files extension such as jpegs are already compressed formats.
What to do in this case: as per above, if CPU is constrained you may want to leave it off , especially with compressed file formats; still, if you have CPU to dedicate to it, saving a further ~10% out of your total daily images traffic (check Cacheability , check Offload, check CDNs in later Posts) isn’t a bad idea , is it?

– It doesn’t work on older Browsers: we’re all aware NetScape 1 didn’t support GZip / Deflate compression, nor did many other older Browsers. Web Archeologists confirmed though, that 2014 we’re in a speedier world than back then, so unless you really need to assess compatibility with pre-2006 Technologies, you can rest assured all today’s browsers will work with GZip /Deflate.

You will likely find no electricity in abandoned medieval castles, nor will you see pictures from the Men on the Moon pre-July 1969, but that shouldn’t prevent you from talking to today’s and tomorrow’s devices.

Pros:

– You save bandwidth: with or without a CDN / Cloud in the middle, your Hosting Solution may have some contracted bandwidth; it actually most likely will. If so, saving up to 80% data transfer with respect to text, css, js , html and so on can save you some serious costs.
Many WebAdmins take for granted that if their total monthly traffic equals X GBs, and a month later it peaks up to X+Y GBs, that is a good thing: more traffic, more costs to sustain their passionated readers and so on. Nothing could be further from true than that: you don’t want traffic to boom, you want visits, revenue, customer satisfaction to boom – my dear Reader!
So what better than making revenues whilst cutting costs on your Hosting Platform?

– You boost performance: delivery performance as a whole, in a HTTP(S) transaction, depends on too many factors than we can explain in this paragraph alone. Let us take a look at the few GZip-compression pertaining elements to it today.
Transport-wise , you as a Client establish serial connections to one or more given Servers and, let alone all other timedeltas , the transfer starts; assuming that the transfer rate stays equal during the whole transport transaction, wouldn’t you agree that downloading a 20kb page VS a 100kb page takes one fifth of the time? Just imagine how happy your clients would be checking your content in one fifth, one third, half or even 70% of the time they’re used to?
There are many studies regarding drop / abandon / bounce rates on pages exceeding a given time to load, and we’ll be glad to expand on them in our future Posts; as a rule of thumb still, you in the first place would be happy to access the same content in a faster fashion than ever, wouldn’t you my dear Reader?
Some may counter: “You take the page weight down by compressing it but you do , at the same time, add load to the Browser by forcing it to uncompress all single assets it receive. What use is that ?”
The theory is right, as yes Browsers are going to spend CPU, lots of it, when uncompressing objects, and that – depending on the available CPU – may turn cumbersome to the effect of a smooth end user experience.
The real question is, as per the bottom “Contra” above: who do you really deal with ordinarily? Do you deal with end users running stone-age-old Pentium 2 Processors? If so, then yes we agree you should carefully study pros and cons from a performance perspective before enabling Compression on your Production environment; if not, and if your average end user’s CPU fares 10% all the time, that shouldn’t really be a topic and the number of milliseconds overhead needed for uncompressing assets will easily get outweighed by the actual seconds you save delivering compressed objects.

In a Cloud/CDN deployment:

If , as we strongly suggest to all successful Businesses and Partners of ours, your Origin Infrastructure already sits behind a Cloud or a CDN, the delivery chain gets much more interesting and performant! The most interesting aspect on here you may want to know about is:

– Independency of GZip Settings between Origin Server and CDN/Cloud: As you may know , by moving to a Cloud/CDN delivery model the transport interactions will look like below:

Origin Server(s) <—> Cloud/CDN Servers <—> End Users

The way your Origin talks with the Cloud/CDN, and the way the Cloud/CDN talks to your Audience are, under many more aspects than just GZip, independent from each other. What does that mean? That you could turn GZip on or off on each and every transport segment downstream to the end users.

Why does that matter?
Let us present you two high-level use cases :

1) Origin with CPU constraints

 

Solution [Spoiler]: Origin -> CDN not compressed, CDN -> End Users compressed

 

Once upon a time, we had a Prospect delivering assets as compressed from his Origin alone; the benefit of compressing assets (also to the effect of reducing bandwidth) however also costed him lots of CPU: the CPU levels were always faring 70-80% , with dangerous spikes and timeouts to end users anytime peak traffic time arrived. They were billed heavily for their traffic.
We integrated a CDN for their Business and configured it to serve compressed assets: we consequently saw traffic going to Customer’s Origin falling down thanks to cacheability of assets and end users benefiting a lot from the proximity of the Cloud/CDNs PoPs; on the other hand, uncacheable assets still took 3-4 secs only for the Origin to respond. The Origin was still struggling.
So we turned off compression on the Origin side and , voilá! Response times from Origin fell down a good couple of seconds, Origin CPU levels fell down to 40-50%, and end users saw an even increased performance afterwards.

2) Power-Origin with  weak End Users machines

Solution [Spoiler]: Origin -> CDN compressed, CDN -> End Users not compressed

 

Once upon another time, we came across a Customer whose assets couldn’t be compressed due to the fact that they delivered computing -intensive data and end users’ Browsers struggled a lot with it, let alone adding the overhead of decompressing assets also.
On the other hand, delivering everything in plain from their Origin costed them salty bills from their Hosting Provider. Still , they had a powerful Origin Infrastructure and wanted to make more use of it and its constantly low CPU-levels (faring 10% on average).
What we did was integrating a CDN and letting the CDN also serve uncompressed data to the End Users; then, together with the Origin traffic reduction due to caching, we introduced GZip on their Origin and as a consequence of that, our Customer traffic bills more than halved themselves whilst finally also letting their CPUs start working a little more (CPU levels rose to ~30%).
And end users keep on getting speedier, uncompressed content.

We hope this introduction to GZip and its use-cases rang a bell on you and look forward to reading your feedback. Tell us about your story !

  • Linked In
  • Google

Tags: , ,

Leave a Reply

Your email address will not be published. Required fields are marked *