While a significant portion of the content of this site is geared towards optimizations and efficiency, I think that summing up a few of those ideas in a single article can be helpful, and will follow nicely from the previous article on Website Optimizations.
The major focus of the last article was content as opposed to server setup, this article will focus more on the latter.
Before we get started though, it is important to address the question of why bother to improve site efficiency. Firstly, a more efficient site is more likely to keep visitors – patience is a virtue that is somewhat lacking. A third of people won’t wait more than 4s for a page to load, and every fraction of a second increase in page load time can have an impact on bounce rate. Furthermore, an inefficient site taxes a server – which means that the server cannot handle as many requests – and the server consumes more energy as well. Finally, inefficient sites require more bandwidth – which of course, increases the cost of running the site. An efficient site is a win for everyone – your visitors, the environment, and your pocketbook.
In order to load a website:
- Each domain must be resolved (DNS Lookup)
- A connection must be established to the server (Ping)
- The server must process the request (PHP/MySQL)
- The server must respond (headers, first byte of data)
- The client must receive the data (Downloading)
In many cases, we tend to consider only the size of a page, but in reality, this accounts for only a very small portion of the load time. We will briefly look at each point above and consider what can be done to reduce the time spent in each case. As always, this is by no means comprehensive, but is meant to provide more of a primer to the topic.
Resolving the domain
Servers are addressed on a network with numeric – IP – addresses, as opposed to the ‘friendly’ text based domain names we typically use. For each domain name a browser encounters, it must first perform a DNS lookup in order to resolve that domain name into an IP address. While, in theory, the use of IP addresses instead of domain names will cut down on the time needed for this step, IP addresses tend to be ‘meaningless’ to humans (i.e. by looking at it, one doesn’t know where they are being sent) and in most shared hosting environments, a domain name (passing the Host header) is necessary to reach the correct site.
Domain names are resolved from right (the top level domain (TLD) – e.g. .com, .net, etc.) to left (the domain, subdomain, etc). DNS resolution will begin with a root server, which will provide a list of DNS servers for the specified TLD, any of which will point to a nameserver for the domain name. Subdomains may be listed on the same nameserver as the parent domain or on their own nameserver.
SimpleDNS’ Trace DNS Delegation, is helpful in seeing this process in action.
As with all network based services, proximity improves response time – therefore, if the nameservers are located closer to the visitor’s computer, the time needed for the DNS lookup will be lower.
Most operating systems cache DNS lookups, so a repeated lookup can be quickly done locally instead of requiring a fresh lookup.
Under Windows, it is possible to view your DNS cache, by running:
You can empty (flush) your DNS cache, by running:
If a site loads resources from multiple domains (including subdomains), a DNS lookup must be performed for each new domain, hence there is a cost to loading resources from multiple domains – however, there is also something to be gained.
As most small sites do not have a distributed server network at their disposal, the best bet for minimizing DNS lookup times is probably to go with an external service – most registrars (e.g. GoDaddy) provide nameservers, and other services such as the newly released Amazon Route 53 are worth looking at.
Keep in mind though, that it is only the first visit that will require the DNS lookup, as the visitor’s computer should cache the DNS request. Moreover, loading a resource that is common to other sites (e.g. the Google Analytics script), should not require an additional DNS lookup.
Finally, there are some instances where a server will need to perform DNS lookups. In such cases, it might have merit to install a caching DNS server (such as TinyDNS), so that repeated requests for the same domain name can be processed much faster.
Establishing a Connection with the Server
For every request, data must pass between the server and the visitor’s computer. The minimum round trip time can be determined by running a ping (of course, some firewalls are setup to block icmp requests and will not respond to ping). The only way to really decrease the time this step takes is for the server to be moved closer to the visitor – the general suggestion is that the server should be at least in the same continent.
One way to provide the needed proximity to the visitor is to use a content delivery network – a CDN provides a distributed network of servers from which content is served. In a CDN such as Cloudfront, content is pulled from an origin server (e.g. S3 or another server) and then cached on the remote server from which it is served. The potential downside to using a CDN is that changes to content on the server are not always made live immediately.
The Server’s Response
Once we have reached the server, the server must respond with headers and content. For static content, this is determined largely by the efficiency of the server. Servers which are designed to serve both static and dynamic content often load many modules that are unneeded for serving static content. Switching to a lightweight server for the purposes of serving static content can greatly decrease the time spent serving such content. Common examples of lightweight servers include nginx and lighttpd.
An easy test of the load time is to run apache bench from the server hosting the content (i.e. to minimize other variables) – nginx serves a cached copy of the index page of this site in a mean time of around 10ms, compared to apache which takes around 100ms to serve the same (cached) page. For the same of comparison, generating the page from scratch takes apache upwards of 250ms.
Now, most content (images, CSS, JS, etc) is static (and, if not, should be made static), however, typically, some content starts off being dynamic. In order to generate dynamic content, there is generally some code that must execute (e.g. PHP), quite possibly a few files that need to be included, and almost certainly a few database requests (MySQL). Each of these, particularly the database requests contribute to the total processing time for the page. While the ideal scenario would be to minimize the processing time, the next best scenario is to cache the page once it is generated. By generating static content, the overhead of processing the page and all associated database requests is eliminated. This means that your server can serve far more requests, and users have a more response experience.
Receiving the data
Since this stage is largely dependent on the size of the data – send the minimum amount of data possible. Gzipping content will go a long way towards achieving this; optimizing images (choosing the best image format, using services such as Smush.it, avoiding resizing images in HTML, using thumbnails), and in some cases even using AJAX to only load the new portion of the page instead of the entire page will all contribute to minimizing the time spent receiving data.
A quick point about images:
- GIFs tend to be preferable for very small images with few colours
- PNGs are usually the best choice for most images (especially graphics)
- JPGs are best suited for images with many colours (e.g. photos)
Furthermore, working with a clean graphic (with minimal artifacts), will usually lead to the best compression and highest quality.
For text files (HTML, JS, CSS), removing extraneous content (e.g. comments, excess white space, etc) can have a significant impact on the total size. While this is a tedious task if done manually, there exist tools that are quite helpful in accomplishing this:
- CSS: YUI Compressor, CSS Tidy
- HTML: HTML Tidy
With respect to CSS and JS, many times we reference an entire library when we only need a single JS function or a few CSS selectors. Cutting out the unneeded code can go a long way to reducing the file size and keeping things easier to understand and improve in the future. For CSS, there is a tool available to help with this task – Dust Me Selectors. Additionally, Google’s Page Speed, will output similar information on unused CSS selectors.
Of course, it is always best to not have to receive content at all – correctly setting cache-control and expires headers will go a long way towards limiting the number of resources that must be re-downloaded for repeat visitors. In this case, the browser will either not check for an updated copy of the file (faster, but will not always be viewing the ‘current’ version of the site – might be acceptable for some static content, but not always ideal), or will check to see if the content has been updated (it should receive a 304 – not modified response). In the case of the latter scenario, the total time is again dependent on the server – a highly optimized server will be able to respond in a time approaching the round trip time.
The above ideas, for the most part, consider things as if a browser requests a single file at a time – however, this is not the case. All major browsers will request between 2 and 8 files from the same domain simultaneously. Requests beyond the browser’s limit are queued, and must wait for the previous request to finish processing before they are processed. One way of taking advantage of this concurrency is to serve content from different domains/subdomains – in this way, the browser will load more files simultaneously.
An even better idea is to reduce the number of requests. Combining two files into a single file allows compression (Gzip) to be more efficient, reduces the time needed for DNS lookups, processing, and round trips, as well as freeing up a place in the browser’s queue for another file to be processed.
Ideally, common libraries (unmodified) would be served from a common location (e.g. Google’s AJAX Libraries). The advantage here is that these libraries are used by many sites, so there is a good chance that the visitor already has the file cached on their computer.
Other JS should be combined into a single file and hosted on a CDN, the same with CSS. Unless CSS/JS is specific to a given page, or there is very little of it, externalize it, so that it can be cached. Remember to load external CSS before external JS, browsers often block requests until JS files have completely downloaded.
Combine images together using CSS sprite to reduce the number of requests and improve the compression used by the image container – the SpriteMe script provides a good starting point for this process.
On the topic of avoiding unnecessary requests, try to minimize the number of redirects (WhereGoes is an interesting tool for checking redirects) you have on a page, and avoid loading files that will return a 404 (page not found) error (especially common in this area are favicons).
Google’s release of mod_pagespeed for apache promises to help with some of the above optimizations, but keep in mind that it does not handle concepts such as CSS sprites, DNS, and CDNs, and despite its optimizations (including caching), a lightweight server, will still far outperform apache for static content.
As with most optimizations, Firebug, PageSpeed, and YSlow make excellent starting points, and remember that there is always room for improvement.