Varnish and Pressflow (Drupal) - VCL tweaks for achieving a high hitrate

Submitted by Janak on Thu, 09/30/2010 - 11:58

The default Varnish config for Pressflow by Four Kitchens is an excellent starting point and gets you up and running with relatively little pain and effort. Having done a fair amount of Varnish tweaking for my personal and work websites, I came across a couple of varnish tweaks that resulted in a phenominal improvement in Varnish Hit rate.

Vary User Agent

It seems Varnish by default will cache every page for every user agent if a Vary: User-Agent header was sent. Unless the website is designed to behave differently for each user agent, this is a wasted resource and will result in a very HIGH miss rate. Most sites will only include style sheets for IE and every other browser.

Unless the sites are optimised for mobile versions, we only need to configure varnish to cache for IE and every other agent, 2 caches.

Varnish can be configured to something like this to reduce number of unique cache hashes:


sub vcl_recv {
if (req.http.user-agent ~ "MSIE") {
set req.http.user-agent = "MSIE";
} else {
set req.http.user-agent = "Mozilla";
}
}

Normalizing Vhost namespace

Since a website can be accessed from http://www.foo.com and http://foo.com Varnish will generate a separate page for each one of these urls. This combined with the above User Agent issue, number of hash combinations would be fairly high:

If you need to serve the same web site from multiple URLs, Varnish can be configured like this:


if (req.http.host ~ "^(www.)?foo.com") {
set req.http.host = "foo.com";
}

Further Reading