Like many others I have been drawn in by the appeal of websockets and their use in (near) real-time communication. As such one of my current projects uses Node.js and websockets (via socket.io). To maximize compatibility, I would, of course, like my Node.js site to run on port 80. My server, however, is not used exclusively for this project – it also has traditional PHP/MySQL sites running on it. Which brings me to my problem:
My current setup has Varnish as a caching layer – to cache the dynamic PHP scripts – and Nginx as a webserver. Together, these have excellent performance. My objective, in adding in Node.js is to have it running behind both of these. Varnish binds to port 80 and provides the publicly accessible interface to all sites on the server – this will allow me to cache dynamically generated content from Node.js as well as the content that is currently cached. Nginx will serve my static content – from some brief tests, it appears to far outperform Node.js in this area. Finally, any requests for dynamic content or websockets will be handled by Node.js.
As is good practice, static content will be served from a separate subdomain, but I would like all remaining content (including the websockets) to be served from the main domain. The rest of this article outlines the configurations I have in place to attain the above.
To recap, the objectives are:
- Have a single public port for both websocket and ‘regular’ data
- Be able to optionally cache some resources using Varnish
- Serve (uncached) static assets directly from nginx (which may be then be cached by Varnish)
- Pass requests for ‘web pages’ to nginx, and from their proxy to Node.js
- Pass websocket requests directly (from Varnish) to Node.js (bypassing nginx).
My server stack is
- Varnish (v3.0.2) – port 80
- Nginx (v1.0.14) – port 81
- Node.js (v0.6.13) – port 1337
- Socket.io (v0.9.2)
- Express (v2.5.8)
- Operating system is Amazon’s Linux (v2011.09)
- Also tested on CentOS (v6.2)
Varnish
Below is an edited version of my /etc/varnish/default.vcl. Some customizations and parts irrelevant to the topic at hand have been edited out.
#define backends and timeouts
backend default {
.host = "127.0.0.1";
.port = "81";
.connect_timeout = 5s;
.first_byte_timeout = 30s;
.between_bytes_timeout = 60s;
.max_connections = 800;
}
backend nodejs{
.host = "127.0.0.1";
.port = "1337";
.connect_timeout = 1s;
.first_byte_timeout = 2s;
.between_bytes_timeout = 60s;
.max_connections = 800;
}
#Removed: ACL for purging
sub vcl_recv {
set backend = default;
set req.grace = 120s;
#set the correct IP so my backends don’t log all requests as coming from Varnish
if (req.restarts == 0) {
if (req.http.x-forwarded-for) {
set req.http.X-Forwarded-For =
req.http.X-Forwarded-For + ", " + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
}
#remove port, so that hostname is normalized
set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
#Removed: code for purging
#part of Varnish’s default config
if (req.request != "GET" &&
req.request != "HEAD" &&
req.request != "PUT" &&
req.request != "POST" &&
req.request != "TRACE" &&
req.request != "OPTIONS" &&
req.request != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
if (req.request != "GET" && req.request != "HEAD") {
return (pass);
}
#pipe websocket connections directly to Node.js
if (req.http.Upgrade ~ "(?i)websocket") {
set req.backend = nodejs;
return (pipe);
}
#do not cache large static files
if (req.url ~ "\.(avi|flv|mp(e?)g|mp4|mp3|gz|tgz|bz2|tbz|ogg)$") {
return(pass);
}
#general URL manipulation and cookie removal
#lines 60-109 from https://github.com/mattiasgeniar/varnish-3.0-configuration-templates/blob/d86d6c1d7d3d0ddaf92019dd5ef5ce66c9e53700/default.vcl
if(req.http.Host ~"^(www\.)?example.com"){
#Removed: Redirect for URL normalization using error 701
# Requests made to this path, relate to websockets - pass does not seem to work (even for XHR polling)
if (req.url ~ "^/socket.io/") {
set req.backend = nodejs;
return (pipe);
}
#My other PHP/MySQL sites get included here, each in its own block
}else if (req.http.Host ~ "^(www\.)?thatsgeeky.com") {
#...
}
# part of Varnish’s default config
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default */
return (pass);
}
return (lookup);
}
sub vcl_pipe {
#we need to copy the upgrade header
if (req.http.upgrade) {
set bereq.http.upgrade = req.http.upgrade;
}
#closing the connection is necessary for some applications – I haven’t had any issues with websockets keeping the line below uncommented
#set bereq.http.Connection = "close";
return (pipe);
}
# sub vcl_pass - unmodified
# sub vcl_hash - mostly modified – added hash by content-encoding
# sub vcl_hit - mostly unmodified – added PURGE code
# sub vcl_miss – mostly unmodified – added PURGE code
# sub vcl_fetch - mostly unmodified – added set beresp.grace = 30m; and some site specific additions
# sub vcl_deliver - modify some headers
# sub vcl_error - custom error page and handle redirects for URL normalization
# sub vcl_init - unmodified
# sub vcl_fini – unmodifiedSince Nginx does not handle websocket requests (although, there is a TCP module that may help with this), we cannot send websocket requests to Nginx – they must go directly to Node.js. As such, we must setup two backend definitions – one for Nginx and one for Node.js. The specific timeout parameters for each backend are a personal preference and are largely arbitrary.
As with most setups, vcl_recv is the function with the most going on. In addition to the standard parts found in the Varnish config, the above looks for websocket connections and will send them directly to Node.js (the code come directly from the Varnish documentation). It should be mentioned that I do not let Node.js serve the socket.io client. My pages call it from a different location and it is served by Nginx.
Nginx
The config below is simply the section that I include for the one site – common options (from nginx.conf) are not included.
upstream node_js {
server 127.0.0.1:1337;
server 127.0.0.1:1337;
}
server {
listen *:81;
server_name example.com www.example.com static.example.com;
root /var/www/example.com/web;
error_log /var/log/nginx/example.com/error.log info;
access_log /var/log/nginx/example.com/access.log timed;
#removed error page setup
#home page
location = / {
proxy_pass http://node_js;
}
#everything else
location / {
try_files $uri $uri/ @proxy;
}
location @proxy{
proxy_pass http://node_js;
}
#removed some standard settings I use
}Firstly, with the upstream block, we define our backend servers. You’ll note that I have the same server listed twice. This is because of the way Nginx falls back in the event of a backend failure, and will give the request another try.
The objective with the above, is to serve all files that exist using nginx, and to proxy all other requests to Node.js. The use of the location = / block is due to the root directory existing, but wanting it handled by Node.js and not Nginx.
In order to track timings through each layer, I use a modified log command (timed, above, instead of main). Also, the IP addresses are updated so that Node.js doesn’t see all requests as originating from Nginx.
set_real_ip_from 127.0.0.1;
real_ip_header X-Forwarded-For;
log_format timed '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" $request_time $upstream_response_time $pipe';
port_in_redirect off;One of the nice things about this setup, is that all data passing though Nginx can be compressed – without needing to add modules to Node.js which will undoubtedly be slower.
Tracking a Request
Just for interest sake, it is possible to track a request through the entire server stack – I occassionally do this to track down the cause of any delays. The following request was for a login page of a Node.js site I am working on.
Varnish:
Logging is done with varnishncsa, using the following:
varnishncsa -F "%h %l %u %t \"%m %U %H\" %s %b \"%{Referer}i\" %{X-Varnish}o %{Varnish:time_firstbyte}x"The logged request is as follows:
xxx.xxx.xxx.xxx - - [26/Mar/2012:12:11:14 -0400] "GET /login HTTP/1.1" 200 601 "-" 1866086403 0.006932020
Matching the XID (1866086403) to output from varnishlog, gives the full request timings:
11 ReqEnd c 1866086403 1332778274.036613464 1332778274.043595552 0.000080347 0.006932020 0.000050068
Looking at this a bit more closely, we see that Varnish took:
- 0.000080347s from the time the request was accepted until processing started
- 0.006932020s from the start of processing to the start of delivering (essentially backend time)
- 0.000050068s from the start of delivery to the end of the request
Nginx:
The log is generated using the log format mentioned earlier that I reference with the name ‘timed‘:
log_format timed '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'$request_time $upstream_response_time';The logged request shows up as follows:
xxx.xxx.xxx.xxx - - [26/Mar/2012:12:11:14 -0400] "GET /login HTTP/1.1" 200 613 "-" 0.006 0.006 .
We can see that Nginx took:
- 0.006s ($request_time) in total to process the request
- 0.006s ($upstream_response_time) of the total time was taken to obtain the response from the upstream server.
Node.js:
As part of my application, I have the following line (which sends the output to the file defined in access_logfile:
app.use(express.logger({format: ':req[X-Forwarded-For] - - [:date] ":method :url HTTP/:http-version" :status :res[content-length] - :response-time ms', stream: access_logfile }));The logged response is:
xxx.xxx.xxx.xxx - - [Mon, 26 Mar 2012 16:11:14 GMT] "GET /login HTTP/1.0" 200 1155 - 4 ms
From the above, it appears that Node.js took:
- 4ms to process and return the request
Obviously the time format used by express.logger is a bit different (and I am recording less information), but the relevant data is present.
An interesting observation here is the change in protocol. The request came in as HTTP/1.1, was sent to Nginx as the same, but Nginx sent the request as HTTP/1.0. This is well documented, so not exactly a surprize – but interesting to see regardless.
Another point to note is the changing response size. The original response from Node.js was 1155 bytes. After going through Nginx it came out at 613 bytes (since it was gzipped). Finally, my VCL modifies some of the headers (not shown above) for a final size of 601 bytes.
Pingback: WebSockets – Varnish, Nginx, and Node.js « That's Geeky | kernicPanel | Scoop.it
Pingback: WebSockets – Varnish, Nginx, and Node.js | Wet Web | Scoop.it
Cheers. I described whole setup here in a bit more detail.
I’ve been looking for Node.js caching with Varnish. This is very helpful! Thank you.
Glad you found it useful. Thanks for commenting.