Share this article:

Bookmark/Search this post

Caching Made Easy Part 2: Pass The Stack

Part I of our Memcached Series explored the concept of using memcached to cache entire pages. In Part II we further increase the responsiveness of your application by finding and removing unnecessary steps. By caching in memcached we are already bypassing an unholy amount of database queries and foreach() loops, not to mention all the unnecessary application logic to decide if a user can view the page or not. If we skip the application stack for anonymous page requests, we can additionally bypass the fastcgi handoff, forking of fastcgi processes, the bootstrap of your application, and the initial database connection (if not part of bootstrapping).

For this, we push nginx to the foreground to front-end the PHP/apache2 stack! There are memcached modules for apache2 but they work more in the fashion of a reverse proxy which differs from the warm-cache, only serving setup we covered in part one. To provide our webservers with the comfort and support they deserve, we use nginx memcached module and the prefix/keyspace you designed before.

Below we configure a basic nginx catchall server which forwards requests back to apache for you. This is essentially a reverse-proxy without request caching. The plus side to this method is that there is only one configuration for all of your application stacks since you are using a safe keyname scheme and $host is passed back to the apache stack.

Our sample nginx configuration with inline comments:

server {
   listen   [::]:80;
   access_log off;
   error_log  /dev/null crit;
   server_name yourwebasset.com;
 
   # only needed if serving static content
   root /var/www/htdocs;
 
   location / {
      # Go ahead and serve static content.
      # If you don't want to serve static content,
      # then change location @fastpath to /
      # and remove this location declaration.
      try_files $uri @fastpath;
   }
 
   #Magic lies within
   location @fastpath {
 
      # If memcached can't find an entry, it returns a 404.
      # We intercept 404 pages and pass them along to
      # our predefined location.  The location has access
      # to all the same variables as before and doesn't know
      # that a prior 404 was generated other than our debug
      # variable.
      error_page 404 = @appstack;
 
      # I like to use these for debugging. This is the default
      # reason for passing on to @appstack.  Variables are
      # preserved through location passing
      set $reason "MISS";
 
      # This is only passed on to the client if we actually get a hit,
      # otherwise the add_header is thrown away when we
      # pass processing onto @appstack
      # Remember the far future cache control headers from Part I?
      add_header X-Cache-debug "HIT";
      add_header Expires "Tue, 1 Jun 1979 12:01:00 GMT";
      add_header Cache-Control "must-revalidate, post-check=0, pre-check=0";
 
      # Check to see if someone passed in ?nocache=1
      # This is not required
      # but good for testing your app
      if ( $args ~* "nocache=1" ) {
          set $reason "nocache override";
          return 404;
      }
 
      # Only GET/HEAD requests are safe
      if ( $request_method !~ ^(GET|HEAD)$) {
          set $reason "method not safe";
          return 404;
      }
 
      # From Part I we set a unique authentication
      # token in the cookies. Here we check and see if
      # this token is set and, if so, skip memcache for
      # authenticated users.
      if ( $http_cookie ~* "AUTH_UID" ) {
          set $reason "authenticated";
          return 404;
      }
 
      # If you want to cache URL arguments, use $request_uri below
      # otherwise, use $uri. Using $request_uri will result in more
      # cache misses, but caches potentially more complex pages. 
      # Arguments are preserved in the key
      # Uncomment this conditional when using $uri
      #if ( $args ) { 
      #   set $reason "arguments";
      #    return 404;
      #}
      set $memcached_key prefix-$scheme://$host$uri;
      memcached_pass 127.0.0.1:11211;
 
      # Since we can't store any complex data, we blindly return the value.
      # Make sure you don't gzip any data that you store in memcached.
      # Memcached also already fastlz's anything you store in there.
      default_type text/html;
      charset utf-8;
   }
 
 
   location ~ \.php$ {
      # protect your php files (or any files)
      error_page 592 = @appstack;
      return 592;
   }
 
   # if you are using fastcgi, you can use that 
   # here instead of the proxy_pass items
   location @appstack {
 
      proxy_pass  http://127.0.0.1:8080;
      proxy_redirect off;
      proxy_set_header   Host             $host;
      proxy_set_header   X-Real-IP        $remote_addr;
      proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
      proxy_max_temp_file_size 0;
 
      client_max_body_size       10m;
      client_body_buffer_size    128k;
      proxy_connect_timeout      90;
      proxy_send_timeout         90;
      proxy_read_timeout         90;
 
      proxy_buffer_size          4k;
      proxy_buffers              4 32k;
      proxy_busy_buffers_size    64k;
      proxy_temp_file_write_size 64k;
 
      add_header X-Cache-debug $reason;
   }
}

If you aren't already using nginx+fastcgi, you'll need to configure your application to correctly use the X-Forwared-For header and X-Real-IP headers instead of the standard IP detection. This is because proxying requests are passing from the client back to apache which causes connections to originate from 127.0.0.1. While undesirable, this can easily be corrected by considering the right headers.

To recap, in this series (Part I and Part II) we have gone over how to cache anonymous pages with memcache and nginx. This paradigm is best implemented when the majority of site traffic is comprised of anonymous users with ads or other personalized content served via AJAX. Sites with a majority of traffic being authenticated users can still employ many of these concepts but require more detailed logic on when to serve a cached page, or which page elements to AJAX and so on.

Keep a look out for more entries like this from myself and other SoftLayer developers in the coming months. Want to see something specific? Leave a comment below as to what you’d like to see from a SoftLayer developer’s perspective and we’ll do our best to deliver!

Popular Blogs

Bookmark/Search this post

Caching Made Easy Part 2: Pass The Stack