Everyone has an opinion about how you should write your application. Whether it's the structure, language, or database (or not!) that you choose, there are plenty of people telling you what you should do, but not how to do it. So I'm going to focus on the application tier and address several issues regarding application session management — one of the trickiest aspects to tackle in high-performance and cloud-deployed applications and something that most developers take for granted.
A Confession about Sessions
It's often easier to use the built-in session management tools in your application server (not a best practice, but I've done it, too), and it will usually work even when you get to the point of having multiple servers behind a load balancer (as long as you enable sticky sessions). But using built-in tools causes problems when those servers need to be updated, such as at release time or when they end up failing.
Managing session data and deciding what's kept in memory versus other data stores is something that developers often don't consider until there's a problem with the application. While it would be nice to be able to get rid of session data altogether, realistically it's something that we should still consider and plan for from an infrastructure and configuration management perspective. Keeping in mind the mantra "build it right the first time," here are some good, better, and best options for you to consider.
Good Option: Get your session data out of the same process as your application or web server, which will enable you to restart your web server and retain session data if set up properly. This is a bit of a niche solution, but some platforms such as .NET allow you to run the State Service separately from the web server (IIS). The main advantage of this is that you can reload your application or even redeploy without disrupting user sessions, although the data is still being kept on the same server. This is a great option when session serialization isn't possible and is as simple as a small change to your application's web.config file and starting a Windows service for the .NET platform.
Better Option: Ship your session data to a database or take advantage of other built-in features related to session data storage and replication across your application farm. Some application servers like JBoss or Tomcat have built-in mechanisms to handle cross-server replication of session data. These configuration-heavy setups certainly have some significant advantages, and more server technologies are building in solutions to distribute session data and even cache data.
Something to consider is how you'll manage the machines in the cluster so that they are aware of each other. RightScale tags are a great fit for this scenario, and we use this mechanism ourselves at RightScale to manage the auto-attachment process of web servers to our load balancer with the HAProxy ServerTemplate™. RightScale ServerTemplates are built from modular images, scripts, and variable inputs to enable you to dynamically provision your servers at boot time using your chosen configuration and variable inputs.
Best Option: Check out a solution like memcached in a clustered configuration, which is both out-of-process related to the web server and off-machine so that any server can serve up the same session. Couchbase also offers a great distributed session toolset. Solutions like this are tuned to manage sessions or simply to reduce the overhead of storing data so that your session data is saved and retrieved efficiently and effectively. Get it set up right and you could even consider ditching that sticky session configuration on your load balancer.
A Few More Tips on Web Application Session Management
Keep in mind that when you start shipping your session data to a centralized server, you'll need to make sure that your data is serialized and de-serialized properly. So don't forget to test to make sure that the change in how you're managing sessions (if you're migrating from another process) doesn't break your application. The most common issues regarding serialization are related to the time it takes to serialize large session objects and the problems that can occur with serializing binary objects — two great reasons to only use application session objects when absolutely necessary.
The devil's always in the details, so automating the work of standing up these systems and relying on a set of proven assets to implement your infrastructure on top of it is a huge advantage and will save you tons of time over the long run. RightScale offers a number of built-in tools, scripts, and Chef recipes to set up these types of configurations and can serve as a great starting point with ServerTemplates that have been tested by both RightScale and the entire community.