Session management is one of the key topics that all serious web developers and architects need to master with. This article will go through several key topics with you. They are:
- Persistence vs non-persistence web connection – web performance!
- Concerns of using cookie – security and size limitations
- Server side session management challenges in scalable web application
- Achieve linear scalability through stateless servers - start moving the session to the client
Today, I will start walking through all these topics at a high level. A series of articles will be written to further develop on each topic if necessary. Lets start!
Persistence vs non-persistence web connection
- Before HTTP 1.1, HTTP is a stateless protocol that doesn't maintain persistence connection. Each request made by a Web browser, for an image, an HTML page, or other Web object, is made via a new connection.
- HTTP 1.1 introduced persistence connection (ie. Keep-Alive) that Web browser can established a single connection, through which multiple requests could be made.
- But before HTTP 1.1, how can state maintain across stateless HTTP request?
- Normally, we keep the session in the server side and provide the session id to the client that can be used to link subsequent requests to the same session.
- Normally, client (often time web client) will store the session id in cookie.
- However, if the cookie is disabled, the session id will normally embedded in the URL (ie URL Rewriting).
Concerns of using cookie
What do we need to pay attention when we store info in cookie?
- Size limitation and security concerns.
- How long cookie can last? Default = expired when browser exits. In Java, you can do cookie.setMaxAge(int) with long future date if you want to keep the info lasting long in the cookie. If you do setMaxAge(0), it will void the cookie.
- Normally, we don't keep all state info in cookie as the information could be sensitive and we are not able to protect it because it sits in the clients' filesystem. Apart from that, there has limitation in size as well. For these two concerns, we normally just store the session id in the cookie and keep the session in the server side. This approach can save us bandwidth as well.
Server side session management challenges
At the first glance, session in server side sounds like a great solution. However, when it comes to scale, it always raises the concerns. Imagine you need to replicate client session state across multiple servers to achieve high availability. Both the replication time and memory resource limit will cause your system not able to scale linearly. To solve or minimize this, we selectively pick what kind of info we store in the session, use sticky session to avoid one session replication across all the machine or even try to store the state to the client if possible like using rich client UI (ex. Flex and Silverlight). A post will be written about this topic later on.
Transient vs Persistent State
- Session in the server can be timed out (~30 minute inactive)
- Session in the server can be persisted in file across Tomcat restart.
- Persistent state should be stored in database.
- Object putting in session should be Serializable
- Avoid putting too much info in the session b/c we don't want to put too much baggage during session replication. One server crash b/c of memory depletion can further spread across to other servers via session replication. Not Good! Should we reconsider storing session in client? This article talks about it.
- Session replication is needed to support failover. Sticky session for simplicity but suffered data lost when the box is down. We can tell one or two servers as its backup to avoid the session lost. To go for sticky session approach, we need to identify the "sticky" part. What kind of thing we can use to link separate requests? Use IP address can potentially overload a box because some Internet service providers use a set of proxy servers to deal with many clients. This subject can be further developed. We will go back to it later!