Friday, June 8, 2007

The internals of HTTP Session

Understanding HTTP requests:
From a browser when a request gets made the first point it hits is the DNS server where the name is resolved and the IP address is retrieved. Once this is done then IP address resolution mechanism is used to track down the server which is hosting the application. The protocol being used is HTTP which has different parts like header, request method names (GET Vs POST) etc.

Ex:GET /index.html HTTP/1.1Host: www.example.com

Response:HTTP/1.1 200 OKDate: Mon, 23 May 2005 22:38:34 GMTServer: Apache/1.3.27 (Unix) (Red-Hat/Linux)Last-Modified: Wed, 08 Jan 2003 23:11:55 GMTEtag: "3f80f-1b6-3e1cb03b"Accept-Ranges: bytesContent-Length: 438Connection: closeContent-Type: text/html; charset=UTF-8

HTTP is a stateless protocol.

What is meant by Stateless protocol?

A stateless protocol means that in this protocol when a caller makes a request and gets a response back, server do no remember who the person was. Next time the same person makes another request then it will be treated a fresh request.

But in many applications we will need the user to maintain state. Ex: User logs in, enters the user id and password. Then he visits some more pages but does not want to enter the username in all of those pages. Then the server will need to maintain the state details of the user like username and then use it for rest of the pages.

Whats the solution?

The common method for solving this problem involves the use of sending and requesting cookies. Other methods include server side sessions, hidden variables (when current page is a form), and URL encoded parameters (such as index.php?userid=3).

Server side sessions:

In J2EE this is called Http-Session. Session data is actually stored in the memory space of the web server. There are different ways session data can be stored and depends on the web server/application server implementation.

i) In Memory: This means the session will be stored in the memory space of the Web Server/App server Process.
ii) Storing session in databases.

In some cases there are services (another process) which manages the session persistence.

This means the way to store is not part of any spec or standards and is part.

Now when a request comes in from a client browser to the server cookie which contains the sessionId gets passed to the server which then server decodes and uses it for getting the session data. Session cookies are transient in nature, i.e. not stored in file system as such but can be seen using certain tools. These cookies are set to expire (be deleted) upon closing the browser. Cookies that last beyond a user's session (i.e., "Remember Me" option) are termed "persistent" cookies. Persistent cookies are usually stored on the user's hard drive. Their location is determined according to the particular operating system and browser (e.g., C:\Documents and Settings\username\Cookies for Internet Explorer on Windows 2000).

Cookie content looks like this:

JSESSIONID=1607EB0EF22CAAA1FC82056B4978F40D

In cases where browser reject cookies then one can configure the application server to use url-rewriting where URL contains the session id embedded in there.


This shows how session id is generated in the browser memory space. Cookievalue=admin@eforceglobal.com is the persistent cookie.

How different application servers maintain sessions?

Jrun: http://www.adobe.com/cfusion/knowledgebase/index.cfm?id=tn_18131
BEA: http://edocs.bea.com/wls/docs90/webapp/sessions.html
WAS: http://www.phptr.com/articles/article.asp?p=332851&seqNum=2&rl=1

Session replication & Clusters:

What are clusters?
For high availability we group a set of application servers where each of them is capable to handling user requests. Depending on load any given server can be used at any given point of time.

Failover & Session replication:
Now when a given server goes down then to the user it should not appear that something has gone wrong. The session data should move over to other server which can take the request processing from that point onwards. Thus in order for failover to take place session replication needs to be done – which means session data needs to be shared with other servers. If the session is stored in-memory then this needs to copied to other servers in the cluster and if its in the database then its automatically supports shared session data.

Session binding listeners:
This addresses the problem when we would like to take some action based on the event that session is created to expired. Ex: Every-time a user session gets expired we would like to perform some cleanup of the resources and user may not have clicked on Logoff button but rather just closed the window or has not accessed the system for more than session time out period. In these cases session binding come handy.

Session Hijacking will mean someone gets hold of the sessionId and the fakes the request and perform operations to the application.

Reference:
http://www.webopedia.com/TERM/S/session_cookie.html
http://java.sun.com/developer/EJTechTips/2003/tt0626.html
Analyze browser cookies: http://www.sandsprite.com/Sleuth/overviews/Sleuth1.4_overview.html
http://www.imperva.com/application_defense_center/glossary/session_hijacking.html
http://en.wikipedia.org/wiki/Session_hijacking

2 comments:

Anish Biswas said...

Please visit following site which acts as a Brand Licensing product and as well as a framework which has been referenced in the article.

http://www.xpflow.com

Unknown said...

Check out this article on session management at BEA Dev2Dev:

http://dev2dev.bea.com/pub/a/2005/05/session_management.html

For more information on Coherence*Web, see:

http://wiki.tangosol.com/display/COH32UG/Installing+Coherence*Web+Session+Management+Module

Peace,

Cameron.