Archive | Scale RSS feed for this section

Session Management – Part 1

Session management is one of the key topics that all serious web developers and architects need to master with. This article will go through several key topics with you. They are:

  • Persistence vs non-persistence web connection – web performance!
  • Concerns of using cookie – security and size limitations
  • Server side session management challenges in scalable web application
  • Achieve linear scalability through stateless servers - start moving the session to the client

Today, I will start walking through all these topics at a high level. A series of articles will be written to further develop on each topic if necessary. Lets start!

Persistence vs non-persistence web connection

  1. Before HTTP 1.1, HTTP is a stateless protocol that doesn't maintain persistence connection. Each request made by a Web browser, for an image, an HTML page, or other Web object, is made via a new connection.
  2. HTTP 1.1 introduced persistence connection (ie. Keep-Alive) that Web browser can established a single connection, through which multiple requests could be made.
  3. But before HTTP 1.1, how can state maintain across stateless HTTP request?
    • Normally, we keep the session in the server side and provide the session id to the client that can be used to link subsequent requests to the same session.
    • Normally, client (often time web client) will store the session id in cookie.
    • However, if the cookie is disabled, the session id will normally embedded in the URL (ie URL Rewriting).

Concerns of using cookie

What do we need to pay attention when we store info in cookie?

  1. Size limitation and security concerns.
  2. How long cookie can last? Default = expired when browser exits. In Java, you can do cookie.setMaxAge(int) with long future date if you want to keep the info lasting long in the cookie. If you do setMaxAge(0), it will void the cookie.
  3. Normally, we don't keep all state info in cookie as the information could be sensitive and we are not able to protect it because it sits in the clients' filesystem. Apart from that, there has limitation in size as well. For these two concerns, we normally just store the session id in the cookie and keep the session in the server side. This approach can save us bandwidth as well.

Server side session management challenges

At the first glance, session in server side sounds like a great solution. However, when it comes to scale, it always raises the concerns. Imagine you need to replicate client session state across multiple servers to achieve high availability. Both the replication time and memory resource limit will cause your system not able to scale linearly. To solve or minimize this, we selectively pick what kind of info we store in the session, use sticky session to avoid one session replication across all the machine or even try to store the state to the client if possible like using rich client UI (ex. Flex and Silverlight). A post will be written about this topic later on.

Transient vs Persistent State

  1. Session in the server can be timed out (~30 minute inactive)
  2. Session in the server can be persisted in file across Tomcat restart.
  3. Persistent state should be stored in database.
  4. Object putting in session should be Serializable
  5. Avoid putting too much info in the session b/c we don't want to put too much baggage during session replication. One server crash b/c of memory depletion can further spread across to other servers via session replication. Not Good! Should we reconsider storing session in client? This article talks about it.
  6. Session replication is needed to support failover. Sticky session for simplicity but suffered data lost when the box is down. We can tell one or two servers as its backup to avoid the session lost. To go for sticky session approach, we need to identify the "sticky" part. What kind of thing we can use to link separate requests? Use IP address can potentially overload a box because some Internet service providers use a set of proxy servers to deal with many clients. This subject can be further developed. We will go back to it later!
Leave a comment Continue Reading →

Powerful combination: JMX + Annotation + AOP

What is AOP?

AOP is a way to modularize cross-cutting concerns. Ok, what does “modularize” really mean? Modularization is the encapsulation of a unit of functionality. It is exactly what “Class” is doing in OO world. How about “cross-cutting concerns”? Basically it means any functionalities that span multiple modules/ classes. They include Transaction Management, Security, Caching, Performance Monitoring and etc. To understand how AOP works, we first look at the common terms in this area:

  1. Join point – An identifiable point in the execution of a program like method invocation, exception thrown.
  2. Pointcut – Program construct that selects join points and collects context at those points. AspectJ has a rich pointcut expression language!
  3. Advice – Code to be executed at a join point that has been selected by a pointcut.

To me, I found it easier to understand these terms if I consider join point as event generated point in code, pointcut as a way to define what events to be captured and advice as event handler.

AOP is indeed a powerful way to factor out system or infrasturcture-related code from the business oriented code. Typically, we use it to take care of transaction, security and profiling aspects. But it doesn’t stop you putting creativity in this domain. With a bit more creativity, you can also do the following::

  1. Exception translation – checked to runtime
  2. Catch ConcurrencyFailureExceptions and transparently retry if an idempotent operation fails with, for example, a deadlock loser exception.

How I use Spring AOP in my project?

I have been told to report the elapsed time for all calls to the database. If I don’t know how to use AOP, I may end up putting code to measure time for every JDBC calls. It ends up tangling performance monitoring code with my main line business logic and the same logic will be scattered everywhere in my data access code. Bad!! That is why we need to know how to factor out the performance monitoring code into an aspect like below:

Here we use AspectJ annotation approach to implement the aspect. “Around” is to intercept start and end of any repository method. Here is what states in Spring 2.5 reference:

Spring 2.0 introduces a simpler and more powerful way of writing custom aspects using either a schema-based approach or the @AspectJ annotation style. Both of these styles offer fully typed advice and use of the AspectJ pointcut language, while still using Spring AOP for weaving.

If you use AspectJ annotation, you need to put <aop:aspectj-autoproxy/> in your application-context.xml. The limitation of Spring proxy-based AOP is that it is limited to method invocation interception. To get around that, you can use AspectJ syntax in your pointcut expression. You don’t need to build the application with ajc (the AspectJ compiler) even you are using AspectJ syntax. Spring AOP can also understand @AspectJ aspects. I strong suggest you use Annotation driven AOP because it is cleaner and simplier. Working with AOP, I have faced 2 questions.

  1. How to select the methods that I want to intercept without hardcoding the method or package name in my pointcut expression. So, my aspect or pointcut doesn’t contain application specific information – Look into annotation and AOP section.
  2. How to turn on and off AOP without restarting the web application? I would use JMX. Look into “What is JMX” section. 

Annotation and AOP

Annotation provides a better way other than code signature for selecting join point that leads to creating loosely coupled aspect. In fact, you can see annotation as another signature of a method in other dimension. And a method can have multiple annotations and each concern just bother its own annotation. It is called multidimensional signature space. For example,

@Authentication(“bankOperation”)
@Transactional(REQUIRED)
public void credit(){…}

Pointcut uses annotation to capture join points. For example:

execution(@Transactional * *.*(..)) Execution of a method annotated as Transactional
execution((@Trasactional *) *.*(..)) Execution of a method that returns object annotated as Transactional
execution(* (@Transactional *).*(..)) Execution of a method defined for type annotated as Transactional

Selection can use Annotation types and Annotation values.  What is more, annotation values can be used in Advice implementation.

Here is a great video from Parleys that talked about “Leveraging Annotation with AOP”. I have included some key points Ramnivas made here:

  • Write you pointcut in a smart way to avoid annotation mess. Try to use naming and package convention to help you. For example, if you want to write app log for all public facing service method, you can use “public” with package name containing “service” wildcard to help you.
  • If you really need to use annotation like @Transaction that designer has no way to define the pointcut beforehand, use annotation to describe what the join point is but not how to handle it. So, your transaction aspect only need to worry annotation @Transaction and decouple from the application.
  • You can piggyback annotation. For example, you can make all entities auditable via declare @type: @Entity *: @Auditable; 

How does Spring AOP work internally?

The magic behind AOP is the concept of Proxy/ Decorator/ Interceptor/ Filter pattern. To me, all those patterns are conceptually the same. They all try to present itself as target object (thru implementing the same interface), intercept method call and execute injected logics. And you can have more than one interceptors invoked in series. In Spring AOP, there is one thing we need to pay attention:

However, once the call has finally reached the target object, …any method calls that it may make on itself, such as this.bar() or this.foo(), are going to be invoked against the this reference, and not the proxy. This has important implications. It means that self-invocation is not going to result in the advice associated with a method invocation getting a chance to execute. To handle this, either you refactor your code such that the self-invocation does not happen (best approach) or you make self invocation call thru proxy like ((Pojo) AopContext.currentProxy()).bar() (invasive approach b/c it totally couples your code to Spring AOP, and it makes the class itself aware of the fact that it is being used in an AOP context, which flies in the face of AOP. Avoid using it).

However, it must be noted that AspectJ does not have this self-invocation issue because it is not a proxy-based AOP framework.

What is JMX?

In short, it is a way to enable management and monitoring of Java applications over a generic API. JMX has a simple architecture that contains instrumentation level, agent level and distribution service level. In instrumentation layer, we register MBean to the MBeanServer. In simple term, In simple term, MBean is a  JavaBean with defined management interface that exposes attributes and operations to the world. MBeanServer acts as a broker to decouple communication among application MBeans and/or remote clients.

Combine AOP and JMX

AOP is statically defined and intercept at the runtime. It is hard to take this out or add another aspect in after you start your machine. However, with JMX, you can enable and disable it via skipping the aspect code. :cool: On the other hand, you can also use JMX to configure and report SLA metrics like configure thresholds and send notifications of violations. That sounds very interesting to me. There are other interesting usages mentioned in the Parley’s video as well:

  1. Service blocking – throw an exception if particular service you don’t want to user to use it for a period of time esp during maintenance time.
  2. Caching management – I am currently using interceptor pattern and IoC to intercept dao method calls for cache lookup. 

Reference

  1. JavaOne 07 – JMX, AOP and Spring (Nice Presentation)
  2. Parley’s AOP and JMX (Video)
  3. Simplifying Enterprise Applications with Spring 2.0 and AspectJ
  4. Workflow Orchestration Using AOP
  5. Performance Monitoring with AOP and JMX

 

Leave a comment Continue Reading →

Streaming data to your grid

Push data to client

Traditional web application is based on request and response model that information is delivered as a single payload and then immediately close the connection to the client. To keep the client in sync, we normally pull the server periodically. This approach may generate unacceptable load to the server. To solve this problem, we want to have a push mechanism from server to client. This is why Comet is defined. Comet is a generic term describing various approaches to send data asynchronously from a Web server to a client without the need for the client to explicitly request the data. It is an essential technique for any real-time event-driven web applications, where the majority of events occur on the server and data must be “pushed” frequently to the client. To achieve this, Comet servers must maintain a continuous connection to each client for the duration of the session.

 

OK. How to maintain a continuous connection to each client for the duration of the session?

If you try to adapt traditional server to the Comet methodology, it may not scale and often fails after a few thousand simultaneously open connections. A true Comet implementation requires a very different kind of server architecture to be efficient and scalable – Liberator (a solid Comet server that are used by the financial industries. However, it is written in C and not open source although it has FREE edition distributed).

To understand this statement a little bit more, we need to know how traditional web containers handle the request. They are under one request per thread model.

  1. The client , typically , a browser sends request for resource to a web server.
  2. The server has a listening thread that keeps track of incoming connections.
  3. When a request arrives , the server uses one process or thread to process the request.
  4. The resource is returned to the client and the connection is closed.

In this model, the number of requests that can be served in a second would depend on two things

  1. How many threads are there to handle the client requests
  2. How long it takes to serve one request.

If all threads of server are busy, then the incoming requests are put in a queue. The server would return to the requests in queue when server threads become free. The number of requests handled per second is always greater than the number of allowed simultaneous connections. All this is made possible because the time required to process a request is very short. In other words you can server more requests in a second than you have threads.

However, there are one breed of applications that need to hold onto the connections. Think of applications that require real time data coming to clients (stock tickers)  or think of applications where low-latency is required. In the above traditional web model, the browser has to re-connect to get the new data. (Polling). If the new updates “can”  happen with high frequency (e.g. a chat application) then the polling frequency also has to increase .  An alternative to high frequency polling is to use push based applications. For push based application, once the browser connects to server, the server will maintain the connection till the browser time-out (server response stream is not closed) and keeps flushing data down the connection as and when they become available. In servlet container, to hold the connection, your thread in the service method cannot exit the method. Otherwise, the response stream will be closed. So what you do is, you block the thread on some condition within the service method. So the thread will block for your condition. When push data becomes available , this thread writes to response stream and again enters a blocked state. So as long as you hold onto the connection, you can not return this thread to the thread pool. And as more and more “push” connections are established you would run out of threads! To remedy the problem, the possible solutions are:

  1. Increase # of server threads.

Flex Push

There is confusion that whether BlazeDS supports real time messaging. Yes it does :wink:. In fact, BlazeDS has a full spectrum of channel types ranging from simple polling, to near-real-time polling, to real-time streaming.

  1. Simple polling – ping the server from Flex client using the traditional request and response model
  2. Near-real-time polling (long polling) – Instead of acknowledging right away, the server could hold the polling request until there’s a message for the client. This ensure the messages are delivered to the client as soon as they become available. The caveat for using long-polling is the thread limitation in most application servers. At this moment, BlazeDS could not support more than a few hundred long-polling clients on most application servers. However, this problem could be resolved once servers like Tomcat start to support asynchronos, non-blocking connection threads. Update: Now Tomcat 6 supports NIO.
  3. Real-time streaming – BlazeDS supports real-time message streaming over AMF and HTTP. Unlike long polling, which closes and reopens the connection upon receiving a message, streaming keep the connection open at all times. Streaming suffers from the same thread blocking issue as long polling. A cap must be set so the server is not hang by idle threads.

The reason why people are confused is that Adobe doesn’t release its proprietary push solution RTMP to BlazeDS. So, RTMP isn’t available as a channel in the BlazeDS configuration files. BlazeDS lives in a Servlet container and hence constrained by one-thread-per-connection limit whereas LCDS has NIO-based channels that can scale up to 1000s of requests. On the other hand, BlazeDS has the advantage that it’ll work over port 80/443, whereas LCDS will use some port for persistent connections that would require a firewall configuration. Once the servlet that implements BlazeDS is revved to support Comet Events under Tomcat 6, and then Jetty Continuations, then the long polling technique will be fine.

UPDATE: We are waiting for a solution that supports Comet Events under Tomcat 6. Then BlazeDS can be coupled to the Tomcat NIO HTTP listener and be able to scale as well as any NIO based server software.

I have learnt from this article that you can create a channel set in client side. So Flex can fail-over to other channels until it gets connected or the list is exhausted.

Marc has put an effort to build a better data grid like a spreadsheet in Flex. (check this out)

Reference

Here are the references I used for this article

  1. Tuning Apache and Tomcat for Web 2.0 comet application
  2. Performance of Grids for Streaming DataThis shows you the performance numbers on various frontend technologies. Again, Flex shows us a good result.
  3. Are raining comets and threads? – Comet Daily
  4. Comet & Java: Threaded Vs Nonblocking IO
  5. JDK 1.6 uses epoll to implement NIO
  6. BlazeDS dev guide
  7. Achieve performance breakthrough using BlazeDSFarata System put an effort to write its NIO channel that runs on Jetty 7 and receive promising result.

 

Leave a comment Continue Reading →

Plenty of Fish – Cash cow!

A site called “PlentyOfFish.com” is currently getting 30 million hits a day. The number doesn’t blow me off. However, what surprise me is that this site is basically operated by single man “Markus Frind”. How does he achieved that? If you want to hear how he does that, you can go to his interview from this link. Otherwise, you can read the summary I got from his interview.

The stuff I learnt from Markus

You may think that Markus must spend a lot of $$ to maintain his site. A picture of server farm may be popped up in your head. Hahaha… all he needs is just 1 web server and 3 database servers. This is the cost that you and me can afford. No bother to write your business plan and wait for VC $$ nowadays. :grin:

Here are some quick tips for Markus

  1. You need a lot of RAM. RAM is cheap, go ahead to power up your box with tons of RAMs please!
  2. Markus uses Akamai CDN to offload the bandwidth of fetching images across different locales.
  3. Separate R/W database operation.
  4. Markus uses one database as master for write and 2 databases as slave to handle the searches (read). According to him, radius-based searches demand lots of resources. “If you have one system to do just one thing, it will do it much efficiently.”
  5. Markus put RAM to both web and db servers. “If you can load your whole db in the RAM, do it!”
  6. Optimize the db access is the key to handle lots of requests.
  7. Denormalization is necessary if you want to reduce the number of joins that can potentially slow down your queries.
  8. PlentyOfFish.com is purely based on “Word of Mouth” marketing. Do things right, your users will spread it out for you. Cheapest marketing strategy ever!
  9. PlentyOfFish.com is FREE site. Because it is free, it doesn’t have high requirements like uptime. It can be down without much issues.
  10. PlentyOfFish.com solely monetized from advertisement like Google Ads. Just this, Markus is making around 10 million annually. Amazing!
  11. PlentyOfFish.com is purely using Microsoft solution like IIS, ASP.NET and SQL Server. In fact, you can build it using other solution like Apache, Spring, MySQL

I love to see how people like Markus beat down the giant like Match.com. One man beats hundreds of people with simple system settings. Incredible! Folks, there is no excuse whining no $$ to start your business!:lol:

Although it sounds easy for Markus during the interview, there are areas the interviewer didn’t cover:

  1. PlentyOfFish.com webfront is not looking good. How could it attract the first set of users in the first place? FREE
  2. If you go to a FREE site without data, you may leave it right away. How PlentyOfFish.com attracts the first real user? Did PlentyOfFish.com crawl competitors’ data to power his site as bootstrap?
  3. PlentyOfFish.com purely makes $$ from Google AdSense. However, according to John Chow, Adsense is not a good place to make $$. Why is that?

What possibly may go wrong for his approach:

His database architecture is traditional master-slave approach. It can offload the read but not write operations. Obviously the master becomes the write bottleneck and a single point of failure. And as load increases the cost of replication increases as well. Replication costs in CPU, network bandwidth, and disk IO. The slaves fall behind and have stale data. The folks at YouTube had a big problem with replication overhead as they scaled. This problem can be tackled by shard/ federation. I will discuss this topic later.

 

Leave a comment Continue Reading →

Amazon Web Service Solutions

When we talk about SOA, I would think of Amazon. It is the company that takes SOA to the next level, proving to the world that it is a viable solution for us. Great! I decide to put sometime to learn from Amazon via reviewing the web services it provides, reading the related interviews and blogs, studying how to build an application on top of its infrastructure, develop an application to consume data provided from its Web Services. Anyway, I believe the best way to learn SOA is to get a taste of the services provided from a company that relies greatly on this to scale its business. Before I delve deeper, I need to clarify one thing. Many people use the term SOA and Web Service interchangeably. Be honest, I was among one of them. However, in definition, they are not the same. SOA is about design; Web services are a specific technology set that supports distributed computing. Web services make it easier to create a service-based system, but only if your developers are using SOA design principles, where functions are packaged into modular, shareable, distributable services that can be used and reused by multiple consumers. In Amazon, each service is independent and encapsulates 3 things: data, business logic and public service interface. Each service owns its data and is never been directly accessed by other services. According to its CTO, this is the core architecture that scales Amazon.
 

 

Video Presentation

Jinesh Varia - an evangelist from Amazon. In his presentation, he will show you how to build a regular-expression based search engine called “GrepTheWeb” on top of the Amazon infrastructure – SQS, SimpleDB, EC2 and S3. The most interesting thing he mentioned in this presentation is the on-demand architecture powered by Hadoop and Amazon infrastructure. “At time t0, you have no infrastructure. At time t1, when regular expression comes in, the system reaches the execution phase and the whole infrastructure is ready for it. At time t2, the request is fulfilled, the whole infrastructure is gone…” This gives me a taste of cloud computing and how powerful it can be.

 

Web Resource

High Scalability posts an article about Amazon architecture. The author follows up with different resources and consolidates key information he found.

 

Leave a comment Continue Reading →

Powerful Full Text Search – Part 3 Solr

Introduction of Solr

Solr is a standalone enterprise search server with a web-services like API. You put documents in it (called "indexing") via XML over HTTP (RESTful). You query it via HTTP GET and receive XML results.

  • Advanced Full-Text Search Capabilities
  • Optimized for High Volume Web Traffic
  • Standards Based Open Interfaces – XML and HTTP
  • Comprehensive HTML Administration Interfaces
  • Scalability – Efficient Replication to other Solr Search Servers
  • Flexible and Adaptable with XML configuration
  • Extensible Plugin Architecture

Set up Solr

 To set up Solr, you should follow this guideline. After the set up Solr, you practically have a indexing service up.

The HTTP/XML interface of the indexer has two main access points: the update URL, which maintains the index, and the select URL, which is used for queries. In the default configuration, they are found at:

  • [code]]czozNDpcImh0dHA6Ly9baG9zdG5hbWU6cG9ydF0vc29sci91cGRhdGVcIjt7WyYqJl19[[/code]
  • [code]]czo3OlwiaHR0cDovL1wiO3tbJiomXX0=[[/code][code]]czoxNTpcIltob3N0bmFtZTpwb3J0XVwiO3tbJiomXX0=[[/code][code]]czoxMjpcIi9zb2xyL3NlbGVjdFwiO3tbJiomXX0=[[/code]

To add a document to the index, we POST an XML representation of the fields to index to the update URL. In addition, you can delete, update (ie. re-post on unique). All change operations need to commit to flush to file system. On the other hand,  once we have indexed some data, an HTTP GET on the select URL does the querying. 

Powerful features Behind Solr

If you follow the guideline above, you already get yourself familiar with indexing, searching and facet browsing. Now lets get down to how to make Solr a scalable solution with great performance.

Caching

TBA

Distribution and Replication

For applications that receive large volumes of queries, a single Solr server may not be enough to meet performance requirements. Therefore, Solr provides mechanisms for replicating the Lucene index across multiple servers that are part of a load-balanced suite of query servers. The replication process is handled through a combination of event listeners enabled through the solrconfig.xml file and several shell scripts (located in solr/bin of the example application).

In a replicating architecture, one Solr server acts as the master server, providing copies of the index (called [code]]czo5Olwic25hcHNob3RzXCI7e1smKiZdfQ==[[/code]) to one or more slave servers that handle query requests. Indexing commands are sent to the master server and queries are sent to the slave servers. The master server can create snapshots manually or by configuring the [code]]czoyMTpcIiZsdDt1cGRhdGVIYW5kbGVyJmd0O1wiO3tbJiomXX0=[[/code] section of solrconfig.xml to trigger snapshot creation when [code]]czo2OlwiY29tbWl0XCI7e1smKiZdfQ==[[/code] and/or [code]]czo4Olwib3B0aW1pemVcIjt7WyYqJl19[[/code] events are received. In either the manual or the event-driven process, the [code]]czoxMTpcInNuYXBzaG9vdGVyXCI7e1smKiZdfQ==[[/code] script is invoked on the master server, creating a directory on the server named [code]]czoyMzpcInNuYXBzaG90Lnl5eXltbWRkSEhNTVNTXCI7e1smKiZdfQ==[[/code] where [code]]czoxNDpcInl5eXltbWRkSEhNTVNTXCI7e1smKiZdfQ==[[/code] is the actual time the snapshot was created. The slave servers then use rsync to copy only those files in the Lucene index that have been changed.

&lt;listener event=&quot;postCommit&quot; class=&quot;solr.RunExecutableListener&quot;&gt;
    &lt;str name=&quot;exe&quot;&gt;snapshooter&lt;/str&gt;
    &lt;str name=&quot;dir&quot;&gt;solr/bin&lt;/str&gt;
    &lt;bool name=&quot;wait&quot;&gt;true&lt;/bool&gt;
    &lt;arr name=&quot;args&quot;&gt; &lt;str&gt;arg1&lt;/str&gt; &lt;str&gt;arg2&lt;/str&gt; &lt;/arr&gt;
    &lt;arr name=&quot;env&quot;&gt; &lt;str&gt;MYVAR=val1&lt;/str&gt; &lt;/arr&gt;
&lt;/listener&gt;

Reference

Below are some cool references I found:

  1. Search smarter with Apache Solr, Part 1: Essential features and the Solr schema
  2. Search smarter with Apache Solr, Part 2: Solr for the enterprise
  3. Advanced Lucene

 

 

Leave a comment Continue Reading →

Grid Computing – Part 1 Introduction

Introduction from Cameron Purdy

 

Leave a comment Continue Reading →

Tomcat Performance Tuning

Most companies I have worked for use Tomcat as Servlet Container. It is de facto standard just like how Apache been used as Web Server. However, most of us just drag our war file to the webapp folder and use Tomcat with all the settings as default out of the box. It works fine in development environment but may not in production. This article will give you advice in several areas:

  1. Production Tomcat Architecture
  2. Tuning tomcat for performance
  3. Resolving problems which affect availability

 Production Tomcat Architecture

In production Tomcat relies on a number of resources which can impact its overall performance. Understanding the overall system architecture is key to tuning performance and troubleshooting problems.

  1. Hardware: CPU(s), memory, network IO and file IO
  2. OS: SMP (symmetric multiprocessing) and thread support
  3. JVM: version, tuning memory usage, and tuning GC
  4. Tomcat: version (example, Tomcat 6 supports NIO)
  5. Application: Application design can have the largest impact on overall performance
  6. Database: concurrent db connection is allowed (pooling and object caching)
  7. Web Server: Apache can sit in front of Tomcat and serves the static content. It also can do load balancing across multiple Tomcat instances.
  8. Network: Network delays.
  9. Remote Client: How fast is the communication protocol? Content can be compressed. 

Performance Tuning

How to measure and test performance

  • Request latency is key b/c it reflects the responsiveness of your site for visitors.
  • Test environment should match production as closely as possible.
  • The data volume is important to simulate in database side.
  • Test HTTP requests with different request parameters (test corner cases)
  • Use load test to simulate the traffics (ex. JMeter)
  • Final tests should be over longer periods like days because JVM performance changes over time and can actually improve if using HotSpot. Memory leaks, db temporary unavailable, etc can only be found when running longer tests.

JVM version, memory usage and GC

  • Sun Java 1.3 and later releases inlcude HotSpot profiling optimizer customized for long running server application.
  • Tomcat will freeze processing of all requests while the JVM is performing GC. On a poorly tuned JVM this can last 10′s of seconds. Most GC’s should take < 1 second and never exceed 10 seconds
  • Tune the -Xms (min) and -Xmx (max) java stack memory (set them to the same value can improve GC performance)
  • Make sure the java process always keeps the memory it uses resident in physical memory and not swapped out to virtual memory.
  • Use -Xincgc to enable incremental garbage collection
  • Try reducing -Xss thread stack memory usage

Tomcat version and configuration

  • Tomcat 6 supports NIO.
  • Set “reloadable” false – remove unnecessary detection overhead
  • Set “liveDeploy” to false – liveDeploy controls whether your webapps directory is periodically checked for new war files. This is done using background thread.
  • Set “debug” to 0
  • Set “swallowOutput” to true – This makes sure all output to stdout or stderr for a web application gets directed to the web application log rather than the console or catalina.out. This make it easier to troubleshoot problems.
  • Connector configuration – minProcessor, maxProcessor, acceptCount, enableLookups. Don’t set the acceptCount too high b/c this sets the number of pending requests awaiting processing. It is better deny few requests than overload Tomcat and cause problems for all requests. Set “enableLookups” to false b/c DNS lookups can add significant delays.

Database connection pool

  • We use connection pool provided by Spring instead
  • Using middleware to persist and cache objects from your database can significantly improve performance b/c of fewer db calls, less thrashing of the JVM for creation and subsequent GC of object craeted for resultset.

Application design and profiling

  • If the data used to generate a dynamic page rarely changes, modify it to a static page which you regenerate periodically.
  • Cache dynamic page
  • Use tool like JProble to profle your web applications during development phase
  • Look for possible thread synchronization bottlenecks
  • Date and Time thread synchronization bottleneck 

Troubleshooting

Collecting and analyzing log data

Common problems

  • Broken pipe – For HTTP Connector indicates that the remote client aborted the request. For web server JK Connector indicates that the web server process or thread was terminated. These are normal and rarely due to a problem with Tomcat. However, if you have long request, the connectionTimeout may close the connection before you send your response back.
  • Tomcat freezes or pauses with no request being processed – usually due to a long pause of JVM GC. A long pause can cause a cascading effect and high load once Tomcat starts handling requests again. Don’t set the “acceptCount” too high and use java -verbose:gc startup argument to collect GC data.
  • Out of Memory Exception – look into application code to fix the leak (profile tool can help). Increase available memory on the system via -Xmx. Restart tomcat!
  • Database connection failure – connection used up when traffic is spike.
  • Random connection close exception - when you close your connection twice. First close(), the connection returns to the pool. It may be picked up by another thread. Now, second close() may close a connection that is being used by other thread. Don’t close connection twice, use JDBC Template from Spring to avoid this problem. 

Reference

  1. JavaWorld GC Article
  2. Sun HotSpot Performance Document
  3. Tomcat Performance Slides

  

Leave a comment Continue Reading →

Scale your site via Amazon solution – EC2 and S3

Steps to use EC2

  1. To get you start, read here.
  2. Create a custom AMI

How others use it

  1. How SmugMug uses Amazon solution
  2. http://www.rajiv.com/blog/2008/02/04/amazon-ec2/
Leave a comment Continue Reading →

Art of using database indexes

Need of Indexes

Image of you have a table of user info, if the table contains 50 million of rows. Without index, running a query like below will need a full table scan. Clearly it is not efficient as it is O(n) problem.

SELECT * FROM user_info WHERE last_name = “Tom”

But if we index it on last_name column, the last_name field will be sorted alphabetically. Now if you look up the last_name = ‘Tom’, you can go directly to the one starts with ‘T’. Internally, index table contains the fields you are indexed and the position of the matching records (ie. pointer).

Cost of Indexes

  1. Because database needs to maintain a separate list of indexes’ values, there is cost to keep them updated as your data changes
  2. Indexes cost space. It is a trade-off between space and time. Lets say the user_info table has 2 billions rows and last_name is 8 bytes long, you are looking at roughly 16 GB of space for the data portion of the index. Plus 4-8 bytes for each row pointer, it will go up to 32 GB of space.

With these cost, you don’t want to index every column in a table.

Type of Indexes

 

  1. Multicolumn indexes – reduce the set that matches single column only (ie. more selective). For example, if you have each field as index separately, database may not use them all at once at the same time. Like MySQL, it will only ever use one index per table per query. To choose which index to use, MySQL will make a decision about which index will return fewer rows via index statistics.
  2. Partial indexes – if you don’t have too much space for your index, you can specify a subset of bytes from your index value be used.
  3. Clustered indexes – In MySQL, for MyISAM, the indexes are kept completely separate from the row data. With clustered indexes, the index and the record itself are “clustered” together. InnoDB uses clustered indexes. In Oracle, clustered indexes are known as “index-organized tables”. The type of index will reduce two lookups (index and record data) to one. Internally, clustered index reorders the way records in the table are physically stored. Therefore table can have only one clustered index. The leaf nodes of a clustered index contain the data pages.

Deep look in MySQL: The InnoDB storage engine creates a clustered index for every table. If the table has a primary key, that is the clustered index. If not, InnoDB internally assigns a six-byte unique ID to every row and uses that as the clustered index. (don’t let it generate a useless one for you). All indexes are B-trees. In InnoDB, the primary key’s leaf nodes are the data. Secondary indexes have a pointer to the data at their leaf nodes.

Index Structure

  • B-Tree Indexes – (balanced tree indexes, a tree structure that will never become lopsided as new nodes are added and removed. It gives us O(log n) performance for single-record lookup). Unlike binary tree, B-trees have many keys per node and don’t grow tall or deep as quick as a binary tree. It is very good for range-based queries as well. For example, for quey below, server simply finds the first Ray record and last Robert record. It then knows everything in between are also matches. The same is true of virtually any query that involves understanding the range of values (>, <, MIN, MAX, BETWEEN xx AND yy)

SELECT * FROM user_info WHERE last_name BETWEEN ‘Ray’ AND ‘Robert’

  • Hash Indexes – It gives very fast lookups O(1) but it is less flexible and predictable than other indexes. First, the key is hashed and compare. So, the range-based queries cannot use it. Uniform distribution is the key here.

Index Limitation

  1. Index doesn’t work together with wildcard and regular expression search.
  2. If index selectively is like > 30% of rows (very low), table scan may be better.

 

Leave a comment Continue Reading →