Archive | January, 2009

Load and stress testing my website

Purpose of Load and Stress testing

The key goals of load testing are:

  1. Find out whether your website can support the expected # of concurrent users.
  2. At what load does the app break?

To do that, you normally follow these steps:

  1. Identifying the primary user path
  2. Identifying the expected # of concurrent users. (Both now and future)
  3. Set up virtual users to hit the app (load generation capability is the key factor to pick the right tool. You don't want too much hardware investment to generate the load you want, right? :wink:)
  4. Run the test
  5. Analyze the result (throughput under load, avg response time under load)

 

Challenges to load test Flex app

We have a web application that has Flex frontend talks to J2EE web application backend via AMF. How do we load and stress test this system? We certainly can just perform load test against our backend. However, it may need to expose our service via Servlet and load test it as typical restful web service. If we want to simulate Flex and load test through AMF. We need to find a way to capture the AMF requests from Flex client and replay it in our load testing suite. To do that, we can use Charles http proxy to capture the AMF request and tell JMeter to load test our application via replaying this AMF request. This  article can give us the detail. However, Charles is a commerical product. If you want a free solution, you can try this.

I know that JMeter comes with proxy to record the request. You can try it out to see whether you can exam the AMF message. Let me know if it works!

JMeter

Load is all of the users using your web site at a point in time. Load includes users making requests to your web site as well as those reading pages from previous requests. However, we do need some way of differentiating between all clients and those clients actually making requests to our web site. We use the terms concurrent load and active load (making request) to make this distinction. We use the JMeter to help us to generate the load to our system. In term of load generation, we should make sure that we can simulate the peak load. JMeter is a great load testing tool. I have heard that Google is using this to load and stress test its application.

To use JMeter, follow the steps below:

  1. Set up Thread Group – use to model concurrent virtual users and decide how you want the load be generated.
    • Number of threads.
    • The ramp-up period (it tells JMeter the amount of time for creating the total number of threads). At the beginning of a load test, if the ramp-up period is zero, JMeter will create all the threads at once and send out requests immediately, thus potentially saturating the server and, more importantly, deceivingly increasing the load. That is, the server could become overloaded, not because the average hit rate is high, but because you send all the threads' first requests simultaneously, causing an unusual initial peak hit rate. The rule of thumb for determining a reasonable ramp-up period is to keep the initial hit rate close to the average hit rate.
    • The number of times to execute the test.
    • If the client machine running JMeter lacks enough computing power to model a heavy load, JMeter's distributive testing feature allows you to control multiple remote JMeter engines from a single JMeter console.
  2. Introduce user think time
  3. Specify response-time requirements and validate test results.

To get familiar with it, here are some articles I found it useful

  1. Stress testing with JMeter by Daniel Rubio – Linux.com
  2. Loading testing with Apache JMeter by Kulvir Singh Bhogal, – Devx.com
  3. JMeter distributed testing
  4. JMeter recording testing
  5. Load test your drupal app scalability with JMeter – Part 1, Part 2
  6. Load testing with JMeter – powerpoint – very good!
  7. Scalability Factors of JMeter in Performance Testing – response size, response time, protocol, hardware configuration, load generating tool architecture and configuration, complexity of client-side processing.
  8. JMeter Tips – Javaworld

 Reference

  1. Someone has written a nice article in sys-con to talk about how to load test Flex application.
  2. Use FlexUnit for stress testing
  3. Test Remote Data Service via FlexUnit
  4. Load testing with log replay – interesting idea
Leave a comment Continue Reading →

Speed up your website via caching

Introduction

Caching is a crucial performance tuning strategy, especially your system has high read to write ratio. You can perform caching strategy at different levels from client browser cache all the way to disk cache at server side. Lets take a brief look at where we can cache based on the invocation path for a request to be fulfilled:

  1. Client browser cache
  2. CDN network
    • A CDN is a network, like Akamai, where a web site such as JustProposed.com can offload high-bandwidth static files like photos and videos to another network, so that my web site doesn’t need to have such huge bandwidth to run. Since bandwidth is a major expense, especially as we grow or when we get slashdotted (in which case we run out of bandwidth), a CDN has looked interesting. However, Akamai is too expensive for us to use. So, we will go for the free network, Coral CDN.
    • Apart from the bandwidth, JustProposed.com has lots of non-USA users who sometimes find my site slow to use. So, CDN network gives us proximity advantages.
    • To use Coral CDN, you simply append nydu.net:8080 to the end of the hostname in the URL of your expensive resources. For example, http://www.justproposed.com/raydoris/myphoto.jpg to http://www.justproposed.com.nydu.net:8080/raydoris/myphoto.jpg
    • Coral looks great, the only problem I have with it is that it’s running on a high port, so that people behind proxy servers that don’t automatically support http over anything bug port 80 will have problems. To use Coral, follow this instruction.
  3. Reverse proxy server and content accelerator – Squid
    •  Why not use Apache as reverse proxy instead of putting Squid in front of Apache? Here are some of the benefits of this setup. The main reason is that Apache spawns out a new process per request that eats up lots of resources.
    •  

 

There are several things that you need to look at when you go for caching approach:

  1. What to cache? The data used by most web applications varies in its dynamicity, from completely static to always changing at every request. Everything that has some degree of stability can be cached. However, I always pick the ones that are most frequently access and/or expensive to compute and retrieve to cache because of the limited resource (ie. memory).

Application level caching (for J2EE)

JCS – Java Caching System

  1. Configuration
    • To understand the power of JCS, the best way is to look at its configuration file. To find out what is each configurable parameter does, take a look at this article.
  2. Integrate with Spring
    • To use JCS with Spring, take a look at this article. It talks about how to create a wrapper or Interceptor for your DAO and inject it to your service for caching purpose. To implement cache as an aspect with full control of what and how to cache, it doesn’t use the declarative Spring module caching approach. Regular dependency injection can do the trick!
  3. Distributed caching
    • JCS is a front-tier cache that can be configured to maintain consistency across multiple servers by using a centralized remote server (client-server) or by lateral distribution (peer-to-peer) of cache updates. 

Reference

  1. Speed up your LAMP stack with lighhttpd
  2. Squid and Apache on the same server – have squid listened on port 80 and apache listened on port 8080
  3. Squid configuration variable

 

Leave a comment Continue Reading →

Streaming data to your grid

Push data to client

Traditional web application is based on request and response model that information is delivered as a single payload and then immediately close the connection to the client. To keep the client in sync, we normally pull the server periodically. This approach may generate unacceptable load to the server. To solve this problem, we want to have a push mechanism from server to client. This is why Comet is defined. Comet is a generic term describing various approaches to send data asynchronously from a Web server to a client without the need for the client to explicitly request the data. It is an essential technique for any real-time event-driven web applications, where the majority of events occur on the server and data must be “pushed” frequently to the client. To achieve this, Comet servers must maintain a continuous connection to each client for the duration of the session.

 

OK. How to maintain a continuous connection to each client for the duration of the session?

If you try to adapt traditional server to the Comet methodology, it may not scale and often fails after a few thousand simultaneously open connections. A true Comet implementation requires a very different kind of server architecture to be efficient and scalable – Liberator (a solid Comet server that are used by the financial industries. However, it is written in C and not open source although it has FREE edition distributed).

To understand this statement a little bit more, we need to know how traditional web containers handle the request. They are under one request per thread model.

  1. The client , typically , a browser sends request for resource to a web server.
  2. The server has a listening thread that keeps track of incoming connections.
  3. When a request arrives , the server uses one process or thread to process the request.
  4. The resource is returned to the client and the connection is closed.

In this model, the number of requests that can be served in a second would depend on two things

  1. How many threads are there to handle the client requests
  2. How long it takes to serve one request.

If all threads of server are busy, then the incoming requests are put in a queue. The server would return to the requests in queue when server threads become free. The number of requests handled per second is always greater than the number of allowed simultaneous connections. All this is made possible because the time required to process a request is very short. In other words you can server more requests in a second than you have threads.

However, there are one breed of applications that need to hold onto the connections. Think of applications that require real time data coming to clients (stock tickers)  or think of applications where low-latency is required. In the above traditional web model, the browser has to re-connect to get the new data. (Polling). If the new updates “can”  happen with high frequency (e.g. a chat application) then the polling frequency also has to increase .  An alternative to high frequency polling is to use push based applications. For push based application, once the browser connects to server, the server will maintain the connection till the browser time-out (server response stream is not closed) and keeps flushing data down the connection as and when they become available. In servlet container, to hold the connection, your thread in the service method cannot exit the method. Otherwise, the response stream will be closed. So what you do is, you block the thread on some condition within the service method. So the thread will block for your condition. When push data becomes available , this thread writes to response stream and again enters a blocked state. So as long as you hold onto the connection, you can not return this thread to the thread pool. And as more and more “push” connections are established you would run out of threads! To remedy the problem, the possible solutions are:

  1. Increase # of server threads.

Flex Push

There is confusion that whether BlazeDS supports real time messaging. Yes it does :wink:. In fact, BlazeDS has a full spectrum of channel types ranging from simple polling, to near-real-time polling, to real-time streaming.

  1. Simple polling – ping the server from Flex client using the traditional request and response model
  2. Near-real-time polling (long polling) – Instead of acknowledging right away, the server could hold the polling request until there’s a message for the client. This ensure the messages are delivered to the client as soon as they become available. The caveat for using long-polling is the thread limitation in most application servers. At this moment, BlazeDS could not support more than a few hundred long-polling clients on most application servers. However, this problem could be resolved once servers like Tomcat start to support asynchronos, non-blocking connection threads. Update: Now Tomcat 6 supports NIO.
  3. Real-time streaming – BlazeDS supports real-time message streaming over AMF and HTTP. Unlike long polling, which closes and reopens the connection upon receiving a message, streaming keep the connection open at all times. Streaming suffers from the same thread blocking issue as long polling. A cap must be set so the server is not hang by idle threads.

The reason why people are confused is that Adobe doesn’t release its proprietary push solution RTMP to BlazeDS. So, RTMP isn’t available as a channel in the BlazeDS configuration files. BlazeDS lives in a Servlet container and hence constrained by one-thread-per-connection limit whereas LCDS has NIO-based channels that can scale up to 1000s of requests. On the other hand, BlazeDS has the advantage that it’ll work over port 80/443, whereas LCDS will use some port for persistent connections that would require a firewall configuration. Once the servlet that implements BlazeDS is revved to support Comet Events under Tomcat 6, and then Jetty Continuations, then the long polling technique will be fine.

UPDATE: We are waiting for a solution that supports Comet Events under Tomcat 6. Then BlazeDS can be coupled to the Tomcat NIO HTTP listener and be able to scale as well as any NIO based server software.

I have learnt from this article that you can create a channel set in client side. So Flex can fail-over to other channels until it gets connected or the list is exhausted.

Marc has put an effort to build a better data grid like a spreadsheet in Flex. (check this out)

Reference

Here are the references I used for this article

  1. Tuning Apache and Tomcat for Web 2.0 comet application
  2. Performance of Grids for Streaming DataThis shows you the performance numbers on various frontend technologies. Again, Flex shows us a good result.
  3. Are raining comets and threads? – Comet Daily
  4. Comet & Java: Threaded Vs Nonblocking IO
  5. JDK 1.6 uses epoll to implement NIO
  6. BlazeDS dev guide
  7. Achieve performance breakthrough using BlazeDSFarata System put an effort to write its NIO channel that runs on Jetty 7 and receive promising result.

 

Leave a comment Continue Reading →

Reporting solution!

Open source reporting

My company needs a reporting engine but it doesn’t want to go for the expensive commerical ones like MicroStrategy. In fact, I don’t know why we need to pay so much because there are tools out there for FREE. As usual, I googled the Net and found out two seemingly promising open source reporting solution.

  1. Pentaho Reporting
  2. Jasper Reporting

Both of them are bundled with a suite of tools related to OLAP, Data Mining, ETL.. etc. To me, I just want an non-invasive reporting engine that can easily integrate into our architecture. To my dismay, I found out Pentaho doesn’t go this route. It basically gives you a reporting server configured. You could build your reports and deploy them following the manual. However, I hardly see a reporting solution that could satisfy all the business requirements without customization. All I expected from Pentaho is a jar file with documents that shows me how to use its api to generate reports in different formats and how to integrate with our database. I have attempted to look into the code and extracted the stuff I want from Pentaho. However, I found out the engine is actually not powerful. To strip out the workflow part, it is basically a simple SQL executor that later on will render the result according to the UI info embedded in the report definition. What is wrong with that?

  1. We want to handle pagination and data streaming as our data volume is huge. In Pentaho, you need to take care these yourself. So, you write your own sql, paginate yourself, stream it yourself if the resultset is huge. Isn’t it what we are doing without it? Apart from that, each report in Pentaho needs a report definition. It supports dynamic sql via token replacement. It is primitive as I want it to support control flow because I may decide what tables to join based on the input filters.
  2. On the UI side, Pentaho helps you to render your result into graph, table…etc. Again, I don’t like this UI solution as well. I found that JFreeChart is not as powerful as the Flex solution. I am adopting Flex and it gives me much powerful visualization tool. All I want is to ship my Flex app the data from my query’s result.

How about Japser? Pretty much the same but the good thing of Jasper is that it gives you the jar and document of how to use it instead of a reporting server like Pentaho. So, I can use it as report renderer to generate PDF and Excel like other utility libraries I use. So, what is my final solution?

I finally decide to create my own report definition that my Flex UI can take and render out the reporting interface. So, I don’t need to create form for each report. Apart from that, in my report definition, I have iBatis SQL template embedded. So, I can leverage its dynamic sql feature that supports control flow logic and the auto result class population. Yes, I still need to handle pagination and streaming myself. But, at least, it already saves up my time. The result object populated will return to Flex via AMF. So, I don’t need to marshal and de-marshal it in xml. It saves the processing time and costs less bandwidth. At the end, my solution combines the best in the market:

  1. Powerful reporting widgets provided by Flex
  2. Fast streaming and RPC protocol – AMF
  3. Good dynamic sql generation and mapping tool from iBatis
  4. Good reporting rendering tool from Jasper that helps me to do PDF and Excel generation

My solution is more flexible. As I can plugin hibernate map if I don’t want to write my own sql at all. Apart from that, no UI work is needed to deploy a new report unless my generic reporting interface is not enough.

Later, if I really need the workflow engine provided by Pentaho, I can plug it in. Again, the document provided doesn’t give us clear instruction or APIs of how to do it.

Reference

Below are references I used to build my solution:

  1. Flexible reporting with JasperReport and iBatis
  2. How Kodo JPA handles large result set (its optimization guide is good reference even you may not use Kodo)
  3. Process Large Result Sets in Java Web Application
  4. Streaming architecture

 

Leave a comment Continue Reading →

Linux System Overview – File System

Linux File System Basic

Ext3 (successor of Ext2) is the standard file system for Linux: It is robust, fast and suitable for all fields of use. The main difference between them is that Ext3 has a journal that records the pending operations for fast recovery purpose in the event of system crash. This record guarantees a consistent file system at all times and reduces the time needed for checking a mounted file system from several hours to a few seconds b/c instead of checking the entire disk, the system can check just those areas noted in the journal as having pending operations.

Like all decent Unix file systems, Ext3 uses three general data structures: directories, inodes and data blocks. Directories only contain file names and the inode numbers assigned to them. Each file has one i-node that contains a list of disk block’s starting sector addresses as a file content is normally not stored in contiguous disk blocks in disk drive due to constant add and delete and the size is dynamic (ie. external fragmentation). If the file content are scattered, it takes longer to retrieve its content as it takes more header spins physically.

http://www.heise-online.co.uk/images/110398/0/1

Under the hood, each disk block can span multiple disk sectors and each sector has the size of 512 bytes. Disk sector is the smallest addressable unit on hard disks. Ext3 uses block sizes of 1024, 2048 or 4096 bytes. In theory, Ext3 supports block sizes up to 64 KB, but in x86 and x64 architectures, 4 KB is the maximum: This block size corresponds to that of the kernel’s memory pages in RAM, which makes paging easier for the operating system. Ext3 uses 32-bit values (4 bytes as integer in Java) to assign block numbers, which means that it can only address about four billion blocks – 4 TB at a block size of 1024 bytes, 16 TB at 4096 bytes. So, larger block size allows you to create large file system. On the other hand, large blocks can waste a lot of disk space because files always use a whole block even if they only contain a few bytes: On average, every file wastes half a block - the larger the blocks and the smaller the files, the more noticeable the effect is. This effect called internal fragmentation.

Optimization with sacrifice

For an efficient file system, it needs to quickly find the data belonging to a file name. For example, for filename “abc.txt”, OS needs to traverse a list of directory entries before the inode of the file is located (depends on the depth of the folder hierarchy) and then traverse all the data block pointers to retrieve the content. To optimize the speed, Ext3 writes the inodes into static tables on the disk during formatting. One consequence of this is that the number of inodes can’t be altered after the file system has been set up. As every file needs to be assigned to one specific inode there can’t be more files than inodes. It is not scalable for handling large number of files. By default, mke2fs creates one inode for every 4 KB in file systems up to 512 MB, otherwise one inode for every 8 KB. Although you can tune this number to increase the number of inodes, it is only changeable at setup time (not dynamic). By the way, each inode itself consumes 128  bytes in size. 

Handle large size file

How is it possible to fit the millions of data block numbers required for gigabyte-sized files into a static data structure of 128 bytes? It isn’t – one Ext3 inode stores exactly 15 block numbers. The first twelve point directly to data blocks, block 13 to a data block containing block numbers (indirectly addressed blocks), block 14 to a block pointing to blocks with block numbers (double indirect), and block 15 points to triple indirect blocks. Therefore, at a block size of 4 KB (that is 1024 block numbers with 4 bytes per indirect block) one inode can handle 12 + 1024 + 10242 + 10243, around a billion block numbers. The resulting maximum file size of just over 4 TB.

Power of B-Tree Indexing

Now you know how inode uses hierarchical pointers to handle file with large size. However, if a directory has tons of files, how directory entries make it efficient to locate the inode. Ext2 originally stored the file names within a directory as a linked list. While this is a elegantly simple data structure it has the disadvantage that operations take longer and longer with a growing number of entries. Ext3 can manage directories in B-Tree+ structure if [code]]czo5OlwiZGlyX2luZGV4XCI7e1smKiZdfQ==[[/code] is set (not default). This drastically speeds up directory operations. Performance loss is only experienced when the directories are filled with hundreds of thousands of files. This is usually caused by a caching effect as Linux kernel doesn’t use unlimited memory for caching directory structures even you add more memory.

$ sudo tune2fs -O dir_index /dev/hda1

Run the above command as root. Do note that the indexing will take up much more space, but then hard disk space is not too expensive nowadays. If you don’t want to tweak the OS default setting but you still want to store large number of files. You can restructure the directory so that it does not contain that many files. Without doing this, in a default (untuned) Ext3 partition, each subsequent write degrades horribly past the 2000 file limit. So, keeping the items in a directory to within 2000 files should be fine. If you want to go this route, there are approaches to restructure your folder:

  1. Date based – YEAR > MONTH > DATE > HOUR if your files is uniformly distributed across the time.
  2. Hash based – break down the hash into several parts as folder name (check this)
  3. Id based – reverse the id and break it down use 2 digits each to make sure it is uniformly distributed

NOTE: I don’t want to use random number here as I want to locate the file via its metadata later.

Alternative solution for large number of files in a directory

ReiserFS can handle up to 2^31 files per dir (that’s 2 billion), with a max of 2^32 (4 billion) files on the filesys total. It can handle up to 64000 subdirs in a directory. Ext3 has a limit of 32000 subdirs per dir. The max number of files per dir is theoretically unlimited (actually around 130 trillion), but performance becomes terrible with above 10-15 thousand files. The max number of total files on the filesys is limited by the number of inodes you have. With a 1 gig file system and a 4k block/ inode ratio (the default), you have around 260000 inodes, and that’s also the max number of files you can have.

Reference

Here are some good references

  1. The Unix and Internet Fundamental How to – Eric Raymond
  2. Tuning Linux file system – Ext3 by Oliver Diedrich
  3. Handling large number of files in a directory – Roopinder Singh
  4. Super fast Ext4 filesystem arrives in Ubuntu 9.04If the benchmark is correct, it outperforms all the file system nowadays dramatically.
  5. Introduction to Linux file systems and files
  6. Extreme performance monitoring and tuning in Linux
  7. Simple Help with simple answer - simplehelp.net
Leave a comment Continue Reading →

Flex Annotated Charting

Recently, I want to extend the LineChart in Flex. I want to have line chart with event annotated like Google Finance.

 
First of all, I googled the Net to see whether anyone had already done it. It was even better if I could find any open source project related to this. Below are the interesting things I found:
  1. Dow Jone Interactive Chart (commerical – it is exactly what I am looking for)
  2. Interactive Bubble Chart (open – although it is not exactly want I want, but if I believe the code can benefit me if I need to customize line chart. :cool: I may just need to draw the interactive small bubble on the line to get my job done!)
  3. This demo gives you tons of chart samples. They are all great example although none of them satisfy my current need.
  4. This demo is close to what I want. From this demo, I notice I can use “annotationElement” to draw on top of the data series. However, the trick is to convert the data points to pixel coordinate in order for me to draw something that can move along with the graph even someone stretches it. To make thing easier, Ely Greenfield has created DataDrawingCanvas that helps us draw on the chart with only data points specified instead of pixel coordinates. This class extends the ChartElement like AnnotationElement does (blog). That is amazing!! Thanks!!! :smile:
  5. Google Finance Chart (It is exactly what I want. I wonder I can get the source of it)
    • I have found the blog and powerpoint of this sample (1/7/2009)
    • Google uses the Flash/ JavaScript integrate kit to get it works (blog) – I heard that it is very nice combination of Flash and AJAX. This is similar to MeasureMap‘s use of the kit.
    • It is open source example!!! (code). Thanks for Brendan Meutzner!!
    • Brendan also shows us how he created his demo in 5 steps to help us understand how to build it ourselves.
    • Step 1, Step 2, Step 3, Step 4, Step 5 – enjoy!!

Reference

Useful resources:

  1. Data Visualization by Tom Gonzalez. (Tom created an open source visualization framework named Axiis. It looks great. Once I get a chance, I will dig into it) – 7/31/2009
  2. Building a Flex Component by Ely GreenField
  3. Create component and enforce separation of concern
  4. http://www.edwardtufte.com/tufte/ (Edward Tufte – famous guy in data visualization)
  5. http://www.insideria.com/2008/03/image-manipulation-in-flex.html (Image Manipulation)

 

Leave a comment Continue Reading →