Tag Archives: rpc

Evolution of XML parsing technologies

Introduction

There were 2 main XML parsing technologies few years ago. They were SAX and DOM.

  1. SAX is event-driven and the events are fired and forget along the xml parsing. Advantages: It doesn’t need to cache the whole xml document in memory and you don’t need to wait til the whole xml been parsed before the first event emitted. Disadvantages: It uses Push API that holds the control during parsing. So clients cannot control the parsing and it doesn’t fit for xml manipulation.
  2. DOM is used to convert the xml into object tree in memory before manipulation. Advantages: Easier to manipulate the xml. Disadvantages: Eat up a lot of memory that is not good for documents larger than few MBs in size or in memory constrained environment such as J2ME.

Pull API is a more comfortable alternative for streaming processing of XML. A pull API is based around the more familiar iterator design pattern rather than observer design pattern. In a pull API, the client program asks the parser for the next piece of information rather than the parser telling the client program when the next datum is available. In a pull API the client program drives the parser. In a push API the parser drives the client. That leads to the invention of StAX.

In this article, I will introduce an new object model from Axis2 named AXIOM that uses StAX underneath for xml parsing. With this, xml parsing will cost less memory with better control.

Evolution of Axis

One of the first generation SOAP engines, Apache SOAP, uses a DOM-based object model internally to represent the XML document, where the XML handling techniques force the entire XML object model to be built at once. The second generation Apache Axis shifted to SAX to avoid keeping the complete information in the memory. SAX, however, has a major constraint – it is built around a "push" technique, and once the parsing of the XML document starts it cannot be stopped. To jump over this hurdle, Apache Axis has to record SAX events. So, effectively, the XML message has to be kept in the memory in the form of SAX events, thus making Apache Axis yet another memory intensive programming model.

Axis2 avoids keeping the complete SOAP message in the memory by introducing a new Object Model for representing the SOAP message AXIOM. AXIOM takes a dramatically new approach. Although AXIOM has an "external" resemblance to DOM, the difference lies in that it generates objects only when required. This "on-demand building" feature gives AXIOM the edge needed to overcome the memory barrier that early SOAP engines failed to pass.

An interesting feature of AXIOM is that it is based on Pull parsing. It is capable of generating pull events from the Object Model that is built. Further, if the Object Model happens to be half built, AXIOM is capable of shifting to the underlying pull parser to generate pull events directly from the stream. The heart of AXIOM is the XML Pull parser since it is the only parsing model that supports the pausing of the parsing process. AXIOM uses the Streaming API for XML (StAX), making it easy to manipulate and utilizing only a fraction of the memory used by a conventional object model. Combined with the speed of the streaming pull parser, AXIOM pushes Axis2 leaps ahead of its predecessors in terms of efficiency and speed.

Apart from new parser, Axis2 also has other new add-ons. They are:

  1. Pluggable Data Binding – you can pick and choose JAXB, Castor and XMLBean for xml – java conversion.
  2. Improved Support for Message-style interaction (RPC vs Message-based)
  3. Improved handlers

The goal of this article is to focus on parsing technology, so I will not discuss in detail the new features on Axis2. If you want to find out more, read this.

 

Reference

An Introduction to StAX

Fast and lightweight object model for XML

 

 

Leave a comment Continue Reading →

Why Flex for RIA, no AJAX?

Here is the list of reasons why I chose Flex for the RIA development.

  1. Write Once Deploy Everywhere – Flex generates SWF that runs on top of Flash Player VM and behaves consistently across different browsers, even mobile phones later. With this, all the browser compatibility issues are basically offloaded by Adobe.
  2. Solid programming model with rich widgets and libraries.
  3. AMF makes Flex object to Java POJO communication possible. No need to use verbose XML – Check out BlazeDS.
  4. Flex IDE is a plugin in Eclipse that gives stepwise debugging, UI design console, code completion and more. Working with Actionscript is like Java.
  5. Flex SDK is open source and free.
  6. Great support on video streaming
  7. Integrate with HTML, Javascript and CSS, so it is not invasive adoption.
  8. Support offline application via AIR – Adobe has been working on the Adobe Integrated Runtime (AIR) that allows for using existing web application development skills to build and deploy desktop applications. AIR is still in early development, but promises to allow developers to use their newly learned Flex skills to build desktop applications. No need to learn Swing, Applet…etc.
  9. Provide several RPC methods like HTTPService, WebService, AMF and JSON. AMF is 10x faster than SOAP. James Ward developed his Census Flex application to provide performance benchmarks for the different RPC methods in the mainstream RIA technologies. (Download)
  10. You can keep the state in the Flex app and have your server completely stateless.
  11. More to come! :)

 

Leave a comment Continue Reading →