Monthly Archives: July 2008

Evolution of XML parsing technologies

Introduction There were 2 main XML parsing technologies few years ago. They were SAX and DOM. SAX is event-driven and the events are fired and forget along the xml parsing. Advantages: It doesn’t need to cache the whole xml document in memory and you don’t need to wait til the whole xml been parsed before [...]

Salesforce.com opens up Google Data API

Salesforce + Google It is good news to hear that Salesforce.com has made Google Data API available on its platform. To further understand the full potential of the new platform, I have googled around to see whether anyone has talked about it, here is the first article I found that covers some use cases on [...]

Powerful Full Text Search – Part 3 Solr

Introduction of Solr Solr is a standalone enterprise search server with a web-services like API. You put documents in it (called "indexing") via XML over HTTP (RESTful). You query it via HTTP GET and receive XML results. Advanced Full-Text Search Capabilities Optimized for High Volume Web Traffic Standards Based Open Interfaces – XML and HTTP [...]

Powerful Full Text Search – Part 2 Nutch

Introduction of Nutch & Hadoop After Lucene, the author created another powerful tool. Its name is Nutch. Nutch is a powerful crawler built on top of the Lucene. With Nutch, you can launch a multi-threaded crawler to obtain information from the Net. At this point of writing, Nutch is in its 0.9 version. Nutch comes [...]

Powerful Full Text Search Engine – Part 1 Lucene Introduction

Introduction of Lucene I have heard of Lucene and its powerful full text search capability many times. Today, I decide to take a look at it. Before I dive into the user guide, I went to Google Tech Talk to find a video related to Lucene first. Here is what I found:  After I finished [...]

Grid Computing – Part 1 Introduction

Introduction from Cameron Purdy