Posted by admin on July 16, 2008
Introduction There were 2 main XML parsing technologies few years ago. They were SAX and DOM. SAX is event-driven and the events are fired and forget along the xml parsing. Advantages: It doesn’t need to cache the whole xml document in memory and you don’t need to wait til the whole xml been parsed before [...]
Posted by admin on July 11, 2008
Salesforce + Google It is good news to hear that Salesforce.com has made Google Data API available on its platform. To further understand the full potential of the new platform, I have googled around to see whether anyone has talked about it, here is the first article I found that covers some use cases on [...]
Posted by admin on July 6, 2008
Introduction of Solr Solr is a standalone enterprise search server with a web-services like API. You put documents in it (called "indexing") via XML over HTTP (RESTful). You query it via HTTP GET and receive XML results. Advanced Full-Text Search Capabilities Optimized for High Volume Web Traffic Standards Based Open Interfaces – XML and HTTP [...]
Posted by admin on July 6, 2008
Introduction of Nutch & Hadoop After Lucene, the author created another powerful tool. Its name is Nutch. Nutch is a powerful crawler built on top of the Lucene. With Nutch, you can launch a multi-threaded crawler to obtain information from the Net. At this point of writing, Nutch is in its 0.9 version. Nutch comes [...]
Posted by admin on July 4, 2008
Introduction of Lucene I have heard of Lucene and its powerful full text search capability many times. Today, I decide to take a look at it. Before I dive into the user guide, I went to Google Tech Talk to find a video related to Lucene first. Here is what I found: After I finished [...]
Posted by admin on July 3, 2008
Introduction from Cameron Purdy