Posted by admin on November 18, 2009
Starting to learn Hive As I mentioned in my last article, I was getting excited about the potential of Hive. Today, I decide to start my journey to learn this. I found a great introductory video that gives you a nice warm-up of using Hive (A basic knowledge of how hadoop and mapreduce work would [...]
Posted by admin on June 26, 2008
Getting into Business Intelligent World When I dig deeper in business intelligence, I found out that it is a huge topic ranging from reporting to data mining. Like all the knowledge acquisition plan, I put a series of milestones for myself. If you are interested, here is my list: Get and prepare your data Data [...]
Posted by admin on February 19, 2008
Create 2 tables Item(id) and Item_log(item_id, price) Populate it insert into item(id) values(1); insert into item(id) values(2); insert into item(id) values(3); insert into item(id) values(4); insert into item_log(item_id, price) values(1, 100); insert into item_log(item_id, price) values(1, 100); insert into item_log(item_id, price) values(1, 100); insert into item_log(item_id, price) values(1, 200); insert into item_log(item_id, price) values(1, 200); [...]
Posted by admin on June 22, 2007
To build data warehouse, you will use the techniques of dimensional modeling. Here are the guidelines you can follow: Divide the world into measurements and context. Numeric measurements place in Fact table whereas context are broken down into Dimensions. A fact table in a pure star schema consists of multiple foreign keys, each paired with [...]
Posted by admin on June 16, 2007
Operational databases are most commonly designed using normalized modeling, often using third-normal form or entity-relationship modeling. Normalized database schemas are tuned to support fast updates and inserts by minimizing the number of rows that must be changed when recording new data. Example: Order-Management Schema for operational database Data warehouses differ from operational databases in the [...]
Posted by admin on June 15, 2007
For those who don’t want to go for licensing path. Open source is definitely a better solution. However, whether open source DBMS can be used to build your data warehouse? I am not a good person to answer this question. But I have seen more and more small and medium size companies launched their business [...]
Posted by admin on June 15, 2007
This goal of this post is to walk you through an awesome business intelligent framework named “Pentaho”. I believe the philosophy of “Learn by Practice”. So, I will show you the steps to get pentaho up and run for a fictitious company. Along with this exercise, you should be able to understand how Pentaho works [...]
Posted by admin on June 6, 2007
I am looking into Pentaho currently for my project. It looks very promising so far. Here is the video that talks about the architecure of it. Enjoy.