Tag Archives: etl

Hibernate vs iBatis

Hibernate is great. However, I don’t see it fits all the data access requirements. At its core, it is an ORM tool that helps you to map your object model to relational model. If you have full control of your relational model and perform lots of CRUD operations, it is certainly a great tool for you. Its transparent persistence, 2 level caching, dirty checking, lazy/ eager data retrieval and sql generation indeed can save us lots of development time. However, one tool doesn’t fit all !! Why not?

In my current company, I have created a reporting tool that interfaces with dimensional model in data warehouse. In this setting, you will deal with star schema with denormalized dimensions.  Often time, I need to tune the query performance via looking into explain plan. Without full control of SQL, my job will be hard to achieve. Apart from that, reporting tool often issues read-only set based queries to the data warehouse. The resultset returned doesn’t fit into my OO model at all. Again, Hibernate just doesn’t fit in. People in my company argue that I should use named query in Hibernate for the sake of sticking with the standard. I am like ok, whatever… I have known a tool called iBatis that I can achieve my job cleanly. Why the hell I would have motivation to try named query that basically a way to by-pass the ORM model to query database. What benefit I will get from this? The cache? We are using ETL to update our fact and dimension in the data warehouse, not by my reporting app. Unless the ETL throws me an event when the update cycle is finished so I can flush my cache, I simply don’t think it gives me lots of help.

Anyway, it is just my own little perspective. You don’t have to agree with me. The point here is not that I don’t like Hibernate but I don’t like to be pushed to use it only because it is “standard” to someone. If Hibernate could help me to construct my sql based on user input, stream my result directly to my presentation layer, populate my model automatically based on mapping I provided, detect data warehouse changes and take care my cache, then I am more happy to adopt it in my reporting app. Otherwise, I would not be eager to dump my iBatis DAO layer unless I get no choice under the political game.

Reference

  1. http://www.nofluffjuststuff.com/media.jsp?id=19
  2. http://www.javalobby.org/articles/hibernate-query-101/
  3. How to use named parameters and named query in Hibernate?
  4. Don’t repeat your DAO 
Leave a comment Continue Reading →

Common DBA jobs

Export schema/ data out from mysql

To export schema and/or data, you can use mysqldump command:

mysqldump -u [username] -p[password] -d [schema_name] > [filename].sql

  1. -d means no data (just gives me the schema).
  2. -B is needed for multiple schema output
  3. -h (hostname)

Export data out from postgresql

  1. Export table data from postgresql to csv format
  2. Backup and restore database in postgresql

However, if you want to export sql result set to csv in postgresql, you can consider to use COPY functionality.

COPY ( select statement ) TO STDOUT WITH CSV
COPY stock FROM ‘mydir/Stock.csv’;

Run sql script using mysql command

To run the scripts as input, we can use the following command:

mysql [schema_name] -u [username] -p[password] < [filename].sql

SQL Tips

There are times we want to put logic in SQL but not writing store procedure. Here are some of using functions that may get you there:

  • Conditional statement – CASE WHEN xxx THEN abc WHEN yyy THEN bbc …ELSE ccc END

UPDATE Account SET Sales_Location__c =
    CASE WHEN Sales_Country__c != ” THEN Sales_Country__c WHEN Country__c != ” THEN Country__c
ELSE ‘–’ END

  • COALESCE (input1, input2,….) – This function takes in as many parameter as you want and return you the first non-NULL parameter. Suppose we have a table A having 3 columns FullName, CompleteName and DisplayName. Any of these columns can contain null values. Now we want to select the DisplayName from this table, but if it is null, then return FullName, if that is also null then return CompleteName. We can easily perform the same in one select statement as: (COALESCE vs ISNULL)

SELECT COALESCE(DisplayName, FullName, CompleteName) From A

 ETL

In mysql, you can export a table from db1 and import it to db2 remotely. For example, in db2 host, you can issue the following command:

/usr/bin/mysqldump – -force – -compress – -opt -u [username] -p[password] -h [hostname] db1

| /usr/bin/mysql -u [username] -p[password] -D db2

Leave a comment Continue Reading →