Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Warning: in_array() [function.in-array]: Wrong datatype for second argument in /home/.pomey/justpropose/solutionhacker.com/wp-content/plugins/attachment-manager/wp-attachment-manager.php on line 402
Introduction
Data is the most important asset to the companies. There are many topics related to data such as data visualization, data representation, data processing, data warehouse and data intelligence. This page will dedicate to provide you information related to these topics.
Level 1 – Basic skills
This covers the basic skills you need to be good at before diving into some advanced topics like data warehouse and data mining.
- Understand key database concepts (Normalization and Join, View, Indexing, Replication, Transaction Management and Storage)
- Write effective SQL
- Outer join vs inner join
- SQL Overview
- How does database indexing work? B-Tree? Most indexes are too large to fit into memory, which means that they are going to be stored on disk. Since I/O is usually the most expensive thing you can do in a computer system, these indexes need to be stored in an I/O efficient way. A B-tree is a good way to do this. If we make the nodes the size of a physical I/O block, it would take one I/O to move to a lower depth in the tree. In the example below, an index was created on a first name kind of field. If every level were an I/O it would take 3 I/Os to find Mary (or any other leaf). However, if the index you choose has low cardinality, it may cause the index page to split. So, you may end up doing more IOs and pull out lots of records. B-Tree may not do much better than full table scan. 10% selectivity is the minimum selectivity necessary for a b-tree index to be helpful.(selectivity% = unique index/ total # of record * 100%). To handle low cardinality columns, you can either use a different kind of index (e.g. Bitmap index) or combine that column with another to make a highly selective composite index.
- Composite indexes (sometimes called multicolumn indexes) are generally recommended if queries often contain the same columns joined with AND or if you have keys with skewed domain counts or low cardinality. The left-most column in a compound index should be the one that occurs most often in the queries. The order of the columns in the WHERE clause once had to match the order in the compound index; however, modern optimizers make that rule obsolete.
- If you can satisfy your query by index value alone, you don’t need to fetch the data page from disk and it is much faster. If the table’s row size is big, using a covering index will dramatically increase query performance because the index will act like a small table (ie. covering index – the indexes that cover what users want). There are limitations, though. The DBMS will never use covering indexes when there are joins or groupings.
- Clustering an index means storing data rows according to the index order. Searching on a clustered index is much faster. You can have only one clustered index per table; usually, the clustering index is the primary key. You don’t need to explicitly create indexes on primary key columns because the system creates them implicitly. To qualify as a primary key, the value of the columns should not be nullable and should be unique on the row. In order to make the primary key the clustering index, use CREATE TABLE. If you try to add it later using an ALTER TABLE statement, the system might not create it as a clustered index.
- Index carries its cost. OLTP has operations like insert, delete and update would need to update the indexes as well. Therefore, we should index our table smartly.
- MySQL Performance Tuning (good summary)
- MVCC vs row-locking for supporting concurrency. In short, MVCC doesn’t use lock to support concurrency. When an
UPDATEis performed, a new row is created and expires the old row. It’s the same row — just different versions. When a query is arrived, it returns the rows with right version(s)/ snapshot. With MVCC, you can also do hot backup without locking the whole table. However, if the dataset is large, it may have storage issue b/c it consumes few times more spaces. JBoss Cache adopts this theory to implement its thread-safe caching. The theory is that readers continue reading shared state, while writers copy the shared state, increment a version id, and write that shared state back after verifying that the version is still valid (i.e., another concurrent writer has not changed this state first – reference. - Programming on database (Trigger and Stored Procedure)
- Understand data representation
- Common DBA jobs
Level 2 – Data visualization
I use built-in Flex components like Chart, Grid, Pie to visualize data. There are some 3rd parties useful data visualization libraries out there for your exploration. Take a look at this.
Level 3 – Data Warehouse
- How data is stored in data warehouse?
- Data is stored as dimensional model in DW (normally in star schema with fact and dimension tables). If the data volume is not too large, you may have all data in single db instance. Otherwise, you need to partition your data across different database server instances. In Greenplum, all tables are distributed, which means a table is divided into non-overlapping sets of rows or parts. Each part resides on a single database known as a segment. The parts are distributed across all of the available segments using a sophisticated hashing algorithm. Database administrators choose the hash key (one or more table columns) when defining the table.
- How to build a large scale data warehouse? I would suggest you to adopt the share-nothing architecture that can economically scale from terabytes to petabytes with massively parallel query performance. It is not the solution that Oracle RAC provides you because it is using shared-disk architecture. From I have known, only Teradata, Netezza and Greenplum provide you this solution. Among them, Greenplum looks like the most cost effective. You may ask me why share-nothing is better. Let’s look at the RAC first:
- Oracle’s scale-out story is RAC, which requires a big and expensive shared-disk infrastructure (i.e. SAN) for coordination. Problem is that SANs only have ~4GBytes/s bandwidth out, so I/O scalability is inherently capped.
- Oracle is an OLTP warehouse at heart. It has never been good at running large parallel analytical queries. Anyone with experience trying to implement a 50TB Oracle warehouse will tell you that it takes rocket-science tuning to get it to behave.
- Oracle RAC requires its database servers separated from the storage (i.e. SAN or Exadata Storage Servers), because each database server needs to be able to see all the data in order to function. This means that the processing is happening on an entirely different box than the one accessing the data in disk. For more detail, look at this article.
- Look into greenplum that built on top of many Postgresql dbs
- Greenplum is a distributed database system. This means that the data stored in the database system is physically located on more than one database server (referred to as segments in Greenplum – Postgresql 8.2.5). These individual database servers are connected by a communications network (referred to as the interconnect in Greenplum – communicate via reliable UDP instead of TCP to remove the scalability limit of 1000 segment instances). An essential feature of a true distributed database is that users and client programs work as if they were accessing one single database on a local machine (in Greenplum, this entry-point database is referred to as the master - parses and optimizes query, then creates and dispatches the parallel query plan to segments where the user data sit). The fact that the database is distributed across several machines is seamless to the users of the system.
- All database operations—such as table scans, joins, aggregations, and sorts—execute in parallel across the segments simultaneously. Each operation is performed on a segment database independent of the data associated with the other segment databases.
- In order to harness the power of a cluster, query processing software had to evolve to take advantage of multiple disks, processors, and network links operating in parallel. To achieve this, the rows of a table were partitioned across multiple machines with separate disks, enabling parallel I/O scans of big tables. Basic relational query operators like selection, join, grouping and aggregation were reinvented to run in parallel via similar partitioning schemes: the operations undertaken by each node in the cluster are the same, but the data being pumped through the fabric is automatically partitioned to allow each node to work on its piece of the operator independently. Finally, these architectures allowed multiple relational operators to operate at the same time, allowing pipeline parallelism in which an operator producing a data stream runs in parallel with the operator consuming it.
- On the other hand, Greenplum has leveraged the idea of MapReduce to achieve parallel processing behind the scene. Here is a great white paper about this.
- Challenges to load large amount of data loaded into DW in a given maintenance window
- Greenplum supports fast, parallel data loading with its external tables feature. External tables can also be accessed in ‘single row error isolation’ mode, allowing administrators to filter out bad rows during a load operation into a separate error table, while still loading properly formatted rows. By using external tables in conjunction with Greenplum Database’s parallel file server (gpfdist), administrators can achieve maximum parallelism and load bandwidth from their Greenplum Database system. Greenplum has demonstrated load rates in excess of 2 TB an hour.
Level 4 – Collective intelligence
Once you gather lots of data, you want to analyze this to provide you useful information. Information retrieval is all about how to extract intelligence from tons of data. Below are some of the good links I obtain from the Net
- Mi Islita – It is a research site about information retrieval, data mining, and search engine technologies. Their content is frequently referenced or used by both IR scholars and search engine marketers.
- IR Thought – new home of Mi Islita’s blog
- Search Engine People – a blog about search engine optimization.
Level 5 – Operational Analytics
Fully automated decision support, which uses a combination of business rule engines, rules-driven workflow engines, and custom code, takes the human factor completely out of the exploration-insight-action cycle of business intelligence (BI). By doing so, companies improve efficiency by making actions faster, more consistent, and less error-prone.

Reference
Below are some interesting references related to database and data intelligence

0 Comments.