Tuesday 1 September 2015

2. Data Warehouse Architecture - An overview | Data Warehouse Tutorial

A quick video to understand standard Data warehouse architecture. It consists of following layers

1. Data Source layer
2. ETL
3. Staging Area
4. Data warehouse - Metadata, Summary and Raw Data
5. OLAP, Reporting and Data Mining

Data Source Layer
Data warehouse is populated from multiple sources for an organisation. All these source system comes under Data Source layer. Some of the source systems are listed below:

1. Operations Systems -- such as Sales, HR, Inventory relational database.
2. ERP (SAP) and CRM (SalesForce.com) Systems.
3. Web server logs and Internal market research data.
4. Third-party data - such as census data, demographics data, or survey data.

Staging Area
Data gets pulled from the data source into the Staging Area.This is a temporary data store area where data is pulled prior to being cleaned and transformed into a data warehouse. Having one common area makes it easier for subsequent data processing / integration.

Metadata means data for the data. It is one of the most important aspects for data warehouse environments. Metadata helps the Business Analyst or the decision maker to find what data is in the warehouse and use that data effectively and efficiently.

Summary Data
Summary Data contains data int eh aggregated form. Data from source systems is summarized and computed as per requirements and in a way it answers specific business questions. e.g YTD , MTD, QTD information rolling over couple of years. It helps to increase the system performance by fetching data from pre computed and aggregated data.

Online analytical processing, or OLAP is an approach to answering multi-dimensional queries in faster and quick time. OLAP tools such as Cubes, Oracle BI suite, helps users to analyze multidimensional data effectively from multiple perspectives. 
OLAP consists of three basic analytical operations: consolidation (roll-up), drill-down, and slicing and dicing.

Data Mining
Data Mining is the process of discovering patterns in large data sets. The goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. which can be used to increase/decrease revenue, increase/descrease product supply or even cust or increase costs.It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. 

Also, visit my other articles and videos:

2. ETL Overview
3. What are Dimension Tables?