Friday, January 3, 2014

Fact Tables and Dimension Tables

In Datawarehouse we use can use two types database :
    1. snowflake schema
    2. star schema
   
Generally we used star schema on other hand snowflake schema used very rare.

What is star schema:-
Star schema use fact tables and dimension tables. In star schema only one join
establishes the relationship between the fact table and any one of the dimension tables.A star
schema has one fact table and is associated with numerous dimensions table and depicts a star.

Fact tables:
    A fact table typically has two types of columns: those that contain numeric facts (often called measurements),
    and those that are foreign keys to dimension tables.
    A fact table contains either detail-level facts or facts that have been aggregated.
   
    In other words , if we design query for report we always use aggregate functions in select clause of query.
    The column we used in select clause    for aggregation is always taken from fact table i.e. called numeric facts.

Dimension tables:
    A dimension is a structure, often composed of one or more hierarchies, that categorizes data. Dimensional
    attributes help to describe the dimensional value. They are normally descriptive, textual values an.
    Dimension tables are generally small in size as compared to fact table.
   
    In other words, these are columns used in query for "group by" clause and used when we need some extra detail
    for the aggregate columns.
   

A example for fact ,dimension table as following:

Fact table :- Fact_UserData
Dimension table :- Dim_User, Dim_Location, Dim_Occupation

Fact_UserData
    User ID
    Location ID
    Occupation ID
    columns contains numeric data that can be meaningfully aggregated

Dim_User
    User ID
    Gender
    Ethnicity

Dim_Location
    Location ID
    District
    City
    State

Dim_Occupation
    Occupation ID
    Occupation Name

query might be like this
select
    aggregate_function(column numeric data),columns_from_dim_table
from
Fact_UserData join Dim_User on Fact_UserData.User ID=Dim_User.User ID
left outer join Dim_Location on Fact_UserData.Location ID=Dim_Location.Location ID
left outer join Dim_Occupation on Fact_UserData.Occupation ID=Dim_Occupation.Occupation ID
group by dim_table_columns

   
What is Snowflake schema:-
    The snowflake schema (sometimes called snowflake join schema) is a more complex schema
    than the star schema because the tables which describe the dimensions are normalized. It is used when a dimensional
    table becomes very big.In snow flake schema since there is relationship between the dimensions Tables it
    has to do many joins to fetch the data.Every dimension table is associated with sub dimension table.
    A star schema may be partially normalized (snowflaked), with related information stored
    in multiple related dimension tables, to support specific data warehousing needs.



OLAP data is typically stored in a star schema or snowflake schema in a relational data warehouse or in a special-purpose data management system. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables.

No comments:

Post a Comment

web stats