Dimensional modeling is a technique, used in data warehouse design, for conceptualizing and visualizing data models as a set of measures that are described by common aspects of the business. It is especially useful for summarizing and rearranging the data and presenting views of the data to support data analysis.
Dimensional Modeling Vocabulary
A fact table is the primary table in a dimensional model where the numerical performance measurements of the business are stored. The term fact represents a business measure that can be used in analyzing the business or business processes. The most useful facts are numeric and additive.
For e.g. Sales, Quantity ordered
Dimension tables are integral companions to a fact table. The dimension tables contain the textual descriptors of the business. Each dimension is defined by its single primary key, which serves as the basis for referential integrity with any given fact table to which it is joined.
Dimensions are the parameters over which we want to perform Online Analytical Processing (OLAP). For example, in a database for analyzing all sales of products, common dimensions could be:
- · Time
- · Location/region
- · Customers
- · Salesperson
Dimensional Modeling Process
Identify business process
In dimensional modeling, the best unit of analysis is the business process in which the organization has the most interest. Select the business process for which the dimensional model will be designed. Based on the selection, the requirements for the business process are gathered.
At this phase we focus on business processes, rather than on business departments, so that we can deliver consistent information more economically throughout the organization. If we establish departmentally bound dimensional models, we’ll inevitably duplicate data with different labels and terminology.
For example, we’d build a single dimensional model to handle orders data rather than building separate models for the sales and marketing departments, which both want to access orders data.
The granularity of a fact is the level of detail at which it is recorded. If data is to be analyzed effectively, it must all be at the same level of granularity. As a general rule, data should be kept at the highest (most detailed) level of granularity.
For example, grain definitions can include the following items:
- A line item on a grocery receipt
- A monthly snapshot of a bank account statement
Identify dimensions & facts
Our next step in creating a model is to identify the measures and dimensions within our requirements.
A user typically needs to evaluate, or analyze, some aspect of the organizations business. The requirements that have been collected must represent the two key elements of this analysis: what is being analyzed, and the evaluation criteria for what is being analyzed. We refer to the evaluation criteria as measures and what is being analyzed as dimensions.
If we are clear about the grain, then the dimensions typically can be identified quite easily. With the choice of each dimension, we will list all the discrete, text like attributes that will flesh out each dimension table. Examples of common dimensions include date, product, customer, transaction type, and status.
Facts are determined by answering the question, “What are we measuring?” Business users are keenly interested in analyzing these business process performance measures.
Creating a dimension table
Now that we have identified dimensions, next we need to identify members , hierarchies & properties or attributes of each dimension that we need to store in our table.
A dimension contains many dimension members. A dimension member is a distinct name or identifier used to determine a data items position. For example, all months, quarters, and years make up a time dimension, and all cities, regions, and countries make up a geography dimension.
We can arrange the members of a dimension into one or more hierarchies. Each hierarchy can also have multiple hierarchy levels. Every member of a dimension does not locate on one hierarchy structure.
Creating a fact table
Together, one set of dimensions and its associated measures make up what we call a fact. Organizing the dimensions and measures into facts is the next step. This is the process of grouping dimensions and measures together in a manner that can address the specified requirements. All candidate facts in a design must be true to the grain defined in step 2. Facts that clearly belong to a different grain must be in a separate fact table.