Data Cube Technology (OLAP)
11 important questions on Data Cube Technology (OLAP)
Why is there a need for OLAP?
•DWs are expected to support a wide range of queries, usually about past events
•However, there is a need for decision-making about future actions, based on sophisticated ‘what-if’ and ‘why’ questions
•OLAP partially fills in this gap by supporting more advanced analysis
•OLAP applies more complex calculations including trend algorithms, forecasting, percentage growths, … on large volumes of multi-dimensional data
• Note: in general, Data Mining supports even more complex data analysis (using even more complex data structures)!!
From which broader category is OLAP part of?
This is an example of an OLAP. What are more types of OLAP?
•OLAP implemented via a specialized multidimensional database (or data store) that summarizes transactions into multidimensional views ahead of time: DWs can be used as underlying data model
2. Relational OLAP (ROLAP)
•The implementation of an OLAP database on top of an existing relational database where multidimensional views can be created on-the-fly using complex SQL queries
•Database OLAP (DOLAP): Relational DBMS that include OLAP functionality
•Web OLAP (WOLAP): OLAP accessible by a web browser
•Desktop OLAP: simple OLAP for local analysis
3.Hybrid OLAP (HOLAP): combining MOLAP and ROLAP
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding
What are the attributes in a multidimensional data cube?
•The corresponding classes follow from partitioning the attribute
•E.g., for ‘Time’, classes may be hours, days, weeks, years, …
•Dictates the granularity of fact attributes, which may need to be summarized (recall Lecture 2)
•Based only on summarized data, partitions are irreversible!
Often both aggregated and non-aggregated data is stored in the DW
What are fact attributes in an data cube?
•Each fact may define one new dimension (e.g., ‘offerPrice’)
•The corresponding classes follow from partitioning the fact
•Classes may be cheap, medium, high, …
•Classes may be defined by certain intervals [0, 50), [50, 100]
What are the two key steps in converting tabular fact data into a multidimensional array?
2. Find the value of each entry in the multidimensional array by:aggregating the values of the target attribute (if a class is a fact attribute) OR
counting the number of records that have the (discrete) attribute values corresponding to that entry (if a class is a partition of a fact attribute)
OR
Cells not defined by the data are assumed to have a value of 0
What is the key operation of an OLAP?
Explain what the table shows per color? For example
•1 three-dimensional summary (the detailed data); •3 two-dimensional summary (over 1 dimension);
•3 one-dimensional summary (over 2 dimensions);
•1 zero-dimensional summary (the overall total).
• Blue: two of the one-dimensional aggregates,
• Yellow: and the overall total
Explain the terms slicing and dicing in OLAP Operations.
•Dicing involves selecting a subset of cells by specifying a range of attribute values
Explain the terms roll-up and drill-down in OLAP operations.
•from detailed to summarized data
•may also be performed by aggregation w.r.t. and removal of one dimension
•Disaggregation of the monthly sales totals into daily sales totals is termed ‘drill-down’
•from summarized to detailed data
•May also be performed by adding one dimension
Explain the term cume in OLAP Operations.
E.g., ‘show the quarterly sales for branch B003, along with the cumulative quarter sales’
- Solution: assume we have a table BranchQuarterSales with attributes branchno, quarter, and quarterlySales,
then we can write:
The question on the page originate from the summary of the following study material:
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding