OLAP business databases - data warehousing & ETL

7 important questions on OLAP business databases - data warehousing & ETL

Essential features of Inmon's and Kimballs models

Especially overall approach, architectural structure, complexity of the method and tools are important to know

What is a data lake?




a database that holds raw data in its narrative format. The data lake approach copies all the data that might be relevant. Whereas the data warehouse approach cleans the data, the data lake approach doesn’t clean the data. Mostly, the two exist next to each other because different employees have different needs. This approach is also called schema on read

Which 4 types of data conversion do we know?


1. Simple: data type conversion
For example:
      a. Dd-mm-yyyy hh:mm  mmddyy
      b. Integer 0/1  Boolean true/false
      c. Double/float  numeric
2. Complex type conversion
Address format, categorical data
3. Currency conversions
4. Language conversions
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

Which 5 types of ETL cleaning do we know?


1. Data must be validated and filtered  garbage in = garbage out (GIGO)
2. Data in the real world is dirty
         a. Incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data
         b. Noisy: containing errors or outliers
         c. Inconsistent: containing discrepancies in codes or names
Therefore, data quality is critical.
3. It is often neglected or casually handled
4. Problems exposed when data is summarized
5. Prevention is better, but who does the cleaning?

What is solution to unify all the different terms used in descriptions?

Text mining to unify the different goods and cluster them. However, the LT solution is in the business. Create awareness by the employees about filling in the form correctly.

Which two types of ETL Load do we know?


1. Integral load:
a. All the records are taken and processes by comparing them with the records in the data warehouse
b. Extreme case: load deletes dw tables and replaces them with complete new versions (= full load)
c. When no time stamps are present it is difficult to do an incremental load.

2. Delta/incremental loada. Only the changed records are taken and processed systematically in the dw

Which 6 ETL tool vendors do we know?


1. Informatica PowerCenter
2. Bluesky Integration Studio
3. SAP Data Integrator
4. IBM WebSphere Data Integration Suite SAS enterprise data integration
5. Oracle Warehouse Builder
6. MS SQL Server Integration Services

The question on the page originate from the summary of the following study material:

  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Remember faster, study better. Scientifically proven.
Trustpilot Logo