Summary: Business Intelligence & Business Analytics
- This + 400k other summaries
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding
Read the summary and the most important questions on Business Intelligence & Business Analytics
-
1 BIBA
-
1.1.2 Introduction to Databases
This is a preview. There are 4 more flashcards available for chapter 1.1.2
Show more cards here -
What is a database and what does is consist of?
Definition database: “A collection of related tables, designed, maintained and utilized by multiple users, with software to update & query the data”
Database system consists of- 1. Data (the database)
- 2. Software
- 3. Hardware
- 4. Users
- 1. Data (the database)
-
What is a database management system (DBMS) and how is it used?
Database management system (DBMS) is the software than controls the data- Oracle, DB2, MS Access, MS SQL Server (Azure)
- MySQL (open source)
- E.g. SQL (Structured Query Language)
- Oracle, DB2, MS Access, MS SQL Server (Azure)
-
What are the database terminology?
- Database
- Table = structured list of data of a specific type, divided by columns and rows
- Record/Tuple
- Field/Attribute
- Domain
-
1.1.3 Relational database
This is a preview. There are 3 more flashcards available for chapter 1.1.3
Show more cards here -
What is a primary key and foreign key?
PK: Field(s) that uniquely identifies each record in a table- Bno (Book), Rno (Reader), Bno + Rno + Loan date (Loan)
- Null value (=no data entry) not allowed for PK
- E.g.: Vendor sells Products
- Precisely 1 vendor per product
- Conversely, a vendor might sell multiple products
- FK = ‘Vendor_Code’ [Product table]
- Bno (Book), Rno (Reader), Bno + Rno + Loan date (Loan)
-
What are 4 trends in the database world?
- Trend 1: From disk-based to In-memory databases (e.g. SAP HANA)
- Trend 2: From on-premise db to cloud db (e.g. MS Azure, Google cloud sql)
- db use becomes an operating expense instead of capital expense
- Trend 3: No (not only) SQL databases
- Hypothesis: For analytics, relational database are dominant
- NoSQL Databases: Key-value, Document, Graph, Wide-column
- Trend 4: Alternative data representations
- For storing document-oriented files with hierarchies; use XQuery
- Trend 1: From disk-based to In-memory databases (e.g. SAP HANA)
-
1.2.3 Data warehouse architectures
This is a preview. There are 5 more flashcards available for chapter 1.2.3
Show more cards here -
Which DW development approaches are there and which is the best?
- Data mart approach (
bottom-up ) DW = acollection of data martsDimensional modelingConsistency achieved byconformed dimensions - E.g.:
Independent data marts,Bus , ‘Canned datawarehouse ’ Enterprise dw approach (top-down)DW = one integrated databaseEntity-relationship modeling- E.g.
Hub & spoke:EDW + dep. data marts,Federated DW - Which approach is best?
- There is no one-size-fits-all strategy to
DW , depending on: management’s information needs, inf.interdependence betweenorganizational units, …
- Data mart approach (
-
1.6.2 Naïve Bayes
This is a preview. There are 3 more flashcards available for chapter 1.6.2
Show more cards here -
Why is laplace smoothing needed?
Fromprevious slides , theprobability of α givenclass c:P (Outlook =“Sunny ” |PlayTennis =“Yes ” ) = 0- Problem:
- An
attribute value doesn’t occur with everyclass - Probability of α given
class c becomes 0 - Having a
probability zero isproblematic , because it wipes out all information in otherprobabilities
-
What is laplace smoothing?
- Laplace Smoothing, or Correction, or Estimator
- Incorporates a small-sample correction in every probability computation
- Increase the numerator/denominator
- Thus, no probability will be zero
- Laplace Smoothing, or Correction, or Estimator
-
What are advantages and disadvantages for Naive Bayes?
- Naive Bayes is Not So Naïve:
- Its beauty is in its simplicity
- Ability to handle categorical variables directly
- Computational efficient
- Good classification performance, especially when the number of predictors is very large
- Negative aspects:
- Requires a very large number of records to obtain good results
- Independence assumption may not hold for some attributes
- Naive Bayes is Not So Naïve:
-
1.7.1 Quizes
This is a preview. There are 3 more flashcards available for chapter 1.7.1
Show more cards here -
Consider you were given 8 items, i.e., records, with numerical variables X1 & X2 along with a dependent variable y that corresponds to color (blue/red). Plot illustrates data in 2D space Task → use k-Nearest neighbors with Euclidean distance to classify item (1,1)Predict the class of new item (1,1) when using k-Nearest neighbors with Euclidean distance and K=3.1. Class of item (1,1) is red.2. Class of item (1,1) is blue
Red
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding
Topics related to Summary: Business Intelligence & Business Analytics
-
BIBA - Naïve Bayes - Introduction to K Nearest Neigbors
-
BIBA - Naïve Bayes - Choosing K
-
BIBA - K Nearest neighbours - Evaluating Predictive Performance Ie., numerical (continuous) variables
-
BIBA - K Nearest neighbours - Judging Classifier Performance I.e., categorical variables
-
BIBA - K Nearest neighbours - Quizzes
-
BIBA - Performance Measures - Introduction Decision Trees
-
BIBA - Performance Measures - Splitting
-
BIBA - Performance Measures - Quizzes
-
BIBA - Association Rules - Introduction to Association Rules
-
BIBA - Association Rules - Confidence
-
BIBA - Association Rules - Quizzes
-
BIBA - Association Rules - Cluster Analysis
-
BIBA - Clustering - Introduction Neural Networks
-
BIBA - Clustering - Deep Neural Networks