Summary: Big Data
- This + 400k other summaries
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding
Read the summary and the most important questions on Big Data
-
Big Data Concepts part 2
This is a preview. There are 5 more flashcards available for chapter 29/10/2019
Show more cards here -
What 4 types of data processing modes are there?
- Transaction processing
- Batch processing
- Real-time processing
- Near real-time processing
-
What different deadlines are there in real-time processing?
- Hard - missing a deadline is a total system failure.
- Firm - infrequent misses are tolerable, but may degrade the systems quality of service. The usefulness of a result is zero after its deadline.
- Soft - usefulness of result degrades after its deadline, thereby degrading the systems quality of service.
- Hard - missing a deadline is a total system failure.
-
What are the steps in data processing?
- Data acquisition
- Data staging
- Data analysis
- Application analysis
- Visualization
- Data acquisition
-
What is a data warehouse staging area?
A temporary location where data from source systems is copied during the extract, transformation and load (ETL) process. -
What is a data lake?
- A data deposit that holds a vast amount of raw data in its native format, including structured, semi-structured and unstructured data.
- Data structure and requirements are not defined until the data is needed.
-
What are the characteristics of a data lake?
- Retain all data
- Support all data types
- Support all users
- Adapt easily to changes
- Retain all data
-
Big Data Concepts part 1
This is a preview. There are 6 more flashcards available for chapter 29/10/2019
Show more cards here -
What are possible sources for big data?
- Web and social media data
- Machina data
- Sensing data
- Transaction data
- Internet of Things
- Web and social media data
-
How can you best manage unstructured data?
Have it flow into a data lake in its raw format. -
What is semi-structured data?
- Falls between structured and unstructured data.
- Form of structured data that does not conform with the formal structure of data models.
- BUT contains tags or other markers to separate semantic elements and enforce hierarchies within the data.
- Examples: mark-up languages XML, JSON, HTML twitter.
-
What types of metadata are there?
- Structural metadata - indicates how compound objects are put together. E.g. How pages are ordered from chapters.
- Descriptive metadata - describes a resource for purposes such as discovery and identification. E.g. Elements such as title, author, keywords.
- Administrative metadata - provides information to help manage a source. E.g. When and how a file was created, who can access it etc.
- Structural metadata - indicates how compound objects are put together. E.g. How pages are ordered from chapters.
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding