Summary: Big Data

Study material generic cover image
  • This + 400k other summaries
  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
PLEASE KNOW!!! There are just 96 flashcards and notes available for this material. This summary might not be complete. Please search similar or other summaries.
Use this summary
Remember faster, study better. Scientifically proven.
Trustpilot Logo

Read the summary and the most important questions on Big Data

  • Big Data Concepts part 2

    This is a preview. There are 5 more flashcards available for chapter 29/10/2019
    Show more cards here

  • What 4 types of data processing modes are there?

    • Transaction processing
    • Batch processing
    • Real-time processing
    • Near real-time processing
  • What different deadlines are there in real-time processing?

    • Hard - missing a deadline is a total system failure.
    • Firm - infrequent misses are tolerable, but may degrade the systems quality of service. The usefulness of a result is zero after its deadline.
    • Soft - usefulness of result degrades after its deadline, thereby degrading the systems quality of service. 
  • What are the steps in data processing?

    1. Data acquisition
    2. Data staging
    3. Data analysis
    4. Application analysis
    5. Visualization
  • What is a data warehouse staging area?

    A temporary location where data from source systems is copied during the extract, transformation and load (ETL) process.
  • What is a data lake?

    • A data deposit that holds a vast amount of raw data in its native format, including structured, semi-structured and unstructured data.
    • Data structure and requirements are not defined until the data is needed
  • What are the characteristics of a data lake?

    • Retain all data
    • Support all data types
    • Support all users
    • Adapt easily to changes
  • Big Data Concepts part 1

    This is a preview. There are 6 more flashcards available for chapter 29/10/2019
    Show more cards here

  • What are possible sources for big data?

    • Web and social media data
    • Machina data
    • Sensing data
    • Transaction data
    • Internet of Things
  • How can you best manage unstructured data?

    Have it flow into a data lake in its raw format.
  • What is semi-structured data?

    • Falls between structured and unstructured data.
    • Form of structured data that does not conform with the formal structure of data models.
    • BUT contains tags or other markers to separate semantic elements and enforce hierarchies within the data.
    • Examples: mark-up languages XML, JSON, HTML twitter.
  • What types of metadata are there?

    • Structural metadata - indicates how compound objects are put togetherE.g. How pages are ordered from chapters.
    • Descriptive metadata - describes a resource for purposes such as discovery and identificationE.g. Elements such as title, author, keywords.
    • Administrative metadata - provides information to help manage a sourceE.g. When and how a file was created, who can access it etc.
PLEASE KNOW!!! There are just 96 flashcards and notes available for this material. This summary might not be complete. Please search similar or other summaries.

To read further, please click:

Read the full summary
This summary +380.000 other summaries A unique study tool A rehearsal system for this summary Studycoaching with videos
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart