Big data context

14 important questions on Big data context

What are the differences between dealing with new data versus existing data in research?

• Experiments, surveys, interviews bring 'new' data.
• Content analysis, 'big data' analysis focus on data already existing.
• Implications for sampling, ethics, validity, and reliability differ.

What does the term "quantitative" mean in content analysis?

In content analysis, "quantitative" refers to counting how often something occurs within the analyzed communications.
- Involves counting occurrences

Why is it important for content analysis to be systematic?

Content analysis needs to be systematic to ensure consistency and reliability in the process.
- Systematic approach ensures consistency and reliability
- Rules are established for sampling and analysis
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

What is the significance of content analysis being objective?

Objectivity in content analysis means that the rules used for sampling and analysis should be clear and unambiguous.
- Objective rules are unambiguous
- Avoids subjective biases

How many steps are involved in doing a content analysis according to Treadwell & Davies?

The process involves seven steps:
- Develop a hypothesis
- Define the content to be analyzed
- Sample the content
- Select units for coding
- Develop a coding scheme
- Code the units
- Count occurrences of the coded units

What is an advantage of a dictionary-based approach in text classification?

- Very consistent, same word always assigned to the same category, enhancing reliability

What is a disadvantage of a dictionary-based approach in text classification?

- Does not use context to disambiguate words
- May lead to occasional misclassifications, threatening validity

How can the sentences "I’m a huge fan of baseball. I have a big collection of bats." and "I’m a huge fan of stuffed nocturnal animals. I have a big collection of bats" demonstrate a limitation of dictionary-based approaches?

- Both sentences use the word "bats" but have different meanings

What are the coding rules for content analysis?

- Coding categories should be exhaustive.
- All coding units must be assigned to a category, minimizing the 'other' category.
- Categories should be exclusive.
- Each coding unit is allocated to one category.
- Coders can assess multiple aspects of each unit, which are not necessarily mutually exclusive.

What are the characteristics of big data often defined by "the three V's"?

- Volume: it is not a sample, but a record of 'everything'; can be broad (many variables) and deep (many data points per variable)
- Variety: includes text, images, audio, video; structured (databases) and unstructured (e.g., chats)
- Velocity: often real-time or with little lag

What does the fourth 'V', Veracity, refer to in the context of big data?

- Big data are often a by-product of everyday behavior
- Not distorted by observer effects or artificial settings
- Interpretation may not always be straightforward

What approach does big data lend itself well to in research?

• Have an exploratory rather than confirmatory research question
• Sometimes aided by visualization
• Gain insights from data (induction) rather than test specific predictions on data (deduction)
• Look for correlations instead of causality

What are some opportunities in big data research?

• Big data may contain rare phenomena and hard-to-reach populations
• Reduces risk of error and bias due to larger sample size
• Can uncover unexpected correlations not predicted by theory
• Allows construction of more sophisticated statistical models
• But may lead to spurious correlations
• Risk of overfitting models

What are some challenges faced in big data research?

- Spurious correlations can exist in broad datasets, leading to statistically significant correlations even with randomly generated data.
- Overfitting may occur when a complex statistical model fits existing data well but struggles to accurately predict new data.

The question on the page originate from the summary of the following study material:

  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Remember faster, study better. Scientifically proven.
Trustpilot Logo