HDFS and Map-Reduce
9 important questions on HDFS and Map-Reduce
What are immutable files?
- Files from which the identity and content cannot be changed.
- Can be copied. But doesn't need to be tracked because it cannot be changed.
Explain what 3 types of sharing architecture there are.
- Share everything. Sharing a central database on a server and shared disk. E.g. Unix FS
- Share disks. Not sharing database, but sharing SAN disks. E.g. Oracle RAC
- Share nothing. Not sharing db and not sharing disk. E.g HDFS
What does HDFS mean and how does it work?
- Means Hadoop Distributed File System
- Stores files in blocks and replicates the blocks across many nodes in a cluster
- Blocks are immutable
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding
What different type of nodes are there?
- Name node (master): runs on a single node. Holds file metadata (which blocks are where).
- Data node (slave): contains the data.
What are the traits of HDFS?
- No file updates - immutable blocks
- Write once, read many times
- Large blocks, sequential reads
- Designed for batch processing
What is the read time?
What are pure functions?
What does the map function do?
What does the reduce function do?
The question on the page originate from the summary of the following study material:
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding