BIBA - Association Rules - Cluster Analysis
4 important questions on BIBA - Association Rules - Cluster Analysis
How is cluster analysis used in the real world?
- Finance
- Balanced portfolios: Given various stocks, find clusters based on financial performance variables, such as return(daily, weekly, or monthly), volatility, etc.
- Industry analysis: Find groups of similar firms based on measures, such as growth rate, profitability, market size,product range, and presence in various markets
- Market segmentation
- Create groups of customers based on past purchasing behavior, demographic characteristics, or other customers features (examples)
- Medical, e.g., divide data in healthy and suspicious clusters, etc.
- Better navigation of search results
- Grouping of search results thematically
Which measures are used to measure the distance between records during clustering?
- Euclidean distance (blue line):
- Manhattan distance (red line):
How can you calculate the distance with binary/nominal attributes during clustering?
- Proportion of unequal attributes out of the total number of attributes
- Example
- XA: ('young', 'myope', 'no', 'reduced', 'none’)
- XB: ('young', 'hypermetrope', 'no', 'reduced', 'none’)
- → d(A, B)=1/5
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding
How do we measure distance between clusters?
- Minimum Distance:
- The distance between Ai and Bj that are closer
- min( distance( Ai, Bj ) ), i=1, …, m & j=1, …, n
- Maximum Distance:
- The distance between Ai and Bj that are farthest
- max( distance( Ai, Bj ) ), i=1, …, m & j=1, …, n
- Average Distance:
- The average of all possible object distances
- avg( distance( Ai, Bj ) ), i=1, …, m & j=1, …, n
The question on the page originate from the summary of the following study material:
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding