Association rules

9 important questions on Association rules

What are association rules?

Identify item clusters in event-based or transaction-based databases
Usage: display items together, recommendation in online shopping
  • Study of "what goes with what"
  • Customers who bought X also bought Y

What are the rules in Association-rules?

  • Represented in an IF-THEN format
    • “IF”part:antecedent
    • THEN”part:consequent
  • Both correspond to sets of items (called itemsets)
  • Itemsets are:
    • Possible combinations of items (e.g., products)
    • Can also be a single item
    • NOT records of what people buy
  • Antecedent and consequent are disjoint
    • I.e., have no items in common

What are frequent itemsets?

Combinations of items that occur with higher frequency among the transactions. Criterion for frequent is “support”.
  • Support: number, or percent, of transactions that include both the antecedent and the consequent
An itemset that has a support that exceeds a selected minimum support, determined by the user.
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

What is the goal of the apriori algorithm?

Generate frequent itemsets

How does an apriori algorithm work?


For k -items

  • User sets a minimum support criterion
  • Generate list of one-item sets
  • Drop the ones bellow the support criterion
  • Use the list of one-itemsets to generate the two-itemsets Drop the ones bellow the support criterion
  • Use the list of two-itemsets to generate the three-itemsets Drop the ones bellow the support criterion
  • .... (continue until k-itemsets)

How is the strength of a rule assessed?

We need to measure the strength of the associated implied by a rule.
Measures:
  • Support
    • Number of transactions that include all items from the antecedent and consequent
  • Confidence
  • Lift ratio

What is the Lift ratio?

Compares the confidence of the rule with a benchmark value. Ratio of confidence with benchmark confidence. Assumes independence of the consequent from the antecedent.

What is benchmark confidence?

Transaction with consequent as percentage of all transactions

Explain the binary incidence matrix.

  • Columns are items
  • Rows are transactions
  • Cells indicate the present or absent of items in transactions


  • Find all rules with a support count of at least 2
    • equivalent to a percentage of 20%
    • rules with items that were purchased together in at least 20% of the transactions
  • Compute support of the items

The question on the page originate from the summary of the following study material:

  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Remember faster, study better. Scientifically proven.
Trustpilot Logo