Lecture causality
16 important questions on Lecture causality
What is the definition of a causal relationship according to various philosophers?
- What constitutes causality is one of the biggest questions of philosophy.
- the requirement for a relation to be a causal one, according to some philosophers are:
- A invariably follows B (David Hume)
- A is an Insufficient but Nonredundant part of an Unnecessary but Sufficient condition for B (INUS condition; John Mackie)
- B counterfactually depends on A: if A had not happened, B would not have happened (David Lewis)
- and more...
What are the disadvantages of philosophical accounts of causality?
- Philosophical definitions of causality always assume a deterministic situation. In statistics we never speak of a deterministic situation but always of a probabilistic situation.
- The mismatch between the probabilistic statistical analyses and the deterministic theories about causalities caused the causal relationship to appear impossible to prove through statistics.
What was done to improve the relationship between theories about causality and statistical methods to back up causality?
- Judea Pearl suggested an alternative approach based in the statistical method of structural relations
- He argues that causal relations should be framed in terms of interventions on a model: given a causal model, what would happen to B if we changed A?
He stated that causal relationships become clear from interventions and not from observations.
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding
Describe pearl's approach to model causality.
- Statistics doesn't have words that describe a causal relationship between a and b.
- He asks: how can we statistically express the fact that rain causes the pavement to be wet?
- He gets no further than a variant of “rain and wet pavements are correlated, and it rained first”
- this is less powerful than our verbal notion of causality, since we're implying that if it hadn't rained, the pavement would've been dry
- to solve this he created a new syntax.
With this new operation, how did pearl then define causation?
- Definition: A causal effect of A on B is present whenever E(B|Do(A=a)) ≠ E(B)
- the expected value of B in absence of A must be different from the expected value of B in presence of an intervened A.
- This fixes a structural equation, which can be represented in a graph by drawing a directed arrow from A to B whenever (in the model structure) changing A affects B but not vice versa:
- the arrows in the model do not depict regression-coefficients, they depict intervention coefficients.
What concept turned out to be key to identify causal relationships from data?
- Pearl and Glymour et al. simultaneously developed the insight that causal relations as encoded in structural equations are not betrayed by correlations or conditional probabilities
- Instead, conditional independence relations are key to the identification of causal structure
How can conditional independence be used to determine a causal relationship from data?
- Instead of focusing on determining the causal relationship between two variables from systems with these two variables, shift attention from bivariate to multivariate systems and then ask two new questions:
- 1) Which conditional independence relations are implied by a given causal structure
- 2) Which causal structures are implied by a given set of conditional independence relations?
With this knowledge, how can we infer causal relationships from conditional independence?
- If we combine small networks to build larger networks, then we can develop a graphical criterion to deduce implied CI relations from a causal graph (i.e., we can look at the graph rather than solve equations)
- If we have a dataset, we can establish which of a set of possible causal graphs could have generated the CI relations observed
- If certain links cannot be deleted from the graph then it is in principle possible to establish causal relations from non-experimental data (pace David Hume)
What is the result of conditioning on the collider?
Summarise what the use of DAG's are
- If you have a causal network that consists of variables coupled through (directed and acyclic) structural relations...
- then you can tell which conditional independence patterns will arise...
- just by looking at the picture
What is the use of knowing conditional independencies in your data?
- And in the other direction: if you have a set of conditional independencies, you can search for the possible causal networks that could have produced them
- If you can assume that the data are generated by a causal network, then you can sometimes conclude that certain relations have to be present
- This means that causal inference from correlational data is possible in principle
Thus: causality can be inferred from conditional independencies and conditional independencies can be inferred from causal relationships in the data.
What are equivalence classes?
- Equivalence classes are two different models that are both equally likely given the factorization of the joint probability distribution of the data.
- the paths in both models are in the same locations but the directions of the paths differ.
- The induction problem resurfaces as the inability to choose between members of an equivalence class of DAGs
How can you infer the causal model from the data?
- There are a couple of approaches:
- Constraint-based:
- Use the causal calculus to find the DAGs that are consistent with the conditional independencies
- Example: IC algorithm (implemented by the PC algorithm)
- Score-based:
- Define a fit function (e.g. information criterion, R2 , goodness-of- fit measure) and search for the DAG that fits best
- Example: Hill-Climbing algorithm
- Hybrid algorithms:
- Combine contraint-based approaches with score-based approaches
- Example: Sparse-Candidate algorithm
What is the power of (undirected) network's as opposed to causal models?
- Most network models are aimed at identifying conditional (in)dependence relations
- These are graphically represented in a network
- However unlike causal approaches networks are model-free: you don’t have to assume a particular causal model a priori
- This makes them very useful in exploratory data analysis
- Practice reasoning about conditional (in)dependence
the downside of undirected networks is that we cannot draw causal conclusions, but the advantage is that we do not have to obey to all kinds of assumptions to create a network.
Can you interpret the latent variable model as a causal model?
- Latent variable models behave like causal models (common cause models)
- the most important assumption of the latent variable model is local independence -> conditional independence on the common cause
Summarize how causality can be studied and inferred.
- conditional independence is a window onto the causal world
- However not all relations derive from directed causal paths
- Network analysis provides tools to assess conditional independence in a model-free way
- If you’re willing to invest causal assumptions, search algorithms can be used.
The question on the page originate from the summary of the following study material:
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding