Member-only story

Causal Inference

Front-door Criterion

Bruno Gonçalves

Published in

Data For Science

5 min readNov 15, 2020

This is the twelfth post on the series we work our way through “Causal Inference In Statistics” a nice Primer co-authored by Judea Pearl himself.

You can find the previous post here and all the we relevant Python code in the companion GitHub Repository:

DataForScience/Causality

How do causes lead to effects? Can you associate the cause leading to the observed effect? Big Data opens the doors for…

github.com

While I will do my best to introduce the content in a clear and accessible way, I highly recommend that you get the book yourself and follow along. So, without further ado, let’s get started!

3.4 — Front-Door Criterion

The Front-Door Criterion is a complementary approach to identifying sets of variables we can use in order to estimate causal effects from non-experimental data. It is particularly useful when we are unable to identify any sets of variables that obey the Backdoor Criterion discussed previously.

Pearl motivates the Front-Door criterion by going back to the smoke-cancer problem. Using this DAG:

Here our goal is to estimate the direct effect of Smoking (X) on Cancer (Y), while being unable to directly measure the Genotype (U). From the DAG we can see that no variable satisfies the back-door criterion as U is unmeasured, so we can immediately write:

On the other hand, we can directly identify the effect of Tar of Cancer by using the back-door criterion to block the back-door path through X:

Now we can chain the two expressions together to obtain the direct effect of X on Y:

The motivation for this expression is clear if we consider a two state intervention. If we set the value of X, we can determine what the corresponding value of Z is, and we can then intervene again to fix that value of Z. By doing this for every value of Z we are able to determine the effect of X on Y! The general expression, known as the front-door formula is: