{{Short description|Type of statistical variable}} In statistics, '''bad controls''' are variables that introduce an unintended discrepancy between regression coefficients and the effects that said coefficients are supposed to measure. These are contrasted with confounders which are "'''good controls'''" and need to be included to remove omitted variable bias.<ref name=Cinelli2020/><ref name=Angrist2014/><ref name=Angrist2008/> This issue arises when a '''bad control''' is an outcome variable (or similar to) in a causal model and thus adjusting for it would eliminate part of the desired causal path. In other words, '''bad controls''' might as well be dependent variables in the model under consideration.<ref name=Angrist2008/> Angrist and Pischke (2008) additionally differentiate two types of '''bad controls''': a simple bad-control scenario and proxy-control scenario where the included variable partially controls for omitted factors but is partially affected by the variable of interest.<ref name=Angrist2008/> Pearl (1995) provides a graphical method for determining good controls using causality diagrams and the '''back-door criterion''' and '''front-door criterion'''.<ref name=Pearl1995/>

== Examples ==

=== ''Simple'' bad control ===

thumb|alt=causal diagram of education, work type and wages variables|Causal diagram showing a type of bad control. If we control for work type <math>T</math> when performing regression from education <math>E</math> to wages <math>W</math> we have disrupted a causal path <math>E \to T \to W</math> and such a regression coefficient does not have a causal interpretation.

A simplified example studies effect of education on wages <math>W</math>.<ref name=Angrist2008/> In this thought experiment, two levels of education <math>E</math> are possible: lower and higher and two types of jobs <math>T</math> are performed: white-collar and blue-collar work. When considering the causal effect of education on wages of an individual, it might be tempting to control for the work-type <math>T</math>, however, work type is a mediator (<math>E \to T \to W</math>) in the causal relationship between education and wages (see causal diagram) and thus, controlling for it precludes causal inference from the regression coefficients.

=== Bad proxy-control ===

thumb|alt=causal diagram of education, innate ability, late ability and wages|Causal diagram showing bad proxy-control. If we control for late ability <math>L</math> when performing regression from education <math>E</math> to wages <math>W</math> we have introduced a new non-causal path <math>E \to L \leftarrow I \to W</math> and thus a collider bias.

Another example of bad control is when attempting to control for innate ability when estimating effect of education <math>E</math> on wages <math>W</math>.<ref name=Angrist2008/> In this example, innate ability <math>I</math> (thought of as for example IQ at pre-school age) is a variable influencing wages <math>W</math>, but its value is unavailable to researchers at the time of estimation. Instead they choose before-work IQ test scores <math>L</math>, or late ability, as a proxy variable to estimate innate ability and perform regression from education to wages adjusting for late ability. Unfortunately, late ability (in this thought experiment) is causally determined by education and innate ability and, by controlling for it, researchers introduced collider bias into their model by opening a back-door path <math>E \to L \leftarrow I\to W</math> previously not present in their model. On the other hand, if both links <math>E \to L</math> and <math>I \to L</math> are strong, one can expect strong (non-causal) correlation between <math>I</math> and <math>E</math> and thus large omitted-variable bias if <math>I</math> is not controlled for. This issue, however, is separate from the causality problem.

== References ==

<references> <ref name=Cinelli2020> {{cite journal |title=A crash course in good and bad controls |vauthors=Cinelli C, Forney A, Pearl J |journal=Sociological Methods & Research |year=2020 |publisher=SAGE Publications Sage CA: Los Angeles |url=http://ftp.cs.ucla.edu/pub/stat_ser/r493.pdf }} </ref> <ref name=Angrist2008> {{cite book |title=Mostly Harmless Econometrics: An Empiricist's Companion |vauthors=Angrist JD, Pischke JS |isbn=0691120358 |year=2008 }} </ref> <ref name=Angrist2014> {{cite book |title=Mastering ’metrics: The path from cause to effect |vauthors=Angrist JD, Pischke JS |isbn=9780691152844 |year=2014 |publisher=Princeton University Press }} </ref> <ref name=Pearl1995> {{cite journal |vauthors=Pearl J |title=Causal diagrams for empirical research |journal=Biometrika |volume=82 |number=4 |pages=669–688 |year=1995 |issn=0006-3444 |doi=10.1093/biomet/82.4.669 |url=https://doi.org/10.1093/biomet/82.4.669 |url-access=subscription }} </ref> </references>

Category:Statistical concepts