Identifiability

{{Short description|Statistical property which a model must satisfy to allow precise inference}} {{For-multi|the related problem in economics|Parameter identification problem|the concept of identifiability in the area of system identification|Structural identifiability}}

In statistics, '''identifiability''' is a property which a model must satisfy for precise inference to be possible. A model is '''identifiable''' if it is theoretically possible to learn the true values of this model's underlying parameters after obtaining an infinite number of observations from it. Mathematically, this is equivalent to saying that different values of the parameters must generate different probability distributions of the observable variables. Usually the model is identifiable only under certain technical restrictions, in which case the set of these requirements is called the '''identification conditions'''.

A model that fails to be identifiable is said to be '''non-identifiable''' or '''unidentifiable''': two or more parametrizations are observationally equivalent. In some cases, even though a model is non-identifiable, it is still possible to learn the true values of a certain subset of the model parameters. In this case we say that the model is '''partially identifiable'''. In other cases it may be possible to learn the location of the true parameter up to a certain finite region of the parameter space, in which case the model is set identifiable.

Aside from strictly theoretical exploration of the model properties, '''identifiability''' can be referred to in a wider scope when a model is tested with experimental data sets, using identifiability analysis.<ref> {{Cite journal| doi = 10.1093/bioinformatics/btp358| volume = 25| issue = 15| pages = 1923–1929| last1 = Raue| first1 = A.| last2 = Kreutz| first2 = C.| last3 = Maiwald| first3 = T.| last4 = Bachmann| first4 = J.| last5 = Schilling| first5 = M.| last6 = Klingmuller| first6 = U.| last7 = Timmer| first7 = J.| title = Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood| journal = Bioinformatics| date = 2009-08-01| pmid=19505944| doi-access = free}} </ref>

==Definition== Let <math> \mathcal{P}=\{P_\theta:\theta\in\Theta\} </math> be a statistical model with parameter space <math>\Theta</math>. We say that <math>\mathcal{P}</math> is '''identifiable''' if the mapping <math>\theta\mapsto P_\theta</math> is one-to-one:<ref>{{harvnb|Lehmann|Casella|1998|loc=Ch. 1, Definition 5.2}}</ref> : <math> P_{\theta_1}=P_{\theta_2} \quad\Rightarrow\quad \theta_1=\theta_2 \quad\ \text{for all } \theta_1,\theta_2\in\Theta. </math>

This definition means that distinct values of ''θ'' should correspond to distinct probability distributions: if ''θ''1≠''θ''2, then also ''P''''θ''1≠''P''''θ''2.<ref>{{harvnb|van der Vaart|1998|page=62}}</ref> If the distributions are defined in terms of the probability density functions (pdfs), then two pdfs should be considered distinct only if they differ on a set of non-zero measure (for example two functions ƒ1(''x'') = '''1'''0 ≤ ''x'' < 1 and ƒ2(''x'') = '''1'''0 ≤ ''x'' ≤ 1 differ only at a single point ''x'' = 1 — a set of measure zero — and thus cannot be considered as distinct pdfs).

Identifiability of the model in the sense of invertibility of the map <math>\theta\mapsto P_\theta</math> is equivalent to being able to learn the model's true parameter if the model can be observed indefinitely long. Indeed, if {''Xt''} ⊆ ''S'' is the sequence of observations from the model, then by the strong law of large numbers, : <math> \frac 1 T \sum_{t=1}^T \mathbf{1}_{\{X_t\in A\}} \ \xrightarrow{\text{a.s.}}\ \Pr[X_t\in A], </math> for every measurable set ''A'' ⊆ ''S'' (here '''1'''{...} is the indicator function). Thus, with an infinite number of observations we will be able to find the true probability distribution ''P''0 in the model, and since the identifiability condition above requires that the map <math>\theta\mapsto P_\theta</math> be invertible, we will also be able to find the true value of the parameter which generated given distribution ''P''0.

==Examples==

===Example 1=== Let <math>\mathcal{P}</math> be the normal location-scale family: : <math> \mathcal{P} = \Big\{\ f_\theta(x) = \tfrac{1}{\sqrt{2\pi}\sigma} e^{ -\frac{1}{2\sigma^2}(x-\mu)^2 }\ \Big|\ \theta=(\mu,\sigma): \mu\in\mathbb{R}, \,\sigma\!>0 \ \Big\}. </math> Then : <math> \begin{align} & f_{\theta_1}(x)=f_{\theta_2}(x) \\[6pt] \Longleftrightarrow {} & \frac 1 {\sqrt{2\pi}\sigma_1} \exp\left( -\frac 1 {2\sigma_1^2} (x-\mu_1)^2 \right) = \frac 1 {\sqrt{2\pi}\sigma_2} \exp\left( -\frac 1 {2\sigma_2^2}(x-\mu_2)^2 \right) \\[6pt] \Longleftrightarrow {} & \frac 1 {\sigma_1^2}(x-\mu_1)^2 + \ln \sigma_1 = \frac 1 {\sigma_2^2}(x-\mu_2)^2 + \ln \sigma_2 \\[6pt] \Longleftrightarrow {} & x^2 \left(\frac 1 {\sigma_1^2}-\frac 1 {\sigma_2^2}\right) - 2x\left(\frac{\mu_1}{\sigma_1^2}-\frac{\mu_2}{\sigma_2^2} \right) + \left(\frac{\mu_1^2}{\sigma_1^2}-\frac{\mu_2^2}{\sigma_2^2}+\ln\sigma_1-\ln\sigma_2\right) = 0 \end{align} </math> This expression is equal to zero for almost all ''x'' only when all its coefficients are equal to zero, which is only possible when |''σ''1| = |''σ''2| and ''μ''1 = ''μ''2. Since in the scale parameter ''σ'' is restricted to be greater than zero, we conclude that the model is identifiable: ƒ''θ''1 = ƒ''θ''2 ⇔ ''θ''1 = ''θ''2.

===Example 2=== Let <math>\mathcal{P}</math> be the standard linear regression model: : <math> y = \beta'x + \varepsilon, \quad \mathrm{E}[\,\varepsilon\mid x\,]=0 </math> (where ′ denotes matrix transpose). Then the parameter ''β'' is identifiable if and only if the matrix <math> \mathrm{E}[xx'] </math> is invertible. Thus, this is the '''identification condition''' in the model.

===Example 3=== Suppose <math>\mathcal{P}</math> is the classical errors-in-variables linear model: : <math>\begin{cases} y = \beta x^* + \varepsilon, \\ x = x^* + \eta, \end{cases}</math> where (''ε'',''η'',''x*'') are jointly normal independent random variables with zero expected value and unknown variances, and only the variables (''x'',''y'') are observed. Then this model is not identifiable,<ref name="riersol">{{harvnb|Reiersøl|1950}}</ref> only the product βσ²∗ is (where σ²∗ is the variance of the latent regressor ''x*''). This is also an example of a set identifiable model: although the exact value of ''β'' cannot be learned, we can guarantee that it must lie somewhere in the interval (''β''yx, 1÷''β''xy), where ''β''yx is the coefficient in OLS regression of ''y'' on ''x'', and ''β''xy is the coefficient in OLS regression of ''x'' on ''y''.<ref>{{harvnb|Casella|Berger|2002|page=583}}</ref>

If we abandon the normality assumption and require that ''x*'' were '''not''' normally distributed, retaining only the independence condition ''ε'' ⊥ ''η'' ⊥ ''x*'', then the model becomes identifiable.<ref name="riersol"/>

== References == === Citations === {{Reflist}}

=== Sources === {{refbegin}} * {{Citation | last1 = Casella | first1 = George | author1-link = George Casella |author2-link=Roger Lee Berger | last2 = Berger | first2 = Roger L. | title = Statistical Inference | year = 2002 | publisher = Thomson Learning | edition = 2nd | isbn = 0-534-24312-6 | lccn = 2001025794 }} * {{Citation | last = Hsiao | first = Cheng | title = Identification | year = 1983 | series = Handbook of Econometrics, Vol. 1, Ch.4 | publisher = North-Holland Publishing Company }} * {{Citation | last1 = Lehmann | first1 = E. L. | author1-link = Erich Leo Lehmann | last2 = Casella | first2 = G. | title = Theory of Point Estimation | edition = 2nd | year = 1998 | publisher = Springer | isbn = 0-387-98502-6 }} * {{Citation | last = Reiersøl | first = Olav | year = 1950 | title = Identifiability of a linear relation between variables which are subject to error | journal = Econometrica | volume = 18 | issue = 4 | pages = 375–389 | jstor = 1907835 | doi = 10.2307/1907835 }} * {{Citation | last = van der Vaart | first = A. W. | title = Asymptotic Statistics | year = 1998 | publisher = Cambridge University Press | isbn = 978-0-521-49603-2 }} {{refend}}

==Further reading== *{{citation| first1= É. | last1= Walter | author1-link= Eric Walter | first2= L. | last2= Pronzato | title= Identification of Parametric Models from Experimental Data | year= 1997 | publisher= Springer}}

===Econometrics=== * {{cite journal | last=Lewbel | first=Arthur |author-link=Arthur Lewbel | title=The Identification Zoo: Meanings of Identification in Econometrics | journal=Journal of Economic Literature | publisher=American Economic Association | volume=57 | issue=4 | date=2019-12-01 | issn=0022-0515 | doi=10.1257/jel.20181361 | pages=835–903| s2cid=125792293 | url=https://www.aeaweb.org/doi/10.1257/jel.20181361.ds | url-access=subscription }} *{{Cite journal| doi = 10.1146/annurev-economics-082912-110231| volume = 5| issue = 1| pages = 457–486| last = Matzkin| first = Rosa L.| authorlink = Rosa Matzkin | title = Nonparametric Identification in Structural Economic Models| journal = Annual Review of Economics | date = 2013}} *{{Cite journal| doi = 10.2307/1913267| issn = 0012-9682| volume = 39| issue = 3| pages = 577–591| last = Rothenberg| first = Thomas J.| title = Identification in Parametric Models| journal = Econometrica| date = 1971| jstor = 1913267}}

Category:Estimation theory