Truncated normal distribution

{{Short description|Type of probability distribution}} {{distinguish||text=rectified normal distribution, where negative elements are reset to zero, nor a censored normal distribution, where some elements are known to be outside of a specific range}} {{More footnotes needed|date=June 2010}} {{Probability distribution | name = | type = density | pdf_image = tnormPDF.png | pdf_caption = Probability density function for the truncated normal distribution for different sets of parameters. In all cases, ''a'' = −10 and ''b'' = 10. For the black: ''μ'' = −8, ''σ'' = 2; blue: ''μ'' = 0, ''σ'' = 2; red: ''μ'' = 9, ''σ'' = 10; orange: ''μ'' = 0, ''σ'' = 10. | cdf_image = tnormCDF.svg | cdf_caption = Cumulative distribution function for the truncated normal distribution for different sets of parameters. In all cases, ''a'' = −10 and ''b'' = 10. For the black: ''μ'' = −8, ''σ'' = 2; blue: ''μ'' = 0, ''σ'' = 2; red: ''μ'' = 9, ''σ'' = 10; orange: ''μ'' = 0, ''σ'' = 10. | notation = <math>\xi=\frac{x-\mu}{\sigma},\ \alpha=\frac{a-\mu}{\sigma},\ \beta=\frac{b-\mu}{\sigma}</math> <math>Z = \Phi(\beta)-\Phi(\alpha)</math> | parameters = {{nowrap|<math>\mu \in \mathbb{R}</math>}} <math>\sigma^2 \geq 0</math> (but see definition) <math>a \in \mathbb{R}</math> — minimum value of <math>x</math> <math>b \in \mathbb{R}</math> — maximum value of <math>x</math> (<math>b > a</math>) | support = <math>x \in [a, b]</math> | pdf = <math>f(x;\mu,\sigma, a,b) = \frac{\varphi(\xi)}{\sigma Z}\,</math><ref name='ist-lecture-4'>{{cite web | title=Lecture 4: Selection|url=http://web.ist.utl.pt/~ist11038/compute/qc/,truncG/lecture4k.pdf | website=web.ist.utl.pt | publisher=Instituto Superior Técnico|access-date=14 July 2015|page=1|date=November 11, 2002}}</ref> | cdf = <math>F(x;\mu,\sigma, a,b) = \frac{\Phi(\xi) - \Phi(\alpha)}{Z}</math> | mean = <math>\mu + \frac{\varphi(\alpha)-\varphi(\beta)}{Z}\sigma</math> | mode = <math>\left\{\begin{array}{ll}a, & \mathrm{if}\ \mu<a \\ \mu, & \mathrm{if}\ a\le\mu\le b\\ b, & \mathrm{if}\ \mu>b\end{array}\right.</math> | variance = <math>\sigma^2\left[1-\frac{\beta\varphi(\beta)-\alpha\varphi(\alpha)}{Z} -\left(\frac{\varphi(\alpha)-\varphi(\beta)}{Z}\right)^2\right]</math> | median = <math>\mu + \Phi^{-1}\left(\frac{\Phi(\alpha)+\Phi(\beta)}{2}\right) \sigma</math> | skewness = | kurtosis = | entropy = <math>\ln(\sqrt{2 \pi e} \sigma Z) + \frac{\alpha\varphi(\alpha)-\beta\varphi(\beta)}{2Z}</math> | mgf = <math>e^{\mu t + \sigma^2 t^2 / 2} \left[ \frac{ \Phi(\beta- \sigma t) - \Phi(\alpha - \sigma t) }{\Phi(\beta) - \Phi(\alpha) } \right] </math> | char = }}

In probability and statistics, the '''truncated normal distribution''' is the probability distribution derived from that of a normally distributed random variable by bounding the random variable from either below or above (or both). The truncated normal distribution has wide applications in statistics and econometrics.

==Definitions==

Suppose <math> X </math> has a normal distribution with mean <math>\mu</math> and variance <math>\sigma^2</math> and lies within the interval <math>(a,b), \text{with} \; -\infty \leq a . Then <math>X</math> conditional on <math> a < X has a truncated normal distribution.

Its probability density function, <math>f</math>, for <math> a \leq x \leq b </math>, is given by

<math display="block"> f(x;\mu,\sigma,a,b) = \frac{1}{\sigma}\,\frac{\varphi(\frac{x - \mu}{\sigma})}{\Phi(\frac{b - \mu}{\sigma}) - \Phi(\frac{a - \mu}{\sigma}) }</math>

and by <math>f=0</math> otherwise.

Here, <math display="block">\varphi(\xi)=\frac{1}{\sqrt{2 \pi}}\exp\left(-\frac{1}{2}\xi^2\right)</math> is the probability density function of the standard normal distribution and <math>\Phi(\cdot)</math> is its cumulative distribution function <math display="block">\Phi(x) = \frac{1}{2} \left( 1+\operatorname{erf}(x/\sqrt{2}) \right).</math> By definition, if <math>b=\infty</math>, then <math>\Phi\left(\tfrac{b - \mu}{\sigma}\right) =1</math>, and similarly, if <math>a = -\infty</math>, then <math>\Phi\left(\tfrac{a - \mu}{\sigma}\right) = 0</math>.

The above formulae show that when <math>-\infty<a<b<+\infty</math> the scale parameter <math>\sigma^2</math> of the truncated normal distribution is allowed to assume negative values. The parameter <math>\sigma</math> is in this case imaginary, but the function <math>f</math> is nevertheless real, positive, and normalizable. The scale parameter <math>\sigma^2</math> of the untruncated normal distribution must be positive because the distribution would not be normalizable otherwise. The doubly truncated normal distribution, on the other hand, can in principle have a negative scale parameter (which is different from the variance, see summary formulae), because no such integrability problems arise on a bounded domain. In this case the distribution cannot be interpreted as an untruncated normal conditional on <math> a < X , of course, but can still be interpreted as a maximum-entropy distribution with first and second moments as constraints, and has an additional peculiar feature: it presents ''two'' local maxima instead of one, located at <math>x=a</math> and <math>x=b</math>.

==Properties==

The truncated normal is one of two possible maximum entropy probability distributions for a fixed mean and variance constrained to the interval [a,b], the other being the truncated ''U''.<ref>{{Cite journal |last1=Dowson |first1=D. |last2=Wragg |first2=A. |date=September 1973 |title=Maximum-entropy distributions having prescribed first and second moments (Corresp.) |journal=IEEE Transactions on Information Theory |volume=19 |issue=5 |pages=689–693 |doi=10.1109/TIT.1973.1055060 |issn=1557-9654}}</ref> Truncated normals with fixed support form an exponential family. Nielsen<ref>{{cite journal | author1 = Frank Nielsen | title = Statistical Divergences between Densities of Truncated Exponential Families with Nested Supports: Duo Bregman and Duo Jensen Divergences | journal = Entropy | publisher= MDPI | year=2022 | doi=10.3390/e24030421 | volume=24 | number = 3| page=421 | pmid=35327931 | pmc=8947456 | bibcode = 2022Entrp..24..421N | doi-access=free }}</ref> reported closed-form formula for calculating the Kullback-Leibler divergence and the Bhattacharyya distance between two truncated normal distributions with the support of the first distribution nested into the support of the second distribution.

===Moments===

If the random variable has been truncated only from below, some probability mass has been shifted to higher values, giving a first-order stochastically dominating distribution and hence increasing the mean to a value higher than the mean <math>\mu</math> of the original normal distribution. Likewise, if the random variable has been truncated only from above, the truncated distribution has a mean less than <math>\mu.</math>

Regardless of whether the random variable is bounded above, below, or both, the truncation is a mean-preserving contraction combined with a mean-changing rigid shift, and hence the variance of the truncated distribution is less than the variance <math>\sigma^2</math> of the original normal distribution.

==== Two sided truncation ==== Source:<ref>{{Cite book | last1 = Johnson | first1 = Norman Lloyd | title = Continuous Univariate Distributions | volume = 1 | date = 1994 | publisher = Wiley | first2 = Samuel | last2 = Kotz | first3 = N. | last3 = Balakrishnan | isbn = 0-471-58495-9 | edition = 2nd | location = New York | oclc = 29428092| at = Section 10.1}}</ref>

Let <math>\alpha = (a-\mu)/\sigma</math> and <math>\beta = (b-\mu)/\sigma </math>. Then: <math display="block"> \operatorname{E}(X \mid a<X<b) = \mu - \sigma\frac{\varphi(\beta) - \varphi(\alpha)}{\Phi(\beta)-\Phi(\alpha)} </math> and <math display="block"> \operatorname{Var}(X \mid a<X<b) = \sigma^2\left[ 1 - \frac{\beta\varphi(\beta) - \alpha\varphi(\alpha)}{\Phi(\beta)-\Phi(\alpha)} -\left(\frac{\varphi(\beta) - \varphi(\alpha)}{\Phi(\beta)-\Phi(\alpha)}\right)^2\right]</math>

Care must be taken in the numerical evaluation of these formulas, which can result in catastrophic cancellation when the interval <math>[a,b]</math> does not include <math>\mu</math>. There are better ways to rewrite them that avoid this issue.<ref name=":0">{{Citation|last=Fernandez-de-Cossio-Diaz|first=Jorge|title=TruncatedNormal.jl: Compute mean and variance of the univariate truncated normal distribution (works far from the peak)|date=2017-12-06|url=https://github.com/cossio/TruncatedNormal.jl|access-date=2017-12-06}}</ref>

==== One sided truncation (of lower tail) ==== Sources:<ref>{{cite book |last=Greene |first= William H. |title= Econometric Analysis | edition = 5th |publisher= Prentice Hall |year= 2003 |isbn= 978-0-13-066189-0 }}</ref><ref>{{Cite journal |last1=del Castillo |first1=Joan |date=March 1994 |title=The singly truncated normal distribution: A non-steep exponential family |url=https://www.ism.ac.jp/editsec/aism/pdf/046_1_0057.pdf|journal= Annals of the Institute of Statistical Mathematics |volume=46 |issue=1 |pages=57–66 |doi=10.1007/BF00773592 }}</ref>

In this case <math>\; b=\infty, \; \varphi(\beta)=0, \; \Phi(\beta)=1,</math> then

<math display="block"> \operatorname{E}(X \mid X>a) = \mu +\sigma \varphi(\alpha)/Z ,\!</math>

and

<math display="block"> \operatorname{Var}(X \mid X>a) = \sigma^2[1+ \alpha \varphi(\alpha)/Z- (\varphi(\alpha)/Z)^2 ],</math>

where <math> Z=1-\Phi(\alpha). </math>

==== One sided truncation (of upper tail) ==== In this case <math>\; a=\alpha=-\infty, \; \varphi(\alpha)=0, \; \Phi(\alpha) = 0,</math> then

<math display="block"> \operatorname{E}(X \mid X<b) = \mu -\sigma\frac{\varphi(\beta)}{\Phi(\beta)} ,</math> <math display="block"> \operatorname{Var}(X \mid X<b) = \sigma^2\left[1-\beta \frac{\varphi(\beta)}{\Phi(\beta)}- \left(\frac{\varphi(\beta)}{\Phi(\beta)} \right)^2\right].</math>

{{harvtxt|Barr|Sherrill|1999}} give a simpler expression for the variance of one sided truncations. Their formula is in terms of the chi-square CDF, which is implemented in standard software libraries. {{harvtxt|Bebu|Mathew|2009}} provide formulas for (generalized) confidence intervals around the truncated moments.

===== A recursive formula =====

As for the non-truncated case, there is a recursive formula for the truncated moments.<ref>Document by Eric Orjebin, "https://people.smp.uq.edu.au/YoniNazarathy/teaching_projects/studentWork/EricOrjebin_TruncatedNormalMoments.pdf"</ref>

In particular, for <math>n\geq0</math>, we have

<math display="block"> \operatorname{E}\left[ \left(\frac{x-\mu}{\sigma}\right)^{n+2}\right]=\frac{\alpha^{n+1}\varphi(\alpha)-\beta^{n+1}\varphi(\beta)}{\Phi(\beta)-\Phi(\alpha)}+(n+1)\operatorname{E}\left[ \left(\frac{x-\mu}{\sigma}\right)^{n}\right]. </math>

====== Proof ====== By the change of variables <math>\xi=(x-\mu)/\sigma</math>, one obtains <math display="block"> \operatorname{E}\left[ \left(\frac{x-\mu}{\sigma}\right)^{n+2}\right] =\int_{\alpha}^{\beta}\frac{\xi^{n+2}\varphi(\xi)}{\Phi(\beta)-\Phi(\alpha)}d\xi. </math> Using <math>\varphi'(\xi) = -\xi\varphi(\xi), </math> integration by parts yields <math display="block"> \operatorname{E}\left[ \left(\frac{x-\mu}{\sigma}\right)^{n+2}\right] =\left[\frac{-\xi^{n+1}\varphi(\xi)}{\Phi(\beta)-\Phi(\alpha)}\right]_{\alpha}^{\beta}+(n+1)\int_\alpha^\beta\frac{\xi^n\varphi(\xi)}{\Phi(\beta)-\Phi(\alpha)}d\xi, </math> which gives the equation to be proven.

===== Multivariate =====

Computing the moments of a multivariate truncated normal is harder.

==Generating values from the truncated normal distribution== {{further|Pseudo-random number sampling}} {{External links|section|date=May 2022}}

A random variate <math>x</math> defined as <math> x = \Phi^{-1}( \Phi(\alpha) + U\cdot(\Phi(\beta)-\Phi(\alpha)))\sigma + \mu </math> with <math>\Phi</math> the cumulative distribution function of the normal distribution to be sampled from (i.e. with correct mean and variance) and <math>\Phi^{-1}</math> its inverse, <math>U</math> a uniform random number on <math>(0, 1)</math>, follows the distribution truncated to the range <math>(a, b)</math>. This is simply the inverse transform method for simulating random variables. Although one of the simplest, this method can either fail when sampling in the tail of the normal distribution,<ref>{{cite book|last1=Kroese|first1=D. P.|author-link1=Dirk Kroese| last2=Taimre|first2=T.|last3=Botev|first3=Z. I.|title=Handbook of Monte Carlo methods|year=2011 | publisher=John Wiley & Sons}}</ref> or be much too slow.<ref name="boLec17">{{cite conference |title=Simulation from the Normal Distribution Truncated to an Interval in the Tail |last1=Botev |first1=Z. I. |last2=L'Ecuyer |first2=P. |date=2017 | publisher=ACM | isbn=978-1-63190-141-6 |book-title=10th EAI International Conference on Performance Evaluation Methodologies and Tools |pages=23–29 |location= 25th–28th Oct 2016 Taormina, Italy |doi= 10.4108/eai.25-10-2016.2266879 }} </ref> Thus, in practice, one has to find alternative methods of simulation.

One such truncated normal generator (implemented in [http://www.mathworks.com/matlabcentral/fileexchange/53180-truncated-normal-generator Matlab] and in R (programming language) as [https://cran.r-project.org/web/packages/TruncatedNormal trandn.R] ) is based on an acceptance rejection idea due to Marsaglia.<ref>{{cite journal|last1=Marsaglia|first1=George|title=Generating a variable from the tail of the normal distribution| journal=Technometrics | date=1964 | volume=6 | issue=1 | pages=101–102 | doi=10.2307/1266749 |jstor=1266749}}</ref> Despite the slightly suboptimal acceptance rate of {{harvtxt|Marsaglia|1964}} in comparison with {{harvtxt|Robert|1995}}, Marsaglia's method is typically faster,<ref name="boLec17"/> because it does not require the costly numerical evaluation of the exponential function.

For more on simulating a draw from the truncated normal distribution, see {{harvtxt|Robert|1995}}, {{harvtxt|Lynch|2007|at= Section 8.1.3 (pages 200–206)}}, {{harvtxt|Devroye|1986}}. The [https://cran.r-project.org/web/packages/msm/index.html MSM] package in R has a function, [https://web.archive.org/web/20120208134826/http://rss.acs.unt.edu/Rdoc/library/msm/html/tnorm.html rtnorm], that calculates draws from a truncated normal. The [https://cran.r-project.org/web/packages/truncnorm/ truncnorm] package in R also has functions to draw from a truncated normal.

{{harvtxt|Chopin|2011}} proposed ([https://arxiv.org/abs/1201.6140 arXiv]) an algorithm inspired from the Ziggurat algorithm of Marsaglia and Tsang (1984, 2000), which is usually considered as the fastest Gaussian sampler, and is also very close to Ahrens's algorithm (1995). Implementations can be found in [http://www.crest.fr/ckfinder/userfiles/files/Pageperso/chopin/truncnorm_20120618.tgz C], [http://miv.u-strasbg.fr/mazet/rtnorm/rtnormCpp.zip C++], [http://miv.u-strasbg.fr/mazet/rtnorm/rtnormM.zip Matlab] and [http://www.christophlassner.de/blog/2013/08/12/Generation-of-Truncated-Gaussian-Samples/ Python].

Sampling from the ''multivariate'' truncated normal distribution is considerably more difficult.<ref name="bo16">{{cite journal | last1=Botev|first1=Z. I.|title=The normal law under linear restrictions: simulation and estimation via minimax tilting | journal=Journal of the Royal Statistical Society, Series B| volume=79| pages=125–148| date=2016| doi=10.1111/rssb.12162 | arxiv=1603.04166|s2cid=88515228}}</ref> Exact or perfect simulation is only feasible in the case of truncation of the normal distribution to a polytope region.<ref name="bo16"/><ref>{{cite book |last1=Botev |first1=Zdravko |last2=L'Ecuyer |first2=Pierre |editor-last=Puliafito |editor-first=Antonio|title=Systems Modeling: Methodologies and Tools. EAI/Springer Innovations in Communication and Computing.|publisher=Springer, Cham| date=2018| pages=115–132 | chapter=Chapter 8: Simulation from the Tail of the Univariate and Multivariate Normal Distribution | isbn=978-3-319-92377-2 |name-list-style=amp|doi=10.1007/978-3-319-92378-9_8 |s2cid=125554530 }}</ref> In more general cases, {{harvtxt|Damien|Walker|2001}} introduce a general methodology for sampling truncated densities within a Gibbs sampling framework. Their algorithm introduces one latent variable and, within a Gibbs sampling framework, it is more computationally efficient than the algorithm of {{harvtxt|Robert|1995}}.

==Notes== {{reflist}}

==References== * {{cite book |last1=Botev |first1=Zdravko |last2=L'Ecuyer |first2=Pierre |editor-last=Puliafito |editor-first=Antonio| title=Systems Modeling: Methodologies and Tools | series = EAI/Springer Innovations in Communication and Computing | publisher=Springer, Cham|date=2018|pages=115–132 |chapter=Chapter 8: Simulation from the Tail of the Univariate and Multivariate Normal Distribution|isbn=978-3-319-92377-2 |name-list-style=amp|doi=10.1007/978-3-319-92378-9_8 |s2cid=125554530 }} * {{cite book |first=Luc |last=Devroye |url=http://www.eirene.de/Devroye.pdf |title=Non-Uniform Random Variate Generation |publisher=Springer-Verlag |place=New York |year=1986 |access-date=2012-04-12 |archive-date=2014-08-18 |archive-url=https://web.archive.org/web/20140818200854/http://www.eirene.de/Devroye.pdf |url-status=dead }} * {{cite book |last=Greene |first= William H. |title= Econometric Analysis (5th ed.)|publisher= Prentice Hall |year= 2003 |isbn= 978-0-13-066189-0 }} * Norman L. Johnson and Samuel Kotz (1970). ''Continuous univariate distributions-1'', chapter 13. John Wiley & Sons. * {{cite book|last=Lynch|first=Scott|title=Introduction to Applied Bayesian Statistics and Estimation for Social Scientists| year=2007|publisher=Springer|location=New York|isbn=978-1-4419-2434-6|url=https://www.springer.com/social+sciences/book/978-0-387-71264-2}} * {{cite journal|last=Robert|first=Christian P.|title=Simulation of truncated normal variables|journal=Statistics and Computing | year=1995|volume=5|issue=2|pages=121–125|doi=10.1007/BF00143942|arxiv=0907.4010|s2cid=15943491}} * {{cite journal|last1=Barr|first1=Donald R.|last2=Sherrill|first2=E.Todd|title=Mean and variance of truncated normal distributions| journal=The American Statistician | year=1999 | volume=53|issue=4| pages=357–361| doi=10.1080/00031305.1999.10474490}} * {{cite journal|last1=Bebu|first1=Ionut|last2=Mathew|first2=Thomas|title=Confidence intervals for limited moments and truncated moments in normal and lognormal models|journal=Statistics and Probability Letters| year=2009| volume=79| issue=3| pages=375–380|doi=10.1016/j.spl.2008.09.006}} * {{cite journal|last1=Damien|first1=Paul|last2=Walker|first2=Stephen G.|title=Sampling truncated normal, beta, and gamma densities|journal=Journal of Computational and Graphical Statistics| year=2001| volume=10| issue=2| pages=206–215| doi=10.1198/10618600152627906|s2cid=123156320}} *{{Cite journal |last=Chopin |first=Nicolas |date=2011-04-01 |title=Fast simulation of truncated Gaussian distributions |url=https://link.springer.com/article/10.1007/s11222-009-9168-1 |journal=Statistics and Computing |language=en |volume=21 |issue=2 |pages=275–288 |doi=10.1007/s11222-009-9168-1 |issn=1573-1375|arxiv=1201.6140 }} * {{cite web|last1=Burkardt|first1=John|title=The Truncated Normal Distribution|url=https://people.sc.fsu.edu/~jburkardt/presentations/truncated_normal.pdf|website=Department of Scientific Computing website|publisher=Florida State University|access-date=15 February 2018}}

Category:Continuous distributions Category:Normal distribution

fr:Loi tronquée#Loi normale tronquée