# Multivariate t-distribution

> Mediated Wiki article. Canonical URL: https://mediated.wiki/source/Multivariate_t-distribution
> Markdown URL: https://mediated.wiki/source/Multivariate_t-distribution.md
> Source: https://en.wikipedia.org/wiki/Multivariate_t-distribution
> Source revision: 1336176555
> License: Creative Commons Attribution-ShareAlike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/)

Multivariable generalization of the Student's t-distribution

Multivariate t Notation t p ( μ , Σ , ν ) {\displaystyle t_{p}({\boldsymbol {\mu }},{\boldsymbol {\Sigma }},\nu )} Parameters μ = [ μ 1 , … , μ p ] T {\displaystyle {\boldsymbol {\mu }}=[\mu _{1},\dots ,\mu _{p}]^{\mathsf {T}}} location (real p × 1 {\displaystyle p\times 1} vector) Σ {\displaystyle {\boldsymbol {\Sigma }}} scale matrix (positive-definite real p × p {\displaystyle p\times p} matrix) ν > 0 {\displaystyle \nu >0} (real) represents the degrees of freedom Support x ∈ R p {\displaystyle \mathbf {x} \in \mathbb {R} ^{p}\!} PDF Γ [ ( ν + p ) / 2 ] Γ ( ν / 2 ) ν p / 2 π p / 2 | Σ | 1 / 2 [ 1 + 1 ν ( x − μ ) T Σ − 1 ( x − μ ) ] − ( ν + p ) / 2 {\displaystyle {\frac {\Gamma \left[(\nu +p)/2\right]}{\Gamma (\nu /2)\nu ^{p/2}\pi ^{p/2}\left|{\boldsymbol {\Sigma }}\right|^{1/2}}}\left[1+{\frac {1}{\nu }}({\mathbf {x} }-{\boldsymbol {\mu }})^{\mathsf {T}}{\boldsymbol {\Sigma }}^{-1}({\mathbf {x} }-{\boldsymbol {\mu }})\right]^{-(\nu +p)/2}} CDF No analytic expression, but see text for approximations Mean μ {\displaystyle {\boldsymbol {\mu }}} if ν > 1 {\displaystyle \nu >1} ; else undefined Median μ {\displaystyle {\boldsymbol {\mu }}} Mode μ {\displaystyle {\boldsymbol {\mu }}} Variance ν ν − 2 Σ {\displaystyle {\frac {\nu }{\nu -2}}{\boldsymbol {\Sigma }}} (covariance matrix) if ν > 2 {\displaystyle \nu >2} ; else undefined Skewness 0 if ν > 3 {\displaystyle \nu >3} ; else undefined

In [statistics](/source/Statistics), the **multivariate *t*-distribution** (or **multivariate Student distribution**) is a [multivariate probability distribution](/source/Multivariate_probability_distribution). It is a generalization to [random vectors](/source/Random_vector) of the [Student's *t*-distribution](/source/Student's_t-distribution), which is a distribution applicable to univariate [random variables](/source/Random_variable). While the case of a [random matrix](/source/Random_matrix) could be treated within this structure, the [matrix *t*-distribution](/source/Matrix_t-distribution) is distinct and makes particular use of the matrix structure.

## Definition

One common method of construction of a multivariate *t*-distribution, for the case of p {\displaystyle p} dimensions, is based on the observation that if y {\displaystyle \mathbf {y} } and u {\displaystyle u} are independent and distributed as N ( μ , Σ ) {\displaystyle N({\boldsymbol {\mu }},{\boldsymbol {\Sigma }})} and χ ν 2 {\displaystyle \chi _{\nu }^{2}} (i.e. [multivariate normal](/source/Multivariate_normal_distribution) and [chi-squared distributions](/source/Chi-squared_distribution)) respectively, the matrix Σ {\displaystyle \mathbf {\Sigma } \,} is a *p* × *p* matrix, and μ {\displaystyle {\boldsymbol {\mu }}} is a constant vector then the random variable x = y / u / ν + μ {\textstyle {\mathbf {x} }={\mathbf {y} }/{\sqrt {u/\nu }}+{\boldsymbol {\mu }}} has the density[1]

Γ [ ( ν + p ) / 2 ] Γ ( ν / 2 ) ν p / 2 π p / 2 | Σ | 1 / 2 [ 1 + 1 ν ( x − μ ) T Σ − 1 ( x − μ ) ] − ( ν + p ) / 2 {\displaystyle {\frac {\Gamma \left[(\nu +p)/2\right]}{\Gamma (\nu /2)\nu ^{p/2}\pi ^{p/2}\left|{\boldsymbol {\Sigma }}\right|^{1/2}}}\left[1+{\frac {1}{\nu }}\left({\mathbf {x} }-{\boldsymbol {\mu }}\right)^{\mathsf {T}}{\boldsymbol {\Sigma }}^{-1}\left({\mathbf {x} }-{\boldsymbol {\mu }}\right)\right]^{-(\nu +p)/2}}

and is said to be distributed as a multivariate *t*-distribution with parameters Σ , μ , ν {\displaystyle {\boldsymbol {\Sigma }},{\boldsymbol {\mu }},\nu } . Note that Σ {\displaystyle \mathbf {\Sigma } } is not the covariance matrix since the covariance is given by ν / ( ν − 2 ) Σ {\displaystyle \nu /(\nu -2)\mathbf {\Sigma } } (for ν > 2 {\displaystyle \nu >2} ).

The constructive definition of a multivariate *t*-distribution simultaneously serves as a sampling algorithm:

1. Generate u ∼ χ ν 2 {\displaystyle u\sim \chi _{\nu }^{2}} and y ∼ N ( 0 , Σ ) {\displaystyle \mathbf {y} \sim N(\mathbf {0} ,{\boldsymbol {\Sigma }})} , independently.

1. Compute x ← y ν / u + μ {\textstyle \mathbf {x} \gets \mathbf {y} {\sqrt {\nu /u}}+{\boldsymbol {\mu }}} .

This formulation gives rise to the hierarchical representation of a multivariate *t*-distribution as a scale-mixture of normals: u ∼ G a ( ν / 2 , ν / 2 ) {\displaystyle u\sim \mathrm {Ga} (\nu /2,\nu /2)} where G a ( a , b ) {\displaystyle \mathrm {Ga} (a,b)} indicates a gamma distribution with density proportional to x a − 1 e − b x {\displaystyle x^{a-1}e^{-bx}} , and x ∣ u {\displaystyle \mathbf {x} \mid u} conditionally follows N ( μ , u − 1 Σ ) {\displaystyle N({\boldsymbol {\mu }},u^{-1}{\boldsymbol {\Sigma }})} .

In the special case ν = 1 {\displaystyle \nu =1} , the distribution is a [multivariate Cauchy distribution](/source/Cauchy_distribution#Multivariate_Cauchy_distribution).

## Derivation

There are in fact many candidates for the multivariate generalization of [Student's *t*-distribution](/source/Student's_t-distribution). An extensive survey of the field has been given by Kotz and Nadarajah (2004). The essential issue is to define a probability density function of several variables that is the appropriate generalization of the formula for the univariate case. In one dimension ( p = 1 {\displaystyle p=1} ), with t = x − μ {\displaystyle t=x-\mu } and Σ = 1 {\displaystyle \Sigma =1} , we have the [probability density function](/source/Probability_density_function) f ( t ) = Γ [ ( ν + 1 ) / 2 ] ν π Γ [ ν / 2 ] ( 1 + t 2 / ν ) − ( ν + 1 ) / 2 {\displaystyle f(t)={\frac {\Gamma [(\nu +1)/2]}{{\sqrt {\nu \pi \,}}\,\Gamma [\nu /2]}}(1+t^{2}/\nu )^{-(\nu +1)/2}} and one approach is to use a corresponding function of several variables. This is the basic idea of [elliptical distribution](/source/Elliptical_distribution) theory, where one writes down a corresponding function of p {\displaystyle p} variables t i {\displaystyle t_{i}} that replaces t 2 {\displaystyle t^{2}} by a quadratic function of all the t i {\displaystyle t_{i}} . It is clear that this only makes sense when all the marginal distributions have the same [degrees of freedom](/source/Degrees_of_freedom_(statistics)) ν {\displaystyle \nu } . With A = Σ − 1 {\displaystyle \mathbf {A} ={\boldsymbol {\Sigma }}^{-1}} , one has a simple choice of multivariate density function

f ( t ) = Γ ( ( ν + p ) / 2 ) | A | 1 / 2 ν p π p Γ ( ν / 2 ) ( 1 + ∑ i , j = 1 p , p A i j t i t j / ν ) − ( ν + p ) / 2 {\displaystyle f(\mathbf {t} )={\frac {\Gamma ((\nu +p)/2)\left|\mathbf {A} \right|^{1/2}}{{\sqrt {\nu ^{p}\pi ^{p}\,}}\,\Gamma (\nu /2)}}\left(1+\sum _{i,j=1}^{p,p}A_{ij}t_{i}t_{j}/\nu \right)^{-(\nu +p)/2}}

which is the standard but not the only choice.

An important special case is the standard **bivariate *t*-distribution**, *p* = 2:

f ( t 1 , t 2 ) = | A | 1 / 2 2 π ( 1 + ∑ i , j = 1 2 , 2 A i j t i t j / ν ) − ( ν + 2 ) / 2 {\displaystyle f(t_{1},t_{2})={\frac {\left|\mathbf {A} \right|^{1/2}}{2\pi }}\left(1+\sum _{i,j=1}^{2,2}A_{ij}t_{i}t_{j}/\nu \right)^{-(\nu +2)/2}}

Note that Γ ( ν + 2 2 ) π ν Γ ( ν 2 ) = 1 2 π {\displaystyle {\frac {\Gamma {\left({\frac {\nu +2}{2}}\right)}}{\pi \nu \,\Gamma {\left({\frac {\nu }{2}}\right)}}}={\frac {1}{2\pi }}} .

Now, if A {\displaystyle \mathbf {A} } is the identity matrix, the density is

f ( t 1 , t 2 ) = 1 2 π ( 1 + ( t 1 2 + t 2 2 ) / ν ) − ( ν + 2 ) / 2 . {\displaystyle f(t_{1},t_{2})={\frac {1}{2\pi }}\left(1+(t_{1}^{2}+t_{2}^{2})/\nu \right)^{-(\nu +2)/2}.}

The difficulty with the standard representation is revealed by this formula, which does not factorize into the product of the marginal one-dimensional distributions. When Σ {\displaystyle \Sigma } is diagonal the standard representation can be shown to have zero [correlation](/source/Pearson_product-moment_correlation_coefficient) but the [marginal distributions](/source/Marginal_distribution) are not [statistically independent](/source/Statistical_independence).

A notable spontaneous occurrence of the elliptical multivariate distribution is its formal mathematical appearance when least squares methods are applied to multivariate normal data such as the classical Markowitz minimum variance econometric solution for asset portfolios.[2]

## Cumulative distribution function

The definition of the [cumulative distribution function](/source/Cumulative_distribution_function) (cdf) in one dimension can be extended to multiple dimensions by defining the following probability (here x {\displaystyle \mathbf {x} } is a real vector):

F ( x ) = P ( X ≤ x ) , where X ∼ t ν ( μ , Σ ) . {\displaystyle F(\mathbf {x} )=\mathbb {P} (\mathbf {X} \leq \mathbf {x} ),\quad {\textrm {where}}\;\;\mathbf {X} \sim t_{\nu }({\boldsymbol {\mu }},{\boldsymbol {\Sigma }}).} There is no simple formula for F ( x ) {\displaystyle F(\mathbf {x} )} , but it can be [approximated numerically](http://www.mathworks.com/matlabcentral/fileexchange/53796) via [Monte Carlo integration](/source/Monte_Carlo_integration).[3][4][5]

## Conditional Distribution

This was developed by Muirhead [6] and Cornish,[7] but later derived using the simpler chi-squared ratio representation above, by Roth[1] and Ding.[8] Let vector X {\displaystyle X} follow a multivariate *t* distribution and partition into two subvectors of p 1 , p 2 {\displaystyle p_{1},p_{2}} elements: X p = [ X 1 X 2 ] ∼ t p ( μ p , Σ p × p , ν ) {\displaystyle X_{p}={\begin{bmatrix}X_{1}\\X_{2}\end{bmatrix}}\sim t_{p}\left(\mu _{p},\Sigma _{p\times p},\nu \right)}

where p 1 + p 2 = p {\displaystyle p_{1}+p_{2}=p} , the known mean vectors are μ p = [ μ 1 μ 2 ] {\displaystyle \mu _{p}={\begin{bmatrix}\mu _{1}\\\mu _{2}\end{bmatrix}}} and the scale matrix is Σ p × p = [ Σ 11 Σ 12 Σ 21 Σ 22 ] {\displaystyle \Sigma _{p\times p}={\begin{bmatrix}\Sigma _{11}&\Sigma _{12}\\\Sigma _{21}&\Sigma _{22}\end{bmatrix}}} .

Roth and Ding find the conditional distribution p ( X 1 | X 2 ) {\displaystyle p(X_{1}|X_{2})} to be a new *t*-distribution with modified parameters.

X 1 | X 2 ∼ t p 1 ( μ 1 | 2 , ν + d 2 ν + p 2 Σ 11 | 2 , ν + p 2 ) {\displaystyle X_{1}|X_{2}\sim t_{p_{1}}\left(\mu _{1|2},\,{\frac {\nu +d_{2}}{\nu +p_{2}}}\Sigma _{11|2},\,\nu +p_{2}\right)}

An equivalent expression in Kotz et. al. is somewhat less concise.

Thus the conditional distribution is most easily represented as a two-step procedure. Form first the intermediate distribution X 1 | X 2 ∼ t p 1 ( μ 1 | 2 , Ψ , ν ~ ) {\displaystyle X_{1}|X_{2}\sim t_{p_{1}}\left(\mu _{1|2},\Psi ,{\tilde {\nu }}\right)} above then, using the parameters below, the explicit conditional distribution becomes

f ( X 1 | X 2 ) = Γ ( ν ~ + p 1 2 ) Γ ( ν ~ 2 ) ( π ν ~ ) p 1 / 2 | Ψ | 1 / 2 [ 1 + 1 ν ~ ( X 1 − μ 1 | 2 ) T Ψ − 1 ( X 1 − μ 1 | 2 ) ] − ( ν ~ + p 1 ) / 2 {\displaystyle f(X_{1}|X_{2})={\frac {\Gamma {\left({\frac {{\tilde {\nu }}+p_{1}}{2}}\right)}}{\Gamma {\left({\frac {\tilde {\nu }}{2}}\right)}\left(\pi \,{\tilde {\nu }}\right)^{p_{1}/2}\left|{\boldsymbol {\Psi }}\right|^{1/2}}}\left[1+{\frac {1}{\tilde {\nu }}}\left(X_{1}-\mu _{1|2}\right)^{\mathsf {T}}{\boldsymbol {\Psi }}^{-1}\left(X_{1}-\mu _{1|2}\right)\right]^{-({\tilde {\nu }}+p_{1})/2}} where ν ~ = ν + p 2 {\displaystyle {\tilde {\nu }}=\nu +p_{2}} Effective degrees of freedom, ν {\displaystyle \nu } is augmented by the number of disused variables p 2 {\displaystyle p_{2}} . μ 1 | 2 = μ 1 + Σ 12 Σ 22 − 1 ( X 2 − μ 2 ) {\displaystyle \mu _{1|2}=\mu _{1}+\Sigma _{12}\Sigma _{22}^{-1}\left(X_{2}-\mu _{2}\right)} is the conditional mean of x 1 {\displaystyle x_{1}} Σ 11 | 2 = Σ 11 − Σ 12 Σ 22 − 1 Σ 21 {\displaystyle \Sigma _{11|2}=\Sigma _{11}-\Sigma _{12}\Sigma _{22}^{-1}\Sigma _{21}} is the [Schur complement](/source/Schur_complement) of Σ 22 in Σ {\displaystyle \Sigma _{22}{\text{ in }}\Sigma } . d 2 = ( X 2 − μ 2 ) T Σ 22 − 1 ( X 2 − μ 2 ) {\displaystyle d_{2}=(X_{2}-\mu _{2})^{\mathsf {T}}\Sigma _{22}^{-1}(X_{2}-\mu _{2})} is the squared [Mahalanobis distance](/source/Mahalanobis_distance) of X 2 {\displaystyle X_{2}} from μ 2 {\displaystyle \mu _{2}} with scale matrix Σ 22 {\displaystyle \Sigma _{22}} Ψ = ν + d 2 ν ~ Σ 11 | 2 {\displaystyle \Psi ={\frac {\nu +d_{2}}{\tilde {\nu }}}\Sigma _{11|2}} is the conditional scale matrix for ν ~ > 0 {\displaystyle {\tilde {\nu }}>0} and Σ c o v = ν ~ ν ~ − 2 Ψ = ν + d 2 ν ~ − 2 Σ 11 | 2 {\displaystyle \Sigma _{cov}={\frac {\tilde {\nu }}{{\tilde {\nu }}-2}}\Psi ={\frac {\nu +d_{2}}{{\tilde {\nu }}-2}}\Sigma _{11|2}} is the conditional covariance matrix for ν ~ > 2 {\displaystyle {\tilde {\nu }}>2} .

## Copulas based on the multivariate *t*

The use of such distributions is enjoying renewed interest due to applications in [mathematical finance](/source/Mathematical_finance), especially through the use of the Student's *t* [copula](/source/Copula_(statistics)).[9]

## Elliptical representation

Constructed as an [elliptical distribution](/source/Elliptical_distribution),[10] take the simplest centralised case with spherical symmetry and no scaling, Σ = I {\displaystyle \Sigma =\operatorname {I} \,} , then the multivariate *t*-PDF takes the form

f X ( X ) = g ( X T X ) = Γ ( ν + p 2 ) ( ν π ) p / 2 Γ ( ν 2 ) ( 1 + ν − 1 X T X ) − ( ν + p ) / 2 {\displaystyle f_{X}(X)=g(X^{\mathsf {T}}X)={\frac {\Gamma {\left({\frac {\nu +p}{2}}\right)}}{(\nu \pi )^{\,p/2}\Gamma {\left({\frac {\nu }{2}}\right)}}}\left(1+\nu ^{-1}X^{\mathsf {T}}X\right)^{-(\nu +p)/2}}

where X = ( x 1 , ⋯ , x p ) T {\displaystyle X=(x_{1},\cdots ,x_{p})^{\mathsf {T}}} is a p {\displaystyle p} -vector and ν {\displaystyle \nu } is the degrees of freedom as defined in Muirhead[6] section 1.5. The covariance of X {\displaystyle X} is

E ⁡ ( X X T ) = ∫ − ∞ ∞ ⋯ ∫ − ∞ ∞ f X ( x 1 , … , x p ) X X T d x 1 … d x p = ν ν − 2 I {\displaystyle \operatorname {E} \left(XX^{\mathsf {T}}\right)=\int _{-\infty }^{\infty }\cdots \int _{-\infty }^{\infty }f_{X}(x_{1},\dots ,x_{p})XX^{\mathsf {T}}\,dx_{1}\dots dx_{p}={\frac {\nu }{\nu -2}}\operatorname {I} }

The aim is to convert the Cartesian PDF to a radial one. Kibria and Joarder,[11] define radial measure r 2 = R 2 = X T X p {\displaystyle r_{2}=R^{2}={\frac {X^{\mathsf {T}}X}{p}}} and, noting that the density is dependent only on r2, we get

E ⁡ [ r 2 ] = ∫ − ∞ ∞ ⋯ ∫ − ∞ ∞ f X ( x 1 , … , x p ) X T X p d x 1 … d x p = ν ν − 2 {\displaystyle \operatorname {E} [r_{2}]=\int _{-\infty }^{\infty }\cdots \int _{-\infty }^{\infty }f_{X}(x_{1},\dots ,x_{p}){\frac {X^{\mathsf {T}}X}{p}}\,dx_{1}\dots dx_{p}={\frac {\nu }{\nu -2}}}

which is equivalent to the variance of p {\displaystyle p} -element vector X {\displaystyle X} treated as a univariate heavy-tail zero-mean random sequence with uncorrelated, yet statistically dependent, elements.

### Radial Distribution

r 2 = X T X p {\displaystyle r_{2}={\frac {X^{\mathsf {T}}X}{p}}} follows the [Fisher-Snedecor](/source/Fisher-Snedecor_distribution) or F {\displaystyle F} distribution:

r 2 ∼ f F ( p , ν ) = B ( p 2 , ν 2 ) − 1 ( p ν ) p / 2 r 2 p / 2 − 1 ( 1 + p ν r 2 ) − ( p + ν ) / 2 {\displaystyle r_{2}\sim f_{F}(p,\nu )=B{\bigg (}{\frac {p}{2}},{\frac {\nu }{2}}{\bigg )}^{-1}{\bigg (}{\frac {p}{\nu }}{\bigg )}^{p/2}r_{2}^{p/2-1}{\bigg (}1+{\frac {p}{\nu }}r_{2}{\bigg )}^{-(p+\nu )/2}}

having mean value E ⁡ [ r 2 ] = ν ν − 2 {\displaystyle \operatorname {E} [r_{2}]={\frac {\nu }{\nu -2}}} . F {\displaystyle F} -distributions arise naturally in tests of sums of squares of sampled data after normalization by the sample standard deviation.

By a change of random variable to y = p ν r 2 = X T X ν {\displaystyle y={\frac {p}{\nu }}r_{2}={\frac {X^{\mathsf {T}}X}{\nu }}} in the equation above, retaining p {\displaystyle p} -vector X {\displaystyle X} , we have E ⁡ [ y ] = ∫ − ∞ ∞ ⋯ ∫ − ∞ ∞ f X ( X ) X T X ν d x 1 … d x p = p ν − 2 {\displaystyle \operatorname {E} [y]=\int _{-\infty }^{\infty }\cdots \int _{-\infty }^{\infty }f_{X}(X){\frac {X^{\mathsf {T}}X}{\nu }}\,dx_{1}\dots dx_{p}={\frac {p}{\nu -2}}} and probability distribution f Y ( y | p , ν ) = | p ν | − 1 B ( p 2 , ν 2 ) − 1 ( p ν ) p / 2 ( p ν ) − p / 2 − 1 y p / 2 − 1 ( 1 + y ) − ( p + ν ) / 2 = B ( p 2 , ν 2 ) − 1 y p / 2 − 1 ( 1 + y ) − ( ν + p ) / 2 {\displaystyle {\begin{aligned}f_{Y}(y|\,p,\nu )&=\left|{\frac {p}{\nu }}\right|^{-1}B\left({\frac {p}{2}},{\frac {\nu }{2}}\right)^{-1}\left({\frac {p}{\nu }}\right)^{p/2}\left({\frac {p}{\nu }}\right)^{-p/2-1}y^{\,p/2-1}{\bigl (}1+y{\bigr )}^{-(p+\nu )/2}\\[2ex]&=B\left({\frac {p}{2}},{\frac {\nu }{2}}\right)^{-1}y^{\,p/2-1}{\bigl (}1+y{\bigr )}^{-(\nu +p)/2}\end{aligned}}}

which is a regular [Beta-prime distribution](/source/Beta-prime_distribution) y ∼ β ′ ( y ; p 2 , ν 2 ) {\displaystyle y\sim \beta \,'{\bigg (}y;{\frac {p}{2}},{\frac {\nu }{2}}{\bigg )}} having mean value 1 2 p 1 2 ν − 1 = p ν − 2 {\displaystyle {\frac {{\frac {1}{2}}p}{{\frac {1}{2}}\nu -1}}={\frac {p}{\nu -2}}} .

### Cumulative Radial Distribution

Given the Beta-prime distribution, the radial cumulative distribution function of y {\displaystyle y} is known: F Y ( y ) ∼ I ( y 1 + y ; p 2 , ν 2 ) B ( p 2 , ν 2 ) − 1 {\displaystyle F_{Y}(y)\sim I{\bigg (}{\frac {y}{1+y}};\,{\frac {p}{2}},{\frac {\nu }{2}}{\bigg )}\,B{\bigg (}{\frac {p}{2}},{\frac {\nu }{2}}{\bigg )}^{-1}}

where I {\displaystyle I} is the incomplete [Beta function](/source/Beta_function) and applies with a spherical Σ {\displaystyle \Sigma } assumption.

In the scalar case, p = 1 {\displaystyle p=1} , the distribution is equivalent to Student-*t* with the equivalence t 2 = y 2 σ − 1 {\displaystyle t^{2}=y^{2}\sigma ^{-1}} , the variable *t* having double-sided tails for CDF purposes, i.e. the "two-tail-t-test".

The radial distribution can also be derived via a straightforward coordinate transformation from Cartesian to spherical. A constant radius surface at R = ( X T X ) 1 / 2 {\textstyle R=\left(X^{\mathsf {T}}X\right)^{1/2}} with PDF p X ( X ) ∝ ( 1 + ν − 1 R 2 ) − ( ν + p ) / 2 {\textstyle p_{X}(X)\propto \left(1+\nu ^{-1}R^{2}\right)^{-(\nu +p)/2}} is an iso-density surface. Given this density value, the quantum of probability on a shell of surface area A R {\displaystyle A_{R}} and thickness δ R {\displaystyle \delta R} at R {\displaystyle R} is δ P = p X ( R ) A R δ R {\displaystyle \delta P=p_{X}(R)\,A_{R}\delta R} .

The enclosed p {\displaystyle p} -sphere of radius R {\displaystyle R} has surface area A R = 2 π p / 2 R p − 1 Γ ( p / 2 ) {\displaystyle A_{R}={\frac {2\pi ^{p/2}R^{\,p-1}}{\Gamma (p/2)}}} . Substitution into δ P {\displaystyle \delta P} shows that the shell has element of probability δ P = p X ( R ) 2 π p / 2 R p − 1 Γ ( p / 2 ) δ R {\displaystyle \delta P=p_{X}(R){\frac {2\pi ^{p/2}R^{p-1}}{\Gamma (p/2)}}\delta R} which is equivalent to radial density function f R ( R ) = Γ ( 1 2 ( ν + p ) ) ν p / 2 π p / 2 Γ ( 1 2 ν ) 2 π p / 2 R p − 1 Γ ( p / 2 ) ( 1 + R 2 ν ) − ( ν + p ) / 2 {\displaystyle f_{R}(R)={\frac {\Gamma {\big (}{\frac {1}{2}}(\nu +p)\,{\big )}}{\nu ^{\,p/2}\pi ^{\,p/2}\Gamma {\big (}{\frac {1}{2}}\nu {\big )}}}{\frac {2\pi ^{p/2}R^{p-1}}{\Gamma (p/2)}}{\bigg (}1+{\frac {R^{2}}{\nu }}{\bigg )}^{-(\nu +p)/2}} which further simplifies to f R ( R ) = 2 ν 1 / 2 B ( 1 2 p , 1 2 ν ) ( R 2 ν ) ( p − 1 ) / 2 ( 1 + R 2 ν ) − ( ν + p ) / 2 {\displaystyle f_{R}(R)={\frac {2}{\nu ^{1/2}B{\big (}{\frac {1}{2}}p,{\frac {1}{2}}\nu {\big )}}}{\bigg (}{\frac {R^{2}}{\nu }}{\bigg )}^{(p-1)/2}{\bigg (}1+{\frac {R^{2}}{\nu }}{\bigg )}^{-(\nu +p)/2}} where B ( ∗ , ∗ ) {\displaystyle B(*,*)} is the [Beta function](/source/Beta_function).

Changing the radial variable to y = R 2 / ν {\displaystyle y=R^{2}/\nu } returns the previous Beta Prime distribution f Y ( y ) = 1 B ( 1 2 p , 1 2 ν ) y p / 2 − 1 ( 1 + y ) − ( ν + p ) / 2 {\displaystyle f_{Y}(y)={\frac {1}{B{\left({\frac {1}{2}}p,{\frac {1}{2}}\nu \right)}}}y^{\,p/2-1}\left(1+y\right)^{-(\nu +p)/2}}

To scale the radial variables without changing the radial shape function, define scale matrix Σ = α I {\displaystyle \Sigma =\alpha \operatorname {I} } , yielding a 3-parameter Cartesian density function, ie. the probability Δ P {\displaystyle \Delta _{P}} in volume element d x 1 … d x p {\displaystyle dx_{1}\dots dx_{p}} is

Δ P ( f X ( X | α , p , ν ) ) = Γ ( 1 2 ( ν + p ) ) ( ν π ) p / 2 α p / 2 Γ ( 1 2 ν ) ( 1 + X T X α ν ) − ( ν + p ) / 2 d x 1 … d x p {\displaystyle \Delta _{P}{\big (}f_{X}(X\,|\alpha ,p,\nu ){\big )}={\frac {\Gamma {\left({\frac {1}{2}}(\nu +p)\,\right)}}{(\nu \pi )^{\,p/2}\alpha ^{\,p/2}\Gamma {\left({\frac {1}{2}}\nu \right)}}}\left(1+{\frac {X^{\mathsf {T}}X}{\alpha \nu }}\right)^{-(\nu +p)/2}\;dx_{1}\dots dx_{p}}

or, in terms of scalar radial variable R {\displaystyle R} ,

f R ( R | α , p , ν ) = 2 α 1 / 2 ν 1 / 2 B ( 1 2 p , 1 2 ν ) ( R 2 α ν ) ( p − 1 ) / 2 ( 1 + R 2 α ν ) − ( ν + p ) / 2 {\displaystyle f_{R}(R\,|\alpha ,p,\nu )={\frac {2}{\alpha ^{1/2}\;\nu ^{1/2}B{\big (}{\frac {1}{2}}p,{\frac {1}{2}}\nu {\big )}}}{\bigg (}{\frac {R^{2}}{\alpha \,\nu }}{\bigg )}^{(p-1)/2}{\bigg (}1+{\frac {R^{2}}{\alpha \,\nu }}{\bigg )}^{-(\nu +p)/2}}

### Radial Moments

The moments of all the radial variables , with the spherical distribution assumption, can be derived from the Beta Prime distribution. If Z ∼ β ′ ( a , b ) {\displaystyle Z\sim \beta '(a,b)} then E ⁡ ( Z m ) = B ( a + m , b − m ) B ( a , b ) {\displaystyle \operatorname {E} (Z^{m})={\frac {B(a+m,b-m)}{B(a,b)}}} , a known result. Thus, for variable y = p ν R 2 {\displaystyle y={\frac {p}{\nu }}R^{2}} we have E ⁡ ( y m ) = B ( 1 2 p + m , 1 2 ν − m ) B ( 1 2 p , 1 2 ν ) = Γ ( 1 2 p + m ) Γ ( 1 2 ν − m ) Γ ( 1 2 p ) Γ ( 1 2 ν ) , ν / 2 > m {\displaystyle \operatorname {E} (y^{m})={\frac {B({\frac {1}{2}}p+m,{\frac {1}{2}}\nu -m)}{B({\frac {1}{2}}p,{\frac {1}{2}}\nu )}}={\frac {\Gamma {\big (}{\frac {1}{2}}p+m{\big )}\;\Gamma {\big (}{\frac {1}{2}}\nu -m{\big )}}{\Gamma {\big (}{\frac {1}{2}}p{\big )}\;\Gamma {\big (}{\frac {1}{2}}\nu {\big )}}},\;\nu /2>m} The moments of r 2 = ν y {\displaystyle r_{2}=\nu \,y} are E ⁡ ( r 2 m ) = ν m E ⁡ ( y m ) {\displaystyle \operatorname {E} (r_{2}^{m})=\nu ^{m}\operatorname {E} (y^{m})} while introducing the scale matrix α I {\displaystyle \alpha \operatorname {I} } yields E ⁡ ( r 2 m | α ) = α m ν m E ⁡ ( y m ) {\displaystyle \operatorname {E} (r_{2}^{m}|\alpha )=\alpha ^{m}\nu ^{m}\operatorname {E} (y^{m})} Moments relating to radial variable R {\displaystyle R} are found by setting R = ( α ν y ) 1 / 2 {\displaystyle R=(\alpha \nu y)^{1/2}} and M = 2 m {\displaystyle M=2m} whereupon E ⁡ ( R M ) = E ( ( α ν y ) 1 / 2 ) 2 m = ( α ν ) M / 2 E ⁡ ( y M / 2 ) = ( α ν ) M / 2 B ( 1 2 ( p + M ) , 1 2 ( ν − M ) ) B ( p 2 , ν 2 ) {\displaystyle {\begin{aligned}\operatorname {E} (R^{M})&=\operatorname {E} \!\left((\alpha \nu y)^{1/2}\right)^{2m}=(\alpha \nu )^{M/2}\operatorname {E} (y^{M/2})\\[1ex]&=(\alpha \nu )^{M/2}{\frac {B{\big (}{\frac {1}{2}}(p+M),{\frac {1}{2}}(\nu -M){\big )}}{B{\left({\frac {p}{2}},{\frac {\nu }{2}}\right)}}}\end{aligned}}}

## Linear Combinations and Affine Transformation

### Full Rank Transform

This closely relates to the multivariate normal method and is described in Kotz and Nadarajah, Kibria and Joarder, Roth, and Cornish. Starting from a somewhat simplified version of the central MV-t pdf: f X ( X ) = K | Σ | 1 / 2 ( 1 + ν − 1 X T Σ − 1 X ) − ( ν + p ) / 2 {\displaystyle f_{X}(X)={\frac {\mathrm {K} }{\left|\Sigma \right|^{1/2}}}\left(1+\nu ^{-1}X^{\mathsf {T}}\Sigma ^{-1}X\right)^{-\left(\nu +p\right)/2}} , where K {\displaystyle \mathrm {K} } is a constant and ν {\displaystyle \nu } is arbitrary but fixed, let Θ ∈ R p × p {\displaystyle \Theta \in \mathbb {R} ^{p\times p}} be a full-rank matrix and form vector Y = Θ X {\displaystyle Y=\Theta X} . Then, by straightforward change of variables

f Y ( Y ) = K | Σ | 1 / 2 ( 1 + ν − 1 Y T Θ − T Σ − 1 Θ − 1 Y ) − ( ν + p ) / 2 | ∂ Y ∂ X | − 1 {\displaystyle f_{Y}(Y)={\frac {\mathrm {K} }{\left|\Sigma \right|^{1/2}}}\left(1+\nu ^{-1}Y^{\mathsf {T}}\Theta ^{-{\mathsf {T}}}\Sigma ^{-1}\Theta ^{-1}Y\right)^{-\left(\nu +p\right)/2}\left|{\frac {\partial Y}{\partial X}}\right|^{-1}}

The matrix of partial derivatives is ∂ Y i ∂ X j = Θ i , j {\displaystyle {\frac {\partial Y_{i}}{\partial X_{j}}}=\Theta _{i,j}} and the Jacobian becomes | ∂ Y ∂ X | = | Θ | {\displaystyle \left|{\frac {\partial Y}{\partial X}}\right|=\left|\Theta \right|} . Thus f Y ( Y ) = K | Σ | 1 / 2 | Θ | ( 1 + ν − 1 Y T Θ − T Σ − 1 Θ − 1 Y ) − ( ν + p ) / 2 {\displaystyle f_{Y}(Y)={\frac {\mathrm {K} }{\left|\Sigma \right|^{1/2}\left|\Theta \right|}}\left(1+\nu ^{-1}Y^{\mathsf {T}}\Theta ^{-{\mathsf {T}}}\Sigma ^{-1}\Theta ^{-1}Y\right)^{-\left(\nu +p\right)/2}}

The denominator reduces to | Σ | 1 / 2 | Θ | = | Σ | 1 / 2 | Θ | 1 / 2 | Θ T | 1 / 2 = | Θ Σ Θ T | 1 / 2 {\displaystyle \left|\Sigma \right|^{1/2}\left|\Theta \right|=\left|\Sigma \right|^{1/2}\left|\Theta \right|^{1/2}\left|\Theta ^{\mathsf {T}}\right|^{1/2}=\left|\Theta \Sigma \Theta ^{\mathsf {T}}\right|^{1/2}} In full: f Y ( Y ) = Γ [ ( ν + p ) / 2 ] Γ ( ν / 2 ) ( ν π ) p / 2 | Θ Σ Θ T | 1 / 2 ( 1 + ν − 1 Y T ( Θ Σ Θ T ) − 1 Y ) − ( ν + p ) / 2 {\displaystyle f_{Y}(Y)={\frac {\Gamma \left[(\nu +p)/2\right]}{\Gamma (\nu /2)\,(\nu \,\pi )^{\,p/2}\left|\Theta \Sigma \Theta ^{\mathsf {T}}\right|^{1/2}}}\left(1+\nu ^{-1}Y^{\mathsf {T}}\left(\Theta \Sigma \Theta ^{\mathsf {T}}\right)^{-1}Y\right)^{-\left(\nu +p\right)/2}}

which is a regular MV-*t* distribution.

In general if X ∼ t p ( μ , Σ , ν ) {\displaystyle X\sim t_{p}(\mu ,\Sigma ,\nu )} and Θ p × p {\displaystyle \Theta ^{p\times p}} has full rank p {\displaystyle p} then Θ X + c ∼ t p ( Θ μ + c , Θ Σ Θ T , ν ) {\displaystyle \Theta X+c\sim t_{p}(\Theta \mu +c,\Theta \Sigma \Theta ^{\mathsf {T}},\nu )}

### Marginal Distributions

This is a special case of the rank-reducing linear transform below. Kotz defines marginal distributions as follows. Partition X ∼ t ( p , μ , Σ , ν ) {\displaystyle X\sim t(p,\mu ,\Sigma ,\nu )} into two subvectors of p 1 , p 2 {\displaystyle p_{1},p_{2}} elements: X p = [ X 1 X 2 ] ∼ t ( p 1 + p 2 , μ p , Σ p × p , ν ) {\displaystyle X_{p}={\begin{bmatrix}X_{1}\\X_{2}\end{bmatrix}}\sim t\left(p_{1}+p_{2},\mu _{p},\Sigma _{p\times p},\nu \right)}

with p 1 + p 2 = p {\displaystyle p_{1}+p_{2}=p} , means μ p = [ μ 1 μ 2 ] {\displaystyle \mu _{p}={\begin{bmatrix}\mu _{1}\\\mu _{2}\end{bmatrix}}} , scale matrix Σ p × p = [ Σ 11 Σ 12 Σ 21 Σ 22 ] {\displaystyle \Sigma _{p\times p}={\begin{bmatrix}\Sigma _{11}&\Sigma _{12}\\\Sigma _{21}&\Sigma _{22}\end{bmatrix}}}

then X 1 ∼ t ( p 1 , μ 1 , Σ 11 , ν ) {\displaystyle X_{1}\sim t\left(p_{1},\mu _{1},\Sigma _{11},\nu \right)} , X 2 ∼ t ( p 2 , μ 2 , Σ 22 , ν ) {\displaystyle X_{2}\sim t\left(p_{2},\mu _{2},\Sigma _{22},\nu \right)} such that f ( X 1 ) = Γ [ ( ν + p 1 ) / 2 ] Γ ( ν / 2 ) ( ν π ) p 1 / 2 | Σ 11 | 1 / 2 [ 1 + 1 ν ( X 1 − μ 1 ) T Σ 11 − 1 ( X 1 − μ 1 ) ] − ( ν + p 1 ) / 2 {\displaystyle f(X_{1})={\frac {\Gamma \left[(\nu +p_{1})/2\right]}{\Gamma (\nu /2)\,(\nu \,\pi )^{\,p_{1}/2}\left|{{\boldsymbol {\Sigma }}_{11}}\right|^{1/2}}}\left[1+{\frac {1}{\nu }}({\mathbf {X} _{1}}-{{\boldsymbol {\mu }}_{1}})^{\mathsf {T}}{\boldsymbol {\Sigma }}_{11}^{-1}({\mathbf {X} _{1}}-{{\boldsymbol {\mu }}_{1}})\right]^{-(\nu \,+\,p_{1})/2}}

f ( X 2 ) = Γ [ ( ν + p 2 ) / 2 ] Γ ( ν / 2 ) ( ν π ) p 2 / 2 | Σ 22 | 1 / 2 [ 1 + 1 ν ( X 2 − μ 2 ) T Σ 22 − 1 ( X 2 − μ 2 ) ] − ( ν + p 2 ) / 2 {\displaystyle f(X_{2})={\frac {\Gamma \left[(\nu +p_{2})/2\right]}{\Gamma (\nu /2)\,(\nu \,\pi )^{\,p_{2}/2}\left|{{\boldsymbol {\Sigma }}_{22}}\right|^{1/2}}}\left[1+{\frac {1}{\nu }}({\mathbf {X} _{2}}-{{\boldsymbol {\mu }}_{2}})^{\mathsf {T}}{\boldsymbol {\Sigma }}_{22}^{-1}({\mathbf {X} _{2}}-{{\boldsymbol {\mu }}_{2}})\right]^{-(\nu \,+\,p_{2})/2}}

If a transformation is constructed in the form Θ p 1 × p = [ 1 ⋯ 0 ⋯ 0 0 ⋱ 0 ⋯ 0 0 ⋯ 1 ⋯ 0 ] {\displaystyle \Theta _{p_{1}\times \,p}={\begin{bmatrix}1&\cdots &0&\cdots &0\\0&\ddots &0&\cdots &0\\0&\cdots &1&\cdots &0\end{bmatrix}}}

then vector Y = Θ X {\displaystyle Y=\Theta X} , as discussed below, has the same distribution as the marginal distribution of X 1 {\displaystyle X_{1}} .

### Rank-Reducing Linear Transform

In the linear transform case, if Θ {\displaystyle \Theta } is a rectangular matrix Θ ∈ R m × p , m < p {\displaystyle \Theta \in \mathbb {R} ^{m\times p},m<p} , of rank m {\displaystyle m} the result is dimensionality reduction. Here, Jacobian | Θ | {\displaystyle \left|\Theta \right|} is seemingly rectangular but the value | Θ Σ Θ T | 1 / 2 {\displaystyle \left|\Theta \Sigma \Theta ^{\mathsf {T}}\right|^{1/2}} in the denominator pdf is nevertheless correct. There is a discussion of rectangular matrix product determinants in Aitken.[12] In general if X ∼ t ( p , μ , Σ , ν ) {\displaystyle X\sim t(p,\mu ,\Sigma ,\nu )} and Θ m × p {\displaystyle \Theta ^{m\times p}} has full rank m {\displaystyle m} then

Y = Θ X + c ∼ t ( m , Θ μ + c , Θ Σ Θ T , ν ) {\displaystyle Y=\Theta X+c\sim t(m,\Theta \mu +c,\Theta \Sigma \Theta ^{\mathsf {T}},\nu )} f Y ( Y ) = Γ [ ( ν + m ) / 2 ] Γ ( ν / 2 ) ( ν π ) m / 2 | Θ Σ Θ T | 1 / 2 [ 1 + 1 ν ( Y − c 1 ) T ( Θ Σ Θ T ) − 1 ( Y − c 1 ) ] − ( ν + m ) / 2 , c 1 = Θ μ + c {\displaystyle f_{Y}(Y)={\frac {\Gamma \left[(\nu +m)/2\right]}{\Gamma (\nu /2)\,(\nu \,\pi )^{\,m/2}\left|\Theta \Sigma \Theta ^{\mathsf {T}}\right|^{1/2}}}\left[1+{\frac {1}{\nu }}(Y-c_{1})^{\mathsf {T}}(\Theta \Sigma \Theta ^{\mathsf {T}})^{-1}(Y-c_{1})\right]^{-(\nu \,+\,m)/2},\;c_{1}=\Theta \mu +c}

*In extremis*, if *m* = 1 and Θ {\displaystyle \Theta } becomes a row vector, then scalar *Y* follows a univariate double-sided Student-t distribution defined by t 2 = Y 2 / σ 2 {\displaystyle t^{2}=Y^{2}/\sigma ^{2}} with the same ν {\displaystyle \nu } degrees of freedom. Kibria et. al. use the affine transformation to find the marginal distributions which are also MV-*t*.

- During affine transformations of variables with elliptical distributions all vectors must ultimately derive from one initial isotropic spherical vector Z {\displaystyle Z} whose elements remain 'entangled' and are not statistically independent.

- A vector of independent student-*t* samples is not consistent with the multivariate *t* distribution.

- Adding two sample multivariate *t* vectors generated with independent Chi-squared samples and different ν {\displaystyle \nu } values: 1 / u 1 / ν 1 , 1 / u 2 / ν 2 {\textstyle {1}/{\sqrt {u_{1}/\nu _{1}}},\;\;{1}/{\sqrt {u_{2}/\nu _{2}}}} will not produce internally consistent distributions, though they will yield a [Behrens-Fisher problem](/source/Behrens-Fisher_problem).[13]

- Taleb compares many examples of fat-tail elliptical *vs* non-elliptical multivariate distributions

## Related concepts

- In univariate statistics, the [Student's *t*-test](/source/Student's_t-test) makes use of [Student's *t*-distribution](/source/Student's_t-distribution)

- The elliptical multivariate-*t* distribution arises spontaneously in linearly constrained least squares solutions involving multivariate normal source data, for example the Markowitz global minimum variance solution in financial portfolio analysis.[14][15][2] which addresses an ensemble of normal random vectors or a random matrix. It does not arise in ordinary least squares (OLS) or multiple regression with fixed dependent and independent variables which problem tends to produce well-behaved normal error probabilities.

- [Hotelling's *T*-squared distribution](/source/Hotelling's_T-squared_distribution) is a distribution that arises in multivariate statistics.

- The [matrix *t*-distribution](/source/Matrix_t-distribution) is a distribution for random variables arranged in a matrix structure.

This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations. (May 2012) (Learn how and when to remove this message)

## See also

- [Multivariate normal distribution](/source/Multivariate_normal_distribution), which is the limiting case of the multivariate Student's t-distribution when ν ↑ ∞ {\displaystyle \nu \uparrow \infty } .

- [Chi distribution](/source/Chi_distribution), the [pdf](/source/Probability_density_function) of the scaling factor in the construction the Student's t-distribution and also the [2-norm](/source/Norm_(mathematics)#p-norm) (or [Euclidean norm](/source/Euclidean_norm)) of a multivariate normally distributed vector (centered at zero). - [Rayleigh distribution § Student's t](/source/Rayleigh_distribution#Student's_t), random vector length of multivariate *t*-distribution

- [Mahalanobis distance](/source/Mahalanobis_distance)

## References

1. ^ [***a***](#cite_ref-:0_1-0) [***b***](#cite_ref-:0_1-1) Roth, Michael (17 April 2013). ["On the Multivariate t Distribution"](http://users.isy.liu.se/en/rt/roth/student.pdf) (PDF). *Automatic Control group. Linköpin University, Sweden*. [Archived](https://web.archive.org/web/20220731142649/http://users.isy.liu.se/en/rt/roth/student.pdf) (PDF) from the original on 31 July 2022. Retrieved 1 June 2022.

1. ^ [***a***](#cite_ref-:2_2-0) [***b***](#cite_ref-:2_2-1) Bodnar, T; Okhrin, Y (2008). ["Properties of the Singular, Inverse and Generalized inverse Partitioned Wishart Distribution"](https://core.ac.uk/download/pdf/82469023.pdf) (PDF). *Journal of Multivariate Analysis*. **99** (Eqn.20): 2389–2405. [doi](/source/Doi_(identifier)):[10.1016/j.jmva.2008.02.024](https://doi.org/10.1016%2Fj.jmva.2008.02.024).

1. **[^](#cite_ref-bochen22_3-0)** Botev, Z.; Chen, Y.-L. (2022). ["Chapter 4: Truncated Multivariate Student Computations via Exponential Tilting."](https://doi.org/10.1007/978-3-031-10193-9_4). In Botev, Zdravko; Keller, Alexander; Lemieux, Christiane; Tuffin, Bruno (eds.). *Advances in Modeling and Simulation: Festschrift for Pierre L'Ecuyer*. Springer. pp. 65–87. [doi](/source/Doi_(identifier)):[10.1007/978-3-031-10193-9_4](https://doi.org/10.1007%2F978-3-031-10193-9_4). [ISBN](/source/ISBN_(identifier)) [978-3-031-10192-2](https://en.wikipedia.org/wiki/Special:BookSources/978-3-031-10192-2).

1. **[^](#cite_ref-boLec16_4-0)** Botev, Z. I.; L'Ecuyer, P. (6 December 2015). "Efficient probability estimation and simulation of the truncated multivariate student-t distribution". *2015 Winter Simulation Conference (WSC)*. Huntington Beach, CA, USA: IEEE. pp. 380–391. [doi](/source/Doi_(identifier)):[10.1109/WSC.2015.7408180](https://doi.org/10.1109%2FWSC.2015.7408180). [hdl](/source/Hdl_(identifier)):[1959.4/unsworks_38275](https://hdl.handle.net/1959.4%2Funsworks_38275).

1. **[^](#cite_ref-Genz_5-0)** Genz, Alan (2009). [*Computation of Multivariate Normal and t Probabilities*](https://www.springer.com/statistics/computational+statistics/book/978-3-642-01688-2). Lecture Notes in Statistics. Vol. 195. Springer. [doi](/source/Doi_(identifier)):[10.1007/978-3-642-01689-9](https://doi.org/10.1007%2F978-3-642-01689-9). [ISBN](/source/ISBN_(identifier)) [978-3-642-01689-9](https://en.wikipedia.org/wiki/Special:BookSources/978-3-642-01689-9). [Archived](https://web.archive.org/web/20220827214814/https://link.springer.com/book/10.1007/978-3-642-01689-9) from the original on 2022-08-27. Retrieved 2017-09-05.

1. ^ [***a***](#cite_ref-:1_6-0) [***b***](#cite_ref-:1_6-1) Muirhead, Robb (1982). *Aspects of Multivariate Statistical Theory*. USA: Wiley. pp. 32–36 Theorem 1.5.4. [ISBN](/source/ISBN_(identifier)) [978-0-47 1-76985-9](https://en.wikipedia.org/wiki/Special:BookSources/978-0-47_1-76985-9).

1. **[^](#cite_ref-7)** Cornish, E A (1954). ["The Multivariate t-Distribution Associated with a Set of Normal Sample Deviates"](https://www.publish.csiro.au/PH/pdf/PH540531). *Australian Journal of Physics*. **7**: 531–542. [doi](/source/Doi_(identifier)):[10.1071/PH550193](https://doi.org/10.1071%2FPH550193).

1. **[^](#cite_ref-8)** Ding, Peng (2016). ["On the Conditional Distribution of the Multivariate t Distribution"](https://www.tandfonline.com/doi/full/10.1080/00031305.2016.1164756). *The American Statistician*. **70** (3): 293–295. [arXiv](/source/ArXiv_(identifier)):[1604.00561](https://arxiv.org/abs/1604.00561). [doi](/source/Doi_(identifier)):[10.1080/00031305.2016.1164756](https://doi.org/10.1080%2F00031305.2016.1164756). [S2CID](/source/S2CID_(identifier)) [55842994](https://api.semanticscholar.org/CorpusID:55842994).

1. **[^](#cite_ref-9)** Demarta, Stefano; McNeil, Alexander (2004). ["The t Copula and Related Copulas"](https://www.risknet.de/uploads/tx_bxelibrary/t-Copula-Demarta-ETH.pdf) (PDF). *Risknet*.

1. **[^](#cite_ref-10)** Osiewalski, Jacek; Steele, Mark (1996). "Posterior Moments of Scale Parameters in Elliptical Sampling Models". *Bayesian Analysis in Statistics and Econometrics*. Wiley. pp. 323–335. [ISBN](/source/ISBN_(identifier)) [0-471-11856-7](https://en.wikipedia.org/wiki/Special:BookSources/0-471-11856-7).

1. **[^](#cite_ref-11)** Kibria, K M G; Joarder, A H (Jan 2006). ["A short review of multivariate t distribution"](https://jsr.isrt.ac.bd/wp-content/uploads/40n1_5.pdf) (PDF). *Journal of Statistical Research*. **40** (1): 59–72. [doi](/source/Doi_(identifier)):[10.1007/s42979-021-00503-0](https://doi.org/10.1007%2Fs42979-021-00503-0). [S2CID](/source/S2CID_(identifier)) [232163198](https://api.semanticscholar.org/CorpusID:232163198).

1. **[^](#cite_ref-12)** Aitken, A C - (1948). *Determinants and Matrices* (5th ed.). Edinburgh: Oliver and Boyd. pp. Chapter IV, section 36.

1. **[^](#cite_ref-13)** Giron, Javier; del Castilo, Carmen (2010). ["The multivariate Behrens–Fisher distribution"](https://doi.org/10.1016%2Fj.jmva.2010.04.008). *Journal of Multivariate Analysis*. **101** (9): 2091–2102. [doi](/source/Doi_(identifier)):[10.1016/j.jmva.2010.04.008](https://doi.org/10.1016%2Fj.jmva.2010.04.008).

1. **[^](#cite_ref-14)** Okhrin, Y; Schmid, W (2006). ["Distributional Properties of Portfolio Weights"](https://www.sciencedirect.com/science/article/abs/pii/S0304407605001442). *Journal of Econometrics*. **134**: 235–256. [doi](/source/Doi_(identifier)):[10.1016/j.jeconom.2005.06.022](https://doi.org/10.1016%2Fj.jeconom.2005.06.022).

1. **[^](#cite_ref-15)** Bodnar, T; Dmytriv, S; Parolya, N; Schmid, W (2019). "Tests for the Weights of the Global Minimum Variance Portfolio in a High-Dimensional Setting". *IEEE Transactions on Signal Processing*. **67** (17): 4479–4493. [arXiv](/source/ArXiv_(identifier)):[1710.09587](https://arxiv.org/abs/1710.09587). [Bibcode](/source/Bibcode_(identifier)):[2019ITSP...67.4479B](https://ui.adsabs.harvard.edu/abs/2019ITSP...67.4479B). [doi](/source/Doi_(identifier)):[10.1109/TSP.2019.2929964](https://doi.org/10.1109%2FTSP.2019.2929964).

## Literature

- Kotz, Samuel; Nadarajah, Saralees (2004). *Multivariate*t*Distributions and Their Applications*. Cambridge University Press. [ISBN](/source/ISBN_(identifier)) [978-0521826549](https://en.wikipedia.org/wiki/Special:BookSources/978-0521826549).

- Cherubini, Umberto; Luciano, Elisa; Vecchiato, Walter (2004). *Copula methods in finance*. John Wiley & Sons. [ISBN](/source/ISBN_(identifier)) [978-0470863442](https://en.wikipedia.org/wiki/Special:BookSources/978-0470863442).

- Taleb, Nassim Nicholas (2023). *Statistical Consequences of Fat Tails* (1st ed.). Academic Press. [ISBN](/source/ISBN_(identifier)) [979-8218248031](https://en.wikipedia.org/wiki/Special:BookSources/979-8218248031).

## External links

- [Copula Methods vs Canonical Multivariate Distributions: the multivariate Student T distribution with general degrees of freedom](https://web.archive.org/web/20061202010900/http://www.mth.kcl.ac.uk/~shaww/web_page/papers/MultiStudentc.pdf)

- [Multivariate Student's *t* distribution](http://www.statlect.com/mcdstu1.htm)

v t e Probability distributions (list) Discrete univariate with finite support Benford Bernoulli Beta-binomial Binomial Categorical Hypergeometric Negative Poisson binomial Rademacher Soliton Discrete uniform Zipf Zipf–Mandelbrot with infinite support Beta negative binomial Borel Conway–Maxwell–Poisson Discrete phase-type Delaporte Extended negative binomial Flory–Schulz Gauss–Kuzmin Geometric Logarithmic Mixed Poisson Negative binomial Panjer Parabolic fractal Poisson Skellam Yule–Simon Zeta Continuous univariate supported on a bounded interval Arcsine ARGUS Balding–Nichols Bates Beta Generalized Beta rectangular Continuous Bernoulli Continuous binomial Irwin–Hall Kumaraswamy Logit-normal Noncentral beta PERT Power function Raised cosine Reciprocal Triangular U-quadratic Uniform Wigner semicircle supported on a semi-infinite interval Benini Benktander 1st kind Benktander 2nd kind Beta prime Burr Chi Chi-squared Noncentral Inverse Scaled Dagum Davis Erlang Hyper Exponential Hyperexponential Hypoexponential Logarithmic F Noncentral Folded normal Fréchet Gamma Generalized Inverse gamma/Gompertz Gompertz Shifted Half-logistic Half-normal Hotelling's T-squared Hartman–Watson Inverse Gaussian Generalized Kolmogorov Lévy Log-Cauchy Log-Laplace Log-logistic Log-normal Log-t Lomax Matrix-exponential Maxwell–Boltzmann Maxwell–Jüttner Mittag-Leffler Nakagami Pareto Phase-type Poly-Weibull Rayleigh Relativistic Breit–Wigner Rice Truncated normal type-2 Gumbel Weibull Discrete Wilks's lambda supported on the whole real line Cauchy Exponential power Fisher's z Kaniadakis κ-Gaussian Gaussian q Generalized hyperbolic Generalized logistic (logistic-beta) Generalized normal Geometric stable Gumbel Holtsmark Hyperbolic secant Johnson's SU Landau Laplace Asymmetric Logistic Noncentral t Normal (Gaussian) Normal-inverse Gaussian Skew normal Slash Stable Student's t Tracy–Widom Variance-gamma Voigt with support whose type varies Generalized chi-squared Generalized extreme value Generalized Pareto Marchenko–Pastur Kaniadakis κ-exponential Kaniadakis κ-Gamma Kaniadakis κ-Weibull Kaniadakis κ-Logistic Kaniadakis κ-Erlang q-exponential q-Gaussian q-Weibull Shifted log-logistic Tukey lambda Mixed univariate continuous- discrete Rectified Gaussian Multivariate (joint) Discrete: Ewens Multinomial Dirichlet Negative Continuous: Dirichlet Generalized Multivariate Laplace Multivariate normal Multivariate stable Multivariate t Normal-gamma Inverse Matrix-valued: LKJ Matrix beta Matrix F Matrix normal Matrix t Matrix gamma Inverse Wishart Normal Inverse Normal-inverse Complex Uniform distribution on a Stiefel manifold Directional Univariate (circular) directional Circular uniform Univariate von Mises Wrapped normal Wrapped Cauchy Wrapped exponential Wrapped asymmetric Laplace Wrapped Lévy Bivariate (spherical) Kent Bivariate (toroidal) Bivariate von Mises Multivariate von Mises–Fisher Bingham Degenerate and singular Degenerate Dirac delta function Singular Cantor Families Circular Compound Poisson Elliptical Exponential Natural exponential Location–scale Maximum entropy Mixture Pearson Tweedie Wrapped Category Commons

---
Adapted from the Wikipedia article [Multivariate t-distribution](https://en.wikipedia.org/wiki/Multivariate_t-distribution) by Wikipedia contributors ([contributor history](https://en.wikipedia.org/wiki/Multivariate_t-distribution?action=history)). Available under [Creative Commons Attribution-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/). Changes may have been made.