Jacobi's formula

{{short description|Formula for the derivative of a matrix determinant}}

In [[matrix calculus]], '''Jacobi's formula''' expresses the [[derivative]] of the [[determinant]] of a matrix ''A'' in terms of the [[adjugate]] of ''A'' and the derivative of ''A''.<ref>{{harvtxt|Magnus|Neudecker|1999|pp=149–150}}, Part Three, Section 8.3</ref>

If {{mvar|A}} is a differentiable map from the real numbers to {{math|''n'' × ''n''}} matrices, then :<math> \frac{d}{dt} \det A(t) = \operatorname{tr} \left (\operatorname{adj}(A(t)) \, \frac{dA(t)}{dt}\right ) = \left(\det A(t) \right) \cdot \operatorname{tr} \left (A(t)^{-1} \cdot \, \frac{dA(t)}{dt}\right )</math> where {{math|tr(''X'')}} is the [[trace (linear algebra)|trace]] of the matrix {{mvar|X}} and <math>\operatorname{adj}(X)</math> is its [[adjugate matrix]]. (The latter equality only holds if ''A''(''t'') is [[Invertible matrix|invertible]].)

As a special case, :<math>{\partial \det(A) \over \partial A_{ij}} = \operatorname{adj}(A)_{ji} = \operatorname{adj}(A)^T_{ij}.</math>

Equivalently, if {{mvar|''dA''}} stands for the [[differential (infinitesimal)|differential]] of {{mvar|A}}, the general formula is :<math> d \det (A) = \operatorname{tr} (\operatorname{adj}(A) \, dA) = \det (A) \operatorname{tr} \left (A^{-1} d A\right )</math>

The formula is named after the mathematician [[Carl Gustav Jacob Jacobi|Carl Jacobi]].

==Derivation== ===Via matrix computation=== '''Theorem.''' (Jacobi's formula) For any differentiable map ''A'' from the real numbers to ''n'' × ''n'' matrices,

: <math>d \det (A) = \operatorname{tr} (\operatorname{adj}(A) \, dA).</math>

''Proof.'' [[Laplace expansion|Laplace's formula]] for the determinant of a matrix ''A'' can be stated as

:<math>\det(A) = \sum_j A_{ij} \operatorname{adj}^{\rm T} (A)_{ij}.</math>

Notice that the summation is performed over some arbitrary row ''i'' of the matrix.

The determinant of ''A'' can be considered to be a function of the elements of ''A'':

:<math>\det(A) = F\,(A_{11}, A_{12}, \ldots , A_{21}, A_{22}, \ldots , A_{nn})</math>

so that, by the [[chain rule]], its differential is

:<math>d \det(A) = \sum_i \sum_j {\partial F \over \partial A_{ij}} \,dA_{ij}.</math>

This summation is performed over all ''n''×''n'' elements of the matrix.

To find ∂''F''/∂''A''''ij'' consider that on the right hand side of Laplace's formula, the index ''i'' can be chosen at will. (In order to optimize calculations: Any other choice would eventually yield the same result, but it could be much harder). In particular, it can be chosen to match the first index of ∂ / ∂''A''''ij'':

:<math>{\partial \det(A) \over \partial A_{ij}} = {\partial \sum_k A_{ik} \operatorname{adj}^{\rm T}(A)_{ik} \over \partial A_{ij}} = \sum_k {\partial (A_{ik} \operatorname{adj}^{\rm T}(A)_{ik}) \over \partial A_{ij}}</math>

Thus, by the [[product rule]],

:<math>{\partial \det(A) \over \partial A_{ij}} = \sum_k {\partial A_{ik} \over \partial A_{ij}} \operatorname{adj}^{\rm T}(A)_{ik} + \sum_k A_{ik} {\partial \operatorname{adj}^{\rm T}(A)_{ik} \over \partial A_{ij}}.</math>

Now, if an element of a matrix ''A''''ij'' and a [[minor (linear algebra)|cofactor]] adjT(''A'')''ik'' of element ''A''''ik'' lie on the same row (or column), then the cofactor will not be a function of ''Aij'', because the cofactor of ''A''''ik'' is expressed in terms of elements not in its own row (nor column). Thus,

:<math>{\partial \operatorname{adj}^{\rm T}(A)_{ik} \over \partial A_{ij}} = 0,</math>

:<math>{\partial \det(A) \over \partial A_{ij}} = \sum_k \operatorname{adj}^{\rm T}(A)_{ik} {\partial A_{ik} \over \partial A_{ij}}.</math>

All the elements of ''A'' are independent of each other, i.e.

:<math>{\partial A_{ik} \over \partial A_{ij}} = \delta_{jk},</math>

where ''δ'' is the [[Kronecker delta]], so

:<math>{\partial \det(A) \over \partial A_{ij}} = \sum_k \operatorname{adj}^{\rm T}(A)_{ik} \delta_{jk} = \operatorname{adj}^{\rm T}(A)_{ij}.</math>

Therefore,

:<math>d(\det(A)) = \sum_i \sum_j \operatorname{adj}^{\rm T}(A)_{ij} \,d A_{ij} = \sum_j \sum_i \operatorname{adj}(A)_{ji} \,d A_{ij} = \sum_j (\operatorname{adj}(A) \,d A)_{jj} = \operatorname{tr}(\operatorname{adj}(A) \,dA).\ \square</math>

===Via chain rule=== '''Lemma 1.''' <math>\det'(I)=\mathrm{tr}</math>, where <math>\det'</math> is the differential of <math>\det</math>.

This equation means that the differential of <math>\det</math>, evaluated at the [[identity matrix]], is equal to the trace. The differential <math>\det'(I)</math> is a linear operator that maps an ''n'' × ''n'' matrix to a [[real number]].

''Proof.'' Using the definition of a [[directional derivative]] together with one of its basic properties for differentiable functions, we have

:<math>\det'(I)(T)=\nabla_T \det(I)=\lim_{\varepsilon\to0}\frac{\det(I+\varepsilon T)-\det I}{\varepsilon}</math>

<math>\det(I+\varepsilon T)</math> is a polynomial in <math>\varepsilon</math> of order ''n''. It is closely related to the [[characteristic polynomial]] of <math>T</math>. The [[constant term]] in that polynomial (the term with <math>\varepsilon = 0</math>) is 1, while the linear term in <math>\varepsilon</math> is <math>\mathrm{tr}\ T</math>. Therefore the limit equals <math>\mathrm{tr}\ T</math> which is the claim.

'''Lemma 2.''' For an invertible matrix ''A'', we have: <math>\det'(A)(T)=\det A \; \mathrm{tr}(A^{-1}T)</math>.

''Proof.'' Consider the following function of ''X'':

:<math>\det X = \det (A A^{-1} X) = \det (A) \ \det(A^{-1} X)</math>

We calculate the differential of <math>\det X</math> and evaluate it at <math>X = A</math> using Lemma 1, the equation above, and the chain rule:

:<math>\det'(A)(T) = \det A \ \det'(I) (A^{-1} T) = \det A \ \mathrm{tr}(A^{-1} T)</math>

'''Theorem.''' (Jacobi's formula) <math>\frac{d}{dt} \det A = \mathrm{tr}\left(\mathrm{adj}\ A\frac{dA}{dt}\right)</math>

''Proof.'' If <math>A</math> is invertible, by Lemma 2, with <math>T = dA/dt</math>

:<math>\frac{d}{dt} \det A = \det A \; \mathrm{tr} \left(A^{-1} \frac{dA}{dt}\right) = \mathrm{tr} \left( \mathrm{adj}\ A \; \frac{dA}{dt} \right)</math>

using the equation relating the [[adjugate]] of <math>A</math> to <math>A^{-1}</math>. Now, the formula holds for all matrices, since the set of invertible linear matrices is dense in the space of matrices.

===Via diagonalization===

Both sides of the Jacobi formula are polynomials in the matrix coefficients of {{mvar|A}} and {{mvar|A'}}. It is therefore sufficient to verify the polynomial identity on the dense subset where the eigenvalues of {{mvar|A}} are distinct and nonzero.

If {{mvar|A}} factors differentiably as <math>A=BC</math>, then :<math> \mathrm{tr}(A^{-1}A')= \mathrm{tr}((BC)^{-1}(BC)')= \mathrm{tr}(B^{-1}B')+ \mathrm{tr}(C^{-1}C'). </math> In particular, if {{mvar|L}} is invertible, then <math>I=L^{-1}L</math> and :<math> 0=\mathrm{tr}(I^{-1}I')= \mathrm{tr}(L(L^{-1})')+ \mathrm{tr}(L^{-1}L'). </math> Since {{mvar|A}} has distinct eigenvalues, there exists a differentiable complex invertible matrix {{mvar|L}} such that <math>A = L^{-1}DL</math> and {{mvar|D}} is diagonal. Then :<math> \mathrm{tr}(A^{-1}A')= \mathrm{tr}(L(L^{-1})')+ \mathrm{tr}(D^{-1}D')+ \mathrm{tr}(L^{-1}L')= \mathrm{tr}(D^{-1}D'). </math> Let <math>\lambda_i</math>, <math>i=1,\ldots,n</math> be the eigenvalues of {{mvar|A}}. Then :<math> \left(\ln\det A\right)' = \left(\sum_{i=1}^{n}\ln \lambda_i \right)' = \sum_{i=1}^n \lambda_i'/\lambda_i = \mathrm{tr}(D^{-1}D')= \mathrm{tr}(A^{-1}A'), </math> which is the Jacobi formula for matrices {{mvar|A}} with distinct nonzero eigenvalues.

==Corollary== The following is a useful relation connecting the [[Trace (linear algebra)|trace]] to the determinant of the associated [[matrix exponential]]: {{Equation box 1 |indent =: |equation = <math> \det e^{B} = e^{\operatorname{tr} \left(B\right)}</math> |cellpadding= 6 |border |border colour = #0073CF |bgcolor=#F9FFF7}} This statement is clear for diagonal matrices, and a proof of the general claim follows.

For any [[invertible matrix]] <math>A(t)</math>, in the previous section [[#Via Chain Rule|"Via Chain Rule"]], we showed that

:<math>\frac{d}{dt} \det A(t) = \det A(t) \; \operatorname{tr} \left(A(t)^{-1} \, \frac{d}{dt} A(t)\right)</math>

Considering <math>A(t) = \exp(tB)</math> in this equation yields:

: <math>\frac{d}{dt} \det e^{tB} =\operatorname{tr}(B) \det e^{tB}</math>

The desired result follows as the solution to this [[ordinary differential equation]].

==Applications== Several forms of the formula underlie the [[Faddeev–LeVerrier algorithm]] for computing the [[characteristic polynomial]], and explicit applications of the [[Cayley–Hamilton theorem]]. For example, starting from the following equation, which was proved above: :<math>\frac{d}{dt} \det A(t) = \det A(t) \ \operatorname{tr} \left(A(t)^{-1} \, \frac{d}{dt} A(t)\right)</math> and using <math>A(t) = t I - B</math>, we get: :<math>\frac{d}{dt} \det (tI-B) = \det (tI-B) \operatorname{tr}[(tI-B)^{-1}] = \operatorname{tr}[\operatorname{adj} (tI-B)]</math> where adj denotes the [[adjugate matrix]].

==Remarks== {{Reflist}}

==References== * {{cite book |first1=Jan R. |last1=Magnus |first2=Heinz |last2=Neudecker |title=Matrix Differential Calculus with Applications in Statistics and Econometrics |publisher=Wiley |year=1999 |edition=Revised |isbn=0-471-98633-X |url=https://books.google.com/books?id=0CXXdKKiIpQC }} * {{cite book |last=Bellman |first=Richard |year=1997 |title=Introduction to Matrix Analysis |publisher=SIAM |isbn=0-89871-399-4 |url=https://books.google.com/books?id=QVCflvTPYE8C }}

{{DEFAULTSORT:Jacobi's Formula}} [[Category:Determinants]] [[Category:Matrix theory]] [[Category:Articles containing proofs]]