Formula for the derivative of a matrix determinant
In matrix calculus, Jacobi's formula expresses the derivative of the determinant of a matrix A in terms of the adjugate of A and the derivative of A.[1]
If A is a differentiable map from the real numbers to n × n matrices, then
![{\displaystyle {\frac {d}{dt}}\det A(t)=\operatorname {tr} \left(\operatorname {adj} (A(t))\,{\frac {dA(t)}{dt}}\right)=\left(\det A(t)\right)\cdot \operatorname {tr} \left(A(t)^{-1}\cdot \,{\frac {dA(t)}{dt}}\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c343ec3019973ed4534044bb4ccb43be14de59fc)
where tr(X) is the trace of the matrix X and
is its adjugate matrix. (The latter equality only holds if A(t) is invertible.)
As a special case,
![{\displaystyle {\partial \det(A) \over \partial A_{ij}}=\operatorname {adj} (A)_{ji}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6f5f78e72c49af2a5ed40481f4fd3a4dfda7700c)
Equivalently, if dA stands for the differential of A, the general formula is
![{\displaystyle d\det(A)=\operatorname {tr} (\operatorname {adj} (A)\,dA)=\det(A)\operatorname {tr} \left(A^{-1}dA\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/69c5a9af376bb07c8a0e4f0a94725e65c513d034)
The formula is named after the mathematician Carl Gustav Jacob Jacobi.
Derivation
Via matrix computation
Theorem. (Jacobi's formula) For any differentiable map A from the real numbers to n × n matrices,
![{\displaystyle d\det(A)=\operatorname {tr} (\operatorname {adj} (A)\,dA).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c1e4b7339dff7bd7be86b46e3d36d3e084778e2a)
Proof. Laplace's formula for the determinant of a matrix A can be stated as
![{\displaystyle \det(A)=\sum _{j}A_{ij}\operatorname {adj} ^{\rm {T}}(A)_{ij}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a0d2798f8f56e03e71e9604c4014abd1c8dbb9c8)
Notice that the summation is performed over some arbitrary row i of the matrix.
The determinant of A can be considered to be a function of the elements of A:
![{\displaystyle \det(A)=F\,(A_{11},A_{12},\ldots ,A_{21},A_{22},\ldots ,A_{nn})}](https://wikimedia.org/api/rest_v1/media/math/render/svg/46268567e31a03beb4411596215894cff692eaaa)
so that, by the chain rule, its differential is
![{\displaystyle d\det(A)=\sum _{i}\sum _{j}{\partial F \over \partial A_{ij}}\,dA_{ij}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7c847b3575cdcc2aaf3343d9435e505a41fb7b20)
This summation is performed over all n×n elements of the matrix.
To find ∂F/∂Aij consider that on the right hand side of Laplace's formula, the index i can be chosen at will. (In order to optimize calculations: Any other choice would eventually yield the same result, but it could be much harder). In particular, it can be chosen to match the first index of ∂ / ∂Aij:
![{\displaystyle {\partial \det(A) \over \partial A_{ij}}={\partial \sum _{k}A_{ik}\operatorname {adj} ^{\rm {T}}(A)_{ik} \over \partial A_{ij}}=\sum _{k}{\partial (A_{ik}\operatorname {adj} ^{\rm {T}}(A)_{ik}) \over \partial A_{ij}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/3d729fb2017534db4dba5fbcb206bee13ce89efd)
Thus, by the product rule,
![{\displaystyle {\partial \det(A) \over \partial A_{ij}}=\sum _{k}{\partial A_{ik} \over \partial A_{ij}}\operatorname {adj} ^{\rm {T}}(A)_{ik}+\sum _{k}A_{ik}{\partial \operatorname {adj} ^{\rm {T}}(A)_{ik} \over \partial A_{ij}}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/e83a0d5e16d4cc39a11e1514127125c23a5f513d)
Now, if an element of a matrix Aij and a cofactor adjT(A)ik of element Aik lie on the same row (or column), then the cofactor will not be a function of Aij, because the cofactor of Aik is expressed in terms of elements not in its own row (nor column). Thus,
![{\displaystyle {\partial \operatorname {adj} ^{\rm {T}}(A)_{ik} \over \partial A_{ij}}=0,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/38cf2d4b9d04943cf9afa19059a27c1ef8befbe2)
so
![{\displaystyle {\partial \det(A) \over \partial A_{ij}}=\sum _{k}\operatorname {adj} ^{\rm {T}}(A)_{ik}{\partial A_{ik} \over \partial A_{ij}}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/433869c7cd304f1287e533936f00c29710ad1927)
All the elements of A are independent of each other, i.e.
![{\displaystyle {\partial A_{ik} \over \partial A_{ij}}=\delta _{jk},}](https://wikimedia.org/api/rest_v1/media/math/render/svg/bc772d0b3862425fc46d3aabbb992c4cd7018f76)
where δ is the Kronecker delta, so
![{\displaystyle {\partial \det(A) \over \partial A_{ij}}=\sum _{k}\operatorname {adj} ^{\rm {T}}(A)_{ik}\delta _{jk}=\operatorname {adj} ^{\rm {T}}(A)_{ij}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/74fb1e57e12d5091881ffed4b5db622547b9057a)
Therefore,
![{\displaystyle d(\det(A))=\sum _{i}\sum _{j}\operatorname {adj} ^{\rm {T}}(A)_{ij}\,dA_{ij}=\sum _{j}\sum _{i}\operatorname {adj} (A)_{ji}\,dA_{ij}=\sum _{j}(\operatorname {adj} (A)\,dA)_{jj}=\operatorname {tr} (\operatorname {adj} (A)\,dA).\ \square }](https://wikimedia.org/api/rest_v1/media/math/render/svg/b1ac4781507869d7ed772406fc0a921d10e8f0de)
Via chain rule
Lemma 1.
, where
is the differential of
.
This equation means that the differential of
, evaluated at the identity matrix, is equal to the trace. The differential
is a linear operator that maps an n × n matrix to a real number.
Proof. Using the definition of a directional derivative together with one of its basic properties for differentiable functions, we have
![{\displaystyle \det '(I)(T)=\nabla _{T}\det(I)=\lim _{\varepsilon \to 0}{\frac {\det(I+\varepsilon T)-\det I}{\varepsilon }}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/22e5268843f5b65e8a275f0f47f48a3ed4e80a64)
is a polynomial in
of order n. It is closely related to the characteristic polynomial of
. The constant term in that polynomial (the term with
) is 1, while the linear term in
is
.
Lemma 2. For an invertible matrix A, we have:
.
Proof. Consider the following function of X:
![{\displaystyle \det X=\det(AA^{-1}X)=\det(A)\ \det(A^{-1}X)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f6c1d82f0705c21cce86b58619bd4b0fcda37ec6)
We calculate the differential of
and evaluate it at
using Lemma 1, the equation above, and the chain rule:
![{\displaystyle \det '(A)(T)=\det A\ \det '(I)(A^{-1}T)=\det A\ \mathrm {tr} (A^{-1}T)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/43da71301a3fc35c20abd599b060f8b9d89e4212)
Theorem. (Jacobi's formula)
Proof. If
is invertible, by Lemma 2, with
![{\displaystyle {\frac {d}{dt}}\det A=\det A\;\mathrm {tr} \left(A^{-1}{\frac {dA}{dt}}\right)=\mathrm {tr} \left(\mathrm {adj} \ A\;{\frac {dA}{dt}}\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d030892e29c99219ea31205195e834113810957f)
using the equation relating the adjugate of
to
. Now, the formula holds for all matrices, since the set of invertible linear matrices is dense in the space of matrices.
Via diagonalization
Both sides of the Jacobi formula are polynomials in the matrix
coefficients of A and A'. It is therefore
sufficient to verify the polynomial identity on the dense subset
where the eigenvalues of A are distinct and nonzero.
If A factors differentiably as
, then
![{\displaystyle \mathrm {tr} (A^{-1}A')=\mathrm {tr} ((BC)^{-1}(BC)')=\mathrm {tr} (B^{-1}B')+\mathrm {tr} (C^{-1}C').}](https://wikimedia.org/api/rest_v1/media/math/render/svg/9d0e8be403abf39daa8d73679ee8a5f0c0f8d50a)
In particular, if L is invertible, then
and
![{\displaystyle 0=\mathrm {tr} (I^{-1}I')=\mathrm {tr} (L(L^{-1})')+\mathrm {tr} (L^{-1}L').}](https://wikimedia.org/api/rest_v1/media/math/render/svg/953a5b82bdfe48a163895552080004a8e97eda55)
Since A has distinct eigenvalues,
there exists a differentiable complex invertible matrix L such that
and D is diagonal.
Then
![{\displaystyle \mathrm {tr} (A^{-1}A')=\mathrm {tr} (L(L^{-1})')+\mathrm {tr} (D^{-1}D')+\mathrm {tr} (L^{-1}L')=\mathrm {tr} (D^{-1}D').}](https://wikimedia.org/api/rest_v1/media/math/render/svg/bab69223f35607abfc3d1c4f73a32d972eaa6b91)
Let
,
be the eigenvalues of A.
Then
![{\displaystyle {\frac {\det(A)'}{\det(A)}}=\sum _{i=1}^{n}\lambda _{i}'/\lambda _{i}=\mathrm {tr} (D^{-1}D')=\mathrm {tr} (A^{-1}A'),}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0543ff18fbd173d15e88e9f7fc383b88948420bc)
which is the Jacobi formula for matrices A with distinct nonzero
eigenvalues.
Corollary
The following is a useful relation connecting the trace to the determinant of the associated matrix exponential:
This statement is clear for diagonal matrices, and a proof of the general claim follows.
For any invertible matrix
, in the previous section "Via Chain Rule", we showed that
![{\displaystyle {\frac {d}{dt}}\det A(t)=\det A(t)\;\operatorname {tr} \left(A(t)^{-1}\,{\frac {d}{dt}}A(t)\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7be23cec68bd66c2b7f91e4007f62e407b1f3baa)
Considering
in this equation yields:
![{\displaystyle {\frac {d}{dt}}\det e^{tB}=\operatorname {tr} (B)\det e^{tB}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d2c037ded0cace57776a027824b67f5f50a9042f)
The desired result follows as the solution to this ordinary differential equation.
Applications
Several forms of the formula underlie the Faddeev–LeVerrier algorithm for computing the characteristic polynomial, and explicit applications of the Cayley–Hamilton theorem. For example, starting from the following equation, which was proved above:
![{\displaystyle {\frac {d}{dt}}\det A(t)=\det A(t)\ \operatorname {tr} \left(A(t)^{-1}\,{\frac {d}{dt}}A(t)\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/3aefec2cf5483e6395d9915c931cd0e07d009df2)
and using
, we get:
![{\displaystyle {\frac {d}{dt}}\det(tI-B)=\det(tI-B)\operatorname {tr} [(tI-B)^{-1}]=\operatorname {tr} [\operatorname {adj} (tI-B)]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/23c86c8faefc03cdf9b628c8f6c597edab683ce8)
where adj denotes the adjugate matrix.
References