|
|
Efficient Automatic Differentiation of Matrix Functions-
incollection
- | |
|
Author(s)
Peder A. Olsen
, Steven J. Rennie
, Vaibhava Goel
|
Published in Recent Advances in Algorithmic Differentiation
|
Editor(s) Shaun Forth, Paul Hovland, Eric Phipps, Jean Utke, Andrea Walther |
Year 2012 |
Publisher Springer |
Abstract Forward and reverse mode automatic differentiation methods for functions that take a vector argument make derivative computation efficient. However, the determinant and inverse of a matrix are not readily expressed in the language of vectors. The derivative of a function f(X) for a d×d matrix X is itself a d×d matrix. The second derivative, or Hessian, is a d^2 times d^2 matrix, and so computing and storing the Hessian can be very costly. In this paper, we present a new calculus for matrix differentiation, and introduce a new matrix operation, the box product, to accomplish this. The box product can be used to elegantly and efficiently compute both the first and second order matrix derivatives of any function that can be expressed solely in terms of arithmetic, transposition, trace and log determinant operations. The Hessian of such a function can be implicitly represented as a sum of Kronecker, outer, and box products, which allows us to compute the Newton step efficiently. Whereas the direct computation requires O(d^4) storage and O(d^6) operations, the indirect representation of the Hessian allows the storage to be reduced to O(kd^2) , where k is the number of times the variable X occurs in the expression for the derivative. Likewise, the cost of computing the Newton direction is reduced to O(kd^5) in general, and O(d^3) for k = 1 and k = 2. |
Cross-References Forth2012RAi |
AD Theory and Techniques Hierarchical Approach |
BibTeX
@INCOLLECTION{
Olsen2012EAD,
title = "Efficient Automatic Differentiation of Matrix Functions",
doi = "10.1007/978-3-642-30023-3_7",
author = "Peder A. Olsen and Steven J. Rennie and Vaibhava Goel",
abstract = "Forward and reverse mode automatic differentiation methods for functions that take
a vector argument make derivative computation efficient. However, the determinant and inverse of a
matrix are not readily expressed in the language of vectors. The derivative of a function $f(X)$ for
a $d \times d$ matrix $X$ is itself a $d \times d$ matrix. The second derivative, or
Hessian, is a $d^2 times d^2$ matrix, and so computing and storing the Hessian can be very costly.
In this paper, we present a new calculus for matrix differentiation, and introduce a new matrix
operation, the box product, to accomplish this. The box product can be used to elegantly and
efficiently compute both the first and second order matrix derivatives of any function that can be
expressed solely in terms of arithmetic, transposition, trace and log determinant operations. The
Hessian of such a function can be implicitly represented as a sum of Kronecker, outer, and box
products, which allows us to compute the Newton step efficiently. Whereas the direct computation
requires $O(d^4)$ storage and $O(d^6)$ operations, the indirect representation of the Hessian allows
the storage to be reduced to $O(kd^2)$ , where $k$ is the number of times the variable $X$ occurs in
the expression for the derivative. Likewise, the cost of computing the Newton direction is reduced
to $O(kd^5)$ in general, and $O(d^3)$ for $k = 1$ and $k = 2$.",
pages = "71--81",
crossref = "Forth2012RAi",
booktitle = "Recent Advances in Algorithmic Differentiation",
series = "Lecture Notes in Computational Science and Engineering",
publisher = "Springer",
address = "Berlin",
volume = "87",
editor = "Shaun Forth and Paul Hovland and Eric Phipps and Jean Utke and Andrea Walther",
isbn = "978-3-540-68935-5",
issn = "1439-7358",
year = "2012",
ad_theotech = "Hierarchical Approach"
}
| |
back
|
|