In mathematics, the Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. Hessian Matrices are often used in optimization problems within Newton-Raphson’s method.
H f = [ ∂ 2 f ∂ x 2 ∂ 2 f ∂ x ∂ y ∂ 2 f ∂ x ∂ z ⋯ ∂ 2 f ∂ y ∂ x ∂ 2 f ∂ y 2 ∂ 2 f ∂ y ∂ z ⋯ ∂ 2 f ∂ z ∂ x ∂ 2 f ∂ z ∂ y ∂ 2 f ∂ z 2 ⋯ ⋮ ⋮ ⋮ ⋱ ] \mathbf{H} f=\left[ \begin{array}{cccc}{\frac{\partial^{2} f}{\partial x^{2}}} & {\frac{\partial^{2} f}{\partial x \partial y}} & {\frac{\partial^{2} f}{\partial x \partial z}} & {\cdots} \\ {\frac{\partial^{2} f}{\partial y \partial x}} & {\frac{\partial^{2} f}{\partial y^{2}}} & {\frac{\partial^{2} f}{\partial y \partial z}} & {\cdots} \\ {\frac{\partial^{2} f}{\partial z \partial x}} & {\frac{\partial^{2} f}{\partial z \partial y}} & {\frac{\partial^{2} f}{\partial z^{2}}} & {\cdots} \\ {\vdots} & {\vdots} & {\vdots} & {\ddots}\end{array}\right] H f = ∂ x 2 ∂ 2 f ∂ y ∂ x ∂ 2 f ∂ z ∂ x ∂ 2 f ⋮ ∂ x ∂ y ∂ 2 f ∂ y 2 ∂ 2 f ∂ z ∂ y ∂ 2 f ⋮ ∂ x ∂ z ∂ 2 f ∂ y ∂ z ∂ 2 f ∂ z 2 ∂ 2 f ⋮ ⋯ ⋯ ⋯ ⋱ Example 1: Computing a Hessian Problem : Compute the Hessian of f ( x , y ) = x 3 − 2 x y − y 6 f(x, y)=x^{3}-2 x y-y^{6} f ( x , y ) = x 3 − 2 x y − y 6 .
Solution: First compute both partial derivatives:
f x ( x , y ) = ∂ ∂ x ( x 3 − 2 x y − y 6 ) = 3 x 2 − 2 y f_{x}(x, y)=\frac{\partial}{\partial x}\left(x^{3}-2 x y-y^{6}\right)=3 x^{2}-2 y f x ( x , y ) = ∂ x ∂ ( x 3 − 2 x y − y 6 ) = 3 x 2 − 2 y f y ( x , y ) = ∂ ∂ y ( x 3 − 2 x y − y 6 ) = − 2 x − 6 y 5 f_{y}(x, y)=\frac{\partial}{\partial y}\left(x^{3}-2 x y-y^{6}\right)=-2 x-6 y^{5} f y ( x , y ) = ∂ y ∂ ( x 3 − 2 x y − y 6 ) = − 2 x − 6 y 5 With these, we compute all four second partial derivatives:
f x x ( x , y ) = ∂ ∂ x ( 3 x 2 − 2 y ) = 6 x f_{x x}(x, y)=\frac{\partial}{\partial x}\left(3 x^{2}-2 y\right)=6 x f xx ( x , y ) = ∂ x ∂ ( 3 x 2 − 2 y ) = 6 x f x y ( x , y ) = ∂ ∂ y ( 3 x 2 − 2 y ) = − 2 {f_{x y}(x, y)=\frac{\partial}{\partial y}\left(3 x^{2}-2 y\right)=-2} f x y ( x , y ) = ∂ y ∂ ( 3 x 2 − 2 y ) = − 2 f y x ( x , y ) = ∂ ∂ x ( − 2 x − 6 y 5 ) = − 2 {f_{y x}(x, y)=\frac{\partial}{\partial x}\left(-2 x-6 y^{5}\right)=-2} f y x ( x , y ) = ∂ x ∂ ( − 2 x − 6 y 5 ) = − 2 f y y ( x , y ) = ∂ ∂ y ( − 2 x − 6 y 5 ) = − 30 y 4 f_{y y}(x, y)=\frac{\partial}{\partial y}\left(-2 x-6 y^{5}\right)=-30 y^{4} f yy ( x , y ) = ∂ y ∂ ( − 2 x − 6 y 5 ) = − 30 y 4 The Hessian matrix in this case is a $ 2\times 2$ matrix with these functions as entries:
H f ( x , y ) = [ f x x ( x , y ) f x y ( x , y ) f y x ( x , y ) f y y ( x , y ) ] = [ 6 x − 2 − 2 − 30 y 4 ] \mathbf{H} f(x, y)=\left[ \begin{array}{cc}{f_{x x}(x, y)} & {f_{x y}(x, y)} \\ {f_{y x}(x, y)} & {f_{y y}(x, y)}\end{array}\right]=\left[ \begin{array}{cc}{6 x} & {-2} \\ {-2} & {-30 y^{4}}\end{array}\right] H f ( x , y ) = [ f xx ( x , y ) f y x ( x , y ) f x y ( x , y ) f yy ( x , y ) ] = [ 6 x − 2 − 2 − 30 y 4 ] Example 2 Problem: the function f ( x ) = x t o p A x + b t o p x + c f(x)=x^{\\top} A x+b^{\\top} x+c f ( x ) = x t o p A x + b t o p x + c , where A A A is a n t i m e s n n \\times n n t im es n matrix, b b b is a vector of length n n n and c c c is a constant.
Determine the gradient of f f f : ∇ f ( x ) \nabla f(x) ∇ f ( x ) . Determine the Hessian of f f f : H _ f ( x ) H\_{f}(x) H _ f ( x ) . Solution:
compute the gradient ∇ f ( x ) \nabla f(x) ∇ f ( x ) : ∇ f ( x ) = ∂ x T ∂ x ⋅ ( A x ) + x T ⋅ ∂ ( A x ) ∂ x ⏞ p r o d u c t − r u l e + ∂ b T x ∂ x + ∂ c ∂ x = A x + x T ⋅ A + b = A x + x ⋅ A T + b = ( A + A T ) x + b \begin{aligned}
\nabla f(x)&=\overbrace{\frac{\partial x^{T}}{\partial x}\cdot (Ax)+x^{T}\cdot \frac{\partial (Ax)}{\partial x}}^{product-rule}+\frac{\partial b^Tx}{\partial x}+\frac{\partial c}{\partial x}\\
&= Ax + x^{T}\cdot A+b \\
&= Ax + x\cdot A^{T} + b \\
&= (A+A^{T})x + b
\end{aligned} ∇ f ( x ) = ∂ x ∂ x T ⋅ ( A x ) + x T ⋅ ∂ x ∂ ( A x ) p ro d u c t − r u l e + ∂ x ∂ b T x + ∂ x ∂ c = A x + x T ⋅ A + b = A x + x ⋅ A T + b = ( A + A T ) x + b compute the Hessian H f ( x ) H_{f}(x) H f ( x ) : H f ( x ) = ∂ ∇ f ( x ) ∂ x = A + A T H_{f}(x) = \frac{\partial \nabla f(x)}{\partial x} = A + A^{T} H f ( x ) = ∂ x ∂ ∇ f ( x ) = A + A T