Automatic Differentiation of generic PDE-constrained functions

GridapTopOpt provides serial and distributed automatic differentiation methods for generic PDE-constrained functionals and arbitrary maps of those functionals. This works by:

Implementing adjoint methods for linear and non-linear finite element problems. We do this by creating new so-called StateMap operators AffineFEStateMap and NonlinearFEStateMap, these act like AffineFEOperator and FEOperator but also (efficently) implement adjoint methods.
The adjoint methods implemented in (1) work by overloading the rrule method from ChainRules.jl. As a result, more general backwards automatic differentation packages such as Zygote.jl can be utilised to compute derivatives of generic (and possibly quite complicated) maps.

Let's consider the following introductory example. Suppose we wish to differentiate

\[J(\kappa,u(\kappa)) = \left(\int_\Omega(u(\kappa)-u_\textrm{obs})^2~\mathrm{d}\boldsymbol{x}\right)^{1/2}\]

where $u_\textrm{obs}$ is some observed data and $u$ depends on $\kappa$ through Poisson's equation: find $u\in H^1_g(\Omega)$ such that $a(u,v) = l(v)$ for all $v\in H^1_0(\Omega)$ where

\[\begin{aligned} a(u,v) &= \int_\Omega\kappa\nabla u\cdot\nabla v~\mathrm{d}\boldsymbol{x},\\ l(v) &= \int_\Omega vf~\mathrm{d}\boldsymbol{x}, \end{aligned}\]

and $f(x,y) = y$ and $g(x,y) = x$. The derivative of $J(u(\kappa),\kappa)$ with respect to $\kappa$ can be found using the following snippet:

using Gridap, GridapTopOpt

f(x) = x[2]
g(x) = x[1]

model = CartesianDiscreteModel((0,1,0,1), (10,10))
Ω = Triangulation(model)
dΩ = Measure(Ω, 2)
K = TestFESpace(model, ReferenceFE(lagrangian, Float64, 0))
V = TestFESpace(model, ReferenceFE(lagrangian, Float64, 1); dirichlet_tags="boundary")
U = TrialFESpace(V,g)
a(u, v, κ) = ∫(κ * ∇(v) ⋅ ∇(u))dΩ
b(v, κ) = ∫(v*f)dΩ
κ_to_u = AffineFEStateMap(a,b,U,V,K)
l2_norm = StateParamMap((u, κ) -> ∫(u ⋅ u)dΩ,κ_to_u)
u_obs = interpolate(x -> sin(2π*x[1]), V) |> get_free_dof_values
function κ_to_J(κ)
  u = κ_to_u(κ)
  # ... (other maps if neccessary)
  sqrt(l2_norm(u-u_obs, κ))
end
κ0h = interpolate(1.0, K)
val, grad = val_and_gradient(J, get_free_dof_values(κ0h))

The above is quite standard other than AffineFEStateMap, StateParamMap, and val_and_gradient. These work as follows:

AffineFEStateMap represents an (affine) FE operator that maps from $\kappa$ to $u$, and implements adjoint methods.
StateParamMap is a wrapper for a function that produces a DomainContribution and handles partial differentiation in a Gridap-friendly. It also implements rrule methods so that Zygote can use the partial derivatives.
val_and_gradient is a custom function for computing derivatives using Zygote. This works in almost the same way as Zygote.withgradient, except val_and_gradient is compatible with PartitionedArrays for distributed computation. Note that in serial, Zygote.withgradient will work as usual.

When $J$ maps to multiple numbers (e.g., the case of having an objective and constraints), val_and_gradient can be replaced with val_and_jacobian.

Finally, we verify the above using finite differences as follows:

using FiniteDiff, Test
fdm_grad = FiniteDiff.finite_difference_gradient(κ_to_J, get_free_dof_values(κ0h))
@test maximum(abs,grad[1] - fdm_grad)/maximum(abs, grad[1]) < 1e-7

Additional examples, as well as examples using GridapDistributed, can be found in src/tests/.

Note

Refer to CustomPDEConstrainedFunctionals and CustomEmbeddedPDEConstrainedFunctionals when attempting to use this with the level-set topology optimisation methods in GridapTopOpt.

Second-order AD of generic PDE-constrained functions

Warning

The Hessian-vector product capability is new. Please report any issues that you encounter.

The capability to compute Hessian-vector products was added in GridapTopOpt v0.5. Under the hood, we use forward-over-reverse automatic differentiation, however most users will only ever need the Hvp function. Below, we demonstrate how to compute the Hessian-vector product.

using Gridap, GridapTopOpt

f(x) = x[2]
g(x) = x[1]

model = CartesianDiscreteModel((0,1,0,1), (2,2))
Ω = Triangulation(model)
dΩ = Measure(Ω, 2)
reffe = ReferenceFE(lagrangian, Float64, 1)
K = TestFESpace(model, reffe)
V = TestFESpace(model, reffe; dirichlet_tags="boundary")
U = TrialFESpace(V,g)
a(u, v, κ) = ∫(κ*(κ+1) * ∇(v) ⋅ ∇(u))dΩ
b(v, κ) = ∫(v*f)dΩ
κ_to_u = AffineFEStateMap(a,b,U,V,K;diff_order=2)
l2_norm = StateParamMap((u, κ) -> ∫(u ⋅ u + 0κ)dΩ,κ_to_u;diff_order=2) # (!!)
u_obs = interpolate(x -> sin(2π*x[1]), V) |> get_free_dof_values
function J(κ)
  u = κ_to_u(κ)
  sqrt(l2_norm(u-u_obs, κ))
end
κ0h = interpolate(1.0, K)
val, grad = val_and_gradient(J, get_free_dof_values(κ0h))
# Hessian-vector product
vh = interpolate(0.5, K)
Hv = Hvp(J, get_free_dof_values(κ0h),get_free_dof_values(vh))

Here we set the optional parameter diff_order = 2 in AffineFEStateMap and StateParamMap to specify that we will be computing the Hessian-vector product in our computations. Note that we reduce the mesh size to make finite differences tractable.

We again verify using finite differences:

using FiniteDifferences, Test
κ = get_free_dof_values(κ0h)
v = get_free_dof_values(vh)
function κ_to_j(κ)
    κh = FEFunction(K,κ)
    op = AffineFEOperator((u,v)->a(u,v,κh),v->b(v,κh),U,V)
    u = solve(op) |> get_free_dof_values
    sqrt(l2_norm(u-u_obs, κ))
end
g_fd = κ->FiniteDifferences.jacobian(central_fdm(5,1),κ_to_j,κ)
Hv_fd = FiniteDifferences.jacobian(central_fdm(5,1),g_fd,κ)[1]*v
maximum(abs,Hv-Hv_fd)/maximum(abs,Hv)

Here we use FiniteDifferences instead of FiniteDiff so that we can specify the finite difference scheme.

Note

The map denoted by (!!) in the above has ... + 0κ in the integrand. This is currently required for functionals that implicitly depond on the parameter. This will be investigated in future.

Another issue is Gridap's long complilation time when computing Hessian's with AD. This will be investigated in future.