Three (Real-World) Case Studies
The applicability of OpenAD as a tool for the automatic generation of
robust and efficient adjoint code is best illustrated by real-world
test cases. We consider the following three example codes:
- flow in a driven cavity
- boxmodel for thermohaline circulation
- shallow water model
The idea of this page is to show the results that can be obtained by
the reader after downloading the provided
OpenAD executables and their
application
to the three codes above.
The first test problem is from the
MINPACK-2
test problem collection.
The 2D (x-y-plane) flow in a driven cavity is
formulated as a boundary value problem, which is discretized by
standard finite difference approximations to obtain a system of
nonlinear equations.
We choose equal numbers of steps in both the
x- and y-directions. The number of independent variables is equal to
the number of mesh points that are not part of the boundary.
For the given original
code
we use OpenAD to generate
This code was provided by Patrick
Heimbach. It is used for the solution of a (generalized) eigenvalue
problem arising in physical oceanography,
where
substantial effort in understanding the
ocean circulation's role in the
variability of the climate system,
on time scales of decades to millennia and beyond,
is being directed at investigating
the so-called 'thermohaline' circulation (THC).
This refers to the contribution to
the ocean circulation which is driven by density gradients
and thus controlled by temperature and salinity properties
and its associated fluxes.
It plays a crucial role in connecting the surface
to the deep ocean through deep-water formation which occurs at some
isolated convection sites at high latitudes mainly in the subpolar
Atlantic ocean, such as the Labrador Sea and the
Greenland-Irminger-Norwegian (GIN) Seas.
This code has a number of global variables located in a module
which is also subject to the code transformation resulting in a version
of the module
with variables having an active type. Again, we show
that is generated by OpenAD for the given original
code.
Initially we had used an alternative approach that uses a tangent-linear code to to write a
tape that consists of the local Jacobians at the current point.
A serious disadvantage of this strategy is the requirement
to hash physical memory addresses (computed by a
C-routine) to entries in a work
array. This mapping
is implemented in form of a support
module. As usual, a driver is
necessary
to call the tangent-linear code in the correct context. Using the
tangent-linear model to generate a tape that can be used in an
interpretive
adjoint sweep has the advantage, that directional derivatives and
adjoints can be
computed during a single execution of the model code [Naumann2003CTL].
The third test case is
a shallow water model used in an earlier study
by Losch and Wunsch on bottom topography as a control variable in ocean
models. This code was originally written in Fortran 77 and is
differentiable
with the commercial adjoint model compiler for Fortran codes
TAF. There have
been a few Fortran90 extensions to the source and it contains some of
the basic language features also used in the
MITgcm.
First we show a version of the original code
and its
tangent-linear as generated by OpenAD.
While it is possible to actually execute the tangent linear mode only
for very small test problems, the numerical
results illustrate the
superiority of the tangent-linear model over the evaluation of finite
difference quotients.
For larger, practically interesting problems we obviously have to use
an
adjoint model for the gradient calculation. In terms of storage
requirements both the split mode and
the joint mode with subroutine level checkpointing have rather large
storage requirements. Therefore we use an approach similar to the one
used with TAF with a two level nested checkpointing at the main loop in
the subroutine forward_model.
This is done by wrapping the outer and
the inner loop body in subroutines respectively such that the
checkpoints become normal subroutine level checkpoints. All callees of
the inner loop wrapper are adjoint in split mode. All other routines
are done in joint mode [Naumann2005STf].
For this approach we left the original
split into several source code files in place. These files can be found
in the donwloadable test binaries along
with instructions on performing the
here described transformations and executing the result.
The following table illustrates the relative effects of a sequence of
various changes to the code generation on storage requirements and run
time of the adjoint model. The blue numbers indicate a testing
setup with with a 20x20
resolution, the green numbers refer to a setup
with a 180x80 resolution.
sequence of changes to the
adjoint model
|
run time
|
doublestape storage requirements
|
comments
|
doubles
|
integers
|
strict joint mode (1 template);
trivial activity analysis;
checkpoint storage local to subroutine
|
3:39
143:37
|
9.803
317.803
|
30.574
1.080.754
|
a pure split mode is infeasible
due to the implied memory requirements for the tape
|
2 level checkpointing (4
templates);
|
2:59
141:20
|
30.287
1.052.527
|
91.293
3.256.033
|
the low level subroutines in the
call tree are handled by split mode, therefore the maximal size of the
tape held at a particular point in time increases. Note in particular
that the time for the 180x80 configuration does not decrease in this
experiment as the data access dominates the run time
|
modified reversal for "simple"
loops;
improved taping/checkpointing code;
|
0:45
46:02
|
30.287
1.052.527
|
7.794
259.854
|
in "simple" loops we avoid
storing certain addresses
|
interprocedural activity analysis
|
0:30
21:51 |
10.801
388.801
|
904
14.964 |
reducing the set of active
variables, i.e. the amount of
values to be taped.
|
For the configuration that is of interest to our colleagues at MIT's
Department of Earth, Atmospheric, and
Planetary Sciences the runtime requirements are still significant.
Work is underway to improve the efficiency of the generated code and
the support libraries. Below we show as an example output a map of
sensitivities of zonal volume transport through the Drake Passage to
changes in bottom topography everywhere in a barotropic ocean model
computed by P. Heimbach.
