Adjoint via AD (∂f/∂y)

This page covers differentiating interpolation output with respect to data values y using automatic differentiation.

For differentiating with respect to query coordinates xq, see AD Support (1D) and AD Support (ND).

Background

Interpolation is a linear operation on data values: the output is a weighted sum of nearby data points. The gradient ∂f/∂y recovers these weights — equivalent to the adjoint (transpose) of the interpolation operator.

DirectionOperationDescription
Forward (gather)y_interp = A * y_dataWeighted sum of nearby data → interpolated value
Adjoint (scatter)∇y_data = Aᵀ * δDistribute residuals back to data nodes
Zygote & Enzyme support for cubic splines (1D and ND)

cubic_interp(x, y, xq; ...) and cubic_interp(grids, data, queries; ...) provide analytical adjoint rules via CubicAdjoint / CubicAdjointND, so Zygote and Enzyme can differentiate ∂f/∂y for cubic splines without tracing through the tridiagonal solve. See Backend Support below for details.

Quick Start

using FastInterpolations, ForwardDiff

x = 0.0:1.0:5.0
y = cos.(collect(x))

f(y) = linear_interp(x, y, 1.25)
∇y = ForwardDiff.gradient(f, y)
6-element Vector{Float64}:
 0.0
 0.75
 0.25
 0.0
 0.0
 0.0

linear_interp(x, y, 1.25) falls between y[2] and y[3] with fractional position t = 0.25, so the gradient is (1-t, t) = (0.75, 0.25) at those indices.

All Methods Work

ForwardDiff.gradient(y -> constant_interp(x, y, 1.25), y)
6-element Vector{Float64}:
 0.0
 1.0
 0.0
 0.0
 0.0
 0.0
ForwardDiff.gradient(y -> quadratic_interp(x, y, 1.25), y)
6-element Vector{Float64}:
 -0.09375
  0.9375
  0.15625
  0.0
  0.0
  0.0
ForwardDiff.gradient(y -> cubic_interp(x, y, 1.25), y)
6-element Vector{Float64}:
 -0.04788427033492824
  0.7912679425837321
  0.32158343301435405
 -0.07726525119617224
  0.014129784688995214
 -0.0018316387559808608

For cubic and quadratic splines, the gradient is non-local: the tridiagonal solve couples all data points, so changing any y[i] can affect the output. The dominant weights are still near the query point, with exponentially decaying influence further away.

ND Interpolants

xg = range(0.0, 1.0, 5)
yg = range(0.0, 1.0, 5)
data = [cos(xi) * cos(yj) for xi in xg, yj in yg]

∇data = ForwardDiff.gradient(d -> linear_interp((xg, yg), d, (0.3, 0.7)), data)
5×5 Matrix{Float64}:
 0.0  0.0  0.0   0.0   0.0
 0.0  0.0  0.16  0.64  0.0
 0.0  0.0  0.04  0.16  0.0
 0.0  0.0  0.0   0.0   0.0
 0.0  0.0  0.0   0.0   0.0

Periodic Boundary Conditions

Both inclusive and exclusive PeriodicBC work:

t = range(0.0, 2π, 7)
y_per = cos.(collect(t))   # cos(0) == cos(2π) == 1.0

ForwardDiff.gradient(y -> cubic_interp(t, y, 1.0; bc=PeriodicBC()), y_per)
7-element Vector{Float64}:
  0.037161667723129926
  0.99564092509889
 -0.03326733500485853
  0.008219844616930736
  0.00038795653713567444
 -0.009771670765473396
  0.0016286117942455648

Batch Queries: Jacobian

For multiple query points, use ForwardDiff.jacobian to build the interpolation matrix:

xqs = [1.25, 3.7]

J = ForwardDiff.jacobian(y -> [linear_interp(x, y, xq) for xq in xqs], y)
2×6 Matrix{Float64}:
 0.0  0.75  0.25  0.0  0.0  0.0
 0.0  0.0   0.0   0.3  0.7  0.0
# Adjoint: scatter residuals back to data nodes
δ = [1.0, -0.5]
∇y = J' * δ              # Aᵀ * δ
6-element Vector{Float64}:
  0.0
  0.75
  0.25
 -0.1499999999999999
 -0.3500000000000001
  0.0

Example: Loss Function

xq_obs = [0.3, 1.2, 2.7, 3.8, 4.5]
y_obs = [0.1, 0.8, 0.3, -0.5, -0.9]

function loss(y)
    predictions = [linear_interp(x, y, xq) for xq in xq_obs]
    return sum((predictions .- y_obs) .^ 2)
end

ForwardDiff.gradient(loss, y)
6-element Vector{Float64}:
  1.0669269684646188
 -0.2643256211276015
 -0.851098288196639
 -1.6533396764226285
  0.3615478486822414
  0.7150092822998072

This gradient tells you how to adjust each data point to reduce the loss — the foundation for iterative fitting and inverse problems.

Performance

ForwardDiff processes y in chunks (default chunk size = 8), reusing the existing optimized interpolation code paths. Because FastInterpolations is highly optimized for forward evaluation, the Dual number overhead remains modest.

Representative timings (Apple M1 Pro, single query ∂f/∂y, Julia 1.12):

N (grid points)linearcubic
110.4 μs0.6 μs
511.4 μs4.6 μs
20110 μs67 μs
1001203 μs1.7 ms

Cubic is slower per chunk because each chunk re-runs the tridiagonal forward/backward substitution with Dual-valued data. The LU structure (grid spacing, cache) is reused, but the solve itself must execute for each chunk of Dual numbers.

Backend Support

Computing ∂f/∂y requires the one-shot API — there is no alternative. Pre-building an interpolant itp = cubic_interp(x, y) bakes y into the struct as fixed constants; calling gradient(itp, xq) then gives ∂f/∂xq (coordinate sensitivity), not ∂f/∂y. To differentiate with respect to data, the AD backend must see y as a live variable, which means it must trace through the entire construction path: f(y) = cubic_interp(x, y, xq).

This construction path includes in-place array mutations (tridiagonal solve for cubic/quadratic, slope recurrence for quadratic), which limits backend compatibility:

Backendconstantlinearquadraticcubic (1D)cubic (ND)
ForwardDiff✅ (1D/ND)✅ (1D/ND)✅ (1D/ND)✅ (1D/ND)
Zygote❌¹✅²✅²
Enzyme❌³❌³❌³✅²✅²

¹ Quadratic one-shot mutates arrays during the spline solve, which Zygote's source-to-source transformation cannot differentiate through. ² Cubic uses an analytical adjoint via CubicAdjoint / CubicAdjointND — the AD backend never traces through the tridiagonal solve. ³ Enzyme encounters LLVM codegen errors on the one-shot construction path.

How It Works

ForwardDiff replaces each y[i] with a Dual number y[i] + ε. The interpolation code runs unchanged because all operations are generic over <: Real, and ForwardDiff.Dual <: Real satisfies the 4 duck typing operations (+, -, *, muladd).

∂f/∂xq vs ∂f/∂y

∂f/∂xq (coordinate)∂f/∂y (data)
Use forPosition optimization, sensitivityInverse problems, data fitting
Fastest methodderiv keyword (analytical)Cubic: CubicAdjoint; others: ForwardDiff
Docs1D, NDThis page

See Also

  • Adjoint Operators: Conceptual overview — what adjoints are, where they appear, and the mathematical formulation
  • Cubic Adjoint (1D): Native CubicAdjoint API for zero-allocation, matrix-free adjoint computation
  • Cubic Adjoint (ND): Native CubicAdjointND API for N-dimensional adjoint computation