Internals & Design Documents

This section provides links to internal design documents for developers.

Design Documents

The following documents are available directly in the GitHub repository:

Derivative Design: Mathematical foundations and kernel implementation for derivative computation
Interpolant Derivative API: Hybrid B+C API design decisions

Architecture Overview

FastInterpolations.jl internal architecture:

Operation Types (src/core/eval_ops.jl): DerivOp{N} parametric singleton for compile-time derivative dispatch (aliases: EvalValue, EvalDeriv1, EvalDeriv2, EvalDeriv3)
Kernel Functions (src/*_kernels.jl): Pure math functions for interpolation and derivatives
Boundary Conditions (src/bc_types.jl): ZeroCurvBC, ZeroSlopeBC, PeriodicBC types

Anchored Queries (Internal API)

For maximum performance in hot loops with fixed query points, anchored queries pre-compute grid positions to skip O(log n) binary search entirely.

Internal API

Functions prefixed with _ are internal and may change without notice.

How It Works

Anchored queries pre-compute:

Which interval each query point falls into
The local coordinate within that interval

This eliminates O(log n) binary search on every evaluation (~2x speedup).

Usage Pattern

x = range(0.0, 10.0, 100)
xq = range(0.0, 10.0, 500)
out = similar(xq)

# Pre-compute anchors ONCE
aq_vec = FastInterpolations._anchor_query(x, xq)

# Use in hot loop
for i in 1:10000
    cubic_interp!(out, x, y, aq_vec)
end

For very small series counts (n ≤ 2-4) with vector queries only, the anchor allocation overhead may make a manual loop marginally faster (~10-25%). For scalar queries or n ≥ 4, SeriesInterpolant always wins.

Advanced Optimization

Cache Optimization

The one-shot API uses auto-caching to avoid redundant LU factorizations.

Cache keys are based on: (x_grid, method, boundary_condition, extrapolation)

# Same cache key → cache HIT
x = 0:0.1:10
cubic_interp(x, y1, xq)  # cache miss (first call)
cubic_interp(x, y2, xq)  # cache HIT (reuses factorization)

# Different x object → cache MISS
cubic_interp(collect(0:0.1:10), y, xq)  # new object = cache miss!

Define grid once outside loops:

x = range(0.0, 10.0, 100)

for step in 1:10000
    y = compute(step)
    cubic_interp!(out, x, y, xq)  # cache HIT every time
end

Thread Safety

FastInterpolations.jl is thread-safe, but requires careful buffer management. Each thread must have its own output buffer:

# Create thread-local buffers
thread_outputs = [similar(xq) for _ in 1:Threads.nthreads()]

Threads.@threads for i in 1:1000
    tid = Threads.threadid()
    cubic_interp!(thread_outputs[tid], x, y[i], xq)
end

Type Stability

Ensure type stability for maximum performance:

# Type-stable: compile-time method selection
result = cubic_interp(x, y, xq)

# Type-unstable: runtime dispatch (slower)
method = user_input == "cubic" ? cubic_interp : linear_interp
result = method(x, y, xq)  # dynamic dispatch

# Fix: use if-else for compile-time dispatch
if user_input == "cubic"
    result = cubic_interp(x, y, xq)
else
    result = linear_interp(x, y, xq)
end