API Reference

API Summary

Macros

MacroDescription
@with_pool name exprRecommended. Injects a global, task-local pool named name. Automatically checkpoints and rewinds.
@maybe_with_pool name exprSame as @with_pool, but can be toggled on/off at runtime via MAYBE_POOLING_ENABLED[].

Functions

FunctionDescription
acquire!(pool, T, dims...)Returns a view for most T: SubArray{T,1} for 1D, ReshapedArray{T,N} for N-D. For T === Bit, returns native BitVector/BitArray{N}. Cache hit 0 bytes.
acquire!(pool, T, dims::Tuple)Tuple overload for acquire! (e.g., acquire!(pool, T, size(x))).
acquire!(pool, x::AbstractArray)Similar-style: acquires array matching eltype(x) and size(x).
unsafe_acquire!(pool, T, dims...)Returns native Array/CuArray (CPU: Vector{T} for 1D, Array{T,N} for N-D). For T === Bit, returns native BitVector/BitArray{N} (equivalent to acquire!). Only use for FFI/type constraints.
unsafe_acquire!(pool, x::AbstractArray)Similar-style: acquires raw array matching eltype(x) and size(x).
checkpoint!(pool)Saves the current pool state (stack pointer).
rewind!(pool)Restores the pool to the last checkpoint, freeing all arrays acquired since then.
pool_stats(pool)Prints detailed statistics about pool usage.
get_task_local_pool()Returns the task-local pool instance.
empty!(pool)Clears all internal storage, releasing all memory.

Convenience Functions

Default element type is Float64 (CPU) or Float32 (CUDA).

FunctionDescription
zeros!(pool, [T,] dims...)Zero-initialized view. Equivalent to acquire! + fill!(0).
ones!(pool, [T,] dims...)One-initialized view. Equivalent to acquire! + fill!(1).
trues!(pool, dims...)Bit-packed BitVector / BitArray{N} filled with true.
falses!(pool, dims...)Bit-packed BitVector / BitArray{N} filled with false.
similar!(pool, A)View matching eltype(A) and size(A).

Types

TypeDescription
AdaptiveArrayPoolThe main pool type. Create with AdaptiveArrayPool().
BitSentinel type to request packed BitVector storage (1 bit/element).
DisabledPool{Backend}Sentinel type when pooling is disabled.

Configuration & Utilities

SymbolDescription
USE_POOLINGCompile-time constant to disable all pooling.
MAYBE_POOLING_ENABLEDRuntime Ref{Bool} for @maybe_with_pool.
POOL_DEBUGRuntime Ref{Bool} to enable safety validation.
set_cache_ways!(n)Set N-way cache size.

Detailed Reference

AdaptiveArrayPools.@maybe_with_poolMacro
@maybe_with_pool pool_name expr
@maybe_with_pool expr

Conditionally enables pooling based on MAYBE_POOLING_ENABLED[]. If disabled, pool_name becomes nothing, and acquire! falls back to standard allocation.

Useful for libraries that want to let users control pooling behavior at runtime.

Function Definition

Like @with_pool, wrap function definitions:

@maybe_with_pool pool function process_data(data)
    tmp = acquire!(pool, Float64, length(data))  # Conditionally pooled
    tmp .= data
    sum(tmp)
end

Block Usage

MAYBE_POOLING_ENABLED[] = false
@maybe_with_pool pool begin
    v = acquire!(pool, Float64, 100)  # Falls back to Vector{Float64}(undef, 100)
end
source
AdaptiveArrayPools.@with_poolMacro
@with_pool pool_name expr
@with_pool expr
@with_pool :backend pool_name expr
@with_pool :backend expr

Executes code within a pooling scope with automatic lifecycle management. Calls checkpoint! on entry and rewind! on exit (even if errors occur).

If pool_name is omitted, a hidden variable is used (useful when you don't need to reference the pool directly).

Backend Selection

Use a symbol to specify the pool backend:

  • :cpu - CPU pools (default)
  • :cuda - GPU pools (requires using CUDA)
# CPU (default)
@with_pool pool begin ... end

# GPU via CUDA
@with_pool :cuda pool begin ... end

Function Definition

Wrap function definitions to inject pool lifecycle into the body:

# Long form function
@with_pool pool function compute_stats(data)
    tmp = acquire!(pool, Float64, length(data))
    tmp .= data
    mean(tmp), std(tmp)
end

# Short form function
@with_pool pool fast_sum(data) = begin
    tmp = acquire!(pool, eltype(data), length(data))
    tmp .= data
    sum(tmp)
end

Block Usage

# With explicit pool name
@with_pool pool begin
    v = acquire!(pool, Float64, 100)
    v .= 1.0
    sum(v)
end

# Without pool name (for simple blocks)
@with_pool begin
    inner_function()  # inner function can use get_task_local_pool()
end

Nesting

Nested @with_pool blocks work correctly - each maintains its own checkpoint.

@with_pool p1 begin
    v1 = acquire!(p1, Float64, 10)
    inner = @with_pool p2 begin
        v2 = acquire!(p2, Float64, 5)
        sum(v2)
    end
    # v1 is still valid here
    sum(v1) + inner
end
source
AdaptiveArrayPools._acquire_impl!Method
_acquire_impl!(pool, Type{T}, n) -> SubArray{T,1,Vector{T},...}
_acquire_impl!(pool, Type{T}, dims...) -> ReshapedArray{T,N,...}

Internal implementation of acquire!. Called directly by macro-transformed code (no type touch recording). User code calls acquire! which adds recording.

source
AdaptiveArrayPools._can_use_typed_pathMethod
_can_use_typed_path(pool::AbstractArrayPool, tracked_mask::UInt16) -> Bool

Check if the typed (fast) checkpoint/rewind path is safe to use.

Returns true when all touched types at the current depth are a subset of the tracked types (bitmask subset check) AND no non-fixed-slot types were touched.

The subset check: (touched_mask & ~tracked_mask) == 0 means every bit set in touched_mask is also set in tracked_mask.

source
AdaptiveArrayPools._disabled_pool_exprMethod
_disabled_pool_expr(backend::Symbol) -> Expr

Generate expression for DisabledPool singleton based on backend. Used when pooling is disabled to preserve backend context.

source
AdaptiveArrayPools._ensure_body_has_toplevel_lnnMethod
_ensure_body_has_toplevel_lnn(body, source)

Ensure body has a LineNumberNode pointing to user source at the top level.

  • Scans first few args to handle Expr(:meta, ...) from @inline etc.
  • If first LNN points to user file (same as source.file), preserve it
  • If first LNN points elsewhere (e.g., macros.jl), replace with source LNN
  • If no LNN exists, prepend source LNN
  • If source.file === :none (REPL/eval), don't clobber valid file LNNs

Returns a new Expr to avoid mutating the original AST.

source
AdaptiveArrayPools._extract_acquire_typesFunction
_extract_acquire_types(expr, target_pool) -> Set{Any}

Extract type arguments from acquire/convenience function calls in an expression. Only extracts types from calls where the first argument matches target_pool. This prevents AST pollution when multiple pools are used in the same block.

Supported functions:

  • acquire! and its alias acquire_view!
  • unsafe_acquire! and its alias acquire_array!
  • zeros!, ones!, similar!
  • unsafe_zeros!, unsafe_ones!, unsafe_similar!

Handles various forms:

  • [unsafe_]acquire!(pool, Type, dims...): extracts Type directly
  • acquire!(pool, x): generates eltype(x) expression
  • zeros!(pool, dims...) / ones!(pool, dims...): Float64 (default)
  • zeros!(pool, Type, dims...) / ones!(pool, Type, dims...): extracts Type
  • similar!(pool, x): generates eltype(x) expression
  • similar!(pool, x, Type, ...): extracts Type
source
AdaptiveArrayPools._extract_local_assignmentsFunction
_extract_local_assignments(expr, locals=Set{Symbol}()) -> Set{Symbol}

Find all symbols that are assigned locally in the expression body. These cannot be used for typed checkpoint since they're defined after checkpoint!.

Detects patterns like: T = eltype(x), local T = ..., etc.

source
AdaptiveArrayPools._filter_static_typesFunction
_filter_static_types(types, local_vars=Set{Symbol}()) -> (static_types, has_dynamic)

Filter types for typed checkpoint/rewind generation.

  • Symbols NOT in local_vars are passed through (type parameters, global types)
  • Symbols IN local_vars trigger fallback (defined after checkpoint!)
  • Parametric types like Vector{T} trigger fallback
  • eltype(x) expressions: usable if x does NOT reference a local variable

Type parameters (T, S from where clause) resolve to concrete types at runtime. Local variables (T = eltype(x)) are defined after checkpoint! and cannot be used.

source
AdaptiveArrayPools._find_first_lnn_indexMethod
_find_first_lnn_index(args) -> Union{Int, Nothing}

Find the index of the first LineNumberNode in the leading prefix of args.

Scans sequentially, skipping Expr(:meta, ...) nodes (inserted by @inline, @inbounds, etc.). Returns nothing as soon as a non-meta, non-LNN expression is encountered—this prevents matching LNNs deeper in the AST.

Example AST prefix patterns

  • [Expr(:meta,:inline), LNN, ...] → returns 2
  • [LNN, ...] → returns 1
  • [Expr(:meta,:inline), Expr(:call,...), LNN, ...] → returns nothing (stopped at call)
source
AdaptiveArrayPools._fix_try_body_lnn!Method
_fix_try_body_lnn!(expr, source)

Fix LineNumberNodes inside try blocks to point to user source. Julia's stack trace uses the LAST LNN before error location for line numbers. By replacing the first LNN in try body with source LNN, we ensure correct line numbers in stack traces.

Scans first few args to handle Expr(:meta, ...) from @inline etc. If source.file === :none (REPL/eval), don't clobber valid file LNNs. Modifies expr in-place and returns it.

source
AdaptiveArrayPools._generate_lazy_checkpoint_callMethod
_generate_lazy_checkpoint_call(pool_expr)

Generate a depth-only checkpoint call for dynamic-selective mode (use_typed=false). Much lighter than full checkpoint!: only increments depth and pushes bitmask sentinels.

source
AdaptiveArrayPools._generate_lazy_rewind_callMethod
_generate_lazy_rewind_call(pool_expr)

Generate selective rewind code for dynamic-selective mode (use_typed=false). Delegates to _lazy_rewind! — a single function call, symmetric with _lazy_checkpoint! for checkpoint. This avoids let-block overhead in finally clauses (which can impair Julia's type inference and cause boxing).

source
AdaptiveArrayPools._generate_pool_code_with_backendMethod
_generate_pool_code_with_backend(backend, pool_name, expr, force_enable)

Generate pool code for a specific backend (e.g., :cuda, :cpu). Uses _get_pool_for_backend(Val{backend}()) for zero-overhead dispatch.

Includes type-specific checkpoint/rewind optimization (same as regular @with_pool).

source
AdaptiveArrayPools._generate_typed_checkpoint_callMethod
_generate_typed_checkpoint_call(pool_expr, types)

Generate bitmask-aware checkpoint call. When types are known at compile time, emits a conditional:

  • if touched types ⊆ tracked types → typed checkpoint (fast path)
  • otherwise → _typed_lazy_checkpoint! (typed checkpoint + set bit 14 for lazy first-touch checkpointing of extra types touched by helpers)
source
AdaptiveArrayPools._generate_typed_rewind_callMethod
_generate_typed_rewind_call(pool_expr, types)

Generate bitmask-aware rewind call. When types are known at compile time, emits a conditional:

  • if touched types ⊆ tracked types → typed rewind (fast path)
  • otherwise → _typed_lazy_rewind! (rewinds tracked | touched mask; all touched types have Case A checkpoints via bit 14 lazy mode)
source
AdaptiveArrayPools._get_pool_for_backendMethod
_get_pool_for_backend(::Val{:cpu}) -> AdaptiveArrayPool

Get task-local pool for the specified backend.

Extensions add methods for their backends (e.g., Val{:cuda}). Using Val{Symbol} enables compile-time dispatch and full inlining, achieving zero overhead compared to Dict-based registry.

Example (in CUDA extension)

@inline AdaptiveArrayPools._get_pool_for_backend(::Val{:cuda}) = get_task_local_cuda_pool()
source
AdaptiveArrayPools._lazy_checkpoint!Method
_lazy_checkpoint!(pool::AdaptiveArrayPool)

Lightweight checkpoint for lazy mode (use_typed=false macro path).

Increments _current_depth and pushes bitmask sentinels — but does not save n_active for any fixed-slot typed pool. The _LAZY_MODE_BIT (bit 15) in _touched_type_masks marks this depth as lazy mode so that _record_type_touch! can trigger lazy first-touch checkpoints.

Existing others entries are eagerly checkpointed since there is no per-type tracking for non-fixed-slot pools; Case B in _rewind_typed_pool! handles any new others entries created during the scope (n_active starts at 0 = sentinel).

Performance: ~2ns vs ~540ns for full checkpoint!.

source
AdaptiveArrayPools._lazy_rewind!Method
_lazy_rewind!(pool::AdaptiveArrayPool)

Complete rewind for lazy mode (use_typed=false macro path).

Reads the combined mask at the current depth, rewinds only the fixed-slot pools whose bits are set, handles any others entries, then pops the depth metadata.

Called directly from the macro-generated finally clause as a single function call (matching the structure of _lazy_checkpoint! for symmetry and performance).

source
AdaptiveArrayPools._looks_like_typeMethod
_looks_like_type(expr) -> Bool

Heuristic to check if an expression looks like a type. Returns true for: uppercase Symbols (Float64, Int), curly expressions (Vector{T}), GlobalRef to types.

source
AdaptiveArrayPools._record_type_touch!Method
_record_type_touch!(pool::AbstractArrayPool, ::Type{T})

Record that type T was touched (acquired) at the current checkpoint depth. Called by acquire! and convenience wrappers; macro-transformed calls use _acquire_impl! directly (bypassing this for zero overhead).

For fixed-slot types, sets the corresponding bit in _touched_type_masks. For non-fixed-slot types, sets _touched_has_others flag.

source
AdaptiveArrayPools._selective_rewind_fixed_slots!Method
_selective_rewind_fixed_slots!(pool::AdaptiveArrayPool, mask::UInt16)

Rewind only the fixed-slot typed pools whose bits are set in mask.

Each of the 8 fixed-slot pools maps to bits 0–7 (same encoding as _fixed_slot_bit). Bits 8–15 (mode flags) are not checked here — callers must strip them before passing the mask (e.g. mask & _TYPE_BITS_MASK).

Unset bits are skipped entirely: for pools that were acquired without a matching checkpoint, _rewind_typed_pool! Case B safely restores from the parent checkpoint.

source
AdaptiveArrayPools._tracked_mask_for_typesMethod
_tracked_mask_for_types(types::Type...) -> UInt16

Compute compile-time bitmask for the types tracked by a typed checkpoint/rewind. Uses @generated for zero-overhead constant folding.

Returns UInt16(0) when called with no arguments. Non-fixed-slot types contribute UInt16(0) (their bit is 0).

source
AdaptiveArrayPools._typed_lazy_checkpoint!Method
_typed_lazy_checkpoint!(pool::AdaptiveArrayPool, types::Type...)

Typed checkpoint that enables lazy first-touch checkpointing for extra types touched by helpers (use_typed=true, _can_use_typed_path=false path).

Calls checkpoint!(pool, types...) (checkpoints only the statically-known types), then sets _TYPED_LAZY_BIT (bit 14) in _touched_type_masks[depth] to signal typed lazy mode.

_record_type_touch! checks (mask & _MODE_BITS_MASK) != 0 (bit 14 OR bit 15) to trigger a lazy first-touch checkpoint for each extra type on first acquire, ensuring Case A (not Case B) applies at rewind and parent n_active is preserved correctly.

source
AdaptiveArrayPools._typed_lazy_rewind!Method
_typed_lazy_rewind!(pool::AdaptiveArrayPool, tracked_mask::UInt16)

Selective rewind for typed mode (use_typed=true) fallback path.

Called when _can_use_typed_path returns false (helpers touched types beyond the statically-tracked set). Rewinds only pools whose bits are set in tracked_mask | touched_mask. All touched types have Case A checkpoints, guaranteed by the _TYPED_LAZY_BIT mode set in _typed_lazy_checkpoint!.

source
AdaptiveArrayPools._uses_local_varMethod
_uses_local_var(expr, local_vars) -> Bool

Check if an expression uses any local variable (recursively). Handles field access (x.y.z) and indexing (x[i]) by checking the base variable.

This is used to detect cases like acquire!(pool, cp1d.t_i_average) where cp1d is defined locally - the eltype expression can't be evaluated at checkpoint time since cp1d doesn't exist yet.

source
AdaptiveArrayPools.acquire!Method
acquire!(pool, x::AbstractArray) -> SubArray

Acquire an array with the same element type and size as x (similar to similar(x)).

Example

A = rand(10, 10)
@with_pool pool begin
    B = acquire!(pool, A)  # Same type and size as A
    B .= A .* 2
end
source
AdaptiveArrayPools.acquire!Method
acquire!(pool, Type{T}, n) -> view type
acquire!(pool, Type{T}, dims...) -> view type
acquire!(pool, Type{T}, dims::NTuple{N,Int}) -> view type

Acquire a pooled array of type T with size n or dimensions dims.

Returns a pooled array (backend-dependent type):

  • CPU 1D: SubArray{T,1,Vector{T},...} (parent is Vector{T})
  • CPU N-D: ReshapedArray{T,N,...} (zero creation cost)
  • Bit (T === Bit): BitVector / BitArray{N} (chunks-sharing, SIMD optimized)
  • CUDA: CuArray{T,N} (unified N-way cache)

For CPU numeric arrays, the return types are StridedArray, compatible with BLAS and broadcasting.

For type-unspecified paths (struct fields without concrete type parameters), use unsafe_acquire! instead - cached native array instances can be reused.

Example

@with_pool pool begin
    v = acquire!(pool, Float64, 100)      # 1D view
    m = acquire!(pool, Float64, 10, 10)   # 2D view
    v .= 1.0
    m .= 2.0
    sum(v) + sum(m)
end

See also: unsafe_acquire! for native array access.

source
AdaptiveArrayPools.acquire_view!Function
acquire_view!(pool, Type{T}, dims...)

Alias for acquire!.

Explicit name emphasizing the return type is a view (SubArray/ReshapedArray), not a raw Array. Use when you prefer symmetric naming with acquire_array!.

source
AdaptiveArrayPools.checkpoint!Method
checkpoint!(pool::AdaptiveArrayPool, types::Type...)

Save state for multiple specific types. Uses @generated for zero-overhead compile-time unrolling. Increments currentdepth once for all types.

source
AdaptiveArrayPools.checkpoint!Method
checkpoint!(pool::AdaptiveArrayPool)

Save the current pool state (n_active counters) to internal stacks.

This is called automatically by @with_pool and related macros. After warmup, this function has zero allocation.

See also: rewind!, @with_pool

source
AdaptiveArrayPools.checkpoint!Method
checkpoint!(pool::AdaptiveArrayPool, ::Type{T})

Save state for a specific type only. Used by optimized macros that know which types will be used at compile time.

Also updates currentdepth and bitmask state for type touch tracking.

~77% faster than full checkpoint! when only one type is used.

source
AdaptiveArrayPools.default_eltypeMethod
default_eltype(pool) -> Type

Default element type for convenience functions when type is not specified. CPU pools default to Float64, CUDA pools to Float32.

Backends can override this to provide appropriate defaults.

source
AdaptiveArrayPools.falses!Method
falses!(pool, dims...) -> BitArray
falses!(pool, dims::Tuple) -> BitArray

Acquire a bit-packed boolean array filled with false from the pool.

Equivalent to Julia's falses(dims...) but using pooled memory. Uses ~8x less memory than zeros!(pool, Bool, dims...).

Example

@with_pool pool begin
    bv = falses!(pool, 100)       # BitVector, all false
    bm = falses!(pool, 10, 10)    # BitMatrix, all false
end

See also: trues!, zeros!, acquire!

source
AdaptiveArrayPools.get_bitarray!Method
get_bitarray!(tp::BitTypedPool, dims::NTuple{N,Int}) -> BitArray{N}

Get a BitArray{N} that shares chunks with the pooled BitVector.

Uses N-way cache for BitArray reuse. Unlike Array which requires unsafe_wrap for each shape, BitArray can reuse cached entries by modifying dims/len fields when ndims matches (0 bytes allocation).

Cache Strategy

  • Exact match: Return cached BitArray directly (0 bytes)
  • Same ndims: Modify dims/len/chunks of cached entry (0 bytes)
  • Different ndims: Create new BitArray{N} and cache it (~944 bytes)

Implementation Notes

  • BitVector (N=1): size() uses len field, dims is ignored
  • BitArray{N>1}: size() uses dims field
  • All BitArrays share chunks with the pool's backing BitVector

Safety

The returned BitArray is only valid within the @with_pool scope. Do NOT use after the scope ends (use-after-free risk).

source
AdaptiveArrayPools.get_task_local_cuda_poolsFunction
get_task_local_cuda_pools() -> Dict{Int, CuAdaptiveArrayPool}

Returns the dictionary of all CUDA pools for the current task (one per device).

Requires CUDA.jl to be loaded. Throws an error if CUDA extension is not available.

source
AdaptiveArrayPools.get_task_local_poolMethod
get_task_local_pool() -> AdaptiveArrayPool

Retrieves (or creates) the AdaptiveArrayPool for the current Task.

Each Task gets its own pool instance via task_local_storage(), ensuring thread safety without locks.

source
AdaptiveArrayPools.get_view!Method
get_view!(tp::AbstractTypedPool{T}, n::Int)

Get a 1D vector view of size n from the typed pool. Returns cached view on hit (zero allocation), creates new on miss.

source
AdaptiveArrayPools.ones!Method
ones!(pool, dims...) -> view
ones!(pool, T, dims...) -> view
ones!(pool, dims::Tuple) -> view
ones!(pool, T, dims::Tuple) -> view

Acquire a one-initialized array from the pool.

Equivalent to acquire!(pool, T, dims...) followed by fill!(arr, one(T)). Default element type depends on pool backend (CPU: Float64, CUDA: Float32). See default_eltype.

Example

@with_pool pool begin
    v = ones!(pool, 100)              # Uses default_eltype(pool)
    m = ones!(pool, Float32, 10, 10)  # Matrix{Float32} view, all ones
end

See also: zeros!, similar!, acquire!

source
AdaptiveArrayPools.pool_statsMethod
pool_stats(tp::AbstractTypedPool; io::IO=stdout, indent::Int=0, name::String="")

Print statistics for a TypedPool or BitTypedPool.

source
AdaptiveArrayPools.pool_statsMethod
pool_stats(pool::AdaptiveArrayPool; io::IO=stdout)

Print detailed statistics about pool usage with colored output.

Example

pool = AdaptiveArrayPool()
@with_pool pool begin
    v = acquire!(pool, Float64, 100)
    pool_stats(pool)
end
source
AdaptiveArrayPools.pool_statsMethod
pool_stats(; io::IO=stdout)

Print statistics for all task-local pools (CPU and CUDA if loaded).

Example

@with_pool begin
    v = acquire!(pool, Float64, 100)
    pool_stats()  # Shows all pool stats
end
source
AdaptiveArrayPools.pooling_enabledMethod
pooling_enabled(pool) -> Bool

Returns true if pool is an active pool, false if pooling is disabled.

Examples

@maybe_with_pool pool begin
    if pooling_enabled(pool)
        # Using pooled memory
    else
        # Using standard allocation
    end
end

See also: DisabledPool

source
AdaptiveArrayPools.reset!Method
reset!(tp::AbstractTypedPool)

Reset state without clearing allocated storage. Sets n_active = 0 and restores checkpoint stacks to sentinel state.

source
AdaptiveArrayPools.reset!Method
reset!(pool::AdaptiveArrayPool)

Reset pool state without clearing allocated storage.

This function:

  • Resets all n_active counters to 0
  • Restores all checkpoint stacks to sentinel state
  • Resets _current_depth and type touch tracking state

Unlike empty!, this preserves all allocated vectors, views, and N-D arrays for reuse, avoiding reallocation costs.

Use Case

When functions that acquire from the pool are called without proper checkpoint!/rewind! management, n_active can grow indefinitely. Use reset! to cleanly restore the pool to its initial state while keeping allocated memory available.

Example

pool = AdaptiveArrayPool()

# Some function that acquires without checkpoint management
function compute!(pool)
    v = acquire!(pool, Float64, 100)
    # ... use v ...
    # No rewind! called
end

for _ in 1:1000
    compute!(pool)  # n_active grows each iteration
end

reset!(pool)  # Restore state, keep allocated memory
# Now pool.n_active == 0, but vectors are still available for reuse

See also: empty!, rewind!

source
AdaptiveArrayPools.rewind!Method
rewind!(pool::AdaptiveArrayPool, types::Type...)

Restore state for multiple specific types in reverse order. Decrements currentdepth once after all types are rewound.

source
AdaptiveArrayPools.rewind!Method
rewind!(pool::AdaptiveArrayPool)

Restore the pool state (nactive counters) from internal stacks. Uses _checkpointdepths to accurately determine which entries to pop vs restore.

Only the counters are restored; allocated memory remains for reuse. Handles touched types by checking checkpointdepths for accurate restoration.

Safety: If called at global scope (depth=1, no pending checkpoints), automatically delegates to reset! to safely clear all n_active counters.

See also: checkpoint!, reset!, @with_pool

source
AdaptiveArrayPools.rewind!Method
rewind!(pool::AdaptiveArrayPool, ::Type{T})

Restore state for a specific type only. Also updates currentdepth and bitmask state.

source
AdaptiveArrayPools.safe_prodMethod
safe_prod(dims::NTuple{N, Int}) -> Int

Compute the product of dimensions with overflow checking.

Throws OverflowError if the product exceeds typemax(Int), preventing memory corruption from integer overflow in unsafe_wrap operations.

Rationale

Without overflow checking, large dimensions like (10^10, 10^10) would wrap around to a small value, causing unsafe_wrap to create an array view that indexes beyond allocated memory.

Performance

Adds ~0.3-1.2 ns overhead (<1%) compared to unchecked prod(), which is negligible relative to the 100-200 ns cost of the full allocation path.

source
AdaptiveArrayPools.set_cache_ways!Method
set_cache_ways!(n::Int)

Set the number of cache ways for N-D array caching. Requires Julia restart to take effect.

Higher values reduce cache eviction but increase memory usage per slot.

Arguments

  • n::Int: Number of cache ways (valid range: 1-16)

Example

using AdaptiveArrayPools
AdaptiveArrayPools.set_cache_ways!(8)  # Double the default
# Restart Julia to apply the change
source
AdaptiveArrayPools.similar!Method
similar!(pool, array) -> view
similar!(pool, array, T) -> view
similar!(pool, array, dims...) -> view
similar!(pool, array, T, dims...) -> view

Acquire an uninitialized array from the pool, using a template array for defaults.

  • similar!(pool, A): same element type and size as A
  • similar!(pool, A, T): element type T, same size as A
  • similar!(pool, A, dims...): same element type as A, specified dimensions
  • similar!(pool, A, T, dims...): element type T, specified dimensions

Example

A = rand(10, 10)
@with_pool pool begin
    B = similar!(pool, A)              # Same type and size
    C = similar!(pool, A, Float32)     # Float32, same size
    D = similar!(pool, A, 5, 5)        # Same type, different size
    E = similar!(pool, A, Int, 20)     # Int, 1D
end

See also: zeros!, ones!, acquire!

source
AdaptiveArrayPools.trues!Method
trues!(pool, dims...) -> BitArray
trues!(pool, dims::Tuple) -> BitArray

Acquire a bit-packed boolean array filled with true from the pool.

Equivalent to Julia's trues(dims...) but using pooled memory. Uses ~8x less memory than ones!(pool, Bool, dims...).

Example

@with_pool pool begin
    bv = trues!(pool, 100)        # BitVector, all true
    bm = trues!(pool, 10, 10)     # BitMatrix, all true
end

See also: falses!, ones!, acquire!

source
AdaptiveArrayPools.unsafe_acquire!Method
unsafe_acquire!(pool, x::AbstractArray) -> Array

Acquire a raw array with the same element type and size as x (similar to similar(x)).

Example

A = rand(10, 10)
@with_pool pool begin
    B = unsafe_acquire!(pool, A)  # Matrix{Float64}, same size as A
    B .= A .* 2
end
source
AdaptiveArrayPools.unsafe_acquire!Method
unsafe_acquire!(pool, Type{T}, n) -> backend's native array type
unsafe_acquire!(pool, Type{T}, dims...) -> backend's native array type
unsafe_acquire!(pool, Type{T}, dims::NTuple{N,Int}) -> backend's native array type

Acquire a native array backed by pool memory.

Returns the backend's native array type:

  • CPU: Array{T,N} (via unsafe_wrap)
  • Bit (T === Bit): BitVector / BitArray{N} (chunks-sharing; equivalent to acquire!)
  • CUDA: CuArray{T,N} (via unified view cache)

For CPU pools, since Array instances are mutable references, cached instances can be returned directly without creating new wrapper objects—ideal for type-unspecified paths. For CUDA pools, this delegates to the same unified N-way cache as acquire!.

Safety Warning

The returned array is only valid within the @with_pool scope. Using it after the scope ends leads to undefined behavior (use-after-free, data corruption).

Do NOT call resize!, push!, or append! on returned arrays - this causes undefined behavior as the memory is owned by the pool.

When to Use

  • Type-unspecified paths: Struct fields without concrete type parameters (e.g., _pooled_chain::PooledChain instead of _pooled_chain::PooledChain{M})
  • FFI calls expecting raw pointers
  • APIs that strictly require native array types

Allocation Behavior

  • CPU: Cache hit 0 bytes, cache miss ~112 bytes (Array header via unsafe_wrap)
  • CUDA: Cache hit ~0 bytes, cache miss ~80 bytes (CuArray wrapper creation)

Example

@with_pool pool begin
    A = unsafe_acquire!(pool, Float64, 100, 100)  # Matrix{Float64} (CPU) or CuMatrix{Float64} (CUDA)
    B = unsafe_acquire!(pool, Float64, 100, 100)
    C = similar(A)  # Regular allocation for result
    mul!(C, A, B)   # BLAS uses A, B directly
end
# A and B are INVALID after this point!

See also: acquire! for view-based access.

source
AdaptiveArrayPools.unsafe_ones!Method
unsafe_ones!(pool, dims...) -> Array
unsafe_ones!(pool, T, dims...) -> Array
unsafe_ones!(pool, dims::Tuple) -> Array
unsafe_ones!(pool, T, dims::Tuple) -> Array

Acquire a one-initialized raw array (not a view) from the pool.

Equivalent to unsafe_acquire!(pool, T, dims...) followed by fill!(arr, one(T)). Default element type depends on pool backend (CPU: Float64, CUDA: Float32). See default_eltype.

Example

@with_pool pool begin
    v = unsafe_ones!(pool, 100)              # Uses default_eltype(pool)
    m = unsafe_ones!(pool, Float32, 10, 10)  # Array{Float32}, all ones
end

See also: unsafe_zeros!, ones!, unsafe_acquire!

source
AdaptiveArrayPools.unsafe_similar!Method
unsafe_similar!(pool, array) -> Array
unsafe_similar!(pool, array, T) -> Array
unsafe_similar!(pool, array, dims...) -> Array
unsafe_similar!(pool, array, T, dims...) -> Array

Acquire an uninitialized raw array (not a view) from the pool, using a template array for defaults.

  • unsafe_similar!(pool, A): same element type and size as A
  • unsafe_similar!(pool, A, T): element type T, same size as A
  • unsafe_similar!(pool, A, dims...): same element type as A, specified dimensions
  • unsafe_similar!(pool, A, T, dims...): element type T, specified dimensions

Example

A = rand(10, 10)
@with_pool pool begin
    B = unsafe_similar!(pool, A)              # Same type and size, raw array
    C = unsafe_similar!(pool, A, Float32)     # Float32, same size
    D = unsafe_similar!(pool, A, 5, 5)        # Same type, different size
end

See also: similar!, unsafe_acquire!

source
AdaptiveArrayPools.unsafe_zeros!Method
unsafe_zeros!(pool, dims...) -> Array
unsafe_zeros!(pool, T, dims...) -> Array
unsafe_zeros!(pool, dims::Tuple) -> Array
unsafe_zeros!(pool, T, dims::Tuple) -> Array

Acquire a zero-initialized raw array (not a view) from the pool.

Equivalent to unsafe_acquire!(pool, T, dims...) followed by fill!(arr, zero(T)). Default element type depends on pool backend (CPU: Float64, CUDA: Float32). See default_eltype.

Example

@with_pool pool begin
    v = unsafe_zeros!(pool, 100)              # Uses default_eltype(pool)
    m = unsafe_zeros!(pool, Float32, 10, 10)  # Array{Float32}, all zeros
end

See also: unsafe_ones!, zeros!, unsafe_acquire!

source
AdaptiveArrayPools.zeros!Method
zeros!(pool, dims...) -> view
zeros!(pool, T, dims...) -> view
zeros!(pool, dims::Tuple) -> view
zeros!(pool, T, dims::Tuple) -> view

Acquire a zero-initialized array from the pool.

Equivalent to acquire!(pool, T, dims...) followed by fill!(arr, zero(T)). Default element type depends on pool backend (CPU: Float64, CUDA: Float32). See default_eltype.

Example

@with_pool pool begin
    v = zeros!(pool, 100)              # Uses default_eltype(pool)
    m = zeros!(pool, Float32, 10, 10)  # Matrix{Float32} view, all zeros
end

See also: ones!, similar!, acquire!

source
Base.empty!Method
empty!(tp::BitTypedPool)

Clear all internal storage for BitTypedPool, releasing all memory. Restores sentinel values for 1-based sentinel pattern.

source
Base.empty!Method
empty!(tp::TypedPool)

Clear all internal storage for TypedPool, releasing all memory. Restores sentinel values for 1-based sentinel pattern.

source
Base.empty!Method
empty!(pool::AdaptiveArrayPool)

Completely clear the pool, releasing all stored vectors and resetting all state.

This is useful when you want to free memory or start fresh without creating a new pool instance.

Example

pool = AdaptiveArrayPool()
v = acquire!(pool, Float64, 1000)
# ... use v ...
empty!(pool)  # Release all memory

Warning

Any SubArrays previously acquired from this pool become invalid after empty!.

source
AdaptiveArrayPools.AdaptiveArrayPoolType
AdaptiveArrayPool

Multi-type memory pool with fixed slots for common types and IdDict fallback for others. Zero allocation after warmup. NOT thread-safe - use one pool per Task.

source
AdaptiveArrayPools.BackendNotLoadedErrorType
BackendNotLoadedError <: Exception

Error thrown when a backend-specific operation is attempted but the backend package is not loaded.

Example

@maybe_with_pool :cuda pool begin
    zeros!(pool, 10)  # Throws if CUDA.jl not loaded
end
source
AdaptiveArrayPools.BitType
Bit

Sentinel type for bit-packed boolean storage via BitVector.

Use Bit instead of Bool in pool operations to get memory-efficient bit-packed arrays (1 bit per element vs 1 byte for Vector{Bool}).

Usage

@with_pool pool begin
    # BitVector (1 bit per element, ~8x memory savings)
    bv = acquire!(pool, Bit, 1000)

    # vs Vector{Bool} (1 byte per element)
    vb = acquire!(pool, Bool, 1000)

    # Convenience functions work too
    mask = falses!(pool, 100)       # BitVector filled with false
    flags = trues!(pool, 100)       # BitVector filled with true
end

Return Types (Unified for Performance)

Unlike other types, Bit always returns native BitVector/BitArray:

  • 1D: BitVector (both acquire! and unsafe_acquire!)
  • N-D: BitArray{N} (reshaped, preserves SIMD optimization)

This design ensures users always get SIMD-optimized performance without needing to remember which API to use.

Performance

BitVector operations like count(), sum(), and bitwise operations are ~(10x ~ 100x) faster than equivalent operations on SubArray{Bool} because they use SIMD-optimized algorithms on packed 64-bit chunks.

@with_pool pool begin
    bv = acquire!(pool, Bit, 10000)
    fill!(bv, true)
    count(bv)  # Uses fast SIMD path automatically
end

Memory Safety

The returned BitVector shares its internal chunks array with the pool. It is only valid within the @with_pool scope - using it after the scope ends leads to undefined behavior (use-after-free risk).

See also: trues!, falses!, BitTypedPool

source
AdaptiveArrayPools.BitTypedPoolType
BitTypedPool <: AbstractTypedPool{Bool, BitVector}

Specialized pool for BitVector arrays with memory reuse.

Unlike TypedPool{Bool} which stores Vector{Bool} (1 byte per element), this pool stores BitVector (1 bit per element, ~8x memory efficiency).

Unified API (Always Returns BitVector)

Unlike other types, both acquire! and unsafe_acquire! return BitVector for the Bit type. This design ensures users always get SIMD-optimized performance without needing to choose between APIs.

  • acquire!(pool, Bit, n)BitVector (SIMD optimized)
  • unsafe_acquire!(pool, Bit, n)BitVector (same behavior)
  • trues!(pool, n)BitVector filled with true
  • falses!(pool, n)BitVector filled with false

Fields

  • vectors: Backing BitVector storage
  • nd_arrays: Cached wrapper BitVectors (chunks sharing)
  • nd_dims: Cached lengths for wrapper cache validation
  • nd_ptrs: Cached chunk pointers for invalidation detection
  • nd_next_way: Round-robin counter for N-way cache
  • n_active: Count of currently active arrays
  • _checkpoint_*: State management stacks (1-based sentinel pattern)

Usage

@with_pool pool begin
    # All return BitVector with SIMD performance
    bv = acquire!(pool, Bit, 100)              # BitVector
    count(bv)                                  # Fast SIMD path

    # Convenience functions
    t = trues!(pool, 50)                       # BitVector filled with true
    f = falses!(pool, 50)                      # BitVector filled with false
end

Performance

Operations like count(), sum(), and bitwise operations are ~(10x ~ 100x) faster than equivalent operations on SubArray{Bool} because BitVector uses SIMD-optimized algorithms on packed 64-bit chunks.

See also: trues!, falses!, Bit

source
AdaptiveArrayPools.DisabledPoolType
DisabledPool{Backend}

Sentinel type for disabled pooling that preserves backend context. When USE_POOLING=false (compile-time) or MAYBE_POOLING_ENABLED[]=false (runtime), macros return DisabledPool{backend}() instead of nothing.

Backend symbols:

  • :cpu - Standard Julia arrays
  • :cuda - CUDA.jl CuArrays (defined in extension)

This enables @with_pool :cuda to return correct array types even when pooling is off.

Example

# When USE_POOLING=false:
@with_pool :cuda pool begin
    v = zeros!(pool, 10)  # Returns CuArray{Float32}, not Array{Float64}!
end

See also: pooling_enabled, DISABLED_CPU

source
AdaptiveArrayPools.TypedPoolType
TypedPool{T} <: AbstractTypedPool{T, Vector{T}}

Internal structure managing pooled vectors for a specific element type T.

Fields

Storage

  • vectors: Backing Vector{T} storage (actual memory allocation)

1D Cache (for acquire!(pool, T, n))

  • views: Cached SubArray views for zero-allocation 1D access
  • view_lengths: Cached lengths for fast Int comparison (SoA pattern)

N-D Array Cache (for unsafe_acquire! only, N-way set associative)

  • nd_arrays: Cached N-D Array objects (length = slots × CACHE_WAYS)
  • nd_dims: Cached dimension tuples for cache hit validation
  • nd_ptrs: Cached pointer values to detect backing vector resize
  • nd_next_way: Round-robin counter per slot (length = slots)

State Management (1-based sentinel pattern)

  • n_active: Count of currently active (checked-out) arrays
  • _checkpoint_n_active: Saved n_active values at each checkpoint (sentinel: [0])
  • _checkpoint_depths: Depth of each checkpoint entry (sentinel: [0])

Note

acquire! for N-D returns ReshapedArray (zero creation cost), so no caching needed. Only unsafe_acquire! benefits from N-D caching since unsafe_wrap allocates 112 bytes.

source
AdaptiveArrayPools.CACHE_WAYSConstant

Number of cache ways per slot for N-way set associative cache. Supports up to CACHE_WAYS different dimension patterns per slot without thrashing.

Default: 4 (handles most use cases well)

Configuration

using AdaptiveArrayPools
AdaptiveArrayPools.set_cache_ways!(8)  # Restart Julia to take effect

Or manually in LocalPreferences.toml:

[AdaptiveArrayPools]
cache_ways = 8

Valid range: 1-16 (higher values increase memory but reduce eviction)

source
AdaptiveArrayPools.FIXED_SLOT_FIELDSConstant
FIXED_SLOT_FIELDS

Field names for fixed slot TypedPools. Single source of truth for foreach_fixed_slot.

When modifying, also update: struct definition, get_typed_pool! dispatches, constructor. Tests verify synchronization automatically.

source
AdaptiveArrayPools.MAYBE_POOLING_ENABLEDConstant
MAYBE_POOLING_ENABLED

Runtime flag for @maybe_with_pool macro only. When false, @maybe_with_pool will use nothing as the pool, causing acquire! to allocate normally.

Note: This only affects @maybe_with_pool. @with_pool ignores this flag (always uses pooling).

For complete removal of pooling overhead at compile time, use USE_POOLING instead.

Default: true

source
AdaptiveArrayPools.POOL_DEBUGConstant
POOL_DEBUG

When true, @with_pool macros validate that returned values don't reference pool memory (which would be unsafe).

Default: false

source
AdaptiveArrayPools.USE_POOLINGConstant
USE_POOLING::Bool

Compile-time constant (master switch) to completely disable pooling. When false, all macros (@with_pool, @maybe_with_pool) generate code that uses nothing as the pool, causing acquire! to fall back to normal allocation.

This enables zero-overhead when pooling is disabled, as the compiler can eliminate all pool-related code paths.

Configuration via Preferences.jl

Set in your project's LocalPreferences.toml:

[AdaptiveArrayPools]
use_pooling = false

Or programmatically (requires restart):

using Preferences
Preferences.set_preferences!(AdaptiveArrayPools, "use_pooling" => false)

Default: true

source