Testing Guide#
Purpose#
The test setup separates quick correctness checks from expensive benchmark runs and tutorial/notebook validation.
The goals are:
fast PR feedback for changed code
reproducible baselines for selected regression tests
scheduled heavy runs (large and benchmark)
smoke checks to keep tutorials and notebooks up-to-date
Folder Structure#
The tests directory is organized by intent first, then by package/backend:
tests/unit/small, focused unit tests
no heavy runtime expectations
split into
zoomy_coreandzoomy_jax
tests/regression/behavior/regression checks on small canonical cases
includes tutorial import smoke checks
split into
zoomy_coreandzoomy_jax
tests/benchmarks/expensive performance-oriented tests
disabled by default in local/PR runs
tests/scripts/script-style test scenarios (recent SWE v2 benchmark/check scripts)
imported by pytest smoke/regression tests
tests/results/baselines/tiny reference artifacts used for regression comparisons
created automatically if missing or when
ZOOMY_CREATE_BASELINES=1
tests/common/shared test helpers (baseline storage, utilities)
tests/notebooks/notebook smoke list and notebook testing support files
tests/old/archived legacy tests kept for historical reference
excluded from active pytest discovery
tests/reporting/test-report generation scripts (HTML/JUnit)
notebook validation and jupytext compile checks
Markers#
Important pytest markers:
small: quick tests for PR/local iterationtutorial: tutorial smoke checks (orthogonal intent marker)jax,numpy: backend-specific groupingcore,amrex,petsc,firedrake: runtime/container-specific groupinglarge,benchmark: expensive tests (scheduled/manual)
Default local fast run:
pytest tests -m "small or tutorial"
Recommended stack-selective runs:
pytest tests -m "small and core"
pytest tests -m "small and jax"
pytest tests -m "small and amrex"
pytest tests -m "small and petsc"
pytest tests -m "small and firedrake"
Tutorial checks should always combine tutorial with a runtime marker:
pytest tests -m "small and tutorial and core"
# or:
pytest tests -m "small and tutorial and jax"
Baseline Workflow#
Some regression tests compare against compact baseline files in tests/results/baselines.
If baseline exists: test compares current output against baseline.
If baseline missing: baseline is created.
To refresh baselines intentionally:
ZOOMY_CREATE_BASELINES=1 pytest tests -m small
CI Workflows#
Containers workflow#
.github/workflows/build-containers.yml (workflow name Containers)
Builds and pushes stack images to GHCR (
zoomy_core,zoomy_jax,zoomy_firedrake, dev bases, placeholders, …) when Dockerfiles,install/*.yml, or selected librarypyproject.tomlfiles change.
Smart test workflow#
.github/workflows/tests-report.yml (workflow name Smart Tests)
Runtime: each stack job logs into GHCR, pulls the matching image, bind-mounts the repo at
/workspace, runspip install -efor the relevantlibrary/*packages inside the container, thentests/reporting/generate_test_report.py(so CI exercises the checked-out tree against the image’s solver stack).path-aware test selection on PRs per runtime group:
tutorial tests run inside their runtime group job (no dedicated tutorial runtime lane)
core/jaxamrexdmplex/fenicsx(split paths;dmplexandfiredrakeshare one container)firedrakePRs that only touch containers, install/, or the Containers workflow hit the
infrapath and run all stack jobs.
After images: when Containers finishes successfully, Smart Tests is triggered again via
workflow_run, checking out the same commit (workflow_run.head_sha) so pulls of:latestmatch the images just pushed. Pushes that match the Smart Tests path filters can still start a run in parallel; theworkflow_runpass is the one guaranteed to see fresh GHCR tags for container-only changes.scheduled and manual large / benchmark runs: one job per stack (same backends as small), merged into
test-reports-large-bundlemanual run with optional large test toggle
HTML + JUnit per stack job; follow-up jobs merge stack artifacts into
test-reports-small-bundleandtest-reports-large-bundleso docs can download two artifacts (small vs large, each with per-stack folders).Render Webpage downloads the latest completed bundles of those names before building the book, and can also run after Smart Tests via
workflow_run(so the usual chain is Containers → Smart Tests → Render Webpage when all succeed).Optional extra dependency pins for local or legacy setups live under
tests/requirements/*.txt; the Smart Tests workflow does not use those files on the GitHub runner (dependencies come from the container images plus editable installs of the repo).
Notebook workflow#
.github/workflows/notebooks.yml
PR: validate changed notebooks + jupytext temporary conversion compile check
schedule/manual: validate all notebooks
optional smoke execution using
tests/notebooks/smoke_notebooks.txt
Notebook Policy#
Source of truth remains
.ipynb(for docs publishing).No paired
.pynotebook files are committed.jupytextis used only transiently in checks:convert notebook content to temporary Python text
run compile/syntax check
discard temporary files
This keeps notebook docs authoritative while still improving maintainability and CI diagnostics.