11. Troubleshooting and Diagnostics
Build/configuration failures
verify CMake summary matches intended backend flags
disable unavailable optional backends explicitly
reconfigure from clean build directory after major flag changes
Runtime capability mismatches
Symptoms:
backend unavailable errors at runtime
unexpected single-rank behavior
Actions:
print capability checks at startup (
has_mpi/has_cuda/...)verify runtime environment and launch mode
verify code path is backend-gated correctly
Collective hangs or deadlocks
Common causes:
mismatched collective participation
diverging control flow by rank
rank-specific early return before collective
Debug steps:
add rank-scoped logging around collective boundaries
minimize test case to smallest reproducer
verify every rank reaches each collective call
Binding-specific issues
C ABI
validate handles before use
ensure matching create/destroy pairs
Python
ensure correct
PYTHONPATHfor local extension buildsverify
_dtlextension matches active Python interpreter
Fortran
verify
bind(c)signatures and type mappingconfirm explicit destroy calls for handles
Diagnostics best practices
include rank, backend, and context info in logs
keep deterministic reproducer scripts for failures
separate environment/setup issues from library contract issues
Additional resources
docs/user_guide/troubleshooting.mddocs/process/known_issues_workflow.md