3. Environment, Context, and Backends

Lifecycle model

DTL runtime usage generally follows:

initialize/create environment
create context(s)
run container/view/algorithm workloads
finalize/cleanup

Environment responsibilities

The environment tracks backend capabilities and runtime lifecycle. It is the point where backend availability (MPI/GPU/etc.) becomes concrete.

Typical capabilities queried from environment:

has_mpi()
has_cuda()
has_hip()
has_nccl()
has_shmem()

Context responsibilities

A context encapsulates active domains and communication identity for operations. Use context as the explicit dependency for distributed containers and collectives.

Typical context queries:

rank and size
root status
device affinity / device id
validity checks

Backend-aware programming guidance

Branch behavior on capability checks before selecting backend-specific paths.
Keep host fallback paths for environments where GPU backends are unavailable.
Treat MPI participation as collective contract, not optional within a call path.

Single-rank and non-MPI mode

DTL supports non-MPI execution for local development and many workflows:

rank is effectively 0
size is effectively 1
collective APIs generally degenerate to local behavior

This mode is useful for unit tests and local correctness development.

MPI/GPU domain composition

Advanced flows may compose contexts with additional domains (e.g., CUDA/NCCL). Ensure availability checks and explicit error handling around domain-adding operations.

Operational best practices

create context once per logical execution scope
avoid repeatedly creating/destroying context in hot loops
keep context validity checks at API boundaries in mixed-language stacks

Next step

Continue with Chapter 4 for core data model usage.