Legacy Deep-Dive: Views
This page is retained as a detailed reference. The canonical user path is now the chaptered handbook.
Primary chapter: 05-views-iteration-and-data-access.md
Runtime and handles: Runtime and Handle Model
Detailed Reference (Legacy)
Views are the central interface layer in DTL. They expose access and iteration semantics while constraining communication and invalidation behavior.
Table of Contents
Overview
DTL provides four view types, each serving a distinct purpose:
View |
Purpose |
Communication |
Iterator Category |
|---|---|---|---|
|
Local-only access |
Never |
Random-access |
|
Global logical access |
On |
N/A (returns |
|
Bulk distributed iteration |
Never (per-segment) |
Forward (over segments) |
|
Explicit remote element |
Explicit |
N/A (proxy type) |
The DTL View Philosophy
DTL follows a clear hierarchy:
Fast path: Use
local_vieworsegmented_viewfor bulk operations (no communication)Correct path: Use
global_view+remote_reffor sparse remote access (explicit communication)Forbidden path: No implicit
T&for potentially remote elements (prevents hidden communication)
local_view
local_view provides STL-compatible access to locally-owned elements only.
Basic Usage
dtl::distributed_vector<double> vec(1000, size, rank);
auto local = vec.local_view();
// Direct element access
local[0] = 42.0;
double val = local[10];
// Bounds-checked access
try {
double x = local.at(999999);
} catch (const std::out_of_range& e) {
// Index out of bounds
}
// Size information
std::size_t n = local.size(); // Number of local elements
bool empty = local.empty();
STL Compatibility
local_view is fully compatible with STL algorithms:
auto local = vec.local_view();
// Range-based for loop
for (double& x : local) {
x *= 2.0;
}
// STL algorithms
std::sort(local.begin(), local.end());
std::fill(local.begin(), local.end(), 0.0);
auto sum = std::accumulate(local.begin(), local.end(), 0.0);
auto it = std::find(local.begin(), local.end(), 42.0);
auto count = std::count_if(local.begin(), local.end(),
[](double x) { return x > 0; });
// Reverse iteration
for (auto rit = local.rbegin(); rit != local.rend(); ++rit) {
// Process in reverse
}
// Iterator arithmetic (random-access)
auto mid = local.begin() + local.size() / 2;
auto dist = std::distance(local.begin(), mid);
No Communication Guarantee
Critical guarantee: local_view operations NEVER communicate.
This is enforced by design:
Local views only access locally-owned elements
All operations are pure local memory operations
No network traffic, no MPI calls, no latency
auto local = vec.local_view();
// These operations are ALL local-only:
local[0] = 1.0; // Direct memory write
double x = local[0]; // Direct memory read
std::sort(local.begin(), local.end()); // Local sort
auto sum = std::accumulate(...); // Local accumulation
This guarantee makes local_view the primary interface for performance-critical code.
global_view
global_view represents the logical global container with explicit remote access.
Global Indexing
dtl::distributed_vector<double> vec(1000, size, rank);
auto global = vec.global_view();
// Global index space
dtl::size_type global_size = global.size(); // 1000 (total across all ranks)
remote_ref Access
Key principle: Global indexing returns remote_ref<T>, not T&.
auto global = vec.global_view();
// operator[] returns remote_ref<T>, NOT T&
auto ref = global[500]; // Type: remote_ref<double>
// You CANNOT do this:
// double& bad = global[500]; // COMPILE ERROR: no implicit conversion
// You MUST explicitly read/write:
double val = ref.get(); // Explicit read (may communicate)
ref.put(99.0); // Explicit write (may communicate)
ND Global Indexing
For tensors, global view uses ND indices:
dtl::distributed_tensor<double, 2> mat({100, 100}, size, rank);
auto global = mat.global_view();
// ND global index
auto ref = global({50, 50}); // remote_ref<double> for element (50, 50)
double val = ref.get();
ref.put(42.0);
remote_ref
remote_ref<T> is DTL’s “syntactically loud” proxy for fine-grained remote access.
Syntactic Loudness
DTL’s core design principle requires that remote access be explicit. remote_ref achieves this by:
No implicit conversion to
T&- You cannot accidentally get a referenceNo implicit conversion to
T*- You cannot accidentally get a pointerNo implicit conversion to
bool- No implicit truth testingNo implicit dereference - Must call
.get()explicitly
auto global = vec.global_view();
auto ref = global[500];
// These all FAIL to compile:
// double& bad1 = ref; // No implicit T& conversion
// double* bad2 = &ref; // No implicit T* conversion
// if (ref) { } // No implicit bool conversion
// double bad3 = *ref; // No implicit dereference
// This is the ONLY way to read:
double val = ref.get();
// This is the ONLY way to write:
ref.put(42.0);
Operations
Basic Read/Write
auto ref = global[idx];
// Synchronous read
double val = ref.get();
// Synchronous write
ref.put(42.0);
Error Handling
Under result-based error policy:
// get() returns result<T>
auto result = ref.get();
if (result.has_value()) {
double val = result.value();
} else {
auto error = result.error();
// Handle communication error
}
// put() returns result<void>
auto put_result = ref.put(42.0);
if (!put_result) {
// Handle write error
}
Under throwing error policy:
try {
double val = ref.get();
ref.put(42.0);
} catch (const dtl::communication_error& e) {
// Handle error
}
Identity Information
auto ref = global[idx];
// Query the global index
auto global_idx = ref.global_index();
// Query the owning rank
dtl::rank_t owner = ref.owner();
// Check if local
bool is_local = ref.is_local();
When to Use
Use remote_ref for:
Debugging and correctness verification
Sparse remote operations (few elements)
Algorithms that need explicit remote access
Prototyping before optimization
Avoid remote_ref for:
Dense iteration over remote data
Performance-critical inner loops
Bulk operations (use halo exchange or redistribution instead)
// BAD: Per-element remote access in a loop
auto global = vec.global_view();
double sum = 0.0;
for (dtl::size_type i = 0; i < global.size(); ++i) {
sum += global[i].get(); // SLOW: one communication per element
}
// GOOD: Local computation + collective reduction
auto local = vec.local_view();
double local_sum = std::accumulate(local.begin(), local.end(), 0.0);
double global_sum;
MPI_Allreduce(&local_sum, &global_sum, 1, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD);
segmented_view
segmented_view is DTL’s primary performance substrate for distributed algorithms.
The Performance Path
The DTL performance model is:
Iterate segments locally (no communication)
Compute local results
Communicate in bulk (collectives, halo exchange)
Repeat
segmented_view enables step 1 efficiently.
Basic Usage
dtl::distributed_vector<double> vec(1000, size, rank);
auto segv = vec.segmented_view();
// Iterate over local segments
for (auto& segment : segv.segments()) {
// Each segment is a local-only view
auto local_range = segment.local_range();
for (double& x : local_range) {
x *= 2.0; // Process locally
}
}
Segment Iteration
Each segment provides:
for (auto& segment : segv.segments()) {
// Global index information
auto global_start = segment.global_offset();
auto global_end = segment.global_offset() + segment.size();
// Local iterable range (STL-compatible)
auto range = segment.local_range();
// Use with STL algorithms
std::transform(range.begin(), range.end(), range.begin(),
[](double x) { return x * x; });
// Segment metadata
auto seg_id = segment.id(); // Stable ID for debugging
}
Segmented Distributed Algorithms
DTL algorithms are built on segmented iteration:
// Distributed reduce pattern
template<typename Container, typename T, typename BinaryOp>
T distributed_reduce(Container& c, T init, BinaryOp op) {
auto segv = c.segmented_view();
// Step 1: Local partial reduction (no communication)
T local_result = init;
for (auto& segment : segv.segments()) {
for (auto& x : segment.local_range()) {
local_result = op(local_result, x);
}
}
// Step 2: Global reduction (collective communication)
T global_result;
// MPI_Allreduce or similar...
return global_result;
}
No Communication Guarantee
Like local_view, segmented_view guarantees no communication during iteration:
auto segv = vec.segmented_view();
// These operations are ALL local-only:
for (auto& seg : segv.segments()) { // Local iteration
for (auto& x : seg.local_range()) { // Local range access
x = 0.0; // Local memory write
}
}
Communication happens only when you explicitly call collective operations.
View Validity and Invalidation
Views track structural epochs to ensure safety.
Structural Operations Invalidate Views
Certain operations change the container’s structure and invalidate all views:
Operation |
Invalidates Views? |
|---|---|
|
Yes |
|
Yes |
Element modification |
No |
|
No |
Detection and Failure
DTL detects use of invalidated views:
auto local = vec.local_view();
// Use view normally
local[0] = 42.0;
// Structural operation
vec.resize(2000);
// View is now INVALID
// Using it will fail deterministically:
local[0] = 1.0; // Debug: assertion failure
// Release: returns structural_invalidation error
Safe Pattern
Always obtain fresh views after structural operations:
void process(dtl::distributed_vector<double>& vec) {
auto local = vec.local_view();
// Phase 1: Process
for (double& x : local) {
x *= 2.0;
}
// Phase 2: Resize
vec.resize(vec.global_size() * 2);
// Phase 3: Process again - GET FRESH VIEW
auto fresh_local = vec.local_view(); // Must get new view
for (double& x : fresh_local) {
x += 1.0;
}
}
Epoch Checking
Views carry an epoch at creation:
auto local = vec.local_view();
auto epoch_at_creation = local.epoch();
// After structural operation
vec.resize(2000);
// Views from before resize have stale epoch
// Container has advanced epoch
// Comparison detects staleness
Best Practices
1. Prefer Local Views
For any operation on local data, use local_view:
// GOOD: Local view for local operations
auto local = vec.local_view();
std::sort(local.begin(), local.end());
// BAD: Global view when you only need local data
auto global = vec.global_view();
for (std::size_t i = vec.global_offset(); i < vec.global_offset() + vec.local_size(); ++i) {
auto ref = global[i]; // Unnecessary indirection
double val = ref.get();
}
2. Use Segmented Views for Distributed Algorithms
// GOOD: Segmented iteration
auto segv = vec.segmented_view();
double local_sum = 0.0;
for (auto& seg : segv.segments()) {
for (double x : seg.local_range()) {
local_sum += x;
}
}
// Then collective reduction...
3. Bulk Communication Over Point-to-Point
// BAD: Per-element remote access
for (dtl::size_type i = 0; i < 1000; ++i) {
remote_data[i] = global[i].get(); // 1000 communications!
}
// GOOD: Use halo exchange or redistribution
auto halo = tensor.halo_view(1);
halo.exchange(); // One bulk communication
4. Check View Validity in Long-Running Code
void long_computation(Container& c) {
auto local = c.local_view();
for (int iteration = 0; iteration < 1000; ++iteration) {
// Process
for (auto& x : local) {
x = compute(x);
}
// If structure might change
if (needs_resize(iteration)) {
c.resize(new_size);
local = c.local_view(); // Refresh view
}
}
}
5. Document Communication Points
Make communication explicit in your code:
void distributed_compute(Container& c) {
auto local = c.local_view();
// Phase 1: Local computation (no communication)
for (auto& x : local) {
x = expensive_local_compute(x);
}
// COMMUNICATION POINT
auto halo = c.halo_view(1);
halo.exchange(); // <-- Communication here
// Phase 2: Stencil with halo data
// ...
// COMMUNICATION POINT
double local_result = local_reduce();
double global_result;
MPI_Allreduce(&local_result, &global_result, ...); // <-- Communication here
}
Summary
View |
Use For |
Communication |
|---|---|---|
|
STL-like local operations |
Never |
|
Explicit global indexing |
On |
|
Distributed algorithms |
Never (bulk ops are separate) |
|
Sparse remote access |
Explicit on each operation |
Key takeaway: DTL makes communication explicit. Use local views for performance, remote_ref for correctness, and segmented views for scalable distributed algorithms.
See Also
Containers Guide - Container types and construction
Algorithms Guide - DTL distributed algorithms