Task-Aware Environment Augmentation for
Reliable Navigation via SCoDA

Shielded Conditional Diffusion for Environment Augmentation — generating sparse, task-aware fiducial layouts that keep a robot localized while following a planned trajectory.

Anonymous Authors

Paper under double-blind review

CoRL 2026 · Submission

SCoDA decides where to add a small budget of visual fiducials so a planned trajectory can be executed reliably under disturbances — placing corrective observations only where they change the outcome.

Abstract

TL;DR — instrument the environment, not the robot

Reliable trajectory execution under partial observability depends not only on a feasible geometric path, but on whether the robot receives informative observations while executing it. Existing approaches keep the environment fixed and adapt the robot through belief-space planning, active localization, or extra sensing. We flip this perspective and address task-aware environment augmentation: given a mapped environment, a planned trajectory, and a small budget of visual fiducial markers, where should the environment be augmented so the trajectory executes reliably under uncertainty? We present SCoDA, which learns a conditional distribution over high-performing fiducial layouts from closed-loop rollout data, conditioned on the environment, trajectory, disturbance context, and desired execution profile. Its shielded sampler reasons about where along the trajectory pose corrections should occur and steers generation toward task-relevant, finite-budget layouts. Across simulated benchmarks and hardware deployments, SCoDA improves execution reliability and completion time over strong baselines.

Why placement matters

The same path can fail or succeed

Between observations, pose error grows under noise and disturbances. Poor or missing fiducial support lets that error compound into tracking failure, while a few well-timed observations are enough to keep the robot on the planned trajectory.

Task-aware fiducial augmentation examples
Real and simulated environments. SCoDA places sparse fiducials exactly where localization support improves execution, rather than spending the budget on already-easy segments.

How SCoDA works

Trajectory-progress generation + shielded sampling

The key idea: instead of generating fiducial poses directly in the workspace, SCoDA generates where along the trajectory each pose correction is needed, then maps that back to a feasible physical layout. Task-critical regions become intervals; redundant markers become points that are too close together.

Trajectory-progress parameterization
SCoDA decouples when a pose correction is needed from where the fiducial is physically placed, reasoning in trajectory-progress space before instantiating a workspace layout.
SCoDA pipeline
Pipeline. Each training example pairs a candidate layout with its task context and rollout outcome; a 1D-UNet learns to recover high-performing layouts. At inference, classifier-free guidance plus a shielded reverse diffusion step steer samples toward task-relevant, well-separated, finite-budget layouts.

Results

Strongest layout under a fixed budget — at a fraction of the cost
1

Strong execution with just-enough augmentation

Under a fixed fiducial budget, SCoDA is the strongest placement method in both simulation and hardware — adding markers alone (random / periodic) is not enough; SCoDA learns which observations support closed-loop execution.

2

Context-aware placements

Generated fiducials concentrate near disturbance entry points, obstacle-dense segments, and parts of the path where tracking error would otherwise compound — not on already-easy stretches.

3

Matches rollout-optimized quality without test-time search

SCoDA approaches the expensive Rollout-Opt upper bound while reaching 90% success / 95% waypoint-following with the same minimal budget — using only a learned generator and shielded inference.

4

Amortization removes the placement-time bottleneck

Layouts are produced in under a second with zero rollouts, versus hundreds of seconds of closed-loop search for Rollout-Opt.

Hardware deployment on a Crazyflie 2.0 quadrotor — 45 trials, budget K = 8.
Method Success rate (%) Waypoint following (%) Completion time (s)
No-Augmentation4.6 ±3.124.2 ±5.458.4 ±1.4
Random13.7 ±5.136.9 ±6.255.6 ±2.3
Periodic-K57.4 ±7.472.8 ±6.144.8 ±3.4
Visibility-Greedy68.7 ±6.978.9 ±5.740.9 ±3.1
SCoDA (ours)91.2 ±4.494.7 ±3.229.5 ±2.1
Simulated trajectory execution in FalconGym 2.0 — K = 6, mean ± s.e. over 150 rollouts. RO (Rollout-Opt) is an expensive upper bound.
Method Success, nominal (%) Success, disturbed (%) Compl. time, disturbed (s)
No-Augmentation5.4 ±1.81.8 ±1.159.2 ±0.8
Random9.6 ±2.45.9 ±1.956.8 ±1.2
Periodic-K64.8 ±4.243.7 ±4.147.4 ±2.0
Visibility-Greedy72.6 ±3.857.9 ±4.044.0 ±1.9
Deviation-Greedy84.1 ±3.077.3 ±3.435.2 ±1.5
Rollout-Opt (upper bound)98.7 ±0.997.8 ±1.127.9 ±0.8
SCoDA (ours)96.9 ±1.394.7 ±1.729.2 ±1.1

SCoDA nearly matches Rollout-Opt while producing layouts in ~0.18 s with zero rollouts (vs. ~417 s and 4500 rollouts).

Qualitative placements in simulation and hardware
Context-aware placements on simulated (top) and hardware (bottom) tasks. SCoDA matches rollout-optimized quality orders of magnitude faster, while baselines fail to complete the task.

Simulation rollouts

Random baseline vs. SCoDA · FalconGym 2.0
Random baselinedrifts / fails
Scenario 1
SCoDAcompletes
Scenario 1
Random baselinedrifts / fails
Scenario 2
SCoDAcompletes
Scenario 2

More SCoDA rollouts on additional environments

SCoDAcompletes
Scenario 3
SCoDAcompletes
Scenario 3 · variant

Hardware deployment

Crazyflie 2.0 · flight + executed trajectory

Each card shows the onboard flight (top) and the trajectory the robot actually followed under that fiducial placement (bottom). Baseline placements leave long observation gaps; SCoDA's layout keeps the robot tracking the planned path.

Periodic placementbaseline
Executed trajectory
SCoDA placementours
Executed trajectory
Random placementbaseline
Executed trajectory
SCoDA placementours
Executed trajectory

Takeaway

Design the environment for the task

SCoDA shows that diffusion can act as a generative designer for just-enough environment augmentation: a trajectory-indexed representation, rollout-conditioned training, and a task-space shield together focus a finite marker budget on the corrective observations that matter most. Reliable execution under partial observability improves not by making the whole environment observable, but by placing limited observations where they change the outcome.

Limitations. SCoDA optimizes where to place a fixed number K of markers but does not choose K, assumes a reference trajectory and mapped environment, and relies on rollout supervision that must match the deployment setting. Extensions to multi-robot localization and full 3D visibility are promising future directions.

BibTeX

 
@inproceedings{scoda2026,
  title     = {Task-Aware Environment Augmentation for Reliable
               Navigation via Shielded Conditional Diffusion},
  author    = {Anonymous Authors},
  booktitle = {Conference on Robot Learning (CoRL)},
  year      = {2026},
  url       = {https://scoda-diffusion.github.io/}
}