ScatterAI
Issue #1 · March 10, 2026

CBCT Tells You Where the Tissue Was. Ultrasound Tells You Where It Is Now.

Research

01 [Robotics] CBCT Tells You Where the Tissue Was. Ultrasound Tells You Where It Is Now.

Interventional navigation relies on CBCT for 3D anatomical context — but CBCT is a snapshot. The moment respiration shifts an organ or a probe deforms soft tissue, that snapshot is wrong. Surgeons navigate against a map that no longer matches the territory.

This framework uses a robotic ultrasound probe as a continuous deformation sensor to keep the CBCT map current. Calibration-initialized alignment with LC2-based rigid refinement establishes the initial multimodal correspondence between ultrasound and CBCT coordinate spaces. From there, USCorUNet — a lightweight correlation-based UNet — tracks intraoperative tissue motion from live ultrasound frames and propagates those deformations back into the CBCT volume, updating slices in real time without re-acquiring CT. The key move: ultrasound doesn’t replace CBCT’s anatomical resolution, it patches CBCT’s temporal blindness.

The catch is integration friction. Robotic ultrasound adds a physical instrument to an already crowded interventional suite, and “real time” depends on USCorUNet inference latency holding up under production OR conditions — neither is validated in a clinical trial here. For teams building navigation systems for liver, kidney, or abdominal interventions where respiratory motion regularly exceeds 10–20mm, this deformation-proxy architecture is worth tracking closely.

Key takeaways:

Source: Robotic Ultrasound Makes CBCT Alive


02 [Evaluation] RLVR Rewards the Right Answer for the Wrong Reasons — CLIPO Fixes the Mechanism

RLVR trains models to reason by rewarding correct final answers. The problem: a rollout can reach the right answer through flawed intermediate steps — copying the answer, skipping logic, hallucinating a plausible chain. Standard RLVR can’t tell the difference. It rewards the outcome and reinforces the broken path.

CLIPO adds a contrastive loss over successful rollouts. Instead of treating each correct trajectory independently, it optimizes across multiple correct reasoning paths simultaneously, forcing the model to learn the invariant structure they share — the logical moves that appear consistently across correct solutions, not the surface patterns that happen to land on the right answer. Process-wrong-but-outcome-correct rollouts get penalized because their internal structure diverges from genuinely correct trajectories, even when their final tokens match. This is cross-trajectory regularization rather than per-sample outcome scoring.

The catch: this approach requires multiple correct rollouts per problem to compute a meaningful contrastive signal — which means it’s harder to apply in regimes where correct trajectories are sparse (exactly the hard-problem regime where reward sparsity already bites). For teams running RLVR pipelines on problems with high Pass@K, this is a direct plug-in improvement. For low Pass@K regimes, solve the exploration problem first.

Key takeaways:

Source: CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR


03 [Image Gen] Missing Brain Scans Don’t Need to Be Collected — They Can Be Generated

Clinical Alzheimer’s datasets almost always have missing modalities. A patient has an MRI but no PET scan. Another has FDG-PET but no amyloid imaging. The standard response is to drop those subjects or impute crudely. ACADiff treats the missing scan as a generation target instead.

The mechanism: three specialized diffusion generators handle bidirectional synthesis across sMRI, FDG-PET, and AV45-PET. Each denoises in latent space while attending to whatever modalities are available. Two design choices carry the weight. First, adaptive fusion dynamically reconfigures the conditioning pathway based on which inputs exist at inference time — the same model handles any combination of present and absent modalities without retraining. Second, clinical metadata (age, MMSE score, diagnosis stage) gets encoded via GPT-4o into semantic prompt embeddings that steer the synthesis toward clinically plausible anatomy. The model isn’t just hallucinating a brain scan; it’s generating one conditioned on what the patient’s chart says they should look like.

The catch: evaluation runs on ADNI, a relatively clean research cohort. Real clinical data is noisier, acquisition protocols vary across scanners, and GPT-4o prompt encoding adds an external dependency that may behave unpredictably on sparse or nonstandard clinical notes. For teams building Alzheimer’s diagnostic pipelines, the practical value isn’t replacing imaging — it’s rescuing subjects who would otherwise be excluded from multimodal analyses due to incomplete acquisition.

Key takeaways:

Source: Adaptive Clinical-Aware Latent Diffusion for Multimodal Brain Image Generation and Missing Modality Imputation