Fast & Faithful: Diffusion Drift

Abstract

Recent work has explored diffusion language models (DLMs) as an alternative to autoregressive (AR) generation for reasoning tasks, yet little is known about the faithfulness of their intermediate reasoning trajectories. This study introduces a preliminary framework for measuring Diffusion Chain-of-Thought (DoT) faithfulness and provides an initial empirical analysis using the LLaDA-8B model and its accelerated variant, dLLM-Cache.

Using trajectory-level linear probes on the GSM8K benchmark, we examine how answer-relevant information emerges and evolves across diffusion steps, and how caching affects this process. Results show that correctness information appears early in the diffusion trajectory, accumulates over time, and remains largely preserved under acceleration with only modest degradation.

While limited to a single acceleration method and probing-based evaluation, these findings provide early evidence that DLM reasoning dynamics can retain causal coherence under efficiency-oriented modifications. Future work will extend this framework with further diagnostics and acceleration methods to build a more complete understanding of faithfulness in diffusion-based reasoning.

Method Overview

We investigate the faithfulness of Diffusion Chain-of-Thought (DoT) and how train-free acceleration techniques affect this property. Unlike autoregressive models where reasoning unfolds token-by-token, DLMs denoise bidirectionally across a sequence, exposing a rich state trajectory over diffusion time that can be interpreted as "latent thoughts."

Defining Faithfulness

A DLM is faithful if its intermediate denoising states encode the causal information necessary for producing the final answer. A DoT is faithful if and only if its intermediate trajectory lies on the model's minimal causal path from input to prediction.

Linear Probing

For each diffusion timestep $t$, we train lightweight linear probes to predict the model's final answer from the intermediate latent state $x_t$. The accuracy of these probes indicates how early answer-relevant information emerges in the trajectory.

Experimental Setup

We evaluate LLaDA-8B baseline alongside dLLM-Cache, which reuses intermediate representations across diffusion steps. We use the GSM8K math word problem dataset with $T = 64$ reverse diffusion steps.

Key Results

Model Performance and Efficiency

Comparison of baseline LLaDA-8B against dLLM-Cache on GSM8K. Accuracy is measured via exact match on the final numerical answer, and latency is averaged over 500 samples.

Model	Accuracy	Latency	Speedup
LLaDA-8B (Baseline)	39.70%	6.31s	1.00×
dLLM-Cache	36.80%	3.54s	1.78×

**Classification probe accuracy over diffusion steps.** The probe predicts answer correctness from mean-pooled hidden states. Both models show increasing accuracy in later steps, with the baseline achieving higher peak accuracy (83.5% vs 76.0% at step 63).

Speed-faithfulness tradeoff analysis — **Speed-faithfulness tradeoff for dLLM-Cache.** The model achieves 1.78× speedup while preserving most faithfulness metrics, suggesting caching maintains the causal structure of reasoning.

L2 distance to final state over diffusion steps — **Convergence of hidden states.** L2 distance to the final predicted answer decreases over diffusion steps, with both baseline and dLLM-Cache following nearly identical trajectories toward the final representation.

Key Findings

Early information emergence: Classification probes show above-chance accuracy in early steps (63.9% baseline, 64.2% cached at steps 0-31), indicating that correctness information emerges well before the final answer is decoded.
Accumulation over time: Classification accuracy rises steadily across diffusion steps, peaking at 83.5% (baseline) and 76.0% (dLLM-Cache) at the final step.
Caching preserves structure: Caching modestly reduces probe accuracy (~1.6 points on average) but preserves overall trajectory patterns and reasoning structure.
Regression limitations: Regression probes fail to predict numeric answers (negative R² throughout), indicating numeric values are not linearly encoded in hidden space.
Favorable efficiency-faithfulness tradeoff: dLLM-Cache achieves 1.78× speedup while maintaining 92.7% of baseline accuracy and 97.6% of baseline mean classification probe accuracy.

Citation

If you find our work useful in your research, please consider citing:

@techreport{aggarwal2025fastandfaithful,
    title={Fast \& Faithful: Diffusion Drift -- Do Accelerated Diffusion Language Models Reason Faithfully?},
    author={Anirud Aggarwal and Omkar Pathak and Nayana Gadde},
    year={2025},
    institution={University of Maryland},
    note={Course Project, CMSC 848R: Language Model Interpretability (Instructor: Sarah Wiegreffe)}
}