TLDR: A new research paper introduces Spectrally Anisotropic Gaussian Diffusion (SAGD), a method that modifies the forward noise in Diffusion Probabilistic Models (DPMs) using a frequency-diagonal covariance. This allows for explicit control over the model’s inductive biases, enabling it to emphasize or suppress specific frequency bands during training. SAGD has shown improved generative performance across various datasets and can even facilitate ‘selective omission,’ where models learn to ignore known corruptions in designated frequency ranges, all while maintaining a probabilistically consistent Gaussian forward process.
Diffusion Probabilistic Models (DPMs) have become incredibly powerful tools for generating realistic data, from images to various probabilistic models. However, the underlying assumptions or ‘inductive biases’ that guide these models often remain hidden. A new research paper introduces an innovative approach to explicitly build these biases into the training and sampling processes of diffusion models, making them more adaptable to the specific characteristics of the data they are trying to model.
The core of this new method, termed Spectrally Anisotropic Gaussian Diffusion (SAGD), involves replacing the standard, uniform ‘forward noise’ with an ‘anisotropic noise operator’. Imagine noise that isn’t just random static, but rather structured in a way that emphasizes or de-emphasizes certain frequencies in the data. This operator uses a structured, frequency-diagonal covariance, which essentially means it can selectively add noise to different frequency bands of an image or data point.
This novel noise operator is versatile, unifying concepts like band-pass masks (which allow only specific frequency ranges to pass) and power-law weightings (which adjust the strength of noise based on frequency). This allows researchers to either highlight or suppress designated frequency bands during the noising process, all while keeping the overall forward process Gaussian, which is crucial for the mathematical consistency of diffusion models.
The researchers derived the mathematical relationship for how the model learns with these anisotropic covariances. They demonstrated that, under certain conditions, the learned model can still accurately recover the true data distribution as the noise level approaches zero. However, the anisotropy fundamentally reshapes the ‘probability-flow path’ – essentially, how the model transitions from pure noise to a coherent data sample. This means the model learns to prioritize different aspects of the data based on how the noise is structured.
Empirical results from the study are compelling. SAGD models consistently outperformed standard diffusion models across several vision datasets, including MNIST, CIFAR-10, Domainnet-Quickdraw, Wiki-Art, and FFHQ. This suggests that by carefully designing the forward noise, models can achieve better generative performance. A particularly interesting finding is the concept of ‘selective omission’. SAGD allows models to learn while deliberately ignoring known corruptions that are confined to specific frequency bands. For instance, if an image is corrupted with noise in a particular frequency range, SAGD can be configured to ignore that range, effectively recovering the clean, uncorrupted data.
The paper highlights two main ways to implement this frequency-based noise control: power-law weighting (plw-SAGD) and a two-band mixture (bpm-SAGD). Power-law weighting applies a radial slope in the log-log power spectrum, allowing for emphasis on either low frequencies (for coarser structures) or high frequencies (for sharper textures). The two-band mixture, on the other hand, uses band-pass masks to combine noise from specific low and high-frequency ranges, offering precise control over which frequencies are affected.
Also Read:
- Bridging the Noise Gap: Optimizing Diffusion Models with Sampler Stochasticity
- Keeping Diffusion Models on Track: Introducing Temporal Alignment Guidance
The practical implications are significant. Because SAGD primarily modifies only the forward covariance, it can be integrated into existing diffusion model implementations with minimal code changes, preserving the rest of the established pipeline. This makes it a simple yet principled way to tailor the inductive biases in DPMs, opening new avenues for more targeted and flexible generative modeling. For more in-depth details, you can read the full research paper here.


