TLDR: A new study defines the fundamental limits of robust watermarking for generative AI, proving that no scheme can survive if more than half of its encoded bits are modified. It also shows that a simple crop-and-resize operation can practically erase state-of-the-art image watermarks without visible degradation, confirming these theoretical limits are already met in practice.
In an era where generative AI models like GPT, Llama, and Stable Diffusion are blurring the lines between human-created and machine-generated content, the challenge of reliably distinguishing between the two has become paramount. Watermarking has emerged as a leading strategy to address this, by embedding a secret, imperceptible pattern into AI outputs that can later be verified with a key.
Ideally, a watermark should be robust enough to survive significant modifications, undetectable to those without the key, and should not degrade the quality or meaning of the original content. While previous research has explored cryptographic watermarking using pseudorandom error-correcting codes (PRCs) that offer both undetectability and robustness, a fundamental question remained: what are the ultimate limits of robustness for any cryptographically sound watermark?
A recent research paper, “The Coding Limits of Robust Watermarking for Generative Models”, by Danilo Francati, Yevin Nikhel Goonatilake, Shubham Pawar, Daniele Venturi, and Giuseppe Ateniese, tackles this question head-on. The authors introduce a novel concept called “messageless secret-key codes” to formalize the essential requirements for robust watermarking: soundness (avoiding false positives), tamper detection (identifying modifications), and pseudorandomness (making the watermark indistinguishable from random noise).
Through this abstraction, the paper establishes a precise information-theoretic threshold for watermark robustness. For binary outputs (like bits in a digital signal), no watermarking scheme can reliably survive if more than half of the encoded bits are modified. For alphabets of size ‘q’ (where ‘q’ is greater than 1), the limit is (1 – 1/q) of the symbols. This means there’s a fundamental barrier that no watermarking scheme, regardless of its design, can overcome if tampering exceeds this specific fraction.
Complementing this impossibility result, the researchers also provide explicit constructions of codes that nearly achieve these theoretical limits. These constructions, which are efficient and operate in linear time, demonstrate that it is possible to build watermarking schemes that tolerate errors up to just under half of the bits in the binary case, or just under (1 – 1/q) of the symbols in the q-ary case. This confirms that the robustness achieved by current PRC-based watermarking schemes is, in fact, information-theoretically optimal.
To validate their theoretical findings in a practical setting, the team conducted experiments on a state-of-the-art PRC-based image watermarking scheme by Gunn, Zhao, and Song (ICLR 2025). They discovered a remarkably simple yet effective attack: a “crop-and-resize” operation. By cropping just 15 pixels from each side of a watermarked 512×512 image and then resizing it back to its original dimensions using bicubic interpolation, the watermark was consistently erased.
Visually, the cropped and resized images were almost indistinguishable from the originals, preserving their perceptual quality. However, this seemingly benign transformation reliably flipped about half of the latent signs in the image’s underlying representation. This critical 50% error rate is precisely the theoretical threshold at which belief-propagation decoding, used by the watermark detector, fails to recover the original codeword.
The success of the crop-and-resize attack highlights a crucial distinction: unlike noise, blur, color shifts, or compression, which alter pixel values but maintain the image’s coordinate system, cropping and resizing globally resamples the image. This forces the encoder to reinterpret the visual content within a new coordinate system, scrambling the latent sign pattern to the point where it becomes indistinguishable from random noise, thus overwhelming the error-correcting capacity of the watermark.
Also Read:
- MetaSeal: A New Approach to Image Attribution Security
- AI Agents Vulnerable to Malicious Code Hidden in Online Images, Study Warns
This research offers a complete characterization of robust watermarking, pinpointing the exact threshold where robustness breaks down, providing constructions that meet this bound, and offering experimental confirmation that this limit is already being reached in practice. The findings suggest that any significant future advancements in watermark robustness will likely require entirely new approaches, potentially leveraging semantic or structural features of content rather than relying solely on cryptographic pseudorandomness.


