ActiveMark: Embedding Ownership in Visual AI Models Through Internal Activations

TLDR: ActiveMark is a novel watermarking method for Visual Foundation Models (VFMs) that embeds digital watermarks into the models’ internal representations by leveraging ‘massive activations’ in specific layers. This technique allows owners to verify their intellectual property even after the model has been fine-tuned or pruned, demonstrating high detection rates for watermarked models and low false positives for independent models, all while being computationally efficient.

Visual Foundation Models (VFMs) are powerful AI systems trained on vast datasets, capable of adapting to many computer vision tasks like image classification and segmentation. Their development requires significant investment in data collection and training, making them valuable assets for their owners. However, the ease with which these models can be copied and redistributed illegally poses a significant challenge to protecting intellectual property rights.

To address this, researchers are developing methods to verify ownership. One prominent approach is watermarking, where specific information is embedded into a model by modifying its internal parameters. This embedded information can then be checked to confirm ownership. Another method, fingerprinting, generates a unique identifier for a model without altering it, and ownership is verified by comparing fingerprints.

A new method called ActiveMark has been introduced, specifically designed for watermarking visual foundation models. ActiveMark embeds digital watermarks into the hidden representations of a select set of input images. This approach leverages a concept known as “massive activations,” which are unusually high response values observed in specific layers or tokens within a VFM. These massive activations often dominate subsequent layers and are found to be ideal locations for embedding watermarks due to their significant impact on the model’s internal representations.

The process involves fine-tuning a small number of the VFM’s later layers, along with training lightweight encoder and decoder networks. The encoder injects a user-specific binary signature (watermark) into a chosen channel of the internal activation of a preselected transformer block. This modified representation then passes through the rest of the VFM. A decoder network at the final block extracts the binary message, allowing for ownership verification.

The training objective for ActiveMark is twofold: it ensures that the watermarked model’s internal representations remain very similar to the original model’s, and it forces the extracted watermark to be nearly identical to the embedded one. This balance ensures that the watermark is successfully embedded and extractable with minimal impact on the model’s functional performance.

To evaluate its effectiveness, ActiveMark measures the “watermark detection rate,” which indicates how reliably the embedded watermark can be extracted. A good watermarking method should have a high detection rate for copies of the watermarked model and a very low detection rate for independent, non-watermarked models. The researchers also developed a statistical method to set a threshold for detection, minimizing the chances of falsely identifying a non-watermarked model or failing to detect a watermarked one.

Experiments showed that early transformer blocks are not suitable for embedding due to low detection rates and high errors. In contrast, specific later blocks, particularly block 12 in models like CLIP, demonstrated high detection rates and low errors. This block also happens to be one of the first layers where massive activations emerge, supporting the hypothesis that these regions are effective for watermark embedding.

ActiveMark was tested for robustness against common model modifications, such as fine-tuning for downstream tasks (like image classification and segmentation) and pruning (reducing model size). The results indicate that the watermarks remain detectable even after these significant alterations. For example, fine-tuning a watermarked CLIP model for semantic segmentation still yielded high detection rates.

When compared to other general-purpose watermarking techniques like ADV-TRA and IPGuard, ActiveMark demonstrated superior watermark detection rates for both positive (functional copies) and negative (independent) suspect models. Furthermore, ActiveMark significantly reduced the computational time required for watermark embedding, taking only 34.63 minutes compared to 1663.70 minutes for ADV-TRA and 1868.54 minutes for IPGuard on a single GPU.

Also Read:

In conclusion, ActiveMark offers a novel, robust, and efficient solution for watermarking visual foundation models. It is designed to be model-agnostic, meaning the owner only needs to perform the embedding procedure once. The watermarked model remains detectable even after fine-tuning for various tasks, and the method effectively distinguishes between legitimate copies and independent models, making it highly applicable in practical scenarios. For more technical details, you can refer to the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

ActiveMark: Embedding Ownership in Visual AI Models Through Internal Activations

Gen AI News and Updates

OneShield Achieves Landmark Registration Under Cloud Security Alliance AI Controls Matrix, Setting New Industry Standard

TrojAI Unveils Defend for MCP to Bolster Security for AI Agent Workflows

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates