A New Approach to Comparing Data Representations: Manifold-approximated Kernel Alignment

TLDR: Researchers at MIT propose Manifold-approximated Kernel Alignment (MKA), a new metric for comparing data representations that incorporates manifold geometry. MKA addresses limitations of existing methods like Centered Kernel Alignment (CKA) by focusing on local data relationships through k-nearest neighbors. Empirical evaluations show MKA is more robust, consistent, and less sensitive to hyperparameters across various synthetic and real-world datasets, offering a more reliable way to measure representational similarity in areas like neural networks and representation learning.

This research introduces a new method for comparing data representations called Manifold-approximated Kernel Alignment (MKA). It aims to improve upon the widely used Centered Kernel Alignment (CKA) by incorporating the underlying geometry of data manifolds.

CKA is a popular metric for understanding how different data representations, like those found in neural networks, relate to each other. It works by aligning “kernels” which capture pairwise relationships within datasets. However, the researchers point out that CKA often struggles with the underlying manifold structure of data and can behave inconsistently across different data scales. This means its reliability has been questioned in several studies.

The core idea behind MKA is to integrate manifold geometry into the alignment process. The “manifold hypothesis” suggests that high-dimensional data, such as medical images or neuroimaging data, often lies on a simpler, curved structure (a manifold) within that high-dimensional space. Manifold approximation techniques, like t-SNE and UMAP, are designed to uncover this hidden structure.

MKA leverages manifold approximation to define a unique kernel that is non-linear and non-Mercer. This kernel is often sparse and is typically derived using the k-nearest neighbor (KNN) algorithm. Unlike CKA, which considers all possible pairs of data points, MKA focuses on local relationships by only considering k-nearest neighbors. This approach makes the kernel less sensitive to outliers and imposes a rank order within each row of the kernel matrix.

The researchers developed a theoretical framework for MKA and conducted extensive empirical evaluations. Their findings suggest that MKA is more consistent when dealing with varying data dimensionality and shapes that preserve topology. It also appears to capture the underlying data topology more effectively and is less sensitive to hyperparameters compared to CKA and other contemporary methods.

Experiments on synthetic datasets, including Swiss-roll and S-curve shapes, demonstrated MKA’s ability to correctly align topologically equivalent structures, where CKA sometimes failed. MKA also showed greater robustness to the number of nearest neighbors (k) compared to other methods like kCKA. Further tests on “rings” and “clusters” datasets confirmed MKA’s superior ability to track changes in data structure and its robustness to the ‘k’ parameter.

In scenarios involving perturbed Gaussian spots and lost correspondence, MKA proved more restrictive to feature perturbations and more consistent with varying hyperparameters. It also showed robustness to data translation, maintaining high alignment scores even when data points were moved far apart, a challenge for some other methods.

The paper also highlights MKA’s performance on the Representation Similarity (ReSi) Benchmark, a collection of tests for evaluating alignment metrics across different domains (vision, natural language processing, and graph). MKA achieved strong performance, particularly in the vision domain, and remained competitive in NLP and graph tasks. This suggests MKA is a consistent and parameter-light choice for various modalities.

When analyzing neural network representations, MKA revealed a different perspective compared to CKA. While CKA often shows a “block structure” in neural network layers, MKA, by focusing on local neighborhoods, significantly reduces or eliminates this structure, especially in later layers. This indicates that MKA is less sensitive to dominant high-density regions and large distances in the data, providing a more nuanced view of how representations evolve within networks.

Also Read:

The authors have made an implementation of MKA available, and the code used for the experiments is also publicly accessible. This work paves the way for applying manifold approximation in diverse fields, including neuroscience for brain activity monitoring and graph learning for protein interactions. For more technical details, the full research paper can be accessed here: Manifold Approximation leads to Robust Kernel Alignment.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

A New Approach to Comparing Data Representations: Manifold-approximated Kernel Alignment

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Financial Sector Fortifies Against Surging AI-Powered Scams

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates