TLDR: Researchers at MIT propose Manifold-approximated Kernel Alignment (MKA), a new metric for comparing data representations that incorporates manifold geometry. MKA addresses limitations of existing methods like Centered Kernel Alignment (CKA) by focusing on local data relationships through k-nearest neighbors. Empirical evaluations show MKA is more robust, consistent, and less sensitive to hyperparameters across various synthetic and real-world datasets, offering a more reliable way to measure representational similarity in areas like neural networks and representation learning.
This research introduces a new method for comparing data representations called Manifold-approximated Kernel Alignment (MKA). It aims to improve upon the widely used Centered Kernel Alignment (CKA) by incorporating the underlying geometry of data manifolds.
CKA is a popular metric for understanding how different data representations, like those found in neural networks, relate to each other. It works by aligning “kernels” which capture pairwise relationships within datasets. However, the researchers point out that CKA often struggles with the underlying manifold structure of data and can behave inconsistently across different data scales. This means its reliability has been questioned in several studies.
The core idea behind MKA is to integrate manifold geometry into the alignment process. The “manifold hypothesis” suggests that high-dimensional data, such as medical images or neuroimaging data, often lies on a simpler, curved structure (a manifold) within that high-dimensional space. Manifold approximation techniques, like t-SNE and UMAP, are designed to uncover this hidden structure.
MKA leverages manifold approximation to define a unique kernel that is non-linear and non-Mercer. This kernel is often sparse and is typically derived using the k-nearest neighbor (KNN) algorithm. Unlike CKA, which considers all possible pairs of data points, MKA focuses on local relationships by only considering k-nearest neighbors. This approach makes the kernel less sensitive to outliers and imposes a rank order within each row of the kernel matrix.
The researchers developed a theoretical framework for MKA and conducted extensive empirical evaluations. Their findings suggest that MKA is more consistent when dealing with varying data dimensionality and shapes that preserve topology. It also appears to capture the underlying data topology more effectively and is less sensitive to hyperparameters compared to CKA and other contemporary methods.
Experiments on synthetic datasets, including Swiss-roll and S-curve shapes, demonstrated MKA’s ability to correctly align topologically equivalent structures, where CKA sometimes failed. MKA also showed greater robustness to the number of nearest neighbors (k) compared to other methods like kCKA. Further tests on “rings” and “clusters” datasets confirmed MKA’s superior ability to track changes in data structure and its robustness to the ‘k’ parameter.
In scenarios involving perturbed Gaussian spots and lost correspondence, MKA proved more restrictive to feature perturbations and more consistent with varying hyperparameters. It also showed robustness to data translation, maintaining high alignment scores even when data points were moved far apart, a challenge for some other methods.
The paper also highlights MKA’s performance on the Representation Similarity (ReSi) Benchmark, a collection of tests for evaluating alignment metrics across different domains (vision, natural language processing, and graph). MKA achieved strong performance, particularly in the vision domain, and remained competitive in NLP and graph tasks. This suggests MKA is a consistent and parameter-light choice for various modalities.
When analyzing neural network representations, MKA revealed a different perspective compared to CKA. While CKA often shows a “block structure” in neural network layers, MKA, by focusing on local neighborhoods, significantly reduces or eliminates this structure, especially in later layers. This indicates that MKA is less sensitive to dominant high-density regions and large distances in the data, providing a more nuanced view of how representations evolve within networks.
Also Read:
- Integrated Signatures Reveal Deeper Structure in Brain and AI Representations
- Eigen-Value: Boosting AI Robustness by Smart Data Valuation
The authors have made an implementation of MKA available, and the code used for the experiments is also publicly accessible. This work paves the way for applying manifold approximation in diverse fields, including neuroscience for brain activity monitoring and graph learning for protein interactions. For more technical details, the full research paper can be accessed here: Manifold Approximation leads to Robust Kernel Alignment.


