TLDR: PLAME is a novel model that enhances protein structure prediction by generating high-quality Multiple Sequence Alignments (MSAs), especially for proteins with limited evolutionary information. It leverages pretrained protein language models, a unique conservation-diversity loss function, and an MSA selection method (HiFiAD) to produce biologically plausible alignments. This approach significantly boosts the accuracy of tools like AlphaFold2 and AlphaFold3, and can enable faster, MSA-free predictions with comparable accuracy.
Understanding the intricate 3D shapes of proteins is a fundamental challenge in biology, crucial for developing new drugs and comprehending biological processes. Recent breakthroughs, notably AlphaFold, have dramatically improved our ability to predict these structures with high accuracy. However, many of these advanced prediction models heavily rely on something called Multiple Sequence Alignments (MSAs).
MSAs are like a family tree for proteins, showing how different related protein sequences have evolved over time. They provide vital evolutionary clues that help predict a protein’s structure. The problem arises with “low-homology proteins” (those with few known relatives) and “orphan proteins” (those with no known relatives), where MSA information is scarce or completely missing. This limitation significantly hampers the effectiveness of even the most advanced folding models.
To tackle this, researchers have introduced a novel model called PLAME (PLm Aligner for MSA Enhancement). PLAME is designed to generate high-quality MSAs, particularly for these challenging proteins, by leveraging the power of pretrained protein language models. Unlike previous methods that might struggle with limited evolutionary data, PLAME taps into the rich “evolutionary embeddings” learned by these language models. These embeddings provide a deeper, more informative understanding of protein evolution, even when traditional MSA data is sparse.
A key innovation in PLAME is its “conservation-diversity loss” function. This unique approach ensures that the generated MSAs are not only accurate in capturing highly conserved (unchanging) regions—which are critical for a protein’s core function and structure—but also maintain sufficient diversity in less conserved areas. This balance is vital for creating biologically plausible MSAs that truly enhance prediction quality.
Furthermore, PLAME introduces a clever MSA selection method called High-Fidelity Appropriate Diversity (HiFiAD). This method acts as a filter, carefully screening the generated MSAs to pick out the best ones—those that strike the right balance between being faithful to known protein characteristics and offering useful diversity. This selection process helps to reduce noise and ensure that only the most beneficial MSAs are used to improve folding performance.
Extensive experiments have shown that PLAME achieves state-of-the-art performance, significantly improving protein structure prediction accuracy on challenging low-homology and orphan protein datasets. It has demonstrated consistent enhancements when used with both AlphaFold2 and the even newer AlphaFold3. Interestingly, PLAME can also act as an “adapter,” allowing models like ESMFold (which typically predicts structures from a single sequence without an MSA) to achieve AlphaFold2-level accuracy but with ESMFold’s much faster inference speed. This means faster and more accurate predictions for a wider range of proteins.
Also Read:
- A New Benchmark for Protein Interaction Prediction: Moving Beyond Pairs to Networks
- Designing Cyclic Peptides Without Prior Examples: Introducing CP-Composer
In essence, PLAME addresses a critical bottleneck in protein structure prediction by providing a robust way to generate high-quality evolutionary information, even for the most elusive proteins. This advancement holds significant promise for accelerating drug discovery and deepening our understanding of biological functions. You can find more details about this research in the full paper available at this link.


