TLDR: A new framework, Knowledge-guided Preference Optimization (KPO), enhances the safety of protein language models (PLMs) by integrating a Protein Safety Knowledge Graph (PSKG). KPO uses an efficient graph pruning strategy to identify preferred (benign) protein sequences and employs reinforcement learning to minimize the generation of harmful proteins. Experimental results demonstrate that KPO effectively reduces the likelihood of producing hazardous sequences while maintaining high functionality, offering a robust safety assurance framework for applying generative models in biotechnology.
Protein language models (PLMs) are powerful tools that have transformed biological research, offering significant advantages in designing new proteins and optimizing existing ones. These models learn from vast datasets of protein sequences, uncovering hidden patterns that are difficult to find with traditional methods. For example, PLMs can help engineers design enzymes with improved efficiency or discover new antibodies to fight emerging diseases.
However, this transformative potential comes with significant safety challenges. Unlike text-based AI models where harmful outputs might be ethical or social, PLMs operate in the biological domain, meaning their outputs could have direct and far-reaching consequences for human health and the environment. The accidental creation of harmful proteins—such as those that make viruses more transmissible, help pathogens evade immune responses, or lead to drug resistance—raises serious biosafety and ethical concerns. Current PLMs primarily focus on performance and generation, often overlooking these critical safety considerations.
Introducing Knowledge-guided Preference Optimization (KPO)
To address these vital biosafety issues, researchers have proposed a new framework called Knowledge-guided Preference Optimization (KPO). KPO is designed to integrate specific safety knowledge directly into the protein generation process of PLMs. At its heart is the Protein Safety Knowledge Graph (PSKG), a comprehensive resource that encodes crucial biochemical properties and relationships of both benign (harmless) and harmful proteins.
How KPO Works
The KPO framework operates in three main stages:
First, the **Protein Safety Knowledge Graph (PSKG) is constructed**. This graph is built by gathering data from databases like UniProt. It identifies harmful proteins (like toxins and antigens) and benign proteins, then connects them through Gene Ontology (GO) terms. GO terms describe biological processes, cellular components, and molecular functions, allowing the PSKG to map intricate relationships between different types of proteins.
Second, a **node pruning strategy** is applied to the PSKG. Because the PSKG can be very large, containing hundreds of thousands of protein nodes, it needs to be made more computationally efficient. KPO uses a weighted metric-based algorithm to identify and keep only the most informative benign protein nodes and GO terms. This process significantly reduces the graph’s size while preserving its essential biological information, making downstream analysis much faster.
Finally, **PLMs are fine-tuned using KPO**. The refined PSKG is used to create ‘preference pairs.’ These pairs consist of a benign protein and a harmful protein that share some similarities in their structure or function. The idea is to teach the PLM to prefer generating sequences similar to the benign protein over the harmful one. This fine-tuning is done using a method called Direct Preference Optimization (DPO), which helps the model learn to distinguish subtle differences between safe and unsafe proteins. By doing so, KPO guides the PLM to generate protein sequences that are not only functionally relevant but also safe.
Also Read:
- AMix-1: A Scalable Protein Design Model Inspired by Large Language Systems
- New AI Model Predicts Red Blood Cell Toxicity of Antimicrobial Peptides
Promising Results and Future Directions
Experiments with KPO applied to various protein language models, including ProtGPT2, Progen2, and InstructProtein, have shown very encouraging results. KPO consistently reduced the likelihood of generating harmful protein sequences across multiple safety metrics, such as sequence similarity to known toxins and predicted toxicity. Importantly, it achieved this while maintaining or even improving the functional capabilities of the models.
Analysis of the generated proteins’ ’embeddings’ (their numerical representations in the model’s internal space) further confirmed KPO’s effectiveness. Proteins generated by KPO-fine-tuned models formed distinct clusters, clearly separated from known harmful proteins, indicating that the models were successfully steered away from dangerous regions in the protein sequence landscape.
While KPO marks a significant step forward in ensuring biosafety in protein generation, the researchers acknowledge limitations. Current work primarily focuses on sequence-level safety, with future efforts aiming to incorporate structural-level constraints to avoid toxic 3D protein conformations. Additionally, scaling KPO to even larger PLMs and dynamically updating the PSKG with new safety insights will be crucial for its adaptability to evolving biological challenges.
This groundbreaking work provides a robust framework for integrating safety knowledge into protein language models, paving the way for more responsible applications in biotechnology and protein engineering. You can read the full research paper here.


