TLDR: SynTwins is a novel retrosynthesis-guided framework that generates synthetically accessible molecular analogs. It mimics expert chemists’ strategies through a three-step process: retrosynthesis, similar building block searching, and virtual synthesis. This approach ensures that AI-designed molecules are not only structurally similar to desired targets but also feasible to synthesize. SynTwins outperforms existing machine learning models in generating synthesizable analogs, offering a practical solution to accelerate drug and material discovery by bridging the gap between computational design and experimental synthesis.
In the rapidly evolving field of drug and material discovery, Artificial Intelligence (AI) has emerged as a powerful tool for generating novel molecules with desired properties. However, a significant challenge persists: many AI-designed molecules are incredibly difficult, if not impossible, to synthesize in a laboratory. This disconnect between computational design and practical synthesis creates a major bottleneck, slowing down the development of new medicines and materials.
A groundbreaking new framework called SynTwins aims to bridge this critical gap. Developed by researchers Shuan Chen, Gunwook Nam, and Yousung Jung, SynTwins offers a novel approach to designing molecular analogs that are not only structurally similar to target molecules but are also readily synthesizable using existing chemical reactions and commercially available building blocks. This innovative method emulates the intuitive strategies employed by experienced chemists, making the molecular design process more practical and efficient.
How SynTwins Works: A Chemist’s Intuition in Code
SynTwins operates through a clever three-step process that mirrors how a human chemist might approach a synthesis challenge:
1. Retrosynthesis: First, SynTwins takes a target molecule and performs a multi-step retrosynthetic analysis. This means it breaks down the complex target molecule into simpler precursor molecules, essentially working backward from the final product to its potential starting materials. This step identifies the key structural components.
2. Building Block Search: Once the precursors are identified, SynTwins searches for similar, readily available building blocks. This is crucial because it ensures that the components needed for synthesis are commercially accessible. It uses an algorithm to find molecules that are structurally similar to the identified precursors, ensuring compatibility with subsequent reactions.
3. Virtual Synthesis: Finally, SynTwins virtually synthesizes new molecular analogs by applying established forward-reaction templates to the newly found building blocks. This step reconstructs the molecule, ensuring that the resulting analog can be formed using known and feasible chemical reactions.
Outperforming AI and Bridging the Gap
What makes SynTwins particularly remarkable is its ability to outperform state-of-the-art machine learning (ML) models in generating synthetically accessible analogs while maintaining high structural similarity to the original target molecules. Unlike many existing AI-driven approaches that might propose molecules without considering their practical synthesis, SynTwins is ‘synthesizable by design’.
The framework has been rigorously evaluated across diverse molecular datasets, including virtual molecules, ChEMBL molecules, USPTO molecules, and FDA-approved drugs. In these evaluations, SynTwins consistently demonstrated superior performance in both reconstructing original molecules and generating highly similar, synthesizable analogs. For instance, it showed a significantly higher reconstruction rate and better structural similarity scores compared to other leading models like ChemProjector and SynFormer, especially for real-world molecules like those found in USPTO patents and FDA-approved drugs.
Also Read:
- Chai Discovery Unveils Chai-2: AI Model Revolutionizes De Novo Antibody Design with Unprecedented Hit Rates and Speed
- Unlocking Expert AI: Introducing Knowledge Protocol Engineering (KPE)
Practical Implications for Drug Discovery
The impact of SynTwins extends beyond theoretical design. When integrated with existing molecule optimization frameworks, this hybrid approach produces synthetically feasible molecules with property profiles comparable to those generated by unconstrained molecule generators. This means that researchers can now design molecules that not only have the desired biological or material properties but are also guaranteed to be synthesizable in the lab.
Furthermore, SynTwins offers several key advantages. It does not rely on computationally intensive machine learning models, making it more robust and adaptable to different reaction conditions and building block sets without requiring extensive retraining. This efficiency is a significant benefit, as training some ML models can require thousands of GPU hours, whereas SynTwins operates efficiently through its search strategy.
By providing a practical solution to the synthesis-design gap, SynTwins is set to accelerate the discovery of new, viable molecules for a wide range of applications, from pharmaceuticals to advanced materials. It represents a significant step forward in translating computational molecular designs into tangible laboratory successes. You can read the full research paper here.


