A new deep-learning framework combines protein language models and 3D structure to accelerate vaccine antigen discovery.
Vaccines remain the most powerful tool for preventing infectious diseases but designing them is far from simple. One of the biggest challenges is identifying the right protective antigens: the specific pathogen proteins that can trigger strong and effective immune responses. Pathogens can produce thousands of proteins, and experimentally testing each one is slow, costly, and often impractical during outbreaks.
Now, a new study introduces an artificial intelligence–based framework that could dramatically speed up this process.
The researchers developed a computational pipeline called PLGDL (Protein Language and Geometric Deep Learning), designed to predict which pathogen proteins are most likely to serve as effective vaccine antigens.
What sets PLGDL apart is its ability to integrate:
- Protein language models, which learn patterns directly from amino acid sequences
- Geometric deep learning, which captures three-dimensional structural features of proteins
By combining sequence and structure, the framework avoids reliance on hand-engineered features a common source of bias and inconsistency in earlier prediction tools.
Robust Across Pathogens
PLGDL was tested on both newly constructed datasets and publicly available benchmark datasets. Across these evaluations, the model showed strong and consistent performance in predicting protective antigens from:
- Viruses
- Bacteria
- Eukaryotic pathogens
This versatility suggests the framework could be broadly useful across infectious disease research, rather than being limited to a single pathogen class.
To demonstrate real-world utility, the researchers applied PLGDL to the ongoing Mpox outbreak. Impressively, the model:
- Rapidly identified several previously known protective antigens
- Discovered a new candidate antigen, G10R, not previously highlighted as a vaccine target
This ability to both confirm known biology and uncover new candidates highlights the framework’s potential for rapid response during emerging outbreaks, when time is critical.
Traditional antigen discovery relies heavily on experimental screening and expert-driven feature selection, which can be slow and difficult to scale. PLGDL offers a high-performance, unbiased screening tool that can prioritize the most promising antigens early, narrowing the field before experimental validation.
By synergistically combining protein language understanding with 3D structural reasoning, the framework represents a methodological advance in how computational immunology supports vaccine design.
As AI tools continue to mature, approaches like PLGDL could become central to next-generation vaccine pipelines, helping researchers move faster from genome sequence to vaccine candidate, especially during epidemics and pandemics.
Researchers have developed an AI framework that uses both protein sequence and structure to predict protective vaccine antigens. The tool performs robustly across pathogens and has already identified new vaccine targets during the Mpox outbreak, offering a powerful new approach for rapid vaccine development.
Journal article: Zai, X., et al., 2025. Integrating protein language and geometric deep learning models for enhanced vaccine antigen prediction. Nature Communications.
Summary by Stefan Botha










