22–23 May 2025
HUN-REN Centre
Europe/Budapest timezone

pLMSAV: A Delta-Embedding Approach for Predicting Pathogenic Single Amino Acid Variants

23 May 2025, 09:00
30m
HUN-REN Centre

HUN-REN Centre

1054 Budapest Alkotmány utca 29.
Lecture Session V

Speaker

Dr Orsolya Gereben (Institute of Biophysics and Radiation Biology, Semmelweis University)

Description

Predicting whether single amino acid variants (SAVs) in proteins lead to pathogenic outcomes is a critical challenge in molecular biology and precision medicine. Experimental determination of the effects of all possible mutations or those observed in pathogenic individuals is infeasible. While existing state-of-the-art tools such as AlphaMissense show promise, their performance remains insufficient for diagnostic applications, they are often challenging to run locally, and most are restricted to human sequences. To address these limitations, we developed pLMSAV, a simple yet effective predictor leveraging protein language models (pLMs). Our method computes delta-embeddings by subtracting the embedding of the mutant sequence from that of the wild type sequence. These delta-embedding vectors serve as input for a convolutional neural network used for training and prediction. To prevent data leakage, we trained our model on a well-characterized, labeled set of Eff10k and evaluated it on a non-homologous subset of ClinVar data. Our results demonstrate that this approach performs exceptionally well on Eff10k test folds and reasonably on ClinVar test sets. Notably, pLMSAV excels in resolving ambiguous predictions by AlphaMissense, also outperforming REVEL predictions of these cases. Therefore, we will integrat these REVEL-enhanced predictions into our widely used AlphaMissense web application (URL). We anticipate that incorporating delta-embeddings into other mutation effect predictors or mutant structure prediction methods will further enhance their accuracy and utility in diverse biological contexts.

Primary authors

Dr Orsolya Gereben (Institute of Biophysics and Radiation Biology, Semmelweis University) Dr Hedvig Tordai (Institute of Biophysics and Radiation Biology, Semmelweis University) Prof. Tamás Hegedűs (Institute of Biophysics and Radiation Biology, Semmelweis University)

Presentation materials