New publication in ChemPhysChem

briza

3 weeks ago

Sequeiros-Borja C, Škoda P, Brezovsky J, 2026: Toward Automatic Derivation of Geometry-Based Descriptors as Surrogates for Complex Computational Approaches in Enzyme–Substrate Prediction. ChemPhysChem 27: e202500883. full text dataset

Accurate prediction of enzyme–substrate (ES) interactions remains a fundamental challenge in biocatalysis and drug discovery. While machine learning (ML) approaches have shown promise, they require extensive training data and often lack mechanistic interpretability. Here, we present a novel methodology that automatically derives geometry-based descriptors from ES complex structures to predict substrate specificity. Our approach simplifies complex catalytic mechanisms into interpretable geometric filters comprising critical inter-atomic distances and accessibility of atomic pairs parameters. We validated this methodology using two mechanistically distinct enzyme families with minimal training data: haloalkane dehalogenases (HLDs; 9 enzymes and 53 substrates) and aldehyde reductases (AldRs; 9 enzymes and 36 substrates). The filters demonstrated robust performance across chemically diverse substrates. On testing datasets, the derived filters achieved average accuracy of 77% and sensitivity of
94% for HLDs and average 57% recall of true substrates for AldRs, exceeding state-of-the-art ML methods for substrate predictions on these datasets. Crucially, the geometric descriptors directly correspond to catalytic requirements, providing mechanistic insights into substrate recognition. This interpretable, mechanism-based approach requires minimal training data and can be readily applied to newly characterized enzymes, offering a powerful tool for enzyme engineering and substrate screening applications.