https://arxiv.org/api/LnDvoA0bJRLM27fys8mWHJsBi0A 2026-03-22T20:14:48Z 6642 255 15 http://arxiv.org/abs/2510.25132v1 EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation 2025-10-29T03:22:32Z Designing enzyme backbones with substrate-specific functionality is a critical challenge in computational protein engineering. Current generative models excel in protein design but face limitations in binding data, substrate-specific control, and flexibility for de novo enzyme backbone generation. To address this, we introduce EnzyBind, a dataset with 11,100 experimentally validated enzyme-substrate pairs specifically curated from PDBbind. Building on this, we propose EnzyControl, a method that enables functional and substrate-specific control in enzyme backbone generation. Our approach generates enzyme backbones conditioned on MSA-annotated catalytic sites and their corresponding substrates, which are automatically extracted from curated enzyme-substrate data. At the core of EnzyControl is EnzyAdapter, a lightweight, modular component integrated into a pretrained motif-scaffolding model, allowing it to become substrate-aware. A two-stage training paradigm further refines the model's ability to generate accurate and functional enzyme structures. Experiments show that our EnzyControl achieves the best performance across structural and functional metrics on EnzyBind and EnzyBench benchmarks, with particularly notable improvements of 13\% in designability and 13\% in catalytic efficiency compared to the baseline models. The code is released at https://github.com/Vecteur-libre/EnzyControl. 2025-10-29T03:22:32Z Chao Song Zhiyuan Liu Han Huang Liang Wang Qiong Wang Jianyu Shi Hui Yu Yihang Zhou Yang Zhang http://arxiv.org/abs/2510.24649v1 Local Electromagnetic Fields Enable Fast Redox Sensing by Physically Accelerating Cysteine Oxidation 2025-10-28T17:16:33Z Hydrogen peroxide oxidises cysteine residues to control protein function, yet bulk rate constants predict hours for changes that occur in cells in seconds. Here, this work shows that local electromagnetic fields (EMFs), ubiquitous in proteins, membranes and nanodomains, can lawfully modulate the Eyring barrier and orientate reactants, accelerating cysteine oxidation without changing the underlying chemistry. Embedding a field term into the Eyring expression, demonstrated that plausible local EMFs with realistic dipole changes accelerate rate constants by orders of magnitude. This local acceleration reconciles the discrepancy between predicted vs. observed rates of H2O2-mediated cysteine oxidation. The framework generates falsifiable predictions, such as vibrational Stark readouts in thiolate peroxide complexes should fall within predicted ranges, and reframes rate-constants as mutable, field conditioned parameters. Cysteine redox sensing is fast not because the chemistry is exotic, but because the physics is local. 2025-10-28T17:16:33Z 11 pages, 2 figures James N. Cobley http://arxiv.org/abs/2508.07328v2 Cation-DNA outer sphere coordination in DNA polymorphism 2025-10-28T15:28:48Z There are two approaches to describing DNA-ions interactions. The physical approach is an analysis of electrostatic interactions between ions and charges on the DNA molecule. The coordination chemistry approach is a search for modes of direct binding of ions to ionophores of DNA. We study both the inner and outer sphere coordination of ions by ionophores of the A and C forms of DNA in molecular dynamics simulations in two low-polarity solvents: in ethanol-water and methanol-water mixtures. We show that the counterion-DNA outer sphere coordination plays a key role in the experimentally observed conformational polymorphism of the DNA molecule: a transition to the A form in ethanol and to the C form in methanol. We identify the ionophores responsible for the existence of the A- and C-complexes. In both the complexes, the ions' inner sphere ligands are mostly water molecules, the ions reside in water clusters. In ethanol-water mixture, the water clusters are large, the major groove of the A-DNA is filled with water, and all ionophores are accessible to ions. In methanol-water mixture, the water clusters are small, and a large number of methanol clusters are present near DNA surface. They interfere with the coordination of ions in one of the ionophores of the major groove, and also with other ionophores near phosphates. Therefore, in methanol, the interaction energy of counterions with A-DNA cannot compensate for the repulsion between closely located phosphates. Therefore, the ions fill more accessible ionophores of the C-complex, converting DNA into the C form. 2025-08-10T12:58:51Z 22 pages, 10 figures Elena A. Zubova Ivan A. Strelnikov http://arxiv.org/abs/2510.22304v2 ODesign: A World Model for Biomolecular Interaction Design 2025-10-28T09:13:37Z Biomolecular interactions underpin almost all biological processes, and their rational design is central to programming new biological functions. Generative AI models have emerged as powerful tools for molecular design, yet most remain specialized for individual molecular types and lack fine-grained control over interaction details. Here we present ODesign, an all-atom generative world model for all-to-all biomolecular interaction design. ODesign allows scientists to specify epitopes on arbitrary targets and generate diverse classes of binding partners with fine-grained control. Across entity-, token-, and atom-level benchmarks in the protein modality, ODesign demonstrates superior controllability and performance to modality-specific baselines. Extending beyond proteins, it generalizes to nucleic acid and small-molecule design, enabling interaction types such as protein-binding RNA/DNA and RNA/DNA-binding ligands that were previously inaccessible. By unifying multimodal biomolecular interactions within a single generative framework, ODesign moves toward a general-purpose molecular world model capable of programmable design. ODesign is available at https://odesign.lglab.ac.cn , 2025-10-25T14:16:17Z Odin Zhang Xujun Zhang Haitao Lin Cheng Tan Qinghan Wang Yuanle Mo Qiantai Feng Gang Du Yuntao Yu Zichang Jin Ziyi You Peicong Lin Yijie Zhang Yuyang Tao Shicheng Chen Jack Xiaoyu Chen Chenqing Hua Weibo Zhao Runze Ma Yunpeng Xia Kejun Ying Jun Li Yundian Zeng Lijun Lang Peichen Pan Hanqun Cao Zihao Song Bo Qiang Jiaqi Wang Pengfei Ji Lei Bai Jian Zhang Chang-yu Hsieh Pheng Ann Heng Siqi Sun Tingjun Hou Shuangjia Zheng http://arxiv.org/abs/2510.23379v1 Symbolic Neural Generation with Applications to Lead Discovery in Drug Design 2025-10-27T14:29:22Z We investigate a relatively underexplored class of hybrid neurosymbolic models integrating symbolic learning with neural reasoning to construct data generators meeting formal correctness criteria. In \textit{Symbolic Neural Generators} (SNGs), symbolic learners examine logical specifications of feasible data from a small set of instances -- sometimes just one. Each specification in turn constrains the conditional information supplied to a neural-based generator, which rejects any instance violating the symbolic specification. Like other neurosymbolic approaches, SNG exploits the complementary strengths of symbolic and neural methods. The outcome of an SNG is a triple $(H, X, W)$, where $H$ is a symbolic description of feasible instances constructed from data, $X$ a set of generated new instances that satisfy the description, and $W$ an associated weight. We introduce a semantics for such systems, based on the construction of appropriate \textit{base} and \textit{fibre} partially-ordered sets combined into an overall partial order, and outline a probabilistic extension relevant to practical applications. In this extension, SNGs result from searching over a weighted partial ordering. We implement an SNG combining a restricted form of Inductive Logic Programming (ILP) with a large language model (LLM) and evaluate it on early-stage drug design. Our main interest is the description and the set of potential inhibitor molecules generated by the SNG. On benchmark problems -- where drug targets are well understood -- SNG performance is statistically comparable to state-of-the-art methods. On exploratory problems with poorly understood targets, generated molecules exhibit binding affinities on par with leading clinical candidates. Experts further find the symbolic specifications useful as preliminary filters, with several generated molecules identified as viable for synthesis and wet-lab testing. 2025-10-27T14:29:22Z 37 pages, 15 figures; partial overlap of experimental results with https://doi.org/10.1101/2025.02.14.634875 Ashwin Srinivasan A Baskar Tirtharaj Dash Michael Bain Sanjay Kumar Dey Mainak Banerjee http://arxiv.org/abs/2506.05596v2 Zero-shot protein stability prediction by inverse folding models: a free energy interpretation 2025-10-27T11:02:49Z Inverse folding models have proven to be highly effective zero-shot predictors of protein stability. Despite this success, the link between the amino acid preferences of an inverse folding model and the free-energy considerations underlying thermodynamic stability remains incompletely understood. A better understanding would be of interest not only from a theoretical perspective, but also potentially provide the basis for stronger zero-shot stability prediction. In this paper, we take steps to clarify the free-energy foundations of inverse folding models. Our derivation reveals the standard practice of likelihood ratios as a simplistic approximation and suggests several paths towards better estimates of the relative stability. We empirically assess these approaches and demonstrate that considerable gains in zero-shot performance can be achieved with fairly simple means. 2025-06-05T21:15:13Z Jes Frellsen Maher M. Kassem Tone Bengtsen Lars Olsen Kresten Lindorff-Larsen Jesper Ferkinghoff-Borg Wouter Boomsma http://arxiv.org/abs/2510.21161v2 RiboPO: Preference Optimization for Structure- and Stability-Aware RNA Design 2025-10-27T01:54:44Z Designing RNA sequences that reliably adopt specified three-dimensional structures while maintaining thermodynamic stability remains challenging for synthetic biology and therapeutics. Current inverse folding approaches optimize for sequence recovery or single structural metrics, failing to simultaneously ensure global geometry, local accuracy, and ensemble stability-three interdependent requirements for functional RNA design. This gap becomes critical when designed sequences encounter dynamic biological environments. We introduce RiboPO, a Ribonucleic acid Preference Optimization framework that addresses this multi-objective challenge through reinforcement learning from physical feedback (RLPF). RiboPO fine-tunes gRNAde by constructing preference pairs from composite physical criteria that couple global 3D fidelity and thermodynamic stability. Preferences are formed using structural gates, PLDDT geometry assessments, and thermostability proxies with variability-aware margins, and the policy is updated with Direct Preference Optimization (DPO). On RNA inverse folding benchmarks, RiboPO demonstrates a superior balance of structural accuracy and stability. Compared to the best non-overlap baselines, our multi-round model improves Minimum Free Energy (MFE) by 12.3% and increases secondary-structure self-consistency (EternaFold scMCC) by 20%, while maintaining competitive 3D quality and high sequence diversity. In sampling efficiency, RiboPO achieves 11% higher pass@64 than the gRNAde base under the conjunction of multiple requirements. A multi-round variant with preference-pair reconstruction delivers additional gains on unseen RNA structures. These results establish RLPF as an effective paradigm for structure-accurate and ensemble-robust RNA design, providing a foundation for extending to complex biological objectives. 2025-10-24T05:25:54Z 9 pages, 2 figures. Equal contribution: Minghao Sun, Hanqun Cao, Zhou Zhang. Corresponding author: Fang Wu, Yang Zhang Minghao Sun Hanqun Cao Zhou Zhang Chen Wei Liang Wang Tianrui Jia Zhiyuan Liu Tianfan Fu Xiangru Tang Yejin Choi Pheng-Ann Heng Fang Wu Yang Zhang http://arxiv.org/abs/2511.05513v1 Molecular Dynamics Simulations of Membrane Selectivity of Star Peptides Across Different Bacterial and Mammalian Bilipids 2025-10-25T09:05:07Z Structurally nanoengineered antimicrobial peptide polymers (SNAPPs) are emerging as promising selective agents against bacterial membranes. In this study, we used all atom molecular dynamics simulation techniques to investigate the interaction of a promising cationic SNAPP architecture (Alt-SNAPP with 8 arms made of alternating lysine and valine residues) with modelled Gram-negative, Gram-positive, mammalian, and red blood cell membranes. Alt-SNAPP exhibited rapid and stable binding to bacterial membranes, driven by electrostatic interactions with anionic lipids such as phosphatidylglycerol (PG) and cardiolipin (CL), and supported by membrane fluidity. In contrast, mammalian and red blood cell membranes, enriched in zwitterionic lipids and cholesterol, resisted peptide association entirely. Analyses of center of mass distance, partial density, hydrogen bonding, and interaction energy confirmed that SNAPP remains fully excluded from host like membranes while forming stable, multivalent interactions with bacterial bilayers. These findings provide mechanistic insight into membrane selectivity of SNAPP and offer a molecular framework for designing next generation antimicrobial polymers with minimal off target toxicity. 2025-10-25T09:05:07Z Amal Jayawardena Andrew Hung Greg Qiao Neil OBrien-Simpson Elnaz Hajizadeh http://arxiv.org/abs/2510.22179v1 Beyond Autophagy: VPS39 Deficiency Triggers Migrasome-Driven Stress Adaptation Revealed by Super-Resolution Imaging 2025-10-25T06:18:48Z Autophagy and migrasome formation constitute critical cellular mechanisms for maintaining cellular homeostasis, however, their potential compensatory interplay remains poorly understood. In this study, we identify VPS39, a core component of the HOPS complex, as a molecular switch coordinating these processes. Genetic ablation of VPS39 not only impairs autophagic flux but also triggers cell migration through RhoA/Rac1 GTPases upregulation, consequently facilitating migrasome formation. Using super-resolution microscopy, we further demonstrate that migrasomes serve as an alternative disposal route for damaged mitochondria during VPS39-induced autophagy impairment, revealing a novel stress adaptation mechanism. Our work establishes a previously unrecognized autophagy-migrasome axis and provides direct visual evidence of organelle quality control via migrasomal extrusion. These findings position VPS39-regulated pathway switching as a potential therapeutic strategy for neurodegenerative diseases characterized by autophagy dysfunction. 2025-10-25T06:18:48Z Xuelei Pang Weiyun Sun Ning Jing Wenwen Gong Cuifang Kuang Xu Liu Hanbing Li Yu-Hui Zhang Yubing Han http://arxiv.org/abs/2511.05510v1 TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles 2025-10-24T13:11:47Z Understanding the dynamic behavior of proteins is critical to elucidating their functional mechanisms, yet generating realistic, temporally coherent trajectories of protein ensembles remains a significant challenge. In this work, we introduce a novel hierarchical autoregressive framework for modeling protein dynamics that leverages the intrinsic multi-scale organization of molecular motions. Unlike existing methods that focus on generating static conformational ensembles or treat dynamic sampling as an independent process, our approach characterizes protein dynamics as a Markovian process. The framework employs a two-scale architecture: a low-resolution model captures slow, collective motions driving major conformational transitions, while a high-resolution model generates detailed local fluctuations conditioned on these large-scale movements. This hierarchical design ensures that the causal dependencies inherent in protein dynamics are preserved, enabling the generation of temporally coherent and physically realistic trajectories. By bridging high-level biophysical principles with state-of-the-art generative modeling, our approach provides an efficient framework for simulating protein dynamics that balances computational efficiency with physical accuracy. 2025-10-24T13:11:47Z Yaoyao Xu Di Wang Zihan Zhou Tianshu Yu Mingchen Chen http://arxiv.org/abs/2510.20788v1 Predicting Protein-Nucleic Acid Flexibility Using Persistent Sheaf Laplacians 2025-10-23T17:53:33Z Understanding the flexibility of protein-nucleic acid complexes, often characterized by atomic B-factors, is essential for elucidating their structure, dynamics, and functions, such as reactivity and allosteric pathways. Traditional models such as Gaussian Network Models (GNM) and Elastic Network Models (ENM) often fall short in capturing multiscale interactions, especially in large or complex biomolecular systems. In this work, we apply the Persistent Sheaf Laplacian (PSL) framework for the B-factor prediction of protein-nucleic acid complexes. The PSL model integrates multiscale analysis, algebraic topology, combinatoric Laplacians, and sheaf theory for data representation. It reveals topological invariants in its harmonic spectra and captures the homotopic shape evolution of data with its non-harmonic spectra. Its localization enables accurate B-factor predictions. We benchmark our method on three diverse datasets, including protein-RNA and nucleic-acid-only structures, and demonstrate that PSL consistently outperforms existing models such as GNM and multiscale FRI (mFRI), achieving up to a 21% improvement in Pearson correlation coefficient for B-factor prediction. These results highlight the robustness and adaptability of PSL in modeling complex biomolecular interactions and suggest its potential utility in broader applications such as mutation impact analysis and drug design. 2025-10-23T17:53:33Z Nicole Hayes Ekaterina Merkurjev Guo-Wei Wei http://arxiv.org/abs/2410.14621v3 JAMUN: Bridging Smoothed Molecular Dynamics and Score-Based Learning for Conformational Ensembles 2025-10-23T05:16:59Z Conformational ensembles of protein structures are immensely important both for understanding protein function and drug discovery in novel modalities such as cryptic pockets. Current techniques for sampling ensembles such as molecular dynamics (MD) are computationally inefficient, while many recent machine learning methods do not transfer to systems outside their training data. We propose JAMUN which performs MD in a smoothed, noised space of all-atom 3D conformations of molecules by utilizing the framework of walk-jump sampling. JAMUN enables ensemble generation for small peptides at rates of an order of magnitude faster than traditional molecular dynamics. The physical priors in JAMUN enables transferability to systems outside of its training data, even to peptides that are longer than those originally trained on. Our model, code and weights are available at https://github.com/prescient-design/jamun. 2024-10-18T17:21:25Z 37 pages, accepted at NeurIPS 2025 Ameya Daigavane Bodhi P. Vani Darcy Davidson Saeed Saremi Joshua Rackers Joseph Kleinhenz http://arxiv.org/abs/2510.19947v1 Modelling multiscale architecture of biofilm extracellular matrix and its role in oxygen transport 2025-10-22T18:18:33Z The extracellular matrix of biofilms presents a dense and intricate architecture. Numerous biophysical properties of the matrix surrounding microbial cells contribute to the heterogeneity of biofilms and their functions at the microscale. Previous mathematical models assume the matrix to be homogeneous, often overlooking the need for a detailed mechanistic understanding of the extracellular space. In this theoretical study, we introduce a novel cell-capsule approach to investigate geometric patterns in biofilm morphology and predict their role in oxygen transport. The thickness of the capsule and the arrangement of cell-capsule patterns can influence matrix heterogeneity, providing a clear picture of biofilm structure. By incorporating the bacterial capsule as a distinct, low-diffusivity phase, our novel cell-capsule model reveals that this architecture acts as a significant 'resistance-in-series' barrier. We found that a thick capsule/dense matrix arrangement can reduce local oxygen transfer by approximately 70%, a substantial drop that may give drive further research into oxygen limitations during early stage biofilm development. 2025-10-22T18:18:33Z 5 figures, 3 tables Raghu K. Moorthy Eoin Casey http://arxiv.org/abs/2507.05101v2 PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs 2025-10-22T15:38:08Z Deep learning-based computational methods have achieved promising results in predicting protein-protein interactions (PPIs). However, existing benchmarks predominantly focus on isolated pairwise evaluations, overlooking a model's capability to reconstruct biologically meaningful PPI networks, which is crucial for biology research. To address this gap, we introduce PRING, the first comprehensive benchmark that evaluates protein-protein interaction prediction from a graph-level perspective. PRING curates a high-quality, multi-species PPI network dataset comprising 21,484 proteins and 186,818 interactions, with well-designed strategies to address both data redundancy and leakage. Building on this golden-standard dataset, we establish two complementary evaluation paradigms: (1) topology-oriented tasks, which assess intra and cross-species PPI network construction, and (2) function-oriented tasks, including protein complex pathway prediction, GO module analysis, and essential protein justification. These evaluations not only reflect the model's capability to understand the network topology but also facilitate protein function annotation, biological module detection, and even disease mechanism analysis. Extensive experiments on four representative model categories, consisting of sequence similarity-based, naive sequence-based, protein language model-based, and structure-based approaches, demonstrate that current PPI models have potential limitations in recovering both structural and functional properties of PPI networks, highlighting the gap in supporting real-world biological applications. We believe PRING provides a reliable platform to guide the development of more effective PPI prediction models for the community. The dataset and source code of PRING are available at https://github.com/SophieSarceau/PRING. 2025-07-07T15:21:05Z Xinzhe Zheng Hao Du Fanding Xu Jinzhe Li Zhiyuan Liu Wenkang Wang Tao Chen Wanli Ouyang Stan Z. Li Yan Lu Nanqing Dong Yang Zhang http://arxiv.org/abs/2309.16519v4 AtomSurf : Surface Representation for Learning on Protein Structures 2025-10-22T14:23:58Z While there has been significant progress in evaluating and comparing different representations for learning on protein data, the role of surface-based learning approaches remains not well-understood. In particular, there is a lack of direct and fair benchmark comparison between the best available surface-based learning methods against alternative representations such as graphs. Moreover, the few existing surface-based approaches either use surface information in isolation or, at best, perform global pooling between surface and graph-based architectures. In this work, we fill this gap by first adapting a state-of-the-art surface encoder for protein learning tasks. We then perform a direct and fair comparison of the resulting method against alternative approaches within the Atom3D benchmark, highlighting the limitations of pure surface-based learning. Finally, we propose an integrated approach, which allows learned feature sharing between graphs and surface representations on the level of nodes and vertices across all layers. We demonstrate that the resulting architecture achieves state-of-the-art results on all tasks in the Atom3D benchmark, while adhering to the strict benchmark protocol, as well as more broadly on binding site identification and binding pocket classification. Furthermore, we use coarsened surfaces and optimize our approach for efficiency, making our tool competitive in training and inference time with existing techniques. Code can be found online: https://github.com/Vincentx15/atomsurf 2023-09-28T15:25:17Z Published as a conference paper at The Thirteenth International Conference on Learning Representations (ICLR 2025). The official open-access version is available at https://openreview.net/forum?id=ARQIJXFcTH The Thirteenth International Conference on Learning Representations (ICLR), 2025 Vincent Mallet Souhaib Attaiki Yangyang Miao Bruno Correia Maks Ovsjanikov