https://arxiv.org/api/LnDvoA0bJRLM27fys8mWHJsBi0A2026-03-22T20:14:48Z664225515http://arxiv.org/abs/2510.25132v1EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation2025-10-29T03:22:32ZDesigning enzyme backbones with substrate-specific functionality is a critical challenge in computational protein engineering. Current generative models excel in protein design but face limitations in binding data, substrate-specific control, and flexibility for de novo enzyme backbone generation. To address this, we introduce EnzyBind, a dataset with 11,100 experimentally validated enzyme-substrate pairs specifically curated from PDBbind. Building on this, we propose EnzyControl, a method that enables functional and substrate-specific control in enzyme backbone generation. Our approach generates enzyme backbones conditioned on MSA-annotated catalytic sites and their corresponding substrates, which are automatically extracted from curated enzyme-substrate data. At the core of EnzyControl is EnzyAdapter, a lightweight, modular component integrated into a pretrained motif-scaffolding model, allowing it to become substrate-aware. A two-stage training paradigm further refines the model's ability to generate accurate and functional enzyme structures. Experiments show that our EnzyControl achieves the best performance across structural and functional metrics on EnzyBind and EnzyBench benchmarks, with particularly notable improvements of 13\% in designability and 13\% in catalytic efficiency compared to the baseline models. The code is released at https://github.com/Vecteur-libre/EnzyControl.2025-10-29T03:22:32ZChao SongZhiyuan LiuHan HuangLiang WangQiong WangJianyu ShiHui YuYihang ZhouYang Zhanghttp://arxiv.org/abs/2510.24649v1Local Electromagnetic Fields Enable Fast Redox Sensing by Physically Accelerating Cysteine Oxidation2025-10-28T17:16:33ZHydrogen peroxide oxidises cysteine residues to control protein function, yet bulk rate constants predict hours for changes that occur in cells in seconds. Here, this work shows that local electromagnetic fields (EMFs), ubiquitous in proteins, membranes and nanodomains, can lawfully modulate the Eyring barrier and orientate reactants, accelerating cysteine oxidation without changing the underlying chemistry. Embedding a field term into the Eyring expression, demonstrated that plausible local EMFs with realistic dipole changes accelerate rate constants by orders of magnitude. This local acceleration reconciles the discrepancy between predicted vs. observed rates of H2O2-mediated cysteine oxidation. The framework generates falsifiable predictions, such as vibrational Stark readouts in thiolate peroxide complexes should fall within predicted ranges, and reframes rate-constants as mutable, field conditioned parameters. Cysteine redox sensing is fast not because the chemistry is exotic, but because the physics is local.2025-10-28T17:16:33Z11 pages, 2 figuresJames N. Cobleyhttp://arxiv.org/abs/2508.07328v2Cation-DNA outer sphere coordination in DNA polymorphism2025-10-28T15:28:48ZThere are two approaches to describing DNA-ions interactions. The physical approach is an analysis of electrostatic interactions between ions and charges on the DNA molecule. The coordination chemistry approach is a search for modes of direct binding of ions to ionophores of DNA. We study both the inner and outer sphere coordination of ions by ionophores of the A and C forms of DNA in molecular dynamics simulations in two low-polarity solvents: in ethanol-water and methanol-water mixtures. We show that the counterion-DNA outer sphere coordination plays a key role in the experimentally observed conformational polymorphism of the DNA molecule: a transition to the A form in ethanol and to the C form in methanol. We identify the ionophores responsible for the existence of the A- and C-complexes. In both the complexes, the ions' inner sphere ligands are mostly water molecules, the ions reside in water clusters. In ethanol-water mixture, the water clusters are large, the major groove of the A-DNA is filled with water, and all ionophores are accessible to ions. In methanol-water mixture, the water clusters are small, and a large number of methanol clusters are present near DNA surface. They interfere with the coordination of ions in one of the ionophores of the major groove, and also with other ionophores near phosphates. Therefore, in methanol, the interaction energy of counterions with A-DNA cannot compensate for the repulsion between closely located phosphates. Therefore, the ions fill more accessible ionophores of the C-complex, converting DNA into the C form.2025-08-10T12:58:51Z22 pages, 10 figuresElena A. ZubovaIvan A. Strelnikovhttp://arxiv.org/abs/2510.22304v2ODesign: A World Model for Biomolecular Interaction Design2025-10-28T09:13:37ZBiomolecular interactions underpin almost all biological processes, and their rational design is central to programming new biological functions. Generative AI models have emerged as powerful tools for molecular design, yet most remain specialized for individual molecular types and lack fine-grained control over interaction details. Here we present ODesign, an all-atom generative world model for all-to-all biomolecular interaction design. ODesign allows scientists to specify epitopes on arbitrary targets and generate diverse classes of binding partners with fine-grained control. Across entity-, token-, and atom-level benchmarks in the protein modality, ODesign demonstrates superior controllability and performance to modality-specific baselines. Extending beyond proteins, it generalizes to nucleic acid and small-molecule design, enabling interaction types such as protein-binding RNA/DNA and RNA/DNA-binding ligands that were previously inaccessible. By unifying multimodal biomolecular interactions within a single generative framework, ODesign moves toward a general-purpose molecular world model capable of programmable design. ODesign is available at https://odesign.lglab.ac.cn ,2025-10-25T14:16:17ZOdin ZhangXujun ZhangHaitao LinCheng TanQinghan WangYuanle MoQiantai FengGang DuYuntao YuZichang JinZiyi YouPeicong LinYijie ZhangYuyang TaoShicheng ChenJack Xiaoyu ChenChenqing HuaWeibo ZhaoRunze MaYunpeng XiaKejun YingJun LiYundian ZengLijun LangPeichen PanHanqun CaoZihao SongBo QiangJiaqi WangPengfei JiLei BaiJian ZhangChang-yu HsiehPheng Ann HengSiqi SunTingjun HouShuangjia Zhenghttp://arxiv.org/abs/2510.23379v1Symbolic Neural Generation with Applications to Lead Discovery in Drug Design2025-10-27T14:29:22ZWe investigate a relatively underexplored class of hybrid neurosymbolic models integrating symbolic learning with neural reasoning to construct data generators meeting formal correctness criteria. In \textit{Symbolic Neural Generators} (SNGs), symbolic learners examine logical specifications of feasible data from a small set of instances -- sometimes just one. Each specification in turn constrains the conditional information supplied to a neural-based generator, which rejects any instance violating the symbolic specification. Like other neurosymbolic approaches, SNG exploits the complementary strengths of symbolic and neural methods. The outcome of an SNG is a triple $(H, X, W)$, where $H$ is a symbolic description of feasible instances constructed from data, $X$ a set of generated new instances that satisfy the description, and $W$ an associated weight. We introduce a semantics for such systems, based on the construction of appropriate \textit{base} and \textit{fibre} partially-ordered sets combined into an overall partial order, and outline a probabilistic extension relevant to practical applications. In this extension, SNGs result from searching over a weighted partial ordering. We implement an SNG combining a restricted form of Inductive Logic Programming (ILP) with a large language model (LLM) and evaluate it on early-stage drug design. Our main interest is the description and the set of potential inhibitor molecules generated by the SNG. On benchmark problems -- where drug targets are well understood -- SNG performance is statistically comparable to state-of-the-art methods. On exploratory problems with poorly understood targets, generated molecules exhibit binding affinities on par with leading clinical candidates. Experts further find the symbolic specifications useful as preliminary filters, with several generated molecules identified as viable for synthesis and wet-lab testing.2025-10-27T14:29:22Z37 pages, 15 figures; partial overlap of experimental results with https://doi.org/10.1101/2025.02.14.634875Ashwin SrinivasanA BaskarTirtharaj DashMichael BainSanjay Kumar DeyMainak Banerjeehttp://arxiv.org/abs/2506.05596v2Zero-shot protein stability prediction by inverse folding models: a free energy interpretation2025-10-27T11:02:49ZInverse folding models have proven to be highly effective zero-shot predictors of protein stability. Despite this success, the link between the amino acid preferences of an inverse folding model and the free-energy considerations underlying thermodynamic stability remains incompletely understood. A better understanding would be of interest not only from a theoretical perspective, but also potentially provide the basis for stronger zero-shot stability prediction. In this paper, we take steps to clarify the free-energy foundations of inverse folding models. Our derivation reveals the standard practice of likelihood ratios as a simplistic approximation and suggests several paths towards better estimates of the relative stability. We empirically assess these approaches and demonstrate that considerable gains in zero-shot performance can be achieved with fairly simple means.2025-06-05T21:15:13ZJes FrellsenMaher M. KassemTone BengtsenLars OlsenKresten Lindorff-LarsenJesper Ferkinghoff-BorgWouter Boomsmahttp://arxiv.org/abs/2510.21161v2RiboPO: Preference Optimization for Structure- and Stability-Aware RNA Design2025-10-27T01:54:44ZDesigning RNA sequences that reliably adopt specified three-dimensional structures while maintaining thermodynamic stability remains challenging for synthetic biology and therapeutics. Current inverse folding approaches optimize for sequence recovery or single structural metrics, failing to simultaneously ensure global geometry, local accuracy, and ensemble stability-three interdependent requirements for functional RNA design. This gap becomes critical when designed sequences encounter dynamic biological environments. We introduce RiboPO, a Ribonucleic acid Preference Optimization framework that addresses this multi-objective challenge through reinforcement learning from physical feedback (RLPF). RiboPO fine-tunes gRNAde by constructing preference pairs from composite physical criteria that couple global 3D fidelity and thermodynamic stability. Preferences are formed using structural gates, PLDDT geometry assessments, and thermostability proxies with variability-aware margins, and the policy is updated with Direct Preference Optimization (DPO). On RNA inverse folding benchmarks, RiboPO demonstrates a superior balance of structural accuracy and stability. Compared to the best non-overlap baselines, our multi-round model improves Minimum Free Energy (MFE) by 12.3% and increases secondary-structure self-consistency (EternaFold scMCC) by 20%, while maintaining competitive 3D quality and high sequence diversity. In sampling efficiency, RiboPO achieves 11% higher pass@64 than the gRNAde base under the conjunction of multiple requirements. A multi-round variant with preference-pair reconstruction delivers additional gains on unseen RNA structures. These results establish RLPF as an effective paradigm for structure-accurate and ensemble-robust RNA design, providing a foundation for extending to complex biological objectives.2025-10-24T05:25:54Z9 pages, 2 figures. Equal contribution: Minghao Sun, Hanqun Cao, Zhou Zhang. Corresponding author: Fang Wu, Yang ZhangMinghao SunHanqun CaoZhou ZhangChen WeiLiang WangTianrui JiaZhiyuan LiuTianfan FuXiangru TangYejin ChoiPheng-Ann HengFang WuYang Zhanghttp://arxiv.org/abs/2511.05513v1Molecular Dynamics Simulations of Membrane Selectivity of Star Peptides Across Different Bacterial and Mammalian Bilipids2025-10-25T09:05:07ZStructurally nanoengineered antimicrobial peptide polymers (SNAPPs) are emerging as promising selective agents against bacterial membranes. In this study, we used all atom molecular dynamics simulation techniques to investigate the interaction of a promising cationic SNAPP architecture (Alt-SNAPP with 8 arms made of alternating lysine and valine residues) with modelled Gram-negative, Gram-positive, mammalian, and red blood cell membranes. Alt-SNAPP exhibited rapid and stable binding to bacterial membranes, driven by electrostatic interactions with anionic lipids such as phosphatidylglycerol (PG) and cardiolipin (CL), and supported by membrane fluidity. In contrast, mammalian and red blood cell membranes, enriched in zwitterionic lipids and cholesterol, resisted peptide association entirely. Analyses of center of mass distance, partial density, hydrogen bonding, and interaction energy confirmed that SNAPP remains fully excluded from host like membranes while forming stable, multivalent interactions with bacterial bilayers. These findings provide mechanistic insight into membrane selectivity of SNAPP and offer a molecular framework for designing next generation antimicrobial polymers with minimal off target toxicity.2025-10-25T09:05:07ZAmal JayawardenaAndrew HungGreg QiaoNeil OBrien-SimpsonElnaz Hajizadehhttp://arxiv.org/abs/2510.22179v1Beyond Autophagy: VPS39 Deficiency Triggers Migrasome-Driven Stress Adaptation Revealed by Super-Resolution Imaging2025-10-25T06:18:48ZAutophagy and migrasome formation constitute critical cellular mechanisms for maintaining cellular homeostasis, however, their potential compensatory interplay remains poorly understood. In this study, we identify VPS39, a core component of the HOPS complex, as a molecular switch coordinating these processes. Genetic ablation of VPS39 not only impairs autophagic flux but also triggers cell migration through RhoA/Rac1 GTPases upregulation, consequently facilitating migrasome formation. Using super-resolution microscopy, we further demonstrate that migrasomes serve as an alternative disposal route for damaged mitochondria during VPS39-induced autophagy impairment, revealing a novel stress adaptation mechanism. Our work establishes a previously unrecognized autophagy-migrasome axis and provides direct visual evidence of organelle quality control via migrasomal extrusion. These findings position VPS39-regulated pathway switching as a potential therapeutic strategy for neurodegenerative diseases characterized by autophagy dysfunction.2025-10-25T06:18:48ZXuelei PangWeiyun SunNing JingWenwen GongCuifang KuangXu LiuHanbing LiYu-Hui ZhangYubing Hanhttp://arxiv.org/abs/2511.05510v1TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles2025-10-24T13:11:47ZUnderstanding the dynamic behavior of proteins is critical to elucidating their functional mechanisms, yet generating realistic, temporally coherent trajectories of protein ensembles remains a significant challenge. In this work, we introduce a novel hierarchical autoregressive framework for modeling protein dynamics that leverages the intrinsic multi-scale organization of molecular motions. Unlike existing methods that focus on generating static conformational ensembles or treat dynamic sampling as an independent process, our approach characterizes protein dynamics as a Markovian process. The framework employs a two-scale architecture: a low-resolution model captures slow, collective motions driving major conformational transitions, while a high-resolution model generates detailed local fluctuations conditioned on these large-scale movements. This hierarchical design ensures that the causal dependencies inherent in protein dynamics are preserved, enabling the generation of temporally coherent and physically realistic trajectories. By bridging high-level biophysical principles with state-of-the-art generative modeling, our approach provides an efficient framework for simulating protein dynamics that balances computational efficiency with physical accuracy.2025-10-24T13:11:47ZYaoyao XuDi WangZihan ZhouTianshu YuMingchen Chenhttp://arxiv.org/abs/2510.20788v1Predicting Protein-Nucleic Acid Flexibility Using Persistent Sheaf Laplacians2025-10-23T17:53:33ZUnderstanding the flexibility of protein-nucleic acid complexes, often characterized by atomic B-factors, is essential for elucidating their structure, dynamics, and functions, such as reactivity and allosteric pathways. Traditional models such as Gaussian Network Models (GNM) and Elastic Network Models (ENM) often fall short in capturing multiscale interactions, especially in large or complex biomolecular systems. In this work, we apply the Persistent Sheaf Laplacian (PSL) framework for the B-factor prediction of protein-nucleic acid complexes. The PSL model integrates multiscale analysis, algebraic topology, combinatoric Laplacians, and sheaf theory for data representation. It reveals topological invariants in its harmonic spectra and captures the homotopic shape evolution of data with its non-harmonic spectra. Its localization enables accurate B-factor predictions. We benchmark our method on three diverse datasets, including protein-RNA and nucleic-acid-only structures, and demonstrate that PSL consistently outperforms existing models such as GNM and multiscale FRI (mFRI), achieving up to a 21% improvement in Pearson correlation coefficient for B-factor prediction. These results highlight the robustness and adaptability of PSL in modeling complex biomolecular interactions and suggest its potential utility in broader applications such as mutation impact analysis and drug design.2025-10-23T17:53:33ZNicole HayesEkaterina MerkurjevGuo-Wei Weihttp://arxiv.org/abs/2410.14621v3JAMUN: Bridging Smoothed Molecular Dynamics and Score-Based Learning for Conformational Ensembles2025-10-23T05:16:59ZConformational ensembles of protein structures are immensely important both for understanding protein function and drug discovery in novel modalities such as cryptic pockets. Current techniques for sampling ensembles such as molecular dynamics (MD) are computationally inefficient, while many recent machine learning methods do not transfer to systems outside their training data. We propose JAMUN which performs MD in a smoothed, noised space of all-atom 3D conformations of molecules by utilizing the framework of walk-jump sampling. JAMUN enables ensemble generation for small peptides at rates of an order of magnitude faster than traditional molecular dynamics. The physical priors in JAMUN enables transferability to systems outside of its training data, even to peptides that are longer than those originally trained on. Our model, code and weights are available at https://github.com/prescient-design/jamun.2024-10-18T17:21:25Z37 pages, accepted at NeurIPS 2025Ameya DaigavaneBodhi P. VaniDarcy DavidsonSaeed SaremiJoshua RackersJoseph Kleinhenzhttp://arxiv.org/abs/2510.19947v1Modelling multiscale architecture of biofilm extracellular matrix and its role in oxygen transport2025-10-22T18:18:33ZThe extracellular matrix of biofilms presents a dense and intricate architecture. Numerous biophysical properties of the matrix surrounding microbial cells contribute to the heterogeneity of biofilms and their functions at the microscale. Previous mathematical models assume the matrix to be homogeneous, often overlooking the need for a detailed mechanistic understanding of the extracellular space. In this theoretical study, we introduce a novel cell-capsule approach to investigate geometric patterns in biofilm morphology and predict their role in oxygen transport. The thickness of the capsule and the arrangement of cell-capsule patterns can influence matrix heterogeneity, providing a clear picture of biofilm structure. By incorporating the bacterial capsule as a distinct, low-diffusivity phase, our novel cell-capsule model reveals that this architecture acts as a significant 'resistance-in-series' barrier. We found that a thick capsule/dense matrix arrangement can reduce local oxygen transfer by approximately 70%, a substantial drop that may give drive further research into oxygen limitations during early stage biofilm development.2025-10-22T18:18:33Z5 figures, 3 tablesRaghu K. MoorthyEoin Caseyhttp://arxiv.org/abs/2507.05101v2PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs2025-10-22T15:38:08ZDeep learning-based computational methods have achieved promising results in predicting protein-protein interactions (PPIs). However, existing benchmarks predominantly focus on isolated pairwise evaluations, overlooking a model's capability to reconstruct biologically meaningful PPI networks, which is crucial for biology research. To address this gap, we introduce PRING, the first comprehensive benchmark that evaluates protein-protein interaction prediction from a graph-level perspective. PRING curates a high-quality, multi-species PPI network dataset comprising 21,484 proteins and 186,818 interactions, with well-designed strategies to address both data redundancy and leakage. Building on this golden-standard dataset, we establish two complementary evaluation paradigms: (1) topology-oriented tasks, which assess intra and cross-species PPI network construction, and (2) function-oriented tasks, including protein complex pathway prediction, GO module analysis, and essential protein justification. These evaluations not only reflect the model's capability to understand the network topology but also facilitate protein function annotation, biological module detection, and even disease mechanism analysis. Extensive experiments on four representative model categories, consisting of sequence similarity-based, naive sequence-based, protein language model-based, and structure-based approaches, demonstrate that current PPI models have potential limitations in recovering both structural and functional properties of PPI networks, highlighting the gap in supporting real-world biological applications. We believe PRING provides a reliable platform to guide the development of more effective PPI prediction models for the community. The dataset and source code of PRING are available at https://github.com/SophieSarceau/PRING.2025-07-07T15:21:05ZXinzhe ZhengHao DuFanding XuJinzhe LiZhiyuan LiuWenkang WangTao ChenWanli OuyangStan Z. LiYan LuNanqing DongYang Zhanghttp://arxiv.org/abs/2309.16519v4AtomSurf : Surface Representation for Learning on Protein Structures2025-10-22T14:23:58ZWhile there has been significant progress in evaluating and comparing different representations for learning on protein data, the role of surface-based learning approaches remains not well-understood. In particular, there is a lack of direct and fair benchmark comparison between the best available surface-based learning methods against alternative representations such as graphs. Moreover, the few existing surface-based approaches either use surface information in isolation or, at best, perform global pooling between surface and graph-based architectures.
In this work, we fill this gap by first adapting a state-of-the-art surface encoder for protein learning tasks. We then perform a direct and fair comparison of the resulting method against alternative approaches within the Atom3D benchmark, highlighting the limitations of pure surface-based learning. Finally, we propose an integrated approach, which allows learned feature sharing between graphs and surface representations on the level of nodes and vertices across all layers.
We demonstrate that the resulting architecture achieves state-of-the-art results on all tasks in the Atom3D benchmark, while adhering to the strict benchmark protocol, as well as more broadly on binding site identification and binding pocket classification. Furthermore, we use coarsened surfaces and optimize our approach for efficiency, making our tool competitive in training and inference time with existing techniques. Code can be found online: https://github.com/Vincentx15/atomsurf2023-09-28T15:25:17ZPublished as a conference paper at The Thirteenth International Conference on Learning Representations (ICLR 2025). The official open-access version is available at https://openreview.net/forum?id=ARQIJXFcTHThe Thirteenth International Conference on Learning Representations (ICLR), 2025Vincent MalletSouhaib AttaikiYangyang MiaoBruno CorreiaMaks Ovsjanikov