https://arxiv.org/api/Irv6F9P0isiPLPaP7xEiV3efl7s 2026-03-25T11:53:57Z 6650 390 15 http://arxiv.org/abs/2504.11249v3 Cryo-em images are intrinsically low dimensional 2025-09-03T20:25:50Z Simulation-based inference provides a powerful framework for cryo-electron microscopy, employing neural networks in methods like CryoSBI to infer biomolecular conformations via learned latent representations. This latent space represents a rich opportunity, encoding valuable information about the physical system and the inference process. Harnessing this potential hinges on understanding the underlying geometric structure of these representations. We investigate this structure by applying manifold learning techniques to CryoSBI representations of hemagglutinin (simulated and experimental). We reveal that these high-dimensional data inherently populate low-dimensional, smooth manifolds, with simulated data effectively covering the experimental counterpart. By characterizing the manifold's geometry using Diffusion Maps and identifying its principal axes of variation via coordinate interpretation methods, we establish a direct link between the latent structure and key physical parameters. Discovering this intrinsic low-dimensionality and interpretable geometric organization not only validates the CryoSBI approach but enables us to learn more from the data structure and provides opportunities for improving future inference strategies by exploiting this revealed manifold geometry. 2025-04-15T14:46:25Z Luke Evans Octavian-Vlad Murad Lars Dingeldein Pilar Cossio Roberto Covino Marina Meila 10.1103/txrb-fw3z http://arxiv.org/abs/2509.01550v1 OTMol: Robust Molecular Structure Comparison via Optimal Transport 2025-09-01T15:27:31Z Root-mean-square deviation (RMSD) is widely used to assess structural similarity in systems ranging from flexible ligand conformers to complex molecular cluster configurations. Despite its wide utility, RMSD calculation is often challenged by inconsistent atom ordering, indistinguishable configurations in molecular clusters, and potential chirality inversion during alignment. These issues highlight the necessity of accurate atom-to-atom correspondence as a prerequisite for meaningful alignment. Traditional approaches often rely on heuristic cost matrices combined with the Hungarian algorithm, yet these methods underutilize the rich intra-molecular structural information and may fail to generalize across chemically diverse systems. In this work, we introduce OTMol, a method that formulates the molecular alignment task as a fused supervised Gromov-Wasserstein (fsGW) optimal transport problem. By leveraging the intrinsic geometric and topological relationships within each molecule, OTMol eliminates the need for manually defined cost functions and enables a principled, data-driven matching strategy. Importantly, OTMol preserves key chemical features such as molecular chirality and bond connectivity consistency. We evaluate OTMol across a wide range of molecular systems, including Adenosine triphosphate, Imatinib, lipids, small peptides, and water clusters, and demonstrate that it consistently achieves low RMSD values while preserving computational efficiency. Importantly, OTMol maintains molecular integrity by enforcing one-to-one mappings between entire molecules, thereby avoiding erroneous many-to-one alignments that often arise in comparing molecular clusters. Our results underscore the utility of optimal transport theory for molecular alignment and offer a generalizable framework applicable to structural comparison tasks in cheminformatics, molecular modeling, and related disciplines. 2025-09-01T15:27:31Z Xiaoqi Wei Xuhang Dai Yaqi Wu Yanxiang Zhao Yingkai Zhang Zixuan Cang http://arxiv.org/abs/2509.01038v1 Learning residue level protein dynamics with multiscale Gaussians 2025-09-01T00:38:44Z Many methods have been developed to predict static protein structures, however understanding the dynamics of protein structure is essential for elucidating biological function. While molecular dynamics (MD) simulations remain the in silico gold standard, its high computational cost limits scalability. We present DynaProt, a lightweight, SE(3)-invariant framework that predicts rich descriptors of protein dynamics directly from static structures. By casting the problem through the lens of multivariate Gaussians, DynaProt estimates dynamics at two complementary scales: (1) per-residue marginal anisotropy as $3 \times 3$ covariance matrices capturing local flexibility, and (2) joint scalar covariances encoding pairwise dynamic coupling across residues. From these dynamics outputs, DynaProt achieves high accuracy in predicting residue-level flexibility (RMSF) and, remarkably, enables reasonable reconstruction of the full covariance matrix for fast ensemble generation. Notably, it does so using orders of magnitude fewer parameters than prior methods. Our results highlight the potential of direct protein dynamics prediction as a scalable alternative to existing methods. 2025-09-01T00:38:44Z Mihir Bafna Bowen Jing Bonnie Berger http://arxiv.org/abs/2508.21350v1 The Quantum Compass Mechanism in Cryptochromes 2025-08-29T06:24:27Z Cryptochrome flavoproteins are prime candidates for mediating magnetic sensing in migratory animals via the radical pair mechanism (RPM), a spin-dependent process initiated by photoinduced electron transfer. The canonical FAD-tryptophan radical pair exhibits pronounced anisotropic hyperfine couplings, enabling sensitivity to geomagnetic fields. However, maintaining spin coherence under physiological conditions and explaining responses to weak radiofrequency fields remain unresolved challenges. Alternative radicals, such as superoxide and ascorbate, have been proposed to enhance anisotropy or suppress decoherence. This review summarizes the quantum basis of magnetoreception, evaluates both canonical and alternative radical pair models, and discusses amplification strategies including triads, spin scavenging, and bystander radicals. Emphasis is placed on how molecular geometry, exchange and dipolar interactions, and hyperfine topology modulate magnetic sensitivity. Key open questions and future directions are outlined, highlighting the need for structural and dynamical data under physiological conditions. 2025-08-29T06:24:27Z Zou Chengye Liu Ya-jun Wang Beibei http://arxiv.org/abs/2504.16621v3 Stable electron-irradiated [1-$^{13}$C]alanine radicals for clinically viable metabolic imaging with Dynamic Nuclear Polarization 2025-08-28T21:12:08Z Dissolution Dynamic Nuclear Polarisation (dDNP) increases the sensitivity of magnetic resonance experiments by $>10^4$-fold, permitting isotopically-labelled molecules to be transiently visible in MRI scans. dDNP requires a source of unpaired electrons in contact with labelled nuclei, cooled to $\sim$1K, and spin-pumped into a given state by microwaves. These electrons are usually chemical radicals, requiring removal by filtration prior to injection into humans. Alternative sources, such as UV irradiation, generate lower polarisation and require cryogenic transport. We present ultra-high-dose-rate electron irradiation as a novel alternative for generating non-persistent radicals in alanine/glycerol mixtures. These are stable for months at room temperature, quench spontaneously upon dissolution, are present in dose-dependent concentrations, and generate comparable nuclear polarisation to trityl radicals used clinically (20\%) through a novel mechanism. This process is inherently sterilising, permitting imaging of alanine metabolism \textit{in vivo}. As well as scientific novelty, this overcomes the biggest barrier to clinically translating dDNP. 2025-04-23T11:23:23Z This is the original version of a manuscript submitted to Science Advances. Per editorial policy this is not the version post the first revision, but rather the manuscript as originally submitted Catriona H. E. Rooney Department of Physiology, Anatomy and Genetics, University of Oxford Justin Y. C. Lau GE HealthCare, Waukesha, WI, USA Esben S. S. Hansen The MR Research Centre, Aarhus University, Aarhus, Denmark Nichlas Vous Christensen The MR Research Centre, Aarhus University, Aarhus, Denmark Duy A. Dang The MR Research Centre, Aarhus University, Aarhus, Denmark Kristoffer Petersson Department of Oncology, University of Oxford Iain D. C. Tullis Department of Oncology, University of Oxford Borivoj Vojnovic Department of Oncology, University of Oxford Sean Smart Department of Oncology, University of Oxford Jarrod Lewis Department of Material Science, University of Oxford William Myers CAESR, Department of Chemistry, University of Oxford Zoe Richardson Department of Physics, University of Oxford Brett W. C. Kennedy Department of Physiology, Anatomy and Genetics, University of Oxford Alice M. Bowen Department of Material Science, University of Oxford Lotte Bonde Bertelsen The MR Research Centre, Aarhus University, Aarhus, Denmark Christoffer Laustsen The MR Research Centre, Aarhus University, Aarhus, Denmark Damian J. Tyler Department of Physiology, Anatomy and Genetics, University of Oxford OCMR, Cardiovascular Medicine, University of Oxford Jack J. Miller The MR Research Centre, Aarhus University, Aarhus, Denmark 10.1126/sciadv.adz4334 http://arxiv.org/abs/2506.06294v2 GLProtein: Global-and-Local Structure Aware Protein Representation Learning 2025-08-28T09:38:11Z Proteins are central to biological systems, participating as building blocks across all forms of life. Despite advancements in understanding protein functions through protein sequence analysis, there remains potential for further exploration in integrating protein structural information. We argue that the structural information of proteins is not only limited to their 3D information but also encompasses information from amino acid molecules (local information) to protein-protein structure similarity (global information). To address this, we propose \textbf{GLProtein}, the first framework in protein pre-training that incorporates both global structural similarity and local amino acid details to enhance prediction accuracy and functional insights. GLProtein innovatively combines protein-masked modelling with triplet structure similarity scoring, protein 3D distance encoding and substructure-based amino acid molecule encoding. Experimental results demonstrate that GLProtein outperforms previous methods in several bioinformatics tasks, including predicting protein-protein interaction, contact prediction, and so on. 2025-05-17T14:45:13Z Accepted to EMNLP 2025 Findings Yunqing Liu Wenqi Fan Xiaoyong Wei Qing Li http://arxiv.org/abs/2508.19829v1 Single-molecule biophysics 2025-08-27T12:27:08Z Biological molecules, like all active matter, use free energy to generate force and motion which drive them out of thermal equilibrium, and undergo inherent dynamic interconversion between metastable free energy states separated by levels barely higher than stochastic thermal energy fluctuations. Here, we explore the founding and emerging approaches of the field of single-molecule biophysics which, unlike traditional ensemble average approaches, enable the detection and manipulation of individual molecules and facilitate exploration of biomolecular heterogeneity and its impact on transitional molecular kinetics and underpinning molecular interactions. We discuss the ground-breaking technological innovations which scratch far beyond the surface into open questions of real physiology, that correlate orthogonal data types and interplay empirical measurement with theoretical and computational insights, many of which are enabling artificial matter to be designed inspired by biological systems. And finally, we examine how these insights are helping to develop new physics framed around biology. 2025-08-27T12:27:08Z arXiv admin note: text overlap with arXiv:1704.06837 Mark C Leake http://arxiv.org/abs/2508.19800v1 A Multi-Layered Framework for Modeling Human Biology: From Basic AI Agents to a Full-Body AI Agent 2025-08-27T11:35:29Z We envision the Full-Body AI Agent as a comprehensive AI system designed to simulate, analyze, and optimize the dynamic processes of the human body across multiple biological levels. By integrating computational models, machine learning tools, and experimental platforms, this system aims to replicate and predict both physiological and pathological processes, ranging from molecules and cells to tissues, organs, and entire body systems. Central to the Full-Body AI Agent is its emphasis on integration and coordination across these biological levels, enabling analysis of how molecular changes influence cellular behaviors, tissue responses, organ function, and systemic outcomes. With a focus on biological functionality, the system is designed to advance the understanding of disease mechanisms, support the development of therapeutic interventions, and enhance personalized medicine. We propose two specialized implementations to demonstrate the utility of this framework: (1) the metastasis AI Agent, a multi-scale metastasis scoring system that characterizes tumor progression across the initiation, dissemination, and colonization phases by integrating molecular, cellular, and systemic signals; and (2) the drug AI Agent, a system-level drug development paradigm in which a drug AI-Agent dynamically guides preclinical evaluations, including organoids and chip-based models, by providing full-body physiological constraints. This approach enables the predictive modeling of long-term efficacy and toxicity beyond what localized models alone can achieve. These two agents illustrate the potential of Full-Body AI Agent to address complex biomedical challenges through multi-level integration and cross-scale reasoning. 2025-08-27T11:35:29Z Aoqi Wang Jiajia Liu Jianguo Wen Yangyang Luo Zhiwei Fan Liren Yang Xi Hu Ruihan Luo Yankai Yu Sophia Li Weiling Zhao Xiaobo Zhou http://arxiv.org/abs/2508.19632v1 TopoBind: Multi-Modal Prediction of Antibody-Antigen Binding Free Energy via Sequence Embeddings and Structural Topology 2025-08-27T07:12:32Z Predicting the binding free energy between antibodies and antigens is a key challenge in structure-aware biomolecular modeling, with direct implications for antibody design. Most existing methods either rely solely on sequence embeddings or struggle to capture complex structural relationships, thus limiting predictive performance. In this work, we present a novel framework that integrates sequence-based representations from pre-trained protein language models (ESM-2) with a set of topological features. Specifically, we extract contact map metrics reflecting residue-level connectivity, interface geometry descriptors characterizing cross-chain interactions, distance map statistics quantifying spatial organization, and persistent homology invariants that systematically capture the emergence and persistence of multi-scale topological structures - such as connected components, cycles, and cavities - within individual proteins and across the antibody-antigen interface. By leveraging a cross-attention mechanism to fuse these diverse modalities, our model effectively encodes both global and local structural organization, thereby substantially enhancing the prediction of binding free energy. Extensive experiments demonstrate that our model consistently outperforms sequence-only and conventional structural models, achieving state-of-the-art accuracy in binding free energy prediction. 2025-08-27T07:12:32Z 17 pages, 9 figures Ciyuan Yu Hongzong Li Jiahao Ma Shiqin Tang Ye-Fan Hu Jian-Dong Huang http://arxiv.org/abs/2508.01799v2 Contrastive Multi-Task Learning with Solvent-Aware Augmentation for Drug Discovery 2025-08-27T04:11:51Z Accurate prediction of protein-ligand interactions is essential for computer-aided drug discovery. However, existing methods often fail to capture solvent-dependent conformational changes and lack the ability to jointly learn multiple related tasks. To address these limitations, we introduce a pre-training method that incorporates ligand conformational ensembles generated under diverse solvent conditions as augmented input. This design enables the model to learn both structural flexibility and environmental context in a unified manner. The training process integrates molecular reconstruction to capture local geometry, interatomic distance prediction to model spatial relationships, and contrastive learning to build solvent-invariant molecular representations. Together, these components lead to significant improvements, including a 3.7% gain in binding affinity prediction, an 82% success rate on the PoseBusters Astex docking benchmarks, and an area under the curve of 97.1% in virtual screening. The framework supports solvent-aware, multi-task modeling and produces consistent results across benchmarks. A case study further demonstrates sub-angstrom docking accuracy with a root-mean-square deviation of 0.157 angstroms, offering atomic-level insight into binding mechanisms and advancing structure-based drug design. 2025-08-03T15:25:42Z 10 pages, 4 figures Jing Lan Hexiao Ding Hongzhao Chen Yufeng Jiang Nga-Chun Ng Gerald W. Y. Cheng Zongxi Li Jing Cai Liang-ting Lin Jung Sun Yoo http://arxiv.org/abs/2404.02360v2 FraGNNet: A Deep Probabilistic Model for Tandem Mass Spectrum Prediction 2025-08-27T03:28:45Z Compound identification from tandem mass spectrometry (MS/MS) data is a critical step in the analysis of complex mixtures. Typical solutions for the MS/MS spectrum to compound (MS2C) problem involve comparing the unknown spectrum against a library of known spectrum-molecule pairs, an approach that is limited by incomplete library coverage. Compound to MS/MS spectrum (C2MS) models can improve retrieval rates by augmenting real libraries with predicted MS/MS spectra. Unfortunately, many existing C2MS models suffer from problems with mass accuracy, generalization, or interpretability. We develop a new probabilistic method for C2MS prediction, FraGNNet, that can efficiently and accurately simulate MS/MS spectra with high mass accuracy. Our approach formulates the C2MS problem as learning a distribution over molecule fragments. FraGNNet achieves state-of-the-art performance in terms of prediction error and surpasses existing C2MS models as a tool for retrieval-based MS2C. 2024-04-02T23:16:15Z Transactions on Machine Learning Research (TMLR), 08/2025 Adamo Young Fei Wang David S Wishart Bo Wang Russell Greiner Hannes Röst http://arxiv.org/abs/2508.19049v1 Evaluation of in vitro antibacterial activity and phytochemical profile of aqueous leaf extract of Asystasia variabilis 2025-08-26T14:03:50Z This study evaluated the in vitro antibacterial effect and the phytochemical profile of aqueous extract of fresh mature leaves of Asystasia variabilis, a Sri Lankan indigenous plant, against four common wound infective bacteria (Staphylococcus aureus, Bacillus subtilis, Pseudomonas aeruginosa and Escherichia coli) using Kirby-Bauer disk diffusion test. Gentamicin 10 μg/ disk and distilled water was used as positive and negative controls respectively The study revealed, for the first time, that the extract possessed significant antibacterial activity against all four test organisms in a concentration dependent manner (r values ranging from 0.921-0.992, P<0.01) with inhibition zone diameters ranging between 8 and 28 mm. Highest antibacterial activity was exhibited against B. subtilis at 1000 μg/ disk (27.43-+0.02 mm). The extract showed inhibitory effects comparable to gentamicin towards B. subtilis and P. aeruginosa at 500 μg/ disk and towards E. coli and S. aureus at 1000 μg/ disk. Qualitative phytochemical screening revealed the presence of flavonoids, tannins, phenols, cardiac glycosides, amino acids, carbohydrates, alkaloids and saponins. Therefore it is likely that the antibacterial effect of the extract is mediated by synergistic mechanisms. Furthermore, results of this study scientifically justified the claim traditional and folk medicine in the treatment of abscesses, wounds and ulcers and indicated the potential for the development of a novel drug from mature leaves of Asystasia variabilis. 2025-08-26T14:03:50Z R Wijerathna NAV Asanthi WD Ratnasooriya RN Pathirana NRM Nelumdeniya http://arxiv.org/abs/2508.19025v1 In-vitro Anti-bacterial Activity of Methanol and Aqueous Crude Extracts of Horsfieldia iryaghedhi 2025-08-26T13:43:40Z Aims: Over the past two decades, the rise of multidrug resistance (MDR) in bacteria has posed a significant threat to global health. The urgent need for new treatment alternatives has brought attention to the potential of plants, which harbor a wealth of unexplored phytochemicals with therapeutic properties. This study aims to evaluate the anti-bacterial efficacy of methanol and aqueous extracts from the leaves and bark of Horsfieldia iryaghedhi In vitro. Methodology: Aqueous and methanol extracts were obtained from the cold maceration method. In vitro anti-bacterial activity of methanol and aqueous leaf, bark, and combination extracts were determined against gram-negative bacteria Escherichia coli (ATCC 25922) and gram-positive bacteria Staphylococcus aureus (ATCC25923). The anti-bacterial assay for different concentrations of each extract was conducted through the well-diffusion method, with Gentamycin serving as the positive control. Results: Methanol leaf and combination extracts of Horsfieldia iryaghedhi have shown a positive anti-bacterial response at their highest concentrations of 1000mcg/mL and 500mcg/mL against grampositive bacteria Staphylococcus aureus while none of the extracts showed anti-bacterial activity against gram-negative E. coli at the experimented concentrations. Conclusion: The study concludes that methanol extracts of H.iryaghedhi should be further analyzed for their anti-bacterial activity, and there could be potential lead molecules that can be developed as antibiotics 2025-08-26T13:43:40Z RMHKK Rajapaksha EMN Fernando AWMKK Bandara NRM Nelumdeniya ARN Silva 10.9734/aprj/2024/v12i4259 http://arxiv.org/abs/2408.09896v2 Instruction-Based Molecular Graph Generation with Unified Text-Graph Diffusion Model 2025-08-26T07:02:24Z Recent advancements in computational chemistry have increasingly focused on synthesizing molecules based on textual instructions. Integrating graph generation with these instructions is complex, leading most current methods to use molecular sequences with pre-trained large language models. In response to this challenge, we propose a novel framework, named $\textbf{UTGDiff (Unified Text-Graph Diffusion Model)}$, which utilizes language models for discrete graph diffusion to generate molecular graphs from instructions. UTGDiff features a unified text-graph transformer as the denoising network, derived from pre-trained language models and minimally modified to process graph data through attention bias. Our experimental results demonstrate that UTGDiff consistently outperforms sequence-based baselines in tasks involving instruction-based molecule generation and editing, achieving superior performance with fewer parameters given an equivalent level of pretraining corpus. Our code is availble at https://github.com/ran1812/UTGDiff. 2024-08-19T11:09:15Z ECAI 2025 Yuran Xiang Haiteng Zhao Chang Ma Zhi-Hong Deng http://arxiv.org/abs/2508.18493v1 Disordered But Rhythmic: the role of intrinsic protein disorder in eukaryotic circadian timing 2025-08-25T21:07:14Z Intrinsically disordered protein regions (IDRs) are found across all domains of life and are characterized by a lack of stable 3D structure. Nevertheless, IDRs play critical roles in the most tightly regulated cellular processes, including in the core circadian clock. The molecular oscillator at the heart of circadian regulation leverages IDRs as dynamic interaction modules for activation and repression to support robust timekeeping and expand clock output and regulation. Here, we cover the biophysical mechanisms conferred by IDRs and their modulators. We survey the intrinsically disordered regions in clock proteins that are widely prevalent from fungi to mammals and discuss the importance of IDRs to the core clock and beyond. 2025-08-25T21:07:14Z Emery T. Usher Jacqueline F. Pelham