https://arxiv.org/api/u2OjdWRwnQTw2HAj+GF/34576H0 2026-03-24T08:29:59Z 4112 90 15 http://arxiv.org/abs/2508.03584v2 Decoding and Engineering the Phytobiome Communication for Smart Agriculture 2025-12-15T11:34:13Z

Smart agriculture applications, integrating technologies like the Internet of Things and machine learning/artificial intelligence (ML/AI) into agriculture, hold promise to address modern challenges of rising food demand, environmental pollution, and water scarcity. Alongside the concept of the phytobiome, which defines the area including the plant, its environment, and associated organisms, and the recent emergence of molecular communication (MC), there exists an important opportunity to advance agricultural science and practice using communication theory. In this article, we motivate to use the communication engineering perspective for developing a holistic understanding of the phytobiome communication and bridge the gap between the phytobiome communication and smart agriculture. Firstly, an overview of phytobiome communication via molecular and electrophysiological signals is presented and a multi-scale framework modeling the phytobiome as a communication network is conceptualized. Then, how this framework is used to model electrophysiological signals is demonstrated with plant experiments. Furthermore, possible smart agriculture applications, such as smart irrigation and targeted delivery of agrochemicals, through engineering the phytobiome communication are proposed. These applications merge ML/AI methods with the Internet of Bio-Nano-Things enabled by MC and pave the way towards more efficient, sustainable, and eco-friendly agricultural production. Finally, the implementation challenges, open research issues, and industrial outlook for these applications are discussed.

2025-08-05T15:50:19Z Accepted for IEEE Communications Magazine Fatih Gulec Hamdan Awan Nigel Wallbridge Andrew W. Eckford 10.1109/MCOM.001.2400570 http://arxiv.org/abs/2401.12498v2 Understanding Cellular Noise with Optical Perturbation and Deep Learning 2025-12-15T06:39:56Z

Noise plays a crucial role in the regulation of cellular and organismal function and behavior. Exploring noise's impact is key to understanding fundamental biological processes, such as gene expression, signal transduction, and the mechanisms of development and evolution. Currently, a comprehensive method to quantify dynamical behavior of cellular noise within these biochemical systems is lacking. In this study, we introduce an optically-controlled perturbation system utilizing the light-sensitive Phytochrome B (PhyB) from \textit{Arabidopsis thaliana}, which enables precise noise modulation with high spatial-temporal resolution. Our system exhibits exceptional sensitivity to light, reacting consistently to pulsed light signals, distinguishing it from other photoreceptor-based promoter systems that respond to a single light wavelength. To characterize our system, we developed a stochastic model for phytochromes that accounts for photoactivation/deactivation, thermal reversion, and the dynamics of the light-activated gene promoter system. To precisely control our system, we determined the rate constants for this model using an omniscient deep neural network that can directly map rate constant combinations to time-dependent state joint distributions. By adjusting the activation rates through light intensity and degradation rates via N-terminal mutagenesis, we illustrate that out optical-controlled perturbation can effectively modulate molecular expression level as well as noise. Our results highlight the potential of employing an optically-controlled gene perturbation system as a noise-controlled stimulus source. This approach, when combined with the analytical capabilities of a sophisticated deep neural network, enables the accurate estimation of rate constants from observational data in a broad range of biochemical reaction networks.

2024-01-23T05:48:20Z Withdrawn by the authors to conduct additional validation experiments. The results will be updated in a future version Chuanbo Liu Yu Fu Lu Lin Elliot L. Elson Jin Wang http://arxiv.org/abs/2512.10309v2 Tracking large chemical reaction networks and rare events by neural networks 2025-12-13T16:56:45Z

Chemical reaction networks are widely used to model stochastic dynamics in chemical kinetics, systems biology and epidemiology. Solving the chemical master equation that governs these systems poses a significant challenge due to the large state space exponentially growing with system sizes. The development of autoregressive neural networks offers a flexible framework for this problem; however, its efficiency is limited especially for high-dimensional systems and in scenarios with rare events. Here, we push the frontier of neural-network approach by exploiting faster optimizations such as natural gradient descent and time-dependent variational principle, achieving a 5- to 22-fold speedup, and by leveraging enhanced-sampling strategies to capture rare events. We demonstrate reduced computational cost and higher accuracy over the previous neural-network method in challenging reaction networks, including the mitogen-activated protein kinase (MAPK) cascade network, the hitherto largest biological network handled by the previous approaches of solving the chemical master equation. We further apply the approach to spatially extended reaction-diffusion systems, the Schlögl model with rare events, on two-dimensional lattices, beyond the recent tensor-network approach that handles one-dimensional lattices. The present approach thus enables efficient modeling of chemical reaction networks in general.

2025-12-11T05:55:44Z Jiayu Weng Xinyi Zhu Jing Liu Linyuan Lü Pan Zhang Ying Tang http://arxiv.org/abs/2512.00696v3 Hierarchical Molecular Language Models (HMLMs) 2025-12-12T22:51:42Z

Artificial intelligence (AI) is reshaping computational and network biology by enabling new approaches to decode cellular communication networks. We introduce Hierarchical Molecular Language Models (HMLMs), a novel framework that models cellular signaling as a specialized molecular language, where signaling molecules function as tokens, protein interactions define syntax, and functional consequences constitute semantics. HMLMs employ a transformer-based architecture adapted to accommodate graph-structured signaling networks through information transducers, mathematical entities that capture how molecules receive, process, and transmit signals. The architecture integrates multi-modal data sources across molecular, pathway, and cellular scales through hierarchical attention mechanisms and scale-bridging operators that enable information flow across biological hierarchies. Applied to a complex network of cardiac fibroblast signaling, HMLMs outperformed traditional approaches in temporal dynamics prediction, particularly under sparse sampling conditions. Attention-based analysis revealed biologically meaningful crosstalk patterns, including previously uncharacterized interactions between signaling pathways. By bridging molecular mechanisms with cellular phenotypes through AI-driven molecular language representation, HMLMs establish a foundation for biology-oriented large language models (LLMs) that could be pre-trained on comprehensive pathway datasets and applied across diverse signaling systems and tissues, advancing precision medicine and therapeutic discovery.

2025-11-30T02:09:27Z The current version includes minor revisions to the preprint v2 (arXiv preprint arXiv:2512.00696), Added the Supplementary materials section Hasi Hays Yue Yu William J. Richardson http://arxiv.org/abs/2512.11495v1 Mechanisms of thrombin inhibition by protein S and the TFPIα-fVshort-protein S complex 2025-12-12T11:42:45Z

Protein S (PS) is a notable anticoagulant implicated in both bleeding and thrombotic disorders, making it a promising drug target. Importantly, PS enhances the anticoagulant function of TFPI$α$, likely circulating in the bloodstream together with TFPI$α$ and a truncated form of factor V (fVshort) in the trimolecular complex, TFPI$α$-fVshort-PS, which we call protein S complex (PSC). PSC has been proposed to strongly inhibit thrombin production by enhancing the ability of TFPI$α$ to inhibit clotting factor Xa up to 100-fold and by localizing to platelet membranes, limiting fXa activity shortly after coagulation starts. Yet, exactly how PS functions with TFPI$α$ as an anticoagulant remains poorly understood. To investigate, we extend an experimentally validated mathematical model of blood coagulation to include PSC and free PS (not part of PSC) in the plasma, as well as free PS and TFPI$α$ in platelets. We find that shortly after coagulation initiation, PSC strongly inhibits thrombin production. We find that the (unknown) magnitude of the enhanced affinity of PSC binding to inhibit fXa critically regulates PSC's impact on thrombin production. We find that under flow, PSC can unexpectedly accumulate on platelets to concentrations ~50 times higher than in the plasma. We also find that PSC limits thrombin production by occupying fV-specific binding sites on platelets. Our results show that changes in PSC can dramatically impact severity of pathological bleeding disorders. For the east Texas bleeding disorder, elevated PSC concentrations eliminate thrombin bursts, leading to bleeding. With fV deficiency, reducing PSC rescues thrombin production in severe fV deficiency and returns thrombin production due to mild fV deficiency to normal. Finally, thrombin production in severe hemophilia A can be substantially improved by blocking PSC's anticoagulant function.

2025-12-12T11:42:45Z Submitted to Biophysical Journal on Dec 11, 2025 Alexander G. Ginsberg Josefin Ahnström James T. B. Crawley Karin Leiderman Dougald M. Monroe Keith B. Neeves Suzanne F. Sindi Aaron L. Fogelson http://arxiv.org/abs/2512.11927v1 Gene regulatory network inference algorithm based on spectral signed directed graph convolution 2025-12-12T00:54:53Z

Accurately reconstructing Gene Regulatory Networks (GRNs) is crucial for understanding gene functions and disease mechanisms. Single-cell RNA sequencing (scRNA-seq) technology provides vast data for computational GRN reconstruction. Since GRNs are ideally modeled as signed directed graphs to capture activation/inhibition relationships, the most intuitive and reasonable approach is to design feature extractors based on the topological structure of GRNs to extract structural features, then combine them with biological characteristics for research. However, traditional spectral graph convolution struggles with this representation. Thus, we propose MSGRNLink, a novel framework that explicitly models GRNs as signed directed graphs and employs magnetic signed Laplacian convolution. Experiments across simulated and real datasets demonstrate that MSGRNLink outperforms all baseline models in AUROC. Parameter sensitivity analysis and ablation studies confirmed its robustness and the importance of each module. In a bladder cancer case study, MSGRNLink predicted more known edges and edge signs than benchmark models, further validating its biological relevance.

2025-12-12T00:54:53Z Rijie Xi Weikang Xu Wei Xiong Yuannong Ye Bin Zhao http://arxiv.org/abs/2510.05383v2 Mathematical Analysis for a Class of Stochastic Copolymerization Processes 2025-12-11T16:01:32Z

We study a stochastic model of a copolymerization process that has been extensively investigated in the physics literature. The main questions of interest include: (i) what are the criteria for transience, null recurrence, and positive recurrence in terms of the system parameters; (ii) in the transient regime, what are the limiting fractions of the different monomer types; and (iii) in the transient regime, what is the speed of growth of the polymer? Previous studies in the physics literature have addressed these questions using heuristic methods. Here, we utilize rigorous mathematical arguments to derive the results from the physics literature. Moreover, the techniques developed allow us to generalize to the copolymerization process with finitely many monomer types. We expect that the mathematical methods used and developed in this work will also enable the study of even more complex models in the future.

2025-10-06T21:19:11Z 38 pages David F. Anderson Jingyi Ma Praful Gagrani http://arxiv.org/abs/2512.10708v1 Saturation-Based Atom Provenance Tracing in Chemical Reaction Networks 2025-12-11T14:48:00Z

Atom tracing is essential for understanding the fate of labeled atoms in biochemical reaction networks, yet existing computational methods either simplify label correlations or suffer from combinatorial explosion. We introduce a saturation-based framework for enumerating labeling patterns that directly operates on atom-atom maps without requiring flux data or experimental measurements. The approach models reaction semantics using Kleisli morphisms in the powerset monad, allowing for compositional propagation of atom provenance through reaction networks. By iteratively saturating all possible educt combinations of reaction rules, the method exhaustively enumerates labeled molecular configurations, including multiplicities and reuse. Allowing arbitrary initial labeling patterns - including identical or distinct labels - the method expands only isotopomers reachable from these inputs, keeping the configuration space as small as necessary and avoids the full combinatorial growth characteristic of previous approaches. In principle, even every atom could carry a distinct identifier (e.g., tracing all carbon atoms individually), illustrating the generality of the framework beyond practical experimental limitations. The resulting template instance hypergraph captures the complete flow of atoms between compounds and supports projections tailored to experimental targets. Customizable labeling sets significantly reduce generated network sizes, providing efficient and exact atom traces focused on specific compounds or available isotopes. Applications to the tricarboxylic acid cycle, and glycolytic pathways demonstrate that the method fully automatically reproduces known labeling patterns and discovers steady-state labeling behavior. The framework offers a scalable, mechanistically transparent, and generalizable foundation for isotopomer modeling and experiment design.

2025-12-11T14:48:00Z Marcel Friedrichs Daniel Merkle http://arxiv.org/abs/2512.10588v1 Why a chloroplast needs its own genome tethered to the thylakoid membrane -- Co-location for Redox Regulation 2025-12-11T12:28:23Z

A chloroplast is a subcellular organelle of photosynthesis in plant and algal cells. A chloroplast genome encodes proteins of the photosynthetic electron transport chain and ribosomal proteins required to express them. Chloroplast-encoded photosynthetic proteins are mostly intrinsic to the chloroplast thylakoid membrane where they drive vectorial electron and proton transport. There they function in close contact with proteins whose precursors are encoded in the cell nucleus for cytosolic synthesis, subsequent processing, and import into the chloroplast. The protein complexes of photosynthetic electron transport thus contain subunits with one of two quite different sites of synthesis. If most chloroplast proteins result from expression of nuclear genes then why not all? What selective pressure accounts for the persistence of the chloroplast genome? One proposal is that photosynthetic electron transport itself governs expression of genes for its own components: co-location of chloroplast genes with their gene products allows redox regulation of gene expression, thereby resulting in self-adjustment of protein stoichiometry in response to environmental change. This hypothesis posits Co-Location for Redox Regulation, termed CoRR, as the primary reason for the retention of genomes in both photosynthetic chloroplasts and respiring mitochondria. I propose that redox regulation affects all stages of chloroplast gene expression and that this integrated control is mediated by a chloroplast mesosome or nucleoid - a structure that tethers chloroplast DNA to the thylakoid.

2025-12-11T12:28:23Z 15 pages, 1 figure John F. Allen http://arxiv.org/abs/2501.19030v2 A network-driven framework for enhancing gene-disease association studies in coronary artery disease 2025-12-08T14:01:19Z

Transcriptome-wide association studies (TWAS) link genetic variation to complex traits by leveraging expression quantitative trait loci (eQTL) data. However, most implementations are typically limited to local (cis-acting) effects and fail to account for long-range (trans) regulatory influences mediated through gene networks. We introduce GRN-TWAS, a framework that reconstructs gene regulatory networks (GRNs) and integrates their topology into gene expression prediction models, thereby propagating distal (trans) regulatory effects through tissue-specific gene networks to trait- or disease-associated phenotypes. By incorporating network-derived trans-eQTLs, GRN-TWAS generates gene expression imputation models that capture both local and distal genetic components, enabling a more complete, systems-level view of genetic regulation consistent with the omnigenic model hypothesis. Using genotype and multi-tissue expression data from 600 coronary artery disease (CAD) cases in the STARNET study together with GWAS summary statistics, we show that GRN-TWAS improves gene-expression prediction and sharpens discovery of CAD-associated genes. Across seven tissues, the framework identified 5,779 transcriptome-wide significant genes, more than 50\% of which appear to be previously unreported in the CAD literature. A knowledge-based gene-ranking engine then prioritized 882 genes as highly CAD-relevant, including 237 regulated exclusively through trans effects. Key-driver analysis highlighted 18 putative trans mediators with high network centrality and disease relevance, offering mechanistic hypotheses that complement association signals. Collectively, these results demonstrate that embedding network topology into TWAS improves discovery and interpretability by exposing tissue-specific regulatory routes from genotype to phenotype and expanding the landscape of gene-disease associations.

2025-01-31T10:54:39Z Revised version, 12 pages, 6 figures, 1 table; code available at https://github.com/guutama/GRN-TWAS; Tex Source includes a file appendix.pdf with supplementary materials (4 pages supplementary methods, 3 supplementary tables, 21 supplementary figures) Gutama Ibrahim Mohammad Johan LM Björkegren Tom Michoel http://arxiv.org/abs/2512.07116v1 Structure-conditioned input-to-state stability for layer-by-layer molecular computations in parallel chemical reaction networks 2025-12-08T02:55:14Z

Molecular computation in chemical reaction networks (CRNs) now constitutes a foundational framework for designing programmable biological systems. However, prevailing design methodologies primarily treat parallelism of chemical reactions as a liability, consequently motivating researchers to redirect research focus toward leveraging parallelism to implement layer-by-layer computations of composite functions in coupled mass-action systems (MASs). MASs exhibiting this property are termed composable. Present composability verification for MASs mainly depends on input-to-state stability (ISS) conditions, with structural characteristics of networks remaining underexplored. This paper investigates the structural conditions under which two MASs are composable. By leveraging ISS-Lyapunov functions, we identify a class of CRN architectures, whose reduced systems have zero deficiency, that guarantee composability with other networks. We also extend our conclusions to encompass some CRN architectures possessing nonzero deficiency. Some examples are presented to demonstrate the validity of our theoretical results. Finally, we employ our methods to devise an algorithm for constructing MASs capable of executing specified molecular computations.

2025-12-08T02:55:14Z Renlei Jiang Chuanhou Gao Denis Dochain http://arxiv.org/abs/2512.06294v1 Interpretable Neural Approximation of Stochastic Reaction Dynamics with Guaranteed Reliability 2025-12-06T04:45:31Z

Stochastic Reaction Networks (SRNs) are a fundamental modeling framework for systems ranging from chemical kinetics and epidemiology to ecological and synthetic biological processes. A central computational challenge is the estimation of expected outputs across initial conditions and times, a task that is rarely solvable analytically and becomes computationally prohibitive with current methods such as Finite State Projection or the Stochastic Simulation Algorithm. Existing deep learning approaches offer empirical scalability, but provide neither interpretability nor reliability guarantees, limiting their use in scientific analysis and in applications where model outputs inform real-world decisions. Here we introduce DeepSKA, a neural framework that jointly achieves interpretability, guaranteed reliability, and substantial computational gains. DeepSKA yields mathematically transparent representations that generalise across states, times, and output functions, and it integrates this structure with a small number of stochastic simulations to produce unbiased, provably convergent, and dramatically lower-variance estimates than classical Monte Carlo. We demonstrate these capabilities across nine SRNs, including nonlinear and non-mass-action models with up to ten species, where DeepSKA delivers accurate predictions and orders-of-magnitude efficiency improvements. This interpretable and reliable neural framework offers a principled foundation for developing analogous methods for other Markovian systems, including stochastic differential equations.

2025-12-06T04:45:31Z Quentin Badolle Arthur Theuer Zhou Fang Ankit Gupta Mustafa Khammash http://arxiv.org/abs/2512.04937v1 A Systemic Pathological Network Model and Combinatorial Intervention Strategies for Alzheimer's Disease 2025-12-04T16:06:14Z

Alzheimer's disease (AD) persists as a paramount challenge in neurological research, characterized by the pathological hallmarks of amyloid-$β$ (A$β$) plaques and neurofibrillary tangles composed of hyperphosphorylated tau. This review synthesizes the evolving understanding of AD pathogenesis, moving beyond the linear amyloid cascade hypothesis to conceptualize the disease as a cross-talk of intricately interacting pathologies, encompassing A$β$, tau, and neuroinflammation as the foundation of phase-adapted pathological network model. This evolving pathophysiological understanding parallels a transformation in diagnostic paradigms, where biomarker-based strategies such as the AT(N) framework enable early disease detection during preclinical or prodromal stages. Within this new landscape, while anti-A$β$ monoclonal antibodies (e.g., lecanemab, donanemab), represent a breakthrough as the first disease-modifying therapies, their modest efficacy underscores the limitation of single-target approaches. Therefore, I explore the compelling rationale for combination therapies that simultaneously target A$β$ pathology, aberrant tau, and neuroinflammation. Looking forward, I emphasize emerging technological platforms such as gene editing and biophysical neuromodulation in advancing precision medicine. Ultimately, the integration of early biomarker detection, multi-target therapeutic strategies, and AI-driven patient stratification charts a promising roadmap toward fundamentally altering the trajectory of AD. The future of AD management will be defined by preemptive, biomarker-guided, and personalized combination interventions. Keywords: Alzheimer's disease, amyloid-$β$, tau pathology, neuroinflammation, combination therapy, multi-target therapy, precision medicine, biomarkers

2025-12-04T16:06:14Z 23 pages She Xutong http://arxiv.org/abs/2512.03191v1 A Comprehensive Review of Casein Kinase 2 in Drosophila Circadian Timing and Its Biomedical Relevance 2025-12-02T19:43:37Z

Circadian rhythms are endogenous 24-hour oscillations that regulate physiology, metabolism, sleep-wake cycles, and cellular homeostasis. Drosophila melanogaster, a genetically tractable model organism, has played a foundational role in uncovering the molecular mechanisms of circadian rhythms. The discovery of major clock genes, including period (per), timeless (tim), clock (clk), cycle (cyc), double time (dbt), and regulators such as Casein kinase 2 (CK2), emerged primarily from Drosophila research. CK2 operates as a critical post-translational regulator of PER protein phosphorylation, stability, nuclear entry, and degradation. Because PER dynamics dictate the timing and robustness of circadian rhythms in both flies and mammals, altered CK2 activity can profoundly impact rhythmic behaviour. CK2 dysregulation contributes not only to circadian disruption in Drosophila but also models broader pathological processes relevant to cancer, metabolic disease, neurodegeneration, and psychiatric disorders. This review synthesises CK2's molecular role in the Drosophila clock system, includes insights from computational modelling of CK2-PER dynamics, integrates tables throughout the text, and summarises the implications of dysregulated PER phosphorylation for human health.

2025-12-02T19:43:37Z Yasmin Fatima Md. Zubair Malik Prashant Ankur Jain http://arxiv.org/abs/2512.02908v1 Imperfect molecular detection renormalizes apparent kinetic rates in stochastic gene regulatory networks 2025-12-02T16:25:05Z

Imperfect molecular detection in single-cell experiments introduces technical noise that obscures the true stochastic dynamics of gene regulatory networks. While binomial models of molecular capture provide a principled description of imperfect detection, they have so far been analyzed only for simple gene-expression models that do not explicitly account for regulation. Here, we extend binomial models of capture to general gene regulatory networks to understand how imperfect capture reshapes the observed time-dependent statistics of molecular counts. Our results reveal when capture effects correspond to a renormalization of a subset of the kinetic rates and when they cannot be absorbed into effective rates, providing a systematic basis for interpreting noisy single-cell measurements. In particular, we show that rate renormalization emerges either under significant transcription factor abundance or when promoter-state transitions occur on a distinct (much slower or faster) timescale than other reactions. In these cases, technical noise causes the apparent mean burst size of synthesized gene products to appear reduced while transcription factor binding reactions appear faster. These effects hold for gene regulatory networks of arbitrary connectivity and remain valid under time-dependent kinetic rates.

2025-12-02T16:25:05Z 24 pages, 5 figures Iryna Zabaikina Ramon Grima