https://arxiv.org/api/G7Y4khpBXr7K/FqcIaEVcp/zlc02026-06-13T18:33:03Z15919015http://arxiv.org/abs/2601.00613v1Personalized Forecasting of Glycemic Control in Type 1 and 2 Diabetes Using Foundational AI and Machine Learning Models2026-01-02T08:58:11ZBackground: Accurate week-ahead forecasts of continuous glucose monitoring (CGM) derived metrics could enable proactive diabetes management, but relative performance of modern tabular learning approaches is incompletely defined.
Methods: We trained and internally validated four regression models (CatBoost, XGBoost, AutoGluon, tabPFN) to predict six weekahead CGM metrics (TIR, TITR, TAR, TBR, CV, MAGE, and related quantiles) using 4,622 case-weeks from two cohorts (T1DM n=3,389; T2DM n=1,233). Performance was assessed with mean absolute error (MAE) and mean absolute relative difference (MARD); quantile classification was summarized via confusion-matrix heatmaps.
Results: Across T1DM and T2DM, all models produced broadly comparable performance for most targets. For T1DM, MARD for TIR, TITR, TAR and MAGE ranged 8.5 to 16.5% while TBR showed large MARD (mean ~48%) despite low MAE. AutoGluon and tabPFN showed lower MAE than XGBoost for several targets (e.g., TITR: p<0.01; TAR/TBR: p<0.05 to 0.01). For T2DM MARD ranged 7.8 to 23.9% and TBR relative error was ~78%; tabPFN outperformed other models for TIR (p<0.01), and AutoGluon/ tabPFN outperformed CatBoost/XGBoost on TAR (p<0.05). Inference time per 1,000 cases varied markedly (PFN 699 s; AG 2.7 s; CatBoost 0.04 s, XGBoost 0.04 s).
Conclusions: Week-ahead CGM metrics are predictable with reasonable accuracy using modern tabular models, but low-prevalence hypoglycemia remains difficult to predict in relative terms. Advanced AutoML and foundation models yield modest accuracy gains at substantially higher computational cost.2026-01-02T08:58:11ZSimon Lebech CichoszStine HangaardThomas KronborgPeter VestergaardMorten Hasselstrøm Jensenhttp://arxiv.org/abs/2512.24489v1High Space-bandwidth Product Label-free Examination of iPSC-derived Brain Organoids via Fourier Ptychographic Microscopy2025-12-30T22:17:44ZFourier ptychographic microscopy (FPM) is a promising quantitative phase imaging technique that enables high-resolution, label-free imaging over a large field-of-view. Here, we present the first application of FPM for the quantitative analysis of human brain organoid slices, providing a powerful, cost-effective, and label-free enhancement to the current gold-standard fluorescence microscopy. Brain organoids, prepared as thin (5 micrometer) slices, were imaged with a custom-built FPM system consisting of a standard light microscope (4x, 0.2 NA objective) and a 7x7 LED array. This configuration achieved a synthetic numerical aperture of 0.54 and a spatial resolution of approximately 488 nm across an area of 2.077 x 3.65 mm. Fluorescence microscopy was used in parallel for neurons, astrocytes, and nuclei labeling, providing rich fluorescence imaging. Moreover, we designed an automated method to merge classical resolution fluorescence images to visualize the whole brain organoid and align it with the numerically increased space-bandwidth product FPM image. The provided alignment method enables rich phase-fluorescence correlative imaging. Based on the segmentation performed on the stitched fluorescence images, we devised a quantitative phase analysis revealing a higher mean optical thickness of the nuclei versus astrocytes and neurons. Notably, nuclei located in neurogenic regions consistently exhibited significantly higher phase values (optical path difference) compared to nuclei elsewhere, suggesting cell-type-specific biophysical signatures. The label-free, quantitative, and high-throughput capabilities of the FPM approach demonstrated here make it a powerful and accessible tool for future structural and functional studies of whole-section brain organoid development and disease modeling studies.2025-12-30T22:17:44Z14 pages, 6 figuresMikolaj KrysaMikolaj RogalskiPiotr ArcabPawel GoclowskiKamil KalinowskiPiotr ZdańkowskiVishesh K. DubeyMukesh VarshneyBalpreet S. AhluwaliaMaciej Trusiak10.1109/JSTQE.2025.3650095http://arxiv.org/abs/2601.03277v1MixRx: Predicting Drug Combination Interactions with LLMs2025-12-28T05:37:56ZMixRx uses Large Language Models (LLMs) to classify drug combination interactions as Additive, Synergistic, or Antagonistic, given a multi-drug patient history. We evaluate the performance of 4 models, GPT-2, Mistral Instruct 2.0, and the fine-tuned counterparts. Our results showed a potential for such an application, with the Mistral Instruct 2.0 Fine-Tuned model providing an average accuracy score on standard and perturbed datasets of 81.5%. This paper aims to further develop an upcoming area of research that evaluates if LLMs can be used for biological prediction tasks.2025-12-28T05:37:56ZRisha SuranaCameron SaidockHugo Chaconhttp://arxiv.org/abs/2510.23620v2Genotype-Phenotype Integration through Machine Learning and Personalized Gene Regulatory Networks for Cancer Metastasis Prediction2025-12-25T07:09:37ZMetastasis is the leading cause of cancer-related mortality, yet most predictive models rely on shallow architectures and neglect patient-specific regulatory mechanisms. Here, we integrate classical machine learning and deep learning to predict metastatic potential across multiple cancer types. Gene expression profiles from the Cancer Cell Line Encyclopedia were combined with a transcription factor-target prior from DoRothEA, focusing on nine metastasis-associated regulators. After selecting differential genes using the Kruskal-Wallis test, ElasticNet, Random Forest, and XGBoost models were trained for benchmarking. Personalized gene regulatory networks were then constructed using PANDA and LIONESS and analyzed through a graph attention neural network (GATv2) to learn topological and expression-based representations. While XGBoost achieved the highest AUROC (0.7051), the GNN captured non-linear regulatory dependencies at the patient level. These results demonstrate that combining traditional machine learning with graph-based deep learning enables a scalable and interpretable framework for metastasis risk prediction in precision oncology.2025-10-22T05:20:13Z39 pages, 14 figures. Preliminary version of ongoing collaborative research; a substantially revised manuscript is in preparationJiwei FuChunyu Yanghttp://arxiv.org/abs/2511.14817v2Systematic Reconstruction of Disease Networks from Longitudinal Blood Data for Causal Discovery and Intervention Analysis2025-12-24T21:06:22ZWe explore the hyperparameters and introduce a methodological framework to convert disease patterns from time series data of blood test results into correlation graphs for causal hypothesis exploration. The networks represent hypotheses that can then be validated or rejected both for causal discovery and causal analysis (under intervention). We synthetically recreated a repository of 105 typical disease longitudinal patterns extracted from medical guidance and research literature of common blood markers to build a systematic pipeline to translate multidimensional clinical data into intervenable disease networks for causal discovery and causal analysis. This study demonstrates that knowledge graphical models reconstructed from longitudinal data can transform routine medical data into clinically interpretable structures. By integrating multiple thresholding strategies and causal graph design, the framework has the purpose to move beyond statistical correlation toward clinically and testable inference networks. These results highlight a practical pathway for more transparent, explainable, and scalable tools in clinical decision support for AI training, precision healthcare and predictive medicine, offering interpretable, clinically actionable outputs that support safer use of AI in differential diagnosis.2025-11-18T08:59:30Z15 pagesDavid Patrick Duys MontealegreAlexander FultonMahta Haghighat GhahfarokhiAbicumaran UthamacumaranHector Zenilhttp://arxiv.org/abs/2512.21408v1MorphoCloud: Democratizing Access to High-Performance Computing for Morphological Data Analysis2025-12-24T20:10:35ZThe digitization of biological specimens has revolutionized the field of morphology, creating large collections of 3D data, and microCT in particular. This revolution was initially supported by the development of open-source software tools, specifically the development of SlicerMorph extension to the open-source image analytics platform 3D Slicer. Through SlicerMorph and 3D Slicer, biologists, morphologists and scientists in related fields have all the necessary tools to import, visualize and analyze these complex and large datasets in a single platform that is flexible and expandible, without the need of proprietary software that hinders scientific collaboration and sharing.
Yet, a significant "compute gap" remains: While data and software are now open and accessible, the necessary high-end computing resources to run them are often not equally accessible in all institutions, and particularly lacking at Primarily Undergraduate Institutions (PUIs) and other educational settings. Here, we present MorphoCloud, an "IssuesOps"-based platform that leverages Github Actions and the JetStream2 cloud farm to provide on-demand, research-grade computing environments to researchers working with 3D morphological datasets. By delivering a GPU-accelerated full desktop experience via a web browser, MorphoCloud eliminates hardware barriers, enabling complex 3D analysis and AI-assisted segmentation. This paper explains the platform and its architecture, as well as use cases it is designed to support.2025-12-24T20:10:35Z13 pages, 3 tables, 2 figuresA. Murat MagaJean-Christophe Fillion-Robin10.12688/f1000research.176328.1http://arxiv.org/abs/2512.20933v1Intrinsic limits of timekeeping precision in gene regulatory cascades2025-12-24T04:29:57ZMultiple cellular processes are triggered when the concentration of a regulatory protein reaches a critical threshold. Previous analyses have characterized timing statistics for single-gene systems. However, many biological timers are based on cascades of genes that activate each other sequentially. Here, we develop an analytical framework to describe the timing precision of such cascades using a burst-dilution hybrid stochastic model. We first revisit the single-gene case and recover the known result of an optimal activation threshold that minimizes first-passage-time (FPT) variability. Extending this concept to two-gene cascades, we identify three distinct optimization regimes determined by the ratio of intrinsic noise levels and the protein dilution rate, defining when coupling improves or worsens timing precision compared to a single-gene strategy. Generalizing to cascades of arbitrary gene length, we obtain a simple mathematical condition that determines when a new gene in the cascade can decrease the timing noise based on its intrinsic noise and protein dilution rate. In the specific case of a cascade of identical genes, our analytical results predict suppression of FPT noise with increasing cascade length and the existence of a mean time that decreases relative timing fluctuations. Together, these results define the intrinsic limits of timekeeping precision in gene regulatory cascades and provide a minimal analytical framework to explore timing control in biological systems.2025-12-24T04:29:57ZJuan Sebastian HernandezCesar NietoJuan Manuel PedrazaAbhyudai Singhhttp://arxiv.org/abs/2512.20805v1Computational optimisation of slow cooling profiles for the cryopreservation of cells in suspension2025-12-23T22:05:23ZThe cryopreservation of biological materials is a highly complex process, as it involves numerous factors such as the cooling and thawing procedures, the administration of cryoprotective agents (CPAs), as well as the type and composition of cells. While theoretical work has yielded a better understanding of the processes occurring during cryopreservation, the design of cryopreservation protocols and their parameters is currently predominantly based on heuristic optimization. Here, we propose a mathematical method to optimise the cooling dynamics in slow-cooling, to reduce the risk of injury. We derive our method from first principles and provide computational predictions. Moreover, we assess the predictions with data obtained from the literature, as well as novel experimental results. Overall, we provide a generic computational approach to generate improved slow-cooling profiles for the cryopreservation of cells in suspension.2025-12-23T22:05:23ZJack Lee JenningsSanja BojicLukas BreitwieserAlex SharpeRoman Bauerhttp://arxiv.org/abs/2512.18681v1Trick or Treat? Free-ranging dogs use human behavioural cues for foraging2025-12-21T10:39:08ZAnimals that display behavioural flexibility and adaptability thrive in urban environments, due to their ability to exploit novel anthropogenic resources. Since humans are an important component of such urban environments, animals that apply heterospecific learning in their decision-making are more likely to succeed as urban adapters. Free-ranging dogs, that have been living in human-dominated environments for centuries, are excellent urban adapters. In this study, we sought to understand the role and extent of human behavioural cues in decision-making during foraging by free-ranging dogs. We investigated whether these dogs were more attracted to items that humans appeared to be eating. When presented with a real and a fake biscuit, the dogs showed a clear preference for the food item. Between two identical biscuits, they chose the one that had been bitten by a human. However, when a fake biscuit was bitten and presented with a real one, the dogs failed to choose one over the other, suggesting a strong influence of the human-provided cue of biting over the natural cue of the smell of the food item. The dogs displayed left-bias during food choice across experimental conditions. These results demonstrate that dog foraging choices in urban environments are a mix of heterospecific learning and independent decision-making, highlighting an important facet behind their success in anthropogenic habitats. This also underscores the high level of dependence that free-ranging dogs have on humans in the urban habitat, not only as a source of food, but as an integral part of their ecological niche.2025-12-21T10:39:08Z5 figuresRohan SarkarSharmistha MajiTuhin Subhra PalAchal Dharmalal RajratnaAvik GhoshMadhurima RoySampurna BagSrijaya NandiArpan BhattacharyyaS. SivasubramaniamAvirup ChakrabortyAnindita Bhadrahttp://arxiv.org/abs/2512.19766v1Growth, yield and quality response of two industrial potato cultivars to chelated potassium and humic acid during fall season2025-12-21T07:41:17ZThe study was carried out to known the response of two industrial potato cultivars (Hermes, and Challenger) Netherlands origin, to chelated potassium fertilizer and humic acid due to growth, yield and quality in the fall season of 2024, planted in an open field of the educational field of Horticulture Department, College of Agricultural Engineering Sciences, University of Sulaimani, Sulaymaniyah, Kurdistan region, Iraq, with a (GPS) reading (latitude: 35.53576 N, longitude: 45.36663 E), and an Altitude of (741 m) above sea level. A factorial randomized complete block design (RCBD) with three replications was used in this study.2025-12-21T07:41:17ZDawan Sardar Hama AliLuqman Garib Karim Barznjyhttp://arxiv.org/abs/2512.18438v1Life as Non-Normal Chemical Accelerator2025-12-20T17:24:28ZLife is commonly described as a self-organized, far-from-equilibrium process that maintains internal order by consuming free energy and exporting entropy. This thermodynamic view underlies diverse theoretical frameworks -- from autopoiesis and relational biology to autocatalytic sets and hypercycles -- yet dissipation is typically treated as a necessary consequence of living organization rather than as a property shaped by its internal dynamics. Here, through explicit calculations of biotic chemical reactions and empirical documentation, we show that living systems universally function as non-normal chemical accelerators. Their elevated entropy production emerges from the asymmetric and hierarchical architecture of their biochemical networks. We introduce a general conceptual and mathematical framework in which biological structuration is understood as a dynamical property. Characterized by asymmetric couplings and transient amplification despite asymptotic stability, non-normal dynamics are shown to naturally generate kinetic acceleration, enhanced energy throughput, and phase-transition-like reorganizations without classical bifurcations. In this view, biological organization is not merely compatible with dissipation but actively structured to amplify free-energy flux and entropy export. We support this perspective with empirical and theoretical evidence that biochemical networks generically give rise to intrinsically non-normal operators through non-reciprocal interactions and hierarchical design. This framework yields testable predictions for dissipation rates, robustness, and evolutionary design principles, and suggests a kinetic principle of evolution in which living systems preferentially construct increasingly non-normal reaction architectures, driving sustained amplification of chemical fluxes and entropy flow.2025-12-20T17:24:28Z16 pages, 1 figureDidier SornetteVirgile Troudehttp://arxiv.org/abs/2512.19762v1Allelopathy of Rumex spp.: A review2025-12-20T11:05:28ZThe genus of Rumex from the Polygonaceae family is widespread in the world, particularly in the northern hemisphere, and includes about 250 species. The species of this genus are used for medicinal purposes and their allelopathic impacts. Regarding allelopathy, many allelochemicals have been detected in different Rumex species. Therefore, plant extracts, leachates, and plant residues of different species of Rumex have been studied with seed germination and plant growth in the recipient plants. Also, various species of Rumex were tested for their allelopathic capacities to control weeds, insects, and plant pathogens. Besides, it was revealed that the allelopathic impact of Rumex spp. was variable depending on extract concentration, the plant part of the Rumex spp., and the species of the recipient plant. In the present review, the results of the studies are exhibited that aimed at the allelopathic effect on different aspects of the plant crops, weeds, insects, and plant pathogens.2025-12-20T11:05:28ZAswan University Journal of Environmental Studies Vol. 6, No. 4, pp. 322-334, 2025Aram Akram Mohammed10.21608/aujes.2025.367558.1335http://arxiv.org/abs/2512.17847v1The use of kinematics to quantify gait attributes and predict gait scores in dairy cows2025-12-19T17:49:37ZDetecting walking pattern abnormalities in dairy cows early on holds the potential to reduce the occurrence of clinical lameness. This study aimed to predict gait scores in non-clinically lame dairy cows by using gait attributes based on kinematic data. Markers were placed on 20 anatomical landmarks on 12 dairy cows. The cows were walked multiple times through a corridor while recorded by six cameras, representing 69 passages. Specific gait attributes were computed from the 3D coordinates of the hoof markers. Gait was visually assessed using a 5-point numerical rating system (NRS). Due to the limited number of observations with NRS lower than 2 (n = 1) and higher than 3 (n = 6), the NRS labels were combined into three groups, representing NRS <= 2, NRS = 2.5, and NRS >= 3. The dataset was split into training and testing sets (70:30 ratio), stratified by the distribution of the NRS categories. Random forest (RF), gradient boosting machine (GBM), extreme gradient boosting machine (XGBM), and support vector machine (SVM) with a radial basis kernel models were trained using k-fold repeated cross-validation with hyperparameters defined using a Bayesian optimization. Accuracy, sensitivity, specificity, F1 score, and balanced accuracy were calculated to measure model performance. The GBM model performed best, achieving an overall accuracy and F1 score of 0.65 in the testing set. The findings of this study contribute to the development of an automated monitoring system for early identification of gait abnormalities, thereby enhancing the welfare and longevity of dairy cows.2025-12-19T17:49:37Z27 pages, 3 figures, 5 tablesCelia JulliotGabriel M. DallagoAmir NejatiAbdoulaye B. DialloElsa Vasseurhttp://arxiv.org/abs/2512.17966v1A generalized framework for procedural generation of three-dimensional static and dynamic plant model geometries2025-12-18T22:14:46ZThis work presents a new framework for procedural generation of dynamic 3D plant model geometries, which has been implemented in the Helios modeling system. Key goals of this work were to develop a model that 1) has a generalized set of parameters that are conserved across species, which are botanically-consistent and readily measurable; 2) significantly reduces the time and effort needed to create photorealistic, dynamically evolving plant models; 3) allows for encoding of the entire plant structure into a character-based representation that can integrated with machine learning models, and 4) includes realistic and computationally efficient collision physics. A model framework that satisfies these specifications is presented in this report. The model was implemented in the Helios C++ and PyHelios Python frameworks, which are open-source libraries that can be used to generate 3D plant geometries based on this model.2025-12-18T22:14:46ZBrian N. Baileyhttp://arxiv.org/abs/2505.01146v4Retrieval-Augmented Generation in Biomedicine: A Survey of Technologies, Datasets, and Clinical Applications2025-12-18T11:03:59ZLarge language models (LLMs) in biomedicine face a fundamental conflict between static parameter knowledge and the dynamic nature of clinical evidence. Retrieval-Augmented Generation (RAG) addresses this by grounding generation in external data, yet it introduces new complexities in latency and architecture. This survey synthesizes the biomedical RAG landscape (2020-2025), classifying systems into naive, advanced, and modular paradigms. Beyond a technological taxonomy, we formalize the biomedical RAG trilemma, identifying the inherent trade-offs between reasoning depth, inference latency, and data privacy that constrain current clinical deployment. We analyze how recent agentic workflows enhance diagnostic reasoning but risk prohibitive latency, and how privacy constraints dictate the choice between powerful cloud-based models and local deployment. Finally, we outline the alignment gap in multimodal RAG and propose future directions for self-correcting, verifiable clinical agents.2025-05-02T09:44:51Z49 pagesJiawei HeBoya ZhangHossein RouhizadehYingjian ChenRui YangJin LuXudong ChenNan LiuDouglas Teodoro