https://arxiv.org/api//kEDlBrNUDfRgh82W4SwTYYhhtM 2026-06-13T12:46:52Z 1591 15 15 http://arxiv.org/abs/2605.24353v1 ViViD-5K: Vineyard vision dataset for field-based berry detection and segmentation and grape cluster closure estimation 2026-05-23T02:30:02Z

Cluster closure, defined as the progressive filling of gaps between the berries in a grape bunch, is a key trait in vineyard management, impacting disease risk. However, traditional visual scoring methods are labor-intensive, subjective, and lack temporal resolution. Existing datasets rarely support fine-grained berry-level analysis, limiting the development of robust deep learning models. In this work, we present ViViD-5k, a large-scale in-field Vineyard Vision Dataset containing 5,000 images with dense annotations, including over 648,000 berry centroids and cluster segmentation masks spanning 13 grape varieties. Building on this dataset, we introduce GrapeSAM, a two-stage visual pipeline that combines point-based berry localization with prompt-based segmentation using Segment Anything, followed by transformer-based cluster segmentation. The pipeline enables automated, in-field estimation of cluster closure with minimal supervision. Quantitative results demonstrate strong segmentation and counting accuracy across diverse conditions, while visualizations confirm robustness on both in-domain and out-of-domain samples. This work provides a scalable and objective alternative to manual compactness scoring and supports high-throughput grape phenotyping with enhanced spatial detail.

2026-05-23T02:30:02Z Xiangzhi Tong Chengrui Zhang Mac Flaherty Andre Matteo Garcia Dominic Gorman Jonathan Jaramillo Justine E. Vanden Heuvel Yu Jiang http://arxiv.org/abs/2605.22145v1 Persistent Homology as a Morphological Signature of Fibrin Networks 2026-05-21T08:17:19Z

We present an investigation of the applicability of topological data analysis (TDA) to the study of high-resolution confocal microscopy images of fibrin network structures from patients with oesophageal cancer undergoing intended curative surgery. Investigation of clot structure brings new knowledge about blood coagulation, risk of bleeding, and thrombosis in this group of patients. Images of fibrin network formation in the collected blood samples were captured by confocal microscopy and three-dimensional z-stacks were analysed. Each z-stack was cropped to a centre region for analysis, the validity of which is assessed in detail. Overall, we found no significant differences in fibrin network topology across the perioperative period, and no consistent differences in network structure between the standard and intervention groups.

2026-05-21T08:17:19Z 12 pages, 5 figures Thomas Burnett Theresa Reinhold Bea Bleile Sophie Raynor Freya Jensen Martin Hermann Tua Gyldenholm Yossi Bokor Bleile http://arxiv.org/abs/2605.18900v1 A Logistic Regression Model to Predict Malaria Severity in Children 2026-05-17T12:53:02Z

One of the main causes of death around the globe is malaria. Researchers have sought to develop predictive models for malaria outbreaks based on meteorological data, climate data and the breeding cycle of Plasmodium, the causative agent of malaria. This study predicts the severity of malaria based on environmental and biological factors. A logistic regression model was developed in this study to predict the severity of malaria based on such factors as sickle cell disease, stagnant water, garbage dump, wet lawns, and the use of treated mosquito nets, with an 83.3% accuracy rate. The study was carried out in the Bosomtwe District of Ghana with 417 respondents. It was deduced that although children in the District are highly prone to malaria infection, the severity is very low. The study recommends that not just having a good sample size alone is important during machine learning model development, but also having a good sample representation of the various class labels is equally important.

2026-05-17T12:53:02Z Eur. J. Electr. Eng. Comput. Sci. 8 (2024) 31-35 Mary Opokua Ansong Asare Yaw Obeng Samuel King Opoku 10.24018/ejece.2024.8.2.614 http://arxiv.org/abs/2605.22848v1 From Simulation to Discovery: AI Enabled Probabilistic Emulation of Mechanistic Crop Systems 2026-05-15T23:56:01Z

Global food security depends on predicting crop responses to climate variability, yet process based crop models remain too computationally expensive for large scale exploration of genotype and environment interactions. Here we develop a probabilistic neural emulator of APSIM that reproduces key maize growth processes across 13 outputs with high fidelity (with R^2 of 0.93) while reducing simulation time by several orders of magnitude. Trained on two million simulations spanning diverse genetic, soil, and management conditions, and augmented with a convolutional synthetic weather generator that produces physically consistent climate sequences, the framework enables scalable exploration of crop responses under realistic and diverse environmental inputs while providing calibrated predictive uncertainty without costly Bayesian inference. Applying this framework across 100,000 trait configurations, six soil environments in Iowa and Illinois, and climate projections through the year 2100 under two emissions scenarios, we identify 181 maize trait combinations that consistently maintain high yield across all tested conditionsan analysis infeasible with the mechanistic model alone. We further show that radiation use efficiency and temperature driven root dynamics are dominant drivers of yield resilience. Notably, projected yield distributions vary substantially across locations, with some lower productivity sites exhibiting yield increases under future climate scenarios, indicating that climate change may reshape regional yield potential in nonintuitive ways. These results demonstrate how uncertainty aware emulation transforms mechanistic crop simulation from a computational bottleneck into an on demand discovery engine, one capable of interrogating the full genotype, environment and management space at a scale no process-based model can match.

2026-05-15T23:56:01Z Mojdeh Saadati Juan Panelo Gustavo Visentini Soumik Sarkar Carlos Messina Baskar Ganapathysubramanian http://arxiv.org/abs/2605.16466v1 An exponential logarithmic measure of drug receptor binding and saturation 2026-05-15T10:47:14Z

Ligand receptor interactions are commonly assessed through equilibrium occupancy and pharmacodynamic measures that describe binding and saturation by means of bounded response curves. Thermodynamic approaches relate binding affinity to logarithmic concentration scaling, while probabilistic descriptions of occupancy arise from exponential relations. We introduce an exponential logarithmic descriptor (ELD) that integrates ligand availability and thermodynamic binding propensity within a single quantity. The logarithmic component corresponds to a thermodynamic term derived from concentration dependent free energy relations, whereas the exponential component is represented through an inverse normalized concentration term corresponding to the reciprocal of the exponential occupancy factor emerging from Boltzmann type binding formulations. We explored ELD behavior through numerical simulations spanning sub affinity, transition and saturating concentration regimes under multiple affinity conditions and time dependent exposure profiles. Compared with conventional occupancy curves, ELD retained a broader dynamic range and revealed asymmetric sensitivity across concentration scales, particularly at low exposure and near saturation, where bounded occupancy measures progressively compress variability. The resulting behavior reflects the coexistence of amplification and constraint processes within ligand receptor dynamics. ELD may provide quantitative representation for biological systems in which exponential and logarithmic processes coexist across different scales. Potential applications include characterization of dose response transitions, identification of subtherapeutic and saturating exposure states, comparison of compounds with different affinities, normalization across heterogeneous datasets and continuous tracking of pharmacodynamic regimes during time dependent exposure.

2026-05-15T10:47:14Z 10 pages, 2 figures Arturo Tozzi http://arxiv.org/abs/2605.13893v1 From Organization to Viability: A Multi-Level Analysis of Gait Dynamics Under Occlusal Constraint 2026-05-12T10:15:18Z

Clinical interpretation often assumes that observable performance provides sufficient information about the organization of an adaptive system. However, similar observable performance may correspond to distinct latent organizations. This study extends a previous multi-level framework by introducing a fourth analytical level centered on longitudinal viability. Using an exploratory single-case design in a Parkinsonian patient, gait data were recorded with instrumented insoles under three occlusal conditions: neutral natural occlusion (ONL), a 2.5-degree increase in vertical dimension of occlusion (OC2.5), and a 3-degree increase in vertical dimension of occlusion (OC3). Two measurement sessions were conducted eleven weeks apart, during which the participant underwent a structured sensorimotor intervention. The vertical dimension of occlusion was considered as an experimentally varied constraint applied to an adaptive neuromechanical system. Although observable performance remained globally comparable across conditions, PCA-based latent-space analysis revealed differentiated longitudinal centroid displacements. OC3 exhibited the smallest displacement, ONL an intermediate displacement, and OC2.5 the largest displacement. This hierarchy supports the relevance of a Level 4 framework centered on viability, understood here as an exploratory proxy for a configuration's capacity to maintain lower longitudinal reorganization over time. These findings remain within-subject, exploratory, and non-causal. They do not establish a validated clinical threshold, causal occlusal effect, or therapeutic optimum. More generally, the work suggests that clinical relevance cannot be inferred solely from instantaneous performance or static latent structure, but may also depend on the capacity of a configuration to sustain a coherent trajectory over time.

2026-05-12T10:15:18Z 16 pages, 2 figures. Exploratory single-case study at the interface of quantitative biology, gait analysis, occlusion, sensorimotor regulation, latent-space modeling, and machine learning Jacques Raynal Pierre Slangen Elsa Raynal Jacques Margerit http://arxiv.org/abs/2605.18791v1 SpecX: A Large-Scale Benchmark for Multi-Modal Spectroscopy and Cross-Paradigm Evaluation 2026-05-11T04:12:58Z

Existing spectral benchmarks are limited in scale, modality alignment, and evaluation scope, and typically focus on either specialized models or multimodal language models (MLLMs). We introduce SpecX, a large-scale benchmark for multi-modal spectroscopy with cross-paradigm evaluation. SpecX contains 1.7M molecules with diverse spectral modalities, including NMR (1H, 13C, HSQC), IR, MS,UV,Raman and FL, and is organized into three tiers: a large-scale dataset for pretraining, an aligned multi-spectral subset for benchmarking, and a high-quality experimental subset for evaluation. SpecX supports a range of tasks such as molecular elucidation, spectrum simulation, and spectral understanding, and enables unified evaluation across both specialized spectral models and MLLMs. Experiments show that specialized models excel at signal-level modeling, while MLLMs exhibit strengths in high-level reasoning but lack precise spectral grounding. SpecX establishes a unified benchmark for spectral intelligence and highlights the need for spectrum-native foundation models.

2026-05-11T04:12:58Z 9 pages,1 figures Chengrui Xiang Tengfei Ma Yujie Chen Tong Wang Haowen Chen Xiangxiang Zeng http://arxiv.org/abs/2605.11028v1 Morpho-Physiological and Genetic Diversity of Crataegus Taxa (Rosaceae) in Selected Locations of Iraqi Kurdistan-Region 2026-05-10T20:55:27Z

One of the great phytogeography zones of semi-arid lands in the world is the Kurdistan region of Iraq which hosts many important fruit species due to its geographical location and ecology. Mountain Hawthorn (Crataegus spp.) is a vital wild edible deciduous fruit tree of the genus Crataegus for the region, which is highly beneficial for ornamental, economical, industrial and medicinal uses. In the present study, morphological, phytochemical and molecular marker systems were applied on sixty-one Hawthorn accessions from different locations in the Iraqi Kurdistan region during April 2022 to September 2023. Phenotypic markers have proven to be extremely useful in studies of genetic diversity in Hawthorn genotypes, the results of the present morphological study showed that there are seven taxa (five species, two hybrids) were observed including, Crataegus azarolus, Crataegus meyrei, Crataegus monogyna, Crataegus orientalists, Crataegus pentagyna, Crataegus azarolus x Crataegus meyrei and Crataegus azarolus x Crataegus pentagyna. There was significant variation among different ecotypes in terms of plant type, reproductive stage, and fruit morphology and production uses. Fruit Physio-morphological data revealed a high level of significant variability (P 0.01) among accessions based on the analysis of variance. The most important characteristics for explaining fruit morphological variability `were 11 varbales including fruit weight (FW), fruit length (FL), fruit width (FW), seed length (SL), seed width (SW), number of seeds per fruits (NSF), volume solution (VS), fruit fresh weight (WOF), seed weight (WS), Potentional of hydrogen (pH) and mositure content (MC). They all are significantly different for all the traits measured for the studied accessions.

2026-05-10T20:55:27Z 96 pages Karzan Ezzalddin Mohammed http://arxiv.org/abs/2605.10994v1 Internally triggered retrospective learning in neural networks 2026-05-09T14:30:43Z

Learning in artificial neural networks usually relies on continuous, externally driven weight updates, in which parameters are modified at every step in response to incoming data, error signals or reward feedback. In this setting, routine and informative inputs contribute similarly to parameter adjustment. We introduce a learning approach in which parameter updates are governed by internally generated events arising from the network own representational dynamics. During ongoing activity, synaptic interactions are accumulated as latent traces encoding recent coactivation patterns, without immediately modifying the underlying parameters. In parallel, an internal predictive process estimates the evolving latent state, while a scalar measure of discrepancy between predicted and observed states is continuously computed. When discrepancy exceeds an adaptive threshold derived from recent error statistics, a learning event is triggered, inducing a retrospective update selectively integrating past activity into the current configuration. We performed simulations using a minimal neural network exposed to structured sequential inputs with transient perturbations. We found that learning occurs through sparse, temporally localized events associated with increases in prediction error, leading to stepwise changes in synaptic efficacy and discrete transitions in latent state organization. By selectively reorganizing parameters in response to internally detected discrepancies, our episodic updating may reduce unnecessary parameter drift while preserving informative patterns. Potential applications include systems requiring selective adaptation to rare or informative inputs such as physiological, industrial or environmental monitoring, edge computing under limited energy budgets, autonomous systems operating in dynamic conditions and sequential computational data processing.

2026-05-09T14:30:43Z 13 pagews, 2 figures Arturo Tozzi http://arxiv.org/abs/2605.10979v1 Statin Recommendations among US Adults with the 2026 Dyslipidemia Guidelines 2026-05-09T03:01:39Z

Importance: The 2026 multisociety dyslipidemia guideline recommended the PREVENT equations in place of the PCE equations, introduced 30-year risk assessment as a new treatment pathway, and lowered risk-based treatment thresholds. The net population impact of these concurrent changes on statin recommendations is unknown. Objective: To estimate changes in statin recommendations under 2026 PREVENT-based dyslipidemia guidelines compared with 2018 PCE-based guidelines. Design and Participants: Cross-sectional analysis of pooled data from NHANES, spanning 2011-2023 and comprising 24,199 participants aged 30-79 years. Main Outcomes and Measures: Number and proportion of US adults receiving or recommended for statin therapy. Results: At the class 1 threshold, the number of US adults receiving or recommended for statin therapy decreased by an estimated 3.0 million (95% CI, 2.3 million to 3.6 million), with larger reductions among Black adults (-4.2 percentage points [pp]), men (-4.0pp), and adults aged 50-69 years (-5.6pp). At the class 2 threshold--which additionally recommends statins for adults aged 30-59 years based on 30-year risk--the number of adults recommended increased by an estimated 20.8 million (95% CI, 19.6 million to 22.0 million), or +11.6pp. The increase was largest among adults aged 50-59 years (+19.7pp) and 40-49 years (+14.8pp). Conclusions: The net population impact of the 2026 dyslipidemia guidelines depends critically on which recommendation class is applied. At the class 1 threshold, statin recommendations decreased modestly; at the class 2 threshold, inclusion of 30-year risk assessment substantially expanded recommendations, particularly among younger adults. These divergent effects underscore the importance of the 30-year risk criterion as a major driver of new eligibility and the need for outcomes and equity monitoring during guideline implementation.

2026-05-09T03:01:39Z James A. Diao Thomas A. Buckley Andrew Z. Zhou Smaraki Dash Rishi K. Wadhera Arjun K. Manrai http://arxiv.org/abs/2605.07035v1 Genetic Information as a "Chord" of Chemical Oscillations: Emergence of Catalyst-RNA Systems Driven by Superposed Rhythms 2026-05-07T23:26:01Z

A central challenge in the origin of life is understanding how catalytic peptide-like polymers and information-bearing nucleic acid-like polymers emerged as an interde-pendent system. This study constructs a primordial cognitive model incorporating two internal Lotka-Volterra chemical oscillators to investigate, through simulation, whether a catalytic loop, primordial tRNAs, and nucleic acids that record and amplify them, can form through the interaction of polymers represented by binary (0/1) sequences. In this model, a mechanism was introduced where the synthesis of internal oscillations pro-vides a temporal bias for 0/1 selection during polymer elongation, while generated functional sequences are protected, recorded, and re-amplified. Simulation results demonstrated that the proposed cognitive model significantly outperformed a contrast model based on random 0/1 selection in terms of the establishment rate of catalytic loops, the accumulation of functional molecules, polymer elongation, and the reduction of Shannon entropy in sequence distribution. Furthermore, this superiority was generally maintained across sensitivity analyses, including batch calculations with different ran-dom seeds. While this study is a computational model based on abstract binary se-quences and simplified translation/replication rules rather than a direct reconstruction of life's origin, it provides a working hypothesis for the interdependent emergence of catalytic function and information retention by demonstrating that internal oscillations can bias sequence exploration within a framework linking autocatalytic networks, re-cording, and group selection. Future research must verify the generality and empirical validity of this framework by expanding monomer types, evolving into multi-oscillator systems, and establishing correspondences with compartmentalized experimental sys-tems.

2026-05-07T23:26:01Z Takeshi Ishida http://arxiv.org/abs/2603.12278v2 Unsupervised Anomaly Detection in Wearable Foot Sensor Data: A Baseline Feasibility Study Towards Diabetic Foot Ulcer Prevention 2026-05-07T14:39:57Z

Diabetic foot ulcers (DFUs) are a severe complication of diabetes associated with significant morbidity, amputation risk, and healthcare burden. Developing effective continuous monitoring frameworks requires first establishing reliable baseline models of normal foot biomechanics. This paper presents a feasibility study of an anomaly detection framework applied to time-series data from wearable foot sensors, specifically NTC thin-film thermocouples for temperature and FlexiForce A401 pressure sensors for plantar load monitoring. Data were collected from healthy adult subjects across 312 capture sessions on an instrumented pathway, generating 93,790 valid multi-sensor readings spanning September 2023 to June 2024. Two unsupervised algorithms, Isolation Forest and K-Nearest Neighbors using Local Outlier Factor (KNN/LOF), were applied to detect statistical deviations in foot temperature and pressure signals. Results show that Isolation Forest is more sensitive to subtle, distributed anomalies, while KNN/LOF identifies concentrated extreme deviations but flags a higher proportion of sessions not corroborated by Isolation Forest. Since no clinical ground truth is available, this difference is interpreted as lower specificity under the shared 5 percent contamination assumption rather than a confirmed false-positive rate. A mild positive correlation (0.41-0.48) between pressure and temperature features supports the case for combined multi-modal monitoring. These findings establish a validated baseline analytical pipeline and provide a methodological foundation for future clinical validation studies involving diabetic patients, where the relationship between detected anomalies and DFU-related pathophysiology can be directly assessed.

2026-02-27T17:30:27Z 36 pages, 19 figures. Published in Biomedical Signal Processing and Control, Vol. 123, Part A, 110416, September 2026. https://doi.org/10.1016/j.bspc.2026.110416 Biomedical Signal Processing and Control, Vol. 123, Part A, 110416 (2026) Md Tanvir Hasan Turja 10.1016/j.bspc.2026.110416 http://arxiv.org/abs/2511.09584v3 Similarity Analysis of Blood Count Reference Intervals Across Continents Reveals No Reproducible Population or Geography-Linked Structure and Supports Personalised Values 2026-05-04T16:21:26Z

Blood reference intervals (RIs) underpin diagnostic interpretation and therapeutic monitoring worldwide. However, many widely used RI systems originate from limited historical cohorts and have been propagated across health systems without harmonised derivation protocols, shared metadata, or cross-population validation. Consequently, the global RI landscape reflects a heterogeneous mixture of legacy standards and local laboratory practices rather than a biologically grounded framework. Here we examine published Complete Blood Count (CBC) reference intervals, one of the most commonly used laboratory panels worldwide. We compiled CBC RI data from 28 countries and analysed their similarity using variability mapping, hierarchical clustering, information-theoretic distances, cohesion benchmarking, and nonlinear manifold visualisation. Body mass index (BMI) served as a methodological positive benchmark and exhibited clear continent-level clustering (mean cohesion approximately 0.78-0.81). In contrast, CBC reference intervals showed no reproducible geography-linked clustering across methods, with uniformly high cohesion scores (mean approximately 1.27-1.30). Weak signals in red-cell indices (MCV, HGB) were unstable across sexes and distance metrics. This absence of structure should not be interpreted as evidence that current CBC reference intervals represent universal biological standards. Rather, it is more consistent with the fragmented and historically inherited nature of the global RI landscape. These findings indicate that published CBC reference intervals do not encode coherent global structure and provide limited support for universal population-based diagnostic thresholds. Instead, they support a transition toward recalibrated and personalised reference frameworks based on longitudinal individual baselines and harmonised derivation standards.

2025-11-12T11:08:21Z 25 pages Kunlin Wu Abicumaran Uthamacumaran Hector Zenil http://arxiv.org/abs/2511.10888v2 Multi-omic Enriched Blood-Derived Digital Signatures Reveal Mechanistic and Confounding Disease Clusters for Differential Diagnosis 2026-05-03T23:45:33Z

Understanding disease relationships through blood biomarkers offers a pathway toward data-driven taxonomy and precision medicine. In this study, we constructed a digital blood twin, a computational model derived from 103 disease signatures comprising longitudinal hematological and biochemical analytes. Profiles were standardized into a unified disease-analyte matrix, and pairwise Pearson correlations were computed to assess similarity across conditions. Hierarchical clustering revealed consistent grouping of hematopoietic disorders, while metabolic, endocrine, and respiratory diseases were more heterogeneous, reflecting weaker internal cohesion. To evaluate cluster structure, the tree was partitioned at a stringent distance threshold, yielding 16 groups. Enrichment analysis of the largest and most heterogeneous cluster demonstrated convergence on cytokine-signaling pathways, indicating shared inflammatory mechanisms that transcend conventional clinical boundaries. PCA and UMAP corroborated the correlation-based results, consistently separating hematological diseases as a distinct cluster. Random Forest feature selection identified neutrophils, mean corpuscular volume, red blood cell count, and platelet count as the most discriminative analytes, reinforcing the role of hematopoietic markers as key drivers of disease stratification. Collectively, these findings show that blood-derived digital signatures can recover clinically meaningful disease clusters while uncovering mechanistic overlaps across categories. This network physiology framework highlights the potential of integrating routine laboratory data with computational methods to refine disease ontology, map comorbidities, and advance precision diagnostics.

2025-11-14T01:49:59Z 47 pages with supplementary Bolin Liu Abicumaran Uthamacumaran Alexander Fulton Hector Zenil http://arxiv.org/abs/2605.02952v1 DynoSys: A Dynamic Systems Framework for Multimodal Integration of Genetic, Environmental, and Neurobiological Signals 2026-05-02T00:29:33Z

Understanding the development of adolescent behavioral and mental health outcomes requires integrating genetic predisposition, environmental exposures, and neurobiological processes over time. Here, we present a unified quantitative framework that models the human body as a dynamic system, where genetic factors form the foundational state, environmental exposures act as time-varying inputs, the brain might serve as a mediation processor, and behavioral phenotypes emerge as system outputs. Using longitudinal data from the Adolescent Brain Cognitive Development (ABCD) Study, we construct harmonized multi-domain representations across six phenotypes: externalizing behavior, internalizing behavior, and four substance use initiation outcomes (alcohol, nicotine, cannabis, and any substance use). We integrate polygenic risk scores (PRS), multi-domain environmental features, and multimodal neuroimaging representations derived through stability selection and dimensionality reduction. Our framework supports both continuous longitudinal modeling and survival-based event modeling through a unified data structure. We further develop interpretable domain-level representations using principal components, weighted risk scores, and cluster-based summaries. These representations enable downstream modeling using survival analysis, state-space models, and machine learning approaches. This work establishes a scalable and interpretable framework for studying how genetic and environmental factors interact over time to shape behavioral outcomes, providing a foundation for identifying modifiable risk factors and informing early intervention strategies.

2026-05-02T00:29:33Z Mengman Wei Qian Peng