https://arxiv.org/api/77Ptvk4l+/+miiDooL//8R+2QLA 2026-03-28T11:03:55Z 22843 75 15 http://arxiv.org/abs/2510.01803v3 The Perceived Impact of Environment on Health in Italy: a Penalized Ordinal Regression Approach 2026-03-20T16:25:30Z Understanding how individuals perceive their living environment is a complex task, as it reflects both personal and contextual determinants. In this paper, we address this task by analyzing the environmental module of the Italian nationwide health surveillance system PASSI (Progressi delle Aziende Sanitarie per la Salute in Italia), integrating it with contextual information at the municipal level, including socio-economic indicators, pollution exposure, and other geographical characteristics. Methodologically, we adopt a penalized semi-parallel cumulative ordinal regression model to analyze how subjective perceptions are shaped by both personal and territorial determinants. The approach balances flexibility and interpretability by allowing both parallel and non-parallel effects while regularizing estimates to address multicollinearity and separation issues. We use the model as an analytical tool to uncover the determinants of positivity and neutrality in environmental perceptions, defined as factors that contribute the most to improving perception or increasing the sense of neutrality. The results are diverse. First, results reveal significant heterogeneity across Italian territories, indicating that local characteristics strongly shape environmental perception. Second, various individual factors interact with contextual influences to shape perceptions. Third, hazardous environmental factors, such as higher PM2.5 levels, appear to be associated with poorer environmental perception, suggesting a tendency among respondents to recognize specific environmental issues. Overall, the approach demonstrates strong potential for application and provides useful insights for environmental policy planning. 2025-10-02T08:44:39Z Mattia Stival Angela Andreella Gaia Bertarelli Catarina Midões Stefano Federico Tonellato Stefano Campostrini http://arxiv.org/abs/2603.20052v1 Uncertainty in wind and solar projections depends on global and regional climate models 2026-03-20T15:36:34Z Ensembles of regional-global climate model combinations show substantial spread in projected wind and solar resources. Using 31 RCM-GCM pairs, we quantify the sources of this spread with a spatially and seasonally resolved variance decomposition, separating contributions from RCMs and GCMs. For both wind speed and solar radiation, RCMs dominate the variability in the absolute historical fields. In contrast, projected changes in wind speed are largely controlled by the driving GCMs, except in mountainous regions where RCM-induced variance becomes larger than that induced by GCMs. For solar radiation, contributions are strongly season-dependent, with RCMs dominating in summer and GCMs in winter. Our findings support that GCM and RCM variability together define the uncertainty of wind and solar climate projections. This provides guidance for designing climate model ensembles that better support uncertainty-aware energy system decisions under climate change. 2026-03-20T15:36:34Z Nina Effenberger Reto Knutti http://arxiv.org/abs/2603.16982v3 Trajectory Stability and Signature Diagnostics for Comet-Based Interstellar Navigation 2026-03-20T15:18:46Z Interstellar objects (ISOs) motivate a coupled mission-design and inference question relevant to spacecraft dynamics and control in extreme environments: if volatile-rich, rotating comet-like bodies were used for sustained deep-space navigation by exploiting pre-existing hyperbolic motion and in-situ propellant, what stability requirements arise under non-gravitational forcing, and what astrometric signatures might distinguish active stabilization from uncontrolled natural dynamics? We develop a stability-theoretic framework for trajectory tracking with jet-actuated correction, and show that high-speed transit geometry -- including debris-belt avoidance and encounter phasing -- tightly constrains feasible trajectories, making long-horizon tracking stability mission-critical. We model tracking residuals as the balance of disturbances and corrective action, and derive stability conditions across four levels: disturbance-energy stability, outer-loop contraction, actuator-memory stability, and rotation-mediated (Floquet) stability. The analysis implies residual diagnostics that can motivate empirical tests: under comparable forcing, effective stabilization is expected to strengthen short-horizon error correction, reduce event-conditioned persistence and variance clustering, regularize standardized innovations, and yield bounded post-shock recovery. More broadly, the framework provides a reference for deep-space guidance and control under nonlinear, multi-field disturbances and for planetary-defense concepts involving attitude shaping or impulsive kinetic impact. 2026-03-17T15:30:15Z 31 pages, 2 figures, 4 added references Bo Pieter Johannes Andrée http://arxiv.org/abs/2603.20015v1 On the Calibration of Bayesian Success Criteria and Operating Characteristics for Clinical Trials 2026-03-20T14:57:37Z Recently, the U.S. Food and Drug Administration (FDA) released draft guidance \citep{FDA2026} signaling a paradigm shift that facilitates the use of Bayesian methodology as the primary analysis and decision framework for drug approval. The cornerstone and fundamental challenge of this framework is the specification and calibration of Bayesian success criteria to control decision errors, ensuring reliable clinical and regulatory outcomes. In this work, we systematically investigate various Bayesian decision-error metrics, their theoretical interrelationships, and their alignment with conventional Frequentist counterparts. This investigation provides critical theoretical insights and practical guidance on calibrating Bayesian success criteria and operating characteristics to ensure robust decision-making and the integrity of public health decisions. We illustrate this framework using a clinical trial evaluating revascularization strategies for cardiogenic shock. A Shiny application will be available at www.trialdesign.org to assist sponsors and regulators in evaluating calibration strategies consistent with recent regulatory perspectives. 2026-03-20T14:57:37Z Peng Yang Li Wang Ying Yuan http://arxiv.org/abs/2603.19986v1 Probabilistic Estimation of Hidden Migrant Fatalities Along the Central Mediterranean Route 2026-03-20T14:33:46Z Estimating the number of migrants who die or go missing along dangerous routes such as the Central Mediterranean remains challenging as available records are incomplete. Some incidents are never documented, and fatalities associated with such unobserved incidents are absent from observed totals. We propose a Bayesian approach for probabilistic estimation of total migrant fatalities in such settings. Building on recent developments in multiple-systems estimation, we develop a time-stratified latent-class framework that accommodates missing fatality counts for unobserved incidents. We apply the method to recoded incident-level data from the Missing Migrants Project for the Central Mediterranean route from 2014 to 2025, encompassing 25,712 fatalities across 1,562 incidents. Our model yields 95% credible intervals of 30,426-39,172 fatalities and 2,200-2,591 deadly incidents, indicating that approximately 66%-85% of fatalities and 60%-71% of incidents are reflected in the available data. We estimate that unreported fatalities were concentrated between 2014 and 2016. Furthermore, we document that reporting likelihood increases with incident severity, implying that smaller incidents are most likely to remain undetected. While contingent on modeling assumptions and incomplete data, our method provides a broadly applicable and principled alternative to naive data adjustment methods. 2026-03-20T14:33:46Z Gregor Zens Zoe Sigman http://arxiv.org/abs/2509.01597v2 Statistics-Friendly Confidentiality Protection for Establishment Data, with Applications to the QCEW 2026-03-20T14:30:40Z Confidentiality for business data is an understudied area of disclosure avoidance, where legacy methods struggle to provide acceptable results. Standard formal privacy techniques for person-level data, like differential privacy, are designed to protect against membership inference and hence do not provide suitable confidentiality/utility trade-offs due to the highly skewed nature of business data and because extreme outlier records are often important contributors to query answers. Prior proposals, therefore, took a personalized differential privacy approach that allowed privacy parameters to degrade for the outlying records -- larger establishments get weaker membership inference guarantees. However, providing guarantees to some entities that are strictly weaker than guarantees for others is problematic from a policy standpoint. In this paper, we propose a novel confidentiality framework for business data with a focus on interpretability for policy makers. Instead of protecting against membership inference, which is often not a concern in business data, we protect against attribute inferences that are too precise. In our framework, data curators specify a neighbor function that is used to define uncertainty interval bands around an establishment's attribute values and the privacy parameters govern the strength of indistinguishability between values within the same uncertainty interval.We propose two query-answering mechanisms under this framework and evaluate them on: (1) a confidential Quarterly Census of Employment and Wages (QCEW) dataset produced by the U.S. Bureau of Labor Statistics (this was done through a cooperative agreement), and (2) a substitute dataset that we created from public sources (and will publicly release). 2025-09-01T16:29:54Z 42 pages (13 main text, 2 references, and 27 appendix pages), 13 figures (4 in main text) Kaitlyn Webb Prottay Protivash John Durrell Daniell Toth Aleksandra Slavković Daniel Kifer http://arxiv.org/abs/2508.15954v2 A Heuristic Framework of Variable Neighborhood Descent Methods for the Large-Scale Multi-Level Facility Location Problem in Supply Chain Networks 2026-03-20T14:22:55Z This paper addresses the single-assignment, uncapacitated, multi-level facility location (MFL) problem, a strategic decision-making process critical to the design of long-term supply chain networks. Specifically, we examine four- and five-level facility location structures (k-LFL), modeled as a location-allocation problem where demand nodes must be assigned to open facilities across hierarchical levels. Although the MFL has been addressed in the literature, solutions to large-scale, realistic problems involving thousands of nodes are lacking. This paper proposes a heuristic framework based on the Variable Neighborhood Descent (VND) metaheuristic with a multi-start strategy. We develop and compare four variants: Basic Variable Neighborhood Descent (BVND), Pipe Variable Neighborhood Descent (PVND), Cyclic Variable Neighborhood Descent (CVND), and Union Variable Neighborhood Descent (UVND). In each case, a multi-start strategy with strong diversification components is employed. Extensive computational experiments compare the methods on large-scale instances involving up to 10,000 customers, 150 distribution centers, 50 warehouses, and 30 plants. Each algorithm settled into a unique, statistically significant computational time when solving these problems. Sensitivity analyses, supported by non-parametric statistical methods, validate the effectiveness of the proposed heuristic framework. 2025-08-21T20:46:34Z 48 pages 3 figures Haibo Wang Bahram Alidaee http://arxiv.org/abs/2603.19899v1 Deep Autocorrelation Modeling for Time-Series Forecasting: Progress and Prospects 2026-03-20T12:31:08Z Autocorrelation is a defining characteristic of time-series data, where each observation is statistically dependent on its predecessors. In the context of deep time-series forecasting, autocorrelation arises in both the input history and the label sequences, presenting two central research challenges: (1) designing neural architectures that model autocorrelation in history sequences, and (2) devising learning objectives that model autocorrelation in label sequences. Recent studies have made strides in tackling these challenges, but a systematic survey examining both aspects remains lacking. To bridge this gap, this paper provides a comprehensive review of deep time-series forecasting from the perspective of autocorrelation modeling. In contrast to existing surveys, this work makes two distinctive contributions. First, it proposes a novel taxonomy that encompasses recent literature on both model architectures and learning objectives -- whereas prior surveys neglect or inadequately discuss the latter aspect. Second, it offers a thorough analysis of the motivations, insights, and progression of the surveyed literature from a unified, autocorrelation-centric perspective, providing a holistic overview of the evolution of deep time-series forecasting. The full list of papers and resources is available at https://github.com/Master-PLC/Awesome-TSF-Papers. 2026-03-20T12:31:08Z Hao Wang Licheng Pan Qingsong Wen Jialin Yu Zhichao Chen Chunyuan Zheng Xiaoxi Li Zhixuan Chu Chao Xu Mingming Gong Haoxuan Li Yuan Lu Zhouchen Lin Philip Torr Yan Liu http://arxiv.org/abs/2603.20349v1 Prediction intervals for overdispersed multinomial data with application to historical controls 2026-03-20T12:14:41Z In pharmaceutical and toxicological research, historical control data are increasingly used to validate concurrent control groups, typically via the construction of historical control limits. While methods have been described for continuous and dichotomous endpoints, approaches for overdispersed multinomial data, common in developmental and reproductive toxicology or histopathology, are currently lacking. This article introduces and compares methods for constructing simultaneous prediction intervals for future multinomial observations subject to overdispersion. We investigate a range of frequentist approaches, including asymptotic approximations and bootstrap techniques (incorporating symmetric, asymmetric, and marginal calibration, as well as rank-based methods), alongside Bayesian hierarchical models. Extensive simulation studies assessing simultaneous coverage probability and the balance of lower and upper tail error probabilities show that standard asymptotic methods and simple Bonferroni adjustments yield liberal intervals, especially for small sample sizes or rare event categories. In contrast, bootstrap methods, specifically the Marginal Calibration and Rank-Based Simultaneous Confidence Sets, provide reliable error control and equal tail probabilities across diverse scenarios involving varying cluster sizes and degrees of overdispersion. These methods fill an important gap for multinomial endpoints and support the validation of concurrent controls using historical control data, in line with the recent European Food Safety Authority scientific opinion on the use and reporting of historical control data. 2026-03-20T12:14:41Z Sören Budig Frank Schaarschmidt Max Menssen http://arxiv.org/abs/2603.22320v1 Bridging the Gap Between Climate Science and Machine Learning in Climate Model Emulation 2026-03-20T10:41:50Z While climate models provide insights for climate decision-making, their use is constrained by significant computational and technical demands. Although machine learning (ML) emulators offer a way to bypass the high computational costs, their effective use remains challenging. The hurdles are diverse, ranging from limited accessibility and a lack of specialized knowledge to a general mistrust of ML methods that are perceived as insufficiently physical. Here, we introduce a framework to overcome these barriers by integrating both climate science and machine learning perspectives. We find that designing easy-to-adopt emulators that address a clearly defined task and demonstrating their reliability offers a promising path for bridging the gap between our two fields. 2026-03-20T10:41:50Z Luca Schmidt Nina Effenberger http://arxiv.org/abs/2508.18716v3 Dynamic Count Models with Flexible Innovation Processes for Irregular Maritime Migration 2026-03-20T10:41:01Z Motivated by the challenge of analyzing the dynamics of weekly sea border crossings in the Mediterranean (2015-2025) and the English Channel (2018-2025), we develop a Bayesian dynamic framework for modeling heteroskedastic count time series. Building on theoretical considerations and empirical stylized facts, our approach utilizes a Poisson random walk model that allows for heavy-tailed innovations or stochastic volatility dynamics, while incorporating an explicit mechanism to separate structural from sampling zeros. Posterior inference is carried out via a straightforward Markov chain Monte Carlo algorithm. Applying this methodology to Mediterranean and English Channel data, we compare alternative model specifications through a comprehensive out-of-sample forecasting exercise. Using log predictive scores and empirical coverage at predictive quantiles to evaluate each model, we find strong evidence for stochastic volatility in migration innovations. These models deliver the strongest out-of-sample forecasts with empirical coverage close to nominal levels up to the 99th percentile. Our framework can be used to develop risk indicators with direct policy implications for improving governance and preparedness for migration surges. More broadly, the methodology extends to other zero-inflated non-stationary count time series applications, including epidemiological surveillance and public safety incident monitoring. 2025-08-26T06:27:41Z Gregor Zens Jakub Bijak http://arxiv.org/abs/2510.17641v5 Are penalty shootouts better than a coin toss? Evidence from international club football in Europe 2026-03-20T10:17:08Z Penalty shootouts play a crucial role in the knockout stage of major football tournaments. Their importance has substantially increased from the 2021/22 season, when the Union of European Football Associations (UEFA) scrapped the away goals rule. Our paper examines whether the outcome of a penalty shootout can be predicted in UEFA club competitions. Based on all shootouts between 2000 and 2025, no evidence is found for the effect of the kicking order, the field of the match, or psychological momentum. In contrast to previous results, we do not detect any relationship between shootout success and relative team strength, quantified by differences in Elo ratings and the implied winning probability. Thus, the hypothesis that penalty shootouts are close to a coin toss in international competitions for European football clubs cannot be rejected. 2025-10-20T15:21:44Z 23 pages, 5 figures, 7 tables László Csató Dóra Gréta Petróczy http://arxiv.org/abs/2603.20345v1 Towards Improved Short-term Hypoglycemia Prediction and Diabetes Management based on Refined Heart Rate Data 2026-03-20T10:13:31Z Hypoglycemia is a severe condition of decreased blood glucose, specifically below 70 mg/dL (3.9 mmol/L). This condition can often be asymptomatic and challenging to predict in individuals with type 1 diabetes (T1D). Research on hypoglycemic prediction typically uses a combination of blood glucose readings and heart rate data to predict hypoglycemic events. Given that these features are collected through wearable sensors, they can sometimes have missing values, necessitating efficient imputation methods. This work makes significant contributions to the current state of the art by introducing two novel imputation techniques for imputing heart rate values over short-term horizons: Controlled Weighted Rational Bézier Curves (CRBC) and Controlled Piecewise Cubic Hermite Interpolating Polynomial with mapped peaks and valleys of Control Points (CMPV). In addition to these imputation methods, we employ two metrics to capture data patterns, alongside a combined metric that integrates the strengths of both individual metrics with RMSE scores for a comprehensive evaluation of the imputation techniques. According to our combined metric assessment, CMPV outperforms the alternatives with an average score of 0.33 across all time gaps, while CRBC follows with a score of 0.48. These findings clearly demonstrate the effectiveness of the proposed imputation methods in accurately filling in missing heart rate values. Moreover, this study facilitates the detection of abnormal physiological signals, enabling the implementation of early preventive measures for more accurate diagnosis. 2026-03-20T10:13:31Z 10 pages, 2 tables Vaibhav Gupta Florian Grensing Beyza Cinar Louisa van den Boom Maria Maleshkova http://arxiv.org/abs/2603.20343v1 A practical introduction to ODE modelling in Stan for biological systems 2026-03-20T10:02:39Z Integrating dynamical systems models with time series data is a central part of contemporary mathematical biology. With the rich variety of available models and data, numerous methods and computational tools have been developed for these purposes. One such tool is Stan, a freely available and open-source probabilistic programming framework that provides efficient methods for estimating model parameters from data using computational Bayesian inference algorithms. Stan includes built-in mechanisms for working with ordinary differential equation (ODE) models, which are widely used in mathematical biology and related fields to study simulated, experimental, and real-world systems that change over time. Through step-by-step worked examples, including both pedagogical toy models and applications with real data, this article provides a practical, self-contained introduction to performing parameter estimation and model evaluation for first-order linear and nonlinear ODE models in Stan. The article also explains key statistical methods that underpin Stan and discusses computational Bayesian modelling in the context of biological applications. 2026-03-20T10:02:39Z 23 pages, 10 figures Sara Hamis John Forslund Cici Chen Gu Jodie A. Cochrane http://arxiv.org/abs/2603.19756v1 Extraction of tabulated statistical results with tableParser 2026-03-20T08:41:37Z Tabulated content is omnipresent in scientific literature. This work presents the R package *tableParser*, designed to extract and postprocess tables from NISO-JATS-encoded XML, HTML, DOCX, and, with limitations, PDF documents. *tableParser* focuses on extracting and analyzing statistical test results reported in scientific publications. It can be used for large-scale analysis of effect sizes, reporting practices, or summarization of results, as well as for checking completeness and consistency of standard test results in unpublished documents. Documents can be processed in three decoding levels. *table2matrix()* compiles all tables into a list of character matrices with captions and footnotes. *table2text()* collapses the matrix contents into human-readable text, mimicking a screen reader. Optionally, many common codings that are reported within the table's caption and footnote can be used to decode and expand the table's content. The collapsed and decoded table content can be further processed match an ideal input for the extraction of statistical standard results with the *standardStats()* function from the *JATSdecoder* package. The output of *table2stats()* is a data frame with all detected standard results as columns and, if calculation is possible, a recalculated p-value. If desired, an automated consistency check of the reported and the coded p-values with the recalculated p-value can be initiated. *tableParser* works best on barrier-free HTML tables encoded in NISO-JATS, where captions and footnotes are clearly identifiable. By guessing the tables captions and footnotes conservatively, the processing of tables within HTML and DOCX documents is comparably robust. Technically, tables in PDFs often fail to be correctly extracted, with captions and footnotes not detectable. Therefore, a decoding of codes is not possible, which lowers *tableParser*'s decoding accuracy on PDFs. 2026-03-20T08:41:37Z 16 pages, 14 tables Ingmar Böschen