https://arxiv.org/api/bJeJODwVKhbyr2EwUUWSTgm9z90 2026-05-15T23:42:41Z 11856 0 15 http://arxiv.org/abs/2512.11948v2 Data-driven modeling of multivariate stochastic trajectories -- Application to water waves 2026-05-14T08:54:26Z A data-driven methodology is proposed to model the distribution of multivariate stochastic trajectories from an observed sample. As a first step, each trajectory in the sample is reduced to a vector of features by means of Functional Principal Component Analysis. Next, the joint distribution of features is modeled using (i) a non-parametric vine copula approach for the bulk of the distribution, and (ii) the conditional modeling framework of Heffernan and Tawn (2004) for the multivariate tail. The method is applied to the modeling of water waves. The dataset used is the DeRisk database, which consists of numerical simulations of water waves. The analysis is restricted to the portion of the wave period between the free-surface zero-upcrossing and the wave crest. The kinematic variables considered are the free-surface slope, the normal component of the fluid velocity at the free surface, and the vertical Lagrangian acceleration of the fluid at the free surface. The stochastic trajectories of these three variables are modeled jointly. The vertical Lagrangian acceleration of the fluid is employed to enforce a wave-breaking filter in the stochastic model. The number of hyperparameters in the stochastic framework is reduced to three, and a stepwise calibration strategy is proposed for their adjustment. The capabilities of the model are illustrated by predicting the distributions of selected response variables and by generating synthetic trajectories. 2025-12-12T18:46:37Z 26 pages, 5 figures Romain Hascoët http://arxiv.org/abs/2605.14131v1 Double Metric Learning for Building Directed Graphs with Chain Connections for the ATLAS ITk Detector 2026-05-13T21:31:28Z Graph construction is an essential step in the Graph Neural Network (GNN) based tracking pipelines. The goal of the graph construction is to construct a graph that contains only the defined true edge connections between nodes (detector hits). A promising approach for the graph construction is through the Metric Learning approach, where a node representation in an embedding space is learned, and nodes are connected according to their distance in the embedding space. The loss function for the metric learning in this case is a contrastive loss encouraging the true pairs of nodes to be close to each other, and pulling away the false pairs of nodes. This approach presents a conflict of the learning objective for the hopping connections when a true edge is defined as a chain connection in a particle track. To address the conflict for this case, we propose a ``Double Metric Learning'' approach, where two node representations are learned. A directed graph can then be constructed based on the distance between the two representations from two nodes respectively. We test this idea with the ATLAS ITk detector at the HL-LHC using the ATLAS ITk simulation and show better graph construction performance particularly for particles with high transverse momentum compared to the Simple Metric Learning approach. We also show that Double Metric Learning is able to accurately predict edge direction. 2026-05-13T21:31:28Z 7 pages, 5 figures Proceedings of the CTD 2025, PROC-CTD2025-071 Jay Chan http://arxiv.org/abs/2605.13627v1 SINAPSE: A lightweight deep learning framework for accurate and explainable neutron-$γ$ discrimination 2026-05-13T14:53:47Z Traditionally, neutron-$γ$ discrimination in organic scintillators relies on techniques such as time-of-flight (ToF) selection and pulse-shape discrimination (PSD). However, particle identification through graphical cuts remains challenging in the low-charge regime due to poor signal-to-noise ratios (SNR). In this work, we propose SINAPSE, a lightweight deep learning framework for accurate and explainable neutron-$γ$ discrimination in the low-charge regime. The framework employs a dual-branch architecture that combines a 1-dimensional convolutional autoencoder for waveform denoising with a classifier for particle identification. Random augmentations are applied to high-SNR waveforms to simulate low-charge conditions, enabling robust extrapolation into regimes where conventional PSD labels are unreliable. We show that SINAPSE achieves superior denoising performance compared to conventional digital signal processing techniques, and outputs well-calibrated probabilities, consistent with traditional graphical cuts. Finally, we apply SHAP (SHapley Additive exPlanations) values to show that model decisions are driven by physically meaningful pulse-shape features, confirming consistency with established PSD principles. 2026-05-13T14:53:47Z 13 pages, 13 figures Thomas Carreau Adrien Matta Owen Syrett Benoît Mauss David Etasse Audrey Chatillon Cyril Lenain Pierre Morfouace Julien Taieb David Regnier Patrick Copp Matthew Devlin Charlène Surault Jason Surbrook http://arxiv.org/abs/2602.13094v2 A Quantum Reservoir Computing Approach to Quantum Stock Movement Forecasting in Quantum-Invested Markets 2026-05-13T13:54:41Z We present a quantum reservoir computing (QRC) framework based on a small-scale quantum system comprising at most six interacting qubits, designed for nonlinear financial time-series forecasting. We apply the model to predict future daily closing trading volumes of 20 quantum-sector publicly traded companies over the period from April 11, 2020, to April 11, 2025, as well as minute-by-minute trading volumes during out-of-market hours on July 7, 2025. Our analysis identifies optimal reservoir parameters that yield stock trend (up/down) classification accuracies exceeding $86 \%$. Importantly, the QRC model is platform-agnostic and can be realized across diverse physical implementations of qubits, including superconducting circuits and trapped ions. These results demonstrate the expressive power and robustness of small-scale quantum reservoirs for modeling complex temporal correlations in financial data, highlighting their potential applicability to real-world forecasting tasks on near-term quantum hardware. 2026-02-13T17:00:03Z 16 pages, 9 figures Wendy Otieno Alexandre Zagoskin Alexander G. Balanov Juan Totero Gongora Sergey E. Savel'ev http://arxiv.org/abs/2509.19929v4 Geometric Autoencoder Priors for Bayesian Inversion: Learn First Observe Later 2026-05-13T12:33:14Z Uncertainty Quantification (UQ) is paramount for inference in engineering. A common inference task is to recover full-field information of physical systems from a small number of noisy observations, a usually highly ill-posed problem. Sharing information from multiple distinct yet related physical systems can alleviate this ill-posedness. Critically, engineering systems often have complicated variable geometries prohibiting the use of standard multi-system Bayesian UQ. In this work, we introduce Geometric Autoencoders for Bayesian Inversion (GABI), a framework for learning geometry-aware generative models of physical responses that serve as highly informative geometry-conditioned priors for Bayesian inversion. Following a ''learn first, observe later'' paradigm, GABI distills information from large datasets of systems with varying geometries, without requiring knowledge of governing PDEs, boundary conditions, or observation processes, into a rich latent prior. At inference time, this prior is seamlessly combined with the likelihood of a specific observation process, yielding a geometry-adapted posterior distribution. Our proposed framework is architecture-agnostic. A creative use of Approximate Bayesian Computation (ABC) sampling yields an efficient implementation that utilizes modern GPU hardware. We test our method on: steady-state heat over rectangular domains; Reynolds-Averaged Navier-Stokes (RANS) flow around airfoils; Helmholtz resonance and source localization on 3D car bodies; RANS airflow over terrain. We find: the predictive accuracy to be comparable to deterministic supervised learning approaches in the restricted setting where supervised learning is applicable; UQ to be well calibrated and robust on challenging problems with complex geometries. 2025-09-24T09:38:11Z Arnaud Vadeboncoeur Gregory Duthé Mark Girolami Eleni Chatzi http://arxiv.org/abs/2605.12405v1 An analytical approach to calculating stationary PDFs for reflected random walks with an application to BESS-based ramp-rate control 2026-05-12T17:03:06Z A Wiener-Hopf-type integral equation for the stationary PDF of a reflected random walk is derived rigorously based on modern probability theory, and an application to battery energy storage systems (BESS), specifically the sizing of the inverter, is discussed in depth. The methodological steps include the construction of a Markov kernel, the derivation of a Fredholm integral equation of the second kind for the PDF of the BESS power, and an analytical solution of the equation based on a Neumann series. The analytical results were compared against numerical solutions obtained with the Nystrom method, as well as against the results of an algorithmic simulation using simulated input time series. The use of truncated versions of the analytic solution allows for the construction of simplified design rules for the power systems practitioner. General insights into inverter sizing criteria of storage systems for ramp-rate control of variable renewable energy (VRE) sources such as wind and solar are provided. 2026-05-12T17:03:06Z 43 pages, 6 figures Carlos Colchero Diego Jiménez-Arreguín Álvaro Herrera Jorge E. Pérez-García Oliver Probst http://arxiv.org/abs/2605.12007v1 A geometry-aligned multi-fidelity framework for uncertainty quantification of wildfire spread 2026-05-12T11:55:52Z Forward propagation of input uncertainties in physics-based wildfire models is computationally prohibitive, limiting the use of high-fidelity simulators in risk assessment workflows. This work introduces a geometry-aligned bi-fidelity surrogate framework that addresses the convection-dominated nature of wildfire spread by mapping low- and high-fidelity solution snapshots onto a common reference domain prior to basis selection and reconstruction. Unlike conventional bi-fidelity schemes, which combine spatially shifted snapshots and thus suffer from oscillations and excess basis requirements near sharp fronts, the proposed mapping aligns the dominant front geometry through per-variable shift/stretch transforms in 1D and an activity indicator-based affine alignment in 2D, so that reduced bases compare physically corresponding structures rather than displaced ones. Building on the ADfiRe physics-based simulator, we demonstrate the method on 1D and 2D test cases in which low- and high-fidelity models differ in mesh resolution and physical completeness. Across both settings, the geometry-aligned surrogate reproduces full-field temperature and fuel composition with substantially lower error than its unmapped counterpart, eliminates Gibbs-type oscillations near steep gradients, and recovers high-fidelity probability density functions for key quantities of interest (e.g., maximum temperature, evaporated moisture, and burned area). After offline training, online predictions are roughly three orders of magnitude cheaper than direct high-fidelity evaluation, making the framework a practical building block for many-query uncertainty quantification once the offline cost is amortized over enough queries. We discuss the conditions under which the geometric alignment is most effective, its limitations for non-convex or topologically complex fronts, and the path toward validation against real data. 2026-05-12T11:55:52Z 23 pages, 15 figures Konstantinos Vogiatzoglou Costas Papadimitriou Vasilis Bontozoglou Petros Koumoutsakos Han Gao http://arxiv.org/abs/2602.22776v3 Optimization-based Unfolding in High-Energy Physics 2026-05-12T07:45:24Z In experimental High-Energy Physics, unfolding refers to the problem of estimating the underlying distribution of a physical observable from detector-level data, in the presence of statistical fluctuations and systematic uncertainties. Starting from its reformulation as a regularized quadratic optimization problem, we develop a framework to address unfolding using both classical and quantum-compatible methods. In particular, we derive a Quadratic Unconstrained Binary Optimization (QUBO) representation of the unfolding objective, allowing direct implementation on quantum annealing and hybrid quantum-classical solvers. The proposed approach is implemented in QUnfold, an open-source Python package integrating classical mixed-integer solvers and D-Wave's hybrid quantum solver. We benchmark the method against widely used unfolding techniques in RooUnfold, including response Matrix Inversion, Iterative Bayesian Unfolding, and Singular Value Decomposition unfolding, using synthetic dataset with controlled distortion effects. Our results demonstrate that the optimization-based approach achieves competitive reconstruction accuracy across multiple distributions while naturally accommodating regularization within the objective function. This work establishes a unified optimization perspective on unfolding and provides a practical pathway for exploring quantum-enhanced methods in experimental HEP data analysis. 2026-02-26T09:11:34Z Simone Gasperini Gianluca Bianco Marco Lorusso Carla Rieger Michele Grossi http://arxiv.org/abs/2605.11637v1 Computed Tomography Reconstruction Algorithm Using Markov Random Field Model 2026-05-12T06:58:44Z X-ray computed tomography (CT) reveals the materials' internal structures non-destructively from a tilt series of projected images. Filtered back projection (FBP) is a widely-adopted reconstruction algorithm in CT owing to its small computational cost. Under low-dose or sparse-view conditions, however, FBP often amplifies noise, severely degrading the reconstructed images. In this study, we evaluated the performance of a Bayesian CT reconstruction algorithm based on the Markov random field model under such adverse conditions. Through simulations, we demonstrated that the proposed algorithm shows higher reconstruction performance than FBP under both low-dose and sparse-view conditions. The hyperparameters are estimated by minimizing the Bayesian free energy, enabling adaptive reconstruction that reflects the noise characteristics of the observed projection data. These results suggest that the proposed algorithm can broaden the applicability of CT to dose-sensitive applications and time-constrained measurements, where only limited observed projection data are available. 2026-05-12T06:58:44Z 17 pages, 7 figures Taiga Shimomiya Taichi Kusumi Yuichi Yokoyama Masayuki Uesugi Akihisa Takeuchi Yuki Sada Hayaru Shouno Masato Okada http://arxiv.org/abs/2605.11359v1 CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing 2026-05-12T00:24:30Z Scientific data processing often requires task-specific algorithms or AI models, creating a barrier for domain scientists who need to analyze their data but may not have extensive computing or image-processing expertise. This barrier is especially pronounced when data are noisy, have a high dynamic range, are sparsely labeled, or are only loosely specified. We introduce CVEvolve, an autonomous agentic harness with a zero-code interface for scientific data-processing algorithm discovery. CVEvolve combines a multi-round search strategy with tools for code execution, evaluation implementation, history management, holdout testing, and optional inspection of scientific data and visual outputs. The search alternates between discovery and improvement actions, and uses lineage-aware stochastic candidate sampling to balance exploration and exploitation. We demonstrate CVEvolve on x-ray fluorescence microscopy image registration, Bragg peak detection, and high-energy diffraction microscopy image segmentation. Across these tasks, CVEvolve discovers algorithms that improve over baseline methods, while holdout test tracking helps identify candidates that generalize better than later over-optimized alternatives. These results show that zero-code, autonomous LLM-powered algorithm development can help domain scientists turn unstructured scientific image data into practical algorithms and downstream scientific discoveries. 2026-05-12T00:24:30Z Ming Du Xiangyu Yin Yanqi Luo Dishant Beniwal Songyuan Tang Hemant Sharma Mathew J. Cherukara http://arxiv.org/abs/2605.11197v1 The Same Problem by Different Names: Unifying Regression Dilution and Regression to the Mean 2026-05-11T20:04:13Z Regression to the Mean and Regression Dilution are often viewed as unrelated issues in the clinical and ecological literatures. In reality, they are different names for the same problem: measurement error in an independent variable that biases the perceived relationship between two factors. This study unifies these traditions by comparing specialized clinical tools, like the Berry correction, with standard structural estimators such as Major Axis and Reduced Major Axis regression. Using an analytical framework, we evaluate how these methods perform across various noise levels and sample sizes. Our results show that the Berry method is a specialized tool designed for clinical scenarios where a 1:1 relationship is expected. However, applying it to ecological trade-offs with negative slopes can lead to severe errors. We provide maps of optimality to identify which estimator most accurately recovers the true biological signal under different conditions. By reconciling these disparate methods, we offer a principled guide for researchers to choose the correct tool based on their data's noise profile rather than their disciplinary tradition. 2026-05-11T20:04:13Z José F. Fontanari Mauro Santos http://arxiv.org/abs/2605.10856v1 Improving search efficiency via adaptive acquisition function selection in discrete black-box optimization 2026-05-11T17:03:01Z In discrete-variable black-box optimization, the number of candidate solutions grows combinatorially, while each evaluation is often expensive. Therefore, it is important to identify promising solutions efficiently within a limited number of trials. Bayesian Optimization of Combinatorial Structures (BOCS), an existing parametric method, works effectively when only a small amount of data is available. However, as the number of observations increases, BOCS tends to repeatedly propose points that have already been evaluated, which leads to search stagnation. A random-point addition strategy has been proposed to address this issue when an evaluated point is proposed, but it cannot sufficiently exploit information from promising data obtained so far. In this study, we propose a hybrid method that uses BOCS as the main search framework and generates alternative unevaluated points using a Gaussian process only when search stagnation is detected. In the Gaussian-process-based component, multiple Lower Confidence Bound (LCB) acquisition functions are adaptively selected to dynamically control the balance between exploitation and exploration. Numerical experiments using fully connected Quadratic Unconstrained Binary Optimization (QUBO) and Higher-order Unconstrained Binary Optimization (HUBO) as black-box functions show that the proposed method finds solutions with better objective values than the conventional random-point addition method in both settings. Additional analyses show that its effectiveness comes from selecting points that promote search progress within Hamming-distance neighborhoods, rather than simply adding low-energy points near promising solutions. Experiments with sparse surrogate models for quantum annealer applications further suggest the importance of retaining near-fully connected representational capacity. 2026-05-11T17:03:01Z Reo Shikanai Masayuki Ohzeki http://arxiv.org/abs/2602.07165v2 PoissonRatioUQ: An R package for band ratio uncertainty quantification 2026-05-11T15:09:34Z We introduce an R package for Bayesian modeling and uncertainty quantification for problems involving count ratios. The modeling relies on the assumption that the quantity of interest is the ratio of Poisson means rather than the ratio of counts. We provide multiple different options for retrieval of this quantity for problems with and without spatial information included. Some added capability for uncertainty quantification for problems of the form $Z=(mT+z_0)^{p}$, where $Z$ is the intensity ratio and $T$ the quantity of interest, is included. 2026-02-06T20:06:52Z Description of the R package in https://github.com/mfleduc/PoissonRatioUQ. Contains some updated information about new functions that have been added Matthew LeDuc Tomoko Matsuo http://arxiv.org/abs/2603.20423v4 From the Stochastic Embedding Sufficiency Theorem to a Superspace Diffusion Framework 2026-05-11T11:26:59Z A generalisation of Takens' delay-coordinate embedding theorem to stochastic systems, the Stochastic Embedding Sufficiency Theorem, is an inverse methodology enabling non-parametric recovery of both drift and diffusion fields from scalar time series without prior assumptions about the governing physics. A blind protocol using only time series data is applied to nine domains: classical mechanics, statistical mechanics, nuclear physics, quantum mechanics, chemical kinetics, electromagnetism, relativistic quantum mechanics, quantum harmonic oscillator dynamics, and quantum electrodynamics. Fundamental constants (the Boltzmann constant, the Planck constant, the speed of light, the Fano factor, and the Van Kampen scaling exponent) emerge in both drift and diffusion channels without prior specification. The recovered diffusion coefficients, viewed across domains, constitute an empirical pattern, the $σ$-continuum, in which $k_B$, $\hbar$, and $c$ play structurally distinct roles. The Gravitational Diffusion Theorem, derived from the fluctuation-dissipation theorem, massless mode structure of linearised gravity, and gravitational self-coupling via the equivalence principle, determines the gravitational diffusion coefficient as one Planck length per square root of Planck time. Four canonical axioms formalise the framework, within which the noise character, drift, covariance operator, and fluctuation amplitude are uniquely determined by theorem, yielding the superspace diffusion hypothesis: $\mathrm{d}g_{ij} = \mathcal{D}_{ij}[g]\,\mathrm{d}τ+ \ell_P\,\mathrm{d}W_{ij}$ where all coefficients are non-parametric, first-principles consequences of the axioms. An implication of the hypothesis is that coarse-graining of the superspace Fokker-Planck equation via Mori-Zwanzig projection yields predictions for galactic-scale gravitational acceleration testable against kinematic data. 2026-03-20T18:48:24Z Carolina Garcia Lucía Perea Durán Agnese Venezia Alex Conradie http://arxiv.org/abs/2605.10333v1 BB plot: A Tool for Accurate Model Selection Using Bayes factors 2026-05-11T10:35:00Z A common task in physics and astronomy is studying which of the competing hypotheses the data prefer. This is usually done by computing the Bayes factor between the two hypotheses, and either interpreting it in terms of the posterior odds or as a ranking statistic for a frequentist p-value test. Here we describe a relationship between the Bayes factor and its distributions under the two competing hypotheses, called the Bayes factor-Bayes factor (BB) relationship, expressed as a diagnostic plot. Using examples from gravitational wave (GW) astronomy, we demonstrate how the BB plot can validate the accuracy of Bayes factor calculations. The BB relationship may also be useful for estimating background distributions of the Bayes factor at low computational cost, even analytically in some cases. We apply this technique in the context of wave-optics lensing of GWs, extrapolating the background distribution from GWTC4 to put a rough bound of $\lesssim 4.1 σ$ on the statistical significance of GW231123. 2026-05-11T10:35:00Z 15 pages, 8 figures Ankur Barsode