https://arxiv.org/api/bJeJODwVKhbyr2EwUUWSTgm9z902026-05-15T23:42:41Z11856015http://arxiv.org/abs/2512.11948v2Data-driven modeling of multivariate stochastic trajectories -- Application to water waves2026-05-14T08:54:26ZA data-driven methodology is proposed to model the distribution of multivariate stochastic trajectories from an observed sample. As a first step, each trajectory in the sample is reduced to a vector of features by means of Functional Principal Component Analysis. Next, the joint distribution of features is modeled using (i) a non-parametric vine copula approach for the bulk of the distribution, and (ii) the conditional modeling framework of Heffernan and Tawn (2004) for the multivariate tail. The method is applied to the modeling of water waves. The dataset used is the DeRisk database, which consists of numerical simulations of water waves. The analysis is restricted to the portion of the wave period between the free-surface zero-upcrossing and the wave crest. The kinematic variables considered are the free-surface slope, the normal component of the fluid velocity at the free surface, and the vertical Lagrangian acceleration of the fluid at the free surface. The stochastic trajectories of these three variables are modeled jointly. The vertical Lagrangian acceleration of the fluid is employed to enforce a wave-breaking filter in the stochastic model. The number of hyperparameters in the stochastic framework is reduced to three, and a stepwise calibration strategy is proposed for their adjustment. The capabilities of the model are illustrated by predicting the distributions of selected response variables and by generating synthetic trajectories.2025-12-12T18:46:37Z26 pages, 5 figuresRomain Hascoëthttp://arxiv.org/abs/2605.14131v1Double Metric Learning for Building Directed Graphs with Chain Connections for the ATLAS ITk Detector2026-05-13T21:31:28ZGraph construction is an essential step in the Graph Neural Network (GNN) based tracking pipelines. The goal of the graph construction is to construct a graph that contains only the defined true edge connections between nodes (detector hits). A promising approach for the graph construction is through the Metric Learning approach, where a node representation in an embedding space is learned, and nodes are connected according to their distance in the embedding space. The loss function for the metric learning in this case is a contrastive loss encouraging the true pairs of nodes to be close to each other, and pulling away the false pairs of nodes. This approach presents a conflict of the learning objective for the hopping connections when a true edge is defined as a chain connection in a particle track. To address the conflict for this case, we propose a ``Double Metric Learning'' approach, where two node representations are learned. A directed graph can then be constructed based on the distance between the two representations from two nodes respectively. We test this idea with the ATLAS ITk detector at the HL-LHC using the ATLAS ITk simulation and show better graph construction performance particularly for particles with high transverse momentum compared to the Simple Metric Learning approach. We also show that Double Metric Learning is able to accurately predict edge direction.2026-05-13T21:31:28Z7 pages, 5 figuresProceedings of the CTD 2025, PROC-CTD2025-071Jay Chanhttp://arxiv.org/abs/2605.13627v1SINAPSE: A lightweight deep learning framework for accurate and explainable neutron-$γ$ discrimination2026-05-13T14:53:47ZTraditionally, neutron-$γ$ discrimination in organic scintillators relies on techniques such as time-of-flight (ToF) selection and pulse-shape discrimination (PSD). However, particle identification through graphical cuts remains challenging in the low-charge regime due to poor signal-to-noise ratios (SNR). In this work, we propose SINAPSE, a lightweight deep learning framework for accurate and explainable neutron-$γ$ discrimination in the low-charge regime. The framework employs a dual-branch architecture that combines a 1-dimensional convolutional autoencoder for waveform denoising with a classifier for particle identification. Random augmentations are applied to high-SNR waveforms to simulate low-charge conditions, enabling robust extrapolation into regimes where conventional PSD labels are unreliable. We show that SINAPSE achieves superior denoising performance compared to conventional digital signal processing techniques, and outputs well-calibrated probabilities, consistent with traditional graphical cuts. Finally, we apply SHAP (SHapley Additive exPlanations) values to show that model decisions are driven by physically meaningful pulse-shape features, confirming consistency with established PSD principles.2026-05-13T14:53:47Z13 pages, 13 figuresThomas CarreauAdrien MattaOwen SyrettBenoît MaussDavid EtasseAudrey ChatillonCyril LenainPierre MorfouaceJulien TaiebDavid RegnierPatrick CoppMatthew DevlinCharlène SuraultJason Surbrookhttp://arxiv.org/abs/2602.13094v2A Quantum Reservoir Computing Approach to Quantum Stock Movement Forecasting in Quantum-Invested Markets2026-05-13T13:54:41ZWe present a quantum reservoir computing (QRC) framework based on a small-scale quantum system comprising at most six interacting qubits, designed for nonlinear financial time-series forecasting. We apply the model to predict future daily closing trading volumes of 20 quantum-sector publicly traded companies over the period from April 11, 2020, to April 11, 2025, as well as minute-by-minute trading volumes during out-of-market hours on July 7, 2025. Our analysis identifies optimal reservoir parameters that yield stock trend (up/down) classification accuracies exceeding $86 \%$. Importantly, the QRC model is platform-agnostic and can be realized across diverse physical implementations of qubits, including superconducting circuits and trapped ions. These results demonstrate the expressive power and robustness of small-scale quantum reservoirs for modeling complex temporal correlations in financial data, highlighting their potential applicability to real-world forecasting tasks on near-term quantum hardware.2026-02-13T17:00:03Z16 pages, 9 figuresWendy OtienoAlexandre ZagoskinAlexander G. BalanovJuan Totero GongoraSergey E. Savel'evhttp://arxiv.org/abs/2509.19929v4Geometric Autoencoder Priors for Bayesian Inversion: Learn First Observe Later2026-05-13T12:33:14ZUncertainty Quantification (UQ) is paramount for inference in engineering. A common inference task is to recover full-field information of physical systems from a small number of noisy observations, a usually highly ill-posed problem. Sharing information from multiple distinct yet related physical systems can alleviate this ill-posedness. Critically, engineering systems often have complicated variable geometries prohibiting the use of standard multi-system Bayesian UQ. In this work, we introduce Geometric Autoencoders for Bayesian Inversion (GABI), a framework for learning geometry-aware generative models of physical responses that serve as highly informative geometry-conditioned priors for Bayesian inversion. Following a ''learn first, observe later'' paradigm, GABI distills information from large datasets of systems with varying geometries, without requiring knowledge of governing PDEs, boundary conditions, or observation processes, into a rich latent prior. At inference time, this prior is seamlessly combined with the likelihood of a specific observation process, yielding a geometry-adapted posterior distribution. Our proposed framework is architecture-agnostic. A creative use of Approximate Bayesian Computation (ABC) sampling yields an efficient implementation that utilizes modern GPU hardware. We test our method on: steady-state heat over rectangular domains; Reynolds-Averaged Navier-Stokes (RANS) flow around airfoils; Helmholtz resonance and source localization on 3D car bodies; RANS airflow over terrain. We find: the predictive accuracy to be comparable to deterministic supervised learning approaches in the restricted setting where supervised learning is applicable; UQ to be well calibrated and robust on challenging problems with complex geometries.2025-09-24T09:38:11ZArnaud VadeboncoeurGregory DuthéMark GirolamiEleni Chatzihttp://arxiv.org/abs/2605.12405v1An analytical approach to calculating stationary PDFs for reflected random walks with an application to BESS-based ramp-rate control2026-05-12T17:03:06ZA Wiener-Hopf-type integral equation for the stationary PDF of a reflected random walk is derived rigorously based on modern probability theory, and an application to battery energy storage systems (BESS), specifically the sizing of the inverter, is discussed in depth. The methodological steps include the construction of a Markov kernel, the derivation of a Fredholm integral equation of the second kind for the PDF of the BESS power, and an analytical solution of the equation based on a Neumann series. The analytical results were compared against numerical solutions obtained with the Nystrom method, as well as against the results of an algorithmic simulation using simulated input time series. The use of truncated versions of the analytic solution allows for the construction of simplified design rules for the power systems practitioner. General insights into inverter sizing criteria of storage systems for ramp-rate control of variable renewable energy (VRE) sources such as wind and solar are provided.2026-05-12T17:03:06Z43 pages, 6 figuresCarlos ColcheroDiego Jiménez-ArreguínÁlvaro HerreraJorge E. Pérez-GarcíaOliver Probsthttp://arxiv.org/abs/2605.12007v1A geometry-aligned multi-fidelity framework for uncertainty quantification of wildfire spread2026-05-12T11:55:52ZForward propagation of input uncertainties in physics-based wildfire models is computationally prohibitive, limiting the use of high-fidelity simulators in risk assessment workflows. This work introduces a geometry-aligned bi-fidelity surrogate framework that addresses the convection-dominated nature of wildfire spread by mapping low- and high-fidelity solution snapshots onto a common reference domain prior to basis selection and reconstruction. Unlike conventional bi-fidelity schemes, which combine spatially shifted snapshots and thus suffer from oscillations and excess basis requirements near sharp fronts, the proposed mapping aligns the dominant front geometry through per-variable shift/stretch transforms in 1D and an activity indicator-based affine alignment in 2D, so that reduced bases compare physically corresponding structures rather than displaced ones. Building on the ADfiRe physics-based simulator, we demonstrate the method on 1D and 2D test cases in which low- and high-fidelity models differ in mesh resolution and physical completeness. Across both settings, the geometry-aligned surrogate reproduces full-field temperature and fuel composition with substantially lower error than its unmapped counterpart, eliminates Gibbs-type oscillations near steep gradients, and recovers high-fidelity probability density functions for key quantities of interest (e.g., maximum temperature, evaporated moisture, and burned area). After offline training, online predictions are roughly three orders of magnitude cheaper than direct high-fidelity evaluation, making the framework a practical building block for many-query uncertainty quantification once the offline cost is amortized over enough queries. We discuss the conditions under which the geometric alignment is most effective, its limitations for non-convex or topologically complex fronts, and the path toward validation against real data.2026-05-12T11:55:52Z23 pages, 15 figuresKonstantinos VogiatzoglouCostas PapadimitriouVasilis BontozoglouPetros KoumoutsakosHan Gaohttp://arxiv.org/abs/2602.22776v3Optimization-based Unfolding in High-Energy Physics2026-05-12T07:45:24ZIn experimental High-Energy Physics, unfolding refers to the problem of estimating the underlying distribution of a physical observable from detector-level data, in the presence of statistical fluctuations and systematic uncertainties. Starting from its reformulation as a regularized quadratic optimization problem, we develop a framework to address unfolding using both classical and quantum-compatible methods. In particular, we derive a Quadratic Unconstrained Binary Optimization (QUBO) representation of the unfolding objective, allowing direct implementation on quantum annealing and hybrid quantum-classical solvers. The proposed approach is implemented in QUnfold, an open-source Python package integrating classical mixed-integer solvers and D-Wave's hybrid quantum solver. We benchmark the method against widely used unfolding techniques in RooUnfold, including response Matrix Inversion, Iterative Bayesian Unfolding, and Singular Value Decomposition unfolding, using synthetic dataset with controlled distortion effects. Our results demonstrate that the optimization-based approach achieves competitive reconstruction accuracy across multiple distributions while naturally accommodating regularization within the objective function. This work establishes a unified optimization perspective on unfolding and provides a practical pathway for exploring quantum-enhanced methods in experimental HEP data analysis.2026-02-26T09:11:34ZSimone GasperiniGianluca BiancoMarco LorussoCarla RiegerMichele Grossihttp://arxiv.org/abs/2605.11637v1Computed Tomography Reconstruction Algorithm Using Markov Random Field Model2026-05-12T06:58:44ZX-ray computed tomography (CT) reveals the materials' internal structures non-destructively from a tilt series of projected images. Filtered back projection (FBP) is a widely-adopted reconstruction algorithm in CT owing to its small computational cost. Under low-dose or sparse-view conditions, however, FBP often amplifies noise, severely degrading the reconstructed images. In this study, we evaluated the performance of a Bayesian CT reconstruction algorithm based on the Markov random field model under such adverse conditions. Through simulations, we demonstrated that the proposed algorithm shows higher reconstruction performance than FBP under both low-dose and sparse-view conditions. The hyperparameters are estimated by minimizing the Bayesian free energy, enabling adaptive reconstruction that reflects the noise characteristics of the observed projection data. These results suggest that the proposed algorithm can broaden the applicability of CT to dose-sensitive applications and time-constrained measurements, where only limited observed projection data are available.2026-05-12T06:58:44Z17 pages, 7 figuresTaiga ShimomiyaTaichi KusumiYuichi YokoyamaMasayuki UesugiAkihisa TakeuchiYuki SadaHayaru ShounoMasato Okadahttp://arxiv.org/abs/2605.11359v1CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing2026-05-12T00:24:30ZScientific data processing often requires task-specific algorithms or AI models, creating a barrier for domain scientists who need to analyze their data but may not have extensive computing or image-processing expertise. This barrier is especially pronounced when data are noisy, have a high dynamic range, are sparsely labeled, or are only loosely specified. We introduce CVEvolve, an autonomous agentic harness with a zero-code interface for scientific data-processing algorithm discovery. CVEvolve combines a multi-round search strategy with tools for code execution, evaluation implementation, history management, holdout testing, and optional inspection of scientific data and visual outputs. The search alternates between discovery and improvement actions, and uses lineage-aware stochastic candidate sampling to balance exploration and exploitation. We demonstrate CVEvolve on x-ray fluorescence microscopy image registration, Bragg peak detection, and high-energy diffraction microscopy image segmentation. Across these tasks, CVEvolve discovers algorithms that improve over baseline methods, while holdout test tracking helps identify candidates that generalize better than later over-optimized alternatives. These results show that zero-code, autonomous LLM-powered algorithm development can help domain scientists turn unstructured scientific image data into practical algorithms and downstream scientific discoveries.2026-05-12T00:24:30ZMing DuXiangyu YinYanqi LuoDishant BeniwalSongyuan TangHemant SharmaMathew J. Cherukarahttp://arxiv.org/abs/2605.11197v1The Same Problem by Different Names: Unifying Regression Dilution and Regression to the Mean2026-05-11T20:04:13ZRegression to the Mean and Regression Dilution are often viewed as unrelated issues in the clinical and ecological literatures. In reality, they are different names for the same problem: measurement error in an independent variable that biases the perceived relationship between two factors. This study unifies these traditions by comparing specialized clinical tools, like the Berry correction, with standard structural estimators such as Major Axis and Reduced Major Axis regression. Using an analytical framework, we evaluate how these methods perform across various noise levels and sample sizes. Our results show that the Berry method is a specialized tool designed for clinical scenarios where a 1:1 relationship is expected. However, applying it to ecological trade-offs with negative slopes can lead to severe errors. We provide maps of optimality to identify which estimator most accurately recovers the true biological signal under different conditions. By reconciling these disparate methods, we offer a principled guide for researchers to choose the correct tool based on their data's noise profile rather than their disciplinary tradition.2026-05-11T20:04:13ZJosé F. FontanariMauro Santoshttp://arxiv.org/abs/2605.10856v1Improving search efficiency via adaptive acquisition function selection in discrete black-box optimization2026-05-11T17:03:01ZIn discrete-variable black-box optimization, the number of candidate solutions grows combinatorially, while each evaluation is often expensive. Therefore, it is important to identify promising solutions efficiently within a limited number of trials. Bayesian Optimization of Combinatorial Structures (BOCS), an existing parametric method, works effectively when only a small amount of data is available. However, as the number of observations increases, BOCS tends to repeatedly propose points that have already been evaluated, which leads to search stagnation. A random-point addition strategy has been proposed to address this issue when an evaluated point is proposed, but it cannot sufficiently exploit information from promising data obtained so far. In this study, we propose a hybrid method that uses BOCS as the main search framework and generates alternative unevaluated points using a Gaussian process only when search stagnation is detected. In the Gaussian-process-based component, multiple Lower Confidence Bound (LCB) acquisition functions are adaptively selected to dynamically control the balance between exploitation and exploration. Numerical experiments using fully connected Quadratic Unconstrained Binary Optimization (QUBO) and Higher-order Unconstrained Binary Optimization (HUBO) as black-box functions show that the proposed method finds solutions with better objective values than the conventional random-point addition method in both settings. Additional analyses show that its effectiveness comes from selecting points that promote search progress within Hamming-distance neighborhoods, rather than simply adding low-energy points near promising solutions. Experiments with sparse surrogate models for quantum annealer applications further suggest the importance of retaining near-fully connected representational capacity.2026-05-11T17:03:01ZReo ShikanaiMasayuki Ohzekihttp://arxiv.org/abs/2602.07165v2PoissonRatioUQ: An R package for band ratio uncertainty quantification2026-05-11T15:09:34ZWe introduce an R package for Bayesian modeling and uncertainty quantification for problems involving count ratios. The modeling relies on the assumption that the quantity of interest is the ratio of Poisson means rather than the ratio of counts. We provide multiple different options for retrieval of this quantity for problems with and without spatial information included. Some added capability for uncertainty quantification for problems of the form $Z=(mT+z_0)^{p}$, where $Z$ is the intensity ratio and $T$ the quantity of interest, is included.2026-02-06T20:06:52ZDescription of the R package in https://github.com/mfleduc/PoissonRatioUQ. Contains some updated information about new functions that have been addedMatthew LeDucTomoko Matsuohttp://arxiv.org/abs/2603.20423v4From the Stochastic Embedding Sufficiency Theorem to a Superspace Diffusion Framework2026-05-11T11:26:59ZA generalisation of Takens' delay-coordinate embedding theorem to stochastic systems, the Stochastic Embedding Sufficiency Theorem, is an inverse methodology enabling non-parametric recovery of both drift and diffusion fields from scalar time series without prior assumptions about the governing physics.
A blind protocol using only time series data is applied to nine domains: classical mechanics, statistical mechanics, nuclear physics, quantum mechanics, chemical kinetics, electromagnetism, relativistic quantum mechanics, quantum harmonic oscillator dynamics, and quantum electrodynamics. Fundamental constants (the Boltzmann constant, the Planck constant, the speed of light, the Fano factor, and the Van Kampen scaling exponent) emerge in both drift and diffusion channels without prior specification. The recovered diffusion coefficients, viewed across domains, constitute an empirical pattern, the $σ$-continuum, in which $k_B$, $\hbar$, and $c$ play structurally distinct roles. The Gravitational Diffusion Theorem, derived from the fluctuation-dissipation theorem, massless mode structure of linearised gravity, and gravitational self-coupling via the equivalence principle, determines the gravitational diffusion coefficient as one Planck length per square root of Planck time.
Four canonical axioms formalise the framework, within which the noise character, drift, covariance operator, and fluctuation amplitude are uniquely determined by theorem, yielding the superspace diffusion hypothesis:
$\mathrm{d}g_{ij} = \mathcal{D}_{ij}[g]\,\mathrm{d}τ+ \ell_P\,\mathrm{d}W_{ij}$
where all coefficients are non-parametric, first-principles consequences of the axioms. An implication of the hypothesis is that coarse-graining of the superspace Fokker-Planck equation via Mori-Zwanzig projection yields predictions for galactic-scale gravitational acceleration testable against kinematic data.2026-03-20T18:48:24ZCarolina GarciaLucía Perea DuránAgnese VeneziaAlex Conradiehttp://arxiv.org/abs/2605.10333v1BB plot: A Tool for Accurate Model Selection Using Bayes factors2026-05-11T10:35:00ZA common task in physics and astronomy is studying which of the competing hypotheses the data prefer. This is usually done by computing the Bayes factor between the two hypotheses, and either interpreting it in terms of the posterior odds or as a ranking statistic for a frequentist p-value test. Here we describe a relationship between the Bayes factor and its distributions under the two competing hypotheses, called the Bayes factor-Bayes factor (BB) relationship, expressed as a diagnostic plot. Using examples from gravitational wave (GW) astronomy, we demonstrate how the BB plot can validate the accuracy of Bayes factor calculations. The BB relationship may also be useful for estimating background distributions of the Bayes factor at low computational cost, even analytically in some cases. We apply this technique in the context of wave-optics lensing of GWs, extrapolating the background distribution from GWTC4 to put a rough bound of $\lesssim 4.1 σ$ on the statistical significance of GW231123.2026-05-11T10:35:00Z15 pages, 8 figuresAnkur Barsode