StochTree: BART-based modeling in R and Python

2026-02-20T06:35:11Z

stochtree is a C++ library for Bayesian tree ensemble models such as BART and Bayesian Causal Forests (BCF), as well as user-specified variations. Unlike previous BART packages, stochtree provides bindings to both R and Python for full interoperability. stochtree boasts a more comprehensive range of models relative to previous packages, including heteroskedastic forests, random effects, and treed linear models. Additionally, stochtree offers flexible handling of model fits: the ability to save model fits, reinitialize models from existing fits (facilitating improved model initialization heuristics), and pass fits between R and Python. On both platforms, stochtree exposes lower-level functionality, allowing users to specify models incorporating Bayesian tree ensembles without needing to modify C++ code. We illustrate the use of stochtree in three settings: i) straightfoward applications of existing models such as BART and BCF, ii) models that include more sophisticated components like heteroskedasticity and leaf-wise regression models, and iii) as a component of custom MCMC routines to fit nonstandard tree ensemble models.

Preconditioned Robust Neural Posterior Estimation for Misspecified Simulators

2026-02-20T05:32:35Z

Simulation-based inference (SBI) enables parameter estimation for complex stochastic models with intractable likelihoods when model simulation is feasible. Neural posterior estimation (NPE) is a popular SBI approach that often achieves accurate inference with far fewer simulations than classical approaches. But in practice, neural approaches can be unreliable for two reasons: incompatible data summaries arising from model misspecification yield unreliable posteriors due to extrapolation, and prior-predictive draws can produce extreme summaries that lead to difficulties in obtaining an accurate posterior for the observed data of interest. Existing preconditioning schemes target well-specified settings, and their behaviour under misspecification remains unexplored. We study preconditioning under misspecification and propose preconditioned robust neural posterior estimation, which computes data-dependent weights that focus training near the observed summaries and fits a robust neural posterior approximation. We also introduce a forest-proximity preconditioning approach that uses tree-based proximity scores to down-weight outlying simulations and concentrate computation around the observed dataset. Across two synthetic examples and one real example with incompatible summaries and extreme prior-predictive behaviour, we demonstrate that preconditioning combined with robust NPE increases stability and improves accuracy, calibration, and posterior-predictive fit over standard baseline methods.

A variational framework for modal estimation

2026-02-20T03:18:47Z

We approach multivariate mode estimation through Gibbs distributions and introduce GERVE (Gibbs-measure Entropy-Regularised Variational Estimation), a likelihood-free framework that approximates Gibbs measures directly from samples by maximizing an entropy-regularised variational objective with natural-gradient updates. GERVE brings together kernel density estimation, mean-shift, variational inference, and annealing in a single platform for mode estimation. It fits Gaussian mixtures that concentrate on high-density regions and yields cluster assignments from responsibilities, with reduced sensitivity to the chosen number of components. We provide theory in two regimes: as the Gibbs temperature approaches zero, mixture components converge to population modes; at fixed temperature, maximisers of the empirical objective exist, are consistent, and are asymptotically normal. We also propose a bootstrap procedure for per-mode confidence ellipses and stability scores. Simulation and real-data studies show accurate mode recovery and emergent clustering, robust to mixture overspecification. GERVE is a practical likelihood-free approach when the number of modes or groups is unknown and full density estimation is impractical.

Model Error Embedding with Orthogonal Gaussian Processes

2026-02-20T01:01:34Z

Computational models of complex physical systems often rely on simplifying assumptions which inevitably introduce model error, with consequent predictive errors. Given data on model observables, the estimation of parameterized model-error representations, along with other model parameters, would be ideally done while separating the contributions of each of the two sets of parameters, in order to ensure meaningful stand-alone model predictions. This work builds an embedded model error framework using a weight-space representation of Gaussian processes (GPs) to flexibly capture model-error spatiotemporal correlations and enable inference with GP-embedding in non-linear models. To disambiguate model and model-error/bias parameters, we extend an existing orthogonal GP method to the embedded model-error setting and derive appropriate orthogonality constraints. To address the increased dimensionality introduced by the GP representation, we employ the likelihood-informed subspace method. The construction is demonstrated on linear and non-linear examples, where it effectively corrects model predictions to match data trends. Extrapolation beyond the training data recovers the prior predictive distribution, and the orthogonality constraints lead to meaningful stand-alone model predictions and nearly uncorrelated posteriors between model and model-error parameters.

Nested Sampling with Slice-within-Gibbs: Efficient Evidence Calculation for Hierarchical Bayesian Models

2026-02-19T14:43:59Z

We present Nested Sampling with Slice-within-Gibbs (NS-SwiG), an algorithm for Bayesian inference and evidence estimation in high-dimensional models whose likelihood admits a factorization, such as hierarchical Bayesian models. We construct a procedure to sample from the likelihood-constrained prior using a Slice-within-Gibbs kernel: an outer update of hyperparameters followed by inner block updates over local parameters. A likelihood-budget decomposition caches per-block contributions so that each local update checks feasibility in constant time rather than recomputing the global constraint at linearly growing cost. This reduces the per-replacement cost from quadratic to linear in the number of groups, and the overall algorithmic complexity from cubic to quadratic under standard assumptions. The decomposition extends naturally beyond independent observations, and we demonstrate this on Markov-structured latent variables. We evaluate NS-SwiG on challenging benchmarks, demonstrating scalability to thousands of dimensions and accurate evidence estimates even on posterior geometries where state-of-the-art gradient-based samplers can struggle.

Diffusive Scaling Limits of Forward Event-Chain Monte Carlo: Provably Efficient Exploration with Partial Refreshment

2026-02-19T05:19:21Z

Piecewise deterministic Markov process samplers are attractive alternatives to Metropolis--Hastings algorithms. A central design question is how to incorporate partial velocity refreshment to ensure ergodicity without injecting excessive noise. Forward Event-Chain Monte Carlo (FECMC) is a generalization of the Bouncy Particle Sampler (BPS) that addresses this issue through a stochastic reflection mechanism, thereby reducing reliance on global refreshment moves. Despite promising empirical performance, its theoretical efficiency remains largely unexplored. We develop a high-dimensional scaling analysis for standard Gaussian targets and prove that the negative log-density (or potential) process of FECMC converges to an Ornstein--Uhlenbeck diffusion, under the same scaling as BPS. We derive closed-form expressions for the limiting diffusion coefficients of both methods by analyzing their associated radial momentum processes and solving the corresponding Poisson equations. These expressions yield a sharp efficiency comparison: the diffusion coefficient of FECMC is strictly larger than that of optimally tuned BPS, and the optimum for FECMC is attained at zero global refreshment. Specifically, they imply an approximately eightfold increase in effective sample size per event over optimal BPS. Numerical experiments confirm the predicted diffusion coefficients and show that the resulting efficiency gains remain substantial for a range of non-Gaussian targets. Finally, as an application of these results, we propose an asymptotic variance estimator for Piecewise deterministic Markov processes that becomes increasingly efficient in high dimensions by extracting information from the velocity variable.

Large-sample analysis of cost functionals for inference under the coalescent

2026-02-18T17:46:09Z

The coalescent is a foundational model of latent genealogical trees under neutral evolution, but suffers from intractable sampling probabilities. Methods for approximating these sampling probabilities either introduce bias or fail to scale to large sample sizes. We show that a class of cost functionals of the coalescent with recurrent mutation and a finite number of alleles converge to tractable processes in the infinite-sample limit. A particular choice of costs yields insight about importance sampling methods, which are a classical tool for coalescent sampling probability approximation. These insights reveal that the behaviour of coalescent importance sampling algorithms differs markedly from standard sequential importance samplers, with or without resampling. We conduct a simulation study to verify that our asymptotics are accurate for algorithms with finite (and moderate) sample sizes. Our results constitute the first theoretical description of large-sample importance sampling algorithms for the coalescent, provide heuristics for the a priori optimisation of computational effort, and identify settings where resampling is harmful for algorithm performance. We observe strikingly different behaviour for importance sampling methods under the infinite sites model of mutation, which is regarded as a good and more tractable approximation of finite alleles mutation in most respects.

Whittle-Matérn Fields with Variable Smoothness

2026-02-18T16:29:10Z

We introduce and analyze a nonlocal generalization of Whittle--Matérn Gaussian fields in which the smoothness parameter varies in space through the fractional order, $s=s(x)\in[\underline{s}\,,\bar{s}]\subset(0,1)$. The model is defined via an integral-form operator whose kernel is constructed from the modified Bessel function of the second kind and whose local singularity is governed by the symmetric exponent $β(x,y)=(s(x)+s(y))/2$. This variable-order nonlocal formulation departs from the classical constant-order pseudodifferential setting and raises new analytic and numerical challenges. We develop a novel variational framework adapted to the kernel, prove existence and uniqueness of weak solutions on truncated bounded domains, and derive Sobolev regularity of the Gaussian (spectral) solution controlled by the minimal local order: realizations lie in $H^r(G)$ for every $r<2\underline{s}-\tfrac{d}{2}$ (here $H^r(G)$ denotes the Sobolev space on the bounded domain $G$), hence in $L_2(G)$ when $\underline s>d/4$. We also present a finite-element sampling method for the integral model, derive error estimates, and provide numerical experiments in one dimension that illustrate the impact of spatially varying smoothness on samples covariances. Computational aspects and directions for scalable implementations are discussed.

Block Empirical Likelihood Inference for Longitudinal Generalized Partially Linear Single-Index Models

2026-02-18T10:47:59Z

Generalized partially linear single-index models (GPLSIMs) provide a flexible and interpretable semiparametric framework for longitudinal outcomes by combining a low-dimensional parametric component with a nonparametric index component. For repeated measurements, valid inference is challenging because within-subject correlation induces nuisance parameters and variance estimation can be unstable in semiparametric settings. We propose a profile estimating-equation approach based on spline approximation of the unknown link function and construct a subject-level block empirical likelihood (BEL) for joint inference on the parametric coefficients and the single-index direction. The resulting BEL ratio statistic enjoys a Wilks-type chi-square limit, yielding likelihood-free confidence regions without explicit sandwich variance estimation. We also discuss practical implementation, including constrained optimization for the index parameter, working-correlation choices, and bootstrap-based confidence bands for the nonparametric component. Simulation studies and an application to the epilepsy longitudinal study illustrate the finite-sample performance.

HAL-MLE Log-Splines Density Estimation (Part I: Univariate)

2026-02-18T08:16:05Z

We study nonparametric maximum likelihood estimation of probability densities under a total variation (TV) type penalty, sectional variation norm (also named as Hardy-Krause variation). TV regularization has a long history in regression and density estimation, including results on $L^2$ and KL divergence convergence rates. Here, we revisit this task using the Highly Adaptive Lasso (HAL) framework. We formulate a HAL-based maximum likelihood estimator (HAL-MLE) using the log-spline link function from \citet{kooperberg1992logspline}, and show that in the univariate setting the bounded sectional variation norm assumption underlying HAL coincides with the classical bounded TV assumption. This equivalence directly connects HAL-MLE to existing TV-penalized approaches such as local adaptive splines \citep{mammen1997locally}. We establish three new theoretical results: (i) the univariate HAL-MLE is asymptotically linear, (ii) it admits pointwise asymptotic normality, and (iii) it achieves uniform convergence at rate $n^{-(k+1)/(2k+3)}$ up to logarithmic factors for the smoothness order $k \geq 1$. These results extend existing results from \citet{van2017uniform}, which previously guaranteed only uniform consistency without rates when $k=0$. We will include the uniform convergence for general dimension $d$ in the follow-up work of this paper. The intention of this paper is to provide a unified framework for the TV-penalized density estimation methods, and to connect the HAL-MLE to the existing TV-penalized methods in the univariate case, despite that the general HAL-MLE is defined for multivariate cases.

Quantifying and Attributing Submodel Uncertainty in Stochastic Simulation Models and Digital Twins

2026-02-18T00:06:39Z

Stochastic simulation is widely used to study complex systems composed of various interconnected subprocesses, such as input processes, routing and control logic, optimization routines, and data-driven decision modules. In practice, these subprocesses may be inherently unknown or too computationally intensive to directly embed in the simulation model. Replacing these elements with estimated or learned approximations introduces a form of epistemic uncertainty that we refer to as submodel uncertainty. This paper investigates how submodel uncertainty affects the estimation of system performance metrics. We develop a framework for quantifying submodel uncertainty in stochastic simulation models and extend the framework to digital-twin settings, where simulation experiments are repeatedly conducted with the model initialized from observed system states. Building on approaches from input uncertainty analysis, we leverage bootstrapping and Bayesian model averaging to construct quantile-based confidence or credible intervals for key performance indicators. We propose a tree-based method that decomposes total output variability and attributes uncertainty to individual submodels in the form of importance scores. The proposed framework is model-agnostic and accommodates both parametric and nonparametric submodels under frequentist and Bayesian modeling paradigms. A synthetic numerical experiment and a more realistic digital-twin simulation of a contact center illustrate the importance of understanding how and how much individual submodels contribute to overall uncertainty.

Solving Fredholm Integral Equations of the Second Kind via Wasserstein Gradient Flows

2026-02-17T22:39:37Z

Motivated by a recent method for approximate solution of Fredholm equations of the first kind, we develop a corresponding method for a class of Fredholm equations of the \emph{second kind}. In particular, we consider the class of equations for which the solution is a probability measure. The approach centres around specifying a functional whose gradient flow admits a minimizer corresponding to a regularized version of the solution of the underlying equation and using a mean-field particle system to approximately simulate that flow. Theoretical support for the method is presented, along with some illustrative numerical results.

MARLEM: A Multi-Agent Reinforcement Learning Simulation Framework for Implicit Cooperation in Decentralized Local Energy Markets

2026-02-17T22:22:45Z

This paper introduces a novel, open-source MARL simulation framework for studying implicit cooperation in LEMs, modeled as a decentralized partially observable Markov decision process and implemented as a Gymnasium environment for MARL. Our framework features a modular market platform with plug-and-play clearing mechanisms, physically constrained agent models (including battery storage), a realistic grid network, and a comprehensive analytics suite to evaluate emergent coordination. The main contribution is a novel method to foster implicit cooperation, where agents' observations and rewards are enhanced with system-level key performance indicators to enable them to independently learn strategies that benefit the entire system and aim for collectively beneficial outcomes without explicit communication. Through representative case studies (available in a dedicated GitHub repository in https://github.com/salazarna/marlem, we show the framework's ability to analyze how different market configurations (such as varying storage deployment) impact system performance. This illustrates its potential to facilitate emergent coordination, improve market efficiency, and strengthen grid stability. The proposed simulation framework is a flexible, extensible, and reproducible tool for researchers and practitioners to design, test, and validate strategies for future intelligent, decentralized energy systems.

Lattice Random Walk Discretisations of Stochastic Differential Equations

2026-02-17T17:59:01Z

We introduce a lattice random walk discretisation scheme for stochastic differential equations (SDEs) that samples binary or ternary increments at each step, suppressing complex drift and diffusion computations to simple 1 or 2 bit random values. This approach is a significant departure from traditional floating point discretisations and offers several advantages; including compatibility with stochastic computing architectures that avoid floating-point arithmetic in place of directly manipulating the underlying probability distribution of a bitstream, elimination of Gaussian sampling requirements, robustness to quantisation errors, and handling of non-Lipschitz drifts. We prove weak convergence and demonstrate the advantages through experiments on various SDEs, including state-of-the-art diffusion models.

FARS: Factor Augmented Regression Scenarios in R

2026-02-17T11:02:15Z

In the context of macroeconomic/financial time series, the FARS package provides a comprehensive framework in R for the construction of conditional densities of the variable of interest based on the factor-augmented quantile regressions (FA-QRs) methodology, with the factors extracted from multi-level dynamic factor models (ML-DFMs) with potential overlapping group-specific factors. Furthermore, the package also allows the construction of measures of risk as well as modeling and designing economic scenarios based on the conditional densities. In particular, the package enables users to: (i) extract global and group-specific factors using a flexible multi-level factor structure; (ii) compute asymptotically valid confidence regions for the estimated factors, accounting for uncertainty in the factor loadings; (iii) obtain estimates of the parameters of the FA-QRs together with their standard deviations; (iv) recover full predictive conditional densities from estimated quantiles; (v) obtain risk measures based on extreme quantiles of the conditional densities; and (vi) estimate the conditional density and the corresponding extreme quantiles when the factors are stressed.