https://arxiv.org/api/TqU4Lu8GzE5ngFhwH8jDx/oyFA02026-04-04T10:45:17Z3487425515http://arxiv.org/abs/2603.20518v2Multi-dimensional Mortality (MDMx): Sex-Age-Specific Model Life Tables, Fitting, Prediction from Summary Mortality Indicators, and Forecasting2026-03-25T13:52:28ZDemographers rely on a variety of tools and methods to work with mortality schedules - model life tables, fitting methods, summary-indicator prediction, and forecasting - largely developed independently and not providing structurally coherent sex-specific outputs. The multi-dimensional mortality model (MDMx) unifies all four within one Tucker tensor decomposition demonstrated using the Human Mortality Database (HMD). Period life tables from the HMD are organized as a four-way tensor of logit(1qx) indexed by sex, age, country, and year. Shared factor matrices for sex and age make every output schedule structurally coherent by construction. From this decomposition four capabilities emerge: model life tables via clustering and smooth within-regime trajectories; life table fitting via a three-stage algorithm with Bayes-factor disruption detection; summary-indicator prediction mapping child or adult mortality to complete schedules, reformulating SVD-Comp in tensor coordinates; and forecasting via a damped local linear trend Kalman filter on PCA-reduced core matrices with hierarchical drift.2026-03-20T21:35:35ZSamuel J. Clarkhttp://arxiv.org/abs/2603.24299v1Mortality Forecasting as a Flow Field in Tucker Decomposition Space2026-03-25T13:38:25ZMortality forecasting methods in the Lee-Carter tradition extrapolate temporal components via time-series models, producing forecasts that can systematically underpredict life expectancy at long horizons and require ad hoc adjustments for sex coherence. We reframe forecasting as integrating a flow field through the low-dimensional score space of a Tucker tensor decomposition of multi-population mortality data from the Human Mortality Database. PCA reduction of the effective core matrices reveals that the mortality transition is essentially a one-dimensional flow: a scalar speed function advances the level, trajectory functions supply the structural scores, and the Tucker reconstruction produces complete sex-specific mortality schedules at each horizon. An era-weighted speed function adapts to contemporary dynamics at each forecast origin, and empirically calibrated convergence rates control relaxation from country-specific to canonical mortality structure. The system is evaluated by leave-country-out cross-validation with a 50-year horizon against Lee-Carter and Hyndman-Ullah benchmarks.2026-03-25T13:38:25ZSamuel J. Clarkhttp://arxiv.org/abs/2603.24276v1Rethinking Individual Risk and Aggregation in Survival Analysis: A Latent Mechanism Framework2026-03-25T13:08:57ZSurvival analysis provides a well-established framework for modeling time-to-event data, with hazard and survival functions formally defined as population-level quantities. In applied work, however, these quantities are often interpreted as representing individual-level risk, despite the absence of a clear generative account linking individual risk mechanisms to observed survival data. This paper develops a latent hazard framework that makes this relationship explicit by modeling event times as arising from unobserved, individual-specific hazard mechanisms and viewing population-level survival quantities as aggregates over heterogeneous mechanisms. Within this framework, we show that individual hazard trajectories are not identifiable from survival data under partial information. More generally, the conditional distribution of latent hazard mechanisms given covariates is structurally non-identifiable, even when population-level survival functions are fully known. This non-identifiability arises from the aggregation inherent in survival data and persists independently of model flexibility or estimation strategy. Finally, we show that classical survival models can be systematically reinterpreted according to how they handle this unresolved conditional mechanism distribution. This paper provides a unified framework for understanding heterogeneity, identifiability, and interpretation in survival analysis, and clarifies how population-level survival models should be interpreted when individual risk mechanisms are only partially observed, thereby establishing explicit information constraints for principled modeling and inference.2026-03-25T13:08:57ZXijia Liuhttp://arxiv.org/abs/2603.24263v1XT-REM: A Two-Component Model for Meta-Analysis of Extreme Event Proportions2026-03-25T12:55:30ZIn this paper, we introduce a novel model for the meta-analysis of proportions that integrates the standard random-effects model (REM) with an extreme value theory (EVT)-based component. The proposed model, named XT-REM (Extreme-Tail Random Effects Model), extends the classical REM framework by explicitly accounting for extreme proportions through a partial segmentation of the study set based on a predefined threshold. While the majority of proportions are modeled using REM, proportions exceeding the threshold are analyzed using the Generalized Pareto Distribution (GPD).
This formulation enables a dual interpretation of meta-analytic results, providing both an aggregate estimate for the central bulk of studies and a separate characterization of tail behavior. The XT-REM framework accommodates heteroskedastic variance structures inherent to proportion data, while preserving identifiability and consistency.
Using real-world data on immunotherapy-related adverse events, together with simulation studies calibrated to empirical settings, we demonstrate that XT-REM yields a comparable central estimate while enabling a more explicit assessment of tail behavior, including high-percentile extreme proportions. Compared with the classical REM, XT-REM achieves higher log-likelihood values and lower AIC, in the considered scenarios, indicating a better fit within this modeling framework.
In summary, XT-REM offers a theoretically grounded and practically useful extension of random-effects meta-analysis, with potential relevance to clinical contexts in which extreme event rates carry important implications for risk assessment.2026-03-25T12:55:30ZUnder preparation for submission to Computational Statistics & Data Analysis. Includes simulation study and real-world application of the XT-REM modelJovana DedeićJelena IvetićSrđan MilićevićKatarina VidojevićMarija Delićhttp://arxiv.org/abs/2402.08151v4Perturbative adaptive importance sampling for Bayesian LOO cross-validation2026-03-25T12:53:33ZImportance sampling (IS) is an efficient stand-in for model refitting in performing (LOO) cross-validation (CV) on a Bayesian model. IS inverts the Bayesian update for a single observation by reweighting posterior samples. The so-called importance weights have high variance -- we resolve this issue through adaptation by transformation. We observe that removing a single observation perturbs the posterior by $\mathcal{O}(1/n)$, motivating bijective transformations of the form $T(θ)=θ+ h Q(θ)$ for $0<h\ll 1.$ We introduce several such transformations: partial moment matching, which generalizes prior work on affine moment-matching with a tunable step size; log-likelihood descent, which partially invert the Bayesian update for an observation; and gradient flow steps that minimize the KL divergence or IS variance. The gradient flow and likelihood descent transformations require Jacobian determinants, which are available via auto-differentiation; we additionally derive closed-form expressions for logistic regression and shallow ReLU networks. We tested the methodology on classification ($n\ll p$), count regression (Poisson and zero-inflated negative binomial), and survival analysis problems, finding that no single transformation dominates but their combination nearly eliminates the need to refit.2024-02-13T01:03:39ZSubmittedJoshua C ChangXiangting LiTianyi SuShixin XuHao-Ren YaoJulia PorcinoCarson Chowhttp://arxiv.org/abs/2510.26485v3Discovering Causal Relationships Between Time Series With Spatial Structure2026-03-25T12:20:49ZCausal discovery is the subfield of causal inference concerned with estimating the structure of cause-and-effect relationships in a system of interrelated variables, as opposed to quantifying the strength or describing the form of causal effects. As interest in causal discovery builds in fields such as ecology, public health, and environmental sciences where data are regularly collected with spatial and temporal structures, approaches must evolve to manage autocorrelation and complex confounding. As it stands, the few proposed causal discovery algorithms for spatiotemporal data require summarizing across locations, ignore spatial autocorrelation, and/or scale poorly to high dimensions. Here, we introduce our developing framework that extends time-series causal discovery to systems with spatial structure, building upon work on causal discovery across contexts and methods for handling spatial confounding in causal effect estimation. We close by outlining remaining gaps in the literature and directions for future research.2025-10-30T13:38:08Z10 pages, 2 figuresRebecca F. SuppleSchool of Mathematics and Statistics, University of St AndrewsCentre for Research into Ecological and Environmental Modelling, University of St AndrewsHannah WorthingtonSchool of Mathematics and Statistics, University of St AndrewsCentre for Research into Ecological and Environmental Modelling, University of St AndrewsBen SwallowSchool of Mathematics and Statistics, University of St AndrewsCentre for Research into Ecological and Environmental Modelling, University of St Andrewshttp://arxiv.org/abs/2603.24227v1Identification of NMF by choosing maximum-volume basis vectors2026-03-25T12:00:53ZIn nonnegative matrix factorization (NMF), minimum-volume-constrained NMF is a widely used framework for identifying the solution of NMF by making basis vectors as similar as possible. This typically induces sparsity in the coefficient matrix, with each row containing zero entries. Consequently, minimum-volume-constrained NMF may fail for highly mixed data, where such sparsity does not hold. Moreover, the estimated basis vectors in minimum-volume-constrained NMF may be difficult to interpret as they may be mixtures of the ground truth basis vectors. To address these limitations, in this paper we propose a new NMF framework, called maximum-volume-constrained NMF, which makes the basis vectors as distinct as possible. We further establish an identifiability theorem for maximum-volume-constrained NMF and provide an algorithm to estimate it. Experimental results demonstrate the effectiveness of the proposed method.2026-03-25T12:00:53ZQianqian QiZhongming ChenPeter G. M. van der Heijdenhttp://arxiv.org/abs/2603.24201v1A Bayesian Dynamic Latent Space Model for Weighted Networks2026-03-25T11:23:00ZA new dynamic latent space eigenmodel (LSM) is proposed for weighted temporal networks. The model accommodates integer-valued weights, excess of zeros, time-varying node positions (features), and time-varying network sparsity. The latent positions evolve according to a vector autoregressive process that accounts for lagged and contemporaneous dependence across nodes and features, a characteristic neglected in the LSM literature. A Bayesian approach is used to address two of the primary sources of inference intractability in dynamic LSMs: latent feature estimation and the choice of latent space dimension. We employ an efficient auxiliary-mixture sampler that performs data augmentation and supports conditionally conjugate prior distributions. A point-process representation of the network weights and the finite-dimensional distribution of the latent processes are used to derive a multi-move sampler in which each feature trajectory is drawn in a single block, without recursions. This sampling strategy is new to the network literature and can significantly reduce computational time while improving chain mixing. To avoid trans-dimensional samplers, a Laplace approximation of the partial marginal likelihood is used to design a partially collapsed Gibbs sampler. Overall, our procedure is general, as it can be easily adapted to static and dynamic settings, as well as to other discrete or continuous weight distributions.2026-03-25T11:23:00ZRoberto CasarinMatteo IacopiniAntonio Peruzzihttp://arxiv.org/abs/2501.06844v3REML implementations of kernel-based genomic prediction models for genotype x environment x management interactions2026-03-25T10:39:29ZHigh-throughput pheno-, geno-, and envirotyping allows characterization of plant genotypes and the trials they are evaluated in, producing different types of data. These different data modalities can be integrated into statistical or machine learning models for genomic prediction in several ways. One commonly used approach within the analysis of multi-environment trial data in plant breeding is to create linear or nonlinear kernels which are subsequently used in linear mixed models (LMMs) to model genotype by environment (G$\times$E) interactions. Current implementations of these kernel-based LMMs present a number of opportunities in terms of methodological extensions. Here we show how these models can be implemented in standard software, allowing direct restricted maximum likelihood (REML) estimation of all parameters. We also further extend the models by combining the kernels with unstructured covariance matrices for three-way interactions in genotype by environment by management (G$\times$E$\times$M) datasets, while simultaneously allowing for environment-specific genetic variances. We show how the models incorporating nonlinear kernels and heterogeneous variances maximize the amount of genetic variance captured by environmental covariables and perform best in prediction settings. We discuss the opportunities regarding models with multiple kernels or kernels obtained after environmental feature selection, as well as the similarities to models regressing phenotypes on latent and observed environmental covariables. Finally, we discuss the flexibility provided by our implementation in terms of modeling complex plant breeding datasets, allowing for straightforward integration of phenomics, enviromics, and genomics.2025-01-12T15:30:11ZKillian A. C. MelsenSalvador GezanDaniel J. TolhurstFred A. van EeuwijkCarel F. W. Peetershttp://arxiv.org/abs/2603.24122v1Scoring Rules with Normalized Upper Order Statistics for Tail Inference2026-03-25T09:33:47ZThis paper proposes a scoring-rule-based method for ranking predictive distributions in the Fréchet domain that is able to distinguish between different tail indices. The approach is built on normalized order statistics and exploits proper scoring rules to compare tail limit distributions in a distributional framework, with direct relevance for insurance claim-severity tails. On the theoretical side, consistency and asymptotic normality for empirical tail scores based on normalized upper order statistics are obtained through residual estimation theory. Simulation results demonstrate that the scoring-rule-based approach is capable of discriminating between different tail behaviors in finite samples and that trends in the scaling have only a minor impact on stability. We further show that optimizing scoring rules (equivalently, minimizing the associated loss form) yields consistent tail-index estimators and that the classical Hill estimator arises as a special case. The performance of the proposed method is investigated and compared with the Hill estimator across a range of tail indices. Lastly, we analyze an automobile claim-severity data set to demonstrate how scoring rules can be used to rank predictive models based on tail predictions in actuarial settings.2026-03-25T09:33:47Z8 figures, 1 tableMartin BladtChristoffer Øhlenschlægerhttp://arxiv.org/abs/2603.24108v1Aitchison Geometry on the Simplex for Uncertainty Quantification in Bayesian Hyperspectral Image Unmixing2026-03-25T09:14:04ZMost algorithms for hyperspectral image unmixing produce point estimates of fractional abundances of the materials to be separated. However, in the absence of reliable ground truth, the ability to perform abundance uncertainty quantification (UQ) should be an important feature of algorithms, e.g. to evaluate how hard the unmixing problem is and how much the results should be trusted. The usual modeling assumptions in Bayesian models for unmixing rely heavily on the Euclidean geometry of the simplex and typically disregard spatial information. In addition, to our knowledge, abundance UQ is close to nonexistent. In this paper, we propose to leverage Aitchinson geometry from the compositional data analysis literature to provide practitioners with alternative tools for modeling prior abundance distributions. In particular we show how to design simplex-valued Gaussian Process priors using this geometry. Then we link Aitchinson geometry to constrained sampling algorithms in the literature, and propose UQ diagnostics that comply with the constraints on abundance vectors. We illustrate these concepts on real and simulated data.2026-03-25T09:14:04ZHector BlondelLucas DrumetzThierry Chonavelhttp://arxiv.org/abs/2405.17669v4Bayesian Nonparametrics for Principal Stratification with Continuous Post-Treatment Variables2026-03-25T09:07:39ZPrincipal stratification provides a causal inference framework for investigating treatment effects in the presence of a post-treatment variable. Principal strata play a key role in characterizing the treatment effect by identifying groups of units with the same or similar values for the potential post-treatment variable at all treatment levels. The literature has focused mainly on binary post-treatment variables. Few papers considered continuous post-treatment variables. In the presence of a continuous post-treatment, a challenge is how to identify and characterize meaningful coarsening of the latent principal strata that lead to interpretable principal causal effects. This paper introduces the Confounders-Aware SHared atoms BAyesian mixture (CASBAH), a novel approach for principal stratification with binary treatment and continuous post-treatment variables. CASBAH leverages Bayesian nonparametric priors with an innovative hierarchical structure for the potential post-treatment outcomes that overcomes some of the limitations of previous works. Specifically, the novel features of our method allow for (i) identifying coarsened principal strata through a data-adaptive approach and (ii) providing a comprehensive quantification of the uncertainty surrounding stratum membership. Through Monte Carlo simulations, we show that the proposed methodology performs better than existing methods in characterizing the principal strata and estimating principal effects of the treatment. Finally, CASBAH is applied to a case study in which we estimate the causal effects of US national air quality regulations on pollution levels and health outcomes.2024-05-27T21:47:41ZDafne ZorzettoAntonio CanaleFabrizia MealliFrancesca DominiciFalco J. Bargagli-Stoffihttp://arxiv.org/abs/2603.24632v1Estimation in moderately misspecified models2026-03-25T08:25:20ZSuppose data are fitted to some parametric model but that the true model happens to be one with an additional parameter. When a parameter is to be estimated one can use likelihood estimation in the wider model or in the narrow model. Including the extra parameter in the model means less bias but larger sampling variability. Two basic questions are addressed in this article. (i) Just how much misspecification can the narrow model tolerate? In the context of a large-sample moderate misspecification framework we find a surprisingly simple, sharp, and general answer. There is effectively a `tolerance radius' around a given narrow model, inside of which narrow estimation is more precise than wide estimation for all estimands. This is computed in a selection of examples that also demonstrate the degree of robustness of important standard methods against moderate incorrectness of the model under which they are optimal. (ii) Are there other estimators that work well both under narrow and wide circumstances? We discuss several possibilities and propose some new procedures. All methods are compared in a broad large-sample performance study.2026-03-25T08:25:20Z31 pages, 1 figure. Statistical Research Report, Department of Mathematics, University of Oslo, from May 1993, but arXiv'd March 2026Nils Lid Hjorthttp://arxiv.org/abs/2603.24041v1Minimal Sufficient Representations for Self-interpretable Deep Neural Networks2026-03-25T07:51:21ZDeep neural networks (DNNs) achieve remarkable predictive performance but remain difficult to interpret, largely due to overparameterization that obscures the minimal structure required for interpretation. Here we introduce DeepIn, a self-interpretable neural network framework that adaptively identifies and learns the minimal representation necessary for preserving the full expressive capacity of standard DNNs. We show that DeepIn can correctly identify the minimal representation dimension, select relevant variables, and recover the minimal sufficient network architecture for prediction. The resulting estimator achieves optimal non-asymptotic error rates that adapt to the learned minimal dimension, demonstrating that recovering minimal sufficient structure fundamentally improves generalization error. Building on these guarantees, we further develop hypothesis testing procedures for both selected variables and learned representations, bridging deep representation learning with formal statistical inference. Across biomedical and vision benchmarks, DeepIn improves both predictive accuracy and interpretability, reducing error by up to 30% on real-world datasets while automatically uncovering human-interpretable discriminative patterns. Our results suggest that interpretability and statistical rigor can be embedded directly into deep architectures without sacrificing performance.2026-03-25T07:51:21ZZhiyao TanLiu LiHuazhen Linhttp://arxiv.org/abs/2603.24025v1i-IF-Learn: Iterative Feature Selection and Unsupervised Learning for High-Dimensional Complex Data2026-03-25T07:35:38ZUnsupervised learning of high-dimensional data is challenging due to irrelevant or noisy features obscuring underlying structures. It's common that only a few features, called the influential features, meaningfully define the clusters. Recovering these influential features is helpful in data interpretation and clustering. We propose i-IF-Learn, an iterative unsupervised framework that jointly performs feature selection and clustering. Our core innovation is an adaptive feature selection statistic that effectively combines pseudo-label supervision with unsupervised signals, dynamically adjusting based on intermediate label reliability to mitigate error propagation common in iterative frameworks. Leveraging low-dimensional embeddings (PCA or Laplacian eigenmaps) followed by $k$-means, i-IF-Learn simultaneously outputs influential feature subset and clustering labels. Numerical experiments on gene microarray and single-cell RNA-seq datasets show that i-IF-Learn significantly surpasses classical and deep clustering baselines. Furthermore, using our selected influential features as preprocessing substantially enhances downstream deep models such as DeepCluster, UMAP, and VAE, highlighting the importance and effectiveness of targeted feature selection.2026-03-25T07:35:38Z28 pages, 5 figures, including appendix. Accepted at AISTATSChen MaWanjie WangShuhao Fan