The data-driven extreme value distribution: non-parametric tail estimation with a derived stability criterion

2026-06-10T15:01:58Z

Quantifying the likelihood of extreme events underpins risk assessment, yet classical Extreme Value Theory relies on asymptotic assumptions that fail in the data-sparse, non-stationary regimes practitioners increasingly face. We introduce the Data-Driven Extreme Value Distribution (DDEVD), a non-parametric estimator that aggregates all observations metastatistically and reconstructs the base distribution with a kernel, removing parametric tail assumptions. We derive its optimal bandwidth and prove a stability law $m < C\,n^{1+γ/2}$ relating reliable extrapolation to the extreme value index $γ$. In sub-hourly Alpine precipitation, DDEVD recovers stable 100-year return levels from single decades (calibration ratio $0.96$), departing from the full-record reference by over $50\,\%$ in fewer than one window in fifty -- versus one in five for a GEV fit. In metallurgical micrographs, it matches a generalised extreme-value fit on the safety-relevant grain-size tail, where the standard log-normal over-predicts by $58\,\%$ at $1\,\mathrm{cm}^{2}$.

Weibull-Stationary Stochastic Differential Equations for Conditional Long-Horizon Wind Power Forecasting

2026-06-10T13:54:42Z

We present a one-month-ahead conditional probabilistic framework for wind-power forecasting at ten-minute resolution. Monthly Weibull shape and scale parameters are estimated from serially dependent SCADA wind-speed data, corrected through a Godambe covariance, and forecast by a heteroskedastic Kalman filter on a bivariate VAR(1) state-space model. Conditional on the MMSE forecasted Weibull invariant law, we construct and compare three positive wind-speed SDE models: an Ornstein-Uhlenbeck-Weibull transform, a Fokker-Planck drift-first diffusion, and a Fokker-Planck diffusion-first model. The simulated wind-speed ensembles are mapped to power through a calibrated XGBoost power curve. Applied to January 2021 data from a Senvion MM92 turbine at Kelmarsh Wind Farm, the three SDE formulations are statistically indistinguishable in probabilistic accuracy, with mean CRPS values between 1.569 and 1.575 m/s. The diffusion-first model is therefore preferred on computational grounds, reducing runtime by about a factor of seven relative to the OU-Weibull model. In the power domain, the Wasserstein distance between simulated and observed distributions is 26.1-27.6 kW, below $1.4\%$ of rated capacity, while the monthly energy-yield bias is about $-7.3\%$ for the examined month. Exceedance-probability errors remain below 1.6 percentage points over the 0-1500 kW range and about 2.2 percentage points near rated power. These quantities provide decision-relevant probabilistic inputs for downstream operational problems, rather than completed reserve, storage, market, or fatigue-optimization decisions. Full marginalisation over the Kalman predictive law of the Weibull parameters is left as a natural extension.

ChargeBD: Character-Aware Heterogeneous Agent Reasoning for Guided Engineering in Battery Development

2026-06-10T13:22:26Z

Redox-flow battery (RFB) research spans molecular design, electrolyte optimization, electrode and membrane materials, stack operation, system management, and safety analysis, making it a constrained, multi-scale, and multi-objective energy-storage R&D problem. Although large language models (LLMs) can support scientific knowledge integration and proposal generation, generic LLM reasoning remains insufficiently adaptive across innovation-oriented exploration, rule-based execution, mechanistic modeling, and system-level trade-offs. Here we introduce ChargeBD, a character-aware heterogeneous-agent reasoning framework for guided engineering in battery development. Starting from a 50-question RFB-specific task set, we construct a 500-question ESS-LLM Benchmark and define MBTI-inspired persona agents as structured cognitive-bias templates rather than psychometric instruments or representations of real personalities. DeepSeek-V3-Plus is selected as the shared base model, and 16 MBTI-inspired persona agents are evaluated to construct a persona capability matrix and a cognitive advantage matrix.

A latent class approach to assess the effects of dynamic adherence to polytherapy in heart failure patients

2026-06-10T12:59:49Z

Heart failure (HF) treatment relies heavily on pharmacotherapy, particularly combining multiple therapies as recommended by clinical guidelines. However, non-adherence to prescribed regimens remains a significant challenge, contributing to increased hospitalizations and poorer patient outcomes. This study introduces a novel methodological pipeline that integrates Latent Markov Models (LMM) with dynamic adherence modeling to evaluate adherence behaviors and their impact on HF rehospitalization. Using administrative healthcare data from Lombardy, Italy, we analyzed 6,818 patients hospitalized for HF between July and December 2020. Adherence was assessed monthly over a six-month observation period, and adherence profiles were linked to clinical outcomes using Cox regression. Seven latent behavioral profiles were identified, reflecting varying levels and trajectories of adherence. The findings revealed that higher adherence levels significantly reduced the risk of rehospitalization. Patients with consistently high adherence exhibited a 56% lower risk of HF rehospitalization compared to those with low adherence. Importantly, improving adherence during the observation period was associated with better survival probabilities, highlighting the potential benefits of timely interventions. Additionally, adherence behaviors were influenced by factors such as age, comorbidity burden, and hospitalization during the observation period. This study underscores the importance of dynamic and personalized strategies to monitor and enhance adherence to polytherapy. By linking adherence patterns to clinical outcomes, the proposed approach offers actionable insights for improving patient management and reducing the burden of HF on healthcare systems.

Hierarchical excitatory processes for modelling event-time data in the presence of exogenous stimuli

2026-06-10T07:49:24Z

We introduce the Hierarchical Excitatory Process (HEP), a flexible point process model for event-time data observed under repeated external stimuli. The proposed framework models the conditional intensity of a point process as a superposition of excitation effects induced by external stimuli, characterised by kernels with parameters dynamically evolving over time. This hierarchical construction enables modulation of excitation strength across repeated stimuli, providing an interpretable structure. We establish likelihood-based inference for the proposed model and embed HEP within a model-based clustering framework to identify latent groups sharing similar response dynamics. Simulation studies demonstrate the model's ability to recover evolving latent patterns, and an application to spike train recordings from the sea slug Aplysia pedal ganglion illustrates how HEPs are able to characterise stimulus-driven excitability of neurons across repeated stimulation under different experimental conditions.

Data-Driven Dynamic Assortment in Online Platforms: Learning about Two Sides

2026-06-10T06:30:43Z

We study a dynamic assortment problem on a two-sided service platform with incomplete information and heterogeneous customers in a discrete-time setting. In each period, a customer arrives seeking service, and the platform chooses an assortment of sellers to display. The customer then proposes a transaction to at most one seller in the assortment according to a multinomial logit choice model. After a fixed number of periods, sellers review the proposals they have received and each chooses at most one customer according to another multinomial logit choice model, after which the cycle repeats. A key challenge is that the platform does not know the choice-model parameters of either customers or sellers in advance. To our knowledge, this is the first study of a dynamic assortment problem in which both sides' choice parameters are unknown. We develop a data-driven algorithm that learns these parameters while optimizing the platform's objective over time. We evaluate performance using regret, which measures revenue loss relative to a clairvoyant benchmark that knows all parameters and customer arrivals in advance. We show that the algorithm's worst-case regret grows polylogarithmically over time, and we derive a matching lower bound, establishing its rate optimality.

DeepRHP: A Hybrid Variational Autoencoder for Designing Random Heteropolymers as Protein Mimics

2026-06-10T04:28:51Z

Synthetic random heteropolymers (RHPs), consisting of a predefined set of monomers, offer an approach toward the design of protein-like materials. These RHPs, if designed appropriately, can mimic protein behavior and function. As such, there is a need for computational tools to efficiently guide RHP design. We bridge this gap by developing DeepRHP, a modified variational autoencoder (VAE) model under a semi-supervised framework. By equipping a classical VAE with an additional feature-based VAE, DeepRHP forces the latent space to capture structures of critical chemical features as well as individual RHP sequence patterns. In this sense, our method is versatile by allowing any relevant features to be incorporated in a hybrid manner. We demonstrate the effectiveness of DeepRHP by suggesting potential monomer compositions that stabilize membrane proteins (e.g. Aquaporin Z) in non-native environments and cross-validating our prediction with published results. The concordance between our model and true RHP function suggests strong potential in utilizing hybrid autoencoder architectures to guide RHP design for proteins and other biological compounds.

Hierarchical Probabilistic Conformal Prediction for Distributed Energy Resources Adoption

2026-06-10T03:16:57Z

The rapid growth of distributed energy resources (DERs) presents both opportunities and operational challenges for electric grid management. Accurately predicting DER adoption is critical for proactive infrastructure planning, but the inherent uncertainty and spatial disparity of DER growth complicate traditional forecasting approaches. Moreover, the hierarchical structure of distribution grids demands that predictions satisfy statistical guarantees at both the circuit and substation levels, a non-trivial requirement for reliable decision-making. In this paper, we propose a novel uncertainty quantification framework for DER adoption predictions that ensures validity across hierarchical grid structures. Leveraging a multivariate Hawkes process to model DER adoption dynamics and a tailored split conformal prediction algorithm, we introduce a new nonconformity score that preserves statistical guarantees under aggregation while maintaining prediction efficiency. We establish theoretical validity under mild conditions and demonstrate through empirical evaluation on customer-level solar panel installation data from Indianapolis, Indiana that our method consistently outperforms existing baselines in both predictive accuracy and uncertainty calibration.

Program Evaluation with Remotely Sensed Outcomes

2026-06-10T01:37:08Z

We study causal inference in experiments and quasi-experiments, where the economic outcome is imperfectly measured by a remotely sensed variable. The remotely sensed variable is low-cost, scalable, and predictive of the economic outcome in observational data; examples include satellite imagery and mobile phone activity. We model the remotely sensed variable as post-outcome: variation in the economic outcome causes variation in the remotely sensed variable. For example, changes in environmental quality cause changes in satellite imagery, not vice versa. Under this assumption, we propose a formula to nonparametrically identify the causal parameter by combining experimental and observational data. We develop a method for n^{-1/2} inference that is robust to misspecification and that does not restrict the algorithms used to process remotely sensed variables.

Inferring Piece Value in Chess and Chess Variants

2026-06-10T01:12:30Z

We use logistic regression to estimate the value of the pieces in standard chess and several chess variants, namely Chess 960, Atomic chess, Antichess, and Horde chess. We perform our regressions on several years of data from Lichess, the free and open-source internet chess server. We use the published player ratings to control for the confounding effect of differential player skill. We adjust for the attenuation bias in regressions due to the noise in observed ratings. We find that major piece values, relative to the value of a pawn, are fairly consistent with historical valuation systems. However we find slightly higher value to bishops than knights. We find that piece values are smaller, in absolute value, in Atomic and Antichess than standard chess. We also present approximate values of the pieces to equalize odds when players of varying skill face off. We briefly consider self-play experiments using the Stockfish engine, which give a contrasting view of piece value.

Post Selection Estimation of Sharpe Ratios

2026-06-10T01:00:57Z

We consider the problem of estimating the true Sharpe ratio of an asset selected for having the highest observed in-sample Sharpe ratio among many assets. We discuss estimators based on the polyhedral lemma, James Stein shrinkage, debiasing the expected maximum Sharpe ratio, thresholding and empirical Bayes. We test these estimators in simulations, computing bias and root mean square error across different values of sample size, number of assets, and spread and shape of population Sharpe ratios. We also compute rank correlation of the estimators against the underlying quantity, simulating how these estimators might be used to compare or rank the output of different teams which perform this selection process. We find that the James Stein estimator provides the best performance across many different realistic values of the relevant parameters, followed by the GMLEB estimator of Jiang and Zhang. These results are fairly robust to correlation of asset returns, with some caveats.

Estimating Spatially-Smoothed Fiber Orientation Distribution from Diffusion-MRI Experiments

2026-06-10T00:23:49Z

Diffusion-weighted magnetic resonance imaging (D-MRI) is a noninvasive in vivo technique for probing the microstructural architecture of biological tissues. At each voxel, the fiber orientation distribution (FOD) characterizes local fiber configurations and orientations and is therefore a central object of estimation in D-MRI analysis. We propose the Nearest-Neighbor Adaptive Regression Model (NARM), a spatially adaptive framework for FOD estimation that performs weighted local likelihood estimation over nested spatial neighborhoods, where the weights jointly encode spatial proximity and similarity among neighboring FODs, measured by either the optimal transport or Hellinger distance. To prevent over-smoothing while preserving structural heterogeneity, we introduce a voxel-wise rescaling scheme and a data-driven stopping rule based on minimum nearest-neighbor dissimilarity. We further develop a configuration-aware strategy for selecting the similarity-smoothing parameter, allowing the smoothing strength to adapt to local fiber complexity. Simulation studies demonstrate that NARM improves FOD estimation accuracy relative to voxel-wise methods and the existing spatial smoothing approach PMARM. Application to test-retest data from the Human Connectome Project additionally shows that NARM yields more reproducible FOD estimates. Implementation details and scripts for the simulation and real data analyses are available at https://github.com/jie108/NARM

How should covariates be handled in randomized trials? Empirical evidence from 50 trials and recommendations for practice

2026-06-09T20:54:46Z

Background and Objective: Covariate adjustment can improve precision and power in randomized clinical trials and is recommended by major regulatory agencies. However, there is limited empirical evidence on how different adjustment strategies perform across diverse real-world trials, leaving uncertainty about which methods and covariates should be prespecified in statistical analysis plans. We aim to address this gap and provide practical recommendations. Methods: We conducted a large-scale empirical study using individual-level data from 50 publicly available randomized trials (29,094 participants; 574 treatment-outcome comparisons). We compared commonly used covariate-adjusted estimators, including analysis of covariance, inverse-probability weighting, g-computation, and machine-learning-based approaches, combined with three covariate-selection strategies. Performance was evaluated using precision gains, changes in point estimates, computational reliability, and the probability that covariate adjustment altered statistical significance relative to an unadjusted analysis. Results: Covariate adjustment improved precision in most settings, with a median variance reduction of 13.3\% for continuous outcomes and 4.6\% for binary outcomes. Parsimonious regression approaches using a small prespecified set of prognostic covariates performed as well as or better than more complex methods, particularly in small to medium samples. Machine-learning-based estimators did not provide additional precision and were more prone to computational failure for binary outcomes. Conclusions: Across trials, parsimonious covariate adjustment provided consistent efficiency gains without introducing systematic bias. These findings support routine covariate adjustment in primary trial analyses. All curated datasets and analysis code are openly released to support future clinical research.

Bayesian Causal Machine Learning for Cure Models

2026-06-09T19:55:11Z

In survival studies, treatments can benefit patients through different mechanisms: a treatment may increase the probability of being cured or delay failure among patients who are not cured. Quantifying which mechanism is dominant, and whether it varies across subpopulations, is clinically important, yet there is limited work in the causal machine learning literature addressing this problem. Standard causal survival learners target finite-horizon survival or restricted mean survival time, while many cure models capture cure structures without estimating causal effects. In this work, we define meaningful causal effects in the presence of a cured subpopulation and introduce BartCure, a Bayesian causal machine learning approach for estimating them. The causal effects we recommend decompose the causal effect on restricted mean survival time into a stochastic cure and stochastic latency component, and we relate these new effects to both stochastic intervention effects and causal effects in principal strata. In simulations, BartCure is competitive for estimating average effects and is especially effective at conservatively detecting the direction of treatment-effect heterogeneity. We apply BartCure to estimate average and subgroup causal effects and to identify treatment effect heterogeneity in the CALGB 40101 breast cancer trial.

Design-Based Cross-Validation for Comparing Small Area Estimators

2026-06-09T18:40:26Z

Subnational monitoring of public health and development indicators often relies on household surveys where data are sparse at the desired spatial resolution. Small area estimation (SAE) methods address this challenge by borrowing strength across areas and incorporating auxiliary information. However, comparing these estimators remains difficult in the absence of ground truth. We propose a design-based cross-validation framework for evaluating small area estimators that accommodates complex survey designs. Our approach enables model-agnostic comparisons between area-level and unit-level SAE models. We derive a decomposition of the conditional mean squared error that yields a consistent cross-validation score, show that finite-sample comparisons carry an unidentifiable bias that can be bounded, and use this bound as a principled threshold for ranking models. We further show that leave-one-area-out cross-validation, a popular alternative, targets extrapolation rather than smoothing error and can reverse the correct ranking. We evaluate the framework through extensive design-based simulations. We apply the framework to compare subnational female literacy estimators in Zambia using the 2024 Demographic and Health Survey. The framework applies broadly across prevalence mapping and other SAE problems and is applicable to any small area estimator irrespective of the underlying model class.