https://arxiv.org/api/CUCV81dl0V6lXbT9wsKruBpLt84 2026-03-22T14:25:49Z 1629 60 15 http://arxiv.org/abs/2511.04469v4 Towards Causal Market Simulators 2026-01-21T13:14:57Z Market generators using deep generative models have shown promise for synthetic financial data generation, but existing approaches lack causal reasoning capabilities essential for counterfactual analysis and risk assessment. We propose a Time-series Neural Causal Model VAE (TNCM-VAE) that combines variational autoencoders with structural causal models to generate counterfactual financial time series while preserving both temporal dependencies and causal relationships. Our approach enforces causal constraints through directed acyclic graphs in the decoder architecture and employs the causal Wasserstein distance for training. We validate our method on synthetic autoregressive models inspired by the Ornstein-Uhlenbeck process, demonstrating superior performance in counterfactual probability estimation with L1 distances as low as 0.03-0.10 compared to ground truth. The model enables financial stress testing, scenario analysis, and enhanced backtesting by generating plausible counterfactual market trajectories that respect underlying causal mechanisms. 2025-11-06T15:44:07Z ICAIF 2025 Workshop on Rethinking Financial Time-Series Dennis Thumm Luis Ontaneda Mijares http://arxiv.org/abs/2601.12167v1 Using Directed Acyclic Graphs to Illustrate Common Biases in Diagnostic Test Accuracy Studies 2026-01-17T21:07:41Z Background: Diagnostic test accuracy (DTA) studies, like etiological studies, are susceptible to various biases including reference standard error bias, partial verification bias, spectrum effect, confounding, and bias from misassumption of conditional independence. While directed acyclic graphs (DAGs) are widely used in etiological research to identify and illustrate bias structures, they have not been systematically applied to DTA studies. Methods: We developed DAGs to illustrate the causal structures underlying common biases in DTA studies. For each bias, we present the corresponding DAG structure and demonstrate the parallel with equivalent biases in etiological studies. We use real-world examples to illustrate each bias mechanism. Results: We demonstrate that five major biases in DTA studies can be represented using DAGs with clear structural parallels to etiological studies: reference standard error bias corresponds to exposure misclassification, misassumption of conditional independence creates spurious correlations similar to unmeasured confounding, spectrum effect parallels effect modification, confounding operates through backdoor paths in both settings, and partial verification bias mirrors selection bias. These DAG representations reveal the causal mechanisms underlying each bias and suggest appropriate correction strategies. Conclusions: DAGs provide a valuable framework for understanding bias structures in DTA studies and should complement existing quality assessment tools like STARD and QUADAS-2. We recommend incorporating DAGs during study design to prospectively identify potential biases and during reporting to enhance transparency. DAG construction requires interdisciplinary collaboration and sensitivity analyses under alternative causal structures. 2026-01-17T21:07:41Z Yang Lu Nandini Dendukuri http://arxiv.org/abs/2505.00356v3 On the retraining frequency of global models in retail demand forecasting 2026-01-14T09:57:38Z In an era of increasing computational capabilities and growing environmental consciousness, organizations face a critical challenge in balancing the accuracy of forecasting models with computational efficiency and sustainability. Global forecasting models, lowering the computational time, have gained significant attention over the years. However, the common practice of retraining these models with new observations raises important questions about the costs of forecasting. Using ten different machine learning and deep learning models, we analyzed various retraining scenarios, ranging from continuous updates to no retraining at all, across two large retail demand datasets. We showed that less frequent retraining strategies maintain the forecast accuracy while reducing the computational costs, providing a more sustainable approach to large-scale forecasting. We also found that machine learning models are a marginally better choice to reduce the costs of forecasting when coupled with less frequent model retraining strategies as the frequency of the data increases. Our findings challenge the conventional belief that frequent retraining is essential for maintaining forecasting accuracy. Instead, periodic retraining offers a good balance between predictive performance and efficiency, both in the case of point and probabilistic forecasting. These insights provide actionable guidelines for organizations seeking to optimize forecasting pipelines while reducing costs and energy consumption. 2025-05-01T07:00:29Z Machine Learning with Applications, 22 , 100769 (2025) Marco Zanotti 10.1016/j.mlwa.2025.100769 http://arxiv.org/abs/2601.08167v2 Proactive Anomaly Screen for Multiple Endpoints Using Bayesian Latent Class Modeling: A k-Step Ahead Approach 2026-01-14T03:57:47Z In clinical trials, ensuring the quality and validity of data for downstream analysis and results is paramount, thus necessitating thorough data monitoring. This typically involves employing edit checks and manual queries during data collection. Edit checks consist of straightforward schemes programmed into relational databases, though they lack the capacity to assess data intelligently. In contrast, manual queries are initiated by data managers who manually scrutinize the collected data, identifying discrepancies needing clarification or correction. Manual queries pose significant challenges, particularly when dealing with large-scale data in late-phase clinical trials. Moreover, they are reactive rather than predictive, meaning they address issues after they arise rather than preemptively preventing errors. In this paper, we propose a joint model for multiple endpoints, focusing on primary and key secondary measures, using a Bayesian latent class approach. This model incorporates adjustments for risk monitoring factors, enabling proactive, $k$-step ahead, detection of conflicting or anomalous patterns within the data. Furthermore, we develop individualized dynamic predictions at consecutive time-points to identify potential anomalous values based on observed data. This analysis can be integrated into electronic data capture systems to provide objective alerts to stakeholders. We present simulation results and demonstrate effectiveness of this approach with real-world data. 2026-01-13T02:57:11Z Yuxi Zhao Margaret Gamalo http://arxiv.org/abs/2601.07534v1 Bayesian Handwriting Evidence Evaluation using MANOVA via Fourier-Based Extracted Features 2026-01-12T13:35:47Z This paper proposes a novel statistical approach that aims at the identification of valid and useful patterns in handwriting examination via Bayesian modeling. Starting from a sample of characters selected among 13 French native writers, an accurate loop reconstruction can be achieved through Fourier analysis. The contour shape of handwritten characters can be described by the first four pairs of Fourier coefficients and by the surface size. Six Bayesian models are considered for such handwritten features. These models arise from two likelihood structures: (a) a multivariate Normal model, and (b) a MANOVA model that accounts for character-level variability. For each likelihood, three different prior formulations are examined, resulting in distinct Bayesian models: (i) a conjugate Normal-Inverse-Wishart prior, (ii) a hierarchical Normal-Inverse-Wishart prior, and (iii) a Normal-LogNormal-LKJ prior specification. The hierarchical prior formulations are of primary interest because they can incorporate the between-writers variability, a distinguishing element that sets writers apart. These approaches do not allow calculation of the marginal likelihood in a closed-form expression. Therefore, bridge sampling is used to estimate it. The Bayes factor is estimated to compare the performance of the proposed models and to evaluate their efficiency for discriminating purposes. Bayesian MANOVA with Normal-LogNormal-LKJ prior showed an overall better performance, in terms of discriminatory capacity and model fitting. Finally, a sensitivity analysis for the elicitation of the prior distribution parameters is performed. 2026-01-12T13:35:47Z Lampis Tzai Ioannis Ntzoufras Silvia Bozza http://arxiv.org/abs/2405.01342v2 Data-Driven Strategies for Detecting and Sampling Misrepresented Subgroups 2026-01-11T19:15:11Z Economic policy research frequently examines population well-being, with a particular focus on the relationships between unequal living conditions, low educational attainment, and social exclusion. Sample surveys, such as EU-SILC, are widely used for this purpose and inform public policy; yet, their sampling designs may fail to adequately represent rare, hard-to-sample, or under-covered subgroups. This limitation can hinder socio-demographic analyses and evidence-based policy design. We propose a generalisable approach based on univariate and multivariate unsupervised learning techniques to detect outliers in survey data that may signal under-represented subgroups. Identified groups can then be characterised to inform targeted resampling strategies that improve survey inclusiveness. An empirical application using the 2019 EU-SILC data for the Italian region of Liguria shows that citizenship, material deprivation, large household size, and economic vulnerability are key indicators of under-representation. 2024-05-02T14:45:37Z G. Lancia F. Mecatti E. Riccomagno http://arxiv.org/abs/2511.11412v4 MajinBook: An open catalogue of digital world literature with likes 2026-01-09T16:11:46Z This data paper introduces MajinBook, an open catalogue designed to facilitate the use of shadow libraries--such as Library Genesis and Z-Library--for computational social science and cultural analytics. By linking metadata from these vast, crowd-sourced archives with structured bibliographic data from Goodreads, we create a high-precision corpus of over 539,000 references to English-language books spanning three centuries, enriched with first publication dates, genres, and popularity metrics like ratings and reviews. Our methodology prioritizes natively digital EPUB files to ensure machine-readable quality, while addressing biases in traditional corpora like HathiTrust, and includes secondary datasets for French, German, and Spanish. We evaluate the linkage strategy for accuracy, release all underlying data openly, and discuss the project's legal permissibility under EU and US frameworks for text and data mining in research. 2025-11-14T15:44:27Z 9 pages, 5 figures, 1 table Antoine Mazières Thierry Poibeau http://arxiv.org/abs/2601.05490v1 How Carbon Border Adjustment Mechanism is Energizing the EU Carbon Market and Industrial Transformation 2026-01-09T02:45:52Z The global carbon market is fragmented and characterized by limited pricing transparency and empirical evidence, creating challenges for investors and policymakers in identifying carbon management opportunities. The European Union is among several regions that have implemented emissions pricing through an Emissions Trading System (EU ETS). While the EU ETS has contributed to emissions reductions, it has also raised concerns related to international competitiveness and carbon leakage, particularly given the strong integration of EU industries into global value chains. To address these challenges, the European Commission proposed the Carbon Border Adjustment Mechanism (CBAM) in 2021. CBAM is designed to operate alongside the EU ETS by applying a carbon price to selected imported goods, thereby aligning carbon costs between domestic and foreign producers. It will gradually replace existing carbon leakage mitigation measures, including the allocation of free allowances under the EU ETS. The initial scope of CBAM covers electricity, cement, fertilizer, aluminium, iron, and steel. As climate policies intensify under the Paris Agreement, CBAM-like mechanisms are expected to play an increasingly important role in managing carbon-related trade risks and supporting the transition to net zero emissions. 2026-01-09T02:45:52Z 17 Pages; 4 Figures Joseph Nyangon Brecht Seifi http://arxiv.org/abs/2512.09790v2 A Conversation with Mike West 2026-01-06T18:15:19Z Mike West is currently the Arts & Sciences Distinguished Professor Emeritus of Statistics and Decision Sciences at Duke University. Mike's research in Bayesian analysis spans multiple interlinked areas: theory and methods of dynamic models in time series analysis, foundations of inference and decision analysis, multivariate and latent structure analysis, stochastic computation and optimisation, among others. Inter-disciplinary R&D has ranged across applications in commercial forecasting, dynamic networks, finance, econometrics, signal processing, climatology, systems biology, genomics and neuroscience, among other areas. Among Mike's currently active research areas are forecasting, causal prediction and decision analysis in business, economic policy and finance, as well as in personal decision making. Mike led the development of academic statistics at Duke University from 1990-2002, and has been broadly engaged in professional leadership elsewhere. He is past president of the International Society for Bayesian Analysis (ISBA), and has served in founding roles and as board member for several professional societies, national and international centres and institutes. Recipient of numerous awards, Mike has been active in research with various companies, banks, government agencies and academic centres, co-founder of a successful biotechnology company, and board member for several financial and IT companies. He has published 4 books, several edited volumes and over 200 papers. Mike has worked with many undergraduate and Master's research students, and as of 2025 has mentored around 65 primary PhD students and postdoctoral associates who moved to academic, industrial or governmental positions involving advanced statistical and data science research. 2025-12-10T16:10:21Z Hedibert F. Lopes Filippo Ascolani http://arxiv.org/abs/2512.23110v2 Assessing the Effects of Macroeconomic Variables on Child Mortality in D-8 Countries Using Panel Data Analysis 2026-01-04T23:55:51Z This research analyses the axiomatic link among health expenditures, inflation rate, and gross national income (GNI) per capita concerning the child mortality (CMU5) rate in D-8 nations, employing panel data analysis from 1995 to 2014. Utilising conventional panel unit root tests and linear regression models, we establish that education expenditures, in conjunction with health expenditures, inflation rate, and GNI per capita, display stationarity at level. Additionally, we examine fixed effects and random effects estimators for the pertinent variables, utilising metrics such as the Hausman Test (HT) and comparisons with CCMR correlations. Our data demonstrate that the CMU5 rate in D-8 nations has steadily decreased, according to a somewhat negative linear regression model, therefore slightly undermining the fourth Millennium Development Goal (MDG4) of the World Health Organisation (WHO). 2025-12-28T23:17:53Z 13 pages, 3 Figures, 4 tables, updated references M. Waseem Akram Binita Shahi M. Javed Akram http://arxiv.org/abs/2512.24930v1 Constraints on the perfect phylogeny mixture model and their effect on reducing degeneracy 2025-12-31T15:39:15Z The perfect phylogeny mixture (PPM) model is useful due to its simplicity and applicability in scenarios where mutations can be assumed to accumulate monotonically over time. It is the underlying model in many tools that have been used, for example, to infer phylogenetic trees for tumor evolution and reconstruction. Unfortunately, the PPM model gives rise to substantial ambiguity -- in that many different phylogenetic trees can explain the same observed data -- even in the idealized setting where data are observed perfectly, i.e. fully and without noise. This ambiguity has been studied in this perfect setting by Pradhan et al. 2018, which proposed a procedure to bound the number of solutions given a fixed instance of observation data. Beyond this, studies have been primarily empirical. Recent work (Myers et al. 2019) proposed adding extra constraints to the PPM model to tackle ambiguity. In this paper, we first show that the extra constraints of Myers et al. 2019, called longitudinal constraints (LC), often fail to reduce the number of distinct trees that explain the observations. We then propose novel alternative constraints to limit solution ambiguity and study their impact when the data are observed perfectly. Unlike the analysis in Pradhan et al. 2018, our theoretical results regarding both the inefficacy of the LC and the extent to which our new constrains reduce ambiguity are not tied to a single observation instance. Rather, our theorems hold over large ensembles of possible inference problems. To the best of our knowledge, we are the first to study degeneracy in the PPM model in this ensemble-based theoretical framework. 2025-12-31T15:39:15Z John Marangola Azadeh Sheikholeslami José Bento http://arxiv.org/abs/2512.23371v1 Domain matters: Towards domain-informed evaluation for link prediction 2025-12-29T11:04:36Z Link prediction, a foundational task in complex network analysis, has extensive applications in critical scenarios such as social recommendation, drug target discovery, and knowledge graph completion. However, existing evaluations of algorithmic often rely on experiments conducted on a limited number of networks, assuming consistent performance rankings across domains. Despite the significant disparities in generative mechanisms and semantic contexts, previous studies often improperly highlight ``universally optimal" algorithms based solely on naive average over networks across domains. This paper systematically evaluates 12 mainstream link prediction algorithms across 740 real-world networks spanning seven domains. We present substantial empirical evidence elucidating the performance of algorithms in specific domains. This findings reveal a notably low degree of consistency in inter-domain algorithm rankings, a phenomenon that stands in stark contrast to the high degree of consistency observed within individual domains. Principal Component Analysis shows that response vectors formed by the rankings of the 12 algorithms cluster distinctly by domain in low-dimensional space, thus confirming domain attributes as a pivotal factor affecting algorithm performance. We propose a metric called Winner Score that could identify the superior algorithm in each domain: Non-Negative Matrix Factorization for social networks, Neighborhood Overlap-aware Graph Neural Networks for economics, Graph Convolutional Networks for chemistry, and L3-based Resource Allocation for biology. However, these domain-specific top-performing algorithms tend to exhibit suboptimal performance in other domains. This finding underscores the importance of aligning an algorithm's mechanism with the network structure. 2025-12-29T11:04:36Z Yilin Bi Junhao Bian Shuyan Wan Shuaijia Wang Tao Zhou http://arxiv.org/abs/2512.23053v1 LLteacher: A Tool for the Integration of Generative AI into Statistics Assignments 2025-12-28T19:39:45Z As generative AI becomes increasingly embedded in everyday life, the thoughtful and intentional integration of AI-based tools into statistics education has become essential. We address this need with a focus on homework assignments and we propose the use of LLMs as a companion to complete homework by developing an open-source tool named LLteacher. This LLM-based tool preserves learning processes and it guides students to engage with AI in ways that support their learning, while ensuring alignment with course content and equitable access. We illustrate LLteacher's design and functionality with examples from an undergraduate Statistical Computing course in R, showing how it supports two distinct pedagogical goals: recalling prior knowledge and discovering new concepts. While this is an initial version, LLteacher demonstrates one possible pathway for integrating generative AI into statistics courses, with strong potential for adaptation to other types of classes and assignments. 2025-12-28T19:39:45Z Emanuela Furfaro Simone Mosciatti http://arxiv.org/abs/2512.20068v2 Change Point Detection and Mean-Field Dynamics of Variable Productivity Hawkes Processes 2025-12-26T21:23:24Z Many self-exciting systems change because endogenous amplification, as opposed to exogenous forcing, varies. We study a Hawkes process with fixed background rate and kernel, but piecewise time-varying productivity. For exponential kernels we derive closed-form mean-field relaxation after a change and a deterministic surrogate for post-change Fisher information, revealing a boundary layer in which change time information localises and saturates, while post-change level information grows linearly beyond a short transient. These results motivate a Bayesian change point procedure that stabilizes inference on finite windows. We illustrate the method on invasive pneumococcal disease incidence in The Gambia, identifying a decline in productivity aligned with pneumococcal conjugate vaccine rollout. 2025-12-23T05:43:55Z Conor Kresin Boris Baeumer Sophie Phillips http://arxiv.org/abs/1006.5471v10 Cognitive Constructivism and the Epistemic Significance of Sharp Statistical Hypotheses in Natural Sciences 2025-12-24T09:07:31Z This book presents our case in defense of a constructivist epistemological framework and the use of compatible statistical theory and inference tools. The basic metaphor of decision theory is the maximization of a gambler's expected fortune, according to his own subjective utility, prior beliefs an learned experiences. This metaphor has proven to be very useful, leading the development of Bayesian statistics since its XX-th century revival, rooted on the work of de Finetti, Savage and others. The basic metaphor presented in this text, as a foundation for cognitive constructivism, is that of an eigen-solution, and the verification of its objective epistemic status. The FBST - Full Bayesian Significance Test - is the cornerstone of a set of statistical tolls conceived to assess the epistemic value of such eigen-solutions, according to their four essential attributes, namely, sharpness, stability, separability and composability. We believe that this alternative perspective, complementary to the one ofered by decision theory, can provide powerful insights and make pertinent contributions in the context of scientific research. 2010-06-28T21:15:07Z 453 pages J. M. Stern