https://arxiv.org/api/X6ju4/30NMTYFPkvC4IsqC6Irj02026-06-10T12:28:04Z168621015http://arxiv.org/abs/2509.01778v2Grid Transmission Evaluation for Solar Deployment and Data Center Growth2025-09-11T13:07:26ZThe rapid growth of renewable energy deployment and data center demand in the United States has intensified challenges in grid interconnection, with project delays and escalating costs threatening both economic expansion and energy reliability. This study investigates transmission constraints using the IEEE 39-bus New England Power System model to evaluate the simultaneous interconnection of a 1 GW solar facility and a 1 GW data center load. Employing PSSE and Python-based automation (psspy), we conducted 1,560 load flow simulations across varying siting configurations to assess branch overloads and transmission line limits. Results revealed that only 14 configurations avoided overloads, while most scenarios highlighted recurring congestion on specific network branches, particularly between buses 21 and 22. Optimal siting was identified with the load at bus #35 and the generator at bus #39, yielding minimal overloads (maximum 91.1% loading). Conversely, poor siting decisions resulted in severe congestion with maximum branch loading above 220%. The findings underscore the critical importance of optimized siting and modernized, automated interconnection studies to reduce delays and costs in renewable integration. This research demonstrates the potential of advanced modeling tools to accelerate interconnection processes, improve system reliability, and inform future strategies for balancing renewable energy deployment with rising data center demand.2025-09-01T21:22:51ZarXiv admin note: This paper has been withdrawn by arXiv due to disputed authorshipKajal ShethDhvanil PatelShyam Kareepadath Sajeevhttp://arxiv.org/abs/2509.08744v1Who has the best probabilities? Luck versus skill in prediction tournaments2025-09-10T16:34:49ZAn informal and elementary introduction to probability scoring and forecast verification and improvement, slightly extended from Significance 22:3(2025)16, which might be useful for less mathematical readers as a prologue to the classic review by Gneiting and Raftery [Strictly proper scoring rules, prediction, and estimation, Journal of the American Statistical Association 102 (2007): 359].2025-09-10T16:34:49Z13 pages, 2 figuresSignificance vol.22 no.3 (2025) 16-21Niall MacKay10.1093/jrssig/qmaf023http://arxiv.org/abs/2509.08451v1Comparing Methodologies for Ranking Alternatives: A case study in assessing bank financial performance2025-09-10T09:48:12ZBank financial performance encapsulates an institution's capacity to effectively manage its assets, capital, and operational activities to generate profits and ensure stability. Evaluating this performance necessitates the integration of diverse metrics, including profitability indicators, loan growth rates, capital utilization efficiency, and more. Nevertheless, directly comparing the financial performance across different banks presents a complex challenge due to inherent disparities in their specific performance parameters. Multi-criteria decision-making (MCDM) techniques are frequently employed to navigate this intricate assessment. This study undertakes a comparative analysis of various MCDM approaches in evaluating bank financial performance. Our investigation encompasses both a comparison of methods for assigning weights to criteria and a comparison of methodologies for ranking the alternatives (banks). We examine five distinct weighting methods: Equal, Entropy, MEREC, LOPCOW, and SPC. Concurrently, three alternative ranking methods Probability, TOPSIS, and RAM are compared. These comparisons are conducted within the context of a case study involving the performance assessment of 19 banks. The findings indicate that the highest degree of stability in ranking bank financial performance is achieved when the Entropy method is utilized for criteria weighting in conjunction with the Probability method for ranking alternatives.2025-09-10T09:48:12Z17 pages, 9 tablesDong Trung ChinhNguyen Thi Thu HienPham Huong QuynhVu Quang Minhhttp://arxiv.org/abs/2509.08187v1A Comparative Analysis of Multi-Criteria Decision-Making (MCDM) Methods2025-09-09T23:20:18ZMulti-Criteria Decision-Making (MCDM) techniques have found widespread application across diverse fields. The rapid evolution of MCDM has led to the development of hundreds of methods, each employing distinct approaches. However, due to inherent algorithmic differences, various MCDM methods often yield divergent results when applied to the same specific problem. This study undertakes a comparative analysis of four particular methods: RAM, MOORA, FUCA, and CURLI, within a defined case study. The evaluation context involves ranking 30 Vietnamese banks based on six criteria: capital adequacy, asset quality, management capability, earnings ability, liquidity, and sensitivity to market risk. Prior to this analysis, these banks had also been ranked by the CAMELS rating system. The CAMELS rankings serve as a benchmark to assess the performance of the RAM, MOORA, FUCA, and CURLI methods. Our findings indicate that FUCA and CURLI are highly suitable methods for this application, demonstrating Spearman's rank correlation coefficients with CAMELS of 0.9996 and 0.9984, respectively. In contrast, both RAM and MOORA proved unsuitable, exhibiting very low Spearman's correlation coefficients of -1.0296 against the CAMELS ranking.2025-09-09T23:20:18Z10 pages, 3 tablesEngineering, Technology & Applied Science Research, Vol. 15, No. 5, 2025, 26369-26375Nguyen Thi Thu HienPham Huong QuynhVu Quang Minh10.48084/etasr.12782http://arxiv.org/abs/2509.08183v1Chaotic Bayesian Inference: Strange Attractors as Risk Models for Black Swan Events2025-09-09T23:11:23ZWe introduce a new risk modeling framework where chaotic attractors shape the geometry of Bayesian inference. By combining heavy-tailed priors with Lorenz and Rossler dynamics, the models naturally generate volatility clustering, fat tails, and extreme events. We compare two complementary approaches: Model A, which emphasizes geometric stability, and Model B, which highlights rare bursts using Fibonacci diagnostics. Together, they provide a dual perspective for systemic risk analysis, linking Black Swan theory to practical tools for stress testing and volatility monitoring.2025-09-09T23:11:23Z13 pages, 5 figures. Includes supplementary baseline diagnosticsCrystal Rusthttp://arxiv.org/abs/2506.07437v2One-dimensional quantile-stratified sampling and its application in statistical simulations2025-09-06T06:04:39ZIn this paper we examine quantile-stratified samples from a known univariate probability distribution, with stratification occurring over a partition of the quantile regions in the distribution. We examine some general properties of this sampling method and we contrast it with standard IID sampling to highlight its similarities and differences. We examine the applications of this sampling method to various statistical simulations including importance sampling. We conduct simulation analysis to compare the performance of standard importance sampling against the quantile-stratified importance sampling to see how they each perform on a range of functions.2025-06-09T05:25:33ZBen O'Neillhttp://arxiv.org/abs/2509.05277v1Bridge Modal Identification using Single Moving Sensor under Random Traffic Loading2025-09-05T17:34:21ZThis paper explores the feasibility of utilizing the response recorded by a single moving sensor to identify the modal parameters of a bridge system under different loading conditions, such as known excitation and unknown random traffic-induced vibrations. The sensor traverses the bridge and captures its dynamic response (acceleration). The natural frequencies and damping ratios are identified using the moving sensor data in the frequency domain. In the case of known inputs, these parameters are then used to obtain the mode shapes, expressed as a linear combination of basic orthonormal polynomials (BOPs), with the coefficients of the BOPs in the linear combinations obtained via optimization. A statistical formulation is proposed to estimate the mode shapes in the case of unknown random traffic-induced vibrations, including the effect of road roughness. It is shown that the absolute value of the mode shapes are proportional to the ensemble standard deviation (SD) of the modal responses. This approach requires the sensor to traverse the bridge multiple times, with the mode shapes identified in both the time domain using variances, and in frequency domain through the evolutionary power spectrum of these responses. The random traffic loading is modeled such that vehicle arrival times follow a Poisson distribution, while the mass and velocity of the vehicles are assumed to follow uniform distributions. To incorporate the effect of road roughness, modeled as a homogeneous random field, a vehicle-bridge-interaction (VBI) model is utilized. Numerical validation under the different loading conditions demonstrates that a single moving sensor can be used to identify the modal parameters quite accurately, with high spatial resolution of the identified mode shapes, offering a cost-effective and efficient alternative for bridge health monitoring.2025-09-05T17:34:21Z56 pages, 18 figuresDhiraj GhoshSuparno MukhopadhyayShaily Jainhttp://arxiv.org/abs/2509.04546v1The Actuary's Final Word on Algorithmic Decision Making2025-09-04T16:59:20ZPaul Meehl's foundational work "Clinical versus Statistical Prediction," provided early theoretical justification and empirical evidence of the superiority of statistical methods over clinical judgment. Despite a century of empirical evidence supporting Meehl's central thesis, from early parole prediction studies in the 1920s to modern meta-analyses, confusion persists regarding when and why his troubling finding applies. This paper provides a contemporary theoretical justification for Meehl's result. Importantly, Meehl's prediction problems require a small set of possible outcomes and machine-readable data. Second, individual predictions and decisions are evaluated only on average. This formulation leads to a natural analysis from statistical decision theory, which shows that statistical rules are more accurate than clinical intuition almost by definition. Meehl's prediction paradox is an example of metrical determinism, where the rules of evaluation implicitly determine the best procedure. The decision-theoretic analysis of Meehl's problem elucidates the utility of algorithmic systems as decision-support tools, but also reveals their natural shortcomings, inducing expertise erosion, decision fatigue, and the usurpation of discretionary judgment.2025-09-04T16:59:20ZBenjamin Rechthttp://arxiv.org/abs/2508.07754v2Asymptotic Consistency and Generalization in Hybrid Models of Regularized Selection and Nonlinear Learning2025-08-31T10:45:21ZThis study explores how different types of supervised models perform in the task of predicting and selecting relevant variables in high-dimensional contexts, especially when the data is very noisy. We analyzed three approaches: regularized models (such as Lasso, Ridge, and Elastic Net), black-box models (such as Random Forest, XGBoost, LightGBM, CatBoost, and H2O GBM), and hybrid models that combine both approaches: regularization with nonlinear algorithms. Based on simulations inspired by the Friedman equation, we evaluated 23 models using three complementary metrics: RMSE, Jaccard index, and recall rate. The results reveal that, although black-box models excel in predictive accuracy, they lack interpretability and simplicity, essential factors in many real-world contexts. Regularized models, on the other hand, proved to be more sensitive to an excess of irrelevant variables. In this scenario, hybrid models stood out for their balance: they maintain good predictive performance, identify relevant variables more consistently, and offer greater robustness, especially as the sample size increases. Therefore, we recommend using this hybrid framework in market applications, where it is essential that the results make sense in a practical context and support decisions with confidence.2025-08-11T08:36:03ZLuciano Ribeiro GalvãoRafael de Andrade Morahttp://arxiv.org/abs/2508.21523v1Quantile Function-Based Models for Neuroimaging Classification Using Wasserstein Regression2025-08-29T11:23:50ZWe propose a novel quantile function-based approach for neuroimaging classification using Wasserstein-Fréchet regression, specifically applied to the detection of mild traumatic brain injury (mTBI) based on the MEG and MRI data. Conventional neuroimaging classification methods for mTBI detection typically extract summary statistics from brain signals across the different epochs, which may result in the loss of important distributional information, such as variance, skewness, kurtosis, etc. Our approach treats complete probability density functions of epoch space results as functional response variables within a Wasserstein-Fréchet regression framework, thereby preserving the full distributional characteristics of epoch results from $L_{1}$ minimum norm solutions. The global Wasserstein-Fréchet regression model incorporating covariates (age and gender) allows us to directly compare the distributional patterns between healthy control subjects and mTBI patients. The classification procedure computes Wasserstein distances between estimated quantile functions from control and patient groups, respectively. These distances are then used as the basis for diagnostic decisions. This framework offers a statistically principled approach to improving diagnostic accuracy in mTBI detection. In practical applications, the test accuracy on unseen data from Innovision IP's dataset achieves up to 98\%.2025-08-29T11:23:50Z17 pages, 2 figuresJie LiGary GreenJian Zhanghttp://arxiv.org/abs/2211.08637v3Near-Peer Mentoring in Data Science: A Plot for Mutual Growth2025-08-26T18:11:02ZUniversities have been expanding undergraduate data science programs. Involving graduate students in these new opportunities can foster their growth as data science educators. We describe two programs that employ a near-peer mentoring structure, in which graduate students mentor undergraduates, to (1) strengthen their teaching and mentoring skills and (2) provide research and learning experiences for undergraduates from diverse backgrounds. In the Data Science for Social Good program, undergraduate participants work in teams to tackle a data science project with social impact. Graduate mentors guide project work and provide just-in-time teaching and feedback. The Stanford Mentoring in Data Science course offers training in effective and inclusive mentorship strategies. In an experiential learning framework, enrolled graduate students are paired with undergraduate students from non-R1 schools, whom they mentor through weekly one-on-one remote meetings. In end-of-program surveys, mentors reported growth through both programs. Drawing from these experiences, we developed a self-paced mentor training guide, which engages teaching, mentoring and project management abilities. These initiatives and the shared materials can serve as prototypes of future programs that cultivate mutual growth of both undergraduate and graduate students in a high-touch, inclusive, and encouraging environment.2022-11-16T03:13:01ZChiara SabattiQian Zhao10.1080/00031305.2025.2550314http://arxiv.org/abs/2508.19070v1Replicability: Terminology, Measuring Success, and Strategy2025-08-26T14:26:36ZEmpirical science needs to be based on facts and claims that can be reproduced. This calls for replicating the studies that proclaim the claims, but practice in most fields still fails to implement this idea. When such studies emerged in the past decade, the results were generally disappointing. There have been an overwhelming number of papers addressing the ``reproducibility crisis'' in the last 20 years. Nevertheless, terminology is not yet settled, and there is no consensus about when a replication should be called successful. This paper intends to clarify such issues. A fundamental problem in empirical science is that usual claims only state that effects are non-zero, and such statements are scientifically void. An effect must have a \emph{relevant} size to become a reasonable item of knowledge. Therefore, estimation of an effect, with an indication of precision, forms a substantial scientific task, whereas testing it against zero does not. A relevant effect is one that is shown to exceed a relevance threshold. This paradigm has implications for the judgement on replication success.
A further issue is the unavoidable variability between studies, called heterogeneity in meta-analysis. Therefore, it is of little value, again, to test for zero difference between an original effect and its replication, but exceedance of a corresponding relevance threshold should be tested. In order to estimate the degree of heterogeneity, more than one replication is needed, and an appropriate indication of the precision of an estimated effect requires such an estimate.
These insights, which are discussed in the paper, show the complexity of obtaining solid scientific results, implying the need for a strategy to make replication happen.2025-08-26T14:26:36Z36 pages, 3 figuresWerner A. StahelETH Zurich, Switzerlandhttp://arxiv.org/abs/2508.14009v2Understanding Pedagogical Content Knowledge of Data Science Instructors: An Inaugural Framework2025-08-25T20:27:53ZAs data science emerges as a distinct academic discipline, introductory data science (IDS) courses have also drawn attention to their role in providing foundational knowledge of data science to students. IDS courses not only help students transition to higher education but also expose students to the field, often for the first time. They are often taught by instructors without formal training in data science or pedagogy, creating a unique context for examining their pedagogical content knowledge (PCK). This study explores IDS instructors' PCK, particularly how instructors' varied backgrounds interact with their instructional practices. Employing empirical phenomenological methodology, we conducted semi-structured interviews to understand the nature of their PCK. Comparing instructors' PCK was inherently challenging due to their diverse backgrounds and teaching contexts. Prior experiences played a central role in shaping participants' instructional choices. Their perceptions regarding the goals and rationale for teaching data science reflected three distinct orientations. Instructors also acknowledged students entering IDS courses often brought preconceived notions that shaped their learning experiences. Despite the absence of national guidelines, participants demonstrated notable overlap in foundational IDS content, though some instructors felt less confident with advanced or specialized topics. Additionally, instructors commonly employed formative and summative assessment approaches, though few explicitly labeled their practices using these terms. The findings highlight key components of PCK in IDS and offer insights into supporting instructor development through targeted training and curriculum design. This work contributes to ongoing efforts to build capacity in data science education and expand the scope of PCK research into new interdisciplinary domains.2025-08-19T17:15:14Z76 pages, 3 tablesSinem DemirciMine DoğucuAndrew ZiefflerJoshua M. Rosenberghttp://arxiv.org/abs/2507.12424v3Hierarchical Temporal Point Process Modeling of Aggressive Behavior Onset in Psychiatric Inpatient Youth with Autism for Branching Factor Estimation2025-08-19T21:19:02ZAggressive behavior in autistic inpatient youth often arises in temporally clustered bursts complicating efforts to distinguish external triggers from internal escalation. The sample population branching factor-the expected number of new onsets triggered by a given event-is a key summary of self-excitation in behavior dynamics. Prior pooled models overestimate this quantity by ignoring patient-specific variability. We addressed this using a hierarchical Hawkes process with an exponential kernel and edge-effect correction allowing partial pooling across patients. This approach reduces bias from high-frequency individuals and stabilizes estimates for those with sparse data. Bayesian inference was performed using the No U-Turn Sampler with model evaluation via convergence diagnostics, power-scaling sensitivity analysis, and multiple Goodness-of-Fit (GOF) metrics: PSIS-LOO the Lewis test with Durbin's modification and residual analysis based on the Random Time Change Theorem (RTCT). The hierarchical model yielded a significantly lower and more precise branching factor estimate mean (0.742 +- 0.026) than the pooled model (0.899 +- 0.015) and narrower intervals than the unpooled model (0.717 +- 0.139). This led to a threefold smaller cascade of events per onset under the hierarchical model. Sensitivity analyses confirmed robustness to prior and likelihood perturbations while the unpooled model showed instability for sparse individuals. GOF measures consistently favored or on par to the hierarchical model. Hierarchical Hawkes modeling with edge-effect correction provides robust estimation of branching dynamics by capturing both within- and between-patient variability. This enables clearer separation of endogenous from exogenous events supports linkage to physiological signals and enhances early warning systems individualized treatment and resource allocation in inpatient care.2025-07-16T17:11:48ZSubmitted to BMC Medical Research MethodologyMichael PotterMichael EverettDeniz ErdogmusYuna WatanabeTales ImbiribaMatthew S. Goodwinhttp://arxiv.org/abs/2508.11726v1Relationship Between Leisure Activities, Stress Management Methods, Study Methods, and Methods of Learning New Things Among First-Year Statistics Students2025-08-15T07:09:03ZThe interplay between leisure activities, stress management methods, studying methods, and methods of learning new things is crucial and affects performance in all aspects of life. On the other hand, data science and statistics are rapidly growing fields with high demands across universities. Thus, this study aimed to identify the similarities and dissimilarities between the four dimensions: leisure activities, stress management methods, studying methods and methods of learning new things. The participants of this study were first-year undergraduates studying statistics at one of the universities in Sri Lanka. There were 117 students in the sample (female-65, male-52). A self-reported questionnaire was used to collect data. First, individual responses for each question under each dimension were visualized using tile maps separately for males and females to identify similarities and dissimilarities in responses. Next, individuals were clustered based on the responses for each dimension separately. Finally, all resulting clusters were re-clustered to identify the relationships between the dimensions. In all cluster analyses, we used Jaccard distance with hierarchical clustering using the complete linkage method. The results were visualized using tile maps. Across all four dimensions we considered, the top activities were either listening to music or lectures and watching videos or TV shows, suggesting that individuals are introverts and passive learners. There was no strong relationship between these dimensions. By identifying these clusters and relationships, educators can tailor instructional approaches to enhance engagement and effectiveness in diverse learning environments.2025-08-15T07:09:03Z23 pages, 10 figures 23 pages, 10 figuresThiyanga S. Talagala