https://arxiv.org/api/bSnvrTCoNisv8Fsb57Gf5/ybdNc 2026-06-10T17:23:38Z 10652 300 15 http://arxiv.org/abs/2605.10729v1 On Distributed Parallelization Strategies for Particle-in-Fourier Schemes 2026-05-11T15:34:13Z We present and compare distributed parallelization strategies for the particle-in-Fourier (PIF) schemes used in kinetic plasma simulations. The different strategies are i) domain decomposition, where both the particles and Fourier modes are split between the MPI ranks ii) particle decomposition, where only the particles are split between the ranks and each rank carries all the modes, and, iii) space-time decomposition, in which time parallelization based on the parareal algorithm is added on top of the particle decomposition. We describe the different communication patterns involved in each of the strategies, the parameter regimes where they work best, and explain their advantages and disadvantages. We implement the strategies within the open-source, performance portable library IPPL and conduct scaling studies with 3D-3V Landau damping and Penning trap benchmark problems on Alps and JUWELS booster supercomputers. We analyze the dominant component timings in each of the strategies and identify areas for future optimizations. 2026-05-11T15:34:13Z Sriramkrishnan Muralikrishnan Paul Fischill Andreas Adelmann Robert Speck http://arxiv.org/abs/2605.10678v1 A Performance-Portable, Massively Parallel Distributed Nonuniform FFT 2026-05-11T14:56:41Z The nonuniform fast Fourier transform (NUFFT) enables spectral methods for problems with irregularly spaced samples, with applications in medical imaging, molecular dynamics, and kinetic plasma simulations. Existing implementations are limited to shared-memory execution, restricting problem sizes to what fits on a single node. We present the first distributed, performance-portable NUFFT for heterogeneous supercomputers. Our Kokkos-based implementation runs without modification on NVIDIA and AMD GPUs. We develop multiple spreading and interpolation kernels optimized for different accuracy requirements and architectures. Our spreading kernels match or exceed the single-GPU throughput of the state-of-the-art CUDA-based NUFFT library cuFINUFFT at production particle densities, while our Kokkos-based implementation additionally supports AMD GPUs. Strong scaling experiments on Alps (NVIDIA GH200), JUWELS Booster (NVIDIA A100), and LUMI (AMD MI250X) demonstrate scaling up to 1024 GPUs. At scale, the distributed FFT is a significant part of the total runtime, making higher NUFFT accuracy less expensive. We apply the method to massively parallel Particle-in-Fourier simulations of Landau damping with up to $1024^3$ Fourier modes and 8.6 billion particles on Alps, JUWELS, and LUMI, demonstrating that distributed NUFFTs enable kinetic plasma simulations at resolutions previously inaccessible to spectral particle methods. 2026-05-11T14:56:41Z Accepted in The Platform for Advanced Scientific Computing (PASC26) conference proceedings Paul Fischill Andreas Adelmann Sriramkrishnan Muralikrishnan http://arxiv.org/abs/2603.15917v2 Data-efficient Bayesian-guided design selection from large candidate sets: Application to hyperelastic stochastic metamaterials 2026-05-11T14:54:40Z From a pool of admissible designs, we aim to identify a structure that achieves a target macroscopic stress response. For each candidate, the response is obtained from a high-fidelity oracle, such as expensive computational homogenization or experiments. We consider cases in which (i) the geometry cannot be conveniently parameterized, rendering gradient-based optimization inapplicable, and (ii) brute-force evaluation of all candidates is infeasible due to costly oracle queries. To tackle this challenge, we propose a Bayesian-guided design selection framework. The dimensionality of design variants is reduced through statistical feature engineering, and the resulting low-dimensional descriptors are mapped to effective hyperelastic constitutive parameters using a multi-output Gaussian process surrogate. The surrogate is trained using uncertainty-driven active learning with only a limited number of high-fidelity oracle evaluations. The surrogate shortlists promising candidates, and since its accuracy is inherently limited, the final selection of the optimal design is performed through high-fidelity oracle evaluations within the shortlist. In numerical test cases, we consider a design set of 50,000 candidate structures. Active learning requires labeling less than half a percent of the entire candidate set. Bayesian-guided design selection reaches a prescribed error threshold with only a handful of oracle evaluations in most cases. 2026-03-16T21:09:57Z Hooman Danesh Henning Wessels http://arxiv.org/abs/2605.10454v1 A Flexible Raspberry Pi-Based Data Logger Platform for Modbus Sensors with Ansible Deployment 2026-05-11T12:24:00Z This article presents LibrePiLogger, an open-source data logging platform based on the Raspberry Pi for environmental monitoring using Modbus sensors over RS-485. The system combines the AtmosPyre Python library for sensor communication with Ansible-based deployment automation, allowing researchers to deploy sensor networks by editing a single YAML inventory file. Two hardware configurations are described: a minimal setup using a Raspberry Pi Zero with an RS-485 HAT, and a maximal setup using a Raspberry Pi 4 with a USB-to-RS-485 converter. Currently implemented sensors include the Vaisala GMP252 for CO$_2$ and the RadonTech AlphaTRACER for $^{222}$Rn, with new sensors requiring approximately 100 lines of Python following a provided driver template. Data is logged to timestamped CSV files with JSON metadata. The system has been deployed for continuous CO$_2$ and $^{222}$Rn monitoring in a karst environment since spring 2025 and remains in active operation, demonstrating reliable long-term performance. All hardware designs, software, and deployment scripts are released under the GNU General Public License v3.0. Total hardware costs range from 54 to 63EUR (excluding housing), depending on the configuration. 2026-05-11T12:24:00Z Submitted to HardwareX Leon Keim Steffen Hägele Vivien Langhans Holger Class http://arxiv.org/abs/2603.03756v4 MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier 2026-05-11T11:18:02Z While large language models (LLMs) show promise in scientific discovery, existing research focuses on inference or feedback-driven training, leaving the direct modeling of the generative reasoning process, $P(\text{hypothesis}|\text{background})$ ($P(h|b)$), unexplored. We demonstrate that directly training $P(h|b)$ is mathematically intractable due to the combinatorial complexity ($O(N^k)$) inherent in retrieving and composing inspirations from a vast knowledge base. To break this barrier, we introduce MOOSE-Star, a unified framework that enables tractable and scalable training of $P(h|b)$, while supporting more scalable inference. In the best case, MOOSE-Star reduces complexity from exponential to logarithmic ($O(\log N)$) by (1) training on decomposed subtasks derived from the probabilistic equation of discovery, (2) employing motivation-guided hierarchical search to enable logarithmic retrieval and prune irrelevant subspaces, and (3) utilizing bounded composition for robustness against retrieval noise. To facilitate this, we release TOMATO-Star, a dataset of 108,717 decomposed papers (38,400 GPU hours) for training. Empirically, MOOSE-Star scales continuously with training data and inference budget, whereas direct brute-force sampling hits a complexity wall. 2026-03-04T06:11:18Z Accepted by ICML 2026 Zonglin Yang Lidong Bing http://arxiv.org/abs/2605.10340v1 Learning to Focus Synthetic Aperture Radar On-line with State-Space Models 2026-05-11T10:43:40Z Conventional focusing methods for Synthetic Aperture Radar (SAR) employ block processing efficiently but remain latency-heavy processes that prevent the realisation of a closed-loop cognitive SAR vision system. We present the first Online SAR Processor (OSP), an online image-formation framework that treats SAR sensing as a stream and produces focused SAR image output line by line during acquisition. OSP uses a tiny state-space surrogate model trained with teacher-student distillation and multi-stage losses. We evaluate the method on 300GB of SAR data from Maya4, a Sentinel-1-derived dataset containing raw, range-compressed, range-cell-migration-corrected, and azimuth-compressed products. Relative to a linewise digital-signal-processing baseline, OSP delivers approximately 70$\times$ lower latency and 130$\times$ lower memory use; on a single AMD CPU core it processes one row in 16 ms with a memory footprint of 6 MB whilst maintaining a focusing quality high enough to support downstream decisions, which we illustrate with vessel detection and flood-mapping tasks. 2026-05-11T10:43:40Z Sebastian Fieldhouse Roberto Del Prete Gabriele Daga Nathaniel Rensly Gabriele Meoni Kea-Tiong Tang http://arxiv.org/abs/2605.10297v1 QuantWeather: Quantile-Aware Probabilistic Forecasting for Subseasonal Precipitation 2026-05-11T09:55:36Z Subseasonal precipitation forecasting is inherently uncertain due to chaotic atmospheric dynamics, making reliable uncertainty estimation essential for real-world applications. Existing approaches typically represent uncertainty through ensemble forecasts rather than directly modeling predictive distributions. However, due to systematic model biases, raw ensemble outputs are often not well calibrated and cannot be directly interpreted as reliable uncertainty estimates. As a result, operational systems rely on post-hoc calibration based on reforecast datasets, which are computationally expensive to generate and maintain. To address these limitations, we propose QuantWeather, an end-to-end probabilistic forecasting framework with a dual-head design. The probabilistic and deterministic heads are supervised with separate objectives and optimized jointly. The framework further supports stochastic sampling, enabling probabilistic outputs even with a single stochastic forward pass and allowing optional multi-sample aggregation. Extensive experiments show that QuantWeather demonstrates superior probabilistic forecasting skill while substantially reducing inference-time computational and storage costs. 2026-05-11T09:55:36Z Lei Chen Xinyu Su Xiaohui Zhong Hao Li http://arxiv.org/abs/2508.07697v8 Semantic-Enhanced Time-Series Forecasting via Large Language Models 2026-05-11T07:13:47Z Time series forecasting plays a significant role in finance, energy, meteorology, and IoT applications. Recent studies have leveraged the generalization capabilities of large language models (LLMs) to adapt to time series forecasting, achieving promising performance. However, existing studies focus on token-level modal alignment, instead of bridging the intrinsic modality gap between linguistic knowledge structures and time series data patterns, greatly limiting the semantic representation. To address this issue, we propose a novel Semantic-Enhanced LLM (SE-LLM) that explores the inherent periodicity and anomalous characteristics of time series to embed into the semantic space to enhance the token embedding. This process enhances the interpretability of tokens for LLMs, thereby activating the potential of LLMs for temporal sequence analysis. Moreover, existing Transformer-based LLMs excel at capturing long-range dependencies but are weak at modeling short-term anomalies in time-series data. Hence, we propose a plugin module embedded within self-attention that models long-term and short-term dependencies to effectively adapt LLMs to time-series analysis. Our approach freezes the LLM and reduces the sequence dimensionality of tokens, greatly reducing computational consumption. Experiments demonstrate the superiority performance of our SE-LLM against the state-of-the-art (SOTA) methods. 2025-08-11T07:19:21Z 22 pages,6 figures Hao Liu Xiaoxing Zhang Chun Yang Xiaobin Zhu http://arxiv.org/abs/2412.03939v3 A quantum nonlinear solver based on the asymptotic numerical method 2026-05-11T06:49:10Z Quantum computing offers a promising avenue for advancing computational methods in science and engineering. In this work, we introduce the quantum asymptotic numerical method (qANM), a framework for solving nonlinear problems using quantum computing. Based on the principle of high-order perturbation techniques, the proposed method uses Taylor series expansions to transform complex nonlinear systems into sequences of linear equations. We integrate the method with the variational quantum linear solver and a quantum-enhanced Jacobi method. Numerical simulations on a quantum simulator validate the convergence of the method. In particular, the high-order ANM formulation demonstrates robustness in addressing nonlinear problems by effectively capturing the solution path through Taylor series expansions. Furthermore, a highlight of this work is a proof-of-principle experiment on a superconducting quantum processor. Despite the noise inherent in near-term quantum hardware, the experiment achieves 98% accuracy in tracking the nonlinear solution path. We believe this work provides a useful reference for applying quantum computing to nonlinear computational mechanics. 2024-12-05T07:39:29Z 37 pages, 19 figures, 1 table, submitted to Elsevier Yongchun Xu Zengtao Kuang Qun Huang Jie Yang Hamid Zahrouni Michel Potier-Ferry Kaixuan Huang Jia-Chi Zhang Heng Fan Heng Hu http://arxiv.org/abs/2605.09852v1 Fairness of Explanations in Artificial Intelligence (AI): A Unifying Framework, Axioms, and Future Direction toward Responsible AI 2026-05-11T01:09:06Z Machine learning algorithms are being used in high-stakes decisions, including those in criminal justice, healthcare, credit, and employment. The research community has responded with two largely independent research fields: \emph{algorithmic fairness}, which targets equitable outcomes, and \emph{explainable AI} (XAI), which targets interpretable reasoning. This survey identifies and maps a novel blind spot at their intersection, which is a model that can satisfy every standard fairness criterion in its outputs while being profoundly unfair in its \emph{reasoning process}. We refer to this as the procedural bias, and mitigating it requires treating the fairness of explanations as a distinct object of scientific study. To our knowledge, we provide the first unified theoretical and literature review of this emerging field and elucidate the drawbacks of post-hoc explainers in certifying explanation fairness. Our central contribution is a \emph{conditional invariance framework} formalizing explanation fairness as the requirement that explanations should be indifferent regardless of the protected attributes $ P(E(X) \in \cdot \mid X_\text{rel} = x_\text{rel},\, A = a) = P(E(X) \in \cdot \mid X_\text{rel} = x_\text{rel},\, A = b)$ for all task-relevant $x$, a single principle from which all existing explanation fairness metrics emerge as partial operationalizations. We introduce a seven-dimensional taxonomy, identify three generative mechanisms of explanation inequity (representation-driven, explanation-model mismatch, actionability-driven), and propose a canonical six-step evaluation workflow for operationalizing explanation fairness audits in practice. 2026-05-11T01:09:06Z 53 pages, 1 figure Gideon Popoola John Sheppard http://arxiv.org/abs/2605.09629v1 Image-Based Whole-Heart Cardiac Flow Simulations in Health and Congenital Heart Disease 2026-05-10T16:17:30Z Intracardiac flow patterns are shaped by the coupled motion of the cardiac chambers and heart valves and provide important information about cardiac function. However, clinical flow imaging remains limited by exam times, noise, resolution, and incomplete details of the three-dimensional flow. Computational fluid dynamics (CFD) can potentially provide detailed flow quantification and predictive insight into treatment outcomes, but clinical translation requires frameworks that reproduce patient-specific measurements while balancing physiological realism, computational cost, and modeling effort. Herein, we present an image-based, patient-specific computational framework for simulating whole-heart intracardiac hemodynamics that balances physiological fidelity with computational efficiency. The framework first employs machine learning-based segmentation and mesh propagation to reconstruct moving cardiac anatomies from time-resolved images. CFD simulations are then performed to resolve blood flow in deforming domains, while resistive immersed surfaces (RIS) are used to model all four cardiac valves with physiologically realistic opening and closing dynamics. The framework was applied to model hemodynamics in a healthy adult and a pediatric patient with complex congenital heart disease (CHD). In the healthy case, the simulations reproduced physiologic pressure-volume behavior, valve timing, and ventricular vortex formation. In the CHD case, simulated chamber and vessel pressures showed agreement with cardiac catheterization measurements. Simulated flow fields were qualitatively consistent with 4D-Flow MRI, while providing higher-resolution visualization of flow structures that were partially obscured by imaging artifacts. Comparison between the healthy and CHD cases further revealed altered diastolic flow organization and elevated normalized viscous dissipation in the CHD heart. 2026-05-10T16:17:30Z Fanwei Kong Aaron Brown Michael Loecher Perry S. Choi Lei Shi Michael Ma Daniel B. Ennis Alison Marsden http://arxiv.org/abs/2605.09610v1 SmartEval: A Benchmark for Evaluating LLM-Generated Smart Contracts from Natural Language Specifications 2026-05-10T15:47:46Z We introduce SmartEval, a benchmark for systematically evaluating the quality of Solidity smart contracts generated by large language models (LLMs) from natural language specifications. SmartEval provides a corpus of 9,000 generated contracts paired with expert-written ground-truth implementations drawn from the FSMSCG dataset, a five-dimensional evaluation rubric covering functional completeness, variable fidelity, state-machine correctness, business-logic fidelity, and code quality, and a reproducible generation-and-evaluation pipeline. To validate the benchmark's reliability, we conduct three independent empirical studies: a five-condition ablation study (N=300 per condition) isolating the contribution of each pipeline component, a human expert evaluation by three Columbia University PhD researchers confirming automated scores align with expert judgment to within 0.34 points, and external security analysis via the Slither static analyzer confirming 79.4% agreement between the LLM auditor and a non-LLM rule-based tool. Systematic analysis of 9,000 generated contracts reveals characteristic failure modes (logic omissions at 35.3%, state transition errors at 23.4%, and complexity-driven degradation) and quantifies a +8.29 composite-score advantage of generated contracts over ground-truth implementations, attributable to LLMs' literal specification-following behavior. SmartEval establishes a reproducible, validated foundation for empirical research on LLM smart contract synthesis quality, with all data, evaluation code, and generated contracts publicly released. 2026-05-10T15:47:46Z Abhinav Goel Agostino Capponi Alfio Gliozzo Chaitya Shah http://arxiv.org/abs/2602.03916v3 SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild? 2026-05-10T15:38:24Z Spatial reasoning is a fundamental aspect of human cognition, yet it remains a major challenge for contemporary vision-language models (VLMs). Prior work largely relied on synthetic or LLM-generated environments with limited task designs and puzzle-like setups, failing to capture the real-world complexity, visual noise, and diverse spatial relationships that VLMs encounter. To address this, we introduce SpatiaLab, a comprehensive benchmark for evaluating VLMs' spatial reasoning in realistic, unconstrained contexts. SpatiaLab comprises 1,400 visual question-answer pairs across six major categories: Relative Positioning, Depth & Occlusion, Orientation, Size & Scale, Spatial Navigation, and 3D Geometry, each with five subcategories, yielding 30 distinct task types. Each subcategory contains at least 25 questions, and each main category includes at least 200 questions, supporting both multiple-choice and open-ended evaluation. Experiments across diverse state-of-the-art VLMs, including open- and closed-source models, reasoning-focused, and specialized spatial reasoning models, reveal a substantial gap in spatial reasoning capabilities compared with humans. In the multiple-choice setup, InternVL3.5-72B achieves 54.93% accuracy versus 87.57% for humans. In the open-ended setting, all models show a performance drop of around 10-25%, with GPT-5-mini scoring highest at 40.93% versus 64.93% for humans. These results highlight key limitations in handling complex spatial relationships, depth perception, navigation, and 3D geometry. By providing a diverse, real-world evaluation framework, SpatiaLab exposes critical challenges and opportunities for advancing VLMs' spatial reasoning, offering a benchmark to guide future research toward robust, human-aligned spatial understanding. SpatiaLab is available at: https://spatialab-reasoning.github.io/. 2026-02-03T17:52:02Z Accepted to ICLR 2026 (https://openreview.net/forum?id=fWWUPOb0CT). 92 Pages. 42 Figures and 29 Tables ICLR 2026 Azmine Toushik Wasi Wahid Faisal Abdur Rahman Mahfuz Ahmed Anik Munem Shahriar Mohsin Mahmud Topu Sadia Tasnim Meem Rahatun Nesa Priti Sabrina Afroz Mitu Md. Iqramul Hoque Shahriyar Zaman Ridoy Mohammed Eunus Ali Majd Hawasly Mohammad Raza Md Rizwan Parvez http://arxiv.org/abs/2605.09467v1 Evaluating Transit Accessibility to Education and Effects of Operational Delays in Japanese Regional Cities: A Case Study of Matsumoto City 2026-05-10T10:38:12Z Realistic assessments of school commuting accessibility in areas with infrequent public transport services require accounting for operational delays; however, the impact of these delays has not been sufficiently examined. This study evaluates high-school accessibility in Matsumoto City, a regional city in Japan, using GTFS data representing both scheduled timetables and actual operating conditions. Accessibility levels are assessed under scheduled operations, while the effects of delays are examined through a comparative analysis based on actual delay measurements over a five-day workweek. Furthermore, a sensitivity analysis of travel-time thresholds was conducted. Results show that, when walking, cycling to stations, and public transport use are allowed, 78% of children under 15 can reach at least one high school within a 90-minute round trip, and 67% within a 60-minute round trip. Extending the threshold to 120 minutes enables access to nearly all schools in the city center, but the overall proportion increases only marginally to 81%. Delay impacts are particularly pronounced along bus routes connecting the central station with suburban areas, while in some areas, delays generate idiosyncratic events, where irregular transfers and reduced waiting times result in improved accessibility. Results underscore the need for both short-term measures,such as adjusting school start times, prioritizing buses, and introducing dedicated school routes, and long-term strategies, such as incorporating public transport accessibility into school consolidation decisions, to guarantee fair access to education opportunities without relying on private vehicles. 2026-05-10T10:38:12Z 50 pages (with appendix) 25 figures Itsuki Sato Kiyoshi Takami Giancarlos Parady http://arxiv.org/abs/2605.09288v1 MC$^2$: Monte Carlo Correction for Fast Elliptic PDE Solving 2026-05-10T03:32:46Z Partial differential equation (PDE) solvers underpin scientific computing, but real-world deployment is bounded by compute. Classical Monte Carlo solvers such as Walk-on-Spheres (WoS) are unbiased and geometry-agnostic but are slow. Learned solvers are fast but biased and brittle under distribution shift. We present \textbf{MC$^2$}, a hybrid WoS-Neural Network (WoS-NN) PDE solver that treats a low-budget Monte Carlo solution as a structured estimator of the true field and learns a single-pass neural correction to recover a high-fidelity solution. MC$^2$ matches the accuracy of solutions using over $1000\times$ more Monte Carlo compute, outperforming all evaluated classical, denoising, and neural-operator baselines. To enable reproducible study of finite-compute PDE solving, we additionally release \textbf{PDEZoo}, the largest standardized elliptic PDE benchmark to date: 2M PDEs spanning five elliptic families and unlimited geometric compositions, with analytic ground truth and multi-budget Monte Carlo trajectories. Together \textbf{MC$^2$} and \textbf{PDEZoo} (1) empirically establish that finite-sample Monte Carlo error is structured, learnable, and correctable in a single forward pass, (2) show that we can solve PDEs $\sim$\textbf{1000x} faster than with just WoS, and (3) provide the evaluation infrastructure the field has so far lacked. 2026-05-10T03:32:46Z Ethan Hsu Hong Meng Yam Ivan Ge