https://arxiv.org/api/w94O7GPgeTxEUtgtlH+QvKtvA78 2026-05-16T00:28:12Z 10445 15 15 http://arxiv.org/abs/2605.12701v1 Do Fair Models Reason Fairly? Counterfactual Explanation Consistency for Procedural Fairness in Credit Decisions 2026-05-12T19:54:25Z Machine learning algorithms in socially sensitive domains (e.g., credit decisions) often focus on equalizing predictive outcomes. However, satisfying these metrics does not guarantee that models use the same reasoning for different groups. We show that existing outcome-fair models can still apply fundamentally different reasoning to individuals, a ``hidden procedural bias'' missed by standard fairness metrics and algorithms. We propose Counterfactual Explanation Consistency (CEC), a framework that detects and mitigates this bias by aligning feature attributions between individuals and their counterfactual counterparts. Key contributions include a nearest-neighbor counterfactual generation method, a modified baseline for integrated gradient comparisons, an individual-level procedural fairness metric, and a corresponding training loss. We introduce a taxonomy identifying ``Regime B'' (same outcome, different reasoning) as a critical blind spot. Experiments on synthetic data, German Credit, Adult Income, and HMDA mortgage data demonstrate that outcome-fair baselines exhibit substantial hidden bias, while CEC substantially reduces it with modest utility cost. 2026-05-12T19:54:25Z Gideon Popoola John Sheppard http://arxiv.org/abs/2605.12435v1 Environment-Adaptive Preference Optimization for Wildfire Prediction 2026-05-12T17:31:00Z Predicting rare extreme events such as wildfires from meteorological data requires models that remain reliable under evolving environmental conditions. This problem is inherently long-tailed: wildfire events are rare but high-impact, while most observations correspond to non-fire conditions, causing standard learning objectives to underemphasize the minority class (fire) that matters most. In addition, models trained on historical distributions often fail under distribution shifts, exhibiting degraded performance in new environments. To this end, we propose Environment-Adaptive Preference Optimization (EAPO), a framework that adapts prediction to the target environment with long-tail distribution. Given a new input distribution, we first construct distribution-aligned datasets via $k$-nearest neighbor retrieval. We then perform a hybrid fine-tuning procedure on this local manifold, combining supervised learning with preference optimization, as well as emphasizing on rare extreme events. EAPO refines decision boundaries while avoiding conflicting signals from heterogeneous training data. We evaluate EAPO on a real-world wildfire prediction task with environmental shifts. EAPO achieves robust performance (ROC-AUC 0.7310) and improves detection in extreme regimes, demonstrating its effectiveness in dynamic wildfire prediction systems. 2026-05-12T17:31:00Z Enyi Jiang Wu Sun http://arxiv.org/abs/2605.12007v1 A geometry-aligned multi-fidelity framework for uncertainty quantification of wildfire spread 2026-05-12T11:55:52Z Forward propagation of input uncertainties in physics-based wildfire models is computationally prohibitive, limiting the use of high-fidelity simulators in risk assessment workflows. This work introduces a geometry-aligned bi-fidelity surrogate framework that addresses the convection-dominated nature of wildfire spread by mapping low- and high-fidelity solution snapshots onto a common reference domain prior to basis selection and reconstruction. Unlike conventional bi-fidelity schemes, which combine spatially shifted snapshots and thus suffer from oscillations and excess basis requirements near sharp fronts, the proposed mapping aligns the dominant front geometry through per-variable shift/stretch transforms in 1D and an activity indicator-based affine alignment in 2D, so that reduced bases compare physically corresponding structures rather than displaced ones. Building on the ADfiRe physics-based simulator, we demonstrate the method on 1D and 2D test cases in which low- and high-fidelity models differ in mesh resolution and physical completeness. Across both settings, the geometry-aligned surrogate reproduces full-field temperature and fuel composition with substantially lower error than its unmapped counterpart, eliminates Gibbs-type oscillations near steep gradients, and recovers high-fidelity probability density functions for key quantities of interest (e.g., maximum temperature, evaporated moisture, and burned area). After offline training, online predictions are roughly three orders of magnitude cheaper than direct high-fidelity evaluation, making the framework a practical building block for many-query uncertainty quantification once the offline cost is amortized over enough queries. We discuss the conditions under which the geometric alignment is most effective, its limitations for non-convex or topologically complex fronts, and the path toward validation against real data. 2026-05-12T11:55:52Z 23 pages, 15 figures Konstantinos Vogiatzoglou Costas Papadimitriou Vasilis Bontozoglou Petros Koumoutsakos Han Gao http://arxiv.org/abs/2605.11798v1 Advancing Dynamic Ride-Pooling Simulation -- A Highly Scalable Dispatcher 2026-05-12T08:56:50Z In ride-pooling, a fleet of vehicles is dynamically dispatched to bring travelers from A to B, trying to pool riders with similar itineraries to improve the use of resources compared to taxis or private cars. Ride-pooling is considered a core building block of future transport systems with autonomous vehicles. In this paper, we introduce Mt-KaRRi, a novel dispatcher for dynamic ride-pooling that leverages state-of-the-art shortest-path algorithms to process millions of travelers per hour. We add a simple mode choice model and use realistic travel demand in three different urban areas for extensive experiments. We find that our dispatcher scales well with a response time per request of around 1ms even for our largest instances. We show how this scalability can be used to conduct ride-pooling studies at unprecedented scale. For instance, we determine how the quality of rides and usage of vehicle resources develop for tens of thousands of vehicles and millions of travelers. We envision Mt-KaRRi as a tool for future ride-pooling simulation studies at scale. 2026-05-12T08:56:50Z Moritz Laupichler Robin Andre Kim Kandler Peter Sanders Peter Vortisch http://arxiv.org/abs/2605.11784v1 Crash Assessment via Mesh-Based Graph Neural Networks and Physics-Aware Attention 2026-05-12T08:50:49Z Full-vehicle crash simulations are computationally expensive, limiting their use in iterative design exploration. This work investigates learned hybrid surrogate models (MeshTransolver, MeshGeoTransolver, and MeshGeoFLARE) for predicting time-resolved structural deformation fields in an industrial lateral pole-impact benchmark. We evaluate whether neural surrogates can reproduce full-field crash kinematics with sufficient accuracy, spatial regularity, and structural plausibility for engineering interpretation. The proposed architectures combine local mesh message passing, geometry-aware global attention, and sparse contact-aware correction for autoregressive crash rollout. We compare mesh-based graph neural networks, attention-based geometric models, and hybrid architectures under a common training and hyperparameter configuration. The hybrid models capture both short-range structural interactions and long-range deformation patterns, while a sparse contact-aware variant assesses the effect of dynamic proximity interactions during rollout. On a 25-sample full-vehicle test set, the best hybrid model achieves a temporal mean root-mean-square error of 3.20 mm. While geometry-aware attention baselines are quantitatively competitive, qualitative side-view inspection shows they can introduce local spatial noise and deformation irregularities that complicate structural interpretation. In contrast, hybrid mesh-attention models provide the best balance between scalar accuracy, survival-space consistency, and physically interpretable displacement fields. These results suggest that crash surrogate assessment should combine global error metrics with downstream safety-relevant quantities and qualitative field inspection. The proposed methodology enables fast full-field predictions while preserving essential structural information for industrial crash-engineering analysis. 2026-05-12T08:50:49Z 40 pages, 15 figures, 7 tables Gabriel Curtosi Carlos Manuel Ruiz Ruiz Fabiola Cavaliere Xabier Larráyoz Izcara http://arxiv.org/abs/2605.11759v1 A nonlinear extension of parametric model embedding for dimensionality reduction in parametric shape design 2026-05-12T08:32:39Z Dimensionality reduction is essential in simulation-based shape design, where high-dimensional parameterizations hinder optimization, surrogate modeling, and systematic design-space exploration. Parametric Model Embedding (PME) addresses this issue by constructing reduced variables from geometric information while preserving an explicit backmapping to the original design parameters. However, PME is intrinsically linear and may become inefficient when the sampled design space is governed by nonlinear geometric variability. This paper introduces a nonlinear extension of PME, denoted NLPME. The proposed framework preserves the defining principle of PME -- geometry-driven latent variables and parameter-mediated reconstruction -- while replacing the linear reduced subspace with a nonlinear latent representation. Geometry is not reconstructed directly from the latent variables; instead, the latent representation is decoded into admissible design parameters, and the corresponding geometry is recovered through a forward parametric map. The method is assessed on a bio-inspired autonomous underwater glider with a 32-dimensional parametric shape description and a CAD-based geometry-generation process. NLPME reaches a 5\% reconstruction-error threshold with \(N=5\) latent variables, compared with \(N=8\) for linear PME, and a 1\% threshold with \(N=9\), compared with \(N=15\) for PME. Comparison with a deep autoencoder shows that most of the nonlinear compression gain can be retained while preserving an explicit backmapping to the original design variables. The results establish NLPME as a compact, admissible, and engineering-compatible nonlinear reduced representation for parametric shape design spaces. 2026-05-12T08:32:39Z Andrea Serani Giorgio Palma Matteo Diez http://arxiv.org/abs/2605.11733v1 Position: LLM Inference Should Be Evaluated as Energy-to-Token Production 2026-05-12T08:15:04Z LLM inference is still evaluated mainly as a model or software problem: accuracy, latency, throughput, and hardware utilization. This is incomplete. At deployment scale, the relevant output is a quality-conditioned token produced under joint constraints from effective compute, delivered data-center power, cooling capacity, PUE, and utilization. We argue that the ML community should treat inference as \emph{energy-to-token production}. We formalize this view with a dimensionally consistent Token Production Function in which token rate is bounded by both compute-per-token and energy-per-token ceilings. Listed API prices vary by over an order of magnitude across providers, but we use price dispersion only as directional motivation, not as causal evidence of marginal cost. The core physical question is instead: under fixed quality and service targets, when does the binding constraint move from theoretical peak compute toward delivered power, cooling, and operational efficiency? Under this framing, system optimizations -- latent KV-cache compression, sparse or heavily compressed attention, quantization, routing, and difficulty-adaptive reasoning -- are not merely local engineering tricks. They are energy-to-token levers because they reduce FLOPs/token, joules/token, memory traffic, or utilization losses under fixed $(q^{*},s^{*})$. We therefore call for inference papers and benchmarks to report Joules/token, active binding constraint, PUE-adjusted delivered power, and utilization-adjusted token output alongside accuracy and latency. 2026-05-12T08:15:04Z https://dominic789654.github.io/energy-to-token/ Xiang Liu Shimiao Yuan Zhenheng Tang Peijie Dong Kaiyong Zhao Qiang Wang Bo Li Xiaowen Chu http://arxiv.org/abs/2605.08916v2 Diffusion Restore: Real-Time Markov Chain Monte Carlo Light Transport 2026-05-12T06:56:50Z We present Diffusion Restore, a real-time framework for diffusion-based MCMC light transport. MCMC methods are highly suitable for sampling from complex high-dimensional distributions and for approximating integrals over them. In practice, they are often the only viable solution when direct sampling is not possible and alternative methods are either inefficient or cannot be applied due to the structure of the target distribution. However, controlling the exploration of the target distribution in MCMC methods remains challenging. Efficient exploration requires a balance between local exploration and global discovery, and local dynamics must rapidly explore individual modes without getting stuck or exhibiting excessive backtracking. The problem of global discovery has recently been addressed by the introduction of the Restore framework. In this work, we build on this framework and focus on improving local exploration. We show how to choose diffusion-based local dynamics within the Restore framework while completely avoiding Metropolis-adjustment, which is known to slow down convergence. Furthermore, we model these dynamics as nonreversible, introducing momentum in the drift and thereby enabling more directed exploration of the target distribution compared to reversible, random-walk-like dynamics. We provide a theoretical justification for the validity of our choice of local dynamics. Empirically, we demonstrate across diverse scenes that Diffusion Restore outperforms all existing MCMC light transport methods and establishes a new state of the art. In addition, we present a GPU implementation in ray tracing and compute shaders and achieve real-time frame rates. This demonstrates that Diffusion Restore is not only superior in offline rendering, but also outperforms traditional Path Tracing methods in real-time rendering settings, such as interactive applications and games. 2026-05-09T12:31:52Z Sascha Holl Gurprit Singh Hans-Peter Seidel http://arxiv.org/abs/2605.09523v2 HS-FNO: History-Space Fourier Neural Operator for Non-Markovian Partial Differential Equations 2026-05-12T04:49:45Z Neural operators provide fast surrogate models for time-dependent partial differential equations, but their standard autoregressive use usually assumes that the instantaneous field $u(t,\cdot)$ is a complete state. This assumption fails for delay equations, distributed-memory systems, and other non-Markovian dynamics: two trajectories may agree at time $t$ and nevertheless have different futures because their histories differ. We introduce the History-Space Fourier Neural Operator (HS-FNO), a neural operator for delay and memory-driven PDEs formulated on the lifted state $u_t(θ,x)=u(t+θ,x)$, $θ\in[-τ,0]$. The key computational step is to decompose one history-state update into a learned predictor for the newly exposed future slice and an exact shift-append transport for the portion of the history window already known from the previous state. This avoids learning deterministic history coordinates, reduces the learned output dimension, and enforces the natural discrete history update. We test HS-FNO on five benchmark families covering delayed reaction--diffusion, spatial epidemiology, nonlocal neural-field dynamics, delayed waves, and distributed-memory closures. Across ten random seeds, HS-FNO attains the lowest aggregate one-step, history-space, and rollout errors among the principal baselines. The largest gain occurs in autoregressive prediction, where aggregate rollout error decreases from $0.241$, $0.188$, and $0.185$ for current-state, lag-stack, and unconstrained history-to-history operators, respectively, to $0.094$. The same model uses fewer parameters than unconstrained history prediction. These results indicate that enforcing the discrete shift structure of history-state evolution is an effective inductive bias for non-Markovian PDE surrogate modeling. 2026-05-10T13:14:59Z 15 pages, 4 figures, 1 table. Code at https://github.com/lennonshikhman/hs-fno/ Lennon J. Shikhman http://arxiv.org/abs/2605.11524v1 EqOD: Symmetry-Informed Stability Selection for PDE Identification 2026-05-12T04:48:38Z Data-driven identification of partial differential equations (PDEs) relies on sparse regression over a candidate library of differential operators, where larger libraries inflate false positives under observation noise and smaller libraries risk missing true terms. We introduce Equivariant Operator Discovery (EqOD), a fully automatic method combining two library reduction mechanisms. When Galilean invariance is detected from trajectory data via a weak-form structural test, EqOD uses the symmetry-reduced library, eliminating terms that our Galilean exclusion result proves to be absent from the governing equation. Otherwise, it applies randomized LASSO stability selection guided by classical false-positive bounds. A residual-based fallback prevents degradation below the full-library baseline. On 8 PDEs at 4 noise levels, EqOD attains $F_1 = 1.000 \pm 0.000$ on Heat at $20\%$ noise, where WF-LASSO obtains $0.475 \pm 0.181$, official PySINDy 2.0 obtains $0.000$, and the WSINDy reimplementation obtains $0.789$. Under the strict criterion that the mean F1 difference exceeds the larger of the two standard deviations, EqOD wins 7 of 32 cells. WF-LASSO wins none, and the remaining 25 cells are ties. Across all 32 cells, EqOD outperforms PySINDy 2.0.0 in 23 of 32 cells, and all 5 PySINDy wins occur on reaction PDEs. External validation on WeakIdent and PINN-SR datasets gives $F_1 = 1.000$ on all 5 clean benchmarks. NLS, 2D, coupled-system, and cylinder-wake extensions are reported. The Galilean library reduction is proved under explicit autonomy and library assumptions. The stability-selection step is motivated by classical false-positive bounds, while formal guarantees for correlated PDE design matrices remain open. 2026-05-12T04:48:38Z 45 pages, 16 figures Gnankan Landry Regis N'guessan Bum Jun Kim http://arxiv.org/abs/2604.03409v2 Generative AI for material design: A mechanics perspective from burgers to matter 2026-05-12T00:29:55Z Generative artificial intelligence offers a new paradigm to design matter in high-dimensional spaces. However, its underlying mechanisms remain difficult to interpret and limit adoption in computational mechanics. This gap is striking because its core tools-diffusion, stochastic differential equations, and inverse problems-are fundamental to the mechanics of materials. Here we show that diffusion-based generative AI and computational mechanics are rooted in the same principles. We illustrate this connection using a three-ingredient burger as a minimal benchmark for material design in a low-dimensional space, where both forward and reverse diffusion admit analytical solutions: Markov chains with Bayesian inversion in the discrete case and the Ornstein-Uhlenbeck process with score-based reversal in the continuous case. We extend this framework to a high-dimensional design space with 146 ingredients and 8.9x10^43 possible configurations, where analytical solutions become intractable. We therefore learn the discrete and continuous reverse processes using neural network models that infer inverse dynamics from data. We train the models on only 2,260 recipes and generate one million samples that capture the statistical structure of the data, including ingredient prevalence and quantitative composition. We further generate five new burgers and validate them in a blinded restaurant-based sensory study with n = 101 participants, where three of the AI-designed burgers outperform the classical Big Mac in overall liking, flavor, and texture. These results establish diffusion-based generative modeling as a physically grounded approach to design in high-dimensional spaces. They position generative AI as a natural extension of computational mechanics, with applications from burgers to matter, and establish a path toward data-driven, physics-informed generative design. 2026-04-03T19:12:01Z 25 pages, 15 figures, 2 tables Vahidullah Tac Ellen Kuhl http://arxiv.org/abs/2605.11355v1 gym-invmgmt: An Open Benchmarking Framework for Inventory Management Methods 2026-05-12T00:22:15Z Inventory-policy comparisons are often difficult to interpret because performance depends on the evaluation contract as much as on the policy itself. Differences in topology, demand regime, information access, feasibility constraints, shortage treatment, and Key Performance Indicator (KPI) definitions can change method rankings. We present gym-invmgmt, a Gymnasium-compatible extension of the OR-Gym inventory-management lineage for auditable cross-paradigm evaluation. The benchmark evaluates optimization, heuristic, and learned controllers under a shared CoreEnv transition, reward, action-bound, and KPI contract, while varying stress conditions through a 22-scenario core grid plus four supplemental MARL-mode rows. Within these released scenarios, informed stochastic programming provides the strongest non-oracle reference, reflecting the value of scenario hedging under forecast access, but at substantially higher online computational cost. Among learned controllers, the Proximal Policy Optimization Transformer variant (PPO-Transformer) achieves the strongest learned-policy quality at fast inference, while Residual Reinforcement Learning (Residual RL) provides competitive hybrid performance. The graph neural network variant (PPO-GNN) is highly competitive on the default divergent topology but less robust on the serial topology. Imitation learning performs well in stationary regimes but degrades under demand shift, and the bounded Large Language Model (LLM) policy-parameter baseline is best interpreted as a diagnostic controller rather than an autonomous inventory optimizer. Overall, the benchmark identifies scenario-conditioned leaders while showing that performance depends jointly on information access, demand shift, topology, and policy representation. 2026-05-12T00:22:15Z 16 pages, 4 figures Reza Barati Qinmin Vivian Hu http://arxiv.org/abs/2605.11346v1 Physics-Informed Teacher-Student Ensemble Learning for Traffic State Estimation with a Varying Speed Limit Scenario 2026-05-11T23:59:31Z Physics-informed deep learning (PIDL) neural networks have shown their capability as a useful instrument for transportation practitioners in utilizing the underlying relationship between the state variables for traffic state estimation (TSE). Another efficient traffic management approach is implementing varying speed limits (VSLs) on transportation corridors to control traffic and mitigate congestion. However, the existing training architecture of PIDL in the literature cannot accommodate the changing traffic characteristics on a freeway with VSL. To tackle this challenge, we propose a novel framework integrating teacher-student ensemble training with PIDL neural networks for TSE under VSL scenarios. The physics of flow conservation law is encoded locally in the teacher models by PIDL, and the student model uses a multi-layer perceptron classifier (MLP) to identify traffic characteristics and selects the ensemble member of PIDL neural networks for TSE. This integrated framework provides a natural solution for capturing the heterogeneity of VSL and accurately addressing the TSE problem. The case study results validate the proposed ensemble approach, demonstrating its superior performance in TSE compared to other popular baseline methods, as indicated by relative L2 error. 2026-05-11T23:59:31Z The IEEE International Conference on Intelligent Transportation Systems (ITSC) 2026 Archie J. Huang Dongdong Wang Shaurya Agarwal Mohamed Abdel-Aty Md Mahmudul Islam Muhammad Shahbaz http://arxiv.org/abs/2505.11517v4 Information-Theoretic Grid Topology Reconstruction using Low-Precision Smart Meter Data 2026-05-11T21:55:35Z Accurate knowledge of power grid topology is a prerequisite for effective state estimation and grid stability. While data-driven methods for topology reconstruction exist, the minimum requirements for measurement quality, specifically regarding quantization, precision, and sampling frequency, remain under-explored. This study investigates the data fidelity required to reconstruct distribution grid topologies using voltage magnitude measurements. Adopting an information-theoretic approach, we utilize the Chow-Liu algorithm to generate maximum spanning trees based on mutual information. Rather than proposing a new reconstruction algorithm, our primary contribution is a comprehensive sensitivity analysis of the measurement data itself. We systematically evaluate the impact of data bit-depth, significant digit truncation, time-window length, and different mutual information estimators on reconstruction accuracy. We validate this approach using IEEE test cases (via MATPOWER) and time-series data from GridLAB-D. Our results demonstrate that grid topology can be successfully recovered even with highly quantized 8-bit data or millivolt-level precision. However, performance degrades significantly when downsampling intervals exceed 20 minutes or when data availability is limited to short durations. These findings establish an optimistic theoretical lower bound, suggesting that costly high-precision instrumentation may not be strictly necessary for structural inference under ideal conditions. This rigorous baseline provides a foundation for future evaluations of noisy real world smart meter data and hybrid approaches that incorporate existing engineering priors. 2025-05-07T20:09:04Z Daniel T. Speckhard http://arxiv.org/abs/2605.10831v1 SLIM: Sparse Latent Steering for Interpretable and Property-Directed LLM-Based Molecular Editing 2026-05-11T16:47:25Z Large language models possess strong chemical reasoning capabilities, making them effective molecular editors. However, property-relevant information is implicitly entangled across their dense hidden states, providing no explicit handle for property control: a substantial fraction of edits fail to improve or even degrade target properties. To address these issues, we propose SLIM (Sparse Latent Interpretable Molecular editing), a plug-and-play framework that decomposes the editor's hidden states into sparse, property-aligned features via a Sparse Autoencoder with learnable importance gates. Steering in this sparse feature space precisely activates property-relevant dimensions, improving editing success rate without modifying model parameters. The same sparse basis further supports interpretable analysis of editing behavior. Experiments on the MolEditRL benchmark across four model architectures and eight molecular properties show consistent gains over baselines, with improvements of up to 42.4 points. 2026-05-11T16:47:25Z Mingxu Zhang Yuhan Li Lujundong Li Dazhong Shen Hui Xiong Ying Sun