https://arxiv.org/api/OMT/XITC9Vi0NfVbFmTxNAYS844 2026-06-10T07:39:02Z 10652 150 15 http://arxiv.org/abs/2606.02614v1 Margin Play: A Multi-Agent System For Public Policy Analysis In The Brazilian Equatorial Margin 2026-05-26T13:47:24Z

The Brazilian Equatorial Margin (BEM) is Brazil's next offshore oil frontier, with operations expected to begin in 2026 in the Foz do Amazonas basin. Its assets are fiscally and territorially linked primarily to Maranhao -- the state with the lowest HDI in the Federation (0.676, IBGE 2022). This raises the central policy question: under what conditions does BEM exploration generate net positive externalities for Maranhao? The problem is intrinsically multi-agent: the Federal Government seeks revenue and energy security; the state seeks regional welfare under constitutional royalty earmarking; the operator maximizes profit under risk; ANP and IBAMA hold conflicting mandates; and Amazonian communities prioritize territorial and environmental vectors over monetary income. We present Margin Play, a Multi-Agent Reinforcement Learning (MARL) system simulating these tensions under Brazilian empirical calibration and classical economic literature. It implements six agents under the CTDE paradigm, trained with BRO-MARL. Results from 60,000 episodes across six scenarios indicate the answer is conditional on the institutional regime: under the reference baseline, the welfare gain is marginal (Waval approx. 1.68), whereas the MA-Prospero configuration yields Delta W = +17.5% and Delta Rcom = +21.3%, with a lower environmental liability (Eamb = 0.048 vs. 0.076). The fundamental problem is not a trade-off between production and welfare, but the choice of public policy regime linked to exploration.

2026-05-26T13:47:24Z Antonio de Sousa Leitão Filho Fabrício Saul Lima Selby Mykael Lima dos Santos Rejani Bandeira Vieira Sousa Luís Jorge Mesquita de Jesus Dennys Correia da Silva Allan Kardec Duailibe Barros Filho http://arxiv.org/abs/2503.21450v3 CMADiff: Cross-Modal Aligned Diffusion for Controllable Protein Generation 2026-05-26T13:35:36Z

AI-assisted protein design has emerged as a critical tool for advancing biotechnology, as deep generative models have demonstrated their reliability in this domain. However, most existing models primarily utilize protein sequence or structural data for training, neglecting the physicochemical properties of proteins.Moreover, they are deficient to control the generation of proteins in intuitive conditions. To address these limitations,we propose CMADiff here, a novel framework that enables controllable protein generation by aligning the physicochemical properties of protein sequences with text-based descriptions through a latent diffusion process. Specifically, CMADiff employs a Conditional Variational Autoencoder (CVAE) to integrate physicochemical features as conditional input, forming a robust latent space that captures biological traits. In this latent space, we apply a conditional diffusion process, which is guided by BioAligner, a contrastive learning-based module that aligns text descriptions with protein features, enabling text-driven control over protein sequence generation. Validated by a series of evaluations including AlphaFold3, the experimental results indicate that CMADiff outperforms protein sequence generation benchmarks and holds strong potential for future applications. The implementation and code are available at https://github.com/HPC-NEAU/PhysChemDiff.

2025-03-27T12:41:48Z Changjian Zhou Yuexi Qiu Jia Song http://arxiv.org/abs/2605.27011v1 Advances in polyconvex anisotropic hyperelasticity 2026-05-26T13:29:19Z

A key challenge in material theory is the formulation of models that satisfy all common mechanical constitutive conditions while retaining sufficient flexibility. In this context, several important modeling aspects remain unresolved for polyconvex anisotropic hyperelasticity. We address some of these challenges and apply our results for physics-augmented neural network (PANN) constitutive modeling. The main contributions of this paper are as follows: (1) We propose a new polyconvex PANN constitutive model for anisotropic hyperelasticity based on triclinic invariants and group symmetrization. For finite symmetry groups, this model fulfills all common mechanical constitutive conditions a priori. (2) We propose a group symmetrization-based method for the construction of polyconvex invariants for finite symmetry groups. Based on this, we derive a new integrity basis for a tetragonal symmetry group and a new functional basis for a cubic symmetry group. To the best of our knowledge, these are the first polyconvex integrity or functional bases for symmetry groups characterized by structural tensors of order higher than two. (3) We provide an extensive introduction to the construction of polyconvex integrity and functional bases, which form the basis of polyconvex invariant-based constitutive models. We discuss polyconvex bases for triclinic, isotropic, transversely isotropic, monoclinic, rhombic, tetragonal, and cubic symmetry groups. (4) We benchmark the polyconvex PANN constitutive models with highly nonlinear homogenization data of cubic metamaterials.

2026-05-26T13:29:19Z Dominik K. Klein Karl A. Kalina Rogelio Ortigosa Jesús Martínez-Frutos Markus Kästner Oliver Weeger http://arxiv.org/abs/2602.01941v2 FluxNet: Learning Capacity-Constrained Local Transport Operators for Conservative and Bounded PDE Surrogates 2026-05-26T08:42:10Z

Autoregressive learning of time-stepping operators provides an effective approach to data-driven partial differential equation (PDE) simulation, yet for conservation laws, they face a fundamental challenge: learned updates may violate global conservation over long rollouts. For the important subclass of mass-conservation-type equations, the problem is compounded by inherent physical bounds (e.g., nonnegativity or concentrations in [0,1]) whose violation further destabilizes predictions. We introduce FluxNet, which learns cumulative transport amounts representing the total conserved quantity redistributed between each cell and a configurable neighborhood over the full surrogate interval. A conservative update guarantees exact discrete conservation by construction; modular capacity-constrained transport heads (L, U, and D) enforce lower bounds, upper bounds, or near-zero dual-bound violations through architectural design. Unlike flux-rate surrogates that require temporal integration and thus inherit CFL constraints, FluxNet involves no such integration; configurable transport neighborhoods enable large-timestep prediction at full spatial resolution. Ghost cells extend the framework to non-periodic boundaries. Experiments on four benchmarks (1D convection--diffusion, 2D shallow water, 1D traffic flow, 2D Cahn--Hilliard) demonstrate exact conservation, structural bound preservation, architecture modularity, and superior stability over flux-rate surrogates at large temporal strides. The code is publicly available at: https://github.com/Lan-zs/FluxNet.

2026-02-02T10:44:10Z ICML2026 Zishuo Lan Junjie Li Lei Wang Jincheng Wang http://arxiv.org/abs/2410.10398v3 Are LLMs Socially Adaptive? Contrasting Belief Evolution in Large Language Models and Humans 2026-05-26T07:24:28Z

As large language models (LLMs) increasingly engage in complex social interactions, ensuring that their behaviors align with human ethical principles and intentions, known as value alignment, has become a critical scientific challenge. Existing benchmarks often rely on static assessments and fail to capture the longitudinal dynamics of decision-making or the latent cognitive processes driving agent behavior. In this work, we propose FairMindSim, a realistic simulation benchmark rooted in social psychology that evaluates alignment through continuous economic games. To move beyond black-box observations, we introduce the Belief-Reward Alignment Behavior Evolution Model (BREM), a probabilistic framework that formalizes decision-making as a dynamic trade-off between maximizing extrinsic rewards and upholding intrinsic beliefs. We conducted a large-scale comparative study involving 1,017 human participants and ten LLMs, including GPT-5 and Gemini-3-Pro. Our experimental results reveal a capability linked non linear empirical trend in the Third Party Punishment (TPP) game. Mid capability models exhibit rigid and algorithmic aggression that is characterized by over punishment, while frontier models show a convergence of restraint and a shift toward human like leniency as reasoning capabilities scale. Furthermore, using BREM, we decompose agents longitudinal decision dynamics and find that more advanced models better balance conflicting objectives by reducing belief action inconsistency. Our contributions provide a standardized protocol for psychological stress testing and an interpretable mechanism for analyzing the longitudinal evolution of AI alignment in controlled social dilemma settings.

2024-10-14T11:39:05Z KDD 2026 Oral Yu Lei Hao Liu Chengxing Xie Songjia Liu Zhiyu Yin Canyu Chen Guohao Li Philip Torr Zhen Wu http://arxiv.org/abs/2604.27604v2 Decoding Scientific Experimental Images: The SPUR Benchmark for Perception, Understanding, and Reasoning 2026-05-26T07:05:48Z

We introduce SPUR, a comprehensive benchmark for scientific experimental image perception, understanding, and reasoning, comprising 4,264 question-answering (QA) pairs derived from 1,084 expert-curated images. SPUR features three key innovations: (1) Panel-Level Fine-Grained Perception: evaluating the visual perception of multimodal large language models (MLLMs) across three dimensions (numerical, morphological, and information localization) on six fine-grained panel types; (2) Cross-Panel Relation Understanding: utilizing complex images with an average of 14.3 panels per sample to evaluate MLLMs' ability to decipher intricate cross-panel relations; (3) Expert-Level Reasoning: assessment of qualitative and quantitative reasoning across five experimental paradigms to determine if models can infer conclusions from evidence as human experts do. Comprehensive evaluation of 20 MLLMs and four multimodal Chain-of-Thought (MCoT) methods reveals that current models fall significantly short of the expert-level requirements for scientific image interpretation, underscoring a critical bottleneck in AI for Science (AI4S) research.

2026-04-30T08:57:18Z Accepted to ACL 2026 Main Conference Junpeng Ding Zichen Tang Haihong E Mengyuan Ji Yang Liu Haolin Tian Haiyang Sun Pengqi Sun Yang Xu Yichen Liu Haocheng Gao Zijie Xi Ruomeng Jiang Peizhi Zhao Rongjin Li Yuanze Li Jiacheng Liu Zhongjun Yang Jintong Chen Siying Lin http://arxiv.org/abs/2605.25639v2 AeroTSBoost: Temporal-Statistical Boosting for Real-World UAV Telemetry Anomaly Mining 2026-05-26T04:41:14Z

Mining anomalies from unmanned aerial vehicle (UAV) state-estimation logs is challenging because failures are sparse, temporally structured, and distributed across heterogeneous PX4 telemetry streams with variable sensor availability and missing values. We present AeroTSBoost, a temporal-statistical boosting framework for real-world UAV telemetry anomaly mining. AeroTSBoost aligns multivariate flight logs, converts each window into deterministic descriptors that capture distributional shifts, quantile structure, endpoint drift, local dynamics, and lag correlation, and trains a class-balanced LightGBM detector. On UAV-SEAD, AeroTSBoost achieves the strongest AUPRC among evaluated classical, supervised tabular, neural reconstruction, recurrent, Granger-causality-based, and frequency-domain baselines. Across five seeds, it reaches $0.7516\pm0.0043$ AUPRC and $0.5342\pm0.0108$ threshold-swept event F1, improving AUPRC by 5.79 absolute points over the strongest non-AeroTSBoost baseline. Under purged chronological and leave-log-out protocols, it remains the best AUPRC method, reaching $0.6066\pm0.0193$ and $0.6388\pm0.0315$, respectively. On related ALFA fixed-wing UAV fault logs, AeroTSBoost reaches $0.9259\pm0.0076$ leave-sequence-out AUPRC, ahead of RandomForest ($0.8835\pm0.0797$) and moments-only ($0.8700\pm0.0481$). These results show that deterministic temporal-statistical representations remain highly competitive for sparse anomaly mining in operational cyber-physical telemetry.

2026-05-25T09:40:37Z Junhao Wei Haochen Li Yanxiao Li Yifu Zhao Dexing Yao Baili Lu Xudong Ye Sio-Kei Im Yapeng Wang Xu Yang http://arxiv.org/abs/2605.19267v2 Bounding LVR in AMMs via Secant-Tangent Divergence and Collateralized Liquidity Scaling 2026-05-26T02:00:16Z

Automated Market Makers face a geometric dilemma: expanding liquidity depth to reduce execution slippage increases Liquidity Providers' exposure to toxic arbitrage, quantified as Loss-Versus-Rebalancing (LVR). We study the Hybrid Liquidity-Collateral Pool (HLCP), a stylized architecture that aims to partially decouple execution quality from active risk exposure through an N-scaled virtual invariant and a collateral buffer. The analysis first characterizes the geometric divergence between execution slippage and marginal-price deviation, then uses this divergence to motivate a trigger-based collateral injection rule. In a stylized duopoly model, under hyper-saturated background liquidity and non-zero volatility or collateral yield, adopting the HLCP is a Nash equilibrium and Pareto-improving relative to a standard AMM benchmark. Empirically, we examine two settings. Under a stochastic-volatility-with-jumps stress scenario, the trigger policy avoids one-shot total buffer depletion under the imposed control law and simulated shock path. Using 2025 Uniswap V2 data with zero collateral yield, the HLCP exhibits lower realized LVR and higher net LP return than the standard CPMM benchmark in the sample considered.

2026-05-19T02:29:39Z Hyoungsung Kim Yong-Suk Park http://arxiv.org/abs/2605.19939v2 Uncertainty-aware Machine Learning Interatomic Potentials via Learned Functional Perturbations 2026-05-25T20:58:41Z

Machine Learning Interatomic Potentials (MLIPs) achieve near ab initio accuracy at a fraction of the cost of quantum-mechanical simulations, yet they remain prone to silent failures on out-of-distribution configurations, making principled uncertainty quantification (UQ) essential for error-aware simulations and active learning. Existing non-ensemble UQ methods for MLIPs rely either on variational inference or on parametric distributional assumptions, both of which add architectural complexity and hyper-parameters that must be tuned per task. Inspired by recent advances in probabilistic weather forecasting, we propose a simpler alternative: turn a deterministic MLIP into a probabilistic one through learned functional perturbations and finetune it end-to-end with the Continuous Ranked Probability Score (CRPS), a proper scoring rule. We validate the approach with an equivariant GNN (P-EGNN) trained from scratch and by finetuning the foundation model the Orb-v3 for silica. On the N-body charged particle benchmark, P-EGNN improves CRPS over the state-of-the-art Bayesian MLIP method BLIP by 19-32% across all training sizes; on silica, P-Orb raises the Spearman correlation between predicted uncertainty and actual error from 0.75 (BLIP-Orb) to 0.84.

2026-05-19T15:00:06Z Olga Zaghen Maksim Zhdanov Dario Coscia David R. Wessels Erik J. Bekkers http://arxiv.org/abs/2606.07567v1 SurfDesign: Effective Protein Design on Molecular Surfaces 2026-05-25T19:53:02Z

Protein function is largely determined by molecular surface geometry and physicochemical complementarity, yet most protein design methods condition only on backbone structure. We introduce SurfDesign, a surface-conditioned protein design framework that models molecular surfaces as continuous geometric manifolds and integrates them with pretrained protein language models. SurfDesign employs surface-based equivariant message passing to capture surface normals, curvature, and directional geometry, together with a parameter-efficient fine-tuning strategy. Focusing on functional protein design, we show that SurfDesign consistently outperforms prior surface-conditioned and backbone-only methods on de novo binder and enzyme design benchmarks. We also report strong performance on inverse-folding benchmarks as a diagnostic of structural compatibility. Our results highlight manifold-aware surface representations as a principled foundation for functional protein and enzyme design. Code is available at https://github.com/smiles724/SurfDesign.

2026-05-25T19:53:02Z KDD 2026 AI4Science Fang Wu Shuting Jin Xiangru Tang Mark Gerstein Xiangxiang Zeng Yejin Choi Jure Leskovec Jinbo Xu http://arxiv.org/abs/2605.26279v1 Constraint acquisition needs better benchmarks 2026-05-25T19:05:12Z

Constraint Acquisition (CA) and related research on the validation and enhancement of Mathematical Programming (MP) models from domain knowledge artifacts are currently limited by inadequate benchmarks. This deficiency impedes reproducibility and cross-study comparability, slowing the maturation of CA methods. Existing benchmarks were designed for solver evaluation rather than for assessing CA algorithms. They are loosely organized, treat individual problems inconsistently, and omit the domain knowledge artifacts required by CA methods. This work presents MPMMine, a benchmark suite designed to assess algorithms that discover, validate, and enhance MP models using diverse domain knowledge artifacts. MPMMine is guided by consistency, standardization, completeness, extensibility, openness, and version control. It adopts a uniform structure and relies on open formats: MiniZinc, CommonMark, and JSON. It provides multiple models per problem, tens of instances per model, and thousands of solutions and non-solutions in both integer and continuous domains, alongside natural-language descriptions to support text-to-model methods.

2026-05-25T19:05:12Z 12 pages, 1 figure, for the associated dataset, see https://github.com/MPMMine/MPMMine Rafał Stachowiak Tomasz P. Pawlak http://arxiv.org/abs/2605.27459v1 Real-Time In Silico Modeling of Postprandial Macronutrient Kinetics: A Validated Computational Engine for Nutrition Research and Digital Health 2026-05-25T18:03:46Z

Simulation of post-prandial pharmacokinetics, such as muscle protein synthesis (MPS) through mTORC1 and insulin-induced glucose uptake, is often challenging due to the computational intensity of the multi-compartmental approach. In this study, I introduce an in silico metabolic simulator that uses bi-compartmental Bateman kinetic processes, gamma-variate distributions, and finite state machine reasoning to solve temporal differential equations instantaneously, generating metabolic curves and predictions depending on input meals. The novel underlying algorithm was custom-built entirely independent of third-party libraries or external services. This original computational engine, bridging the gap between academia and the digital health sector, is integrated within a web dashboard and provided as a service via REST APIs. The average response time is approximately 135 ms with a maximum below 750 ms. The multi-dimensional model was calibrated using a Landmark Validation approach across diverse dietary conditions (Whey Protein, mixed meal, OGTT) and optimized via Grid Search. Ultimately, the system achieved a global physiologically optimal Mean Absolute Percentage Error (MAPE) of $\sim18\%$ while maintaining an algorithmic complexity of $O(n \log n)$.

2026-05-25T18:03:46Z Alberto Calderone http://arxiv.org/abs/2605.26076v1 AI-Powered Sustainable Finance: An Integrative Taxonomy and Framework of AI Applications for Sustainable Investment Decision-Making 2026-05-25T17:41:19Z

The integration of Artificial Intelligence into sustainable finance represents a transformative paradigm shift in how Environmental, Social, and Governance factors are analyzed, predicted, and incorporated into investment decisions. This review provides a comprehensive taxonomy of AI approaches applicable to sustainable investment decision-making, categorizing methodologies based on their underlying algorithms and their impact on ESG-related financial processes. The proposed AI Taxonomy includes machine learning paradigms -- including supervised, unsupervised, and reinforcement learning -- as well as natural language processing techniques and optimization algorithms, examining their specific applications in ESG score prediction, controversy detection, portfolio management, and sustainability report analysis. By synthesizing findings from the recent literature, a framework emerges on AI-powered sustainable finance that identifies technological applications to overcome ESG data barriers.

2026-05-25T17:41:19Z Eduardo C. Garrido-Merchán Esther Vaquero Lafuente Elisa Aracil http://arxiv.org/abs/2605.26066v1 The Evolution of Digital Twins from Reactive to Agentic Systems 2026-05-25T17:30:09Z

Digital twins are evolving into self-learning, autonomous systems that link models, data, and human interaction. Realizing their full potential depends on interoperability, standardization, and the integration of artificial intelligence and advanced computational reasoning across sectors.

2026-05-25T17:30:09Z Nat Comput Sci 6, 6-10 (2026) Omer San Adil Rasheed Eda Bozdemir Jun Deng 10.1038/s43588-025-00944-0 http://arxiv.org/abs/2605.22663v2 Therm-FM: Foundation Model is ALL YOU NEED for 3D-ICs Thermal Simulation 2026-05-25T17:22:04Z

Data-driven thermal predictors for 3D-ICs are often trained from scratch for each chip design using many high-fidelity finite-element simulations, leading to high data-generation cost and costly cross-design reuse. We propose Therm-FM, a neural operator framework that adapts a pretrained partial differential equation (PDE) foundation model to steady-state and transient 3D-IC thermal simulation. The motivation is that steady-state and transient chip-level heat conduction respectively share elliptic and parabolic operator structures with diffusion-type PDEs, allowing pretrained diffusion priors to provide an effective initialization for thermal-field prediction under heterogeneous materials, dense TSV/microbump interconnects, and package-level boundary conditions. To further reduce data-generation cost, Therm-FM incorporates a thermal-equivalent multi-fidelity training strategy that uses low-cost approximate simulations for thermal-domain adaptation and limited high-fidelity samples for calibration. Experiments on public HotSpot benchmarks and industrial 3D-IC package benchmarks show that Therm-FM achieves up to a 10.6x reduction in mean error and surpasses prior best accuracy with less than 20% of the training data. In cross-chip adaptation, it matches or surpasses full-data baselines in several metrics using only 10--30 target samples. We release datasets, source code, and pretrained models at https://github.com/haiyangxin/Therm-FM.

2026-05-21T16:03:48Z 14 pages, 10 figures, extended version of a DAC 2026 paper Zhen Huang Haiyang Xin Wenkai Yang Yangbo Wei Zhiping Yu Yu Zhang Wei W. Xing Ting-Jung Lin Lei He