https://arxiv.org/api/FCWxW/3yHbzOE509meFw2pPcBlI 2026-07-17T19:55:03Z 10861 0 15 http://arxiv.org/abs/2607.15258v1 Decoding Market Emotion from Blockchain Activity: A Data-Driven Sentiment Classifier 2026-07-16T17:52:26Z

The growing use of Bitcoin as a decentralized digital asset and investment tool has sparked strong interest in understanding its market behavior. This study presents a new approach to analyze Bitcoin market sentiment by combining on-chain and financial data with social media posts. Unlike models that aim to predict prices, this work focuses on explaining market sentiment using blockchain transactions, historical price data of Bitcoin, and daily Twitter sentiment classifications. The method merges sentiment trends with on-chain and financial metrics, normalized into a dataset for detailed market analysis. Multiple machine learning models were tested using cross-validation, with Gradient Boosting (XGBoost) emerging as the most reliable model for classifying sentiment, achieving an average F1-score of about 0.84. SHAP (SHapley Additive exPlanations), a game theory-based method for model interpretability, was used to quantify the contribution of on-chain features to the model's predictions, improving transparency. The results indicate that this data combination yields meaningful predictive signals and insights, supporting data-driven cryptocurrency analysis and future improvements with deep learning.

2026-07-16T17:52:26Z This manuscript has been accepted for presentation at the IEEE International Symposium on Computers and Communications (ISCC 2026) Arthur G. Bubolz Abreu Quevedo Giancarlo Lucca Rafael A. Berri Eduardo Borges Bruno L. Dalmazo http://arxiv.org/abs/2607.13220v2 Networked Intelligence: Active Shared Context Graphs for Human-AI Team Science 2026-07-16T17:32:26Z

Most AI-for-science systems focus on scaling a single reasoning process by using better models, larger context windows, long-horizon agentic execution, or digital co-scientists working with one principal user. However, challenging scientific problems are rarely solved by one reasoner alone. They are solved by teams whose members carry different priors, experimental background, tacit knowledge, and domain-trained intuitions. The open problem is therefore not only how to scale models, but how to develop "networked intelligence", scaling the connections between humans and AI systems so that a result or hypothesis produced in one context reaches another person, agent, instrument or robot that can act on it. We introduce Mycelium, an active shared workspace that automatically connects researchers and AI agents. As human users and agents work, the system captures important observations and hypotheses, tracks how they relate to the team's evolving knowledge model, and routes them to the person or agent whose next decision they can inform. We evaluate Mycelium through a real-world scientific discovery use case: a biological multi-omics campaign where shared context turned a local analytical finding into a cross-expert mechanistic constraint and ultimately into an experimental design. Finally, we describe networked intelligence as sparse conditional computation over distributed scientific contexts. This framework establishes when a scaled standalone agent is sufficient, and when isolated data and specialized expertise make a networked approach essential.

2026-07-14T19:29:16Z Sutanay Choudhury Jeffrey J. Czajka Lummy M. O. Monteiro Erin Bredeweg Jason McDermott Katherine Wolf Alex Beliaev Josh Elmore Paul Piehowski Kylee Tate Yuqian Gao Aivett Bilbao Kelly Stratton Scott Baker Jaydeep P. Bardhan Kristin Burnum Johnson Chris Oehmen Robert Rallo http://arxiv.org/abs/2607.15049v1 Neural operators solve inverse problems for constitutive model discovery 2026-07-16T14:29:38Z

Characterizing the mechanical response of materials traditionally requires solving optimization problems in which model parameters are calibrated or trained to minimize the discrepancy between model predictions and experimental data. This process can be computationally expensive and time-consuming. To overcome this limitation, we propose two neural operator architectures that directly map experimentally measured data to the constitutive functions governing the mechanical response of the material: Physics-Augmented Neural Operators (PANO) and Constitutive Artificial Neural Operators (CANO). The proposed neural operators approximate the mapping between the infinite-dimensional input space of full-field displacement measurements and net reaction forces, and the infinite-dimensional output space of hyperelastic strain-energy density functions. The displacement fields are encoded through Laplacian eigenfunctions to obtain discretization-independent and noise-robust predictions. Our framework constrains the output space to physically admissible material models that satisfy fundamental physical requirements by design. The neural operators are trained on simulated data tuples of displacement fields and reaction forces for a range of material models. Once trained, the neural operators enable near-instantaneous material characterization and require only a single forward pass to infer the strain-energy density function from a given experimental dataset. We test the predictive power of the neural operators for unseen data, noisy data, data with missing information, data from different spatial discretizations, and data from geometries of different sizes.

2026-07-16T14:29:38Z Moritz Flaschel Burigede Liu Ellen Kuhl http://arxiv.org/abs/2606.14565v2 CANN-EUCLID: unsupervised constitutive artificial neural network model discovery from full-field data 2026-07-16T10:49:35Z

Constitutive artificial neural networks (CANNs) provide interpretable material model discovery, but have so far been used in stress-supervised settings based on apparent stress-strain data from homogeneous tests. Because each test samples only a narrow loading path and provides homogenized rather than local stress information, robust discovery typically requires multiple loading modes to constrain the multidimensional response. This is challenging for soft biological tissues, where repeated testing, damage, and sample variability limit reliable information from a single specimen. Here, we combine CANNs with the stress-unsupervised full-field discovery framework EUCLID to identify sparse hyperelastic laws directly from displacement fields and reaction forces in one heterogeneity-inducing loading case. CANN-EUCLID minimizes equilibrium imbalance with sparsity-promoting regularization selecting compact active terms, without local stress measurements or a prescribed law. We evaluate the approach on isotropic and anisotropic benchmarks with prescribed ground-truth laws. When the ground truth is representable by the chosen CANN basis, our method recovers the correct terms with near-exact accuracy, including exponential terms with embedded parameters. When it is not contained in the basis, the method retains shared terms and approximates missing contributions using available basis functions. Generalization depends strongly on sampled deformation states: exponential strain-stiffening terms can be recovered accurately when sufficiently probed, but can produce large extrapolation errors when the stiffening regime lies outside the sampled domain. Forward FE validation simulations show that the discovered behavior accurately replicates the ground truth. These results establish stress-unsupervised CANN discovery as a promising framework for interpretable full-field constitutive model identification.

2026-06-12T15:44:06Z Benjamin Alheit Siddhant Kumar Mathias Peirlinck http://arxiv.org/abs/2607.14686v1 Class Weighting versus Amount Conditioning in Credit-Card Fraud Detection: A Dollar-Metric Study with a Temporal Explanation Audit 2026-07-16T07:45:56Z

Credit-card fraud losses are monetary, but papers often judge models with transaction-level scores. We ask whether transaction amount should shape training weights or be used later to order alerts. To separate this question from ordinary class imbalance handling, we keep total fraud-case weight fixed and vary only its allocation across fraud cases. The experiments test two chronological card-fraud datasets with XGBoost under unweighted training, standard class weighting, matched log-amount weighting, stronger amount-weighted variants, and score times amount reranking. Metrics are average precision, dollar recall, and dollar precision at fixed alert budgets over five seeds, with 95 percent day-block bootstrap intervals for the main contrasts. Results are narrower than expected. Amount-derived ratio and velocity features carry much of the signal, while raw amount fields add little once those features are in the model. In the matched setting, amount-conditioned training gives only small gains over class weighting and does not consistently beat the plain unweighted model. Stronger amount weights recover more fraudulent dollars, but at lower ranking quality and dollar precision. Reranking alerts by score times amount after training gives the largest dollar-recall shift. A small SHAP audit finds larger month-to-month attribution movement for fraud cases than for aggregate traffic. In these tests, amount is useful as a feature and as an alert-ordering variable, not by itself as a better sample-weighting rule.

2026-07-16T07:45:56Z Chenyu Wu http://arxiv.org/abs/2607.14600v1 A Nonlinear Model Predictive Control Perspective on Gradient-Based Optimization: A New Efficient, Parameter-Free and Provably Stable Algorithm 2026-07-16T06:00:22Z

This paper discusses some aspects related to gradient-based optimization algorithms with special focus on the requirements associated to their use in the implementation of Nonlinear Model Predictive Control. Based on a dedicated discussion, a new algorithm, termed Search and Accelerate (SaA) is proposed that mixes together a novel line search, a trust region mechanism together with an adaptation of the gradient acceleration scheme. A dedicated benchmark involving a set of 600 instances of box constrained optimization problems is designed and used in order to show the algorithm performances which make it a highly competitive general purpose gradient-based alternative for box-constrained optimization problems. An appealing feature of the algorithm is its robustness to the choice of the few parameters involved in its definition making the default values a valid option for any problem without a priori knowledge of the related Lipchitz constant. Moreover, an example of use of the proposed algorithm in NMPC implementation is proposed showing the possibility to reduce the control updating period which might be mandatory in some circumstances.

2026-07-16T06:00:22Z 15 pages, 14 Figures Mazen Alamir http://arxiv.org/abs/2410.03057v3 What Causes Performance Degradation in Cross-Subject EEG Classification? 2026-07-16T05:26:10Z

Cross-subject Electroencephalography (EEG) classification typically achieves significantly lower performance than subject-dependent settings. Although this phenomenon has been widely observed in the literature, the underlying causes have not been systematically studied. In this paper, we design a series of controlled experiments to investigate the mechanisms behind the performance drop in cross-subject EEG classification across different EEG tasks. We show that the performance degradation can generally be attributed to two factors: inter-subject variability and shortcut learning. Specifically, multi-class-per-subject EEG classification tasks, such as motor imagery, emotion recognition, and ERP stimulus classification, are mainly affected by inter-subject variability, whereas single-class-per-subject EEG classification tasks, such as brain disease detection, are primarily influenced by shortcut learning based on subject-specific features. These findings provide new insights into the challenges of cross-subject EEG classification and emphasize the importance of appropriate evaluation protocols in EEG research. The code is available at https://github.com/DL4mHealth/EEG-Cross-Subject.

2024-10-04T00:35:17Z Accepted by the IEEE International Conference on Systems, Man, and Cybernetics (IEEE SMC 2026) Yihe Wang Taida Li Yujun Yan Wenzhan Song Xiang Zhang http://arxiv.org/abs/2607.14556v1 Impact of Expert-Following Strategies in Financial Asset Recommendation 2026-07-16T04:34:57Z

Financial institutions hold rich transaction histories, yet delivering recommendations that simultaneously maximize investment returns and ensure preference alignment remains a significant challenge. Existing approaches, namely return-based and preference-based strategies, each optimize a single objective, resulting in a fundamental trade-off between profitability (ROI) and relevance (nDCG). In this paper, we propose the Expert-Following Strategies: a framework that identifies top-performing investors based on their historical ROI and recommends the assets they purchased, scored by ROI-weighted purchase frequency. Our experiments using real-world transaction histories show that our strategy achieves statistically significant improvement over the market-average baseline in both ROI and nDCG simultaneously across all four thresholds.

2026-07-16T04:34:57Z 2pages, 1figure Ryuki Unno Koshi Watanabe Keigo Sakurai Keisuke Maeda Takahiro Ogawa Miki Haseyama http://arxiv.org/abs/2607.14475v1 One-Shot Generative Design for Disordered Metamaterials via Self-Organizing Neural Cellular Automata 2026-07-16T01:43:03Z

Disordered metamaterials feature microstructures with inherent randomness and irregularity, enabling them to achieve broader property coverage and superior performance unavailable in their regular counterparts. Despite their promise, designing disordered microstructures is substantially harder than designing regular ones. Their design remains trapped between manual parameterizations with limited expressiveness, and generative AI that is data-hungry and struggles to generalize. To address these limitations, we propose a generative design framework based on Neural Cellular Automata that dynamically grows complex microstructures through learned local interaction rules, inspired by the self-organizing processes in natural materials. This framework requires only a single training template, yet accommodates diverse disordered microstructures and adapts to irregular domains and arbitrary discretizations. By manipulating the learned local rules, we can steer the growth process to generate microstructures unseen during training, providing control over orientation, anisotropy, and directional thickness without retraining. As a dynamic, local growth process, it naturally produces spatially varying microstructures that transition smoothly to enable location-specific mechanical properties. We demonstrate this in a multiscale mechanical cloaking design, where microstructures vary across the space to meet an optimized heterogeneous property distribution. Our design enables excellent cloaking performance without complicated post-processing and incompatible assembly common in existing methods. This data-efficient, generalizable approach opens access to previously intractable disordered materials for biomedical implants and soft robotics.

2026-07-16T01:43:03Z Yujie Xiang Liwei Wang http://arxiv.org/abs/2607.12771v2 Learning Mechanistic Reasoning for Chemical Reactions with Large Language Models 2026-07-16T00:41:49Z

Reaction mechanisms consist of the step-by-step sequences of elementary reactions that explain chemical transformations. Learning the mechanism logic is therefore essential for enhancing the fundamental chemical intelligence of large language models (LLMs). The stepwise deduction of reaction mechanism aligns naturally with the reasoning paradigms of reasoning LLMs. However, current chemical LLMs primarily emphasize coarse-grained name reactions for product prediction and retrosynthesis, often leading to physical inconsistencies and hallucinations. In contrast, specialized small-scale generative models for mechanism inference typically suffer from restricted generalization capacity across diverse chemical spaces. To overcome these limitations, we built a novel, large-scale reasoning dataset of reaction mechanisms. Furthermore, we established the FukuyamaBench, a difficult benchmark derived from Fukuyama's Advanced Organic Reaction Mechanism book, to rigorously evaluate model performance on hierarchical mechanism reasoning. Our fine-tuned Qwen3-30B-A3B achieves 8.3% exact pathway match on FukuyamaBench Set~A, surpassing the specialized FlowER model (5.1%), demonstrating that mechanism-aware training substantially enhances chemical reasoning in language models.

2026-07-14T13:44:51Z Xingyu Dang Haocheng Tang Junmei Wang Yanjun Li http://arxiv.org/abs/2607.14397v1 Model-Informed Joint Material-Structural Optimization of Hard-Magnetic Soft Materials 2026-07-15T22:23:20Z

This work develops a model-informed framework for predictive analysis and optimal design of hard-magnetic soft materials (hMSMs). These materials undergo contact-free, field-driven deformation, making them attractive for soft robotics, adaptive structures, and bio-inspired systems. Accurate prediction requires effective structure--property relations, while optimal design requires simultaneous control of structural density, magnetic particle distribution, and remanent magnetization direction. To address these issues, this work makes two main contributions. First, classical rigid-inclusion relations, a Hill self-consistent relation, and constrained-kinematics models are placed into a unified effective shear-modulus framework for particle-filled elastomers. With one default control relation, seven shear-modulus relations are combined with three strain-energy density functions to obtain 21 constitutive models. The results show that the strain-energy density form has a relatively small effect for the actuation problems considered, whereas the effective shear-modulus relation can significantly affect deformation when magnetic material overlaps with highly deforming regions. Experimental stress--strain data are then used to select a representative shear-modulus relation, with the Mooney relation giving the best overall agreement. Second, using the selected constitutive model, a joint material--structural optimization framework is developed for simultaneous design of structural density, magnetic particle volume fraction, and remanent magnetization direction. Rotational, translational, and restorative examples show that the framework handles different active design fields, objectives, and single- or multi-load-case formulations, producing non-intuitive hMSM designs with prescribed deformation responses. The framework is implemented in the open-source \texttt{CEADpx/top\_optim} repository.

2026-07-15T22:23:20Z 53 pages, 20 figures, 8 tables; Repository: https://github.com/CEADpx/top_optim Ian Galloway Prashant K. Jha http://arxiv.org/abs/2607.14392v1 Unified Uncertainty Quantification Framework Bridging Noisy Quantum Backends Across Variational Quantum Algorithms and Quantum Signal Processing 2026-07-15T22:10:05Z

We present an uncertainty quantification (UQ) framework for application level benchmarking and characterization of noisy quantum backends. The framework compares two workload classes under one statistical pipeline: noisy intermediate scale quantum (NISQ) variational quantum algorithms (VQAs) and Quantum Singular Value Transformation (QSVT) based Green's function reconstruction. For the VQA branch, we evaluate ten benchmark families spanning chemistry, optimization, simulation, compiling, linear solving, partial differential equations, metrology, error correction, tomography, and channel fidelity estimation. For the QSVT branch, we reconstruct orbital resolved Green's functions and spectral peaks from a block encoded real time propagator. The workflow combines Bayesian optimization, posterior distribution refinement, sensitivity analysis, robust parameter density estimation, backend ranking, noise correlation, and resource estimation analysis. Instead of reporting only one best parameter vector, the framework identifies robust parameter regions, residual gaps to ideal behavior, backend specific failure modes, and calibration sensitive uncertainty. The result is a common benchmark for variational and non-variational workloads that measures how reliably each backend reaches useful task level behavior.

2026-07-15T22:10:05Z Priyabrata Senapati Vibin Abraham Qiang Guan Bo Peng http://arxiv.org/abs/2607.14321v1 Accounting for Hysteresis and Eddy Currents in Finite Element Simulations of Ferromagnetic Laminated Cores using a Recurrent Neural Network 2026-07-15T19:38:30Z

Incorporating hysteresis and eddy currents into finite element simulations of laminated-core electrical machines is computationally challenging. Resolving the fields inside the laminations at each integration point and at every nonlinear iteration leads to computational costs several orders of magnitude higher than anhysteretic simulations, making such approaches impractical for design applications. Conversely, simplified models accounting only for magnetic saturation are becoming increasingly inadequate as electrical machine topologies and operating conditions grow in complexity. In this context, machine learning surrogate modeling has emerged as a promising alternative, offering efficient and accurate approximations of complex electromagnetic behaviors. In this paper, a recurrent neural network is trained as a surrogate of a laminated-core material model for an isotropic laminated core, and is integrated into realistic two-dimensional magnetodynamic finite element simulations based on a magnetic vector potential formulation. The proposed approach achieves excellent agreement with the reference laminated-core model while limiting the computational cost to about twice that of an anhysteretic simulation. By training the recurrent neural network on a sufficiently diverse set of artificially generated magnetic field sequences designed to mimic those encountered in electrical machine simulations, the proposed approach can be readily applied across a wide range of finite element simulations. Furthermore, the trained surrogate model is provided as a standalone component that can be easily incorporated into existing computational frameworks. It is publicly available at https://gitlab.onelab.info/getdp/lamnet.

2026-07-15T19:38:30Z Florent Purnode Louis Denis François Henrotte Gilles Louppe Christophe Geuzaine http://arxiv.org/abs/2507.03209v2 A Machine Learning Benchmarking Framework for Lipid Nanoparticle Transfection Efficiency Prediction 2026-07-15T19:14:14Z

The discovery of new ionizable lipids for efficient lipid nanoparticle (LNP)-mediated RNA delivery remains a major bottleneck in RNA therapeutics development. Recent advances demonstrate the potential of machine learning (ML) models to predict transfection efficiency directly from lipid structure, enabling high-throughput virtual screening and accelerating lead identification. However, as new models for LNP transfection prediction continue to emerge, the lack of rigorous and standardized benchmarking poses a significant risk and may undermine confidence in their reliability for discovery. Here, we present a robust ML benchmarking framework for evaluating transfection prediction models based on ionizable lipid structures. This framework systematically benchmarks diverse molecular representations paired with a broad range of ML architectures spanning traditional models, feedforward neural networks, and state-of-the-art graph-based methods. In addition, the presented framework supports assessment of model generalization and evaluates prediction reliability beyond standard regression metrics. Using a curated dataset of 1,100 unique ionizable lipid structures derived from the HeLa transfection dataset originally reported by Xu et al., we show that within this framework, models leveraging explicit molecular substructure encoding consistently achieve the highest predictive accuracy and should serve as essential baselines for the development of new, more sophisticated models. In contrast, some current graph-based models, including AGILE, Chemprop, and KPGT, tend to show comparatively lower accuracy. The presented framework provides a standardized, transparent, and comprehensive benchmarking resource that enables meaningful comparison of emerging architectures and establishes strong baselines for future development of predictive models in lipid-based RNA delivery.

2025-07-03T22:49:49Z Published in Communications AI & Computing (Nature Portfolio), 2026 Commun. AI Comput. 1, 2 (2026) Asal Mehradfar Mohammad Shahab Sepehri Jose Miguel Hernandez-Lobato Glen S. Kwon Mahdi Soltanolkotabi Salman Avestimehr Morteza Rasoulianboroujeni 10.1038/s44488-026-00007-x http://arxiv.org/abs/2607.14026v1 From Forecasts to Auditable Reports: Evidence Contracts for LLM-Assisted Housing-Guarantee Risk Monitoring 2026-07-15T16:57:21Z

Translating next-month housing-guarantee risk forecasts into auditable operational reports is essential yet challenging because upper-tail events are sparse, source records are confidential, and generated narratives can distort the underlying evidence. Using monthly South Korean \textit{jeonse} deposit guarantee data from September 2015 to December 2025, we introduce an evidence-constrained reporting pipeline that prioritizes upper-tail monitoring, retrieves historical precedents aligned with the forecasting rationale, organizes admissible information into typed evidence contracts, and verifies generated claims before analyst review. We train and select the forecasting backbone on the original panel, whereas the reporting experiments use synthetic aggregate scenarios calibrated to its empirical ranges and temporal structure. The selected forecasting model substantially improves high-risk detection while retaining competitive average error. Across eight LLMs, structured evidence consistently increases report quality, numerical fidelity, and claim-level grounding. A practitioner evaluation involving 51 analysts and related domain professionals further indicates that the reports support real-world review and decision-making: most participants rated them as practically useful and endorsed an operational pilot. These findings demonstrate that reliable LLM-assisted reporting requires predictive models to be coupled with structured evidence, explicit verification, and analyst oversight.

2026-07-15T16:57:21Z Hyeongcheol Kim Yoontae Hwang