Smart Timing for Mining: A Deep Learning Framework for Bitcoin Hardware ROI Prediction

2026-05-25T17:08:15Z

Bitcoin mining hardware acquisition requires strategic timing due to volatile markets, rapid technological obsolescence, and protocol-driven revenue cycles. Despite mining's evolution into a capital-intensive industry, there is little guidance on when to purchase new Application-Specific Integrated Circuit (ASIC) hardware, and no prior computational frameworks address this decision problem. We address this gap by formulating hardware acquisition as a time series classification task, predicting whether purchasing ASIC machines yields profitable (Return on Investment (ROI) >= 1), marginal (0 < ROI < 1), or unprofitable (ROI <= 0) returns within one year. We propose MineROI-Net, an open-source Transformer-based architecture designed to capture multi-scale temporal patterns in mining profitability. Evaluated on data from 20 ASIC miners released between 2015 and 2024 across diverse market regimes, MineROI-Net outperforms recurrent, convolutional, and attention-based baselines, achieving 83.2% accuracy and 83.5% macro F1-score. The model demonstrates strong economic relevance, achieving 97.8% precision in detecting unprofitable periods and 81.5% precision in detecting profitable ones, while avoiding misclassifying profitable scenarios as unprofitable and vice versa. These results indicate that MineROI-Net offers a practical, data-driven tool for timing mining hardware acquisitions, potentially reducing financial risk in capital-intensive mining operations.

PolyGnosis 2.0: Enhancing LLM Reasoning via Agentic Harness Engineering for Polymarket and OSINT Insight Extraction

2026-05-25T15:30:54Z

This paper introduces PolyGnosis 2.0, a pioneering multi-agent architecture designed to extract predictive intelligence by synthesizing Polymarket anomaly signals with global Open Source Intelligence (OSINT) streams, specifically Global Database of Events, Language, and Tone (GDELT). We define and target "Perspective Mismatches", the narrative divergence between Polymarket sentiment and global media flows, as high-alpha trading signals. Moving beyond generic agentic superiority, we rigorously quantify the efficacy of "Harness Engineering" techniques, including reflection loops, tool-calling, divide-and-conquer partitioning (D&C), and chain-of-thought (CoT), within high-noise financial domains. Our empirical evaluation against human-expert benchmarks reveals that while structural partitioning is mandatory for multi-dimensional alignment, unconstrained terminal reflection actively induces logical drift. Furthermore, we identify a pervasive "consensus bias" across all agent configurations during narrative reasoning, necessitating deterministic validation. Ultimately, we isolate a Pareto-optimal configuration that achieves professional-grade analytical precision while minimizing latency and token overhead, providing a robust blueprint for autonomous intelligence in prediction markets.

AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery

2026-05-25T14:50:02Z

Artificial intelligence research increasingly depends on prolonged cycles of reproduction, debugging, and iterative refinement to achieve State-Of-The-Art (SOTA) performance, creating a growing need for systems that can accelerate the full pipeline of empirical model optimization. In this work, we introduce AutoSOTA, an end-to-end automated research system that advances the latest SOTA models published in top-tier AI papers to reproducible and empirically improved new SOTA models. We formulate this problem through three tightly coupled stages: resource preparation and goal setting; experiment evaluation; and reflection and ideation. To tackle this problem, AutoSOTA adopts a multi-agent architecture with eight specialized agents that collaboratively ground papers to code and dependencies, initialize and repair execution environments, track long-horizon experiments, generate and schedule optimization ideas, and supervise validity to avoid spurious gains. We evaluate AutoSOTA on recent research papers collected from eight top-tier AI conferences under filters for code availability and execution cost. Across these papers, AutoSOTA achieves strong end-to-end performance in both automated replication and subsequent optimization. Specifically, it successfully discovers 105 new SOTA models that surpass the original reported methods, averaging approximately five hours per paper. Case studies spanning LLM, NLP, computer vision, time series, and optimization further show that the system can move beyond routine hyperparameter tuning to identify architectural innovation, algorithmic redesigns, and workflow-level improvements. These results suggest that end-to-end research automation can serve not only as a performance optimizer, but also as a new form of research infrastructure that reduces repetitive experimental burden and helps redirect human attention toward higher-level scientific creativity.

From Reports to Ontologies: Ontology-Guided Representation Learning for 12-Lead ECG

2026-05-25T14:12:23Z

The 12-lead electrocardiogram (ECG) is a quasi-periodic, multi-channel signal with diagnostic content spanning timescales from millisecond waveform morphology to multi-second rhythm dynamics. Existing ECG representation learning relies on signal-only self-supervision or ECG-text multimodal alignment, neither of which exploits the structured diagnostic codes attached to every clinical recording. We present \textbf{MAR-ECG}, an ontology-guided masked autoregressive framework that supervises the encoder with a curated 40-node SNOMED-CT cardiac graph through \emph{graph alignment}, eliminating the need for paired clinical reports. MAR-ECG combines two complementary objectives. First, \emph{graph-smoothed contrastive learning} (GSCL) anchors the encoder's rhythm-pooled features to the SNOMED graph, softening supervision targets by ontology distance so that clinically related concepts reinforce one another rather than function as hard negatives. Second, \emph{multi-scale physiological supervision} complements GSCL with signal-derived patch auxiliaries that target rhythm-physiology statistics extracted automatically from the input, extending supervision beyond the patch tier at no annotation cost. Pretrained on ${\sim}40$K publicly available 12-lead ECGs with SNOMED-CT codes and evaluated by frozen linear probing on five downstream classification benchmarks, MAR-ECG consistently outperforms a strong masked-autoregressive baseline, with mean gains in the low-label regime. Despite the absence of paired clinical text, MAR-ECG achieves performance competitive with state-of-the-art multimodal ECG-text methods.

Branched Signature Kernel Solvers for ODEs with rough Single-Trajectory signals

2026-05-25T13:22:07Z

We develop a branched signature kernel solver for linear and nonlinear ordinary differential equations driven by a \emph{single observed trajectory} of a possibly rough forcing signal -- a setting that arises naturally in earthquake engineering, finance, biology, and structural health monitoring, where the forcing is observed exactly once and the solver must respect the underlying physical law without recourse to an ensemble of realizations. Two ingredients are new. First, a \emph{count-sampling} construction turns the single observation into a hierarchical family of $N+1$ nested training paths on which the branched signature kernel can be evaluated; this allows the signature kernel machinery, originally designed for multi-realization regression problems, to operate on a single-trajectory observation. Second, a kernel-collocation framework places the ansatz either on the highest-order derivative of the solution (with lower derivatives recovered by integrating the kernel) or on the solution itself (after $m$-fold integration of the ODE). We prove a universal approximation theorem for the branched signature kernel, leveraging the Hairer--Kelly morphism to express branched signature evaluations through geometric signatures of time-extended paths. The offline solver is extended to a streaming Test/Train/Retrain protocol with closed-form online updates in the linear case and scalar Newton steps in the nonlinear case. Numerical experiments on six benchmarks (El-Centro earthquake displacement, the Solow capital-stock model, an fBM-driven second-order ODE, a forced Duffing oscillator, a path-dependent Arias-intensity-degraded oscillator with variable coefficients, and a noisy Kuramoto phase-oscillator system) show that the branched signature-kernel solver delivers accurate, stable predictions across all regimes.

FLOATBench: A Dataset and Benchmark for Floating Offshore Wind Turbine Tower Fatigue

2026-05-25T11:18:24Z

Most of the world's offshore wind resource lies in waters too deep for fixed-bottom foundations, making floating offshore wind turbines (FOWTs) essential for deep-water deployment. As the industry scales toward $22$ MW class designs, tower fatigue becomes increasingly critical because larger structures amplify the coupled aero-hydro-servo-elastic loads induced by continuous wind and wave excitation. Accurate fatigue-damage prediction is therefore central to certification, design optimization, and cost reduction. Yet the field lacks a shared surrogate benchmark: studies report different simulations, splits, and metrics, making methods difficult to compare. We present FLOATBench, a public tabular benchmark with $582{,}120$ per-section fatigue-damage labels across three $22$ MW FOWT tower geometries, derived from $19{,}404$ high-fidelity OpenFAST simulations across the three towers ($6{,}468$ per tower: $1{,}078$ aligned wind/wave operating points $\times$ six turbulence seeds), labeled at $30$ cross-sections per tower. FLOATBench includes a regime-aware alpha-shape partition of the joint wind/wave operating envelope, stratifying test points into in-train, interpolation, and extrapolation regimes. It is paired with a reproducible evaluation harness covering three protocol levels: random validation (E1), within-tower regime-aware evaluation (E2), and cross-tower transfer (E3). The regime-aware protocol reveals rank shifts between global and extrapolation performance that random-split leaderboards cannot detect. To the authors' knowledge, FLOATBench is the first FOWT fatigue benchmark for tabular surrogate modeling, and offers an evaluation protocol that generalizes to engineering surrogates defined over physical operating envelopes. Dataset and code available at: https://github.com/Joao97ribeiro/FLOATBench.

DeepSeekMath Meets Order Book: Group-Aware Policy Optimization for High-Frequency Directional Trading

2026-05-25T07:33:56Z

This paper studies reinforcement learning for high-frequency trading on limit order books by pairing an Order-Flow-based state model with policy-gradient methods. Instead of value-based RL techniques like tabular Q-learning, our approach deploys policy-based methods like vanilla PPO and DeepSeekMath-inspired variants like GRPO and GSPO, that use group-normalized updates and downside-aware shaping. On backtests with financial assets AMZN, AAPL, and GOOG under a simplified backtesting setup based on spread-scaled rewards, these new policies improve net average PnL, profitability, and drawdown over the Q-Learning baseline. Our results show that (1) Order-Flow signals are an adequate state for policy RL and (2) group-aware PPO surrogates are preferable over value-based baselines.

Effective information gathering for ore estimation, evaluation and perspectives on adaptive sampling

2026-05-25T03:13:51Z

A computational/analytics framework for assessing the value of drill-hole information in ore grade estimation is described using Gaussian Process and statistics. A distinguishing feature is that it presents both a near-term and long-term vision, circumvents conditional simulations and avoids making rigid assumptions such as stationarity and uncorrelated errors. Two experiments are devised to cater for situations where geological domains are differentiated or mixed. In scenario 1, performance (learning) curves are obtained to inform in-fill drilling and spacing consideration consistent with current practice. Analysis shows it is possible to estimate the incremental cost and reward via a proxy measure without relying on the ground truth, using insights obtained from a similar deposit, adjacent bench or domain. Scenario 2 examines adaptive sampling strategies and focuses on applying these in geologically complex areas with discontinuities and heterogeneous composition. Evaluation is made based on structural similarity, the mean and uncertainty in the posterior predictive distribution for the grade. The results highlight situations where regular grid sampling is suboptimal, and demonstrate an adaptive strategy that targets spatial complexity is capable of narrowing this gap. The proposed methodology can potentially be used in the future in an exploration--exploitation setting that involves sampling, machine learning, reasoning and cooperation between robots with embodied intelligence on a mine site.

Data-Driven Structural Health Monitoring of Short Carbon Fiber-Reinforced Polymer Composites via Multiphysics Phase-Field Simulation

2026-05-24T23:06:40Z

Short carbon fiber-reinforced polymer (SCFRP) composites exploit the intrinsic conductivity of the carbon fiber network for self-sensing, yet no predictive model couples their anisotropic, rate-dependent fracture to piezoresistive damage identification. This work presents a finite deformation multiphysics phase-field framework coupling a viscoelastic-viscoplastic constitutive model, an anisotropic crack resistance formulation, and a piezoresistive conductivity model. The three sub-problems are unified through the second-order fiber orientation tensor, which simultaneously defines fiber family directions, crack resistance anisotropy, and principal conduction paths of the carbon fiber network. A damage-coupled conductivity tensor captures both strain-driven geometric-kinematic resistance changes and irreversible network severance driven by the phase-field variable. The framework is coupled to an eight-electrode electrical impedance tomography configuration, and the normalized inter-electrode conductance ratios serve as inputs to a feedforward artificial neural network that infers normalized crack length and mechanical compliance without mechanical sensing. The network achieves R2 = 0.99 on held-out configurations, confirming generalization across the microstructure space. The framework establishes a physics-based, computationally efficient route for real-time structural health monitoring and inverse damage assessment in SCFRP composites.

Samudra 2: Scaling Ocean Emulators across Resolutions

2026-05-24T20:17:51Z

Ocean general circulation models (OGCMs) are essential to climate science but computationally expensive, limiting ensemble size and forcing scenarios. Neural emulators promise orders-of-magnitude speedups, yet existing ocean emulators have not combined fine spatial resolution with multi-year autoregressive rollouts. Samudra, the first autoregressive neural ocean emulator to produce multi-decade global rollouts, is limited to $1^\circ$ resolution and exhibits two long-horizon failure modes: \emph{variance collapse}, the loss of temporal variability, and \emph{imprinting artifacts}, in which velocity patterns leak into deep-ocean fields. We present Samudra 2, which introduces a wider U-Net backbone with modified ConvNeXt-style blocks and a reduced block-internal expansion factor, together with a dynamic loss that reweights output channels according to their prediction errors, strengthening gradients for slow-evolving deep-ocean fields. At $1^\circ$, Samudra 2 increases upper-ocean global-mean temperature $R^2$ from 0.56 to 0.87 and reduces deep-ocean temperature error by roughly sevenfold. The same architecture scales to $1/2^\circ$ and $1/4^\circ$ over approximately 8-year autoregressive rollouts, recovering mesoscale eddies and sharp western boundary currents. Running on a single GPU, Samudra 2 enables larger ensembles for sea-level projections, ocean heat uptake, and climate variability studies. We provide code, documentation, and benchmark resources at https://openathena.ai/Ocean_Emulator/.

Representation Without Control: Testing the Realization Effect in Language Models

2026-05-24T16:07:34Z

Large language models are increasingly used as behavioral simulators, but it remains unclear when their outputs reflect human-like cognitive mechanisms rather than prompt-sensitive surface patterns. We study this question through the realization effect, a well-characterized finding in behavioral economics in which risk-taking differs systematically after paper versus realized gains and losses. We evaluate LLM behavior at three levels: prompt-only behavioral sensitivity, linear readout of internal representations, and causal control via activation steering. Prompt-only results show systematic condition sensitivity, but the directional pattern does not reproduce human realization-effect predictions. Gemma's residual stream contains a linearly decodable realization-status signal at layer 18 that generalizes to held-out prompts. Steering along this direction does not, however, reliably shift downstream risk choices, a null result that holds across positive scales and in a negative sign-symmetry run. Behavioral sensitivity, latent readout, and causal control are three distinct properties that do not automatically co-occur, and successful latent readout is insufficient evidence that a model behaviorally relies on a representation during downstream decision-making.

Courant: a State-Adaptive Perceiver-Based Neural Surrogate with Local Support and Interpretable Field Decomposition

2026-05-24T14:55:12Z

We introduce "Courant", a Perceiver-based encoder-processor-decoder surrogate model that has latent features exhibiting adaptive specialization and local support in the physical space, enabling functionality akin to an adaptive hp-refinement scheme, an attribute that is highly desirable in traditional numerical solvers and scientific machine learning broadly. The proposed architecture combines a shared random Fourier feature coordinate embedding, state-adapted latent queries, and a light-weight decoder. Courant is trained end-to-end with steady or transient simulation data and only a standard L_2 prediction loss in the physical space, achieving competitive accuracy on benchmarks. We demonstrate that Courant's inductive biases yield latents that are interpretable by design: they develop multiscale geometric specialization in the simulation domain and track coherent structures in the time-dependent case, acting analogously to time-evolving spatial basis functions and allowing for decoding a compact, geometry-anchored, partition-of-unity-like decomposition of the simulated field.

Noise-Robust Financial Numerical Entity Attribute Tagging

2026-05-24T07:31:34Z

Financial Numerical Entity (FNE) understanding aims to recover the meaning of numerical mentions in financial reports. Existing studies primarily focus on concept name prediction and face two important limitations. First, labels derived from inline XBRL may contain errors because filings are usually prepared manually. Second, other important FNE attributes, such as reporting-time relation, measurement scale, and accounting sign, are less emphasized. We propose \textbf{NO}ise-\textbf{R}obust Tagging for Rich Financial Numerical Entity \textbf{A}ttributes (\textsc{NORA}) to address these gaps. NORA uses task-aware instance-specific weighting to attenuate the influence of noisy labels during training, and we further propose the Neighborhood Prior-adjusted KNN (NPK) filtering method for more reliable evaluation on real-world noisy test sets. In addition, we construct a large-scale benchmark containing 6.6 million instances with multi-attribute labels and filing metadata. Experiments show that \textsc{NORA} performs strongly compared with state-of-the-art noisy-label baselines, including Co-teaching, Mixup, SSR, and SelfMix. Moreover, NORA is robust under both unfiltered and noise-filtered test settings. It achieves the best Accuracy, Macro F1, and Weighted F1 for concept name and time-relation prediction, while remaining competitive on scale and sign prediction. These results demonstrate the value of jointly modeling rich FNE attributes while accounting for label noise in real-world financial filings.

Exascale Hybrid Numerical-AI Ensembles for Operational Flood-Season Forecasting in East Asia: 15-km Decadal Hindcasts and 1-km High-Resolution Capability

2026-05-24T06:52:51Z

Seasonal forecasting of summer rainfall in East Asia remains a grand challenge, as predictability at 3 to 6 month lead times is constrained by the spring predictability barrier, weak large-scale signals, and localized nonlinear convective extremes. We address this challenge with CAPES, which integrates a kilometer-resolution coupled regional model with atmosphere, land, and ocean components and a data-driven AI seasonal forecasting system. At 15 km resolution, the fused workflow combines 174 numerical members from varying start times, physics schemes, and parameter perturbations with 1,600 AI members generated from initial and physical perturbations. Using the full LineShine system, CAPES completes ten annual 1,774-member hindcasts for 2016 to 2025 within 14.6 hours, improving the mean prediction score from ECMWF's 71.8 to 75.9 and delivering a major gain in operational forecasting capability. The 1-km configuration further enables fine-scale typhoon simulation and establishes the feasibility of kilometer-scale fused ensemble forecasting on a one-week timescale.

Sloan's Analytical Gömböc at Published $β$: A Strict-Convexity-Constrained Reanalysis

2026-05-23T23:19:12Z

Varkonyi and Domokos (2006) proved that convex homogeneous bodies with exactly one stable and one unstable equilibrium point exist. Sloan (2023) gave the first analytical parameterization, with radial function $R(θ,φ)$ having exactly two critical points on $S^2$. This is the v2 amendment-of-record of arXiv:2604.17120. v1 claimed Sloan's parameterization does not produce mono-monostatic bodies and reported a 13-member catalog of Fourier/radial extensions certified at ECS=1 via mesh-vertex drainage-basin analysis. Following correspondence with P. L. Varkonyi (BME), an analytical verification suite was built around the Varkonyi-Gauss identity. Finding 1: Sloan's parameterization does produce mono-monostatic bodies in a strictly-convex sub-regime ($β\lesssim 0.036$), where $K_{\min} > 0$ and the identity certifies ECS=1. v1 missed this because its mesh-vertex oracle over-counted on shallow COM-height landscapes. At Sloan's published $β=0.05$, strict convexity is lost ($K_{\min}=-0.569$; $K<0$ over 4.01% of surface); the identity's precondition fails. v1's "global surface information" mechanism is replaced by the strict-convexity precondition. Finding 2: Of v1's 13 catalog instances only Phase-1 ($β=0.023149$, $a_1=0.234433$, $k=1$) survives identity-based verification; the remaining twelve were per-$k$ optimizer extrema overshooting the strict-convex boundary. Probing the regime interior verifies further mono-monostatic bodies in $k=2$ and $k=3$ sub-families: the verified set is an open regime in $(β, a_1, k)$, not a discrete list. Finding 3: v1's ECS=1 readings for the 9 radial-family members reflected drainage-basin merging; the $r=0.9993$ gentleness-robustness correlation is retracted.