https://arxiv.org/api/Sc9VJW6IWAklcL8lfIt+OW5KCDM 2026-06-13T14:51:18Z 2259 45 15 http://arxiv.org/abs/2605.17724v1 Sequential Structure in Intraday Futures Data: LSTM vs Gradient Boosting on MNQ 2026-05-18T01:03:28Z This paper compares gradient boosting and long short-term memory (LSTM) architectures for intraday directional prediction in Micro E-Mini Nasdaq 100 futures (MNQ). Motivated by recent foundation-model research on financial candlestick data, including the Kronos architecture, we test whether five-minute OHLCV bar sequences contain exploitable sequential predictive structure at the scale of a single instrument dataset. Using 944 trading days from 2021-2025, four model configurations are evaluated under strict expanding-window walk-forward validation across three out-of-sample periods. The target variable is whether the session close exceeds the 10:30 AM open by more than ten points. No configuration produces statistically significant out-of-sample accuracy above the 51.8% base rate. Combined OOS accuracies range from 50.00% to 50.89% across gradient boosting variants, while the LSTM achieves 50.59%. Permutation tests yield p-values of 0.135 for the best gradient boosting model and 0.515 for the LSTM, indicating no statistically significant predictive edge. Feature importance instability across walk-forward folds suggests noise fitting rather than stable structural signal capture. The results indicate that four years of single-instrument five-minute OHLCV data are insufficient for reliable sequential ML-based intraday forecasting. The primary contribution is a documented evaluation of a Kronos-inspired architecture on a constrained real-world dataset, providing an empirical lower bound on data scale requirements for sequential financial ML. 2026-05-18T01:03:28Z 18 pages, 4 figures. All results based on out-of-sample walk-forward validation and permutation testing. Data: MNQ futures (2021-2025) Mathias Mesfin http://arxiv.org/abs/2605.17425v1 The Viability of Blockchain Markets under Discrete Clearing and Paid Priority 2026-05-17T12:39:46Z This paper develops a model to evaluate the viability of blockchain markets as the sole venue for price formation. Blockchains clear at discrete intervals called block time, and transactions are executed sequentially according to priority fees paid by traders who compete for queue position. We show that these features undermine the viability of markets. Paid-priority ordering induces endogenous selection, where only traders with sufficiently high valuations participate. The participation cutoff rises with competition, which intensifies with lower information costs or higher liquidity demand. This hinders price discovery and biases prices. It also impairs liquidity: the cutoff concentrates trading among aggressive traders and increases adverse selection that liquidity suppliers absorb in a single clearing round. Although longer block times enhance consensus security, they amplify these effects and can cause markets to shut down. 2026-05-17T12:39:46Z Available at SSRN: 5290232 Agostino Capponi Álvaro Cartea Fayçal Drissi http://arxiv.org/abs/2605.17307v1 Deep Reinforcement Learning Framework for Diversified Portfolio Management Across Global Equity Markets 2026-05-17T07:50:37Z This study develops and evaluates a deep reinforcement learning framework for dynamic portfolio allocation across global equity markets. The Soft Actor-Critic algorithm is used to learn continuous portfolio weights within a Markov Decision Process, incorporating transaction costs, turnover penalties, and diversification constraints into the reward function. Five model configurations are compared, varying in reward formulation, policy structure (flat versus hierarchical Dirichlet), portfolio constraints, and temporal encoder (LSTM versus Transformer), and evaluated via walk-forward optimization across sixteen out-of-sample folds spanning 2003-2026 on the Nasdaq-100, Nikkei 225, and Euro Stoxx 50. Results show that RL strategies achieve competitive risk-adjusted performance primarily in the Euro Stoxx 50, where statistically significant abnormal returns are observed, but the central hypothesis is only partially confirmed: no strategy achieves statistically significant excess returns relative to Buy and Hold under HAC-robust inference across all markets. Regime analysis reveals that RL adds the most value during periods of elevated uncertainty, while ensemble aggregation across markets improves risk-adjusted performance and confirms the benefits of geographic diversification. 2026-05-17T07:50:37Z 67 pages, 11 figures, 16 tables Kamil Kashif Robert Ślepaczuk http://arxiv.org/abs/2503.08833v2 Randomization in Optimal Execution Games 2026-05-15T23:26:52Z We study optimal execution in markets with transient price impact in a competitive setting with $N$ traders. Motivated by prior negative results on the existence of pure Nash equilibria, we consider randomized strategies for the traders and whether allowing such strategies can restore the existence of equilibria. We show that given a randomized strategy, there is a non-randomized strategy with strictly lower expected execution cost, and moreover this de-randomization can be achieved by a simple averaging procedure. As a consequence, Nash equilibria cannot contain randomized strategies, and non-existence of pure equilibria implies non-existence of randomized equilibria. Separately, we also establish uniqueness of equilibria. Both results hold in a general transaction cost model given by a strictly positive definite impact decay kernel and a convex trading cost. 2025-03-11T19:15:19Z 33 pages Steven Campbell Marcel Nutz http://arxiv.org/abs/2501.09638v2 Optimal Execution among $N$ Traders with Transient Price Impact 2026-05-15T23:18:13Z We study $N$-player optimal execution games in an Obizhaeva--Wang model of transient price impact. When the game is regularized by an instantaneous cost on the trading rate, a unique equilibrium exists and we derive its closed form. Whereas without regularization, there is no equilibrium. We prove that existence is restored if (and only if) a very particular, time-dependent cost on block trades is added to the model. In that case, the equilibrium is particularly tractable. We show that this equilibrium is the limit of the regularized equilibria as the instantaneous cost parameter $\varepsilon$ tends to zero. Moreover, we explain the seemingly ad-hoc block cost as the limit of the equilibrium instantaneous costs. Notably, in contrast to the single-player problem, the optimal instantaneous costs do not vanish in the limit $\varepsilon\to0$. We use this tractable equilibrium to study the cost of liquidating in the presence of predators and the cost of anarchy. Our results also give a new interpretation to the erratic behaviors previously observed in discrete-time trading games with transient price impact. 2025-01-16T16:27:52Z 66 pages, 4 figures, 1 table Steven Campbell Marcel Nutz http://arxiv.org/abs/2604.24366v2 The Anatomy of a Decentralized Prediction Market: Microstructure Evidence from the Polymarket Order Book 2026-05-14T14:44:36Z We study the microstructure of Polymarket, the largest on-chain prediction market, using a continuous tick-level archive of the public order-book feed (30 billion events over 52 days) joined to the authoritative on-chain trade record. On a pre-registered stratified panel of 600 markets we report eight stylized facts: a longshot spread premium; a depth profile closer to uniform than to top-of-book; a null block-clock alignment effect; broad maker-wallet diversity with a concentrated tail; category-conditional effective-spread differences; a sub-50 ms median archive-ingestion delay with a multi-second tail; a self-counterparty wash share with median 1% and a 22% upper tail (well below Cong et al. 2023's 25-70% for unregulated crypto venues -- a sanity bound, not an apples-to-apples reference); and a cross-sectional depth profile explained by market duration, price level, and volume, with no residual time-to-close effect. The paper also contributes a measurement result: trade direction inferred from Polymarket's public order-book feed agrees with on-chain ground truth on only ~59% of buckets (panel mean 0.615, 95% CI [0.58, 0.65]), well below the ~80% Lee-Ready accuracy on Nasdaq. The effective half-spread changes sign between feed- and on-chain trade directions on 67%/50% of markets across two 7-day windows; Kyle's lambda on 60%/43%. Microstructure work on Polymarket therefore needs to source trade direction from on-chain OrderFilled events; we release a replication package that performs the join. 2026-04-27T12:01:14Z 16 pages, 9 figures, 5 tables. JEL: G14, G12, G19, C58, L86. v2: scope narrowing in Section 3, SF2 redone with full per-level depth profile, SF8 reframed (time-to-close coefficient becomes NS once duration and p(1-p) are controlled). Replication: https://github.com/philippdubach/polymarket-microstructure ; Zenodo DOI 10.5281/zenodo.19811426 Philipp D. Dubach http://arxiv.org/abs/2605.02287v2 Per-Market Information Leakage and Order-Flow Skill: Two Methodological Lenses on Informed Trading in Decentralized Prediction Markets 2026-05-14T04:11:16Z April 2026 saw notable methodological convergence in the academic study of informed trading on decentralized prediction markets. Three approaches surfaced almost simultaneously: Mitts and Ofir (2026) apply a composite screen to over 210,000 wallet-market pairs; Gomez-Cram et al. (2026) apply an event-level sign-randomization test to Polymarket's complete transaction history, classifying 3.14% of accounts as "skilled winners" and separately flagging 1,950 accounts as "insiders" via a lifecycle heuristic; Nechepurenko (2026) develops the Information Leakage Score (ILS) framework, which quantifies per-market information front-loading at an article-derived public-event timestamp. This paper provides a methodological comparison. The central claim is that these are three distinct layers of detection, not competing methods on a single layer. Sign-randomization is best understood as an account-level test of persistent directional skill conditional on opportunity selection -- not a direct test of insider trading, and not a per-market measure. The heuristic insider flag is separate from the skill classifier, applies to a population the classifier excludes by design, and has unknown precision. The Polymarket sample pools politics, sports, crypto, and other categories with different information technologies, so a platform-wide "skilled winner" classification is mechanism-ambiguous. The January 2026 U.S.-Venezuela operation cluster, where the DOJ indictment of Master Sergeant Gannon Van Dyke provides a rare external enforcement benchmark, illustrates how the layers stack: lifecycle heuristics identify suspicious accounts; legal investigation addresses non-public-information possession; per-market scoring would quantify how much information was leaked into each contract. A combined pipeline gains in precision because each layer filters a different dimension. 2026-05-04T07:22:20Z v2 (May 2026): added Revision Note section. No methodological-comparison changes. 21 pages, 4 tables Maksym Nechepurenko http://arxiv.org/abs/2605.02286v2 Empirical Evaluation of Deadline-Resolved Information Leakage on Documented Polymarket Insider Cases 2026-05-14T04:09:11Z This paper reports an end-to-end empirical evaluation of the deadline-Information Leakage Score (ILS-dl) extension introduced in the companion methodology paper. The deadline-ILS extends the original ILS to deadline-resolved prediction-market contracts, the dominant structural form of publicly documented insider trading on Polymarket. We anchor the evaluation in the 2026 U.S.-Iran conflict cluster of the ForesightFlow Insider Cases (FFIC) inventory, the largest documented deadline cluster. The evaluation has four parts: per-category exponential-hazard estimation, a single-case ILS-dl computation, cross-market wallet analysis, and methodological refinements. Hazard-rate estimation produces an adequate exponential fit for military-geopolitics markets (KS p = 0.426, half-life 2.9 days, n = 18) and a preliminary fit for corporate-disclosure markets (n = 5). The regulatory-decision category is rejected as bimodal (p = 0.023). On the largest applicable FFIC contract ("US forces enter Iran by April 30," $269M volume), the article-derived public-event timestamp yields ILS-dl = +0.113 versus a resolution-anchored proxy value of -0.331: a 0.444 shift in magnitude on opposite sides of zero, demonstrating that the extension distinguishes signal from proxy artefact. Pre-event drift is mild, and short-window variants (30-min, 2-hour) are exactly zero. Cross-market wallet analysis identifies 332 wallets active in both major Iran-cluster markets, but the available trade history covers only the resolution-settlement window. v2 (May 2026) corrects the hazard fit to the full Tier-3 population; the v1 estimate lies inside the v2 95% CI. 2026-05-04T07:22:13Z v2 (May 2026): hazard-rate fits updated to full Tier-3 population (n=18 for military_geopolitics, was n=9). v1 estimate lies inside v2 95% CI. Esports taxonomy correction applied. No conclusion changes. 11 pages, 6 tables Maksym Nechepurenko http://arxiv.org/abs/2605.00493v2 ForesightFlow: An Information Leakage Score Framework for Prediction Markets 2026-05-14T04:06:48Z ForesightFlow is an Information Leakage Score (ILS) framework for detecting informed trading on decentralized prediction markets. For an event-resolved binary market, the score quantifies the fraction of the terminal information move priced in before the public news event. Three operational scope conditions (edge effect, non-trivial total move, anchor sensitivity) are stated as preconditions for interpretation. The score admits a Murphy-decomposition reading that connects label generation to the proper-scoring-rule literature. A pilot empirical evaluation surfaces three findings. First, a resolution-anchored proxy for the public-event timestamp does not separate event-resolved markets from a matched control population (Mann-Whitney p = 1e-6, separation reversed), demonstrating that proxy quality is itself a binding constraint. Second, the article-derived timestamp on a single high-stakes case shifts the score by 0.444 in magnitude relative to the proxy and lies on the opposite side of zero. Third, an audit of the publicly documented Polymarket insider record reveals that documented cases are systematically deadline-resolved, falling outside the original ILS scope (0 of 24 FFIC inventory markets satisfied original scope conditions). This last finding motivates a deadline-ILS extension introduced in Section 7, anchored at the public-event timestamp rather than the news timestamp, and equipped with a per-category exponential hazard baseline for the time-to-event distribution. The extension closes the gap between the methodology and the population in which insider trading has been empirically documented. An end-to-end evaluation of the extension on the 2026 U.S.-Iran conflict cluster is reported in a companion paper. We release the FFIC inventory, the resolution-typology classification of the 911,237-market corpus, and all code at github.com/ForesightFlow. 2026-05-01T08:04:29Z v2 (May 2026): added Revision Note section; No methodology changes. 41 pages, 12 tables, 4 figures Maksym Nechepurenko http://arxiv.org/abs/2605.23978v1 Algometrics: Forecasting Under Algorithmic Feedback 2026-05-13T20:05:54Z In algorithmic markets, predictive models become part of the data-generating process they aim to forecast. Once their outputs are converted into trades, allocations, execution schedules, or risk controls, they change the future data on which they are evaluated. I introduce algometrics, a framework for time series whose evolution depends on the predictive algorithms forecasting them. The framework distinguishes historical risk, measured under passive forecasting, from deployment risk, measured when forecasts drive actions. I prove three results. First, deployment risk is not identifiable from passive historical data alone: even in a one-step linear feedback model, infinitely many algorithm-mediated environments induce the same historical law while implying different deployment risks for the same forecaster. Second, historical model rankings can invert under crowding, so a predictor with lower passive error can have higher deployment error once similar algorithms are adopted. Third, randomized or instrumented actions identify short-horizon linear feedback, and I derive a finite-sample bound for deployment-risk estimation. These results suggest that time-series benchmarks in algorithmic markets should report feedback sensitivity alongside predictive accuracy. 2026-05-13T20:05:54Z Marc Schmitt http://arxiv.org/abs/2605.11640v1 Fill-Side Non-Retail Trading on Polymarket: An Empirical Study of Behavioral Tiers and Microstructure Signatures Under Quote-Attribution Constraints 2026-05-12T07:01:35Z Prediction markets cannot exist without market makers, arbitrageurs, and other non-retail liquidity providers, yet the supply-side microstructure of Polymarket-class venues has not been characterized at on-chain pseudonymous-address scale. This paper studies non-retail participation on Polymarket using an empirical run on the PMXT v2 archive over 2026-04-21 through 2026-04-27 (13,356,931 OrderFilled events; 77,204 addresses with five+ fills; 43,116 markets). We report three findings. First, Polymarket's off-chain CLOB architecture renders address-level quote-lifecycle attribution permanently unavailable: OrderPlaced and OrderCancelled events are off-chain and absent from public archives, so quote-intensity, two-sided-ratio, and posted-spread features cannot be built at address level. We document this as a structural validity-gate failure (G-QUOTE-LIFE universal fail) and restrict analysis to a six-feature fill-side vector. Second, density-based clustering (DBSCAN, fifteen sensitivity configurations) on the fill-side vector produces a single dense cluster with zero noise: fill-side behavior in the empirical window is uni-modal under the six-feature vector, contradicting the pre-registered hypothesis of four-to-five separable archetypes. Third, robust retail vs non-retail separation is achievable through clustering-independent feature-tier stratification: whale-tier, high-frequency-operator, and power-trader tiers jointly hold 81.4% of total notional across 12.6% of addresses. Address-level market-making and liquidity-provision claims are withdrawn per the G-QUOTE-LIFE failure; spoof-by-non-fill manipulation detection is downgraded to market-level book diagnostics. A privacy-respecting derived-dataset deposit accompanies the paper as Bundle 3 of the PMXT family. Fourth paper in a four-paper programme on event-linked perpetuals and leveraged prediction-market microstructure. 2026-05-12T07:01:35Z 52 pages, 6 figures, 12 tables. Empirical microstructure on 13.36M Polymarket OrderFilled events. Companion dataset (PMXT Bundle 3; Zenodo DOI inserted before announcement) and code at https://github.com/ForesightFlow/event-linked-perps. Fourth in a four-paper programme; sister papers at SSRN 6748278, 6748298, 6748318 Maksym Nechepurenko http://arxiv.org/abs/2605.11423v1 A Validated Volatility-Volume-Gap Classifier for Regime Identification in MNQ Intraday Data 2026-05-12T02:17:32Z This paper constructs and validates a composite day-classification system for Micro E-Mini Nasdaq 100 futures (MNQ) using three pre-market observable conditions: first-30-minute return magnitude, overnight gap magnitude, and abnormal opening-bar volume relative to a rolling baseline. Using 947 regular trading days of five-minute data from 2021-2025, we find that classifier-positive days exhibit statistically distinct intraday behavior, including directional morning drift followed by systematic late-session reversal. Despite these descriptive characteristics, all tested directional trading strategies fail institutional validation standards after transaction costs and multi-year consistency requirements are applied. The highest-performing configuration achieves T = 1.46 and mean net +7.80 points but fails year-stability criteria. The primary contribution is the validation of the Volatility-Volume-Gap (VVG) classifier as a descriptive regime-identification framework and the documentation of failed attempts to convert these statistical patterns into deployable trading signals under realistic execution constraints. 2026-05-12T02:17:32Z 18 pages, 4 figures. All results based on out-of-sample walk-forward validation. Data: MNQ futures (2021-2025) Mathias Mesfin http://arxiv.org/abs/2605.23959v1 When Alpha Disappears: A One-Switch Benchmark for Decision-Time Leakage in Financial Backtests 2026-05-12T01:28:10Z We introduce When Alpha Disappears, a paired evaluation benchmark for diagnosing decision-time leakage in financial machine-learning backtests. Rather than treating leakage as a binary property, the benchmark estimates protocol-induced inflation by toggling one evaluation convention at a time around a clean $t{+}1$-open reference, while holding the data panel, walk-forward split, model family, horizon, portfolio rule, and cost convention fixed. Across two daily-OHLCV equity panels, six model families, and yearly tests from 2016--2024, we find that inflation is highly selective: centered temporal features and same-day-open execution with post-open daily-bar information cause large and stable increases in both predictive and trading metrics, whereas global normalization, future-informed graph structure, and same-day-close execution are weak in most settings. The benchmark is diagnostic rather than a claim of tradable alpha, and is intended to make evaluation assumptions, failure modes, and protocol fragility directly measurable. 2026-05-12T01:28:10Z 19 pages, including figures, tables, and the appendix Fan Zhang Zhen Li Sijia Peng Yu Chen http://arxiv.org/abs/2605.11180v1 The Value of Information: A Puzzle 2026-05-11T19:42:47Z We show that under mild assumptions, the total value of information to informed traders in the market can be measured by the covariance between price changes and order flow. This covariance captures noise trader losses, which equal informed trader gains when market making is competitive. We estimate the value of information using high frequency data on US equities at about $3.5 million per year for the average stock. The aggregate value of information is about 0.04% of market cap, which is considerably lower than the 0.67% in fees investors pay each year searching for superior returns (French 2008). We discuss potential resolutions for these puzzling findings. 2026-05-11T19:42:47Z Ohad Kadan Asaf Manela http://arxiv.org/abs/2410.14927v2 Hierarchical Reinforced Trader (HRT): A Bi-Level Approach for Optimizing Stock Selection and Execution 2026-05-11T16:31:19Z Automated equity trading requires converting noisy market and news signals into executable portfolio decisions under risk, turnover, and transaction costs. We propose Hierarchical Reinforced Trader (HRT), a bi-level reinforcement learning framework for text-aware portfolio management in multi-asset equity markets. HRT separates trading into two coordinated decisions: a factorized sparse High-Level Controller (HLC) selects asset-level increase, reduce, or hold directions from compact market and text-derived signals, while a risk-aware Low-Level Controller (LLC) converts these directions into feasible portfolio weight adjustments under turnover, drawdown, and text-risk penalties. This decomposition avoids enumerating the full joint action space and makes selection and execution easier to inspect. We evaluate HRT on an open stock-news benchmark with a fixed 89-stock Nasdaq universe, using 2013--2018 for training, 2019 for validation, and 2020--2023 for final out-of-sample testing; the test horizon is restricted to 2020--2023 due to public benchmark data availability under the same timestamp-clean text-aware protocol. Across market-proxy, same-universe portfolio, alpha-only, flat-RL, and hierarchical ablation baselines, HRT delivers the strongest learning-based return--risk--cost trade-off. The full model improves Sharpe from 1.06 for HRT-Base to 1.24, reduces daily turnover from 0.112 to 0.090, and remains robust under transaction-cost stress. These results suggest that separating sparse directional selection from risk-aware execution is an effective way to incorporate market forecasts and text-derived risk signals into portfolio management. 2024-10-19T01:29:38Z Zijie Zhao Roy E. Welsch