https://arxiv.org/api/WrFQabRE2TkdzMHTaD0ldWp7Bw4 2026-06-13T22:49:23Z 2259 150 15 http://arxiv.org/abs/2602.10798v2 Trading in CEXs and DEXs with Priority Fees and Stochastic Delays 2026-02-19T09:45:20Z

We develop a mixed control framework that combines absolutely continuous controls with impulse interventions subject to stochastic execution delays. The model extends current impulse control formulations by allowing (i) the controller to choose the mean of the stochastic delay of their impulses, and allowing (ii) for multiple pending orders, so that several impulses can be submitted and executed asynchronously at random times. The framework is motivated by an optimal trading problem between centralized (CEX) and decentralized (DEX) exchanges. In DEXs, traders control the distribution of the execution delay through the priority fee paid, introducing a fundamental trade-off between delays, uncertainty, and costs. We study the optimal trading problem of an agent exploiting trading signals in CEXs and DEXs. From a mathematical perspective, we derive the associated dynamic programming principle of this new class of impulse control problems, and establish the viscosity properties of the corresponding quasi-variational inequalities. From a financial perspective, our model provides insights on how to carry out execution across CEXs and DEXs, highlighting how traders manage latency risk optimally through priority fee selection. We show that employing the optimal priority fee has a significant outperformance over non-strategic fee selection.

2026-02-11T12:39:58Z Philippe Bergault Yadh Hafsi Leandro Sánchez-Betancourt http://arxiv.org/abs/2508.04344v2 Performative Market Making 2026-02-18T17:35:37Z

Financial models do not merely analyse markets, but actively shape them. This effect, known as performativity, describes how financial theories and the subsequent actions based on them influence market processes, by creating self-fulfilling prophecies. Although discussed in the literature on economic sociology, this deeply rooted phenomenon lacks mathematical formulation in financial markets. Our paper closes this gap by breaking down the canonical separation of diffusion processes between the description of the market environment and the financial model. We do that by embedding the model in the process itself, creating a closed feedback loop, and demonstrate how prices change towards greater conformity to the prevailing financial model used in the market. We further show, with closed-form solutions and machine learning, how a performative market maker can reverse engineer the current dominant strategies in the market and effectively arbitrage them while maintaining competitive quotes and superior P&L.

2025-08-06T11:37:26Z Charalampos Kleitsikas Stefanos Leonardos Carmine Ventre http://arxiv.org/abs/2508.16595v2 Empirical Analysis of the Model-Free Valuation Approach: Hedging Gaps, Conservatism, and Trading Opportunities 2026-02-17T05:38:03Z

In this paper we study the quality of model-free valuation approaches for financial derivatives by systematically evaluating the difference between model-free super-hedging strategies and the realized payoff of financial derivatives using historical option prices from several constituents of the S&P 500 between 2018 and 2022. Our study allows in particular to describe the realized gap between payoff and model-free hedging strategy empirically so that we can quantify to which degree model-free approaches are overly conservative. Our results imply that the model-free hedging approach is only marginally more conservative than industry-standard models such as the Heston-model while being model-free at the same time. This finding, its statistical description and the model-independence of the hedging approach enable us to construct an explicit trading strategy which, as we demonstrate, can be profitably applied in financial markets, and additionally possesses the desirable feature with an explicit control of its downside risk due to its model-free construction preventing losses pathwise.

2025-08-09T00:06:12Z Zixing Chen Yihan Qi Shanlan Que Julian Sester Xiao Zhang http://arxiv.org/abs/2602.15182v1 Autodeleveraging as Online Learning 2026-02-16T20:42:34Z

Autodeleveraging (ADL) is a last-resort loss socialization mechanism used by perpetual futures venues when liquidation and insurance buffers are insufficient to restore solvency. Despite the scale of perpetual futures markets, ADL has received limited formal treatment as a sequential control problem. This paper provides a concise formalization of ADL as online learning on a PNL-haircut domain: at each round, the venue selects a solvency budget and a set of profitable trader accounts. The profitable accounts are liquidated to cover shortfalls up to the solvency budget, with the aim of recovering exchange-wide solvency. In this model, ADL haircuts apply to positive PNL (unrealized gains), not to posted collateral principal. Using our online learning model, we provide robustness results and theoretical upper bounds on how poorly a mechanism can perform at recovering solvency. We apply our model to the October 10, 2025 Hyperliquid stress episode. The regret caused by Hyperliquid's production ADL queue is about 50\% of an upper bound on regret, calibrated to this event, while our optimized algorithm achieves about 2.6\% of the same bound. In dollar terms, the production ADL model over liquidates trader profits by up to \$51.7M. We also counterfactually evaluated algorithms inspired by our online learning framework that perform better and found that the best algorithm reduces overshoot to \$3M. Our results provide simple, implementable mechanisms for improving ADL in live perpetuals exchanges.

2026-02-16T20:42:34Z Tarun Chitra Nagu Thogiti Mauricio Jean Pieer Trujillo Ramirez Victor Xu http://arxiv.org/abs/2512.01112v3 Autodeleveraging: Impossibilities and Optimization 2026-02-16T20:30:07Z

Autodeleveraging (ADL) is a last-resort loss socialization mechanism for perpetual futures venues. It is triggered when solvency-preserving liquidations fail. Despite the dominance of perpetual futures in the crypto derivatives market, with over \$60 trillion of volume in 2024, there has been no formal study of ADL. In this paper, we provide the first rigorous model of ADL. We prove that ADL mechanisms face a fundamental \emph{trilemma}: no policy can simultaneously satisfy exchange \emph{solvency}, \emph{revenue}, and \emph{fairness} to traders. This impossibility theorem implies that as participation scales, a novel form of \emph{moral hazard} grows asymptotically, rendering `zero-loss' socialization impossible. On the positive side, we show that three classes of ADL mechanisms can optimally navigate this trilemma to provide fairness, robustness to price shocks, and maximal exchange revenue. We analyze these mechanisms on the Hyperliquid dataset from October 10, 2025, when ADL was used repeatedly to close \$2.1 billion of positions in 12 minutes. By comparing production ADL to transparent benchmark allocations, we find that Hyperliquid's production algorithm overshot the minimum trader profit haircut required to cover the shortfall. Our methodology suggests the excess profits lost by profitable traders is between \$45.0M and \$51.7M. In terms of the positions liquidated, this corresponds to roughly \$653.6M of positions being closed. This comparison also suggests that Binance overutilized ADL far more than Hyperliquid. Our results show both theoretically and empirically that optimized ADL mechanisms can dramatically reduce losses of trader profitability while maintaining exchange solvency.

2025-11-30T22:17:49Z Update 1: Empirical data given new cleaned data from Mauricio Trujillo (@ConejoCapital) Update 2: Corrections from public feedback; corrected empirical analysis Tarun Chitra http://arxiv.org/abs/2602.14670v1 FactorMiner: A Self-Evolving Agent with Skills and Experience Memory for Financial Alpha Discovery 2026-02-16T11:48:52Z

Formulaic alpha factor mining is a critical yet challenging task in quantitative investment, characterized by a vast search space and the need for domain-informed, interpretable signals. However, finding novel signals becomes increasingly difficult as the library grows due to high redundancy. We propose FactorMiner, a lightweight and flexible self-evolving agent framework designed to navigate this complex landscape through continuous knowledge accumulation. FactorMiner combines a Modular Skill Architecture that encapsulates systematic financial evaluation into executable tools with a structured Experience Memory that distills historical mining trials into actionable insights (successful patterns and failure constraints). By instantiating the Ralph Loop paradigm -- retrieve, generate, evaluate, and distill -- FactorMiner iteratively uses memory priors to guide exploration, reducing redundant search while focusing on promising directions. Experiments on multiple datasets across different assets and Markets show that FactorMiner constructs a diverse library of high-quality factors with competitive performance, while maintaining low redundancy among factors as the library scales. Overall, FactorMiner provides a practical approach to scalable discovery of interpretable formulaic alpha factors under the "Correlation Red Sea" constraint.

2026-02-16T11:48:52Z Yanlong Wang Jian Xu Hongkang Zhang Shao-Lun Huang Danny Dongning Sun Xiao-Ping Zhang http://arxiv.org/abs/2509.12456v2 Reinforcement Learning-Based Market Making as a Stochastic Control on Non-Stationary Limit Order Book Dynamics 2026-02-14T18:57:48Z

Reinforcement Learning has emerged as a promising framework for developing adaptive and data-driven strategies, enabling market makers to optimize decision-making policies based on interactions with the limit order book environment. This paper explores the integration of a reinforcement learning agent in a market-making context, where the underlying market dynamics have been explicitly modeled to capture observed stylized facts of real markets, including clustered order arrival times, non-stationary spreads and return drifts, stochastic order quantities and price volatility. These mechanisms aim to enhance stability of the resulting control agent, and serve to incorporate domain-specific knowledge into the agent policy learning process. Our contributions include a practical implementation of a market making agent based on the Proximal-Policy Optimization (PPO) algorithm, alongside a comparative evaluation of the agent's performance under varying market conditions via a simulator-based environment. As evidenced by our analysis of the financial return and risk metrics when compared to a closed-form optimal solution, our results suggest that the reinforcement learning agent can effectively be used under non-stationary market conditions, and that the proposed simulator-based environment can serve as a valuable tool for training and pre-training reinforcement learning agents in market-making scenarios.

2025-09-15T21:08:13Z 9 pages, 8 figures, 3 tables, 31 equations Rafael Zimmer Oswaldo Luiz do Valle Costa http://arxiv.org/abs/2408.11773v2 Deviations from the Nash equilibrium in a two-player optimal execution game with reinforcement learning 2026-02-13T15:27:13Z

The use of reinforcement learning algorithms in financial trading is becoming increasingly prevalent. However, the autonomous nature of these algorithms can lead to unexpected outcomes that deviate from traditional game-theoretical predictions and may even destabilize markets. In this study, we examine a scenario in which two autonomous agents, modelled with Double Deep Q-Learning, learn to liquidate the same asset optimally in the presence of market impact, under the Almgren-Chriss (2000) framework. We show that the strategies learned by the agents deviate significantly from the Nash equilibrium of the corresponding market impact game. Notably, the learned strategies exhibit supra-competitive solution, {which might be compatible with a tacit collusive behaviour}, closely aligning with the Pareto-optimal solution. We further explore how different levels of market volatility influence the agents' performance and the equilibria they discover, including scenarios where volatility differs between the training and testing phases.

2024-08-21T16:54:53Z Fabrizio Lillo Andrea Macrì http://arxiv.org/abs/2505.07078v5 Can LLM-based Financial Investing Strategies Outperform the Market in Long Run? 2026-02-12T16:17:00Z

Large Language Models (LLMs) have recently been leveraged for asset pricing tasks and stock trading applications, enabling AI agents to generate investment decisions from unstructured financial data. However, most evaluations of LLM timing-based investing strategies are conducted on narrow timeframes and limited stock universes, overstating effectiveness due to survivorship and data-snooping biases. We critically assess their generalizability and robustness by proposing FINSABER, a backtesting framework evaluating timing-based strategies across longer periods and a larger universe of symbols. Systematic backtests over two decades and 100+ symbols reveal that previously reported LLM advantages deteriorate significantly under broader cross-section and over a longer-term evaluation. Our market regime analysis further demonstrates that LLM strategies are overly conservative in bull markets, underperforming passive benchmarks, and overly aggressive in bear markets, incurring heavy losses. These findings highlight the need to develop LLM strategies that are able to prioritise trend detection and regime-aware risk controls over mere scaling of framework complexity.

2025-05-11T18:02:21Z KDD 2026, Datasets & Benchmarks Track Weixian Waylon Li Hyeonjun Kim Mihai Cucuringu Tiejun Ma http://arxiv.org/abs/2602.12104v1 Liquidation Dynamics in DeFi and the Role of Transaction Fees 2026-02-12T15:58:40Z

Liquidation of collateral are the primary safeguard for solvency of lending protocols in decentralized finance. However, the mechanics of liquidations expose these protocols to predatory price manipulations and other forms of Maximal Extractable Value (MEV). In this paper, we characterize the optimal liquidation strategy, via a dynamic program, from the perspective of a profit-maximizing liquidator when the spot oracle is given by a Constant Product Market Maker (CPMM). We explicitly model Oracle Extractable Value (OEV) where liquidators manipulate the CPMM with sandwich attacks to trigger profitable liquidation events. We derive closed-form liquidation bounds and prove that CPMM transaction fees act as a critical security parameter. Crucially, we demonstrate that fees do not merely reduce attacker profits, but can make such manipulations unprofitable for an attacker. Our findings suggest that CPMM transaction fees serve a dual purpose: compensating liquidity providers and endogenously hardening CPMM oracles against manipulation without the latency of time-weighted averages or medianization.

2026-02-12T15:58:40Z 28 pages, 9 figures Agathe Sadeghi Zachary Feinstein http://arxiv.org/abs/2602.12030v1 Time-Inhomogeneous Volatility Aversion for Financial Applications of Reinforcement Learning 2026-02-12T15:00:28Z

In finance, sequential decision problems are often faced, for which reinforcement learning (RL) emerges as a promising tool for optimisation without the need of analytical tractability. However, the objective of classical RL is the expected cumulated reward, while financial applications typically require a trade-off between return and risk. In this work, we focus on settings where one cares about the time split of the total return, ruling out most risk-aware generalisations of RL which optimise a risk measure defined on the latter. We notice that a preference for homogeneous splits, which we found satisfactory for hedging, can be unfit for other problems, and therefore propose a new risk metric which still penalises uncertainty of the single rewards, but allows for an arbitrary planning of their target levels. We study the properties of the resulting objective and the generalisation of learning algorithms to optimise it. Finally, we show numerical results on toy examples.

2026-02-12T15:00:28Z 18 pages, 6 figures Federico Cacciamani Roberto Daluiso Marco Pinciroli Michele Trapletti Edoardo Vittori http://arxiv.org/abs/2602.10785v1 A novel approach to trading strategy parameter optimization using double out-of-sample data and walk-forward techniques 2026-02-11T12:19:24Z

This study introduces a novel approach to walk-forward optimization by parameterizing the lengths of training and testing windows. We demonstrate that the performance of a trading strategy using the Exponential Moving Average (EMA) evaluated within a walk-forward procedure based on the Robust Sharpe Ratio is highly dependent on the chosen window size. We investigated the strategy on intraday Bitcoin data at six frequencies (1 minute to 60 minutes) using 81 combinations of walk-forward window lengths (1 day to 28 days) over a 19-month training period. The two best-performing parameter sets from the training data were applied to a 21-month out-of-sample testing period to ensure data independence. The strategy was only executed once during the testing period. To further validate the framework, strategy parameters estimated on Bitcoin were applied to Binance Coin and Ethereum. Our results suggest the robustness of our custom approach. In the training period for Bitcoin, all combinations of walk-forward windows outperformed a Buy-and-Hold strategy. During the testing period, the strategy performed similarly to Buy-and-Hold but with lower drawdown and a higher Information Ratio. Similar results were observed for Binance Coin and Ethereum. The real strength was demonstrated when a portfolio combining Buy-and-Hold with our strategies outperformed all individual strategies and Buy-and-Hold alone, achieving the highest overall performance and a 50 percent reduction in drawdown. A conservative fee of 0.1 percent per transaction was included in all calculations. A cost sensitivity analysis was performed as a sanity check, revealing that the strategy's break-even point was around 0.4 percent per transaction. This research highlights the importance of optimizing walk-forward window lengths and emphasizing the value of single-time out-of-sample testing for reliable strategy evaluation.

2026-02-11T12:19:24Z 40 pages, 8 figures, 11 tables Tomasz Mroziewicz Robert Ślepaczuk http://arxiv.org/abs/2505.07820v2 Revisiting the Excess Volatility Puzzle Through the Lens of the Chiarella Model 2026-02-10T16:53:47Z

We amend and extend the Chiarella model of financial markets to deal with arbitrary long-term value drifts in a consistent way. This allows us to improve upon existing calibration schemes, opening the possibility of calibrating individual monthly time series instead of classes of time series. The technique is employed on spot prices of four asset classes from ca. 1800 onward (stock indices, bonds, commodities, currencies). The so-called fundamental value is a direct output of the calibration, which allows us to (a) quantify the amount of excess volatility in these markets, which we find to be large (e.g. a factor $\approx$ 4 for stock indices) and consistent with previous estimates; and (b) determine the distribution of mispricings (i.e. the difference between market price and value), which we find in many cases to be bimodal. Both findings are strongly at odds with the Efficient Market Hypothesis. We also study in detail the 'sloppiness' of the calibration, that is, the directions in parameter space that are weakly constrained by data. The main conclusions of our study are remarkably consistent across different asset classes, and reinforce the hypothesis that the medium-term fate of financial markets is determined by a tug-of-war between trend followers and fundamentalists.

2025-05-12T17:59:46Z 20 pages plus 11 pages of appendices, 11+12 figures, 2+6 tables Jutta G. Kurth Adam A. Majewski Jean-Philippe Bouchaud http://arxiv.org/abs/2511.13277v2 Stationary Distributions of the Mode-switching Chiarella Model 2026-02-10T13:02:51Z

We derive the stationary distribution in various regimes of the extended Chiarella model of financial markets. This model is a stochastic nonlinear dynamical system that encompasses dynamical competition between a (saturating) trending and a mean-reverting component. We find the so-called mispricing distribution and the trend distribution to be unimodal Gaussians in the small noise, small feedback limit. Slow trends yield Gaussian-cosh mispricing distributions that allow for a P-bifurcation: unimodality occurs when mean-reversion is fast, bimodality when it is slow. The critical point of this bifurcation is established and refutes previous ad-hoc reports and differs from the bifurcation condition of the dynamical system itself. For fast, weakly coupled trends, deploying the Furutsu-Novikov theorem reveals that the result is again unimodal Gaussian. For the same case with higher coupling we disprove another claim from the literature: bimodal trend distributions do not generally imply bimodal mispricing distributions. The latter becomes bimodal only for stronger trend feedback. The exact solution in this last regime remains unfortunately beyond our proficiency.

2025-11-17T11:50:14Z 7 pages, 4 figures, 12 pages of appendices Jutta G. Kurth Jean-Philippe Bouchaud http://arxiv.org/abs/2501.15106v2 Solving Optimal Execution Problems via In-Context Operator Networks 2026-02-07T04:21:48Z

We propose a novel transformer-based neural network architecture (ICON-OCnet) for solving optimal order execution problems in the presence of unknown price impact. Our architecture facilitates data-driven in-context operator learning for the incurred price impact by merging offline pre-training with online few-shot prompting inference. First, the operator learning component (ICON) learns the prevailing price impact environment from only a few executed trade and price impact trajectories (time series data) provided as context. Second, we employ ICON as a surrogate operator to train a neural network policy (OCnet) for the optimal order execution strategy for the price impact regime inferred from the in-context examples. We study the efficiency of our approach for linear propagator models with path-dependent transient price impact and explicitly known optimal execution strategies. In this model class, price impact persists and decays over time according to some propagator kernel. We illustrate that ICON is capable of accurately inferring the underlying price impact model from the data prompts, even for propagator kernels not seen in the training data. Moreover, we demonstrate that ICON-OCnet correctly retrieves the exact optimal order execution strategy for the model generating the in-context examples. Our introduced methodology is very general, offering a new approach to solving path-dependent optimal stochastic control problems sample-based with unknown state dynamics.

2025-01-25T07:15:47Z 27 pages, 11 figures Tingwei Meng Moritz Voß Nils Detering Giulio Farolfi Stanley Osher Georg Menz