https://arxiv.org/api/yJQ/NlGgRfk3rAqPR3bm+74r5eI2026-06-14T09:10:25Z225930015http://arxiv.org/abs/2508.17837v1Bimodal Dynamics of the Artificial Limit Order Book Stock Exchange with Autonomous Traders2025-08-25T09:36:00ZThis paper explores the bifurcative dynamics of an artificial stock market exchange (ASME) with endogenous, myopic traders interacting through a limit order book (LOB). We showed that agent-based price dynamics possess intrinsic bistability, which is not a result of randomness but an emergent property of micro-level trading rules, where even identical initial conditions lead to qualitatively different long-run price equilibria: a deterministic zero-price state and a persistent positive-price equilibrium. The study also identifies a metastable region with elevated volatility between the basins of attraction and reveals distinct transient behaviors for trajectories converging to these equilibria. Furthermore, we observe that the system is neither entirely regular nor fully chaotic. By highlighting the emergence of divergent market outcomes from uniform beginnings, this work contributes a novel perspective on the inherent path dependence and complex dynamics of artificial stock markets.2025-08-25T09:36:00ZMatej SteinbacherMitja SteinbacherMatjaz Steinbacherhttp://arxiv.org/abs/2508.14784v1Graph Learning for Foreign Exchange Rate Prediction and Statistical Arbitrage2025-08-20T15:29:31ZWe propose a two-step graph learning approach for foreign exchange statistical arbitrages (FXSAs), addressing two key gaps in prior studies: the absence of graph-learning methods for foreign exchange rate prediction (FXRP) that leverage multi-currency and currency-interest rate relationships, and the disregard of the time lag between price observation and trade execution. In the first step, to capture complex multi-currency and currency-interest rate relationships, we formulate FXRP as an edge-level regression problem on a discrete-time spatiotemporal graph. This graph consists of currencies as nodes and exchanges as edges, with interest rates and foreign exchange rates serving as node and edge features, respectively. We then introduce a graph-learning method that leverages the spatiotemporal graph to address the FXRP problem. In the second step, we present a stochastic optimization problem to exploit FXSAs while accounting for the observation-execution time lag. To address this problem, we propose a graph-learning method that enforces constraints through projection and ReLU, maximizes risk-adjusted return by leveraging a graph with exchanges as nodes and influence relationships as edges, and utilizes the predictions from the FXRP method for the constraint parameters and node features. Moreover, we prove that our FXSA method satisfies empirical arbitrage constraints. The experimental results demonstrate that our FXRP method yields statistically significant improvements in mean squared error, and that the FXSA method achieves a 61.89% higher information ratio and a 45.51% higher Sortino ratio than a benchmark. Our approach provides a novel perspective on FXRP and FXSA within the context of graph learning.2025-08-20T15:29:31ZYoonsik HongDiego Klabjanhttp://arxiv.org/abs/2508.14656v1Deep Learning for Short Term Equity Trend Forecasting: A Behavior Driven Multi Factor Approach2025-08-20T12:15:32ZThis study proposes a behaviorally-informed multi-factor stock selection framework that integrates short-cycle technical alpha signals with deep learning. We design a dual-task multilayer perceptron (MLP) that jointly predicts five-day future returns and directional price movements, thereby capturing nonlinear market behaviors such as volume-price divergence, momentum-driven herding, and bottom reversals. The model is trained on 40 carefully constructed factors derived from price-volume patterns and behavioral finance insights. Empirical evaluation demonstrates that the dual-task MLP achieves superior and stable performance across both predictive accuracy and economic relevance, as measured by information coefficient (IC), information ratio (IR), and portfolio backtesting results. Comparative experiments further show that deep learning methods outperform linear baselines by effectively capturing structural interactions between factors. This work highlights the potential of structure-aware deep learning in enhancing multi-factor modeling and provides a practical framework for short-horizon quantitative investment strategies.2025-08-20T12:15:32ZYuqi Luanhttp://arxiv.org/abs/2508.00554v3ContestTrade: A Multi-Agent Trading System Based on Internal Contest Mechanism2025-08-18T06:13:10ZIn financial trading, large language model (LLM)-based agents demonstrate significant potential. However, the high sensitivity to market noise undermines the performance of LLM-based trading systems. To address this limitation, we propose a novel multi-agent system featuring an internal competitive mechanism inspired by modern corporate management structures. The system consists of two specialized teams: (1) Data Team - responsible for processing and condensing massive market data into diversified text factors, ensuring they fit the model's constrained context. (2) Research Team - tasked with making parallelized multipath trading decisions based on deep research methods. The core innovation lies in implementing a real-time evaluation and ranking mechanism within each team, driven by authentic market feedback. Each agent's performance undergoes continuous scoring and ranking, with only outputs from top-performing agents being adopted. The design enables the system to adaptively adjust to dynamic environment, enhances robustness against market noise and ultimately delivers superior trading performance. Experimental results demonstrate that our proposed system significantly outperforms prevailing multi-agent systems and traditional quantitative investment methods across diverse evaluation metrics. ContestTrade is open-sourced on GitHub at https://github.com/FinStep-AI/ContestTrade.2025-08-01T11:48:13ZLi ZhaoRui SunZuoyou JiangBo YangYuxiao BaiMengting ChenXinyang WangJing LiZuo Baihttp://arxiv.org/abs/2508.08698v1DiffVolume: Diffusion Models for Volume Generation in Limit Order Books2025-08-12T07:42:00ZModeling limit order books (LOBs) dynamics is a fundamental problem in market microstructure research. In particular, generating high-dimensional volume snapshots with strong temporal and liquidity-dependent patterns remains a challenging task, despite recent work exploring the application of Generative Adversarial Networks to LOBs. In this work, we propose a conditional \textbf{Diff}usion model for the generation of future LOB \textbf{Volume} snapshots (\textbf{DiffVolume}). We evaluate our model across three axes: (1) \textit{Realism}, where we show that DiffVolume, conditioned on past volume history and time of day, better reproduces statistical properties such as marginal distribution, spatial correlation, and autocorrelation decay; (2) \textit{Counterfactual generation}, allowing for controllable generation under hypothetical liquidity scenarios by additionally conditioning on a target future liquidity profile; and (3) \textit{Downstream prediction}, where we show that the synthetic counterfactual data from our model improves the performance of future liquidity forecasting models. Together, these results suggest that DiffVolume provides a powerful and flexible framework for realistic and controllable LOB volume generation.2025-08-12T07:42:00Z13 pages, 6 figures, 3 tablesZhuohan WangCarmine Ventrehttp://arxiv.org/abs/2508.21075v1A Stream Pipeline Framework for Digital Payment Programming based on Smart Contracts2025-08-12T03:58:19ZDigital payments play a pivotal role in the burgeoning digital economy. Moving forward, the enhancement of digital payment systems necessitates programmability, going beyond just efficiency and convenience, to meet the evolving needs and complexities. Smart contract platforms like Central Bank Digital Currency (CBDC) networks and blockchains support programmable digital payments. However, the prevailing paradigm of programming payment logics involves coding smart contracts with programming languages, leading to high costs and significant security challenges. A novel and versatile method for payment programming on DLTs was presented in this paper - transforming digital currencies into token streams, then pipelining smart contracts to authorize, aggregate, lock, direct, and dispatch these streams efficiently from source to target accounts. By utilizing a small set of configurable templates, a few specialized smart contracts could be generated, and support most of payment logics through configuring and composing them. This approach could substantially reduce the cost of payment programming and enhance security, self-enforcement, adaptability, and controllability, thus hold the potential to become an essential component in the infrastructure of digital economy.2025-08-12T03:58:19Z5 pages, 2 figuresZijia MengVictor Fenghttp://arxiv.org/abs/2508.08152v1Optimal Fees for Liquidity Provision in Automated Market Makers2025-08-11T16:30:02ZPassive liquidity providers (LPs) in automated market makers (AMMs) face losses due to adverse selection (LVR), which static trading fees often fail to offset in practice. We study the key determinants of LP profitability in a dynamic reduced-form model where an AMM operates in parallel with a centralized exchange (CEX), traders route their orders optimally to the venue offering the better price, and arbitrageurs exploit price discrepancies. Using large-scale simulations and real market data, we analyze how LP profits vary with market conditions such as volatility and trading volume, and characterize the optimal AMM fee as a function of these conditions. We highlight the mechanisms driving these relationships through extensive comparative statics, and confirm the model's relevance through market data calibration. A key trade-off emerges: fees must be low enough to attract volume, yet high enough to earn sufficient revenues and mitigate arbitrage losses. We find that under normal market conditions, the optimal AMM fee is competitive with the trading cost on the CEX and remarkably stable, whereas in periods of very high volatility, a high fee protects passive LPs from severe losses. These findings suggest that a threshold-type dynamic fee schedule is both robust enough to market conditions and improves LP outcomes.2025-08-11T16:30:02Z43 pages, 23 figures, 8 tablesSteven CampbellPhilippe BergaultJason MilionisMarcel Nutzhttp://arxiv.org/abs/2508.06914v1Prediction of high-frequency futures return directions based on the mean uncertainty classification methods: An application in China's future market2025-08-09T09:56:48ZIn this paper, we mainly focus on the prediction of short-term average return directions in China's high-frequency futures market. As minor fluctuations with limited amplitude and short duration are typically regarded as random noise, only price movements of sufficient magnitude qualify as statistically significant signals. Therefore data imbalance emerges as a key problem during predictive modeling. From the view of data distribution imbalance, we employee the mean-uncertainty logistic regression (mean-uncertainty LR) classification method under the sublinear expectation (SLE) framework, and further propose the mean-uncertainty support vector machines (mean-uncertainty SVM) method for the prediction. Corresponding investment strategies are developed based on the prediction results. For data selection, we utilize trading data and limit order book data of the top 15 liquid products among the most active contracts in China's future market. Empirical results demonstrate that comparing with conventional LR-related and SVM-related imbalanced data classification methods, the two mean-uncertainty approaches yields significant advantages in both classification metrics and average returns per trade.2025-08-09T09:56:48Z19 pages, 3 figuresYing PengYifan ZhangXin Wanghttp://arxiv.org/abs/2508.16598v1Sizing the Risk: Kelly, VIX, and Hybrid Approaches in Put-Writing on Index Options2025-08-09T08:31:00ZThis paper examines systematic put-writing strategies applied to S&P 500 Index options, with a focus on position sizing as a key determinant of long-term performance. Despite the well-documented volatility risk premium, where implied volatility exceeds realized volatility, the practical implementation of short-dated volatility-selling strategies remains underdeveloped in the literature. This study evaluates three position sizing approaches: the Kelly criterion, VIX-based volatility regime scaling, and a novel hybrid method combining both. Using SPXW options with expirations from 0 to 5 days, the analysis explores a broad design space, including moneyness levels, volatility estimators, and memory horizons. Results show that ultra-short-dated, far out-of-the-money options deliver superior risk-adjusted returns. The hybrid sizing method consistently balances return generation with robust drawdown control, particularly under low-volatility conditions such as those seen in 2024. The study offers new insights into volatility harvesting, introducing a dynamic sizing framework that adapts to shifting market regimes. It also contributes practical guidance for constructing short-dated option strategies that are robust across market environments. These findings have direct applications for institutional investors seeking to enhance portfolio efficiency through systematic exposure to volatility premia.2025-08-09T08:31:00ZMaciej Wysockihttp://arxiv.org/abs/2508.16589v1ARL-Based Multi-Action Market Making with Hawkes Processes and Variable Volatility2025-08-07T21:50:30ZWe advance market-making strategies by integrating Adversarial Reinforcement Learning (ARL), Hawkes Processes, and variable volatility levels while also expanding the action space available to market makers (MMs). To enhance the adaptability and robustness of these strategies -- which can quote always, quote only on one side of the market or not quote at all -- we shift from the commonly used Poisson process to the Hawkes process, which better captures real market dynamics and self-exciting behaviors. We then train and evaluate strategies under volatility levels of 2 and 200. Our findings show that the 4-action MM trained in a low-volatility environment effectively adapts to high-volatility conditions, maintaining stable performance and providing two-sided quotes at least 92\% of the time. This indicates that incorporating flexible quoting mechanisms and realistic market simulations significantly enhances the effectiveness of market-making strategies.2025-08-07T21:50:30ZICAIF '24: Proceedings of the 5th ACM International Conference on AI in Finance, November 14--17, 2024, Brooklyn, NY, USAZiyi WangCarmine VentreMaria Polukarov10.1145/3677052.3698695http://arxiv.org/abs/2508.16588v1Robust Market Making: To Quote, or not To Quote2025-08-07T21:49:24ZMarket making is a popular trading strategy, which aims to generate profit from the spread between the quotes posted at either side of the market. It has been shown that training market makers (MMs) with adversarial reinforcement learning allows to overcome the risks due to changing market conditions and to lead to robust performances. Prior work assumes, however, that MMs keep quoting throughout the trading process, but in practice this is not required, even for ``registered'' MMs (that only need to satisfy quoting ratios defined by the market rules). In this paper, we build on this line of work and enrich the strategy space of the MM by allowing to occasionally not quote or provide single-sided quotes. Towards this end, in addition to the MM agents that provide continuous bid-ask quotes, we have designed two new agents with increasingly richer action spaces. The first has the option to provide bid-ask quotes or refuse to quote. The second has the option to provide bid-ask quotes, refuse to quote, or only provide single-sided ask or bid quotes. We employ a model-driven approach to empirically compare the performance of the continuously quoting MM with the two agents above in various types of adversarial environments. We demonstrate how occasional refusal to provide bid-ask quotes improves returns and/or Sharpe ratios. The quoting ratios of well-trained MMs can basically meet any market requirements, reaching up to 99.9$\%$ in some cases.2025-08-07T21:49:24ZICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance, November 27--29, 2023, Brooklyn, NY, USAZiyi WangCarmine VentreMaria Polukarov10.1145/3604237.3626858http://arxiv.org/abs/2508.02247v2ByteGen: A Tokenizer-Free Generative Model for Orderbook Events in Byte Space2025-08-07T04:31:56ZGenerative modeling of high-frequency limit order book (LOB) dynamics is a critical yet unsolved challenge in quantitative finance, essential for robust market simulation and strategy backtesting. Existing approaches are often constrained by simplifying stochastic assumptions or, in the case of modern deep learning models like Transformers, rely on tokenization schemes that affect the high-precision, numerical nature of financial data through discretization and binning. To address these limitations, we introduce ByteGen, a novel generative model that operates directly on the raw byte streams of LOB events. Our approach treats the problem as an autoregressive next-byte prediction task, for which we design a compact and efficient 32-byte packed binary format to represent market messages without information loss. The core novelty of our work is the complete elimination of feature engineering and tokenization, enabling the model to learn market dynamics from its most fundamental representation. We achieve this by adapting the H-Net architecture, a hybrid Mamba-Transformer model that uses a dynamic chunking mechanism to discover the inherent structure of market messages without predefined rules. Our primary contributions are: 1) the first end-to-end, byte-level framework for LOB modeling; 2) an efficient packed data representation; and 3) a comprehensive evaluation on high-frequency data. Trained on over 34 million events from CME Bitcoin futures, ByteGen successfully reproduces key stylized facts of financial markets, generating realistic price distributions, heavy-tailed returns, and bursty event timing. Our findings demonstrate that learning directly from byte space is a promising and highly flexible paradigm for modeling complex financial systems, achieving competitive performance on standard market quality metrics without the biases of tokenization.2025-08-04T09:48:42Z21 pages, 3 tables, 5 figuresYang LiZhi Chenhttp://arxiv.org/abs/2508.03474v1Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets2025-08-05T14:06:50ZPolymarket is a prediction market platform where users can speculate on future events by trading shares tied to specific outcomes, known as conditions. Each market is associated with a set of one or more such conditions. To ensure proper market resolution, the condition set must be exhaustive -- collectively accounting for all possible outcomes -- and mutually exclusive -- only one condition may resolve as true. Thus, the collective prices of all related outcomes should be \$1, representing a combined probability of 1 of any outcome. Despite this design, Polymarket exhibits cases where dependent assets are mispriced, allowing for purchasing (or selling) a certain outcome for less than (or more than) \$1, guaranteeing profit. This phenomenon, known as arbitrage, could enable sophisticated participants to exploit such inconsistencies.
In this paper, we conduct an empirical arbitrage analysis on Polymarket data to answer three key questions: (Q1) What conditions give rise to arbitrage (Q2) Does arbitrage actually occur on Polymarket and (Q3) Has anyone exploited these opportunities. A major challenge in analyzing arbitrage between related markets lies in the scalability of comparisons across a large number of markets and conditions, with a naive analysis requiring $O(2^{n+m})$ comparisons. To overcome this, we employ a heuristic-driven reduction strategy based on timeliness, topical similarity, and combinatorial relationships, further validated by expert input.
Our study reveals two distinct forms of arbitrage on Polymarket: Market Rebalancing Arbitrage, which occurs within a single market or condition, and Combinatorial Arbitrage, which spans across multiple markets. We use on-chain historical order book data to analyze when these types of arbitrage opportunities have existed, and when they have been executed by users. We find a realized estimate of 40 million USD of profit extracted.2025-08-05T14:06:50ZOriol SaguilloVahid GhafouriLucianna KifferGuillermo Suarez-Tangilhttp://arxiv.org/abs/2508.03217v1Measuring DEX Efficiency and The Effect of an Enhanced Routing Method on Both DEX Efficiency and Stakeholders' Benefits2025-08-05T08:45:38ZThe efficiency of decentralized exchanges (DEXs) and the influence of token routing algorithms on market performance and stakeholder outcomes remain underexplored. This paper introduces the concept of Standardized Total Arbitrage Profit (STAP), computed via convex optimization, as a systematic measure of DEX efficiency. We prove that executing the trade order maximizing STAP and reintegrating the resulting transaction fees eliminates all arbitrage opportunities-both cyclic arbitrage within DEXs and between DEXs and centralized exchanges (CEXs). In a fully efficient DEX (i.e., STAP = 0), the monetary value of target tokens received must not exceed that of the source tokens, regardless of the routing algorithm. Any violation indicates arbitrage potential, making STAP a reliable metric for arbitrage detection. Using a token graph comprising 11 tokens and 18 liquidity pools based on Uniswap V2 data, we observe a decline in DEX efficiency between June 21 and November 8, 2024. Simulations comparing two routing algorithms-Yu Zhang et al.'s line-graph-based method and the depth-first search (DFS) algorithm-show that employing more profitable routing improves DEX efficiency and trader returns over time. Moreover, while total value locked (TVL) remains stable with the line-graph method, it increases under the DFS algorithm, indicating greater aggregate benefits for liquidity providers.2025-08-05T08:45:38ZYu ZhangClaudio J. Tessonehttp://arxiv.org/abs/2508.02971v1Modeling Loss-Versus-Rebalancing in Automated Market Makers via Continuous-Installment Options2025-08-05T00:30:24ZThis paper mathematically models a constant-function automated market maker (CFAMM) position as a portfolio of exotic options, known as perpetual American continuous-installment (CI) options. This model replicates an AMM position's delta at each point in time over an infinite time horizon, thus taking into account the perpetual nature and optionality to withdraw of liquidity provision. This framework yields two key theoretical results: (a) It proves that the AMM's adverse-selection cost, loss-versus-rebalancing (LVR), is analytically identical to the continuous funding fees (the time value decay or theta) earned by the at-the-money CI option embedded in the replicating portfolio. (b) A special case of this model derives an AMM liquidity position's delta profile and boundaries that suffer approximately constant LVR, up to a bounded residual error, over an arbitrarily long forward window. Finally, the paper describes how the constant volatility parameter required by the perpetual option can be calibrated from the term structure of implied volatilities and estimates the errors for both implied volatility calibration and LVR residual error. Thus, this work provides a practical framework enabling liquidity providers to choose an AMM liquidity profile and price boundaries for an arbitrarily long, forward-looking time window where they can expect an approximately constant, price-independent LVR. The results establish a rigorous option-theoretic interpretation of AMMs and their LVR, and provide actionable guidance for liquidity providers in estimating future adverse-selection costs and optimizing position parameters.2025-08-05T00:30:24ZSrisht Fateh SinghReina Ke Xin LiSamuel GaskinYuntao WuJeffrey KlinckPanagiotis MichalopoulosZissis PoulosAndreas Veneris