https://arxiv.org/api/9hSpxVE4xcI05CMpqh92LHedY8w 2026-06-14T04:01:07Z 2259 225 15 http://arxiv.org/abs/2502.18625v2 The Market Maker's Dilemma: Navigating the Fill Probability vs. Post-Fill Returns Trade-Off 2025-11-23T00:21:15Z

Using data from a live trading experiment on the Binance Bitcoin perpetual, we examine the effects of (i) basic order book mechanics and (ii) the persistence of price changes from immediate to short timescales, revealing the interplay between returns, queue sizes, and orders' queue positions. We document a fundamental trade-off: a negative correlation between maker fill likelihood and post-fill returns. This dictates that viable maker strategies often require a contrarian approach, counter-trading the prevailing order book imbalance. These dynamics render commonly-cited strategies highly unprofitable, leading us to model `Reversals': situations where a contrarian maker strategy at the touch proves effective.

2025-02-25T20:22:59Z Jakob Albers Mihai Cucuringu Sam Howison Alexander Y. Shestopaloff http://arxiv.org/abs/2512.04099v1 Partial multivariate transformer as a tool for cryptocurrencies time series prediction 2025-11-22T21:59:32Z

Forecasting cryptocurrency prices is hindered by extreme volatility and a methodological dilemma between information-scarce univariate models and noise-prone full-multivariate models. This paper investigates a partial-multivariate approach to balance this trade-off, hypothesizing that a strategic subset of features offers superior predictive power. We apply the Partial-Multivariate Transformer (PMformer) to forecast daily returns for BTCUSDT and ETHUSDT, benchmarking it against eleven classical and deep learning models. Our empirical results yield two primary contributions. First, we demonstrate that the partial-multivariate strategy achieves significant statistical accuracy, effectively balancing informative signals with noise. Second, we experiment and discuss an observable disconnect between this statistical performance and practical trading utility; lower prediction error did not consistently translate to higher financial returns in simulations. This finding challenges the reliance on traditional error metrics and highlights the need to develop evaluation criteria more aligned with real-world financial objectives.

2025-11-22T21:59:32Z Accepted for publication in the proceedings of ICTAI 2025 Andrzej Tokajuk Jarosław A. Chudziak http://arxiv.org/abs/2511.17479v1 Emergence of Randomness in Temporally Aggregated Financial Tick Sequences 2025-11-21T18:27:59Z

Markets efficiency implies that the stock returns are intrinsically unpredictable, a property that makes markets comparable to random number generators. We present a novel methodology to investigate ultra-high frequency financial data and to evaluate the extent to which tick by tick returns resemble random sequences. We extend the analysis of ultra high-frequency stock market data by applying comprehensive sets of randomness tests, beyond the usual reliance on serial correlation or entropy measures. Our purpose is to extensively analyze the randomness of these data using statistical tests from standard batteries that evaluate different aspects of randomness. We illustrate the effect of time aggregation in transforming highly correlated high-frequency trade data to random streams. More specifically, we use many of the tests in the NIST Statistical Test Suite and in the TestU01 battery (in particular the Rabbit and Alphabit sub-batteries), to prove that the degree of randomness of financial tick data increases together with the increase of the aggregation level in transaction time. Additionally, the comprehensive nature of our tests also uncovers novel patterns, such as non-monotonic behaviors in predictability for certain assets. This study demonstrates a model-free approach for both assessing randomness in financial time series and generating pseudo-random sequences from them, with potential relevance in several applications.

2025-11-21T18:27:59Z Silvia Onofri Andrey Shternshis Stefano Marmi http://arxiv.org/abs/2502.07071v3 TRADES: Generating Realistic Market Simulations with Diffusion Models 2025-11-20T09:51:16Z

Financial markets are complex systems characterized by high statistical noise, nonlinearity, volatility, and constant evolution. Thus, modeling them is extremely hard. Here, we address the task of generating realistic and responsive Limit Order Book (LOB) market simulations, which are fundamental for calibrating and testing trading strategies, performing market impact experiments, and generating synthetic market data. We propose a novel TRAnsformer-based Denoising Diffusion Probabilistic Engine for LOB Simulations (TRADES). TRADES generates realistic order flows as time series conditioned on the state of the market, leveraging a transformer-based architecture that captures the temporal and spatial characteristics of high-frequency market data. There is a notable absence of quantitative metrics for evaluating generative market simulation models in the literature. To tackle this problem, we adapt the predictive score, a metric measured as an MAE, to market data by training a stock price predictive model on synthetic data and testing it on real data. We compare TRADES with previous works on two stocks, reporting a 3.27 and 3.48 improvement over SoTA according to the predictive score, demonstrating that we generate useful synthetic market data for financial downstream tasks. Furthermore, we assess TRADES's market simulation realism and responsiveness, showing that it effectively learns the conditional data distribution and successfully reacts to an experimental agent, giving sprout to possible calibrations and evaluations of trading strategies and market impact experiments. To perform the experiments, we developed DeepMarket, the first open-source Python framework for LOB market simulation with deep learning. In our repository, we include a synthetic LOB dataset composed of TRADES's generated simulations.

2025-01-31T19:43:13Z 8 pages ECAI 2025. Volume 413: Pages 3703 - 3710 Leonardo Berti Bardh Prenkaj Paola Velardi 10.3233/FAIA251249 http://arxiv.org/abs/2511.15262v1 Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution 2025-11-19T09:26:23Z

We investigate the use of Reinforcement Learning for the optimal execution of meta-orders, where the objective is to execute incrementally large orders while minimizing implementation shortfall and market impact over an extended period of time. Departing from traditional parametric approaches to price dynamics and impact modeling, we adopt a model-free, data-driven framework. Since policy optimization requires counterfactual feedback that historical data cannot provide, we employ the Queue-Reactive Model to generate realistic and tractable limit order book simulations that encompass transient price impact, and nonlinear and dynamic order flow responses. Methodologically, we train a Double Deep Q-Network agent on a state space comprising time, inventory, price, and depth variables, and evaluate its performance against established benchmarks. Numerical simulation results show that the agent learns a policy that is both strategic and tactical, adapting effectively to order book conditions and outperforming standard approaches across multiple training configurations. These findings provide strong evidence that model-free Reinforcement Learning can yield adaptive and robust solutions to the optimal execution problem.

2025-11-19T09:26:23Z Tomas Espana Yadh Hafsi Fabrizio Lillo Edoardo Vittori http://arxiv.org/abs/2511.13614v1 Market-Dependent Communication in Multi-Agent Alpha Generation 2025-11-17T17:19:56Z

Multi-strategy hedge funds face a fundamental organizational choice: should analysts generating trading strategies communicate, and if so, how? We investigate this using 5-agent LLM-based trading systems across 450 experiments spanning 21 months, comparing five organizational structures from isolated baseline to collaborative and competitive conversation. We show that communication improves performance, but optimal communication design depends on market characteristics. Competitive conversation excels in volatile technology stocks, while collaborative conversation dominates stable general stocks. Finance stocks resist all communication interventions. Surprisingly, all structures, including isolated agents, converge to similar strategy alignments, challenging assumptions that transparency causes harmful diversity loss. Performance differences stem from behavioral mechanisms: competitive agents focus on stock-level allocation while collaborative agents develop technical frameworks. Conversation quality scores show zero correlation with returns. These findings demonstrate that optimal communication design must match market volatility characteristics, and sophisticated discussions don't guarantee better performance.

2025-11-17T17:19:56Z Jerick Shi Burton Hollifield http://arxiv.org/abs/2511.12490v1 Discovery of a 13-Sharpe OOS Factor: Drift Regimes Unlock Hidden Cross-Sectional Predictability 2025-11-16T07:55:00Z

We document a high-performing cross-sectional equity factor that achieves out-of-sample Sharpe ratios above 13 through regime-conditional signal activation. The strategy combines value and short-term reversal signals only during stock-specific drift regimes, defined as periods when individual stocks show more than 60 percent positive days in trailing 63-day windows. Under these conditions, the factor delivers annualized returns of 158.6 percent with 12.0 percent volatility and a maximum drawdown of minus 11.9 percent. Using rigorous walk-forward validation across 20 years of S&P 500 data (2004 to 2024), we show performance roughly 13 times stronger than market benchmarks on a risk-adjusted basis, produced entirely out-of-sample with frozen parameters. The factor passes extensive robustness tests, including 1,000 randomization trials with p-values below 0.001, and maintains Sharpe ratios above 7 even under 30 percent parameter perturbations. Exposure to standard risk factors is negligible, with total R-squared values below 3 percent. We provide mechanistic evidence that drift regimes reshape market microstructure by amplifying behavioral biases, altering liquidity patterns, and creating conditions where cross-sectional price discovery becomes systematically exploitable. Conservative capacity estimates indicate deployable capital of 100 to 500 million dollars before noticeable performance degradation.

2025-11-16T07:55:00Z This paper presents a regime-conditioned long short equity factor with out-of-sample Sharpe above 13, validated over 20 years with frozen parameters. We include conservative costs, impact modeling, stress tests, and capacity estimates of 100 to 500 million dollars with annualized returns above 70 percent. At 1 billion dollars, the annualized return is 33.6 percent. Feedback is welcome Mainak Singha http://arxiv.org/abs/2511.12129v1 A Practical Machine Learning Approach for Dynamic Stock Recommendation 2025-11-15T09:32:03Z

Stock recommendation is vital to investment companies and investors. However, no single stock selection strategy will always win while analysts may not have enough time to check all S&P 500 stocks (the Standard & Poor's 500). In this paper, we propose a practical scheme that recommends stocks from S&P 500 using machine learning. Our basic idea is to buy and hold the top 20% stocks dynamically. First, we select representative stock indicators with good explanatory power. Secondly, we take five frequently used machine learning methods, including linear regression, ridge regression, stepwise regression, random forest and generalized boosted regression, to model stock indicators and quarterly log-return in a rolling window. Thirdly, we choose the model with the lowest Mean Square Error in each period to rank stocks. Finally, we test the selected stocks by conducting portfolio allocation methods such as equally weighted, mean-variance, and minimum-variance. Our empirical results show that the proposed scheme outperforms the long-only strategy on the S&P 500 index in terms of Sharpe ratio and cumulative returns. This work is fully open-sourced at \href{https://github.com/AI4Finance-Foundation/Dynamic-Stock-Recommendation-Machine_Learning-Published-Paper-IEEE}{GitHub}.

2025-11-15T09:32:03Z Accepted by IEEE TrustCom/BigDataSE 2018. Supported by AI4Finance Foundation Hongyang Yang Xiao-Yang Liu Qingwei Wu http://arxiv.org/abs/2511.12120v1 Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy 2025-11-15T09:15:10Z

Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose an ensemble strategy that employs deep reinforcement schemes to learn a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market situations. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand technique for processing very large data. We test our algorithms on the 30 Dow Jones stocks that have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble strategy is shown to outperform the three individual algorithms and two baselines in terms of the risk-adjusted return measured by the Sharpe ratio. This work is fully open-sourced at \href{https://github.com/AI4Finance-Foundation/Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020}{GitHub}.

2025-11-15T09:15:10Z Accepted by ICAIF '20: Proceedings of the First ACM International Conference on AI in Finance. Conference program: https://ai-finance.org/2020program/ Hongyang Yang Xiao-Yang Liu Shan Zhong Anwar Walid http://arxiv.org/abs/2306.06031v2 FinGPT: Open-Source Financial Large Language Models 2025-11-15T08:39:26Z

Large language models (LLMs) have shown the potential of revolutionizing natural language processing tasks in diverse domains, sparking great interest in finance. Accessing high-quality financial data is the first challenge for financial LLMs (FinLLMs). While proprietary models like BloombergGPT have taken advantage of their unique data accumulation, such privileged access calls for an open-source alternative to democratize Internet-scale financial data. In this paper, we present an open-source large language model, FinGPT, for the finance sector. Unlike proprietary models, FinGPT takes a data-centric approach, providing researchers and practitioners with accessible and transparent resources to develop their FinLLMs. We highlight the importance of an automatic data curation pipeline and the lightweight low-rank adaptation technique in building FinGPT. Furthermore, we showcase several potential applications as stepping stones for users, such as robo-advising, algorithmic trading, and low-code development. Through collaborative efforts within the open-source AI4Finance community, FinGPT aims to stimulate innovation, democratize FinLLMs, and unlock new opportunities in open finance. Two associated code repos are https://github.com/AI4Finance-Foundation/FinGPT and https://github.com/AI4Finance-Foundation/FinNLP

2023-06-09T16:52:00Z Accepted by the FinLLM Symposium at IJCAI 2023. Recipient of the Best Presentation Award (Hongyang Yang). Workshop link: https://finllm.github.io/workshop. This is the first official FinGPT paper; please cite this work when referencing FinGPT Hongyang Yang Xiao-Yang Liu Christina Dan Wang http://arxiv.org/abs/2310.14144v2 Unwinding Stochastic Order Flow: When to Warehouse Trades 2025-11-12T22:28:49Z

We study how to unwind stochastic order flow with minimal transaction costs. Stochastic order flow arises, e.g., in the central risk book (CRB), a centralized trading desk that aggregates order flows within a financial institution. The desk can warehouse in-flow orders, ideally netting them against subsequent opposite orders (internalization), or route them to the market (externalization) and incur costs related to price impact and bid-ask spread. We model and solve this problem for a general class of in-flow processes, enabling us to study in detail how in-flow characteristics affect optimal strategy and core trading metrics. Our model allows for an analytic solution in semi-closed form and is readily implementable numerically. Compared with a standard execution problem where the order size is known upfront, the unwind strategy exhibits an additive adjustment for projected future in-flows. Its sign depends on the autocorrelation of orders; only truth-telling (martingale) flow is unwound myopically. In addition to analytic results, we present extensive simulations for different use cases and regimes, and introduce new metrics of practical interest.

2023-10-22T00:51:12Z To appear in 'Mathematical Finance' Marcel Nutz Kevin Webster Long Zhao http://arxiv.org/abs/2503.08287v2 Liquidity Competition Between Brokers and an Informed Trader 2025-11-12T17:32:40Z

We study a multi-agent setting in which brokers transact with an informed trader. Through a sequential Stackelberg-type game, brokers manage trading costs and adverse selection with an informed trader. In particular, supplying liquidity to the informed traders allows the brokers to speculate based on the flow information. They simultaneously attempt to minimize inventory risk and trading costs with the lit market based on the informed order flow, also known as the internalization-externalization strategy. We solve in closed form for the trading strategy that the informed trader uses with each broker and propose a system of equations which classify the equilibrium strategies of the brokers. By solving these equations numerically we may study the resulting strategies in equilibrium. Finally, we formulate a competitive game between brokers in order to determine the liquidity prices subject to precommitment supplied to the informed trader and provide a numerical example in which the resulting equilibrium is not Pareto efficient.

2025-03-11T10:59:49Z Ryan Donnelly Zi Li http://arxiv.org/abs/2501.02963v3 A data-driven merit order: Learning a fundamental electricity price model 2025-11-12T13:21:52Z

Electricity price forecasting approaches generally fall into two categories: data-driven models, which learn from historical patterns, or fundamental models, which simulate market mechanisms. We propose a novel and highly efficient data-driven merit order model that integrates both paradigms. The model embeds the classical expert-based merit order as a nested special case, allowing all key parameters, such as plant efficiencies, bidding behavior, and available capacities, to be estimated directly from historical data, rather than assumed. We further enhance the model with critical embedded extensions such as hydro power, cross-border flows and corrections for underreported capacities, which considerably improve forecasting accuracy. Applied to the German day-ahead market, our model outperforms both classic fundamental and state-of-the-art machine learning models. It retains the interpretability of fundamental models, offering insights into marginal technologies, fuel switches, and dispatch patterns, elements which are typically inaccessible to black-box machine learning approaches. This transparency and high computational efficiency make it a promising new direction for electricity price modeling.

2025-01-06T12:16:09Z Energy Economics, 154 (2026) 109114 Paul Ghelasi Florian Ziel 10.1016/j.eneco.2025.109114 http://arxiv.org/abs/2511.08571v1 Forecast-to-Fill: Benchmark-Neutral Alpha and Billion-Dollar Capacity in Gold Futures (2015-2025) 2025-11-11T18:52:06Z

We test whether simple, interpretable state variables-trend and momentum-can generate durable out-of-sample alpha in one of the world's most liquid assets, gold. Using a rolling 10-year training and 6-month testing walk-forward from 2015 to 2025 (2,793 trading days), we convert a smoothed trend-momentum regime signal into volatility-targeted, friction-aware positions through fractional, impact-adjusted Kelly sizing and ATR-based exits. Out of sample, the strategy delivers a Sharpe ratio of 2.88 and a maximum drawdown of 0.52 percent, net of 0.7 basis-point linear cost and a square-root impact term (gamma = 0.02). A regression on spot-gold returns yields a 43 percent annualized return (CAGR approximately 43 percent) and a 37 percent alpha (Sharpe = 2.88, IR = 2.09) at a 15 percent volatility target with beta approximately 0.03, confirming benchmark-neutral performance. Bootstrap confidence intervals ([2.49, 3.27]) and SPA tests (p = 0.000) confirm statistical significance and robustness to latency, reversal, and cost stress. We conclude that forecast-to-fill engineering-linking transparent signals to executable trades with explicit risk, cost, and impact control-can transform modest predictability into allocator-grade, billion-dollar-scalable alpha.

2025-11-11T18:52:06Z Institutional-grade systematic framework: Sharpe 2.88, $1B capacity, benchmark-neutral. Seeking feedback on live deployment considerations, multi-asset extensions, and operational implementation at scale Mainak Singha Jose Aguilera-Toste Vinayak Lahiri http://arxiv.org/abs/2511.06177v1 Push-response anomalies in high-frequency S&P 500 price series 2025-11-09T01:34:00Z

We test the hypothesis that consecutive intraday price changes in the most liquid U.S. equity ETF (SPY) are conditionally nonrandom. Using NBBO event-time data for about 1,500 regular trading days, we form for every lag L ordered pairs of a backward price increment ("push") and a forward price increment ("response"), standardize them, and estimate the expected responses on a fine grid of push magnitudes. The resulting lag-by-magnitude maps reveal a persistent structural shift: for short lags (1-5,000 ticks), expected responses cluster near zero across most push magnitudes, suggesting high short-term efficiency; beyond that range, pronounced tails emerge, indicating that larger historical pushes increasingly correlate with nonzero conditional responses. We also find that large negative pushes are followed by stronger positive responses than equally large positive pushes, consistent with asymmetric liquidity replenishment after sell-side shocks. Decomposition into symmetric and antisymmetric components and the associated dominance curves confirm that short-horizon efficiency is restored only partially. The evidence points to an intraday, lag-resolved anomaly that is invisible in unconditional returns and that can be used to define tradable pockets and risk controls.

2025-11-09T01:34:00Z Dmitrii Vlasiuk Mikhail Smirnov