https://arxiv.org/api/20AMSPJrTGflbcuKjCdsS6JFTSo 2026-03-26T12:39:00Z 3130 285 15 http://arxiv.org/abs/2312.15950v3 The implied volatility surface (also) is path-dependent 2025-10-14T09:25:25Z

We propose a new model for the forecasting of both the implied volatility surfaces and the underlying asset price. In the spirit of Guyon and Lekeufack (2023) who are interested in the dependence of volatility indices (e.g. the VIX) on the paths of the associated equity indices (e.g. the S\&P 500), we first study how vanilla options implied volatility can be predicted using the past trajectory of the underlying asset price. Our empirical study reveals that a large part of the movements of the at-the-money-forward implied volatility for up to two years time-to-maturities can be explained using the past returns and their squares. Moreover, we show that this feedback effect gets weaker when the time-to-maturity increases. Building on this new stylized fact, we fit to historical data a parsimonious version of the SSVI parameterization (Gatheral and Jacquier, 2014) of the implied volatility surface relying on only four parameters and show that the two parameters ruling the at-the-money-forward implied volatility as a function of the time-to-maturity exhibit a path-dependent behavior with respect to the underlying asset price. Finally, we propose a model for the joint dynamics of the implied volatility surface and the underlying asset price. The latter is modelled using a variant of the path-dependent volatility model of Guyon and Lekeufack and the former is obtained by adding a feedback effect of the underlying asset price onto the two parameters ruling the at-the-money-forward implied volatility in the parsimonious SSVI parameterization and by specifying Ornstein-Uhlenbeck processes for the residuals of these two parameters and Jacobi processes for the two other parameters. Thanks to this model, we are able to simulate highly realistic paths of implied volatility surfaces that are free from static arbitrage.

2023-12-26T08:31:29Z Hervé Andrès CERMICS Alexandre Boumezoued CERMICS, MATHRISK Benjamin Jourdain CERMICS, MATHRISK http://arxiv.org/abs/2505.11243v2 A Set-Sequence Model for Time Series 2025-10-13T17:44:53Z

Many prediction problems across science and engineering, especially in finance and economics, involve large cross-sections of individual time series, where each unit (e.g., a loan, stock, or customer) is driven by unit-level features and latent cross-sectional dynamics. While sequence models have advanced per-unit temporal prediction, capturing cross-sectional effects often still relies on hand-crafted summary features. We propose Set-Sequence, a model that learns cross-sectional structure directly, enhancing expressivity and eliminating manual feature engineering. At each time step, a permutation-invariant Set module summarizes the unit set; a Sequence module then models each unit's dynamics conditioned on both its features and the learned summary. The architecture accommodates unaligned series, supports varying numbers of units at inference, integrates with standard sequence backbones (e.g., Transformers), and scales linearly in cross-sectional size. Across a synthetic contagion task and two large-scale real-world applications, equity portfolio optimization and loan risk prediction, Set-Sequence significantly outperforms strong baselines, delivering higher Sharpe ratios, improved AUCs, and interpretable cross-sectional summaries.

2025-05-16T13:36:07Z Presented at the Workshop on Financial AI at ICLR 2025 Elliot L. Epstein Apaar Sadhwani Kay Giesecke http://arxiv.org/abs/2510.11616v1 Attention Factors for Statistical Arbitrage 2025-10-13T16:56:30Z

Statistical arbitrage exploits temporal price differences between similar assets. We develop a framework to jointly identify similar assets through factors, identify mispricing and form a trading policy that maximizes risk-adjusted performance after trading costs. Our Attention Factors are conditional latent factors that are the most useful for arbitrage trading. They are learned from firm characteristic embeddings that allow for complex interactions. We identify time-series signals from the residual portfolios of our factors with a general sequence model. Estimating factors and the arbitrage trading strategy jointly is crucial to maximize profitability after trading costs. In a comprehensive empirical study we show that our Attention Factor model achieves an out-of-sample Sharpe ratio above 4 on the largest U.S. equities over a 24-year period. Our one-step solution yields an unprecedented Sharpe ratio of 2.3 net of transaction costs. We show that weak factors are important for arbitrage trading.

2025-10-13T16:56:30Z Accepted to the 6th ACM International Conference on AI in Finance Elliot L. Epstein Rose Wang Jaewon Choi Markus Pelger http://arxiv.org/abs/2311.12330v2 A General Framework for Importance Sampling with Markov Random Walks 2025-10-13T06:32:07Z

Although stochastic models driven by latent Markov processes are widely used, the classical importance sampling methods based on the exponential tilting for these models suffers from the difficulties in computing the eigenvalues and associated eigenfunctions and the plausibility of the indirect asymptotic large deviation regime for the variance of the estimator. We propose a general importance sampling framework that twists the observable and latent processes separately using a link function that directly minimizes the estimator's variance. An optimal choice of the link function is chosen within the locally asymptotically normal family. We show the logarithmic efficiency of the proposed estimator. As applications, we estimate an overflow probability under a pandemic model and the CoVaR, a measurement of the co-dependent financial systemic risk. Both applications are beyond the scope of traditional importance sampling methods due to their nonlinear features.

2023-11-21T03:45:58Z 69 pages, 2 figures, 4 tables Cheng-Der Fuh Yanwei Jia Steven Kou http://arxiv.org/abs/2510.10878v1 Identifying and Quantifying Financial Bubbles with the Hyped Log-Periodic Power Law Model 2025-10-13T01:06:16Z

We propose a novel model, the Hyped Log-Periodic Power Law Model (HLPPL), to the problem of quantifying and detecting financial bubbles, an ever-fascinating one for academics and practitioners alike. Bubble labels are generated using a Log-Periodic Power Law (LPPL) model, sentiment scores, and a hype index we introduced in previous research on NLP forecasting of stock return volatility. Using these tools, a dual-stream transformer model is trained with market data and machine learning methods, resulting in a time series of confidence scores as a Bubble Score. A distinctive feature of our framework is that it captures phases of extreme overpricing and underpricing within a unified structure. We achieve an average yield of 34.13 percentage annualized return when backtesting U.S. equities during the period 2018 to 2024, while the approach exhibits a remarkable generalization ability across industry sectors. Its conservative bias in predicting bubble periods minimizes false positives, a feature which is especially beneficial for market signaling and decision-making. Overall, this approach utilizes both theoretical and empirical advances for real-time positive and negative bubble identification and measurement with HLPPL signals.

2025-10-13T01:06:16Z Zheng Cao Xingran Shao Yuheng Yan Helyette Geman http://arxiv.org/abs/2510.10526v1 Integrating Large Language Models and Reinforcement Learning for Sentiment-Driven Quantitative Trading 2025-10-12T09:49:39Z

This research develops a sentiment-driven quantitative trading system that leverages a large language model, FinGPT, for sentiment analysis, and explores a novel method for signal integration using a reinforcement learning algorithm, Twin Delayed Deep Deterministic Policy Gradient (TD3). We compare the performance of strategies that integrate sentiment and technical signals using both a conventional rule-based approach and a reinforcement learning framework. The results suggest that sentiment signals generated by FinGPT offer value when combined with traditional technical indicators, and that reinforcement learning algorithm presents a promising approach for effectively integrating heterogeneous signals in dynamic trading environments.

2025-10-12T09:49:39Z Wo Long Wenxin Zeng Xiaoyu Zhang Ziyao Zhou http://arxiv.org/abs/2503.12305v2 Intraday Battery Dispatch for Hybrid Renewable Energy Assets 2025-10-10T20:52:20Z

We develop a mathematical model for intraday dispatch of co-located wind-battery energy assets. Focusing on the primary objective of firming grid-side actual production vis-a-vis the preset day-ahead hourly generation targets, we conduct a comprehensive study of the resulting stochastic control problem across different firming formulations and wind generation dynamics. Among others, we provide a closed-form solution in the special case of a quadratic objective and linear dynamics, as well as design a novel adaptation of a Gaussian Process-based Regression Monte Carlo algorithm for our setting. Extensions studied include an asymmetric loss function for peak shaving, capturing the cost of battery cycling, and the role of battery duration. In the applied portion of our work, we calibrate our model to a collection of 140+ wind-battery assets in Texas, benchmarking the economic benefits of firming based on outputs of a realistic unit commitment and economic dispatch solver.

2025-03-16T00:32:07Z Thiha Aung Mike Ludkovski http://arxiv.org/abs/2510.09247v1 Application of Deep Reinforcement Learning to At-the-Money S&P 500 Options Hedging 2025-10-10T10:35:50Z

This paper explores the application of deep Q-learning to hedging at-the-money options on the S\&P~500 index. We develop an agent based on the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, trained to simulate hedging decisions without making explicit model assumptions on price dynamics. The agent was trained on historical intraday prices of S\&P~500 call options across years 2004--2024, using a single time series of six predictor variables: option price, underlying asset price, moneyness, time to maturity, realized volatility, and current hedge position. A walk-forward procedure was applied for training, which led to nearly 17~years of out-of-sample evaluation. The performance of the deep reinforcement learning (DRL) agent is benchmarked against the Black--Scholes delta-hedging strategy over the same period. We assess both approaches using metrics such as annualized return, volatility, information ratio, and Sharpe ratio. To test the models' adaptability, we performed simulations across varying market conditions and added constraints such as transaction costs and risk-awareness penalties. Our results show that the DRL agent can outperform traditional hedging methods, particularly in volatile or high-cost environments, highlighting its robustness and flexibility in practical trading contexts. While the agent consistently outperforms delta-hedging, its performance deteriorates when the risk-awareness parameter is higher. We also observed that the longer the time interval used for volatility estimation, the more stable the results.

2025-10-10T10:35:50Z 35 pages Zofia Bracha Paweł Sakowski Jakub Michańków http://arxiv.org/abs/2508.17086v2 Detecting Multilevel Manipulation from Limit Order Book via Cascaded Contrastive Representation Learning 2025-10-10T07:32:36Z

Trade-based manipulation (TBM) undermines the fairness and stability of financial markets drastically. Spoofing, one of the most covert and deceptive TBM strategies, exhibits complex anomaly patterns across multilevel prices, while often being simplified as a single-level manipulation. These patterns are usually concealed within the rich, hierarchical information of the Limit Order Book (LOB), which is challenging to leverage due to high dimensionality and noise. To address this, we propose a representation learning framework combining a cascaded LOB representation architecture with supervised contrastive learning. Extensive experiments demonstrate that our framework consistently improves detection performance across diverse models, with Transformer-based architectures achieving state-of-the-art results. In addition, we conduct systematic analyses and ablation studies to investigate multilevel manipulation and the contributions of key components for detection, offering broader insights into representation learning and anomaly detection for complex time series data.

2025-08-23T16:57:32Z Yushi Lin Peng Yang http://arxiv.org/abs/2510.08268v1 Multi-Agent Analysis of Off-Exchange Public Information for Cryptocurrency Market Trend Prediction 2025-10-09T14:25:49Z

Cryptocurrency markets present unique prediction challenges due to their extreme volatility, 24/7 operation, and hypersensitivity to news events, with existing approaches suffering from key information extraction and poor sideways market detection critical for risk management. We introduce a theoretically-grounded multi-agent cryptocurrency trend prediction framework that advances the state-of-the-art through three key innovations: (1) an information-preserving news analysis system with formal theoretical guarantees that systematically quantifies market impact, regulatory implications, volume dynamics, risk assessment, technical correlation, and temporal effects using large language models; (2) an adaptive volatility-conditional fusion mechanism with proven optimal properties that dynamically combines news sentiment and technical indicators based on market regime detection; (3) a distributed multi-agent coordination architecture with low communication complexity enabling real-time processing of heterogeneous data streams. Comprehensive experimental evaluation on Bitcoin across three prediction horizons demonstrates statistically significant improvements over state-of-the-art natural language processing baseline, establishing a new paradigm for financial machine learning with broad implications for quantitative trading and risk management systems.

2025-10-09T14:25:49Z Kairan Hong Jinling Gan Qiushi Tian Yanglinxuan Guo Rui Guo Runnan Li http://arxiv.org/abs/2510.07444v1 Minimizing the Value-at-Risk of Loan Portfolio via Deep Neural Networks 2025-10-08T18:45:12Z

Risk management is a prominent issue in peer-to-peer lending. An investor may naturally reduce his risk exposure by diversifying instead of putting all his money on one loan. In that case, an investor may want to minimize the Value-at-Risk (VaR) or Conditional Value-at-Risk (CVaR) of his loan portfolio. We propose a low degree of freedom deep neural network model, DeNN, as well as a high degree of freedom model, DSNN, to tackle the problem. In particular, our models predict not only the default probability of a loan but also the time when it will default. The experiments demonstrate that both models can significantly reduce the portfolio VaRs at different confidence levels, compared to benchmarks. More interestingly, the low degree of freedom model, DeNN, outperforms DSNN in most scenarios.

2025-10-08T18:45:12Z IJCAI 2017 Workshop on AI Applications in E-Commerce Albert Di Wang Ye Du http://arxiv.org/abs/2510.07180v1 Bayesian Portfolio Optimization by Predictive Synthesis 2025-10-08T16:18:11Z

Portfolio optimization is a critical task in investment. Most existing portfolio optimization methods require information on the distribution of returns of the assets that make up the portfolio. However, such distribution information is usually unknown to investors. Various methods have been proposed to estimate distribution information, but their accuracy greatly depends on the uncertainty of the financial markets. Due to this uncertainty, a model that could well predict the distribution information at one point in time may perform less accurately compared to another model at a different time. To solve this problem, we investigate a method for portfolio optimization based on Bayesian predictive synthesis (BPS), one of the Bayesian ensemble methods for meta-learning. We assume that investors have access to multiple asset return prediction models. By using BPS with dynamic linear models to combine these predictions, we can obtain a Bayesian predictive posterior about the mean rewards of assets that accommodate the uncertainty of the financial markets. In this study, we examine how to construct mean-variance portfolios and quantile-based portfolios based on the predicted distribution information.

2025-10-08T16:18:11Z Masahiro Kato Kentaro Baba Hibiki Kaibuchi Ryo Inokuchi http://arxiv.org/abs/2510.07099v1 Diffusion-Augmented Reinforcement Learning for Robust Portfolio Optimization under Stress Scenarios 2025-10-08T14:56:50Z

In the ever-changing and intricate landscape of financial markets, portfolio optimisation remains a formidable challenge for investors and asset managers. Conventional methods often struggle to capture the complex dynamics of market behaviour and align with diverse investor preferences. To address this, we propose an innovative framework, termed Diffusion-Augmented Reinforcement Learning (DARL), which synergistically integrates Denoising Diffusion Probabilistic Models (DDPMs) with Deep Reinforcement Learning (DRL) for portfolio management. By leveraging DDPMs to generate synthetic market crash scenarios conditioned on varying stress intensities, our approach significantly enhances the robustness of training data. Empirical evaluations demonstrate that DARL outperforms traditional baselines, delivering superior risk-adjusted returns and resilience against unforeseen crises, such as the 2025 Tariff Crisis. This work offers a robust and practical methodology to bolster stress resilience in DRL-driven financial applications.

2025-10-08T14:56:50Z Himanshu Choudhary Arishi Orra Manoj Thakur http://arxiv.org/abs/2508.18427v2 Tracing Positional Bias in Financial Decision-Making: Mechanistic Insights from Qwen2.5 2025-10-07T17:56:00Z

The growing adoption of large language models (LLMs) in finance exposes high-stakes decision-making to subtle, underexamined positional biases. The complexity and opacity of modern model architectures compound this risk. We present the first unified framework and benchmark that not only detects and quantifies positional bias in binary financial decisions but also pinpoints its mechanistic origins within open-source Qwen2.5-instruct models (1.5B-14B). Our empirical analysis covers a novel, finance-authentic dataset revealing that positional bias is pervasive, scale-sensitive, and prone to resurfacing under nuanced prompt designs and investment scenarios, with recency and primacy effects revealing new vulnerabilities in risk-laden contexts. Through transparent mechanistic interpretability, we map how and where bias emerges and propagates within the models to deliver actionable, generalizable insights across prompt types and scales. By bridging domain-specific audit with model interpretability, our work provides a new methodological standard for both rigorous bias diagnosis and practical mitigation, establishing essential guidance for responsible and trustworthy deployment of LLMs in financial systems.

2025-08-25T19:18:50Z Fabrizio Dimino Krati Saxena Bhaskarjit Sarmah Stefano Pasquali 10.1145/3768292.3770394 http://arxiv.org/abs/2510.06095v1 A Microstructure Analysis of Coupling in CFMMs 2025-10-07T16:23:35Z

The programmable and composable nature of smart contract protocols has enabled the emergence of novel market structures and asset classes that are architecturally frictional to implement in traditional financial paradigms. This fluidity has produced an understudied class of market dynamics, particularly in coupled markets where one market serves as an oracle for the other. In such market structures, purchases or liquidations through the intermediate asset create coupled price action between the intermediate and final assets; leading to basket inflation or deflation when denominated in the riskless asset. This paper examines the microstructure of this inflationary dynamic given two constant function market makers (CFMMs) as the intermediate market structures; attempting to quantify their contributions to the former relative to familiar pool metrics such as price drift, trade size, and market depth. Further, a concrete case study is developed, where both markets are constant product markets. The intention is to shed light on the market design process within such coupled environments.

2025-10-07T16:23:35Z Althea Sterrett Austin Adams