https://arxiv.org/api/DkmmJN9SCEUH6D7D/9IU3qxdEkI 2026-06-17T07:57:59Z 3235 15 15 http://arxiv.org/abs/2606.11798v1 Deterministic Policy Gradient for Learning Equilibrium in Time-Inconsistent Control Problems 2026-06-10T08:31:54Z

In this paper, we develop a continuous-time model-free reinforcement learning algorithm to learn deterministic equilibrium policies in general time-inconsistent control problems. Utilizing the extended Hamilton-Jacobi-Bellman system, we recast the original time-inconsistent problem into an equivalent two-stage problem. In the first stage, for given auxiliary functions, we employ the deterministic policy gradient approach to learn an optimal policy in an auxiliary time-consistent control problem. In the second stage, given the updated policy, we exploit the inner fixed point iterations and some martingale characterizations to learn the auxiliary functions. As a theoretical contribution, we provide some mild model assumptions and establish the convergence of inner fixed point iterations. By repeating this actor-critic style of iterations across two stages, our algorithm aims to learn the equilibrium under different sources of time-inconsistency in a unified manner. The superior effectiveness of the proposed algorithm are illustrated in two classical financial applications with time-inconsistency: mean-variance portfolio management and optimal tracking portfolio under non-exponential discounting.

2026-06-10T08:31:54Z Keywords: Time-inconsistent control, two-stage reformulation, model-free continuous-time reinforcement learning, deterministic policy gradient, fixed point iteration Xin Guo Yijie Huang Xiang Yu http://arxiv.org/abs/2606.10658v1 Post-Quantum Secure Federated DeFi for Inclusive Banking 2026-06-09T10:06:55Z

Recent advances in error-corrected qubits have accelerated the timeline for practical quantum computing. It poses a threat to cryptographic primitives used to secure financial systems, government infrastructure, communication networks, and DeFi (Decentralized Finance) ecosystems. This paper introduces a post-quantum secure federated DeFi framework that enables inter-bank collaboration to improve the inclusivity of individuals underserved by local lenders due to limited financial histories. Multiple banks contribute encrypted information batches to a virtual server, where lattice-based Fully Homomorphic Encryption (FHE) enables end-to-end homomorphic computation. The server fuses local data-driven probabilistic assessments, expert beliefs, and verifiable evidence generated by the NASA-IBM Prithvi Geospatial Foundation Model (GFM), in encrypted format. Decentralized technologies are employed to ensure tamper-proof evidence and auditable accountability for all encrypted data exchanges between institutions and the server. The framework is tested on agricultural lending decisions for rural borrowers in Virginia.

2026-06-09T10:06:55Z Swati Sachan Dale Fickett Richard Buchinger Theo Miller 10.1109/cai68641.2026.11536585 http://arxiv.org/abs/2606.10245v1 A Fast Implied Volatility Method with Expansions 2026-06-08T23:24:15Z

We present a regime-split Black--Scholes implied volatility solver in which every initial seed is a fully closed-form analytical expression, derived from the asymptotic structure of the Black--Scholes price in its natural domain. At the money, series reversion of an exact Gaussian identity yields a fourth-order seed with error $\mathcal{O}(s^8)$. In the moderate out-of-the-money region, successive Gaussian CDF approximations of increasing order produce explicit initial seed formulas whose accuracy is proved numerically, with no iteration or numerical inversion at the seed stage. In the deep out-of-the-money region, a Gaussian tail cancellation identity -- the Mills ratio -- reveals the asymptotic structure of the Black--Scholes price and motivates a ratio-corrected seed that achieves near-machine-precision initialisation for large moneyness. All regime boundaries are derived analytically from CDF truncation tolerances and numerical solver theoretical error bounds, with no empirically tuned constants. A universal fourth-order Householder polisher then drives all regimes to machine precision, with mean update iterations strictly below two on both standard and granular benchmark grids -- meeting and surpassing the two-iteration target established by the highest-accuracy reference implementation in the literature (Jäckel, 2015). The resulting C implementation achieves a $1.73$--$1.85\times$ throughput gain over the state-of-the-art benchmark (Jäckel, 2015) under identical hardware and compiler conditions, with maximum absolute error $\mathcal{O}(10^{-14})$, stable across grid configurations. A Python/Numba implementation confirms portability. All source code is publicly available.

2026-06-08T23:24:15Z Alper Hekimoglu Ismail Hakki Gokgoz http://arxiv.org/abs/2606.09478v1 Volatility Forecasting and Return Prediction under Market Regimes: Evidence from High-Frequency Chinese Equity Data 2026-06-08T13:36:57Z

This study investigates whether regime-dependent volatility forecasting and machine-learning-based return prediction can be jointly integrated to improve both statistical forecasting performance and economic strategy outcomes in equity markets. Using high-frequency CSI 300 Index data from 2005 to 2023, a sequential twostage framework is developed. In the first stage, realized volatility is modeled using regime-augmented HARQ specifications combined with Markov-switching GJR-GARCH filtering to capture long-memory dynamics, asymmetry, and structural market regimes. In the second stage, volatility forecasts, regime indicators, and return-related predictors are incorporated into an XGBoost return-prediction model estimated through a strictly walk-forward out-of-sample procedure. The empirical results demonstrate that regime-aware volatility forecasting consistently outperforms baseline HARQ models across forecast evaluation metrics and is generally supported by formal forecast comparison tests. In contrast, return predictability remains weak, state-dependent, and concentrated primarily in low-volatility regimes. Although naive predictive trading strategies generally fail after accounting for realistic transaction costs, carefully designed implementations incorporating volatility scaling, low-volatility gating, threshold calibration, and turnover controls can improve defensive economic performance. The findings suggest that the practical value of predictive systems in financial markets may depend less on generating strong unconditional return forecasts and more on transforming weak state-dependent signals into economically robust portfolio allocation rules. Overall, the study contributes by integrating econometric volatility modeling, regime classification, machine-learning return prediction, and implementation realism within a unified framework.

2026-06-08T13:36:57Z 41 pages, 16 figures, 21 tables Xinyue Fang Robert Ślepaczuk http://arxiv.org/abs/2604.27210v2 Fast-Vollib: A Fast Implied Volatility Library for Pythonwith PyTorch, JAX, and CUDA Fused-Kernel Backends 2026-06-08T10:21:56Z

We present fast-vollib, an open-source Python library that provides high-performance European option pricing, implied volatility (IV) computation, and Greeks under the Black-76, Black-Scholes, and Black-Scholes-Merton models. The library is designed as a drop-in alternative to the de-facto-standard py_vollib and py_vollib_vectorized packages, with pluggable PyTorch and JAX execution backends, a CUDA fused-kernel Triton contribution for batched IV workloads, and a compatibility-first public API. In addition to a vectorized Halley-method IV solver, fast-vollib ships an experimental, fully-vectorized implementation of Jäckel's "Let's Be Rational" (LBR) algorithm with NumPy/Numba, torch.compile, JAX, and Triton single-pass GPU kernels for batched option chains. This note announces the library and describes its public API surface, with source, documentation, and packaging artifacts available at: GitHub (https://github.com/raeidsaqur/fast-vollib), Docs (https://raeidsaqur.github.io/fast-vollib/), PyPI (https://pypi.org/project/fast-vollib/).

2026-04-29T21:29:32Z 5 pages, 1 figure, 1 table. Software announcement / reference note. Code: https://github.com/raeidsaqur/fast-vollib. Install: pip install fast-vollib Raeid Saqur http://arxiv.org/abs/2603.07600v4 Differential Machine Learning for 0DTE Options with Stochastic Volatility and Jumps 2026-06-08T05:48:13Z

We present a differential machine learning method for zero-days-to-expiry (0DTE) options under a stochastic-volatility jump-diffusion model. To handle the ultra-short-maturity regime, we express the option price in Black-Scholes form with a maturity-gated variance correction, combining supervision on prices and Greeks with a PIDE-residual penalty. Prices and Greeks are derived from a single trained pricing network, while jump-term identifiability is ensured by a jump-operator network fitted jointly in a three-stage procedure. The method improves jump-term approximation relative to one-stage baselines while maintaining comparable pricing errors. Furthermore, it reduces errors in Greeks, produces stable one-day delta hedges, and offers significant speedups over Fourier-based benchmarks. Calibration experiments demonstrate the network's efficiency as a pricer and incorporating jump-intensity price sensitivity into the learning process further improves the overall model fit. We also consider a jump rough Heston model.

2026-03-08T12:10:24Z Takayuki Sakuma http://arxiv.org/abs/2606.08379v1 TT-DAC-PS: Twin-Target Deterministic Actor-Critic with Policy Smoothing for Optimal Trade Execution 2026-06-07T00:20:29Z

This study addresses the optimal execution of large stock sell programs by introducing TT-DAC-PS (Twin-Target Deterministic Actor-Critic with Policy Smoothing), a deterministic actor-critic architecture that combines twin exponential-moving-average critic targets with pessimistic min backup, TD3-style target policy smoothing noise, delayed actor updates, and conservative Q regularisation to curb overestimation. Exploration uses Ornstein-Uhlenbeck (OU) noise with a hybrid schedule: deterministic episode-wise decay, variance-guided adjustment based on recent reward dispersion, and a Soft Actor-Critic (SAC)-style temperature that is learned and mapped to the noise scale. The environment integrates Almgren-Chriss (AC) trade impact with Limit Order Book (LOB) prices and volumes, normalised state features, per-step volume participation caps, and a utility-based reward. The trade execution algorithm is applied to LOB data for ten U.S. stocks. Performance is assessed against reinforcement-learning baseline algorithms, including Proximal Policy Optimisation (PPO), Soft Actor-Critic (SAC), and Advantage Actor-Critic (A2C), as well as alternative trade execution algorithms, including Time-Weighted Average Price (TWAP), Volume-Weighted Average Price (VWAP), and AC. The proposed model consistently reduces mean implementation shortfall percentage with competitive variance, outperforming classical baselines and standard reinforcement-learning benchmark models.

2026-06-07T00:20:29Z 21 pages, 1 figure, 3 tables Ilia Zaznov Atta Badii Julian Kunkel Alfonso Dufour http://arxiv.org/abs/2606.08285v1 Beyond Agent Architecture: Execution Assumptions and Reproducibility in LLM-Based Trading Systems 2026-06-06T18:14:29Z

Large language models (LLMs) and agentic systems are increasingly proposed for financial trading, yet their reported performance remains difficult to compare because studies vary in data provenance, temporal split discipline, execution timing, turnover treatment, and transaction-cost modeling. This article presents a targeted topical review and reproducibility audit of execution realism in LLM-based trading research. A coded evidence matrix covering 30 trade-relevant primary studies is used to assess point-in-time controls, split transparency, held-out evaluation, cost and turnover treatment, execution semantics, universe definition, and artifact release. Across the audited sample, architecture reporting is generally clearer than the evaluation assumptions needed to judge whether a trading result is economically interpretable or reproducible. A 10-equity worked example is included only as a methodological scaffold to illustrate how explicit friction and timing choices can materially compress active-strategy results. The main conclusion is that the next useful step for LLM trading research is not only better agent design, but also clearer reporting standards for execution realism, reproducibility, and evaluation comparability.

2026-06-06T18:14:29Z Junyi Yao Zihao Zheng http://arxiv.org/abs/2606.08232v1 Hour-Aware Adaptive Risk Management for Autonomous Memecoin Trading: A Multi-Layer Intelligence Framework 2026-06-06T15:40:05Z

This paper measures hour-of-day effects, filter precision, fragility, and realised yield in a 15-day paper-traded deployment of an autonomous memecoin trading system on Solana decentralised exchanges. The 190-trade sample (March 29 to April 12, 2026) shows a 40.5 percent win rate, mean per-trade return of +0.62 percent, cumulative +117.7 percent (net SOL +0.039), skewness -1.21, excess kurtosis 6.61. A Mann-Whitney U test of three poorest-performing UTC hours (2, 13, 23) against the others yields U = 1,274, p = 0.22; directional but not significant at n = 190. The three hours were selected in-sample, so the comparison is exploratory, not confirmatory. A parallel counterfactual rejection-tracking system collected 4,874 forward-sample observations across 184 distinct rejection events. Of those events, 17.9 percent reached a 50 percent drawdown from reference within 24 hours; 26.0 percent of forward samples recorded the rejected token below half-reference. The filter stack avoided these realised drawdowns: evidence that the rejection criteria are net-positive against forward-market outcomes. Fragility is the principal caveat. Removing the top three trades (1.6 percent of sample) flips cumulative return unprofitable. Profitability rests on a small number of large winners and is structurally fragile. The dataset and audit script are deposited under CC-BY-4.0 (Zenodo DOI 10.5281/zenodo.20043302).

2026-06-06T15:40:05Z 15 pages, 4 figures. Companion paper to RED-2400 (arXiv:2605.12151) and PRFS methodology (arXiv submit/7684836). SSRN abstract ID 6564803. Zenodo concept DOI 10.5281/zenodo.20043302 Arati Uday Kamat 10.5281/zenodo.20043302 http://arxiv.org/abs/2606.08228v1 Post-Rejection Follow-up Sampling: A Methodology for Counterfactual Outcome Measurement in Algorithmic DEX Trading 2026-06-06T15:32:38Z

Algorithmic trading systems on decentralised exchanges (DEXs) reject most candidate tokens they evaluate. The counterfactual outcome of rejected candidates (what would have happened had the system entered) is rarely measured. This paper introduces Post-Rejection Follow-up Sampling (PRFS). A separate tracking subsystem samples each rejected token's price and liquidity at a configurable cadence, over a horizon of up to twenty-four hours. PRFS produces the data needed to evaluate filter precision against actual market outcomes of rejected candidates, not against synthetic backtest reconstructions. The methodology, data architecture, and deposit format are described in Section III. The companion dataset contains 67,000 forward-outcome observation rows across 2,997 rejection events spanning 457 unique mints, collected over a continuous eight-day window (2026-04-10 to 2026-04-19, UTC). Approximately 55 percent of rejection events receive at least one forward observation; coverage at the mint level is complete. The principal binding constraint on downstream classification is per-event horizon density, not event-level coverage. PRFS is dataset-independent. It generalises to any algorithmic decision system in which rejections substantially outnumber executions.

2026-06-06T15:32:38Z 12 pages. Companion methodology paper to RED-2400 (arXiv:2605.12151). Currently under review at Ledger. SSRN abstract ID 6607301. Zenodo concept DOI 10.5281/zenodo.20043516 Arati Uday Kamat 10.5281/zenodo.20043516 http://arxiv.org/abs/2605.01176v3 Decision-Induced Ranking Explains Prediction Inflation and Excessive Turnover in SPO-Based Portfolio Optimization 2026-06-05T08:17:08Z

Decision-focused learning (DFL) is attractive for portfolio optimization because it trains predictors according to downstream decision quality rather than prediction accuracy alone. However, SPO(Smart, Predict then Optimize surrogate)-based DFL may produce inflated return signals and unstable portfolio reallocations. This study provides a KKT-based interpretation showing that portfolio decisions can be viewed as ranking over risk- and transaction-cost-adjusted marginal scores. Empirically, we examine prediction inflation and excessive turnover in SPO-trained portfolios, and evaluate clipping, min-max rescaling, and partial portfolio adjustment as practical stabilization mechanisms. The results suggest that realistic output constraints and portfolio-level turnover control improve the implementability of SPO-based portfolio strategies.

2026-05-02T00:48:16Z Yi Wang Takashi Hasuike http://arxiv.org/abs/2410.23587v5 Moments by Integrating the Moment-Generating Function 2026-06-04T17:29:28Z

We introduce a general integral framework for computing fractional, complex, absolute, and logarithmic moments from the moment-generating function (MGF) under explicit regularity conditions. By evaluating a complex extension of the MGF along a vertical contour, we obtain exact integral expressions that bypass the need for explicit probability densities and high-order derivatives. We establish conditions for negative fractional moments using the symmetric Cauchy principal value, including the requirement that the distribution have no point mass at the centering point. We demonstrate the theoretical scope and computational practicality of the framework through applications to the normal-inverse Gaussian distribution and a semicontinuous compound Poisson-Gamma distribution. In the latter case, the framework handles point masses at the boundary by evaluating conditional fractional moments.

2024-10-31T02:58:56Z Peter Reinhard Hansen Chen Tong http://arxiv.org/abs/2606.17065v1 PIVOT: Bridging Black-Scholes Implied-Volatility and Price Objectives via Differentiable Jäckel Operator 2026-06-04T14:43:38Z

Modern option-learning systems operate in two coordinates: price space, where markets quote and no-arbitrage constraints are most naturally enforced, and implied volatility (IV) space, where volatility surfaces are smoothed, regularized, and evaluated. The bottleneck is interface, not approximation: Jäckel's seminal "Let's Be Rational" (LBR) solver already inverts the Black-Scholes price to machine precision efficiently. What is missing is a differentiable layer that preserves LBR in the forward pass and avoids backpropagating through its branch logic. Such a layer must also confront the unavoidable singularity of the inverse map in the low-vega regime, where the sensitivity 1/vega diverges as vega -> 0. We close this gap with PIVOT, the Price-Implied-Volatility Objective Translator. PIVOT keeps the LBR forward pass intact and supplies the backward pass by implicit differentiation through the smooth Black-Scholes/Black-76 price map, with an explicit gating contract: invalid domains return NaN, well-conditioned rows receive the exact 1/vega gradient, and low-vega rows are attenuated rather than silently regularized. On a single H100, a fused Triton kernel reaches 1.79e9 IV/s at machine precision (9.3e-14 max relative error vs. the reference C solver); end-to-end label generation sustains 48.9M/s on synthetic chains and 16.6M/s on SPX OptionMetrics. In a HyperIV-style one-day reproduction on SPX, PIVOT-augmented objectives Pareto-dominate the baselines, reducing held-out price MAE by up to 43.4% and the strongest three-seed gated objective improving price MAE by 38.8% and IV MAE by 21.3% jointly; cross-asset results on RUT, VIX, and NDX show directional price-MAE gains of 40.1%, 24.2%, and 16.7%, while an ungated IV-roundtrip control collapses to a degenerate near-zero surface, confirming the gate as a correctness contract rather than a tuning knob.

2026-06-04T14:43:38Z 30 pages, 17 figures, 12 tables Raeid Saqur Yannick Limmer Anastasis Kratsios Blanka Horvath Hans Buehler http://arxiv.org/abs/2606.05733v1 Zero-Copy Semantic Contagion: An In-Memory Streaming Architecture for Evolving Attention Graphs 2026-06-04T05:48:56Z

Per-ticker forecasting models dominate financial time-series work yet remain blind to cross-company propagation: a foundry disruption in Taiwan does not register in a single-asset model until Apple's own price has already moved. To address this limitation, we introduce a heterogeneous Rust-Python streaming architecture that maps cross-company attention as a continuous-time graph driven directly from text. We show that on the ingestion side, a zero-copy Rust edge parses news records in $\sim$100 ns and scans the target equity universe in $\sim$1.2 $μ$s. On the inference end, a multivariate Neural Hawkes Process featuring per-node continuous-time LSTM states and a bilinear latent projection propagates directed excitation, while an adaptive pruning rule bounds the computational cost of dynamic neighborhood updates. Combining these stages, we demonstrate an end-to-end processing latency of $\sim$13 ms per incoming news record on a single commodity CPU. Evaluated on a one-month temporal holdout of the FNSPID corpus (638 articles across 47 tickers), the system delivers a $1.70\times$ precision lift over random at the 90th-percentile next-day return threshold, and $3.36\times$ over a same-sector baseline. Crucially, removing the graph topology collapses precision to zero, confirming that the dynamic attention network is the sole driver of cross-company signal in this architecture.

2026-06-04T05:48:56Z Accepted to the 2026 ACM SIGMOD Workshop on Data Management for the Modern Financial Systems (FinDS). 10 pages, 4 figures Kabir Murjani http://arxiv.org/abs/2509.19663v2 Long-Range Dependence in Financial Markets: Empirical Evidence and Generative Modeling Challenges 2026-06-04T01:13:32Z

This study provides a comprehensive empirical investigation of long-range dependence (LRD) in financial markets and evaluates the ability of deep generative models to reproduce such temporal structures. Using daily data from three representative sectors--equity (S&P 500, DAX, Nikkei 225), commodities (Wheat, Corn, Soybeans), and energy (UNG, USO, XLE)--we examine the presence of LRD through three complementary approaches: rescaled range (R/S) analysis, detrended fluctuation analysis (DFA), and an ARFIMA--FIGARCH model with Student's $t$-distributed innovations. The empirical evidence suggests that while mean returns exhibit limited persistence, pronounced long memory is consistently observed in conditional volatility across most assets. Building on these findings, we assess whether Quant Generative Adversarial Networks (Quant GANs) can learn and reproduce these stylized temporal dependencies. Although the generated series successfully mimic heavy-tailed return distributions and certain aspects of volatility clustering, they generally fail to capture the magnitude and consistency of LRD observed in real data, particularly in volatility dynamics. These results highlight an important limitation of current deep generative architectures in modeling slow-decaying dependence structures and underscore the need for incorporating explicit long-memory mechanisms when synthetic financial data are intended for risk management or long-horizon forecasting applications.

2025-09-24T00:41:14Z 28 pages, 8 figures, 7 tables Yifan He Svetlozar Rachev