https://arxiv.org/api/DkmmJN9SCEUH6D7D/9IU3qxdEkI2026-06-17T07:57:59Z32351515http://arxiv.org/abs/2606.11798v1Deterministic Policy Gradient for Learning Equilibrium in Time-Inconsistent Control Problems2026-06-10T08:31:54ZIn this paper, we develop a continuous-time model-free reinforcement learning algorithm to learn deterministic equilibrium policies in general time-inconsistent control problems. Utilizing the extended Hamilton-Jacobi-Bellman system, we recast the original time-inconsistent problem into an equivalent two-stage problem. In the first stage, for given auxiliary functions, we employ the deterministic policy gradient approach to learn an optimal policy in an auxiliary time-consistent control problem. In the second stage, given the updated policy, we exploit the inner fixed point iterations and some martingale characterizations to learn the auxiliary functions. As a theoretical contribution, we provide some mild model assumptions and establish the convergence of inner fixed point iterations. By repeating this actor-critic style of iterations across two stages, our algorithm aims to learn the equilibrium under different sources of time-inconsistency in a unified manner. The superior effectiveness of the proposed algorithm are illustrated in two classical financial applications with time-inconsistency: mean-variance portfolio management and optimal tracking portfolio under non-exponential discounting.2026-06-10T08:31:54ZKeywords: Time-inconsistent control, two-stage reformulation, model-free continuous-time reinforcement learning, deterministic policy gradient, fixed point iterationXin GuoYijie HuangXiang Yuhttp://arxiv.org/abs/2606.10658v1Post-Quantum Secure Federated DeFi for Inclusive Banking2026-06-09T10:06:55ZRecent advances in error-corrected qubits have accelerated the timeline for practical quantum computing. It poses a threat to cryptographic primitives used to secure financial systems, government infrastructure, communication networks, and DeFi (Decentralized Finance) ecosystems. This paper introduces a post-quantum secure federated DeFi framework that enables inter-bank collaboration to improve the inclusivity of individuals underserved by local lenders due to limited financial histories. Multiple banks contribute encrypted information batches to a virtual server, where lattice-based Fully Homomorphic Encryption (FHE) enables end-to-end homomorphic computation. The server fuses local data-driven probabilistic assessments, expert beliefs, and verifiable evidence generated by the NASA-IBM Prithvi Geospatial Foundation Model (GFM), in encrypted format. Decentralized technologies are employed to ensure tamper-proof evidence and auditable accountability for all encrypted data exchanges between institutions and the server. The framework is tested on agricultural lending decisions for rural borrowers in Virginia.2026-06-09T10:06:55ZSwati SachanDale FickettRichard BuchingerTheo Miller10.1109/cai68641.2026.11536585http://arxiv.org/abs/2606.10245v1A Fast Implied Volatility Method with Expansions2026-06-08T23:24:15ZWe present a regime-split Black--Scholes implied volatility solver in which every initial seed is a fully closed-form analytical expression, derived from the asymptotic structure of the Black--Scholes price in its natural domain. At the money, series reversion of an exact Gaussian identity yields a fourth-order seed with error $\mathcal{O}(s^8)$. In the moderate out-of-the-money region, successive Gaussian CDF approximations of increasing order produce explicit initial seed formulas whose accuracy is proved numerically, with no iteration or numerical inversion at the seed stage. In the deep out-of-the-money region, a Gaussian tail cancellation identity -- the Mills ratio -- reveals the asymptotic structure of the Black--Scholes price and motivates a ratio-corrected seed that achieves near-machine-precision initialisation for large moneyness. All regime boundaries are derived analytically from CDF truncation tolerances and numerical solver theoretical error bounds, with no empirically tuned constants. A universal fourth-order Householder polisher then drives all regimes to machine precision, with mean update iterations strictly below two on both standard and granular benchmark grids -- meeting and surpassing the two-iteration target established by the highest-accuracy reference implementation in the literature (Jäckel, 2015). The resulting C implementation achieves a $1.73$--$1.85\times$ throughput gain over the state-of-the-art benchmark (Jäckel, 2015) under identical hardware and compiler conditions, with maximum absolute error $\mathcal{O}(10^{-14})$, stable across grid configurations. A Python/Numba implementation confirms portability. All source code is publicly available.2026-06-08T23:24:15ZAlper HekimogluIsmail Hakki Gokgozhttp://arxiv.org/abs/2606.09478v1Volatility Forecasting and Return Prediction under Market Regimes: Evidence from High-Frequency Chinese Equity Data2026-06-08T13:36:57ZThis study investigates whether regime-dependent volatility forecasting and machine-learning-based return prediction can be jointly integrated to improve both statistical forecasting performance and economic strategy outcomes in equity markets. Using high-frequency CSI 300 Index data from 2005 to 2023, a sequential twostage framework is developed. In the first stage, realized volatility is modeled using regime-augmented HARQ specifications combined with Markov-switching GJR-GARCH filtering to capture long-memory dynamics, asymmetry, and structural market regimes. In the second stage, volatility forecasts, regime indicators, and return-related predictors are incorporated into an XGBoost return-prediction model estimated through a strictly walk-forward out-of-sample procedure. The empirical results demonstrate that regime-aware volatility forecasting consistently outperforms baseline HARQ models across forecast evaluation metrics and is generally supported by formal forecast comparison tests. In contrast, return predictability remains weak, state-dependent, and concentrated primarily in low-volatility regimes. Although naive predictive trading strategies generally fail after accounting for realistic transaction costs, carefully designed implementations incorporating volatility scaling, low-volatility gating, threshold calibration, and turnover controls can improve defensive economic performance. The findings suggest that the practical value of predictive systems in financial markets may depend less on generating strong unconditional return forecasts and more on transforming weak state-dependent signals into economically robust portfolio allocation rules. Overall, the study contributes by integrating econometric volatility modeling, regime classification, machine-learning return prediction, and implementation realism within a unified framework.2026-06-08T13:36:57Z41 pages, 16 figures, 21 tablesXinyue FangRobert Ślepaczukhttp://arxiv.org/abs/2604.27210v2Fast-Vollib: A Fast Implied Volatility Library for Pythonwith PyTorch, JAX, and CUDA Fused-Kernel Backends2026-06-08T10:21:56ZWe present fast-vollib, an open-source Python library that provides high-performance European option pricing, implied volatility (IV) computation, and Greeks under the Black-76, Black-Scholes, and Black-Scholes-Merton models. The library is designed as a drop-in alternative to the de-facto-standard py_vollib and py_vollib_vectorized packages, with pluggable PyTorch and JAX execution backends, a CUDA fused-kernel Triton contribution for batched IV workloads, and a compatibility-first public API. In addition to a vectorized Halley-method IV solver, fast-vollib ships an experimental, fully-vectorized implementation of Jäckel's "Let's Be Rational" (LBR) algorithm with NumPy/Numba, torch.compile, JAX, and Triton single-pass GPU kernels for batched option chains. This note announces the library and describes its public API surface, with source, documentation, and packaging artifacts available at: GitHub (https://github.com/raeidsaqur/fast-vollib), Docs (https://raeidsaqur.github.io/fast-vollib/), PyPI (https://pypi.org/project/fast-vollib/).2026-04-29T21:29:32Z5 pages, 1 figure, 1 table. Software announcement / reference note. Code: https://github.com/raeidsaqur/fast-vollib. Install: pip install fast-vollibRaeid Saqurhttp://arxiv.org/abs/2603.07600v4Differential Machine Learning for 0DTE Options with Stochastic Volatility and Jumps2026-06-08T05:48:13ZWe present a differential machine learning method for zero-days-to-expiry (0DTE) options under a stochastic-volatility jump-diffusion model. To handle the ultra-short-maturity regime, we express the option price in Black-Scholes form with a maturity-gated variance correction, combining supervision on prices and Greeks with a PIDE-residual penalty. Prices and Greeks are derived from a single trained pricing network, while jump-term identifiability is ensured by a jump-operator network fitted jointly in a three-stage procedure. The method improves jump-term approximation relative to one-stage baselines while maintaining comparable pricing errors. Furthermore, it reduces errors in Greeks, produces stable one-day delta hedges, and offers significant speedups over Fourier-based benchmarks. Calibration experiments demonstrate the network's efficiency as a pricer and incorporating jump-intensity price sensitivity into the learning process further improves the overall model fit. We also consider a jump rough Heston model.2026-03-08T12:10:24ZTakayuki Sakumahttp://arxiv.org/abs/2606.08379v1TT-DAC-PS: Twin-Target Deterministic Actor-Critic with Policy Smoothing for Optimal Trade Execution2026-06-07T00:20:29ZThis study addresses the optimal execution of large stock sell programs by introducing TT-DAC-PS (Twin-Target Deterministic Actor-Critic with Policy Smoothing), a deterministic actor-critic architecture that combines twin exponential-moving-average critic targets with pessimistic min backup, TD3-style target policy smoothing noise, delayed actor updates, and conservative Q regularisation to curb overestimation. Exploration uses Ornstein-Uhlenbeck (OU) noise with a hybrid schedule: deterministic episode-wise decay, variance-guided adjustment based on recent reward dispersion, and a Soft Actor-Critic (SAC)-style temperature that is learned and mapped to the noise scale. The environment integrates Almgren-Chriss (AC) trade impact with Limit Order Book (LOB) prices and volumes, normalised state features, per-step volume participation caps, and a utility-based reward. The trade execution algorithm is applied to LOB data for ten U.S. stocks. Performance is assessed against reinforcement-learning baseline algorithms, including Proximal Policy Optimisation (PPO), Soft Actor-Critic (SAC), and Advantage Actor-Critic (A2C), as well as alternative trade execution algorithms, including Time-Weighted Average Price (TWAP), Volume-Weighted Average Price (VWAP), and AC. The proposed model consistently reduces mean implementation shortfall percentage with competitive variance, outperforming classical baselines and standard reinforcement-learning benchmark models.2026-06-07T00:20:29Z21 pages, 1 figure, 3 tablesIlia ZaznovAtta BadiiJulian KunkelAlfonso Dufourhttp://arxiv.org/abs/2606.08285v1Beyond Agent Architecture: Execution Assumptions and Reproducibility in LLM-Based Trading Systems2026-06-06T18:14:29ZLarge language models (LLMs) and agentic systems are increasingly proposed for financial trading, yet their reported performance remains difficult to compare because studies vary in data provenance, temporal split discipline, execution timing, turnover treatment, and transaction-cost modeling. This article presents a targeted topical review and reproducibility audit of execution realism in LLM-based trading research. A coded evidence matrix covering 30 trade-relevant primary studies is used to assess point-in-time controls, split transparency, held-out evaluation, cost and turnover treatment, execution semantics, universe definition, and artifact release. Across the audited sample, architecture reporting is generally clearer than the evaluation assumptions needed to judge whether a trading result is economically interpretable or reproducible. A 10-equity worked example is included only as a methodological scaffold to illustrate how explicit friction and timing choices can materially compress active-strategy results. The main conclusion is that the next useful step for LLM trading research is not only better agent design, but also clearer reporting standards for execution realism, reproducibility, and evaluation comparability.2026-06-06T18:14:29ZJunyi YaoZihao Zhenghttp://arxiv.org/abs/2606.08232v1Hour-Aware Adaptive Risk Management for Autonomous Memecoin Trading: A Multi-Layer Intelligence Framework2026-06-06T15:40:05ZThis paper measures hour-of-day effects, filter precision, fragility, and realised yield in a 15-day paper-traded deployment of an autonomous memecoin trading system on Solana decentralised exchanges. The 190-trade sample (March 29 to April 12, 2026) shows a 40.5 percent win rate, mean per-trade return of +0.62 percent, cumulative +117.7 percent (net SOL +0.039), skewness -1.21, excess kurtosis 6.61. A Mann-Whitney U test of three poorest-performing UTC hours (2, 13, 23) against the others yields U = 1,274, p = 0.22; directional but not significant at n = 190. The three hours were selected in-sample, so the comparison is exploratory, not confirmatory. A parallel counterfactual rejection-tracking system collected 4,874 forward-sample observations across 184 distinct rejection events. Of those events, 17.9 percent reached a 50 percent drawdown from reference within 24 hours; 26.0 percent of forward samples recorded the rejected token below half-reference. The filter stack avoided these realised drawdowns: evidence that the rejection criteria are net-positive against forward-market outcomes. Fragility is the principal caveat. Removing the top three trades (1.6 percent of sample) flips cumulative return unprofitable. Profitability rests on a small number of large winners and is structurally fragile. The dataset and audit script are deposited under CC-BY-4.0 (Zenodo DOI 10.5281/zenodo.20043302).2026-06-06T15:40:05Z15 pages, 4 figures. Companion paper to RED-2400 (arXiv:2605.12151) and PRFS methodology (arXiv submit/7684836). SSRN abstract ID 6564803. Zenodo concept DOI 10.5281/zenodo.20043302Arati Uday Kamat10.5281/zenodo.20043302http://arxiv.org/abs/2606.08228v1Post-Rejection Follow-up Sampling: A Methodology for Counterfactual Outcome Measurement in Algorithmic DEX Trading2026-06-06T15:32:38ZAlgorithmic trading systems on decentralised exchanges (DEXs) reject most candidate tokens they evaluate. The counterfactual outcome of rejected candidates (what would have happened had the system entered) is rarely measured. This paper introduces Post-Rejection Follow-up Sampling (PRFS). A separate tracking subsystem samples each rejected token's price and liquidity at a configurable cadence, over a horizon of up to twenty-four hours. PRFS produces the data needed to evaluate filter precision against actual market outcomes of rejected candidates, not against synthetic backtest reconstructions. The methodology, data architecture, and deposit format are described in Section III. The companion dataset contains 67,000 forward-outcome observation rows across 2,997 rejection events spanning 457 unique mints, collected over a continuous eight-day window (2026-04-10 to 2026-04-19, UTC). Approximately 55 percent of rejection events receive at least one forward observation; coverage at the mint level is complete. The principal binding constraint on downstream classification is per-event horizon density, not event-level coverage. PRFS is dataset-independent. It generalises to any algorithmic decision system in which rejections substantially outnumber executions.2026-06-06T15:32:38Z12 pages. Companion methodology paper to RED-2400 (arXiv:2605.12151). Currently under review at Ledger. SSRN abstract ID 6607301. Zenodo concept DOI 10.5281/zenodo.20043516Arati Uday Kamat10.5281/zenodo.20043516http://arxiv.org/abs/2605.01176v3Decision-Induced Ranking Explains Prediction Inflation and Excessive Turnover in SPO-Based Portfolio Optimization2026-06-05T08:17:08ZDecision-focused learning (DFL) is attractive for portfolio optimization because it trains predictors according to downstream decision quality rather than prediction accuracy alone. However, SPO(Smart, Predict then Optimize surrogate)-based DFL may produce inflated return signals and unstable portfolio reallocations. This study provides a KKT-based interpretation showing that portfolio decisions can be viewed as ranking over risk- and transaction-cost-adjusted marginal scores. Empirically, we examine prediction inflation and excessive turnover in SPO-trained portfolios, and evaluate clipping, min-max rescaling, and partial portfolio adjustment as practical stabilization mechanisms. The results suggest that realistic output constraints and portfolio-level turnover control improve the implementability of SPO-based portfolio strategies.2026-05-02T00:48:16ZYi WangTakashi Hasuikehttp://arxiv.org/abs/2410.23587v5Moments by Integrating the Moment-Generating Function2026-06-04T17:29:28ZWe introduce a general integral framework for computing fractional, complex, absolute, and logarithmic moments from the moment-generating function (MGF) under explicit regularity conditions. By evaluating a complex extension of the MGF along a vertical contour, we obtain exact integral expressions that bypass the need for explicit probability densities and high-order derivatives. We establish conditions for negative fractional moments using the symmetric Cauchy principal value, including the requirement that the distribution have no point mass at the centering point. We demonstrate the theoretical scope and computational practicality of the framework through applications to the normal-inverse Gaussian distribution and a semicontinuous compound Poisson-Gamma distribution. In the latter case, the framework handles point masses at the boundary by evaluating conditional fractional moments.2024-10-31T02:58:56ZPeter Reinhard HansenChen Tonghttp://arxiv.org/abs/2606.17065v1PIVOT: Bridging Black-Scholes Implied-Volatility and Price Objectives via Differentiable Jäckel Operator2026-06-04T14:43:38ZModern option-learning systems operate in two coordinates: price space, where markets quote and no-arbitrage constraints are most naturally enforced, and implied volatility (IV) space, where volatility surfaces are smoothed, regularized, and evaluated. The bottleneck is interface, not approximation: Jäckel's seminal "Let's Be Rational" (LBR) solver already inverts the Black-Scholes price to machine precision efficiently. What is missing is a differentiable layer that preserves LBR in the forward pass and avoids backpropagating through its branch logic. Such a layer must also confront the unavoidable singularity of the inverse map in the low-vega regime, where the sensitivity 1/vega diverges as vega -> 0.
We close this gap with PIVOT, the Price-Implied-Volatility Objective Translator. PIVOT keeps the LBR forward pass intact and supplies the backward pass by implicit differentiation through the smooth Black-Scholes/Black-76 price map, with an explicit gating contract: invalid domains return NaN, well-conditioned rows receive the exact 1/vega gradient, and low-vega rows are attenuated rather than silently regularized. On a single H100, a fused Triton kernel reaches 1.79e9 IV/s at machine precision (9.3e-14 max relative error vs. the reference C solver); end-to-end label generation sustains 48.9M/s on synthetic chains and 16.6M/s on SPX OptionMetrics. In a HyperIV-style one-day reproduction on SPX, PIVOT-augmented objectives Pareto-dominate the baselines, reducing held-out price MAE by up to 43.4% and the strongest three-seed gated objective improving price MAE by 38.8% and IV MAE by 21.3% jointly; cross-asset results on RUT, VIX, and NDX show directional price-MAE gains of 40.1%, 24.2%, and 16.7%, while an ungated IV-roundtrip control collapses to a degenerate near-zero surface, confirming the gate as a correctness contract rather than a tuning knob.2026-06-04T14:43:38Z30 pages, 17 figures, 12 tablesRaeid SaqurYannick LimmerAnastasis KratsiosBlanka HorvathHans Buehlerhttp://arxiv.org/abs/2606.05733v1Zero-Copy Semantic Contagion: An In-Memory Streaming Architecture for Evolving Attention Graphs2026-06-04T05:48:56ZPer-ticker forecasting models dominate financial time-series work yet remain blind to cross-company propagation: a foundry disruption in Taiwan does not register in a single-asset model until Apple's own price has already moved. To address this limitation, we introduce a heterogeneous Rust-Python streaming architecture that maps cross-company attention as a continuous-time graph driven directly from text. We show that on the ingestion side, a zero-copy Rust edge parses news records in $\sim$100 ns and scans the target equity universe in $\sim$1.2 $μ$s. On the inference end, a multivariate Neural Hawkes Process featuring per-node continuous-time LSTM states and a bilinear latent projection propagates directed excitation, while an adaptive pruning rule bounds the computational cost of dynamic neighborhood updates. Combining these stages, we demonstrate an end-to-end processing latency of $\sim$13 ms per incoming news record on a single commodity CPU. Evaluated on a one-month temporal holdout of the FNSPID corpus (638 articles across 47 tickers), the system delivers a $1.70\times$ precision lift over random at the 90th-percentile next-day return threshold, and $3.36\times$ over a same-sector baseline. Crucially, removing the graph topology collapses precision to zero, confirming that the dynamic attention network is the sole driver of cross-company signal in this architecture.2026-06-04T05:48:56ZAccepted to the 2026 ACM SIGMOD Workshop on Data Management for the Modern Financial Systems (FinDS). 10 pages, 4 figuresKabir Murjanihttp://arxiv.org/abs/2509.19663v2Long-Range Dependence in Financial Markets: Empirical Evidence and Generative Modeling Challenges2026-06-04T01:13:32ZThis study provides a comprehensive empirical investigation of long-range dependence (LRD) in financial markets and evaluates the ability of deep generative models to reproduce such temporal structures. Using daily data from three representative sectors--equity (S&P 500, DAX, Nikkei 225), commodities (Wheat, Corn, Soybeans), and energy (UNG, USO, XLE)--we examine the presence of LRD through three complementary approaches: rescaled range (R/S) analysis, detrended fluctuation analysis (DFA), and an ARFIMA--FIGARCH model with Student's $t$-distributed innovations. The empirical evidence suggests that while mean returns exhibit limited persistence, pronounced long memory is consistently observed in conditional volatility across most assets. Building on these findings, we assess whether Quant Generative Adversarial Networks (Quant GANs) can learn and reproduce these stylized temporal dependencies. Although the generated series successfully mimic heavy-tailed return distributions and certain aspects of volatility clustering, they generally fail to capture the magnitude and consistency of LRD observed in real data, particularly in volatility dynamics. These results highlight an important limitation of current deep generative architectures in modeling slow-decaying dependence structures and underscore the need for incorporating explicit long-memory mechanisms when synthetic financial data are intended for risk management or long-horizon forecasting applications.2025-09-24T00:41:14Z28 pages, 8 figures, 7 tablesYifan HeSvetlozar Rachev