https://arxiv.org/api/osUYdcL7cGD7tuCowB+zxmxPk2U2026-03-22T14:45:25Z312416515http://arxiv.org/abs/2512.12871v1CapOptix: An Options-Framework for Capacity Market Pricing2025-12-14T23:08:38ZElectricity markets are under increasing pressure to maintain reliability amidst rising renewable penetration, demand variability, and occasional price shocks. Traditional capacity market designs often fall short in addressing this by relying on expected-value metrics of energy unserved, which overlook risk exposure in such systems. In this work, we present CapOptix, a capacity pricing framework that interprets capacity commitments as reliability options, i.e., financial derivatives of wholesale electricity prices. CapOptix characterizes the capacity premia charged by accounting for structural price shifts modeled by the Markov Regime Switching Process. We apply the framework to historical price data from multiple electricity markets and compare the resulting premium ranges with existing capacity remuneration mechanisms.2025-12-14T23:08:38ZMillend RoyAgostino CapponiVladimir PyltsovYinbo HuVijay Modihttp://arxiv.org/abs/2512.12526v1Empirical Mode Decomposition and Graph Transformation of the MSCI World Index: A Multiscale Topological Analysis for Graph Neural Network Modeling2025-12-14T02:35:38ZThis study applies Empirical Mode Decomposition (EMD) to the MSCI World index and converts the resulting intrinsic mode functions (IMFs) into graph representations to enable modeling with graph neural networks (GNNs). Using CEEMDAN, we extract nine IMFs spanning high-frequency fluctuations to long-term trends. Each IMF is transformed into a graph using four time-series-to-graph methods: natural visibility, horizontal visibility, recurrence, and transition graphs. Topological analysis shows clear scale-dependent structure: high-frequency IMFs yield dense, highly connected small-world graphs, whereas low-frequency IMFs produce sparser networks with longer characteristic path lengths. Visibility-based methods are more sensitive to amplitude variability and typically generate higher clustering, while recurrence graphs better preserve temporal dependencies. These results provide guidance for designing GNN architectures tailored to the structural properties of decomposed components, supporting more effective predictive modeling of financial time series.2025-12-14T02:35:38Z19 pages, 3 figures, 6 tablesAgustín M. de los RiscosJulio E. SandubeteDiego Carmona-FernándezLeón Beleñahttp://arxiv.org/abs/2512.12499v1Explainable Prediction of Economic Time Series Using IMFs and Neural Networks2025-12-13T23:58:27ZThis study investigates the contribution of Intrinsic Mode Functions (IMFs) derived from economic time series to the predictive performance of neural network models, specifically Multilayer Perceptrons (MLP) and Long Short-Term Memory (LSTM) networks. To enhance interpretability, DeepSHAP is applied, which estimates the marginal contribution of each IMF while keeping the rest of the series intact. Results show that the last IMFs, representing long-term trends, are generally the most influential according to DeepSHAP, whereas high-frequency IMFs contribute less and may even introduce noise, as evidenced by improved metrics upon their removal. Differences between MLP and LSTM highlight the effect of model architecture on feature relevance distribution, with LSTM allocating importance more evenly across IMFs.2025-12-13T23:58:27Z12 pages, 10 figures, 8 tablesPablo HidalgoJulio E. SandubeteAgustín García-Garcíahttp://arxiv.org/abs/2511.13384v4CBDC Stress Test in a Dual-Currency Setting2025-12-13T16:34:30ZThis study explores the potential impact of introducing a Central Bank Digital Currency (CBDC) on financial stability in an emerging dual-currency economy (Romania), where the domestic currency (RON) coexists with the euro. It develops an integrated analytical framework combining econometrics, machine learning, and behavioural modelling. CBDC adoption probabilities are estimated using XGBoost and logistic regression models trained on behavioural and macro-financial indicators rather than survey data. Liquidity stress simulations assess how banks would respond to deposit withdrawals resulting from CBDC adoption, while VAR, MSVAR, and SVAR models capture the macro-financial transmission of liquidity shocks into credit contraction and changes in monetary conditions. The findings indicate that CBDC uptake (co-circulating Digital RON and Digital EUR) would be moderate at issuance, amounting to around EUR 1 billion, primarily driven by digital readiness and trust in the central bank. The study concludes that a non-remunerated, capped CBDC, designed primarily as a means of payment rather than a store of value, can be introduced without compromising financial stability. In dual currency economies, differentiated holding limits for domestic and foreign digital currencies (e.g., Digital RON versus Digital Euro) are crucial to prevent uncontrolled euroisation and preserve monetary sovereignty. A prudent design with moderate caps, non remuneration, and macroprudential coordination can transform CBDC into a digital liquidity buffer and a complementary monetary policy instrument that enhances resilience and inclusion rather than destabilising the financial system.2025-11-17T13:55:02Z724 pages, including annexes; most figures and tables included; if not, then referencedCatalin Dumitrescuhttp://arxiv.org/abs/2512.12334v1Extending the application of dynamic Bayesian networks in calculating market risk: Standard and stressed expected shortfall2025-12-13T13:55:28ZIn the last five years, expected shortfall (ES) and stressed ES (SES) have become key required regulatory measures of market risk in the banking sector, especially following events such as the global financial crisis. Thus, finding ways to optimize their estimation is of great importance. We extend the application of dynamic Bayesian networks (DBNs) to the estimation of 10-day 97.5% ES and stressed ES, building on prior work applying DBNs to value at risk. Using the S&P 500 index as a proxy for the equities trading desk of a US bank, we compare the performance of three DBN structure-learning algorithms with several traditional market risk models, using either the normal or the skewed Student's t return distributions. Backtesting shows that all models fail to produce statistically accurate ES and SES forecasts at the 2.5% level, reflecting the difficulty of modeling extreme tail behavior. For ES, the EGARCH(1,1) model (normal) produces the most accurate forecasts, while, for SES, the GARCH(1,1) model (normal) performs best. All distribution-dependent models deteriorate substantially when using the skewed Student's t distribution. The DBNs perform comparably to the historical simulation model, but their contribution to tail prediction is limited by the small weight assigned to their one-day-ahead forecasts within the return distribution. Future research should examine weighting schemes that enhance the influence of forward-looking DBN forecasts on tail risk estimation.2025-12-13T13:55:28ZEden GrossRyan KrugerFrancois Toerienhttp://arxiv.org/abs/2512.14744v1VERAFI: Verified Agentic Financial Intelligence through Neurosymbolic Policy Generation2025-12-12T17:17:43ZFinancial AI systems suffer from a critical blind spot: while Retrieval-Augmented Generation (RAG) excels at finding relevant documents, language models still generate calculation errors and regulatory violations during reasoning, even with perfect retrieval. This paper introduces VERAFI (Verified Agentic Financial Intelligence), an agentic framework with neurosymbolic policy generation for verified financial intelligence. VERAFI combines state-of-the-art dense retrieval and cross-encoder reranking with financial tool-enabled agents and automated reasoning policies covering GAAP compliance, SEC requirements, and mathematical validation. Our comprehensive evaluation on FinanceBench demonstrates remarkable improvements: while traditional dense retrieval with reranking achieves only 52.4\% factual correctness, VERAFI's integrated approach reaches 94.7\%, an 81\% relative improvement. The neurosymbolic policy layer alone contributes a 4.3 percentage point gain over pure agentic processing, specifically targeting persistent mathematical and logical errors. By integrating financial domain expertise directly into the reasoning process, VERAFI offers a practical pathway toward trustworthy financial AI that meets the stringent accuracy demands of regulatory compliance, investment decisions, and risk management.2025-12-12T17:17:43ZAdewale AkinfaderinShreyas Subramanianhttp://arxiv.org/abs/2512.10913v1Reinforcement Learning in Financial Decision Making: A Systematic Review of Performance, Challenges, and Implementation Strategies2025-12-11T18:42:19ZReinforcement learning (RL) is an innovative approach to financial decision making, offering specialized solutions to complex investment problems where traditional methods fail. This review analyzes 167 articles from 2017--2025, focusing on market making, portfolio optimization, and algorithmic trading. It identifies key performance issues and challenges in RL for finance. Generally, RL offers advantages over traditional methods, particularly in market making. This study proposes a unified framework to address common concerns such as explainability, robustness, and deployment feasibility. Empirical evidence with synthetic data suggests that implementation quality and domain knowledge often outweigh algorithmic complexity. The study highlights the need for interpretable RL architectures for regulatory compliance, enhanced robustness in nonstationary environments, and standardized benchmarking protocols. Organizations should focus less on algorithm sophistication and more on market microstructure, regulatory constraints, and risk management in decision-making.2025-12-11T18:42:19ZPaper submitted to Management ScienceMohammad Rezoanul HoqueMd Meftahul FerdausM. Kabir Hassanhttp://arxiv.org/abs/2512.10584v1Volatility time series modeling by single-qubit quantum circuit learning2025-12-11T12:23:32ZWe employ single-qubit quantum circuit learning (QCL) to model the dynamics of volatility time series. To assess its effectiveness, we generate synthetic data using the Rational GARCH model, which is specifically designed to capture volatility asymmetry. Our results show that QCL-based volatility predictions preserve the negative return-volatility correlation, a hallmark of asymmetric volatility dynamics. Moreover, analysis of the Hurst exponent and multifractal characteristics indicates that the predicted series, like the original synthetic data, exhibits anti-persistent behavior and retains its multifractal structure.2025-12-11T12:23:32Z9 pages, 10 figures, accepted for 14th International Conference on Mathematical Modeling in Physical Sciences,Tetsuya Takaishihttp://arxiv.org/abs/2512.14735v1PyFi: Toward Pyramid-like Financial Image Understanding for VLMs via Adversarial Agents2025-12-11T06:04:33ZThis paper proposes PyFi, a novel framework for pyramid-like financial image understanding that enables vision language models (VLMs) to reason through question chains in a progressive, simple-to-complex manner. At the core of PyFi is PyFi-600K, a dataset comprising 600K financial question-answer pairs organized into a reasoning pyramid: questions at the base require only basic perception, while those toward the apex demand increasing levels of capability in financial visual understanding and expertise. This data is scalable because it is synthesized without human annotations, using PyFi-adv, a multi-agent adversarial mechanism under the Monte Carlo Tree Search (MCTS) paradigm, in which, for each image, a challenger agent competes with a solver agent by generating question chains that progressively probe deeper capability levels in financial visual reasoning. Leveraging this dataset, we present fine-grained, hierarchical, and comprehensive evaluations of advanced VLMs in the financial domain. Moreover, fine-tuning Qwen2.5-VL-3B and Qwen2.5-VL-7B on the pyramid-structured question chains enables these models to answer complex financial questions by decomposing them into sub-questions with gradually increasing reasoning demands, yielding average accuracy improvements of 19.52% and 8.06%, respectively, on the dataset. All resources of code, dataset and models are available at: https://github.com/AgenticFinLab/PyFi .2025-12-11T06:04:33ZYuqun ZhangYuxuan ZhaoSijia Chenhttp://arxiv.org/abs/2512.05156v2Semantic Faithfulness and Entropy Production Measures to Tame Your LLM Demons and Manage Hallucinations2025-12-08T15:12:35ZEvaluating faithfulness of Large Language Models (LLMs) to a given task is a complex challenge. We propose two new unsupervised metrics for faithfulness evaluation using insights from information theory and thermodynamics. Our approach treats an LLM as a bipartite information engine where hidden layers act as a Maxwell demon controlling transformations of context $C $ into answer $A$ via prompt $Q$. We model Question-Context-Answer (QCA) triplets as probability distributions over shared topics. Topic transformations from $C$ to $Q$ and $A$ are modeled as transition matrices ${\bf Q}$ and ${\bf A}$ encoding the query goal and actual result, respectively. Our semantic faithfulness (SF) metric quantifies faithfulness for any given QCA triplet by the Kullback-Leibler (KL) divergence between these matrices. Both matrices are inferred simultaneously via convex optimization of this KL divergence, and the final SF metric is obtained by mapping the minimal divergence onto the unit interval [0,1], where higher scores indicate greater faithfulness. Furthermore, we propose a thermodynamics-based semantic entropy production (SEP) metric in answer generation, and show that high faithfulness generally implies low entropy production. The SF and SEP metrics can be used jointly or separately for LLM evaluation and hallucination control. We demonstrate our framework on LLM summarization of corporate SEC 10-K filings.2025-12-04T03:47:37Z23 pages, 6 figuresIgor Halperinhttp://arxiv.org/abs/2511.01869v2BondBERT: What we learn when assigning sentiment in the bond market2025-12-08T09:21:22ZBond markets respond differently to macroeconomic news compared to equity markets, yet most sentiment models are trained primarily on general financial or equity news data. However, bond prices often move in the opposite direction to economic optimism, making general or equity-based sentiment tools potentially misleading. We introduce BondBERT, a transformer-based language model fine-tuned on bond-specific news. BondBERT can act as the perception and reasoning component of a financial decision-support agent, providing sentiment signals that integrate with forecasting models. We propose a generalisable framework for adapting transformers to low-volatility, domain-inverse sentiment tasks by compiling and cleaning 30,000 UK bond market articles (2018-2025). BondBERT's sentiment predictions are compared against FinBERT, FinGPT, and Instruct-FinGPT using event-based correlation, up/down accuracy analyses, and LSTM forecasting across ten UK sovereign bonds. We find that BondBERT consistently produces positive correlations with bond returns, and achieves higher alignment and forecasting accuracy than the three baseline models. These results demonstrate that domain-specific sentiment adaptation better captures fixed income dynamics, bridging a gap between NLP advances and bond market analytics.2025-10-21T09:18:03Z8 pages, 3 figures, author manuscript accepted for ICAART 2026: 18th International Conference on Agents and Artificial Intelligence, Mar. 2026, Marbella, SpainToby BarterZheng GaoEva ChristodoulakiJing ChenJohn Cartlidgehttp://arxiv.org/abs/2512.07162v1DeepSVM: Learning Stochastic Volatility Models with Physics-Informed Deep Operator Networks2025-12-08T04:53:23ZReal-time calibration of stochastic volatility models (SVMs) is computationally bottlenecked by the need to repeatedly solve coupled partial differential equations (PDEs). In this work, we propose DeepSVM, a physics-informed Deep Operator Network (PI-DeepONet) designed to learn the solution operator of the Heston model across its entire parameter space. Unlike standard data-driven deep learning (DL) approaches, DeepSVM requires no labelled training data. Rather, we employ a hard-constrained ansatz that enforces terminal payoffs and static no-arbitrage conditions by design. Furthermore, we use Residual-based Adaptive Refinement (RAR) to stabilize training in difficult regions subject to high gradients. Overall, DeepSVM achieves a final training loss of $10^{-5}$ and predicts highly accurate option prices across a range of typical market dynamics. While pricing accuracy is high, we find that the model's derivatives (Greeks) exhibit noise in the at-the-money (ATM) regime, highlighting the specific need for higher-order regularization in physics-informed operator learning.2025-12-08T04:53:23ZKieran A. MalandainSelim KaliciHakob Chakhoyanhttp://arxiv.org/abs/2507.22712v2Order-Flow Filtration and Directional Association with Short-Horizon Returns2025-12-08T04:09:43ZElectronic markets generate dense order flow with many transient orders, which degrade directional signals derived from the limit order book (LOB). We study whether simple structural filters on order lifetime, modification count, and modification timing sharpen the association between order book imbalance (OBI) and short-horizon returns in BankNifty index futures, where unfiltered OBI is already known to be a strong short-horizon directional indicator. The efficacy of each filter is evaluated using a three-step diagnostic ladder: contemporaneous correlations, linear association between discretised regimes, and Hawkes event-time excitation between OBI and return regimes. Our results indicate that filtration of the aggregate order flow produces only modest changes relative to the unfiltered benchmark. By contrast, when filters are applied on the parent orders of executed trades, the resulting OBI series exhibits systematically stronger directional association. Motivated by recent regulatory initiatives to curb noisy order flow, we treat the association between OBI and short-horizon returns as a policy-relevant diagnostic of market quality. We then compare unfiltered and filtered OBI series, using tick-by-tick data from the National Stock Exchange of India, to infer how structural filters on the order flow affect OBI-return dynamics in an emerging market setting.2025-07-30T14:22:47Z21 pagesAditya Nittur AnanthaShashi JainPrithwish Maitihttp://arxiv.org/abs/2512.06620v1Unveiling Hedge Funds: Topic Modeling and Sentiment Correlation with Fund Performance2025-12-07T01:52:51ZThe hedge fund industry presents significant challenges for investors due to its opacity and limited disclosure requirements. This pioneering study introduces two major innovations in financial text analysis. First, we apply topic modeling to hedge fund documents-an unexplored domain for automated text analysis-using a unique dataset of over 35,000 documents from 1,125 hedge fund managers. We compared three state-of-the-art methods: Latent Dirichlet Allocation (LDA), Top2Vec, and BERTopic. Our findings reveal that LDA with 20 topics produces the most interpretable results for human users and demonstrates higher robustness in topic assignments when the number of topics varies, while Top2Vec shows superior classification performance. Second, we establish a novel quantitative framework linking document sentiment to fund performance, transforming qualitative information traditionally requiring expert interpretation into systematic investment signals. In sentiment analysis, contrary to expectations, the general-purpose DistilBERT outperforms the finance-specific FinBERT in generating sentiment scores, demonstrating superior adaptability to diverse linguistic patterns found in hedge fund documents that extend beyond specialized financial news text. Furthermore, sentiment scores derived using DistilBERT in combination with Top2Vec show stronger correlations with subsequent fund performance compared to other model combinations. These results demonstrate that automated topic modeling and sentiment analysis can effectively process hedge fund documents, providing investors with new data-driven decision support tools.2025-12-07T01:52:51ZChang Liuhttp://arxiv.org/abs/2303.09406v2Exploiting Supply Chain Interdependencies for Stock Return Prediction: A Full-State Graph Convolutional LSTM2025-12-07T01:43:39ZStock return prediction is fundamental to financial decision-making, yet traditional time series models fail to capture the complex interdependencies between companies in modern markets. We propose the Full-State Graph Convolutional LSTM (FS-GCLSTM), a novel temporal graph neural network that incorporates value-chain relationships to enhance stock return forecasting. Our approach features two key innovations: First, we represent inter-firm dependencies through value-chain networks, where nodes correspond to companies and edges capture supplier-customer relationships, enabling the model to leverage information beyond historical price data. Second, FS-GCLSTM applies graph convolutions to all LSTM components - current input features, previous hidden states, and cell states - ensuring that spatial information from the value-chain network influences every aspect of the temporal update mechanism. We evaluate FS-GCLSTM on Eurostoxx 600 and S&P 500 datasets using LSEG value-chain data. While not achieving the lowest traditional prediction errors, FS-GCLSTM consistently delivers superior portfolio performance, attaining the highest annualized returns, Sharpe ratios, and Sortino ratios across both markets. Performance gains are more pronounced in the denser Eurostoxx 600 network, and robustness tests confirm stability across different input sequence lengths, demonstrating the practical value of integrating value-chain data with temporal graph neural networks.2023-03-07T17:24:04ZChang Liu