https://arxiv.org/api/diqpaVH+rQMBFbjhGewCTbIEa8Q 2026-03-26T11:14:04Z 2953 75 15 http://arxiv.org/abs/2512.08270v1 Reasoning Models Ace the CFA Exams 2025-12-09T05:57:19Z

Previous research has reported that large language models (LLMs) demonstrate poor performance on the Chartered Financial Analyst (CFA) exams. However, recent reasoning models have achieved strong results on graduate-level academic and professional examinations across various disciplines. In this paper, we evaluate state-of-the-art reasoning models on a set of mock CFA exams consisting of 980 questions across three Level I exams, two Level II exams, and three Level III exams. Using the same pass/fail criteria from prior studies, we find that most models clear all three levels. The models that pass, ordered by overall performance, are Gemini 3.0 Pro, Gemini 2.5 Pro, GPT-5, Grok 4, Claude Opus 4.1, and DeepSeek-V3.1. Specifically, Gemini 3.0 Pro achieves a record score of 97.6% on Level I. Performance is also strong on Level II, led by GPT-5 at 94.3%. On Level III, Gemini 2.5 Pro attains the highest score with 86.4% on multiple-choice questions while Gemini 3.0 Pro achieves 92.0% on constructed-response questions.

2025-12-09T05:57:19Z Jaisal Patel Yunzhe Chen Kaiwen He Keyi Wang David Li Kairong Xiao Xiao-Yang Liu http://arxiv.org/abs/2512.07526v1 The Suicide Region: Option Games and the Race to Artificial General Intelligence 2025-12-08T13:00:23Z

Standard real options theory predicts delay in exercising the option to invest or deploy when extreme asset volatility or technological uncertainty are present. However, in the current race to develop artificial general intelligence (AGI), sovereign actors are exhibiting behaviors contrary to theoretical predictions: the US and China are accelerating AI investment despite acknowledging the potential for catastrophic failure from AGI misalignment. We resolve this puzzle by formalizing the AGI race as a continuous-time preemption game with endogenous existential risk. In our model, the cost of failure is no longer bounded only by the sunk cost of investment (I), but rather a systemic ruin parameter (D) that is correlated with development velocity and shared globally. As the disutility of catastrophe is embedded in both players' payoffs, the risk term mathematically cancels out of the equilibrium indifference condition. This creates a "suicide region" in the investment space where competitive pressures force rational agents to deploy AGI systems early, despite a negative risk-adjusted net present value. Furthermore, we show that "warning shots" (sub-existential disasters) will fail to deter AGI acceleration, as the winner-takes-all nature of the race remains intact. The race can only be halted if the cost of ruin is internalized, making safety research a prerequisite for economic viability. We derive the critical private liability threshold required to restore the option value of waiting and propose mechanism design interventions that can better ensure safe AGI research and socially responsible deployment.

2025-12-08T13:00:23Z 25 pages, 1 figure David Tan http://arxiv.org/abs/2507.22712v2 Order-Flow Filtration and Directional Association with Short-Horizon Returns 2025-12-08T04:09:43Z

Electronic markets generate dense order flow with many transient orders, which degrade directional signals derived from the limit order book (LOB). We study whether simple structural filters on order lifetime, modification count, and modification timing sharpen the association between order book imbalance (OBI) and short-horizon returns in BankNifty index futures, where unfiltered OBI is already known to be a strong short-horizon directional indicator. The efficacy of each filter is evaluated using a three-step diagnostic ladder: contemporaneous correlations, linear association between discretised regimes, and Hawkes event-time excitation between OBI and return regimes. Our results indicate that filtration of the aggregate order flow produces only modest changes relative to the unfiltered benchmark. By contrast, when filters are applied on the parent orders of executed trades, the resulting OBI series exhibits systematically stronger directional association. Motivated by recent regulatory initiatives to curb noisy order flow, we treat the association between OBI and short-horizon returns as a policy-relevant diagnostic of market quality. We then compare unfiltered and filtered OBI series, using tick-by-tick data from the National Stock Exchange of India, to infer how structural filters on the order flow affect OBI-return dynamics in an emerging market setting.

2025-07-30T14:22:47Z 21 pages Aditya Nittur Anantha Shashi Jain Prithwish Maiti http://arxiv.org/abs/2512.15728v1 FedSight AI: Multi-Agent System Architecture for Federal Funds Target Rate Prediction 2025-12-05T16:45:18Z

The Federal Open Market Committee (FOMC) sets the federal funds rate, shaping monetary policy and the broader economy. We introduce \emph{FedSight AI}, a multi-agent framework that uses large language models (LLMs) to simulate FOMC deliberations and predict policy outcomes. Member agents analyze structured indicators and unstructured inputs such as the Beige Book, debate options, and vote, replicating committee reasoning. A Chain-of-Draft (CoD) extension further improves efficiency and accuracy by enforcing concise multistage reasoning. Evaluated at 2023-2024 meetings, FedSight CoD achieved accuracy of 93.75\% and stability of 93.33\%, outperforming baselines including MiniFed and Ordinal Random Forest (RF), while offering transparent reasoning aligned with real FOMC communications.

2025-12-05T16:45:18Z NeurIPS 2025 Generative AI in Finance Workshop Yuhan Hou Tianji Rao Jeremy Tan Adler Viton Xiyue Zhang David Ye Abhishek Kodi Sanjana Dulam Aditya Paul Yikai Feng http://arxiv.org/abs/2512.03709v1 The Effect of High-Speed Rail Connectivity on Capital Market Earnings Forecast Error: Evidence from the Chinese Stock Market 2025-12-03T12:00:11Z

This study examines how China's high-speed rail (HSR) expansion affects analyst earnings forecast errors from an economic information friction perspective. Using firm-year panel data from 2008-2019, a period that covers HSR's early introduction and rapid nationwide rollout, the findings show that analysts' relative earnings forecast errors (RFE) decline significantly only after firms' cities become connected by high-speed rail. The placebo test, which artificially shifts HSR connectivity 3 years earlier than the actual opening year, yields an insignificant DID coefficient, rejecting the possibility that forecast errors were improving before the infrastructure shock. This supports the conclusion that forecast error reduction is linked to real geographic accessibility improvements rather than coincidence, pre-existing trends, or analyst anticipation. Economically, the study highlights that HSR reduces analysts' costs of gathering private, incremental information, particularly soft information obtained via plant or management visits. The rail network does not directly alter firms' internal capital allocation or earnings generation paths, but it lowers spatial barriers to information collection, enabling analysts to update EPS expectations under reduced travel friction. This work provides intuitive evidence that geography and mobility improvements contribute to forecasting accuracy in China's emerging, decentralized capital market corridors, and it encourages future research to consider transport accessibility as an exogenous information cost shock rather than an internal firm-capital shock.

2025-12-03T12:00:11Z Shilong Han http://arxiv.org/abs/2512.03189v1 The First Crypto President: Presidential Power and Cryptocurrency Markets During Trump's Second Term (2025-2029) 2025-12-02T19:39:03Z

This paper analyzes the intersection of presidential authority and cryptocurrency markets during Donald J. Trump's second term (2025-2029). We examine developments from 2024 through October 2025, focusing on how executive influence, family business ventures, and digital assets became intertwined in ways that blurred boundaries between public office and private profit. Using a mixed-methods approach that combines quantitative market data with qualitative institutional assessment, we identify politically linked digital assets as a distinct class characterized by reflexive valuations, asymmetric risk distribution, and systemic vulnerabilities. The Trump family's integrated cryptocurrency ecosystem reached peak valuations exceeding eleven billion dollars before collapsing by more than one trillion in market capitalization following a tariff announcement in October 2025. Results highlight conflicts of interest, failures in market microstructure, and the emergence of political finance as a monetizable phenomenon in the digital age. The study contributes to understanding how presidential signaling reshapes capital flows, how politically branded tokens function as quasi-currencies, and how sudden policy actions can trigger cascading liquidations across global digital asset systems.

2025-12-02T19:39:03Z 32 pages, 9 tables, 8 figures. Submitted to Journal of Business Economics and Finance. Revised version includes updated October-November 2025 market data Habib Badawi http://arxiv.org/abs/2512.07887v1 Does it take two to tango: Interaction between Credit Default Swaps and National Stock Indices 2025-12-01T14:03:07Z

This paper investigates both short and long-run interaction between BIST-100 index and CDS prices over January 2008 to May 2015 using ARDL technique. The paper documents several findings. First, ARDL analysis shows that 1 TL increase in CDS shrinks BIST-100 index by 22.5 TL in short-run and 85.5 TL in long-run. Second, 1000 TL increase in BIST index price causes 25 TL and 44 TL reducation in Turkey's CDS prices in short- and long-run respectively. Third, a percentage increase in interest rate shrinks BIST index by 359 TL and a percentage increase in inflation rate scales CDS prices up to 13.34 TL both in long-run. In case of short-run, these impacts are limited with 231 TL and 5.73 TL respectively. Fourth, a kurush increase in TL/USD exchange rate leads 24.5 TL (short-run) and 78 TL (long-run) reductions in BIST, while it augments CDS prices by 2.5 TL (short-run) and 3 TL (long-run) respectively. Fifth, each negative political events decreases BIST by 237 TL in short-run and 538 TL in long-run, while it increases CDS prices by 33 TL in short-run and 89 TL in long-run. These findings imply the highly dollar indebted capital structure of Turkish firms, and overly sensitivity of financial markets to the uncertainties in political sphere. Finally, the paper provides evidence for that BIST and CDS with control variables drift too far apart, and converge to a long-run equilibrium at a moderate monthly speed.

2025-12-01T14:03:07Z Journal of Economics and Financial Analysis, 2018, 2(1), pp.129-149 Yhlas Sovbetov Hami Saka http://arxiv.org/abs/2512.07886v1 The Endogenous Constraint: Hysteresis, Stagflation, and the Structural Inhibition of Monetary Velocity in the Bitcoin Network (2016-2025) 2025-11-30T19:51:43Z

Bitcoin operates as a macroeconomic paradox: it combines a strictly predetermined, inelastic monetary issuance schedule with a stochastic, highly elastic demand for scarce block space. This paper empirically validates the Endogenous Constraint Hypothesis, positing that protocol-level throughput limits generate a non-linear negative feedback loop between network friction and base-layer monetary velocity. Using a verified Transaction Cost Index (TCI) derived from Blockchain.com on-chain data and Hansen's (2000) threshold regression, we identify a definitive structural break at the 90th percentile of friction (TCI ~ 1.63). The analysis reveals a bifurcation in network utility: while the network exhibits robust velocity growth of +15.44% during normal regimes, this collapses to +6.06% during shock regimes, yielding a statistically significant Net Utility Contraction of -9.39% (p = 0.012). Crucially, Instrumental Variable (IV) tests utilizing Hashrate Variation as a supply-side instrument fail to detect a significant relationship in a linear specification (p=0.196), confirming that the velocity constraint is strictly a regime-switching phenomenon rather than a continuous linear function. Furthermore, we document a "Crypto Multiplier" inversion: high friction correlates with a +8.03% increase in capital concentration per entity, suggesting that congestion forces a substitution from active velocity to speculative hoarding.

2025-11-30T19:51:43Z 42 pages, 13 figures. JEL Classification: E41, E51, G15, C24 Hamoon Soleimani http://arxiv.org/abs/2512.00142v1 DeFi TrustBoost: Blockchain and AI for Trustworthy Decentralized Financial Decisions 2025-11-28T18:30:39Z

This research introduces the Decentralized Finance (DeFi) TrustBoost Framework, which combines blockchain technology and Explainable AI to address challenges faced by lenders underwriting small business loan applications from low-wealth households. The framework is designed with a strong emphasis on fulfilling four crucial requirements of blockchain and AI systems: confidentiality, compliance with data protection laws, resistance to adversarial attacks, and compliance with regulatory audits. It presents a technique for tamper-proof auditing of automated AI decisions and a strategy for on-chain (inside-blockchain) and off-chain data storage to facilitate collaboration within and across financial organizations.

2025-11-28T18:30:39Z 19 pages Swati Sachan Dale S. Fickett http://arxiv.org/abs/2511.15214v2 Corporate Earnings Calls and Analyst Beliefs 2025-11-25T18:42:49Z

Economic behavior is shaped not only by quantitative information but also by the narratives through which such information is communicated and interpreted (Shiller, 2017). I show that narratives extracted from earnings calls significantly improve the prediction of both realized earnings and analyst expectations. To uncover the underlying mechanisms, I introduce a novel text-morphing methodology in which large language models generate counterfactual transcripts that systematically vary topical emphasis (the prevailing narrative) while holding quantitative content fixed. This framework allows me to precisely measure how analysts under- and over-react to specific narrative dimensions. The results reveal systematic biases: analysts over-react to sentiment (optimism) and under-react to narratives of risk and uncertainty. Overall, the analysis offers a granular perspective on the mechanisms of expectation formation through the competing narratives embedded in corporate communication.

2025-11-19T08:06:46Z Giuseppe Matera http://arxiv.org/abs/2206.15365v10 Most claimed statistical findings in cross-sectional return predictability are likely true 2025-11-19T14:48:22Z

The false discovery rate (FDR) measures the share of false positives in a set of statistical tests. I develop simple and intuitive bounds on the FDR in cross-sectional predictability publications. The simplest bound requires just a few lines of math and finds $\text{FDR} \le 25\%$ based on summary statistics in eight out of nine previous studies. A more refined bound finds $\text{FDR} \le 9\%$. The FDR is small because randomly selecting accounting ratios produces statistically significant predictability far more often than would occur if there were no predictability. The bounds also reconcile the disparate FDR estimates in the literature.

2022-06-30T15:36:31Z Andrew Y. Chen http://arxiv.org/abs/2511.15456v1 Know Your Intent: An Autonomous Multi-Perspective LLM Agent Framework for DeFi User Transaction Intent Mining 2025-11-19T14:15:23Z

As Decentralized Finance (DeFi) develops, understanding user intent behind DeFi transactions is crucial yet challenging due to complex smart contract interactions, multifaceted on-/off-chain factors, and opaque hex logs. Existing methods lack deep semantic insight. To address this, we propose the Transaction Intent Mining (TIM) framework. TIM leverages a DeFi intent taxonomy built on grounded theory and a multi-agent Large Language Model (LLM) system to robustly infer user intents. A Meta-Level Planner dynamically coordinates domain experts to decompose multiple perspective-specific intent analyses into solvable subtasks. Question Solvers handle the tasks with multi-modal on/off-chain data. While a Cognitive Evaluator mitigates LLM hallucinations and ensures verifiability. Experiments show that TIM significantly outperforms machine learning models, single LLMs, and single Agent baselines. We also analyze core challenges in intent inference. This work helps provide a more reliable understanding of user motivations in DeFi, offering context-aware explanations for complex blockchain activity.

2025-11-19T14:15:23Z Written in 2025 Q1 Qian'ang Mao Yuxuan Zhang Jiaman Chen Wenjun Zhou Jiaqi Yan http://arxiv.org/abs/2511.15364v1 Anonymization and Information Loss 2025-11-19T11:44:48Z

We show that while anonymization effectively obscures firm identity, it significantly reduces the power of textual understanding, thereby diminishing models' ability to extract meaningful economic signals from financial texts. This information loss is particularly severe when numerical and object entities are removed from texts and is amplified in texts characterized by high linguistic uncertainty and firm specificity. Importantly, in the setting of sentiment extraction from earnings call transcripts, we find that information loss induced by anonymization is more pervasive and severe than the effects of look-ahead bias, suggesting that the costs of anonymization may outweigh its benefits in certain financial applications.

2025-11-19T11:44:48Z Ke Wu Baozhong Yang Zhenkun Ying Dexin Zhou http://arxiv.org/abs/2511.15123v1 Causal Inference in Financial Event Studies 2025-11-19T04:57:19Z

Financial event studies, ubiquitous in finance research, typically use linear factor models with known factors to estimate abnormal returns and identify causal effects of information events. This paper demonstrates that when factor models are misspecified -- an almost certain reality -- traditional event study estimators produce inconsistent estimates of treatment effects. The bias is particularly severe during volatile periods, over long horizons, and when event timing correlates with market conditions. We derive precise conditions for identification and expressions for asymptotic bias. As an alternative, we propose synthetic control methods that construct replicating portfolios from control securities without imposing specific factor structures. Revisiting four empirical applications, we show that some established findings may reflect model misspecification rather than true treatment effects. While traditional methods remain reliable for short-horizon studies with random event timing, our results suggest caution when interpreting long-horizon or volatile-period event studies and highlight the importance of quasi-experimental designs when available.

2025-11-19T04:57:19Z Paul Goldsmith-Pinkham Tianshu Lyu http://arxiv.org/abs/2512.02029v1 HODL Strategy or Fantasy? 480 Million Crypto Market Simulations and the Macro-Sentiment Effect 2025-11-19T03:46:37Z

Crypto enthusiasts claim that buying and holding crypto assets yields high returns, often citing Bitcoin's past performance to promote other tokens and fuel fear of missing out. However, understanding the real risk-return trade-off and what factors affect future crypto returns is crucial as crypto becomes increasingly accessible to retail investors through major brokerages. We examine the HODL strategy through two independent analyses. First, we implement 480 million Monte Carlo simulations across 378 non-stablecoin crypto assets, net of trading fees and the opportunity cost of 1-month Treasury bills, and find strong evidence of survivorship bias and extreme downside concentration. At the 2-3 year horizon, the median excess return is -28.4 percent, the 1 percent conditional value at risk indicates that tail scenarios wipe out principal after all costs, and only the top quartile achieves very large gains, with a mean excess return of 1,326.7 percent. These results challenge the HODL narrative: across a broad set of assets, simple buy-and-hold loads extreme downside risk onto most investors, and the miracles mostly belong to the luckiest quarter. Second, using a Bayesian multi-horizon local projection framework, we find that endogenous predictors based on realized risk-return metrics have economically negligible and unstable effects, while macro-finance factors, especially the 24-week exponential moving average of the Fear and Greed Index, display persistent long-horizon impacts and high cross-basket stability. Where significant, a one-standard-deviation sentiment shock reduces forward top-quartile mean excess returns by 15-22 percentage points and median returns by 6-10 percentage points over 1-3 year horizons, suggesting that macro-sentiment conditions, rather than realized return histories, are the dominant indicators for future outcomes.

2025-11-19T03:46:37Z Weikang Zhang Alison Watts