https://arxiv.org/api/QEWNeWoqInax+RwAtVa+tc76hVQ2026-04-12T13:27:16Z296248015http://arxiv.org/abs/2310.08678v1Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams2023-10-12T19:28:57ZLarge Language Models (LLMs) have demonstrated remarkable performance on a wide range of Natural Language Processing (NLP) tasks, often matching or even beating state-of-the-art task-specific models. This study aims at assessing the financial reasoning capabilities of LLMs. We leverage mock exam questions of the Chartered Financial Analyst (CFA) Program to conduct a comprehensive evaluation of ChatGPT and GPT-4 in financial analysis, considering Zero-Shot (ZS), Chain-of-Thought (CoT), and Few-Shot (FS) scenarios. We present an in-depth analysis of the models' performance and limitations, and estimate whether they would have a chance at passing the CFA exams. Finally, we outline insights into potential strategies and improvements to enhance the applicability of LLMs in finance. In this perspective, we hope this work paves the way for future studies to continue enhancing LLMs for financial reasoning through rigorous evaluation.2023-10-12T19:28:57ZEthan CallananAmarachi MbakweAntony PapadimitriouYulong PeiMathieu SibueXiaodan ZhuZhiqiang MaXiaomo LiuSameena Shahhttp://arxiv.org/abs/2310.07110v1Valuation Duration of the Stock Market2023-10-11T01:12:18ZAt the peak of the tech bubble, only 0.57% of market valuation comes from dividends in the next year. Taking the ratio of total market value to the value of one-year dividends, we obtain a valuation-based duration of 175 years. In contrast, at the height of the global financial crisis, more than 2.2% of market value is from dividends in the next year, implying a duration of 46 years. What drives valuation duration? We find that market participants have limited information about cash flow beyond one year. Therefore, an increase in valuation duration is due to a decrease in the discount rate rather than good news about long-term growth. Accordingly, valuation duration negatively predicts annual market return with an out-of-sample R2 of 15%, robustly outperforming other predictors in the literature. While the price-dividend ratio reflects the overall valuation level, our valuation-based measure of duration captures the slope of the valuation term structure. We show that valuation duration, as a discount rate proxy, is a critical state variable that augments the price-dividend ratio in spanning the (latent) state space for stock-market dynamics.2023-10-11T01:12:18ZYe LiChen Wanghttp://arxiv.org/abs/2310.05322v1Market Crowds' Trading Behaviors, Agreement Prices, and the Implications of Trading Volume2023-10-09T00:59:24ZIt has been long that literature in financial academics focuses mainly on price and return but much less on trading volume. In the past twenty years, it has already linked both price and trading volume to economic fundamentals, and explored the behavioral implications of trading volume such as investor's attitude toward risks, overconfidence, disagreement, and attention etc. However, what is surprising is how little we really know about trading volume. Here we show that trading volume probability represents the frequency of market crowd's trading action in terms of behavior analysis, and test two adaptive hypotheses relevant to the volume uncertainty associated with price in China stock market. The empirical work reveals that market crowd trade a stock in efficient adaptation except for simple heuristics, gradually tend to achieve agreement on an outcome or an asset price widely on a trading day, and generate such a stationary equilibrium price very often in interaction and competition among themselves no matter whether it is highly overestimated or underestimated. This suggests that asset prices include not only a fundamental value but also private information, speculative, sentiment, attention, gamble, and entertainment values etc. Moreover, market crowd adapt to gain and loss by trading volume increase or decrease significantly in interaction with environment in any two consecutive trading days. Our results demonstrate how interaction between information and news, the trading action, and return outcomes in the three-term feedback loop produces excessive trading volume which includes various internal and external causes.2023-10-09T00:59:24Z57 pages, 11 figures, 5 tablesProceedings of 2013 China Finance Review International Conference, 845-897 (2013)Leilei ShiBing HanYingzi ZhuLiyan HanYiwen WangYan Piaohttp://arxiv.org/abs/2310.01063v1Combining Deep Learning and GARCH Models for Financial Volatility and Risk Forecasting2023-10-02T10:18:13ZIn this paper, we develop a hybrid approach to forecasting the volatility and risk of financial instruments by combining common econometric GARCH time series models with deep learning neural networks. For the latter, we employ Gated Recurrent Unit (GRU) networks, whereas four different specifications are used as the GARCH component: standard GARCH, EGARCH, GJR-GARCH and APARCH. Models are tested using daily logarithmic returns on the S&P 500 index as well as gold price Bitcoin prices, with the three assets representing quite distinct volatility dynamics. As the main volatility estimator, also underlying the target function of our hybrid models, we use the price-range-based Garman-Klass estimator, modified to incorporate the opening and closing prices. Volatility forecasts resulting from the hybrid models are employed to evaluate the assets' risk using the Value-at-Risk (VaR) and Expected Shortfall (ES) at two different tolerance levels of 5% and 1%. Gains from combining the GARCH and GRU approaches are discussed in the contexts of both the volatility and risk forecasts. In general, it can be concluded that the hybrid solutions produce more accurate point volatility forecasts, although it does not necessarily translate into superior VaR and ES forecasts.2023-10-02T10:18:13Z25 pages, 11 figuresJakub MichańkówŁukasz KwiatkowskiJanusz Morajdahttp://arxiv.org/abs/2310.00490v1A systematic review of early warning systems in finance2023-09-30T21:11:59ZEarly warning systems (EWSs) are critical for forecasting and preventing economic and financial crises. EWSs are designed to provide early warning signs of financial troubles, allowing policymakers and market participants to intervene before a crisis expands. The 2008 financial crisis highlighted the importance of detecting financial distress early and taking preventive measures to mitigate its effects. In this bibliometric review, we look at the research and literature on EWSs in finance. Our methodology included a comprehensive examination of academic databases and a stringent selection procedure, which resulted in the final selection of 616 articles published between 1976 and 2023. Our findings show that more than 90\% of the papers were published after 2006, indicating the growing importance of EWSs in financial research. According to our findings, recent research has shifted toward machine learning techniques, and EWSs are constantly evolving. We discovered that research in this area could be divided into four categories: bankruptcy prediction, banking crisis, currency crisis and emerging markets, and machine learning forecasting. Each cluster offers distinct insights into the approaches and methodologies used for EWSs. To improve predictive accuracy, our review emphasizes the importance of incorporating both macroeconomic and microeconomic data into EWS models. To improve their predictive performance, we recommend more research into incorporating alternative data sources into EWS models, such as social media data, news sentiment analysis, and network analysis.2023-09-30T21:11:59Z20 pages, 12 figuresAli NamakiReza EyvazlooShahin Ramtinniahttp://arxiv.org/abs/2309.17379v1Handling missing data in Burundian sovereign bond market2023-09-29T16:28:18ZConstructing an accurate yield curve is essential for evaluating financial instruments and analyzing market trends in the bond market. However, in the case of the Burundian sovereign bond market, the presence of missing data poses a significant challenge to accurately constructing the yield curve. In this paper, we explore the limitations and data availability constraints specific to the Burundian sovereign market and propose robust methodologies to effectively handle missing data. The results indicate that the Linear Regression method, and the Previous value method perform consistently well across variables, approximating a normal distribution for the error values. The non parametric Missing Value Imputation using Random Forest (miss-Forest) method performs well for coupon rates but poorly for bond prices, and the Next value method shows mixed results. Ultimately, the Linear Regression (LR) method is recommended for imputing missing data due to its ability to approximate normality and predictive capabilities. However, filling missing values with previous values has high accuracy, thus, it will be the best choice when we have less information to be able to increase accuracy for LR. This research contributes to the development of financial products, trading strategies, and overall market development in Burundi by improving our understanding of the yield curve dynamics.2023-09-29T16:28:18ZIrène IrakozeRédempteur NtawiratsaDavid Niyukurihttp://arxiv.org/abs/2309.17322v1Assessing Look-Ahead Bias in Stock Return Predictions Generated By GPT Sentiment Analysis2023-09-29T15:30:32ZLarge language models (LLMs), including ChatGPT, can extract profitable trading signals from the sentiment in news text. However, backtesting such strategies poses a challenge because LLMs are trained on many years of data, and backtesting produces biased results if the training and backtesting periods overlap. This bias can take two forms: a look-ahead bias, in which the LLM may have specific knowledge of the stock returns that followed a news article, and a distraction effect, in which general knowledge of the companies named interferes with the measurement of a text's sentiment. We investigate these sources of bias through trading strategies driven by the sentiment of financial news headlines. We compare trading performance based on the original headlines with de-biased strategies in which we remove the relevant company's identifiers from the text. In-sample (within the LLM training window), we find, surprisingly, that the anonymized headlines outperform, indicating that the distraction effect has a greater impact than look-ahead bias. This tendency is particularly strong for larger companies--companies about which we expect an LLM to have greater general knowledge. Out-of-sample, look-ahead bias is not a concern but distraction remains possible. Our proposed anonymization procedure is therefore potentially useful in out-of-sample implementation, as well as for de-biased backtesting.2023-09-29T15:30:32Z17 pages, 2 figuresPaul GlassermanCaden Linhttp://arxiv.org/abs/2311.05840v1Predictive AI for SME and Large Enterprise Financial Performance Management2023-09-22T11:04:32ZFinancial performance management is at the core of business management and has historically relied on financial ratio analysis using Balance Sheet and Income Statement data to assess company performance as compared with competitors. Little progress has been made in predicting how a company will perform or in assessing the risks (probabilities) of financial underperformance. In this study I introduce a new set of financial and macroeconomic ratios that supplement standard ratios of Balance Sheet and Income Statement. I also provide a set of supervised learning models (ML Regressors and Neural Networks) and Bayesian models to predict company performance. I conclude that the new proposed variables improve model accuracy when used in tandem with standard industry ratios. I also conclude that Feedforward Neural Networks (FNN) are simpler to implement and perform best across 6 predictive tasks (ROA, ROE, Net Margin, Op Margin, Cash Ratio and Op Cash Generation); although Bayesian Networks (BN) can outperform FNN under very specific conditions. BNs have the additional benefit of providing a probability density function in addition to the predicted (expected) value. The study findings have significant potential helping CFOs and CEOs assess risks of financial underperformance to steer companies in more profitable directions; supporting lenders in better assessing the condition of a company and providing investors with tools to dissect financial statements of public companies more accurately.2023-09-22T11:04:32Z8 pages plus appendix. Thesis for MSc in AI at QMULRicardo Cuervohttp://arxiv.org/abs/2209.13623v3Publication Bias in Asset Pricing Research2023-09-21T14:26:27ZResearchers are more likely to share notable findings. As a result, published findings tend to overstate the magnitude of real-world phenomena. This bias is a natural concern for asset pricing research, which has found hundreds of return predictors and little consensus on their origins.
Empirical evidence on publication bias comes from large scale meta-studies. Meta-studies of cross-sectional return predictability have settled on four stylized facts that demonstrate publication bias is not a dominant factor: (1) almost all findings can be replicated, (2) predictability persists out-of-sample, (3) empirical $t$-statistics are much larger than 2.0, and (4) predictors are weakly correlated. Each of these facts has been demonstrated in at least three meta-studies.
Empirical Bayes statistics turn these facts into publication bias corrections. Estimates from three meta-studies find that the average correction (shrinkage) accounts for only 10 to 15 percent of in-sample mean returns and that the risk of inference going in the wrong direction (the false discovery rate) is less than 10%.
Meta-studies also find that $t$-statistic hurdles exceed 3.0 in multiple testing algorithms and that returns are 30 to 50 percent weaker in alternative portfolio tests. These facts are easily misinterpreted as evidence of publication bias effects. We clarify these misinterpretations and others, including the conflating of ``mostly false findings'' with ``many insignificant findings,'' ``data snooping'' with ``liquidity effects,'' and ``failed replications'' with ``insignificant ad-hoc trading strategies.''
Meta-studies outside of the cross-sectional literature are rare. The four facts from cross-sectional meta-studies provide a framework for future research. We illustrate with a preliminary re-examination of equity premium predictability.2022-09-27T18:39:34ZAndrew Y. ChenTom Zimmermannhttp://arxiv.org/abs/2309.10546v1Mean Absolute Directional Loss as a New Loss Function for Machine Learning Problems in Algorithmic Investment Strategies2023-09-19T11:52:13ZThis paper investigates the issue of an adequate loss function in the optimization of machine learning models used in the forecasting of financial time series for the purpose of algorithmic investment strategies (AIS) construction. We propose the Mean Absolute Directional Loss (MADL) function, solving important problems of classical forecast error functions in extracting information from forecasts to create efficient buy/sell signals in algorithmic investment strategies. Finally, based on the data from two different asset classes (cryptocurrencies: Bitcoin and commodities: Crude Oil), we show that the new loss function enables us to select better hyperparameters for the LSTM model and obtain more efficient investment strategies, with regard to risk-adjusted return metrics on the out-of-sample data.2023-09-19T11:52:13Z12 pages, 6 figuresJakub MichańkówPaweł SakowskiRobert Ślepaczukhttp://arxiv.org/abs/2309.13064v1InvestLM: A Large Language Model for Investment using Financial Domain Instruction Tuning2023-09-15T02:59:31ZWe present a new financial domain large language model, InvestLM, tuned on LLaMA-65B (Touvron et al., 2023), using a carefully curated instruction dataset related to financial investment. Inspired by less-is-more-for-alignment (Zhou et al., 2023), we manually curate a small yet diverse instruction dataset, covering a wide range of financial related topics, from Chartered Financial Analyst (CFA) exam questions to SEC filings to Stackexchange quantitative finance discussions. InvestLM shows strong capabilities in understanding financial text and provides helpful responses to investment related questions. Financial experts, including hedge fund managers and research analysts, rate InvestLM's response as comparable to those of state-of-the-art commercial models (GPT-3.5, GPT-4 and Claude-2). Zero-shot evaluation on a set of financial NLP benchmarks demonstrates strong generalizability. From a research perspective, this work suggests that a high-quality domain specific LLM can be tuned using a small set of carefully curated instructions on a well-trained foundation model, which is consistent with the Superficial Alignment Hypothesis (Zhou et al., 2023). From a practical perspective, this work develops a state-of-the-art financial domain LLM with superior capability in understanding financial texts and providing helpful investment advice, potentially enhancing the work efficiency of financial professionals. We release the model parameters to the research community.2023-09-15T02:59:31ZLink: https://github.com/AbaciNLP/InvestLMYi YangYixuan TangKar Yan Tamhttp://arxiv.org/abs/2311.10720v1Cryptocurrency in the Aftermath: Unveiling the Impact of the SVB Collapse2023-09-15T01:25:58ZIn this paper, we explore the aftermath of the Silicon Valley Bank (SVB) collapse, with a particular focus on its impact on crypto markets. We conduct a multi-dimensional investigation, which includes a factual summary, analysis of user sentiment, and examination of market performance. Based on such efforts, we uncover a somewhat counterintuitive finding: the SVB collapse did not lead to the destruction of cryptocurrencies; instead, they displayed resilience.2023-09-15T01:25:58ZQin WangGuangsheng YuShiping Chenhttp://arxiv.org/abs/2311.10717v1Arguably Adequate Aqueduct Algorithm: Crossing A Bridge-Less Block-Chain Chasm2023-09-12T03:24:57ZWe consider the problem of being a cross-chain wealth management platform with deposits, redemptions and investment assets across multiple networks. We discuss the need for blockchain bridges to facilitates fund flows across platforms. We point out several issues with existing bridges. We develop an algorithm - tailored to overcome current constraints - that dynamically changes the utilization of bridge capacities and hence the amounts to be transferred across networks. We illustrate several scenarios using numerical simulations.2023-09-12T03:24:57ZFinance Research Letters, September 2023, 104421Ravi Kashyap10.1016/j.frl.2023.104421http://arxiv.org/abs/2309.05783v1A New Framework to Estimate Return on Investment for Player Salaries in the National Basketball Association2023-09-11T19:30:44ZThe National Basketball Association (NBA) imposes a player salary cap. It is therefore useful to develop tools to measure the relative realized return of a player's salary given their on court performance. Very few such studies exist, however. We thus present the first known framework to estimate a return on investment (ROI) for NBA player contracts. The framework operates in five parts: (1) decide on a measurement time horizon, such as the standard 82-game NBA regular season; (2) calculate the novel game contribution percentage (GCP) measure we propose, which is a single game summary statistic that sums to unity for each competing team and is comprised of traditional, playtype, hustle, box outs, defensive, tracking, and rebounding per game NBA statistics; (3) estimate the single game value (SGV) of each regular season NBA game using a standard currency conversion calculation; (4) multiply the SGV by the vector of realized GCPs to obtain a series of realized per-player single season cash flows; and (5) use the player salary as an initial investment to perform the traditional ROI calculation. We illustrate our framework by compiling a novel, sharable dataset of per game GCP statistics and salaries for the 2022-2023 NBA regular season. A scatter plot of ROI by salary for all players is presented, including the top and bottom 50 performers. Notably, missed games are treated as defaults because GCP is a per game metric. This allows for break-even calculations between high-performing players with frequent missed games and average performers with few missed games, which we demonstrate with a comparison of the 2023 NBA regular seasons of Anthony Davis and Brook Lopez. We conclude by suggesting uses of our framework, discussing its flexibility through customization, and outlining potential future improvements.2023-09-11T19:30:44Z39 pages, 5 figures, 6 tablesAppl Stochastic Models Bus Ind. 41 (2025) e70020Jackson P. Lautier10.1002/asmb.70020http://arxiv.org/abs/2309.05560v1New News is Bad News2023-09-11T15:47:50ZAn increase in the novelty of news predicts negative stock market returns and negative macroeconomic outcomes over the next year. We quantify news novelty - changes in the distribution of news text - through an entropy measure, calculated using a recurrent neural network applied to a large news corpus. Entropy is a better out-of-sample predictor of market returns than a collection of standard measures. Cross-sectional entropy exposure carries a negative risk premium, suggesting that assets that positively covary with entropy hedge the aggregate risk associated with shifting news language. Entropy risk cannot be explained by existing long-short factors.2023-09-11T15:47:50ZPaul GlassermanHarry MamayskyJimmy Qin