https://arxiv.org/api/+jIlo2N5cb8lr1IaLOw8/coZb4I2026-04-09T08:31:52Z296118015http://arxiv.org/abs/2507.20494v1Deep Reputation Scoring in DeFi: zScore-Based Wallet Ranking from Liquidity and Trading Signals2025-07-28T03:12:27ZAs decentralized finance (DeFi) evolves, distinguishing between user behaviors - liquidity provision versus active trading - has become vital for risk modeling and on-chain reputation. We propose a behavioral scoring framework for Uniswap that assigns two complementary scores: a Liquidity Provision Score that assesses strategic liquidity contributions, and a Swap Behavior Score that reflects trading intent, volatility exposure, and discipline. The scores are constructed using rule-based blueprints that decompose behavior into volume, frequency, holding time, and withdrawal patterns. To handle edge cases and learn feature interactions, we introduce a deep residual neural network with densely connected skip blocks inspired by the U-Net architecture. We also incorporate pool-level context such as total value locked (TVL), fee tiers, and pool size, allowing the system to differentiate similar user behaviors across pools with varying characteristics. Our framework enables context-aware and scalable DeFi user scoring, supporting improved risk assessment and incentive design. Experiments on Uniswap v3 data show its usefulness for user segmentation and protocol-aligned reputation systems. Although we refer to our metric as zScore, it is independently developed and methodologically different from the cross-protocol system proposed by Udupi et al. Our focus is on role-specific behavioral modeling within Uniswap using blueprint logic and supervised learning.2025-07-28T03:12:27ZComments: 10 pages, 5 figures. Independently developed system by Zeru Finance for decentralized user scoring. Not submitted to any conference or journalDhanashekar KandaswamyAshutosh SahooAkshay SPGurukiran SParag PaulGirish G Nhttp://arxiv.org/abs/2508.06500v1Negative redispatch power for green hydrogen production: Game changer or lame duck? A German perspective2025-07-25T10:57:19ZFollowing years of controversial discussions about the risks of market-based redispatch, the German transmission network operators finally installed regional redispatch markets by the end of 2024. Since water electrolysers are eligible market participants, the otherwise downwards redispatched renewable energy can be used for green hydrogen production in compliance with European law. To show how different price levels in regional redispatch markets affect green hydrogen production cost and thus the incentive for electrolyser market participation, we use historic redispatch time series and evaluate various power purchase scenarios. Our results show that low price levels can lead to notable production cost reductions, potentially counteracting uncertainties in redispatch power availability and thus incentivising system-beneficial electrolyser siting. In contrast, the possibility of high price levels can nullify an increase in the competitiveness of German and European green hydrogen through production cost reductions and discourage market participation.2025-07-25T10:57:19ZJonathan BrandtAstrid BensmannRichard Hanke-Rauschenbachhttp://arxiv.org/abs/2507.22932v1FinMarBa: A Market-Informed Dataset for Financial Sentiment Classification2025-07-24T16:27:32ZThis paper presents a novel hierarchical framework for portfolio optimization, integrating lightweight Large Language Models (LLMs) with Deep Reinforcement Learning (DRL) to combine sentiment signals from financial news with traditional market indicators. Our three-tier architecture employs base RL agents to process hybrid data, meta-agents to aggregate their decisions, and a super-agent to merge decisions based on market data and sentiment analysis. Evaluated on data from 2018 to 2024, after training on 2000-2017, the framework achieves a 26% annualized return and a Sharpe ratio of 1.2, outperforming equal-weighted and S&P 500 benchmarks. Key contributions include scalable cross-modal integration, a hierarchical RL structure for enhanced stability, and open-source reproducibility.2025-07-24T16:27:32Z8 pagesBaptiste LefortEric BenhamouBeatrice GuezJean-Jacques OhanaEthan SetroukAlban Etiennehttp://arxiv.org/abs/2507.17624v1Homeownership as Life Cycle Goldmine: Evidence from Macrohistory2025-07-23T15:53:02ZShould a household buy a home? Using data from 16 developed countries spanning 1870 to 2020, this study provides a resounding affirmative answer. Contrary to popular expert advice, homeownership enhances life cycle wealth by up to 9% and welfare by up to 23%, compared to all-equity investment strategy. Homeownership reduces wealth portfolio risk and improves wealth equality, though it comes at the cost of lower working-life wealth and curtailed financial asset holdings. Gains are heterogeneous: Low-income (high-income) households gain more in wealth (welfare), and home purchase during periods of moderately low interest rates and high housing prices maximizes these benefits.2025-07-23T15:53:02ZYang BaiShize LiJialu Shenhttp://arxiv.org/abs/2108.02283v2Machine Learning Classification and Portfolio Allocation: with Implications from Machine Uncertainty2025-07-23T15:52:55ZWe use multi-class machine learning classifiers to identify the stocks that outperform or underperform other stocks. The resulting long-short portfolios achieve annual Sharpe ratios of 1.67 (value-weighted) and 3.35 (equal-weighted), with annual alphas ranging from 29\% to 48\%. These results persist after controlling for machine learning regressions and remain robust among large-cap stocks. Machine uncertainty, as measured by predicted probabilities, impairs the prediction performance. Stocks with higher machine uncertainty experience lower returns, particularly when human proxies of information uncertainty align with machine uncertainty. Consistent with the literature, such an effect is driven by the past underperformers.2021-08-04T20:48:27ZYang BaiKuntara Pukthuanthonghttp://arxiv.org/abs/2507.07296v1Time Series Foundation Models for Multivariate Financial Time Series Forecasting2025-07-09T21:43:06ZFinancial time series forecasting presents significant challenges due to complex nonlinear relationships, temporal dependencies, variable interdependencies and limited data availability, particularly for tasks involving low-frequency data, newly listed instruments, or emerging market assets. Time Series Foundation Models (TSFMs) offer a promising solution through pretraining on diverse time series corpora followed by task-specific adaptation. This study evaluates two TSFMs (Tiny Time Mixers (TTM) and Chronos) across three financial forecasting tasks: US 10-year Treasury yield changes, EUR/USD volatility, and equity spread prediction. Results demonstrate that TTM exhibits strong transferability. When fine-tuning both the pretrained version of TTM and an untrained model with the same architecture, the pretrained version achieved 25-50% better performance when fine-tuned on limited data and 15-30% improvements even when fine-tuned on lengthier datasets. Notably, TTM's zero-shot performance outperformed naive benchmarks in volatility forecasting and equity spread prediction, with the latter demonstrating that TSFMs can surpass traditional benchmark models without fine-tuning. The pretrained model consistently required 3-10 fewer years of data to achieve comparable performance levels compared to the untrained model, demonstrating significant sample-efficiency gains. However, while TTM outperformed naive baselines, traditional specialised models matched or exceeded its performance in two of three tasks, suggesting TSFMs prioritise breadth over task-specific optimisation. These findings indicate that TSFMs, though still nascent, offer substantial promise for financial forecasting-particularly in noisy, data-constrained tasks-but achieving competitive performance likely requires domain-specific pretraining and architectural refinements tailored to financial time series characteristics.2025-07-09T21:43:06Z66 pagesBen A. Marconihttp://arxiv.org/abs/2502.21206v3Chronologically Consistent Large Language Models2025-07-06T01:02:58ZLarge language models are increasingly used in social sciences, but their training data can introduce lookahead bias and training leakage. A good chronologically consistent language model requires efficient use of training data to maintain accuracy despite time-restricted data. Here, we overcome this challenge by training a suite of chronologically consistent large language models, ChronoBERT and ChronoGPT, which incorporate only the text data that would have been available at each point in time. Despite this strict temporal constraint, our models achieve strong performance on natural language processing benchmarks, outperforming or matching widely used models (e.g., BERT), and remain competitive with larger open-weight models. Lookahead bias is model and application-specific because even if a chronologically consistent language model has poorer language comprehension, a regression or prediction model applied on top of the language model can compensate. In an asset pricing application predicting next-day stock returns from financial news, we find that ChronoBERT and ChronoGPT's real-time outputs achieve Sharpe ratios comparable to a much larger Llama model, indicating that lookahead bias is modest. Our results demonstrate a scalable, practical framework to mitigate training leakage, ensuring more credible backtests and predictions across finance and other social science domains.2025-02-28T16:25:50ZSongrun HeLinying LvAsaf ManelaJimmy Wuhttp://arxiv.org/abs/2504.07929v3Market-Based Portfolio Variance2025-07-05T11:43:01ZThe variance measures the portfolio risks the investors are taking. The investor, who holds his portfolio and doesn't trade his shares, at the current time can use the time series of the market trades that were made during the averaging interval with the securities of his portfolio and assess the current return, variance, and hence the current risks of his portfolio. We show how the time series of trades with the securities of the portfolio determine the time series of trades with the portfolio as a single market security. The time series of trades with the portfolio determine its return and variance in the same form as the time series of trades with securities determine their returns and variances. The description of any portfolio and any single market security is equal. The time series of the portfolio trades define the decomposition of the portfolio variance by its securities, which is a quadratic form in the variables of relative amounts invested into securities. Its coefficients themselves are quadratic forms in the variables of relative numbers of shares of its securities. If one assumes that the volumes of all consecutive deals with each security are constant, the decomposition of the portfolio variance coincides with Markowitz's (1952) variance, which ignores the effects of random trade volumes. The use of the variance that accounts for the randomness of trade volumes could help majors like BlackRock, JP Morgan, and the U.S. Fed to adjust their models, like Aladdin and Azimov, to the reality of random markets.2025-04-10T17:44:57Z17 pagesVictor Olkhovhttp://arxiv.org/abs/2404.11745v5Piercing the Veil of TVL: DeFi Reappraised2025-07-04T09:52:20ZTotal value locked (TVL) is widely used to measure the size and popularity of decentralized finance (DeFi). However, TVL can be easily manipulated and inflated through "double counting" activities such as wrapping and leveraging. As existing methodologies addressing double counting are inconsistent and flawed, we propose a new framework, termed "total value redeemable (TVR)", to assess the true underlying value of DeFi. Our formal analysis reveals how DeFi's complex network spreads financial contagion via derivative tokens, increasing TVL's sensitivity to external shocks. To quantify double counting, we construct the DeFi multiplier, which mirrors the money multiplier in traditional finance (TradFi). This measurement reveals substantial double counting in DeFi, finding that the gap between TVL and TVR reached \$139.87 billion during the peak of DeFi activity on December 2, 2021, with a TVL-to-TVR ratio of approximately 2. We conduct sensitivity tests to evaluate the stability of TVL compared to TVR, demonstrating the former's significantly higher level of instability than the latter, especially during market downturns: A 25% decline in the price of Ether (ETH) leads to a \$1 billion greater non-linear decrease in TVL compared to TVR via the liquidations triggered by derivative tokens. We also document that the DeFi money multiplier is positively correlated with crypto market indicators and negatively correlated with macroeconomic indicators. Overall, our findings suggest that TVR is more reliable and stable than TVL.2024-04-17T20:51:54ZYichen LuoYebo FengJiahua XuPaolo Tascahttp://arxiv.org/abs/2507.03233v1Economic Policy Taxonomy2025-07-04T00:22:22ZThis paper proposes a framework for categorizing economic policies in a form of a tree taxonomy. The purpose of this approach is to construct an exhaustive and standardized list of actions that a governing authority has access to and can change to control an economy. This is advantageous from two perspectives: by having an exhaustive list of tools, it becomes easier to construct "complete" models (i.e., models that take in all empirical data and aim to simulate economic dynamics) of an economy and understand what the assumptions of these models are; and by knowing all available actions, economic strategies can be devised that target specific economic performance metrics with an exhaustive list of policies.2025-07-04T00:22:22Z38 pages, 9 figures, 9 tablesRem SadykhovGeoff GoodellPhilip Treleavenhttp://arxiv.org/abs/2507.01995v1Fair sharing ratios of Profit and Loss sharing contracts2025-06-30T13:11:19ZWe consider islamic Profit and Loss (PL) sharing contract, possibly combined with an agency contract, and introduce the notion of {\em $c$-fair} profit sharing ratios ($c = (c_1, \ldots,c_d) \in (\mathbb R^{\star})^d$, where $d$ is the number of partners) which aims to determining both the profit sharing ratios and the induced expected maturity payoffs of each partner $\ell$ according to its contribution, determined by the rate component $c_{\ell}$ of the vector $c$, to the global success of the project. We show several new results that elucidate the relation between these profit sharing ratios and various important economic factors as the investment risk, the labor and the capital, giving accordingly a way of choosing them in connection with the real economy. The design of our approach allows the use of all the range of econometrics models or more general stochastic diffusion models to compute or approximate the quantities of interest.2025-06-30T13:11:19ZAbass Sagnahttp://arxiv.org/abs/2507.01991v1FinAI-BERT: A Transformer-Based Model for Sentence-Level Detection of AI Disclosures in Financial Reports2025-06-29T09:33:29ZThe proliferation of artificial intelligence (AI) in financial services has prompted growing demand for tools that can systematically detect AI-related disclosures in corporate filings. While prior approaches often rely on keyword expansion or document-level classification, they fall short in granularity, interpretability, and robustness. This study introduces FinAI-BERT, a domain-adapted transformer-based language model designed to classify AI-related content at the sentence level within financial texts. The model was fine-tuned on a manually curated and balanced dataset of 1,586 sentences drawn from 669 annual reports of U.S. banks (2015 to 2023). FinAI-BERT achieved near-perfect classification performance (accuracy of 99.37 percent, F1 score of 0.993), outperforming traditional baselines such as Logistic Regression, Naive Bayes, Random Forest, and XGBoost. Interpretability was ensured through SHAP-based token attribution, while bias analysis and robustness checks confirmed the model's stability across sentence lengths, adversarial inputs, and temporal samples. Theoretically, the study advances financial NLP by operationalizing fine-grained, theme-specific classification using transformer architectures. Practically, it offers a scalable, transparent solution for analysts, regulators, and scholars seeking to monitor the diffusion and framing of AI across financial institutions.2025-06-29T09:33:29ZThe FinAI-BERT model can be directly loaded via Hugging Face Transformers (https://huggingface.co/bilalzafar/FinAI-BERT) for sentence-level AI disclosure classificationMuhammad Bilal Zafarhttp://arxiv.org/abs/2507.01990v1Integrating Large Language Models in Financial Investments and Market Analysis: A Survey2025-06-29T05:25:31ZLarge Language Models (LLMs) have been employed in financial decision making, enhancing analytical capabilities for investment strategies. Traditional investment strategies often utilize quantitative models, fundamental analysis, and technical indicators. However, LLMs have introduced new capabilities to process and analyze large volumes of structured and unstructured data, extract meaningful insights, and enhance decision-making in real-time. This survey provides a structured overview of recent research on LLMs within the financial domain, categorizing research contributions into four main frameworks: LLM-based Frameworks and Pipelines, Hybrid Integration Methods, Fine-Tuning and Adaptation Approaches, and Agent-Based Architectures. This study provides a structured review of recent LLMs research on applications in stock selection, risk assessment, sentiment analysis, trading, and financial forecasting. By reviewing the existing literature, this study highlights the capabilities, challenges, and potential directions of LLMs in financial markets.2025-06-29T05:25:31ZSedigheh MahdaviKristin JiatingKristin ChenPradeep Kumar JoshiLina Huertas GuativaUpmanyu Singhhttp://arxiv.org/abs/2507.01987v1Predicting and Explaining Customer Data Sharing in the Open Banking2025-06-28T01:24:59ZThe emergence of Open Banking represents a significant shift in financial data management, influencing financial institutions' market dynamics and marketing strategies. This increased competition creates opportunities and challenges, as institutions manage data inflow to improve products and services while mitigating data outflow that could aid competitors. This study introduces a framework to predict customers' propensity to share data via Open Banking and interprets this behavior through Explanatory Model Analysis (EMA). Using data from a large Brazilian financial institution with approximately 3.2 million customers, a hybrid data balancing strategy incorporating ADASYN and NEARMISS techniques was employed to address the infrequency of data sharing and enhance the training of XGBoost models. These models accurately predicted customer data sharing, achieving 91.39% accuracy for inflow and 91.53% for outflow. The EMA phase combined the Shapley Additive Explanations (SHAP) method with the Classification and Regression Tree (CART) technique, revealing the most influential features on customer decisions. Key features included the number of transactions and purchases in mobile channels, interactions within these channels, and credit-related features, particularly credit card usage across the national banking system. These results highlight the critical role of mobile engagement and credit in driving customer data-sharing behaviors, providing financial institutions with strategic insights to enhance competitiveness and innovation in the Open Banking environment.2025-06-28T01:24:59ZJoão B. G. de BritoRodrigo HeldtCleo S. SilveiraMatthias BogaertGuilherme B. BuccoFernando B. LuceJoão L. BeckerFilipe J. ZabalaMichel J. Anzanellohttp://arxiv.org/abs/2505.10370v2Optimal Post-Hoc Theorizing2025-06-27T20:11:32ZFor many economic questions, the empirical results are not interesting unless they are strong. For these questions, theorizing before the results are known is not always optimal. Instead, the optimal sequencing of theory and empirics trades off a ``Darwinian Learning'' effect from theorizing first with a ``Statistical Learning'' effect from examining the data first. This short paper formalizes the tradeoff in a Bayesian model. In the modern era of mature economic theory and enormous datasets, I argue that post hoc theorizing is typically optimal.2025-05-15T14:56:03ZAndrew Y. Chen