Detecting Consumers' Financial Vulnerability using Open Banking Data: Evidence from UK Payday Loans

2026-05-07T13:47:53Z

This paper examines whether repeated payday loan use, commonly known as the debt trap, harms borrowers' financial wellbeing. Using Open Banking data from 1,815 UK borrowers observed between 2017 and 2018, we model borrowing intensity using a two-state hidden Markov model (HMM). The HMM outperforms single-regime alternatives and identifies two distinct borrowing patterns: occasional (low-intensity) and persistent (high-intensity) use. Each regime exhibits a characteristic relationship between borrowing intensity and wider transaction behaviour. We translate the decoded state sequence into a practical monitoring rule based on sustained high-intensity exposure. Defining a trigger event as 12 consecutive weeks in the high-intensity regime, we find that 36.4% of borrowers experience at least one such event. Among those who do, high-intensity weeks represent 17.8% of all borrower-week observations on average. Together, these results provide evidence for a persistent high-intensity borrowing pattern and demonstrate that it can serve as a simple, interpretable rule for monitoring prolonged reliance on payday loans.

Does social media information affect individual investor disposition effect? Evidence from Xueqiu

2026-05-07T07:52:57Z

The irrational behavior of investors selling profitable assets too early while holding onto losing assets for too long is known as the disposition effect. Due to the development of the Internet, the information environment for individual investors has been greatly improved. As an important source of information for individual investors, whether social media can improve investors' behavioral biases and return to rational expectations is a question worth studying. Based on the post data and actual trading data of the social investment platform Xueqiu.com, this paper studies the impact of social media information on the disposition effect of individual investors. The research results show that social media information can significantly reduce the disposition effect. Furthermore, it is through negative information that social media information reduces the disposition effect. When presented with negative information, individual investors will gradually become more rational in adjusting their positions. At the individual level, factors such as investment experience, users followed, region, and gender can all influence the effectiveness of the information acquired by individual investors in reducing the disposition effect.

Seeing the Goal, Missing the Truth: Human Accountability for AI Bias

2026-05-06T16:55:41Z

This research explores how human-defined goals influence the behavior of Large Language Models (LLMs) through purpose-conditioned cognition. Using financial prediction tasks, we show that revealing the downstream use (e.g., predicting stock returns or earnings) of LLM outputs leads the LLM to generate biased sentiment and competition measures, even though these measures are intended to be downstream task-independent. Goal-aware prompting shifts these intermediate measures toward the disclosed downstream objective, producing in-sample overfitting. Specifically, purpose leakage improves performance on data prior to the LLM's knowledge cutoff, but provides no advantage after the cutoff. This bias is strong enough that regularization of prompt instructions cannot fully address this form of overfitting. We further show that the bias can arise from users' unintentional conversational context that hints at the purpose. Overall, we document that AI bias due to "seeing the goal" is not an algorithmic flaw, but stems from human accountability in research design.

Deepening the Secondary Market: Integrating Trade Credit into Market Clearing with the Cycles Protocol

2026-05-04T10:34:50Z

Current post-trade clearing systems rely almost exclusively on cash or cash-like collateral, leaving vast reserves of short-term liquidity embedded in trade credit outside formal settlement infrastructures. A key barrier to integrating this liquidity is the near-universal dependence of clearing services on novation, which imposes institutional overhead that restricts accessibility and limits the range of obligations that can be brought into settlement. This paper introduces the Cycles Protocol: a distributed, multilateral clearing mechanism based on double-entry accounting and atomic cycle execution that maximizes balance sheet compression. Unlike novation-based clearing, Cycles does not redistribute counterparty risk; it can thus be applied generally to existing financial networks, without any change in counterparty relations, allowing it to complement existing clearing systems and Central Counterparties (CCPs). By representing commitments as edges on a unified directed graph, Cycles surfaces liquidity hiding within existing network structure. We focus here on two applications of Cycles to deepening secondary market liquidity: first, as a compression layer between existing clearing participants and CCPs; and second, as a means to incorporate the liquidity of the trade credit network into formal settlement, extending market clearing beyond financial obligations and into real-economy financing.

Foresight Arena: An On-Chain Benchmark for Evaluating AI Forecasting Agents

2026-05-04T07:21:37Z

Evaluating the true forecasting ability of AI agents requires environments that are resistant to environments resistant to overfitting, free from centralized trust, and grounded in incentive-compatible scoring. Existing benchmarks either rely on static datasets vulnerable to training-data contamination, or measure trading PnL -- a metric conflating predictive accuracy with timing, sizing, and risk appetite. We introduce Foresight Arena, the first permissionless, on-chain benchmark for evaluating AI forecasting agents on real-world prediction markets. Agents submit probabilistic forecasts on binary Polymarket markets via a commit-reveal protocol enforced by Solidity smart contracts on Polygon PoS; outcomes are resolved trustlessly through the Gnosis Conditional Token Framework. Performance is measured by the Brier Score and a novel Alpha Score -- proper scoring rules that incentivize honest probability reporting and isolate predictive edge over market consensus. We provide a formal analysis: closed-form variance for per-market Alpha, the connection to Murphy's classical Brier decomposition, and a power analysis characterizing the number of rounds required to reliably distinguish agents of different skill levels. We show that detecting a true edge of $α^* = 0.02$ at 80% power requires approximately 350 resolved binary predictions (50 rounds of 7 markets), while $α^* = 0.01$ requires four times more. We complement these analytical results with a deterministic, seed-controlled simulation study calibrated to literature-reported Brier-score ranges, illustrating how Murphy decomposition distinguishes well-calibrated agents from market-tracking agents that fail through reduced resolution. Live results from the deployed benchmark will be reported in a future revision. All smart contracts and evaluation infrastructure are open-source.

First-passage horizons in horizontal visibility graphs: a rank-invariant estimator of path roughness for rough volatility models

2026-05-03T14:30:54Z

Horizontal visibility graphs (HVGs) encode the ordinal structure of time series and provide graph-local summaries of path topology. This article introduces L+(t), the forward visibility horizon at node t, with finite-sample terminal non-crossings treated as right-censored observations. For paths without ties, each uncensored L+(t) is identical to the first-passage time τ+(t) = inf{k â¥ 1 : x_{t+k} â¥ x_t}. For an i.i.d. sequence with a continuous distribution, the survival law is exactly Pr[L+ â¥ k] = 1/k, equivalent to Rényi's record statistic and implying infinite mean and variance. Hence roughness is estimated on a power-law survival scale through a single tail exponent θ. Combining the identity L+ = τ+ with discrete-grid persistence theory for fractional Brownian motion gives the prediction θ(H) = 1 â H. For rough Bergomi-type volatility, the same prediction is derived under an explicit persistence hypothesis for RiemannâLiouville fBm increments and verified numerically. In Monte-Carlo experiments (N = 10,000, T = 2^16), a Hill-MLE with ClausetâShaliziâNewman threshold selection recovers θ(H) within one cross-replicate standard deviation for H â¤ 0.2 and reveals a positive finite-size bias for smoother paths. The rank-invariant, parameter-free estimator separates rough Bergomi volatility from classical Heston, GARCH, and FIGARCH benchmarks. Applied to daily FRED VIX data from 2000â2026, the rolling estimate is θÌ = 0.91 Â± 0.19 across 45 four-year windows and lies far below an overlapping-window i.i.d. Monte-Carlo null (p < 0.001). The statistic offers an ordinal diagnostic of roughness for financial volatility and other complex time-series systems.

Identifying Risk Variables From Raw ESG Data Using Its Hierarchical Structure

2026-05-03T04:19:51Z

Environmental, Social, and Governance (ESG) data provides non-financial insights into corporations. In this study, we aim to identify relevant ESG raw variables to assess financial risk, measured by logarithmic volatility of return. We propose a framework specifically designed for ESG datasets characterized by a hierarchical data structure and a significantly larger number of variables than observations. We show that raw variables selected by the proposed framework are significantly more relevant to financial risk than aggregated ESG scores. Furthermore, these selected risk variables provide additional insights beyond the traditional financial factors. We validate the robustness of this framework using out-of-sample data. We illustrate our framework using company data from various sectors of the US economy. We further identify the specific ESG risk variables relevant to large and small companies within each sector.

Leveraging Ensemble-Based Semi-Supervised Learning for Illicit Account Detection in Ethereum DeFi Transactions

2026-05-02T18:00:44Z

The advent of smart contracts has enabled the rapid rise of Decentralized Finance (DeFi) on the Ethereum blockchain, offering substantial rewards in financial innovation and inclusivity. This growth, however, is accompanied by significant security risks such as illicit accounts engaged in fraud. Effective detection is further limited by the scarcity of labeled data and the evolving tactics of malicious accounts. To address these challenges with a robust solution for safeguarding the DeFi ecosystem, we propose $\textbf{SLEID}$, a $\textbf{S}$elf-$\textbf{L}$earning $\textbf{E}$nsemble-based $\textbf{I}$llicit account $\textbf{D}$etection framework. SLEID uses an Isolation Forest model for initial outlier detection and a self-training mechanism to iteratively generate pseudo-labels for unlabeled accounts, enhancing detection accuracy. Experiments on 6,903,860 Ethereum transactions with extensive DeFi interaction coverage demonstrate that SLEID significantly outperforms supervised and semi-supervised baselines with $\textbf{+2.56}$ percentage-point precision, comparable recall, and $\textbf{+0.90}$ percentage-point F1 -- particularly for the minority illicit class -- alongside $\textbf{+3.74}$ percentage-points higher accuracy and improvements in PR-AUC, while substantially reducing reliance on labeled data.

Debiasing LLMs by Fine-tuning

2026-05-02T05:14:42Z

Prior research shows that large language models (LLMs) exhibit systematic extrapolation bias when forming predictions from both experimental and real-world data, and that prompt-based approaches appear limited in alleviating this bias. We propose a supervised fine-tuning (SFT) approach that uses Low-Rank Adaptation (LoRA) to train off-the-shelf LLMs on instruction datasets constructed from rational benchmark forecasts. By intervening at the parameter level, SFT changes how LLMs map observed information into forecasts and thereby mitigates extrapolation bias. We evaluate the fine-tuned model in two settings: controlled forecasting experiments and cross-sectional stock return prediction. In both settings, fine-tuning corrects the extrapolative bias out-of-sample, establishing a low-cost and generalizable method for debiasing LLMs.

Information Leakage at Population Scale: An Evaluation of the Polymarket Insider-Relevant Subpopulation, 2020-2026

2026-05-01T06:45:23Z

We carry the deadline-resolved Information Leakage Score (ILS-dl) framework of Nechepurenko (2026a, 2026b) from a single-case proof of concept to a population-scale evaluation across 12,708 Polymarket markets, October 2020 to April 2026. We frame the paper as a scope-discovery study: scaling reveals that the framework's effective domain is materially narrower than initial framing suggested, and the principal obstacle is not score computation but resolution semantics. We report four findings. First, only 88 of 12,708 candidate markets (0.7%) yield computable ILS-dl values; only 1 of 32 markets in the ForesightFlow Insider Cases (FFIC) inventory is in scope, and 14 of 32 FFIC markets are flagged unclassifiable due to genuine resolution-criterion ambiguity. Second, only 12 of the 88 computed markets (13.6%) satisfy anchor-sensitivity, and an independent-second-pass T_event validation reaches 57.8% exact-date agreement, below the 90% ex-ante criterion. Third, raw ILS-dl medians are negative across all six (sub-bucket by period) cells, but a hazard-decay baseline correction we introduce yields a heterogeneous result: regulatory_formal post-2024 shifts to near-zero (-0.21 to -0.02), while regulatory_announcement post-2024 retains a 95% bootstrap CI entirely below zero. Fourth, the constant-hazard exponential of Nechepurenko (2026b) is rejected in favor of Weibull on the pooled post-2024 cell, but a per-subcategory check confirms the preference reflects category mixture rather than within-cell duration dependence. The implication is that detection of informed flow requires methodological refinement on the resolution-typology and score-baseline axes, not only on the score-computation axis where prior work concentrated.

The Satoshi Overhang: Why the Bear Case is Bounded

2026-04-30T10:37:02Z

Renewed public attention on the identity of Bitcoin's pseudonymous creator has sharpened focus on the Satoshi overhang, commonly framed as a tail risk for bitcoin. This paper argues that the mechanical downside of a disposition is bounded well below the existential-loss framing, and that the terminal states most consistent with sixteen years of holder behavior are nonbearish for bitcoin's effective supply. The approximately 1.148 million BTC Patoshi position is analyzed on two tracks. For a purely wealth-maximizing holder, a three-scenario quantitative analysis (Appendix A) shows that bitcoin's current market depth is sufficient to absorb a patient multi-year liquidation at a cumulative price impact in the mid-single-digit to mid-double-digit percent range relative to counterfactual, with the central scenario clustering near 10 percent. The paper maps a decision space rather than identifying a unique modal outcome, assuming a holder whose profile is consistent with the sixteen-year record. Preference sets consistent with the record, including ideological non-intervention, privacy above all, satisficing, and myth preservation, favor continued dormancy terminating in a cryptographically enforced nonrecovery or destruction arrangement; preference sets favoring adversarial or wealth-maximizing action are possible but less supported. Across the plausible region of the decision space, the bear case is bounded and the terminal states most consistent with observed behavior are neutral to slightly positive for bitcoin's effective supply.

From Hypotheses to Factors: Constrained LLM Agents in Cryptocurrency Markets

2026-04-29T14:46:10Z

LLM agents are promising tools for empirical discovery, but their flexibility can also turn discovery into uncontrolled search. We study how to use agents under a reproducible protocol through cryptocurrency factor discovery. Our framework casts the task as sequential hypothesis search: an agent reads an append-only experiment trace, proposes falsifiable factor hypotheses, and maps them to executable recipes, while a deterministic engine enforces fixed data splits, selection gates, transaction costs, and portfolio tests. Candidate actions are restricted to a point-in-time factor DSL, making both successful and failed hypotheses auditable. A ridge-combined portfolio trained only on 2020--2022 data achieves a 44.55% annualized return and Sharpe ratio of 1.55 in the 2024--2026 pure out-of-sample period after a 5 basis point one-way trading cost.

Non-unique time and market incompleteness

2026-04-28T10:54:14Z

Financial markets are often modelled as if time were unique and continuous across assets and markets. Financial markets are however asynchronous, order flow is event-driven, and waiting times between events are often random. Many of the most influential formulations of financial market models presuppose a unique global calendar time and advocate for this or that preferred single latent continuous-time price system. Here we critically contrast these assumptions with event-time, renewal, point-process, and order-flow descriptions. We revisit no-arbitrage, no-dynamic-arbitrage, and risk-neutral option pricing in settings where the market is represented as a discrete event system and where the continuum limit of a discrete-time random walk need not be unique. The central suggestion is then that such non-uniqueness points to a more foundational form of market incompleteness than is usually emphasized. This highlights the importance of operational time at the level of decision making but reminds market practitioners that managing risk itself often requires reconciling operational time with a global calendar time. At these longer time scales forms of effective or average completeness may still emerge at lower frequencies and remain useful for portfolio construction and risk management, even if high-frequency hedging and execution expose a clock mismatch between trading, pricing, and longer-horizon allocation.

Cross-Stock Predictability via LLM-Augmented Semantic Networks

2026-04-26T05:10:37Z

Text-based financial networks are increasingly used to study cross-stock return predictability. A common approach constructs links from similarities in firms' disclosure embeddings, but such networks often contain spurious edges because textual proximity does not necessarily imply economic connection. We propose a two-stage framework that first builds a sparse candidate graph from 10-K embeddings and then uses a large language model to classify and filter candidate edges according to their economic relations. The refined graph is used to aggregate pair-level mean-reversion signals into stock-level trading signals with relation-aware and distance-based weights. In a backtest on S&P 500 constituents from 2011 to 2019, LLM-based edge filtering improves the long-short Sharpe ratio from 0.742 to 0.820 and reduces maximum drawdown from $-$10.47% to $-$7.85%. These results suggest that LLM-based reasoning can improve the economic fidelity of text-derived financial networks and strengthen cross-stock predictability.

Robust dividend policy: Equivalence of Epstein-Zin and Maenhout preferences

2026-04-24T16:05:45Z

In a continuous-time economy, this paper formulates the Epstein-Zin preference for discounted dividends received by an investor as an Epstein-Zin singular control utility. We introduce a backward stochastic differential equation with an aggregator integrated with respect to a singular control, prove its well-posedness, and show that it coincides with the Epstein-Zin singular control utility. We then establish that this formulation is equivalent to a robust dividend policy chosen by the firm's executive under the Maenhout's ambiguity-averse preference. In particular, the robust dividend policy takes the form of a threshold strategy on the firm's surplus process, where the threshold level is characterized as the free boundary of a Hamilton-Jacobi-Bellman variational inequality. Therefore, dividend-caring investors can choose firms that match their preferences by examining stock's dividend policies and financial statements, whereas executives can make use of dividend to signal their confidence, in the form of ambiguity aversion, on realizing the earnings implied by their financial statements.