Sub-City Real Estate Price Index Forecasting at Weekly Horizons Using Satellite Radar and News Sentiment

2026-02-20T19:25:48Z

Reliable real estate price indicators are typically published at city level and low frequency, limiting their use for neighborhood-scale monitoring and long-horizon planning. We study whether sub-city price indices can be forecasted at weekly frequency by combining physical development signals from satellite radar with market narratives from news text. Using over 350,000 transactions from Dubai Land Department (2015-2025), we construct weekly price indices for 19 sub-city regions and evaluate forecasts from 2 to 34 weeks ahead. Our framework fuses regional transaction history with Sentinel-1 SAR backscatter, news sentiment combining lexical tone and semantic embeddings, and macroeconomic context. Results are strongly horizon dependent: at horizons up to 10 weeks, price history alone matches multimodal configurations, but beyond 14 weeks sentiment and SAR become critical. At long horizons (26-34 weeks), the full multimodal model reduces mean absolute error from 4.48 to 2.93 (35% reduction), with gains statistically significant across regions. Nonparametric learners consistently outperform deep architectures in this data regime. These findings establish benchmarks for weekly sub-city index forecasting and demonstrate that remote sensing and news sentiment materially improve predictability at strategically relevant horizons.

Beyond the Numbers: Causal Effects of Financial Report Sentiment on Bank Profitability

2026-02-19T21:32:59Z

This study establishes the causal effects of market sentiment on firm profitability, moving beyond traditional correlational analyses. It leverages a causal forest machine learning methodology to control for numerous confounding variables, enabling systematic analysis of heterogeneity and non-linearities often overlooked. A key innovation is the use of a pre-trained FinancialBERT to generate sentiment scores from quarterly reports, which are then treated as causal interventions impacting profitability dynamics like returns and volatilities. Utilizing a comprehensive dataset from NEPSE, NRB, and individual financial institutions, the research employs SHAP analysis to identify influential profit predictors. A two-pronged causal analysis further explores how sentiment's impact is conditioned by Loan Portfolio/Asset Composition and Balance Sheet Strength/Leverage. Average Treatment Effect analyses, combined with SHAP insights, reveal statistically significant causal associations between certain balance sheet and expense management variables and profitability. This advanced causal machine learning framework significantly extends existing literature, providing a more robust understanding of how financial sentiment truly impacts firm performance.

Statistical modeling of SOFR term structure

2026-02-17T11:36:35Z

SOFR derivatives market remains illiquid and incomplete so it is not amenable to classical risk-neutral term structure models which are based on the assumption of perfect liquidity and completeness. This paper develops a statistical SOFR term structure model that is well-suited for risk management and derivatives pricing within the incomplete markets paradigm. The model incorporates relevant macroeconomic factors that drive central bank policy rates which, in turn, cause jumps often observed in the SOFR rates. The model is easy to calibrate to historical data, current market quotes, and the user's views concerning the future development of the relevant macroeconomic factors. The model is well suited for large-scale simulations often required in risk management, portfolio optimization and indifference pricing of interest rate derivatives.

Quantum Reservoir Computing for Statistical Classification in a Superconducting Quantum Circuit

2026-02-17T10:29:25Z

We analyze numerically the performance of Quantum Reservoir Computing (QRC) for statistical and financial problems. We use a reservoir composed of two superconducting islands coupled via their charge degrees of freedom. The key non-linear elements that provide the reservoir with rich and complex dynamics are the Josephson junctions that connect each island to the ground. We show that QRC implemented in this circuit can accurately classify complex probability distributions, including those with heavy tails, and identify regimes in correlated time series, such as periods of high volatility generated by standard econometric models. We find QRC to outperform some of the best classical methods when the amount of information is limited. This demonstrates its potential to be a noise-resilient quantum learning approach capable of tackling real-world problems within currently available superconducting platforms. We further discuss how to improve our QRC algorithm in real superconducting hardware to benefit from a much larger Hilbert space.

Broken Symmetry of Stock Returns -- a Modified Jones-Faddy Skew t-Distribution

2026-02-16T16:37:56Z

We argue that negative skew and positive mean of the distribution of stock returns are largely due to the broken symmetry of stochastic volatility governing gains and losses. Starting with stochastic differential equations for stock returns and for stochastic volatility we argue that the distribution of stock returns can be effectively split in two -- for gains and losses -- assuming difference in parameters of their respective stochastic volatilities. A modified Jones-Faddy skew t-distribution utilized here allows to reflect this in a single organic distribution which tends to meaningfully capture this asymmetry. We illustrate its application on distribution of daily S&P500 returns, including analysis of its tails.

Predicting the success of new crypto-tokens: the Pump.fun case

2026-02-16T15:53:13Z

We study the dynamics of token launched on Pump.fun, a Solana-based launchpad platform, to identify the determinants of the token success. Pump.fun employs a bonding curve mechanism to bootstrap initial liquidity possibly leading to graduation to the on-chain market, which can be seen as a token success. We build predictive models of the probability of graduation conditional on the current amount of Solana locked in the bonding curve and a set of explanatory variables that capture structural and behavioral aspects of the launch process. Conditioning the graduation probability on these variables significantly improves its predictive power, providing insights into early-stage market behavior, speculative and manipulative dynamics, and the informational efficiency of bonding-curve-based token launches.

A Quadratic Link between Out-of-Sample $R^2$ and Directional Accuracy

2026-02-14T20:17:37Z

This study provides a novel perspective on the metric disconnect phenomenon in financial time series forecasting through an analytical link that reconciles the out-of-sample $R^2$ ($R^2_{\text{OOS}}$) and directional accuracy (DA). In particular, using the random walk model as a baseline and assuming that sign correctness is independent of the realized magnitude, we show that these two metrics exhibit a quadratic relationship for MSE-optimal point forecasts. For point forecasts with modest DAs, the theoretical value of $R^2_{\text{OOS}}$ is intrinsically negligible. Thus, a negative empirical $R^2_{\text{OOS}}$ is expected if the model is suboptimal or affected by finite sample noise.

The Extremity Premium: Sentiment Regimes and Adverse Selection in Cryptocurrency Markets

2026-02-14T07:04:31Z

Using the Crypto Fear & Greed Index and Bitcoin daily data, we document that sentiment extremity predicts excess uncertainty beyond realized volatility. Extreme fear and extreme greed regimes exhibit significantly higher spreads than neutral periods -- a phenomenon we term the "extremity premium." Extended validation on the full Fear & Greed history (February 2018--January 2026, N = 2,896) confirms the finding: within-volatility-quintile comparisons show a significant premium (p < 0.001, Cohen's d = 0.21), Granger causality from uncertainty to spreads is strong (F = 211), and placebo tests reject the null (p < 0.0001). The effect replicates on Ethereum and across 6 of 7 market cycles. However, the premium is sensitive to functional form: comprehensive regression controls absorb regime effects, while nonparametric stratification preserves them. We interpret this as evidence that sentiment extremity captures volatility-regime interactions not fully represented by parametric controls -- consistent with, but not conclusively separable from, the F&G Index's embedded volatility component. An agent-based model reproduces the pattern qualitatively. The results suggest that intensity, not direction, drives uncertainty-linked liquidity withdrawal in cryptocurrency markets, though identification of "pure" sentiment effects from volatility remains an open challenge.

Same Returns, Different Risks: How Cryptocurrency Markets Process Infrastructure vs Regulatory Shocks

2026-02-14T06:46:52Z

We investigate whether cryptocurrency markets differentiate between infrastructure failures and regulatory enforcement at the return level, complementing a companion conditional variance analysis that finds 5.7 times larger volatility impacts from infrastructure events (p = 0.0008). Using event-level block bootstrap inference on 31 events across Bitcoin, Ethereum, Solana, and Cardano (2019-2025), we find no statistically significant difference in cumulative abnormal returns between infrastructure failures (-7.6%) and regulatory enforcement (-11.1%): the difference of +3.6 pp has p = 0.81 with 95% CI [-25.3%, +30.9%]. This null acquires substantive meaning alongside the companion's highly significant variance result: the same events that produce indistinguishable return responses generate dramatically different volatility signatures. Markets differentiate shock types through the risk channel -- the second moment -- rather than expected returns. The block bootstrap methodology, which resamples entire events to preserve cross-sectional correlation, reveals that prior parametric approaches systematically understate uncertainty by inflating degrees of freedom. Results are robust across eight specifications including permutation tests, leave-one-out analysis, and the Ibragimov-Mueller few-cluster test.

Large and Deep Factor Models

2026-02-13T21:59:30Z

We show that a deep neural network (DNN) trained to construct a stochastic discount factor (SDF) admits a sharp additive decomposition that separates nonlinear characteristic discovery from the pricing rule that aggregates them. The economically relevant component of this decomposition is governed by a new object, the Portfolio Tangent Kernel (PTK), which captures the features learned by the network and induces an explicit linear factor pricing representation for the SDF. In population, the PTK-implied SDF converges to a ridge-regularized version of the true SDF, with the effective strength of regularization determined by the spectral complexity of the PTK. Using U.S. equity data, we show that the PTK representation delivers large and statistically significant performance gains, while its spectral complexity has risen sharply-by roughly a factor of six since the early 2000s-imposing increasingly tight limits on finite-sample pricing performance.

Integrating granular data into a multilayer network: an interbank model of the euro area for systemic risk assessment

2026-02-11T15:50:53Z

Micro-structural models of contagion and systemic risk emphasize that shock propagation is inherently multi-channel, spanning counterparty exposures, short-term funding and roll-over risk, securities cross-holdings, and common-asset (fire-sale) spillovers. Empirical implementations, however, often rely on stylized or simulated networks, or focus on a single exposure dimension, reflecting the practical difficulty of reconciling heterogeneous granular collections into a coherent representation with consistent identifiers and consolidation rules. We close part of this gap by constructing an empirically grounded multilayer network for euro area significant banking groups that integrates several supervisory and statistical datasets into layer-consistent exposure matrices defined on a common node set. Each layer corresponds to a distinct transmission channel, long- and short-term credit, securities cross-holdings, short-term secured funding, and overlapping external portfolios, and nodes are enriched with balance-sheet information to support model calibration. We document pronounced cross-layer heterogeneity in connectivity and centrality, and show that an aggregated (flattened) representation can mask economically relevant structure and misidentify the institutions that are systemically important in specific markets. We then illustrate how the resulting network disciplines standard systemic-risk analytics by implementing a centrality-based propagation measure and a micro-structural agent-based framework on real exposures. The approach provides a data-grounded basis for layer-aware systemic-risk assessment and stress testing across multiple dimensions of the banking network.

Predictive AI with External Knowledge Infusion: Datasets and Benchmarks for Stock Markets

2026-02-11T12:36:45Z

Fluctuations in stock prices are influenced by a complex interplay of factors that go beyond mere historical data. These factors, themselves influenced by external forces, encompass inter-stock dynamics, broader economic factors, various government policy decisions, outbreaks of wars, etc. Furthermore, all of these factors are dynamic and exhibit changes over time. In this paper, for the first time, we tackle the forecasting problem under external influence by proposing learning mechanisms that not only learn from historical trends but also incorporate external knowledge from temporal knowledge graphs. Since there are no such datasets or temporal knowledge graphs available, we study this problem with stock market data, and we construct comprehensive temporal knowledge graph datasets. In our proposed approach, we model relations on external temporal knowledge graphs as events of a Hawkes process on graphs. With extensive experiments, we show that learned dynamic representations effectively rank stocks based on returns across multiple holding periods, outperforming related baselines on relevant metrics.

Holistic Multi-Scale Inference of the Leverage Effect: Efficiency under Dependent Microstructure Noise

2026-02-10T07:25:48Z

This paper addresses the long-standing challenge of estimating the leverage effect from high-frequency data contaminated by dependent, non-Gaussian microstructure noise. We depart from the conventional reliance on pre-averaging or volatility "plug-in" methods by introducing a holistic multi-scale framework that operates directly on the leverage effect. We propose two novel estimators: the Subsampling-and-Averaging Leverage Effect (SALE) and the Multi-Scale Leverage Effect (MSLE). Central to our approach is a shifted window technique that constructs a noise-unbiased base estimator, significantly simplifying the multi-scale architecture. We provide a rigorous theoretical foundation for these estimators, establishing central limit theorems and stable convergence results that remain valid under both noise-free and dependent-noise settings. The primary contribution to estimation efficiency is a specifically designed weighting strategy for the MSLE estimator. By optimizing the weights based on the asymptotic covariance structure across scales and incorporating finite-sample variance corrections, we achieve substantial efficiency gains over existing benchmarks. Extensive simulation studies and an empirical analysis of 30 U.S. assets demonstrate that our framework consistently yields smaller estimation errors and superior performance in realistic, noisy market environments.

Reproducing the first and second moments of empirical degree distributions

2026-02-09T16:57:26Z

The study of probabilistic models for the analysis of complex networks represents a flourishing research field. Among the former, Exponential Random Graphs (ERGs) have gained increasing attention over the years. So far, only linear ERGs have been extensively employed to gain insight into the structural organisation of real-world complex networks. None, however, is capable of accounting for the variance of the empirical degree distribution. To this aim, non-linear ERGs must be considered. After showing that the usual mean-field approximation forces the degree-corrected version of the two-star model to degenerate, we define a fitness-induced variant of it. Such a `softened' model is capable of reproducing the sample variance, while retaining the explanatory power of its linear counterpart, within a purely canonical framework.

Nansde-net: A neural sde framework for generating time series with memory

2026-02-09T00:53:28Z

Modeling time series with long- or short-memory characteristics is a fundamental challenge in many scientific and engineering domains. While fractional Brownian motion has been widely used as a noise source to capture such memory effects, its incompatibility with Itô calculus limits its applicability in neural stochastic differential equation~(SDE) frameworks. In this paper, we propose a novel class of noise, termed Neural Network-kernel ARMA-type noise~(NA-noise), which is an Itô-process-based alternative capable of capturing both long- and short-memory behaviors. The kernel function defining the noise structure is parameterized via neural networks and decomposed into a product form to preserve the Markov property. Based on this noise process, we develop NANSDE-Net, a generative model that extends Neural SDEs by incorporating NA-noise. We prove the theoretical existence and uniqueness of the solution under mild conditions and derive an efficient backpropagation scheme for training. Empirical results on both synthetic and real-world datasets demonstrate that NANSDE-Net matches or outperforms existing models, including fractional SDE-Net, in reproducing long- and short-memory features of the data, while maintaining computational tractability within the Itô calculus framework.