Macro Economists in the Machine: A Multi-Agent LLM Framework for Commodity-Related ETF Portfolio Construction

2026-06-06T18:07:28Z

We test whether large language models (LLMs) add value in commodity portfolio construction when the information set and implementation rules are held fixed across strategies. A Hawkish Agent (inflation-tightening prior), a Dovish Agent (growth-easing prior), a Debate Agent, and a deterministic z-score Rule Agent each receive identical FRED macro z-scores and route their tilt signals through the same portfolio engine. Across 124 weekly rebalancing dates spanning the 2023 U.S. rate peak and the 2024-2025 soft landing, all three LLM strategies outperform the Rule Agent in Sharpe terms; the Hawkish and Debate Agents record the largest gains (ΔSharpe = +0.044 and +0.040, both p < 0.10 under a block bootstrap) and preserve a net-of-cost advantage over the passive inverse-volatility benchmark at one-way trading costs up to 30 basis points, while the Rule Agent's thin margin over passive disappears at approximately 5 basis points.The Debate Agent does not outperform the best single agent (ΔSharpe = -0.004, p = 0.769); its contribution is bias correction -- averaging out the Dovish Agent's miscalibrated prior -- rather than deliberation-generated return. The performance advantage is concentrated in the soft-landing sub-period, the evaluation window spans a single rate cycle, and the reported $p$-values are unadjusted for multiple comparisons. Within these limits, the results suggest that an LLM acting as a constrained macro-interpretation function can add modest but economically meaningful value over a transparent rule layer, though the margin is small and its persistence beyond this sample is unknown.

Hour-Aware Adaptive Risk Management for Autonomous Memecoin Trading: A Multi-Layer Intelligence Framework

2026-06-06T15:40:05Z

This paper measures hour-of-day effects, filter precision, fragility, and realised yield in a 15-day paper-traded deployment of an autonomous memecoin trading system on Solana decentralised exchanges. The 190-trade sample (March 29 to April 12, 2026) shows a 40.5 percent win rate, mean per-trade return of +0.62 percent, cumulative +117.7 percent (net SOL +0.039), skewness -1.21, excess kurtosis 6.61. A Mann-Whitney U test of three poorest-performing UTC hours (2, 13, 23) against the others yields U = 1,274, p = 0.22; directional but not significant at n = 190. The three hours were selected in-sample, so the comparison is exploratory, not confirmatory. A parallel counterfactual rejection-tracking system collected 4,874 forward-sample observations across 184 distinct rejection events. Of those events, 17.9 percent reached a 50 percent drawdown from reference within 24 hours; 26.0 percent of forward samples recorded the rejected token below half-reference. The filter stack avoided these realised drawdowns: evidence that the rejection criteria are net-positive against forward-market outcomes. Fragility is the principal caveat. Removing the top three trades (1.6 percent of sample) flips cumulative return unprofitable. Profitability rests on a small number of large winners and is structurally fragile. The dataset and audit script are deposited under CC-BY-4.0 (Zenodo DOI 10.5281/zenodo.20043302).

Post-Rejection Follow-up Sampling: A Methodology for Counterfactual Outcome Measurement in Algorithmic DEX Trading

2026-06-06T15:32:38Z

Algorithmic trading systems on decentralised exchanges (DEXs) reject most candidate tokens they evaluate. The counterfactual outcome of rejected candidates (what would have happened had the system entered) is rarely measured. This paper introduces Post-Rejection Follow-up Sampling (PRFS). A separate tracking subsystem samples each rejected token's price and liquidity at a configurable cadence, over a horizon of up to twenty-four hours. PRFS produces the data needed to evaluate filter precision against actual market outcomes of rejected candidates, not against synthetic backtest reconstructions. The methodology, data architecture, and deposit format are described in Section III. The companion dataset contains 67,000 forward-outcome observation rows across 2,997 rejection events spanning 457 unique mints, collected over a continuous eight-day window (2026-04-10 to 2026-04-19, UTC). Approximately 55 percent of rejection events receive at least one forward observation; coverage at the mint level is complete. The principal binding constraint on downstream classification is per-event horizon density, not event-level coverage. PRFS is dataset-independent. It generalises to any algorithmic decision system in which rejections substantially outnumber executions.

Market Making and Transient Impact in Spot FX

2026-06-05T16:18:28Z

Dealers in foreign exchange markets provide bid and ask prices to their clients at which they are happy to buy and sell, respectively. To manage risk, dealers can skew their quotes and hedge in the interbank market. Hedging offers certainty but comes with transaction costs and market impact. Optimal market making with execution has previously been addressed within the Almgren-Chriss market impact model, which includes instantaneous and permanent components. However, there is overwhelming empirical evidence of the transient nature of market impact, with instantaneous and permanent impacts arising as the two limiting cases. In this note, we consider an intermediate scenario and study the interplay between risk management and impact resilience.

Diffusive in plain sight: An inconspicuous law of market impact

2026-06-05T09:00:19Z

Decomposing impact as the difference between realized and counterfactual returns and requiring both to be diffusive yields an identity that restricts admissible impact scaling at the level of individual participants. This constraint implies the square-root law in the information-neutral regime and a crossover to linear impact under strong informational coupling, consistent with empirical observations. In the weak-coupling regime, cumulative market impact is itself diffusive -- a diagnostic that many propagator and latent liquidity models fail to satisfy.

Multiperiod Groundwater Markets

2026-06-05T03:40:43Z

Motivated by the emergence of local groundwater exchanges, we construct and analyze stochastic models of dynamic groundwater markets. Our primary focus is endogenizing the price formation and groundwater pumping strategies in a closed market with stochastic groundwater allocations and opportunities for intertemporal transfer through rights banking. In our model, several agents, interpreted as farmers or agricultural districts, make competitive decisions on water consumption to produce a basket of goods, as well as on trading allocations among themselves, or banking them for future periods. We define the respective discrete-time non-zero-sum non-cooperative game and construct its sub-game perfect Nash equilibria characterized by the groundwater price process $\{p^\circ(t)\}$. We furthermore construct an algorithm to determine equilibrium strategies and prices through a machine learning approach on top of best-response iterations. Extensive numerical experiments illustrate dynamic phenomena, including the role of groundwater recharge dynamics, agents' risk aversion and groundwater allocations. Our model provides insights into competitive effects in environmental markets with banking features.

Competition in Dealer Markets with Internalisation and Externalisation

2026-06-04T17:15:54Z

We model a market with multiple dealers who compete for client order flow by dynamically updating their bid and ask quotes for a risky asset. Dealers aim to maximise expected profits while controlling inventory risk by skewing their quotes to attract offsetting order flow (internalisation) or by directly offloading positions in the market (externalisation). Using a variational approach, we derive a closed-form equilibrium for the resulting Nash competition, shedding light on key features of dealer market dynamics. We show that dealers relying on internalisation are compelled to increase their externalisation activity when competing with externalising dealers. This strategic shift in equilibrium leads to significantly higher hedging costs for all dealers and substantially wider spreads for clients.

The Impact of Market Informedness on Market Makers' Profitability

2026-06-04T08:53:39Z

This paper examines the impact of market informedness on the profitability of market makers. In contrast to the existing literature, the analysis is conducted in a complex market environment featuring heterogeneous market-making agents that differ in terms of information sets and aversion to inventory risk, endogenous price formation, exogenous fundamental value dynamics, and self-exciting market order flow. The paper also establishes finite-horizon stability guarantees for the resulting state-dependent Hawkes market-taker process, including non-explosion, exponential mispricing integrability, occupation-time bounds, and a pathwise mispricing tail estimate. To address the market-making problem, the study employs a reinforcement learning framework based on the multi-agent proximal policy optimization (MAPPO) algorithm in a centralized training with decentralized execution (CTDE) setting. The study shows that informed market order flow is particularly dangerous in poorly informed markets, leading to severe adverse-selection risk. Although the complex market dynamics together with stochastic training give rise to locally non-monotonic outcomes, the results nevertheless reveal an overall upward trend in market makers' profitability as market informedness increases, suggesting that price discovery resulting from higher market informedness offsets the negative impact of adverse selection.

Fairness and Strategy-Proofness in Automated Market Makers

2026-06-03T14:46:55Z

No deployed automated market maker lets its liquidity providers vote on the trading function. We show this is structural, not an oversight. On the weighted-product family with $n \geq 3$ assets, no aggregation rule is at once fair and strategy-proof. Arrovian fairness forces a unique form, the weighted Aitchison centroid, the weighted geometric mean of the providers' preferred pools. But fairness forces mean-type aggregation and strategy-proofness forces median-type, and the only rule that is both is a single-provider dictator. The obstruction is sharp: it vanishes at $n = 2$, where a fair strategy-proof rule exists. Under the Frongillo--Papireddygari--Waggoner equivalence, the centroid is Genest's logarithmic opinion pool, and the impossibility transfers to externally Bayesian pooling.

Dynamic Multi-Pair Trading Strategy in Cryptocurrency Markets with Deep Reinforcement Learning

2026-06-03T08:10:33Z

This study aims to determine whether the application of Deep Reinforcement Learning (DRL) as a specialized execution overlay can enhance pair trading in highly volatile cryptocurrency markets. Although classical implementations of the strategy have proven successful in traditional equities, they frequently exhibit rigidity and suffer from severe divergence risks when applied to high-variance environments. To address this need, this research introduces novel concepts. To construct a robust system, we developed a hierarchical "Filter-then-Rank" pair selection methodology and a proprietary "Fixed Risk, Adaptive Mean" execution model. The system employs a Proximal Policy Optimization (PPO) agent with a Long Short-Term Memory (LSTM) layer to govern execution decisions within strict deterministic risk management boundaries. Evaluated on 1-hour interval data from the Binance USD-M Futures market, the optimized RL policy achieved an out-of-sample performance that substantially outperformed the heuristic baseline. A stationary circular block bootstrap robustness check confirms that the agent's risk-adjusted outperformance is statistically significant at the 10 percent level. Although falling marginally short of the stricter 5 percent threshold, this result highlights the extreme idiosyncratic variance characteristic of digital assets. Ultimately, this thesis contributes to the quantitative finance literature by introducing a hybrid architecture that combines statistical arbitrage with DRL execution policies. Furthermore, it delivers a novel framework for safe reinforcement learning via deterministic shielding, proving that anchoring a neural policy to statistically robust boundaries successfully mitigates severe divergence risks.

Cost of Manipulation in AMM-Based Oracles

2026-06-02T12:10:23Z

We study the robustness of AMM-based on-chain price oracles to strategic manipulation. An attacker trades against constant product automated market makers (CPMMs) to distort an on-chain oracle, arbitrageurs restore cross-pool and cross-venue consistency, and an oracle designer chooses how to aggregate pool quotes. Taking an efficient-market-hypothesis (EMH) view of the off-chain "true" price, we define the \emph{cost of manipulation} as the minimal mark-to-market loss that an attacker must incur to move the oracle by a given multiplicative factor. For independent CPMMs, we derive closed-form single-pool manipulation formulas and solve the attacker-designer game for weighted means and weighted medians, showing that liquidity weights maximize the minimum cost of manipulation within these classes for weighted medians (for any distortion level) and, for weighted means, locally as the distortion tends to zero. For larger distortions, weighted means become more fragile: optimal weights can depend on the target distortion and no single choice is uniformly optimal across distortion levels. In a frictionless CPMM model with cross-pool arbitrage, the manipulation cost depends only on the total quote depth and coincides across symmetric aggregators. We extend this framework to multi-asset star architectures, confirming that liquidity weights remain optimal in the same sense. Finally, we bridge theory and practice by incorporating dwell times and rate limits, providing a quantitative yardstick to size oracles against the explicit economic costs of attack.

RED-2400: A Public Benchmark of Algorithmically-Rejected Trading Events with Outcome Labels

2026-06-01T23:41:55Z

RED-2400 is a public benchmark of 6,660 algorithmically-rejected trading events from a live Solana decentralised-exchange filter stack, observed continuously over 22 calendar days (2026-04-10T21:10Z through 2026-05-02T21:48Z, UTC). Each rejection event is linked to its post-rejection price-and-liquidity trajectory. The deposit contains 169,123 forward-outcome observations and 1,837 graveyard-tracker lifecycle snapshots, covering 1,076 distinct mints in the rejection registry and 1,075 in the forward-observation file. Outcome labels follow the five-tier classification rule introduced by a related methodology paper [Kamat 2026c]. The deposit includes a lifecycle-tracker file that permits external validation of any subset of those labels against observed token-lifecycle ground truth. Filter labels are anonymised to filter_1 through filter_8; source-collector identifiers to source_a and source_b. Liquidity and 24-hour volume are quantised to the nearest power of two, preserving heavy-tailed shape while preventing operational-threshold inference. This is the first window of a planned series; subsequent windows will extend the time horizon and enable regime-stratified analysis. "RED-2400" is a brand name, not a count; current cohort sizes are listed below and do not equal 2,400.

Strategic Users in a Priority Queue with Bulk Service on Blockchains

2026-05-31T14:57:23Z

This paper analyzes transaction fees on blockchains by considering that they form a priority queue and users play a queueing game. Using an M/G^K/1 priority queue model, we provide new insights into the dynamics governing transaction fees and their impact on user behavior. We derive semi-closed form expressions for steady-state quantities and extend the relationship between user delay costs and transaction fees to general block generation times. We apply the model to the Bitcoin network and simulate user responses under various scenarios. Cross-chain analysis across Bitcoin, Dogecoin, and Litecoin reveals similarities in normalized cost structures.

The Privacy Subsidy in Continuous-Time Kyle: Cumulative Welfare under Noise-Perturbed Order-Flow Observation

2026-05-31T05:56:52Z

We extend the closed-form privacy-subsidy result of Nakamura~(2026, arXiv:2605.15746) from the single-period Kyle model to continuous-time. A committed Bayesian automated market maker observes the aggregate order flow perturbed by an independent Brownian privacy channel of diffusion intensity $σ_\varepsilon$. Under the Markovian linear equilibrium, the price-impact coefficient is $λ= σ_v / \sqrt{σ_u^2 + σ_\varepsilon^2}$ -- constant in time -- and the cumulative expected transfer from the protocol's liquidity pool to traders over $[0,1]$ is $|Π_M| = σ_v σ_\varepsilon^2 / \sqrt{σ_u^2 + σ_\varepsilon^2}$. We then establish a structural correspondence between this cumulative privacy subsidy and Loss-Versus-Rebalancing (Milionis et al.~2022), identifying privacy-noise welfare as the order-flow observation analog of LVR's price observation gap. The result completes the continuous-time Kyle leg of the program of quantifying break-even fees for committed-AMM exchanges under privacy-aggregated information environments.

The Privacy Subsidy in Glosten-Milgrom: Bid-Ask Spread and Welfare under Flip-Noise Direction Observation

2026-05-31T05:51:56Z

We derive a closed-form bid-ask spread and welfare decomposition for the Glosten-Milgrom 1985 sequential-trading model when the market maker observes the trade direction perturbed by a binary flip channel of probability $η$ -- a natural information-theoretic model of privacy mechanisms acting on the direction signal. Under a committed Bayesian market-maker pricing rule, the equilibrium spread is $μ(1-2η)Δ$, where $μ$ is the informed-trader fraction and $Δ= v_H - v_L$ the value range. The welfare decomposition identifies a per-trade transfer $μηΔ$ from the protocol's liquidity pool to traders -- the "privacy subsidy", mirroring the Gaussian-Kyle analog established in prior work. The result extends the privacy-subsidy concept from continuous Gaussian to discrete two-state microstructure, demonstrating robustness across both classical models. Primary application: MPC-based matching engines with $\varepsilon$-differentially-private direction disclosure, where the engine prices on a noisy direction signal.