https://arxiv.org/api/BPnEfZeBb9BBKn6Rqa3KH2kkSyA 2026-06-14T03:24:48Z 3249 210 15 http://arxiv.org/abs/2504.11258v3 Multi-Agent Reinforcement Learning for Greenhouse Gas Offset Credit Markets 2026-03-11T20:54:51Z Climate change is a major threat to the future of humanity, and its impacts are being intensified by excess man-made greenhouse gas emissions. One method governments can employ to control these emissions is to provide firms with emission limits and penalize any excess emissions above the limit. Excess emissions may also be offset by firms who choose to invest in carbon reducing and capturing projects. These projects generate offset credits which can be submitted to a regulating agency to offset a firm's excess emissions, or they can be traded with other firms. In this work, we characterize the finite-agent Nash equilibrium for offset credit markets. As computing Nash equilibria is an NP-hard problem, we utilize the modern reinforcement learning technique Nash-DQN to efficiently estimate the market's Nash equilibria. We demonstrate not only the validity of employing reinforcement learning methods applied to climate themed financial markets, but also the significant financial savings emitting firms may achieve when abiding by the Nash equilibria through numerical experiments. 2025-04-15T14:56:42Z Liam Welsh Udit Grover Sebastian Jaimungal http://arxiv.org/abs/2603.02820v3 Optimal Consumption and Portfolio Choice with No-Borrowing Constraint in the Kim-Omberg Model: The Complete Market Case 2026-03-11T10:54:22Z In this paper, we study an intertemporal utility maximization problem in which an investor chooses consumption and portfolio strategies in the presence of a stochastic factor and a no-borrowing constraint. In the spirit of the Kim-Omberg model, the stochastic factor represents the expected excess return of the risky asset. It is perfectly negatively correlated with shocks to the risky asset, and follows an Ornstein-Uhlenbeck process, thereby capturing the mean reversion of expected excess returns-a feature well supported by empirical evidence in financial markets. The investor seeks to maximize expected utility from consumption, subject to the constraint that wealth remains nonnegative at all times. To address the dynamic no-borrowing constraint, we use Lagrange duality to transform the primal problem into a singular control problem in the dual space. We then characterize the solution to the dual singular control problem via an auxiliary two-dimensional optimal stopping problem featuring stochastic volatility, and subsequently retrieve the primal value function as well as the optimal portfolio and consumption plans. Finally, a numerical study is conducted to derive economic and financial implications. 2026-03-03T10:16:30Z This new version fixes a mistake found in the previous version Giorgio Ferrari Tim Niclas Schütz http://arxiv.org/abs/2603.10569v1 Win-score promotion gates in aggregator-routed RFQ markets: A two-tier stochastic control model 2026-03-11T09:24:14Z We study market making in aggregator-routed RFQ markets where platform routing depends on slowly varying dealer performance scores. We propose a two-tier stochastic control model that separates RFQ-level price competition from a macro routing layer: tier A represents aggregator flow whose opportunity intensity is multiplied by a promotion gate driven by the dealer's win score, while tier B captures background flow that is not gated and does not update the score. RFQs arrive in multiple sizes and the dealer chooses a size-ladder of bid/ask offsets; conditional on winning, trades earn spread minus an adverse selection correction and contribute to inventory risk. The resulting Hamilton-Jacobi-Bellman equation admits a reduced Bergault-Guéant operator form with explicit win/lose branches for the score on tier A. Using the envelope-theorem argument, we express optimal controls through derivatives of the one-dimensional reduced Hamiltonians, yielding an interpretable mapping from optimal win probabilities to optimal offsets. In the long-memory regime, we derive an adiabatic approximation that separates fast inventory dynamics from slow score dynamics. A quadratic inventory ansatz and quadratic Hamiltonian expansion lead to a quasi-stationarity inventory-curvature scaling and a one-dimensional score drift field. For steep (logistic) promotion gates, the score dynamics can exhibit fold bifurcations, bistability, and hysteresis, producing an endogenous "campaign vs. harvest" pattern in optimal quoting. Numerical experiments confirm this behaviour and highlight the stabilizing role of background flow in maintaining inventory-mixing capacity even when the dealer is weakly promoted. 2026-03-11T09:24:14Z 12 pages, 8 figures Alexander Barzykin http://arxiv.org/abs/2408.09335v3 Exploratory Optimal Stopping: A Singular Control Formulation 2026-03-11T07:17:41Z This paper explores continuous-time and state-space optimal stopping problems from a reinforcement learning perspective. We begin by formulating the stopping problem using randomized stopping times, where the decision maker's control is represented by the probability of stopping within a given time-specifically, a bounded, non-decreasing, càdlàg control process. To encourage exploration and facilitate learning, we introduce a regularized version of the problem by penalizing the performance criterion with the cumulative residual entropy of the randomized stopping time. The regularized problem takes the form of an (n+1)-dimensional degenerate singular stochastic control with finite-fuel, where the regularized free boundary becomes the graph of a function mapping the state variable of the original stopping problem into the probability of stopping. We address this singular control problem through the dynamic programming principle, which enables us to identify the unique optimal exploratory strategy. Finally, we propose both model-based and model-free reinforcement learning algorithms tailored for exploratory optimal stopping problems. We establish policy improvement guarantees for the proposed algorithms. Moreover, the model-free method is of actor-critic type and it is scalable in high-dimensions under neural network parameterization. 2024-08-18T02:31:55Z 49 pages, 3 figures Jodi Dianetti Giorgio Ferrari Renyuan Xu http://arxiv.org/abs/2603.09773v1 Global universality via discrete-time signatures 2026-03-10T15:12:03Z We establish global universal approximation theorems on spaces of piecewise linear paths, stating that linear functionals of the corresponding signatures are dense with respect to $L^p$- and weighted norms, under an integrability condition on the underlying weight function. As an application, we show that piecewise linear interpolations of Brownian motion satisfies this integrability condition. Consequently, we obtain $L^p$-approximation results for path-dependent functionals, random ordinary differential equations, and stochastic differential equations driven by Brownian motion. 2026-03-10T15:12:03Z Mihriban Ceylan David J. Prömel http://arxiv.org/abs/2603.09669v1 Competition between DEXs through Dynamic Fees 2026-03-10T13:41:31Z We find an approximate Nash equilibrium in a game between decentralized exchanges (DEXs) that compete for order flow by setting dynamic trading fees. We characterize the equilibrium via a coupled system of partial differential equations and derive tractable approximate closed-form expressions for the equilibrium fees. Our analysis shows that the two-regime structure found in monopoly models persists under competition: pools alternate between raising fees to deter arbitrage and lowering fees to attract noise trading and increase volatility. Under competition, however, the switching boundary shifts from the oracle price to a weighted average of the oracle and competitors' exchange rates. Our numerical experiments show that, holding total liquidity fixed, an increase in the number of competing DEXs reduces execution slippage for strategic liquidity takers and lowers fee revenue per DEX. Finally, the effect on noise traders' slippage depends on market activity: they are worse off in low-activity markets but better off in high-activity ones. 2026-03-10T13:41:31Z Leonardo Baggiani Martin Herdegen Leandro Sanchez-Betancourt http://arxiv.org/abs/2603.20243v1 Two-Factor Hull-White Model Revisited: Correlation Structure for Two-Factor Interest Rate Model in CVA Calculation 2026-03-10T05:14:13Z The development of credit valuation adjustment (CVA) (valuation adjustments [XVA]) [Green] has increased the importance of simple interest rate models such as the Hull-White model [Tan14] [Tsuchiya]. This is because the XVA model is an FX hybrid model, and is tractable only when the interest rate part is a simple Gaussian model. For the XVA calculation of interest rate instruments, de-correlation of the yield curve can be important even for the swap portfolio. Capturing the correlation structure in the two-factor Hull-White model is an integral element of CVA (XVA) modeling. However, the correlation structure in two-factor Hull-White model has not studied enough except for the analysis in [AndersenPiterbarg]. In this study, the correlation structure of the two-factor Hull-White model is analyzed in detail. The correlation structure of co-initial swap rates is investigated using a combination of the approximation formula and Monte-Carlo simulation. The Hull-White model captures the de-correlation of the yield curve only when the parameters (volatilities and mean reversion strength) satisfy certain relationships, making the valuation of XVA by two-factor Hull-White model effective. 2026-03-10T05:14:13Z Osamu Tsuchiya http://arxiv.org/abs/2509.03439v2 Concentration Inequalities for Sub-Weibull Random Tensors 2026-03-09T23:08:56Z We extend the theory of concentration inequalities to simple random tensors with heavy-tailed coefficients. Specifically, we consider the class of sub-Weibull distributions $\mathcal{S}_α$ for $α\in [1, 2]$. We establish concentration bounds for Euclidean functions of such tensors, exhibiting a phase transition between sub-gaussian and heavy-tailed regimes. Our results rely on a new Generalized Maximal Inequality for products of heavy-tailed random variables and a martingale analysis using Nagaev-type inequalities. 2025-09-03T16:07:35Z Yunfan Zhao http://arxiv.org/abs/2603.08552v1 Nonconcave Portfolio Choice under Smooth Ambiguity 2026-03-09T16:16:25Z We study continuous-time portfolio choice with nonlinear payoffs under smooth ambiguity and Bayesian learning. We develop a general framework for dynamic, non-concave asset allocation that accommodates nonlinear payoffs, broad utility classes, and flexible ambiguity attitudes. Dynamic consistency is obtained by a robust representation that recasts the ambiguity-averse problem as ambiguity-neutral with distorted priors. This structure delivers explicit trading rules by combining nonlinear filtering with the martingale approach and nests standard concave and linear-payoff benchmarks. As a leading application, delegated management with convex incentives illustrates that ambiguity aversion shifts beliefs toward adverse states, limits the range of states that would otherwise trigger more aggressive risk taking, and reduces volatility through lower risky exposure. 2026-03-09T16:16:25Z 36 pages, 8 figures Emanuele Borgonovo An Chen Massimo Marinacci Shihao Zhu http://arxiv.org/abs/2311.03538v4 On an Optimal Stopping Problem with a Discontinuous Reward 2026-03-09T14:30:05Z We study an optimal stopping problem with an unbounded, time-dependent and discontinuous reward function. This problem is motivated by the pricing of a variable annuity contract with guaranteed minimum maturity benefit, under the assumption that the policyholder's surrender behaviour maximizes the risk-neutral value of the contract. We consider a general fee and surrender charge function, and give a condition under which optimal stopping always occurs at maturity. Using an alternative representation for the value function of the optimization problem, we study its analytical properties and the resulting surrender (or exercise) region. In particular, we show that the non-emptiness and the shape of the surrender region are fully characterized by the fee and the surrender charge functions, which provides a powerful tool to understand their interrelation and how it affects early surrenders and the optimal surrender boundary. Under certain conditions on these two functions, we develop three representations for the value function; two are analogous to their American option counterpart, and one is new to the actuarial and American option pricing literature. 2023-11-06T21:18:59Z Anne Mackay Marie-Claude Vachon 10.13140/RG.2.2.36565.40160 http://arxiv.org/abs/2603.07863v1 Choice of Collateral Currency in Differential Swaps 2026-03-09T00:35:45Z The role of collateral in derivative pricing has evolved beyond credit risk mitigation, particularly following the global financial crisis, when funding costs and basis spreads became central to valuation practices. This development coincided with the transition from the London Interbank Offered Rate (LIBOR) to risk-free rates (RFRs) and the increasing standardization of collateralised trading. We study the valuation and hedging of a class of differential swaps referencing backward-looking averages of overnight rates, with SOFR swaps appearing as a particular instance. The focus is on the impact of the collateral currency. Extending earlier results Ding et al. [Math. Finance 36 (2026), pp.~180--202], we allow the collateral account to be denominated in a currency different from that of the contractual cash flows and derive explicit pricing and hedging strategies using a futures-based replication approach. We show that the choice of collateral currency can have a non-trivial effect on both valuation and risk management. In particular, foreign-currency collateral can introduce additional risk exposures even when contractual cash flows are entirely denominated in the domestic currency. Numerical study demonstrates that collateral effects can lead to significant valuation adjustments and therefore need to be properly incorporated in modern multi-currency modelling frameworks. 2026-03-09T00:35:45Z 5 figures Yining Ding Ruyi Liu Marek Rutkowski http://arxiv.org/abs/2412.21192v2 Rough differential equations for volatility 2026-03-08T21:50:04Z We introduce a canonical way of performing the joint lift of a Brownian motion $W$ and a low-regularity adapted stochastic rough path $\mathbf{X}$, extending [Diehl, Oberhauser and Riedel (2015). A Lévy area between Brownian motion and rough paths with applications to robust nonlinear filtering and rough partial differential equations]. Applying this construction to the case where $\mathbf{X}$ is the canonical lift of a one-dimensional fractional Brownian motion (possibly correlated with $W$) completes the partial rough path of [Fukasawa and Takano (2024). A partial rough path space for rough volatility]. We use this to model rough volatility with the versatile toolkit of rough differential equations (RDEs), namely by taking the price and volatility processes to be the solution to a single RDE. We argue that our framework is already interesting when $W$ and $X$ are independent, as correlation between the price and volatility can be introduced in the dynamics. The lead-lag scheme of [Flint, Hambly, and Lyons (2016). Discretely sampled signals and the rough Hoff process] is extended to our fractional setting as an approximation theory for the rough path in the correlated case. Continuity of the solution map transforms this into a numerical scheme for RDEs. We numerically test this framework and use it to calibrate a simple new rough volatility model to market data. 2024-12-30T18:57:29Z Revised version Ofelia Bonesini Emilio Ferrucci Ioannis Gasteratos Antoine Jacquier http://arxiv.org/abs/2603.07752v1 Dynamic slippage control and rejection feedback in spot FX market making 2026-03-08T18:00:03Z We study an OTC FX market-making problem, built on the Avellaneda-Stoikov tradition, in which a dealer streams size-dependent quotes on a discrete ladder and manages inventory risk over a finite horizon under Poisson arrivals of trade requests. Adverse selection is modelled through latency-driven price moves over a delay window, represented by Gaussian marks whose conditional means can depend on the quoted spread, capturing selective client reaction to stale quotes. The dealer can address latency risk through trade rejection when slippage breaches a tolerance threshold. We treat slippage tolerance as an explicit control jointly optimized with quotes: upon receiving a trade request, the dealer chooses an acceptance/rejection rule, which makes the trade economically akin to an embedded option written on the latency price move. We further introduce rejection feedback through an EMA-based rejection score used as a reputation proxy, so that client intensity is endogenously modulated by past rejections via a multiplicative factor. Using dynamic programming, we derive a Markov control problem with state variables (inventory, rejection-score) and show how rejection decision enters the HJB equation through Hamiltonians that include an expectation over the latency mark and a maximization over both quote and rejection rule parameters. For practical control evaluation, we develop an adiabatic-quadratic approximation: fixing reputation on the inventory-control time scale, expanding Hamiltonians to the second order, and adopting quadratic ansatz in inventory, yielding tractable Riccati-type ODE and closed-form expressions for approximate quotes and slippage thresholds. This approximation provides a fast surrogate for policy design and enables self-consistent calibration of rejection behaviour. 2026-03-08T18:00:03Z 18 pages, 10 figures Alexander Barzykin http://arxiv.org/abs/2603.07692v1 Understanding the Long-Only Minimum Variance Portfolio 2026-03-08T15:47:09Z For a covariance matrix coming from a factor model of returns, we investigate the relationship between the long-only global minimum variance portfolio and the asset exposures to the factors. In the case of a 1-factor model, we provide a rigorous and explicit description of the long-only solution in terms of the parameters of the covariance matrix. For $q>1$ factors, we provide a description of the long-only portfolio in geometric terms. The results are illustrated with empirical daily returns of US stocks. 2026-03-08T15:47:09Z 25 pages, 6 figures Nick L. Gunther Alec N. Kercheval Ololade Sowunmi http://arxiv.org/abs/2603.07616v1 SABR Type Libor (Forward) Market Model (SABR/LMM) with time-dependent skew and smile 2026-03-08T13:03:59Z Volatility Skew and Smile of Interest Rate products (Swaption and Caplet) are represented by SABR (Stochastic Alpha Beta Rho model). So, the Interest Rate derivatives model for pricing the callable exotic swaps should be comparable to the SABR volatility surface. In the interest rate derivatives models, Libor Market Model (LMM) (in a post-Libor world, Forward Market Model (FMM)) is one of the most popular models used in the market. So, there are many attempts to develop LMMs that are comparable to the SABR surface. It is called SABR/LMM. There are many references for SABR/LMM, but most of them only treat SABR/LMM, which is not flexible enough to be used practically in global banks. The purpose of this paper is to provide a comprehensive definition of SABR/LMM and a complete description of how it is to be implemented. 2026-03-08T13:03:59Z Osamu Tsuchiya