https://arxiv.org/api/iOZ/8c3c2UMfCOKJ7U0P/G0STCc2026-06-18T15:20:01Z838142015http://arxiv.org/abs/2603.07227v2Estimating changes in extreme quantiles over time, applied to desert temperatures2026-03-13T15:44:40ZWe quantify changes DeltaQ in 100-year return values for regional annual maxima and minima of near-surface atmospheric temperature from output of five CMIP6 models, for five of the Earth's desert regions, over the interval (2025,2125). We use generalised extreme value (GEV) regression to characterise changes in extremes, considering a range of different parametric forms for the variation of GEV parameters with time, and coupling models for different scenarios so that they provide a common GEV tail in the first year of observation. Parameters are estimated using Bayesian inference.
We perform a simulation study using ground truth models generating data qualitatively similar to the CMIP6 output, to assess the relative performance of different information criteria in selecting models from a set of candidates, to minimise error in predictions of DeltaQ. The Bayesian information criterion (BIC) provides best performance, out-performing the divergence and widely-applicable information criteria in particular.
Using BIC-selected GEV regression models, we estimate joint posterior distributions of DeltaQ over three forcing scenarios, for different combinations of region, GCM and climate ensemble. Estimates show a consistent trend across regions, GCMs and climate ensembles, of DeltaQ increasing with climate scenario for both regional annual maxima and minima. Aggregating posterior distributions over climate ensembles and GCMs, we find evidence for significant increases in DeltaQ for regional annual maxima under more severe forcing scenarios for all desert regions. Similar but weaker and less significant trends are observed for regional annual minima.2026-03-07T14:17:48ZCallum LeachKevin EwansPhilip Jonathanhttp://arxiv.org/abs/2603.12449v1FloeNet: A mass-conserving global sea ice emulator that generalizes across climates2026-03-12T21:00:26ZWe introduce FloeNet, a machine-learning emulator trained on the Geophysical Fluid Dynamics Laboratory global sea ice model, SIS2. FloeNet is a mass-conserving model, emulating 6-hour mass and area budget tendencies related to sea ice and snow-on-sea-ice growth, melt, and advection. We train FloeNet using simulated data from a reanalysis-forced ice-ocean simulation and test its ability to generalize to pre-industrial control and 1% CO2 climates. FloeNet outperforms a non-conservative model at reproducing sea ice and snow-on-sea-ice mean state, trends, and inter-annual variability, with volume anomaly correlations above 0.96 in the Antarctic and 0.76 in the Arctic, across all forcings. FloeNet also produces the correct thermodynamic vs dynamic response to forcing, enabling physical interpretability of emulator output. Finally, we show that FloeNet outputs high-fidelity coupling-related variables, including ice-surface skin temperature, ice-to-ocean salt flux, and melting energy fluxes. We hypothesize that FloeNet will improve polar climate processes within existing atmosphere and ocean emulators.2026-03-12T21:00:26Z4 Figures, 18 supplementary figuresWilliam GregoryMitchell BushukJames DuncanElynn WuAdam SubelSpencer K. ClarkBill HurlinOliver Watt-MeyerAlistair AdcroftChris BrethertonLaure Zannahttp://arxiv.org/abs/2601.02090v2Flo: A data-driven limited-area storm surge model2026-03-12T09:32:11ZWe present Flo, a data-driven storm surge model, covering the North Sea, Norwegian Sea and Barents Sea. The model is built using the Anemoi framework for creating machine learning weather forecasting systems, developed by the European Centre for Medium-Range Weather Forecasts and partners. The model is based on a graph neural network, and is capable of simulating water level due to atmospheric effects (wind stress and inverse barometer effect, i.e. the non-tidally induced part of the total water level; the residual water level) at a horizontal resolution of 4 km and a temporal resolution of 1 hour with a quality comparable to the numerical model on which it was trained. The model was trained using a dataset consisting of 43 years of atmospheric data from the 3-km Norwegian Reanalysis hindcast for mean sea level pressure and winds, and the NORA-Surge hindcast for water level. Evaluation was done by comparing results from hindcast runs of the Flo model against independent observations of more than 90 water level gauges along the European coast, and against the NORA-Surge hindcast. The evaluation shows that Flo produces hindcasts with accuracy similar to the NORA-Surge hindcast, and it is shown that the model can resolve key physical processes. As the NORA-Surge hindcast used for training does not include data assimilation, Flo is not expected to systematically outperform the numerical model when evaluated against observations. Nevertheless, the present work represents an important step towards complementing traditional physics-based storm surge modelling with machine learning approaches and the framework establishes a strong foundation for future developments, particularly for training storm surge models that offer more flexibility for incorporating observations and other additional data sources.2026-01-05T13:17:28ZNils Melsom KristensenMateusz MatuszakPaulina TedescoIna Kristine Berentsen KullmannJohannes Röhrshttp://arxiv.org/abs/2603.11345v1Leveraging higher-order time integration methods for improved computational efficiency in a rainshaft model2026-03-11T22:21:16ZCloud and precipitation microphysics packages in atmospheric general circulation models typically use first-order time integration methods with a large time step, requiring ad hoc limiters and substepping of the sedimentation scheme to prevent solutions from becoming unstable. We show that in the latest version of Energy Exascale Earth System Model, E3SMv3, the rain microphysics provided by the Predicted Particle Properties (P3) scheme is underresolved in time at the model's default 300s time step. The P3 scheme requires limiters to guarantee stability, but those limiters make large discretization errors more difficult to detect. When the time step of the P3 scheme is reduced to sufficiently capture correct microphysics behavior, wall clock time of the simulation is increased by nearly a factor of 40.
Instead of reducing the microphysics time step, we recommend using higher-order time integrators based on Runge-Kutta methods, which offer improved solution accuracy at comparable computational costs. A key to obtaining computationally efficient microphysics results is the use of adaptive time stepping, which also eliminates the need for specialized substepping procedures in the sedimentation process. We also analyze individual microphysical processes by extracting inverse timescales from Jacobians of the process rates, which gives insight about the maximum time step each process is able to take while maintaining stability and accuracy, and about how individual processes should be grouped together for most efficient results. The proposed integrators can achieve the accuracy level required to correctly model rain microphysics parameterizations more than 10x faster than the P3 scheme.2026-03-11T22:21:16ZJustin DongSean P. SantosSteven B. RobertsChristopher J. VoglCarol S. Woodwardhttp://arxiv.org/abs/2603.09974v2Task Aware Modulation Using Representation Learning for Upsaling of Terrestrial Carbon Fluxes2026-03-11T16:27:50ZAccurately upscaling terrestrial carbon fluxes is central to estimating the global carbon budget, yet remains challenging due to the sparse and regionally biased distribution of ground measurements. Existing data-driven upscaling products often fail to generalize beyond observed domains, leading to systematic regional biases and high predictive uncertainty. We introduce Task-Aware Modulation with Representation Learning (TAM-RL), a framework that couples spatio-temporal representation learning with knowledge-guided encoder-decoder architecture and loss function derived from the carbon balance equation. Across 150+ flux tower sites representing diverse biomes and climate regimes, TAM-RL improves predictive performance relative to existing state-of-the-art datasets, reducing RMSE by 8-9.6% and increasing explained variance (R2) from 19.4% to 43.8%, depending on the target flux. These results demonstrate that integrating physically grounded constraints with adaptive representation learning can substantially enhance the robustness and transferability of global carbon flux estimates.2026-03-10T17:59:29ZAccepted to the KGML Bridge at AAAI 2026 (non-archival)Aleksei RozanovArvind RenganathanVipin Kumarhttp://arxiv.org/abs/2603.20257v1Constructing efficient score functions for rare event simulation in high-dimensional ocean-climate models2026-03-11T14:30:25ZCalculating transition probabilities between different states of multistable climate tipping systems is computationally challenging in high-dimensional models. Targeted algorithms, such as the Trajectory-Adaptive Multilevel Splitting (TAMS) method, require an adequate score function to be successful, i.e., to provide an estimate of a transition probability with an acceptable variance when only a relatively small ensemble of model trajectories can be computed. Here, we present a data-driven method to derive a score function based on projecting the model dynamics in a reduced state space. Using a spatially two-dimensional partial differential equation model of the Atlantic Meridional Overturning Circulation, we show that this score function performs better than currently available ones. Using the new score function, transition probabilities can be determined with low variance, even in the case of small noise amplitudes. Besides purely noise-induced transitions, we also consider the scenario of combined stochastic and time-dependent deterministic forcing, presenting a strategy to efficiently simulate AMOC tipping events in global ocean and climate models subject to transient climate change.2026-03-11T14:30:25ZLucas EsclapezValérian Jacques-DumasReyk BörnerLaurent SoucasseHenk A. Dijkstrahttp://arxiv.org/abs/2603.07331v2Causal Attribution of Coastal Water Clarity Degradation to Nickel Processing Expansion at the Indonesia Morowali Industrial Park, Sulawesi2026-03-11T07:35:14ZIndonesia's nickel ore export ban has driven rapid expansion of smelting and hydrometallurgical processing capacity at the Indonesia Morowali Industrial Park (IMIP), now the world's largest integrated nickel processing complex, on the coast of Central Sulawesi. Whether this industrialization has degraded the adjacent marine environment remains unquantified. We apply Bayesian structural time-series (BSTS) causal inference to a multi-decadal, multi-sensor satellite ocean color record of the diffuse attenuation coefficient at 490 nm, $K_d(490)$, to test for a causal link between IMIP expansion and nearshore turbidity change. A consensus structural breakpoint, a significant posterior causal effect estimated against a Banda Sea counterfactual, and a distribution-free placebo rank test collectively establish that coastal water clarity deteriorated after the transition from initial nickel pig iron production to hyper-expansion of high-pressure acid leaching facilities for battery-grade nickel. Satellite-derived land cover analysis independently corroborates this timing, showing substantial built-area growth and concurrent tree cover loss within the IMIP footprint. The resulting euphotic zone shoaling occurs in oligotrophic waters supporting high marine biodiversity, where even moderate optical degradation may impair coral photosynthesis and compress depth-dependent reef habitat. These findings quantify a marine environmental cost absent from Indonesia's mineral downstreaming policy discourse and demonstrate a transferable, satellite-based quasi-experimental framework for causal impact assessment at coastal industrial sites in data-limited tropical settings.2026-03-07T20:22:27Z19 pages, 8 figuresSandy Hardian Susanto HerhoAlfita Puspa HandayaniIwan Pramesti AnwarFaruq KhadamiKarina Aprilia SujatmikoDoandy Yonathan WibisonoRusmawan SuwarmanDasapta Erwin Irawanhttp://arxiv.org/abs/2603.09868v1CarbonBench: A Global Benchmark for Upscaling of Carbon Fluxes Using Zero-Shot Learning2026-03-10T16:33:28ZAccurately quantifying terrestrial carbon exchange is essential for climate policy and carbon accounting, yet models must generalize to ecosystems underrepresented in sparse eddy covariance observations. Despite this challenge being a natural instance of zero-shot spatial transfer learning for time series regression, no standardized benchmark exists to rigorously evaluate model performance across geographically distinct locations with different climate regimes and vegetation types.
We introduce CarbonBench, the first benchmark for zero-shot spatial transfer in carbon flux upscaling. CarbonBench comprises over 1.3 million daily observations from 567 flux tower sites globally (2000-2024). It provides: (1) stratified evaluation protocols that explicitly test generalization across unseen vegetation types and climate regimes, separating spatial transfer from temporal autocorrelation; (2) a harmonized set of remote sensing and meteorological features to enable flexible architecture design; and (3) baselines ranging from tree-based methods to domain-generalization architectures. By bridging machine learning methodologies and Earth system science, CarbonBench aims to enable systematic comparison of transfer learning methods, serves as a testbed for regression under distribution shift, and contributes to the next-generation climate modeling efforts.2026-03-10T16:33:28ZAleksei RozanovArvind RenganathanYimeng ZhangVipin Kumarhttp://arxiv.org/abs/2603.20250v1Developing Machine Learning-Based Watch-to-Warning Severe Weather Guidance from the Warn-on-Forecast System2026-03-10T14:18:03ZWhile machine learning (ML) post-processing of convection-allowing model (CAM) output for severe weather hazards (large hail, damaging winds, and/or tornadoes) has shown promise for very short lead times (0-3 hours), its application to slightly longer forecast windows remains relatively underexplored. In this study, we develop and evaluate a grid-based ML framework to predict the probability of severe weather hazards over the next 2-6 hours using forecast output from the Warn-on-Forecast System (WoFS). Our dataset includes WoFS ensemble forecasts valid every 5 minutes out to 6 hours from 108 days during the 2019--2023 NOAA Hazardous Weather Testbed Spring Forecasting Experiments. We train ML models to generate probabilistic forecasts of severe weather akin to Storm Prediction Center outlooks (i.e., likelihood of a tornado, severe wind, or severe hail event within 36 km of each point). We compare a histogram gradient-boosted tree (HGBT) model and a deep learning U-Net approach against a carefully calibrated baseline generated from 2-5 km updraft helicity. Results indicate that the HGBT and U-Net outperform the baseline, particularly at higher probability thresholds. The HGBT achieves the best performance metrics, but predicted probabilities cap at 60% while the U-net forecasts extend to 100%. Similar to previous studies, the U-Net produces spatially smoother guidance than the tree-based method. These findings add to the growing evidence of the effectiveness of ML-based CAM post-processing for providing short-term severe weather guidance.2026-03-10T14:18:03Z28 pages, 7 figuresMontgomery FloraSamuel VargaCorey PotvinNoah Langhttp://arxiv.org/abs/2603.07893v2Designing probabilistic AI monsoon forecasts to inform agricultural decision-making2026-03-10T14:09:08ZHundreds of millions of farmers make high-stakes decisions under uncertainty about future weather. Forecasts can inform these decisions, but available choices and their risks and benefits vary between farmers. We introduce a decision-theory framework for designing useful forecasts in settings where the forecaster cannot prescribe optimal actions because farmers' circumstances are heterogeneous. We apply this framework to the case of seasonal onset of monsoon rains, a key date for planting decisions and agricultural investments in many tropical countries. We develop a system for tailoring forecasts to the requirements of this framework by blending systematically benchmarked artificial intelligence (AI) weather prediction models with a new "evolving farmer expectations" statistical model. This statistical model applies Bayesian inference to historical observations to predict time-varying probabilities of first-occurrence events throughout a season. The blended system yields more skillful Indian monsoon forecasts at longer lead times than its components or any multi-model average. In 2025, this system was deployed operationally in a government-led program that delivered subseasonal monsoon onset forecasts to 38 million Indian farmers, skillfully predicting that year's early-summer anomalous dry period. This decision-theory framework and blending system offer a pathway for developing climate adaptation tools for large vulnerable populations around the world.2026-03-09T02:25:12ZColin AitkenRajat MasiwalAdam MarchakitusKatherine KowalMayank GuptaTyler YangAmir JinaPedram HassanzadehWilliam R. BoosMichael Kremerhttp://arxiv.org/abs/2603.09196v1Joint Diagnostics of Circumsolar Sky Brightness Using Coronagraphic Measurements and Aerosol Optical Inversions at Mauna Loa2026-03-10T05:10:03ZAtmospheric aerosols strongly influence daytime sky quality for solar coronal imaging, yet few studies directly link aerosol properties and sky-brightness measurements within ~2° of the Sun. Here we compare externally occulted coronagraphic measurements of near-Sun radiance with aerosol-constrained inferences derived from direct-Sun and sky photometry. Our analysis focuses on Mauna Loa Observatory, a well-characterized high-altitude site for atmospheric and solar observations. We present coronagraphic measurements of near-Sun radiance at 1.54 +/- 0.77° from solar disk center acquired between 2006 and 2007 by an ATST Sky Brightness Monitor (SBM). These data are directly compared with circumsolar radiances inferred at 1.54° using AERONET almucantar measurements and aerosol optical retrievals. We find quantitative agreement between these two approaches, enabling extension to multi-decadal analyses of circumsolar radiance and its relationship to aerosol properties and related proxies (e.g., the Angstrom exponent) using AERONET data from 2000-2025. Near-Sun radiances are expressed relative to the solar disk-center radiance, facilitating direct comparison with related studies. Finally, we synthesize physically based true-color images of the circumsolar sky under representative aerosol conditions as an observational aid, in part to illustrate that visually enhanced solar aureoles do not necessarily imply poor infrared coronal observing conditions. This methodology provides an extended framework for assessing daytime coronal sky quality at existing and future observing sites.2026-03-10T05:10:03Z17 pages, 14 figures. Accepted for publication in ApJApJ 1000 250 (2026)Thomas A. SchadPaul BryansAndre FehlmannSarah GibsonDavid M. HarringtonLucas A. TarrSteven TomczykJeffrey G. Yepez10.3847/1538-4357/ae4ec2http://arxiv.org/abs/2603.08550v1Emergence of an Advective Boundary Layer in Monsoon Cross-Equatorial Flow: Scaling, Dynamics, and Idealized Models2026-03-09T16:15:04ZThe conventional Ekman model of the tropical boundary layer neglects nonlinear momentum advection and breaks down near the equator, where Coriolis effects are weak. During South Asian monsoon onset, we identify a dynamical regime transition to an advective boundary layer (ABL). Reanalysis links this transition to a shift in the zonal momentum balance from frictional to meridional-advection control as cross-equatorial flow intensifies, accompanied by increasing local Rossby number and vanishing absolute vorticity, signaling the breakdown of Ekman balance. A scaling analysis shows that this transition occurs when the meridional length scales of geopotential and zonal wind contract such that their product approaches $φ/f^2$. In the resulting ABL regime, kinetic energy is governed by a balance between its generation and advection, yielding a linear diagnostic relation between meridional geopotential gradient and meridional wind. A simple theoretical model predicts that the sensitivity of this relation is controlled by an advective timescale that equals the inertial timescale ($1/f$) at the transition latitude, where zonal and meridional wind speeds become comparable. Testing this framework in idealized aquaplanet experiments confirms that stronger cross-equatorial pressure gradients and slower planetary rotation rates amplify advective effects and shift the transition latitude poleward. Across experiments, the sensitivity of meridional winds to the geopotential gradient remains tightly linked to $1/f$ at the transition latitude. Together, these results establish the ABL as a distinct dynamical regime, with important implications for monsoon onset, intraseasonal variability, and the representation of tropical boundary layer processes in climate models.2026-03-09T16:15:04ZRajat MasiwalAshwin K SeshadriVishal Dixithttp://arxiv.org/abs/2603.08366v1Observations and numerical simulations of a valley-exit wind in the Alpine Bolzano basin2026-03-09T13:29:02ZThe characteristics of the nocturnal drainage wind flowing from the tributary Isarco Valley into the Bolzano basin, in the Italian Alps, during wintertime are investigated. Analyses are performed by combining measurements from an intensive field campaign and the output of four high-resolution numerical simulations, run with the Weather Research and Forecasting (WRF) model using different planetary boundary-layer (PBL) schemes. Two episodes are identified, based on the vertical temperature stratification in the basin and the evolution of the drainage flow at the valley exit. Numerical results show that the drainage flow behaves as a valley-exit wind, whose main structure at the exit of the valley is well captured by the model independently of the PBL scheme. However, the model struggles to correctly reproduce the temperature stratification in the basin, with better results when a PBL scheme including, among others, a prognostic equation for the temperature variance and a counter-gradient term is used. This has an impact on the simulation of the onset and duration of the valley-exit wind, which are sensitive to the temperature contrasts between the valley and the basin. Overall, the model is able to reproduce the different behavior of the drainage wind at the exit of the valley in the two case studies. It is found that the presence of a cold air pool in the basin favors an upward trajectory of the flow at the exit of the valley, resulting in unperturbed calm wind conditions in the lower levels. On the other hand, with weak temperature stratification, the drainage flow closely follows the topography, resulting in strong winds also near the surface.2026-03-09T13:29:02ZUnder 'Minor Revision' on the Quarterly Journal of the Royal Meteorological SocietyFederica GucciAndrea ZonatoMarco FalocchiDino ZardiLorenzo Giovanninihttp://arxiv.org/abs/2603.07859v1Impacts of Jet Stream Structure on Cyclone Merging and Persistent Anticyclones: Insights from Dry Idealized Simulations2026-03-09T00:21:21ZMidlatitude jet streams exhibit substantial variability in latitude, width, and vertical depth on synoptic to multi-decadal timescales. While the upper-level dynamics of baroclinic waves have been extensively studied, the sensitivity of the extreme-generating, low-level phenomena to these variations remains underexplored. Here, we systematically investigate this sensitivity using dry, adiabatic idealized experiments with the GFDL FV3 dry dynamical core initialized with analytically specified jets. We identify jet variations that control synoptic-scale features of interest. Results indicate that poleward-shifted jets accelerate initial cyclone intensification and favor anticyclonic Rossby Wave Breaking (RWB). These wave-breaking tendencies are consistent with established baroclinic paradigms, validating the newly configured idealized simulations. Additionally, jet width regulates the likelihood of surface cyclone merging. Poleward-shifted, broader, and higher jets produce more frequent cyclone merging, generating intense wind extremes. Finally, we show that poleward-shifted, broad, deep jets dynamically precondition the flow for persistent stationary anticyclones in the absence of diabatic contributions. Together, these findings illustrate how changes in jet stream structure may modulate midlatitude weather extremes.2026-03-09T00:21:21ZMingfei RenGan ZhangKai-Yuan ChengLucas HarrisTalia Tamarin-BrodskyJoseph Mouallemhttp://arxiv.org/abs/2603.07849v1Genuine Increases in Tropical Cyclone Intensities2026-03-08T23:57:54ZKossin et al. (2020) report a rising ratio of satellite observations of major C3-C5 storms relative to all C1-C5 storms from 1979 to 2017. Decomposing their R = N(C3+)/N(C1+) statistic into per-category shares shows that their trend was driven primarily by fewer C1 rather than more C3-C5 observations. From the first half to the second half of their sample period, their per-year C1 observations fell by 17%. However, extending the record through 2023 greatly changes the picture. Although the relative decline in C1 observations persists, C3 and C4 observations now increase, too. The signal about the intensification of storms now becomes genuine in the extended sample, in that it is driven no longer only by fewer weak but now also by more strong tropical cyclone observations.2026-03-08T23:57:54Z14 pages, 2 figures, 3 tablesIvo Welch