https://arxiv.org/api/Z1s0bW0ecbBDaM4L8rS99v+3+sY 2026-06-14T00:39:13Z 8365 270 15 http://arxiv.org/abs/2604.07137v2 What's in the latent space? Exploring coupled tropical Pacific variability within a Multi-branch $β$-Variational Autoencoder 2026-04-11T13:11:14Z

What is encoded in the latent space of a multi-branch $β$-variational autoencoder ($β$-VAE) trained on coupled tropical Pacific climate fields? To answer this question, we assess the reconstruction skill and physical interpretability of the latent space of a multi-branch $β$-VAE trained on sea surface temperature, ocean heat content, and outgoing longwave radiation across the tropical Pacific from a 500-year preindustrial control simulation. The model generalizes well, with only modest degradation from training to test performance, and preserves the dominant basin-scale structure of all three fields. Latent-space diagnostics show that variability is organized unevenly across dimensions: sea surface temperature is concentrated in a smaller subset of latent dimensions, whereas ocean heat content and outgoing longwave radiation are more broadly distributed across multiple dimensions. Comparisons with conventional tropical Pacific diagnostics further show that several latent dimensions align with known El Niño and La Niña variability, while others capture related coupled ocean-atmosphere variability on decadal or longer timescales. Sensitivity experiments and latent traversals identify dimensions associated with eastern-Pacific-like, central-Pacific-like, coastal, subsurface-dominant, and atmosphere-dominant variability. Together, these results show that the multi-branch $β$-variational autoencoder yields a skillful and physically informative reduced representation of coupled tropical Pacific variability.

2026-04-08T14:31:07Z 52 pages, 17 figures, 4 tables. Version 2 (04/09/2026) Emily F. Wisinski Maria J. Molina Kyle J. C. Hall Hannah Bao Salil Mahajan Nan Rosenbloom John Fasullo http://arxiv.org/abs/2604.10094v1 Global monitoring of methane point sources using deep learning on hyperspectral radiance measurements from EMIT 2026-04-11T08:27:56Z

Anthropogenic methane (CH4) point sources drive near-term climate forcing, safety hazards, and system inefficiencies. Space-based imaging spectroscopy is emerging as a tool for identifying emissions globally, but existing approaches largely rely on manual plume identification. Here we present the Methane Analysis and Plume Localization with EMIT (MAPL-EMIT) model, an end-to-end vision transformer framework that leverages the complete radiance spectrum from the Earth Surface Mineral Dust Source Investigation (EMIT) instrument to jointly retrieve methane enhancements across all pixels within a scene. This approach brings together spectral and spatial context to significantly lower detection limits. MAPL-EMIT simultaneously supports enhancement quantification, plume delineation, and source localization, even for multiple overlapping plumes. The model was trained on 3.6 million physics-based synthetic plumes injected into global EMIT radiance data. Synthetic evaluation confirms the model's ability to identify plumes with high recall and precision and to capture weaker plumes relative to existing matched-filter approaches. On real-world benchmarks, MAPL-EMIT captures 79% of known hand-annotated NASA L2B plume complexes across a test set of 1084 EMIT granules, while capturing twice as many plausible plumes than identified by human analysts. Further validation against coincident airborne data, top-emitting landfills, and controlled release experiments confirms the model's ability to identify previously uncaptured sources. By incorporating model-generated metrics such as spectral fit scores and estimated noise levels, the framework can further limit false-positive rates. Overall, MAPL-EMIT enables high-throughput implementation on the full EMIT catalog, shifting methane monitoring from labor-intensive workflows to a rapid, scalable paradigm for global plume mapping at the facility scale.

2026-04-11T08:27:56Z 43 pages, 27 figures, 4 tables Vishal V. Batchu Michelangelo Conserva Alex Wilson Anna M. Michalak Varun Gulshan Philip G. Brodrick Andrew K. Thorpe Christopher V. Arsdale http://arxiv.org/abs/2604.09510v1 On the Methodology for Assessing Vegetation Impacts on the Atmospheric Branch of the Hydrological Cycle 2026-04-10T17:28:49Z

China has undertaken unprecedented, state-driven vegetation restoration on a continental scale. This large-scale land-surface intervention offers a rare opportunity to assess how deliberate biospheric change influences climate-relevant processes, especially the hydrological cycle. Of particular interest is how increased water use by additional vegetation affects terrestrial water availability, including streamflow that sustains both ecosystems and human society. Here we evaluate the methodological basis for addressing this question in light of recently available data on hydrological change in China. Revisiting the atmospheric branch of the hydrological cycle, we argue that water yield depends fundamentally on vegetation-induced changes in atmospheric circulation. When the effects of vegetation on atmospheric dynamics are neglected, as in moisture-recycling-based approaches, the analysis is predisposed by construction toward diagnosing a negative effect of additional vegetation on water yield. Given the nonlinear dependence of precipitation on atmospheric moisture, we further suggest that streamflow reductions associated with added vegetation in dry regions reflects a transient phase of early ecological succession rather than a long-term outcome. As ecosystems mature and regional moisture regimes evolve, this relationship may reverse, generating a positive feedback between vegetation cover and water availability. We briefly discuss recent observational evidence consistent with this interpretation. We conclude that robust assessment of vegetation impacts on water yield requires frameworks that explicitly couple vegetation change, atmospheric processes, and hydrological responses. Such an approach is essential for distinguishing short-term trade-offs from longer-term system trajectories and for informing sustainable land management under continued ecosystem restoration and conservation.

2026-04-10T17:28:49Z 13 pages, 2 figures, 1 table A. M. Makarieva A. V. Nefiodov A. D. Nobre L. A. Cuartas F. Pasini D. Andrade http://arxiv.org/abs/2604.09346v1 OTProf: estimating high-resolution profiles of optical turbulence ($C_n^2$) from reanalysis using deep learning 2026-04-10T14:17:58Z

Accurate high-resolution vertical profiles of optical turbulence ($C_n^2$), which reflect local meteorology and topography, are crucial for ground-based optical astronomy and free-space optical communication. However, measuring these profiles or generating them with numerical weather models requires substantial operational or computational effort. In this work, we present OTProf, a deep-learning method that estimates high-resolution $C_n^2$ profiles from widely available coarse-resolution ERA5 reanalysis data. We evaluate the approach in the Netherlands and compare it with the commonly used Hufnagel-Valley model. Overall, OTProf reproduces the vertical structure of $C_n^2$ more accurately than Hufnagel-Valley and yields more accurate estimates of the Fried parameter $r_0$ and the scintillation index $σ_I^2$. As typical in machine learning, the $C_n^2$ predictions are slightly smoothed compared to reference data, especially in cases of rare strong turbulence. This smoothing affects the integrated parameters, sometimes leading to overly optimistic $r_0$ and $σ_I^2$ values. Despite this limitation, OTProf offers a more accurate, efficient, and physically consistent alternative to traditional analytical models and computationally expensive mesoscale models.

2026-04-10T14:17:58Z Maximilian Pierzyna Sukanta Basu Rudolf Saathof http://arxiv.org/abs/2509.26258v3 EnScale: Temporally-consistent multivariate generative downscaling via proper scoring rules 2026-04-10T13:05:06Z

The practical use of future climate projections from global circulation models (GCMs) is often limited by their coarse spatial resolution, requiring downscaling to generate high-resolution data. Regional climate models (RCMs) provide this refinement, but are computationally expensive. To address this issue, machine learning (ML) models can learn the downscaling function, mapping coarse GCM outputs to high-resolution fields. Among these, generative approaches aim to capture the full conditional distribution of RCM data given coarse-scale GCM data, which is characterized by large variability and thus challenging to model accurately. We introduce EnScale, a generative ML framework emulating the full GCM-to-RCM map by training on multiple pairs of GCM and corresponding RCM data. It first adjusts large-scale mismatches between GCM and coarsened RCM data, followed by a super-resolution step to generate high-resolution fields. To efficiently model the high-dimensional output, the super-resolution step employs a novel class of sparse local stochastic layers. Both steps employ generative models optimized with the energy score, a proper scoring rule. Compared to state-of-the-art ML downscaling approaches, our setup reduces computational cost by about one order of magnitude. EnScale jointly emulates multiple variables -- temperature, precipitation, solar radiation, and wind -- spatially consistent over Central Europe. In addition, we propose a variant EnScale-t that enables temporally consistent downscaling. We establish a comprehensive evaluation framework across various categories including calibration, spatial and temporal structure, extremes, and multivariate dependencies. Comparison with diverse benchmarks demonstrates EnScale(-t)'s competitive performance and computational efficiency, offering a promising approach for accurate and temporally consistent RCM emulation.

2025-09-30T13:46:14Z Updates according to suggestions by anonymous reviewers: improved methodology for temporal consistency; add preliminary results for extrapolation to unseen GCMs; add further evaluation via histograms, ACFs and for climate change signal; improved explanations and wordings in several places Maybritt Schillinger Maxim Samarin Xinwei Shen Reto Knutti Nicolai Meinshausen http://arxiv.org/abs/2604.06760v2 Single Scattering Properties for an Ensemble of Randomly Oriented Convex Polyhedra in Geometrical Optics Regime 2026-04-10T01:53:37Z

To study how geometrical shape affect the light scattering properties for an ensemble of randomly orientated particles, the single scattering matrices including complete polarization information are calculated statistically for a group of crystals with random geometrical shape and a group of hexagonal prisms with various aspect ratios in geometrical optics approximation method. To compare, the single scattering matrices for individual random irregular crystal and individual hexagonal prism are also presented. It should be noted that all statistical simulation experiments in this study are restricted to the following conditions: diffraction and absorption effects are neglected, calculations are performed at a single fixed wavelength, particles are assumed to be randomly oriented, and the simulations are limited to the regime where the geometric optics approximation is applicable. Using a unified computational framework for scattering matrices of convex polyhedra, we carried out a series of statistical numerical simulations. The flexibility of this framework in modifying particle geometry enables a systematic investigation of shape-dependent scattering characteristics. The results demonstrate that regular and irregular particles exhibit noticeably different scattering matrix signatures, and ensembles of irregular particles yield smooth and featureless non-zero matrix elements. In contrast, ensembles of regular hexagonal particles with varying aspect ratios retain common geometric scattering features.

2026-04-08T07:26:52Z Quan Mu http://arxiv.org/abs/2604.03906v2 Improving Model Performance by Adapting the KGE Metric to Account for System Non-Stationarity 2026-04-09T23:34:09Z

Geoscientific systems tend to be characterized by pronounced temporal non-stationarity, arising from seasonal and climatic variability in hydrometeorological drivers, and from natural and anthropogenic changes to land use and cover. As has been pointed out, such variability renders "the assumption of statistical stationarity obsolete in water management", and requires us to "account for, rather than ignore, non-stationary trends" in the data. However, metrics used for model development are typically based on the implicit and unjustifiable assumption that the data generating process is time-stationary. Here, we introduce the JKGE_ss metric (adapted from KGE_ss) that detects and accounts for dynamical non-stationarity in the statistical properties of the data and thereby improves information extraction and model performance. Unlike NSE and KGE_ss, which use the long-term mean as a benchmark against which to evaluate model efficiency, JKGE_ss emphasizes reproduction of temporal variations in system storage. We tested the robustness of the new metric by training physical-conceptual and data-based catchment-scale models of varying complexity across a wide range of hydroclimatic conditions, from recent-precipitation-dominated to snow-dominated to strongly arid. In all cases, the result was improved reproduction of system temporal dynamics at all time scales, across wet to dry years, and over the full range of flow levels (especially recession periods). Since traditional metrics fail to adequately account for temporal shifts in system dynamics, potentially resulting in misleading assessments of model performance under changing conditions, we recommend the adoption of JKGE_ss for geoscientific model development.

2026-04-05T00:17:10Z M Jawad HV Gupta YH Wang MA Farmani A Behrangi GY Niu http://arxiv.org/abs/2604.08772v1 CERBERUS: A Three-Headed Decoder for Vertical Cloud Profiles 2026-04-09T21:16:42Z

Atmospheric clouds exhibit complex three-dimensional structure and microphysical details that are poorly constrained by the predominantly two-dimensional satellite observations available at global scales. This mismatch complicates data-driven learning and evaluation of cloud processes in weather and climate models, contributing to ongoing uncertainty in atmospheric physics. We introduce CERBERUS, a probabilistic inference framework for generating vertical radar reflectivity profiles from geostationary satellite brightness temperatures, near-surface meteorological variables, and temporal context. CERBERUS employs a three-headed encoder-decoder architecture to predict a zero-inflated (ZI) vertically-resolved distribution of radar reflectivity. Trained and evaluated using ground-based Ka-band radar observations at the ARM Southern Great Plains site, CERBERUS recovers coherent structures across cloud regimes, generalizes to withheld test periods, and provides uncertainty estimates that reflect physical ambiguity, particularly in multilayer and dynamically complex clouds. These results demonstrate the value of distribution-based learning targets for bridging observational scales, introducing a path toward model-relevant synthetic observations of clouds.

2026-04-09T21:16:42Z Accepted for oral presentation at 2026 ICLR workshop on Machine Learning for Remote Sensing Emily K. deJong Nipun Gunawardena Kevin Smalley Hassan Beydoun Peter Caldwell http://arxiv.org/abs/2604.08634v1 Resolving satellite-in situ mismatches in Net Primary Production using high-frequency in situ bio-optical observations in the subpolar Northwest Atlantic 2026-04-09T17:06:10Z

Net primary productivity (NPP) forms the basis of biological carbon pump, but its estimates in high-latitude regions remain highly uncertain despite its disproportional importance for the global carbon sink. Optical satellites are limited by cloud cover, low irradiance, and shallow light penetration, with uncertainties further exacerbated by the lack of in situ validations and regional model tuning for NPP measurements. This study compared two satellite-based models, a global (VGPM) and a regionally tuned (BIO) NPP model, with a time series of in situ NPP. Using a high-frequency, depth-resolved moored profiler in the subpolar Northwest Atlantic (56°N) in 2016, in situ NPP was estimated by daily bio-optical profiles and prior measurement of photosynthesis-irradiance (P-I) parameters. Our findings indicated that satellite-derived estimates of depth-integrated NPP were overestimated by a factor of 2.5 to 4. However, the reasons for the discrepancies varied between the VGPM and BIO model. VGPM used global photosynthetic parameters with a simplified depth assumption, leading to an unrealistic vertical structure for depth-integrated NPP, despite its surface values were lower than in situ estimates. A major phytoplankton bloom in June-July was missed by VGPM, likely due to the use of non-regionally calibrated OCI Chl-a, which led to an underestimation of biomass. In contrast, the BIO model used regionally tuned POLY4 Chl-a products, and the differences in the assignment of P-I parameters accounted for the remaining discrepancies. This study showed the possibility to reach good agreement between satellite and in situ NPPs if the challenge of P-I assignment can be overcome. We recommend further studies to investigate discrepancies of NPP estimates in high-latitude regions, focusing on data sources and model choices, as well as improving regional model calibration to enhance NPP accuracy.

2026-04-09T17:06:10Z 39 pages, 12 figures Kitty Kam Emmanuel Devred Stephanie Clay Mohammad M. Amirian Andrew Irwin Dariia Atamanchuk Uta Send Douglas W. R. Wallace http://arxiv.org/abs/2604.08442v1 Ecohydrological Controls on Moist Convection and Long-Term Rainfall Feedback 2026-04-09T16:39:16Z

To elucidate how land surface state and soil moisture dynamics regulate moist convection, and how convective rainfall subsequently reshapes surface and root-zone hydrology, we develop a stochastic dynamical model that couples soil moisture, vegetation hydraulics, atmospheric boundary layer evolution, and convective available potential energy (CAPE). We show that CAPE depends not only on the free-tropospheric environment but also on soil moisture, through its control of surface fluxes, boundary-layer growth, and the timing of the intersection between the atmospheric boundary layer and the lifting condensation level (LCL). Soil texture and plant properties strongly modulate convective potential during dry-down. Loamy sand favors convection at relatively high soil moisture and maintains the largest CAPE at the time of LCL-ABL crossing across drying conditions. In contrast, sandy soils exhibit high CAPE when wet but lose convective potential rapidly as they dry. As matric potential becomes more negative, convection is increasingly suppressed in finer, loamy clay textures. Plant functional type further shapes dry-down dynamics: water-use-maximizing strategies can enhance dry persistence via stomatal closure during drying, whereas more conservative strategies can sustain convection for longer periods. On longer timescales, stochastic rainfall forcing with CAPE-dependent precipitation intensity produces persistent wet and dry soil moisture regimes, with switching times that depend on soil hydraulic properties, plant physiological traits, and atmospheric conditions.

2026-04-09T16:39:16Z 36 pages, 11 tables and figures Elizabeth Cultra Jun Yin Mark Bartlett Amilcare Porporato http://arxiv.org/abs/2601.23190v2 Hybrid physics-data-driven modeling for sea ice thermodynamics and transfer learning 2026-04-09T16:35:19Z

This study explores a physics-data driven hybrid approach for sea-ice column physics models, in which a machine learning (ML) component acts as a state-dependent parameterization of forecast errors. We examine how perturbations in snow thermodynamics and sea-ice radiative properties affect forecast errors, and train dedicated neural networks (NNs) for each model configuration. The performance of the hybrid models is evaluated for long lead-time forecasts and compared against a benchmark system based on climatological forecast-error estimates. The NN-based hybrids prove to be stable, robust to initial condition and atmospheric forcing errors, and consistently outperform their climatology-based counterpart. To derive guiding principles for efficiently handling possible physical model updates, we perform transfer learning experiments to test whether pretrained NNs optimized for one model configuration can be successfully adapted to another. Results indicate that direct evaluation of pretrained networks on the target task provides useful insights into their adaptability, recommending transfer learning whenever performance exceeds a trivial baseline. Finally, a feature-importance analysis shows that atmospheric forcing inputs have negligible influence on NN predictive skill, while ice-layer enthalpies play a key role in achieving satisfactory performance.

2026-01-30T17:12:39Z Giovanni De Cillis Alberto Carrassi Julien Brajard Laurent Bertino Matteo Broccoli Dorotea Iovino Tobias Sebastian Finn Marc Bocquet http://arxiv.org/abs/2602.08022v2 Linear Response and Optimal Fingerprinting for Nonautonomous Systems 2026-04-09T11:40:00Z

We provide a link between response theory, pullback measures, and optimal fingerprinting method that paves the way for a) predicting the impact of acting forcings on time-dependent systems and b) attributing observed anomalies to acting forcings when the reference state is not time-independent. We derive formulas for linear response theory for time-dependent Markov chains and diffusion processes. We discuss existence, uniqueness, and differentiability of the equivariant measure under general (not necessarily slow or periodic) perturbations of the transition kernels. Our results allow for extending the theory of optimal fingerprinting for detection and attribution of climate change (or change in any complex system) when the background state is time-dependent amd when the optimal solution is sought for multiple time slices at the same time. We provide numerical support for the findings by applying our theory to a modified version of the Ghil-Sellers energy balance model. We verify the precision of response theory - even in a coarse-grained setting - in predicting the impact of increasing CO$_2$ concentration on the temperature field. Additionally, we show that the optimal fingerprinting method developed here is capable to attribute the climate change signal to multiple acting forcings across a vast time horizon.

2026-02-08T15:53:41Z 28 pages, 3 figures, updated discussion and bibliography, full database and codes online Valerio Lucarini http://arxiv.org/abs/2604.08055v1 Dissipating the correlation smokescreen: Causal decomposition of the radiative effects of biomass burning aerosols over the South-East Atlantic 2026-04-09T10:04:07Z

Biomass burning aerosols (BBAs) from Southern Africa seasonally overlie the semi-permanent South-East Atlantic (SEA) stratocumulus deck, impacting the region's energy budget through complex aerosol-cloud-radiation-meteorology interactions. Climate model intercomparison initiatives, like the Aerosol Comparisons between Observations and Models (AeroCom), have highlighted the large inter-model variability for BBA radiative effects, especially over the SEA, due to parameterization of emission modeling and smoke properties. Observational constraints are needed to reduce these uncertainties, but correlative observational studies are typically affected by confounding meteorological influences. We propose a physically informed statistical approach, based on causal graphs applied to satellite observations, to disentangle BBA influences on shortwave radiation over the SEA and identify the main sources of statistical biases plaguing observational studies. We find that, during the fire season, BBAs cause a regional shortwave cooling of -2.5 W m$^{-2}$, which can be decomposed into equal contributions from three physical pathways: aerosol-radiation interactions (ARI), adjustments to ARI, and aerosol-cloud interactions (ACI). We also perform ablation experiments with graph variants to investigate the main sources of confounding - like large-scale winds, humidity-biased retrievals or spatial aggregation of data - and show that they result in biased radiative effect estimates (between -50 $\%$ and +15 $\%$). Once free of such biases, our derived causal estimates of smoke radiative effects can be used as observational constraints to improve climate models.

2026-04-09T10:04:07Z Climate Informatics 2026 Emilie Fons Isabel L. McCoy Tom Beucler David Neubauer Ulrike Lohmann http://arxiv.org/abs/2604.07861v1 Comparing Ocean Forecasts Driven with Machine Learning-based and Physics-based Atmospheric Forcings 2026-04-09T06:20:32Z

Operational ocean forecasting systems conventionally employ dynamical ocean models driven by atmospheric forcing derived from numerical weather prediction (NWP) models. Recent advancements in artificial intelligence and machine learning (ML) have led to the development of ML-based atmospheric weather models, which have competitive, if not better, medium range forecast accuracy compared to traditional NWP systems. This study evaluates the impact of ML-based atmospheric forcing on ocean forecast skill through two sets of 10-day forecasts using the UK Met Office GOSI9 configuration of the NEMO dynamical ocean model. Both experiments share identical ocean initial conditions; but differ in atmospheric forcing: one uses ECMWF's ML-based AIFS model, while the other uses the Australian Bureau of Meteorology's physics-based NWP model, ACCESS-G3. Forecasts were initialized on the first day of each month over the period 2023-2024. The quality of the atmospheric forcing was assessed by comparing AIFS and ACCESS-G3 forecast skill against both ECMWF reanalysis v5 (ERA5) and ACCESS-G3 analyses. Results indicate that AIFS consistently outperforms ACCESS-G3, either from the initial forecast time or after the first few days. Oceanic forecast skill was evaluated against both the GOSI9 reanalysis and observations, focusing on key surface variables including sea surface temperature, salinity, sea level, and ocean currents. The ocean forecasts forced with AIFS atmospheric data exhibit comparable or enhanced predictive skill compared to those forced with ACCESS-G3 data. These findings underscore the potential of ML-based atmospheric models to replace traditional NWP forcing in operational ocean forecasting systems, offering improved accuracy and computational efficiency.

2026-04-09T06:20:32Z 67 pages, 42 Figures Xiaobing Zhou Frank Colberg Debra Hudson Yonghong Yin Griffith Young Christopher Bladwell Catherine Deburgh-Day http://arxiv.org/abs/2507.09211v2 Capturing Unseen Spatial Heat Extremes Through Dependence-Aware Generative Modeling 2026-04-09T04:11:15Z

Observed records of climate extremes provide an incomplete view of risk, missing "unseen" events beyond historical experience. Ignoring spatial dependence further underestimates hazards striking multiple locations simultaneously. We introduce DeepX-GAN (Dependence-Enhanced Embedding for Physical eXtremes - Generative Adversarial Network), a deep generative model that explicitly captures the spatial structure of rare extremes. Its zero-shot generalizability enables simulation of statistically plausible extremes beyond the observed record, validated against long climate model large-ensemble simulations. We define two unseen types: direct-hit extremes that affect the target and near-miss extremes that narrowly miss. These unrealized events reveal hidden risks and can either prompt proactive adaptation or reinforce a sense of false resilience. Applying DeepX-GAN to the Middle East and North Africa shows that unseen heat extremes disproportionately threaten countries with high vulnerability and low socioeconomic readiness. Future warming is projected to expand and shift these extremes, creating persistent hotspots in Northwest Africa and the Arabian Peninsula, and new hotspots in Central Africa, necessitating spatially adaptive risk planning.

2025-07-12T09:06:45Z Xinyue Liu Xiao Peng Shuyue Yan Yuntian Chen Dongxiao Zhang Zhixiao Niu Hui-Min Wang Xiaogang He