https://arxiv.org/api/VvxC6wLvMX/P5U/G9Viz6P3OMw42026-06-18T20:53:41Z838149515http://arxiv.org/abs/2602.17708v1Spectral Homogenization of the Radiative Transfer Equation via Low-Rank Tensor Train Decomposition2026-02-12T22:41:39ZRadiative transfer in absorbing-scattering media requires solving a transport equation across a spectral domain with 10^5 - 10^6 molecular absorption lines. Line-by-line (LBL) computation is prohibitively expensive, while existing approximations sacrifice spectral fidelity. We show that the Young-measure homogenization framework produces solution tensors I that admit low-rank tensor-train (TT) decompositions whose bond dimensions remain bounded as the spectral resolution Ns increases. Using molecular line parameters from the HITRAN database for H2O and CO2, we demonstrate that: (i) the TT rank saturates at r = 8 (at tolerance e = 10^-6) from Ns = 16 to 4096, independent of single-scattering albedo, Henyey-Greenstein asymmetry, temperature, and pressure; (ii) quantized tensor-train (QTT) representations achieve sub-linear storage scaling; (iii) in a controlled comparison using identical opacity data and transport solver, the homogenized approach achieves over an order of magnitude lower L2 error than the correlated-k distribution at equal cost; and (iv) for atomic plasma opacity (aluminum at 60 eV, TOPS database), the TT rank saturates at r = 15 with fundamentally different spectral structure (bound-bound and bound-free transitions spanning 12 decades of dynamic range), confirming that rank boundedness is a property of the transport equation rather than any particular opacity source. These results establish that the spectral complexity of radiative transfer has a finite effective rank exploitable by tensor decomposition, complementing the spatial-angular compression achieved by existing TT and dynamical low-rank approaches.2026-02-12T22:41:39Z30 pages; submitted for publicationY. Sungtaek Juhttp://arxiv.org/abs/2602.15056v1Reconstructing Carbon Monoxide Reanalysis with Machine Learning2026-02-12T21:58:00ZThe Copernicus Atmospheric Monitoring Service provides reanalysis products for atmospheric composition by combining model simulations with satellite observations. The quality of these products depends strongly on the availability of the observational data, which can vary over time as new satellite instruments become available or are discontinued, such as Carbon Monoxide (CO) observations of the Measurements Of Pollution In The Troposphere (MOPITT) satellite in early 2025. Machine learning offers a promising approach to compensate for such data losses by learning systematic discrepancies between model configurations. In this study, we investigate machine learning methods to predict monthly-mean total column of Carbon Monoxide re-analysis from a control model simulation.2026-02-12T21:58:00ZPaula HarderJohannes Flemminghttp://arxiv.org/abs/2602.11825v1CAAL: Confidence-Aware Active Learning for Heteroscedastic Atmospheric Regression2026-02-12T11:09:58ZQuantifying the impacts of air pollution on health and climate relies on key atmospheric particle properties such as toxicity and hygroscopicity. However, these properties typically require complex observational techniques or expensive particle-resolved numerical simulations, limiting the availability of labeled data. We therefore estimate these hard-to-measure particle properties from routinely available observations (e.g., air pollutant concentrations and meteorological conditions). Because routine observations only indirectly reflect particle composition and structure, the mapping from routine observations to particle properties is noisy and input-dependent, yielding a heteroscedastic regression setting. With a limited and costly labeling budget, the central challenge is to select which samples to measure or simulate. While active learning is a natural approach, most acquisition strategies rely on predictive uncertainty. Under heteroscedastic noise, this signal conflates reducible epistemic uncertainty with irreducible aleatoric uncertainty, causing limited budgets to be wasted in noise-dominated regions. To address this challenge, we propose a confidence-aware active learning framework (CAAL) for efficient and robust sample selection in heteroscedastic settings. CAAL consists of two components: a decoupled uncertainty-aware training objective that separately optimises the predictive mean and noise level to stabilise uncertainty estimation, and a confidence-aware acquisition function that dynamically weights epistemic uncertainty using predicted aleatoric uncertainty as a reliability signal. Experiments on particle-resolved numerical simulations and real atmospheric observations show that CAAL consistently outperforms standard AL baselines. The proposed framework provides a practical and general solution for the efficient expansion of high-cost atmospheric particle property databases.2026-02-12T11:09:58Z17 pages in totalFei JiangJiyang XiaJunjie YuMingfei SunHugh CoeDavid ToppingDantong LiuZhenhui Jessie LiZhonghua Zhenghttp://arxiv.org/abs/2602.11313v1Hierarchical Testing of a Hybrid Machine Learning-Physics Global Atmosphere Model2026-02-11T19:34:50ZMachine learning (ML)-based models have demonstrated high skill and computational efficiency, often outperforming conventional physics-based models in weather and subseasonal predictions. While prior studies have assessed their fidelity in capturing synoptic-scale atmospheric dynamics, their performance across timescales and under out-of-distribution forcing, such as +3K or +4K uniform-warming forcings, and the sources of biases remain elusive, to establish the model reliability for Earth science. Here, we design three sets of experiments targeting synoptic-scale phenomena, interannual variability, and out-of-distribution uniform-warming forcings. We evaluate the Neural General Circulation Model (NeuralGCM), a hybrid model integrating a dynamical core with ML-based component, against observations and physics-based Earth system models (ESMs). At the synoptic scale, NeuralGCM captures the evolution and propagation of extratropical cyclones with performance comparable to ESMs. At the interannual scale, when forced by El Niño-Southern Oscillation sea surface temperature (SST) anomalies, NeuralGCM successfully reproduces associated teleconnection patterns but exhibits deficiencies in capturing nonlinear response. Under out-of-distribution uniform-warming forcings, NeuralGCM simulates similar responses in global-average temperature and precipitation and reproduces large-scale tropospheric circulation features similar to those in ESMs. Notable weaknesses include overestimating the tracks and spatial extent of extratropical cyclones, biases in the teleconnected wave train triggered by tropical SST anomalies, and differences in upper-level warming and stratospheric circulation responses to SST warming compared to physics-based ESMs. The causes of these weaknesses were explored.2026-02-11T19:34:50Z48 pages, 9 figuresZiming ChenL. Ruby LeungWenyu ZhouJian LuSandro W. LubisYe LiuChuan-Chieh ChangBryce E. HarropYa WangMingshi YangGan ZhangYun Qianhttp://arxiv.org/abs/2602.15052v1Blackening Cryosphere: Revealing Hotspot Shifts and HGB-Based Forecasting of Absorbing Aerosol Threats over the Himalayan Frozen Frontiers2026-02-11T10:40:19ZBlack carbon and mineral dust are key absorbing aerosols that influence atmospheric radiation and increasingly threaten global cryospheric stability. This study examines the long-range transport and seasonal variability of these aerosols over Pakistan and their movement toward the western Himalayas. Using satellite-derived Absorption Aerosol Optical Depth (AAOD) data from 2019 to mid-2025, we analyse their spatiotemporal behaviour across Pakistan's urban lowlands and high-altitude regions. Fifteen-day aggregated AAOD fields are used to track seasonal transport into glaciated terrain, where deposited aerosols can darken snow and ice and accelerate melt. For high-AAOD events, a probabilistic forecasting approach based on machine learning (ML) was developed. Using geographical, seasonal, and lagged indicators, a histogram-based gradient boosting classifier was trained to predict AAOD exceedance one step in advance. ROC-AUC, PR-AUC, and the Brier score were used to assess the model's performance. The results show high predictive capacity and good probability calibration, with values of 0.791, 0.269, and 0.028, respectively. Forecasts indicate that areas adjacent to Himalayan glaciers consistently exhibit the highest probability of increasing AAOD, signalling an elevated risk of aerosol-induced snowmelt.2026-02-11T10:40:19ZUnder reviewAbira SenguptaAyoti BanerjeeSarbani PalitBrendon Woodfordhttp://arxiv.org/abs/2511.23460v2Near-inertial waves enhance vertical transport at ocean fronts2026-02-11T07:47:03ZThe interactions between near-inertial waves (NIWs) and submesoscale currents in the surface ocean are challenging to deconvolve due to their overlapping temporal and spatial scales. The frequency of NIW is modulated by the relative vorticity, $ζ$, of submesoscale currents, which varies between positive and negative $ζ$ of $O(f)$ on spatial scales of 1 -- 10~$km$, particularly across fronts where the horizontal buoyancy gradient, $\nabla_H b$, is intensified. The effective NIW frequency $f_{\scriptstyle{eff}} = f + ζ/2$ can therefore also vary by $O(f)$ on these scales, causing the waves to be out of phase. This generates periodic convergence and divergence in the surface layer, particularly at fronts. The resulting vertical motion, known as inertial pumping, is traditionally considered to be reversible. However, the strong vertical shear of the horizontal velocity at fronts, $v_z \sim |\nabla_H b|/f$, implies that not all of the water that is pumped downward will return. We examine the effect of this asymmetry on the vertical transport of tracers with an ambient vertical gradient, analogous to biogeochemical tracers, such as oxygen and dissolved organic carbon. Using numerical simulations of an unstable front forced by NIW, we demonstrate that inertial pumping can lead to net vertical tracer transport. Spectral analysis of the vertical tracer flux given by the covariance between tracer and vertical velocity anomalies reveals that the interaction of strong NIW with submesoscale currents enhances the vertical exchange at the front on both the sub-inertial and inertial time scales.2025-11-28T18:52:30Z33 pages, 1 Table, 9 Figures, SI Text S1 to S5, Table S1, SI Movie S1 and SI Figures S1-S11Nihar PaulWoods Hole Oceanographic Institution, Massachusetts, USAmala MahadevanWoods Hole Oceanographic Institution, Massachusetts, UShttp://arxiv.org/abs/2602.10309v1Insights from Ex-Typhoon Halong (2025) -- An Arctic Cyclone of Tropical Origin2026-02-10T21:27:36ZAn Arctic cyclone, Ex-Typhoon Halong, produced strong winds and devastating flooding in southwestern Alaska during 11-12 October 2025. This study examines the evolution of Halong after its transition into an extratropical cyclone through the analysis of ERA5 reanalysis and WRF model simulations. It is found that warm sea surface temperature (SST) anomalies over the western North Pacific preconditioned ex-Halong for intensification by increasing water-vapor content and reducing static stability. Quasi-geostrophic lifting associated with a subsequent interaction with another extratropical cyclone led to the rapid deepening of ex-Halong. This case demonstrates that tropical cyclones can transition into extratropical systems that are intensified by anomalously warm ocean waters, exacerbating impacts in high latitudes. Further analyses indicate that an increasing fraction of Alaskan cyclones has originated in tropical latitudes (south of 30°N) in recent decades. In particular, the frequency of Arctic cyclones of tropical origin increased by a factor of four in August and by a factor of three in September during 1980-2025 compared with 1940-1979.2026-02-10T21:27:36ZThis Work has been submitted to Bulletin of the American Meteorological Society (BAMS). Copyright in this Work may be transferred without further noticeMingshi YangZhuo WangJohn E. WalshJames D. DoyleRichard L. ThomanAlice K. DuVivierhttp://arxiv.org/abs/2509.08790v2Entropy-Stable Discontinuous Spectral-Element Methods for the Spherical Shallow Water Equations in Covariant Form2026-02-09T18:05:52ZWe introduce discontinuous spectral-element methods of arbitrary order that are well balanced, conservative of mass, and conservative or dissipative of total energy (i.e., a mathematical entropy function) for a covariant flux formulation of the rotating shallow water equations with variable bottom topography on curved manifolds such as the sphere. The proposed methods are based on a skew-symmetric splitting of the tensor divergence in covariant form, which we implement and analyze within a general flux-differencing framework using tensor-product summation-by-parts operators. Such schemes are proven to satisfy semi-discrete mass and energy conservation on general unstructured quadrilateral grids in addition to well balancing for arbitrary continuous bottom topographies, with energy dissipation resulting from a suitable choice of numerical interface flux. Furthermore, the proposed covariant formulation permits an analytical representation of the geometry and associated metric terms while satisfying the aforementioned entropy stability, conservation, and well-balancing properties without the need to approximate the metric terms so as to enforce discrete metric identities. Numerical experiments on cubed-sphere grids are presented in order to verify the schemes' structure-preservation properties as well as to assess their accuracy and robustness within the context of several standard test cases characteristic of idealized atmospheric flows. Our theoretical and numerical results support the further development of the proposed methodology towards a full dynamical core for numerical weather prediction and climate modelling, as well as broader applications to other hyperbolic and advection-dominated systems of partial differential equations on curved manifolds.2025-09-10T17:18:36Z38 pages, 10 figures. Reproducibility repository: https://github.com/tristanmontoya/paper-2025-spherical-shallow-water/Tristan MontoyaAndrés M. Rueda-RamírezGregor J. Gassnerhttp://arxiv.org/abs/2512.05377v4China Regional 3km Downscaling Based on Residual Corrective Diffusion Model2026-02-09T07:24:20ZA fundamental challenge in numerical weather prediction is to efficiently produce high-resolution forecasts. A common solution is applying downscaling methods, which include dynamical downscaling and statistical downscaling, to the outputs of global models. This work focuses on statistical downscaling, which establishes statistical relationships between low-resolution and high-resolution historical data using statistical models. Deep learning has emerged as a powerful tool for this task, giving rise to various high-performance super-resolution models, which can be directly applied for downscaling, such as diffusion models and Generative Adversarial Networks. This work relies on a diffusion-based downscaling framework named CorrDiff. In contrast to the original work of CorrDiff, the region considered in this work is nearly 40 times larger, and we not only consider surface variables as in the original work, but also encounter high-level variables (six pressure levels) as target downscaling variables. In addition, a global residual connection is added to improve accuracy. In order to generate the 3km forecasts for the China region, we apply our trained models to the 25km global grid forecasts of CMA-GFS, an operational global model of the China Meteorological Administration (CMA), and SFF, a data-driven deep learning-based weather model developed from Spherical Fourier Neural Operators (SFNO). CMA-MESO, a high-resolution regional model, is chosen as the baseline model. The experimental results demonstrate that the forecasts downscaled by our method generally outperform the direct forecasts of CMA-MESO in terms of MAE for the target variables. Our forecasts of radar composite reflectivity show that CorrDiff, as a generative model, can generate fine-scale details that lead to more realistic predictions compared to the corresponding deterministic regression models.2025-12-05T02:27:08ZHonglu SunHao JingZhixiang DaiSa XiaoWei XueJian SunQifeng Luhttp://arxiv.org/abs/2507.09202v3XiChen: A global weather observation-to-forecast machine learning system via four-dimensional variational gradient-guided flexible assimilation2026-02-08T08:31:42ZMachine Learning (ML) has shown great promise in revolutionizing weather forecasting, yet most ML systems still rely on initial conditions generated by Numerical Weather Prediction (NWP) systems. End-to-end ML models aim to eliminate this dependency, but they often rely on observation-specific encoders and require redesign or retraining when observation sources change, thereby limiting their operational robustness. Here, we introduce XiChen, a global weather observation-to-forecast ML system via four-dimensional variational (4DVar) gradient-guided flexible assimilation. We demonstrate that the gradient of the 4DVar cost function serves as a physically grounded interface that maps heterogeneous observations into a common state space. This novel formulation enables XiChen to flexibly assimilate diverse conventional and raw satellite observations while preserving physical consistency. Experiments show that the system achieves forecasting metrics competitive with operational NWP systems. This work provides a practical and physically consistent route toward operational ML-based global weather forecasting systems with heterogeneous and evolving observations.2025-07-12T08:46:58Z24 pages, 6 figuresWuxin WangWeicheng NiLilan HuangTao HaoBen FeiShuo MaTaikang YuanYanlai ZhaoKefeng DengXiaoyong LiHongze LengBoheng DuanLei BaiWeimin ZhangJunqiang SongKaijun Renhttp://arxiv.org/abs/2510.09891v2Probabilistic bias adjustment of seasonal predictions of Arctic Sea Ice Concentration2026-02-07T00:26:25ZSeasonal forecast of Arctic sea ice concentration is key to mitigate the negative impact and assess potential opportunities posed by the rapid decline of sea ice coverage. Seasonal prediction systems based on climate models often show systematic biases and complex spatio-temporal errors that grow with the forecasts. Consequently, operational predictions are routinely bias corrected and calibrated using retrospective forecasts. For predictions of Arctic sea ice concentration, error corrections are mainly based on one-to-one post-processing methods including climatological mean or linear regression correction and, more recently, machine learning. Such deterministic adjustments are confined at best to the limited number of costly-to-run ensemble members of the raw forecast. However, decision-making requires proper quantification of uncertainty and likelihood of events, particularly of extremes. We introduce a probabilistic error correction framework based on a conditional Variational Autoencoder model to map the conditional distribution of observations given the biased model prediction. This method naturally allows for generating large ensembles of adjusted forecasts. We evaluate our model using deterministic and probabilistic metrics and show that the adjusted forecasts are better calibrated, closer to the observational distribution, and have smaller errors than climatological mean adjusted forecasts.2025-10-10T22:17:29ZParsa GooyaReinel Sospedra-Alfonsohttp://arxiv.org/abs/2602.07108v1Machine Learning-Ready Data Sets for the Analysis and Nowcasting of Atmospheric Radiation at Aviation Altitudes2026-02-06T18:38:24ZNowcasting and forecasting of the radiation environment in the Earth's lower atmosphere are critical for the safety of aircraft and spacecraft crews and passengers. Currently, this problem is addressed by employing statistical and physics-based models that take into account particle transport and precipitation. However, given the increased number of radiation measurements available to the community, it is possible to start developing data-driven approaches. We prepared Machine Learning-ready (ML-ready) datasets to nowcast the effective dose rates at aviation altitudes. The presented datasets contain 92,476 individual measurements from 589 flights obtained by the Automated Radiation Measurements for Aerospace Safety (ARMAS) experiment from 2013 to 2023. The ARMAS measurements are augmented with the properties of the Geospace environment, such as solar soft X-ray and proton fluxes, solar wind properties, secondary cosmic ray neutrons, space weather indexes, and global solar activity indicators (such as daily sunspot number). ARMAS data are separated into three partitions, ensuring that (1) the data points from a single flight remain within the same partition, and (2) each partition samples the flight locations and Geospace environment conditions equally. Several versions of the datasets allow predictions based on point-in-time measurements and use up to 24 hours of Geospace parameter history. The test of the use case demonstrates a possibility of nowcasting ARMAS measurements with accuracies slightly better than the considered physics-based models. The publicly available ML-ready datasets could serve as the first step in data preparation for ML-driven nowcasting and forecasting of the radiation environment.2026-02-06T18:38:24Z24 pages, 8 figures, 1 table, accepted to Space Weather JournalViacheslav M SadykovZachary M WatkinsDustin KemptonWilliam JonesSanjib K CGriffin T GoodwinXiaochun HeW Kent TobiskaIrina KitiashviliChristopher MertensShubha RanjanD Glenn DeardorffRyan Spauldinghttp://arxiv.org/abs/2602.06473v1On large-scale oceanic wind-drift currents2026-02-06T08:01:52ZStarting from the Navier--Stokes equations in rotating spherical coordinates with constant density and eddy viscosity varying only with depth, and appropriate, physically motivated boundary conditions, we derive an asymptotic model for the description of non-equatorial wind-generated oceanic drift currents. We do not invoke any tangent-plane approximations, thus allowing for large-scale flows that would not be captured by the classical $f$-plane approach. The strategy is to identify two small intrinsic scales for the flow (namely, the ratio between the depth of the Ekman layer and the Earth's radius, and the Rossby number) and, after a careful scaling, perform a double asymptotic expansion with respect to these small parameters. This leads to a system of linear ordinary differential equations with nonlinear boundary conditions for the leading-order dynamics, in addition to which we identify the governing equations for the first-order correction with respect to the Rossby number. First, we establish the existence and uniqueness of the solution to the leading-order equations and show that the solution behaves like a classical Ekman spiral for any eddy viscosity profile; moreover, we discuss the solution of the equations for the first-order correction, for which we also provide a priori bounds in terms of the leading-order solution. Finally, we discuss several cases of explicit eddy viscosity profiles (constant, linearly decreasing, linearly increasing, piecewise linear, and exponentially decaying) and compute the surface deflection angle of the wind-drift current. We obtain results that are remarkably consistent with observations.2026-02-06T08:01:52ZChristian PuntiniLuigi RobertiEduard Stefanescuhttp://arxiv.org/abs/2602.05248v2Thermodynamic Origin of Degree-Day Scaling in Phase-Change Systems2026-02-06T07:00:01ZPhase transitions impose topological constraints on thermodynamic state variables, masking energetic fluctuations at the phase boundary. This constraint is most apparent in melting systems, where temperature remains pinned despite continued energy input. Here we resolve this information loss by introducing a latent temperature-a counterfactual trajectory describing the system's unconstrained thermal evolution. We show that energy conservation alone enforces a rigorous duality between the total latent heat dissipated during phase change and the accumulated exceedance of the latent temperature above the melting point. This duality is mathematically equivalent to the one-dimensional Wasserstein-1 distance between the latent and observed temperature trajectories, with the transport cost set by a characteristic surface dissipation timescale and melting energy. Applied to ice-sheet surface melting, this timescale admits a direct physical interpretation in terms of radiative and turbulent heat loss. The same framework yields a first-principles derivation of the empirical Positive Degree Day law and predicts realistic degree-day factors that emerge from surface energy balance, without ad hoc calibration. More broadly, phase change emerges as an optimal transport process that projects continuous energetic variability onto a constrained thermodynamic boundary.2026-02-05T03:05:15ZZhiang Xiehttp://arxiv.org/abs/2602.06287v1Toward generative machine learning for boosting ensembles of climate simulations2026-02-06T00:54:19ZAccurately quantifying uncertainty in predictions and projections arising from irreducible internal climate variability is critical for informed decision making. Such uncertainty is typically assessed using ensembles produced with physics based climate models. However, computational constraints impose a trade off between generating the large ensembles required for robust uncertainty estimation and increasing model resolution to better capture fine scale dynamics. Generative machine learning offers a promising pathway to alleviate these constraints. We develop a conditional Variational Autoencoder (cVAE) trained on a limited sample of climate simulations to generate arbitrary large ensembles. The approach is applied to output from monthly CMIP6 historical and future scenario experiments produced with the Canadian Centre for Climate Modelling and Analysis' (CCCma's) Earth system model CanESM5. We show that the cVAE model learns the underlying distribution of the data and generates physically consistent samples that reproduce realistic low and high moment statistics, including extremes. Compared with more sophisticated generative architectures, cVAEs offer a mathematically transparent, interpretable, and computationally efficient framework. Their simplicity lead to some limitations, such as overly smooth outputs, spectral bias, and underdispersion, that we discuss along with strategies to mitigate them. Specifically, we show that incorporating output noise improves the representation of climate relevant multiscale variability, and we propose a simple method to achieve this. Finally, we show that cVAE-enhanced ensembles capture realistic global teleconnection patterns, even under climate conditions absent from the training data.2026-02-06T00:54:19ZSI_Toward_generative_machine_learning_for_boosting_the_ensembles_size_of_climate_simulation.pdf contains Supplementary InformationParsa GooyaReinel Sospedra-AlfonsoJohannes Exenberger