https://arxiv.org/api/lQDcE4vitsMCpPUePMBrx+KU7Rk 2026-04-09T08:24:45Z 30498 90 15 http://arxiv.org/abs/2511.19447v2 A model of the Unity High Definition Render Pipeline, with applications to flat-panel and head-mounted display characterization 2026-03-31T20:58:27Z

Game engines such as Unity and Unreal Engine have become popular tools for creating perceptual and behavioral experiments in complex, interactive environments. They are often used with flat-panel displays, and also with head-mounted displays. Here I describe and test a mathematical model of luminance and color in Unity's High Definition Render Pipeline (HDRP). I show that the HDRP has several non-obvious features, such as nonlinearities applied to material properties and rendered values, that must be taken into account in order to show well-controlled stimuli. I also show how the HDRP can be configured to display gamma-corrected luminance and color, and I provide software to create the specialized files needed for gamma correction.

2025-11-18T02:42:19Z 27 pages, 9 figures Richard F. Murray http://arxiv.org/abs/2604.00225v1 Pupil Design for Computational Wavefront Estimation 2026-03-31T20:42:52Z

Establishing a precise connection between imaged intensity and the incident wavefront is essential for emerging applications in adaptive optics, holography, computational microscopy, and non-line-of-sight imaging. While prior work has shown that breaking symmetries in pupil design enables wavefront recovery from a single intensity measurement, there is little guidance on how to design a pupil that improves wavefront estimation. In this work we introduce a quantitative asymmetry metric to bridge this gap and, through an extensive empirical study and supporting analysis, demonstrate that increasing asymmetry enhances wavefront recoverability. We analyze the trade-offs in pupil design, and the impact on light throughput along with performance in noise. Both large-scale simulations and optical bench experiments are carried out to support our findings.

2026-03-31T20:42:52Z Ali Almuallem Nicholas Chimitt Bole Ma Qi Guo Stanley H. Chan http://arxiv.org/abs/2507.16962v3 Harmonization in Magnetic Resonance Imaging: A Survey of Acquisition, Image-level, and Feature-level Methods 2026-03-31T20:24:56Z

Magnetic resonance imaging (MRI) has greatly advanced neuroscience research and clinical diagnostics. However, imaging data collected across different scanners, acquisition protocols, or imaging sites often exhibit substantial heterogeneity, known as batch effects or site effects. These non-biological sources of variability can obscure true biological signals, reduce reproducibility and statistical power, and severely impair the generalizability of learning-based models across datasets. Image harmonization is grounded in the central hypothesis that site-related biases can be eliminated or mitigated while preserving meaningful biological information, thereby improving data comparability and consistency. This review provides a comprehensive overview of key concepts, methodological advances, publicly available datasets, and evaluation metrics in the field of MRI harmonization. We systematically cover the full imaging pipeline and categorize harmonization approaches into prospective acquisition and reconstruction, retrospective image-level and feature-level methods, and traveling-subject-based techniques. By synthesizing existing methods and evidence, we revisit the central hypothesis of image harmonization and show that, although site invariance can be achieved with current techniques, further evaluation is required to verify the preservation of biological information. To this end, we summarize the remaining challenges and highlight key directions for future research, including the need for standardized validation benchmarks, improved evaluation strategies, and tighter integration of harmonization methods across the imaging pipeline.

2025-07-22T19:06:02Z 27 pages, 6 figures, 3 tables Qinqin Yang Firoozeh Shomal-Zadeh Ali Gholipour http://arxiv.org/abs/2603.05537v2 Sketch It Out: Exploring Label-Free Structural Cues for Multimodal Gait Recognition 2026-03-31T19:00:57Z

Gait recognition is a non-intrusive biometric technique for security applications, yet existing studies are dominated by silhouette- and parsing-based representations. Silhouettes are sparse and miss internal structural details, limiting discriminability. Parsing enriches silhouettes with part-level structures, but relies heavily on upstream human parsers (e.g., label granularity and boundary precision), leading to unstable performance across datasets and sometimes even inferior results to silhouettes. We revisit gait representations from a structural perspective and describe a design space defined by edge density and supervision form: silhouettes use sparse boundary edges with weak single-label supervision, while parsing uses denser cues with strong semantic priors. In this space, we identify an underexplored paradigm: dense part-level structure without explicit semantic labels, and introduce SKETCH as a new visual modality for gait recognition. Sketch extracts high-frequency structural cues (e.g., limb articulations and self-occlusion contours) directly from RGB images via edge-based detectors in a label-free manner. We further show that label-guided parsing and label-free sketch are semantically decoupled and structurally complementary. Based on this, we propose SKETCHGAIT, a hierarchically disentangled multi-modal framework with two independent streams for modality-specific learning and a lightweight early-stage fusion branch to capture structural complementarity. Extensive experiments on SUSTech1K and CCPG validate the proposed modality and framework: SketchGait achieves 92.9% Rank-1 on SUSTech1K and 93.1% mean Rank-1 on CCPG.

2026-03-04T05:48:07Z 10 pages, 3 figures Chao Zhang Zhuang Zheng Ruixin Li Zhanyong Mei http://arxiv.org/abs/2602.11873v2 Time-resolved aortic 3D shape reconstruction from a limited number of cine 2D MRI slices 2026-03-31T14:44:35Z

Background and Objective: To assess the feasibility and accuracy of reconstructing time-resolved, three-dimensional, subject-specific aortic geometries from a limited number of standard cine 2D magnetic resonance imaging (MRI) acquisitions. This is achieved by coupling a statistical shape model with a differentiable volumetric mesh optimization algorithm. Methods: Cine 2D MRI slices were manually segmented and used to reconstruct subject-specific aortic geometries via a differentiable mesh optimization algorithm, constrained by a statistical shape model. Optimal slice positioning was first evaluated on synthetic data, followed by in-vivo acquisition in 30 subjects (19 volunteers and 11 aortic stenosis patients). Time-resolved aortic geometries were reconstructed, from which geometric descriptors and radial strain were derived. In a subset of 10 subjects, 4D flow MRI data was acquired to provide volumetric reference for peak-systolic shape comparison. Results: Accurate reconstruction was achieved using as few as six cine 2D MRI slices. Agreement with 4D flow MRI reference data yielded a Dice score of (89.9 +/- 1.6) %, Intersection over Union of (81.7 +/- 2.7) %, Hausdorff distance of (7.3 +/- 3.3) mm, and Chamfer distance of (3.7 +/- 0.6) mm. The mean absolute radius error along the aortic arch was (0.8 +/- 0.6) mm. Secondary analysis demonstrated significant differences in geometric features and radial strain across age groups, with strain decreasing progressively with age at values of (11.00 +/- 3.11) x 10-2 vs. (3.74 +/- 1.25) x 10-2 vs. (2.89 +/- 0.87) x 10-2 for the young, mid-age, and elderly groups, respectively. Conclusion: The proposed framework enables reconstruction of time-resolved, subject-specific aortic geometries from a limited number of standard cine 2D MRI acquisitions, providing a practical basis for downstream computational analysis.

2026-02-12T12:23:41Z Gloria Wolkerstorfer Stefano Buoso Rabea Schlenker Jochen von Spiczak Robert Manka Sebastian Kozerke http://arxiv.org/abs/2604.00070v1 Brain MR Image Synthesis with Multi-contrast Self-attention GAN 2026-03-31T13:43:53Z

Accurate and complete multi-modal Magnetic Resonance Imaging (MRI) is essential for neuro-oncological assessment, as each contrast provides complementary anatomical and pathological information. However, acquiring all modalities (e.g., T1c, T1n, T2, T2f) for every patient is often impractical due to time, cost, and patient discomfort, potentially limiting comprehensive tumour evaluation. We propose 3D-MC-SAGAN (3D Multi-Contrast Self-Attention generative adversarial network), a unified 3D multi-contrast synthesis framework that generates high-fidelity missing modalities from a single T2 input while explicitly preserving tumour characteristics. The model employs a multi-scale 3D encoder-decoder generator with residual connections and a novel Memory-Bounded Hybrid Attention (MBHA) block to capture long-range dependencies efficiently, and is trained with a WGAN-GP critic and an auxiliary contrast-conditioning branch to produce T2f, T1n, and T1c volumes within a single unified network. A frozen 3D U-Net-based segmentation module introduces a segmentation-consistency constraint to preserve lesion morphology. The composite objective integrates adversarial, reconstruction, perceptual, structural similarity, contrast-classification, and segmentation-guided losses to align global realism with tumour-preserving structure. Extensive evaluation on 3D brain MRI datasets demonstrates that 3D-MC-SAGAN achieves state-of-the-art quantitative performance and generates visually coherent, anatomically plausible contrasts with improved distribution-level realism. Moreover, it maintains tumour segmentation accuracy comparable to fully acquired multi-modal inputs, highlighting its potential to reduce acquisition burden while preserving clinically meaningful information.

2026-03-31T13:43:53Z Note: This work has been submitted to the IEEE for possible publication Zaid A. Abod Furqan Aziz http://arxiv.org/abs/2309.16205v2 Generative AI Enables Structural Brain Network Construction from fMRI via Symmetric Diffusion Learning 2026-03-31T11:59:38Z

Mapping from functional connectivity (FC) to structural connectivity (SC) can facilitate multimodal brain network fusion and discover potential biomarkers for clinical implications. However, it is challenging to directly bridge the reliable non-linear mapping relations between SC and functional magnetic resonance imaging (fMRI). In this paper, a novel symmetric diffusive generative adversarial network-based fMRI-to-SC (DiffGAN-F2S) model is proposed to predict SC from brain fMRI in a unified framework. To be specific, the proposed DiffGAN-F2S leverages denoising diffusion probabilistic models (DDPMs) and adversarial learning to efficiently generate symmetric and high-fidelity SC through a few steps from fMRI. By designing the dual-channel multi-head spatial attention (DMSA) and graph convolutional modules, the symmetric graph generator first captures global relations among direct and indirect connected brain regions, then models the local brain region interactions. It can uncover the complex mapping relations between fMRI and symmetric structural connectivity. Furthermore, the spatially connected consistency loss is devised to constrain the generator to preserve global-local topological information for accurate symmetric SC prediction. Testing on the public Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, the proposed model can effectively generate empirical SC-preserved connectivity from four-dimensional imaging data and shows superior performance in SC prediction compared with other related models. Furthermore, the proposed model can identify the vast majority of important brain regions and connections derived from the empirical method, providing an alternative way to fuse multimodal brain networks and analyze clinical brain disease.

2023-09-28T06:55:50Z 12 pages Qiankun Zuo Bangjun Lei Wanyu Qiu Changhong Jing Jin Hong Shuqiang Wang http://arxiv.org/abs/2603.24176v2 Modeling Spatiotemporal Neural Frames for High Resolution Brain Dynamic 2026-03-31T09:59:29Z

Capturing dynamic spatiotemporal neural activity is essential for understanding large-scale brain mechanisms. Functional magnetic resonance imaging (fMRI) provides high-resolution cortical representations that form a strong basis for characterizing fine-grained brain activity patterns. The high acquisition cost of fMRI limits large-scale applications, therefore making high-quality fMRI reconstruction a crucial task. Electroencephalography (EEG) offers millisecond-level temporal cues that complement fMRI. Leveraging this complementarity, we present an EEG-conditioned framework for reconstructing dynamic fMRI as continuous neural sequences with high spatial fidelity and strong temporal coherence at the cortical-vertex level. To address sampling irregularities common in real fMRI acquisitions, we incorporate a null-space intermediate-frame reconstruction, enabling measurement-consistent completion of arbitrary intermediate frames and improving sequence continuity and practical applicability. Experiments on the CineBrain dataset demonstrate superior voxel-wise reconstruction quality and robust temporal consistency across whole-brain and functionally specific regions. The reconstructed fMRI also preserves essential functional information, supporting downstream visual decoding tasks. This work provides a new pathway for estimating high-resolution fMRI dynamics from EEG and advances multimodal neuroimaging toward more dynamic brain activity modeling.

2026-03-25T10:53:11Z CVPR 2026 Wanying Qu Jianxiong Gao Wei Wang Yanwei Fu http://arxiv.org/abs/2601.11691v3 Explainable histomorphology-based survival prediction of glioblastoma, IDH-wildtype 2026-03-31T09:40:28Z

Glioblastoma, IDH-wildtype (GBM-IDHwt) is the most common malignant brain tumor. While histomorphology is a crucial component of GBM-IDHwt diagnosis, it is not further considered for prognosis. Here, we present an explainable artificial intelligence (AI) framework to identify and interpret histomorphological features associated with patient survival. The framework combines an explainable multiple instance learning (MIL) architecture that directly identifies prognostically relevant image tiles with a sparse autoencoder (SAE) that maps these tiles to interpretable visual patterns. The MIL model was trained and evaluated on a new real-world dataset of 720 GBM-IDHwt cases from three hospitals and four cancer registries across Germany. The SAE was trained on 1,878 whole-slide images from five independent public glioblastoma collections. Despite the many factors influencing survival time, our method showed some ability to discriminate between patients living less than 180 days or more than 360 days solely based on histomorphology (AUC: 0.67; 95% CI: 0.63-0.72). Cox proportional hazards regression confirmed a significant survival difference between predicted groups after adjustment for established prognostic factors (hazard ratio: 1.47; 95% CI: 1.26-1.72). Three neuropathologists categorized the identified visual patterns into seven distinct histomorphological groups, revealing both established prognostic features and unexpected associations, the latter being potentially attributable to surgery-related confounders. The presented explainable AI framework facilitates prognostic biomarker discovery in GBM-IDHwt and beyond, highlighting promising histomorphological features for further analysis and exposing potential confounders that would be hidden in black-box models.

2026-01-16T15:35:12Z Jan-Philipp Redlich Friedrich Feuerhake Stefan Nikolin Nadine Sarah Schaadt Sarah Teuber-Hanselmann Joachim Weis Sabine Luttmann Andrea Eberle Christoph Buck Timm Intemann Pascal Birnstill Klaus Kraywinkel Jonas Ort Peter Boor André Homeyer http://arxiv.org/abs/2603.29438v1 Polyhedral Unmixing: Bridging Semantic Segmentation with Hyperspectral Unmixing via Polyhedral-Cone Partitioning 2026-03-31T08:46:09Z

Semantic segmentation and hyperspectral unmixing are two central problems in spectral image analysis. The former assigns each pixel a discrete label corresponding to its material class, whereas the latter estimates pure material spectra, called endmembers, and, for each pixel, a vector representing material abundances in the observed scene. Despite their complementarity, these two problems are usually addressed independently. This paper aims to bridge these two lines of work by formally showing that, under the linear mixing model, pixel classification by dominant materials induces polyhedral-cone regions in the spectral space. We leverage this fundamental property to propose a direct segmentation-to-unmixing pipeline that performs blind hyperspectral unmixing from any semantic segmentation by constructing a polyhedral-cone partition of the space that best fits the labeled pixels. Signed distances from pixels to the estimated regions are then computed, linearly transformed via a change of basis in the distance space, and projected onto the probability simplex, yielding an initial abundance estimate. This estimate is used to extract endmembers and recover final abundances via matrix pseudo-inversion. Because the segmentation method can be freely chosen, the user gains explicit control over the unmixing process, while the rest of the pipeline remains essentially deterministic and lightweight. Beyond improving interpretability, experiments on three real datasets demonstrate the effectiveness of the proposed approach when associated with appropriate clustering algorithms, and show consistent improvements over recent deep and non-deep state-of-the-art methods. The code is available at: https://github.com/antoine-bottenmuller/polyhedral-unmixing

2026-03-31T08:46:09Z Antoine Bottenmuller CMM, PSL, STIM Etienne Decencière CMM, PSL, STIM Petr Dokládal CMM, PSL, STIM http://arxiv.org/abs/2508.14475v4 Fine-grained Image Quality Assessment for Perceptual Image Restoration 2026-03-31T08:20:40Z

Recent years have witnessed remarkable achievements in perceptual image restoration (IR), creating an urgent demand for accurate image quality assessment (IQA), which is essential for both performance comparison and algorithm optimization. Unfortunately, the existing IQA metrics exhibit inherent weakness for IR task, particularly when distinguishing fine-grained quality differences among restored images. To address this dilemma, we contribute the first-of-its-kind fine-grained image quality assessment dataset for image restoration, termed FGRestore, comprising 18,408 restored images across six common IR tasks. Beyond conventional scalar quality scores, FGRestore was also annotated with 30,886 fine-grained pairwise preferences. Based on FGRestore, a comprehensive benchmark was conducted on the existing IQA metrics, which reveal significant inconsistencies between score-based IQA evaluations and the fine-grained restoration quality. Motivated by these findings, we further propose FGResQ, a new IQA model specifically designed for image restoration, which features both coarse-grained score regression and fine-grained quality ranking. Extensive experiments and comparisons demonstrate that FGResQ significantly outperforms state-of-the-art IQA metrics. Codes and model weights have been released in https://sxfly99.github.io/FGResQ-Home.

2025-08-20T06:58:32Z Accepted by AAAI2026 Xiangfei Sheng Xiaofeng Pan Zhichao Yang Pengfei Chen Leida Li http://arxiv.org/abs/2603.29404v1 Rich-U-Net: A medical image segmentation model for fusing spatial depth features and capturing minute structural details 2026-03-31T08:07:32Z

Medical image segmentation is of great significance in analysis of illness. The use of deep neural networks in medical image segmentation can help doctors extract regions of interest from complex medical images, thereby improving diagnostic accuracy and enabling better assessment of the condition to formulate treatment plans. However, most current medical image segmentation methods underperform in accurately extracting spatial information from medical images and mining potential complex structures and variations. In this article, we introduce the Rich-U-Net model, which effectively integrates both spatial and depth features. This fusion enhances the model's capability to detect fine structures and intricate details within complex medical images. Our multi-level and multi-dimensional feature fusion and optimization strategies enable our model to achieve fine structure localization and accurate segmentation results in medical image segmentation. Experiments on the ISIC2018, BUSI, GLAS, and CVC datasets show that Rich-U-Net surpasses other state-of-the-art models in Dice, IoU, and HD95 metrics.

2026-03-31T08:07:32Z Zhuoyi Fang Kexuan Shi Jiajia Liu Qiang Han http://arxiv.org/abs/2603.29181v1 Retinal Malady Classification using AI: A novel ViT-SVM combination architecture 2026-03-31T02:48:28Z

Macular Holes, Central serous retinopathy and Diabetic Retinopathy are one of the most widespread maladies of the eyes responsible for either partial or complete vision loss, thus making it clear that early detection of the mentioned defects is detrimental for the well-being of the patient. This study intends to introduce the application of Vision Transformer and Support Vector Machine based hybrid architecture (ViT-SVM) and analyse its performance to classify the optical coherence topography (OCT) Scans with the intention to automate the early detection of these retinal defects.

2026-03-31T02:48:28Z 6th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 2022, pp. 1659-1664 Shashwat Jha Vishvaditya Luhach Raju Poddar 10.1109/ICCMC53470.2022.9753876 http://arxiv.org/abs/2603.29034v1 The Surprising Effectiveness of Noise Pretraining for Implicit Neural Representations 2026-03-30T22:01:00Z

The approximation and convergence properties of implicit neural representations (INRs) are known to be highly sensitive to parameter initialization strategies. While several data-driven initialization methods demonstrate significant improvements over standard random sampling, the reasons for their success -- specifically, whether they encode classical statistical signal priors or more complex features -- remain poorly understood. In this study, we explore this phenomenon through a series of experimental analyses leveraging noise pretraining. We pretrain INRs on diverse noise classes (e.g., Gaussian, Dead Leaves, Spectral) and measure their ability to both fit unseen signals and encode priors for an inverse imaging task (denoising). Our analyses on image and video data reveal a surprising finding: simply pretraining on unstructured noise (Uniform, Gaussian) dramatically improves signal fitting capacity compared to all other baselines. However, unstructured noise also yields poor deep image priors for denoising. In contrast, we also find that noise with the classic $1/|f^α|$ spectral structure of natural images achieves an excellent balance of signal fitting and inverse imaging capabilities, performing on par with the best data-driven initialization methods. This finding enables more efficient INR training in applications lacking sufficient prior domain-specific data. For more details, visit project page at https://kushalvyas.github.io/noisepretraining.html

2026-03-30T22:01:00Z Accepted to CVPR 2026. Project page: https://kushalvyas.github.io/noisepretraining.html Kushal Vyas Alper Kayabasi Daniel Kim Vishwanath Saragadam Ashok Veeraraghavan Guha Balakrishnan http://arxiv.org/abs/2603.29014v1 End-to-end optimization of sparse ultrasound linear probes 2026-03-30T21:21:25Z

Ultrasound imaging faces a trade-off between image quality and hardware complexity caused by dense transducers. Sparse arrays are one popular solution to mitigate this challenge. This work proposes an end-to-end optimization framework that jointly learns sparse array configuration and image reconstruction. The framework integrates a differentiable Image Formation Model with a HARD Straight Thought Estimator (STE) selection mask, unrolled Iterative Soft-Thresholding Algorithm (ISTA) deconvolution, and a residual Convolutional Neural Network (CNN). The objective combines physical consistency (Point Spread Function (PSF) and convolutional formation model) with structural fidelity (contrast, Side-Lobe-Ratio (SLR), entropy, and row diversity). Simulations using a 3.5\,MHz probe show that the learned configuration preserves axial and lateral resolution with half of the active elements. This physics-guided, data-driven approach enables compact, cost-efficient ultrasound probe design without sacrificing image quality, and it is expandable to 3-D volumetric imaging.

2026-03-30T21:21:25Z Accepted at the IEEE International Symposium on Biomedical Imaging (ISBI 2026) Sergio Urrea Adrian Basarab Hervé Liebgott Henry Arguello