https://arxiv.org/api/YATQ3s2IqKJ2VvGosQkiPC6Kokc 2026-06-25T14:30:52Z 9383 1260 15 http://arxiv.org/abs/2511.19189v1 AvatarBrush: Monocular Reconstruction of Gaussian Avatars with Intuitive Local Editing 2025-11-24T14:58:11Z The efficient reconstruction of high-quality and intuitively editable human avatars presents a pressing challenge in the field of computer vision. Recent advancements, such as 3DGS, have demonstrated impressive reconstruction efficiency and rapid rendering speeds. However, intuitive local editing of these representations remains a significant challenge. In this work, we propose AvatarBrush, a framework that reconstructs fully animatable and locally editable avatars using only a monocular video input. We propose a three-layer model to represent the avatar and, inspired by mesh morphing techniques, design a framework to generate the Gaussian model from local information of the parametric body model. Compared to previous methods that require scanned meshes or multi-view captures as input, our approach reduces costs and enhances editing capabilities such as body shape adjustment, local texture modification, and geometry transfer. Our experimental results demonstrate superior quality across two datasets and emphasize the enhanced, user-friendly, and localized editing capabilities of our method. 2025-11-24T14:58:11Z Mengtian Li Shengxiang Yao Yichen Pan Haiyao Xiao Zhongmei Li Zhifeng Xie Keyu Chen http://arxiv.org/abs/2511.17443v2 GRAPHIC--Guidelines for Reviewing Algorithmic Practices in Human-centred Design and Interaction for Creativity 2025-11-24T09:38:31Z Artificial Intelligence (AI) has been increasingly applied to creative domains, leading to the development of systems that collaborate with humans in design processes. In Graphic Design, integrating computational systems into co-creative workflows presents specific challenges, as it requires balancing scientific rigour with the subjective and visual nature of design practice. Following the PRISMA methodology, we identified 872 articles, resulting in a final corpus of 71 publications describing 68 unique systems. Based on this review, we introduce GRAPHIC (Guidelines for Reviewing Algorithmic Practices in Human-centred Design and Interaction for Creativity), a framework for analysing computational systems applied to Graphic Design. Its goal is to understand how current systems support human-AI collaboration in the Graphic Design discipline. The framework comprises main dimensions, which our analysis revealed to be essential across diverse system types: (1) Collaborative Panorama, (2) Processes and Modalities, and (3) Graphic Design Principles. Its application revealed research gaps, including the need to balance initiative and control between agents, improve communication through explainable interaction models, and promote systems that support transformational creativity grounded in core design principles. 2025-11-21T17:42:09Z 20 pages, 16 figures Joana Rovira Martins Pedro Martins Ana Boavida http://arxiv.org/abs/2511.18900v1 MatMart: Material Reconstruction of 3D Objects via Diffusion 2025-11-24T08:58:14Z Applying diffusion models to physically-based material estimation and generation has recently gained prominence. In this paper, we propose \ttt, a novel material reconstruction framework for 3D objects, offering the following advantages. First, \ttt\ adopts a two-stage reconstruction, starting with accurate material prediction from inputs and followed by prior-guided material generation for unobserved views, yielding high-fidelity results. Second, by utilizing progressive inference alongside the proposed view-material cross-attention (VMCA), \ttt\ enables reconstruction from an arbitrary number of input images, demonstrating strong scalability and flexibility. Finally, \ttt\ achieves both material prediction and generation capabilities through end-to-end optimization of a single diffusion model, without relying on additional pre-trained models, thereby exhibiting enhanced stability across various types of objects. Extensive experiments demonstrate that \ttt\ achieves superior performance in material reconstruction compared to existing methods. 2025-11-24T08:58:14Z Xiuchao Wu Pengfei Zhu Jiangjing Lyu Xinguo Liu Jie Guo Yanwen Guo Weiwei Xu Chengfei Lyu http://arxiv.org/abs/2511.18873v1 Neural Texture Splatting: Expressive 3D Gaussian Splatting for View Synthesis, Geometry, and Dynamic Reconstruction 2025-11-24T08:26:32Z 3D Gaussian Splatting (3DGS) has emerged as a leading approach for high-quality novel view synthesis, with numerous variants extending its applicability to a broad spectrum of 3D and 4D scene reconstruction tasks. Despite its success, the representational capacity of 3DGS remains limited by the use of 3D Gaussian kernels to model local variations. Recent works have proposed to augment 3DGS with additional per-primitive capacity, such as per-splat textures, to enhance its expressiveness. However, these per-splat texture approaches primarily target dense novel view synthesis with a reduced number of Gaussian primitives, and their effectiveness tends to diminish when applied to more general reconstruction scenarios. In this paper, we aim to achieve concrete performance improvement over state-of-the-art 3DGS variants across a wide range of reconstruction tasks, including novel view synthesis, geometry and dynamic reconstruction, under both sparse and dense input settings. To this end, we introduce Neural Texture Splatting (NTS). At the core of our approach is a global neural field (represented as a hybrid of a tri-plane and a neural decoder) that predicts local appearance and geometric fields for each primitive. By leveraging this shared global representation that models local texture fields across primitives, we significantly reduce model size and facilitate efficient global information exchange, demonstrating strong generalization across tasks. Furthermore, our neural modeling of local texture fields introduces expressive view- and time-dependent effects, a critical aspect that existing methods fail to account for. Extensive experiments show that Neural Texture Splatting consistently improves models and achieves state-of-the-art results across multiple benchmarks. 2025-11-24T08:26:32Z SIGGRAPH Asia 2025 (conference track), Project page: https://19reborn.github.io/nts/ Yiming Wang Shaofei Wang Marko Mihajlovic Siyu Tang http://arxiv.org/abs/2105.01610v4 Reliving the Dataset: Combining the Visualization of Road Users' Interactions with Scenario Reconstruction in Virtual Reality 2025-11-24T08:02:45Z One core challenge in the development of automated vehicles is their capability to deal with a multitude of complex trafficscenarios with many, hard to predict traffic participants. As part of the iterative development process, it is necessary to detect criticalscenarios and generate knowledge from them to improve the highly automated driving (HAD) function. In order to tackle this challenge,numerous datasets have been released in the past years, which act as the basis for the development and testing of such algorithms.Nevertheless, the remaining challenges are to find relevant scenes, such as safety-critical corner cases, in these datasets and tounderstand them completely.Therefore, this paper presents a methodology to process and analyze naturalistic motion datasets in two ways: On the one hand, ourapproach maps scenes of the datasets to a generic semantic scene graph which allows for a high-level and objective analysis. Here,arbitrary criticality measures, e.g. TTC, RSS or SFF, can be set to automatically detect critical scenarios between traffic participants.On the other hand, the scenarios are recreated in a realistic virtual reality (VR) environment, which allows for a subjective close-upanalysis from multiple, interactive perspectives. 2021-05-04T16:39:06Z Accepted for publication at ICITE 2021 Lars Töttel Maximilian Zipfl Daniel Bogdoll Marc René Zofka J. Marius Zöllner 10.1007/978-981-19-2259-6_39 http://arxiv.org/abs/2512.03052v1 LATTICE: Democratize High-Fidelity 3D Generation at Scale 2025-11-24T03:22:19Z We present LATTICE, a new framework for high-fidelity 3D asset generation that bridges the quality and scalability gap between 3D and 2D generative models. While 2D image synthesis benefits from fixed spatial grids and well-established transformer architectures, 3D generation remains fundamentally more challenging due to the need to predict both spatial structure and detailed geometric surfaces from scratch. These challenges are exacerbated by the computational complexity of existing 3D representations and the lack of structured and scalable 3D asset encoding schemes. To address this, we propose VoxSet, a semi-structured representation that compresses 3D assets into a compact set of latent vectors anchored to a coarse voxel grid, enabling efficient and position-aware generation. VoxSet retains the simplicity and compression advantages of prior VecSet methods while introducing explicit structure into the latent space, allowing positional embeddings to guide generation and enabling strong token-level test-time scaling. Built upon this representation, LATTICE adopts a two-stage pipeline: first generating a sparse voxelized geometry anchor, then producing detailed geometry using a rectified flow transformer. Our method is simple at its core, but supports arbitrary resolution decoding, low-cost training, and flexible inference schemes, achieving state-of-the-art performance on various aspects, and offering a significant step toward scalable, high-quality 3D asset creation. 2025-11-24T03:22:19Z Technical Report Zeqiang Lai Yunfei Zhao Zibo Zhao Haolin Liu Qingxiang Lin Jingwei Huang Chunchao Guo Xiangyu Yue http://arxiv.org/abs/2511.18680v1 Inverse Rendering for High-Genus Surface Meshes from Multi-View Images 2025-11-24T01:44:09Z We present a topology-informed inverse rendering approach for reconstructing high-genus surface meshes from multi-view images. Compared to 3D representations like voxels and point clouds, mesh-based representations are preferred as they enable the application of differential geometry theory and are optimized for modern graphics pipelines. However, existing inverse rendering methods often fail catastrophically on high-genus surfaces, leading to the loss of key topological features, and tend to oversmooth low-genus surfaces, resulting in the loss of surface details. This failure stems from their overreliance on Adam-based optimizers, which can lead to vanishing and exploding gradients. To overcome these challenges, we introduce an adaptive V-cycle remeshing scheme in conjunction with a re-parametrized Adam optimizer to enhance topological and geometric awareness. By periodically coarsening and refining the deforming mesh, our method informs mesh vertices of their current topology and geometry before optimization, mitigating gradient issues while preserving essential topological features. Additionally, we enforce topological consistency by constructing topological primitives with genus numbers that match those of ground truth using Gauss-Bonnet theorem. Experimental results demonstrate that our inverse rendering approach outperforms the current state-of-the-art method, achieving significant improvements in Chamfer Distance and Volume IoU, particularly for high-genus surfaces, while also enhancing surface details for low-genus surfaces. 2025-11-24T01:44:09Z 3DV2026 Accepted (Poster) Xiang Gao Xinmu Wang Xiaolong Wu Jiazhi Li Jingyu Shi Yu Guo Yuanpeng Liu Xiyun Song Heather Yu Zongfang Lin Xianfeng David Gu http://arxiv.org/abs/2511.18441v1 ReCoGS: Real-time ReColoring for Gaussian Splatting scenes 2025-11-23T13:25:14Z Gaussian Splatting has emerged as a leading method for novel view synthesis, offering superior training efficiency and real-time inference compared to NeRF approaches, while still delivering high-quality reconstructions. Beyond view synthesis, this 3D representation has also been explored for editing tasks. Many existing methods leverage 2D diffusion models to generate multi-view datasets for training, but they often suffer from limitations such as view inconsistencies, lack of fine-grained control, and high computational demand. In this work, we focus specifically on the editing task of recoloring. We introduce a user-friendly pipeline that enables precise selection and recoloring of regions within a pre-trained Gaussian Splatting scene. To demonstrate the real-time performance of our method, we also present an interactive tool that allows users to experiment with the pipeline in practice. Code is available at https://github.com/loryruta/recogs. 2025-11-23T13:25:14Z Project page is available at https://github.com/loryruta/recogs Lorenzo Rutayisire Nicola Capodieci Fabio Pellacini http://arxiv.org/abs/2511.09361v2 Computational Caustic Design for Surface Light Source 2025-11-22T13:18:30Z Designing freeform surfaces to control light based on real-world illumination patterns is challenging, as existing caustic lens designs often assume oversimplified point or parallel light sources. We propose representing surface light sources using an optimized set of point sources, whose parameters are fitted to the real light source's illumination using a novel differentiable rendering framework. Our physically-based rendering approach simulates light transmission using flux, without requiring prior knowledge of the light source's intensity distribution. To efficiently explore the light source parameter space during optimization, we apply a contraction mapping that converts the constrained problem into an unconstrained one. Using the optimized light source model, we then design the freeform lens shape considering flux consistency and normal integrability. Simulations and physical experiments show our method more accurately represents real surface light sources compared to point-source approximations, yielding caustic lenses that produce images closely matching the target light distributions. 2025-11-12T14:23:17Z Accepted to IEEE Transactions on Visualization and Computer Graphics Sizhuo Zhou Yuou Sun Bailin Deng Juyong Zhang 10.1109/TVCG.2025.3633081 http://arxiv.org/abs/2511.17932v1 Novel View Synthesis from A Few Glimpses via Test-Time Natural Video Completion 2025-11-22T06:08:29Z Given just a few glimpses of a scene, can you imagine the movie playing out as the camera glides through it? That's the lens we take on \emph{sparse-input novel view synthesis}, not only as filling spatial gaps between widely spaced views, but also as \emph{completing a natural video} unfolding through space. We recast the task as \emph{test-time natural video completion}, using powerful priors from \emph{pretrained video diffusion models} to hallucinate plausible in-between views. Our \emph{zero-shot, generation-guided} framework produces pseudo views at novel camera poses, modulated by an \emph{uncertainty-aware mechanism} for spatial coherence. These synthesized frames densify supervision for \emph{3D Gaussian Splatting} (3D-GS) for scene reconstruction, especially in under-observed regions. An iterative feedback loop lets 3D geometry and 2D view synthesis inform each other, improving both the scene reconstruction and the generated views. The result is coherent, high-fidelity renderings from sparse inputs \emph{without any scene-specific training or fine-tuning}. On LLFF, DTU, DL3DV, and MipNeRF-360, our method significantly outperforms strong 3D-GS baselines under extreme sparsity. 2025-11-22T06:08:29Z Accepted to NeurIPS 2025 Yan Xu Yixing Wang Stella X. Yu http://arxiv.org/abs/2511.17501v1 Native 3D Editing with Full Attention 2025-11-21T18:59:26Z Instruction-guided 3D editing is a rapidly emerging field with the potential to broaden access to 3D content creation. However, existing methods face critical limitations: optimization-based approaches are prohibitively slow, while feed-forward approaches relying on multi-view 2D editing often suffer from inconsistent geometry and degraded visual quality. To address these issues, we propose a novel native 3D editing framework that directly manipulates 3D representations in a single, efficient feed-forward pass. Specifically, we create a large-scale, multi-modal dataset for instruction-guided 3D editing, covering diverse addition, deletion, and modification tasks. This dataset is meticulously curated to ensure that edited objects faithfully adhere to the instructional changes while preserving the consistency of unedited regions with the source object. Building upon this dataset, we explore two distinct conditioning strategies for our model: a conventional cross-attention mechanism and a novel 3D token concatenation approach. Our results demonstrate that token concatenation is more parameter-efficient and achieves superior performance. Extensive evaluations show that our method outperforms existing 2D-lifting approaches, setting a new benchmark in generation quality, 3D consistency, and instruction fidelity. 2025-11-21T18:59:26Z Weiwei Cai Shuangkang Fang Weicai Ye Xin Dong Yunhan Yang Xuanyang Zhang Wei Cheng Yanpei Cao Gang Yu Tao Chen http://arxiv.org/abs/2508.17995v2 Topology Aware Neural Interpolation of Scalar Fields 2025-11-21T18:16:26Z This paper presents a neural scheme for the topology-aware interpolation of time-varying scalar fields. Given a time-varying sequence of persistence diagrams, along with a sparse temporal sampling of the corresponding scalar fields, denoted as keyframes, our interpolation approach aims at "inverting" the non-keyframe diagrams to produce plausible estimations of the corresponding, missing data. For this, we rely on a neural architecture which learns the relation from a time value to the corresponding scalar field, based on the keyframe examples, and reliably extends this relation to the non-keyframe time steps. We show how augmenting this architecture with specific topological losses exploiting the input diagrams both improves the geometrical and topological reconstruction of the non-keyframe time steps. At query time, given an input time value for which an interpolation is desired, our approach instantaneously produces an output, via a single propagation of the time input through the network. Experiments interpolating 2D and 3D time-varying datasets show our approach superiority, both in terms of data and topological fitting, with regard to reference interpolation schemes. Our implementation is available at this GitHub link : https://github.com/MohamedKISSI/Topology-Aware-Neural-Interpolation-of-Scalar-Fields.git. 2025-08-25T13:04:21Z Mohamed Kissi Keanu Sisouk Joshua A. Levine Julien Tierny http://arxiv.org/abs/2511.17014v1 Parameter-Free Neural Lens Blur Rendering for High-Fidelity Composites 2025-11-21T07:32:05Z Consistent and natural camera lens blur is important for seamlessly blending 3D virtual objects into photographed real-scenes. Since lens blur typically varies with scene depth, the placement of virtual objects and their corresponding blur levels significantly affect the visual fidelity of mixed reality compositions. Existing pipelines often rely on camera parameters (e.g., focal length, focus distance, aperture size) and scene depth to compute the circle of confusion (CoC) for realistic lens blur rendering. However, such information is often unavailable to ordinary users, limiting the accessibility and generalizability of these methods. In this work, we propose a novel compositing approach that directly estimates the CoC map from RGB images, bypassing the need for scene depth or camera metadata. The CoC values for virtual objects are inferred through a linear relationship between its signed CoC map and depth, and realistic lens blur is rendered using a neural reblurring network. Our method provides flexible and practical solution for real-world applications. Experimental results demonstrate that our method achieves high-fidelity compositing with realistic defocus effects, outperforming state-of-the-art techniques in both qualitative and quantitative evaluations. 2025-11-21T07:32:05Z Accepted by ISMAR 2025 with oral presentation. 10 pages, 11 figures Lingyan Ruan Bin Chen Taehyun Rhee http://arxiv.org/abs/2511.16831v1 Vorion: A RISC-V GPU with Hardware-Accelerated 3D Gaussian Rendering and Training 2025-11-20T22:24:17Z 3D Gaussian Splatting (3DGS) has recently emerged as a foundational technique for real-time neural rendering, 3D scene generation, volumetric video (4D) capture. However, its rendering and training impose massive computation, making real-time rendering on edge devices and real-time 4D reconstruction on workstations currently infeasible. Given its fixed-function nature and similarity with traditional rasterization, 3DGS presents a strong case for dedicated hardware in the graphics pipeline of next-generation GPUs. This work, Vorion, presents the first GPGPU prototype with hardware-accelerated 3DGS rendering and training. Vorion features scalable architecture, minimal hardware change to traditional rasterizers, z-tiling to increase parallelism, and Gaussian/pixel-centric hybrid dataflow. We prototype the minimal system (8 SIMT cores, 2 Gaussian rasterizer) using TSMC 16nm FinFET technology, which achieves 19 FPS for rendering. The scaled design with 16 rasterizers achieves 38.6 iterations/s for training. 2025-11-20T22:24:17Z Yipeng Wang Mengtian Yang Chieh-pu Lo Jaydeep P. Kulkarni http://arxiv.org/abs/2511.16349v1 CRISTAL: Real-time Camera Registration in Static LiDAR Scans using Neural Rendering 2025-11-20T13:34:34Z Accurate camera localization is crucial for robotics and Extended Reality (XR), enabling reliable navigation and alignment of virtual and real content. Existing visual methods often suffer from drift, scale ambiguity, and depend on fiducials or loop closure. This work introduces a real-time method for localizing a camera within a pre-captured, highly accurate colored LiDAR point cloud. By rendering synthetic views from this cloud, 2D-3D correspondences are established between live frames and the point cloud. A neural rendering technique narrows the domain gap between synthetic and real images, reducing occlusion and background artifacts to improve feature matching. The result is drift-free camera tracking with correct metric scale in the global LiDAR coordinate system. Two real-time variants are presented: Online Render and Match, and Prebuild and Localize. We demonstrate improved results on the ScanNet++ dataset and outperform existing SLAM pipelines. 2025-11-20T13:34:34Z Joni Vanherck Steven Moonen Brent Zoomers Kobe Werner Jeroen Put Lode Jorissen Nick Michiels