https://arxiv.org/api/YATQ3s2IqKJ2VvGosQkiPC6Kokc2026-06-25T14:30:52Z9383126015http://arxiv.org/abs/2511.19189v1AvatarBrush: Monocular Reconstruction of Gaussian Avatars with Intuitive Local Editing2025-11-24T14:58:11ZThe efficient reconstruction of high-quality and intuitively editable human avatars presents a pressing challenge in the field of computer vision. Recent advancements, such as 3DGS, have demonstrated impressive reconstruction efficiency and rapid rendering speeds. However, intuitive local editing of these representations remains a significant challenge. In this work, we propose AvatarBrush, a framework that reconstructs fully animatable and locally editable avatars using only a monocular video input. We propose a three-layer model to represent the avatar and, inspired by mesh morphing techniques, design a framework to generate the Gaussian model from local information of the parametric body model. Compared to previous methods that require scanned meshes or multi-view captures as input, our approach reduces costs and enhances editing capabilities such as body shape adjustment, local texture modification, and geometry transfer. Our experimental results demonstrate superior quality across two datasets and emphasize the enhanced, user-friendly, and localized editing capabilities of our method.2025-11-24T14:58:11ZMengtian LiShengxiang YaoYichen PanHaiyao XiaoZhongmei LiZhifeng XieKeyu Chenhttp://arxiv.org/abs/2511.17443v2GRAPHIC--Guidelines for Reviewing Algorithmic Practices in Human-centred Design and Interaction for Creativity2025-11-24T09:38:31ZArtificial Intelligence (AI) has been increasingly applied to creative domains, leading to the development of systems that collaborate with humans in design processes. In Graphic Design, integrating computational systems into co-creative workflows presents specific challenges, as it requires balancing scientific rigour with the subjective and visual nature of design practice. Following the PRISMA methodology, we identified 872 articles, resulting in a final corpus of 71 publications describing 68 unique systems. Based on this review, we introduce GRAPHIC (Guidelines for Reviewing Algorithmic Practices in Human-centred Design and Interaction for Creativity), a framework for analysing computational systems applied to Graphic Design. Its goal is to understand how current systems support human-AI collaboration in the Graphic Design discipline. The framework comprises main dimensions, which our analysis revealed to be essential across diverse system types: (1) Collaborative Panorama, (2) Processes and Modalities, and (3) Graphic Design Principles. Its application revealed research gaps, including the need to balance initiative and control between agents, improve communication through explainable interaction models, and promote systems that support transformational creativity grounded in core design principles.2025-11-21T17:42:09Z20 pages, 16 figuresJoana Rovira MartinsPedro MartinsAna Boavidahttp://arxiv.org/abs/2511.18900v1MatMart: Material Reconstruction of 3D Objects via Diffusion2025-11-24T08:58:14ZApplying diffusion models to physically-based material estimation and generation has recently gained prominence. In this paper, we propose \ttt, a novel material reconstruction framework for 3D objects, offering the following advantages. First, \ttt\ adopts a two-stage reconstruction, starting with accurate material prediction from inputs and followed by prior-guided material generation for unobserved views, yielding high-fidelity results. Second, by utilizing progressive inference alongside the proposed view-material cross-attention (VMCA), \ttt\ enables reconstruction from an arbitrary number of input images, demonstrating strong scalability and flexibility. Finally, \ttt\ achieves both material prediction and generation capabilities through end-to-end optimization of a single diffusion model, without relying on additional pre-trained models, thereby exhibiting enhanced stability across various types of objects. Extensive experiments demonstrate that \ttt\ achieves superior performance in material reconstruction compared to existing methods.2025-11-24T08:58:14ZXiuchao WuPengfei ZhuJiangjing LyuXinguo LiuJie GuoYanwen GuoWeiwei XuChengfei Lyuhttp://arxiv.org/abs/2511.18873v1Neural Texture Splatting: Expressive 3D Gaussian Splatting for View Synthesis, Geometry, and Dynamic Reconstruction2025-11-24T08:26:32Z3D Gaussian Splatting (3DGS) has emerged as a leading approach for high-quality novel view synthesis, with numerous variants extending its applicability to a broad spectrum of 3D and 4D scene reconstruction tasks. Despite its success, the representational capacity of 3DGS remains limited by the use of 3D Gaussian kernels to model local variations. Recent works have proposed to augment 3DGS with additional per-primitive capacity, such as per-splat textures, to enhance its expressiveness. However, these per-splat texture approaches primarily target dense novel view synthesis with a reduced number of Gaussian primitives, and their effectiveness tends to diminish when applied to more general reconstruction scenarios. In this paper, we aim to achieve concrete performance improvement over state-of-the-art 3DGS variants across a wide range of reconstruction tasks, including novel view synthesis, geometry and dynamic reconstruction, under both sparse and dense input settings. To this end, we introduce Neural Texture Splatting (NTS). At the core of our approach is a global neural field (represented as a hybrid of a tri-plane and a neural decoder) that predicts local appearance and geometric fields for each primitive. By leveraging this shared global representation that models local texture fields across primitives, we significantly reduce model size and facilitate efficient global information exchange, demonstrating strong generalization across tasks. Furthermore, our neural modeling of local texture fields introduces expressive view- and time-dependent effects, a critical aspect that existing methods fail to account for. Extensive experiments show that Neural Texture Splatting consistently improves models and achieves state-of-the-art results across multiple benchmarks.2025-11-24T08:26:32ZSIGGRAPH Asia 2025 (conference track), Project page: https://19reborn.github.io/nts/Yiming WangShaofei WangMarko MihajlovicSiyu Tanghttp://arxiv.org/abs/2105.01610v4Reliving the Dataset: Combining the Visualization of Road Users' Interactions with Scenario Reconstruction in Virtual Reality2025-11-24T08:02:45ZOne core challenge in the development of automated vehicles is their capability to deal with a multitude of complex trafficscenarios with many, hard to predict traffic participants. As part of the iterative development process, it is necessary to detect criticalscenarios and generate knowledge from them to improve the highly automated driving (HAD) function. In order to tackle this challenge,numerous datasets have been released in the past years, which act as the basis for the development and testing of such algorithms.Nevertheless, the remaining challenges are to find relevant scenes, such as safety-critical corner cases, in these datasets and tounderstand them completely.Therefore, this paper presents a methodology to process and analyze naturalistic motion datasets in two ways: On the one hand, ourapproach maps scenes of the datasets to a generic semantic scene graph which allows for a high-level and objective analysis. Here,arbitrary criticality measures, e.g. TTC, RSS or SFF, can be set to automatically detect critical scenarios between traffic participants.On the other hand, the scenarios are recreated in a realistic virtual reality (VR) environment, which allows for a subjective close-upanalysis from multiple, interactive perspectives.2021-05-04T16:39:06ZAccepted for publication at ICITE 2021Lars TöttelMaximilian ZipflDaniel BogdollMarc René ZofkaJ. Marius Zöllner10.1007/978-981-19-2259-6_39http://arxiv.org/abs/2512.03052v1LATTICE: Democratize High-Fidelity 3D Generation at Scale2025-11-24T03:22:19ZWe present LATTICE, a new framework for high-fidelity 3D asset generation that bridges the quality and scalability gap between 3D and 2D generative models. While 2D image synthesis benefits from fixed spatial grids and well-established transformer architectures, 3D generation remains fundamentally more challenging due to the need to predict both spatial structure and detailed geometric surfaces from scratch. These challenges are exacerbated by the computational complexity of existing 3D representations and the lack of structured and scalable 3D asset encoding schemes. To address this, we propose VoxSet, a semi-structured representation that compresses 3D assets into a compact set of latent vectors anchored to a coarse voxel grid, enabling efficient and position-aware generation. VoxSet retains the simplicity and compression advantages of prior VecSet methods while introducing explicit structure into the latent space, allowing positional embeddings to guide generation and enabling strong token-level test-time scaling. Built upon this representation, LATTICE adopts a two-stage pipeline: first generating a sparse voxelized geometry anchor, then producing detailed geometry using a rectified flow transformer. Our method is simple at its core, but supports arbitrary resolution decoding, low-cost training, and flexible inference schemes, achieving state-of-the-art performance on various aspects, and offering a significant step toward scalable, high-quality 3D asset creation.2025-11-24T03:22:19ZTechnical ReportZeqiang LaiYunfei ZhaoZibo ZhaoHaolin LiuQingxiang LinJingwei HuangChunchao GuoXiangyu Yuehttp://arxiv.org/abs/2511.18680v1Inverse Rendering for High-Genus Surface Meshes from Multi-View Images2025-11-24T01:44:09ZWe present a topology-informed inverse rendering approach for reconstructing high-genus surface meshes from multi-view images. Compared to 3D representations like voxels and point clouds, mesh-based representations are preferred as they enable the application of differential geometry theory and are optimized for modern graphics pipelines. However, existing inverse rendering methods often fail catastrophically on high-genus surfaces, leading to the loss of key topological features, and tend to oversmooth low-genus surfaces, resulting in the loss of surface details. This failure stems from their overreliance on Adam-based optimizers, which can lead to vanishing and exploding gradients. To overcome these challenges, we introduce an adaptive V-cycle remeshing scheme in conjunction with a re-parametrized Adam optimizer to enhance topological and geometric awareness. By periodically coarsening and refining the deforming mesh, our method informs mesh vertices of their current topology and geometry before optimization, mitigating gradient issues while preserving essential topological features. Additionally, we enforce topological consistency by constructing topological primitives with genus numbers that match those of ground truth using Gauss-Bonnet theorem. Experimental results demonstrate that our inverse rendering approach outperforms the current state-of-the-art method, achieving significant improvements in Chamfer Distance and Volume IoU, particularly for high-genus surfaces, while also enhancing surface details for low-genus surfaces.2025-11-24T01:44:09Z3DV2026 Accepted (Poster)Xiang GaoXinmu WangXiaolong WuJiazhi LiJingyu ShiYu GuoYuanpeng LiuXiyun SongHeather YuZongfang LinXianfeng David Guhttp://arxiv.org/abs/2511.18441v1ReCoGS: Real-time ReColoring for Gaussian Splatting scenes2025-11-23T13:25:14ZGaussian Splatting has emerged as a leading method for novel view synthesis, offering superior training efficiency and real-time inference compared to NeRF approaches, while still delivering high-quality reconstructions. Beyond view synthesis, this 3D representation has also been explored for editing tasks. Many existing methods leverage 2D diffusion models to generate multi-view datasets for training, but they often suffer from limitations such as view inconsistencies, lack of fine-grained control, and high computational demand. In this work, we focus specifically on the editing task of recoloring. We introduce a user-friendly pipeline that enables precise selection and recoloring of regions within a pre-trained Gaussian Splatting scene. To demonstrate the real-time performance of our method, we also present an interactive tool that allows users to experiment with the pipeline in practice. Code is available at https://github.com/loryruta/recogs.2025-11-23T13:25:14ZProject page is available at https://github.com/loryruta/recogsLorenzo RutayisireNicola CapodieciFabio Pellacinihttp://arxiv.org/abs/2511.09361v2Computational Caustic Design for Surface Light Source2025-11-22T13:18:30ZDesigning freeform surfaces to control light based on real-world illumination patterns is challenging, as existing caustic lens designs often assume oversimplified point or parallel light sources. We propose representing surface light sources using an optimized set of point sources, whose parameters are fitted to the real light source's illumination using a novel differentiable rendering framework. Our physically-based rendering approach simulates light transmission using flux, without requiring prior knowledge of the light source's intensity distribution. To efficiently explore the light source parameter space during optimization, we apply a contraction mapping that converts the constrained problem into an unconstrained one. Using the optimized light source model, we then design the freeform lens shape considering flux consistency and normal integrability. Simulations and physical experiments show our method more accurately represents real surface light sources compared to point-source approximations, yielding caustic lenses that produce images closely matching the target light distributions.2025-11-12T14:23:17ZAccepted to IEEE Transactions on Visualization and Computer GraphicsSizhuo ZhouYuou SunBailin DengJuyong Zhang10.1109/TVCG.2025.3633081http://arxiv.org/abs/2511.17932v1Novel View Synthesis from A Few Glimpses via Test-Time Natural Video Completion2025-11-22T06:08:29ZGiven just a few glimpses of a scene, can you imagine the movie playing out as the camera glides through it? That's the lens we take on \emph{sparse-input novel view synthesis}, not only as filling spatial gaps between widely spaced views, but also as \emph{completing a natural video} unfolding through space.
We recast the task as \emph{test-time natural video completion}, using powerful priors from \emph{pretrained video diffusion models} to hallucinate plausible in-between views. Our \emph{zero-shot, generation-guided} framework produces pseudo views at novel camera poses, modulated by an \emph{uncertainty-aware mechanism} for spatial coherence. These synthesized frames densify supervision for \emph{3D Gaussian Splatting} (3D-GS) for scene reconstruction, especially in under-observed regions. An iterative feedback loop lets 3D geometry and 2D view synthesis inform each other, improving both the scene reconstruction and the generated views.
The result is coherent, high-fidelity renderings from sparse inputs \emph{without any scene-specific training or fine-tuning}. On LLFF, DTU, DL3DV, and MipNeRF-360, our method significantly outperforms strong 3D-GS baselines under extreme sparsity.2025-11-22T06:08:29ZAccepted to NeurIPS 2025Yan XuYixing WangStella X. Yuhttp://arxiv.org/abs/2511.17501v1Native 3D Editing with Full Attention2025-11-21T18:59:26ZInstruction-guided 3D editing is a rapidly emerging field with the potential to broaden access to 3D content creation. However, existing methods face critical limitations: optimization-based approaches are prohibitively slow, while feed-forward approaches relying on multi-view 2D editing often suffer from inconsistent geometry and degraded visual quality. To address these issues, we propose a novel native 3D editing framework that directly manipulates 3D representations in a single, efficient feed-forward pass. Specifically, we create a large-scale, multi-modal dataset for instruction-guided 3D editing, covering diverse addition, deletion, and modification tasks. This dataset is meticulously curated to ensure that edited objects faithfully adhere to the instructional changes while preserving the consistency of unedited regions with the source object. Building upon this dataset, we explore two distinct conditioning strategies for our model: a conventional cross-attention mechanism and a novel 3D token concatenation approach. Our results demonstrate that token concatenation is more parameter-efficient and achieves superior performance. Extensive evaluations show that our method outperforms existing 2D-lifting approaches, setting a new benchmark in generation quality, 3D consistency, and instruction fidelity.2025-11-21T18:59:26ZWeiwei CaiShuangkang FangWeicai YeXin DongYunhan YangXuanyang ZhangWei ChengYanpei CaoGang YuTao Chenhttp://arxiv.org/abs/2508.17995v2Topology Aware Neural Interpolation of Scalar Fields2025-11-21T18:16:26ZThis paper presents a neural scheme for the topology-aware interpolation of time-varying scalar fields. Given a time-varying sequence of persistence diagrams, along with a sparse temporal sampling of the corresponding scalar fields, denoted as keyframes, our interpolation approach aims at "inverting" the non-keyframe diagrams to produce plausible estimations of the corresponding, missing data. For this, we rely on a neural architecture which learns the relation from a time value to the corresponding scalar field, based on the keyframe examples, and reliably extends this relation to the non-keyframe time steps. We show how augmenting this architecture with specific topological losses exploiting the input diagrams both improves the geometrical and topological reconstruction of the non-keyframe time steps. At query time, given an input time value for which an interpolation is desired, our approach instantaneously produces an output, via a single propagation of the time input through the network. Experiments interpolating 2D and 3D time-varying datasets show our approach superiority, both in terms of data and topological fitting, with regard to reference interpolation schemes. Our implementation is available at this GitHub link : https://github.com/MohamedKISSI/Topology-Aware-Neural-Interpolation-of-Scalar-Fields.git.2025-08-25T13:04:21ZMohamed KissiKeanu SisoukJoshua A. LevineJulien Tiernyhttp://arxiv.org/abs/2511.17014v1Parameter-Free Neural Lens Blur Rendering for High-Fidelity Composites2025-11-21T07:32:05ZConsistent and natural camera lens blur is important for seamlessly blending 3D virtual objects into photographed real-scenes. Since lens blur typically varies with scene depth, the placement of virtual objects and their corresponding blur levels significantly affect the visual fidelity of mixed reality compositions. Existing pipelines often rely on camera parameters (e.g., focal length, focus distance, aperture size) and scene depth to compute the circle of confusion (CoC) for realistic lens blur rendering. However, such information is often unavailable to ordinary users, limiting the accessibility and generalizability of these methods. In this work, we propose a novel compositing approach that directly estimates the CoC map from RGB images, bypassing the need for scene depth or camera metadata. The CoC values for virtual objects are inferred through a linear relationship between its signed CoC map and depth, and realistic lens blur is rendered using a neural reblurring network. Our method provides flexible and practical solution for real-world applications. Experimental results demonstrate that our method achieves high-fidelity compositing with realistic defocus effects, outperforming state-of-the-art techniques in both qualitative and quantitative evaluations.2025-11-21T07:32:05ZAccepted by ISMAR 2025 with oral presentation. 10 pages, 11 figuresLingyan RuanBin ChenTaehyun Rheehttp://arxiv.org/abs/2511.16831v1Vorion: A RISC-V GPU with Hardware-Accelerated 3D Gaussian Rendering and Training2025-11-20T22:24:17Z3D Gaussian Splatting (3DGS) has recently emerged as a foundational technique for real-time neural rendering, 3D scene generation, volumetric video (4D) capture. However, its rendering and training impose massive computation, making real-time rendering on edge devices and real-time 4D reconstruction on workstations currently infeasible. Given its fixed-function nature and similarity with traditional rasterization, 3DGS presents a strong case for dedicated hardware in the graphics pipeline of next-generation GPUs. This work, Vorion, presents the first GPGPU prototype with hardware-accelerated 3DGS rendering and training. Vorion features scalable architecture, minimal hardware change to traditional rasterizers, z-tiling to increase parallelism, and Gaussian/pixel-centric hybrid dataflow. We prototype the minimal system (8 SIMT cores, 2 Gaussian rasterizer) using TSMC 16nm FinFET technology, which achieves 19 FPS for rendering. The scaled design with 16 rasterizers achieves 38.6 iterations/s for training.2025-11-20T22:24:17ZYipeng WangMengtian YangChieh-pu LoJaydeep P. Kulkarnihttp://arxiv.org/abs/2511.16349v1CRISTAL: Real-time Camera Registration in Static LiDAR Scans using Neural Rendering2025-11-20T13:34:34ZAccurate camera localization is crucial for robotics and Extended Reality (XR), enabling reliable navigation and alignment of virtual and real content. Existing visual methods often suffer from drift, scale ambiguity, and depend on fiducials or loop closure. This work introduces a real-time method for localizing a camera within a pre-captured, highly accurate colored LiDAR point cloud. By rendering synthetic views from this cloud, 2D-3D correspondences are established between live frames and the point cloud. A neural rendering technique narrows the domain gap between synthetic and real images, reducing occlusion and background artifacts to improve feature matching. The result is drift-free camera tracking with correct metric scale in the global LiDAR coordinate system. Two real-time variants are presented: Online Render and Match, and Prebuild and Localize. We demonstrate improved results on the ScanNet++ dataset and outperform existing SLAM pipelines.2025-11-20T13:34:34ZJoni VanherckSteven MoonenBrent ZoomersKobe WernerJeroen PutLode JorissenNick Michiels