https://arxiv.org/api/LtygTz2/eewlZOsSmTvi8e8RBZ02026-06-14T08:18:34Z932336015http://arxiv.org/abs/2602.01674v2VRGaussianAvatar: Integrating 3D Gaussian Avatars into VR2026-05-04T04:40:10ZWe present VRGaussianAvatar, an integrated system that enables real-time full-body 3D Gaussian Splatting (3DGS) avatars in virtual reality using only head-mounted display (HMD) tracking signals. The system adopts a parallel pipeline with a VR Frontend and a GA Backend. The VR Frontend uses inverse kinematics to estimate full-body pose and streams the resulting pose along with stereo camera parameters to the backend. The GA Backend stereoscopically renders a 3DGS avatar reconstructed from a single image. To improve stereo rendering efficiency, we introduce Binocular Batching, which jointly processes left and right eye views in a single batched pass to reduce redundant computation and support high-resolution VR displays. We evaluate VRGaussianAvatar with quantitative performance tests and a within-subject user study against image- and video-based mesh avatar baselines. Results show that VRGaussianAvatar sustains interactive VR performance and yields higher perceived appearance similarity, embodiment, and plausibility. Project page and source code are available at https://vrgaussianavatar.github.io.2026-02-02T05:42:40ZAccepted as an IEEE TVCG paper at IEEE VR 2026 (journal track)Hail SongBoram YoonSeokhwan YangSeoyoung KangHyunjeong KimHenning MetzmacherWoontack Woo10.1109/TVCG.2026.3680708http://arxiv.org/abs/2605.02086v1GETA-3DGS: Automatic Joint Structured Pruning and Quantization for 3D Gaussian Splatting2026-05-03T23:12:14Z3D Gaussian splatting (3DGS) is a state-of-the-art representation for real-time photorealistic novel-view synthesis, yet a single high-fidelity scene typically occupies hundreds of megabytes to several gigabytes, exceeding the budgets of mobile, immersive, and volumetric video platforms. Existing 3DGS compression methods (e.g., HAC++, FlexGaussian, LP-3DGS) treat pruning, quantization, and entropy coding as separate stages and rely on hand-tuned heuristics (opacity thresholds, fixed bit-widths, SH truncation), limiting cross-scene generalization and preventing users from specifying a target rate or quality budget. We propose GETA-3DGS, to our knowledge the first end-to-end automatic joint structured pruning and quantization framework for 3DGS. Building on GETA for joint pruning-quantization of deep networks, we contribute: (i) a 3DGS-aware quantization-aware dependency graph (QADG) treating each Gaussian primitive as a group with five attribute sub-nodes and degree-aware SH sub-nodes; (ii) a render-aware saliency fusing transmittance-weighted contribution, screen-space gradient, and pixel coverage into a Gaussian-level importance score; and (iii) a heterogeneous per-attribute mixed-precision scheme co-optimized with structural sparsity under a projected partial saliency-guided (PPSG) descent guarantee. On Mip-NeRF 360, Tanks and Temples, and Deep Blending, GETA-3DGS operates directly on raw Gaussian primitives rather than a post-hoc anchor representation, delivering ~5x storage reduction over Vanilla 3DGS with no per-scene thresholds. Bit-width policy is the dominant rate-distortion lever: a uniform 6-bit cap costs up to -6.74 dB on view-dependent scenes versus our heterogeneous allocation, matching an information-theoretic reverse-water-filling analysis we develop. GETA-3DGS is complementary to existing codecs: entropy coding (HAC++, CompGS) is downstream, so the two can be composed.2026-05-03T23:12:14ZBaobing ZhangWanxin Suihttp://arxiv.org/abs/2512.08309v4InfiniteDiffusion: Bridging Learned Fidelity and Procedural Utility for Open-World Terrain Generation2026-05-03T20:00:14ZFor decades, procedural worlds have been built on procedural noise functions such as Perlin noise, which are fast and infinite, yet fundamentally limited in realism and large-scale coherence. Conversely, diffusion models offer unprecedented fidelity but remain generally confined to bounded canvases. We introduce InfiniteDiffusion, a training-free algorithm that reformulates diffusion sampling for lazy and unbounded generation, bridging the fidelity of diffusion models with the properties that made procedural noise indispensable: seamless infinite extent, seed-consistency, and constant-time random access. To demonstrate the utility of this approach, we present Terrain Diffusion, a framework for learned procedural terrain generation with a procedural noise-like interface. Our framework outpaces orbital velocity by 9 times on a consumer GPU, enabling realistic terrain generation at interactive rates. We integrate a hierarchical stack of diffusion models to couple planetary context with local detail, a compact Laplacian encoding to stabilize outputs across Earth-scale dynamic ranges, and an open-source infinite-tensor framework for constant-memory manipulation of unbounded tensors. Together, these components position diffusion models as a practical foundation for the next generation of infinite virtual worlds.2025-12-09T07:10:35ZProject website: https://xandergos.github.io/terrain-diffusion/ Code: https://github.com/xandergos/terrain-diffusion/Alexander Goslinhttp://arxiv.org/abs/2605.01925v1CADFS: A Big CAD Program Dataset and Framework for Computer-Aided Design with Large Language Models2026-05-03T15:12:36ZWe introduce CADFS, a data-centric framework that enables large vision-language models to generate complex CAD design histories. Existing generative CAD systems are restricted to sketch-extrude operations due to simplified representations and limited datasets. We address this by introducing a FeatureScript-based representation and constructing a dataset of 450k real-world CAD models spanning 15 modeling operations. We obtain the dataset via a new pipeline that reconstructs clean, executable FeatureScript programs and provides multimodal annotations. Fine-tuning a VLM on this representation yields state-of-the-art results in text-conditioned CAD generation and image-based reconstruction, producing more accurate, diverse, and feature-rich designs than prior frameworks. Ablations show that each individual component of our framework, i.e., the FeatureScript representation, the extended operation set, and representation-aligned textual descriptions, significantly improves performance. Our framework substantially broadens the complexity and realism achievable in generative CAD. The CADFS framework and the new dataset are available at https://voyleg.github.io/cadfs/.2026-05-03T15:12:36ZAccepted to CVPR 2026Vladislav PyatovGleb BobrovskikhSaveliy GalochkinNikita BoldyrevOleg VoynovAlexander FilippovGonzalo FerrerPeter WonkaEvgeny Burnaevhttp://arxiv.org/abs/2605.01919v1Greed for the Spheres: A Signed Distance Interpolation Method2026-05-03T15:00:18ZWe propose a method to interpolate Signed Distance Function (SDF) data from a discrete set of samples. Unlike prior work, our approach ensures that the new SDF data values are fully consistent with the input and each other, such that the augmented data still corresponds to a geometrically realizable surface. We express the theoretical properties of SDFs as hard geometric constraints, and construct an efficient greedy algorithm for consistent SDF interpolation that is made even faster with powerful parallelized GPU preprocessing. We exemplify the usefulness of our method by evaluating it on three practical applications: global SDF refinement, in which the SDF data is upsampled without knowledge of the ground truth; mesh reconstruction, where our method can reconstruct highly detailed surfaces using global information from coarse input SDFs; and repair of pseudo-SDFs, which result from many pipelines such as CSG Boolean operations and must be turned into valid SDFs for downstream processing tasks. Our refined SDFs are guaranteed to be consistent with the input, where previous methods have no such guarantee.2026-05-03T15:00:18ZLetao ChenSanju MupparajuChristopher BattySilvia SellánOded Steinhttp://arxiv.org/abs/2605.01854v1High-Fidelity Mobile Avatars with Pruned Local Blendshapes2026-05-03T12:51:05ZWe propose a method to reconstruct high-fidelity human avatars from multi-view video that can run on mobile devices. Many works can model high-quality Gaussian-based full-body avatars from multi-view video. However, these methods require heavy computation to obtain pose-dependent appearance, making deployment on mobile devices very difficult. Recent methods distill from pretrained models and model pose-dependent nonlinear Gaussian attributes by linearly combining global pose features with blendshapes. Although they can run on mobile devices, they suffer some loss of detail. We observe that nearby Gaussians are often highly correlated within a local region of the body, and can be linearly modeled with less error. Therefore, we use local linear blendshapes in small body parts to capture global nonlinear changes of Gaussian attributes. To further reduce computation and model size, we propose to remove blendshapes for Gaussians whose attributes change little, yielding a minimal blendshape representation. Our method is an end-to-end training method without a pretrained model. To make it run on multiple devices, we implement our method using WebGPU. Experiments show that our method can render high-quality human avatars with better details, and can reach 120 FPS at 2K resolution on mobile devices.2026-05-03T12:51:05ZCVPR 2026. Project page https://gapszju.github.io/webavatar/Youyi ZhanHe WangTianjia ShaoKun Zhouhttp://arxiv.org/abs/2605.01536v1The Antipodal Method: Fast, Accurate, and Robust 3D Generalized Winding Numbers2026-05-02T17:01:54ZGeneralized winding numbers provide a robust measure of point insidedness for 3D surfaces - whether open, self-intersecting, or non-manifold - and are central to numerous geometry processing tasks. However, existing methods trade off between accuracy and computational efficiency, limiting their use in interactive and large-scale applications.
We introduce a new formulation and algorithm for computing generalized winding numbers that is both fast and accurate to arbitrary precision, applicable to meshes and parametric surfaces. Our approach expresses the winding number as the sum of two intuitive geometric quantities: the signed number of ray-surface intersections and a boundary integral over the surface's projection onto the unit sphere. This insight leads to an efficient discretization that avoids expensive surface integrals and spherical arrangements.
For meshes, our method achieves average speedups of $22\times$ on a CPU compared to the fastest precise methods and $3\times$ compared to the fastest approximation method, while maintaining full precision. On a GPU, for moderately complex meshes we reach a throughput of $10^9$ queries per second, or $4K$ generalized winding number slices at 120 FPS ($13\times$ faster than a naive GPU method). For parametric surfaces, our method is on average $5.6\times$ faster than the state-of-the-art method, with the same precision. Our method naturally handles complex topologies and non-manifold inputs. We extensively validate its accuracy, robustness, and time performance. Our code is available at https://github.com/MartensCedric/antipodal.2026-05-02T17:01:54Z11 pagesCedric MartensPhilip TrettnerMikhail Bessmeltsev10.1145/3811323http://arxiv.org/abs/2605.01456v1How Historians Use Visualization: A Corpus-Backed Taxonomy and Analysis for Cross-Disciplinary Practice2026-05-02T14:07:31ZVisualization in historical research is shifting from isolated attempts to systematic practices. However, data-driven evidence about how historians actually use visualization remains scarce. We present a corpus-driven, mixed-methods study that combines analysis of images from 4,142 research articles across history and digital humanities journals with a collaboratively developed visualization taxonomy and a semi-automatic labeling pipeline. We construct a corpus of 14,021 images, classify 4,831 visualization instances using a hierarchical, domain-informed taxonomy, and analyze patterns of visualization adoption across venues, history subfields, and time. To interpret these patterns, we conduct interviews with 11 historians and use HiFigAtlas system as a boundary object to support joint inspection of the corpus. We identify distinct roles for visualizations in historical research: primary-source, evidence-synthesis, communicative, confirmative, and exploratory. We further find that while historians pursue diverse goals with figures, persistent epistemological and practical barriers, such as uncertainty, provenance, justification burden, and publication constraints, impede the adoption of visualization. This work contributes a grounded account of visualization use in historical scholarship and points to opportunities to better support domain-specific needs.2026-05-02T14:07:31ZXinyue ChenYu ZhangWeili ZhengChiteng MaXiaoru Yuan10.1111/cgf.70468http://arxiv.org/abs/2604.09132v2Strips as Tokens: Artist Mesh Generation with Native UV Segmentation2026-05-02T02:39:12ZRecent advancements in autoregressive transformers have demonstrated remarkable potential for generating artist-quality meshes. However, the token ordering strategies employed by existing methods typically fail to meet professional artist standards, where coordinate-based sorting yields inefficiently long sequences, and patch-based heuristics disrupt the continuous edge flow and structural regularity essential for high-quality modeling. To address these limitations, we propose Strips as Tokens (SATO), a novel framework with a token ordering strategy inspired by triangle strips. By constructing the sequence as a connected chain of faces that explicitly encodes UV boundaries, our method naturally preserves the organized edge flow and semantic layout characteristic of artist-created meshes. A key advantage of this formulation is its unified representation, enabling the same token sequence to be decoded into either a triangle or quadrilateral mesh. This flexibility facilitates joint training on both data types: large-scale triangle data provides fundamental structural priors, while high-quality quad data enhances the geometric regularity of the outputs. Extensive experiments demonstrate that SATO consistently outperforms prior methods in terms of geometric quality, structural coherence, and UV segmentation. Project page: https://ruixu.me/html/SATO/index.html2026-04-10T09:13:09ZACM Transactions on Graphics. SIGGRAPH 2026Rui XuDafei QinKaichun QiaoQiujie DongHuaijin PiQixuan ZhangLongwen ZhangLan XuJingyi YuWenping WangTaku Komurahttp://arxiv.org/abs/2601.06035v2Investigating Anthropometric Fidelity in SAM 3D Body2026-05-02T00:45:24ZThe release of SAM 3D Body is a recent development in human mesh recovery, demonstrating improved performance in producing clean, topologically coherent meshes from single images. By leveraging the Momentum Human Rig (MHR), it achieves robustness to occlusion and diverse poses. However, our evaluation reveals a specific and consistent limitation: the model struggles to reconstruct detailed anthropometric deviations, particularly in populations exhibiting distinctive morphological alterations such as geriatric muscle atrophy, scoliosis, or pregnancy, even when these features are prominent in the input image. In this paper, we investigate this phenomenon not as a failure of the model's capacity, but as a byproduct of the "perception-distortion trade-off". We posit that the architectural reliance on the low-dimensional parametric MHR representation, combined with semantic-invariant conditioning (DINOv3) and annotation-based alignment, creates a pervasive "regression to the mean" effect. We analyze these mechanisms to understand why individual biological details are smoothed out. Furthermore, we state our contributions by proposing specific, constructive pathways for future work, such as implicit-explicit hybrid representations and Medical-in-the-Loop alignment, to extend the baseline performance of SAM 3D Body into the high-precision medical domain.2025-12-02T05:33:17ZAizierjiang AiersilanRuting ChengJames Hahnhttp://arxiv.org/abs/2506.22899v3Neural Cellular Automata: From Cells to Pixels2026-05-01T23:39:16ZNeural Cellular Automata (NCAs) are bio-inspired dynamical systems in which identical cells iteratively apply a learned local update rule to self-organize into complex patterns, exhibiting regeneration, robustness, and spontaneous dynamics. Despite their success in texture synthesis and morphogenesis, NCAs remain largely confined to low-resolution outputs. This limitation stems from (1) training time and memory requirements that grow quadratically with grid size, (2) the strictly local propagation of information that impedes long-range cell communication, and (3) the heavy compute demands of real-time inference at high resolution. In this work, we overcome this limitation by pairing an NCA that evolves on a coarse grid with a lightweight implicit decoder that maps cell states and local coordinates to appearance attributes, enabling the same model to render outputs at arbitrary resolution. Moreover, because both the decoder and NCA updates are local, inference remains highly parallelizable. To supervise high-resolution outputs efficiently, we introduce task-specific losses for morphogenesis (growth from a seed) and texture synthesis with minimal additional memory and computation overhead. Our experiments across 2D/3D grids and mesh domains demonstrate that our hybrid models produce high-resolution outputs in real-time, and preserve the characteristic self-organizing behavior of NCAs.2025-06-28T14:30:21Z9 pages, 14 figures, +8 pages of Appendix (20 figures in total)SIGGRAPH 2026Ehsan PajouheshgarYitao XuAli AbbasiAlexander MordvintsevWenzel JakobSabine Süsstrunkhttp://arxiv.org/abs/2602.19182v2Thin Plate Spline Surface Reconstruction via the Method of Matched Sections2026-05-01T21:15:07ZThis paper further develops the Method of Matched Sections (MMS), a robust numerical framework for the solution of boundary value problems governed by partial differential equations. It demonstrates its unique applicability to the challenges of surface modeling, which lie at the intersection of computational mechanics and computer graphics. This work shows how the MMS successfully bridges this gap. By decomposing the domain into an assembly of 1D directional components matched along their entire boundaries, the method inherently enforces the continuity of all variational parameters, including second-order (curvature) and third-order (shear) derivatives. We demonstrate the method's advanced capabilities in high-fidelity surface reconstruction and blending, showing that it consistently generates energetically optimal, fair surfaces even from complex boundary conditions or sparse internal data points. By advancing the application of the MMS, this research establishes it as a powerful, physics-informed geometric tool that satisfies the dual demands of rigorous numerical analysis and aesthetic computer-aided design.2025-12-06T14:53:25ZIgor OrynyakKirill DanylenkoDanylo Tavrovhttp://arxiv.org/abs/2506.18867v4Efficient B-Spline Finite Elements for Cloth Simulation2026-05-01T19:38:55ZWe present an efficient B-spline finite element method (FEM) for cloth simulation. While higher-order FEM has long promised higher accuracy, its adoption in cloth simulators has been limited by its larger computational costs while generating results with similar visual quality. Our contribution is a full algorithmic pipeline that makes cloth simulation using quadratic B-spline surfaces faster than standard linear FEM in practice while consistently improving accuracy and visual fidelity. Using quadratic B-spline basis functions, we obtain a globally $C^1$-continuous displacement field that supports consistent discretization of both membrane and bending energies, effectively reducing locking artifacts and mesh dependence common to linear elements. To close the performance gap, we introduce a reduced integration scheme that separately optimizes quadrature rules for membrane and bending energies, an accelerated Hessian assembly procedure tailored to the spline structure, and an optimized linear solver based on partial factorization. Together, these optimizations make high-order, smooth cloth simulation competitive at scale, yielding an average $2\times$ speedup over linear FEM in our tests. Extensive experiments demonstrate improved accuracy, wrinkle detail, and robustness, including contact-rich scenarios, relative to linear FEM and recent higher-order approaches. Our method enables realistic wrinkling dynamics across a wide range of material parameters and supports practical garment animation, providing a new promising spatial discretization for high-quality cloth simulation.2025-06-23T17:36:11Z25 pages, 28 figuresYuqi MengYihao ShiKemeng HuangZixuan LuNing GuoTaku KomuraYin YangMinchen Li10.1145/3811278http://arxiv.org/abs/2605.00632v1BlenderRAG: High-Fidelity 3D Object Generation via Retrieval-Augmented Code Synthesis2026-05-01T13:09:30ZAutomatic generation of executable Blender code from natural language remains challenging, with state-of-the-art LLMs producing frequent syntactic errors and geometrically inconsistent objects. We present BlenderRAG, a retrieval-augmented generation system that operates on a curated multimodal dataset of 500 expert-validated examples (text, code, image) across 50 object categories. By retrieving semantically similar examples during generation, BlenderRAG improves compilation success rates from 40.8% to 70.0% and semantic normalized alignment from 0.41 to 0.77 (CLIP similarity) across four state-of-the-art LLMs, without requiring fine-tuning or specialized hardware, making it immediately accessible for deployment. The dataset and code will be available at https://github.com/MaxRondelli/BlenderRAG.2026-05-01T13:09:30ZMassimo RondelliFrancesco PiviMaurizio Gabbriellihttp://arxiv.org/abs/2605.00569v12D-SuGaR: Surface-Aware Gaussian Splatting for Geometrically Accurate Mesh Reconstruction2026-05-01T11:09:29Z3D Gaussian Splatting (3DGS) has emerged as a powerful technique for generating photorealistic renderings of a scene in real-time. However, the volumetric nature of 3DGS limits its ability to accurately capture surface geometry. To address this, 2D Gaussian Splatting (2DGS) was proposed to enable view-consistent and geometrically accurate surface reconstruction from multi-view images. However, 2DGS can be sensitive to the initialization of the Gaussian primitives. Reliance on Structure-from-Motion (SfM) initializations, which can produce poor estimates on challenging image sets, may lead to subpar results. In this work, we enhance 2DGS by incorporating monocular depth and normal priors to improve both geometric accuracy and robustness. We propose a depth-guided initialization strategy for Gaussians and introduce a clustering-based technique for pruning degenerate Gaussians. We evaluate our method on the DTU dataset, where it achieves state-of-the-art results in mesh reconstruction while preserving high-quality novel view synthesis.2026-05-01T11:09:29ZPrajwal Gupta C. R.Divyam ShethJinjoo HaMirela OstrekJustus Thies