https://arxiv.org/api/lF7B/wPFEwqkUhrw1eKXCqF0TCc 2026-06-23T21:27:40Z 9374 1065 15 http://arxiv.org/abs/2601.04370v1 End-to-end differentiable design of geometric waveguide displays 2026-01-07T20:19:11Z

Geometric waveguides are a promising architecture for optical see-through augmented reality displays, but their performance is severely bottlenecked by the difficulty of jointly optimizing non-sequential light transport and polarization-dependent multilayer thin-film coatings. Here we present the first end-to-end differentiable optimization framework for geometric waveguide that couples non-sequential Monte Carlo polarization ray tracing with a differentiable transfer-matrix thin-film solver. A differentiable Monte Carlo ray tracer avoids the exponential growth of deterministic ray splitting while enabling gradients backpropagation from eyebox metrics to design parameters. With memory-saving strategies, we optimize more than one thousand layer-thickness parameters and billions of non-sequential ray-surface intersections on a single multi-GPU workstation. Automated layer pruning is achieved by starting from over-parameterized stacks and driving redundant layers to zero thickness under discrete manufacturability constraints, effectively performing topology optimization to discover optimal coating structures. On a representative design, starting from random initialization within thickness bounds, our method increases light efficiency from 4.1\% to 33.5\% and improves eyebox and FoV uniformity by $\sim$17$\times$ and $\sim$11$\times$, respectively. Furthermore, we jointly optimize the waveguide and an image preprocessing network to improve perceived image quality. Our framework not only enables system-level, high-dimensional coating optimization inside the waveguide, but also expands the scope of differentiable optics for next-generation optical design.

2026-01-07T20:19:11Z Xinge Yang Zhaocheng Liu Zhaoyu Nie Qingyuan Fan Zhimin Shi Jim Bonar Wolfgang Heidrich http://arxiv.org/abs/2601.04348v1 SCAR-GS: Spatial Context Attention for Residuals in Progressive Gaussian Splatting 2026-01-07T19:34:51Z

Recent advances in 3D Gaussian Splatting have allowed for real-time, high-fidelity novel view synthesis. Nonetheless, these models have significant storage requirements for large and medium-sized scenes, hindering their deployment over cloud and streaming services. Some of the most recent progressive compression techniques for these models rely on progressive masking and scalar quantization techniques to reduce the bitrate of Gaussian attributes using spatial context models. While effective, scalar quantization may not optimally capture the correlations of high-dimensional feature vectors, which can potentially limit the rate-distortion performance. In this work, we introduce a novel progressive codec for 3D Gaussian Splatting that replaces traditional methods with a more powerful Residual Vector Quantization approach to compress the primitive features. Our key contribution is an auto-regressive entropy model, guided by a multi-resolution hash grid, that accurately predicts the conditional probability of each successive transmitted index, allowing for coarse and refinement layers to be compressed with high efficiency.

2026-01-07T19:34:51Z Diego Revilla Pooja Suresh Anand Bhojan Ooi Wei Tsang http://arxiv.org/abs/2601.04194v1 Choreographing a World of Dynamic Objects 2026-01-07T18:59:40Z

Dynamic objects in our physical 4D (3D + time) world are constantly evolving, deforming, and interacting with other objects, leading to diverse 4D scene dynamics. In this paper, we present a universal generative pipeline, CHORD, for CHOReographing Dynamic objects and scenes and synthesizing this type of phenomena. Traditional rule-based graphics pipelines to create these dynamics are based on category-specific heuristics, yet are labor-intensive and not scalable. Recent learning-based methods typically demand large-scale datasets, which may not cover all object categories in interest. Our approach instead inherits the universality from the video generative models by proposing a distillation-based pipeline to extract the rich Lagrangian motion information hidden in the Eulerian representations of 2D videos. Our method is universal, versatile, and category-agnostic. We demonstrate its effectiveness by conducting experiments to generate a diverse range of multi-body 4D dynamics, show its advantage compared to existing methods, and demonstrate its applicability in generating robotics manipulation policies. Project page: https://yanzhelyu.github.io/chord

2026-01-07T18:59:40Z Yanzhe Lyu Chen Geng Karthik Dharmarajan Yunzhi Zhang Hadi Alzayer Shangzhe Wu Jiajun Wu http://arxiv.org/abs/1001.3974v3 Modelación y Visualización Tridimensional Interactiva de Variables Eléctricas en Celdas de Electro-Obtención con Electrodos Bipolares 2026-01-07T17:25:45Z

The use of floating bipolar electrodes in copper electro-winning cells represents an emerging technology that promises economic and operational impacts. This article presents a computational tool designed for the simulation and analysis of these electrochemical systems. Based on the generalization and optimization of an existing 2D finite difference model for calculating electrical variables in rectangular cells, a new 3D model capable of processing complex geometries, not necessarily rectangular, has been developed. At the same time, a new analytical method for estimating potentials in floating electrodes is introduced, overcoming the inaccuracies of previous heuristic approaches. The analysis of the results is supported by an interactive visualization technique of three-dimensional vector fields as flow lines.

2010-01-22T12:57:59Z 6 pages, 3 figures, in Spanish. For more details, see arXiv:1001.4002 [cs.GR]. Metadata-only update: Authors' names standardized (maternal surnames removed; paternal surnames as sole last name). Title orthography corrected with TeX accents. Abstract refined Anales del XIV Congreso de la Asociacion Chilena de Control Automatico, ACCA, 2000, pp. 362-367 César Mena Ricardo Sánchez Lautaro Salazar http://arxiv.org/abs/2508.08930v2 How Does a Virtual Agent Decide Where to Look? Symbolic Cognitive Reasoning for Embodied Head Rotation 2026-01-07T03:50:59Z

Natural head rotation is critical for believable embodied virtual agents, yet this micro-level behavior remains largely underexplored. While head-rotation prediction algorithms could, in principle, reproduce this behavior, they typically focus on visually salient stimuli and overlook the cognitive motives that guide head rotation. This yields agents that look at conspicuous objects while overlooking obstacles or task-relevant cues, diminishing realism in a virtual environment. We introduce SCORE, a Symbolic Cognitive Reasoning framework for Embodied Head Rotation, a data-agnostic framework that produces context-aware head movements without task-specific training or hand-tuned heuristics. A controlled VR study (N=20) identifies five motivational drivers of human head movements: Interest, Information Seeking, Safety, Social Schema, and Habit. SCORE encodes these drivers as symbolic predicates, perceives the scene with a Vision-Language Model (VLM), and plans head poses with a Large Language Model (LLM). The framework employs a hybrid workflow: the VLM-LLM reasoning is executed offline, after which a lightweight FastVLM performs online validation to suppress hallucinations while maintaining responsiveness to scene dynamics. The result is an agent that predicts not only where to look but also why, generalizing to unseen scenes and multi-agent crowds while retaining behavioral plausibility.

2025-08-12T13:32:18Z 13 pages, 8 figures. Accepted to SIGGRAPH Asia Conference Papers '25 SIGGRAPH Asia Conference Papers '25, December 15-18, 2025, Hongkong Juyeong Hwang Seong-Eun Hong JaeYoung Seon Hyeongyeop Kang 10.1145/3757377.3763849 http://arxiv.org/abs/2601.03114v1 Stroke Patches: Customizable Artistic Image Styling Using Regression 2026-01-06T15:44:18Z

We present a novel, regression-based method for artistically styling images. Unlike recent neural style transfer or diffusion-based approaches, our method allows for explicit control over the stroke composition and level of detail in the rendered image through the use of an extensible set of stroke patches. The stroke patch sets are procedurally generated by small programs that control the shape, size, orientation, density, color, and noise level of the strokes in the individual patches. Once trained on a set of stroke patches, a U-Net based regression model can render any input image in a variety of distinct, evocative and customizable styles.

2026-01-06T15:44:18Z 39th Conference on Neural Information Processing Systems (NeurIPS 2025) Creative AI Track Ian Jaffray John Bronskill http://arxiv.org/abs/2601.03319v1 CaricatureGS: Exaggerating 3D Gaussian Splatting Faces With Gaussian Curvature 2026-01-06T13:56:28Z

A photorealistic and controllable 3D caricaturization framework for faces is introduced. We start with an intrinsic Gaussian curvature-based surface exaggeration technique, which, when coupled with texture, tends to produce over-smoothed renders. To address this, we resort to 3D Gaussian Splatting (3DGS), which has recently been shown to produce realistic free-viewpoint avatars. Given a multiview sequence, we extract a FLAME mesh, solve a curvature-weighted Poisson equation, and obtain its exaggerated form. However, directly deforming the Gaussians yields poor results, necessitating the synthesis of pseudo-ground-truth caricature images by warping each frame to its exaggerated 2D representation using local affine transformations. We then devise a training scheme that alternates real and synthesized supervision, enabling a single Gaussian collection to represent both natural and exaggerated avatars. This scheme improves fidelity, supports local edits, and allows continuous control over the intensity of the caricature. In order to achieve real-time deformations, an efficient interpolation between the original and exaggerated surfaces is introduced. We further analyze and show that it has a bounded deviation from closed-form solutions. In both quantitative and qualitative evaluations, our results outperform prior work, delivering photorealistic, geometry-controlled caricature avatars.

2026-01-06T13:56:28Z Eldad Matmon Amit Bracha Noam Rotstein Ron Kimmel http://arxiv.org/abs/2601.02829v1 Resolution deficits drive simulator sickness and compromise reading performance in virtual environments 2026-01-06T09:01:16Z

Extended reality (XR) is evolving into a general-purpose computing platform, yet its adoption for productivity is hindered by visual fatigue and simulator sickness. While these symptoms are often attributed to latency or motion conflicts, the precise impact of textual clarity on physiological comfort remains undefined. Here we show that sub-optimal effective resolution, the clarity that reaches the eye after the full display-optics-rendering pipeline, is a primary driver of simulator sickness during reading tasks in both virtual reality and video see-through environments. By systematically manipulating end-to-end effective resolution on a unified logMAR scale, we measured reading psychophysics and sickness symptoms in a controlled within-subjects study. We find that reading performance and user comfort degrade exponentially as resolution drops below 0 logMAR (normal visual acuity). Notably, our results reveal 0 logMAR as a key physiological tipping point: resolutions better than this threshold yield naked-eye-level performance with minimal sickness, whereas poorer resolutions trigger rapid, non-linear increases in nausea and oculomotor strain. These findings suggest that the cognitive and perceptual effort required to resolve blurry text directly compromises user comfort, establishing human-eye resolution as a critical baseline for the design of future ergonomic XR systems.

2026-01-06T09:01:16Z 18 pages, 7 figures, 7 tables Jialin Wang Xinru Cheng Boyong Hou Hai-Ning Liang http://arxiv.org/abs/2601.02805v1 The perceptual gap between video see-through displays and natural human vision 2026-01-06T08:28:23Z

Video see-through (VST) technology aims to seamlessly blend virtual and physical worlds by reconstructing reality through cameras. While manufacturers promise perceptual fidelity, it remains unclear how close these systems are to replicating natural human vision across varying environmental conditions. In this work, we quantify the perceptual gap between the human eye and different popular VST headsets (Apple Vision Pro, Meta Quest 3, Quest Pro) using psychophysical measures of visual acuity, contrast sensitivity, and color vision. We show that despite hardware advancements, all tested VST systems fail to match the dynamic range and adaptability of the naked eye. While high-end devices approach human performance in ideal lighting, they exhibit significant degradation in low-light conditions, particularly in contrast sensitivity and acuity. Our results map the physiological limitations of digital reality reconstruction, establishing a specific perceptual gap that defines the roadmap for achieving indistinguishable VST experiences.

2026-01-06T08:28:23Z 19 pages, 9 figures, 4 tables Jialin Wang Songming Ping Kemu Xu Yue Li Hai-Ning Liang http://arxiv.org/abs/2601.02096v1 Dancing Points: Synthesizing Ballroom Dancing with Three-Point Inputs 2026-01-05T13:24:12Z

Ballroom dancing is a structured yet expressive motion category. Its highly diverse movement and complex interactions between leader and follower dancers make the understanding and synthesis challenging. We demonstrate that the three-point trajectory available from a virtual reality (VR) device can effectively serve as a dancer's motion descriptor, simplifying the modeling and synthesis of interplay between dancers' full-body motions down to sparse trajectories. Thanks to the low dimensionality, we can employ an efficient MLP network to predict the follower's three-point trajectory directly from the leader's three-point input for certain types of ballroom dancing, addressing the challenge of modeling high-dimensional full-body interaction. It also prevents our method from overfitting thanks to its compact yet explicit representation. By leveraging the inherent structure of the movements and carefully planning the autoregressive procedure, we show a deterministic neural network is able to translate three-point trajectories into a virtual embodied avatar, which is typically considered under-constrained and requires generative models for common motions. In addition, we demonstrate this deterministic approach generalizes beyond small, structured datasets like ballroom dancing, and performs robustly on larger, more diverse datasets such as LaFAN. Our method provides a computationally- and data-efficient solution, opening new possibilities for immersive paired dancing applications. Code and pre-trained models for this paper are available at https://peizhuoli.github.io/dancing-points.

2026-01-05T13:24:12Z Peizhuo Li Sebastian Starke Yuting Ye Olga Sorkine-Hornung http://arxiv.org/abs/2511.11618v3 On The Topology of Polygonal Meshes 2026-01-05T13:07:58Z

This paper is an introductory and informal exposition on the topology of polygonal meshes. We begin with a broad overview of topological notions and discuss how homeomorphisms, homotopy, and homology can be used to characterise topology. We move on to define polygonal meshes and make a distinction between intrinsic topology and extrinsic topology which depends on the space in which the mesh is immersed. A distinction is also made between quantitative topological properties and qualitative properties. Next, we outline proofs of the Euler and the Euler-Poincaré formulas. The Betti numbers are then defined in terms of the Euler-Poincaré formula and other mesh statistics rather than as cardinalities of the homology groups which allows us to avoid abstract algebra. Finally, we discuss how it is possible to cut a polygonal mesh such that it becomes a topological disc.

2025-11-05T13:56:24Z 26 pages, 22 figures (including nine in the margin) Andreas Bærentzen http://arxiv.org/abs/2601.02072v1 SketchRodGS: Sketch-based Extraction of Slender Geometries for Animating Gaussian Splatting Scenes 2026-01-05T12:51:12Z

Physics simulation of slender elastic objects often requires discretization as a polyline. However, constructing a polyline from Gaussian splatting is challenging as Gaussian splatting lacks connectivity information and the configuration of Gaussian primitives contains much noise. This paper presents a method to extract a polyline representation of the slender part of the objects in a Gaussian splatting scene from the user's sketching input. Our method robustly constructs a polyline mesh that represents the slender parts using the screen-space shortest path analysis that can be efficiently solved using dynamic programming. We demonstrate the effectiveness of our approach in several in-the-wild examples.

2026-01-05T12:51:12Z Presented at SIGGRAPH Asia 2025 (Technical Communications). Best Technical Communications Award Proceedings of the SIGGRAPH Asia 2025 Technical Communications, Article No. 29, pp. 1 - 4 Haato Watanabe Nobuyuki Umetani 10.1145/3757376.3771403 http://arxiv.org/abs/2412.10977v2 Point Cloud to Mesh Reconstruction: Methods, Trade-offs, and Implementation Guide 2026-01-05T09:49:06Z

Reconstructing meshes from point clouds is a fundamental task in computer vision with applications spanning robotics, autonomous systems, and medical imaging. Selecting an appropriate learning-based method requires understanding trade-offs between computational efficiency, geometric accuracy, and output constraints. This paper categorizes over fifteen methods into five paradigms -- PointNet family, autoencoder architectures, deformation-based methods, point-move techniques, and primitive-based approaches -- and provides practical guidance for method selection. We contribute: (1) a decision framework mapping input/output requirements to suitable paradigms, (2) a failure mode analysis to assist practitioners in debugging implementations, (3) standardized comparisons on ShapeNet benchmarks, and (4) a curated list of maintained codebases with implementation resources. By synthesizing both theoretical foundations and practical considerations, this work serves as an entry point for practitioners and researchers new to learning-based 3D mesh reconstruction.

2024-12-14T21:39:43Z Fatima Zahra Iguenfer Achraf Hsain Hiba Amissa Yousra Chtouki http://arxiv.org/abs/2506.21811v2 Revisiting Graph Analytics Benchmark 2026-01-04T06:07:09Z

The rise of graph analytics platforms has led to the development of various benchmarks for evaluating and comparing platform performance. However, existing benchmarks often fall short of fully assessing performance due to limitations in core algorithm selection, data generation processes (and the corresponding synthetic datasets), as well as the neglect of API usability evaluation. To address these shortcomings, we propose a novel graph analytics benchmark. First, we select eight core algorithms by extensively reviewing both academic and industrial settings. Second, we design an efficient and flexible data generator and produce eight new synthetic datasets as the default datasets for our benchmark. Lastly, we introduce a multi-level large language model (LLM)-based framework for API usability evaluation-the first of its kind in graph analytics benchmarks. We conduct comprehensive experimental evaluations on existing platforms (GraphX, PowerGraph, Flash, Grape, Pregel+, Ligra and G-thinker). The experimental results demonstrate the superiority of our proposed benchmark.

2025-03-04T08:11:27Z Lingkai Meng Yu Shao Long Yuan Longbin Lai Peng Cheng Xue Li Wenyuan Yu Wenjie Zhang Xuemin Lin Jingren Zhou http://arxiv.org/abs/2601.01361v1 VARTS: A Tool for the Visualization and Analysis of Representative Time Series Data 2026-01-04T04:18:22Z

Large-scale time series visualization often suffers from excessive visual clutter and redundant patterns, making it difficult for users to understand the main temporal trends. To address this challenge, we present VARTS, an interactive visual analytics tool for representative time series selection and visualization. Building upon our previous work M4-Greedy, VARTS integrates M4-based sampling, DTW-based similarity computation, and greedy selection into a unified workflow for the identification and visualization of representative series. The tool provides a responsive graphical interface that allows users to import time series datasets, perform representative selection, and visualize both raw and reduced data through multiple coordinated views. By reducing redundancy while preserving essential data patterns, VARTS effectively enhances visual clarity and interpretability for large-scale time series analysis. The demo video is available at https://youtu.be/mS9f12Rf0jo.

2026-01-04T04:18:22Z Duosi Jin Jianqiu Xu Guidong Zhang