https://arxiv.org/api/bKsZtN8jOj6ORq+/Qbqb1ObRUng 2026-06-26T09:46:01Z 9390 1515 15 http://arxiv.org/abs/2509.26213v1 Palace: A Library for Interactive GPU-Accelerated Large Tensor Processing and Visualization 2025-09-30T13:13:58Z

Tensor datasets (two-, three-, or higher-dimensional) are fundamental to many scientific fields utilizing imaging or simulation technologies. Advances in these methods have led to ever-increasing data sizes and, consequently, interest and development of out-of-core processing and visualization techniques, although mostly as specialized solutions. Here we present Palace, an open-source, cross-platform, general-purpose library for interactive and accelerated out-of-core tensor processing and visualization. Through a high-performance asynchronous concurrent architecture and a simple compute-graph interface, Palace enables the interactive development of out-of-core pipelines on workstation hardware. We demonstrate on benchmarks that Palace outperforms or matches state-of-the-art systems for volume rendering and hierarchical random-walker segmentation and demonstrate applicability in use cases involving tensors from 2D images up to 4D time series datasets.

2025-09-30T13:13:58Z Dominik Drees Benjamin Risse http://arxiv.org/abs/2509.26055v1 GaussEdit: Adaptive 3D Scene Editing with Text and Image Prompts 2025-09-30T10:31:31Z

This paper presents GaussEdit, a framework for adaptive 3D scene editing guided by text and image prompts. GaussEdit leverages 3D Gaussian Splatting as its backbone for scene representation, enabling convenient Region of Interest selection and efficient editing through a three-stage process. The first stage involves initializing the 3D Gaussians to ensure high-quality edits. The second stage employs an Adaptive Global-Local Optimization strategy to balance global scene coherence and detailed local edits and a category-guided regularization technique to alleviate the Janus problem. The final stage enhances the texture of the edited objects using a sophisticated image-to-image synthesis technique, ensuring that the results are visually realistic and align closely with the given prompts. Our experimental results demonstrate that GaussEdit surpasses existing methods in editing accuracy, visual fidelity, and processing speed. By successfully embedding user-specified concepts into 3D scenes, GaussEdit is a powerful tool for detailed and user-driven 3D scene editing, offering significant improvements over traditional methods.

2025-09-30T10:31:31Z IEEE Transactions on Visualization and Computer Graphics. 2025 Zhenyu Shu Junlong Yu Kai Chao Shiqing Xin Ligang Liu 10.1109/TVCG.2025.3556745 http://arxiv.org/abs/2405.20188v3 SPARE: Symmetrized Point-to-Plane Distance for Robust Non-Rigid 3D Registration 2025-09-30T10:18:42Z

Existing optimization-based methods for non-rigid registration typically minimize an alignment error metric based on the point-to-point or point-to-plane distance between corresponding point pairs on the source surface and target surface. However, these metrics can result in slow convergence or a loss of detail. In this paper, we propose SPARE, a novel formulation that utilizes a symmetrized point-to-plane distance for robust non-rigid registration. The symmetrized point-to-plane distance relies on both the positions and normals of the corresponding points, resulting in a more accurate approximation of the underlying geometry and can achieve higher accuracy than existing methods. To solve this optimization problem efficiently, we introduce an as-rigid-as-possible regulation term to estimate the deformed normals and propose an alternating minimization solver using a majorization-minimization strategy. Moreover, for effective initialization of the solver, we incorporate a deformation graph-based coarse alignment that improves registration quality and efficiency. Extensive experiments show that the proposed method greatly improves the accuracy of non-rigid registration problems and maintains relatively high solution efficiency. The code is publicly available at https://github.com/yaoyx689/spare.

2024-05-30T15:55:04Z Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence Yuxin Yao Bailin Deng Junhui Hou Juyong Zhang 10.1109/TPAMI.2025.3598630 http://arxiv.org/abs/2509.23336v2 DiffTex: Differentiable Texturing for Architectural Proxy Models 2025-09-30T10:15:45Z

Simplified proxy models are commonly used to represent architectural structures, reducing storage requirements and enabling real-time rendering. However, the geometric simplifications inherent in proxies result in a loss of fine color and geometric details, making it essential for textures to compensate for the loss. Preserving the rich texture information from the original dense architectural reconstructions remains a daunting task, particularly when working with unordered RGB photographs. We propose an automated method for generating realistic texture maps for architectural proxy models at the texel level from an unordered collection of registered photographs. Our approach establishes correspondences between texels on a UV map and pixels in the input images, with each texel's color computed as a weighted blend of associated pixel values. Using differentiable rendering, we optimize blending parameters to ensure photometric and perspective consistency, while maintaining seamless texture coherence. Experimental results demonstrate the effectiveness and robustness of our method across diverse architectural models and varying photographic conditions, enabling the creation of high-quality textures that preserve visual fidelity and structural detail.

2025-09-27T14:39:53Z ACM TOG and SIGGRAPH Asia 2025 (Patent Protected); Project page: https://vcc.tech/research/2025/DiffTex Weidan Xiong Yongli Wu Bochuan Zeng Jianwei Guo Dani Lischinski Daniel Cohen-Or Hui Huang 10.1145/3763312 http://arxiv.org/abs/2508.15372v2 Image-Conditioned 3D Gaussian Splat Quantization 2025-09-30T08:12:48Z

3D Gaussian Splatting (3DGS) has attracted considerable attention for enabling high-quality real-time rendering. Although 3DGS compression methods have been proposed for deployment on storage-constrained devices, two limitations hinder archival use: (1) they compress medium-scale scenes only to the megabyte range, which remains impractical for large-scale scenes or extensive scene collections; and (2) they lack mechanisms to accommodate scene changes after long-term archival. To address these limitations, we propose an Image-Conditioned Gaussian Splat Quantizer (ICGS-Quantizer) that substantially enhances compression efficiency and provides adaptability to scene changes after archiving. ICGS-Quantizer improves quantization efficiency by jointly exploiting inter-Gaussian and inter-attribute correlations and by using shared codebooks across all training scenes, which are then fixed and applied to previously unseen test scenes, eliminating the overhead of per-scene codebooks. This approach effectively reduces the storage requirements for 3DGS to the kilobyte range while preserving visual fidelity. To enable adaptability to post-archival scene changes, ICGS-Quantizer conditions scene decoding on images captured at decoding time. The encoding, quantization, and decoding processes are trained jointly, ensuring that the codes, which are quantized representations of the scene, are effective for conditional decoding. We evaluate ICGS-Quantizer on 3D scene compression and 3D scene updating. Experimental results show that ICGS-Quantizer consistently outperforms state-of-the-art methods in compression efficiency and adaptability to scene changes. Our code, model, and data will be publicly available on GitHub.

2025-08-21T09:07:26Z Xinshuang Liu Runfa Blark Li Keito Suzuki Truong Nguyen http://arxiv.org/abs/2504.15657v2 Neural Kinematic Bases for Fluids 2025-09-30T02:23:30Z

We propose mesh-free fluid simulations that exploit a kinematic neural basis for velocity fields represented by an MLP. We design a set of losses that ensures that these neural bases approximate fundamental physical properties such as orthogonality, divergence-free, boundary alignment, and smoothness. Our neural bases can then be used to fit an input sketch of a flow, which will inherit the same fundamental properties from the bases. We then can animate such flow in real-time using standard time integrators. Our neural bases can accommodate different domains, moving boundaries, and naturally extend to three dimensions.

2025-04-22T07:28:28Z Yibo Liu Zhixin Fang Sune Darkner Noam Aigerman Kenny Erleben Paul Kry Teseo Schneider 10.1145/3757377.3763925 http://arxiv.org/abs/2509.25504v1 XR Blocks: Accelerating Human-centered AI + XR Innovation 2025-09-29T21:00:53Z

We are on the cusp where Artificial Intelligence (AI) and Extended Reality (XR) are converging to unlock new paradigms of interactive computing. However, a significant gap exists between the ecosystems of these two fields: while AI research and development is accelerated by mature frameworks like JAX and benchmarks like LMArena, prototyping novel AI-driven XR interactions remains a high-friction process, often requiring practitioners to manually integrate disparate, low-level systems for perception, rendering, and interaction. To bridge this gap, we present XR Blocks, a cross-platform framework designed to accelerate human-centered AI + XR innovation. XR Blocks strives to provide a modular architecture with plug-and-play components for core abstraction in AI + XR: user, world, peers; interface, context, and agents. Crucially, it is designed with the mission of "reducing frictions from idea to reality", thus accelerating rapid prototyping of AI + XR apps. Built upon accessible technologies (WebXR, three.js, TensorFlow, Gemini), our toolkit lowers the barrier to entry for XR creators. We demonstrate its utility through a set of open-source templates, samples, and advanced demos, empowering the community to quickly move from concept to interactive XR prototype. Site: https://xrblocks.github.io

2025-09-29T21:00:53Z David Li Nels Numan Xun Qian Yanhe Chen Zhongyi Zhou Evgenii Alekseev Geonsun Lee Alex Cooper Min Xia Scott Chung Jeremy Nelson Xiuxiu Yuan Jolica Dias Tim Bettridge Benjamin Hersh Michelle Huynh Konrad Piascik Ricardo Cabello David Kim Ruofei Du http://arxiv.org/abs/2509.25392v1 Interpolated Adaptive Linear Reduced Order Modeling for Deformation Dynamics 2025-09-29T18:48:54Z

Linear reduced-order modeling (ROM) is widely used for efficient simulation of deformation dynamics, but its accuracy is often limited by the fixed linearization of the reduced mapping. We propose a new adaptive strategy for linear ROM that allows the reduced mapping to vary dynamically in response to the evolving deformation state, significantly improving accuracy over traditional linear approaches. To further handle large deformations, we introduce a historical displacement basis combined with Grassmann interpolation, enabling the system to recover robustly even in challenging scenarios. We evaluate our method through quantitative online-error analysis and qualitative comparisons with principal component analysis (PCA)-based linear ROM simulations, demonstrating substantial accuracy gains while preserving comparable computational costs.

2025-09-29T18:48:54Z Yutian Tao Maurizio Chiaramonte Pablo Fernandez http://arxiv.org/abs/2509.25387v1 Computational Design and Single-Wire Sensing of 3D Printed Objects with Integrated Capacitive Touchpoints 2025-09-29T18:42:50Z

Producing interactive 3D printed objects currently requires laborious 3D design and post-instrumentation with off-the-shelf electronics. Multi-material 3D printing using conductive PLA presents opportunities to mitigate these challenges. We present a computational design pipeline that embeds multiple capacitive touchpoints into any 3D model that has a closed mesh without self-intersection. With our pipeline, users define touchpoints on the 3D object's surface to indicate interactive regions. Our pipeline then automatically generates a conductive path to connect the touch regions. This path is optimized to output unique resistor-capacitor delays when each region is touched, resulting in all regions being able to be sensed through a double-wire or single-wire connection. We illustrate our approach's utility with five computational and sensing performance evaluations (achieving 93.35% mean accuracy for single-wire) and six application examples. Our sensing technique supports existing uses (e.g., prototyping) and highlights the growing promise to produce interactive devices entirely with 3D printing. Project website: https://github.com/d-rep-lab/3dp-singlewire-sensing

2025-09-29T18:42:50Z 19 pages, 14 figures, to be published in Proceedings of ACM SCF 2025 S. Sandra Bae Takanori Fujiwara Danielle Albers Szafir Ellen Yi-Luen Do Michael L. Rivera 10.1145/3745778.3766650 http://arxiv.org/abs/2509.25134v1 LayerD: Decomposing Raster Graphic Designs into Layers 2025-09-29T17:50:12Z

Designers craft and edit graphic designs in a layer representation, but layer-based editing becomes impossible once composited into a raster image. In this work, we propose LayerD, a method to decompose raster graphic designs into layers for re-editable creative workflow. LayerD addresses the decomposition task by iteratively extracting unoccluded foreground layers. We propose a simple yet effective refinement approach taking advantage of the assumption that layers often exhibit uniform appearance in graphic designs. As decomposition is ill-posed and the ground-truth layer structure may not be reliable, we develop a quality metric that addresses the difficulty. In experiments, we show that LayerD successfully achieves high-quality decomposition and outperforms baselines. We also demonstrate the use of LayerD with state-of-the-art image generators and layer-based editing.

2025-09-29T17:50:12Z ICCV 2025, Project page: https://cyberagentailab.github.io/LayerD/ , GitHub: https://github.com/CyberAgentAILab/LayerD Tomoyuki Suzuki Kang-Jun Liu Naoto Inoue Kota Yamaguchi http://arxiv.org/abs/2509.25079v1 UniLat3D: Geometry-Appearance Unified Latents for Single-Stage 3D Generation 2025-09-29T17:21:23Z

High-fidelity 3D asset generation is crucial for various industries. While recent 3D pretrained models show strong capability in producing realistic content, most are built upon diffusion models and follow a two-stage pipeline that first generates geometry and then synthesizes appearance. Such a decoupled design tends to produce geometry-texture misalignment and non-negligible cost. In this paper, we propose UniLat3D, a unified framework that encodes geometry and appearance in a single latent space, enabling direct single-stage generation. Our key contribution is a geometry-appearance Unified VAE, which compresses high-resolution sparse features into a compact latent representation -- UniLat. UniLat integrates structural and visual information into a dense low-resolution latent, which can be efficiently decoded into diverse 3D formats, e.g., 3D Gaussians and meshes. Based on this unified representation, we train a single flow-matching model to map Gaussian noise directly into UniLat, eliminating redundant stages. Trained solely on public datasets, UniLat3D produces high-quality 3D assets in seconds from a single image, achieving superior appearance fidelity and geometric quality. More demos \& code are available at https://unilat3d.github.io/

2025-09-29T17:21:23Z Project page: https://unilat3d.github.io/ Guanjun Wu Jiemin Fang Chen Yang Sikuang Li Taoran Yi Jia Lu Zanwei Zhou Jiazhong Cen Lingxi Xie Xiaopeng Zhang Wei Wei Wenyu Liu Xinggang Wang Qi Tian http://arxiv.org/abs/2509.25058v1 CharGen: Fast and Fluent Portrait Modification 2025-09-29T17:09:30Z

Interactive editing of character images with diffusion models remains challenging due to the inherent trade-off between fine-grained control, generation speed, and visual fidelity. We introduce CharGen, a character-focused editor that combines attribute-specific Concept Sliders, trained to isolate and manipulate attributes such as facial feature size, expression, and decoration with the StreamDiffusion sampling pipeline for more interactive performance. To counteract the loss of detail that often accompanies accelerated sampling, we propose a lightweight Repair Step that reinstates fine textures without compromising structural consistency. Throughout extensive ablation studies and in comparison to open-source InstructPix2Pix and closed-source Google Gemini, and a comprehensive user study, CharGen achieves two-to-four-fold faster edit turnaround with precise editing control and identity-consistent results. Project page: https://chargen.jdihlmann.com/

2025-09-29T17:09:30Z Project page: https://chargen.jdihlmann.com/ Jan-Niklas Dihlmann Arnela Killguss Hendrik P. A. Lensch http://arxiv.org/abs/2509.24986v1 Light-SQ: Structure-aware Shape Abstraction with Superquadrics for Generated Meshes 2025-09-29T16:18:32Z

In user-generated-content (UGC) applications, non-expert users often rely on image-to-3D generative models to create 3D assets. In this context, primitive-based shape abstraction offers a promising solution for UGC scenarios by compressing high-resolution meshes into compact, editable representations. Towards this end, effective shape abstraction must therefore be structure-aware, characterized by low overlap between primitives, part-aware alignment, and primitive compactness. We present Light-SQ, a novel superquadric-based optimization framework that explicitly emphasizes structure-awareness from three aspects. (a) We introduce SDF carving to iteratively udpate the target signed distance field, discouraging overlap between primitives. (b) We propose a block-regrow-fill strategy guided by structure-aware volumetric decomposition, enabling structural partitioning to drive primitive placement. (c) We implement adaptive residual pruning based on SDF update history to surpress over-segmentation and ensure compact results. In addition, Light-SQ supports multiscale fitting, enabling localized refinement to preserve fine geometric details. To evaluate our method, we introduce 3DGen-Prim, a benchmark extending 3DGen-Bench with new metrics for both reconstruction quality and primitive-level editability. Extensive experiments demonstrate that Light-SQ enables efficient, high-fidelity, and editable shape abstraction with superquadrics for complex generated geometry, advancing the feasibility of 3D UGC creation.

2025-09-29T16:18:32Z SIGGRAPH Asia 2025. Project Page https://johann.wang/Light-SQ/ Yuhan Wang Weikai Chen Zeyu Hu Runze Zhang Yingda Yin Ruoyu Wu Keyang Luo Shengju Qian Yiyan Ma Hongyi Li Yuan Gao Yuhuan Zhou Hao Luo Wan Wang Xiaobin Shen Zhaowei Li Kuixin Zhu Chuanlang Hong Yueyue Wang Lijie Feng Xin Wang Chen Change Loy 10.1145/3757377.3763835 http://arxiv.org/abs/2508.21344v2 ARGS: Advanced Regularization on Aligning Gaussians over the Surface 2025-09-29T13:10:55Z

Reconstructing high-quality 3D meshes and visuals from 3D Gaussian Splatting(3DGS) still remains a central challenge in computer graphics. Although existing models such as SuGaR offer effective solutions for rendering, there is is still room to improve improve both visual fidelity and scene consistency. This work builds upon SuGaR by introducing two complementary regularization strategies that address common limitations in both the shape of individual Gaussians and the coherence of the overall surface. The first strategy introduces an effective rank regularization, motivated by recent studies on Gaussian primitive structures. This regularization discourages extreme anisotropy-specifically, "needle-like" shapes-by favoring more balanced, "disk-like" forms that are better suited for stable surface reconstruction. The second strategy integrates a neural Signed Distance Function (SDF) into the optimization process. The SDF is regularized with an Eikonal loss to maintain proper distance properties and provides a continuous global surface prior, guiding Gaussians toward better alignment with the underlying geometry. These two regularizations aim to improve both the fidelity of individual Gaussian primitives and their collective surface behavior. The final model can make more accurate and coherent visuals from 3DGS data.

2025-08-29T06:05:30Z 9 pages, 4 figures Jeong Uk Lee Sung Hee Choi http://arxiv.org/abs/2503.09641v4 SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation 2025-09-29T12:51:54Z

This paper presents SANA-Sprint, an efficient diffusion model for ultra-fast text-to-image (T2I) generation. SANA-Sprint is built on a pre-trained foundation model and augmented with hybrid distillation, dramatically reducing inference steps from 20 to 1-4. We introduce three key innovations: (1) We propose a training-free approach that transforms a pre-trained flow-matching model for continuous-time consistency distillation (sCM), eliminating costly training from scratch and achieving high training efficiency. Our hybrid distillation strategy combines sCM with latent adversarial distillation (LADD): sCM ensures alignment with the teacher model, while LADD enhances single-step generation fidelity. (2) SANA-Sprint is a unified step-adaptive model that achieves high-quality generation in 1-4 steps, eliminating step-specific training and improving efficiency. (3) We integrate ControlNet with SANA-Sprint for real-time interactive image generation, enabling instant visual feedback for user interaction. SANA-Sprint establishes a new Pareto frontier in speed-quality tradeoffs, achieving state-of-the-art performance with 7.59 FID and 0.74 GenEval in only 1 step - outperforming FLUX-schnell (7.94 FID / 0.71 GenEval) while being 10x faster (0.1s vs 1.1s on H100). It also achieves 0.1s (T2I) and 0.25s (ControlNet) latency for 1024 x 1024 images on H100, and 0.31s (T2I) on an RTX 4090, showcasing its exceptional efficiency and potential for AI-powered consumer applications (AIPC). Code and pre-trained models will be open-sourced.

2025-03-12T04:53:07Z 22 pages, 11 figures, 8 tables, In submission Junsong Chen Shuchen Xue Yuyang Zhao Jincheng Yu Sayak Paul Junyu Chen Han Cai Song Han Enze Xie