https://arxiv.org/api/zHI4cbaG/fOiUqfdIM73iH+RHdM 2026-06-25T15:38:18Z 9383 1275 15 http://arxiv.org/abs/2511.16298v1 Optimizing 3D Gaussian Splattering for Mobile GPUs 2025-11-20T12:25:26Z

Image-based 3D scene reconstruction, which transforms multi-view images into a structured 3D representation of the surrounding environment, is a common task across many modern applications. 3D Gaussian Splatting (3DGS) is a new paradigm to address this problem and offers considerable efficiency as compared to the previous methods. Motivated by this, and considering various benefits of mobile device deployment (data privacy, operating without internet connectivity, and potentially faster responses), this paper develops Texture3dgs, an optimized mapping of 3DGS for a mobile GPU. A critical challenge in this area turns out to be optimizing for the two-dimensional (2D) texture cache, which needs to be exploited for faster executions on mobile GPUs. As a sorting method dominates the computations in 3DGS on mobile platforms, the core of Texture3dgs is a novel sorting algorithm where the processing, data movement, and placement are highly optimized for 2D memory. The properties of this algorithm are analyzed in view of a cost model for the texture cache. In addition, we accelerate other steps of the 3DGS algorithm through improved variable layout design and other optimizations. End-to-end evaluation shows that Texture3dgs delivers up to 4.1$\times$ and 1.7$\times$ speedup for the sorting and overall 3D scene reconstruction, respectively -- while also reducing memory usage by up to 1.6$\times$ -- demonstrating the effectiveness of our design for efficient mobile 3D scene reconstruction.

2025-11-20T12:25:26Z Md Musfiqur Rahman Sanim Zhihao Shu Bahram Afsharmanesh AmirAli Mirian Jiexiong Guan Wei Niu Bin Ren Gagan Agrawal http://arxiv.org/abs/2511.16273v1 TetraSDF: Precise Mesh Extraction with Multi-resolution Tetrahedral Grid 2025-11-20T11:53:52Z

Extracting meshes that exactly match the zero-level set of neural signed distance functions (SDFs) remains challenging. Sampling-based methods introduce discretization error, while continuous piecewise affine (CPWA) analytic approaches apply only to plain ReLU MLPs. We present TetraSDF, a precise analytic meshing framework for SDFs represented by a ReLU MLP composed with a multi-resolution tetrahedral positional encoder. The encoder's barycentric interpolation preserves global CPWA structure, enabling us to track ReLU linear regions within an encoder-induced polyhedral complex. A fixed analytic input preconditioner derived from the encoder's metric further reduces directional bias and stabilizes training. Across multiple benchmarks, TetraSDF matches or surpasses existing grid-based encoders in SDF reconstruction accuracy, and its analytic extractor produces highly self-consistent meshes that remain faithful to the learned isosurfaces, all with practical runtime and memory efficiency.

2025-11-20T11:53:52Z Seonghun Oh Youngjung Uh Jin-Hwa Kim http://arxiv.org/abs/2511.16162v1 Layer-wise Noise Guided Selective Wavelet Reconstruction for Robust Medical Image Segmentation 2025-11-20T09:03:32Z

Clinical deployment requires segmentation models to stay stable under distribution shifts and perturbations. The mainstream solution is adversarial training (AT) to improve robustness; however, AT often brings a clean--robustness trade-off and high training/tuning cost, which limits scalability and maintainability in medical imaging. We propose \emph{Layer-wise Noise-Guided Selective Wavelet Reconstruction (LNG-SWR)}. During training, we inject small, zero-mean noise at multiple layers to learn a frequency-bias prior that steers representations away from noise-sensitive directions. We then apply prior-guided selective wavelet reconstruction on the input/feature branch to achieve frequency adaptation: suppress noise-sensitive bands, enhance directional structures and shape cues, and stabilize boundary responses while maintaining spectral consistency. The framework is backbone-agnostic and adds low additional inference overhead. It can serve as a plug-in enhancement to AT and also improves robustness without AT. On CT and ultrasound datasets, under a unified protocol with PGD-$L_{\infty}/L_{2}$ and SSAH, LNG-SWR delivers consistent gains on clean Dice/IoU and significantly reduces the performance drop under strong attacks; combining LNG-SWR with AT yields additive gains. When combined with adversarial training, robustness improves further without sacrificing clean accuracy, indicating an engineering-friendly and scalable path to robust segmentation. These results indicate that LNG-SWR provides a simple, effective, and engineering-friendly path to robust medical image segmentation in both adversarial and standard training regimes.

2025-11-20T09:03:32Z Yuting Lu Ziliang Wang Weixin Xu Wei Zhang Yongqiang Zhao Yang Yu Xiaohong Zhang http://arxiv.org/abs/2511.16112v1 Clustered Error Correction with Grouped 4D Gaussian Splatting 2025-11-20T07:14:10Z

Existing 4D Gaussian Splatting (4DGS) methods struggle to accurately reconstruct dynamic scenes, often failing to resolve ambiguous pixel correspondences and inadequate densification in dynamic regions. We address these issues by introducing a novel method composed of two key components: (1) Elliptical Error Clustering and Error Correcting Splat Addition that pinpoints dynamic areas to improve and initialize fitting splats, and (2) Grouped 4D Gaussian Splatting that improves consistency of mapping between splats and represented dynamic objects. Specifically, we classify rendering errors into missing-color and occlusion types, then apply targeted corrections via backprojection or foreground splitting guided by cross-view color consistency. Evaluations on Neural 3D Video and Technicolor datasets demonstrate that our approach significantly improves temporal consistency and achieves state-of-the-art perceptual rendering quality, improving 0.39dB of PSNR on the Technicolor Light Field dataset. Our visualization shows improved alignment between splats and dynamic objects, and the error correction method's capability to identify errors and properly initialize new splats. Our implementation details and source code are available at https://github.com/tho-kn/cem-4dgs.

2025-11-20T07:14:10Z 16 pages, 8 figures, SIGGRAPH Asia Conference Papers 2025 Taeho Kang Jaeyeon Park Kyungjin Lee Youngki Lee http://arxiv.org/abs/2511.15908v1 SPHaptics: A Real-Time Bidirectional Haptic Interaction Framework for Coupled Rigid-Soft Body and Lagrangian Fluid Simulation in Virtual Environments 2025-11-19T22:19:27Z

Haptic feedback enhances immersion in virtual environments by allowing users to physically interact with simulated objects. Supporting accurate force responses in multiphysics systems is challenging because physically based simulation of fluid, rigid, and deformable materials is computationally demanding, especially when interaction must occur in real time. We present a unified framework for real-time, bidirectional haptic interaction with rigid bodies, deformable objects, and Lagrangian fluids in virtual reality (VR). Our approach integrates Smoothed Particle Hydrodynamics (SPH) with two-way force coupling and feedback smoothing to maintain stability and produce physically meaningful tactile responses. This enables users to manipulate objects immersed in fluid and feel reaction forces consistent with fluid-structure behavior. We demonstrate the capabilities of our framework through interactive VR scenarios involving fluid stirring, soft tissue manipulation, and rigid-body interaction. The proposed system advances haptic-enabled multiphysics simulation by unifying fluid, soft-body, and rigid-body dynamics into a single platform suitable for immersive educational applications.

2025-11-19T22:19:27Z 9 pages, 9 figures William Baumgartner Gizem Kayar-Ceylan http://arxiv.org/abs/2511.13264v2 SymGS : Leveraging Local Symmetries for 3D Gaussian Splatting Compression 2025-11-19T14:12:47Z

3D Gaussian Splatting has emerged as a transformative technique in novel view synthesis, primarily due to its high rendering speed and photorealistic fidelity. However, its memory footprint scales rapidly with scene complexity, often reaching several gigabytes. Existing methods address this issue by introducing compression strategies that exploit primitive-level redundancy through similarity detection and quantization. We aim to surpass the compression limits of such methods by incorporating symmetry-aware techniques, specifically targeting mirror symmetries to eliminate redundant primitives. We propose a novel compression framework, SymGS, introducing learnable mirrors into the scene, thereby eliminating local and global reflective redundancies for compression. Our framework functions as a plug-and-play enhancement to state-of-the-art compression methods, (e.g. HAC) to achieve further compression. Compared to HAC, we achieve $1.66 \times$ compression across benchmark datasets (upto $3\times$ on large-scale scenes). On an average, SymGS enables $\bf{108\times}$ compression of a 3DGS scene, while preserving rendering quality. The project page and supplementary can be found at symgs.github.io

2025-11-17T11:26:09Z Project Page: https://symgs.github.io/ Keshav Gupta Akshat Sanghvi Shreyas Reddy Palley Astitva Srivastava Charu Sharma Avinash Sharma http://arxiv.org/abs/2511.15398v1 One algebra for all : Geometric Algebra methods for neurosymbolic XR scene authoring, animation and neural rendering 2025-11-19T12:53:37Z

This position paper delves into the transformative role of Geometric Algebra (GA) in advancing specific areas of Computer Graphics (CG) and Extended Reality (XR), particularly in character animation, rendering, rigging, neural rendering, and generative AI-driven scene editing. Common CG algorithms require handling rotations, translations, and dilations (uniform scalings) in operations such as object rendering, rigged model animation, soft-body deformation, and XR simulations. Traditional representation forms - such as matrices, quaternions, and vectors - often introduce limitations in precision and performance. Recent breakthroughs in the use of GA suggest it can significantly enhance these processes by encapsulating geometric forms and transformations into uniform algebraic expressions, which maintain critical geometric properties throughout multi-step transformations. Furthermore, we explore how GA can serve as a unifying mathematical substrate for neurosymbolic XR scene authoring, bridging learned neural representations and explicit geometric reasoning. This paper outlines how GA-based approaches can improve the fidelity of rigged character animations, enhance soft-body simulations, streamline real-time rendering, and optimize neural and generative AI scene editing. GA offers a coherent and efficient framework for these processes, resulting in superior visual outcomes and computational efficiency, particularly in XR environments.

2025-11-19T12:53:37Z 10 pages, 9 Figures Manos Kamarianakis Antonis Protopsaltis George Papagiannakis http://arxiv.org/abs/2507.07623v2 Capture Stage Matting: Challenges, Approaches, and Solutions for Offline and Real-Time Processing 2025-11-19T09:17:45Z

Capture stages are high-end sources of state-of-the-art recordings for downstream applications in movies, games, and other media. One crucial step in almost all pipelines is matting, i.e., separating captured performances from the background. While common matting algorithms deliver remarkable performance in other applications like teleconferencing and mobile entertainment, we found that they struggle significantly with the peculiarities of capture stage content. The goal of our work is to share insights into those challenges as a curated list of these characteristics along with a constructive discussion for proactive intervention and present a guideline to practitioners for an improved workflow to mitigate unresolved challenges. To this end, we also demonstrate an efficient pipeline to adapt state-of-the-art approaches to such custom setups without the need for extensive annotations, both offline and real-time. For an objective evaluation, we introduce a validation methodology using a state-of-the-art diffusion model to demonstrate the benefits of our approach.

2025-07-10T10:45:46Z Hannah Dröge Janelle Pfeifer Saskia Rabich Reinhard Klein Matthias B. Hullin Markus Plack http://arxiv.org/abs/2511.11587v3 MedBuild AI: An Agent-Based Hybrid Intelligence Framework for Reshaping Agency in Healthcare Infrastructure Planning through Generative Design for Medical Architecture 2025-11-19T02:33:13Z

Globally, disparities in healthcare infrastructure remain stark, leaving countless communities without access to even basic services. Traditional infrastructure planning is often slow and inaccessible, and although many architects are actively delivering humanitarian and aid-driven hospital projects worldwide, these vital efforts still fall far short of the sheer scale and urgency of demand. This paper introduces MedBuild AI, a hybrid-intelligence framework that integrates large language models (LLMs) with deterministic expert systems to rebalance the early design and conceptual planning stages. As a web-based platform, it enables any region with satellite internet access to obtain guidance on modular, low-tech, low-cost medical building designs. The system operates through three agents: the first gathers local health intelligence via conversational interaction; the second translates this input into an architectural functional program through rule-based computation; and the third generates layouts and 3D models. By embedding computational negotiation into the design process, MedBuild AI fosters a reciprocal, inclusive, and equitable approach to healthcare planning, empowering communities and redefining agency in global healthcare architecture.

2025-10-17T19:55:00Z 24 pages, 16 figures. Submitted to the IJAC Special Issue "Rebalance and Reciprocity" Yiming Zhang Yuejia Xu Ziyao Wang Xin Yan Xiaosai Hao http://arxiv.org/abs/2511.15022v1 Complex-Valued 2D Gaussian Representation for Computer-Generated Holography 2025-11-19T01:41:14Z

We propose a new hologram representation based on structured complex-valued 2D Gaussian primitives, which replaces per-pixel information storage and reduces the parameter search space by up to 10:1. To enable end-to-end training, we develop a differentiable rasterizer for our representation, integrated with a GPU-optimized light propagation kernel in free space. Our extensive experiments show that our method achieves up to 2.5x lower VRAM usage and 50% faster optimization while producing higher-fidelity reconstructions than existing methods. We further introduce a conversion procedure that adapts our representation to practical hologram formats, including smooth and random phase-only holograms. Our experiments show that this procedure can effectively suppress noise artifacts observed in previous methods. By reducing the hologram parameter search space, our representation enables a more scalable hologram estimation in the next-generation computer-generated holography systems.

2025-11-19T01:41:14Z 8 pages, 11 figures Yicheng Zhan Xiangjun Gao Long Quan Kaan Akşit http://arxiv.org/abs/2511.14742v1 A Neural Field-Based Approach for View Computation & Data Exploration in 3D Urban Environments 2025-11-18T18:41:28Z

Despite the growing availability of 3D urban datasets, extracting insights remains challenging due to computational bottlenecks and the complexity of interacting with data. In fact, the intricate geometry of 3D urban environments results in high degrees of occlusion and requires extensive manual viewpoint adjustments that make large-scale exploration inefficient. To address this, we propose a view-based approach for 3D data exploration, where a vector field encodes views from the environment. To support this approach, we introduce a neural field-based method that constructs an efficient implicit representation of 3D environments. This representation enables both faster direct queries, which consist of the computation of view assessment indices, and inverse queries, which help avoid occlusion and facilitate the search for views that match desired data patterns. Our approach supports key urban analysis tasks such as visibility assessments, solar exposure evaluation, and assessing the visual impact of new developments. We validate our method through quantitative experiments, case studies informed by real-world urban challenges, and feedback from domain experts. Results show its effectiveness in finding desirable viewpoints, analyzing building facade visibility, and evaluating views from outdoor spaces. Code and data are publicly available at https://urbantk.org/neural-3d.

2025-11-18T18:41:28Z Accepted at IEEE Transactions on Visualization and Computer Graphics. Code and data are publicly available at https://urbantk.org/neural-3d Stefan Cobeli Kazi Shahrukh Omar Rodrigo Valença Nivan Ferreira Fabio Miranda http://arxiv.org/abs/2503.23752v4 StrokeFusion: Vector Sketch Generation via Joint Stroke-UDF Encoding and Latent Sequence Diffusion 2025-11-18T18:33:36Z

In the field of sketch generation, raster-format trained models often produce non-stroke artifacts, while vector-format trained models typically lack a holistic understanding of sketches, leading to compromised recognizability. Moreover, existing methods struggle to extract common features from similar elements (e.g., eyes of animals) appearing at varying positions across sketches. To address these challenges, we propose StrokeFusion, a two-stage framework for vector sketch generation. It contains a dual-modal sketch feature learning network that maps strokes into a high-quality latent space. This network decomposes sketches into normalized strokes and jointly encodes stroke sequences with Unsigned Distance Function (UDF) maps, representing sketches as sets of stroke feature vectors. Building upon this representation, our framework exploits a stroke-level latent diffusion model that simultaneously adjusts stroke position, scale, and trajectory during generation. This enables high-fidelity sketch generation while supporting stroke interpolation editing. Extensive experiments on the QuickDraw dataset demonstrate that our framework outperforms state-of-the-art techniques, validating its effectiveness in preserving structural integrity and semantic features. Code and models will be made publicly available upon publication.

2025-03-31T06:03:03Z Jin Zhou Yi Zhou Hongliang Yang Pengfei Xu Hui Huang http://arxiv.org/abs/2511.14283v1 NeuralSSD: A Neural Solver for Signed Distance Surface Reconstruction 2025-11-18T09:20:15Z

We proposed a generalized method, NeuralSSD, for reconstructing a 3D implicit surface from the widely-available point cloud data. NeuralSSD is a solver-based on the neural Galerkin method, aimed at reconstructing higher-quality and accurate surfaces from input point clouds. Implicit method is preferred due to its ability to accurately represent shapes and its robustness in handling topological changes. However, existing parameterizations of implicit fields lack explicit mechanisms to ensure a tight fit between the surface and input data. To address this, we propose a novel energy equation that balances the reliability of point cloud information. Additionally, we introduce a new convolutional network that learns three-dimensional information to achieve superior optimization results. This approach ensures that the reconstructed surface closely adheres to the raw input points and infers valuable inductive biases from point clouds, resulting in a highly accurate and stable surface reconstruction. NeuralSSD is evaluated on a variety of challenging datasets, including the ShapeNet and Matterport datasets, and achieves state-of-the-art results in terms of both surface reconstruction accuracy and generalizability.

2025-11-18T09:20:15Z Under review Zi-Chen Xi Jiahui Huang Hao-Xiang Chen Francis Williams Qun-Ce Xu Tai-Jiang Mu Shi-Min Hu http://arxiv.org/abs/2511.14205v1 FreeMusco: Motion-Free Learning of Latent Control for Morphology-Adaptive Locomotion in Musculoskeletal Characters 2025-11-18T07:23:46Z

We propose FreeMusco, a motion-free framework that jointly learns latent representations and control policies for musculoskeletal characters. By leveraging the musculoskeletal model as a strong prior, our method enables energy-aware and morphology-adaptive locomotion to emerge without motion data. The framework generalizes across human, non-human, and synthetic morphologies, where distinct energy-efficient strategies naturally appear--for example, quadrupedal gaits in Chimanoid versus bipedal gaits in Humanoid. The latent space and corresponding control policy are constructed from scratch, without demonstration, and enable downstream tasks such as goal navigation and path following--representing, to our knowledge, the first motion-free method to provide such capabilities. FreeMusco learns diverse and physically plausible locomotion behaviors through model-based reinforcement learning, guided by the locomotion objective that combines control, balancing, and biomechanical terms. To better capture the periodic structure of natural gait, we introduce the temporally averaged loss formulation, which compares simulated and target states over a time window rather than on a per-frame basis. We further encourage behavioral diversity by randomizing target poses and energy levels during training, enabling locomotion to be flexibly modulated in both form and intensity at runtime. Together, these results demonstrate that versatile and adaptive locomotion control can emerge without motion capture, offering a new direction for simulating movement in characters where data collection is impractical or impossible.

2025-11-18T07:23:46Z SIGGRAPH Asia 2025 Minkwan Kim Yoonsang Lee http://arxiv.org/abs/2410.15584v3 Deep Learning and Machine Learning -- Object Detection and Semantic Segmentation: From Theory to Applications 2025-11-18T06:43:58Z

An in-depth exploration of object detection and semantic segmentation is provided, combining theoretical foundations with practical applications. State-of-the-art advancements in machine learning and deep learning are reviewed, focusing on convolutional neural networks (CNNs), YOLO architectures, and transformer-based approaches such as DETR. The integration of artificial intelligence (AI) techniques and large language models for enhancing object detection in complex environments is examined. Additionally, a comprehensive analysis of big data processing is presented, with emphasis on model optimization and performance evaluation metrics. By bridging the gap between traditional methods and modern deep learning frameworks, valuable insights are offered for researchers, data scientists, and engineers aiming to apply AI-driven methodologies to large-scale object detection tasks.

2024-10-21T02:10:49Z 167 pages Jintao Ren Ziqian Bi Qian Niu Xinyuan Song Zekun Jiang Junyu Liu Benji Peng Sen Zhang Xuanhe Pan Jinlang Wang Keyu Chen Caitlyn Heqi Yin Pohsun Feng Yizhu Wen Tianyang Wang Silin Chen Ming Li Jiawei Xu Ming Liu