https://arxiv.org/api/4PmqPHx1j+tmK8/3R6jh7iQ0W20 2026-06-27T19:59:02Z 9390 1635 15 http://arxiv.org/abs/2509.07044v1 On design, analysis, and hybrid manufacturing of microstructured blade-like geometries 2025-09-08T11:31:47Z

With the evolution of new manufacturing technologies such as multi-material 3D printing, one can think of new type of objects that consist of considerably less, yet heterogeneous, material, consequently being porous, lighter and cheaper, while having the very same functionality as the original object when manufactured from one single solid material. We aim at questioning five decades of traditional paradigms in geometric CAD and focus at new generation of CAD objects that are not solid, but contain heterogeneous free-form internal microstructures. We propose a unified manufacturing pipeline that involves all stages, namely design, optimization, manufacturing, and inspection of microstructured free-form geometries. We demonstrate our pipeline on an industrial test case of a blisk blade that sustains the desired pressure limits, yet requires significantly less material when compared to the solid counterpart.

2025-09-08T11:31:47Z 14 pages, 23 figures Pablo Antolin Michael Barton Georges-Pierre Bonneau Annalisa Buffa Amaia Calleja-Ochoa Gershon Elber Stefanie Elgeti Gaizka Gómez Escudero Alicia Gonzalez Haizea González Barrio Stefanie Hahmann Thibaut Hirschler Q Youn Honga Konstantin Key Myung-Soo Kim Michael Kofler Norberto Lopez de Lacalle Silvia de la Maza Kanika Rajain Jacques Zwar http://arxiv.org/abs/2509.04145v2 Hyper Diffusion Avatars: Dynamic Human Avatar Generation using Network Weight Space Diffusion 2025-09-08T08:40:58Z

Creating human avatars is a highly desirable yet challenging task. Recent advancements in radiance field rendering have achieved unprecedented photorealism and real-time performance for personalized dynamic human avatars. However, these approaches are typically limited to person-specific rendering models trained on multi-view video data for a single individual, limiting their ability to generalize across different identities. On the other hand, generative approaches leveraging prior knowledge from pre-trained 2D diffusion models can produce cartoonish, static human avatars, which are animated through simple skeleton-based articulation. Therefore, the avatars generated by these methods suffer from lower rendering quality compared to person-specific rendering methods and fail to capture pose-dependent deformations such as cloth wrinkles. In this paper, we propose a novel approach that unites the strengths of person-specific rendering and diffusion-based generative modeling to enable dynamic human avatar generation with both high photorealism and realistic pose-dependent deformations. Our method follows a two-stage pipeline: first, we optimize a set of person-specific UNets, with each network representing a dynamic human avatar that captures intricate pose-dependent deformations. In the second stage, we train a hyper diffusion model over the optimized network weights. During inference, our method generates network weights for real-time, controllable rendering of dynamic human avatars. Using a large-scale, cross-identity, multi-view video dataset, we demonstrate that our approach outperforms state-of-the-art human avatar generation methods.

2025-09-04T12:15:55Z Project webpage: https://vcai.mpi-inf.mpg.de/projects/HDA/ Dongliang Cao Guoxing Sun Marc Habermann Florian Bernard http://arxiv.org/abs/2509.06167v1 Exploring Urban Factors with Autoencoders: Relationship Between Static and Dynamic Features 2025-09-07T18:37:04Z

Urban analytics utilizes extensive datasets with diverse urban information to simulate, predict trends, and uncover complex patterns within cities. While these data enables advanced analysis, it also presents challenges due to its granularity, heterogeneity, and multimodality. To address these challenges, visual analytics tools have been developed to support the exploration of latent representations of fused heterogeneous and multimodal data, discretized at a street-level of detail. However, visualization-assisted tools seldom explore the extent to which fused data can offer deeper insights than examining each data source independently within an integrated visualization framework. In this work, we developed a visualization-assisted framework to analyze whether fused latent data representations are more effective than separate representations in uncovering patterns from dynamic and static urban data. The analysis reveals that combined latent representations produce more structured patterns, while separate ones are useful in particular cases.

2025-09-07T18:37:04Z Ximena Pocco Waqar Hassan Karelia Salinas Vladimir Molchanov Luis G. Nonato http://arxiv.org/abs/2509.05855v1 Programming tension in 3D printed networks inspired by spiderwebs 2025-09-06T22:42:11Z

Each element in tensioned structural networks -- such as tensegrity, architectural fabrics, or medical braces/meshes -- requires a specific tension level to achieve and maintain the desired shape, stability, and compliance. These structures are challenging to manufacture, 3D print, or assemble because flattening the network during fabrication introduces multiplicative inaccuracies in the network's final tension gradients. This study overcomes this challenge by offering a fabrication algorithm for direct 3D printing of such networks with programmed tension gradients, an approach analogous to the spinning of spiderwebs. The algorithm: (i) defines the desired network and prescribes its tension gradients using the force density method; (ii) converts the network into an unstretched counterpart by numerically optimizing vertex locations toward target element lengths and converting straight elements into arcs to resolve any remaining error; and (iii) decomposes the network into printable toolpaths; Optional additional steps are: (iv) flattening curved 2D networks or 3D networks to ensure 3D printing compatibility; and (v) automatically resolving any unwanted crossings introduced by the flattening process. The proposed method is experimentally validated using 2D unit cells of viscoelastic filaments, where accurate tension gradients are achieved with an average element strain error of less than 1.0\%. The method remains effective for networks with element minimum length and maximum stress of 5.8 mm and 7.3 MPa, respectively. The method is used to demonstrate the fabrication of three complex cases: a flat spiderweb, a curved mesh, and a tensegrity system. The programmable tension gradient algorithm can be utilized to produce compact, integrated cable networks, enabling novel applications such as moment-exerting structures in medical braces and splints.

2025-09-06T22:42:11Z Thijs Masmeijer Caleb Swain Jeff Hill Ed Habtour http://arxiv.org/abs/2508.18597v2 SemLayoutDiff: Semantic Layout Generation with Diffusion Model for Indoor Scene Synthesis 2025-09-06T19:34:22Z

We present SemLayoutDiff, a unified model for synthesizing diverse 3D indoor scenes across multiple room types. The model introduces a scene layout representation combining a top-down semantic map and attributes for each object. Unlike prior approaches, which cannot condition on architectural constraints, SemLayoutDiff employs a categorical diffusion model capable of conditioning scene synthesis explicitly on room masks. It first generates a coherent semantic map, followed by a cross-attention-based network to predict furniture placements that respect the synthesized layout. Our method also accounts for architectural elements such as doors and windows, ensuring that generated furniture arrangements remain practical and unobstructed. Experiments on the 3D-FRONT dataset show that SemLayoutDiff produces spatially coherent, realistic, and varied scenes, outperforming previous methods.

2025-08-26T02:01:20Z Project page: https://3dlg-hcvc.github.io/SemLayoutDiff/ Xiaohao Sun Divyam Goel Angel X. Chang http://arxiv.org/abs/2509.05595v1 PaMO: Parallel Mesh Optimization for Intersection-Free Low-Poly Modeling on the GPU 2025-09-06T04:42:05Z

Reducing the triangle count in complex 3D models is a basic geometry preprocessing step in graphics pipelines such as efficient rendering and interactive editing. However, most existing mesh simplification methods exhibit a few issues. Firstly, they often lead to self-intersections during decimation, a major issue for applications such as 3D printing and soft-body simulation. Second, to perform simplification on a mesh in the wild, one would first need to perform re-meshing, which often suffers from surface shifts and losses of sharp features. Finally, existing re-meshing and simplification methods can take minutes when processing large-scale meshes, limiting their applications in practice. To address the challenges, we introduce a novel GPU-based mesh optimization approach containing three key components: (1) a parallel re-meshing algorithm to turn meshes in the wild into watertight, manifold, and intersection-free ones, and reduce the prevalence of poorly shaped triangles; (2) a robust parallel simplification algorithm with intersection-free guarantees; (3) an optimization-based safe projection algorithm to realign the simplified mesh with the input, eliminating the surface shift introduced by re-meshing and recovering the original sharp features. The algorithm demonstrates remarkable efficiency, simplifying a 2-million-face mesh to 20k triangles in 3 seconds on RTX4090. We evaluated the approach on the Thingi10K dataset and showcased its exceptional performance in geometry preservation and speed.

2025-09-06T04:42:05Z Seonghun Oh Xiaodi Yuan Xinyue Wei Ruoxi Shi Fanbo Xiang Minghua Liu Hao Su 10.1111/cgf.70267 http://arxiv.org/abs/2509.04277v1 Massively-Parallel Implementation of Inextensible Elastic Rods Using Inter-block GPU Synchronization 2025-09-04T14:51:38Z

An elastic rod is a long and thin body able to sustain large global deformations, even if local strains are small. The Cosserat rod is a non-linear elastic rod with an oriented centreline, which enables modelling of bending, stretching and twisting deformations. It can be used for physically-based computer simulation of threads, wires, ropes, as well as flexible surgical instruments such as catheters, guidewires or sutures. We present a massively-parallel implementation of the original CoRdE model as well as our inextensible variation. By superseding the CUDA Scalable Programming Model and using inter-block synchronization, we managed to simulate multiple physics time-steps per single kernel launch utilizing all the GPU's streaming multiprocessors. Under some constraints, this results in nearly constant computation time, regardless of the number of Cosserat elements simulated. When executing 10 time-steps per single kernel launch, our implementation of the original, extensible CoRdE was x40.0 faster. In a number of tests, the GPU implementation of our inextensible CoRdE modification achieved an average speed-up of x15.11 over the corresponding CPU version. Simulating a catheter/guidewire pair (2x512 Cosserat elements) in a cardiovascular application resulted in a 13.5 fold performance boost, enabling for accurate real-time simulation at haptic interactive rates (0.5-1kHz).

2025-09-04T14:51:38Z 12 pages, unpublished Przemyslaw Korzeniowski Niels Hald Fernando Bello http://arxiv.org/abs/2510.15886v1 Structural Tree Extraction from 3D Surfaces 2025-09-04T11:01:34Z

This paper introduces a method to extract a hierarchical tree representation from 3D unorganized polygonal data. The proposed approach first extracts a graph representation of the surface, which serves as the foundation for structural analysis. A Steiner tree is then generated to establish an optimized connection between key terminal points, defined according to application-specific criteria. The structure can be further refined by leveraging line-of-sight constraints, reducing redundancy while preserving essential connectivity. Unlike traditional skeletonization techniques, which often assume volumetric interpretations, this method operates directly on the surface, ensuring that the resulting representation remains relevant for navigation-aware geometric analysis. The method is validated through two use cases: extracting structural representations from tile-based elements for procedural content generation, and identifying key points and structural metrics for automated level analysis. Results demonstrate its ability to produce simplified, coherent representations, supporting applications in procedural generation, spatial reasoning, and map analysis.

2025-09-04T11:01:34Z Diogo de Andrade Nuno Fachada http://arxiv.org/abs/2509.04047v1 TensoIS: A Step Towards Feed-Forward Tensorial Inverse Subsurface Scattering for Perlin Distributed Heterogeneous Media 2025-09-04T09:28:20Z

Estimating scattering parameters of heterogeneous media from images is a severely under-constrained and challenging problem. Most of the existing approaches model BSSRDF either through an analysis-by-synthesis approach, approximating complex path integrals, or using differentiable volume rendering techniques to account for heterogeneity. However, only a few studies have applied learning-based methods to estimate subsurface scattering parameters, but they assume homogeneous media. Interestingly, no specific distribution is known to us that can explicitly model the heterogeneous scattering parameters in the real world. Notably, procedural noise models such as Perlin and Fractal Perlin noise have been effective in representing intricate heterogeneities of natural, organic, and inorganic surfaces. Leveraging this, we first create HeteroSynth, a synthetic dataset comprising photorealistic images of heterogeneous media whose scattering parameters are modeled using Fractal Perlin noise. Furthermore, we propose Tensorial Inverse Scattering (TensoIS), a learning-based feed-forward framework to estimate these Perlin-distributed heterogeneous scattering parameters from sparse multi-view image observations. Instead of directly predicting the 3D scattering parameter volume, TensoIS uses learnable low-rank tensor components to represent the scattering volume. We evaluate TensoIS on unseen heterogeneous variations over shapes from the HeteroSynth test set, smoke and cloud geometries obtained from open-source realistic volumetric simulations, and some real-world samples to establish its effectiveness for inverse scattering. Overall, this study is an attempt to explore Perlin noise distribution, given the lack of any such well-defined distribution in literature, to potentially model real-world heterogeneous scattering in a feed-forward manner.

2025-09-04T09:28:20Z To appear in Pacific Graphics 2025 (CGF Journal Track), Project page: https://yashbachwana.github.io/TensoIS/ Ashish Tiwari Satyam Bhardwaj Yash Bachwana Parag Sarvoday Sahu T. M. Feroz Ali Bhargava Chintalapati Shanmuganathan Raman http://arxiv.org/abs/2506.12348v3 Real-Time Per-Garment Virtual Try-On with Temporal Consistency for Loose-Fitting Garments 2025-09-04T01:53:54Z

Per-garment virtual try-on methods collect garment-specific datasets and train networks tailored to each garment to achieve superior results. However, these approaches often struggle with loose-fitting garments due to two key limitations: (1) They rely on human body semantic maps to align garments with the body, but these maps become unreliable when body contours are obscured by loose-fitting garments, resulting in degraded outcomes; (2) They train garment synthesis networks on a per-frame basis without utilizing temporal information, leading to noticeable jittering artifacts. To address the first limitation, we propose a two-stage approach for robust semantic map estimation. First, we extract a garment-invariant representation from the raw input image. This representation is then passed through an auxiliary network to estimate the semantic map. This enhances the robustness of semantic map estimation under loose-fitting garments during garment-specific dataset generation. To address the second limitation, we introduce a recurrent garment synthesis framework that incorporates temporal dependencies to improve frame-to-frame coherence while maintaining real-time performance. We conducted qualitative and quantitative evaluations to demonstrate that our method outperforms existing approaches in both image quality and temporal coherence. Ablation studies further validate the effectiveness of the garment-invariant representation and the recurrent synthesis framework.

2025-06-14T04:57:21Z Zaiqiang Wu I-Chao Shen Takeo Igarashi 10.1111/cgf.70272 http://arxiv.org/abs/2509.03775v1 ContraGS: Codebook-Condensed and Trainable Gaussian Splatting for Fast, Memory-Efficient Reconstruction 2025-09-03T23:40:17Z

3D Gaussian Splatting (3DGS) is a state-of-art technique to model real-world scenes with high quality and real-time rendering. Typically, a higher quality representation can be achieved by using a large number of 3D Gaussians. However, using large 3D Gaussian counts significantly increases the GPU device memory for storing model parameters. A large model thus requires powerful GPUs with high memory capacities for training and has slower training/rendering latencies due to the inefficiencies of memory access and data movement. In this work, we introduce ContraGS, a method to enable training directly on compressed 3DGS representations without reducing the Gaussian Counts, and thus with a little loss in model quality. ContraGS leverages codebooks to compactly store a set of Gaussian parameter vectors throughout the training process, thereby significantly reducing memory consumption. While codebooks have been demonstrated to be highly effective at compressing fully trained 3DGS models, directly training using codebook representations is an unsolved challenge. ContraGS solves the problem of learning non-differentiable parameters in codebook-compressed representations by posing parameter estimation as a Bayesian inference problem. To this end, ContraGS provides a framework that effectively uses MCMC sampling to sample over a posterior distribution of these compressed representations. With ContraGS, we demonstrate that ContraGS significantly reduces the peak memory during training (on average 3.49X) and accelerated training and rendering (1.36X and 1.88X on average, respectively), while retraining close to state-of-art quality.

2025-09-03T23:40:17Z Sankeerth Durvasula Sharanshangar Muhunthan Zain Moustafa Richard Chen Ruofan Liang Yushi Guan Nilesh Ahuja Nilesh Jain Selvakumar Panneer Nandita Vijaykumar http://arxiv.org/abs/2509.03753v1 Memory Optimization for Convex Hull Support Point Queries 2025-09-03T22:45:50Z

This paper evaluates several improvements to the memory layout of convex hulls to improve computation times for support point queries. The support point query is a fundamental part of common collision algorithms, and the work presented achieves a significant speedup depending on the number of vertices of the convex hull.

2025-09-03T22:45:50Z 6 pages, 15 figures Michael Greer http://arxiv.org/abs/2509.03680v1 LuxDiT: Lighting Estimation with Video Diffusion Transformer 2025-09-03T19:59:20Z

Estimating scene lighting from a single image or video remains a longstanding challenge in computer vision and graphics. Learning-based approaches are constrained by the scarcity of ground-truth HDR environment maps, which are expensive to capture and limited in diversity. While recent generative models offer strong priors for image synthesis, lighting estimation remains difficult due to its reliance on indirect visual cues, the need to infer global (non-local) context, and the recovery of high-dynamic-range outputs. We propose LuxDiT, a novel data-driven approach that fine-tunes a video diffusion transformer to generate HDR environment maps conditioned on visual input. Trained on a large synthetic dataset with diverse lighting conditions, our model learns to infer illumination from indirect visual cues and generalizes effectively to real-world scenes. To improve semantic alignment between the input and the predicted environment map, we introduce a low-rank adaptation finetuning strategy using a collected dataset of HDR panoramas. Our method produces accurate lighting predictions with realistic angular high-frequency details, outperforming existing state-of-the-art techniques in both quantitative and qualitative evaluations.

2025-09-03T19:59:20Z Project page: https://research.nvidia.com/labs/toronto-ai/LuxDiT/ Ruofan Liang Kai He Zan Gojcic Igor Gilitschenski Sanja Fidler Nandita Vijaykumar Zian Wang http://arxiv.org/abs/2412.00177v3 LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting 2025-09-03T17:59:08Z

We introduce LumiNet, a novel architecture that leverages generative models and latent intrinsic representations for effective lighting transfer. Given a source image and a target lighting image, LumiNet synthesizes a relit version of the source scene that captures the target's lighting. Our approach makes two key contributions: a data curation strategy from the StyleGAN-based relighting model for our training, and a modified diffusion-based ControlNet that processes both latent intrinsic properties from the source image and latent extrinsic properties from the target image. We further improve lighting transfer through a learned adaptor (MLP) that injects the target's latent extrinsic properties via cross-attention and fine-tuning. Unlike traditional ControlNet, which generates images with conditional maps from a single scene, LumiNet processes latent representations from two different images - preserving geometry and albedo from the source while transferring lighting characteristics from the target. Experiments demonstrate that our method successfully transfers complex lighting phenomena including specular highlights and indirect illumination across scenes with varying spatial layouts and materials, outperforming existing approaches on challenging indoor scenes using only images as input.

2024-11-29T18:59:11Z Corrects an evaluation bug in Table 1 due to a data normalization error. Thanks to the Sony PlayStation team for discovering and reporting the issue. The paper's core contributions, qualitative results, and user study are unaffected. We also include a minor update to the method to further improve result quality. Project page: https://luminet-relight.github.io/ Xiaoyan Xing Konrad Groh Sezer Karaoglu Theo Gevers Anand Bhattad http://arxiv.org/abs/2509.03451v1 SmartPoser: Arm Pose Estimation with a Smartphone and Smartwatch Using UWB and IMU Data 2025-09-03T16:16:55Z

The ability to track a user's arm pose could be valuable in a wide range of applications, including fitness, rehabilitation, augmented reality input, life logging, and context-aware assistants. Unfortunately, this capability is not readily available to consumers. Systems either require cameras, which carry privacy issues, or utilize multiple worn IMUs or markers. In this work, we describe how an off-the-shelf smartphone and smartwatch can work together to accurately estimate arm pose. Moving beyond prior work, we take advantage of more recent ultra-wideband (UWB) functionality on these devices to capture absolute distance between the two devices. This measurement is the perfect complement to inertial data, which is relative and suffers from drift. We quantify the performance of our software-only approach using off-the-shelf devices, showing it can estimate the wrist and elbow joints with a \hl{median positional error of 11.0~cm}, without the user having to provide training data.

2025-09-03T16:16:55Z The first two listed authors contributed equally. Published at UIST 2023 Nathan DeVrio Vimal Mollyn Chris Harrison 10.1145/3586183.3606821