https://arxiv.org/api/4PmqPHx1j+tmK8/3R6jh7iQ0W202026-06-27T19:59:02Z9390163515http://arxiv.org/abs/2509.07044v1On design, analysis, and hybrid manufacturing of microstructured blade-like geometries2025-09-08T11:31:47ZWith the evolution of new manufacturing technologies such as multi-material 3D printing, one can think of new type of objects that consist of considerably less, yet heterogeneous, material, consequently being porous, lighter and cheaper, while having the very same functionality as the original object when manufactured from one single solid material. We aim at questioning five decades of traditional paradigms in geometric CAD and focus at new generation of CAD objects that are not solid, but contain heterogeneous free-form internal microstructures. We propose a unified manufacturing pipeline that involves all stages, namely design, optimization, manufacturing, and inspection of microstructured free-form geometries. We demonstrate our pipeline on an industrial test case of a blisk blade that sustains the desired pressure limits, yet requires significantly less material when compared to the solid counterpart.2025-09-08T11:31:47Z14 pages, 23 figuresPablo AntolinMichael BartonGeorges-Pierre BonneauAnnalisa BuffaAmaia Calleja-OchoaGershon ElberStefanie ElgetiGaizka Gómez EscuderoAlicia GonzalezHaizea González BarrioStefanie HahmannThibaut HirschlerQ Youn HongaKonstantin KeyMyung-Soo KimMichael KoflerNorberto Lopez de LacalleSilvia de la MazaKanika RajainJacques Zwarhttp://arxiv.org/abs/2509.04145v2Hyper Diffusion Avatars: Dynamic Human Avatar Generation using Network Weight Space Diffusion2025-09-08T08:40:58ZCreating human avatars is a highly desirable yet challenging task. Recent advancements in radiance field rendering have achieved unprecedented photorealism and real-time performance for personalized dynamic human avatars. However, these approaches are typically limited to person-specific rendering models trained on multi-view video data for a single individual, limiting their ability to generalize across different identities. On the other hand, generative approaches leveraging prior knowledge from pre-trained 2D diffusion models can produce cartoonish, static human avatars, which are animated through simple skeleton-based articulation. Therefore, the avatars generated by these methods suffer from lower rendering quality compared to person-specific rendering methods and fail to capture pose-dependent deformations such as cloth wrinkles. In this paper, we propose a novel approach that unites the strengths of person-specific rendering and diffusion-based generative modeling to enable dynamic human avatar generation with both high photorealism and realistic pose-dependent deformations. Our method follows a two-stage pipeline: first, we optimize a set of person-specific UNets, with each network representing a dynamic human avatar that captures intricate pose-dependent deformations. In the second stage, we train a hyper diffusion model over the optimized network weights. During inference, our method generates network weights for real-time, controllable rendering of dynamic human avatars. Using a large-scale, cross-identity, multi-view video dataset, we demonstrate that our approach outperforms state-of-the-art human avatar generation methods.2025-09-04T12:15:55ZProject webpage: https://vcai.mpi-inf.mpg.de/projects/HDA/Dongliang CaoGuoxing SunMarc HabermannFlorian Bernardhttp://arxiv.org/abs/2509.06167v1Exploring Urban Factors with Autoencoders: Relationship Between Static and Dynamic Features2025-09-07T18:37:04ZUrban analytics utilizes extensive datasets with diverse urban information to simulate, predict trends, and uncover complex patterns within cities. While these data enables advanced analysis, it also presents challenges due to its granularity, heterogeneity, and multimodality. To address these challenges, visual analytics tools have been developed to support the exploration of latent representations of fused heterogeneous and multimodal data, discretized at a street-level of detail. However, visualization-assisted tools seldom explore the extent to which fused data can offer deeper insights than examining each data source independently within an integrated visualization framework. In this work, we developed a visualization-assisted framework to analyze whether fused latent data representations are more effective than separate representations in uncovering patterns from dynamic and static urban data. The analysis reveals that combined latent representations produce more structured patterns, while separate ones are useful in particular cases.2025-09-07T18:37:04ZXimena PoccoWaqar HassanKarelia SalinasVladimir MolchanovLuis G. Nonatohttp://arxiv.org/abs/2509.05855v1Programming tension in 3D printed networks inspired by spiderwebs2025-09-06T22:42:11ZEach element in tensioned structural networks -- such as tensegrity, architectural fabrics, or medical braces/meshes -- requires a specific tension level to achieve and maintain the desired shape, stability, and compliance. These structures are challenging to manufacture, 3D print, or assemble because flattening the network during fabrication introduces multiplicative inaccuracies in the network's final tension gradients. This study overcomes this challenge by offering a fabrication algorithm for direct 3D printing of such networks with programmed tension gradients, an approach analogous to the spinning of spiderwebs. The algorithm: (i) defines the desired network and prescribes its tension gradients using the force density method; (ii) converts the network into an unstretched counterpart by numerically optimizing vertex locations toward target element lengths and converting straight elements into arcs to resolve any remaining error; and (iii) decomposes the network into printable toolpaths; Optional additional steps are: (iv) flattening curved 2D networks or 3D networks to ensure 3D printing compatibility; and (v) automatically resolving any unwanted crossings introduced by the flattening process. The proposed method is experimentally validated using 2D unit cells of viscoelastic filaments, where accurate tension gradients are achieved with an average element strain error of less than 1.0\%. The method remains effective for networks with element minimum length and maximum stress of 5.8 mm and 7.3 MPa, respectively. The method is used to demonstrate the fabrication of three complex cases: a flat spiderweb, a curved mesh, and a tensegrity system. The programmable tension gradient algorithm can be utilized to produce compact, integrated cable networks, enabling novel applications such as moment-exerting structures in medical braces and splints.2025-09-06T22:42:11ZThijs MasmeijerCaleb SwainJeff HillEd Habtourhttp://arxiv.org/abs/2508.18597v2SemLayoutDiff: Semantic Layout Generation with Diffusion Model for Indoor Scene Synthesis2025-09-06T19:34:22ZWe present SemLayoutDiff, a unified model for synthesizing diverse 3D indoor scenes across multiple room types. The model introduces a scene layout representation combining a top-down semantic map and attributes for each object. Unlike prior approaches, which cannot condition on architectural constraints, SemLayoutDiff employs a categorical diffusion model capable of conditioning scene synthesis explicitly on room masks. It first generates a coherent semantic map, followed by a cross-attention-based network to predict furniture placements that respect the synthesized layout. Our method also accounts for architectural elements such as doors and windows, ensuring that generated furniture arrangements remain practical and unobstructed. Experiments on the 3D-FRONT dataset show that SemLayoutDiff produces spatially coherent, realistic, and varied scenes, outperforming previous methods.2025-08-26T02:01:20ZProject page: https://3dlg-hcvc.github.io/SemLayoutDiff/Xiaohao SunDivyam GoelAngel X. Changhttp://arxiv.org/abs/2509.05595v1PaMO: Parallel Mesh Optimization for Intersection-Free Low-Poly Modeling on the GPU2025-09-06T04:42:05ZReducing the triangle count in complex 3D models is a basic geometry preprocessing step in graphics pipelines such as efficient rendering and interactive editing. However, most existing mesh simplification methods exhibit a few issues. Firstly, they often lead to self-intersections during decimation, a major issue for applications such as 3D printing and soft-body simulation. Second, to perform simplification on a mesh in the wild, one would first need to perform re-meshing, which often suffers from surface shifts and losses of sharp features. Finally, existing re-meshing and simplification methods can take minutes when processing large-scale meshes, limiting their applications in practice. To address the challenges, we introduce a novel GPU-based mesh optimization approach containing three key components: (1) a parallel re-meshing algorithm to turn meshes in the wild into watertight, manifold, and intersection-free ones, and reduce the prevalence of poorly shaped triangles; (2) a robust parallel simplification algorithm with intersection-free guarantees; (3) an optimization-based safe projection algorithm to realign the simplified mesh with the input, eliminating the surface shift introduced by re-meshing and recovering the original sharp features. The algorithm demonstrates remarkable efficiency, simplifying a 2-million-face mesh to 20k triangles in 3 seconds on RTX4090. We evaluated the approach on the Thingi10K dataset and showcased its exceptional performance in geometry preservation and speed.2025-09-06T04:42:05ZSeonghun OhXiaodi YuanXinyue WeiRuoxi ShiFanbo XiangMinghua LiuHao Su10.1111/cgf.70267http://arxiv.org/abs/2509.04277v1Massively-Parallel Implementation of Inextensible Elastic Rods Using Inter-block GPU Synchronization2025-09-04T14:51:38ZAn elastic rod is a long and thin body able to sustain large global deformations, even if local strains are small. The Cosserat rod is a non-linear elastic rod with an oriented centreline, which enables modelling of bending, stretching and twisting deformations. It can be used for physically-based computer simulation of threads, wires, ropes, as well as flexible surgical instruments such as catheters, guidewires or sutures. We present a massively-parallel implementation of the original CoRdE model as well as our inextensible variation. By superseding the CUDA Scalable Programming Model and using inter-block synchronization, we managed to simulate multiple physics time-steps per single kernel launch utilizing all the GPU's streaming multiprocessors. Under some constraints, this results in nearly constant computation time, regardless of the number of Cosserat elements simulated. When executing 10 time-steps per single kernel launch, our implementation of the original, extensible CoRdE was x40.0 faster. In a number of tests, the GPU implementation of our inextensible CoRdE modification achieved an average speed-up of x15.11 over the corresponding CPU version. Simulating a catheter/guidewire pair (2x512 Cosserat elements) in a cardiovascular application resulted in a 13.5 fold performance boost, enabling for accurate real-time simulation at haptic interactive rates (0.5-1kHz).2025-09-04T14:51:38Z12 pages, unpublishedPrzemyslaw KorzeniowskiNiels HaldFernando Bellohttp://arxiv.org/abs/2510.15886v1Structural Tree Extraction from 3D Surfaces2025-09-04T11:01:34ZThis paper introduces a method to extract a hierarchical tree representation from 3D unorganized polygonal data. The proposed approach first extracts a graph representation of the surface, which serves as the foundation for structural analysis. A Steiner tree is then generated to establish an optimized connection between key terminal points, defined according to application-specific criteria. The structure can be further refined by leveraging line-of-sight constraints, reducing redundancy while preserving essential connectivity. Unlike traditional skeletonization techniques, which often assume volumetric interpretations, this method operates directly on the surface, ensuring that the resulting representation remains relevant for navigation-aware geometric analysis. The method is validated through two use cases: extracting structural representations from tile-based elements for procedural content generation, and identifying key points and structural metrics for automated level analysis. Results demonstrate its ability to produce simplified, coherent representations, supporting applications in procedural generation, spatial reasoning, and map analysis.2025-09-04T11:01:34ZDiogo de AndradeNuno Fachadahttp://arxiv.org/abs/2509.04047v1TensoIS: A Step Towards Feed-Forward Tensorial Inverse Subsurface Scattering for Perlin Distributed Heterogeneous Media2025-09-04T09:28:20ZEstimating scattering parameters of heterogeneous media from images is a severely under-constrained and challenging problem. Most of the existing approaches model BSSRDF either through an analysis-by-synthesis approach, approximating complex path integrals, or using differentiable volume rendering techniques to account for heterogeneity. However, only a few studies have applied learning-based methods to estimate subsurface scattering parameters, but they assume homogeneous media. Interestingly, no specific distribution is known to us that can explicitly model the heterogeneous scattering parameters in the real world. Notably, procedural noise models such as Perlin and Fractal Perlin noise have been effective in representing intricate heterogeneities of natural, organic, and inorganic surfaces. Leveraging this, we first create HeteroSynth, a synthetic dataset comprising photorealistic images of heterogeneous media whose scattering parameters are modeled using Fractal Perlin noise. Furthermore, we propose Tensorial Inverse Scattering (TensoIS), a learning-based feed-forward framework to estimate these Perlin-distributed heterogeneous scattering parameters from sparse multi-view image observations. Instead of directly predicting the 3D scattering parameter volume, TensoIS uses learnable low-rank tensor components to represent the scattering volume. We evaluate TensoIS on unseen heterogeneous variations over shapes from the HeteroSynth test set, smoke and cloud geometries obtained from open-source realistic volumetric simulations, and some real-world samples to establish its effectiveness for inverse scattering. Overall, this study is an attempt to explore Perlin noise distribution, given the lack of any such well-defined distribution in literature, to potentially model real-world heterogeneous scattering in a feed-forward manner.2025-09-04T09:28:20ZTo appear in Pacific Graphics 2025 (CGF Journal Track), Project page: https://yashbachwana.github.io/TensoIS/Ashish TiwariSatyam BhardwajYash BachwanaParag Sarvoday SahuT. M. Feroz AliBhargava ChintalapatiShanmuganathan Ramanhttp://arxiv.org/abs/2506.12348v3Real-Time Per-Garment Virtual Try-On with Temporal Consistency for Loose-Fitting Garments2025-09-04T01:53:54ZPer-garment virtual try-on methods collect garment-specific datasets and train networks tailored to each garment to achieve superior results. However, these approaches often struggle with loose-fitting garments due to two key limitations: (1) They rely on human body semantic maps to align garments with the body, but these maps become unreliable when body contours are obscured by loose-fitting garments, resulting in degraded outcomes; (2) They train garment synthesis networks on a per-frame basis without utilizing temporal information, leading to noticeable jittering artifacts. To address the first limitation, we propose a two-stage approach for robust semantic map estimation. First, we extract a garment-invariant representation from the raw input image. This representation is then passed through an auxiliary network to estimate the semantic map. This enhances the robustness of semantic map estimation under loose-fitting garments during garment-specific dataset generation. To address the second limitation, we introduce a recurrent garment synthesis framework that incorporates temporal dependencies to improve frame-to-frame coherence while maintaining real-time performance. We conducted qualitative and quantitative evaluations to demonstrate that our method outperforms existing approaches in both image quality and temporal coherence. Ablation studies further validate the effectiveness of the garment-invariant representation and the recurrent synthesis framework.2025-06-14T04:57:21ZZaiqiang WuI-Chao ShenTakeo Igarashi10.1111/cgf.70272http://arxiv.org/abs/2509.03775v1ContraGS: Codebook-Condensed and Trainable Gaussian Splatting for Fast, Memory-Efficient Reconstruction2025-09-03T23:40:17Z3D Gaussian Splatting (3DGS) is a state-of-art technique to model real-world scenes with high quality and real-time rendering. Typically, a higher quality representation can be achieved by using a large number of 3D Gaussians. However, using large 3D Gaussian counts significantly increases the GPU device memory for storing model parameters. A large model thus requires powerful GPUs with high memory capacities for training and has slower training/rendering latencies due to the inefficiencies of memory access and data movement. In this work, we introduce ContraGS, a method to enable training directly on compressed 3DGS representations without reducing the Gaussian Counts, and thus with a little loss in model quality. ContraGS leverages codebooks to compactly store a set of Gaussian parameter vectors throughout the training process, thereby significantly reducing memory consumption. While codebooks have been demonstrated to be highly effective at compressing fully trained 3DGS models, directly training using codebook representations is an unsolved challenge. ContraGS solves the problem of learning non-differentiable parameters in codebook-compressed representations by posing parameter estimation as a Bayesian inference problem. To this end, ContraGS provides a framework that effectively uses MCMC sampling to sample over a posterior distribution of these compressed representations. With ContraGS, we demonstrate that ContraGS significantly reduces the peak memory during training (on average 3.49X) and accelerated training and rendering (1.36X and 1.88X on average, respectively), while retraining close to state-of-art quality.2025-09-03T23:40:17ZSankeerth DurvasulaSharanshangar MuhunthanZain MoustafaRichard ChenRuofan LiangYushi GuanNilesh AhujaNilesh JainSelvakumar PanneerNandita Vijaykumarhttp://arxiv.org/abs/2509.03753v1Memory Optimization for Convex Hull Support Point Queries2025-09-03T22:45:50ZThis paper evaluates several improvements to the memory layout of convex hulls to improve computation times for support point queries. The support point query is a fundamental part of common collision algorithms, and the work presented achieves a significant speedup depending on the number of vertices of the convex hull.2025-09-03T22:45:50Z6 pages, 15 figuresMichael Greerhttp://arxiv.org/abs/2509.03680v1LuxDiT: Lighting Estimation with Video Diffusion Transformer2025-09-03T19:59:20ZEstimating scene lighting from a single image or video remains a longstanding challenge in computer vision and graphics. Learning-based approaches are constrained by the scarcity of ground-truth HDR environment maps, which are expensive to capture and limited in diversity. While recent generative models offer strong priors for image synthesis, lighting estimation remains difficult due to its reliance on indirect visual cues, the need to infer global (non-local) context, and the recovery of high-dynamic-range outputs. We propose LuxDiT, a novel data-driven approach that fine-tunes a video diffusion transformer to generate HDR environment maps conditioned on visual input. Trained on a large synthetic dataset with diverse lighting conditions, our model learns to infer illumination from indirect visual cues and generalizes effectively to real-world scenes. To improve semantic alignment between the input and the predicted environment map, we introduce a low-rank adaptation finetuning strategy using a collected dataset of HDR panoramas. Our method produces accurate lighting predictions with realistic angular high-frequency details, outperforming existing state-of-the-art techniques in both quantitative and qualitative evaluations.2025-09-03T19:59:20ZProject page: https://research.nvidia.com/labs/toronto-ai/LuxDiT/Ruofan LiangKai HeZan GojcicIgor GilitschenskiSanja FidlerNandita VijaykumarZian Wanghttp://arxiv.org/abs/2412.00177v3LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting2025-09-03T17:59:08ZWe introduce LumiNet, a novel architecture that leverages generative models and latent intrinsic representations for effective lighting transfer. Given a source image and a target lighting image, LumiNet synthesizes a relit version of the source scene that captures the target's lighting. Our approach makes two key contributions: a data curation strategy from the StyleGAN-based relighting model for our training, and a modified diffusion-based ControlNet that processes both latent intrinsic properties from the source image and latent extrinsic properties from the target image. We further improve lighting transfer through a learned adaptor (MLP) that injects the target's latent extrinsic properties via cross-attention and fine-tuning.
Unlike traditional ControlNet, which generates images with conditional maps from a single scene, LumiNet processes latent representations from two different images - preserving geometry and albedo from the source while transferring lighting characteristics from the target. Experiments demonstrate that our method successfully transfers complex lighting phenomena including specular highlights and indirect illumination across scenes with varying spatial layouts and materials, outperforming existing approaches on challenging indoor scenes using only images as input.2024-11-29T18:59:11ZCorrects an evaluation bug in Table 1 due to a data normalization error. Thanks to the Sony PlayStation team for discovering and reporting the issue. The paper's core contributions, qualitative results, and user study are unaffected. We also include a minor update to the method to further improve result quality. Project page: https://luminet-relight.github.io/Xiaoyan XingKonrad GrohSezer KaraogluTheo GeversAnand Bhattadhttp://arxiv.org/abs/2509.03451v1SmartPoser: Arm Pose Estimation with a Smartphone and Smartwatch Using UWB and IMU Data2025-09-03T16:16:55ZThe ability to track a user's arm pose could be valuable in a wide range of applications, including fitness, rehabilitation, augmented reality input, life logging, and context-aware assistants. Unfortunately, this capability is not readily available to consumers. Systems either require cameras, which carry privacy issues, or utilize multiple worn IMUs or markers. In this work, we describe how an off-the-shelf smartphone and smartwatch can work together to accurately estimate arm pose. Moving beyond prior work, we take advantage of more recent ultra-wideband (UWB) functionality on these devices to capture absolute distance between the two devices. This measurement is the perfect complement to inertial data, which is relative and suffers from drift. We quantify the performance of our software-only approach using off-the-shelf devices, showing it can estimate the wrist and elbow joints with a \hl{median positional error of 11.0~cm}, without the user having to provide training data.2025-09-03T16:16:55ZThe first two listed authors contributed equally. Published at UIST 2023Nathan DeVrioVimal MollynChris Harrison10.1145/3586183.3606821