https://arxiv.org/api/4FXvqt7bWgxGuOhVYKoF30RrDH0 2026-06-26T00:30:06Z 9383 1395 15 http://arxiv.org/abs/2510.18658v1 MorphModes: Non-rigid Registration via Adaptive Skinning Eigenmodes 2025-10-21T14:08:19Z

Non-rigid registration is a crucial task with applications in medical imaging, industrial robotics, computer vision, and entertainment. Standard approaches accomplish this task using variations on the Non-Rigid Iterative Closest Point (NRICP) algorithms, which are prone to local minima and sensitive to initial conditions. We instead formulate the non-rigid registration problem as a Signed Distance Function (SDF) matching optimization problem, which provides richer shape information compared to traditional ICP methods. To avoid degenerate solutions, we propose to use a smooth Skinning Eigenmode subspace to parameterize the optimization problem. Finally, we propose an adaptive subspace optimization scheme to allow the resolution of localized deformations within the optimization. The result is a non-rigid registration algorithm that is more robust than NRICP, without the parameter sensitivity present in other SDF-matching approaches.

2025-10-21T14:08:19Z Gabrielle Browne Mengfei Liu Eitan Grinspun Otman Benchekroun http://arxiv.org/abs/2510.18278v1 ORDENA: ORigin-DEstiNAtion data exploration 2025-10-21T04:07:09Z

Analyzing origin-destination flows is an important problem that has been extensively investigated in several scientific fields, particularly by the visualization community. The problem becomes especially challenging when involving massive data, demanding mechanisms such as data aggregation and interactive filtering to make the exploratory process doable. However, data aggregation tends to smooth out certain patterns, and deciding which data should be filtered is not straightforward. In this work, we propose ORDENA, a visual analytic tool to explore origin and destination data. ORDENA is built upon a simple and intuitive scatter plot where the horizontal and vertical axes correspond to origins and destinations. Therefore, each origin-destination flow is represented as a point in the scatter plot. How the points are organized in the plot layout reveals important spatial phenomena present in the data. Moreover, ORDENA provides explainability resources that allow users to better understand the relation between origin-destination flows and associated attributes. We illustrate ORDENA's effectiveness in a set of case studies, which have also been elaborated in collaboration with domain experts. The proposed tool has also been evaluated by domain experts not involved in its development, which provided quite positive feedback about ORDENA.

2025-10-21T04:07:09Z Karelia Salinas Victor Barella André Luiz Cunha Gabriel Martins de Oliveira Thales Viera Luis Gustavo Nonato http://arxiv.org/abs/2505.19713v3 CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward 2025-10-21T04:03:20Z

In this work, we introduce CAD-Coder, a novel framework that reformulates text-to-CAD as the generation of CadQuery scripts - a Python-based, parametric CAD language. This representation enables direct geometric validation, a richer modeling vocabulary, and seamless integration with existing LLMs. To further enhance code validity and geometric fidelity, we propose a two-stage learning pipeline: (1) supervised fine-tuning on paired text-CadQuery data, and (2) reinforcement learning with Group Reward Policy Optimization (GRPO), guided by a CAD-specific reward comprising both a geometric reward (Chamfer Distance) and a format reward. We also introduce a chain-of-thought (CoT) planning process to improve model reasoning, and construct a large-scale, high-quality dataset of 110K text-CadQuery-3D model triplets and 1.5K CoT samples via an automated pipeline. Extensive experiments demonstrate that CAD-Coder enables LLMs to generate diverse, valid, and complex CAD models directly from natural language, advancing the state of the art of text-to-CAD generation and geometric reasoning.

2025-05-26T09:01:56Z Advances in Neural Information Processing Systems 38 (NeurIPS 2025), pp. 59765-59789 Yandong Guan Xilin Wang Ximing Xing Jing Zhang Dong Xu Qian Yu http://arxiv.org/abs/2510.03308v2 Creative synthesis of kinematic mechanisms 2025-10-20T16:07:47Z

In this paper, we formulate the problem of kinematic synthesis for planar linkages as a cross-domain image generation task. We develop a planar linkages dataset using RGB image representations, covering a range of mechanisms: from simple types such as crank-rocker and crank-slider to more complex eight-bar linkages like Jansen's mechanism. A shared-latent variational autoencoder (VAE) is employed to explore the potential of image generative models for synthesizing unseen motion curves and simulating novel kinematics. By encoding the drawing speed of trajectory points as color gradients, the same architecture also supports kinematic synthesis conditioned on both trajectory shape and velocity profiles. We validate our method on three datasets of increasing complexity: a standard four-bar linkage set, a mixed set of four-bar and crank-slider mechanisms, and a complex set including multi-loop mechanisms. Preliminary results demonstrate the effectiveness of image-based representations for generative mechanical design, showing that mechanisms with revolute and prismatic joints, and potentially cams and gears, can be represented and synthesized within a unified image generation framework.

2025-09-30T19:32:30Z 6pages, 6 figures Jiong Lin Jialong Ning Judah Goldfeder Hod Lipson http://arxiv.org/abs/2106.07718v5 HUMAP: Hierarchical Uniform Manifold Approximation and Projection 2025-10-20T14:04:17Z

Dimensionality reduction (DR) techniques help analysts to understand patterns in high-dimensional spaces. These techniques, often represented by scatter plots, are employed in diverse science domains and facilitate similarity analysis among clusters and data samples. For datasets containing many granularities or when analysis follows the information visualization mantra, hierarchical DR techniques are the most suitable approach since they present major structures beforehand and details on demand. This work presents HUMAP, a novel hierarchical dimensionality reduction technique designed to be flexible on preserving local and global structures and preserve the mental map throughout hierarchical exploration. We provide empirical evidence of our technique's superiority compared with current hierarchical approaches and show a case study applying HUMAP for dataset labelling.

2021-06-14T19:27:54Z Wilson E. Marcílio-Jr Danilo M. Eler Fernando V. Paulovich Rafael M. Martins 10.1109/TVCG.2024.3471181 http://arxiv.org/abs/2510.17101v1 Shape-aware Inertial Poser: Motion Tracking for Humans with Diverse Shapes Using Sparse Inertial Sensors 2025-10-20T02:20:31Z

Human motion capture with sparse inertial sensors has gained significant attention recently. However, existing methods almost exclusively rely on a template adult body shape to model the training data, which poses challenges when generalizing to individuals with largely different body shapes (such as a child). This is primarily due to the variation in IMU-measured acceleration caused by changes in body shape. To fill this gap, we propose Shape-aware Inertial Poser (SAIP), the first solution considering body shape differences in sparse inertial-based motion capture. Specifically, we decompose the sensor measurements related to shape and pose in order to effectively model their joint correlations. Firstly, we train a regression model to transfer the IMU-measured accelerations of a real body to match the template adult body model, compensating for the shape-related sensor measurements. Then, we can easily follow the state-of-the-art methods to estimate the full body motions of the template-shaped body. Finally, we utilize a second regression model to map the joint velocities back to the real body, combined with a shape-aware physical optimization strategy to calculate global motions on the subject. Furthermore, our method relies on body shape awareness, introducing the first inertial shape estimation scheme. This is accomplished by modeling the shape-conditioned IMU-pose correlation using an MLP-based network. To validate the effectiveness of SAIP, we also present the first IMU motion capture dataset containing individuals of different body sizes. This dataset features 10 children and 10 adults, with heights ranging from 110 cm to 190 cm, and a total of 400 minutes of paired IMU-Motion samples. Extensive experimental results demonstrate that SAIP can effectively handle motion capture tasks for diverse body shapes. The code and dataset are available at https://github.com/yinlu5942/SAIP.

2025-10-20T02:20:31Z Accepted by SIGGRAPH Asia 2025 (TOG) Lu Yin Ziying Shi Yinghao Wu Xinyu Yi Feng Xu Shihui Guo http://arxiv.org/abs/2510.16966v1 A Scalable In Transit Solution for Comprehensive Exploration of Simulation Data 2025-10-19T19:13:53Z

As simulations produce more data than available disk space on supercomputers, many simulations are employing in situ analysis and visualization to reduce the amount of data that needs to be stored. While in situ visualization offers potential for substantial data reduction, its efficacy is hindered by the need for a priori knowledge. First, we need to know what visualization parameters to use to highlight features of interest. Second, we do not know ahead of time how much resources will be needed to run the in situ workflows, e.g. how many compute nodes will be needed for in situ work. In this work, we present SeerX, a lightweight, scalable in-transit in situ service that supports dynamic resource allocation and lossy compression of 3D simulation data. SeerX enables multiple simulations to offload analysis to a shared, elastic service infrastructure without MPI synchronization.

2025-10-19T19:13:53Z Paascal Grosset James Ahrens http://arxiv.org/abs/2510.16833v1 From Mannequin to Human: A Pose-Aware and Identity-Preserving Video Generation Framework for Lifelike Clothing Display 2025-10-19T13:42:03Z

Mannequin-based clothing displays offer a cost-effective alternative to real-model showcases for online fashion presentation, but lack realism and expressive detail. To overcome this limitation, we introduce a new task called mannequin-to-human (M2H) video generation, which aims to synthesize identity-controllable, photorealistic human videos from footage of mannequins. We propose M2HVideo, a pose-aware and identity-preserving video generation framework that addresses two key challenges: the misalignment between head and body motion, and identity drift caused by temporal modeling. In particular, M2HVideo incorporates a dynamic pose-aware head encoder that fuses facial semantics with body pose to produce consistent identity embeddings across frames. To address the loss of fine facial details due to latent space compression, we introduce a mirror loss applied in pixel space through a denoising diffusion implicit model (DDIM)-based one-step denoising. Additionally, we design a distribution-aware adapter that aligns statistical distributions of identity and clothing features to enhance temporal coherence. Extensive experiments on the UBC fashion dataset, our self-constructed ASOS dataset, and the newly collected MannequinVideos dataset captured on-site demonstrate that M2HVideo achieves superior performance in terms of clothing consistency, identity preservation, and video fidelity in comparison to state-of-the-art methods.

2025-10-19T13:42:03Z Xiangyu Mu Dongliang Zhou Jie Hou Haijun Zhang Weili Guan http://arxiv.org/abs/2510.14980v2 Agentic Design of Compositional Machines 2025-10-19T05:35:03Z

The design of complex machines stands as both a marker of human intelligence and a foundation of engineering practice. Given recent advances in large language models (LLMs), we ask whether they, too, can learn to create. We approach this question through the lens of compositional machine design: a task in which machines are assembled from standardized components to meet functional demands like locomotion or manipulation in a simulated physical environment. With this simplification, machine design is expressed as writing XML-like code that explicitly specifies pairwise part connections. To support this investigation, we introduce BesiegeField, a testbed built on the machine-building game Besiege, which enables part-based construction, physical simulation and reward-driven evaluation. Using BesiegeField, we benchmark state-of-the-art LLMs with agentic workflows and identify key capabilities required for success, including spatial reasoning, strategic assembly, and instruction-following. As current open-source models fall short, we explore reinforcement learning (RL) as a path to improvement: we curate a cold-start dataset, conduct RL finetuning experiments, and highlight open challenges at the intersection of language, machine design, and physical reasoning.

2025-10-16T17:59:58Z 75 pages, 31 figures, Project Page: https://besiegefield.github.io Wenqian Zhang Weiyang Liu Zhen Liu http://arxiv.org/abs/2510.16684v1 Filtering of Small Components for Isosurface Generation 2025-10-19T02:08:05Z

Let $f: \mathbb{R}^3 \rightarrow \mathbb{R}$ be a scalar field. An isosurface is a piecewise linear approximation of a level set $f^{-1}(σ)$ for some $σ\in \mathbb{R}$ built from some regular grid sampling of $f$. Isosurfaces constructed from scanned data such as CT scans or MRIs often contain extremely small components that distract from the visualization and do not form part of any geometric model produced from the data. Simple prefiltering of the data can remove such small components while having no effect on the large components that form the body of the visualization. We present experimental results on such filtering.

2025-10-19T02:08:05Z 8 pages, 6 figures, 5 tables Devin Zhao Rephael Wenger http://arxiv.org/abs/2510.21785v1 Multi-Agent Pose Uncertainty: A Differentiable Rendering Cramér-Rao Bound 2025-10-18T23:21:02Z

Pose estimation is essential for many applications within computer vision and robotics. Despite its uses, few works provide rigorous uncertainty quantification for poses under dense or learned models. We derive a closed-form lower bound on the covariance of camera pose estimates by treating a differentiable renderer as a measurement function. Linearizing image formation with respect to a small pose perturbation on the manifold yields a render-aware Cramér-Rao bound. Our approach reduces to classical bundle-adjustment uncertainty, ensuring continuity with vision theory. It also naturally extends to multi-agent settings by fusing Fisher information across cameras. Our statistical formulation has downstream applications for tasks such as cooperative perception and novel view synthesis without requiring explicit keypoint correspondences.

2025-10-18T23:21:02Z 5 pages, 3 figures, 1 table. Presented at IEEE/CVF International Conference on Computer Vision (ICCV 2025) and IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025) Arun Muthukkumar http://arxiv.org/abs/2509.25600v2 MoReFlow: Motion Retargeting Learning through Unsupervised Flow Matching 2025-10-18T06:23:55Z

Motion retargeting holds a premise of offering a larger set of motion data for characters and robots with different morphologies. Many prior works have approached this problem via either handcrafted constraints or paired motion datasets, limiting their applicability to humanoid characters or narrow behaviors such as locomotion. Moreover, they often assume a fixed notion of retargeting, overlooking domain-specific objectives like style preservation in animation or task-space alignment in robotics. In this work, we propose MoReFlow, Motion Retargeting via Flow Matching, an unsupervised framework that learns correspondences between characters' motion embedding spaces. Our method consists of two stages. First, we train tokenized motion embeddings for each character using a VQ-VAE, yielding compact latent representations. Then, we employ flow matching with conditional coupling to align the latent spaces across characters, which simultaneously learns conditioned and unconditioned matching to achieve robust but flexible retargeting. Once trained, MoReFlow enables flexible and reversible retargeting without requiring paired data. Experiments demonstrate that MoReFlow produces high-quality motions across diverse characters and tasks, offering improved controllability, generalization, and motion realism compared to the baselines.

2025-09-29T23:52:47Z Wontaek Kim Tianyu Li Sehoon Ha http://arxiv.org/abs/2510.16312v1 Predictability of Complex Systems 2025-10-18T02:50:08Z

The study of complex systems has attracted widespread attention from researchers in the fields of natural sciences, social sciences, and engineering. Prediction is one of the central issues in this field. Although most related studies have focused on prediction methods, research on the predictability of complex systems has received increasing attention across disciplines--aiming to provide theories and tools to address a key question: What are the limits of prediction accuracy? Predictability itself can serve as an important feature for characterizing complex systems, and accurate estimation of predictability can provide a benchmark for the study of prediction algorithms. This allows researchers to clearly identify the gap between current prediction accuracy and theoretical limits, thereby helping them determine whether there is still significant room to improve existing algorithms. More importantly, investigating predictability often requires the development of new theories and methods, which can further inspire the design of more effective algorithms. Over the past few decades, this field has undergone significant evolution. In particular, the rapid development of data science has introduced a wealth of data-driven approaches for understanding and quantifying predictability. This review summarizes representative achievements, integrating both data-driven and mechanistic perspectives. After a brief introduction to the significance of the topic in focus, we will explore three core aspects: the predictability of time series, the predictability of network structures, and the predictability of dynamical processes. Finally, we will provide extensive application examples across various fields and outline open challenges for future research.

2025-10-18T02:50:08Z En Xu Yilin Bi Hongwei Hu Xin Chen Zhiwen Yu Yong Li Yanqing Hu Tao Zhou http://arxiv.org/abs/2510.16147v1 Procedural Scene Programs for Open-Universe Scene Generation: LLM-Free Error Correction via Program Search 2025-10-17T18:32:05Z

Synthesizing 3D scenes from open-vocabulary text descriptions is a challenging, important, and recently-popular application. One of its critical subproblems is layout generation: given a set of objects, lay them out to produce a scene matching the input description. Nearly all recent work adopts a declarative paradigm for this problem: using an LLM to generate a specification of constraints between objects, then solving those constraints to produce the final layout. In contrast, we explore an alternative imperative paradigm, in which an LLM iteratively places objects, with each object's position and orientation computed as a function of previously-placed objects. The imperative approach allows for a simpler scene specification language while also handling a wider variety and larger complexity of scenes. We further improve the robustness of our imperative scheme by developing an error correction mechanism that iteratively improves the scene's validity while staying as close as possible to the original layout generated by the LLM. In forced-choice perceptual studies, participants preferred layouts generated by our imperative approach 82% and 94% of the time when compared against two declarative layout generation methods. We also present a simple, automated evaluation metric for 3D scene layout generation that aligns well with human preferences.

2025-10-17T18:32:05Z To appear in SIGGRAPH Asia 2025 Maxim Gumin Do Heon Han Seung Jean Yoo Aditya Ganeshan R. Kenny Jones Kailiang Fu Rio Aguina-Kang Stewart Morris Daniel Ritchie http://arxiv.org/abs/2510.16136v1 GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer 2025-10-17T18:22:04Z

Transferring appearance to 3D assets using different representations of the appearance object - such as images or text - has garnered interest due to its wide range of applications in industries like gaming, augmented reality, and digital content creation. However, state-of-the-art methods still fail when the geometry between the input and appearance objects is significantly different. A straightforward approach is to directly apply a 3D generative model, but we show that this ultimately fails to produce appealing results. Instead, we propose a principled approach inspired by universal guidance. Given a pretrained rectified flow model conditioned on image or text, our training-free method interacts with the sampling process by periodically adding guidance. This guidance can be modeled as a differentiable loss function, and we experiment with two different types of guidance including part-aware losses for appearance and self-similarity. Our experiments show that our approach successfully transfers texture and geometric details to the input 3D asset, outperforming baselines both qualitatively and quantitatively. We also show that traditional metrics are not suitable for evaluating the task due to their inability of focusing on local details and comparing dissimilar inputs, in absence of ground truth data. We thus evaluate appearance transfer quality with a GPT-based system objectively ranking outputs, ensuring robust and human-like assessment, as further confirmed by our user study. Beyond showcased scenarios, our method is general and could be extended to different types of diffusion models and guidance functions.

2025-10-17T18:22:04Z NeurIPS 2025. Project Page: https://sayands.github.io/guideflow3d/ Sayan Deb Sarkar Sinisa Stekovic Vincent Lepetit Iro Armeni