https://arxiv.org/api/l05BCkiCHHjyklDIFr7ld9AaVa82026-06-20T08:36:24Z935488515http://arxiv.org/abs/2603.29592v1Bioinspired123D: Generative 3D Modeling System for Bioinspired Structures2026-02-11T15:49:40ZGenerative AI has made rapid progress in text, image, and video synthesis, yet text-to-3D modeling for scientific design remains particularly challenging due to limited controllability and high computational cost. Most existing 3D generative methods rely on meshes, voxels, or point clouds which can be costly to train and difficult to control. We introduce Bioinspired123D, a lightweight and modular code-as-geometry pipeline that generates fabricable 3D structures directly through parametric programs rather than dense visual representations. At the core of Bioinspired123D is Bioinspired3D, a compact language model finetuned to translate natural language design cues into Blender Python scripts encoding smooth, biologically inspired geometries. We curate a domain-specific dataset of over 4,000 bioinspired and geometric design scripts spanning helical, cellular, and tubular motifs with parametric variability. The dataset is expanded and validated through an automated LLM-driven, Blender-based quality control pipeline. Bioinspired3D is then embedded in a graph-based agentic framework that integrates multimodal retrieval-augmented generation and a vision-language model critic to iteratively evaluate, critique, and repair generated scripts. We evaluate performance on a new benchmark for 3D geometry script generation and show that Bioinspired123D demonstrates a near fourfold improvement over its non-finetuned base model, while also outperforming substantially larger state-of-the-art language models despite using far fewer parameters and compute. By prioritizing code-as-geometry representations, Bioinspired123D enables compute-efficient, controllable, and interpretable text-to-3D generation, lowering barriers to AI driven scientific discovery in materials and structural design.2026-02-11T15:49:40ZRachel K. LuuMarkus J. Buehlerhttp://arxiv.org/abs/2506.08043v4Neural-Augmented Kelvinlet for Real-Time Soft Tissue Deformation Modeling2026-02-11T14:01:54ZAccurate and efficient modeling of soft-tissue interactions is fundamental for advancing surgical simulation, surgical robotics, and model-based surgical automation. To achieve real-time latency, classical Finite Element Method (FEM) solvers are often replaced with neural approximations; however, naively training such models in a fully data-driven manner without incorporating physical priors frequently leads to poor generalization and physically implausible predictions. We present a novel physics-informed neural simulation framework that enables real-time prediction of soft-tissue deformations under complex single- and multi-grasper interactions. Our approach integrates Kelvinlet-based analytical priors with large-scale FEM data, capturing both linear and nonlinear tissue responses. This hybrid design improves predictive accuracy and physical plausibility across diverse neural architectures while maintaining the low-latency performance required for interactive applications. We validate our method on challenging surgical manipulation tasks involving standard laparoscopic grasping tools, demonstrating substantial improvements in deformation fidelity and temporal stability over existing baselines. These results establish Kelvinlet-augmented learning as a principled and computationally efficient paradigm for real-time, physics-aware soft-tissue simulation in surgical AI.2025-06-06T19:22:49ZAshkan ShahbaziKyvia PereiraJon S. HeiselmanElaheh AkbariAnnie C. BensonSepehr SeifiXinyuan LiuGarrison L. JohnstonJie Ying WuNabil SimaanMichael I. MigaSoheil Kolourihttp://arxiv.org/abs/2507.06109v2LighthouseGS: Indoor Structure-aware 3D Gaussian Splatting for Panorama-Style Mobile Captures2026-02-11T13:18:42ZWe introduce LighthouseGS, a practical novel view synthesis framework based on 3D Gaussian Splatting that utilizes simple panorama-style captures from a single mobile device. While convenient, this rotation-dominant motion and narrow baseline make accurate camera pose and 3D point estimation challenging, especially in textureless indoor scenes. To address these challenges, LighthouseGS leverages rough geometric priors, such as mobile device camera poses and monocular depth estimation, and utilizes indoor planar structures. Specifically, we propose a new initialization method called plane scaffold assembly to generate consistent 3D points on these structures, followed by a stable pruning strategy to enhance geometry and optimization stability. Additionally, we present geometric and photometric corrections to resolve inconsistencies from motion drift and auto-exposure in mobile devices. Tested on real and synthetic indoor scenes, LighthouseGS delivers photorealistic rendering, outperforming state-of-the-art methods and enabling applications like panoramic view synthesis and object placement. Project page: https://vision3d-lab.github.io/lighthousegs/2025-07-08T15:49:53ZWACV 2026Seungoh HanJaehoon JangHyunsu KimJaeheung SurhJunhyung KwakHyowon HaKyungdon Joohttp://arxiv.org/abs/2602.09999v1Faster-GS: Analyzing and Improving Gaussian Splatting Optimization2026-02-10T17:22:59ZRecent advances in 3D Gaussian Splatting (3DGS) have focused on accelerating optimization while preserving reconstruction quality. However, many proposed methods entangle implementation-level improvements with fundamental algorithmic modifications or trade performance for fidelity, leading to a fragmented research landscape that complicates fair comparison. In this work, we consolidate and evaluate the most effective and broadly applicable strategies from prior 3DGS research and augment them with several novel optimizations. We further investigate underexplored aspects of the framework, including numerical stability, Gaussian truncation, and gradient approximation. The resulting system, Faster-GS, provides a rigorously optimized algorithm that we evaluate across a comprehensive suite of benchmarks. Our experiments demonstrate that Faster-GS achieves up to 5$\times$ faster training while maintaining visual quality, establishing a new cost-effective and resource efficient baseline for 3DGS optimization. Furthermore, we demonstrate that optimizations can be applied to 4D Gaussian reconstruction, leading to efficient non-rigid scene optimization.2026-02-10T17:22:59ZProject page: https://fhahlbohm.github.io/faster-gaussian-splattingFlorian HahlbohmLinus FrankeMartin EisemannMarcus Magnorhttp://arxiv.org/abs/2511.11679v2Free-Boundary Quasiconformal Maps via a Least-squares Operator in Diffeomorphism Optimization2026-02-10T10:05:55ZFree-boundary diffeomorphism optimization, an important and widely occurring task in geometric modeling, computer graphics, and biological imaging, requires simultaneously determining a planar target domain and a locally bijective map with well-controlled distortion. We formulate this task through the least-squares quasiconformal (LSQC) operator and establish key structural properties of the LSQC minimizer, including well-posedness under mild conditions, invariance under similarity transformations, and resolution-independent behavior with stability under mesh refinement. We further analyze the sensitivity of the LSQC solution with respect to the Beltrami coefficient, establishing stability and differentiability properties that enable gradient-based optimization over the space of Beltrami coefficients. To make this differentiable formulation practical at scale and to facilitate the optimization process, we introduce the Spectral Beltrami Network (SBN), a multiscale mesh-spectral surrogate that approximates the LSQC solution operator in a single differentiable forward pass. This yields SBN-Opt, an optimization framework that searches over admissible Beltrami coefficients and pinning conditions to solve free-boundary diffeomorphism objectives with explicit distortion control. Extensive experiments on equiareal parameterization and inconsistent surface registration demonstrate consistent improvements over traditional numerical algorithms.2025-11-12T03:43:28ZZhehao XuLok Ming Luihttp://arxiv.org/abs/2601.05765v2More Power to the Particles: Analytic Geometry for Partial Optimal Transport-based Fluid simulation2026-02-09T17:21:47ZWe propose unified data structures and algorithms for free-surface fluid simulations based on partial optimal transport, such as the Power Particles method or Gallouët-Mérigot's scheme. Such methods previously relied on a discretization of the cells by leveraging a classical convex cell clipping algorithm. However, this results in a heavy computational cost and a coarse approximation of the evaluated quantities. In contrast, we propose to analytically construct the generalized Laguerre cells characterized by intersections between Laguerre cells and spheres. This makes it possible to accurately compute the differential quantities used by the Newton algorithm, that is, the areas of the (curved) facets and the volumes of the (generalized) Laguerre cells. This significantly improves the convergence of the Newton algorithm, hence the robustness of the simulations, even in challenging scenarios with high velocities and chocs. Moreover, this drastically reduces the computational cost as compared to previous works. Based on our data structure, we propose a framework that combines (1) the numerical solution mechanism for partial optimal transport, (2) the fluid simulation scheme and (3) the rendering. The aforementioned three components are implemented on the GPU, providing further speedup and avoiding data transfers. This is made possible by the compactness of our data structure combined with a massively parallel implementation. We report the result of numerical experiments featuring highly detailed, large-scale simulations and high variations of physical properties within the same simulation.2026-01-09T12:35:45ZCyprien Plateau HollevilleBruno Lévyhttp://arxiv.org/abs/2602.08724v1Rotated Lights for Consistent and Efficient 2D Gaussians Inverse Rendering2026-02-09T14:34:06ZInverse rendering aims to decompose a scene into its geometry, material properties and light conditions under a certain rendering model. It has wide applications like view synthesis, relighting, and scene editing. In recent years, inverse rendering methods have been inspired by view synthesis approaches like neural radiance fields and Gaussian splatting, which are capable of efficiently decomposing a scene into its geometry and radiance. They then further estimate the material and lighting that lead to the observed scene radiance. However, the latter step is highly ambiguous and prior works suffer from inaccurate color and baked shadows in their albedo estimation albeit their regularization. To this end, we propose RotLight, a simple capturing setup, to address the ambiguity. Compared to a usual capture, RotLight only requires the object to be rotated several times during the process. We show that as few as two rotations is effective in reducing artifacts. To further improve 2DGS-based inverse rendering, we additionally introduce a proxy mesh that not only allows accurate incident light tracing, but also enables a residual constraint and improves global illumination handling. We demonstrate with both synthetic and real world datasets that our method achieves superior albedo estimation while keeping efficient computation.2026-02-09T14:34:06ZProject Page: https://rotlight-ir.github.io/Geng LinMatthias Zwickerhttp://arxiv.org/abs/2602.08642v1Forget Superresolution, Sample Adaptively (when Path Tracing)2026-02-09T13:39:20ZReal-time path tracing increasingly operates under extremely low sampling budgets, often below one sample per pixel, as rendering complexity, resolution, and frame-rate requirements continue to rise. While super-resolution is widely used in production, it uniformly sacrifices spatial detail and cannot exploit variations in noise, reconstruction difficulty, and perceptual importance across the image. Adaptive sampling offers a compelling alternative, but existing end-to-end approaches rely on approximations that break down in sparse regimes.
We introduce an end-to-end adaptive sampling and denoising pipeline explicitly designed for the sub-1-spp regime. Our method uses a stochastic formulation of sample placement that enables gradient estimation despite discrete sampling decisions, allowing stable training of a neural sampler at low sampling budgets. To better align optimization with human perception, we propose a tonemapping-aware training pipeline that integrates differentiable filmic operators and a state-of-the-art perceptual loss, preventing oversampling of regions with low visual impact.
In addition, we introduce a gather-based pyramidal denoising filter and a learnable generalization of albedo demodulation tailored to sparse sampling. Our results show consistent improvements over uniform sparse sampling, with notably better reconstruction of perceptually critical details such as specular highlights and shadow boundaries, and demonstrate that adaptive sampling remains effective even at minimal budgets.2026-02-09T13:39:20ZMartin BálintCorentin SalaünHans-Peter SeidelKarol Myszkowskihttp://arxiv.org/abs/2602.08540v1TIBR4D: Tracing-Guided Iterative Boundary Refinement for Efficient 4D Gaussian Segmentation2026-02-09T11:41:06ZObject-level segmentation in dynamic 4D Gaussian scenes remains challenging due to complex motion, occlusions, and ambiguous boundaries. In this paper, we present an efficient learning-free 4D Gaussian segmentation framework that lifts video segmentation masks to 4D spaces, whose core is a two-stage iterative boundary refinement, TIBR4D. The first stage is an Iterative Gaussian Instance Tracing (IGIT) at the temporal segment level. It progressively refines Gaussian-to-instance probabilities through iterative tracing, and extracts corresponding Gaussian point clouds that better handle occlusions and preserve completeness of object structures compared to existing one-shot threshold-based methods. The second stage is a frame-wise Gaussian Rendering Range Control (RCC) via suppressing highly uncertain Gaussians near object boundaries while retaining their core contributions for more accurate boundaries. Furthermore, a temporal segmentation merging strategy is proposed for IGIT to balance identity consistency and dynamic awareness. Longer segments enforce stronger multi-frame constraints for stable identities, while shorter segments allow identity changes to be captured promptly. Experiments on HyperNeRF and Neu3D demonstrate that our method produces accurate object Gaussian point clouds with clearer boundaries and higher efficiency compared to SOTA methods.2026-02-09T11:41:06Z13 pages, 6 figures, 4 tablesHe WuXia YanYanghui XuLiegang XiaJiazhou Chenhttp://arxiv.org/abs/2602.08368v1T2VTree: User-Centered Visual Analytics for Agent-Assisted Thought-to-Video Authoring2026-02-09T08:06:24ZGenerative models have substantially expanded video generation capabilities, yet practical thought-to-video creation remains a multi-stage, multi-modal, and decision-intensive process. However, existing tools either hide intermediate decisions behind repeated reruns or expose operator-level workflows that make exploration traces difficult to manage, compare, and reuse. We present T2VTree, a user-centered visual analytics approach for agent-assisted thought-to-video authoring. T2VTree represents the authoring process as a tree visualization. Each node in the tree binds an editable specification (intent, referenced inputs, workflow choice, prompts, and parameters) with the resulting multimodal outputs, making refinement, branching, and provenance inspection directly operable. To reduce the burden of deciding what to do next, a set of collaborating agents translates step-level intent into an executable plan that remains visible and user-editable before execution. We further implement a visual analytics system that integrates branching authoring with in-place preview and stitching for convergent assembly, enabling end-to-end multi-scene creation without leaving the authoring context. We demonstrate T2VTreeVA through two multi-scene case studies and a comparative user study, showing how the T2VTree visualization and editable agent planning support reliable refinement, localized comparison, and practical reuse in real authoring workflows. T2VTree is available at: https://github.com/tezuka0210/T2VTree.2026-02-09T08:06:24ZZhuoyun ZhengYu DongGaorong LiangGuan LiGuihua ShanShiyu ChengDong TianJianlong ZhouJie Lianghttp://arxiv.org/abs/2602.08198v1PEGAsus: 3D Personalization of Geometry and Appearance2026-02-09T01:41:27ZWe present PEGAsus, a new framework capable of generating Personalized 3D shapes by learning shape concepts at both Geometry and Appearance levels. First, we formulate 3D shape personalization as extracting reusable, category-agnostic geometric and appearance attributes from reference shapes, and composing these attributes with text to generate novel shapes. Second, we design a progressive optimization strategy to learn shape concepts at both the geometry and appearance levels, decoupling the shape concept learning process. Third, we extend our approach to region-wise concept learning, enabling flexible concept extraction, with context-aware and context-free losses. Extensive experimental results show that PEGAsus is able to effectively extract attributes from a wide range of reference shapes and then flexibly compose these concepts with text to synthesize new shapes. This enables fine-grained control over shape generation and supports the creation of diverse, personalized results, even in challenging cross-category scenarios. Both quantitative and qualitative experiments demonstrate that our approach outperforms existing state-of-the-art solutions.2026-02-09T01:41:27ZJingyu HuBin HuKa-Hei HuiHaipeng LiZhengzhe LiuDaniel Cohen-OrChi-Wing Fuhttp://arxiv.org/abs/2602.08094v1Energy-Controllable Time Integration for Elastodynamic Contact2026-02-08T19:29:47ZDynamic simulation of elastic bodies is a longstanding task in engineering and computer graphics. In graphics, numerical integrators like implicit Euler and BDF2 are preferred due to their stability at large time steps, but they tend to dissipate energy uncontrollably. In contrast, symplectic methods like implicit midpoint can conserve energy but are not unconditionally stable and fail on moderately stiff problems. To address these limitations, we propose a general class of numerical integrators for Hamiltonian problems which are symplectic on linear problems, yet have superior stability on nonlinear problems. With this, we derive a novel energy-controllable time integrator, A-search, a simple modification of implicit Euler that can follow user-specified energy targets, enabling flexible control over energy dissipation or conservation while maintaining stability and physical fidelity. Our method integrates seamlessly with barrier-type energies and allows for inversion-free and penetration-free guarantees, making it well-suited for handling large deformations and complex collisions. Extensive evaluations over a wide range of material parameters and scenes demonstrate that A-search has biases to keep energy in low frequency motion rather than dissipation, and A-search outperforms traditional methods such as BDF2 at similar total running times by maintaining energy and leading to more visually desirable simulations.2026-02-08T19:29:47ZKevin YouJuntian ZhengMinchen Lihttp://arxiv.org/abs/2502.09411v2ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation2026-02-08T16:05:16ZDiffusion models enable high-quality and diverse visual content synthesis. However, they struggle to generate rare or unseen concepts. To address this challenge, we explore the usage of Retrieval-Augmented Generation (RAG) with image generation models. We propose ImageRAG, a method that dynamically retrieves relevant images based on a given text prompt, and uses them as context to guide the generation process. Prior approaches that used retrieved images to improve generation, trained models specifically for retrieval-based generation. In contrast, ImageRAG leverages the capabilities of existing image conditioning models, and does not require RAG-specific training. Our approach is highly adaptable and can be applied across different model types, showing significant improvement in generating rare and fine-grained concepts using different base models.
Our project page is available at: https://rotem-shalev.github.io/ImageRAG2025-02-13T15:36:12ZRotem Shalev-ArkushinRinon GalAmit H. BermanoOhad Friedhttp://arxiv.org/abs/2602.07860v1Recovering 3D Shapes from Ultra-Fast Motion-Blurred Images2026-02-08T08:17:35ZWe consider the problem of 3D shape recovery from ultra-fast motion-blurred images. While 3D reconstruction from static images has been extensively studied, recovering geometry from extreme motion-blurred images remains challenging. Such scenarios frequently occur in both natural and industrial settings, such as fast-moving objects in sports (e.g., balls) or rotating machinery, where rapid motion distorts object appearance and makes traditional 3D reconstruction techniques like Multi-View Stereo (MVS) ineffective.
In this paper, we propose a novel inverse rendering approach for shape recovery from ultra-fast motion-blurred images. While conventional rendering techniques typically synthesize blur by averaging across multiple frames, we identify a major computational bottleneck in the repeated computation of barycentric weights. To address this, we propose a fast barycentric coordinate solver, which significantly reduces computational overhead and achieves a speedup of up to 4.57x, enabling efficient and photorealistic simulation of high-speed motion. Crucially, our method is fully differentiable, allowing gradients to propagate from rendered images to the underlying 3D shape, thereby facilitating shape recovery through inverse rendering.
We validate our approach on two representative motion types: rapid translation and rotation. Experimental results demonstrate that our method enables efficient and realistic modeling of ultra-fast moving objects in the forward simulation. Moreover, it successfully recovers 3D shapes from 2D imagery of objects undergoing extreme translational and rotational motion, advancing the boundaries of vision-based 3D reconstruction. Project page: https://maxmilite.github.io/rec-from-ultrafast-blur/2026-02-08T08:17:35ZAccepted by 3DV 2026. Project page: https://maxmilite.github.io/rec-from-ultrafast-blur/Fei YuShudan GuoShiqing XinBeibei WangHaisen ZhaoWenzheng Chenhttp://arxiv.org/abs/2602.07782v1TABI: Tight and Balanced Interactive Atlas Packing2026-02-08T02:45:26ZAtlas packing is a key step in many computer graphics applications. Packing algorithms seek to arrange a set of charts within a fixed-size atlas with as little downscaling as possible. Many packing applications such as content creation tools, dynamic atlas generation for video games, and texture space shading require on-the-fly interactive atlas packing. Unfortunately, while many methods have been developed for generating tight high-quality packings, they are designed for offline settings and have running times two or more orders of magnitude greater than what is required for interactive performance. While real-time GPU packing methods exist, they significantly downscale packed charts compared to offline methods. We introduce a GPU packing method that targets interactive speeds, provides packing quality approaching that of offline methods, and supports flexible user control over the tradeoff between performance and quality. We observe that current real-time packing methods leave large gaps between charts and often produce asymmetric, or poorly balanced, packings. These artifacts dramatically degrade packing quality. Our Tight And Balanced method eliminates these artifacts while retaining Interactive performance. TABI generates tight packings by compacting empty space between irregularly shaped charts both horizontally and vertically, using two approximations of chart shape that support efficient parallel processing. We balance packing outputs by automatically adjusting atlas row widths and orientations to accommodate varying chart heights. We show that our method significantly reduces chart downscaling compared to existing interactive methods while remaining orders of magnitude faster than offline alternatives.2026-02-08T02:45:26ZFloria GuNicholas ViningAlla Sheffer