Two-chart Beltrami Optimization for Distortion-Controlled Spherical Bijection with Application to Brain Surface Registration

2026-02-04T02:38:39Z

Many genus-0 surface mapping tasks such as landmark alignment, feature matching, and image-driven registration, can be reduced (via an initial spherical conformal map) to optimizing a spherical self-homeomorphism with controlled distortion. However, existing works lack efficient mechanisms to control the geometric distortion of the resulting mapping. To resolve this issue, we formulate this as a Beltrami-space optimization problem, where the angle distortion is encoded explicitly by the Beltrami differential and bijectivity can be enforced through the constraint $\|μ\|_{\infty}<1$. To make this practical on the sphere, we introduce the Spherical Beltrami Differential (SBD), a two-chart representation of quasiconformal self-maps of the unit sphere $\mathbb{S}^2$, together with cross-chart consistency conditions that yield a globally bijective spherical deformation (up to conformal automorphisms). Building on the Spectral Beltrami Network, we develop BOOST, a differentiable optimization framework that updates two Beltrami fields to minimize task-driven losses while regularizing distortion and enforcing consistency along the seam. Experiments on large-deformation landmark matching and intensity-based spherical registration demonstrate improved task performance meanwhile maintaining controlled distortion and robust bijective behavior. We also apply the method to cortical surface registration by aligning sulcal landmarks and matching cortical sulcal depth, achieving comparative or better registration performance without sacrificing geometric validity.

Role of Graphics in Disaster Communication: Practitioner Perspectives on Use, Challenges, and Inclusivity

2026-02-04T02:00:39Z

Information graphics, such as hazard maps, evacuation diagrams, and pictorial action guides, are widely used in disaster risk communication. These visuals are important because they convey hazard information quickly, reduce reliance on lengthy text, and support decision-making in time-critical situations. However, despite their importance, disaster information graphics do not work equally well for all audiences. In practice, many graphics remain difficult to interpret, and their accessibility for vulnerable populations is still uneven and underexplored. Despite their central role, there has been little empirical work examining how graphics shape disaster communication, what challenges practitioners face in using them, and, most importantly, how inclusive current disaster graphics are in real-world settings. To address this gap, we examine how information graphics are currently produced and used in disaster communication, what issues emerge in practice, and how inclusivity is addressed. We conducted semi-structured interviews with disaster communication practitioners and researchers to examine the role of graphics across preparedness, warning, and response contexts, as well as the barriers experienced by vulnerable communities. Our findings show that graphics are widely expected and heavily relied upon, yet significant accessibility gaps persist for groups such as people with vision impairments, older adults, and culturally and linguistically diverse communities. Participants also highlighted that inclusive adaptations are difficult to achieve during unfolding emergencies due to operational constraints, limited guidance, and resource barriers. Based on these findings, we outline recommendations for disaster management agencies and graphic designers and identify research directions for technological and adaptive support to make disaster graphics more inclusive at scale.

Multi-threaded Recast-Based A* Pathfinding for Scalable Navigation in Dynamic Game Environments

2026-02-04T01:47:18Z

While the A* algorithm remains the industry standard for game pathfinding, its integration into dynamic 3D environments faces trade-offs between computational performance and visual realism. This paper proposes a multi-threaded framework that enhances standard A* through Recast-based mesh generation, Bezier-curve trajectory smoothing, and density analysis for crowd coordination. We evaluate our system across ten incremental phases, from 2D mazes to complex multi-level dynamic worlds. Experimental results demonstrate that the framework maintains 350+ FPS with 1000 simultaneous agents and achieves collision-free crowd navigation through density-aware path coordination.

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs

2026-02-03T18:58:31Z

Methods for finetuning generative models for concept-driven personalization generally achieve strong results for subject-driven or style-driven generation. Recently, low-rank adaptations (LoRA) have been proposed as a parameter-efficient way of achieving concept-driven personalization. While recent work explores the combination of separate LoRAs to achieve joint generation of learned styles and subjects, existing techniques do not reliably address the problem; they often compromise either subject fidelity or style fidelity. We propose ZipLoRA, a method to cheaply and effectively merge independently trained style and subject LoRAs in order to achieve generation of any user-provided subject in any user-provided style. Experiments on a wide range of subject and style combinations show that ZipLoRA can generate compelling results with meaningful improvements over baselines in subject and style fidelity while preserving the ability to recontextualize. Project page: https://ziplora.github.io

Continuous Control of Editing Models via Adaptive-Origin Guidance

2026-02-03T18:33:39Z

Diffusion-based editing models have emerged as a powerful tool for semantic image and video manipulation. However, existing models lack a mechanism for smoothly controlling the intensity of text-guided edits. In standard text-conditioned generation, Classifier-Free Guidance (CFG) impacts prompt adherence, suggesting it as a potential control for edit intensity in editing models. However, we show that scaling CFG in these models does not produce a smooth transition between the input and the edited result. We attribute this behavior to the unconditional prediction, which serves as the guidance origin and dominates the generation at low guidance scales, while representing an arbitrary manipulation of the input content. To enable continuous control, we introduce Adaptive-Origin Guidance (AdaOr), a method that adjusts this standard guidance origin with an identity-conditioned adaptive origin, using an identity instruction corresponding to the identity manipulation. By interpolating this identity prediction with the standard unconditional prediction according to the edit strength, we ensure a continuous transition from the input to the edited result. We evaluate our method on image and video editing tasks, demonstrating that it provides smoother and more consistent control compared to current slider-based editing approaches. Our method incorporates an identity instruction into the standard training framework, enabling fine-grained control at inference time without per-edit procedure or reliance on specialized datasets.

See-through: Single-image Layer Decomposition for Anime Characters

2026-02-03T17:12:36Z

We introduce a framework that automates the transformation of static anime illustrations into manipulatable 2.5D models. Current professional workflows require tedious manual segmentation and the artistic ``hallucination'' of occluded regions to enable motion. Our approach overcomes this by decomposing a single image into fully inpainted, semantically distinct layers with inferred drawing orders. To address the scarcity of training data, we introduce a scalable engine that bootstraps high-quality supervision from commercial Live2D models, capturing pixel-perfect semantics and hidden geometry. Our methodology couples a diffusion-based Body Part Consistency Module, which enforces global geometric coherence, with a pixel-level pseudo-depth inference mechanism. This combination resolves the intricate stratification of anime characters, e.g., interleaving hair strands, allowing for dynamic layer reconstruction. We demonstrate that our approach yields high-fidelity, manipulatable models suitable for professional, real-time animation applications.

Occlusion-Free Conformal Lensing for Spatiotemporal Visualization in 3D Urban Analytics

2026-02-03T17:05:05Z

The visualization of temporal data on urban buildings, such as shadows, noise, and solar potential, plays a critical role in the analysis of dynamic urban phenomena. However, in dense and geographically constrained 3D urban environments, visual representations of time-varying building data often suffer from occlusion and visual clutter. To address these two challenges, we introduce an immersive lens visualization that integrates a view-dependent cutaway de-occlusion technique and a temporal display derived from a conformal mapping algorithm. The mapping process first partitions irregular building footprints into smaller, sufficiently regular subregions that serve as structural primitives. These subregions are then seamlessly recombined to form a conformal, layered layout for our temporal lens visualization. The view-responsive cutaway is inspired by traditional architectural illustrations, preserving the overall layout of the building and its surroundings to maintain users' sense of spatial orientation. This lens design enables the occlusion-free embedding of shape-adaptive temporal displays across building facades on demand, supporting rapid time-space association for the discovery, access and interpretation of spatiotemporal urban patterns. Guided by domain and design goals, we outline the rationale behind the lens visual and interaction design choices, such as the encoding of time progression and temporal values in the conforming lens image. A user study compares our approach against conventional juxtaposition and x-ray spatiotemporal designs. Results validate the usage and utility of our lens, showing that it improves task accuracy and completion time, reduces navigation effort, and increases user confidence. From these findings, we distill design recommendations and promising directions for future research on spatially-embedded lenses in 3D visualization and urban analytics.

Point Vortex Dynamics on Closed Surfaces

2026-02-03T16:05:19Z

The theory of point vortex dynamics has existed since Kirchhoff's proposal in 1891 and is still under development with connections to many fields in mathematics. As a strong simplification of the concept of vorticity it excels in computational speed for vorticity based fluid simulations at the cost of accuracy. Recent finding by Stefanella Boatto and Jair Koiller allowed the extension of this theory on to closed surfaces. A comprehensive guide to point vortex dynamics on closed surfaces with genus zero and vanishing total vorticity is presented here. Additionally fundamental knowledge of fluid dynamics and surfaces are explained in a way to unify the theory of point vortex dynamics of the plane, the sphere and closed surfaces together with implementation details and supplement material.

Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer

2026-02-03T11:16:25Z

Text-guided color editing in images and videos is a fundamental yet unsolved problem, requiring fine-grained manipulation of color attributes, including albedo, light source color, and ambient lighting, while preserving physical consistency in geometry, material properties, and light-matter interactions. Existing training-free methods offer broad applicability across editing tasks but struggle with precise color control and often introduce visual inconsistency in both edited and non-edited regions. In this work, we present ColorCtrl, a training-free color editing method that leverages the attention mechanisms of modern Multi-Modal Diffusion Transformers (MM-DiT). By disentangling structure and color through targeted manipulation of attention maps and value tokens, our method enables accurate and consistent color editing, along with word-level control of attribute intensity. Our method modifies only the intended regions specified by the prompt, leaving unrelated areas untouched. Extensive experiments on both SD3 and FLUX.1-dev demonstrate that ColorCtrl outperforms existing training-free approaches and achieves state-of-the-art performances in both edit quality and consistency. Furthermore, our method surpasses strong commercial models such as FLUX.1 Kontext Max and GPT-4o Image Generation in terms of consistency. When extended to video models like CogVideoX, our approach exhibits greater advantages, particularly in maintaining temporal coherence and editing stability. Finally, our method also generalizes to instruction-based editing diffusion models such as Step1X-Edit and FLUX.1 Kontext dev, further demonstrating its versatility.

Pi-GS: Sparse-View Gaussian Splatting with Dense π^3 Initialization

2026-02-03T09:55:03Z

Novel view synthesis has evolved rapidly, advancing from Neural Radiance Fields to 3D Gaussian Splatting (3DGS), which offers real-time rendering and rapid training without compromising visual fidelity. However, 3DGS relies heavily on accurate camera poses and high-quality point cloud initialization, which are difficult to obtain in sparse-view scenarios. While traditional Structure from Motion (SfM) pipelines often fail in these settings, existing learning-based point estimation alternatives typically require reliable reference views and remain sensitive to pose or depth errors. In this work, we propose a robust method utilizing π^3, a reference-free point cloud estimation network. We integrate dense initialization from π^3 with a regularization scheme designed to mitigate geometric inaccuracies. Specifically, we employ uncertainty-guided depth supervision, normal consistency loss, and depth warping. Experimental results demonstrate that our approach achieves state-of-the-art performance on the Tanks and Temples, LLFF, DTU, and MipNeRF360 datasets.

WebSplatter: Enabling Cross-Device Efficient Gaussian Splatting in Web Browsers via WebGPU

2026-02-03T07:18:40Z

We present WebSplatter, an end-to-end GPU rendering pipeline for the heterogeneous web ecosystem. Unlike naive ports, WebSplatter introduces a wait-free hierarchical radix sort that circumvents the lack of global atomics in WebGPU, ensuring deterministic execution across diverse hardware. Furthermore, we propose an opacity-aware geometry culling stage that dynamically prunes splats before rasterization, significantly reducing overdraw and peak memory footprint. Evaluation demonstrates that WebSplatter consistently achieves 1.2$\times$ to 4.5$\times$ speedups over state-of-the-art web viewers.

Implicit neural representation of textures

2026-02-02T17:17:20Z

Implicit neural representation (INR) has proven to be accurate and efficient in various domains. In this work, we explore how different neural networks can be designed as a new texture INR, which operates in a continuous manner rather than a discrete one over the input UV coordinate space. Through thorough experiments, we demonstrate that these INRs perform well in terms of image quality, with considerable memory usage and rendering inference time. We analyze the balance between these objectives. In addition, we investigate various related applications in real-time rendering and down-stream tasks, e.g. mipmap fitting and INR-space generation.

Attention in Geometry: Scalable Spatial Modeling via Adaptive Density Fields and FAISS-Accelerated Kernels

2026-02-02T15:06:53Z

This work introduces Adaptive Density Fields (ADF), a geometric attention framework that formulates spatial aggregation as a query-conditioned, metric-induced attention operator in continuous space. By reinterpreting spatial influence as geometry-preserving attention grounded in physical distance, ADF bridges concepts from adaptive kernel methods and attention mechanisms. Scalability is achieved via FAISS-accelerated inverted file indices, treating approximate nearest-neighbor search as an intrinsic component of the attention mechanism. We demonstrate the framework through a case study on aircraft trajectory analysis in the Chengdu region, extracting trajectory-conditioned Zones of Influence (ZOI) to reveal recurrent airspace structures and localized deviations.

OFERA: Blendshape-driven 3D Gaussian Control for Occluded Facial Expression to Realistic Avatars in VR

2026-02-02T07:34:15Z

We propose OFERA, a novel framework for real-time expression control of photorealistic Gaussian head avatars for VR headset users. Existing approaches attempt to recover occluded facial expressions using additional sensors or internal cameras, but sensor-based methods increase device weight and discomfort, while camera-based methods raise privacy concerns and suffer from limited access to raw data. To overcome these limitations, we leverage the blendshape signals provided by commercial VR headsets as expression inputs. Our framework consists of three key components: (1) Blendshape Distribution Alignment (BDA), which applies linear regression to align the headset-provided blendshape distribution to a canonical input space; (2) an Expression Parameter Mapper (EPM) that maps the aligned blendshape signals into an expression parameter space for controlling Gaussian head avatars; and (3) a Mapper-integrated Avatar (MiA) that incorporates EPM into the avatar learning process to ensure distributional consistency. Furthermore, OFERA establishes an end-to-end pipeline that senses and maps expressions, updates Gaussian avatars, and renders them in real-time within VR environments. We show that EPM outperforms existing mapping methods on quantitative metrics, and we demonstrate through a user study that the full OFERA framework enhances expression fidelity while preserving avatar realism. By enabling real-time and photorealistic avatar expression control, OFERA significantly improves telepresence in VR communication. A project page is available at https://ysshwan147.github.io/projects/ofera/.

Under-Canopy Terrain Reconstruction in Dense Forests Using RGB Imaging and Neural 3D Reconstruction

2026-02-02T07:24:42Z

Mapping the terrain and understory hidden beneath dense forest canopies is of great interest for numerous applications such as search and rescue, trail mapping, forest inventory tasks, and more. Existing solutions rely on specialized sensors: either heavy, costly airborne LiDAR, or Airborne Optical Sectioning (AOS), which uses thermal synthetic aperture photography and is tailored for person detection. We introduce a novel approach for the reconstruction of canopy-free, photorealistic ground views using only conventional RGB images. Our solution is based on the celebrated Neural Radiance Fields (NeRF), a recent 3D reconstruction method. Additionally, we include specific image capture considerations, which dictate the needed illumination to successfully expose the scene beneath the canopy. To better cope with the poorly lit understory, we employ a low light loss. Finally, we propose two complementary approaches to remove occluding canopy elements by controlling per-ray integration procedure. To validate the value of our approach, we present two possible downstream tasks. For the task of search and rescue (SAR), we demonstrate that our method enables person detection which achieves promising results compared to thermal AOS (using only RGB images). Additionally, we show the potential of our approach for forest inventory tasks like tree counting. These results position our approach as a cost-effective, high-resolution alternative to specialized sensors for SAR, trail mapping, and forest-inventory tasks.