https://arxiv.org/api/++WlVRQmNnOV4UCzYh5l8Dt2mJQ 2026-07-19T04:38:09Z 9772 105 15 http://arxiv.org/abs/2606.26645v1 An Evaluation of Decentralized Group Formation Techniques for Flying Light Specks 2026-06-25T06:17:05Z

Group formation is fundamental for 3D displays that use Flying Light Specks, FLSs, to illuminate shapes and provide haptic interactions. An FLS is a drone with light sources that illuminates a shape. Groups of G FLSs may implement reliability techniques to tolerate FLS failures, provide kinesthetic haptic feedback in response to a user's touch, and facilitate a divide and conquer approach to challenges such as localizing FLSs to render a shape. This paper evaluates four decentralized techniques to form groups. An FLS implements a technique autonomously using asynchronous communication and without a global clock. We evaluate these techniques using synthetic point clouds with known optimal solutions and real point clouds. Obtained results show a technique named Random Subset (RS) is superior when constructing small groups (G $\leq$ 5) while a different technique named Closest Available Neighbor First (CANF) is superior when constructing large groups (G $\geq$ 10).

2026-06-25T06:17:05Z Appeared in ACM Multimedia Asia 2023 (MMAsia '23), December 06-08, 2023, Tainan, Taiwan. ACM, New York, NY, USA, 7 pages Hamed Alimohammadzadeh Heather Culbertson Shahram Ghandeharizadeh 10.1145/3595916.3626460 http://arxiv.org/abs/2606.26556v1 WQ-Fusion: Dynamic Gated Attention for Cross-Domain Audio Representation 2026-06-25T03:07:34Z

While pre-trained models excel in specialized tasks, learning universal representations across diverse acoustic domains remains challenging. To address this, we propose WQ-Fusion, a robust dual-encoder framework for cross-domain audio representation learning. Overcoming the limitations of static concatenation, WQ-Fusion integrates whisper and qwen via an Adaptive Feature Modulation module and a novel element-wise gated attention mechanism. This design enables dynamic feature selection, allowing the model to selectively emphasize relevant acoustic and semantic dimensions. Extensive experiments on the Interspeech 2026 Audio Encoder Capability Challenge (Track A) benchmark demonstrate that by effectively routing heterogeneous information, WQ-Fusion achieves a superior overall score of 0.836, significantly outperforming the strongest single-encoder baseline.

2026-06-25T03:07:34Z Accepted by INTERSPEECH 2026 Mingda Lin Lei Ding Xinyue Zhou Tiantian Xiong Hanchen Pei Gongping Huang Hao Zhang Jingdong Chen Jacob Benesty http://arxiv.org/abs/2606.26368v1 An Evaluation of ABR Switching for Time-Shifted Clients in MoQ 2026-06-24T20:27:46Z

Media over QUIC enables ultra low latency video streaming over QUIC, but its default quality-switching semantics risk introducing playback gaps during periods of network congestion. The in-progress SWITCH specification for MOQ Transport aims to streamline rate adaptation for MoQ. In this work, we characterize the performance of SWITCH-style Adaptive Bitrate (ABR) for both live and time-shifted clients in a Mininet simulated topology. We validate that standard ABR algorithms can be directly applied to time-shifted playback without modification, yielding substantially higher throughput. We demonstrate that a subscriber can experience increased overall throughput after a rebuffering scenario, and we identify focal points for further optimizations of MoQ ABR switching.

2026-06-24T20:27:46Z Abanisenioluwa Orojo Tanvir Redoy Samira Afzal Andrew C. Freeman http://arxiv.org/abs/2606.26196v1 From Structure to Synergy: A Survey of Vision-Language Perception Paradigm Evolution in Multimodal Large Language Models 2026-06-24T15:20:32Z

Multimodal Large Language Models (MLLMs) have recently made remarkable progress in unifying vision-language understanding and reasoning, especially following the introduction of models such as OpenAI's O-series and DeepSeek's R-series, which have driven a paradigm shift toward perception-centric intelligence. However, there remains a lack of systematic surveys that examine perception from a truly unified vision-language perspective -- one that treats vision and language as an inseparable modality. Existing reviews are often fragmented, focusing separately on either vision or language, and thus rarely capture the cross-modal evolution of perception as an integrated capability. To bridge this gap, we present the first systematic survey of unified vision-language perception in MLLMs. Specifically, we (1) formalize MLLM perception as an intrinsic, unified vision-language capability analogous to human innate perception, (2) introduce a five-stage taxonomy tracing the paradigm evolution of MLLM perception and survey representative methods and milestones at each phase, and (3) identify open challenges and outline promising research directions toward truly general, unified multimodal intelligence. We hope our study will provide both a foundational understanding and an actionable roadmap to foster further innovation on the path toward artificial general intelligence (AGI).

2026-06-24T15:20:32Z Haoxiang Sun Tao Wang Li Yuan Jian Zhao Jiancheng Lv 10.1016/j.inffus.2026.104285 http://arxiv.org/abs/2606.25906v1 OracleAnalyser: Analysing Implicit Semantics of Oracle Bone Scripts through MLLMs with Post-training 2026-06-24T14:54:41Z

With the advancement of artificial intelligence, research on oracle bone scripts has entered a new era. However, existing methods and benchmarks remain largely confined to recognition tasks, overlooking the equally crucial aspect of oracle bone analysis. To address this gap, we propose OracleAnalyser, a reasoning framework for oracle bone analysis based on post-training techniques. Specifically, we fine-tune Qwen2.5-VL-3B-Instruct through multiple post-training stages and introduce a new preference optimization algorithm, Stable Focal Preference Optimization (SFPO), tailored to the characteristics of oracle bone datasets. In addition, we release both an oracle bone reasoning dataset and an oracle bone preference dataset, and further construct a new benchmark to evaluate models' analytical capabilities for oracle bone scripts. Extensive experiments validate the superior analytical performance of OracleAnalyser, which achieves remarkable results with only 3B parameters, surpassing models with substantially larger scales.

2026-06-24T14:54:41Z Zijia Song Yelin Wang Zhengyi Ma Zitong Yu Tianheng Wang Jiahuan Zhang Taorui Wang Kaicheng Yu http://arxiv.org/abs/2512.24946v3 HaineiFRDM: Structure-Preserving Diffusion for Film Restoration under Fast Motion and Diverse Defects 2026-06-24T14:47:42Z

Existing film-restoration methods frequently fail under fast motion, producing limb disappearance and structural distortion due to inaccurate motion modeling. Moreover, high-resolution restoration under spatially-persistent and mixed defects remains insufficiently studied. We propose HaineiFRDM, a Film Restoration Diffusion Model that leverages the content modeling capability of diffusion models for content-aware restoration, removing defects while preserving scene structure.To enable scalable high-resolution restoration, we adopt a patch-wise strategy with position-aware global fusion modules to maintain cross-patch coherence. We further introduce a frequency-based module to enhance texture consistency and a patch-consistent inference framework to alleviate blocking artifacts introduced by patch-based processing.We also construct a film restoration dataset comprising categorized defect templates, professionally restored films, and realistic synthetic degradations.Extensive experiments demonstrate our superior restoration quality with strong structural consistency. Our design also reduces memory requirements, enabling high-resolution restoration on a single 24GB-VRAM GPU.Code and the dataset will be released at https://anonymous.4open.science/r/HaineiFRDM.

2025-12-31T16:18:07Z Rongji Xun Junjie Yuan Zhongjie Wang http://arxiv.org/abs/2606.25547v1 Efficient Cross-Scale Invertible Hiding Network with Spatial-Frequency Collaboration and Non-Invertible Mechanism 2026-06-24T08:24:58Z

Image hiding aims to conceal image-level messages within cover images at the same resolution. Invertible neural networks (INN)-based image hiding has emerged as an important branch. It treats concealing and revealing as a pair of inverse problems on image domain transformation and uses INN's forward and backward processes to address them. Due to architectural constraints, existing INN-based methods suffer from single-scale and single-domain feature extraction and limited nonlinear representation capability, resulting in inferior image quality. To mitigate these limitations, we propose an efficient cross-scale invertible hiding network with the spatial-frequency collaboration and the non-invertible mechanism, termed CrosInv. CrosInv exploits cross-scale and spatial-frequency collaborative features while enhancing nonlinear representation. Specifically, we introduce a cross-scale invertible module that bijectively maps inputs to cross-scale representations. To effectively integrate spatial and frequency information, the cross-scale invertible module employs pixel shuffle, Haar wavelet transformation, and their inverse operations for scale transformation. Furthermore, a non-invertible cross dense module is integrated to enhance the nonlinearity. Comprehensive experiments verify the effectiveness and superiority of the proposed CrosInv.

2026-06-24T08:24:58Z IEEE TNNLS submitted by Junxue Yang, Xin Liao (https://msf-hnu.github.io/) Junxue Yang Xin Liao http://arxiv.org/abs/2606.25391v1 From Sounds to Scenes: A Benchmark for Evaluating Context-Aware Auditory Scene Understanding in Large Audio Language Models 2026-06-24T04:42:57Z

Recent Large Audio Language Models (LALMs) have achieved remarkable progress in audio perceptual tasks across individual acoustic layers, including speech, sound, and music. However, existing benchmarks predominantly evaluate these layers in isolation, overlooking the complex contextual relationships that arise when multiple acoustic sources co-occur in real-world auditory scenes. Real-world auditory interpretation requires Context-Aware Auditory Scene Understanding (CASU): the ability to comprehend the holistic scene by integrating sound layers. To evaluate this capability, we introduce the CASU benchmark, which assesses whether Audio LLMs can interpret auditory scenes composed of speech, acoustic events (e.g., announcements), and background environments (e.g., traffic), and reason about the logical relationships between these layers. We propose a scalable pipeline for constructing time-accurate, semi-synthetic audio streams by composing real-world scene sounds with synthetic speech. Building on this data, we design four tasks that probe scene understanding: contextual question answering, entity extraction from the scene, speaker role inference, and counterfactual reasoning where scene is manipulated. Experiments across multiple LALMs demonstrate that effective auditory scene understanding requires integration over all auditory layers, rather than reliance on speech or sound alone, underscoring the necessity of CASU for advancing complex audio understanding in LALMs.

2026-06-24T04:42:57Z Pengfei Zhang Hoang H Nguyen Kazi Shaharair Sharif Yutong Song Wenjun Huang Henry Peng Zou Pinxin Liu Honghui Xu Amir M. Rahmani http://arxiv.org/abs/2606.02800v4 Cosmos 3: Omnimodal World Models for Physical AI 2026-06-23T17:33:32Z

We introduce Cosmos 3, a family of omnimodal world models designed to jointly process and generate language, image, video, audio, and action sequences within a unified mixture-of-transformers architecture. By supporting highly flexible input-output configurations, Cosmos 3 seamlessly unifies critical modalities for Physical AI -- effectively subsuming vision-language models, video generators, world simulators, and world-action models into a single framework. Our evaluation demonstrates that Cosmos 3 establishes a new state-of-the-art across a diverse suite of understanding and generation tasks, demonstrating omnimodal world models as scalable, general-purpose backbones for embodied agents. Our post-trained Cosmos 3 models were ranked as the best open-source Text-to-Image and Image-to-Video models by Artificial Analysis, and the best policy model by RoboArena at the time the technical report was written. To accelerate open research and deployment in Physical AI, we make our code, model checkpoints, curated synthetic datasets, and evaluation benchmark available under the Linux Foundation's OpenMDW-1.1 License at https://github.com/nvidia/cosmos and https://huggingface.co/collections/nvidia/cosmos3. The project website is available at https://research.nvidia.com/labs/cosmos-lab/cosmos3.

2026-06-01T19:12:30Z NVIDIA : Aditi Niket Agarwal Arslan Ali Jon Allen Martin Antolini Adeline Aubame Alisson Azzolini Junjie Bai Maciej Bala Yogesh Balaji Josh Bapst Aarti Basant Mukesh Beladiya Mohammad Qazim Bhat Zaid Pervaiz Bhat Dan Blick Vanni Brighella Han Cai Tiffany Cai Eric Cameracci Jiaxin Cao Yulong Cao Mark Carlson Carlos Casanova Ting-Yun Chang Yan Chang Yu-Wei Chao Prithvijit Chattopadhyay Roshan Chaudhari Chieh-Yun Chen Junyu Chen Ke Chen Qizhi Chen Wenkai Chen Xiaotong Chen Yu Chen An-Chieh Cheng Click Cheng Xiu Chia Jeana Choi Chaeyeon Chung Wenyan Cong Yin Cui Magdalena Dadela Nalin Dadhich Wenliang Dai Joyjit Daw Alperen Degirmenci Rodrigo Vieira Del Monte Robert Denomme Sameer Dharur Marco Di Lucca Ke Ding Wenhao Ding Yifan Ding Yuzhu Dong Nicole Drumheller Yilun Du Aigul Dzhumamuratova Aleksandr Efitorov Hamid Eghbalzadeh Naomi Eigbe Imad El Hanafi Hassan Eslami Benedikt Falk Jiaojiao Fan Jim Fan Amol Fasale Sergiy Fefilatyev Liang Feng Francesco Ferroni Sanja Fidler Xiao Fu Vikram Fugro Prashant Gaikwad TJ Galda Katelyn Gao Yihuai Gao Wenhang Ge Sreyan Ghosh Arushi Goel Vivek Goel Akash Gokul Rama Govindaraju Jinwei Gu Miguel Guerrero Elfie Guo Aryaman Gupta Siddharth Gururani Hugo Hadfield Song Han Ankur Handa Zekun Hao Mohammad Harrim Ali Hassani Nathan Hayes-Roth Yufan He Chris Helvig Cyrus Hogg Madison Huang Michael Huang Sophia Huang Yufan Huang Jacob Huffman DeLesley Hutchins Suneel Indupuru Boris Ivanovic Arihant Jain Joel Jang Ryan Ji Yanan Jian Dongfu Jiang Jingyi Jin Atharva Joshi Nikhilesh Joshi Pranjali Joshi Andy Ju Jaehun Jung Weiwei Kang Scott Kassekert Jan Kautz Ashna Khetan Julia Kiczka Slawek Kierat Gwanghyun Kim Kuno Kim Sunny Kim Kezhi Kong Xin Kong Zhifeng Kong Tomasz Kornuta Egor Krivov Hui Kuang Saurav Kumar Chia-Wen Kuo George Kurian Wojciech Kutak JF Lafleche Himangshu Lahkar Omar Laymoun Jayjun Lee Sanggil Lee Gabriele Leone Boyi Li Freya Li Jiajun Li Jinfeng Li Ling Li Pengcheng Li Shangru Li Tingle Li Xiaolong Li Xuan Li Zhaoshuo Li Zhiqi Li Hao Liang Maosheng Liao Chen-Hsuan Lin Tsung-Yi Lin Ming-Yu Liu Sifei Liu Zihan Liu Hai Loc Lu Xiangyu Lu Alice Luo Ruipu Luo Wenjie Luo Jiangran Lyu Martin Ding Ma Nic Ma Qianli Ma Dawid Majchrowski Louis Marcoux Miguel Martin Qing Miao Ashkan Mirzaei Shreyas Misra Kaichun Mo Durra Mohsin Hyejin Moon Pawel Morkisz Saeid Motiian Kirill Motkov Seungjun Nah Yashraj Narang Deepak Narayanan Thabang Ngazimbi Julian Ouyang Shubham Pachori David Page Yatian Pang Sehwi Park Mahesh Patekar Mostofa Patwary Marco Pavone Trung Pham Wei Ping Soha Pouya Shrimai Prabhumoye Varun Praveen Delin Qu Hesam Rabeti Morteza Ramezanali Marilyn Reeb Xuanchi Ren Kristen Rumley Wojciech Rymer Jun Saito Yeongho Seol John Shao Piyush Shekdar Tianwei Shen Humphrey Shi Min Shi Stella Shi Kevin Shih Mohammad Shoeybi Mateusz Sieniawski Shuran Song Alexander Sotelo Amir Sotoodeh Sunil Srinivasa Vignesh Srinivasakumar Bartosz Stefaniak Rahul Heinrich Steiger Shangkun Sun Jiaxiang Tang Shitao Tang Yangyang Tang Yue Tang Tolou Tavakkoli Kayley Ting Krzysztof Tomala Wei-Cheng Tseng Jibin Varghese Sergei Vasilev Thomas Volk Raju Wagwani Roger Waleffe Andrew Z. Wang Boxiang Wang Haoxiang Wang Qiao Wang Shihao Wang Shijie Wang Ting-Chun Wang Yan Wang Yu Wang Rohit Watve David Wehr Fangyin Wei Xinshuo Weng Jay Zhangjie Wu Kedi Wu Hongchi Xia Summer Xiao Tianjun Xiao Kevin Xie Daguang Xu Jiashu Xu Mengyao Xu Ruqing Xu Xingqian Xu Yao Xu Dinghao Yang Dong Yang Hans Yang Xiaodong Yang Xuning Yang Yichu Yang Yurong You Zhiding Yu Hao Yuan Simon Yuen Xiaohui Zeng Pengcuo Zeren Cindy Zha Haotian Zhang Jenny Zhang Jing Zhang Liangkai Zhang Paris Zhang Shun Zhang Xuanmeng Zhang Zhizheng Zhang Ann Zhao Yilin Zhao Yuliya Zhautouskaya Charles Zhou Fengzhe Zhou Shilin Zhu Yuke Zhu Dima Zhylko Artur Zolkowski http://arxiv.org/abs/2504.06138v3 Multimedia and Visual Analytics in the Agentic Era 2026-06-23T16:14:18Z

Professional users need tools to help them gain actionable insights from large multimedia collections. Foundation models and AI agents have rapidly changed the playing field, and improving their accuracy, trustworthiness, and reasoning capabilities are active topics in the computer vision, machine learning, and multimedia communities. Most current research focuses on benchmark driven algorithmic improvements. The multimedia community is the place to go beyond algorithms and consider complete multimedia analytics systems that support professional users in their complex tasks and achieve a true teaming of humans and AI. Supporting users with machine learning and visualizations has been studied for decades in the visual analytics field. In this paper, we propose a framework to bring multimedia and visual analytics together and indicate how it could impact current and new multimedia analytics solutions. Additional information can be found at https://staff.fnwi.uva.nl/m.worring/analytics-model.html

2025-04-08T15:35:59Z Marcel Worring Jan Zahálka Stef van den Elzen Maximilian T. Fischer Daniel A. Keim http://arxiv.org/abs/2507.16696v3 FISHER: A Foundation Model for Multi-Modal Industrial Signal Comprehensive Representation 2026-06-23T08:25:29Z

Industrial signal analysis is hindered by severe data heterogeneity, which we characterize as the M5 problem. Existing solutions rely on specialized models that lack robustness and scalability, while large-scale pre-training has rarely been investigated in this area. In this work, we derive a prioritized roadmap for the M5 problem and propose FISHER, a Foundation model for multi-modal Industrial Signal compreHEnsive Representation. To address the foremost multi-sampling-rate problem, FISHER utilizes a novel sub-band modeling approach that treats sampling rate increments as concatenated sub-band information, enabling the adaptive usage of full signal bandwidth without resampling. FISHER is pre-trained by teacher-student self-distillation over external audio and music data. We also establish the RMIS benchmark, comprising 19 datasets across four modalities. In the experiment, FISHER outperforms 24 state-of-the-art series encoders (up to 2B) with much smaller sizes (up to 16x), showcasing groundbreaking diagnostic accuracy and remarkable versatility. We further demonstrate that 1) seamless adaptation to variable sampling rates is the key to generalization 2) audio and music data provide better temporal variability, which is essential for pre-training. Both FISHER and RMIS are open-sourced.

2025-07-22T15:31:16Z Accepted by IEEE TII. FISHER open-sourced on https://github.com/jianganbai/FISHER . RMIS open-sourced on https://jianganbai.github.io/RMIS Pingyi Fan Anbai Jiang Shuwei Zhang Xinhu Zheng Zhiqiang Lv Bing Han Wenrui Liang Junjie Li Wei-Qiang Zhang Yanmin Qian Xie Chen Jia Liu 10.1109/TII.2026.3698554 http://arxiv.org/abs/2604.09054v4 HAFM: Hierarchical Autoregressive Foundation Model for Music Accompaniment Generation 2026-06-23T01:22:05Z

Music accompaniment generation aims to automatically produce instrumental accompaniments that are rhythmically, harmonically, and timbrally coherent with a given vocal input, with broad applications in personalized music creation, arrangement assistance, and music education. Existing approaches, primarily operating in the symbolic domain or relying on single-stage audio generation frameworks, commonly suffer from insufficient high-level semantic structure modeling, limited acoustic detail reconstruction, and weak conditional controllability. To address these limitations, this paper proposes HAFM, a Hierarchical Autoregressive Foundation Model for vocal-conditioned music accompaniment generation. The model employs a dual-rate tokenization strategy in which $50$ Hz HuBERT semantic tokens capture high-level musical structure and $75$ Hz EnCodec acoustic tokens encode fine-grained acoustic content, enabling explicit disentanglement of semantic and acoustic representations. Building on this foundation, a three-stage cascaded generation framework is designed to progressively generate semantic tokens, coarse acoustic tokens, and fine acoustic tokens, refining the accompaniment from global structure to local detail. . Objective evaluation on the MUSDB18 dataset demonstrates that the full three-stage model achieves a Fr{é}chet Audio Distance (FAD) score of 1.71, representing an 18.6% relative improvement over the two-stage baseline (FAD = 2.10). Subjective listening tests show that the generated accompaniments achieve a 51.5% preference rate against ground-truth accompaniments in head-to-head comparisons, and substantially outperform the random baseline in terms of rhythmic alignment, harmonic compatibility, and overall musical coherence. The source code and demo are available at https://github.com/HackerHyper/HAFM.git.

2026-04-10T07:27:55Z This paper is submitted to the to National Conference on Man-Machine Speech Communication (NCMMSC, 2026) Jian Zhu Jianwei Cui Yunlong Xue Shihao Chen Yubang Zhang Cheng Luo Jun Sun http://arxiv.org/abs/2606.23885v1 Mind the Heads: Topological Representation Alignment for Multimodal LLMs 2026-06-22T19:30:30Z

Representation alignment has emerged as an effective approach to improve Multimodal Large Language Models (MLLMs) by regularizing their internal representations toward those of an external vision encoder. However, existing methods typically align a fixed layer of the language backbone, overlooking the fine-grained structure of Transformer models. In this work, we propose Head-Wise Representation Alignment (HeRA), a method that enforces cross-modal alignment at the level of individual attention heads. Our approach is grounded in the Platonic Representation Hypothesis, focusing on preserving the topological structure of representations (i.e., their local neighborhood relationships) across modalities. Following the Mutual K-Nearest Neighbor (MKNN) alignment metric, we introduce a contrastive objective that acts as a differentiable proxy for matching local structures. HeRA applies this objective during multimodal training to specific attention heads in the LLM, selected by their alignment score according to the MKNN metric. Counterintuitively, we find that aligning the least aligned heads yields the largest gains. Extensive evaluations across multiple MLLMs and 18 benchmarks demonstrate that HeRA consistently improves performance on challenging vision-centric tasks and serves as an effective regularizer against visual hallucinations by naturally curbing the over-reliance on linguistic priors. Our code is publicly released.

2026-06-22T19:30:30Z Davide Caffagni Alberto Compagnoni Federico Melis Sara Sarto Pier Luigi Dovesi Mark Granroth-Wilding Marcella Cornia Lorenzo Baraldi http://arxiv.org/abs/2606.23545v1 UI-LIC: A Unified Framework for Evaluating Learned Image Compression Models 2026-06-22T16:18:00Z

The evaluation and comparison of Learned Image Compression (LIC) systems is complicated by heterogeneous software stacks, varying training conditions, and divergent evaluation methodologies. To address these challenges, we introduce UI-LIC, an open-source software framework for evaluating LIC models. We integrate six high-performance LIC models, and provide a centralized controller for performing training, inference, and analysis with shared configuration parameters. Our GUI program offers a streamlined interface to evaluate these models alongside traditional video intra-frame encoders, equalizing the compressed bitrates and calculating quality metrics such as PSNR, SSIM, VMAF, and LPIPS. Finally, we provide an interactive image analyzer with configurable quality heatmap overlays. Our framework lowers barriers to further LIC research, unlocking comparative metrics and subjective analysis with a single setup command. The open-source software is released under the MIT license and is available at github.com/BaylorMultimediaLab/UI-LIC.

2026-06-22T16:18:00Z Nicholas J. Nolen Luc Trudeau Andrew C. Freeman http://arxiv.org/abs/2606.23526v1 Composition: Building Community with Arts, Math, and Code (Experience Report) 2026-06-22T16:09:08Z

Composition (https://composition.codes) is a free event series on art, mathematics, and code. This experience report covers Composition's event structure, artist selection process, outreach efforts for submissions and event promotion, and the community response.

2026-06-22T16:09:08Z Isidore Mohr Claire Wang