https://arxiv.org/api/vLaDg20yH8uBpucxQ/lE+AMqsuM2026-06-13T16:12:10Z309349015http://arxiv.org/abs/2606.09242v1Conceptualising Reflective Use: Toward A Process Perspective On Human-AI Interaction2026-06-08T09:16:37ZThe rapid diffusion of generative artificial intelligence (genAI) systems reshapes how individuals engage with information systems, requiring users to monitor, assess, and adapt their interaction with non-deterministic systems. Existing constructs capture elements of this engagement but do not account for the situated dynamics of the entire evaluative process in genAI use. This research-in-progress, situated in a larger endeavour towards a scale development, derives an initial conceptualisation of reflective use: a behavioural-knowledge capability that unfolds across pre-use, in-use, and post-use phases, reinforced through situated reflective knowledge gained in practice. Drawing on expert interviews and a focus group, we identify four core components of reflective use and show how they form an iterative capability cycle anchored within the motivational needs outlined in self-determination theory. Understanding reflective use is essential to ensure appropriate reliance and high decision quality, and thus provides a foundation for promoting responsible and effective human-AI interaction.2026-06-08T09:16:37Z8 pages, 2 figures, 1 table, published in ECIS 2026 ProceedingsThimo SchulzChristina Speckhttp://arxiv.org/abs/2606.09239v1Orange Lab: Lowering Barriers to Data Mining through Embedded Interactive Workflows2026-06-08T09:14:55ZWhile visual programming of data analysis workflows has become an important vehicle for the democratization of data science, such systems remain largely confined to standalone applications and offer limited support for transitioning their visual analytics solutions into interactive web environments. As a result, data analysis pipelines are difficult to share, embed, and adapt into user-facing analytical tools. We present Orange Lab, a web-based collaborative environment for visual data analytics. At its core, Orange Lab enables users to visually construct machine learning workflows from modular components, where interactions in any component propagate seamlessly through the workflow, turning static pipelines into dynamic, reactive systems that support exploration and data-driven storytelling. Our key contribution is component exposition, a paradigm that allows authors to embed selected workflow components, or parts of their interfaces, into arbitrary web contexts, creating synchronized, interactive interfaces while hiding underlying workflow complexity. This enables the development of tailored analytical views and narrative-driven experiences that integrate data analysis directly into online materials. We demonstrate the approach through deployments in data literacy education, where embedded components guide students in hands-on exploration of machine learning concepts without requiring knowledge of the underlying system, showing that Orange Lab effectively lowers barriers to entry and supports the democratization of data science.2026-06-08T09:14:55ZMatej BevecAleš ErjavecVesna TankoLena TrnovecLan ŽagarAna FaričJanez DemšarBlaž Zupanhttp://arxiv.org/abs/2606.09227v1Trustworthy Smart Fabs via Professional Proxies: Scaling Safe and Sustainable by Design (SSbD) through Industrial Data Spaces2026-06-08T09:02:02ZThe convergence of the 2026 European Union Safe and Sustainable by Design (SSbD) framework, Corporate Sustainability Due Diligence Directive (CSDDD), and Carbon Border Adjustment Mechanism (CBAM) introduce a severe governance bottleneck for advanced semiconductor manufacturing facilities ("Smart Fabs"). Regulatory compliance demands have surpassed the capacity of manual corporate reporting, creating a direct conflict between multi-stakeholder transparency and corporate data privacy. This paper addresses this challenge by introducing a zero-trust socio-technical orchestration framework that operationalizes a six-layer SSbD reference architecture within trustworthy industrial data spaces. We propose a shift from reactive automation to autonomous governance through "Professional Proxies"-role-based agentic workflows executing within hardware-isolated trust zones. Structured as an interoperable network protocol stack, the framework coordinates an automated, five-step "relay race" between Facility, Process Engineering, and Finance proxy teams to align factory-floor yield models with macro-level sustainability mandates. By executing Virtual Metrology (VM) predictions and Federated Machine Learning (FML) inside hardware-rooted Trusted Execution Environments (TEEs), this architecture resolves the Data Sovereignty Paradox, demonstrating how fabs can export cryptographically signed compliance tokens via International Data Spaces (IDS) connectors without exposing proprietary process recipes. Ultimately, this framework provides technology managers with a verifiable, evidence-based pathway toward resilient, net-zero Industry 5.0 ecosystems.2026-06-08T09:02:02ZThis work was accepted for presentation at the 32nd IEEE ICE/ITMC Conference, Porto, Portugal, 2026 but was subsequently withdrawn prior to publication due to submission volume limits. It is currently under consideration for publication elsewhereHan-Teng LiaoChang-Yi KaoKaren Anghttp://arxiv.org/abs/2606.09186v1DuplexOmni: Real-Time Listening, Seeing, Thinking, and Speaking for Full-Duplex Interaction2026-06-08T08:23:02ZHuman interaction is continuous, multimodal, and full-duplex by nature. Although recent omni models have made substantial progress in unified speech, vision, and text modeling, combining seamless real-time interaction with complex reasoning and tool use remains challenging. We present DuplexOmni, a method for real-time multimodal full-duplex interaction. DuplexOmni separates model capability into an interaction layer and a thinking layer, which collaborate asynchronously in parallel. The interaction layer is implemented by the DuplexOmni model, an end-to-end system that processes streaming audio and video inputs while generating text and speech responses in real time. The thinking layer is a pluggable module that provides complex reasoning and tool-use capabilities. To support this method, we further develop a Writer-Director pipeline for constructing continuous-interaction training data. Experiments show that DuplexOmni achieves strong performance on multiple public benchmarks and exhibits natural full-duplex interaction ability.2026-06-08T08:23:02ZMuye HuangLingling ZhangXingyu YuLei ShiZhanyu MaJun XuJiuchong GaoJinghua HaoRenqing HeJun Liuhttp://arxiv.org/abs/2606.09041v1Culturally-Aware AI for Cross-Boundary Community Learning: Undergraduate Innovation at the Intersection of Computation and Design2026-06-08T05:14:01ZResearch on artificial intelligence in education (AIED) is rapidly expanding, yet technical progress often lacks human-centered grounding and adequate attention to cultural context. Community-Based Learning, a pedagogy rooted in social work, remains underrepresented in AIED research, particularly within Asia-Pacific contexts. This paper reports on cross-boundary Community-Based Learning where undergraduate students develop AI-enabled solutions for cultural heritage preservation and sustainable development. We examine how community-engaged computing operationalizes human-centered AIED across three dimensions: education, technology, and culture. We contribute a collaborative framework for culturally-aware AIED that fosters multi-stakeholder collaboration while widening participation by dissolving disciplinary silos between social work and computational science.2026-06-08T05:14:01ZJiaojiao ZhaoWeisheng ZhangJiawen CaiHaibin GaoLuyao Zhanghttp://arxiv.org/abs/2606.09024v1Personal Salience: Highlighting Is Social, but Individuality Lives in Selection2026-06-08T04:44:51ZSocial highlighters let people mark passages that matter to them. We ask how much of an individual is recoverable from these naturalistic traces, using a co-readership identity control (the same document highlighted by many users) that holds document and topic fixed and asks whether a person's own history predicts their marks better than another reader's does. We separate generic salience (structure), crowd salience (what others marked), and personal salience (the individual residual). First, highlighting is social: which sentences you mark is predicted far better by the crowd than by structure or by a personal model, and even a well-estimated crowd, an information-privileged baseline that sees others' marks on the same document, beats a frontier LLM twin built from your other-document history; the within-document personal signal is at most a whisper (own-vs-other gap +0.017 by an embedding scorer, small but significant). Second, in sharp contrast, individuality lives in selection: asked which of the already-salient passages are yours, your own history is a strong, leakage-free predictor (gap +0.14). A topic decomposition shows this is largely stable thematic preference: it shrinks ~6-8x against a topically-matched peer, and a thin residual cannot be separated from finer topic. The non-obvious part is an asymmetry: under the same scorer the individual signal is ~6-8x weaker in salience than in selection. Methodologically, naive history-conditioning evaluations leak (the target's own marks enter the profile in ~42% of pairs, inflating personal scores by up to +0.15 AP) and small crowds overstate personalization; our results are leakage-free, use a dense crowd, and a model-matched control. Highlights carry a genuine individual signature, but a thin layer over a strong shared one, surfacing far more in which salient things a person selects than in what is salient.2026-06-08T04:44:51Z12 pages, 5 figures, 2 tablesKazuki NakayashikiKeisuke Watanabehttp://arxiv.org/abs/2606.08965v1Before You Scroll Again: Predicting Regretful Social Media Sessions from In-the-Wild Contextual and Wearable Sensing2026-06-08T03:04:29ZUsers often feel regret after using social media, making regret a more ecologically valid target than screen time for understanding when phone use becomes problematic. Existing self-monitoring tools cannot anticipate regret before it occurs, and prior physiological work on social media use has been confined to the lab with research-grade sensors and curated content, leaving the question of in-the-wild prediction open. We deployed a 7-day in-the-wild experience sampling study with 21 participants, combining passive smartphone logging, a low-cost consumer smartwatch (Bangle.js 2, \$80), session-level surveys (1,445 sessions), and exit interviews to investigate when and why social media sessions become regretful, and whether regret can be anticipated before a session begins. Three findings stand out: (i) the gap between intended and actual use predicts regret far more strongly than session duration, with duration's apparent effect collapsing once intention is modeled; (ii) regret is amplified when sessions displace a valued alternative, particularly at night and following productivity-app use; and (iii) pre-session contextual features generalize across participants while physiological signals add person-specific lift, pointing toward a two-layer architecture for just-in-time adaptive interventions. Interview themes of scrolling-as-avoidance and time blindness contextualize these patterns and surface design opportunities beyond timer-based interventions.2026-06-08T03:04:29ZSally AhmedJan EnkmannKye ShimizuIvy YipVincent BeermannAyse AlomarFalk UebernickelPattie Maeshttp://arxiv.org/abs/2603.13679v2Toward Scalable Co-located Practical Learning: Assisting with Computer Vision and Multimodal Analytics2026-06-08T02:34:07ZCo-located practical learning leaves evidence in visible actions around patients, task resources and room zones, but these traces are often recovered through live observation or retrospective video review. Fixed wide-angle video could reduce sensing burden, yet a debriefing pipeline must do more than detect behaviours: it must maintain detection after small camera-position shifts, relate the detector-derived behaviour trace to instructor-labelled outcomes and preserve room-zone context. This study evaluates a fixed-camera pipeline in repeated nursing simulation. Using a harmonised six-code taxonomy, we tested YOLO26 target-only training and two-stage source-to-target adaptation across two same-room side-view data sources. We then converted detections from 51 instructor-labelled sessions into one-second behaviour and behaviour-zone traces for rate, ordered-network, transition-network and sequence analyses.
Two-stage adaptation improved mean mAP50 from 0.815 to 0.848 for the 2021 target view and from 0.690 to 0.855 for the smaller 2022 target view; with a balanced target quota of \(N = 22\), the 2022 model reached 0.850 mAP50. In the detector-derived behaviour trace analyses, higher phone use characterised low task-performance sessions. Zone labels changed the interpretation of patient interaction: primary patient-care-zone interaction was stronger in higher-performance sessions, while secondary-zone interaction was stronger in lower-performance sessions. Ordered and transition network models showed that ordered room-zone relations contributed beyond behaviour frequency, with the strongest task-performance classifier using zoned and co-presence features. The resulting trace is most appropriate for searchable simulation debriefing, where instructors inspect detected moments rather than receive automated assessment scores.2026-03-14T01:04:58ZXinyu LiLinxuan ZhaoYueqiao JinYuchen LiuJin ZhouRoberto Martinez-MaldonadoDragan GasevicLixiang Yanhttp://arxiv.org/abs/2606.08936v1Report on CHIIR 2026 Workshop on Generative AI and Academic Search (GAI&AS)2026-06-08T02:31:14ZThis report summarizes the CHIIR 2026 Workshop on Generative AI and Academic Search (GAI\&AS), which examined how GenAI is reshaping academic search systems and research practices. The workshop brought together researchers in human information interaction and information retrieval to explore key challenges and opportunities in designing and evaluating future academic search systems that integrate GenAI, moving beyond traditional document retrieval to support summarization, recommendation, synthesis, and conversational interaction. Participants' interests and discussions focused on three thematic clusters: foundations and principles, applications and opportunities, and search-as-learning. Across these themes, the workshop highlighted the importance of academic search systems in supporting transparency, credibility, research integrity, and long-term scholarly needs, as well as in fostering higher-order cognitive processes. Participants discussed guiding theories, design principles, methodological approaches, partnerships, and community-building efforts aimed at advancing human-centered GenAI-enhanced academic search systems. Overall, the workshop demonstrated strong community interest and a diverse range of ongoing and emerging research initiatives at the intersection of GenAI and academic search.2026-06-08T02:31:14ZYifan LiuKlaraJaime ArguelloKlaraOrland HoeberKlaraChang LiuKlaraSoo Young RiehKlaraLuanne SinnamonKlaraDean AlvarezKlaraSusan ArchambaultKlaraRob CapraKlaraHenson ChenKlaraCharles CostaKlaraAnita CrescenziKlara ZhitongKlara GuanJacek GwizdkaPao-Pei HuangGavindya JayawardenaGhazal KalhorDagmar KernOliver KoopAlice LiAfra MashhadiGaohui MengMarta MicheliAnil B. MurthyKevin SchottSebastian SchultheißJiwoo SeoPhaneendra SivangulaFrans van der SluisXiaoxuan SongSilang WangDan Zhanghttp://arxiv.org/abs/2606.08927v1In-Situ Immersive Analytics Authoring through Ergonomic Keyboard Support2026-06-08T02:09:11ZImmersive analytics uses augmented reality (AR) to integrate data analysis and authoring within physical environments. However, extensive text entry required for immersive analytics authoring remains a fundamental challenge in AR, as popular natural user interfaces often hinder expressive input. This paper presents the Body-Supported Keyboard (BSK), an ergonomic system that allows the mobile use of a Bluetooth keyboard in AR. We conducted a controlled study with 20 participants to compare the BSK with a standing desk during text transcription and a mobile AR scenario. The results showed slightly higher error rates but comparable task completion times. Participants reported comfort improvements during mobile use and positive usability ratings (mean SUS = 74.5). The BSK allows users to move freely and maintain stable postures while authoring in AR. In general, the findings show evidence of the potential for body-supported input to enhance expressive and ergonomic workflows in immersive analytics and emphasize the importance of comfort and mobility in the design of AR authoring tools.2026-06-08T02:09:11Z31 pages, 7 tables, 5 figuresInternational Journal of Human-Computer Interaction, 1-27. 2026Leonel MerinoBegoña Juliá-NehmeSantiago Viana10.1080/10447318.2026.2676765http://arxiv.org/abs/2606.08914v1Vibe Visualizing: How Visualization Novices Try (and Fail) to Generate and Interpret Visualizations with Conversational AI2026-06-08T01:29:22ZConversational AI has enabled users to generate and interpret visualizations through natural language, significantly lowering the technical barrier to entry. The increased accessibility brings visualization novices into data visualization, but also exposes them to misinformation and misinterpretations. We are motivated to examine what issues can arise in interactions with current conversational AI, whether visualization novices can recognize such issues, and how they respond to them. To examine these questions, we conducted a user study on ChatGPT with 20 visualization novices, collecting their conversation logs, semi-structured interview transcripts, and Likert-scale questionnaire responses. Through thematic analysis, we developed a codebook that covers AI execution compliance, issues of AI-generated visualizations, patterns of AI responses, and prompting patterns of users. We summarized four themes, including the quality of outcomes, recurring errors from ChatGPT, misuse by users, factors that affect user trust, confidence, and verification behavior, and human-AI collaboration dynamics. To demonstrate the generalizability of our codebook and findings, we replayed the initial user prompts on Gemini and Claude and compared the outcomes, which revealed distinct failure modes for each model. Based on the results of all analyses, we derive a set of design recommendations for future AI-assisted visualization systems. We conclude with discussions on literacy gaps, diverse human-AI collaboration dynamics, and implications for agentic visualization.2026-06-08T01:29:22ZSam Yu-Te LeeYun-Hsin KuoChifang ChouMatthew WardXiwei XuanKwan-Liu Mahttp://arxiv.org/abs/2606.08912v1Enhancing Presence, Deepening Fan Intensity: How Presence in Immersive Video Shapes Psychological Closeness to Performers2026-06-08T01:26:23ZImmersive video differs from conventional flat 2D video in that it is experienced as 180-degree stereoscopic video on a head-mounted display, thereby eliciting bodily and spatial subjective experience. Previous studies have shown that viewing and interpersonal distance affect Presence; however, it remains insufficiently understood how Presence differences are related to psychological closeness to content. In the present study, we examined whether differences in Presence could increase viewers' psychological closeness to performers within the content. This psychological closeness was operationally defined as fan intensity. Specifically, a live performance by a Japanese idol group was recorded as 180-degree immersive video, and a high-Presence condition (1.2 m) and a low-Presence condition (7.6 m) were established by manipulating filming distance. Twenty-four participants with different levels of prior involvement, comprising Avid fans and Casual fans, experienced both conditions in a counterbalanced within-participants design. Fan intensity was measured before and after the experience as perceived psychological overlap between the self and the performers. The results showed that, compared with the low-Presence condition, the high-Presence condition significantly increased all Presence-related measures except the Slater-Usoh-Steed questionnaire, with the largest condition differences observed for Possible Actions, Social Presence, and Observability. Moreover, a mixed analysis of variance on changes in fan intensity revealed a significant main effect of Presence condition, indicating that the high-Presence video produced a greater increase in fan intensity than the low-Presence video. These findings suggest that filming distance in immersive video is not merely a factor that determines angle of view or composition, but a design variable that can enhance Presence and deepen fan intensity.2026-06-08T01:26:23Z20 pages, including 6 pages of supplementary materials; 10 figures, 2 tablesKoichi ToidaHideto HiranumaShimpei MiuraNorihiro YamamotoYuki KobayashiShingo Megurohttp://arxiv.org/abs/2603.29495v2All-in-One Augmented Reality Guided Head and Neck Tumor Resection2026-06-07T19:53:36ZPositive margins are common in head and neck squamous cell carcinoma, yet intraoperative re-resection is often imprecise because margin locations are typically communicated verbally from pathology. We present an all-in-one augmented reality (AR) system that relocalizes positive margins from a resected specimen to the resection bed and visualizes them in situ using HoloLens 2 depth sensing and fully automated markerless surface registration. In a silicone phantom study with six medical trainees, markerless registration achieved target registration errors comparable to a marker-based baseline (median 1.8 mm vs. 1.7 mm; maximum < 4 mm). In a margin relocalization task, AR guidance reduced error from verbal guidance (median 14.2 mm) to a few millimeters (median 3.2 mm), with all AR localizations within 5 mm error. These results support the feasibility of markerless AR margin guidance for more precise intraoperative re-excision.2026-03-31T09:38:52ZYue YangMatthieu ChabanasCarrie RealeAnnie BensonJason SlagleMatthew WeingerMichael TopfJie Ying Wuhttp://arxiv.org/abs/2508.10239v3Breaking the Curse of Knowledge: Designing Personalized Jargon Support for Real-Time Online Meetings2026-06-07T19:49:25ZCross-disciplinary communication is often hindered by specialized language (i.e., jargon) and uneven background knowledge. Recent advances in speech-to-text and large language models make it possible to provide jargon support during online meetings, but generic support (i.e., defining the same terms for everyone) can overwhelm listeners with definitions they do not need. We present ParseJargon, a system for personalized jargon support in real-time online meetings. We begin with an initial prototype to probe the use of single-sentence user profiles for personalization. We conducted a controlled study and showed that even this minimal personalization enhanced listeners' comprehension and engagement over generic support because of more precise jargon identification. Guided by insights from participants' feedback, we refined the system with more advanced personalization techniques, including in-session user feedback and portable glossary-based profiles. We evaluated how these techniques can further improve jargon identification precision using data collected in the controlled study to simulate personalization over time. We also conducted a latency test, complemented by a lightweight deployment, to analyze the system's real-time capability and usability.2025-08-13T23:42:12ZPortions of this work appeared in CHI '26 Extended Abstracts ("Breaking the Curse of Knowledge: Toward Personalized Jargon Support in Online Meetings") and ACL '26 System Demonstrations ("ParseJargon: Personalized Real-time Jargon Support in Online Meetings")Yifan SongYijun LiuWing Yee AuHon Yung WongBrian P. BaileyTal Augusthttp://arxiv.org/abs/2605.16972v2WhiteTesseract: Reframing the Interpretation of Cultural Heritage through XR and Conversational AI2026-06-07T13:19:01ZCultural heritage exhibitions often struggle to sustain attention and support reflective engagement. Physical exhibitions rely on fixed interpretive aids that lack adaptability to individual backgrounds or curiosity, and their effectiveness depends heavily on a visitor's Personal Context, prior knowledge, and cultural literacy. Meanwhile, digital exhibitions prioritize convenience and accessibility but risk weakening the Physical and Social Contexts that define embodied cultural experience.
WhiteTesseract addresses this gap by enabling in-situ interpretation through high-resolution XR and conversational AI. The system integrates spatial intelligence via artwork recognition to allow visitors to selectively reduce environmental distractions (via diminished reality) and engage in context-aware dialogue (via large language models). The goal is to preserve the richness of the physical and social environment while providing a flexible space for personal reflection, enhancing Personal Context without compromising physical authenticity.
We deployed the system in a Claude Monet exhibition and conducted a controlled user study with 26 participants. Quantitative results showed that WhiteTesseract modulation significantly increased average viewing duration from 35.3 to 98.3 seconds (p < 0.001). Analysis of 529 visitor-AI interactions revealed that 60% extended beyond factual queries to include analytical, emotional, and comparative inquiries. These findings demonstrate how XR and AI can enrich the physical exhibition experience by supporting deeper, more personalized engagement without displacing the embodied value of cultural heritage. We discuss technical and social constraints for real-world deployment and limitations of our controlled setting.2026-05-16T12:50:37Z38 pages, 13 figures. Accepted for publication in ACM Journal on Computing and Cultural Heritage (JOCCH)Jingjing LiZhi LiuXiyao JinTatsuki FushimiYoichi Ochiai