https://arxiv.org/api/gyX93STL1YFV80NJhgVZ2Kbaf64 2026-06-13T18:31:36Z 30934 120 15 http://arxiv.org/abs/2604.16095v3 GroupEnvoy: A Conversational Agent Speaking for the Outgroup to Foster Intergroup Relations 2026-06-06T01:57:40Z

Conversational agents have the potential to support intergroup relations when psychological or linguistic barriers prevent direct interaction. Based on intergroup contact theory, we propose GroupEnvoy, a text-based conversational agent that represents outgroup perspectives during ingroup discussions. Its dialogue is grounded in data from a prior outgroup-only discussion. To evaluate this approach and derive design principles, we conducted a mixed-methods, between-subjects study with university students, in which host-country students formed the ingroup and international students formed the outgroup. Ingroup students performed a collaborative task while engaging with outgroup perspectives, either by interacting with GroupEnvoy (AI-mediated contact) or by reading a static document (passive exposure). Quantitatively, AI-mediated contact demonstrated a directional reduction in intergroup anxiety and an improvement in perspective-taking. Qualitatively, AI-mediated contact enhanced outcome expectancies and directed empathy toward the outgroup's evaluations of the ingroup, whereas passive exposure fostered future contact intentions and elicited empathy toward the outgroup's lived experiences. These findings present AI-mediated contact as a promising paradigm for improving intergroup relations.

2026-04-17T14:26:53Z 18 pages, 5 figures, accepted to ACM CUI 2026 Koken Hata Rintaro Chujo Reina Takamatsu Wenzhen Xu Yukino Baba 10.1145/3816046.3816204 http://arxiv.org/abs/2606.07897v1 The AI Epistemic Deference Index: A Continuous Measure of Sycophancy 2026-06-05T23:16:28Z

Current AI models frequently exhibit epistemic sycophancy, endorsing claims to agree with a user. Existing evaluations typically measure this either by assessing what it takes to make a model shift a binary endorsement or by eliciting an explicit probability in a proposition. However, much user-facing sycophantic behavior is demonstrated through shifts in graded support expressed through ordinary language. We propose the AI Epistemic Deference Index (AEDI): a continuous, unidimensional score representing how sensitive the support expressed in a model's output is to the attitude expressed in a user's prompt. To generate AEDI, we provide a new protocol for estimating probabilities from natural language outputs, using LLMs-as-judges validated for consistency and correlation to human judgment. We deploy it on a new curated database of 500 propositions across diverse topics and 16,000 prompts varying in user attitude, testing eight prominent models. Every model exhibits substantial deference, though with large and systematic differences across providers, with Claude models demonstrating the least, and Grok and Gemini models the most. The effect is amplified in prompts requesting a written artifact, and concentrated on propositions where models hold weaker priors. We release AEDI as an easy-to-update benchmark and measurement pipeline for output-level sycophancy evaluation.

2026-06-05T23:16:28Z Alejandro Botas Paul de Font-Reaulx Luke Hewitt http://arxiv.org/abs/2606.07837v1 Does Persona Make LLMs K-pop Fans? A Pilot Study of LLM-Based Online Concert Audience Agents 2026-06-05T20:55:38Z

A concert is a collective experience, but recorded performance videos are typically watched alone, stripping away the shared audience presence that makes concerts feel eventful. We investigate whether persona-based LLM audience agents can recreate aspects of this collective experience by generating real-time fan chat alongside a K-pop performance video. We present a multi-agent system in which ten LLM agents react through live-chat messages, comparing a persona-conditioned audience (each agent assigned a distinct fan identity, bias, and chat style) with a no-persona baseline. In a within-subjects pilot with K-pop fans (N=11), persona conditioning substantially improved model-level chat quality and perceived naturalness, but did not translate into differences in social connectedness, engagement, or affective response. Interviews suggest that online K-pop concert chat may operate as collective monologue rather than interpersonal dialogue, and that meaningful participation depends on shared identification with the specific artist and fandom. Persona conditioning can make LLM audiences appear more natural, but culturally meaningful collective experience may require deeper alignment between persona, crowd behavior, fandom identity, and user expectations.

2026-06-05T20:55:38Z Accepted at the ICML 2026 Workshop on Culture x AI: Evaluating AI as a Cultural Technology Kirak Kim Hyojin Kim Yejin Son Sungyoung Kim Kyung Myun Lee http://arxiv.org/abs/2606.07794v1 The Choreography of Augmented Reality Timelines: Studying the Relative Position, Chronology, & Situatedness of Event Sequences 2026-06-05T19:12:34Z

Timelines are effective ways to tell historical and personal stories. However, most timeline visualization tools impose an inflexible model of time prioritizing chronological clarity. On the other hand, unconstrained representations can better capture the irregular and contextual nature of lived time, but often at the cost of interpretability. In this work, we explore this continuum with a study of how historical and personal timelines could manifest in physical spaces. We conducted a formative study (N=12) in which participants freely arranged events within a physical environment. We observed a diversity of strategies reflecting the personal and context-dependent nature of temporal mental models. We also invited participants to consider how others could move through their timelines. Our analysis led to a choreographic approach to timeline creation, as well as a proof-of-concept tablet-based augmented reality (AR) application that supports spatial timeline drawing and viewing. Finally, we reflect on the design implications of encoding chronology, pacing, and spatial context in immersive timeline stories.

2026-06-05T19:12:34Z In Proceedings of Graphics Interface 2026 https://conferences.graphicsinterface.org/2026/ Isabelle Kwan Jessica Ziyu Chen Matthew Brehmer http://arxiv.org/abs/2606.07773v1 Understanding Human and Interface Design Factors in Canadian Cybercrime Reporting 2026-06-05T18:31:14Z

Cybercrime affects a majority of Canadians, yet most incidents go unreported. We conducted two studies to examine the factors influencing cybercrime reporting and the role of interface design in victims' reporting experiences. Our survey provides individual-level insights into the persistent gap in cybercrime reporting in Canada, showing how perceived incident severity and personal characteristics shape reporting behaviour. Our usability study compared reporting with an AI chatbot to an online form; chatbots facilitated more complete reports and led to higher user satisfaction, highlighting how interface design impacts reporting outcomes.

2026-06-05T18:31:14Z 17 pages, 12 figures Charlotte Carr Ananta Chowdhury Asra Sakeen Wani Sonia Chiasson http://arxiv.org/abs/2606.07765v1 TibetCPR: A Multimodal Tactile Feedback System to Enhance Cardiopulmonary Resuscitation Training in High-Altitude Regions of Tibet 2026-06-05T18:24:37Z

High-quality cardiopulmonary resuscitation (CPR) requires stable control of compression rhythm and depth, yet most training systems presuppose instructor mediation, repeated practice, and explanatory guidance-assumptions that do not hold in the Tibet Autonomous Region, where instruction is fragmented and learners' linguistic and educational backgrounds are heterogeneous. We present TibetCPR, a low-cost, self-guided CPR training system that pairs depth-driven electrotactile feedback with rhythm-driven visual cues within a Tibetan-language narrative. In a randomised study with 40 lay community members aged 19--56, the experimental group showed progressive minute-by-minute stabilisation of rhythm and depth across a 10-minute intervention, substantially exceeding an unguided-practice control, with gains transferring to an unscaffolded one-minute post-test. Qualitative accounts described the feedback as legible through participants' bodily action, and usability was high (SUS = 84.3). We synthesise three transferable design principles for self-guided embodied training: feedback as a calibration reference, not an immediate corrector; modality temporal granularity matched to behaviour's temporal structure; and autonomous interpretability as a deployment prerequisite, not an after-effect of usability.

2026-06-05T18:24:37Z Accept to MobileHCI 2026 Yibo Meng Ruiqi Chen Zhiming Liu Xiaolan Ding http://arxiv.org/abs/2606.07437v1 Re-imagining ISO 26262 in the Age of Autonomous Vehicles: Enhancing Controllability through Transferability and Predictability 2026-06-05T16:35:58Z

The ISO 26262 standard defines functional safety for road vehicles through risk assessments based on Severity, Exposure, and Controllability, grounded in a human-driven vehicle paradigm. In the context of autonomous vehicles (AVs), the absence of a human driver necessitates revisiting these principles. This paper decomposes the Controllability placeholder into two auditable evidence dimensions of ISO 26262 by introducing two measurable sub-concepts: Transferability and Predictability. Transferability extends Controllability to capture AV systems' ability to hand off control to dedicated fallback safety mechanisms, while Predictability captures how easily external agents can anticipate AV behavior. Predictability is formally defined from human-robot interaction-inspired principles, and a mathematical framework is provided to quantify it. A designed-versus-achievable gap is introduced to distinguish architectural fallback claims from scene-conditioned achievable fallback capability. The proposed metrics align with ISO 26262 and ISO/PAS 21448 (SOTIF), rendering fallback and interaction claims falsifiable and traceable across ODD slices. These dimensions complement rather than replace existing standards, and the enhancements preserve the structure of ISO 26262 while extending its applicability to driverless automated systems operating at SAE Levels 4 and 5.

2026-06-05T16:35:58Z Chaitanya Shinde Hadi Hajieghrary Paul Schmitt Adam Shoemaker Bodo Seifert Steve Kenner http://arxiv.org/abs/2511.10544v3 Effects of Personality- and Opinion-Alignment in Human-AI Interaction 2026-06-05T15:12:41Z

Interactions with AI assistants are increasingly personalized to individual users. As AI personalization is dynamic and machine-learning-driven, we have limited understanding of how personalization affects interaction outcomes and user perceptions. We conducted a large-scale controlled experiment in which 1,000 participants interacted with AI assistants prompted to take on specific personality traits and opinions. Our results show that participants consistently preferred to interact with models that shared their opinions. Participants found opinion-aligned models more trustworthy, competent, warm, and persuasive, corroborating an AI-similarity-attraction hypothesis. In contrast, we observed no or only weak effects of AI personality alignment, with introvert models rated as less trustworthy and competent by introvert participants. These findings highlight opinion alignment as a central dimension of AI user preference, while underscoring the need for a more grounded discussion of the mechanisms and risks of AI personalization.

2025-11-13T17:45:20Z Maximilian Eder Clemens Lechner Maurice Jakesch http://arxiv.org/abs/2606.07714v1 Beyond Accuracy: Interpreting Topic Representation in Suicide Ideation Detection Models 2026-06-05T14:46:50Z

Suicide ideation detection models are typically evaluated using aggregate performance metrics, yet little is known about how they internally represent psychologically meaningful risk factors. In high-stakes mental health applications, understanding these internal representations is essential for safety, transparency, and responsible deployment. In this work, we move beyond accuracy and analyze how suicide detection models trained on original and topic-augmented datasets encode psychological risk factors in their internal representation space. Using visualization and geometric analysis, we examine the coherence and separability of topic-related features. Our results show that topic-aware augmentation increases the clarity and distinctness of underrepresented psychosocial risk factors such as immigration, family issues, and financial crisis. These findings suggest that augmentation not only improves model performance but also leads to more structured and interpretable internal representations.

2026-06-05T14:46:50Z Hamideh Ghanadian Isar Nejadgholi Hussein Al Osman http://arxiv.org/abs/2606.07283v1 A Model of Integrated Information Processing in Human-AI Interaction 2026-06-05T13:57:05Z

For Human-AI Interaction (HAII) research to move forward, theoretical work linking psychological mechanisms to interface design is needed. Such work should extend rather than replace established HCI and automation research, adapting to the increasing autonomy and agency of AI systems. Building on prior frameworks focused on roles and levels in human interaction with automation, a gap remains from a psychological view: a task-centered, process-oriented account that links mechanisms of action regulation to concrete design and evaluation levers for human-AI coupling, expressed in a unified vocabulary for human and machine. Moreover, existing models may describe how a system is designed (e.g., function allocation in automation) but fall short in showing how this design affects human behavior. We present the Integrated Information Processing (IIP) model, a task-centered, cybernetic model that conceptualizes humans, machines, and their joint activity as coupled control loops. The IIP model uses a unified modeling language for human and artificial agents, making psychological models of action regulation accessible for AI system design. As a core feature, we argue that efficacy within a shared task is characterized by three integration qualities, input adequacy, reference consonance, and output operativity, which critically influence benchmarks of human-centeredness such as transparency and controllability. The model maps interface choices (e.g., XAI techniques) to theory-driven expectations of user behavior, guiding interface design and evaluation. To this end, we present (1) a continuity-preserving theoretical discourse that extends HAII to agency in AI; (2) the IIP model with three information-processing qualities; and (3) applications of the IIP model to exemplary use cases demonstrating implications for interface design.

2026-06-05T13:57:05Z 22 pages Tim Schrills. Thomas Franke http://arxiv.org/abs/2511.06080v4 AIDEN: Design and Pilot Study of an AI Assistant for the Visually Impaired 2026-06-05T13:26:09Z

This paper presents AIDEN, an artificial intelligence-based assistant designed to enhance the autonomy and daily quality of life of visually impaired individuals, who often struggle with object identification, text reading, and navigation in unfamiliar environments. Existing solutions such as screen readers or audio-based assistants facilitate access to information but frequently lead to auditory overload and raise privacy concerns in open environments. AIDEN addresses these limitations with a hybrid architecture that integrates You Only Look Once (YOLO) for real-time object detection and a Large Language and Vision Assistant (LLaVA) for scene description and Optical Character Recognition (OCR). A key novelty of the system is a continuous haptic guidance mechanism based on a Geiger-counter metaphor, which supports object centering without occupying the auditory channel, while privacy is preserved by ensuring that no personal data are stored. Empirical evaluations with visually impaired participants assessed perceived ease of use and acceptance using the Technology Acceptance Model (TAM). Results indicate high user satisfaction, particularly regarding intuitiveness and perceived autonomy. Moreover, the ``Find an Object'' achieved effective real-time performance. These findings provide promising evidence that multimodal haptic-visual feedback can improve daily usability and independence compared to traditional audio-centric methods, motivating larger-scale clinical validations.

2025-11-08T17:23:51Z Luis Marquez-Carpintero Francisco Gomez-Donoso Zuria Bauer Bessie Dominguez-Dager Alvaro Belmonte-Baeza Mónica Pina-Navarro Francisco Morillas-Espejo Felix Escalona Miguel Cazorla http://arxiv.org/abs/2606.09901v1 On the Controllability-Fidelity Frontier in Diffusion Editing 2026-06-05T13:24:01Z

Diffusion-based generative models enable powerful image editing capabilities, but achieving precise control while maintaining fidelity and safety remains challenging. We present a comprehensive theoretical and empirical study of controllable diffusion-based image editing, analyzing the trade-offs between adherence to user intent, preservation of non-target content, and output quality. Our work spans text- and mask-guided edits, point/drag manipulation, and inversion-based pipelines. We derive mathematical formulations of editing objectives and analyze dynamics of noise injection, score guidance, and inversion error. We provide theoretical bounds on reconstruction error, stability under repeated edits, and locality of changes. We propose algorithmic frameworks (with pseudocode) for mask-localized and instruction-guided editing, and present extensive experiments comparing state-of-the-art methods (e.g.\ TF-ICON \cite{lu2023tficone}, DragFlow \cite{zhou2025dragflow}, InstructPix2Pix \cite{brooks2023instructpix2pix}, UltraEdit \cite{zhao2024ultraedit}) on multiple tasks and metrics (FID, identity similarity, CLIP alignment, artifact scores, etc). Our results reveal key failure modes, such as identity drift, prompt sensitivity, and compositional errors. We also discuss ethical considerations in image editing, including misuse risks, bias, consent, and concept erasure techniques (e.g.\ MACE \cite{lu2024mace}, ANT \cite{li2025ant}, EraseAnything \cite{gao2024eraseanything}) as safeguards. We conclude with best practices and future directions for responsible, high-fidelity diffusion-based editing.

2026-06-05T13:24:01Z Preprint Yi Hu Leying Yi Emily Davis Finn Carter http://arxiv.org/abs/2606.06126v2 Deterring Searches for Child Sexual Abuse Material on Google Search and Promoting Help-Seeking 2026-06-05T13:07:45Z

Google Search deploys a "Onebox" feature at the top of the results page when users conduct searches for Child Sexual Abuse Material. This study evaluates the impact of a strategic shift in this feature, comparing a revised intervention, focused on repercussions and therapeutic resources, to a previous iteration that focused on reporting. Using a difference-in-differences analysis of internal Google Search logs data, we found the new messaging resulted in a 3.8 percentage point reduction as compared to the status quo in subsequent CSAM-related queries within the same Search session. We found an average click through rate of 0.73% on any of the hyperlinked buttons to help-providing resources. Together, this research presents convergent evidence that a subset of individuals can be deterred from ongoing CSAM-seeking and redirected to therapeutic services.

2026-06-04T13:13:30Z Rebecca Umbach Griffin Hunt John Buckley Joel Scanlan Caoilte Ó Ciardha Ethel Quayle Ainslie Heasman Maximilian von Heyden Elizabeth Letourneau Donald Findlater Tegan Insoll Richard Wortley Chad Steel Abhishek Roy http://arxiv.org/abs/2606.07231v1 Moodie: An Early-Stage Design Exploration for Supporting Fear of Missing Out with LLM-based Chatbots 2026-06-05T12:56:47Z

The excessive use of social media has led to the challenge known as Fear of Missing Out (FoMO). Existing studies fail to provide accessible, interactive tools that focus on the emotional and cognitive aspects of FoMO. This work presents Moodie, a chatbot designed using Large Language Models to support emotion regulation and reduce FoMO. We conducted a formative study to understand the needs of individuals with FoMO and developed Moodie. Then, we conducted a preliminary evaluative study (N=21) to observe how participants interact with Moodie and a baseline chatbot (GPT-4o) over one week. The results show that while both Moodie and a baseline chatbot reduced FoMO to a similar extent, Moodie resulted in greater engagement and social connection. This finding raises interesting questions about the advantages of purpose-built chatbots compared to general-purpose models for mental health support. Future research will include chat log analysis, prototype refinements, and longitudinal evaluations.

2026-06-05T12:56:47Z 7 pages, 1 figure, 1 table. Preliminary work submitted to the ACM CUI 2026 Works-in-Progress (WiP) track Hsin-Yu Tsai Jingxian Liao Fu-Yin Cherng Tzu-Hsiang Huang http://arxiv.org/abs/2505.17739v2 Feasible Action Space Reduction for Quantifying Causal Responsibility in Continuous Spatial Interactions 2026-06-05T10:59:58Z

Understanding the causal influence of one agent on another agent is crucial for safely deploying artificially intelligent systems such as automated vehicles and mobile robots into human-inhabited environments. Existing models of causal responsibility deal with simplified abstractions of scenarios with discrete actions, thus, limiting real-world use when understanding responsibility in spatial interactions. Based on the assumption that spatially interacting agents are embedded in a scene and must follow an action at each instant, Feasible Action-Space Reduction (FeAR) was proposed as a metric for causal responsibility in a grid-world setting with discrete actions.Since real-world interactions involve continuous action spaces, this paper proposes a formulation of the FeAR metric for measuring causal responsibility in space-continuous interactions. We illustrate the utility of the metric in prototypical space-sharing conflicts, and showcase its applications for analysing backward-looking responsibility and in estimating forward-looking responsibility to guide agent decision making. Our results highlight the potential of the FeAR metric for designing and engineering artificial agents, as well as for assessing the responsibility of agents around humans.

2025-05-23T11:02:44Z In review Ashwin George Luciano Cavalcante Siebert David A. Abbink Arkady Zgonnikov