https://arxiv.org/api/PJjX5y5qWGNz5zHis4opvV3deqI 2026-06-13T19:36:57Z 30934 135 15 http://arxiv.org/abs/2606.07101v1 CANote: Empowering Fact-checking Note Writing Through Scaffolded and Provenance-based Human-AI Collaboration 2026-06-05T09:52:27Z

Crowdsourced fact-checking mechanisms, such as X's Community Notes, play a critical role in mitigating the spread of misinformation. However, drafting high-quality, evidence-based debunking notes imposes a substantial burden on contributors. We present CANote, an AI-assisted debunking note writing system featuring evidence correlation and structured co-drafting. CANote scaffolds the workflow by extracting subclaims from social media posts, providing provenance through explicit links between subclaims and retrieved evidence, and generating neutral, structural drafts to support human reasoning. We evaluated CANote against manual writing (N=52 fact-checkers, N=52 lay users) on simulated X platform, where we found CANote significantly improves note quality. Notably, CANote enables lay users to write notes that have comparable quality to those written by experts. While the task completion time and perceived cognitive load remain comparable to manual drafting, CANote significantly increases user satisfaction. However, this assistance introduces a trade-off, resulting in a reduced sense of user ownership and control over the debunking note.

2026-06-05T09:52:27Z Shuning Zhang Jingruo Chen Yuwei Chuai Dai Shi Yifan Wang Xin Yi Hewu Li http://arxiv.org/abs/2603.04982v3 Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis 2026-06-05T08:57:56Z

Can targeted user training unlock the productive potential of generative artificial intelligence in professional settings? We study this question using a randomized experiment in which 164 law students completed an issue-spotting examination under one of three conditions: no GenAI access, optional access to a large language model (LLM), or LLM access with a brief training intervention. Untrained LLM access proved counterproductive: relative to participants without any LLM access, untrained users wrote significantly shorter answers, committed more case misstatements, and scored marginally lower, though most differences fall short of conventional significance. Training reversed this pattern. Trained participants adopted the LLM at higher rates (41% vs. 26%; p = 0.044), scored 0.27 grade points higher than untrained users--roughly one fine grade--(p = 0.027), and stated applicable rules more accurately (p = 0.014). Principal stratification analysis suggests training operates primarily through adoption rather than effectiveness--the adoption lower bound (1.06) exceeds the effectiveness upper bound (0.42) at strict mean dominance--though confidence intervals are wide. More broadly, these findings challenge the view that GenAI primarily benefits lower-skilled workers: without training, higher-ability practitioners opt out while lower-ability users adopt but unproductively. Realizing GenAI's productivity gains requires investment in both access and instruction.

2026-03-05T09:23:30Z Benjamin M. Chen Hong Bao http://arxiv.org/abs/2604.05360v2 OGA-AID: Clinician-in-the-loop AI Report Drafting Assistant for Multimodal Observational Gait Analysis in Post-Stroke Rehabilitation 2026-06-05T08:02:45Z

Gait analysis is essential in post-stroke rehabilitation but remains time-intensive and cognitively demanding, especially when clinicians must integrate gait videos and motion-capture data into structured reports. We present OGA-AID, a clinician-in-the-loop multi-agent large language model system for multimodal report drafting. The system coordinates 3 specialized agents to synthesize patient movement recordings, kinematic trajectories, and clinical profiles into structured assessments. Evaluated with expert physiotherapists on real patient data, OGA-AID consistently outperforms single-pass multimodal baselines with low error. In clinician-in-the-loop settings, brief expert preliminary notes further reduce error compared to reference assessments. Our findings demonstrate the feasibility of multimodal agentic systems for structured clinical gait assessment and highlight the complementary relationship between AI-assisted analysis and human clinical judgment in rehabilitation workflows.

2026-04-07T02:57:29Z 2026 CV4Clinic CVPR Workshop Proceedings Khoi T. N. Nguyen Nghia D. Nguyen Hui Yu Koh Patrick W. H. Kwong Karen Sui Geok Chua Ananda Sidarta Baosheng Yu http://arxiv.org/abs/2506.02622v2 HORUS: A Mixed Reality Interface for Managing Teams of Mobile Robots 2026-06-05T07:58:34Z

Mixed Reality (MR) interfaces have been extensively explored for controlling mobile robots, but there is limited research on their application to managing teams of robots. This paper presents HORUS: Holistic Operational Reality for Unified Systems, a Mixed Reality interface offering a comprehensive set of tools for managing multiple mobile robots simultaneously. HORUS enables operators to monitor individual robot statuses, visualize sensor data projected in real time, and assign tasks to single robots, subsets of the team, or the entire group, all from a Mini-Map (Ground Station). The interface also provides different teleoperation modes: a mini-map mode that allows teleoperation while observing the robot model and its transform on the mini-map, and a semi-immersive mode that offers a flat, screen-like view in either single or stereo view (3D). We conducted a user study in which participants used HORUS to manage a team of mobile robots tasked with finding clues in an environment, simulating search and rescue tasks. This study compared HORUS's full-team management capabilities with individual robot teleoperation. The experiments validated the versatility and effectiveness of HORUS in multi-robot coordination, demonstrating its potential to advance human-robot collaboration in dynamic, team-based environments.

2025-06-03T08:38:46Z 7 pages, 7 figures, conference paper submitted to UR 2026 Omotoye Shamsudeen Adekoya Antonio Sgorbissa Carmine Tommaso Recchiuto http://arxiv.org/abs/2606.07013v1 A Multi-Operator Mixed-Reality Interface for Multi-Robot Control and Coordination: Co-Located and Private Workspace Collaboration 2026-06-05T07:55:16Z

Multi-operator control of robot teams requires not only access to the same mission information, but also mechanisms for maintaining shared awareness and preventing conflicting interventions. Building on our previous HORUS interface (Holistic Operational Reality for Unified Systems) we present a mixed-reality interface that extends single-operator multi-robot supervision to collaborative multi-operator use. The system supports two complementary modes: a co-located shared workspace, in which operators observe and manipulate the same mini-map in the same physical location, and a private-workspace mode, in which operators work on the same mission through independently placed local workspaces. The architecture combines registration-driven scene construction, lightweight shared-session synchronization, and per-robot control leases to support collaborative monitoring, tasking, and teleoperation while preventing conflicting commands. We evaluated the approach in a human-subject study with 36 participants (18 pairs) controlling three Nova Carter mobile robots in two search environments. The performance of the objective task was comparable across the two modes, indicating that both modes supported effective mission execution. However, the co-located shared workspace significantly improved perceived collaboration, shared understanding, and handoff clarity, and was the preferred collaborative mode. These results indicate that physically co-locating the MR workspace improves how operators coordinate even when the underlying robot-control tools remain unchanged.

2026-06-05T07:55:16Z Submitted to RO-MAN 2026 Omotoye Shamsudeen Adekoya Antonio Sgorbissa Carmine Tommaso Recchiuto http://arxiv.org/abs/2606.06417v2 Warning Message Content Increases Help Seeking in a Large-Scale Dark Web CSAM Intervention 2026-06-05T07:51:49Z

Warning messages have been used to disrupt individuals seeking online child sexual abuse material (CSAM) and promote engagement with support services, yet large-scale field evidence on message content remains limited, particularly in high anonymity environments. This study reports a field experiment on Ahmia.fi, a Tor search engine, examining how warning message content influences behavior. Across a 140-day period, almost 20 million searches were observed, with over 3 million searches containing known CSAM-related terms that triggered a warning linking to an anonymous self-help program. Users were exposed to warning messages varying in thematic content and framing, or a neutral message. Across a randomized comparison, a campaign-wide analysis, and interrupted time series models, message content consistently influenced engagement with help resources. All active messages increased click-through rates to help resources relative to the neutral condition, with a harm-focused message producing the strongest effects. At the platform level, click-through rates increased from 8.73% before the intervention to 15.67% during the campaign. These findings highlight the importance of message content in shaping responses to warning interventions, supporting an approach in which messaging is refined and adapted to increase engagement with support resources.

2026-06-04T17:21:23Z Caoilte Ó Ciardha Joel Scanlan Tegan Insoll Juha Nurmi Nina Vaaranen-Valkonen http://arxiv.org/abs/2507.01548v3 Telling stories, making Hanzi: AI-assisted co-creation with elderly migrants in urban China 2026-06-05T07:09:47Z

This paper explores how older migrants in urban China can record stories that everyday language and design often miss. We ran two co-creation workshops with 10 elders. Activities combined oral storytelling, facilitator-mediated AI assistance, and hand-making. Large language models proposed candidate glyphs through a facilitator. Participants crafted new Hanzi to hold their stories. The resulting characters served as memory anchors for later sharing and retelling. Our interpretive analysis shows heterogeneity and adaptive capacity among participants. Participants experienced AI as a creative initiator that lowered barriers to expression and making, especially for those with lower digital literacy. The work challenges homogenizing assumptions about older adults and the presumption of uniform capacities and needs. We contribute a workshop framework that positions AI as a backstage facilitator. We also offer insights on engaging older migrants as sources of community memory and situated cultural knowledge within inclusive urban systems.

2025-07-02T10:00:12Z Yunfei Chen Wen Zhan Peiyue Lin Ziqun Hua Ying Hu 10.21606/drs.2026.963 http://arxiv.org/abs/2606.06936v1 Personality Anchoring for Social Simulation: Linking Personality, Social Behavior, and Interaction Success with LLM Agents 2026-06-05T06:00:15Z

Social interactions are shaped by the interplay of dispositional traits and situational context, yet systematically investigating how personality configurations between individuals jointly influence social behavior across diverse social contexts remains methodologically challenging. We address this gap by introducing a simulation pipeline adapted from the CHARISMA framework, which employs well-known movie characters and public figures as psychologically grounded agents for multi-LLM social simulation using a method we term personality anchoring. We present a large-scale empirical study examining how dyadic Agreeableness composition influences social interaction outcomes across 1,010 simulated conversations. Our results reveal a monotonic relationship between dyadic Agreeableness composition and shared goal achievement, with Homogeneous-Agreeable pairs achieving success 10 times the rate of Homogeneous-Disagreeable pairs (62% vs. 6%). Behavioral mediation analysis reveals that Agreeableness shapes goal achievement partially through cooperative strategy selection, though it continues to predict outcomes within the same dominant strategy, indicating pathways beyond observable conversational behavior. Robustness analyses confirm high consistency of results across repeated simulations (ICC = 0.89) and stable personality expression across diverse scenarios, validating personality anchoring as a viable operationalization strategy.

2026-06-05T06:00:15Z Vahid Sadiri Javadi Aksa Aksa Fryderyk Róg Lucie Flek Johanne R. Trippas http://arxiv.org/abs/2604.07732v2 Twitch Third-Party Developers' Support Seeking and Provision Practices on Discord 2026-06-05T05:42:03Z

Third-party developers (TPDs) often turn to online communities for support when they can't get immediate responses from the platform. Twitch, as a leading live streaming platform, attracted many TPDs and formed an online support community on Discord. This study explores TPDs' support practices via mixed method (a topic modeling to identify topics related to support seeking and provision first and a follow-up in-depth qualitative analysis with these topics) and found that: (1) TPDs' support-seeking practices around social, technical, and policy matters are highly dependent on Twitch, and this dependence acts as a form of platform labor; (2) TPDs need to switch between Discord and Twitch regarding seeking and provision, exacerbating TPDs' platform labor; (3) TPDs' flexible role practices reflect the community's flourishing on Discord but require roles to bridge the two platforms and transfer informal support seeking to possible formal support from Twitch. We propose implications for effectively managing support seeking and provision between formal and informal spaces to improve the development of TPDs. We also contribute to community support practice and to platform ecology work in CSCW.

2026-04-09T02:29:15Z Accepted by ACM CSCW 2026 Jie Cai He Zhang Yueyan Liu John M. Carroll Chun Yu 10.1145/3817021 http://arxiv.org/abs/2606.06851v1 Toward a Metaphysics of Learning Analytics: Ontological Positioning of Data, Inference, and Normativity 2026-06-05T02:51:14Z

The Learning Analytics (LA) community has undergone rapid development over the 15 years since the first LAK conference was held. However, while epistemological and ethical debates regarding the philosophical foundations of LA have been vigorous, metaphysical discussions have been sparse, signifying a lack of effort to derive the identity of LA from its internal principles. In this paper, we attempt to establish a metaphysics of LA by addressing the ontological question of ``What is LA?'' We do so by tracing back to LA's own definitions and principles to derive an answer from within LA itself. Specifically, we address what kind of existence the data LA operates on constitutes, identify eight agents including learners as ontological prerequisites, and clarify, via the is/ought problem, that LA does not derive norms from data. In particular, this system reveals that a class of LA practices, here termed \textit{norm-embedded LA}, conflates LA's purpose with its operations, creating an ontological tension with the first principle. We also discuss connections with related fields and the limitations of this system. The metaphysics outlined here is not imposed from outside LA, but surfaces what LA itself has always implicitly presupposed.

2026-06-05T02:51:14Z 25 pages, 1 figures Kensuke Takii http://arxiv.org/abs/2605.21351v2 The Human-AI Delegation-Verification Dilemma: Individual Strategies, Collective Equilibria and Sociotechnical Lock-in 2026-06-05T02:45:04Z

This paper takes an ecological approach toward large-scale models of hybrid human-AI intelligence. Emerging models of human-AI interaction predominantly advance the complementarity thesis variously dubbed human-AI collaboration and human-AI hybrid intelligence. However, this constitutes an over-simplification of the modalities of human-AI interaction and possibility-space for both individual and collective action that human-AI interaction potentiates. To fill these gaps, this paper develops a decision and game-theoretic approach to the human-AI delegation-verification dilemma. First, we map out canonical decision-theoretic strategies that account for adaptive user trajectories, modeling how agents transition between strategies based on interaction feedback to reach stable equilibria. Second, we scale individually stable strategies to collective equilibria using three extrapolation principles: (a) non-communicative aggregation (b) local social signaling and (c) institutional norms setting. The analysis identifies the emergence of sociotechnical lock-in, a macro-behavioral state where individually adaptive delegation, in the absence of communicative and institutional safeguards, aggregates into a systemic collective action problem modeled as a prisoner's dilemma that degrades shared epistemic standards. We argue that adoption under higher communicative standards and institutional norms can mitigate suboptimal collective equilibria by imposing social commitments on individual users.

2026-05-20T16:15:52Z Angjelin Hila http://arxiv.org/abs/2606.06800v1 Exploring Reinforcement Learning for Fluid Transitions Between Clinical Mental Healthcare and Everyday Wellness Support 2026-06-05T01:01:56Z

Mental health struggles wax and wane, yet clinical and wellness interventions typically operate separately, causing frequent breakdowns at care transitions. We explore reinforcement learning (RL) as a means to build digital health systems that deliver clinical and wellness interventions proactively, as part of a coherent care journey. We ask: what complexities does designing such a system involve? We built a contextual bandit that dynamically selects journaling prompts from clinical and wellness repertoires to optimize for an overarching health goal (sustained journaling) and deployed it in a four-week exploratory study (N=38). We found that, first, many benefits of RL-optimized intervention sequences appeared only after interventions ended, raising the question: Should systems that offer coherent clinical-wellness care journeys include stepping-back periods? If so, when and how? Second, participants most engaged with RL-generated interventions deepened their engagement over time, while those most engaged with a constant intervention tended to burn out and drop out later. It raises the question: When should a system blending clinical and wellness interventions reduce intensity to prevent burnout in versus sustain it to maximize treatment gains?

2026-06-05T01:01:56Z Healthcare Beyond Reaction: Harnessing AI and Sensing for Proactive Care, Workshop at ACM Interactive Health 2026 (IH '26), July 05--08, 2026, Porto, Portugal Tony Wang Qian Yang http://arxiv.org/abs/2606.06788v1 Explain Like I'm 5 or Whatever I Choose: Evaluating the Interactive Potential of Language Model Responses 2026-06-05T00:14:44Z

Evaluations of large language models (LLMs) in scientific information seeking tasks have become increasingly use-centric, such as conducting live or multi-turn evaluations with real users. These evaluations still assume a single, static chat interface, but as models are integrated into new interfaces, evaluations must shift to incorporate interface-specific criteria. We propose a new evaluation framework based on a formative study with $16$ participants that tests models' ability to generate multiple responses to one query that differ along an interpretable axis of language (language complexity), inspired by direct manipulation interfaces from human-centered design literature. We evaluate GPT-5.1, GPT-5 mini, Claude Sonnet 4.5 + Thinking, and DeepSeek-V3.1 by generating 5 responses at different levels of language complexity for $98$ scientific queries. While models vary complexity across responses, most changes remain inconsistent, with the best performing model (Claude Sonnet 4.5) only shifting reliable complexity measures in the correct direction $46\%$ of the time. Our findings hold with increased sample size and alternative complexity levels.

2026-06-05T00:14:44Z Preprint Indu Panigrahi Tal August http://arxiv.org/abs/2512.23128v2 It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents 2026-06-04T23:27:42Z

Web-based agents powered by large language models are increasingly used for tasks such as email management or professional networking. Their reliance on dynamic web content, however, makes them vulnerable to prompt injection attacks: adversarial instructions hidden in interface elements that persuade the agent to divert from its original task. We introduce the Task-Redirecting Agent Persuasion Benchmark (TRAP), a benchmark for studying how persuasion techniques misguide autonomous web agents on realistic tasks. Across six frontier models, agents are susceptible to prompt injection in 25% of tasks on average (13% for GPT-5 to 43% for DeepSeek-R1), with small interface or contextual changes often doubling success rates and revealing systemic, psychologically driven vulnerabilities in web-based agents. We also provide a modular social-engineering injection framework with controlled experiments on high-fidelity website clones, allowing for further benchmark expansion.

2025-12-29T01:09:10Z ICML 2026 Karolina Korgul Yushi Yang Arkadiusz Drohomirecki Piotr Błaszczyk Will Howard Lukas Aichberger Chris Russell Philip H. S. Torr Adam Mahdi Adel Bibi http://arxiv.org/abs/2602.03121v3 Behind the Feed: A Taxonomy of User-Facing Cues for Algorithmic Transparency in Social Media 2026-06-04T22:49:51Z

People who use social media are learning about how the companies that run these platforms make their decisions on who gets to see what through visual indicators in the interface (UI) of each social media site. These indicators are different for each platform and are not always located in an easy-to-find location on the site. Therefore, it is hard for someone to compare different social media platforms or determine whether transparency leads to greater accountability or only leads to increased understanding. A new classification system has been developed to help provide a standard way of categorizing the way, that an algorithm is presented through UI elements and whether the company has provided any type of explanation as to why they are featured. This new classification system includes the following three areas of development: design form, information content, and user agency. This new classification system can be applied to the six social media platforms currently available and serves as a reference database for identifying common archetypes of features in the each social media platform's UI. The new classification system will assist in determining whether or not the transparency of an algorithm functions the way that it was intended when it was developed and provide future design ideas that can help improve the inspectibility, actionability, and contestability of algorithms.

2026-02-03T05:25:01Z ECSCW 2026 Haoze Guo Ziqi Wei 10.48340/ecscw2026_014