https://arxiv.org/api/VI/V/uHICg5zJQK0tpAPKZdmZ/0 2026-06-13T13:41:58Z 30934 60 15 http://arxiv.org/abs/2504.20519v5 Large Language Model Chatbot Conversations vs Public Health Materials and Parental HPV Vaccination Intentions: A Randomized Clinical Trial 2026-06-09T13:02:32Z

Health care systems are increasingly considering large language model (LLM)-based chatbots for vaccine communication, but evidence that they improve durable, behaviorally relevant outcomes beyond existing health materials is limited. This randomized clinical trial tested whether brief, multiturn LLM chatbot interactions increased parental intention to vaccinate children against human papillomavirus (HPV) compared with no intervention and government public health materials, and whether effects persisted. Parents in the US, Canada, and UK were recruited online from March 3 to May 25, 2025, with follow-up at 15 and 45 days. Eligible participants were adults with at least one HPV vaccine-eligible child who was unvaccinated or whose vaccination status was unknown. Participants were randomized to no-message control, country-matched government materials with at least 3 minutes of exposure, or a 3-minute GPT-4o chatbot interaction using either a default persuasive style or a shorter conversational style. The primary outcome was self-reported likelihood of vaccinating the child against HPV within 12 months, measured immediately after intervention on a 0-100 scale. Follow-up outcomes included vaccination intent and self-reported vaccination at 15 and 45 days. In total, 1297 participants were randomized (mean age 42.84 years; 72.1% female). Compared with no intervention, public health materials increased immediate vaccination intent (Cohen d = 0.53; 95% CI, 0.36-0.70), as did the default chatbot (d = 0.48; 95% CI, 0.30-0.65) and conversational chatbot (d = 0.33; 95% CI, 0.17-0.49). At 45 days, neither chatbot increased intent relative to controls, whereas public health materials maintained modest effects. No intervention increased self-reported vaccination uptake. Findings suggest well-designed public health materials may match or exceed short LLM chatbot conversations for HPV vaccine promotion.

2025-04-29T07:59:46Z JAMA Network Open 2026 Neil K. R. Sehgal Sunny Rai Manuel Tonneau Anish K. Agarwal Joseph Cappella Melanie Kornides Lyle Ungar Alison Buttenheim Sharath Chandra Guntuku 10.1001/jamanetworkopen.2026.16822 http://arxiv.org/abs/2606.10786v1 Being and Time in XR: Other-Presentness Beyond Co-Presence 2026-06-09T12:37:20Z

Research in XR (Extended Reality) has conventionally centred upon concepts such as Presence, Embodiment, Social Presence, and Co-presence. Within these traditions, bodily action, sensory contingencies, synchronous interaction, and possibilities for action have generally been regarded as constitutive conditions for the experience of "being there" and of being with others. XR environments, however, permit the partial separation of conditions that ordinarily co-vary in everyday experience. Bodily co-presence, temporal simultaneity, spatial configuration, and social interaction need not remain inseparable. This paper approaches this possibility as a problem of other-presentness. Other-presentness refers to the conditions under which another individual is experienced as existing "here and now". The contribution of this paper does not lie in arguing that asynchronous others can evoke social responses; such observations have already been addressed within parasocial interaction and social presence research. Rather, the novelty lies in theorising XR as a technological condition capable of separating and operationalising the constitutive elements of other-presentness as design variables. Reconsidering Bodyless Presence as a methodological precedent and drawing upon experimental findings from Immersive Video research, this paper formulates Bodyless Presentness as a condition in which another individual continues to be experienced as presently existing despite attenuated bodily co-presence and weakened real-time simultaneity.

2026-06-09T12:37:20Z 5 pages, 3 figures Koichi Toida http://arxiv.org/abs/2606.10753v1 Deploying Speech-Driven 3D Facial Animation in Unreal Engine for Production-Ready Digital Humans 2026-06-09T12:03:42Z

Speech-driven 3D facial animation research has shown promising results, but most methods rely on representations that are not compatible with production pipelines. In this work, we present a deployable system that bridges this gap by enabling speech-driven 3D facial animation directly in Unreal Engine (UE) using ARKit-compatible representations. We construct 3DMEAD-ARKit dataset by converting the MEAD corpus into blendshape sequences using MediaPipe, and retrain FaceDiffuser and ProbTalk3D-X to generate stochastic and emotion controllable animations. We further develop a modular UE plugin with a Python backend that supports model selection, and parameter control. We compare the results to two existing commercial tools: Epic Games' MetaHuman speech-driven animator and Nvidia Audio2Face with a perceptual user study. The results highlight the importance of comparisons among academic and commercial pipelines. We recommend watching the supplementary video. We also plan to do live demonstrations of our work at Siggraph 2026 conference.

2026-06-09T12:03:42Z 11 pages Alessandro Busacchi Kazi Injamamul Haque Zerrin Yumak 10.1145/3799825.3818695 http://arxiv.org/abs/2605.12100v2 HM-Req: A Framework for Embedding Values within CPS Human Monitoring Requirements 2026-06-09T09:36:19Z

Monitoring humans, for example, their movement or location, is essential for safe and efficient human-machine collaboration in Cyber-Physical Systems (CPS). This information allows CPS to ensure safety properties, adapt their behaviour dynamically, and coordinate with humans. To ensure that the design of a CPS respects ethical principles and the privacy of its stakeholders, system requirements, particularly those related to human monitoring, must reflect the human values of all involved stakeholders. However, human values are often underrepresented in Software Engineering -- particularly during requirements elicitation and system design, crucial phases when introducing ethically critical functionality. Stakeholder values are often implicit and conflicting, yet rarely systematically captured. Furthermore, unstructured natural language requirements introduce ambiguity and vagueness, complicating conflict resolution. To address these problems, we propose HM-Req, a requirements elicitation framework including a Controlled Natural Language (CNL) for defining human monitoring requirements. These requirements are then augmented with human values from relevant stakeholders and integrated into a Value Dashboard to detect potential conflicts that require further discussion and resolution. Validation results, applying the CNL to different datasets and conducting a survey and expert interview, provide evidence of the CNL's ability to capture diverse human monitoring requirements and demonstrate HM-Req's usefulness for requirements elicitation activities.

2026-05-12T13:15:39Z Accepted Version for publication at the 34th IEEE International Requirements Engineering Conference (RE'26). 10+2 pages Zoe Pfister Ruth Breu Michael Vierhauser http://arxiv.org/abs/2606.10627v1 Profy: Interpretable Visualization of Expertise-Dependent Motor Skills Toward Supporting Piano Practice 2026-06-09T09:28:46Z

The quality of piano performance depends on nuanced timing, articulation, and dynamic control, but practice feedback is often summary-based and hard to act on. We introduce Profy, a weakly supervised system that learns from take-level labels derived from aggregated listener ratings (expert-labeled vs. amateur-labeled) to produce time-aligned highlights for review during piano practice. We collected synchronized 1 kHz key-motion and audio from 73 pianists and used 1,083 valid takes for modeling and evaluation. The model outputs clip-level predictions together with evidence scores on a shared resampled model time base for visualization. On 20 amateur clips from short technique studies annotated by 21 expert pianists, the displayed highlight score aligns with passages that expert pianists marked for review despite training without localized labels (Pearson r=0.61, ROC-AUC 0.75). Rather than summarizing a take with a single global score, Profy helps learners decide where to inspect next by supporting scrubbing, looping, and focused replay of time-localized passages associated with expert-amateur differences.

2026-06-09T09:28:46Z Designing Interactive Systems Conference (DIS '26), June 13-17, 2026, Singapore, Singapore Kazuki Kawamura Fujiki Nakamura Hayato Nishioka Momoko Shioki Shinichi Furuya Jun Rekimoto 10.1145/3800645.3812903 http://arxiv.org/abs/2605.06234v2 RobotEQ: Transitioning from Passive Intelligence to Active Intelligence in Embodied AI 2026-06-09T07:34:06Z

Embodied AI is a prominent research topic in both academia and industry. Current research centers on completing tasks based on explicit user instructions. However, for robots to integrate into human society, they must understand which actions are permissible and which are prohibited, even without explicit commands. We refer to the user-guided AI as passive intelligence and the unguided AI as active intelligence. This paper introduces RobotEQ, the first benchmark for active intelligence, aiming to assess whether existing models can comprehend and adhere to social norms in embodied scenarios. First, we construct RobotEQ-Data, a dataset consisting of 1,894 egocentric images, spanning 10 representative embodied categories and 56 subcategories. Through extensive manual annotation, we provide 4,944 action judgment questions and 1,157 spatial grounding questions, specifying appropriate robot actions across diverse scenarios. Furthermore, we establish RobotEQ-Bench to evaluate the performance of state-of-the-art models on this task. Experimental results demonstrate that current models still fall short in achieving reliable active intelligence, particularly in spatial grounding. Meanwhile, leveraging RAG techniques to incorporate external social norm knowledge bases can generally enhance performance. This work can facilitate the transition of robotics from user-guided passive manipulation to active social compliance.

2026-05-07T13:22:26Z Kuofei Fang Xinyi Che Haomin Ouyang Shufan Zhang Xuehao Wang Qi Liu Liyi Liu Chenqi Zhang Wenxi Cai Wenyu Dai Jinyang Wu Fan Zhang Haoyu Chen Bin He Zheng Lian http://arxiv.org/abs/2605.04254v3 Hierarchical Support Vector State Partitioning for Distilling Black Box Reinforcement Learning Policies 2026-06-09T07:30:52Z

We introduce State Vector Space Partitioning (SVSP), a novel method to mimic a black box reinforcement learning policy using a set of human-interpretable subpolicies. By partitioning a distillation dataset of state action pairs with linear support vector machine splits, SVSP constructs a compact and structured representation of the original policy. Our method improves mean return by +7.4% over previous critic driven state partitioning attempts such as Voronoi State Partitioning (VSP) and +2.8% over the original TD3 policy, while reducing the number of required subpolicies against VSP by 82.1%. Our results pave the path towards a more flexible form of distillation where both the decision boundary and surrogate models can be chosen within a margin of the original black box behavior.

2026-05-05T19:40:05Z Accepted for poster presentation at HHAI 2026 Senne Deproost Mehrdad Asadi Ann Nowé http://arxiv.org/abs/2606.11269v1 Traits Run Deeper: Trait-Specific Asymmetric Fusion for Personality Assessment 2026-06-09T06:38:36Z

Personality assessment aims to infer stable personality traits from dynamic behaviors across language, voice, and facial cues. Since different personality dimensions are revealed through distinct behavioral perspectives, modeling trait-specific evidence is challenging. However, most existing approaches adopt a uniform multimodal fusion strategy across all dimensions, assuming identical modality contributions. This overlooks trait-specific modality preferences and introduces cross-modal interference. To address this issue, we propose a novel personality assessment framework called Traits Run Deeper, which consists of three components. Specifically, the Multimodal Foundation Representation (MFR) module constructs personality-oriented multimodal inputs and leverages psychology-informed semantic templates as anchors, enabling foundation models to capture trait-relevant information. Building upon MFR, the Trait-Specific Modality Fusion (TSMF) module acts as an asymmetric fusion mechanism, allowing each dimension to selectively exploit different modality pathways from modality-specific modeling to complementary fusion. Thus, TSMF captures heterogeneous modality preferences while reducing cross-modal contamination. Furthermore, the Distribution-Calibrated Personality Regression (DCPR) module mitigates label imbalance and central tendency bias through target distribution calibration, improving robustness and stability. Experimental results on the AVI Challenge 2026 validation set demonstrate the effectiveness of the proposed framework, reducing mean squared error (MSE) by approximately 25% compared with the baseline. Consistent improvements are observed on the official test set, where our method achieves the best performance and ranks first in the Personality Assessment Track. The source code will be made available at https://github.com/MSA-LMC/AVI2026.

2026-06-09T06:38:36Z Jia Li Qian Chen Wei Wang Xinyu Li Zhenzhen Hu Dongsheng Shao Richang Hong Meng Wang http://arxiv.org/abs/2606.10434v1 Profiling cognitive offloading in LLM-mediated synthesis writing: Volume vs. content 2026-06-09T05:21:18Z

This study compares two approaches to profiling how learners offload cognitive activity to LLMs during a synthesis writing task. Drawing on Salomon's distributed cognition and the Kintsch and van Dijk model of text comprehension, the study operationalises offloading to an LLM in two ways: as a volume of LLM use and as content of what is offloaded, both along with prior knowledge. Data from 97 university students interacting with a general-purpose LLM via a custom interface were analysed using k-means clustering. To capture the content of offloading, their prompts were interpreted as to who performs the activity (active or passive) and at what level of comprehension (local or global). Volume-based profiling (k=4) differentiated learners primarily by prior knowledge, with volume negatively associated with essay authorship. Content-based profiling (k=5) revealed qualitatively distinct patterns of offloading, from vocabulary clarification to active direction of structuring and generation to passive delegation of comprehension at both levels. These patterns reflect different fragmentation of the cognitive process, with differences in learning strategies, behavioural markers, and essay authorship. Combining volume and content of offloading could improve future analyses on how LLM use redistributes cognitive activity and its effects on learners.

2026-06-09T05:21:18Z Accepted to the Proceedings of the European Conference for Tecnology-Enhanced Learning' 2026 Oleksandra Poquet Mani Shankar Nanduri Maria Ximena Salinas Loyer Matthias Stadler Michael Sailer Jelena Jovanovic http://arxiv.org/abs/2606.10398v1 Selection, Not Salience: The Shape and Limits of Personalization in Social Highlighting 2026-06-09T04:18:08Z

Does personalizing what a reader sees pay off, and where does it stop? Using a social web highlighter and a co-readership identity control (the same document highlighted by many users, which holds document and topic fixed and asks whether a person's own history predicts their marks better than another reader's does), we map the shape and limits of personalization across reading altitudes. At the document altitude we give the clean, leakage-free, identity-controlled measurement that prior next-document evaluations could only upper-bound: a person's history identifies which documents in a co-reading neighborhood are theirs, with an own-versus-other gap of +0.169 against community negatives and +0.119 against topic-matched hard negatives (both highly significant); a content-based arm suggests the signal is not purely title-driven but is largely thematic. This is comparable to the span-level selection signal (+0.14) from our prior work: the selection signal is of comparable magnitude across altitudes (+0.12 to +0.17), most of it stable topic preference. At the sentence altitude, a two-stage personalized auto-highlight (an impersonal model proposes candidates, a personal model re-ranks them) does not improve on its impersonal baseline: two off-the-shelf zero-shot LLMs, including a frontier model, predict highlight locations worse than a lead baseline, and personal re-ranking is beaten by the salience order even on the highest-recall candidate pool, so the null is not merely a Stage-1 ceiling artifact. Measurable personalization appears primarily at the selection layer: modest (~+0.13), topic-dominated, with no reliable gain at the salience layer. We also surface a control-in-negatives bias that inflated our document gap to a spurious +0.227 until audited. Going beyond the shared salience layer may be better approached by aggregating individuals than by personalizing them harder.

2026-06-09T04:18:08Z 9 pages, 1 figure, 3 tables Kazuki Nakayashiki Keisuke Watanabe http://arxiv.org/abs/2603.20511v2 CARE: A Capability-Based Measurement Framework for Reproductive Equity in Human-AI Interaction 2026-06-09T02:51:27Z

Algorithmic systems mediate sexual and reproductive health (SRH) information seeking. Standard HCI and AI evaluation centers usability, accuracy, and interaction quality, measures designed to assess task performance and interaction quality at the system level. We introduce CARE, the Capability Approach for Reproductive Equity, a measurement framework for human-AI interaction that adds capability outcomes as a unit of evaluation above task performance. CARE functions in two parts. The Normative Design Lens identifies the resources, conversion factors, capabilities, and functionings a system should support. The Evaluation lens assesses how design features, interaction patterns, and social conditions shape capability outcomes, tradeoffs, and lived experiences in use. We apply CARE to SRH-specific chatbots, general-purpose LLMs, and search engine features in a study with 12 participants, demonstrating that it surfaces capability outcomes standard metrics aggregate away. The same design features expanded capabilities for some users while constraining them for others: source-level organization, response format, tone, and SRH-specific features all shaped which capabilities expanded for which users and in which direction. Participants' professional backgrounds, gender identities, and prior AI familiarity further shaped these effects, producing capability outcomes that usability and accuracy metrics, aggregated across users, would not surface. These findings demonstrate capability outcomes as a measurable unit for human-AI interaction evaluation, extending existing metrics with a capability layer above task performance.

2026-03-20T21:25:56Z Alice Zhong Phoebe Chen Punya Aragula Anika Sharma Kandyce Brennan Snehalkumar 'Neil' S. Gaikwad 10.1145/3772363.3799046 http://arxiv.org/abs/2507.09788v3 TinyTroupe: An LLM-powered Multiagent Persona Simulation Toolkit 2026-06-09T02:50:22Z

Recent advances in Large Language Models (LLM) have led to a new class of autonomous agents, renewing and expanding interest in the area. LLM-powered Multiagent Systems (MAS) have thus emerged, both for assistive and simulation purposes, yet tools for realistic human behavior simulation -- with its distinctive challenges and opportunities -- remain underdeveloped. Existing MAS libraries and tools lack fine-grained persona specifications, population sampling facilities, experimentation support, and integrated validation, among other key capabilities, limiting their utility for behavioral studies, social simulation, and related applications. To address these deficiencies, in this work we introduce TinyTroupe, a simulation toolkit enabling detailed persona definitions (e.g., nationality, age, occupation, personality, beliefs, behaviors) and programmatic control via numerous LLM-driven mechanisms. This allows for the concise formulation of behavioral problems of practical interest, either at the individual or group level, and provides effective means for their solution. TinyTroupe's components are presented using representative working examples, such as brainstorming and market research sessions, thereby simultaneously clarifying their purpose and demonstrating their usefulness. Quantitative and qualitative evaluations of selected aspects are also provided, including preliminary experiments with real human behavior as control. Results highlight possibilities, limitations, and trade-offs. The approach, though realized as a specific Python implementation, is meant as a novel conceptual contribution, which can be partially or fully incorporated in other contexts. The library is available as open source at https://github.com/microsoft/tinytroupe.

2025-07-13T21:00:27Z 9 pages Paulo Salem Robert Sim Christopher Olsen Prerit Saxena Rafael Barcelos Yi Ding http://arxiv.org/abs/2606.10325v1 Design and Implementation of a Real-time Multi-site Immersive Learning System Using Photon Fusion 2026-06-09T02:19:55Z

In this paper, we develop a Virtual Reality-based immersive learning environment that allows teachers to conduct a lesson in a virtual space using Photon Fusion. The proposed system allows teachers and students to be present in the same virtual space regardless of their actual physical locations. The teachers can verbally communicate with students in real-time, interacting with 3D learning materials. By adopting Photon Fusion, the system achieves stable real-time communication and synchronization among multiple players. Evaluation results demonstrate that the proposed system provides stable communication performance, good usability, and minimal VR sickness, confirming its effectiveness as an immersive learning environment.

2026-06-09T02:19:55Z Iwai Wataru Duc V. Nguyen http://arxiv.org/abs/2606.10182v1 Creativity in the BioFoundry: Supporting scientific creativity in the age of automation 2026-06-08T21:17:53Z

Biofoundries automate biological experimentation at unprecedented scale, promising speed, reproducibility, and access. Yet automation also reshapes how scientists experience experimentation and creativity. Through in-depth interviews with nine scientists and experts across academia and industry (including biofoundry developers, automation engineers, and end-users), we examine how scientific creativity is enacted under automation. Biofoundries displace sensory cues, redistribute responsibility between humans and machines, and transform troubleshooting from an embodied, local practice into a predictive, social, and interpretive one. Rather than framing biofoundries as automation factories, we argue that they should be understood as Creativity Support Tools, whose design directly shapes how researchers notice breakdowns, exercise judgment, learn from failure, and progress through success. By connecting biofoundry practice with prior HCI work on automation, debugging, and distributed creativity, this paper demonstrates biofoundries as a distinctive and timely site for creativity research in science.

2026-06-08T21:17:53Z 13 pages, 6 figures, 2 tables, ACM Creativity and Cognition Conference 2026 Mingyan Claire Tian Sarah Sterman 10.1145/3803784.3807549 http://arxiv.org/abs/2606.10180v1 Flow Control: Steering Vision-Language-Action Models with Simple Real-Time Inputs 2026-06-08T21:16:37Z

We introduce flow control of vision-language-action (VLA) models, a simple and effective way to steer VLA actions in real-time through generic inputs, such as a keyboard. This method can be used out-of-the-box and does not require retraining or fine-tuning VLAs. It enables relatively crude user inputs to steer a VLA to align with user intent. The VLA transforms these inputs into action samples drawn from the VLA expert action distribution learned during training, so that the generated actions are high quality (conformity to the action expert distribution) and high fidelity (reflecting the user's intent). We demonstrate that flow control has many desirable properties: (1) flow control accurately and responsively steers robot actions with user inputs, (2) it is robust to suboptimal user inputs, (3) it enables users to steer VLAs to achieve significantly higher success rates and faster task completion, and (4) fine-tuning a VLA on flow control trajectories improves the autonomous policy. Together, these results provide a simple and intuitive way for users to help steer VLA actions, increasing task performance.

2026-06-08T21:16:37Z 10 pages, 5 figures Jonathan C. Kao Jason Chan Andy Wang