https://arxiv.org/api/3RAVMEdQhZppVR6xzTlBr+EG18s 2026-03-22T10:27:50Z 29041 15 15 http://arxiv.org/abs/2603.18798v1 Signals of Success and Struggle: Early Prediction and Physiological Signatures of Human Performance across Task Complexity 2026-03-19T11:47:57Z User performance is crucial in interactive systems, capturing how effectively users engage with task execution. Prospectively predicting performance enables the timely identification of users struggling with task demands. While ocular and cardiac signals are widely used to characterise performance-relevant visual behaviour and physiological activation, their potential for early prediction and for revealing the physiological mechanisms underlying performance differences remains underexplored. We conducted a within-subject experiment in a game environment with naturally unfolding complexity, using early ocular and cardiac signals to predict later performance and to examine physiological and self-reported group differences. Results show that the ocular-cardiac fusion model achieves a balanced accuracy of 0.86, and the ocular-only model shows comparable predictive power. High performers exhibited targeted gaze and adjusted visual sampling, and sustained more stable cardiac activation as demands intensified, with a more positive affective experience. These findings demonstrate the feasibility of cross-session prediction from early physiology, providing interpretable insights into performance variation and facilitating future proactive intervention. 2026-03-19T11:47:57Z CHI2026 Yufei Cao Penny Sweetser Ziyu Chen Xuanying Zhu http://arxiv.org/abs/2603.18758v1 Dual-Model Prediction of Affective Engagement and Vocal Attractiveness from Speaker Expressiveness in Video Learning 2026-03-19T11:09:58Z This paper outlines a machine learning-enabled speaker-centric Emotion AI approach capable of predicting audience-affective engagement and vocal attractiveness in asynchronous video-based learning, relying solely on speaker-side affective expressions. Inspired by the demand for scalable, privacy-preserving affective computing applications, this speaker-centric Emotion AI approach incorporates two distinct regression models that leverage a massive corpus developed within Massive Open Online Courses (MOOCs) to enable affectively engaging experiences. The regression model predicting affective engagement is developed by assimilating emotional expressions emanating from facial dynamics, oculomotor features, prosody, and cognitive semantics, while incorporating a second regression model to predict vocal attractiveness based exclusively on speaker-side acoustic features. Notably, on speaker-independent test sets, both regression models yielded impressive predictive performance (R2 = 0.85 for affective engagement and R2 = 0.88 for vocal attractiveness), confirming that speaker-side affect can functionally represent aggregated audience feedback. This paper provides a speaker-centric Emotion AI approach substantiated by an empirical study discovering that speaker-side multimodal features, including acoustics, can prospectively forecast audience feedback without necessarily employing audience-side input information. 2026-03-19T11:09:58Z Preprint. Accepted for publication in IEEE Transactions on Computational Social Systems IEEE Transactions on Computational Social Systems, 2026 Hung-Yue Suen Kuo-En Hung Fan-Hsun Tseng 10.1109/TCSS.2026.3675249 http://arxiv.org/abs/2603.18677v1 Cognitive Amplification vs Cognitive Delegation in Human-AI Systems: A Metric Framework 2026-03-19T09:39:24Z Artificial intelligence is increasingly embedded in human decision-making, where it can either enhance human reasoning or induce excessive cognitive dependence. This paper introduces a conceptual and mathematical framework for distinguishing cognitive amplification, in which AI improves hybrid human-AI performance while preserving human expertise, from cognitive delegation, in which reasoning is progressively outsourced to AI systems. To characterize these regimes, we define a set of operational metrics: the Cognitive Amplification Index (CAI*), the Dependency Ratio (D), the Human Reliance Index (HRI), and the Human Cognitive Drift Rate (HCDR). Together, these quantities provide a low-dimensional metric space for evaluating not only whether human-AI systems achieve genuine synergistic performance, but also whether such performance is cognitively sustainable for the human component over time. The framework highlights a central design tension in human-AI systems: maximizing short-term hybrid capability does not necessarily preserve long-term human cognitive competence. We therefore argue that human-AI systems should be designed under a cognitive sustainability constraint, such that gains in hybrid performance do not come at the cost of degradation in human expertise. 2026-03-19T09:39:24Z 16 pages, 2 figures. Conceptual and mathematical framework for human-AI collaboration, cognitive amplification, cognitive delegation, and cognitive sustainability Eduardo Di Santi http://arxiv.org/abs/2506.05908v2 QualitEye: Public and Privacy-preserving Gaze Data Quality Verification 2026-03-19T09:30:31Z Gaze-based applications are increasingly advancing with the availability of large datasets but ensuring data quality presents a substantial challenge when collecting data at scale. It further requires different parties to collaborate, therefore, privacy concerns arise. We propose QualitEye--the first method for verifying image-based gaze data quality. QualitEye employs a new semantic representation of eye images that contains the information required for verification while excluding irrelevant information for better domain adaptation. QualitEye covers a public setting where parties can freely exchange data and a privacy-preserving setting where parties cannot reveal their raw data nor derive gaze features/labels of others with adapted private set intersection protocols. We evaluate QualitEye on the MPIIFaceGaze and GazeCapture datasets and achieve a high verification performance (with a small overhead in runtime for privacy-preserving versions). Hence, QualitEye paves the way for new gaze analysis methods at the intersection of machine learning, human-computer interaction, and cryptography. 2025-06-06T09:27:04Z Mayar Elfares Pascal Reisert Ralf Küsters Andreas Bulling http://arxiv.org/abs/2603.18578v1 Dream the Dream: Futuring Communication between LGBTQ+ and Cisgender Groups in Metaverse 2026-03-19T07:38:31Z Digital platforms frequently reproduce heteronormative norms and structural biases, limiting inclusive communication between LGBTQ+ and cisgender individuals. The Metaverse, with its affordances for identity fluidity, presence, and community governance, offers a promising site for reimagining such interactions. To investigate this potential, we conducted participatory design workshops involving LGBTQ+ and cisgender participants, situating them in speculative Metaverse contexts to surface barriers and co-create alternative futures. The workshops followed a three-phase process-identifying challenges, speculative problem-solving, and visualizing futures-yielding socio-spatial-technical solutions across four layers: activity, interaction, scene, and space. These findings highlight the importance of spatial cues and power dynamics in shaping digital encounters. We contribute by (1) articulating challenges of cross-group communication in virtual environments, (2) proposing inclusive design opportunities for the Metaverse, and (3) advancing principles for addressing power geometry in digital space. This work demonstrates futuring as a critical strategy for designing equitable, transformative communication infrastructures. 2026-03-19T07:38:31Z Conditionally accepted to DIS 2026 Anqi Wang Lei Han Jiahua Dong Muzhi Zhou David Yip Yuyang Wang Pan Hui http://arxiv.org/abs/2512.04316v6 ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions 2026-03-19T06:55:34Z Web privacy is experienced via two public artifacts: site utterances in policy texts, and the actions users are required to take during consent interfaces. In the extensive cross-section audits we've studied, there is a lack of longitudinal data detailing how these artifacts are changing together, and if interfaces are actually doing what they promise in policy. ConsentDiff provides that longitudinal view. We build a reproducible pipeline that snapshots sites every month, semantically aligns policy clauses to track clause-level churn, and classifies consent-UI patterns by pulling together DOM signals with cues provided by screenshots. We introduce a novel weighted claim-UI alignment score, connecting common policy claims to observable predicates, and enabling comparisons over time, regions, and verticals. Our measurements suggest continued policy churn, systematic changes to eliminate a higher-friction banner design, and significantly higher alignment where rejecting is visible and lower friction. 2025-12-03T23:05:42Z 5 pages, Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems (CHI EA '26) Haoze Guo http://arxiv.org/abs/2603.18535v1 Align-to-Scale: Mode Switching Technique for Unimanual 3D Object Manipulation with Gaze-Hand-Object Alignment in Extended Reality 2026-03-19T06:40:14Z As extended reality (XR) technologies rapidly become as ubiquitous as today's mobile devices, supporting one-handed interaction becomes essential for XR. However, the prevalent Gaze + Pinch interaction model partially supports unimanual interaction, where users select, move, and rotate objects with one hand, but scaling typically requires both hands. In this work, we leverage the spatial alignment between gaze and hand as a mode switch to enable single-handed pinch-to-scale. We design and evaluate several techniques geared for one-handed scaling and assess their usability in a compound translate-scale task. Our findings show that all proposed methods effectively enable one-handed scaling, but each method offers distinct advantages and trade-offs. To this end, we derive design guidelines to support futuristic 3D interfaces with unimanual interaction. Our work helps make eye-hand 3D interaction in XR more mobile, flexible, and accessible. 2026-03-19T06:40:14Z 19 pages, 6 figures, Presented at ACM ETRA 2026 Min-yung Kim Jinwook Kim Ken Pfeuffer Sang Ho Yoon http://arxiv.org/abs/2603.18480v1 Do Vision Language Models Understand Human Engagement in Games? 2026-03-19T04:32:12Z Inferring human engagement from gameplay video is important for game design and player-experience research, yet it remains unclear whether vision--language models (VLMs) can infer such latent psychological states from visual cues alone. Using the GameVibe Few-Shot dataset across nine first-person shooter games, we evaluate three VLMs under six prompting strategies, including zero-shot prediction, theory-guided prompts grounded in Flow, GameFlow, Self-Determination Theory, and MDA, and retrieval-augmented prompting. We consider both pointwise engagement prediction and pairwise prediction of engagement change between consecutive windows. Results show that zero-shot VLM predictions are generally weak and often fail to outperform simple per-game majority-class baselines. Memory- or retrieval-augmented prompting improves pointwise prediction in some settings, whereas pairwise prediction remains consistently difficult across strategies. Theory-guided prompting alone does not reliably help and can instead reinforce surface-level shortcuts. These findings suggest a perception--understanding gap in current VLMs: although they can recognize visible gameplay cues, they still struggle to robustly infer human engagement across games. 2026-03-19T04:32:12Z Ziyi Wang Qizan Guo Rishitosh Singh Xiyang Hu http://arxiv.org/abs/2601.14637v2 Forest-Chat: Adapting Vision-Language Agents for Interactive Forest Change Analysis 2026-03-19T04:16:46Z The increasing availability of high-resolution satellite imagery, together with advances in deep learning, creates new opportunities for forest monitoring workflows. Two central challenges in this domain are pixel-level change detection and semantic change interpretation, particularly for complex forest dynamics. While large language models (LLMs) are increasingly adopted for data exploration, their integration with vision-language models (VLMs) for remote sensing image change interpretation (RSICI) remains underexplored, especially beyond urban environments. This paper introduces Forest-Chat, an LLM-driven agent for forest change analysis, enabling natural language querying across multiple RSICI tasks, including change detection and captioning, object counting, deforestation characterisation, and change reasoning. Forest-Chat builds upon a multi-level change interpretation (MCI) vision-language backbone with LLM-based orchestration, incorporating zero-shot change detection via AnyChange and multimodal LLM-based zero-shot change captioning and refinement. To support adaptation and evaluation in forest environments, we introduce the Forest-Change dataset, comprising bi-temporal satellite imagery, pixel-level change masks, and semantic change captions via human annotation and rule-based methods. Forest-Chat achieves mIoU and BLEU-4 scores of 67.10% and 40.17% on Forest-Change, and 88.13% and 34.41% on LEVIR-MCI-Trees, a tree-focused subset of LEVIR-MCI. In a zero-shot capacity, it achieves 60.15% and 34.00% on Forest-Change, and 47.32% and 18.23% on LEVIR-MCI-Trees. Further experiments demonstrate the value of caption refinement for injecting geographic domain knowledge into supervised captions, and the system's limited label domain transfer onto JL1-CD-Trees. These findings demonstrate that interactive, LLM-driven systems can support accessible and interpretable forest change analysis. 2026-01-21T04:23:33Z 28 pages, 9 figures, 12 tables, Submitted to Ecological Informatics James Brock Ce Zhang Nantheera Anantrasirichai http://arxiv.org/abs/2603.18470v1 CyberJustice Tutor: An Agentic AI Framework for Cybersecurity Learning via Think-Plan-Act Reasoning and Pedagogical Scaffolding 2026-03-19T04:04:57Z The integration of Large Language Models (LLMs) into cybersecurity education for criminal justice professionals is currently hindered by the "statelessness" of reactive chatbots and the risk of hallucinations in high-stakes legal contexts. To address these limitations, we propose the CyberJustice Tutor, an educational dialogue system powered by an Agentic AI framework. Unlike reactive chatbots, our system employs a "Think-Plan-Act" cognitive cycle, enabling autonomous goal decomposition, longitudinal planning, and dynamic context maintenance. We integrate a Pedagogical Scaffolding Layer grounded in Vygotsky's Zone of Proximal Development (ZPD), which dynamically adapts instructional support based on the learner's real-time progress. Furthermore, an Adaptive Retrieval Augmented Generation (RAG) core anchors the agent's reasoning in verified curriculum materials to ensure legal and technical accuracy. A comprehensive user study with 123 participants, including students, educators, and active law enforcement officers, validated the system's efficacy. Quantitative results demonstrate high user acceptance for Response Speed (4.7/5), Ease of Use (4.4/5), and Accuracy (4.3/5). Qualitative feedback indicates that the agentic architecture is perceived as highly effective in guiding learners through personalized paths, demonstrating the feasibility and usability of agentic AI for specialized professional education. 2026-03-19T04:04:57Z Baiqiang Wang Yan Bai Juan Li http://arxiv.org/abs/2603.18435v1 Beyond Ray-Casting: Evaluating Controller, Free-Hand, and Virtual-Touch Modalities for Immersive Text Entry 2026-03-19T02:50:44Z Efficient text entry remains a primary bottleneck preventing Virtual Reality (VR) from evolving into a viable productivity platform. To address this, we conducted an empirical comparison of six physical input systems across three interaction styles Controller Driven, Free Hand, and Virtual Touch evaluating both discrete tap typing and continuous gesture typing (swiping), alongside a speech to text (Voice) condition as a non physical reference modality. Results from 21 participants show that the Controller Driven Tap Gesture Combo (CD TGC) delivers the best productivity performance, achieving speeds 2.25 times higher than the slowest system and 30% faster than the current industry standard, while reducing error rates by up to 68%. A clear trade off emerged between performance and perceived usability: although controller based gesture input led on speed and accuracy, participants rated Virtual Touch Tap Typing highest in subjective experience, scoring 80% higher on the System Usability Scale (SUS) than the lowest rated alternative. We further observe that Free Hand interaction remains limited by tracking stability and physical fatigue, whereas Voice input introduces practical constraints related to privacy, editing control, and immersive engagement. Together, these findings characterize the tension between throughput and natural interaction in immersive text entry and provide data driven guidance for future VR interface design. 2026-03-19T02:50:44Z 7 figures, International Conference on Power, Electronics, Communications, Computing, and Intelligent Infrastructure 2026 Md. Tanvir Hossain Mohd Ruhul Ameen Akif Islam Md. Omar Faruqe Mahboob Qaosar A. F. M. Mahbubur Rahman Sanjoy Kumar Chakravarty M. Khademul Islam Molla http://arxiv.org/abs/2511.03117v2 Tracing Generative AI in Digital Art: A Longitudinal Study of Chinese Painters' Attitudes, Practices, and Identity Negotiation 2026-03-19T02:50:25Z This study presents a five-year longitudinal mixed-methods study of 17 Chinese digital painters, examining how their attitudes and practices evolved in response to generative AI. Our findings reveal a trajectory from resistance and defensiveness, to pragmatic adoption, and ultimately to reflective reconstruction, shaped by strong peer pressures and shifting emotional experiences. Persistent concerns around copyright and creative labor highlight the ongoing negotiation of identity and values. This work contributes by offering rare longitudinal empirical data, advancing a theoretical lens of "identity and value negotiation," and providing design implications for future human-AI collaborative systems. 2025-11-05T02:02:44Z In Submission Yibo Meng Ruiqi Chen Zhuoran Lu Shuai Ma Chengxi Zang http://arxiv.org/abs/2507.04996v9 Agentic Vehicles for Human-Centered Mobility: Definition, Prospects, and System Implications 2026-03-19T02:11:36Z Autonomy, from the Greek autos (self) and nomos (law), refers to the capacity to operate according to internal rules without external control. Autonomous vehicles (AuVs) are therefore understood as systems that perceive their environment and execute pre-programmed tasks independently of external input, consistent with the SAE levels of automated driving. Yet recent research and real-world deployments have begun to showcase vehicles that exhibit behaviors outside the scope of this definition. These include natural language interaction with humans, goal adaptation, contextual reasoning, external tool use, and the handling of unforeseen ethical dilemmas, enabled in part by multimodal large language models (LLMs). These developments highlight not only a gap between technical autonomy and the broader cognitive and social capacities required for human-centered mobility, but also the emergence of a form of vehicle intelligence that currently lacks a clear designation. To address this gap, the paper introduces the concept of agentic vehicles (AgVs): vehicles that exhibit agency, the capacity for goal-driven reasoning, strategic adaptation, self-reflection, and purposeful engagement with complex environments. We conclude by outlining key challenges in the development and governance of AgVs and their potential role in shaping future agentic transportation systems that align with user and societal needs. 2025-07-07T13:34:49Z Jiangbo Yu Raphael Frank Luis Miranda-Moreno Sasan Jafarnejad Jonatas Augusto Manzolli Jiyao Wang Ali Eslami http://arxiv.org/abs/2602.04023v2 Exploring Emerging Norms of AI Attribution and Disclosure in Programming Education 2026-03-19T01:58:24Z Generative AI blurs the lines of authorship in computing education, creating uncertainty around how students should attribute AI assistance. To examine these emerging norms, we conducted a factorial vignette study with 94 computer science students across 102 unique scenarios, systematically manipulating assessment type, AI autonomy, student activity, prior knowledge, and human refinement effort. This paper details how these factors influence students' perceptions of ownership and disclosure preferences. Our findings indicate that attribution judgments are primarily driven by different levels of AI assistance and human refinement. We also found that students' perception of authorship significantly predicts their policy expectations. We conclude by proposing a shift from statement-style policies to process-oriented attribution, transforming disclosure into a pedagogical mechanism for fostering critical engagement with AI-generated content. 2026-02-03T21:14:19Z Runlong Ye Oliver Huang Jessica He Michael Liut http://arxiv.org/abs/2603.18398v1 Deconstructing Open-World Game Mission Design Formula: A Thematic Analysis Using an Action-Block Framework 2026-03-19T01:42:26Z Open-world missions often rely on repeated formulas, yet designers lack systematic ways to examine pacing, variation, and experiential balance across large portfolios. We introduce the Mission Action Quality Vector (MAQV), a six-dimensional framework-covering combat, exploration, narrative, emotion, problem-solving, and uniqueness-paired with an action block grammar representing missions as gameplay sequences. Using about 2200 missions from 20 AAA titles, we apply LLM-assisted parsing to convert community walkthroughs into structured action sequences and score them with MAQV. An interactive dashboard enables designers to reveal underlying mission formulas. In a mixed-methods study with experienced players and designers, we validate the pipeline's fidelity and the tool's usability, and use thematic analysis to identify recurring design trade-offs, pacing grammars, and systematic differences by quest type and franchise evolution. Our work offers a reproducible analytical workflow, a data-driven visualization tool, and reflective insights to support more balanced, varied mission design at scale. 2026-03-19T01:42:26Z Kaijie Xu Yiwei Zhang Brian Yang Clark Verbrugge