https://arxiv.org/api/KJSK0prASU96VvhX8K98qIuX0Uo2026-06-14T00:15:39Z3093419515http://arxiv.org/abs/2606.04308v1Creative Reading: Scaffolding Reading for Transformation2026-06-03T00:27:05ZReading augmentation systems increasingly help readers process text at scale. While these tools address real constraints of time and cognitive load, they often implicitly frame reading as information transmission, or "reading to discard," delegating interpretation and effort to the machine. Yet this delegation changes the outcome of reading. For example, in scholarly reading, deciding what a research text implies and why it matters is central to the work of scholarly production. We propose creative reading as an alternative goal: reading augmentation that supports readers in creating both readings and themselves as readers. By putting literary and narrative theories into conversation with scholarly sensemaking and creativity support, we present a provocation-oriented design space for valuing the process of reading as a way of preserving a plurality of readings and transforming readers over time.2026-06-03T00:27:05ZSophia LiuSarah AbowitzYijun LiuSarah StermanShm Garanganao AlmedaMax Kreminskihttp://arxiv.org/abs/2606.04254v1Behavioral and Performance Indicators of Depression and Anxiety in Electronic Learning Systems2026-06-02T22:08:07ZThis study investigates whether behavioral and performance indicators derived from a Moodle-based learning management system are associated with university students' depression and anxiety in two undergraduate Computer Engineering courses. Using a quantitative observational design, LMS event logs, academic records, and self-reported Beck Depression Inventory-II and Beck Anxiety Inventory scores from 97 students were integrated. A broad set of behavioral and performance indicators spanning temporal engagement, session structure, deadline-related behavior, page-refresh patterns, and LMS navigation was extracted from raw event logs and analyzed using descriptive statistics, independent-samples t-tests with Benjamini-Hochberg FDR correction, effect sizes, and Spearman correlations; inventory scores were confirmed invariant by sex and academic year. Several indicators were significantly associated with depression and anxiety. Higher depression was associated with shifted temporal activity patterns, longer session durations, and shorter homework submission lead times, while higher anxiety was associated with concentrated temporal engagement and session-based differences. These findings suggest that routine LMS data can provide meaningful behavioral signals related to student well-being and may support earlier educational awareness of students who experience mental-health-related strain. At the same time, such indicators should be interpreted as contextual and non-diagnostic markers rather than as substitutes for clinical assessment.2026-06-02T22:08:07ZArya VarastehNezhadFattaneh Taghiyarehhttp://arxiv.org/abs/2602.23312v3Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction2026-06-02T19:40:01ZLeader-follower interaction is an important paradigm in human-robot interaction (HRI). Yet, assigning roles in real time remains challenging for resource-constrained mobile and assistive robots. While large language models (LLMs) have shown promise for natural communication, their size and latency limit on-device deployment. Small language models (SLMs) offer a potential alternative, but their effectiveness for role classification in HRI has not been systematically evaluated. In this paper, we present a benchmark of SLMs for leader-follower communication, introducing a novel dataset derived from a published database and augmented with synthetic samples to capture interaction-specific dynamics. We investigate two adaptation strategies: prompt engineering and fine-tuning, studied under zero-shot and one-shot interaction modes, compared with an untrained baseline. Experiments with Qwen2.5-0.5B reveal that zero-shot fine-tuning achieves robust classification performance (86.66% accuracy) while maintaining low latency (22.2 ms per sample), significantly outperforming baseline and prompt-engineered approaches. However, results also indicate a performance degradation in one-shot modes, where increased context length challenges the model's architectural capacity. These findings demonstrate that fine-tuned SLMs provide an effective solution for direct role assignment, while highlighting critical trade-offs between dialogue complexity and classification reliability on the edge.2026-02-26T18:20:26ZRafael R. BaptistaAndré de Lima SalgadoRicardo V. GodoyMarcelo BeckerThiago BoaventuraGustavo J. G. Lahrhttp://arxiv.org/abs/2606.04155v1SocialCoach: Personalized Social Skill Learning with RL-based Agentic Tutoring and Practice2026-06-02T19:20:54ZSocial skills such as negotiation and leadership are crucial for personal and professional success in today's interconnected world. However, scalable and effective training remains a significant challenge due to the scarcity of expert coaching. In this paper, we introduce SocialCoach, a holistic LLM-powered agentic tutoring system for personalized social skill development at scale. First, SocialCoach automatically constructs a pedagogically-grounded, theory-to-practice knowledge corpus from diverse expert sources, leveraging a multi-agent pipeline. Second, to personalize the learning journey, it employs an adaptive practice scheduling module that follows a prescription-retrieval-adaptation process. To maximize the long-term learning experience while overcoming the cold-start problem, this policy is optimized within a learner simulation environment through reinforcement learning. Finally, SocialCoach integrates immersive, goal-driven practice, causality-driven proficiency assessment and knowledge-grounded, reflective tutoring to help address the knowing-doing gap. We deploy it in our product, EQoach, and conduct extensive experiments. The results show that SocialCoach improves simulated pathway quality and judge-rated tutoring quality over baseline approaches, while early user feedback indicates strong perceived engagement and usefulness. These findings suggest a practical architecture for personalized and gamified pedagogical platforms on soft skill learning.2026-06-02T19:20:54ZTianfu WangMax XiongJianxun LianHongyuan ZhuZhengyu HuYuxuan LeiLinxiao GongXiaofang LiPeiting TsaiNicholas Jing YuanQi Zhanghttp://arxiv.org/abs/2606.04150v1Stumbling Into AI Emotional Dependence: How Routine AI Interactions Reshape Human Connection2026-06-02T19:18:39ZPublic discourse and emerging policy typically assume that AI emotional support is a deliberate act: a lonely user consciously seeking comfort from a dedicated companion chatbot. In this paper, we draw on emerging empirical evidence and argue that this picture is inaccurate on two accounts, both in how AI emotional support arises and how it shapes future behavior. First, AI emotional support commonly emerges incidentally within task-oriented interactions on general-purpose platforms, much as workplace friendships deepen through collaboration. Second, these incidental encounters are path-dependent: positive experiences of AI emotional support update people's beliefs about AI's emotional capabilities and redirect their choices for future emotional support, increasing preference for AI and decreasing preference for humans. We review recent evidence, including a large-scale longitudinal study conducted in collaboration with OpenAI, showing that daily five-minute conversations with an AI about personal issues over 28 days led to a 10.3% decrease in the preference for seeking support from humans and an 11.6% increase in the preference for AI. These findings suggest that current policy, focused on companion apps and isolated interactions, cannot adequately protect human connection. Instead, effective regulations should extend to general-purpose AI systems and address cumulative, trajectory-level changes in how people seek support. Recognizing how people stumble into AI emotional support and how those encounters redirect human connections over time is essential to safeguarding human well-being.2026-06-02T19:18:39ZYaoxi ShiCathy Mengying FangPattie MaezAmit Goldenberghttp://arxiv.org/abs/2606.03926v1DiffUNet^2: Bidirectional Prediction, Probabilistic Generation and Collaborative Visual Discovery for Scientific Data2026-06-02T17:15:01ZModeling temporal evolution is important to analyzing and reasoning about scientific phenomena, yet most machine learning methods provide deterministic forward predictions that overlook multiple plausible outcomes and rarely support backward reasoning, limiting their usefulness in practical scientific workflows. We present a framework that integrates diffusion-based generative modeling with interactive visual analytics for scientific exploration. We introduce DiffUNet^2, a conditional diffusion model that enables bidirectional, any-to-any generation across time and captures distributions of plausible system evolutions. Built upon the model, our interactive system supports branching timeline exploration, user-guided state editing, and probability-space navigation, enabling scientists to actively explore alternative hypotheses rather than passively observe predictions. We evaluate the model on 5 datasets across different scientific domains to validate its predictive accuracy and probability-space ensemble quality. In collaboration with domain experts, we demonstrate the effectiveness of our approach in supporting practical scientific temporal data analysis workflows. By integrating modeling and visual interaction, our approach enables scientists to interactively explore system dynamics, transforming generative models into tools for hypothesis-driven scientific analysis.2026-06-02T17:15:01Z12 pages, 20 figuresMengdi ChuJiaxin YangAngus G. ForbesNathan DebardelebenEarl LawrenceAyan BiswasHan-Wei Shenhttp://arxiv.org/abs/2603.27750v2Invasive and Non-Invasive Neural Decoding of Motor Performance in Parkinson's Disease for Personalized Deep Brain Stimulation2026-06-02T17:10:53ZDecoding motor performance from brain signals offers promising avenues for adaptive deep brain stimulation (aDBS) for Parkinson's disease (PD). In a two-center cohort of 19 PD patients executing a drawing task, we decoded motor performance from electroencephalography (n=15) and, critically for clinical translation, electrocorticography (n=4). Within each session, patients performed the task under DBS on and DBS off. A total of 35 sessions were recorded. Instead of relying on single frequency bands, we derived patient-specific biomarkers using a filterbank-based machine-learning approach. DBS modulated kinematics significantly in 23 sessions. Significant neural decoding of kinematics was possible in 28 of the 35 sessions (average Pearson's $\text{r}= 0.37$). Our results further demonstrate modulation of speed-accuracy trade-offs, with increased drawing speed but reduced accuracy under DBS. Joint evaluation of behavioral and neural decoding outcomes revealed six prototypical scenarios, for which we provide guidance for future aDBS strategies.2026-03-29T16:07:18ZMatthias DoldVolker A. CoenenBastian SajonzPeter ReinacherThomas ProkopMarco ReisertSophia GimpleYasin TemelMarcus L. F. JanssenMichael TangermannJoana Pereirahttp://arxiv.org/abs/2606.03907v1The Impact of Configuring Agentic AI Coding Tools on Build-vs-Buy Decisions: A Study Protocol2026-06-02T17:01:28ZAgentic AI coding tools write code with increasing autonomy and in doing so decide when to import a library and when to implement functionality from scratch. These decisions, whether to build functionality from scratch or buy into an external library, hereafter build-versus-buy, carry direct consequences for software security, licensing compliance, performance, and long-term maintainability. Yet no controlled experimental study has examined what governs build-versus-buy decisions in agentic AI coding tools. Configuration mechanisms, i.e., the means by which developers tailor agentic AI coding tool behavior to a project or workflow, are one of the primary means by which practitioners can influence these decisions. However, it is unclear which configuration mechanisms influence build-versus-buy decisions most effectively. We present a pre-registered protocol to study how configuration mechanisms alter build-versus-buy behavior in two popular agentic AI coding tools: Claude Code and OpenAI Codex. We will execute controlled programming tasks drawn from a benchmark of staged projects, each constructed around identifiable build-versus-buy points, and will manipulate the configuration supplied to each tool, ranging from no configuration, through context files with soft preferences and explicit prohibitions, to Skills (instructions that can be autonomously discovered), MCP-enabled library discovery tools, and permission controls, measuring which libraries the tool selects, whether it discloses newly introduced libraries, and whether those disclosures are complete and accurate. Nine pre-registered hypotheses structure the protocol. The resulting benchmark dataset and analysis pipeline will be released as a reusable artifact for evaluating build-versus-buy behavior in agentic AI coding tools.2026-06-02T17:01:28Z14 pages, 1 table. Accepted at the 20th International Symposium on Empirical Software Engineering and Measurement (ESEM 2026), Registered Reports trackJai Lal LullaMatthias GalsterJie M. ZhangSebastian BaltesChristoph Treudehttp://arxiv.org/abs/2606.03876v1From 'What' to 'How' and 'Why': Sharing LLM-Generated Retrospective Summaries of Older Adults' Passive Tracking Data with Remote Family Members2026-06-02T16:46:00ZWith the growing prevalence of modern ubiquitous computing technologies, multi-modal tracking systems hold promise for providing timely awareness and reassurance to stakeholders such as remote family members (RFMs) of older adults, who play a central role in care coordination. However, combining heterogeneous data streams into high-level, meaningful content - such as retrospective summaries - remains challenging. While recent work has demonstrated the promise of large language models (LLMs) for interpreting multi-modal tracking data, less attention has been given to generating narrative accounts for stakeholders like RFMs, who possess rich personal knowledge of older adults and strong emotional responsibility, yet have limited visibility into their daily lives and limited capacity for caregiving. In this work, we explore how LLMs can be used to generate retrospective summaries from multi-modal tracking data for RFMs of older adults. We leveraged and customized an existing system, Vital Insight, to generate initial summaries on different dates and data availability scenarios as technology probes, and conducted interviews with 11 RFMs to gather feedback. Based on these insights, we redesigned the system into a multi-layer, multi-agent, insight-driven summary approach that builds from objective statistics and descriptions to enriched, context-aware narratives. We then compared the redesigned summaries with the initial versions through a survey with the same 11 RFMs and found significant improvements in satisfaction, perceived helpfulness, trust, and willingness to receive the summaries. We conclude by presenting design implications for AI-generated summaries for RFMs and broader contexts, emphasizing the need to support RFMs' sensemaking shift from simply presenting ''What'' data were collected, to explaining ''How'' is my loved one doing and ''Why''.2026-06-02T16:46:00ZJiachen LiReina Szeyi ChanAkshat ChoubeXiang Zhi TanElizabeth MynattVarun Mishrahttp://arxiv.org/abs/2606.03854v1CLI-Anything: Towards Agent-Native Computer Use2026-06-02T16:30:33ZAs large language models advance in reasoning and tool use capabilities, researchers increasingly seek to leverage them for computer use agents that can interact with existing software. The dominant approach develops GUI agents that control applications through visual interfaces: interpreting screenshots, locating UI elements, and executing mouse clicks to mimic human interaction. This GUI-centric paradigm fundamentally misaligns with agent capabilities. Current GUI agents struggle with brittle pixel-level interactions, timing dependencies, and coordinate-based actions that break with interface changes. They force agents to emulate human perceptual limitations rather than leverage their computational strengths in structured data processing and programmatic control. CLI-Anything argues for agent-native computer use design. Instead of forcing agents to navigate visual layouts, we create interfaces aligned with how agents naturally operate: through structured commands, explicit state representations, and deterministic feedback. We transform existing applications into command-line harnesses that preserve functionality while exposing machine-readable protocols optimized for AI-native interaction. This eliminates the lossy visual-to-computational translation that plagues GUI agents. Rather than building sophisticated screen readers and click simulators, we should redesign interaction paradigms around agent strengths: precise programmatic control and deterministic execution. We examine the methodology, architecture, evidence, and future directions for this agent-native transformation of computer use. We have built CLI-Hub as a comprehensive platform that operationalizes this agent-native computer use vision. The platform provides methodology, architecture, and infrastructure for this fundamental transformation of computer use.2026-06-02T16:30:33ZYuhao YangTianyu FanChao Huanghttp://arxiv.org/abs/2606.03835v1Formalizing all indexed mathematics as a benchmark for general reasoning, with the example of implementing dilatations of categories2026-06-02T16:17:30ZFormal rigor distinguishes mathematics from other disciplines, in the sense that mathematical statements are derived from explicit axioms by logically verifiable steps. Interactive theorem provers support this by expressing definitions, theorems, and proofs in a fully formal language and verifying them mechanically. We consider the benchmark problem of formalizing all published mathematics as a machine verifiable and continuously updated corpus of mathematical knowledge. This viewpoint treats mathematics as a structured database of interdependent results and raises questions about scalability and organization of large formal libraries. As a case study, we present an ongoing formalization in categorical algebra, namely dilatations of categories, extending classical localizations and illustrating what such an implementation looks like in practice.2026-06-02T16:17:30ZAccepted for publication in Lecture Notes in Networks and Systems (Springer)A. Mayeuxhttp://arxiv.org/abs/2606.03822v1Warning About AI Fallibility Increases Help-Seeking in an Intelligent Tutoring System2026-06-02T16:03:57ZRecent work in Technology-Enhanced Learning and Human-Computer Interaction highlights the importance of transparency and trust calibration in AI-supported learning environments as they pose a risk of hallucinations. In this study, we investigate whether a simple transparency intervention that warns students that a pedagogical agent may make mistakes affects learner behavior in a math intelligent tutoring system. We conducted a classroom experiment with 252 school students using two system versions: one including a warning message about potential system errors, and one that does not mention potential errors. Using log data, we analyzed students' problem-solving performance data, including help-seeking behavior, error rate, and time-on-task. Results show that students who were warned about potential AI errors requested significantly more hints than those in the other condition, even though the actual system behavior was exactly the same. This finding suggests that lightweight transparency interventions can influence learners' interaction strategies without necessarily improving or impairing immediate performance.2026-06-02T16:03:57ZTomohiro NagashimaMirella HladkýVera Riefhttp://arxiv.org/abs/2606.03694v1Face versus Body Tracking for Human-Robot Interaction: An Egocentric Dataset2026-06-02T14:15:17ZTo enable meaningful human-robot interaction (HRI), a robot must continuously assess engagement by consistently tracking users over time. State-of-the-art computer vision models, however, are heavily optimized for surveillance or autonomous driving. A social robot faces distinct egocentric challenges, such as humans bouncing, obstructing each other, or leaving the frame. Frequent identity switches (IDSW) cause the robot to lose its footing mid-conversation. To address this, we introduce a novel, custom-annotated egocentric dataset collected via the Furhat robot to capture complex social dynamics. We present a systematic evaluation isolating detection errors from tracking logic, comparing face versus body tracking, and assessing the impact of extended spatial memory and appearance re-identification (ReID). Results indicate that increasing spatial memory mitigates prolonged occlusions but fails on complex dynamic events. Integrating ReID resolves complex switches but exhibits opposing effects: it substantially improves body tracking stability, yet causes facial IDSW to spike due to profile angle sensitivity. Ultimately, our optimized pipeline reduces IDSW by 49\%, mitigating interaction breakdowns. Because standard benchmarks lack dense, close-quarter occlusions, this work highlights the critical need for natively captured social dynamics to truly validate HRI perception models.2026-06-02T14:15:17Z8 pages, 5 figures, 3 tables. Accepted to the 35th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN 2026)Jessica WenningerGabriel Skantzehttp://arxiv.org/abs/2511.04383v2A Visual Analytics System for Interactive Exploration of Historical Painter Cohorts2026-06-02T14:08:40ZPainter cohort analysis has long been regarded as a key lens for studying how painting artistic styles develop and transmit across generations. Through a two-year collaboration with art historians, we identify key challenges in traditional painter cohort research: the unstructured characteristic of painter features, the entangled complexity of inheritance relationships, and the cognitively demanding nature of cohort definition and validation. To solve these challenges, we propose HPC-Vis, a visual analytics system for interactive exploration of historical painter cohorts. An improved cohort analytical workflow is designed to integrate structured feature construction, visualization-assisted exploration, algorithm-based recommendation, and unified cohort management. Based on this workflow, we develop three core computational modules: a multi-scale artistic feature construction method that leverages LLMs to extract and organize hierarchical style features from unstructured historical texts, an inheritance reconstruction algorithm that transforms the entangled multi-parent inheritance network into a clear hierarchical forest structure, and a recommendation model that identifies core features of the cohort and recommends cohort members via painter relevance assessment. To support smooth interactive exploration, we further design a set of novel visualizations with multidimensional collaboration, especially an inheriting mountain view inspired by traditional Chinese landscape paintings, and a foldable doughnut chart for hierarchical artistic style labels. HPC-Vis is evaluated and validated through case studies, user studies, and technical evaluations, demonstrating its effectiveness in supporting painter cohort exploration and in providing visual insights for art historical research.2025-11-06T14:09:33ZYingping YangGuangtao YouWenwen LiJiayi ChenYumeng ZhangYuxin LeiWei ZhangJiazhou ChenWei Chenhttp://arxiv.org/abs/2606.00727v2Knowing When to Move: Evidence Accumulation Models of Human Behavior in Traffic2026-06-02T13:18:19ZEvidence accumulation models provide a formal framework for studying decision making as a dynamic process unfolding over time. While these models have been extensively developed and reviewed in laboratory paradigms, their structured application in complex, ecologically valid domains has received comparatively little attention. Road traffic is a particularly relevant context for studying sustained, embodied perception action behavior, where decisions unfold under time pressure and involve continuous control and ongoing perception-action coupling. Examining how EAMs have been applied in this domain may therefore offer insights beyond discrete laboratory tasks toward decision making in real-world behavior. This semi-systematic review synthesizes 28 studies (2014-2026) applying EAMs to traffic-related behavior. We organize the literature along two dimensions: 1) modelling level, distinguishing models at the level of discrete decision-making and models at the level of continuous action control, and 2) model architecture, distinguishing evidence accumulation as either a stand-alone decision model or an embedded component within broader perception-action or interaction frameworks. These distinctions are associated with systematic differences in model architecture, parameterization, data usage, and validation strategies, reflecting task specific demands. By providing a structured overview of these patterns, this review clarifies how EAMs are currently instantiated in traffic contexts and highlights methodological challenges and future directions both in traffic modelling and in modelling of decision-making more broadly. Promising directions include laboratory work on evidence accumulation in sustained and time-varying tasks, interactive multi-individual decision-making, and the use of neurophysiological measures to identify the perceptual evidence underlying complex perception-action behavior.2026-05-30T13:40:03ZFloor BontjeFelix van WaverenLeendert van MaanenBhargav NallapuGustav MarkkulaArkady Zgonnikov