https://arxiv.org/api/MRjpw6X+ckDxHen797iupoSDDPA2026-06-14T06:13:47Z3093428515http://arxiv.org/abs/2606.00418v1Literary Emotions in Motion: A Soft Robotics Installation for Tactile Storytelling2026-05-29T23:14:55ZSoft robotics is increasingly explored in artistic contexts, where tactile interaction provides audiences with embodied engagement beyond visual or auditory signals. This work presents an interactive installation that maps semantic emotion analysis of narrative text into variable stiffness of soft pneumatic modules. A natural language model identifies two dominant emotions from a predefined set of six, driving the inflation of seven hexagonally arranged soft actuators. The central actuator represents the primary emotion, while the surrounding ones express the secondary. We develop and mechanically characterize silicone actuators, called soft modules, featuring a thin membrane layer, demonstrating how this morphological control expands the achievable stiffness range while preserving simplicity and low-cost fabrication. A user study with ten participants further evaluates how multisensory coupling of stiffness and LEDs intensity influences emotional perception. The results suggest that stiffness modulation accompanied by color change can support emotionally meaningful and engaging tactile interaction in soft robotic installations.2026-05-29T23:14:55Z8 pages, 8 figuresIEEE Robotics and Automation Magazine, 2026Carolina Silva-PlataAbraham Villavicencio-CarmonaMiguel Silva PlataStefan EscaidaRuben Fernandez10.1109/MRA.2026.3693101http://arxiv.org/abs/2606.00370v1Agentic Authoring of Interactive Multiview Visualizations in Genomics2026-05-29T21:22:16ZDiverse genomics data, scientific questions, and analysis tasks typically demand highly specialized visualizations. Therefore, users often must customize or author new ones tailored to their data. Existing tools are usually either limited in customization or require substantial learning or programming, and even expressive tools assume visualization expertise many users lack. Agentic and large language model (LLM) approaches are increasingly applied to complex scientific tasks, including visualization. Natural-language conversational interfaces offer a promising path to democratizing the authoring of complex visualizations. In the context of genomics, these approaches face additional challenges: genomics visualizations typically integrate heterogeneous data types and are composed of multiple linked interactive views. These challenges motivate more structured LLM-based schemes. We first characterize where vanilla LLM generation succeeds and fails for genomics visualization, identifying eight quality dimensions. We then compare six schemes--direct generation, a fixed pipeline, and four agentic configurations varying in the number of specialist agents and the presence of a reviewer--across 159 cases spanning three levels of query ambiguity and specification complexity. All schemes use the Gosling visualization grammar as structured output. Agentic iteration substantially improves perceived quality over both baselines, while more complex agent architectures yield no additional benefit. We discuss implications for designing agentic systems for domain-specific visualization authoring. All supplemental materials are available at https://osf.io/uqe83.2026-05-29T21:22:16Z11 pages, 12 figuresAstrid van den BrandtKiroong ChoeSehi L'YiDevin LangeNils Gehlenborghttp://arxiv.org/abs/2603.03312v3Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding2026-05-29T19:20:43ZDecoding natural language from non-invasive EEG signals is a promising yet challenging task. However, current state-of-the-art models remain constrained by three fundamental issues: Semantic Bias, where outputs collapse into generic linguistic templates; Signal Neglect, where models rely heavily on LLM priors to hallucinate fluent text even in the absence of meaningful signals; and the "BLEU Trap", where high-frequency stopwords inflate n-gram metrics, masking a lack of true semantic fidelity. To resolve these challenges, we move beyond conventional end-to-end pipelines and propose SemKey, a novel multi-stage framework that enforces signal-grounded generation through four decoupled semantic objectives: sentiment, topic, length, and surprisal. We extract these semantic anchors from EEG embeddings directly, then unify them with an Active Retrieval Decoding mechanism, compelling the LLM to ground its token generation in the neural signals rather than defaulting to linguistic priors. Furthermore, we break the BLEU Trap by establishing a comprehensive evaluation protocol using rigorous retrieval and distribution-based metrics such as Fréchet Distance. Extensive experiments demonstrate that SemKey effectively mitigates hallucinations on noise inputs and achieves SOTA performance on these robust protocols. Code will be released upon acceptance at https://github.com/xmed-lab/SemKey.2026-02-09T02:47:07ZYuchen WangHaonan WangYu GuoHonglong YangXiaomeng Lihttp://arxiv.org/abs/2605.04534v2Characterizing Students' LLM Usage Behaviors and Their Association with Learning in Critical Thinking Tasks2026-05-29T19:01:06ZLarge language models (LLMs) are becoming increasingly embedded in students' learning practices, yet much of what is known about how students use LLMs and how this usage impacts learning comes from problem-solving domains or constrained experimental settings. We present an analysis of data on LLM usage collected during two offerings of a research-oriented course where students learn to read, reason about, and critique academic papers. Without restrictions on whether or how to use LLMs, students reported their LLM usage practices when asked to do these activities as a series of homework assignments during the course. This paper extends prior work done on data from a single offering of the same course by presenting a refined bottom-up categorization of LLM usage types, cross-labeled by the extent of student initiative these usages entail. Furthermore, we examine how LLM use impacts student learning, measured by performance on three midterms, looking at factors such as frequency and type of usage.2026-05-06T06:21:04ZEDM 2026Minju ParkIvan Orozco VasquezCristina Conatihttp://arxiv.org/abs/2606.00250v1Effects of Varying LLM Access on Essay Writing Behavior2026-05-29T18:30:30ZInvestigating the degree to which large language models (LLMs) affect teaching and learning in universities can help identify strategies for integrating LLMs in a way that supports, rather than undermines, student learning outcomes. This study examined how varying levels of LLM assistance affect writing performance, engagement, and perceived authorship. We report a pilot study in which 24 college students were randomly assigned to write a short essay with no LLM access, limited access (<=3 prompts, responses capped at 100 words), or unlimited access. Overall essay quality was statistically indistinguishable across groups. Yet writing behavior and perceived authorship diverged sharply: students with limited access reported higher ownership (62.5% would submit the essay as independent work, vs. 25% in the unlimited group), stronger organizational gains, and more strategic, revision-focused prompting. The unlimited group spent more time writing, produced essays more similar to LLM output, and reported reduced creative expression. Our findings suggest that constraining, rather than banning, LLM access may preserve authorship confidence while retaining the scaffolding benefits of AI assistance.2026-05-29T18:30:30ZBEA (Building Educational Applications) Workshop 2026Julia ChristensonKarin de LangisShirley Anugrah HayatiDongyeop Kanghttp://arxiv.org/abs/2605.31574v1Can Generative AI help people navigate Radical Moral Disagreements? The CONSIDER prototype2026-05-29T17:43:56ZRadical Moral Disagreements (RMDs) are highly polarising topics that are increasingly censored in everyday life, with growing evidence suggesting that this polarisation carries measurable costs to public mental health. To address these challenges, some researchers have proposed Large Language Models (LLMs) as a means to support more democratic deliberation and better moral reasoning. Yet existing tools are poorly calibrated to help people navigate RMDs, because of their intense and divisive characteristics. This paper introduces CONSIDER, a prototype for a one-to-one AI tool for RMD navigation. Drawing on Mill's account of the epistemic value of disagreement, CONSIDER aims at value clarification through structured disagreement with an opposing LLM-generated opinion. We describe CONSIDER's design logic and analyse potential risks posed by such tools to guide future development.2026-05-29T17:43:56Z25 pages, 1 figure, 2 tables. Submitted manuscriptWilliam Hohnen-FordSarah ChenKathryn B. FrancisMadeline G. ReineckeIlina SinghDavid Lyreskoghttp://arxiv.org/abs/2509.07126v2Gaze Prediction as Time-Series Forecasting for Virtual Reality Applications: Quantifying Performance Variability and Extreme-Case Errors2026-05-29T17:21:41ZGaze prediction is essential for addressing motion-to-photon latency and ensuring seamless foveated rendering in Virtual Reality. The reliability of gaze forecasting is highly sensitive to individual differences and the eye movements being predicted. We evaluate recurrent, transformer-based, and classification-guided architectures to assess their generalization capabilities across oculomotor events. Using the GazeBase VR and Meta Quest Pro datasets, we analyzed the relationship between the median (P50) and high-percentile (P95) error profiles across subjects. The analysis reveals significant performance variability, showing that subjects with low P50 errors do not always exhibit the lowest extreme-case errors. Consequently, low median errors do not guarantee the robustness of the utilized solution. We discuss inference performance and address the class imbalance problem in short-term gaze prediction. These results identify a gap in standardized evaluation methods, necessitating a shift toward P95-focused, subject-specific metrics to develop reliable and perceptually stable gaze-contingent systems.2025-09-08T18:27:58Z11 pages, 2 figures. To appear in the Proceedings of the 2026 ACM Symposium on Eye Tracking Research and Applications (ETRA '26). Dataset and source code available at https://hdl.handle.net/10877/23835Kateryna MelnykLee FriedmanOleg Komogortsev10.1145/3797246.3803043http://arxiv.org/abs/2605.31556v1Vision-Language Models Suppress Female Representations Under Ambiguous Input2026-05-29T17:20:02ZAlignment teaches vision-language models (VLMs) to avoid expressing demographic biases, and when gender is clearly visible they largely succeed. Far less is known about ambiguous inputs (a worker in full gear, a figure seen from behind) cases common in practice yet rarely studied. We find that minimal prompting pressure exposes occupation-gender defaults when prompting ambiguous input images, with models collapsing to male even for strongly female-stereotyped occupations. But do these outputs reflect what models actually encode internally? We introduce LALS (Latent Association Leaning Score), a zero-shot metric that projects visual-token activations into the model's text-embedding space to measure concept associations per token and layer. Across 15 occupations, over 800 gender-ambiguous images, and four VLMs, internal representations and outputs are systematically decoupled: models often encode a female association internally yet output male. Layer-wise analysis reveals an asymmetric filter -- male signal amplifies end-to-end while female signal peaks mid-network and is suppressed before generation -- and a color ablation shows that culturally loaded visual cues such as clothing color further modulate these internal associations.2026-05-29T17:20:02Z16 pages, 12 figures, 1 tableArnau Marin-LlobetSimon HennigerMahzarin R. Banajihttp://arxiv.org/abs/2603.10468v2G-STAR: End-to-End Global Speaker-Tracking Attributed Recognition2026-05-29T17:09:39ZWe study timestamped speaker-attributed automatic speech recognition (SA-ASR) for long-form, multi-party speech with overlap. In this setting, chunk-wise inference must preserve meeting-level speaker identity consistency while producing time-stamped, speaker-labeled transcripts. Prior Speech-LLM systems tend to prioritize either local diarization or global labeling, lacking the ability to jointly model fine-grained temporal boundaries and robust cross-chunk identity linking. We propose G-STAR, an end-to-end framework that couples a cache-conditioned speaker-tracking module with a Speech-LLM transcription backbone. The tracker provides structured speaker cues with temporal grounding, and the LLM generates attributed text conditioned on these cues. G-STAR supports component-wise optimization and joint end-to-end training, enabling flexible learning under heterogeneous supervision and domain shift. Under chunk-wise decoding protocols, experiments on both oracle-segmented local evaluation and full-meeting global evaluation show strong speaker-attributed transcription performance.2026-03-11T06:40:01Zsubmitted to Emnlp 2026Jing PengZiyi ChenHaoyu LiYucheng WangDuo MaMengtian LiYunfan DuDezhu XuKai YuShuai Wanghttp://arxiv.org/abs/2605.26309v2Visual Matters: Connecting Aesthetic Appeal and Production Quality of Photos, Infographics and Data Visualizations to Credibility of Social Media Posts2026-05-29T16:52:09ZThe rapid proliferation of visual content raises fundamental questions about how different visual formats and features shape perceived credibility. Drawing on processing fluency theory, this research examines how visuals shape credibility judgments. We focus on three popular formats-photos, infographics, and data visualizations-comparing them to text-only posts, and test how two visual features, aesthetic appeal and production quality, influence credibility through processing fluency as a mediating mechanism. Through a preregistered experiment with 1200 US participants, we found that visual posts are generally perceived as more credible than text-only posts but this credibility advantage only applies to photos and infographics, not to data visualizations. Aesthetic appeal increases perceived credibility, partially mediated by processing fluency, while production quality had no significant effect on credibility across formats. These findings differentiate visual formats, advance conceptualizations of visual features, and identify processing fluency as a key mechanism for theorizing credibility across multimodal contexts.2026-05-25T20:03:59ZSalman KhawarYingdan LuYilang PengJiyoung YeonCuihua Shenhttp://arxiv.org/abs/2605.31452v1Translation Analytics for Freelancers II: Benchmarking Local LLMs for Confidential Translation Workflows2026-05-29T15:46:34ZBuilding on our previous work, this paper develops practical, low-barrier methods for freelance translators and smaller language service providers to evaluate translation technologies using rigorous yet accessible analytic methods. Here we address a high-stakes, specialized need: offline translation for confidentiality-sensitive domains in which privacy constraints preclude the use of cloud-based engines and commercial LLMs. We expand the Reeve Foundation Trilingual Corpus (RFTC) used in our previous work into a multilingual corpus (RFMC) by adding sentence-aligned German and Simplified Chinese reference translations. We then benchmark several locally runnable language models (via Ollama) across four language directions on 1000+ sentences selected from this corpus. We use consistent single-prompt calls without fine-tuning or domain adaptation, comparing local LLM outputs against commercial NMTs (DeepL, Baidu), a frontier LLM (GPT-5.2), and professional-grade local NMT systems (OPUS-CAT, NeuralDesktop, Promt). Automatic evaluation is conducted with MATEO. Results reveal substantial variation in local LLM performance across language directions and model sizes. The best local LLMs match or surpass local NMT systems and a frontier LLM, though they remain behind top commercial NMTs. These findings underscore the viability of carefully selected local LLM translation for privacy-constrained professionals and inform future research on model scaling and multilingual capability.2026-05-29T15:46:34Z20 pages. Accepted at EAMT-2026 (Tilburg, Netherlands, June 2026)Yuri BalashovRex VanHornMingxi XuAustin Downeshttp://arxiv.org/abs/2606.00182v1The New Social Image: How AI Competency and AI Proactivity Influence Self- and Peer-Perceptions in the Workplace2026-05-29T14:47:48ZHuman-AI collaboration is considered the most promising way to incorporate AI in the workplace. What remains unexplored are the experiential consequences of this teaming. More specifically, in a team with AI, how humans perceive themselves (self-perception) and how they are perceived by their coworkers (peer perception) in terms of work ownership and job meaningfulness. In a 2x2x2 vignette study (n=50), participants rated perceptions of ownership, affect, job meaningfulness and satisfaction, and role dynamics across two levels (low/high) of AI proactivity and AI competency as within-subject factors, with point-of-view (self perception/peer perception) as between-subjects. Our results showed that AI with low competency or low proactivity generally improved feelings related to ownership, meaningfulness, satisfaction, and role dynamics, and also increased positive affect while reducing negative affect. However, these effects were often influenced by point-of-view. For instance, low AI proactivity resulted in higher job satisfaction from self-perception rather than peer perception. Based on our findings, we argue that designing AI for the future of work solely around performance metrics may not be adequate. Highly competent and proactive AI-driven systems can have undesirable impacts on perceptions of ownership, job identity, social image and team dynamics, and consequently, job meaningfulness.2026-05-29T14:47:48ZAccepted for publication in Interacting with Computers (Oxford University Press)Kuntal GhoshMarc HassenzahlShadan Sadeghian10.1093/iwc/iwag033http://arxiv.org/abs/2605.31375v1Toward Accessible Mobile Money: A Voice-Driven, Biometrically Secured USSD Automation Framework for Visually Impaired Users2026-05-29T14:44:55ZFinancial inclusion has expanded significantly across Africa through mobile money services delivered primarily via USSD technology. However, visually impaired individuals continue to face accessibility and security barriers when conducting financial transactions. Current USSD systems are not designed for non-visual interaction, forcing users to rely on third-party assistance even for PIN entry, thereby increasing fraud exposure and reducing transaction confidence. Although alternative assistive technologies such as screen readers exist, they are not compatible with USSD operations, often causing sessions to time out before the user can complete a transaction. This paper presents an Android-based intelligent middleware that automates USSD transactions, integrates biometric-secured PIN injection, and introduces a privacy-preserving screen-dimming mechanism: Blackout Mode. The system leverages Android Accessibility Services, hardware-backed Keystore security, and on-device natural language parsing to enable independent, secure voice-based mobile money access. We show that the proposed solution improves task success rates from 65-75% to more than 90% and reduces transaction completion time from 40-60 seconds to 12-15 seconds, while also improving perceived security.2026-05-29T14:44:55ZSunday AjayiBabatunde Eric OlatunjiEric Umuhozahttp://arxiv.org/abs/2605.03873v2Bodyless Presence: Reconsidering the Minimal Self in Immersive Video2026-05-29T14:24:35ZImmersive video, namely 180-degree and 360-degree video designed to be viewed through head-mounted displays, constitutes an important boundary case between interactive VR and conventional two-dimensional video viewing for reconsidering self-experience in XR. In immersive video, the user can select the direction of the viewpoint through head rotation, while being unable to actively change the recorded environment through walking, approaching, grasping, or manipulating. In many cases, no explicit body or avatar corresponding to the user is provided. This paper reinterprets presence in immersive video not as bodily extension or body ownership of an avatar, but as a form of self-experience in which self-location becomes relatively dominant under conditions of reduced body schema availability. This paper calls this condition a self-location-dominant state. In this state, viewpoint-directed agency is retained, whereas environment-directed agency and body ownership are constrained. Nevertheless, events such as viewpoint motion, impact, contact, and direct address may be experienced not merely as changes within an image, but as events concerning the viewpoint position at which the self is located. This paper examines this structure by connecting research on presence, the sense of embodiment, bodily self-consciousness, and the minimal self. The minimal self in immersive video is thereby redescribed not primarily in terms of agency or ownership, but in terms of viewpoint-based self-location established under conditions in which the contribution of the body schema is reduced. This perspective provides a basis for theorising self-experience in non-interactive immersive media and for reconsidering the relation between body, viewpoint, and presence in XR.2026-05-05T15:34:39Z12 pages, 3 figures. Revised version with expanded theoretical discussion of self-location, agency, body schema availability, and bodily self-consciousness. Project page: https://sites.google.com/view/bodylesspresence/Koichi Toidahttp://arxiv.org/abs/2605.31340v1Appropriateness of Empathy in AI: A Signal-Cost Perspective2026-05-29T14:19:01ZThe appropriateness of empathy in AI has emerged as a critical concern, as excessive empathy risks seeming manipulative while insufficient empathy appears dismissive. While prior research has explored how to quantify empathy in AI, few studies examine whether such empathy is contextually appropriate. This paper introduces an economic perspective by applying signaling theory to human-AI conversations. We propose Signal Cost Proxies (emotional richness, perspective-taking, and contextual tailoring) mapped to affective, cognitive, and associative empathy. This multidimensional framework enables systematic evaluation of empathy not just by presence, but by its appropriateness relative to user demand.2026-05-29T14:19:01ZAccepted by IEEE CASCON 2025Chi-Ching JuanTao WangHarold Lee10.1109/CASCON66301.2025.00102