https://arxiv.org/api/U+NuKgE+pUfiyb3XaP0u+T6CqQA2026-06-14T15:09:05Z3093440515http://arxiv.org/abs/2605.00025v2MoDAl: Self-Supervised Neural Modality Discovery via Decorrelation for Speech Neuroprosthesis2026-05-26T01:46:15ZSpeech neuroprosthesis systems decode intended speech from neural activity in the absence of audible output, offering a path to restoring communication for individuals with speech-impairing conditions. Current approaches decode predominantly from motor cortical areas, discarding others -- such as area 44, part of Broca's area -- that may encode complementary linguistic information. We introduce MoDAl (Modality Decorrelation and Alignment), a framework that discovers complementary neural modalities through the interplay of two objectives in a shared projection space. A contrastive loss aligns each of several parallel brain encoders with the text embeddings of a pretrained large language model (LLM), while a decorrelation loss prevents the encoders from coalescing to duplicative representations. We prove that these objectives are in productive tension: Contrastive alignment induces transitive modality coalescence, which decorrelation must counteract for the framework to discover diverse neurolinguistic modalities. On the Brain-to-Text Benchmark '24, MoDAl reduces word error rate (WER) from 26.3% to 21.6% compared to the previous best end-to-end method, with the gain from incorporating previously discarded area 44 signals arising entirely from the decorrelation mechanism. Analysis of the discovered modalities reveals functional specialization: Encoders receiving area 44 input capture structural and syntactic properties (sentence length, grammatical voice, wh-words), consistent with the neurolinguistic understanding of Broca's area.2026-04-22T03:02:51ZYuanhao ChenPeter Chinhttp://arxiv.org/abs/2605.26325v1Real-time, Directionality Aware 3D Ultrasound Reconstruction and Re-Slicing2026-05-25T21:00:53ZTele-ultrasound through teleoperation allows experts to perform examinations remotely in communities, but limited connectivity can lead to communication delays that reduce usability and diagnostic performance. Visual-haptic model mediated teleoperation reslices a pre-acquired ultrasound volume in real time to provide an accurate, delay-independent preview image for the sonographer. This enables fast and robust exploration before using the live image for fine tuning. However, existing reslicing techniques do not account for the directional nature of ultrasound - the fact that a structure looks different when imaged from different directions. This paper presents Directionality-Aware Reslicing (DARE), an ultrasound volume reconstruction and reslicing framework that takes directionality into account. The presented GPU-accelerated algorithm allows real-time reslicing from arbitrary viewpoints to generate accurate preview images. The method is evaluated quantitatively through image similarity metrics and qualitatively through a user study, and significantly outperforms existing reslicing methods in image similarity and realism compared to a ground truth. This can improve the effectiveness and robustness of tele-ultrasound in low-resource areas.2026-05-25T21:00:53ZTobias JaeggiDavid Gregory BlackSeptimiu Salcudeanhttp://arxiv.org/abs/2605.20255v2Multi-Agent Reinforcement Learning for Safe Autonomous Driving Under Pedestrian Behavioral Uncertainty2026-05-25T19:49:33ZSimulation-based testing of self-driving cars (SDCs) typically relies on scripted pedestrian models that do not capture the heterogeneity and uncertainty of real crossing behavior, limiting the realism of safety assessments, especially for jaywalking, which is governed by latent personality traits the vehicle cannot observe. We hypothesize that jointly training pedestrians and the SDC with multi-agent reinforcement learning (MARL) yields more realistic interaction scenarios than training against fixed pedestrian policies, and that the behavior gap between predictable and unpredictable crossings can be measured directly from trajectories. We co-train an SDC and 12 pedestrians using Multi-Agent Proximal Policy Optimization (MAPPO): pedestrian locomotion follows scripted Dijkstra pathfinding while an RL policy controls high-level go/wait decisions, and jaywalking probability depends on a per-pedestrian trait sampled at episode start and hidden from the SDC. In 500-episode evaluations, the co-trained SDC reached 78% of goals with a 14% collision rate, versus 35%/33% for the best rule-based baseline. A speed differential metric shows the SDC traveled 2.65 m/s faster near jaywalkers than near crosswalk users at close range (0-3 m), indicating jaywalking encounters were not anticipated. Jaywalking was 13% of crossing events but 62% of collisions, and co-training reduced collisions by 30% relative to single-agent RL as pedestrians learned to wait when the SDC approached at speed.2026-05-18T12:02:41ZAccepted to ICRA 2026 Workshop "8th Workshop on Long-term Human Motion Prediction"Prakash AryanKaushik RaghupathruniTimo KehrerSebastiano Panichellahttp://arxiv.org/abs/2605.22774v2CogAdapt: Transferring Clinical ECG Foundation Models to Wearable Cognitive Load Assessment via Lead Adaptation2026-05-25T18:20:33ZReal-time cognitive load assessment is essential for adaptive human-computer interaction but remains challenging due to limited labeled data and poor cross-subject generalization. Recent ECG foundation models pre-trained on millions of clinical recordings offer rich representations, but cannot be directly applied to wearable devices due to sensor configuration mismatch and task differences. In this paper, we propose CogAdapt, a framework that adapts clinical ECG foundation models to wearable cognitive load assessment. CogAdapt introduces LeadBridge, a learnable adapter that transforms 3-lead wearable signals into anatomically consistent 12-lead representations, and ProFine, a progressive fine-tuning strategy that gradually unfreezes encoder layers while preventing catastrophic forgetting. Evaluations on two public datasets (CLARE and CL-Drive) under leave-one-subject-out cross-validation show that CogAdapt substantially outperforms baselines trained from scratch, achieving macro-F1 scores of 0.626 and 0.768. These results demonstrate the promise of foundation model adaptation for subject-independent cognitive load assessment from wearable sensors.2026-05-21T17:33:35Z7 pages, 7 figures. Submitted to IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI 2026)Amir MousaviErfan NourbakhshMohammad Sadegh SirjaniMimi XieRocky SlavinLeslie NeelyJohn DavisJohn Quarleshttp://arxiv.org/abs/2605.26196v1"You do understand that people don't trust technology?": Explaining Trusted Execution Environments to Non-Experts2026-05-25T16:31:10ZTrusted Execution Environments (TEEs) protect confidentiality and integrity of trusted applications by creating an isolated environment for executing code. Prior work has shown that users may feel more comfortable sharing data when they know it will be protected by a TEE, especially if they understand what a TEE is. In this study, we evaluated text-based explanations introducing TEEs to non-experts. We analyzed existing TEE explanations to develop candidate explanations and evaluated them via vignette scenarios with 966 crowdworkers. The explanations that enhanced understanding most were non-technical ones that highlighted specific threats that can be prevented by a TEE. Surprisingly, even the explanations that enhanced understanding had little effect on willingness to use the TEE-enhanced technology. These results provide insights into ways to communicate technical security concepts more effectively but also suggest that explaining security technology might not be enough to address users' privacy concerns.2026-05-25T16:31:10ZMcKenna McCallCarolina CarreiraMiguel FloresLorrie Faith Cranorhttp://arxiv.org/abs/2605.21613v2Simulating Learners' Task-Selection Strategies and System Constraints in Mastery Learning2026-05-25T15:09:49ZIntelligent Tutoring Systems often grant learners shared control over skill and problem selection. This choice brings motivational and metacognitive benefits. At the same time, past literature suggests that learners exhibit diverse preferences and strategies in selecting tasks, for instance, by avoiding challenge. Although underexplored, differences in learner task-selection strategies may interact with mastery learning systems that optimize task-selection based on estimated knowledge, potentially leading to undesirable student-level differences in learning outcomes. Algorithmic constraints on problem selection may help mitigate this issue. However, this possibility has not been comprehensively explored in prior work, in part because testing such constraints in real-world classrooms is costly. We propose a simulation-based framework to observe how varying learner task-selection strategies combined with system constraints shape mastery learning efficiency. Using interaction data from 261 students across two mathematical domains with different problem structures (equation solving, graph interpretation), we simulate common task-selection strategies such as Weakness Targeting and Interleaving, grounded in prior literature. We then evaluate how these strategies affect overpractice as a common measure of mastery learning efficiency. Results show substantial variability in efficiency across strategies, with risk-averse strategies producing higher levels of overpractice, especially for more complex multi-step problems. Targeted system constraints significantly reduce these inefficiencies for maladaptive strategies while having minimal impact on already efficient strategies. Together, these findings demonstrate how simulation grounded in real student data can support data-driven redesign of shared-control tutoring systems prior to classroom deployment.2026-05-20T18:20:50ZAccepted as short paper to the 19th annual Educational Data Mining conference (EDM '26)Haley NohAarna ChowdharyJeroen OogeVincent AlevenConrad Borchershttp://arxiv.org/abs/2605.25868v1The Timing Dependencies of Trust: Speed, Accuracy, and cBCI Neuro-Decoupling in Human-AI Teams2026-05-25T13:56:17ZThe speed and accuracy of an artificial teammate fundamentally alter the failure states of Human-AI integration. While high-speed AI interventions risk inducing reflexive blind compliance, delayed interventions can induce ambiguous cognitive conflict. This study investigates how the fundamental characteristics of an in-task AI assistant, Fast/Less-Accurate (FLA-AI) versus Slow/Accurate (SA-AI) impact the synergy of Collaborative Brain-Computer Interface (cBCI) teams in a Virtual Reality drone task. Seventeen operators completed continuous search tasks under high cognitive workload while their spatial covariance was mapped using a 2D Adaptive Riemannian Oracle. The results mathematically demonstrate that AI timing dictates the mechanism of team failure. Fast AI induced instant, blind compliance; human accuracy under deception collapsed to 50.2%, and pure behavioural teams (N=8) failed to scale beyond 74.1%. In contrast, Slow AI induced delayed cognitive conflict; humans hesitated (61.1% accuracy), but N=8 behavioural teams eventually recovered to 100.0%. Crucially, the Riemannian Oracle mathematically adapted to these states: it heavily restricted temporal windows (< 0.8s) to intercept fast reflexive compliance, while widening windows (> 1.2s) to capture delayed cognitive conflict. Integrating these isolated veridical signals via Hybrid Fusion successfully rescued the Fast AI team (+7.6% at N=8) and significantly accelerated the recovery of smaller Slow AI teams (+6.9% at N=4). These findings prove that cBCI synergy is heavily contingent on the temporal dynamics of trust, providing a critical framework for designing dynamically gated Human-AI systems.2026-05-25T13:56:17ZChristopher BakerStephen HintonAkashdeep NijjarRiccardo PoliCaterina CinelTom ReedStephen Faircloughhttp://arxiv.org/abs/2605.25856v1Explaining Too Much? Understanding How Large Language Model Reasoning Traces Influence Performance and Metacognition2026-05-25T13:46:04ZLarge Language Model interfaces are increasingly verbose, exposing intermediate reasoning traces alongside final answers. Traces are framed as transparency mechanisms, yet it is unclear how people use them to solve problems. We report a preregistered between-subjects study (N = 559) in which participants solved ten LSAT-style reasoning problems under one of three conditions: an Answer-only baseline, a Full-trace revealed before the answer, and a Summary-trace presented alongside the answer. Summaries preserved task performance at the no-trace baseline while significantly elevating trust and hedonic appeal, establishing that trace exposure shifts subjective appraisal of the interaction without bringing performance benefits. Under an open-weight reasoning model exposing verbose intermediate output, full traces additionally impaired performance relative to the answer-only baseline. Across all conditions, participants substantially overestimated their performance, and no trace format supported calibrated self-evaluation. Further analysis indicates that hedonic appeal, not trust, carries the indirect path to overestimation, consistent with a processing-fluency account. Reasoning traces are best understood as user-facing interface artifacts rather than transparent windows into model cognition, and calibration is unlikely to emerge from the traces themselves and may best be scaffolded by interactions that elicit users' own reasoning first.2026-05-25T13:46:04Z27 pages, 5 figures, 9 tablesDaniela FernandesDaniel BuschekLev TankelevitchThomas KoschRobin Welschhttp://arxiv.org/abs/2502.10311v3ExplainReduce: Generating global explanations from many local explanations2026-05-25T11:26:15ZMost commonly used non-linear machine learning methods are closed-box models, uninterpretable to humans. The field of explainable artificial intelligence (XAI) aims to develop tools to examine the inner workings of these closed boxes. An often-used model-agnostic approach to XAI involves using simple models as local approximations to produce so-called local explanations; examples of this approach include LIME, SHAP, and SLISEMAP. This paper shows how a large set of local explanations can be reduced to a small "proxy set" of simple models, which can act as a generative global explanation. This reduction procedure, ExplainReduce, can be formulated as an optimisation problem and approximated efficiently using greedy heuristics. We show that, for many problems, as few as five explanations can faithfully emulate the closed-box model and that our reduction procedure is competitive with other model aggregation methods.2025-02-14T17:14:02Z21 pages with a 36 page appendix, 8 + 39 figures, 1+1 tables. The datasets and source code used in the paper are available at https://github.com/edahelsinki/explainreduce. Accepted for publication in the 4th World Conference on eXplainable Artificial Intelligence (2026)Lauri SeppäläinenMudong GuoKai Puolamäkihttp://arxiv.org/abs/2512.23076v2Multimodal Functional Maximum Correlation for Emotion Recognition2026-05-25T10:54:25ZEmotional states manifest as coordinated yet heterogeneous physiological responses across central and autonomic systems, posing a fundamental challenge for multimodal representation learning in affective computing. Learning such joint dynamics is further complicated by the scarcity and subjectivity of affective annotations, which motivates the use of self-supervised learning (SSL). However, most existing SSL approaches rely on pairwise alignment objectives, which are insufficient to characterize dependencies among more than two modalities and fail to capture higher-order interactions arising from coordinated brain and autonomic responses.
To address this limitation, we propose Multimodal Functional Maximum Correlation (MFMC), a principled SSL framework that maximizes higher-order multimodal dependence through a Dual Total Correlation (DTC) objective. By deriving a tight sandwich bound and optimizing it using a functional maximum correlation analysis (FMCA) based trace surrogate, MFMC captures joint multimodal interactions directly, without relying on pairwise contrastive losses.
Experiments on three public affective computing benchmarks demonstrate that MFMC consistently achieves state-of-the-art or competitive performance under both subject-dependent and subject-independent evaluation protocols, highlighting its robustness to inter-subject variability. In particular, MFMC improves subject-dependent accuracy on CEAP-360VR from 78.9% to 86.8%, and subject-independent accuracy from 27.5% to 33.1% using the EDA signal alone. Moreover, MFMC remains within 0.8 percentage points of the best-performing method on the most challenging EEG subject-independent split of MAHNOB-HCI. Our code is available at https://github.com/DY9910/MFMC.2025-12-28T20:48:02Zmanuscript accepted by IEEE Transactions on Affective Computing. Code is available at https://github.com/DY9910/MFMCDeyang ZhengTianyi ZhangWenming ZhengShujian Yu10.1109/TAFFC.2026.3695876http://arxiv.org/abs/2605.25664v1Posture Clip: Sit properly or I wont let you work2026-05-25T10:14:37ZPoor posture is a significant concern due to its detrimental effects on health and productivity. This paper presents a collar-clipped device called PostureClip, designed to restrict users from sitting and working at a bent angle, by blacking out the screen and resuming on correcting posture, thereby promoting better posture. The device integrates sensors and feedback mechanisms to provide real-time posture feedback to users.
To evaluate the effectiveness of PostureClip, a controlled experiment was conducted with participants (n=165) who were working on a laptop/PC for over 6 hours per day. The participants were randomly assigned to both the intervention group (IG1,n=54 ; IG2,n=55), which used the collar-clipped device, and the control group (CG, n=56), which did not use the device. IG1 didn't get feedback while IG2 got feedback from the device by notifying and further darkening the screen. The study was conducted in the office environment of the participants, for 4 weeks, and metrics such as posture angle, duration of bent angle, and user feedback were collected.
Analysis revealed significant improvements in posture angle (p<0.001) and significant reduction in bent angle duration (p<0.01) for participants' group using PostureClip with feedback and compared to the group without feedback and the control group (who were not intervened). The qualitative analysis of user feedback highlighted the device's ease of use, effectiveness in providing timely feedback, and positive impact on participants' awareness and habits regarding posture. These results indicate that PostureClip is an effective tool for promoting better posture during sedentary work.2026-05-25T10:14:37ZPublished online by Cambridge University Press on 14 May 2026Wearable Technologies, 7, e5 (2026)Arka MajhiAparajita Mondal10.1017/wtc.2026.10041http://arxiv.org/abs/2605.25643v1WeeCare: Towards Handheld Bladder Fullness Sensing with a Conformable Pad2026-05-25T09:48:24ZPatients with bladder dysfunction often lose the sensation of bladder fullness and cannot void naturally, forcing reliance on fixed-schedule catheterization that is uncomfortable and risks complications. We present WeeCare, a handheld conformable pad with fabric electrodes for on-demand bladder fullness sensing using electrical impedance tomography (EIT). The central challenge is that repeated removal and reattachment can introduce variation in electrode position and contact quality. We assess WeeCare along three axes: in-silico simulations characterizing electrode layout and noise robustness, in-vitro phantom experiments across urine salinities and filling levels, and an in-vivo human measurement for bladder fullness sensing, voiding, and filling dynamics.2026-05-25T09:48:24ZZhikai QinSiqi ZhangJunyi ZhuJustin Chanhttp://arxiv.org/abs/2605.25541v1TopoAlign: Topology-Aware Visual Representation Alignment2026-05-25T07:58:26ZNeural networks encode inputs as high-dimensional vectors, known as representations, that capture how models process data by encoding task-relevant structure and semantics. Representation alignment refers to the degree to which different models, layers, or training conditions produce similar representations for the same inputs, with important implications for model interpretation, selection, and robustness analysis. Existing approaches to measure alignment primarily rely on geometric properties, such as neighborhood and cluster similarity, offering limited insight into the global organization of representations. In this work, we present TopoAlign, a topology-aware framework for visually comparing model representations from a structural perspective. Leveraging mapper graphs from topological data analysis, TopoAlign jointly analyzes graphs constructed from representations of shared inputs across different models or layers. The framework supports a top-down comparative workflow: it first performs global structure alignment via joint force-directed optimization to produce coordinated graph layouts; it then identifies local correspondences through automated detection of structurally matching regions, visualized with Bubble Sets; and finally it enables fine-grained pattern inspection through motif-based queries and membrane-inspired visualizations. We demonstrate TopoAlign through case studies on language and multimodal models, complemented by expert feedback. Our results show that TopoAlign provides meaningful insights into representation structure and alignment from a topological perspective.2026-05-25T07:58:26ZXinyuan YanRita SevastjanovaMennatallah El-AssadyBei Wanghttp://arxiv.org/abs/2606.00093v1Agreement Metrics for LLM-as-Judge Evaluation: What to Report and Why2026-05-25T07:31:44ZValidating an LLM judge against human annotations usually means reporting several agreement statistics: accuracy, precision, recall, $F_1$, Cohen's $κ$, and one or more rank correlations. A survey of 24 recent LLM-as-judge papers finds metric choice entangled with the judgment scale, tie handling, invalid outputs, and abstention handling, and those choices rarely stated. For binary criteria -- the common case in rubric-based evaluation, where each criterion is graded MET or UNMET -- most of the reported numbers are redundant: Pearson's $r$, Spearman's $ρ$, Kendall's $τ_b$, the phi coefficient $φ$, and the Matthews Correlation Coefficient all reduce to a single number on non-degenerate binary data, so reporting several of them only creates an illusion of corroborating evidence. Cohen's $κ$ is the one agreement coefficient that adds information: it shares $φ$'s numerator but normalizes differently, and the gap between them measures how far the judge's positive-label rate has drifted from the human's. We then trace what changes when a judge may abstain with a CANNOT_ASSESS verdict: the three common ways of handling abstentions are not interchangeable preprocessing choices but answer different questions, and they break the binary equivalences. The same equivalences reappear, up to a negligible finite-sample correction, for multi-judge ensembles scored with Fleiss' $κ$ or Krippendorff's $α$. We close with a reporting checklist that names the judgment scale, the abstention and tie handling mode, coverage, the confusion matrix, and the aggregation level alongside any scalar agreement coefficient.2026-05-25T07:31:44Z12 pagesDelip RaoChris Callison-Burchhttp://arxiv.org/abs/2506.01982v4Music Interpretation and Emotion Perception: A Computational and Neurophysiological Investigation2026-05-25T07:03:10ZThis study investigates emotional expression and perception in music performance using computational and neurophysiological methods. The influence of different performance settings, such as repertoire, diatonic modal etudes, and improvisation, as well as levels of expressiveness, on performers' emotional communication and listeners' reactions is explored. Professional musicians performed various tasks, and emotional annotations were provided by both performers and the audience. Audio analysis revealed that expressive and improvisational performances exhibited unique acoustic features, while emotion analysis showed stronger emotional responses. Neurophysiological measurements indicated greater relaxation in improvisational performances. This multimodal study highlights the significance of expressivity in enhancing emotional communication and audience engagement.2025-05-16T15:30:38ZAccepted at SMC 2025Vassilis LyberatosSpyridon KantarelisIoanna ZiogaChristina AnagnostopoulouGiorgos StamouAnastasia Georgaki