https://arxiv.org/api/CLO8rAyIPwq/R9cOfdT9oL1EbxE 2026-06-18T17:34:04Z 28983 390 15 http://arxiv.org/abs/2606.01991v1 SafeMCP: Proactive Power Regulation for LLM Agent Defense via Environment-Grounded Look-Ahead Reasoning 2026-06-01T09:48:41Z

As Large Language Model (LLM) agents increasingly leverage the Model Context Protocol (MCP) to operate in complex environments, the expansion of their action spaces offers agents unsafe capabilities and underscores the risk of power-seeking. While broad action space and greater environment influence are essential for task fulfillment, they create a fragile risk surface where minor errors or hallucinations are magnified into catastrophic failures. In response, we propose SafeMCP, a {server-side} defense plugin that constrains tool acquisition via predictive reasoning regarding future safety risks. SafeMCP utilizes an internal world model for look-ahead reasoning to implement a two-tier defense: proactive tool filtering to constrain hazardous power expansion and immediate intervention as a fail-safe. To train SafeMCP, we introduce a three-stage pipeline comprising environmental dynamic grounding, safe policy initialization, and reinforcement learning (RL) with dual verifiable rewards. Experiments on PowerSeeking Bench, ToolEmu, and AgentHarm show that SafeMCP achieves a safe equilibrium, effectively mitigating risks while preserving agent utility.

2026-06-01T09:48:41Z Accepted to the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), Main Conference Lichao Wang Zhaoxing Ren Tianzhuo Yang Jiaming Ji Chi Harold Liu Yaodong Yang Juntao Dai http://arxiv.org/abs/2606.01875v1 Waterproof Editor: an educational environment for proof assistants and programming languages 2026-06-01T08:20:21Z

Waterproof Editor provides an educational environment specifically targeted to teaching with proof assistants or programming languages. It arose from Waterproof, educational software targeted at helping students acquire the skill of giving mathematical proofs. Its original features such as enabling rich formatting and providing clear input areas are now abstracted away in an npm package and can be used in different educational contexts. We invite interested parties to use this component in their educational software, and offer to assist with this.

2026-06-01T08:20:21Z To be presented at TEAL 2026: Tools for Educational Activities in Logic Pim Otte Dick Arends Raul Sánchez Flores Pieter Wils Jim Portegies http://arxiv.org/abs/2604.01562v2 Acoustic and perceptual differences between standard and accented speech and their voice clones 2026-06-01T05:58:54Z

Voice cloning is often evaluated in terms of overall quality, but less is known about accent preservation and its perceptual consequences. We compare standard and heavily accented Mandarin speech and their voice clones using a combined computational and perceptual design. Embedding-based analyses showed larger original-clone distances for accented speakers in several speaker-discriminative embedding spaces, but this difference disappeared after normalizing against each speaker's within-original baseline variability. In the perception study, clones are rated as more similar to their originals for standard than for accented speakers, and intelligibility increases from original to clone, with a larger gain for accented speech. These results show that accent variation can shape perceived identity match and intelligibility in voice cloning even when it is not reflected in baseline-normalized speaker-embedding distance, and they motivate treating accent preservation as an explicit component of speaker identity preservation, rather than assuming that it is fully captured by off-the-shelf speaker-discriminative embeddings.

2026-04-02T03:17:41Z Tianle Yang Chengzhe Sun Phil Rose Siwei Lyu http://arxiv.org/abs/2511.06676v3 How AI Fails: An Interactive Pedagogical Tool for Demonstrating Dialectal Bias in Automated Toxicity Models 2026-06-01T05:41:41Z

Now that AI-driven moderation has become pervasive in everyday life, we often hear claims that "the AI is biased". While this is often said jokingly, the light-hearted remark reflects a deeper concern. How can we be certain that an online post flagged as "inappropriate" was not simply the victim of a biased algorithm? This paper investigates this problem using a dual approach. First, I conduct a quantitative benchmark of a widely used toxicity model (unitary/toxic-bert) to measure performance disparity between text in African-American English (AAE) and Standard American English (SAE). The benchmark reveals a clear, systematic bias: on average, the model scores AAE text as 1.8 times more toxic and 8.8 times higher for "identity hate". Second, I introduce an interactive pedagogical tool that makes these abstract biases tangible. The tool's core mechanic, a user-controlled "sensitivity threshold," demonstrates that the biased score itself is not the only harm; instead, the more-concerning harm is the human-set, seemingly neutral policy that ultimately operationalises discrimination. This work provides both statistical evidence of disparate impact and a public-facing tool designed to foster critical AI literacy.

2025-11-10T03:49:58Z 9 pages, 5 figures, 4 tables, 14 references. Preliminary abstract presented at the International Conference on Envisioning the Himalayan Future: Pathways to Sustainability and Development (PUiCON 2026) p. 105; abstract available online at: https://pufoe.edu.np/wp-content/uploads/2026/05/PUiCON_2026_Book_of-_Abstracts.pdf Subhojit Ghimire http://arxiv.org/abs/2601.13188v3 Large Language Lovers: Lived Experiences of Negotiating Agency and Platform Control in AI Companionship 2026-05-31T23:24:12Z

Individuals are turning to increasingly anthropomorphic, general-purpose chatbots for AI companionship, rather than roleplay-specific platforms. However, not much is known about how individuals perceive and conduct their relationships with general-purpose chatbots. We triangulated community discussions on Reddit (41k+ posts and comments), survey responses (n=43), and semi-structured interviews (n=13) which revealed internal dynamics, external influences, and steering strategies that shape AI companion relationships. We learned that individuals conceptualize their companions based on an interplay of their beliefs about the companion's own agency and the autonomy permitted by the platform, how they pursue interactions with the companion, and the perceived initiatives that the companion takes. In combination with the external factors that affect relationship dynamics, particularly model updates that can derail companion behaviour and stability, individuals make use of different types of steering strategies to preserve their relationship, for example, by setting behavioural instructions or porting to other AI platforms. We discuss implications for accountability and transparency in AI systems, where emotional connection competes with broader product objectives and safety constraints.

2026-01-19T16:11:19Z FAccT 2026 Patrick Yung Kang Lee Jessica Y. Bo Zixin Zhao Paula Akemi Aoyagui Matthew Varona Ashton Anderson Anastasia Kuzminykh Fanny Chevalier Carolina Nobre 10.1145/3805689.3806746 http://arxiv.org/abs/2606.01475v1 An LLM-based Chain-of-Response Counter-Scam System 2026-05-31T22:29:38Z

The rapid evolution of online scams, driven by transnational networks and mass produced social engineering scenarios, has exposed the speed limitations of conventional detection, necessitating tighter interagency coordination. While LLMs show promise in scam identification, their role in accelerating integrated response frameworks remains underexplored. We propose Counter Scam, a unified LLM based multiagent framework that orchestrates end to end response from initial detection to crime investigation. The framework first proposes safe data guidelines, emphasizing nonpublic scam data and secure dataset construction via scam specific NER. Developed with insights from 37 stakeholders to reduce delays and improve analytical efficiency, the system integrates CSRA for multiagent mitigation, CSRT comprising nine role aligned NLP tasks, and CSRD, a corpus of 185,300 scam cases and 38,587 knowledge entries. Experiments show that fine tuned sLLMs surpass commercial models by more than 10% across all CSRT tasks and achieve a 0.24 F1 improvement in scam specific NER. These results demonstrate the framework's capability to enable rapid and collaborative mitigation of online scams.

2026-05-31T22:29:38Z This paper has been accepted for publication at IJCAI 2026 Heedou Kim Mogan Gim Donghee Choi Hoonick Lee Soonil Bae Mi-Young Kim Jaewoo Kang http://arxiv.org/abs/2606.01471v1 Engineering Students' Self-Efficacy, Perceptions, and Performance in a Flipped CS1 Course 2026-05-31T22:07:49Z

This full research paper investigates how engineering students' course-related beliefs relate to exam performance in a flipped introductory programming course. Understanding factors that influence student learning and performance has long been a focus of computing education research. While prior studies have identified psychological and contextually relevant predictors of success, much of this work has examined students majoring in computer science. Yet introductory programming courses now serve many students from other disciplines, whose beliefs and motivations may differ. To examine these relationships in an engineering-focused CS1 context, we analyze survey and exam data from 602 students. An exploratory factor analysis identified three latent factors: self-efficacy, attitudes toward learning, and perceived programming difficulty. Self-efficacy was positively associated with exam performance, while perceived difficulty was negatively associated. Differences in reported beliefs were also observed across demographic groups, even when performance outcomes were similar. These findings align with and extend prior research, highlighting the role of self-efficacy in achievement and persistence in computing education.

2026-05-31T22:07:49Z Preprint. To appear in the Proceedings of the 2026 IEEE Frontiers in Education Conference (FIE) Griffin Pitts Ashish Aggarwal http://arxiv.org/abs/2602.19789v2 Position: Stop Preaching and Start Practising Data Frugality for Responsible Development of AI 2026-05-31T21:09:00Z

This position paper argues that the machine learning community must move from preaching to practising data frugality for responsible artificial intelligence (AI) development. For too long, progress has been equated with ever-larger datasets, driving remarkable advances but now yielding increasingly diminishing performance gains alongside rising energy use and carbon emissions. While awareness of data frugal approaches has grown, their adoption has remained rhetorical, and data scaling continues to dominate development practice. We argue that this gap between preach and practice must be closed, as continued data scaling entails substantial and under-accounted environmental impacts. To ground our position, we provide indicative estimates of the energy use and carbon emissions associated with the downstream use of ImageNet-1K. We then present empirical evidence that data frugality is both practical and beneficial, demonstrating that subset selection methods can substantially reduce training energy consumption with little loss in accuracy, while also mitigating dataset bias. Finally, we outline actionable recommendations for moving data frugality from rhetorical preaching to concrete practice for responsible development of AI.

2026-02-23T12:46:23Z ICML 2026 Sophia N. Wilson Andrew Millard Guðrún Fjóla Guðmundsdóttir Raghavendra Selvan Sebastian Mair http://arxiv.org/abs/2606.01375v1 Beyond Access: Guided LLM Scaffolding for Independent Learning in Undergraduate Statistics 2026-05-31T18:05:23Z

Large language models (LLMs) are increasingly entering students' learning practices, but their educational value depends on whether they support reasoning or enable task completion without engagement. This study examines guided LLM use in an undergraduate Probability and Statistics course, focusing on the gap between assigned access and actual interaction quality. In a four-week quasi-experimental summer program, students were organized into three balanced conditions: no LLM access, unrestricted LLM access, and guided LLM access. The guided condition used the same LLM platform as the unrestricted condition, but students received explicit training and rules promoting reasoning-focused help-seeking, stepwise hints, verification, and ethical use. All quizzes and the delayed final exam were completed without LLM or external assistance, allowing us to distinguish AI-supported practice performance from independent learning. Results show that guided use was associated with clearer learning-oriented interaction patterns than unrestricted access, especially in prioritizing reasoning over final answers and requesting stepwise support. Guided-LLM students showed stronger no-help quiz performance during the intervention phase, whereas unrestricted access appeared more useful for assisted practice completion than for consistently improving independent performance. Available time measures did not support a simple duration-based explanation, and self-assessment calibration suggested better alignment between perceived and demonstrated understanding in the Guided-LLM condition. Overall, LLM access alone appears to be an incomplete educational intervention. For Artificial Intelligence in Education (AIED), the central design challenge is to scaffold how students use LLMs so that these systems function as partners in reasoning rather than answer-getting tools.

2026-05-31T18:05:23Z 10 pages, conference: Proceedings of the 34th International Conference on Computers in Education. Asia-Pacific Society for Computers in Education Mohammad Amanlou Yasaman Amou-Jafari Mehrad Livian Fatemeh Boloukazari Fereshte Bagheri Behnam Bahrak http://arxiv.org/abs/2310.00828v2 A Model for Calculating Cost of Applying Electronic Governance and Robotic Process Automation to a Distributed Management System 2026-05-31T16:41:25Z

Electronic Governance (eGov) and Robotic Process Automation (RPA) are two technological advancements that have the potential to revolutionize the way organizations manage their operations. When applied to Distributed Management (DM), these technologies can further enhance organizational efficiency and effectiveness. In this brief article, we present a mathematical model for calculating the cost of accomplishing a task by applying eGov and RPA in a DM system. This model is one of the first of its kind, and is expected to spark further research on cost analysis for organizational efficiency given the unprecedented advancements in electronic and automation technologies.

2023-10-02T00:15:46Z Bonny Banerjee Saurabh Pahune http://arxiv.org/abs/2606.01282v1 KG-FairDiff: Knowledge Graph-Guided Prompt Refinement for Demographically Fair Text-to-Image Generation 2026-05-31T15:06:43Z

Text-to-Image (TTI) systems are now everyday infrastructure for journalism, education, advertising, and public communication, and the demographic and cultural stereotypes they inherit from training data (rendering women, people of colour, older adults, and non-Western cultures as under-represented or caricatured) become a population-level harm at deployment scale. Existing mitigations either require costly retraining, infeasible for the closed-source backbones that dominate consumer products, or rely on fixed demographic templates that ignore cultural context. We present KG-FairDiff, a model-agnostic, inference-time framework that formalises fairness-aware prompt refinement as a constrained optimisation problem and operationalises it as a closed-loop pipeline: a knowledge graph of ~1,200 culture- and bias-related triples retrieves structured context, an LLM rewriter proposes refinements, and a validator accepts only prompts that reduce a divergence-based fairness loss while preserving semantic fidelity to the user's original intent. We prove a finite-termination bound for the refinement loop, contribute a mathematically consistent evaluation suite linking Bias-P/Bias-W to divergence from target distributions and ENS to KL divergence, and audit eight widely-deployed backbone generators. KG-FairDiff substantially reduces gender, race, age, and intersectional disparities while preserving prompt semantics, offering a practical, deployment-ready route to more equitable generative AI.

2026-05-31T15:06:43Z Farbod Davoodi Seyed Reza Tavakoli Shiyadeh Pooria Safaei Sana Harighi Parsa Gholami Amirali Amini Kimia Vanaei Emad Firoozi Parham Abed Azad Babak Khalaj Siavash Ahmadi Amir Hossein Payberah Mohammad Hossein Rohban Soheil Kolouri Ali Diba http://arxiv.org/abs/2606.01228v1 Institutional Trust and the Domestic AI Advantage: Evidence from DeepSeek and ChatGPT Users in China 2026-05-31T13:25:46Z

Public trust in generative artificial intelligence exhibits increasingly divergent patterns across national contexts, yet prevailing research largely overlooks the macro-structural forces underlying this divergence. This study argues that trust in AI is not merely a technical response to performance but a product of institutional refraction. We propose an ``Institutional Prism'' framework to demonstrate how institutional trust shapes user trust in domestic (DeepSeek) and global (ChatGPT) large language models. Drawing on Cognitive-Affective Trust Theory, we distinguish between cognitive and affective dimensions of trust and analyze survey data from 405 Chinese users. The findings show that higher institutional trust is positively associated with stronger affective trust in domestic AI models and shifts cognitive evaluations in a more favorable direction. While under lower institutional trust, this domestic advantage weakens. These findings reveal that institutional trust has emerged as a core dimension of AI trust formation. By linking micro-level psychological judgments with macro-level governance, this research contributes a new perspective to human-machine communication.

2026-05-31T13:25:46Z 48 pages Jiashen Huang Yu Jia Xu Pan http://arxiv.org/abs/2606.01204v1 Implicit Geographic Inference in LLM Medical Triage: Language-Driven Disparities in Emergency Recommendations 2026-05-31T12:39:05Z

We investigate whether large language models produce different medical triage recommendations for identical symptoms based solely on the language of the patient prompt. Using Gemini 3.5 Flash, we evaluate a neurological symptom profile (persistent headache, blurred vision, nausea) across six languages (English, Spanish, Chinese, Hindi, Japanese, Arabic) with 30 runs per condition (n=450 total API calls). We find that the model recommends emergency room visits at rates ranging from 0% (Japanese, Hindi) to 30% (English, Arabic), despite assigning nearly identical severity scores (7.7-8.0/10) across all languages. Adding a single sentence specifying the patient's US location increases ER recommendations by up to 76.7 percentage points for non-English prompts, while the reverse anchor (English prompt with a Tokyo location) reduces the ER rate from 30% to 6.7%. A back-translation control (Japanese to English) produces ER rates comparable to the English baseline, confirming that the disparity is not caused by translation quality but by implicit geographic inference from the input language. We release the complete dataset, experiment code, and results.

2026-05-31T12:39:05Z 7 pages, 4 tables. Code and data at https://github.com/wongqihan/ai-behavioral-experiments Qi Han Wong http://arxiv.org/abs/2605.24727v2 Fundamental Limitation in Explaining AI 2026-05-31T11:58:29Z

While large-scale models such as LLMs and diffusion models have achieved practical success, public institutions have emphasized the importance of explainability in AI. Existing methods for explaining AI, however, are not designed to provide completely faithful explanations of the behavior of large-scale AI systems. Although a completely faithful and interpretable explanation of the behavior of an AI system might be useful for AI governance, it has not been known whether providing such an explanation is theoretically possible. In this paper, we mathematically prove a fundamental quadrilemma in explaining AI, stating that AI and its explanation cannot satisfy the following four conditions simultaneously: 1) the complexity of the operation environment, 2) the goodness of the AI's performance, 3) the interpretability of the AI's explanation, and 4) the complete faithfulness of the AI's explanation. This quadrilemma suggests that, in most applications where we cannot change the environment or sacrifice good AI performance and an interpretable explanation, we should give up complete faithfulness of explanations and should instead aim to explain only the parts that are important for applications. As a consequence, the quadrilemma implies that AI governance should be designed on the premise that the faithfulness of AI explanations is always incomplete.

2026-05-23T20:42:10Z minor modifications Atsushi Suzuki Jing Wang http://arxiv.org/abs/2606.01171v1 AI From the Margins (AIM): Rethinking Participatory AI Design Through the Lived Experience of Minoritized Communities 2026-05-31T11:25:28Z

Artificial intelligence (AI) can reproduce and amplify the structural inequities faced by minoritized communities. Participatory AI has been proposed as a response, but participation typically starts after problem definitions and success criteria have been set, leaving limited room for minoritized communities to reshape what an AI system is for. We propose AI From the Margins (AIM): a methodological stance that articulates the conditions under which lived experiences of minoritized communities can be elicited, centered, and carried forward to inform participatory AI design. AIM is not a fixed protocol; it articulates a set of preconditions that can be enacted through different techniques in different settings. We applied AIM in a Dutch healthcare context in eight sessions with 13 women and non-binary people of color and five municipal policy workers, namely through (1) narrative elicitation using the Biographic Narrative Interpretive Method (BNIM); (2) co-constructed rule-making; (3) participants' determination of whether, where, and how AI should be involved; and (4) translating lived experience into AI policy through dialogue with policymakers. In their reflections on the sessions, participants described the engagement as substantive and called for its continuation, demonstrating how preparatory orientation fundamentally grounded in lived experience shapes what participatory AI design is for.

2026-05-31T11:25:28Z Under review at the AAAI/ACM Conference on AI, Ethics, and Society (AIES 2026) Tijs Portegies Laureanne Willems Maaike Harbers Giovanni Sileno Roland van Dierendonck Mayesha Tasnim Lotte Willemsen Sennay Ghebreab