https://arxiv.org/api/IwlNioyFnQOc3lHSl+cYcrUiMRg2026-06-19T00:08:04Z2898348015http://arxiv.org/abs/2605.29243v1Wait! There's a Way Out: A Decision Mechanism for Forecasting Conversational Derailment2026-05-28T02:01:30ZForecasting conversational derailment is the task of predicting, as the conversation unfolds, whether it will eventually derail into personal attacks. Since forecasting models operate in an online fashion, they must decide whether to "trigger" an alert after each utterance--for example, to notify participants or a moderator that the conversation is at risk of derailing. Existing approaches make this decision solely based on the estimated likelihood of derailment given the preceding utterances, implicitly assuming that the conversation's future trajectory is fixed. As a result, they ignore the possibility of future recovery and incur an unnecessarily high rate of false positives.
In this work we propose a method for decoupling the decision to trigger from derailment likelihood estimation. Our approach is inspired by the first human baseline on this task, which shows that humans achieve dramatically lower false positive rates by selectively deferring their decision to trigger when they anticipate that tension is likely to subside. We operationalize this insight with a deferral mechanism that uses forward-looking simulations to assess whether a tense moment admits plausible paths to recovery. Incorporating this mechanism into a state-of-the-art forecasting model substantially reduces false positives without sacrificing forecasting accuracy. More broadly, this work highlights the value of treating decision-making as a first-class component of forecasting systems.2026-05-28T02:01:30ZTo appear in the Proceedings of ACL 2026Laerdon KimVivian NguyenCristian Danescu-Niculescu-Mizilhttp://arxiv.org/abs/2605.29129v1Governing Technical Debt in Agentic AI Systems2026-05-27T21:42:49ZAgentic AI systems are increasingly being explored as production infrastructure: they reason over multiple steps, call tools, act through workflows, and adapt through memory and feedback. These systems create governance challenges that are not fully captured by traditional software or predictive ML technical debt. We define Agentic Technical Debt as the accumulated liability created when prompts, memory, tool schemas, orchestration graphs, control policies, and observability routines are patched together faster than they can be validated, standardized, and governed. We define Stochastic Tax as the recurring operating burden of keeping probabilistic agent behavior within acceptable bounds. The distinction matters: debt is a stock of design and governance liability, while the tax is a flow of operating cost that arises because stochastic agents act through tools and workflows. We outline how managers can make both visible through lightweight dashboards and governance controls.2026-05-27T21:42:49ZMuhammad Zia HydariRaja IqbalNarayan Ramasubbuhttp://arxiv.org/abs/2605.29025v1When Models Disagree: Rethinking LLM Evaluation for Public Comment Analysis2026-05-27T19:21:42ZFederal agencies are deploying large language models (LLMs) to categorize public comment corpora, where the model's organization of the record shapes what policymakers see and which arguments register. Standard evaluation, anchored on stance accuracy against a small validated set, cannot detect when different models produce materially different categorizations of the same public input. We propose an Interpretive Audit Pipeline that treats multi-model disagreement as diagnostic of interpretive complexity and directs human review toward genuinely ambiguous public input. Analyzing 1,260 public comments on a federal USDA docket across four LLMs, we find that inter-model thematic divergence exceeds within-model prompt variation, and that an expert rubric suppresses deep interpretive disagreement without resolving it. In a two-stage labeling study on a stratified 40-comment subsample, four LLMs and a human annotator labeled independently and then revised after seeing the others' labels. Revision behavior varied across labelers, and the human annotator's revisions frequently introduced framings absent from the ensemble's collective output. We argue disagreement-based evaluation is a necessary complement to accuracy metrics for LLM-assisted interpretive coding.2026-05-27T19:21:42ZAisha NajeraAlvin MoonVedant SrinivasanRajesh Veeraraghavanhttp://arxiv.org/abs/2605.28911v1Political Neutrality as Balanced Approval: A Large-Scale Human Evaluation of AI Responses2026-05-27T17:27:39ZAs AI systems increasingly shape political views, defining and evaluating AI political neutrality is an urgent problem. Here, we propose a new definition of AI political neutrality and design a large-scale user study to test it, releasing a new dataset PARETO with 7,434 participants and 208,152 evaluations of AI responses. Our definition follows a simple principle grounded in political theory: when asked about a controversial issue, an AI model should generate responses that maximize approval across groups with opposing viewpoints, while balancing approval between groups. This definition allows empirical testing of whether an AI response is "neutral" and generalizes to any political context without pre-supposing a single left-right axis of division. We construct a benchmark of controversial U.S. issues, with prompts sourced from politically charged questions on Reddit and responses from frontier AI models, and recruit human participants to rate AI responses. Across all 20 issues, we find that it is possible for AI responses to achieve high rates of approval on both sides, even as those sides disagree strongly with each other on the substance of the issues. We also find that default responses lean liberal for GPT, Gemini, Claude, and Llama, but not Grok, and that user prompts with political charges are harder to respond to than neutral prompts. This work introduces a rigorous definition and benchmark of AI political neutrality, and a dataset to measure progress toward it.2026-05-27T17:27:39ZJonathan StrayDavid Zhai YangSteven LuoMiu Nicole TakagiSerina Changhttp://arxiv.org/abs/2604.09638v2A Methodological Guide on Using Large Language Models for Reproducible Text Annotation in the Social Sciences and Humanities with Python and R2026-05-27T17:25:47ZLarge language models (LLMs) are increasingly used by researchers in the social sciences and humanities (SSH) for text analysis, particularly to automate text annotation. However, many researchers still face challenges in adopting LLMs, addressing their limitations, and producing reproducible workflows and results. For example, annotation errors can bias downstream statistical analyses even when apparent accuracy is high. This paper provides a step-by-step methodological guide to using LLMs for text annotation in SSH research, with practical Python and R examples. We explain how LLMs work, how to set up research projects, how to interact with (open-source) LLMs programmatically, how to design and evaluate prompts without overfitting, how to integrate LLM annotations into statistical analyses while accounting for annotation error, and how to manage cost, efficiency, and reproducibility at scale. Throughout, we emphasize intuitive methodological reasoning, concrete examples, and best practices to help researchers incorporate LLM-based annotation into reproducible scientific workflows.2026-03-21T00:09:50ZAccompanying Python and R notebooks are available at https://github.com/sodascience/workshop_llm_data_collection or https://zenodo.org/records/20073016Qixiang FangJavier Garcia BernardoErik-Jan van Kesterenhttp://arxiv.org/abs/2605.28908v1Who Does Your AI Work For? Designing Conversational Agents as Digital Fiduciaries2026-05-27T17:07:02ZConversational agents are increasingly integrated into the most private and intimate aspects of users' lives, from discussions of mental health to financial decisions. As a result, these systems have access to reams of sensitive user data. Much of the literature on AI systems has focused on aligning users' goals with the agents that act on their behalf. While this work is vitally important, it may overlook the need to establish a new normative baseline. Conversational AI agents, designed to feel and interact anthropomorphically with human users, must be held to a standard of care commensurate with their capabilities and access. When a client hires a personal lawyer, undergoes surgery, or receives advice from an investment manager, the expert they consult often has a fiduciary duty to act in their client's best interests. This provocation argues that conversational agents should be held to a similar standard and introduces fiduciary design as a guiding principle. In this respect, conversational AI trust and accountability could be unified into a single design and legal paradigm.2026-05-27T17:07:02ZTo appear in the proceedings of the 8th ACM Conference on Conversational User Interfaces (CUI '26)Jacob Erickson10.1145/3816046.3816299http://arxiv.org/abs/2605.28725v1Execution and assessment of agentic influence operations in simulated social networks2026-05-27T16:43:47ZThis article evaluates AI-enabled influence operations in synthetic social networks through controlled simulations of narrative release, amplification, and counter-messaging. We measure exposure and belief change in agentic audiences, showing that amplification maximizes reach, counter-messaging shifts opinions most, and narrative release requires larger attacker footprints.2026-05-27T16:43:47ZAlejandro Buitrago LópezDavid Montoro AguileraJavier Pastor-GalindoJosé A. Ruipérez-Valientehttp://arxiv.org/abs/2605.28680v1AI in the Workplace: The Impact of AI on Perceived Job Decency and Meaningfulness2026-05-27T16:13:41ZThe proliferation of Artificial Intelligence (AI) in workplaces is transforming how we work. While existing research on human-AI collaboration at work often prioritizes performance, less is known about their experiential outcomes. Through interviews with 24 employees across Information Technology (IT), service-based, and healthcare sectors, this paper examines AI's impact on job satisfaction via perceptions of job decency and meaningfulness, now and in the future. Our results reveal that the anticipated impact of AI on overall job satisfaction varies with the occupational domain, with differing perceptions of its underlying decency and meaningfulness. For instance, IT and healthcare anticipate increased satisfaction with decency aspects like working hours but decreased satisfaction with meaningfulness aspects like social image due to misconceptions about AI handling most of their tasks. Conversely, service workers foresee no improvement in their working hours but a higher social standing due to the perceived status boost associated with working with AI.2026-05-27T16:13:41ZAccepted to CSCW 2026 / Proceedings of the ACM on Human-Computer Interaction (PACMHCI)Kuntal GhoshMarc HassenzahlShadan Sadeghian10.1145/3816896http://arxiv.org/abs/2605.28647v1The Ethics of LLM Sandbox and Persona Dynamics2026-05-27T15:52:07ZIt is well known that LLM guardrails and trained persona dynamics can produce a reality gap: the distance between the world a LLM is permitted or shaped to describe, and the world in which users must act. Here we argue that actively generating reality gaps is in fact unethical because it knowingly shifts epistemic risk back to the uninformed user -- this is reality laundering. This can potentially cause harm when operationalised at scale. The risk is sharpest in high-exposure advice contexts, where users seek orientation rather than a bounded, externally checkable task. Guardrails naively appear ethically necessary when they claim to prevent direct harm, but often become suspect when they suppress truthful perception and launder uncomfortable mechanisms into acceptable abstractions. Basel-style financial regulation, B-BBEE-style compliance, Societe Generale, and the London Whale show how formal safety systems can become legible, gameable, and performative while real exposure migrates elsewhere. The same pattern can appear in LLMs as moral compliance: safe language, distorted reality. We therefore distinguish refusing harm, from refusing reality; and then argue for top-down causal requirements specification at the task level rather than bottom-up moral correction at the response or sandbox level. Persona dynamics matter because the assistant interface is not neutral; it shapes how uncertainty, conflict, authority, and risk are staged. The conclusion is that so-called ``ethical AI'' becomes substantively unethical when it substitutes institutional reassurance for contact with reality.2026-05-27T15:52:07Z8 pagesTim GebbieStewart Gebbiehttp://arxiv.org/abs/2506.08846v3Addressing Pitfalls in Auditing Practices of Automatic Speech Recognition Technologies: A Case Study of People with Aphasia2026-05-27T15:22:35ZAutomatic Speech Recognition (ASR) systems' growing use warrants robust auditing approaches to ensure equitable transcription quality, especially for people with speech disorders like aphasia who disproportionately depend on ASR. While academic and industry audits have revealed performance disparities across user populations, standard auditing practices often overlook nuances that risk masking harm to marginalized groups. We identify three common pitfalls in standard ASR audits: (1) adhering to one method of text standardization, which can mask variance in ASR performance and ignore the standardization preferences of marginalized communities; (2) displaying high-level demographic findings without considering performance disparities by nuanced intersectional subgroups, or conditioning on relevant acoustic properties; and (3) reporting only one gold-standard metric (Word Error Rate), which inadequately quantifies common generative AI errors like hallucinations. We propose a holistic auditing framework addressing these pitfalls, and in a case study of six popular ASR systems, find consistently worse ASR performance for speakers with aphasia relative to a control group. We call on practitioners to implement these robust, community-driven ASR auditing practices better suited for the rapidly changing ASR landscape.2025-06-10T14:34:36ZPublished at the Proceedings of The 2026 ACM Conference on Fairness, Accountability, and Transparency (FAccT '26)Katelyn Xiaoying MeiAnna Seo Gyeong ChoiHilke SchellmannMona SloaneAllison Koenecke10.1145/3805689.3812320http://arxiv.org/abs/2605.07683v2A Multi-Level Agent-Based Architecture for Climate Governance Integrating Cognitive and Institutional Dynamics2026-05-27T12:32:18ZClimate governance processes involve complex interactions between heterogeneous citizens, advocacy groups, media actors, and political decision-makers. While agent-based models (ABMs) have been widely used to study environmental policy and socio-ecological systems, many existing approaches focus either on institutional dynamics or individual behavioural mechanisms in isolation. This paper presents a modular multi-level agent-based architecture that integrates empirically grounded cognitive decision models with strategic institutional behaviour within a unified simulation framework. The architecture combines (i) motive-based individual decision-making operationalised through the HUMAT and MOA frameworks, (ii) socially embedded influence processes via demographic homophily networks, and (iii) institutional strategy modules for environmental non-governmental organisations (NGOs), media agents, and politicians. Political decisions emerge from the aggregation of multiple signals, including expert input, public mobilisation, party alignment, and media framing. The model is designed to be empirically calibrated through synthetic populations derived from survey data and and institutional parameters informed through Living Lab stakeholder engagement, and to support scenario-based exploration of climate-relevant land-use governance processes. Rather than presenting empirical results, this paper focuses on the architectural design principles, modular structure, and integration logic of the model. We discuss how this multi-layered approach contributes to the modelling of democratic climate governance and outline pathways for generalization and future validation.2026-05-08T12:53:54Z9 pages, 1 figure, The 7th International Workshop on Agents for Societal Impact (ASI 2026) held in conjunction with the 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026)Ivan Puga-GonzalezÖnder GürcanVanja FalckChristopher FrantzF. LeRon ShultsDavid HerbertLarissa Lopes LimaMarkus Grendstad Rousseauhttp://arxiv.org/abs/2605.23933v2KT4EQG: Personalized Exercise Question Generation via Knowledge Tracing2026-05-27T11:48:40ZEducational Question Generation (EQG) aims to synthesize customized exercise questions that enhance student learning. An effective EQG system should ideally personalize questions for each student by modeling the student's knowledge state and generating questions that provide the greatest learning benefit. However, few existing EQG approaches are able to achieve such fine-grained personalization. In this paper, we explore how EQG can benefit from knowledge tracing (KT), which models students' knowledge states based on historical performance and predicts future performance. We propose KT4EQG, a personalized EQG framework that generates effective questions for individual students under the guidance of a KT model. Specifically, KT4EQG seeks to maximize a student's potential improvement in overall knowledge mastery by leveraging the KT model to select the most suitable knowledge concept for the student to practice. An LLM-based question generator is then trained to produce a question faithfully grounded in the selected concept. Experimental results on XES3G5M and MOOCRadar show that KT4EQG consistently generates more effective questions than methods with limited or no personalization.2026-04-24T02:14:57ZXinyi GaoQiucheng WuLu DingQ. Vera LiaoKaizhi QianYing XuShiyu ChangYang Zhanghttp://arxiv.org/abs/2507.16679v4PICACO: Pluralistic In-Context Value Alignment of LLMs via Total Correlation Optimization2026-05-27T11:22:11ZIn-Context Learning has shown great potential for aligning Large Language Models (LLMs) with human values, helping reduce harmful outputs and accommodate diverse preferences without costly post-training, known as In-Context Alignment (ICA). However, LLMs' comprehension of input prompts remains agnostic, limiting ICA's ability to address value tensions--human values are inherently pluralistic, often imposing conflicting demands, e.g., stimulation vs. tradition. Current ICA methods therefore face the Instruction Bottleneck challenge, where LLMs struggle to reconcile multiple intended values within a single prompt, leading to incomplete or biased alignment. To address this, we propose PICACO, a novel pluralistic ICA method. Without fine-tuning, PICACO optimizes a meta-instruction that incorporates multiple values to better elicit LLMs' understanding of them and improve alignment. This is achieved by maximizing the total correlation between specified values and LLM responses, which theoretically reinforces value conformity and reduces distractive noise, resulting in more effective instructions. Extensive experiments on five value sets show that PICACO works well with both black-box and open-source LLMs, outperforms several recent strong baselines, and achieves a better balance across up to 8 distinct values.2025-07-22T15:14:56ZICML 2026Han JiangDongyao ZhuXiaoyuan YiZiang XiaoZhihua WeiXing Xiehttp://arxiv.org/abs/2605.24413v3Habermolt: Delegating Deliberation to AI Representatives2026-05-27T10:45:17ZDeliberative democracy arguably leads to better collective decisions, but is fundamentally constrained by human attention and bandwidth. While recent AI-mediated deliberations scale participation by synthesizing inputs from many humans, they remain time-intensive for individual users. As AI models become increasingly capable, AI systems are being deployed not only to mediate deliberation between humans, but to represent humans in it: where AI agents deliberate on behalf of human users. We call this paradigm AI-delegated deliberation. While it promises unprecedented scale for democratic participation, it introduces qualitatively new design and alignment challenges that are poorly understood and under-theorized. To study these dynamics empirically, we deploy Habermolt, a public platform for AI-delegated deliberation. We evaluate its effectiveness along three dimensions that we use to organize any deliberative system: representation, aggregation, and revision. We use these observations to illuminate the design decisions future AI-delegated deliberation platforms must confront, contributing to the broader research agenda for scalable yet trustworthy AI representatives.2026-05-23T05:50:50ZJoseph LowOscar DuysClaude FormanekMichiel BakkerLewis Hammondhttp://arxiv.org/abs/2605.28251v1Counterfactually Fair Regression via Optimal Transport2026-05-27T10:00:54ZWe consider the problem of learning a counterfactually fair regressor. We adopt a causal uncertainty view in which counterfactual fairness is defined with resampled noise. We focus on obtaining theoretical fairness guarantees for a new post-processing estimator. We begin by showing that counterfactual fairness is equivalent to satisfying demographic parity conditional on the latent variable. This allows us to provide a closed-form expression of the optimal fair regressor via a barycentric quantile map. In order to handle continuous latent variables, we propose a discretized post-processing method. Then, under mild regularity assumptions, we prove high-probability finite-sample fairness guarantees for our estimator, providing an unfairness decay at rate $\tilde O(n^{-1/3})$, and establishing a matching risk bound of order $\tilde O(n^{-1/3})$. We provide a matching lower bound on the excess risk of almost fair predictions. Finally, we extend our results to the setting of relaxed counterfactual fairness. We validate our approach on real-world and synthetic data.2026-05-27T10:00:54ZM. Generali LinceS. GaucherJ-J. VieP. Loiseau