https://arxiv.org/api/aHhJUM7uofJX8VGIGi6vbPB68tc 2026-06-21T21:02:09Z 28997 750 15 http://arxiv.org/abs/2605.16993v1 Adversarial Fragility and Language Vulnerability in Clinical AI: A Systematic Audit of Diagnostic Collapse Under Imperceptible Perturbations and Cross-Lingual Drift in Low-Resource Healthcare Settings 2026-05-16T13:33:47Z

Current clinical artificial intelligence (AI) systems are evaluated almost exclusively on clean, standardised, English-language inputs, conditions that do not reflect the realities of healthcare delivery in low-resource settings. This study presents the first systematic dual audit of two orthogonal safety vulnerabilities in clinical AI: adversarial image fragility and cross-lingual diagnostic drift. Using DenseNet121, the architecture underlying CheXNet, fine-tuned on the COVID-QU-Ex chest X-ray dataset (85,318 images; COVID-19, Non-COVID Pneumonia, Normal), we demonstrate that diagnostic accuracy collapses from 89.3% to 62.0% under a Fast Gradient Method (FGM) perturbation of epsilon=0.021, a magnitude imperceptible to the human eye. Standard defensive strategies including Gaussian smoothing and ensemble voting failed to restore clinical safety. In a parallel language fragility experiment, we tested Llama3.1:8b and NatLAS (N-ATLAS) on 20 COVID-19 clinical cases presented in Standard English, Nigerian Pidgin (Naija), and Yoruba-inflected English. Both models exhibited significant accuracy degradation: Llama3.1:8b dropped from 80.0% to 65.0% on Pidgin; NatLAS, an African-context model, collapsed from 85.0% to 55.0%, with diagnosis consistency falling to 50%. These findings establish a quantitative failure envelope for clinical AI under conditions representative of Primary Health Centre (PHC) deployment in Nigeria, and motivate urgent calls for adversarially hardened, linguistically inclusive clinical AI architectures.

2026-05-16T13:33:47Z 23 pages, 9 figures, 3 tables. Code and data available at https://github.com/anthoniooladimeji11-coder/clinical-ai-safety-audit Anthonio Oladimeji Gabriel Ahmad Rufai Yusuf http://arxiv.org/abs/2605.16992v1 Push and Pull in Community College Cross-Enrollment: Remoteness, Articulation, and Student Mobility 2026-05-16T13:31:29Z

Cross-enrollment across institutions can expand access to courses and support student progression. Still, little is known about how geographic constraints and institutional policies jointly shape cross-enrollment within community college (CC) systems. We adopt a push-pull framework: geographic remoteness constrains feasible cross-institution mobility, while credit mobility may attract enrollment expressed as articulation (CC-to-university: credit toward a four-year partner) and course equivalencies (CC-to-CC: equivalencies across the system). Using de-identified administrative records from a 12-institution community college system (100,547 students; 1,290,311 course enrollments), we quantify outgoing and incoming cross-enrollment and relate these patterns to institutional remoteness and credit mobility. We find that less remote colleges exhibit higher outgoing and incoming cross-enrollment than more remote colleges. Further, cross-enrolled students are more likely to take articulated courses, and institutions with higher equivalency ratios receive higher incoming cross-enrollment (8.62% vs. 6.70%). This association was slightly stronger at more remote colleges. This study demonstrates how analysis of complex college systems can surface factors shaping student mobility and inform the design of cross-enrollment and articulation policies in CC systems.

2026-05-16T13:31:29Z Accepted as work-in-progress paper to the 13th ACM Conference on Learning @ Scale (L@S '26) Conrad Borchers Robin Schmucker Ashutosh Tiwari Zachary A. Pardos http://arxiv.org/abs/2606.12437v1 Algorithmic Constitutionalism 2026-05-16T13:16:32Z

The increasing encroachment of artificial intelligence (AI) on social life raises significant risks for society, particularly within the infospheres created and controlled by companies such as Google, Facebook, Apple, and Amazon. This article examines these risks through an in-depth analysis of Facebook's content moderation regime, which is already partially governed by algorithms. We argue that the idea of ethical engineering, often proposed in the literature as a solution to the governance challenges posed by AI, is inadequate for several reasons. In response, we develop an alternative framework, which we term "algorithmic constitutionalism." Our approach rests on three pillars: (a) a layered architecture consisting of two levels of code: (i) an operative or object level and (ii) a meta level designed to protect the system's core principles from algorithmically initiated change; (b) algorithmic meta-reasoning, which enables the system to operate simultaneously at both levels so that it can monitor, verify, and potentially correct in real time operations at the object level that depart from principles protected at the meta-code level; and (c) correction through deliberation. The article elaborates the concept of algorithmic constitutionalism and demonstrates how it may be applied to Facebook's content moderation regime. As part of this analysis, we examine the tension between societal constitutionalism and algorithmic constitutionalism. Paradoxically, attempts to subject AI systems to external deliberative control may also enable AI agents to intervene in that process, potentially undermining its purpose. The article concludes by considering the implications of this argument for the European Digital Services Act, which entered into force in October 2022.

2026-05-16T13:16:32Z Ind. J. Global Legal Stud. 30 (2023): 81 Oren Perez Nurit Wimer http://arxiv.org/abs/2606.12436v1 Knowing the Rules Is Not Enough: Student Regulatory Awareness and Use of GenAI in Higher Education 2026-05-16T10:53:44Z

Context: Generative Artificial Intelligence (GenAI) tools such as ChatGPT are increasingly integrated into students learning practices. While previous research mainly examines adoption rates and attitudes, students awareness of institutional regulations and their perceived compliance remain unexplored. Understanding whether regulatory awareness influences student behavior is therefore important as higher education institutions create and apply AI policies. Objective: This study investigates how students awareness of GenAI regulations relates to their perceived compliance and actual usage behavior. Our research objective is to examine the association between regulatory knowledge, GenAI use, and perceived rule conformity among students in computer science related study programs. Method: A survey with 151 undergraduate students in Business Information Systems and E-Government programs at the University of Applied Sciences and Arts Hannover (Germany) collected data on GenAI usage, tools used, awareness of institutional regulations, and perceived compliance. Descriptive statistics, cross-tabulations, and correlation analyzes were applied. Results: Most students actively use GenAI tools, but over half are uncertain whether their usage complies with institutional regulations. Regulatory awareness shows only weak to moderate associations with actual usage behavior. Students primarily rely on privately accessed GenAI tools rather than institutionally provided solutions. Contributions: The study contributes empirical evidence on the relationship between regulatory awareness and GenAI usage in higher education. Our findings highlight a gap between institutional regulations and student practices and provide insights for educators and institutions on improving policy communication and integrating GenAI more effectively into teaching and learning contexts.

2026-05-16T10:53:44Z Lasse Bischof Eva-Maria Schön Maria Rauschenberger Michael Neumann http://arxiv.org/abs/2605.16872v1 Some[Body] Must Receive That Pain for Agent Accountability 2026-05-16T08:24:06Z

AI agents increasingly act consequentially in the real world. This creates a problem we call \emph{consequence reception}: harm occurs, the producing system is identified, yet no continuing agent receives consequences in a way that changes future behavior. Pain, understood mechanistically as a corrective feedback signal, is foundational to canonical theories of punishment -- deterrence, rehabilitation, retribution, and incapacitation all assume a continuing locus that registers the signal and updates behavior. That, in turn, requires a body for the signal to land on: a boundary whose integrity it protects, a locus where it accumulates, consolidation that converts episodic signal into durable update, and a substrate that responds by altering future action. Current LLM agents -- software-defined composites of weights, prompts, tools, memory, and credentials, freely swapped, copied, reset, and reassembled -- satisfy none of these conditions. The two prevailing legal responses therefore fail to achieve consequence reception. The thin-identity agent-principal dyad has a body but no \emph{consequence--agency coupling}: the human bears pain for behaviors beyond their control -- Elish's \emph{moral crumple zone}. The thick-identity Arbel et al.'s \emph{Algorithmic Corporation} creates legally legible entities but does not guarantee that any AI decision architecture receives pain as a behavioral signal. Achieving consequence-agency coupling is therefore a sociotechnical infrastructural problem, not only a legal one. Until such architectures exist, high-stakes AI deployment should remain tethered to accountable human principals with meaningful control, proportional liability, and authority to constrain or terminate the agent. \emph{If some body does not receive the pain by design, some body will receive it by default.}

2026-05-16T08:24:06Z Botao Amber Hu Helena Rong http://arxiv.org/abs/2606.12435v1 Auditing Discriminatory Patterns in Mortgage Lending Through Association Rules and Fair Binning 2026-05-16T03:35:51Z

Mortgage lending in the United States exhibits persistent racial and gender disparities. We investigate whether standard data preprocessing steps, specifically attribute binning, amplify these disparities in downstream pattern mining. Using 103,481 cleaned mortgage applications from the HMDA 2023 dataset (Chicago metropolitan area), we build a three-stage pipeline: (1) a PySpark data cleaning and binning pipeline that implements both standard equal-frequency binning and the epsilon-biased fair binning algorithm from Asudeh et al. [1], (2) FP-Growth association rule mining that compares denial patterns under both binning regimes, and (3) K-Means clustering with a per-cluster disparate impact audit. Our standard binning shows 9.63% racial bias in income discretization, consistent with the 8-10% reported in prior work. Fair binning with seven race groups is infeasible at epsilon=0.03 and only succeeds at epsilon=0.08 with a Price of Fairness of 29.4%. FP-Growth reveals that high debt-to-income ratio is the dominant denial predictor (67.2% confidence, 2.81 lift), while racial bias does not appear as explicit high-support rules. However, K-Means clustering followed by a disparate impact audit flags 10 out of 45 cluster-group pairs, showing that Black applicants face significantly higher denial rates than White applicants even among financially similar groups.

2026-05-16T03:35:51Z 10 pages, 4 figures, fairness-aware mortgage lending analysis using HMDA 2023 data. Project repository available at GitHub Archit Rathod Dhwani Chande Het Nagda http://arxiv.org/abs/2509.19590v2 Position: AI Evaluations Should be Grounded on a Theory of Capability 2026-05-16T01:07:51Z

Evaluations of generative models are now ubiquitous, and their outcomes critically shape public and scientific expectations of AI's capabilities. Yet skepticism about their reliability continues to grow. How can we know that a reported accuracy genuinely reflects a model's underlying performance? Although benchmark results are often presented as direct measurements of capability, in practice they are inferences: treating a score as evidence of capability already presupposes a theory of what it means to be capable at a task. We argue that AI evaluations should instead be framed as inference tasks grounded on an explicit theory of capability. While this perspective is standard in fields like psychometrics, it remains underdeveloped in AI evaluation, where core assumptions are often left implicit. As a proof-of-concept, we empirically show that reported performance can depend strongly on the evaluator's modeling assumptions, underscoring the need for transparent, theory-driven evaluation practices. We conclude by offering an Evaluation Card to help researchers document, justify, and scrutinize the modeling decisions underlying AI evaluations.

2025-09-23T21:29:04Z ICML 2026 Position Paper Track Nathanael Jo Ashia Wilson http://arxiv.org/abs/2605.14271v2 Auditing Agent Harness Safety 2026-05-16T00:50:55Z

LLM agents increasingly run inside execution harnesses that dispatch tools, allocate resources, and route messages between specialized components. However, a harness can return a correct, benign answer over a trajectory that accesses unauthorized resources or leaks context to the wrong agent. Output-level evaluation cannot see these failures, yet most safety benchmarks score only final outputs or terminal states, even though many violations occur mid-trajectory rather than at termination. The central question is whether the harness respects user intent, permission boundaries, and information-flow constraints throughout execution. To address this gap, we propose HarnessAudit, a framework that audits full execution trajectories across boundary compliance, execution fidelity, and system stability, with a focus on multi-agent harnesses where these risks are most pronounced. We further introduce HarnessAudit-Bench, a benchmark of 210 tasks across eight real-world domains, instantiated in both single-agent and multi-agent configurations with embedded safety constraints. Evaluating ten harness configurations across frontier models and three multi-agent frameworks, we find that: (i) task completion is misaligned with safe execution, and violations accumulate with trajectory length; (ii) safety risks vary across domains, task types, and agent roles; (iii) most violations concentrate in resource access and inter-agent information transfer; and (iv) multi-agent collaboration expands the safety risk surface, while harness design sets the upper bound of safe deployment.

2026-05-14T02:14:28Z 11 Pages, 8 Figures Chengzhi Liu Yichen Guo Yepeng Liu Yuzhe Yang Qianqi Yan Xuandong Zhao Wenyue Hua Sheng Liu Sharon Li Yuheng Bu Xin Eric Wang http://arxiv.org/abs/2604.02406v2 Evaluating AI-Generated Images of Cultural Artifacts with Community-Informed Rubrics 2026-05-16T00:42:12Z

Measurement is essential to improving AI performance and mitigating harms for marginalized groups. As generative AI systems are rapidly deployed across geographies and contexts, AI measurement practices must be designed to support repeatable, automatable application across different models, datasets, and evaluation settings. But the drive to automate measurement can be in tension with the ability for measurement instruments to capture the expertise and perspectives of communities impacted by AI. Recent work advocates for breaking measurement into several key stages: first moving from an abstract concept to be measured into a precise, "systematized" concept; next operationalizing the systematized concept into a concrete measurement instrument; and finally applying the measurement instrument on data to produce measurements. This opens up an opportunity to concentrate community engagement in the systematization phase before operationalizing and applying measurement instruments. In this paper, we explore how to involve communities in systematizing the concept of "cultural appropriateness" in text-to-image models' representation of culturally significant artifacts through case studies with three communities: blind and low vision individuals residing in the UK, residents of Kerala, and residents of Tamil Nadu. Our systematized concepts reflect community members' lived experiences interacting with each artifact and how they want their material culture to be depicted, demonstrating the value of community involvement in defining valid measures. We explore how these systematized concepts can be operationalized into automated measurement instruments that could be applied using a multimodal LLM-as-a-judge approach and challenges that remain. We reflect on the benefits and limitations of such approaches.

2026-04-02T17:17:12Z Published at ACM FAccT 2026. 15 pages Nari Johnson Deepthi Sudharsan Hamna Samantha Dalal Theo Holroyd Anja Thieme Hoda Heidari Daniela Massiceti Jennifer Wortman Vaughan Cecily Morrison 10.1145/3805689.3812222 http://arxiv.org/abs/2606.12434v1 Pluralistic-Alignment Urbanism: Operationalizing a Right to AI for Inclusive Public Space 2026-05-15T23:59:37Z

Municipal agencies increasingly use machine learning to inventory sidewalks, score streetscapes, and generate visualizations of public-space interventions. These systems produce outputs that enter budgeting, design iteration, and public justification, yet judgments about inclusion, safety, and belonging remain contested. This paper proposes Pluralistic-Alignment Urbanism (PAU), a procedural governance framework that treats public-space AI systems as civic infrastructure and formulates a procedural Right to AI for municipal uses of such systems. Drawing on two participatory case studies with community organizations in Montreal, Canada, the paper examines how disagreement, subgroup variation, bounded predictive scaling, and neutral preference judgments can inform municipal AI governance. Street Review elicits resident criteria for streetscape evaluation and trains a subgroup-aware scaling model for co-produced judgments, achieving an R2 of 0.89 on a held-out test set. LIVS, a Local Intersectional Visual Spaces dataset, constructs pluralistic preference data for aligning text-to-image models and treats neutral selections as evidence of indeterminacy. Across the cases, disagreement appears structured, deliberation changes what counts as evidence, scaling is feasible but limited by modality and coverage, and neutrality constrains what preference tuning can justify. PAU translates these constraints into a municipal governance architecture with disaggregated reporting, a versioned value register, standing deliberative cells, procurement clauses, and defined pause and rollback authority.

2026-05-15T23:59:37Z Accepted to The 2026 ACM Conference on Fairness, Accountability, and Transparency (FAccT '26), June 25--28, 2026, Montreal, QC, Canada Rashid Mushkani http://arxiv.org/abs/2605.16671v1 Sustainable Intelligence for the Wild: Democratizing Ecological Monitoring via Knowledge-Adaptive Edge Expert Agents 2026-05-15T22:12:02Z

Rapid biodiversity loss underscore the urgency of effective monitoring, yet manual surveys remain resource-intensive. While on-device AI offers a scalable alternative, its performance in the wild is often challenged by environmental variability. Current methods rely heavily on cloud resource, which requires continuous uploading of field data for model retraining. This approach is unsuitable for remote deployments because it consumes limited power and network connectivity. To address these constraints, this research proposes a shift from model adaptation to knowledge adaptation. We introduce an architecture that separates visual perception from reasoning, combining a visual encoder with a dynamic knowledge base. We uses an explicit knowledge base to replace implicitly encoding expert knowledge into model parameters. This method also supports knowledge sustainability by preserving expert insights in a structured form. Through cross-disciplinary collaboration with biologists and Indigenous communities, this work advances ethical AI co-development, fostering responsible and culturally informed ecosystem management.

2026-05-15T22:12:02Z 10 pages Jiaxing Li Hao Fang Chi Xu Miao Zhang Jiangchuan Liu William I. Atlas Katrina M. Connors Mark A. Spoljaric http://arxiv.org/abs/2601.16398v2 White-Box Sensitivity Auditing with Steering Vectors 2026-05-15T21:54:04Z

Algorithmic audits are essential tools for examining systems for properties required by regulators or desired by operators. Current audits of large language models (LLMs) primarily rely on black-box evaluations that assess model behavior only through input-output testing. These methods are limited to tests constructed in the input space, often generated by heuristics. In addition, many socially relevant model properties (e.g., gender bias) are abstract and difficult to measure through text-based inputs alone. To address these limitations, we propose a white-box sensitivity auditing framework for LLMs that leverages activation steering to conduct more rigorous assessments through model internals. Our auditing method conducts internal sensitivity tests by manipulating key concepts relevant to the model's intended function for the task. We demonstrate its application to bias audits in four simulated high-stakes LLM decision tasks. Our method consistently indicates substantial dependence on protected attributes in model predictions, even in settings where standard black-box evaluations suggest little or no bias. Our code is openly available at https://github.com/hannahxchen/llm-steering-audit

2026-01-23T02:03:20Z Hannah Cyberey Yangfeng Ji David Evans http://arxiv.org/abs/2605.16656v1 Read This Paper to Get $50 Million:* An Analysis of Mobile Messaging Scams Using Reddit Data 2026-05-15T21:50:21Z

Mobile messaging scams--fraudulent messages delivered over SMS and other mobile applications--have become a persistent and evolving security threat, yet the attributes underlying these campaigns remain unclear. This study seeks to address this gap by examining trends in mobile messaging scams and testing the effectiveness of commercial and open-source off-the-shelf detection tools. We characterize mobile messaging scam operations, focusing on how phone numbers, URLs, and text content are used across campaigns. To achieve this objective, we collect and measure a dataset of 175,430 user-reported mobile messaging scams from Reddit between June 2020 and December 2025. While reply-based scams constitute only 50% of our dataset, their compound annual growth rate (99.98%) is nearly twice that of click-based scams (57.29%). Critically, reply-based scams also show the lowest detector performance--despite identifiable similarities in text content and phone number origin within categories--indicating that current off-the-shelf tools are ineffective. These results suggest that further development of detectors is necessary to defend against this rapidly changing ecosystem. By examining a range of message attributes, this work provides new insights into mobile messaging scams, informing the design of more targeted and robust detection methods.

2026-05-15T21:50:21Z Allison Lu Bernardo B. P. Medeiros Kevin R. B. Butler Patrick Traynor http://arxiv.org/abs/2605.16623v1 To Trust or Not to Trust: Authors' Response to AI-based Reviews 2026-05-15T20:43:55Z

Large language models are increasingly discussed and used as tools that may assist with scholarly peer review, but empirical evidence regarding how authors use and perceive AI-based feedback remains limited. This paper reports findings from two independent pilot studies on authors' use and perceptions of AI-based auxiliary review at two computer science venues. After the review release, authors were invited to complete an anonymous post-review questionnaire about the AI review's usefulness, trustworthiness, agreement with human reviews, practical value for revision, perceived inaccuracies, and consent. The final dataset included 56 analyzable responses from authors of 40 papers; closed-ended items were summarized using descriptive statistics, and open-ended responses were analyzed using inductive thematic analysis. Most respondents (83.9%) considered the AI-based review useful, and 80.4% reported that it identified issues not mentioned by human reviewers. This perceived added value translated into action: 82.1% reported using at least some AI feedback in their camera-ready version. However, the authors did not treat the AI review as equivalent to a human review. They generally trusted it less than the human reviews and found human feedback clearer, even though 25.0% described at least some human reviews as not very useful. Reported problems with the AI review were usually limited: 51.8% reported minor inaccuracies, while 16.1% reported clearly incorrect, misleading, or irrelevant comments. Support for future use was strongest when AI was framed as a supervised or author-controlled tool: 96.4% said they would use AI as an internal review tool before future submissions, 89.3% preferred advance notice that AI would be used in review, and 76.8% favored explicit consent before use.

2026-05-15T20:43:55Z César Leblanc Lukas Picek http://arxiv.org/abs/2603.18053v2 Auditing the Auditors: Does Community-based Moderation Get It Right? 2026-05-15T19:33:08Z

Online social platforms increasingly rely on crowd-sourced systems to label misleading content at scale, but these systems must both aggregate users' evaluations and decide whose evaluations to trust. To address the latter, many platforms audit users by rewarding agreement with the final aggregate outcome, a design we term consensus-based auditing. We analyze the consequences of this design in X's Community Notes, which in September 2022 adopted consensus-based auditing that ties users' eligibility for participation to agreement with the eventual platform outcome. We find evidence of strategic conformity: minority contributors' evaluations drift toward the majority and their participation share falls on controversial topics, where independent signals matter most. We formalize this mechanism in a behavioral model in which contributors trade off private beliefs against anticipated penalties for disagreement. Motivated by these findings, we propose a two-stage auditing and aggregation algorithm that weights contributors by the stability of their past residuals rather than by agreement with the majority. The method first accounts for differences across content and contributors, and then measures how predictable each contributor's evaluations are relative to the latent-factor model. Contributors whose evaluations are consistently informative receive greater influence in aggregation, even when they disagree with the prevailing consensus. In the Community Notes data, this approach improves out-of-sample predictive performance while avoiding penalization of disagreement.

2026-03-17T21:58:13Z Yeganeh Alimohammadi Karissa Huang Christian Borgs Jennifer Chayes