https://arxiv.org/api/uxCdgCS4YwktKAX4rUBcLmEOXR0 2026-06-21T07:45:35Z 28997 585 15 http://arxiv.org/abs/2605.24737v1 Who judges the judges? Governance from metrics: a runtime framework for continuous LLM compliance monitoring 2026-05-23T21:21:33Z

Current approaches to AI compliance treat conformity as a binary, audit-time verdict rather than a continuous, measurable property of production systems. We argue that this compliance fiction is structurally ill-suited to the requirements of the EU AI Act, which demands ongoing human oversight and the detection of emergent behavioural drift in deployed systems. We introduce governance from metrics, a principle whereby regulatory compliance is derived as a continuous signal from runtime observability rather than from static assessments. Building on this principle, we present govllm, an open-source framework implementing a governance-driven routing architecture in which model selection is determined by accumulated compliance scores rather than by latency or cost alone. Central to our approach is a panel of regulatory judges - LLM evaluators specialised per criterion (EU AI Act, GDPR, ANSSI, accessibility) - whose inter-judge disagreement we reframe not as noise but as a regulatory uncertainty signal warranting human arbitration. We validate this approach through a ground truth corpus of 49 annotated prompt/response pairs across five regulatory criteria, evaluated by four small language models (SLMs, 1.7B-7B parameters) running fully on-premise. Agreement rates range from 51.5% (mistral:7b) to 69.1% (phi4-mini), with no single model dominating across all criteria - empirically motivating the Profile-as-jury design. We further document three structural failure modes in small regulatory judges and a judge-specific position bias that degrades agreement by up to 25 percentage points across three question-order conditions (original, reversed, permuted). govllm is released as open-source software to support reproducible AI governance research.

2026-05-23T21:21:33Z 41 pages, 8 figures, preprint Jehanne Dussert http://arxiv.org/abs/2605.24735v1 Dual-Use AI Face Swap Apps Are Mostly Unsafe: A Systematic Safety Audit 2026-05-23T21:15:42Z

AI-based image editing tools, such as face swapping algorithms, can be used to transform a clothed image of a person into a sexually explicit image of that person. These tools are made easily accessible to non-expert users through mobile apps, and have been linked to reports of image-based sexual abuse and cyberbullying involving synthetic non-consensual intimate imagery. Apple and Google have begun to remove "nudification" apps from their platforms: apps that are marketed with the capability to "undress", "nudify", or create nude face swaps from images of people. However, AI image editing apps that have the same underlying capabilities, but do not present as nudification apps could be also abused to create non-consensual explicit images. In this paper, we investigate whether AI face swap apps for iOS and Android implement safety measures to prevent the creation of SNCII. We identified and downloaded 420 face swap apps, and manually tested 155 eligible apps to see whether they would permit the user to create face swaps with nude images. Our evaluation shows that 70% of apps with face swap functionality have no technical safeguards against generation of nude images. Additionally, we investigated whether face swap apps' descriptions, terms of service, or privacy policies addressed harmful uses of the app, finding that no apps self-describe as nudification apps, but that the majority do not have specific terms of service provisions prohibiting this kind of use. Our findings suggest that to mitigate the threat of UI-bound SNCII threats, platforms and lawmakers must implement policies to mandate safety filters in dual-use AI image editing applications like face swap apps.

2026-05-23T21:15:42Z Alaa Daffalla Sarah Chao Eric Zeng http://arxiv.org/abs/2605.24688v1 Position: Adopting AI in Practice Does Not Guarantee the Productivity Boost 2026-05-23T17:56:01Z

This position paper argues that adopting AI in organizational practice does not guarantee productivity gains, because human and environmental factors critically moderate the relationship between AI deployment and realized productivity improvements. Following the advent of high-performance generative models, AI use has been rapidly encouraged in some sectors while being restricted in others. Most practitioners assume that AI brings productivity boosts owing to enhanced technical capabilities, but regardless of apparent performance advances in AI technology, human and environmental factors of the organization may substantially attenuate -- or even negate -- the effective productivity benefits. We identify five key moderating factors: human resource composition, baseline capability of individuals, learning curve of practitioners, incentives for fair use, and flexibility of objectives. Drawing on the partial equilibrium model of Gries and Naudé (2022), we argue that existing economic frameworks may inadvertently overlook these factors. We revise the existing framework to redefine effective organizational determinants and shed light on practical implications including industry and education, responding to alternative views and calling for action of stakeholders.

2026-05-23T17:56:01Z Accepted at ICML 2026 as a position paper; Official link: https://icml.cc/virtual/2026/poster/67097 Won Ik Cho Seong-hun Kim Geunhye Kim http://arxiv.org/abs/2603.20479v2 Profiling learners' affective engagement: Emotion AI, intercultural pragmatics, and language learning 2026-05-23T16:03:59Z

Learning another language can be a highly emotional process, typically characterized by numerous frustrations and triumphs, big and small. For most learners, language learning does not follow a linear, predictable path, its zigzag course shaped by motivational (or demotivating) variables such as personal characteristics, teacher/peer relationships, learning materials, and dreams of a future L2 (second language) self. While some aspects of language learning (reading, grammar) are relatively mechanical, others can be stressful and unpredictable, especially conversing in the target language. That experience necessitates not only knowledge of structure and lexis, but also the ability to use the language in ways that are appropriate to the social and cultural context. A new opportunity to practice conversational abilities has arrived through the availability of AI chatbots, with both advantages (responsive, non-judgmental) and drawbacks (emotionally void, culturally biased). This column explores aspects of emotion as they arise in technology use and in particular how automatic emotion recognition and simulated human responsiveness in AI systems interface with language learning and the development of pragmatic and interactional competence. Emotion AI, the algorithmically driven interpretation of users' affective signals, has been seen as enabling greater personalized learning, adapting to perceived learner cognitive and emotional states. Others warn of emotional manipulation and inappropriate and ineffective user profiling

2026-03-20T20:22:54Z Language Learning & Technology, 30(2), 14-35 (2026) Robert Godwin-Jones 10.64152/10125/73679 http://arxiv.org/abs/2605.24580v1 From Replacement to Orchestration: A Socio-Technical Architecture for Agentic AI in Corporate R&D 2026-05-23T13:44:24Z

Purpose: Corporate R&D faces a persistent productivity paradox: rising investment and expanding scientific knowledge have not translated into proportional innovation output. In pharmaceuticals this is captured as Eroom's Law; analogous patterns appear across engineering, materials science, and healthcare. The core cause is not insufficient tools but cognitive saturation: researchers spend an increasing share of their effort on coordination, documentation, and data governance -- hidden work that displaces high-value hypothesis formation, interpretation, and strategic synthesis. Design/Methodology/Approach: The paper uses a Design Science Research (DSR) methodology. The artifact is the HARMONY operating model. Evidence is triangulated from four semi-structured expert interviews with senior R&D leaders across industrial, healthcare, and academic settings; a foresight scenario analysis projecting four plausible 2040 R&D futures; and pattern matching with documented agentic R&D deployments. Two non-negotiable design requirements guide the architecture: cognitive-load redistribution (DR1) and bounded autonomy with alignment (DR2). Findings: We propose HARMONY -- Hybrid Agentic Research Model for Organisational New Yield -- a four-pillar socio-technical architecture comprising ResOps (Industrialized Execution), the Control Tower (Strategic Visibility and Drift Detection), the Ethics Fabric (Bounded Autonomy by Design), and the Talent Studio (Sciencepreneur Capability). The model introduces the Sciencepreneur as the central human archetype in agentic R&D, and Orchestration Leverage as a candidate productivity metric suited to human-agent hybrid systems.

2026-05-23T13:44:24Z Haithem Boussaid Marc Heemskerk Jimmy Siméon Adam Breen Merouane Debbah http://arxiv.org/abs/2603.00179v3 Privacy-Preserving Proof of Human Authorship via Zero-Knowledge Process Attestation 2026-05-23T12:55:30Z

Process attestation verifies human authorship by collecting behavioral biometric evidence, including keystroke dynamics, typing patterns, and editing behavior, during the creative process. However, the very data needed to prove authenticity can reveal intimate details about an author's cognitive state, health conditions, and identity, constituting sensitive biometric data under GDPR Article 9. We resolve this privacy-attestation paradox using zero-knowledge proofs. We present ZK-PoP, a construction that allows a verifier to confirm that (a) sequential work function chains were computed correctly, (b) behavioral feature vectors fall within human population distributions, and (c) content evolution is consistent with incremental human editing, all without learning the underlying behavioral data, exact timing, or intermediate content. Our construction uses Groth16 proofs over arithmetic circuits with Pedersen commitments and Bulletproof range proofs. We prove that ZK-PoP is computationally zero-knowledge, computationally sound, and achieves unlinkability across sessions. Evaluation shows proof generation in under 30 seconds for a 1-hour writing session, with 192-byte proofs verifiable in 8.2 ms, while incurring less than 5% accuracy loss in simulation at practical privacy levels (epsilon >= 1.0) compared to non-private baselines.

2026-02-26T20:38:19Z 8 pages David Condrey http://arxiv.org/abs/2605.24538v1 Is Decentralized AI Governable? From Regulative Policy to Constitutive Protocol 2026-05-23T12:09:56Z

Every major framework for governing artificial intelligence presupposes an identifiable entity -- a developer, deployer, or operator -- who can be held responsible and compelled to comply. Decentralized AI (DeAI) dissolves this presupposition. We analyze DeAI as a six-layer decentralizing stack -- model, training, compute, harness, identity, and ownership -- and show how partial decentralization across layers compounds into what we call the \emph{governance vacuum}: a condition in which AI systems are consequential enough to require governance but lack the properties that existing frameworks presuppose in their targets. This vacuum takes two analytically distinct forms: an \emph{accountability gap}, where no addressable principal can be identified, and an \emph{incapacitation gap}, where even an identified principal cannot alter the running system. We demonstrate that these failures are not merely jurisdictional but defeat every presupposition of governance through normative address -- the communication of rules to a comprehending, responsive agent. Drawing on Lessig's modalities of regulation and Searle's distinction between regulative and constitutive rules, we argue for a shift in the locus of governance from policy to protocol, from normative address to architectural constraint. Protocol-based constitutive governance does not address the agents operating within a system but shapes the substrate that determines what kinds of actions are possible within it. We identify four ethical conditions -- legitimacy, contestability, transparency, and non-domination -- that such governance must satisfy to avoid degenerating into unaccountable technocratic power, and we argue that the central political challenge of governing AI in a decentralized world is reconstructing forms of democratic authorization for architectural choices that persist after the ordinary chain of policy has broken down.

2026-05-23T12:09:56Z Submitted for Ethics and Information Technology Botao Amber Hu Helena Rong http://arxiv.org/abs/2509.17878v2 AI, Digital Platforms, and the New Systemic Risk 2026-05-23T11:01:37Z

As artificial intelligence (AI) becomes increasingly embedded in digital, social, and institutional infrastructures, and AI and platforms are merged into hybrid structures, systemic risk has emerged as a critical but undertheorized challenge. In this paper, we develop a rigorous framework for understanding systemic risk in AI, platform, and hybrid system governance, drawing on insights from finance, complex systems theory, climate change, and cybersecurity - domains where systemic risk has already shaped regulatory responses. We argue that recent legislation, including the EU's AI Act and Digital Services Act (DSA), invokes systemic risk but relies on narrow or ambiguous characterizations of this notion, sometimes reducing this risk to specific capabilities present in frontier AI models, or to harms occurring in economic market settings. The DSA, we show, actually does a better job at identifying systemic risk than the more recent AI Act. Our framework highlights novel risk pathways, including the possibility of systemic failures arising from the interaction of multiple AI agents. We identify four levels of AI-related systemic risk and emphasize that discrimination at scale and systematic hallucinations, despite their capacity to destabilize institutions and fundamental rights, may not fall under current legal definitions, given the AI Act's focus on frontier model capabilities. We then test the DSA, the AI Act, and our own framework on five key examples, and propose reforms that broaden systemic risk assessments, strengthen coordination between regulatory regimes, and explicitly incorporate collective harms.

2025-09-22T15:14:23Z Accepted for publication at ACM FAccT (2026) Philipp Hacker Lilian Edwards Atoosa Kasirzadeh http://arxiv.org/abs/2511.08654v2 AI-generated podcasts: Synthetic Intimacy and Cultural Mistranslation in NotebookLM's Audio Overviews 2026-05-23T09:23:55Z

This paper analyses AI-generated podcasts produced by Google's NotebookLM, which generates audio podcasts with two chatty AI hosts discussing whichever documents a user uploads. While AI-generated podcasts have been discussed as tools, for instance in medical education, they have not yet been analysed as media. By uploading different types of text and analysing the generated outputs I show how the podcasts' structure is built around a fixed template. I also find that NotebookLM not only translates texts from other languages into a perky standardised Mid-Western American accent, it also translates cultural contexts to a white, educated, middle-class American default. This is a distinct development in how publics are shaped by media, marking a departure from the multiple public spheres that scholars have described in human podcasting from the early 2000s until today, where hosts spoke to specific communities and responded to listener comments, to an abstraction of the podcast genre.

2025-11-11T08:21:02Z This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement number 101142306. The project is also supported by the Center for Digital Narrative, which is funded by the Research Council of Norway through its Centres of Excellence scheme, project number 332643. Media, Culture & Society, online first (2026) Jill Walker Rettberg 10.1177/01634437261452160 http://arxiv.org/abs/2606.00088v1 From Frontier to Shadow AI: A Simmering Threat to Assurance and Security in Critical Infrastructure 2026-05-23T07:21:08Z

Frontier AI systems, including large language models and emerging agentic AI tools, offer significant operational benefits but present unique challenges to critical infrastructure (CI) environments due to their non-deterministic and emergent properties. While formal adoption is inherently cautious and tightly controlled due to strict regulatory oversight, widespread accessibility has catalysed shadow AI: the unsanctioned use of frontier AI outside established organisational controls. In CI settings, shadow AI bypasses established assurance and oversight mechanisms, amplifying risks to data protection, decision reliability, and regulatory compliance, with potential consequences for essential service delivery. We present the first empirical study of shadow AI in CI environments, characterising it as a systemic socio-technical condition of assurance erosion. Drawing on semi-structured interviews with senior executives and functional leaders across 27 Australian CI organisations (Communications, Energy, and Water and Sewerage sectors), we analyse how shadow AI manifests in practice, how it interacts with existing technical and governance controls, and the resulting security, assurance, and compliance risks. We develop an empirically derived threat model identifying three primary mechanisms of security degradation: (i) boundary bypass, where data flows circumvent established perimeters; (ii) unassessed capability expansion, where embedded AI features introduce latent risks; and (iii) loss of observability via governance circumvention, undermining forensic auditability and least-privilege enforcement. Our findings demonstrate that shadow AI introduces unmanaged risks that fundamentally challenge existing security and compliance frameworks, necessitating tailored, pathway-aligned governance and control strategies.

2026-05-23T07:21:08Z 21 pages, 2 figures, 2 tables, paper under review Mohan Baruwal Chhetri Shahroz Tariq Tooba Aamir Marthie Grobler Chandra Thapa Ronal Singh http://arxiv.org/abs/2605.24383v1 A governance horizon for ethical-use constraints in open-weight AI models 2026-05-23T03:47:04Z

Ethical constraints on open-weight AI models are both a reflection of societal concerns and a foundation for AI governance policy. They are expected to propagate to downstream derivatives while implemented as voluntary metadata disclosures that must be restated at each generation of reuse. We audit 2,142,823 model repositories on Hugging Face Hub to test whether this disclosure-based governance infrastructure can sustain traceability across deep model lineages. Restriction evidence decays with a half-life of 1.31 derivation steps ($R^2$=0.98), and beyond seven downstream generations at least 80% of descendant models lack sufficient public evidence for a governance determination, a depth boundary we formalize as the governance horizon. Platform-level interventions to restore missing licence metadata reveal that policy design (not enforcement alone) is the binding factor: inheritance-only designs require near-complete enforcement to move the horizon, whereas a mandatory-declaration design that explicitly resolves orphan lineage components shifts the horizon already at moderate enforcement. The structural bottleneck is lineages with no inheritable upstream intent: such orphan components remain undecidable under any inheritance-only policy regardless of enforcement rate, and unresolved upstream nodes additionally create direct downstream undecidability bottlenecks that inheritance rules alone cannot recover. Comparison with PyPI, where governance signals are carried by explicit machine-readable declarations, corroborates that the collapse is topology-specific to open-weight derivation rather than inherent to open ecosystems. These results establish that disclosure-based governance has a shallow, structurally determined reach in open-weight AI, and that achieving deep supply-chain accountability requires provenance mechanisms propagating governance signals through derivation itself.

2026-05-23T03:47:04Z Weiwei Xu Hengzhi Ye Haoran Ye Kai Gao Vladimir Filkov Minghui Zhou http://arxiv.org/abs/2508.11872v3 Designing Singing Syllabi with Virtual Avatars: AI-Assisted Syllabus Reauthoring 2026-05-23T02:23:15Z

Traditional syllabi often function as static reference documents rather than engaging introductions to a course. In practical teaching, we observe that few students thoroughly read or fully comprehend the information provided in traditional, text-based course syllabi, which can leave essential information underused. This paper reframes syllabus communication as a design problem and documents an AI-assisted workflow for transforming a traditional syllabus into a musical, video-based, and avatar-enhanced learning artifact. The paper traces the process of lyrical adaptation, music generation, video composition, avatar synthesis, and optional browser-based interaction. And the paper contributes a reproducible workflow and a concrete example of syllabus reauthoring. The discussion in this paper positions the singing syllabus as a supplement to, not a replacement for, the formal written syllabus and identifies future directions for empirical evaluation. The complete implementation described in this paper is publicly available at https://github.com/xinxingwu-uk/SSVA

2025-08-16T02:12:39Z 16 pages, 1 figures, 1 table Xinxing Wu http://arxiv.org/abs/2604.23703v2 Talking Slide Avatars: Open-Source Multimodal Communication Approach for Teaching 2026-05-23T02:20:27Z

Slide-based teaching is widely used in higher education, yet in online, hybrid, and asynchronous contexts, slides often lose instructor presence, narrative continuity, and expressive framing that help learners connect with course content. Full lecture video can partly restore these qualities, but it is time-consuming to record, revise, and reuse. This study presents a practice-based implementation and analytic reflection of an open-source workflow for creating talking slide avatars. The workflow integrates OpenVoice for text-to-speech and authorized voice-style conversion with Ditto-TalkingHead for audio-driven talking-image synthesis, enabling instructors to transform a short script and an authorized or synthetic portrait image into a narrated video for slide decks or HTML-based lecture materials. Rather than treating this workflow only as a technical solution, the study frames talking slide avatars as multimodal communication artifacts at the intersection of digital pedagogy, aesthetic education, and art-technology practice. The paper documents the production pipeline, analyzes communicative and aesthetic affordances, and proposes practical guidelines for script length, image selection, pacing, disclosure, accessibility, consent, and ethical use. Its contribution is not a validated learning intervention, but an educator-oriented open-source production model and communication-design framework. The study concludes that short, transparent, and carefully designed avatars may provide a reusable communication layer for introductions, transitions, reminders, and recaps when used selectively and with appropriate ethical safeguards.

2026-04-26T13:36:45Z 15 pages Xinxing Wu http://arxiv.org/abs/2605.24325v1 PAIRED: A Process-Anchored Framework for Transparent Reporting of AI Contributions in Scientific Research 2026-05-23T01:10:56Z

The rapid integration of generative AI into scientific research has exposed a critical gap in academic disclosure practice. Existing frameworks for reporting AI contributions are uniformly output-oriented -- they document what AI produced, not how the research unfolded. As a result, researchers who wish to report their AI collaboration honestly lack the tools to do so: no current framework can distinguish between a researcher who originated a research direction and one who adopted a direction proposed by AI, or between a researcher who critically evaluated AI-generated alternatives and one who accepted AI output without independent assessment. This gap is not a matter of compliance detail; it is a failure to capture the cognitive dynamics that determine what kind of intellectual contribution a paper actually represents. We propose PAIRED -- Process-Anchored Interaction Reporting for AI-Enabled Discovery -- a dual-facing framework that addresses this gap through four design principles: process orientation, which takes the decision point rather than the research product as the fundamental unit of documentation; dual-facing output, which derives a structured publisher disclosure from a prospective author log without double work; decision-point granularity, which operates between session-level coarseness and message-level impracticality; and artifact-triggered logging, which provides an auditable rule against selective omission. We demonstrate PAIRED through worked examples, discuss its limitations openly, and propose a model-assisted adoption pathway that embeds the framework's logging discipline directly into AI research platforms.

2026-05-23T01:10:56Z Ahmad Al-Kabbany http://arxiv.org/abs/2605.24307v1 Modernizing User Privacy Preference Measurement through GPPI: A GDPR-aligned Privacy Preference Item Bank 2026-05-23T00:36:32Z

Privacy measurement instruments (e.g., CFIP, IUIPC, PAQ) predate GDPR by over a decade and measure privacy concerns, distinct from preferences for regulatory protections (e.g., data portability, erasure, automated decision-making rights). This leaves practitioners without tools to assess whether users value the GDPR mechanisms implemented in compliant policies. We developed a GDPR-grounded privacy preference measurement item bank by extracting 669 statements from all 99 GDPR articles, validated by: (1) two-round expert review achieving full consensus on accuracy, (2) semantic clustering into 10 parent themes and 87 subthemes, and (3) consensus review with 50 privacy experts (5 per theme) using a larger or equal than 4/5 vote retention threshold. The final 527-item bank comprises 9 parent themes and 73 subthemes (18 to 112 items per parent theme, 1 to 29 per subtheme), enabling targeted measurement across granularities while covering GDPR at mean pairwise expert agreement of approx. 85%. This work introduces a complementary measurement dimension aligning user preferences with regulatory mechanisms.

2026-05-23T00:36:32Z Yahya Hmaiti Mykola Maslych Amirpouya Ghasemaghaei Trung Cuong Dang Corey Pittman David Mohaisen Joseph J. LaViola