Principled Uncertainty in Clinical AI: End-to-End Bayesian Modelling and Algorithmic Equity Auditing Across Multimodal Patient Data

2026-06-08T17:44:15Z

Clinical artificial intelligence (AI) systems routinely produce predictions without principled quantification of uncertainty, limiting their trustworthiness in high-stakes medical environments. This paper presents an integrated research programme addressing two interconnected problems: (1) the development of a fully end-to-end Bayesian uncertainty modelling framework for multimodal clinical data, and (2) the application of calibrated uncertainty estimates as a formal measure of algorithmic equity across patient subgroups. We construct a probabilistic deep learning architecture comprising modality-specific variational encoders, a precision-weighted late fusion mechanism, and a decomposed uncertainty output head that separates aleatoric from epistemic uncertainty. The system is trained with a composite Bayesian loss incorporating binary cross-entropy, Kullback-Leibler divergence regularisation, and an uncertainty calibration penalty. We evaluate model calibration using Expected Calibration Error (ECE = 0.096) and conduct a subgroup equity audit across facility type, socioeconomic status, age group, and biological sex on a dataset of 1,000 simulated patients. Results demonstrate that epistemic uncertainty systematically identifies underserved populations: primary/rural facility patients show a 15.3% uncertainty equity gap (p < 0.001, effect size = 0.698), low socioeconomic status patients exhibit a 6.8% gap (p < 0.001), and elderly patients show a 3.9% gap (p < 0.001), whilst no significant sex-based disparity is detected. These findings establish that calibrated uncertainty is not merely a technical property of probabilistic models but constitutes an actionable equity signal with direct clinical relevance.

Human-Centred Risk Mitigation for AI-Mediated Information Manipulation: A SOCMINT Framework Based on Information Manipulation Sets

2026-06-08T17:12:11Z

AI-mediated information manipulation increasingly takes the form of social cyber attacks that target trust, attention, credibility, reputation, and decision-making rather than only technical infrastructures or isolated false contents. Existing defensive approaches often oscillate between incident-level analysis, which fragments campaigns into weak signals, and attribution-first analysis, which may delay mitigation until responsibility is established. This paper proposes a SOCMINT framework based on Information Manipulation Sets (IMS) as an intermediate operational unit between individual incidents and strategic attribution. Building on the VIGINUM/EEAS use of IMS in counter-FIMI analysis, the framework treats manipulation as a coherent process involving narratives, accounts, infrastructures, temporal patterns, cross-platform migration, synthetic amplification, and cognitive targeting. The proposed pipeline moves from signal detection and diagnostic triage to IMS hypothesis construction, confidence/severity assessment, mitigation selection, and iterative update. A compact scenario illustrates how IMS-based analysis captures what content-level and attribution-first approaches miss. The paper also proposes a tabletop evaluation protocol to assess decision quality, confidence calibration, and mitigation proportionality. The main implication is that human-centred risk mitigation requires not only better detection, but also structured reasoning under uncertainty, auditable decision-making, and safeguards against over-securitising legitimate dissent.

Powering the Future of AI: Navigating the Trade-offs for Europe's Energy Transition and Net-Zero Goals

2026-06-08T15:22:38Z

The rapid expansion of AI globally has led to the proliferation of energy-intensive hyperscale data centres (DCs), making them as a structurally challenging component in power system planning and operation. Using a spatially explicit optimisation model of Europe across 21 AI growth scenarios, we systematically quantify additional demand, capacity requirements, emissions, and operational impacts of DCs. Results indicate that AI could drive 73-723 TWh of extra demand by 2050, risking cumulative emissions overshoots of 67-181 MtCO2 between 2030 and 2050. Our analysis indicates that after 2030, the geography of AI infrastructure will be shaped more by firm power and system flexibility than by the mere abundance of clean energy. In moderate scenarios, AI requires an additional of 200 hours of firm generation, which increases LCOE by 35 EUR/MWh in key hubs. We show that even under the pessimistic scenarios, existing infrastructure would require 70 GW additional capacity, while under managed growth pathways, this expansion could reach 226 GW. We further find DCs workload dynamics strongly shape energy dispatch, system flexibility, and emissions, while improved efficiency significantly reduces capacity needs, and system peaks. While our findings suggest that net-zero targets for 2050 may be achieved, critical emission risks may appear in the intermediate years, and the EU may compromise its carbon-neutral goals unless policies adapt to this accelerating digital transformation.

Awareness of Technological Isomorphism: Integrating AI into Elementary Mathematics Teaching on Data and Prediction,A Case Study of the Compound Line Graph

2026-06-08T15:09:42Z

The deep integration of Artificial Intelligence (AI) into elementary mathematics education necessitates a conceptual tool capable of explaining students' cognitive transition from disciplinary knowledge to AI understanding. This study proposes a novel core concept, "Awareness of Technological Isomorphism, " defined as a student's metacognitive realization that their own mathematical cognitive operations (e.g., observing trends, inducing patterns, and making predictions) share an underlying logical structure with AI technical operations (e.g., pattern recognition and predictive modeling). This awareness, in turn, facilitates cognitive transfer from disciplinary mathematics to AI comprehension. Underpinned by transfer learning and metacognitive theories, this study clarifies the distinct essence of this concept from traditional "computational thinking." We demonstrate the explanatory power of this framework in two ways: elucidating the mechanism of students' cognitive leap from mathematics to AI, and guiding instructors to identify "isomorphic interfaces" within disciplinary curricula. On this basis, a three-stage pedagogical pathway--spanning "Perception, Comprehension, and Creation"--is constructed alongside a corresponding evaluation rubric. This framework is empirically validated through a case study based on the "Compound Line Graph" lesson from a fifth-grade mathematics textbook in China, offering a highly replicable operational framework for the deep convergence of disciplinary instruction and AI literacy education.

I Was Scrolling and Then I Saw a Pregnant Strawberry

2026-06-08T15:01:11Z

AI minidramas (also known as fruit dramas) are short, algorithmically distributed generative AI video series featuring anthropomorphized characters that have recently emerged as a widespread phenomenon on social media platforms. This paper argues that despite their seemingly innocuous aesthetic, these videos reproduce deeply gendered narrative structures in which female characters are systematically associated with moral transgression, sexual betrayal, and reproductive capacity, and that several plots also encode the logic of racialization, i.e., the process by which visible bodily difference is morally loaded. Drawing on feminist film theory, critical race theory, and platform studies, it further argues that the generative AI aesthetic of these videos, characterized by softness, roundness, and visual cuteness, functions as a mechanism of aesthetic laundering, neutralizing the ideological weight of these narratives and enabling their circulation despite content moderation systems. This paper approaches these questions through personal observation and close reading, reflecting on the specific affordances of generative AI that make this phenomenon both possible and culturally consequential for the field of computational creativity.

MC-CPO: Mastery-Conditioned Constrained Policy Optimization for Pedagogically Safe Intelligent Tutoring Systems

2026-06-08T14:40:43Z

Intelligent tutoring systems increasingly rely on reinforcement learning to personalise instruction, yet optimising for observable engagement signals can systematically decouple learner activity from genuine knowledge acquisition. Analysing over 21 million student interactions across two deployed platforms, we find engagement events without corresponding mastery gains occur in 26.5% of interactions on Junyi Academy (72,758 students) and 3.1% on XES3G5M (14,453 students, NeurIPS 2023), confirming this pattern is directly observable in deployed educational technology at scale. We introduce Mastery-Conditioned Constrained Policy Optimisation (MC-CPO), a reinforcement learning framework that addresses this problem structurally. MC-CPO conditions the admissible instructional action space on learner mastery state: a concept becomes available only when prerequisite knowledge meets a mastery threshold, yielding an action space that expands naturally as learners acquire knowledge. Pedagogical safety constraints are enforced by construction, with formal guarantees of structural prerequisite safety, primal-dual convergence, and strict dominance over post-hoc filtering. MC-CPO is the only method to reduce reward hacking severity across all conditions. Mean per-episode mastery gain increases by 18.3% on Junyi Academy and 54.0% on XES3G5M relative to all baselines, while competitive engagement performance is maintained. These results support structural constraint modelling as a principled foundation for safer adaptive instructional policies in deployed tutoring systems.

Performative Learning Theory

2026-06-08T14:39:22Z

Performative predictions influence the very outcomes they aim to forecast. We study performative predictions that affect a sample (e.g., only existing users of an app) and/or the whole population (e.g., all potential app users). This raises the question of how well models generalize under performativity. For example, how well can we draw insights about new app users based on existing users when both of them react to the app's predictions? We address this question by embedding performative predictions into statistical learning theory. We prove generalization bounds under performative effects on the sample, on the population, and on both. A key intuition behind our proofs is that in the worst case, the population negates predictions, while the sample deceptively fulfills them. We cast such self-negating and self-fulfilling predictions as min-max and min-min risk functionals in Wasserstein space, respectively. Our analysis reveals a fundamental trade-off between performatively changing the world and learning from it: the more a model affects data, the less it can learn from it. Moreover, our analysis results in a surprising insight on how to improve generalization guarantees by retraining on performatively distorted samples. We illustrate our bounds in a case study on prediction-informed assignments of unemployed German residents to job trainings, drawing upon administrative labor market records from 1975 to 2017 in Germany.

Interpretable Crisis Behavior Analysis Using Mobility and Social Media Data

2026-06-08T14:16:36Z

Crises alter both how people move and how they communicate. During emergencies such as wildfires and pandemics, changes in mobility patterns and online emotional discourse evolve jointly, yet they are typically studied in isolation. This paper presents a unified and interpretable pipeline that integrates mobility and social media data to identify cross-domain behavioral patterns in crisis settings. The framework is evaluated through two case studies: a short-horizon analysis of the January 2025 Los Angeles wildfires (prototype case) and a longitudinal analysis of UAE COVID-19 behavior from March 2020 to December 2021 (primary case, 671 days). The pipeline aligns heterogeneous daily signals, transforms them into binary behavioral states, applies Formal Concept Analysis (FCA) to extract co-occurrence structure, mines association rules, and validates rule stability through chronological holdout testing. A structured policy-translation layer renders robust rules as operational briefs specifying triggers, lead times, and action playbooks. Results reveal clear cross-domain behavioral structure in both crises. In the wildfire case, traffic stress, fear/anger sentiment, and governance discourse are tightly coupled within a 33-day window, with key rules reaching 100\% confidence and lift scores up to 2.5. In the COVID case, repeated mobility adaptation and sentiment volatility yield 8 stable same-day rules (88\% holdout pass rate) and 40 clean predictive rules with 2--7 day lead horizons. The work demonstrates that interpretable multimodal fusion can produce both scientifically credible and policy-actionable crisis intelligence.

Toward Operationalizing Rasmussen: Drift Observability on the Simplex for Evolving Systems

2026-06-08T12:39:21Z

Software operations increasingly rely on SLOs, traces, deployment specifications, and change events, yet dashboards and thresholding practices often expose share-like operational signals as separate scalar panels or baseline distances. This can create false alarms under benign redistribution and miss movement toward policy boundaries. Rasmussen's dynamic safety model motivates drift under competing pressures, but operationalizing it for software is difficult because relevant state variables (remaining margin, engineering effort, and risk/impact) are often compositional and their parts evolve. We formulate an automated, artifact-derived drift-monitor design that maps changing software artifacts into a stable compositional monitoring state: it extracts a current part inventory and policy constraints, maps telemetry to a positive composition, stabilizes splits, merges, and renames through lineage-aware canonical groups, and analyzes boundary-directed drift in log-ratio coordinates. The proposed monitor would report drift direction, step-to-boundary, balance-level attribution, and model-health indicators under architectural churn. We specify the approach, identify its zero/noise/lineage assumptions, and report a reproducible synthetic sanity check of boundary-aware drift and controlled part churn.

Can Data Work be Reparative?

2026-06-08T12:25:35Z

We present an ethnographic study of an alternative approach to data work, developed by a civic-tech initiative that builds datasets for training and benchmarking online safety systems. They aim to respond to online safety concerns from a feminist perspective, by building safety datasets collaboratively with those most impacted by online harms. In this paper, we examine how this approach aims to reorient data work as a site for repair and redress, and trace the struggles they encounter in the process. Specifically, we draw attention to the challenges and tensions involved in advancing just reward for data work and collective governance of AI datasets. Examining these challenges through an STS-informed lens of reparative justice and repair, we argue that the work of repairing data work (and AI) lies, fundamentally, in resetting the ties of accountability. At a time heightened emphasis on efforts like safety evaluations and red teaming to make AI more responsible, we highlight the need to confront foundational questions about how the humans involved in these efforts relate to the datasets and systems they help produce. A reparative lens demands that we interrupt prevailing norms of data work and place at their centre, not AI or datasets, but those most harmed by the neglect, oversight and exclusion animated in the current modes of dataset production. This, we argue, offers a bold vision for responsibility and contributes towards a critical agenda for building alternative futures of data and AI practice.

Bathtubs, Boundaries, and Sandboxes: AI Regulatory Learning under Legal Uncertainty

2026-06-08T11:46:10Z

Effective regulation of AI is a defining policy challenge, driven by their integration into all aspects of society. To remain responsive to their rapid development and emergent properties, policymakers across the globe rely on high-level principles and abstract legal requirements. Yet, while this flexibility supports future-proofing human-centred regulations and aligning them with socio-ethical values, it also causes legal uncertainty downstream as developers, companies, and auditors struggle with translating these abstract requirements into verifiable technical requirements. Using the AI Act as an example, this paper draws on Coleman's bathtub to analyse the regulatory learning space in AI governance. It argues that legal uncertainty cannot be fully reduced ex ante and that, within reasonable bounds, it is also necessary for regulatory learning because it creates the space in which boundary negotiation over socio-technical meaning can occur. Building on this analysis, the paper shows how boundary objects and boundary negotiating artifacts help explain the translation of legal requirements into operational practice. By examining technical sandbox frameworks, it further identifies concrete properties that technical infrastructures must possess to function effectively as boundary negotiation artifacts in AI assessment. The paper concludes that legal certainty remains the long-term aim, but that premature closure of regulatory instruments risks undermining the learning processes needed for adaptive governance.

Trustworthy Smart Fabs via Professional Proxies: Scaling Safe and Sustainable by Design (SSbD) through Industrial Data Spaces

2026-06-08T09:02:02Z

The convergence of the 2026 European Union Safe and Sustainable by Design (SSbD) framework, Corporate Sustainability Due Diligence Directive (CSDDD), and Carbon Border Adjustment Mechanism (CBAM) introduce a severe governance bottleneck for advanced semiconductor manufacturing facilities ("Smart Fabs"). Regulatory compliance demands have surpassed the capacity of manual corporate reporting, creating a direct conflict between multi-stakeholder transparency and corporate data privacy. This paper addresses this challenge by introducing a zero-trust socio-technical orchestration framework that operationalizes a six-layer SSbD reference architecture within trustworthy industrial data spaces. We propose a shift from reactive automation to autonomous governance through "Professional Proxies"-role-based agentic workflows executing within hardware-isolated trust zones. Structured as an interoperable network protocol stack, the framework coordinates an automated, five-step "relay race" between Facility, Process Engineering, and Finance proxy teams to align factory-floor yield models with macro-level sustainability mandates. By executing Virtual Metrology (VM) predictions and Federated Machine Learning (FML) inside hardware-rooted Trusted Execution Environments (TEEs), this architecture resolves the Data Sovereignty Paradox, demonstrating how fabs can export cryptographically signed compliance tokens via International Data Spaces (IDS) connectors without exposing proprietary process recipes. Ultimately, this framework provides technology managers with a verifiable, evidence-based pathway toward resilient, net-zero Industry 5.0 ecosystems.

Strategic Integration of Artificial Intelligence in the C-Suite: The Role of the Chief AI Officer

2026-06-08T08:42:16Z

The integration of Artificial Intelligence (AI) into corporate strategy has become critical for organizations seeking to maintain competitive advantage in the digital age. Although organizations increasingly rely on AI as a strategic and organizational resource, existing C-suite roles remain only partially equipped to govern, integrate, and leverage it coherently at the enterprise level. Organizations vary in their responses. Some create a dedicated Chief AI Officer (CAIO), others extend existing mandates into hybrid roles, and still others coordinate AI through federated structures. This paper develops a role-design theory to explain this variation. I identify three properties that distinguish AI from earlier cross-cutting enterprise technologies - distributed accountability for judgment, upstream governance, and non-stationarity - and three configurations through which organizations respond: concentrated extension, distributed extension, and role creation. The CAIO Framework links these properties to the executive design problems they generate and to the functions and capabilities required of the dedicated role. Four propositions specify when a dedicated CAIO emerges, what form an organization's response takes, when the dedicated role is effective, and how configurations evolve over time. This paper contributes to research on executive leadership, organizational design, and digital governance by offering a theory-driven account of the strategic integration of AI at the executive level.

Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook

2026-06-08T08:25:07Z

As LLMs are globally deployed, aligning their cultural value orientations is critical for safety and user engagement. However, existing benchmarks face the Construct-Composition-Context ($C^3$) challenge: relying on discriminative, multiple-choice formats that probe value knowledge rather than true orientations, overlook subcultural heterogeneity, and mismatch with real-world open-ended generation. We introduce DOVE, a distributional evaluation framework that directly compares human-written text distributions with LLM-generated outputs. DOVE utilizes a rate-distortion variational optimization objective to construct a compact value codebook from 10K documents, mapping text into a structured value space to filter semantic noise. Alignment is measured using unbalanced optimal transport, capturing intra-cultural distributional structures and subgroup diversity. Experiments across 12 LLMs show that DOVE achieves superior predictive validity, attaining a 31.56% correlation with downstream tasks, while maintaining high reliability with as few as 500 samples per culture.

Health-Informed Computing: Estimating and Addressing the Public Health Impact of Data Centers

2026-06-08T05:39:34Z

The surging demand for artificial intelligence (AI) has led to a rapid expansion of energy-intensive data centers, contributing to criteria air pollutant emissions and raising public health concerns that have received comparatively limited attention in sustainability assessments. This paper introduces a principled methodology to model air pollutant emissions for data centers and estimate the public health impacts. Our findings reveal that the growing demand for AI and computing technologies is projected to push the total annual public health burden of U.S. data centers up to more than $20 billion in 2028. Although national-level impacts remain modest, data center health costs are unevenly distributed: in the most affected counties, the estimated per-household health burden can reach about seven times the national average. Next, we propose a health-informed computing framework that explicitly incorporates public health impacts into data center resource management across space and time, mitigating public health costs while supporting environmental sustainability. More broadly, we recommend extended energy reporting to include public health impact of data centers and paying attention to all impacted communities.