https://arxiv.org/api/pjG53soHDgBigsfmSRguUNpLAbE 2026-06-21T23:10:37Z 28997 780 15 http://arxiv.org/abs/2605.15934v1 Privacy is Fungibility: Why Endogenous Tokens Are Not Money 2026-05-15T13:15:32Z

In this paper, we make a case that endogenous tokens such as cryptoassets are not money. First, we define and classify tokens found on public, permissionless ledgers, contrasting them with privately issued stablecoins and proposed CBDC designs. We then discuss the work of Kahn et al in Money is Privacy on cash versus simplified credit, and we extend their analysis to the situation found on most public, permissionless ledgers. Many public, permissionless ledgers utilize an account-based abstraction for balances, resulting in a default state that maps onto the most harmful models of agent interaction enumerated in Money is Privacy. The conclusion is threefold: that most blockchain economies lack a cash-like primitive; that stablecoins do not intrinsically fulfil this role; and that the reliance of a network on an endogenous token for security exposes holders even of a privacy-preserving asset to the same risk, if that asset relies on the same global ledger state as the endogenous token.

2026-05-15T13:15:32Z 20 pages, 2 tables Alex Lynham Geoffrey Goodell http://arxiv.org/abs/2506.22440v2 From Model Design to Organizational Design: Complexity Redistribution and Trade-Offs in Generative AI 2026-05-15T13:05:40Z

This paper introduces the Generality-Accuracy-Simplicity (GAS) framework to analyze how large language models (LLMs) are reshaping organizations and competitive strategy. We argue that viewing AI as a simple reduction in input costs overlooks two critical dynamics: (a) the inherent trade-offs among generality, accuracy, and simplicity, and (b) the redistribution of complexity across stakeholders. While LLMs appear to defy the traditional trade-off by offering high generality and accuracy through simple interfaces, this user-facing simplicity masks a significant shift of complexity to infrastructure, compliance, and specialized personnel. The GAS trade-off, therefore, does not disappear but is relocated from the user to the organization, creating new managerial challenges, particularly around accuracy in high-stakes applications. We contend that competitive advantage no longer stems from mere AI adoption, but from mastering this redistributed complexity through the design of abstraction layers, workflow alignment, and complementary expertise. This study advances AI strategy by clarifying how scalable cognition relocates complexity and redefines the conditions for technology integration.

2025-06-10T15:22:09Z Sharique Hasan Alexander Oettl Sampsa Samila http://arxiv.org/abs/2412.05887v3 An Overview of Cyber Security Funding for Open Source Software 2026-05-15T11:45:48Z

Many open source software (OSS) projects need more human resources for maintenance, improvements, and sometimes even their survival. These needs allegedly apply even to vital OSS projects that can be seen as being a part of the world's critical infrastructures. To address this resourcing problem, new funding instruments for OSS projects have been established in recent years. The paper examines two such funding bodies for OSS and the projects they have funded. The focus of both funding bodies is on software security and cyber security in general. Based on qualitative thematic analysis, the results indicate that particularly OSS supply chains, network and cryptography libraries, programming languages, and operating systems and their low-level components have been funded and thus seen as critical in terms of cyber security. In addition to the qualitative results presented, the paper makes a contribution by connecting the research branches of critical infrastructure and sustainability of OSS projects. A further contribution is made by connecting the topic examined to recent cyber security regulations. Finally, an important argument is raised that neither cyber security nor project sustainability alone can entirely explain the rationales behind the funding decisions made by the two funding bodies.

2024-12-08T10:48:30Z Proceedings of the 7th International Workshop on Engineering and Cybersecurity of Critical Systems (EnCyCriS 2026), Rio de Janeiro, ACM, pp. 18-25 Jukka Ruohonen Gaurav Choudhary Adam Alami 10.1145/3786160.3788466 http://arxiv.org/abs/2604.20127v2 Trajectory-Aware Reliability Modeling of Democratic Systems 2026-05-15T10:51:51Z

Failures in complex systems often emerge through gradual degradation and the propagation of stress across interacting components rather than through isolated shocks. Democratic systems exhibit similar dynamics, where weakening institutions can trigger cascading deterioration in related institutional structures. Traditional reliability and survival models typically estimate failure risk based on the current system state but do not explicitly capture how degradation propagates through institutional networks over time. This paper introduces a trajectory-aware reliability modeling framework based on Dynamic Causal Neural Autoregression (DCNAR). The framework first estimates a causal interaction structure among institutional indicators and then models their joint temporal evolution to generate forward trajectories of system states. Failure risk is defined as the probability that predicted trajectories cross predefined degradation thresholds within a fixed horizon. Using longitudinal institutional indicators, we compare DCNAR-based trajectory risk models with discrete-time hazard and Cox proportional hazards models. Results show that trajectory-aware modeling consistently outperforms Cox models and improves risk prediction for several propagation-driven institutional failures. These findings highlight the importance of modeling dynamic system interactions for reliability analysis and early detection of systemic degradation.

2026-04-22T02:47:31Z Dmitry Zaytsev Valentina Kuskova Michael Coppedge http://arxiv.org/abs/2603.18221v2 Scalable and Personalized Oral Assessments Using Voice AI 2026-05-15T05:48:09Z

Students in our AI/ML course submitted polished, well-argued project analyses. Then, in class discussion, we asked them to walk through a single choice from their own work. Many could not. The writing looked great. The understanding often wasn't. Oral examinations retain an evidentiary link where written work no longer does: a student who can reason aloud, defend a decision under follow-up, and adapt when pushed demonstrates something no submitted document can certify. The obstacle has always been cost. A 25-minute oral reviewed by two graders takes roughly 30 combined instructor and TA hours for 36 students; at 100 the format is untenable. Voice AI and automated grading change the arithmetic. We built Viva, a system that conducts a personalized oral exam, then grades the transcript with a panel of three LLMs that score independently, read each other's assessments, and revise. Across two undergraduate cohorts at NYU Stern (36 students in Fall 2025, 37 in Spring 2026), grading-LLM cost stayed under one dollar per exam within the ElevenLabs subscription covering our voice minutes; for deployments exceeding an equivalent credit pool, budget about a dollar per ten minutes of graded exam time, practical for weekly assignments, not just finals. The system also broke instructively: the agent asked several questions at once, failed to randomize topics across the cohort, and a voice cloned from the professor's came across as harsh, replaced in Spring 2026 with a calm preset. These failures, with an earlier finding that a monolithic agent handling both examination and grading proved unreliable, point to five candidate transferable patterns: decompose into single-purpose modules, constrain behavior with code rather than prompts, keep randomization out of the LLM, grade with a multi-model panel whose members disagree, and choose voice characteristics with the same care as question design.

2026-03-18T19:09:06Z Panos Ipeirotis Konstantinos Rizakos http://arxiv.org/abs/2605.15473v1 Validated Hypotheses as a Lens for Human-Likeness Evaluation in AI Agents 2026-05-14T23:25:02Z

We propose using validated behavioral hypotheses as a lens for evaluating human-likeness in LLM-based agents. Our key idea is simple: If an agent is human-like, a population of such agents should reach the same inferential conclusion as the human population when run through the same experiment. Decades of social science have produced many such validated findings, each anchored to concrete experimental protocols and robustly established through independent replication. This yields an evaluation that is objective, decomposable, and scalable. We operationalize this lens through HumanStudy-Bench, an open platform that turns published human-subject studies into reusable simulation environments and administers the evaluation to configurable agents. It scores agent-human alignment on two metrics: the Probability Alignment Score (PAS) for inferential agreement and the Effect Consistency Score (ECS) for effect-size agreement. We curated an initial suite of 12 studies whose hypotheses are robustly established through independent replication, and evaluated 10 models under 4 agent designs. Results show that agent responses polarize between full replication and complete failure; agent design influences alignment more than model scale, but its effect is non-monotonic.

2026-05-14T23:25:02Z Xuan Liu HaoYang Shang Zizhang Liu Yuanjun Feng Guankai Zhai Yunze Xiao Yiwen Tu Haojian Jin http://arxiv.org/abs/2605.15468v1 GreenZ: A Sustainable UX Framework for Complex Digital Systems 2026-05-14T23:15:56Z

Digital systems have become simultaneously more powerful and more wasteful. Features accumulate that nobody uses. Data is collected that nobody analyzes. AI is deployed at significant energy and water costs for gains that a simpler approach could have achieved. And through all of it, the people who depend on these systems quietly absorb the consequences in cognitive load, lost time, and eroded trust. This paper introduces GreenZ, a three-layer Sustainable UX Framework for complex digital systems. Its three layers are a Philosophy Layer built around ten published principles, an Operational Frameworks Layer comprising five applied systems, and a Tools and Canvases Layer of practical audit instruments and decision models. Two contributions sit at the framework's core: a Digital Waste Taxonomy classifying eight distinct waste types, and an AI Sufficiency Decision Model that asks whether AI should exist in a given flow before any question of how to implement it. GreenZ v1 is theoretically grounded but empirically unvalidated. A practitioner expert review study is underway at the time of submission. The paper presents the framework's architecture, its conceptual foundations, its position relative to existing literature, and an honest account of what remains to be established.

2026-05-14T23:15:56Z 8 pages, 1 figure, 4 tables. Framework preprint. Expert review study underway. v1 Trisha Solanki http://arxiv.org/abs/2606.12429v1 Muse Spark Safety & Preparedness Report 2026-05-14T23:12:14Z

Muse Spark is the latest large language model developed by Meta. In this report, we first present evaluations for catastrophic risk domains under Meta's Advanced AI Scaling Framework, along with the evidence that informed our launch decision. We then discuss additional considerations, such as Muse Spark's broader content safety and behavioral profile, that are relevant to overall safety but fall outside the catastrophic risk domains governed by the Framework. Our preparedness results covering Chemical and Biological, Cybersecurity, and Loss of Control risks assess Muse Spark's deployment within Meta AI as presenting acceptable levels of residual risks under our Advanced AI Scaling Framework. We conducted a broad set of evaluations targeting dual-use and high-risk capabilities across these catastrophic risk domains. Those evaluations identified elevated risks prior to mitigations, with Chemical and Biological capabilities assessed as likely reaching the "high risk" category under the Advanced AI Scaling Framework before safeguards were applied. We have implemented a multi-layered set of mitigations that address the identified risks, and Muse Spark demonstrates state-of-the-art refusal across a range of benchmarks related to hazardous workflows in chemistry and biology. We therefore release Muse Spark as the underlying model of Meta AI.

2026-05-14T23:12:14Z 159 pages, 57 figures Cristina Menghini Sail Peter Ney Sail Hamza Kwisaba Sail Zifan Sail Wang Miles Turpin Felix Binder Jean-Christophe Testud Aidan Boyd Nathaniel Li Ivan Evtimov Klaudia Krawiecka Arman Zharmagambetov Jeremy Kritz Alexander R. Fabbri Daniel Song Jinpeng Miao Joonas Hjelt Meghna Ramani Leona Lan Reza Aghajani Joanna Bitton Mahesh Pasupuleti Devin Norder Khalid El-Arini Paridhi Singh Vítor Albiero Sahana CB Rashnil Chaturvedi Elahe Dabir Edoardo Debenedetti Jim Gust Ziwen Han Kat He Sean Hendryx Lifeng Jin Polina Kirichenko Sandra Lefdal Kenneth Li Asad Liaqat Inna Lin Despoina Magka Neal Mangaokar Ishita Mediratta Zach Miller Smitha Milli Niloofar Mireshghallah Saba Nazir Hung Nguyen Maximilian Nickel Kelvin Niu Kerem Oktar Bhargavi Paranjape Parth Pathak Maya Pavlova Emmanuel Ramirez David Renardy Candace Ross Yasha Sheynin Claudia Shi Shivam Singhal Evangelia Spiliopoulou Rakshith Sharma Srinivasa Jamelle Watson-Daniels Spencer Whitman Adina Williams Chen Xing Andy Zou Tommy Ma Siqi Deng James Beldock Prashant Ratanchandani Kate Plawiak Taesung Lee Ryan Victory Lindsay Hundley Rachad Alao Himaghna Bhattacharjee Jianfeng Chi Gary Frost Pegah Ghahremani Niki Howe Yuheng Huang Saeed Jahed Hannah Korevaar Trang Le Zhe Liu Jinghong Luo Qin Lyu Nina Mehrabi Abraham Montilla Chirag Nagpal Cyrus Nikolaidis Rajvardhan Oak Manoj Ravi Vidya Sarma Aman Shankar Alana Shine Eric Michael Smith Mariana Tandon Michael Tontchev Caoyu Wang Zihan Wang Corinne Wong Zheng Wu Hongyuan Zhan Justin Zhao Zexuan Zhong Chengxu Zhuang Tristan Goodman Ayaz Minhas Harrison Rudolph Victoria Jeffries Ingrid Dickinson Alex Vaughan Lauren Deason Kamalika Chaudhuri Julian Michael Shengjia Zhao Summer Yue http://arxiv.org/abs/2601.21028v2 "Unlimited Realm of Exploration and Experimentation": Methods and Motivations of AI-Generated Sexual Content Creators 2026-05-14T21:05:40Z

AI-generated media is radically changing the way content is both consumed and produced on the internet, and in no place is this potentially more visible than in sexual content. AI-generated sexual content (AIG-SC) is increasingly enabled by an ecosystem of individual AI developers, specialized third-party applications, and foundation model providers. AIG-SC raises a number of concerns from older debates about the line between pornography and obscenity to newer debates about fair use and labor displacement (in this case, of sex workers), and has spurred new regulations to curb the spread of non-consensual intimate imagery (NCII) created using the same technology used to create AIG-SC. However, despite the growing prevalence of AIG-SC, little is known about its creators, their motivations, and what types of content they produce. To inform effective governance in this space, we conducted an in-depth study to understand what AIG-SC creators make, along with how and why they make it. Interviews with 28 AIG-SC creators, ranging from hobbyists to entrepreneurs to those who moderate communities of hundreds of thousands of other creators, revealed a wide spectrum of motivations, including sexual exploration, creative expression, technical experimentation, and in a handful of cases, the creation of NCII.

2026-01-28T20:43:25Z Jaron Mink Lucy Qin Elissa M. Redmiles http://arxiv.org/abs/2605.15380v1 Eskwai for Students: Generative AI Assistant for Legal Education in Ghana 2026-05-14T20:10:32Z

Recent advances in generative AI have shown their potential to be leveraged for legal education. Yet, work on the development and deployment of such systems for legal education in the Global South is limited. In this work, we developed Eskwai for Students, a generative AI assistant to help law students with their legal education. Eskwai for Students is a retrieval augmented generation (RAG) system that provides answers to a wide range of legal questions for law students grounded in a curated database of over 12K case laws and 1.4K legislation in Ghana. We deployed Eskwai for Students in a longitudinal study of 30 months (2.5 years) used by 3.1K law students in Ghana who made 32K queries. We evaluated the helpfulness of our AI, and provided insight into the kinds of queries law students submit to this generative AI tool, which raises some ethical concerns. This work contributes to an understanding of how law students in the Global South are using generative AI for their studies and the ways it could be leveraged responsibly to advance legal education.

2026-05-14T20:10:32Z 10 pages. Accepted at the 27th International Conference on Artificial Intelligence in Education (AIED 2026) George Boateng Philemon Badu Patrick Agyeman-Budu Samuel Ansah Evans Atompoya Evan Igwilo Lord Baah Frederick Abu-Bonsrah Victor Wumbor-Apin Kumbol http://arxiv.org/abs/2605.15376v1 Adesua: Development and Feasibility Study of an AI WhatsApp Bot for Science Learning in West Africa 2026-05-14T20:04:39Z

Sub-Saharan Africa faces persistently high student-teacher ratios and shortages of qualified teachers, limiting students' access to personalized learning support and formative assessment. To address this challenge, we present Adesua, a WhatsApp-based AI Teaching Assistant for science education that extends the Kwame for Science platform. Adesua leverages WhatsApp's widespread adoption in Africa to provide accessible, curriculum-aligned learning support for Junior High School (JHS) and Senior High School (SHS) students across West Africa. The system integrates curated textbooks and 33 years of national examination questions with generative AI to enable conversational question answering and automated assessment with feedback via a WhatsApp bot. Students can ask science questions, take timed or untimed multiple-choice tests by topic or exam year, and receive instant grading and detailed explanations of correct and incorrect responses. A 6-month feasibility deployment in 2025 had 56 active users in Ghana, including students and parents. Quantitative evaluation showed a high perceived usefulness, with a helpfulness score of 93.75\% for AI-generated answers, albeit with a small number of ratings (n=16). These preliminary results provide a basis for more extensive future evaluation of a WhatsApp-based AI assistant to assess its potential to offer scalable, low-cost personalized learning support and formative assessment in resource-constrained educational contexts.

2026-05-14T20:04:39Z 11 pages. Accepted at the 27th International Conference on Artificial Intelligence in Education (AIED 2026) George Boateng Evans Atompoya Philemon Badu Samuel John Samuel Ansah Patrick Agyeman-Budu Victor Wumbor-Apin Kumbol http://arxiv.org/abs/2606.12428v1 Mapping AI Programs in the U.S: A Status Report from Early 2026 and an Analysis of AI Majors and Minors 2026-05-14T20:01:25Z

We present a report on the status of undergraduate Artificial Intelligence (AI) programs in the United States in Spring 2026. In so doing, we 1) describe our scraping and mapping tools, which dynamically update to track the state of AI education in the U.S., and 2) create a historic record at a time of great upheaval. The tool we developed, available at https://cicmap.ai, detects, scrapes, and displays data from more than 350 undergraduate AI programs--majors, minors, concentrations, and certificates--at 4-year universities. Our tool searched over 560 institutions to locate these programs, a sample that represents 86\% of all undergraduate Computer Science (CS) graduates in the U.S. This tool allows prospective students, guidance counselors, administrators, and faculty to easily access AI program requirements and is designed to continually update as new programs emerge. To the best of our knowledge, this survey represents the most comprehensive snapshot of the state of AI programs in the U.S. to date. With this work we offer three important contributions: 1) a record of AI programs in the U.S. at a time of great upheaval; 2) a tool to explore AI programs and their requirements; and 3) an analysis of the courses required for 66 AI majors and 87 AI minors. Our analysis of majors and minors shows great variability in the size and the requirements of these degrees, but we note two takeaways. First, not all majors require a general AI course, but if they don't, they do require a Machine Learning (ML) course. Second, while more than a third of majors require an Ethics in AI course, just under a quarter of AI minors do.

2026-05-14T20:01:25Z Felix Muzny Carolyn Jones Carter Ithier Hasnain Sikora Hrutika Harshadbhai Patel Carla E. Brodley http://arxiv.org/abs/2511.19115v2 AI Consciousness and Existential Risk 2026-05-14T19:57:37Z

In AI, the existential risk denotes the hypothetical threat posed by an artificial system that would possess both the capability and the objective, either directly or indirectly, to eradicate humanity. This issue is gaining prominence in scientific debate due to recent technical advancements and increased media coverage. In parallel, AI progress has sparked speculation and studies about the potential emergence of artificial consciousness. The two questions, AI consciousness and existential risk, are sometimes conflated, as if the former entailed the latter. Here, I explain that this view stems from a common confusion between consciousness and intelligence. Yet these two properties are empirically and theoretically distinct. Arguably, while intelligence is a direct predictor of an AI system's existential threat, consciousness is not. There are, however, certain incidental scenarios in which consciousness could influence existential risk, in either direction. Consciousness could be viewed as a means towards AI alignment, thereby lowering existential risk; or, it could be a precondition for reaching certain capabilities or levels of intelligence, and thus positively related to existential risk. Recognizing these distinctions can help AI safety researchers and public policymakers focus on the most pressing issues.

2025-11-24T13:48:02Z Updated for clarity and completeness following peer-review Rufin VanRullen http://arxiv.org/abs/2606.12427v1 Planning on Paper: Problem Decomposition with Diagrams in Introductory Computing 2026-05-14T19:27:02Z

Background and Context. Problem decomposition is a core concern of computing education. It has also become increasingly relevant: in response to GenAI, many CS1 educators are advocating for shifting instructional emphasis away from code writing and towards decomposition and higher-level planning. Currently, there is a lack of knowledge in how novices do decomposition in large, multifunction tasks. Objectives. In this study, we describe how students represent solutions to a decomposition task, and characterize common issues that arise in those representations. Method. In a 50-minute lab, students were given a description of a word game and asked to draw (with pencil and paper) a decomposition diagram for a program that would implement this game. We performed an inductive thematic analysis with negotiated agreement on 55 of the diagrams, coding salient elements (e.g. functions and the relationships between them) and issues that arose. Findings. Students used multiple representational strategies, including hierarchical function calls and sequencing (order of execution). We identified issues in notation (including use of differing, incompatible notations within the same diagram), order of execution, abstraction and reuse, encapsulation, clarity, and problem-specific misunderstandings. Implications. These findings suggest that novice decomposition is shaped by multiple underlying models of program behavior, with tensions between structural and sequence-focused reasoning. We discuss implications for decomposition instruction and future work, including clarifying representational constraints and plan tracing as simulation.

2026-05-14T19:27:02Z International Computing Education Conference (ICER) Annapurna Vadaparty Devamardeep Hayatpur Adalbert Gerald Soosai Raj Leo Porter Daniel Zingaro http://arxiv.org/abs/2605.15312v1 Beyond Performance Disparities: A Three-Level Audit of Representational Harm in CelebA 2026-05-14T18:25:17Z

Large-scale facial datasets like CelebA are widely used in computer vision, yet the cultural biases embedded in their labels remain underexplored. Fairness research has distinguished representational from allocational harms, but audits of computer vision datasets have mostly examined categorical labels, leaving open how such harms appear in learned features and model attention. This paper examines CelebA at three levels: dataset structure, learned feature weights, and spatial attention, focusing on how gendered double standards of ageing and beauty are encoded in the data and reproduced in model behaviour. First, hierarchical clustering of 202,599 images shows that the 39 attributes organise into latent trait bundles aligned with cultural archetypes: performative femininity (youth, makeup, adornment) and professional masculinity (ageing, facial hair, formal attire). Female faces, though more often rated attractive overall, incur steep penalties when assigned to ageing or masculine-coded clusters. Second, XGBoost with SHAP analysis reveal gender-specific effects, such as adiposity reducing attractiveness only for females. Third, Grad-CAM finds that predictions for female and younger male subgroups concentrate on mid-face cues, whereas predictions for older males drift toward peripheral cues such as hair and clothing. Older males attain the highest accuracy but the lowest average precision, indicating categorical exclusion of groups outside the dataset's evaluative templates. Cultural double standards thus pass from media representation into dataset labels, feature weights, and model attention, producing two representational harms: hyper-scrutiny of women under a narrow evaluative template, and exclusion of older men from the scheme entirely. Fairness metrics focused on performance disparities mask both, underscoring the need to address representational harm in fairness research.

2026-05-14T18:25:17Z 15 pages, 8 figures Sieun Park Yuanmo He