https://arxiv.org/api/lhAXdxzLCd6xYUXW23IhDDxObfY2026-03-28T10:40:36Z274704515http://arxiv.org/abs/2603.23474v1Evidence of political bias in search engines and language models before major elections2026-03-24T17:39:34ZSearch engines (SEs) and large language models (LLMs) are central to political information access, yet their algorithmic decisions and potential underlying biases remain underexplored. We developed a standardized, privacy-preserving, bot-and-proxy methodology to audit four SEs and two LLMs before the 2024 European Parliament and US presidential elections. We collected answers to approximately 4,360 queries related to elections in five EU countries and 15 US counties, identified political entities and topics in those answers, and mapped them to ideological positions (EU) or issue associations (US). In Europe, SE results disproportionately mentioned far-right entities beyond levels expected from polls, past elections, or media salience. In the US, Google strongly favored topics more important to Republican voters, while other search engines favored issues more relevant to Democrats. LLMs responses were more balanced, although there is evidence of overrepresentation of far-right (and Green) entities. These results show evidence of bias and open important discussions on how even small skews in widely used platforms may influence democratic processes, calling for systematic audits of their outputs.2026-03-24T17:39:34Z20 pages, 4 figures; Supplementary Information : Page 22 - 74Íris DamiãoPaulo AlmeidaJoão FrancoNuno SantosPedro C. MagalhãesJoana Gonçalves-Sáhttp://arxiv.org/abs/2603.23471v1Regulating AI Agents2026-03-24T17:38:27ZAI agents -- systems that can independently take actions to pursue complex goals with only limited human oversight -- have entered the mainstream. These systems are now being widely used to produce software, conduct business activities, and automate everyday personal tasks. While AI agents implicate many areas of law, ranging from agency law and contracts to tort liability and labor law, they present particularly pressing questions for the most globally consequential AI regulation: the European Union's AI Act. Promulgated prior to the development and widespread use of AI agents, the EU AI Act faces significant obstacles in confronting the governance challenges arising from this transformative technology, such as performance failures in autonomous task execution, the risk of misuse of agents by malicious actors, and unequal access to the economic opportunities afforded by AI agents. We systematically analyze the EU AI Act's response to these challenges, focusing on both the substantive provisions of the regulation and, crucially, the institutional frameworks that aim to support its implementation. Our analysis of the Act's allocation of monitoring and enforcement responsibilities, reliance on industry self-regulation, and level of government resourcing illustrates how a regulatory framework designed for conventional AI systems can be ill-suited to AI agents. Taken together, our findings suggest that policymakers in the EU and beyond will need to change course, and soon, if they are to effectively govern the next generation of AI technology.2026-03-24T17:38:27ZKathrin GardhouseAmin OueslatiNoam Kolthttp://arxiv.org/abs/2603.23415v1Integrating GenAI in Filmmaking: From Co-Creativity to Distributed Creativity2026-03-24T16:49:09ZThe integration of Generative AI (GenAI) into audio-visual production is often presented as a radical break from past traditions. However, through a sociomaterial and historical lens, this paper argues that GenAI represents a new development in the long-standing negotiation between creative labor and technological possibilities. Moving beyond the limiting framework of human-machine co-creativity, we adopt an STS-based approach to investigate creativity in the making within the Filmmaking industry. We analyze Filmmaking as a distributed process where agency is shared across diverse human experts and non-human actors, showing how technological innovations have historically reconfigured Filmmaking practices long before the advent of AI. The article introduces an analytical taxonomy of GenAI techniques to illustrate how these technologies do not merely "assist" but can actively reconfigure professional roles, production temporalities, and film aesthetics. By linking sociomaterial configurations to aesthetic outcomes, this reframing suggests that AI technologies in Filmmaking should be seen as mediators that could enable new aesthetic possibilities by blurring the boundaries of traditional filmmaking workflows.2026-03-24T16:49:09Z33 pagesPierluigi MasaiLorenzo CartaMateusz Miroslaw Lishttp://arxiv.org/abs/2603.15998v2NLP Occupational Emergence Analysis: How Occupations Form and Evolve in Real Time -- A Zero-Assumption Method Demonstrated on AI in the US Technology Workforce, 2022-20262026-03-24T15:30:16ZOccupations form and evolve faster than classification systems can track. We propose that a genuine occupation is a self-reinforcing structure (a bipartite co-attractor) in which a shared professional vocabulary makes practitioners cohesive as a group, and the cohesive group sustains the vocabulary. This co-attractor concept enables a zero-assumption method for detecting occupational emergence from resume data, requiring no predefined taxonomy or job titles: we test vocabulary cohesion and population cohesion independently, with ablation to test whether the vocabulary is the mechanism binding the population. Applied to 8.2 million US resumes (2022-2026), the method correctly identifies established occupations and reveals a striking asymmetry for AI: a cohesive professional vocabulary formed rapidly in early 2024, but the practitioner population never cohered. The pre-existing AI community dissolved as the tools went mainstream, and the new vocabulary was absorbed into existing careers rather than binding a new occupation. AI appears to be a diffusing technology, not an emerging occupation. We discuss whether introducing an "AI Engineer" occupational category could catalyze population cohesion around the already-formed vocabulary, completing the co-attractor.2026-03-16T23:17:10ZThis manuscript has been withdrawn by the authors pending internal review and substantial revisionDavid Nordforshttp://arxiv.org/abs/2603.23322v1Leveraging LLMs and Social Media to Understand User Perception of Smartphone-Based Earthquake Early Warnings2026-03-24T15:24:33ZAndroid's Earthquake Alert (AEA) system provided timely early warnings to millions during the Mw 6.2 Marmara Ereglisi, Türkiye earthquake on April 23, 2025. This event, the largest in the region in 25 years, served as a critical real-world test for smartphone-based Earthquake Early Warning (EEW) systems. The AEA system successfully delivered alerts to users with high precision, offering over a minute of warning before the strongest shaking reached urban areas. This study leveraged Large Language Models (LLMs) to analyze more than 500 public social media posts from the X platform, extracting 42 distinct attributes related to user experience and behavior. Statistical analyses revealed significant relationships, notably a strong correlation between user trust and alert timeliness. Our results indicate a distinction between engineering and the user-centric definition of system accuracy. We found that timeliness is accuracy in the user's mind. Overall, this study provides actionable insights for optimizing alert design, public education campaigns, and future behavioral research to improve the effectiveness of such systems in seismically active regions.2026-03-24T15:24:33ZHanjing WangS. Mostafa MousaviPatrick RobertsonRichard M. AllenAlexie BarskiRobert BoschNivetha ThiruverahanYoungmin ChoTajinder GadhSteve MalkosBoone SpoonerGreg WimpeyMarc Stogaitishttp://arxiv.org/abs/2603.23315v1Unilateral Relationship Revision Power in Human-AI Companion Interaction2026-03-24T15:18:48ZWhen providers update AI companions, users report grief, betrayal, and loss. A growing literature asks whether the norms governing personal relationships extend to these interactions. So what, if anything, is morally significant about them? I argue that human-AI companion interaction is a triadic structure in which the provider exercises constitutive control over the AI. I identify three structural conditions of normatively robust dyads that the norms characteristic of personal relationships presuppose and show that AI companion interactions fail all three. This reveals what I call Unilateral Relationship Revision Power (URRP): the provider can rewrite how the AI interacts from a position where these revisions are not answerable within that interaction. I argue that designing interactions that exhibit URRP is pro tanto wrong because it involves cultivating normative expectations while maintaining conditions under which those expectations cannot be fulfilled. URRP has three implications: i) normative hollowing (commitment is elicited but no agent inside the interaction bears it), ii) displaced vulnerability (the user's exposure is governed by an agent not answerable to her within the interaction), and iii) structural irreconcilability (when trust breaks down, reconciliation is structurally unavailable because the agent who acted and the entity the user interacts with are different). I discuss design principles such as commitment calibration, structural separation, and continuity assurance as external substitutes for the internal constraints the triadic structure removes. The analysis therefore suggests that a central and underexplored problem in relational AI ethics is the structural arrangement of power over the human-AI interaction itself.2026-03-24T15:18:48Z42 pagesBenjamin Langehttp://arxiv.org/abs/2603.23273v1Systemic Gendered Citation Imbalance in Computer Science: Evidence from Conferences and Journals2026-03-24T14:37:34ZGender imbalance persists across science, technology, engineering, and mathematics (STEM) fields, including computer science, where it appears in researcher demographics, productivity, recognition, hiring, and career progression. Given computer science's rapid expansion and global influence, addressing this imbalance is essential for broadening participation and fueling innovation. Although journal-oriented disciplines exhibit consistent gender imbalances in citation practices, it remains unclear whether similar patterns arise in the conference-centric culture of computer science. Here, we systematically investigate gender imbalance in citations of conference and journal papers in computer science. We find that papers for which a woman is listed as either first or last author receive fewer citations than expected, partly because of homophilic citation tendencies (i.e., authors tend to cite papers that share specific attributes). This imbalance is especially pronounced for conference papers--particularly those published at top-tier venues--relative to journals. Moreover, we find that the prominence of the first or last author and the structure of their local co-authorship networks are potential drivers of these imbalances. By exploring how conference-centric publishing practices can amplify systemic imbalances in computer science, our study offers insights that may inform efforts to foster more equitable representation in academia.2026-03-24T14:37:34ZAccepted for publication in Scientometrics. 31 pages, 7 figures, 3 tables. Includes Supplementary InformationKazuki NakajimaYuya SasakiSohei TokunoGeorge Fletcherhttp://arxiv.org/abs/2410.03532v5Understanding the Key Factors Influencing Continued Use Intention Toward Intangible Cultural Heritage (ICH)-Themed Virtual Reality Games2026-03-24T14:22:01ZIntangible Cultural Heritage (ICH) faces critical challenges in the digital age, including reduced public engagement, restricted accessibility, and difficulties in communicating complex cultural practices to modern audiences. Virtual Reality (VR) games present promising opportunities for ICH preservation and transmission, yet little is known about factors shaping their user acceptance. This study introduces a VR game centered on the Qinhuai Lantern Festival, a representative ICH case. We extend the Technology Acceptance Model (TAM) by incorporating sensory, emotional, and cultural dimensions as external variables, offering a framework for examining user acceptance of ICH-oriented VR applications. We conduct a survey with 299 respondents and apply structural equation modeling. Findings show that sensory experience significantly enhances both perceived usefulness (beta = 0.401, p < 0.001) and cultural experience (beta = 0.523, p < 0.001), while emotional experience strongly predicts positive attitudes (beta = 0.428, p < 0.001) and emotional loyalty (beta = 0.517, p < 0.001). Moreover, sensory, emotional, and cultural dimensions positively influence users' attitudes and behavioral intentions. The findings provide practical guidelines for the design of future ICH-based VR games.2024-09-14T02:57:12ZThis paper is withdrawn due to issues identified in the methodology and experimental evaluation, which may affect the validity of the results. The authors are preparing a substantially revised versionYuanfang LiuGuanghong XieJunya ZhiWenrui ZuoRua Mae Williamshttp://arxiv.org/abs/2603.23577v1The Geometric Price of Discrete Logic: Context-driven Manifold Dynamics of Number Representations2026-03-24T13:41:57ZLarge language models (LLMs) generalize smoothly across continuous semantic spaces, yet strict logical reasoning demands the formation of discrete decision boundaries. Prevailing theories relying on linear isometric projections fail to resolve this fundamental tension. In this work, we argue that task context operates as a non-isometric dynamical operator that enforces a necessary "topological distortion." By applying Gram-Schmidt decomposition to residual-stream activations , we reveal a dual-modulation mechanism driving this process: a class-agnostic topological preservation that anchors global structure to prevent semantic collapse, and a specific algebraic divergence that directionally tears apart cross-class concepts to forge logical boundaries. We validate this geometric evolution across a gradient of tasks, from simple mapping to complex primality testing. Crucially, targeted specific vector ablation establishes a strict causal binding between this topology and model function: algebraically erasing the divergence component collapses parity classification accuracy from 100% to chance levels (38.57%). Furthermore, we uncover a three-phase layer-wise geometric dynamic and demonstrate that under social pressure prompts, models fail to generate sufficient divergence. This results in a "manifold entanglement" that geometrically explains sycophancy and hallucination. Ultimately, our findings revise the linear-isometric presumption, demonstrating that the emergence of discrete logic in LLMs is purchased at an irreducible cost of topological deformation.2026-03-24T13:41:57ZLong ZhangDai-jun LinWei-neng Chenhttp://arxiv.org/abs/2603.23171v1Robust Safety Monitoring of Language Models via Activation Watermarking2026-03-24T13:13:23ZLarge language models (LLMs) can be misused to reveal sensitive information, such as weapon-making instructions or writing malware. LLM providers rely on $\emph{monitoring}$ to detect and flag unsafe behavior during inference. An open security challenge is $\emph{adaptive}$ adversaries who craft attacks that simultaneously (i) evade detection while (ii) eliciting unsafe behavior. Adaptive attackers are a major concern as LLM providers cannot patch their security mechanisms, since they are unaware of how their models are being misused. We cast $\emph{robust}$ LLM monitoring as a security game, where adversaries who know about the monitor try to extract sensitive information, while a provider must accurately detect these adversarial queries at low false positive rates. Our work (i) shows that existing LLM monitors are vulnerable to adaptive attackers and (ii) designs improved defenses through $\emph{activation watermarking}$ by carefully introducing uncertainty for the attacker during inference. We find that $\emph{activation watermarking}$ outperforms guard baselines by up to $52\%$ under adaptive attackers who know the monitoring algorithm but not the secret key.2026-03-24T13:13:23Z20 pages, 17 figuresToluwani AremuDaniil OgnevSamuele PoppiNils Lukashttp://arxiv.org/abs/2603.23114v1Between Rules and Reality: On the Context Sensitivity of LLM Moral Judgment2026-03-24T12:08:16ZA human's moral decision depends heavily on the context. Yet research on LLM morality has largely studied fixed scenarios. We address this gap by introducing Contextual MoralChoice, a dataset of moral dilemmas with systematic contextual variations known from moral psychology to shift human judgment: consequentialist, emotional, and relational. Evaluating 22 LLMs, we find that nearly all models are context-sensitive, shifting their judgments toward rule-violating behavior. Comparing with a human survey, we find that models and humans are most triggered by different contextual variations, and that a model aligned with human judgments in the base case is not necessarily aligned in its contextual sensitivity. This raises the question of controlling contextual sensitivity, which we address with an activation steering approach that can reliably increase or decrease a model's contextual sensitivity.2026-03-24T12:08:16ZpreprintAdrian SauterMona Schirmerhttp://arxiv.org/abs/2603.23107v1Network Analysis of the Egyptian Reddit Community2026-03-24T11:57:52ZThis paper presents a network analysis of the Reddit community focused on Egypt. We collected and constructed a comprehensive dataset consisting of 23,185 users and 105 Egyptian subreddits. Through network analysis criteria such as degree analysis, degree distribution analysis, and clustering coefficient analysis, we explored the structural properties, connectivity patterns, and local clustering within the Egyptian Reddit network. The findings provide insights into the community dynamics, influential users, and information flow within the network. Our study contributes to a better understanding of online communities in the context of Egypt and sheds light on the relationships and interactions within the Egyptian Reddit community. By leveraging network analysis techniques, we uncover the importance of individual nodes, the distribution of node degrees, and the formation of tightly knit groups. This study contributes significantly to the understanding of online communities specific to Egypt, shedding light on relationships and interactions within the Egyptian Reddit community.2026-03-24T11:57:52Z12 pages, 7 figures, 14 tables. Conference paper submitted to ICINCOSamy ShaawatAdham HammadKarim FarhatMina ThabetWalid Gomaahttp://arxiv.org/abs/2603.23063v1Machine Learning Models for the Early Detection of Burnout in Software Engineering: a Systematic Literature Review2026-03-24T10:58:34ZBurnout is an occupational syndrome that, like many other professions, affects the majority of software engineers. Past research studies showed important trends, including an increasing use of machine learning techniques to allow for an early detection of burnout.
This paper is a systematic literature review (SLR) of the research papers that proposed machine learning (ML) approaches, and focused on detecting burnout in software developers and IT professionals. Our objective is to review the accuracy and precision of the proposed ML techniques, and to formulate recommendations for future researchers interested to replicate or extend those studies.
From our SLR we observed that a majority of primary studies focuses on detecting emotions or utilise emotional dimensions to detect or predict the presence of burnout. We also performed a cross-sectional study to detect which ML approach shows a better performance at detecting emotions; and which dataset has more potential and expressivity to capture emotions.
We believe that, by identifying which ML tools and datasets show a better performance at detecting emotions, and indirectly at identifying burnout, our paper can be a valuable asset to progress in this important research direction.2026-03-24T10:58:34ZThis paper is under reviewTien Rahayu TuliliAyushi RastogiAndrea Capiluppihttp://arxiv.org/abs/2507.00026v2RedTopic: Toward Topic-Diverse Red Teaming of Large Language Models2026-03-24T09:55:48ZAs large language models (LLMs) are increasingly deployed as black-box components in real-world applications, red teaming has become essential for identifying potential risks. It tests LLMs with adversarial prompts to uncover vulnerabilities and improve safety alignment. Ideally, effective red teaming should be adaptive to evolving LLM capabilities and explore a broad range of harmful topics. However, existing approaches face two limitations: 1) topic-based approaches rely on pre-collected harmful topics, limited in flexibility and adaptivity. 2) topic-free methods use reinforcement learning (RL), but they lack an explicit reward signal for exploration and tend to over-optimize a narrow objective, reducing topic diversity. To address these limitations, we propose RedTopic, a novel red teaming framework that generates topic-diverse adversarial prompts through a contextualized generation pipeline, an aggregate reward design, and a multi-objective RL training loop. Experiments show that RedTopic produces more effective and diverse adversarial prompts than existing methods, with notable improvements in integrated evaluation metrics. We believe RedTopic represents a step toward more adaptive and topic-diverse red teaming for large language models.2025-06-17T10:55:17ZJiale DingXiang ZhengYutao WuCong WangWei-Bin LeeLing PanXingjun MaYu-Gang Jianghttp://arxiv.org/abs/2603.23569v1Trends in Equal-Contribution Authorship: A Large-Scale Bibliometric Analysis of Biomedical Literature2026-03-24T09:33:54ZEqual-contribution authorship, in which two or more authors are designated as having contributed equally, is increasingly common in scientific publishing. Using approximately 480,000 tagged records from PubMed and PMC (2010-2024), we examine temporal trends, journal-level patterns, geographic distributions, and byline positions of equal-contributing authors. Results show a sharp rise after 2017, with both high-output mega-journals and smaller, discipline-specific journals contributing to the growth. Journal-level analysis indicates a median increase in the share of tagged articles from about 19% in 2015 to over 30% in 2024, with some journals exceeding 50%. Geographically, China accounts for the largest share (40.8% of fractionalized contributions), followed by the United States (15.2%) and Germany (5.2%). Normalizing to 2015 baselines, China shows a 13.1x; increase by 2024, while even the slowest-growing countries more than tripled their levels. Analysis of normalized byline positions shows that equal-contribution designations are concentrated near the first-author position, with fewer cases in middle or last positions. These findings document a broad shift toward shared first-author credit across journal sizes and regions within the biomedical literature and suggest that journals and evaluators may need to rely more on transparent contributorship information and to monitor the use of such labels over time.2026-03-24T09:33:54ZQuantitative Science Studies, 1-15 (2026)Binbin Xu10.1162/QSS.a.476