https://arxiv.org/api/x+Qv1XbSnCxZrIS0Rrh6UdbSc8A2026-06-10T02:58:39Z60619015http://arxiv.org/abs/2605.08869v1Horizontal and Longitudinal Comparisons Among AI Subfields: A Bibliometric Perspective2026-05-09T10:42:48ZRecent artificial intelligence has developed rapidly with significant interdisciplinary expansion, yet existing studies often treat it as a whole, lacking systematic long-term subfield comparisons and structural analyses, thereby limiting understanding of internal differences and evolutionary mechanisms. To address this gap, we employ bibliometric methods, using expert interviews and indicator screening to construct an analytical framework. Twelve bibliometric indicators are selected across three dimensions: Impact and Dissemination, Collaboration Characteristics, and Author Characteristics. We conduct horizontal and longitudinal analyses of five subfields (AI, CV, ML, NLP, Web\&IR) from 2000 to 2024. Using CSRankings classification and a dataset of 106,622 papers, we apply violin plots, chord diagrams, and sankey diagrams to characterize structural features and evolutionary paths. Results show that these subfields have entered high-intensity knowledge diffusion: academic impact increased, knowledge dissemination accelerated, external disciplinary reliance grown, and knowledge production shifted from closed accumulation to open, interdisciplinary, multi-actor networks. On this basis, subfields exhibit significant structural differentiation: CV leads in academic impact with a task-oriented trajectory; ML shows shrinking industry collaboration but concentrated international collaboration with a relatively dispersed structure; Web\&IR is strongly industry-driven with a stable collaboration network; AI shows continuous growth; NLP remains relatively stable. Overall, this study reveals artificial intelligence evolving from unified diffusion to structural differentiation, constructs an extensible multidimensional framework, and provides a quantitative approach for understanding complex technological field evolution.2026-05-09T10:42:48Z66 pages, 28 figuresZeyu LiYalan JinShuyu ChenTingxin JiangXinyi ChangLu Yuanhttp://arxiv.org/abs/2605.07723v1LLM hallucinations in the wild: Large-scale evidence from non-existent citations2026-05-08T13:26:41ZLarge language models (LLMs) are known to generate plausible but false information across a wide range of contexts, yet the real-world magnitude and consequences of this hallucination problem remain poorly understood. Here we leverage a uniquely verifiable object - scientific citations - to audit 111 million references across 2.5 million papers in arXiv, bioRxiv, SSRN, and PubMed Central. We find a sharp rise in non-existent references following widespread LLM adoption, with a conservative estimate of 146,932 hallucinated citations in 2025 alone. These errors are diffusely embedded across many papers but especially pronounced in fields with rapid AI uptake, in manuscripts with linguistic signatures of AI-assisted writing, and among small and early-career author teams. At the same time, hallucinated references disproportionately assign credit to already prominent and male scholars, suggesting that LLM-generated errors may reinforce existing inequities in scientific recognition. Preprint moderation and journal publication processes capture only a fraction of these errors, suggesting that the spread of hallucinated content has outpaced existing safeguards. Together, these findings demonstrate that LLM hallucinations are infiltrating knowledge production at scale, threatening both the reliability and equity of future scientific discovery as human and AI systems draw on the existing literature.2026-05-08T13:26:41ZZhenyue ZhaoYihe WangToby StuartMathijs De VaanPaul GinspargYian Yinhttp://arxiv.org/abs/2404.06500v3The Rise and Fall of the Initial Era2026-05-08T11:19:59ZBibliographic data is a rich source of information that goes beyond the use cases of location and citation -- it also encodes both cultural and technological context. For most of its existence, the scholarly record has changed slowly and hence provides an opportunity to gain insight through its reflection of the cultural norms of the research community over the last four centuries. While it is often difficult to distinguish the originating driver of change, it is still valuable to consider the motivating influences that have led to changes in the structure of the scholarly record. An "initial era" is identified during which initials were used in preference to full names by authors on scholarly communications. Causes of the emergence and demise of this era are considered as well as the implications of this era on research culture and practice.2024-04-09T17:51:42Z20 pages, 18 figures, updated references, iterations in response to reviewer comments on MetaRORSimon J PorterDaniel W Hookhttp://arxiv.org/abs/2605.06935v1Faculty mobility reallocates research capacity within persistent institutional hierarchies2026-05-07T20:51:43ZFaculty mobility is often understood as a mechanism through which universities redistribute scientific talent and potentially improve research performance. Yet the system-level structure of mobility and its association with individual research trajectories have rarely been examined together. We link longitudinal faculty rosters from U.S. research universities to OpenAlex publication records and study 11,535 tenure-system faculty members who changed institutions between 2011 and 2020, with a comparison group of more than 200,000 non-moving faculty members. A directed network of faculty moves reveals a strongly hierarchical market: high-prestige institutions are net importers, lower-prestige institutions are net exporters, and the mobility hierarchy closely parallels the hierarchy observed in faculty hiring. However, event-study models that account for pre-move trajectories show little evidence of sustained post-move gains in publication volume, citation impact, or top-cited publication rates, including among upward moves to more prestigious institutions. The most consistent post-move change is collaborative: movers form new coauthor ties. We also observe modest increases in the share of papers with positive CD-index values. These patterns suggest that faculty mobility primarily reallocates existing research capacity within a persistent institutional hierarchy rather than systematically altering individual research trajectories.2026-05-07T20:51:43ZErjia YanChaoqun Nihttp://arxiv.org/abs/2301.06196v5Young Male and Female Scientists: A Quantitative Exploratory Study of the Changing Demographics of the Global Scientific Workforce2026-05-07T18:54:31ZIn this study, the global scientific workforce is explored through large-scale, generational, cross-sectional, and longitudinal approaches. We examine 4.3 million nonoccasional scientists from 38 OECD countries publishing in 1990-2021. Our interest is in the changing distribution of young male and female scientists over time across 16 STEMM (science, technology, engineering, mathematics, medicine) disciplines. We unpack the details of the changing scientific workforce using age groups. Some disciplines are already numerically dominated by women, and the change is fast in some and slow in other disciplines. In one-third of disciplines, there are already more youngest female than male scientists. Across all disciplines combined, the majority of women are young women. And more than half of women scientists (55.02%) are located in medicine. The usefulness of global bibliometric data sources in analyzing the scientific workforce along gender, age, discipline, and time is tested. Traditional aggregated data about scientists in general hide a nuanced picture of the changing gender dynamics within and across disciplines and age groups. The limitations of bibliometric datasets are explored, and global studies are compared with national-level studies. The methodological choices and their implications are shown, and new opportunities for how to study scientists globally are discussed.2023-01-15T21:53:29Z40 pages, 7 tables, 12 figuresMarek KwiekLukasz Szymulahttp://arxiv.org/abs/2404.16835v3Quantifying Lifetime Productivity Changes: A Longitudinal Study of 320,000 Late-Career Scientists2026-05-07T18:47:30ZThe present study focuses on persistence in research productivity over the course of an individual's entire scientific career. We track 'late-career' scientists - scientists with at least 25 years of publishing experience (N=320,564) - in 16 STEMM (science, technology, engineering, mathematics, and medicine) and social science disciplines from 38 OECD countries for up to five decades. Our OECD sample includes 79.42% of late-career scientists globally. We examine the details of their mobility patterns as early-career, mid-career, and late-career scientists between decile-based productivity classes, from the bottom 10% to top 10% of the productivity distribution. Methodologically, we turn a large-scale bibliometric dataset (Scopus raw data) into a comprehensive, longitudinal data source for research on careers in science. The global science system is highly immobile: half of global top performers continue their careers as top performers and one-third of global bottom performers as bottom performers. Jumpers-Up and Droppers-Down are extremely rare in science. The chances of moving radically up or down in productivity classes are marginal (1% or less). Our regression analyses show that productivity classes are highly path dependent: there is a single most important predictor of being a top performer, which is being a top performer at an earlier career stage.2024-01-19T07:21:13Z31 pages, 7 figures, 4 tables plus Electronic Supplementary MaterialsMarek KwiekLukasz Szymulahttp://arxiv.org/abs/2411.00008v5Women in Science: Measuring Participation in Europe Across Disciplines, Generations and Over Time2026-05-07T18:43:57ZIn this research, we quantify an inflow of women into science in the past three decades. Structured Big Data allow us to estimate the contribution of women scientists to the growth of science by disciplines (N = STEMM 14 disciplines) and over time (1990-2023). A monolithic segment of STEMM science emerges from this research as divided between the disciplines in which the growth was powerfully driven by women - and the disciplines in which the role of women was marginal. There are four disciplines in which 50% of currently publishing scientists are women; and five disciplines in which more than 50% of currently young scientists are women. But there is also a cluster of four highly mathematized disciplines (MATH, COMP, PHYS, and ENG) in which the growth of science is only marginally driven by women. Digital traces left by scientists in their publications indexed in global datasets open two new dimensions in large-scale academic profession studies: time and gender. The growth of science in Europe was accompanied by growth in the number of women scientists, but with powerful cross-disciplinary and cross-generational differentiations. We examined the share of women scientists coming from ten different age cohorts for 32 European and four comparator countries (the USA, Canada, Australia, and Japan). Our study sample was N = 1,740,985 scientists (including 39.40% women scientists). Three critical methodological challenges of using structured Big Data of the bibliometric type were discussed: gender determination, academic age determination, and discipline determination.2024-10-17T12:26:08Z29 pages, 5 figuresMarek KwiekLukasz Szymulahttp://arxiv.org/abs/2605.16340v1Success in Science: How Global Prestige Organizes Careers2026-05-07T18:40:38ZThis article analyzes the structure of perceived academic success. We combine survey data from 10,848 Polish scientists with their Scopus bibliometric data at the individual level. We use polychoric correlations, exploratory factor analysis, network modeling (EBICglasso), and generalized linear mixed models in ordinal and binary forms. Our results show that academic success is multidimensional, with a clear core. This core is global publication prestige. Publishing in top international journals is the node with the highest centrality, and it is connected to other career dimensions, such as citations and international collaboration. Publications in top national journals, in contrast, are peripheral. The threshold structure of the scale indicates a selection effect. The definition of success is globally oriented and strongly tied to the hierarchy of international journals.2026-05-07T18:40:38Z31 pagesMarek KwiekWojciech Roszkahttp://arxiv.org/abs/2409.05512v4DatAasee -- A Metadata-Lake as Metadata Catalog for a Virtual Data-Lake2026-05-07T12:46:25ZMetadata management for distributed data sources is a long-standing but ever-growing problem. To counter this challenge in a research-data and library-oriented setting, this work constructs a data architecture, derived from the data-lake: the metadata-lake. A proof-of-concept implementation of this proposed metadata aggregator is presented and briefly evaluated.2024-09-09T11:10:45ZChristian Himpehttp://arxiv.org/abs/2605.16338v1Vidya: An AI-Driven Modular Pipeline for Archival Automation and Semantic Metadata Enrichment2026-05-07T11:21:24ZThe large-scale digitization of historical archives has created a paradox: "dark data"-digital objects lacking metadata for retrieval. Manual archival description is slow and expensive, limiting discovery and reuse. We propose Vidya, a modular pipeline that orchestrates Large Language Models (LLMs) and FOSS tools to automate semantic enrichment and archival ingestion at scale. Vidya constrains generations using YAML-defined ontologies and Pydantic validation, producing deterministic, structured JSON outputs from probabilistic models. Developed at Laboratory for Digital Humanities and Innovation (LAMUHDI) of the State University of Ponta Grossa (UEPG), Vidya applies Maker principles and open-source practices to enable low-cost deployment in memory institutions using modest hardware. We compare LLM performance and present a cost-benefit analysis showing major gains, reducing processing time from decades to days while complying with NOBRADE and ISAD(G).2026-05-07T11:21:24ZCloter Migliorini FilhoJulia Graciela MachadoEdson Armando SilvaMarcella Scoczynskihttp://arxiv.org/abs/2605.06021v1PlotPick: AI-powered batch extraction of numerical data from scientific figures2026-05-07T11:15:39ZSystematic reviews and meta-analyses frequently require numerical data that authors report only as figures, yet manual digitisation is slow and does not scale. We present PlotPick, an open-source tool that uses vision-language models (VLMs) to batch-extract structured tabular data from scientific figures. We evaluate six VLMs from three providers on two established chart-to-table benchmarks (ChartX and PlotQA) and compare against the dedicated chart-to-table model DePlot. All six VLMs outperform DePlot on both benchmarks. On ChartX (restricted to bar charts, line charts, box plots, and histograms; n=300), VLMs achieve 88-96% recall versus 71% for DePlot. On PlotQA (n=529), VLMs achieve 86-99% RMSF1 versus 94% for DePlot. The gap is largest on chart types absent from the dedicated models' training data: on box plots, DePlot achieves 24% RMSF1 while VLMs achieve 83-97%. PlotPick is available at https://plotpick.streamlit.app.2026-05-07T11:15:39Z7 pages, 2 figures, 2 tables. Software available at https://plotpick.streamlit.app and https://github.com/tommycarstensen/plotpickTommy Carstensenhttp://arxiv.org/abs/2605.05644v1AoI-Guided Client Selection for Robust and Timely Federated Intrusion Detection in Cloud-Edge Security Analytics2026-05-07T03:51:15ZFederated learning (FL) is attractive for cloud-edge intrusion detection because it enables collaborative training over distributed telemetry without centralizing raw logs. In production security analytics pipelines, however, only a subset of clients participates in each round, and heterogeneous bandwidth, stragglers, and dropouts can cause the server to rely on stale client information. This paper studies client participation as a timeliness-aware systems problem using Age of Information (AoI). We compare three lightweight policies for federated intrusion detection: AoI-first, utility-first, and a hybrid AoI+utility rule with a tunable trade-off parameter.
Across a CIC-IDS2017 DDoS/PortScan mini subset, NSL-KDD, ToN-IoT, and a synthetic drift benchmark under clean, poisoning, and poisoning-plus-robust-aggregation settings, AoI-aware selection reduces average AoI by about 39--41% and peak AoI by about 70% relative to random sampling while keeping the per-round communication budget fixed. The hybrid policy usually preserves Macro-F1/AUC and provides an interpretable knob for balancing freshness, detection quality, and robustness, although it is not uniformly Pareto-dominant once false positive rate is included. Robustness is evaluated by combining AoI-guided selection with trimmed-mean aggregation under label-flip poisoning; the selection policy itself is not intended as a standalone Byzantine defense. The main practical message is that cloud-edge, privacy-preserving intrusion analytics can improve timeliness through a lightweight scheduling layer without changing the underlying FL participation budget.2026-05-07T03:51:15Z6 pages, 2 figures, 3 tables. Accepted by the 2026 10th International Conference on Cloud and Big Data Computing (ICCBDC 2026). PreprintChun Yin Chiuhttp://arxiv.org/abs/2605.13873v1Large Language Models for Web Accessibility: A Systematic Literature Review2026-05-06T01:40:27ZWeb accessibility aims to ensure that web content and services are usable by people with diverse abilities. In recent years, Large Language Models (LLMs) have been increasingly explored to support accessibility-related tasks on the web, such as content generation, issue detection, and remediation. However, little is known about the characteristics of these approaches, the accessibility issues they target, the standards they follow, and how they are evaluated. In this paper, we present a systematic literature review of 38 peer-reviewed studies that investigate the use of LLMs in web accessibility contexts. We begin by performing a comprehensive search of scientific publications to identify relevant studies. We then conduct a comparative analysis to examine the accessibility tasks addressed, the LLM models and prompting strategies employed, the system architectures adopted, the accessibility issues and guidelines considered, and the evaluation methods used across studies. Our findings show that most studies apply LLMs to text-centric and structurally explicit accessibility tasks, with WCAG serving as the primary reference framework and limited consideration of cognitive accessibility guidelines (COGA). The reviewed approaches predominantly rely on general-purpose LLMs and prompt-based interactions, while evaluation practices vary widely and often lack direct involvement of users with disabilities. We envision this review as a consolidated reference for researchers and practitioners seeking to understand the current landscape of LLM-supported web accessibility, and as a foundation to guide future research and tool development in this area.2026-05-06T01:40:27ZAccepted at the 23rd International Web for All Conference (W4A 2026)Wajdi AljedaaniRubel Hassan Mollikhttp://arxiv.org/abs/2605.04334v1Science discussions of retracted articles on Bluesky: public scrutiny or misinformation spreading?2026-05-05T22:41:04ZPost-publication peer review (PPPR) has emerged as an important supplement to traditional peer review, with social media playing a growing role in publicising potential problems in published research. However, it remains unclear whether social media discussions of retracted articles primarily reflect good practices, such as exposing flaws and acknowledging retraction status, or bad practices, such as overlooking retractions and continuing to disseminate scientific misinformation. In this study, we collected Bluesky posts referencing scholarly articles from Altmetric and retrieved metadata for the referenced articles using OpenAlex. The final dataset included 284 retracted articles with 79 pre-retraction posts and 857 post-retraction posts, 59 retraction notices with 186 posts, and 609,461 non-retracted articles with 1,344,756 posts. We manually coded Bluesky posts discussing retracted articles to identify instances of good and bad practice. The results show that posts demonstrating good practice (89.9%) substantially outnumbered those demonstrating bad practice (10.1%). Posts reflecting good practice also had more user engagement. In the pre-retraction phase, good practice posts constituted a slight minority (43.0%), whereas in the post-retraction phase they were dominant (94.2%). Most negative posts in the pre-retraction phase (90.0%) had good practice while only 17.3% positive posts in the post-retraction phase showed bad practice. Thus, sentiment analysis can be helpful to filter posts that could flag potential flaws before retraction, but it may struggle to accurately identify the spread of misinformation after retraction. More broadly, this study highlights the potential of Bluesky to support responsible scientific communication, public scrutiny, and research integrity.2026-05-05T22:41:04Z26 pages, 5 figuresEr-Te ZhengHui-Zhen FuXiaorui JiangZhichao FangMike Thelwallhttp://arxiv.org/abs/2605.16333v1SotA Lens: A Network-Augmented Methodology and Tool for Exploratory State-of-the-Art Reviews2026-05-05T19:44:35ZResearchers often begin new projects by conducting a broad State-of-the-Art review before they are ready to define the narrow protocol required by a systematic review. This is especially common in multidisciplinary areas where terminology is unstable, communities are weakly connected, and relevant work is dispersed across technical and application domains. This paper presents SotA Lens, a network-augmented methodology and lightweight software toolkit for exploratory State-of-the-Art reviews. The approach combines documented seed search, DOI-level metadata resolution, bounded citation expansion, directed graph construction, community detection, ranking of authors and subject terms, and human labelling of research communities. It is designed to complement, not replace, established review protocols such as PRISMA, PRISMA-ScR, systematic mapping studies, and bibliometric science mapping. The method is demonstrated through a proof-of-concept review of Dynamic Projection-Mapping and Spatial Augmented Reality. Starting from approximately 200 seed search results, the workflow produced a citation graph with 2,198 DOI-level vertices and 8,249 reference edges; a filtered largest component for 2010-2023 contained 986 vertices, 2,693 edges, and sixteen labelled communities. The contribution is both methodological and practical: SotA Lens helps researchers map broad fields, identify clusters and gaps, and produce auditable review artifacts before committing to a narrower systematic review protocol. This paper is not intended as a domain survey of Dynamic Projection-Mapping or Spatial Augmented Reality; rather, it introduces and demonstrates an original review-support methodology and software artifact using that domain as a proof-of-concept case study.2026-05-05T19:44:35Z11 pages, 3 figures, 2 tables; original methodology/software paper with proof-of-concept case study; software DOI: 10.5281/zenodo.19860899Diogo Peralta Cordeiro