https://arxiv.org/api/D8AwQeQyVwO1QMFYD2Xxr7kG7nE2026-03-22T13:23:42Z58706015http://arxiv.org/abs/2602.15249v1Artificial Intelligence Specialization in the European Union: Underexplored Role of the Periphery at NUTS-3 Level2026-02-16T23:01:14ZThis study examines the geographical distribution of Artificial Intelligence (AI) research production across European regions at the NUTS-3 level for the period 2015-2024. Using bibliometric data from Clarivate InCites and the Citation Topics classification system, we analyze two hierarchical levels of thematic aggregation: Electrical Engineering, Electronics & Computer Science (Macro Citation Topic 4) and Artificial Intelligence & Machine Learning (Meso Citation Topic 4.61). We calculate the Relative Specialization Index (RSI) and Relative Citation Impact (RCI) for 781 NUTS-3 regions. While major metropolitan hubs such as Paris (IIle-de-France), Warszawa, and Madrid lead in absolute production volume, our findings reveal that peripheral regions, particularly from Eastern Europe and Spain, exhibit the highest levels of relative AI specialization. Notably, we find virtually no correlation between regional specialization and citation impact, identifying four distinct regional profiles: high-impact specialized regions (e.g., Granada, Jaen, Vilniaus), high-volume but low-impact regions (e.g., Bugas, several Polish regions), high-impact non-specialized regions, with Fyn (Denmark) standing out as a remarkable outlier achieving exceptional citation impact (RCI > 4) despite low specialization, and diversified portfolios with selective excellence (e.g., German regions). These results suggest that AI research represents a strategic opportunity for peripheral regions to develop competitive scientific niches, though achieving international visibility requires more than research volume alone.2026-02-16T23:01:14Z6 pages, 3 figures, submitted to IEEE Computational Intelligence MagazineVictor Herrero-Solanahttp://arxiv.org/abs/2602.14755v1Measuring the relatedness between scientific publications using controlled vocabularies2026-02-16T13:58:47ZMeasuring the relatedness between scientific publications is essential in many areas of bibliometrics and science policy. Controlled vocabularies provide a promising basis for measuring relatedness and are widely used in combination with Salton's cosine similarity. The latter is problematic because it only considers exact matches between terms. This article introduces two alternative methods - soft cosine and maximum term similarities - that account for the semantic similarity between non-matching terms. The article compares the accuracy of all three methods using the assignment of publications to topics in the TREC 2006 Genomics Track and the assumption that accurate relatedness measures should assign high relatedness scores to publication pairs within the same topic and low scores to pairs from separate topics. Results show that soft cosine is the most accurate method, while the most widely used version of Salton's cosine is markedly less accurate than the other methods tested. These findings have implications for how controlled vocabularies should be used to measure relatedness.2026-02-16T13:58:47ZCurrently under review at Scientometrics (16 February 2026)Emil Dolmer Alnorhttp://arxiv.org/abs/2602.14384v1M-CODE: Materials Categorization via Ontology, Dimensionality and Evolution2026-02-16T01:18:15ZThe rapid advancement of artificial intelligence in materials science requires data standards and data management practices that can capture the complexity of real-world structures, including surfaces, interfaces, defects, and dimensionality reduction. We present M-CODE - Materials Categorization via Ontology, Dimensionality and Evolution - a compact categorization system that links materials-science-specific terminology to a set of reusable concepts as building blocks and provenance-aware transformations. M-CODE classifies structures by dimensionality, structural complexity (from pristine to compound pristine, defective, and processed), and variants that capture common structure creation and evolution approaches. A practical implementation of the categorization is provided in an open-source codebase that includes JSON schemas, examples, and Python and TypeScript types/interfaces, designed to support reproducible dataset generation, validation, and community contributions.2026-02-16T01:18:15Z13 pages, 2 figures, 5 tablesVsevolod BiryukovKamal ChoudharyTimur Bazhirovhttp://arxiv.org/abs/2602.14285v1FMMD: A multimodal open peer review dataset based on F1000Research2026-02-15T19:36:05ZAutomated scholarly paper review (ASPR) has entered the coexistence phase with traditional peer review, where artificial intelligence (AI) systems are increasingly incorporated into real-world manuscript evaluation. In parallel, research on automated and AI-assisted peer review has proliferated. Despite this momentum, empirical progress remains constrained by several critical limitations in existing datasets. While reviewers routinely evaluate figures, tables, and complex layouts to assess scientific claims, most existing datasets remain overwhelmingly text-centric. This bias is reinforced by a narrow focus on data from computer science venues. Furthermore, these datasets lack precise alignment between reviewer comments and specific manuscript versions, obscuring the iterative relationship between peer review and manuscript evolution. In response, we introduce FMMD, a multimodal and multidisciplinary open peer review dataset curated from F1000Research. The dataset bridges the current gap by integrating manuscript-level visual and structural data with version-specific reviewer reports and editorial decisions. By providing explicit alignment between reviewer comments and the exact article iteration under review, FMMD enables fine-grained analysis of the peer review lifecycle across diverse scientific domains. FMMD supports tasks such as multimodal issue detection and multimodal review comment generation. It provides a comprehensive empirical resource for the development of peer review research.2026-02-15T19:36:05ZWork in progressZhenzhen ZhuangYuqing FuJing ZhuZhangping ZhouJialiang Linhttp://arxiv.org/abs/2603.00080v1From Static Repositories to Agentic Knowledge Webs: ResearchTwin and the S-Index for Federated Human-AI Research Discovery2026-02-13T22:37:40ZThe exponential growth of scientific literature, datasets, and code repositories has created a discovery bottleneck that impedes knowledge synthesis and reproducibility. Traditional dissemination formats -- static PDFs, siloed code hosting, and fragmented data repositories -- fail to represent the interconnected narrative of modern research, while conventional metrics such as the H-index neglect contributions from reusable code and shared datasets. We present ResearchTwin, an open-source federated platform that transforms a researcher's scholarly output into a conversational digital twin, with a preliminary evaluation of its deployed prototype. The system uses a Bimodal Glial-Neural Optimization (BGNO) architecture comprising a Multi-Modal Connector Layer, a Glial Layer for caching and rate management, and a Neural Layer implementing Retrieval-Augmented Generation with a provider-agnostic LLM backend. We formalize the S-index, building on our earlier QIC framework, into a composite metric that extends FAIR principles -- via a binary accessibility/licensing gate, field-normalized impact scoring, and geometric collaboration scaling -- to quantify multimodal research impact. A case study comparing two researchers with similar H-indexes but substantially different S-indexes demonstrates that the metric captures dimensions of impact -- particularly dataset and code contributions -- invisible to citation-based measures alone. ResearchTwin exposes an inter-agentic discovery API using Schema.org typed responses and HATEOAS navigation, enabling AI agents to discover cross-lab synergies. A three-tier federated architecture preserves data sovereignty while enabling global discoverability.2026-02-13T22:37:40Z15 pages, 1 figure, https://github.com/martinfrasch/ResearchTwinMartin G. Fraschhttp://arxiv.org/abs/2602.12537v1News Harvesting from Google News combining Web Scraping, LLM Metadata Extraction and SCImago Media Rankings enrichment: a case study of IFMIF-DONES2026-02-13T02:34:26ZThis study develops and evaluates a systematic methodology for constructing news datasets from Google News, combining automated web scraping, large language model (LLM)-based metadata extraction, and SCImago Media Rankings enrichment. Using the IFMIF-DONES fusion energy project as a case study, we implemented a five-stage data collection pipeline across 81 region-language combinations, yielding 1,482 validated records after a 56% noise reduction. Results are compared against two licensed press databases: MyNews (2,280 records) and ProQuest Newsstream Collection (148 records). Overlap analysis reveals high complementarity, with 76% of Google News records exclusive to this platform. The dataset captures content types absent from proprietary databases, including specialized outlets, institutional communications, and social media posts. However, significant methodological challenges emerge: temporal instability requiring synchronic collection, a 100-result cap per query demanding multi-stage strategies, and unexpected noise including academic PDFs, false positives, and pornographic content infiltrating results through black hat SEO techniques. LLM-assisted extraction proved effective for structured articles but exhibited systematic hallucination patterns requiring validation protocols. We conclude that Google News offers valuable complementary coverage for communication research but demands substantial methodological investment, multi-source triangulation, and robust filtering mechanisms to ensure dataset integrity.2026-02-13T02:34:26Z24 pages, 7 figures, submitted to Communication Methods and MeasuresVictor Herrero-Solanahttp://arxiv.org/abs/2602.12206v1Making the complete OpenAIRE citation graph easily accessible through compact data representation2026-02-12T17:44:36ZThe OpenAIRE graph contains a large citation graph dataset, with over 200 million publications and over 2 billion citations. The current graph is available as a dump with metadata which uncompressed totals ~TB. This makes it hard to process on conventional computers. To make this network more available for the community we provide a processed OpenAIRE graph which is downscaled to 32GB, while preserving the full graph structure. Apart from this we offer the processed data in very simple format, which allows further straightforward manipulation. We also provide a python pipeline, which can be used to process the next releases of the OpenAIRE graph.2026-02-12T17:44:36ZJoakim SkardingPavel Sandahttp://arxiv.org/abs/2309.04414v3Scientific productivity as a random walk2026-02-12T17:33:06ZThe expectation that scientific productivity follows regular patterns over a career underpins many scholarly evaluations. However, recent studies of individual productivity patterns reveal a puzzle: the average number of papers published per year robustly follows the ``canonical trajectory'' of a rapid rise followed by a gradual decline, yet only about 20\% of individual productivity trajectories follow this pattern. We resolve this puzzle by modeling scientific productivity as a random walk, showing that the canonical pattern can be explained as a decrease in the variance in changes to productivity in the early-to-mid career. By empirically characterizing the variable structure of 2,085 productivity trajectories of computer science faculty at 205 PhD-granting institutions, spanning 29,119 publications over 1980--2016, we (i) discover remarkably simple patterns in both early-career and year-to-year changes to productivity, and (ii) show that a random walk model of productivity both reproduces the canonical trajectory in the average productivity and captures much of the diversity of individual-level trajectories, including the lognormal distribution of cumulative productivity observed by William Shockley in 1957. We confirm that these results generalize across fields by fitting our model to a separate panel of 22,952 faculty across 12 fields from 2011 to 2023. These results highlight the importance of variance in shaping individual scientific productivity, opening up new avenues for characterizing how systemic incentives and opportunities can be directed for aggregate effect.2023-09-08T16:25:24ZSam ZhangNicholas LaBergeSamuel F. WayDaniel B. LarremoreAaron Clausethttp://arxiv.org/abs/2602.03828v2AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations2026-02-12T16:22:05ZHigh-quality scientific illustrations are crucial for effectively communicating complex scientific and technical concepts, yet their manual creation remains a well-recognized bottleneck in both academia and industry. We present FigureBench, the first large-scale benchmark for generating scientific illustrations from long-form scientific texts. It contains 3,300 high-quality scientific text-figure pairs, covering diverse text-to-illustration tasks from scientific papers, surveys, blogs, and textbooks. Moreover, we propose AutoFigure, the first agentic framework that automatically generates high-quality scientific illustrations based on long-form scientific text. Specifically, before rendering the final result, AutoFigure engages in extensive thinking, recombination, and validation to produce a layout that is both structurally sound and aesthetically refined, outputting a scientific illustration that achieves both structural completeness and aesthetic appeal. Leveraging the high-quality data from FigureBench, we conduct extensive experiments to test the performance of AutoFigure against various baseline methods. The results demonstrate that AutoFigure consistently surpasses all baseline methods, producing publication-ready scientific illustrations. The code, dataset and huggingface space are released in https://github.com/ResearAI/AutoFigure.2026-02-03T18:41:43ZAccepted at the ICLR 2026Minjun ZhuZhen LinYixuan WengPanzhong LuQiujie XieYifan WeiSifan LiuQiyao SunYue Zhanghttp://arxiv.org/abs/2506.03527v2Distinguishing True Influence from Hyperprolificity with Citation Distance2026-02-12T08:43:46ZAccurately evaluating scholarly influence is essential for fair academic assessment, yet traditional bibliometric indicators - dominated by publication and citation counts - often favor hyperprolific authors over those with deeper, long-term impact. We propose the x-index, a novel citation-based metric that conceptualizes citation as a process of knowledge diffusion and incorporates citation distance to reflect the structural reach of scholarly work. By weighting citations according to the collaborative proximity between citing and cited authors, the x-index captures both the depth and breadth of influence within evolving academic networks. Empirical analyses show that the x-index significantly improves the rankings of Turing Award recipients while reducing those of hyperprolific authors, better aligning rankings with recognized academic merit. It also demonstrates superior discriminatory power among early-career researchers and reveals stronger sensitivity to institutional research quality. These results suggest that the x-index offers a more equitable and forward-looking alternative to existing metrics, with practical applications in talent identification, funding decisions, and academic recommendation systems.2025-06-04T03:19:11ZLu LiYun WanFeng Xiaohttp://arxiv.org/abs/2602.03866v3PaperX: A Unified Framework for Multimodal Academic Presentation Generation with Scholar DAG2026-02-11T13:18:33ZTransforming scientific papers into multimodal presentation content is essential for research dissemination but remains labor intensive. Existing automated solutions typically treat each format as an isolated downstream task, leading to redundant processing and semantic inconsistency. We introduce PaperX, a unified framework that models academic presentation generation as a structural transformation and rendering process. Central to our approach is the Scholar DAG, an intermediate representation that decouples the paper's logical structure from its final presentation syntax. By applying adaptive graph traversal strategies, PaperX generates diverse, high quality outputs from a single source. Comprehensive evaluations demonstrate that our framework achieves the state of the art performance in content fidelity and aesthetic quality while significantly improving cost efficiency compared to specialized single task agents.2026-01-30T18:27:03Z29 pages, 9 figures, Project website: https://github.com/yutao1024/PaperXTao YuMinghui ZhangZhiqing CuiHao WangZhongtian LuoShenghua ChaiJunhao GongYuzhao PengYuxuan ZhouYujia YangZhenghao ZhangHaopeng JinXinming WangYufei XiongJiabing YangJiahao YuanHanqing WangHongzhu YiYan HuangLiang Wanghttp://arxiv.org/abs/2603.00069v1Top performers and top journals: Persistent concentration in scientific publishing2026-02-10T20:21:55ZIn this research, we analyze the relationship between publishing productivity and access to highly prestigious journals, treating publishing in top journals as a stratification mechanism selecting publishing elites. We study N = 144,314 Polish scientists publishing for 30 years (1992-2021) and their Nart = 433,546 unique research articles published in the period. Using bibliometric data from Scopus, we compare the scientists belonging to the top productivity decile (the upper 10%, termed top performers) and the remaining population of scientists (90%) by discipline and period (five six-year periods). We measure the share of publications in prestigious segments of journals, with particular reference to the 90th-99th percentiles, and we use nonlinear journal prestige-normalized productivity. Our results indicate that access to top journals (defined as the top 10% of journals indexed in Scopus) is powerfully and permanently concentrated in the group of top performers in all disciplines and periods studied. The differences between top performers and the other scientists are primarily of a qualitative nature: they are seen almost exclusively at the top of the journal hierarchy rather than in its bottom or middle segments. Our logistic regression models indicate the complementarity of quantity and quality: publishing intensity increases the probability of membership in the elite segment of top performers, especially when it is coupled with publishing in prestigious journals. Our results suggest that top journals function as selection gates to academic careers and that they function as durable mechanisms of elite reproduction in science.2026-02-10T20:21:55Z36 pages plus supplementray materialsMarek KwiekWojciech Roszkahttp://arxiv.org/abs/2602.09817v1AnalyticsGPT: An LLM Workflow for Scientometric Question Answering2026-02-10T14:23:55ZThis paper introduces AnalyticsGPT, an intuitive and efficient large language model (LLM)-powered workflow for scientometric question answering. This underrepresented downstream task addresses the subcategory of meta-scientific questions concerning the "science of science." When compared to traditional scientific question answering based on papers, the task poses unique challenges in the planning phase. Namely, the need for named-entity recognition of academic entities within questions and multi-faceted data retrieval involving scientometric indices, e.g. impact factors. Beyond their exceptional capacity for treating traditional natural language processing tasks, LLMs have shown great potential in more complex applications, such as task decomposition and planning and reasoning. In this paper, we explore the application of LLMs to scientometric question answering, and describe an end-to-end system implementing a sequential workflow with retrieval-augmented generation and agentic concepts. We also address the secondary task of effectively synthesizing the data into presentable and well-structured high-level analyses. As a database for retrieval-augmented generation, we leverage a proprietary research performance assessment platform. For evaluation, we consult experienced subject matter experts and leverage LLMs-as-judges. In doing so, we provide valuable insights on the efficacy of LLMs towards a niche downstream task. Our (skeleton) code and prompts are available at: https://github.com/lyvykhang/llm-agents-scientometric-qa/tree/acl.2026-02-10T14:23:55ZKhang LyGeorgios CheirmposAdrian RaudaschlChristopher JamesSeyed Amin Tabatabaeihttp://arxiv.org/abs/2602.18479v1AgentCAT: An LLM Agent for Extracting and Analyzing Catalytic Reaction Data from Chemical Engineering Literature2026-02-10T04:30:11ZThis paper presents a large language model (LLM) agent named AgentCAT, which extracts and analyzes catalytic reaction data from chemical engineering papers, %and supports natural language based interactive analysis of the extracted data. AgentCAT serves as an alternative to overcome the long-standing data bottleneck in chemical engineering field, and its natural language based interactive data analysis functionality is friendly to the community. AgentCAT also presents a formal abstraction and challenge analysis of the catalytic reaction data extraction task in an artificial intelligence-friendly manner. This abstraction would help the artificial intelligence community understand this problem and in turn would attract more attention to address it. Technically, the complex catalytic process leads to complicated dependency structure in catalytic reaction data with respect to elementary reaction steps, molecular behaviors, measurement evidence, etc. This dependency structure makes it challenging to guarantee the correctness and completeness of data extraction, as well as representing them for analysis. AgentCAT addresses this challenge and it makes four folds of technical contributions: (1) a schema-governed extraction pipeline with progressive schema evolution, enabling robust data extraction from chemical engineering papers; (2) a dependency-aware reaction-network knowledge graph that links catalysts/active sites, synthesis-derived descriptors, mechanistic claims with evidence, and macroscopic outcomes, preserving process coupling and traceability; (3) a general querying module that supports natural-language exploration and visualization over the constructed graph for cross-paper analysis; (4) an evaluation on $\sim$800 peer-reviewed chemical engineering publications demonstrating the effectiveness of AgentCAT.2026-02-10T04:30:11ZWei YangZihao LiuTao TanXiao HuHong XieLulu Li Xin LiJianyu HanDefu LianMao Yehttp://arxiv.org/abs/2603.00051v1LitBench: A Graph-Centric Large Language Model Benchmarking Tool For Literature Tasks2026-02-10T04:12:29ZWhile large language models (LLMs) have become the de facto framework for literature-related tasks, they still struggle to function as domain-specific literature agents due to their inability to connect pieces of knowledge and reason across domain-specific contexts, terminologies, and nomenclatures. This challenge underscores the need for a tool that facilitates such domain-specific adaptation and enables rigorous benchmarking across literature tasks. To that end, we introduce LitBench, a benchmarking tool designed to enable the development and evaluation of domain-specific LLMs tailored to literature-related tasks. At its core, LitBench uses a data curation process that generates domain-specific literature sub-graphs and constructs training and evaluation datasets based on the textual attributes of the resulting nodes and edges. The tool is designed for flexibility, supporting the curation of literature graphs across any domain chosen by the user, whether high-level fields or specialized interdisciplinary areas. In addition to dataset curation, LitBench defines a comprehensive suite of literature tasks, ranging from node and edge level analyses to advanced applications such as related work generation. These tasks enable LLMs to internalize domain-specific knowledge and relationships embedded in the curated graph during training, while also supporting rigorous evaluation of model performance. Our results show that small domain-specific LLMs trained and evaluated on LitBench datasets achieve competitive performance compared to state-of-the-art models like GPT-4o and DeepSeek-R1. To enhance accessibility and ease of use, we open-source the tool along with an AI agent tool that streamlines data curation, model training, and evaluation.2026-02-10T04:12:29ZAndreas VarvarigosAli MaatoukJiasheng ZhangNgoc BuiJialin ChenLeandros TassiulasRex Ying