https://arxiv.org/api/vlX/d10e9O8YFXTCT2rsgeIgvNQ 2026-06-13T12:34:54Z 6065 375 15 http://arxiv.org/abs/2511.13742v2 Review of Passenger Flow Modelling Approaches Based on a Bibliometric Analysis 2026-01-12T09:43:42Z

This paper presents a bibliometric analysis of the field of short-term passenger flow forecasting within local public transit, covering 814 publications that span from 1984 to 2024. In addition to common bibliometric analysis tools, a variant of a citation network was developed, and topic modelling was conducted. The analysis reveals that research activity exhibited sporadic patterns prior to 2008, followed by a marked acceleration, characterised by a shift from conventional statistical and machine learning methodologies (e.g., ARIMA, SVM, and basic neural networks) to specialised deep learning architectures. Based on this insight, a connection to more general fields such as machine learning and time series modelling was established. In addition to modelling, spatial, linguistic, and modal biases were identified and findings from existing secondary literature were validated and quantified. This revealed existing gaps, such as constrained data fusion, open (multivariate) data, and underappreciated challenges related to model interpretability, cost-efficiency, and a balance between algorithmic performance and practical deployment considerations. In connection with the superordinate fields, the growth in relevance of foundation models is also noteworthy.

2025-11-12T07:13:18Z Jonathan Hecht Weilian Li Ziyue Li Youness Dehbi http://arxiv.org/abs/2601.02598v2 LongDA: Benchmarking LLM Agents for Long-Document Data Analysis 2026-01-11T22:21:22Z

We introduce LongDA, a data analysis benchmark for evaluating LLM-based agents under documentation-intensive analytical workflows. In contrast to existing benchmarks that assume well-specified schemas and inputs, LongDA targets real-world settings in which navigating long documentation and complex data is the primary bottleneck. To this end, we manually curate raw data files, long and heterogeneous documentation, and expert-written publications from 17 publicly available U.S. national surveys, from which we extract 505 analytical queries grounded in real analytical practice. Solving these queries requires agents to first retrieve and integrate key information from multiple unstructured documents, before performing multi-step computations and writing executable code, which remains challenging for existing data analysis agents. To support the systematic evaluation under this setting, we develop LongTA, a tool-augmented agent framework that enables document access, retrieval, and code execution, and evaluate a range of proprietary and open-source models. Our experiments reveal substantial performance gaps even among state-of-the-art models, highlighting the challenges researchers should consider before applying LLM agents for decision support in real-world, high-stakes analytical settings.

2026-01-05T23:23:16Z Yiyang Li Zheyuan Zhang Tianyi Ma Zehong Wang Keerthiram Murugesan Chuxu Zhang Yanfang Ye http://arxiv.org/abs/2502.06472v2 KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph Enrichment 2026-01-11T02:16:42Z

Maintaining comprehensive and up-to-date knowledge graphs (KGs) is critical for modern AI systems, but manual curation struggles to scale with the rapid growth of scientific literature. This paper presents KARMA, a novel framework employing multi-agent large language models (LLMs) to automate KG enrichment through structured analysis of unstructured text. Our approach employs nine collaborative agents, spanning entity discovery, relation extraction, schema alignment, and conflict resolution that iteratively parse documents, verify extracted knowledge, and integrate it into existing graph structures while adhering to domain-specific schema. Experiments on 1,200 PubMed articles from three different domains demonstrate the effectiveness of KARMA in knowledge graph enrichment, with the identification of up to 38,230 new entities while achieving 83.1\% LLM-verified correctness and reducing conflict edges by 18.6\% through multi-layer assessments.

2025-02-10T13:51:36Z 24 pages, 3 figures, 2 tables Spotlight paper of NeurIPS 2025 Yuxing Lu Wei Wu Xukai Zhao Rui Peng Jinzhuo Wang http://arxiv.org/abs/2601.00840v2 A Global Atlas of Digital Dermatology to Map Innovation and Disparities 2026-01-09T20:23:53Z

The adoption of artificial intelligence in dermatology promises democratized access to healthcare, but model reliability depends on the quality and comprehensiveness of the data fueling these models. Despite rapid growth in publicly available dermatology images, the field lacks quantitative key performance indicators to measure whether new datasets expand clinical coverage or merely replicate what is already known. Here we present SkinMap, a multi-modal framework for the first comprehensive audit of the field's entire data basis. We unify the publicly available dermatology datasets into a single, queryable semantic atlas comprising more than 1.1 million images of skin conditions and quantify (i) informational novelty over time, (ii) dataset redundancy, and (iii) representation gaps across demographics and diagnoses. Despite exponential growth in dataset sizes, informational novelty across time has somewhat plateaued: Some clusters, such as common neoplasms on fair skin, are densely populated, while underrepresented skin types and many rare diseases remain unaddressed. We further identify structural gaps in coverage: Darker skin tones (Fitzpatrick V-VI) constitute only 5.8% of images and pediatric patients only 3.0%, while many rare diseases and phenotype combinations remain sparsely represented. SkinMap provides infrastructure to measure blind spots and steer strategic data acquisition toward undercovered regions of clinical space.

2025-12-27T09:22:36Z Fabian Gröger Simone Lionetti Philippe Gottfrois Alvaro Gonzalez-Jimenez Lea Habermacher Labelling Consortium Ludovic Amruthalingam Matthew Groh Marc Pouly Alexander A. Navarini http://arxiv.org/abs/2603.19237v1 Prompt engineering for bibliographic web-scraping 2026-01-09T10:00:10Z

Bibliographic catalogues store millions of data. The use of computer techniques such as web-scraping allows the extraction of data in an efficient and accurate manner. The recent emergence of ChatGPT is facilitating the development of suitable prompts that allow the configuration of scraping to identify and extract information from databases. The aim of this article is to define how to efficiently use prompts engineering to elaborate a suitable data entry model, able to generate in a single interaction with ChatGPT-4o, a fully functional web-scraper, programmed in PHP language, adapted to the case of bibliographic catalogues. As a demonstration example, the bibliographic catalogue of the National Library of Spain with a dataset of thousands of records is used. The findings present an effective model for developing web-scraping programs, assisted with AI and with the minimum possible interaction. The results obtained with the model indicate that the use of prompts with large language models (LLM) can improve the quality of scraping by understanding specific contexts and patterns, adapting to different formats and styles of presentation of bibliographic information.

2026-01-09T10:00:10Z 26 pages, 7 Tables, 2 Figures Scientometrics, 2025, 130(7), 3433-3453 Manuel Blázquez-Ochando Juan José Prieto-Gutiérrez María Antonia Ovalle-Perandones 10.1007/s11192-025-05372-5 http://arxiv.org/abs/2503.18526v2 SciClaims: An End-to-End Generative System for Biomedical Claim Analysis 2026-01-08T17:04:29Z

We present SciClaims, an interactive web-based system for end-to-end scientific claim analysis in the biomedical domain. Designed for high-stakes use cases such as systematic literature reviews and patent validation, SciClaims extracts claims from text, retrieves relevant evidence from PubMed, and verifies their veracity. The system features a user-friendly interface where users can input scientific text and view extracted claims, predictions, supporting or refuting evidence, and justifications in natural language. Unlike prior approaches, SciClaims seamlessly integrates the entire scientific claim analysis process using a single large language model, without requiring additional fine-tuning. SciClaims is optimized to run efficiently on a single GPU and is publicly available for live interaction.

2025-03-24T10:31:31Z In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations Raúl Ortega José Manuel Gómez-Pérez 10.18653/v1/2025.emnlp-demos.11 http://arxiv.org/abs/2601.05103v1 Semantically Orthogonal Framework for Citation Classification: Disentangling Intent and Content 2026-01-08T16:48:36Z

Understanding the role of citations is essential for research assessment and citation-aware digital libraries. However, existing citation classification frameworks often conflate citation intent (why a work is cited) with cited content type (what part is cited), limiting their effectiveness in auto classification due to a dilemma between fine-grained type distinctions and practical classification reliability. We introduce SOFT, a Semantically Orthogonal Framework with Two dimensions that explicitly separates citation intent from cited content type, drawing inspiration from semantic role theory. We systematically re-annotate the ACL-ARC dataset using SOFT and release a cross-disciplinary test set sampled from ACT2. Evaluation with both zero-shot and fine-tuned Large Language Models demonstrates that SOFT enables higher agreement between human annotators and LLMs, and supports stronger classification performance and robust cross-domain generalization compared to ACL-ARC and SciCite annotation frameworks. These results confirm SOFT's value as a clear, reusable annotation standard, improving clarity, consistency, and generalizability for digital libraries and scholarly communication infrastructures. All code and data are publicly available on GitHub https://github.com/zhiyintan/SOFT.

2026-01-08T16:48:36Z Accepted at the 29th International Conference on Theory and Practice of Digital Libraries (TPDL 2025) Changxu Duan Zhiyin Tan 10.1007/978-3-032-05409-8_12 http://arxiv.org/abs/2601.05099v1 Multi-Disciplinary Dataset Discovery from Citation-Verified Literature Contexts 2026-01-08T16:46:06Z

Identifying suitable datasets for a research question remains challenging because existing dataset search engines rely heavily on metadata quality and keyword overlap, which often fail to capture the semantic intent of scientific investigation. We introduce a literature-driven framework that discovers datasets from citation contexts in scientific papers, enabling retrieval grounded in actual research use rather than metadata availability. Our approach combines large-scale citation-context extraction, schema-guided dataset recognition with Large Language Models, and provenance-preserving entity resolution. We evaluate the system on eight survey-derived computer science queries and find that it achieves substantially higher recall than Google Dataset Search and DataCite Commons, with normalized recall ranging from an average of 47.47% to a highest value of 81.82%. Beyond recovering gold-standard datasets, the method also surfaces additional datasets not documented in the surveys. Expert assessments across five top-level Fields of Science indicate that a substantial portion of the additional datasets are considered high utility, and some are regarded as novel for the specific topics chosen by the experts. These findings establish citation-context mining as an effective and generalizable paradigm for dataset discovery, particularly in settings where datasets lack sufficient or reliable metadata. To support reproducibility and future extensions, we release our code, evaluation datasets, and results on GitHub (https://github.com/Fireblossom/citation-context-dataset-discovery).

2026-01-08T16:46:06Z Accepted at the 25th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2025) Zhiyin Tan Changxu Duan 10.1109/JCDL67857.2025.00022 http://arxiv.org/abs/2601.05051v1 Publishing FAIR and Machine-actionable Reviews in Materials Science: The Case for Symbolic Knowledge in Neuro-symbolic Artificial Intelligence 2026-01-08T15:56:17Z

Scientific reviews are central to knowledge integration in materials science, yet their key insights remain locked in narrative text and static PDF tables, limiting reuse by humans and machines alike. This article presents a case study in atomic layer deposition and etching (ALD/E) where we publish review tables as FAIR, machine-actionable comparisons in the Open Research Knowledge Graph (ORKG), turning them into structured, queryable knowledge. Building on this, we contrast symbolic querying over ORKG with large language model-based querying, and argue that a curated symbolic layer should remain the backbone of reliable neurosymbolic AI in materials science, with LLMs serving as complementary, symbolically grounded interfaces rather than standalone sources of truth.

2026-01-08T15:56:17Z 35 pages, 11 figures Jennifer D'Souza Soren Auer Eleni Poupaki Alex Watkins Anjana Devi Riikka L. Puurunen Bora Karasulu Adrie Mackus Erwin Kessels http://arxiv.org/abs/2603.00003v1 Commitment Checklist: Auditing Author Commitments in Peer Review 2026-01-08T05:37:09Z

Peer review author responses often include commitments to add experiments, release code, or clarify content in the final paper. Yet, there is currently no systematic mechanism to ensure authors fulfill these promises. In this position paper, we present a large-scale audit of author commitments using large language models (LLMs) to compare rebuttals against camera-ready versions. Analyzing the commitments from ICLR-2025 and EMNLP-2024, we find that while a majority of promised changes are implemented, a significant share (about 25%) are not, with "missing experiments" and other high-impact items among the most frequently unfulfilled. We demonstrate that LLM-based tools can feasibly detect the promises. Finally, we propose the idea of Author Commitment Checklist, which would alert authors and organizers to unaddressed promises, increasing accountability and strengthening the integrity of the peer review process. We discuss the benefits of this practice and advocate for its adoption in future conferences.

2026-01-08T05:37:09Z Chung-Chi Chen Iryna Gurevych http://arxiv.org/abs/2601.16990v1 pyBiblioNet: a Python library for a comprehensive network-based bibliometric analysis 2026-01-07T15:57:31Z

Bibliometric analysis is a critical tool for understanding the structure, dynamics, and impact of scientific research. Traditional methods often fall short in capturing the intricate relationships and evolving trends within scientific literature. To address this gap, we present pyBiblioNet, a Python library designed to facilitate comprehensive network-based bibliometric analysis, providing insights into citation networks, co-authorship networks, and keyword co-occurrence networks. The library integrates with OpenAlex, a popular and open catalogue to the global research system, enabling users to easily preprocess, visualize, and analyse bibliometric data. Key features include topic selection, automatic data download via OpenAlex APIs, creation of the root and base sets of manuscripts to analyze, creation of the citation and co-authorship networks, network visualization tools, and a suite of algorithms for computing network centralities, clustering, and community detection, all of them tailored to the bibliometric domain. Additionally, it enables the analysis of key topics and concepts using NLP techniques. We showcase the main functions of the library by performing a bibliometric analysis on the multidisciplinary "15-minute city paradigm", demonstrating the utility of pyBiblioNet in uncovering hidden patterns and emerging trends in various scientific domains. pyBiblioNet can empower researchers, librarians, and policymakers with a powerful, user-friendly tool for enhancing their bibliometric analyses and making data-driven decisions.

2026-01-07T15:57:31Z This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in Scientometrics, and is available online at https://doi.org/10.1007/s11192-025-05458-0 Mirko Lai Salvatore Vilella Federica Cena Giancarlo Ruffo 10.1007/s11192-025-05458-0 http://arxiv.org/abs/2411.02005v2 Towards a valid bibliometric measure of epistemic breadth of researchers 2026-01-07T15:57:27Z

The concept of epistemic breadth of the work of a researcher refers to the scope of their knowledge claims, as reflected in published research reports. Studies of epistemic breadth have been hampered by the lack of a validated measure of the concept. Here we introduce a knowledge space approach to the measurement of epistemic breadth and propose to use the semantic similarity network of an author's publication record to operationalize a measure. In this approach, each paper has its own location in a common abstract vector space based on its content. Proximity in knowledge space corresponds to thematic similarity of publications. Candidate measures of epistemic breadth derived from aggregate similarity values of researchers' bodies of work are tested against external validation data of researchers known to have made a major change in research topic and against self-citation data. We find that some candidate measures co-vary well with known epistemic breadth of researchers in the empirical data and can serve as valid indicators of the concept.

2024-11-04T11:43:48Z Paul Donner Clemens Blümel http://arxiv.org/abs/2601.03628v1 Global research trends and collaborations in Fibrodysplasia Ossificans Progressiva: A bibliometric analysis (1989-2023) 2026-01-07T06:17:04Z

Fibrodysplasia Ossificans Progressiva (FOP) is a rare and debilitating genetic disorder characterized by the progressive formation of bone in muscles and connective tissues. This scientometric analysis examines the global research trends on FOP between 1989 and 2023 using bibliographic data from Web of Science. The study highlights key patterns in publication productivity, influential journals, institutions, and the geographical distribution of research. The findings reveal that the United States leads both in terms of total publications and citation impact, with significant contributions from the UK, Italy, Japan, and other European countries. Additionally, the analysis identifies the major document types, including articles and reviews, and evaluates the collaborative efforts across institutions. The study offers valuable insights into the global research landscape of FOP, providing a foundation for future studies and international collaborations.

2026-01-07T06:17:04Z 23 page, 4 figures, Research article Muneer Ahmad Undie Felicia Nkatv Sajid Saleem 10.47524/lipr.v7i4.66 http://arxiv.org/abs/2510.16242v2 Code Contribution and Credit in Science 2026-01-06T22:27:01Z

Software development has become essential to scientific research, but its relationship to traditional metrics of scholarly credit remains poorly understood. We develop a dataset of approximately 140,000 paired research articles and code repositories, and a predictive model that matches research article authors with software repository developer accounts. We use this dataset to investigate how software development activities influence credit allocation in collaborative scientific settings. Our findings reveal significant patterns distinguishing software contributions from traditional authorship credit. We find that $\sim$30\% of articles include non-author code contributors -- individuals who participated in software development but received no authorship recognition. While code-contributing authors provide a $\sim$4.2\% increase in article citations, this effect becomes non-significant when controlling for domain, article type, and open access status. First authors are significantly more likely to be code contributors than other author positions. Notably, we identify a negative relationship between coding frequency and scholarly impact metrics. Authors who contribute code more frequently exhibit progressively lower h-indices than non-coding colleagues, even when controlling for publication count, author position, domain, and article type. These results suggest a disconnect between software contributions and credit, highlighting important implications for institutional reward structures and science policy.

2025-10-17T22:17:38Z Revisions after peer-review. This is the "Accepted" version of the paper! Eva Maxfield Brown Isaac Slaughter Nicholas Weber http://arxiv.org/abs/2603.19236v1 L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI) 2026-01-06T06:08:20Z

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework provides a rigorous foundation for evidence synthesis, yet the manual processes of data extraction and literature screening remain time-consuming and restrictive. Recent advances in Generative Artificial Intelligence (GenAI), particularly large language models (LLMs), offer opportunities to automate and scale these tasks, thereby improving time and efficiency. However, reproducibility, transparency, and auditability, the core PRISMA principles, are being challenged by the inherent non-determinism of LLMs and the risks of hallucination and bias amplification. To address these limitations, this study integrates human-led synthesis with a GenAI-assisted statistical pre-screening step. Human oversight ensures scientific validity and transparency, while the deterministic nature of the statistical layer enhances reproducibility. The proposed approach systematically enhances PRISMA guidelines, providing a responsible pathway for incorporating GenAI into systematic review workflows.

2026-01-06T06:08:20Z ICMET 2025 Samar Shailendra Rajan Kadel Aakanksha Sharma Islam Mohammad Tahidul Urvashi Rahul Saxena