https://arxiv.org/api/F6gyyNh9guJlHdO1qAU4Pc1OM2w 2026-06-10T11:44:33Z 6061 195 15 http://arxiv.org/abs/2604.01262v1 Transforming OPACs into Intelligent Discovery Systems: An AI-Powered, Knowledge Graph-Driven Smart OPAC for Digital Libraries 2026-04-01T12:48:36Z

Traditional Online Public Access Catalogues (OPACs) are becoming less effective due to the rapid growth of scholarly literature. Conventional search methods, such as keyword indexing and Boolean queries, often fail to support efficient knowledge discovery. This paper proposes a Smart OPAC framework that transforms traditional OPACs into intelligent discovery systems using artificial intelligence and knowledge graph techniques. The framework enables semantic search, thematic filtering, and knowledge graph-based visualization to enhance user interaction and exploration. It integrates multiple open scholarly data sources and applies semantic embeddings to improve relevance and contextual understanding. The system supports exploratory search, semantic navigation, and refined result filtering based on user-defined themes. Quantitative evaluation demonstrates improvements in retrieval efficiency, relevance, and reduction of information overload. The proposed approach offers practical implications for modernizing digital library services and supports next-generation research workflows. Future work includes user-centric evaluation, personalization, and dynamic knowledge graph updates.

2026-04-01T12:48:36Z 8 pages, 4 tables, 6 figures presented at Intellib 2026 International Conference M. S. Rajeevan B. Mini Devi http://arxiv.org/abs/2604.00554v1 LLM-supported document separation for printed reviews from zbMATH Open 2026-04-01T06:54:45Z

This paper presents a specialized methodology for digitizing and segmenting mathematical documents from zbMATH Open, a comprehensive database of mathematical literature, to enhance machine processing capabilities. Currently, approximately 831,000 documents exist only in scanned volumes, which makes them not machine-processable. Furthermore, these scans often span multiple pages or share pages with other documents and incorporate diverse typesetting techniques, posing challenges for automated processing. To address these issues, we evaluate various Optical Character Recognition (OCR) tools and document separation techniques, proposing an optimized pipeline that outperforms existing approaches. Our study identifies Mathpix as the most effective OCR tool for LaTeX conversion, demonstrating superior performance based on BLEU and Edit Distance metrics. For document separation, we fine-tune generative Large Language Models (LLMs) and integrate them into a Majority Voting framework, achieving 97.5% accuracy when providing the text of the document. Additionally, our method identifies the start and end indexes for 90.6% of the test dataset, with an accuracy of 98.4% on applicable cases, resulting in an overall accuracy of 89.1% on the entire dataset. This approach surpasses traditional baselines, including regular expressions, ChatGPT-4o, and computer vision-based techniques. As a practical outcome, we process 810,977 mathematical documents into machine-readable text and extract precise document boundaries for 721,288 documents in LaTeX format. These contributions significantly improve accessibility for mathematical information retrieval systems, machine learning models, and related applications.

2026-04-01T06:54:45Z Submitted to SIGIR 2026 Ivan Pluzhnikov Ankit Satpute Moritz Schubotz Olaf Teschke Bela Gipp http://arxiv.org/abs/2604.16407v1 How unique are hallucinated citations offered by generative Artificial Intelligence models? 2026-03-31T21:24:34Z

This paper investigates how generative AI produces and propagates hallucinated academic references, focusing on the recurring non-existent citation 'Education Governance and Datafication' attributed to Ben Williamson and Nelli Piattoeva. Drawing on 137 accessible source papers identified through Google Scholar and Google searches, the study analyses the structure, recurrence, and onward citation of this phantom reference. It shows that hallucinated citations are not random inventions but patterned recombinations of real authors, journals, dates, and keywords, with duplication occurring in nearly 30% of cases. The paper also reports a structured interrogation of ChatGPT 5-mini about how it generates citations and finds that, absent verification, the model reconstructs plausible references from learned patterns rather than factual recall. Finally, ten AI-generated essays on datafication and school governance were examined: while most references were genuine or partly accurate, 9.2% remained hallucinated, including an exact match to the most common phantom citation. The findings highlight ongoing risks to academic integrity and show that web-enabled AI still does not fully eliminate fabricated references.

2026-03-31T21:24:34Z Dirk HR Spennemann http://arxiv.org/abs/2604.06229v1 Discoverability matters: Open access models and the translation of science into patents 2026-03-30T09:55:26Z

Scientific research is a key input into technological innovation, yet not all scientific knowledge is equally mobilized in patents. This paper examines how different scientific publishing models shape both the selection of scientific publications cited in patents and their cognitive alignment with patented technologies. Using large-scale data on non-patent references linking patents to scientific publications, combined with metadata from OpenAlex, we compare the Open Access (OA) structure of patent-cited science to that of the scientific literature. We then assess cognitive alignment using semantic similarity between patent abstracts and the abstracts of cited publications, distinguishing between citations appearing in the front section of patents and those embedded in the body of patent texts. We find that patent citations disproportionately draw on publications disseminated through highly visible and institutionally established publishing channels, particularly hybrid and bronze OA models, indicating strong selection effects. However, this dominance in citation counts does not translate into stronger cognitive alignment with patented technologies. On the contrary, publications in fully OA journals (gold and diamond OA) exhibit equal or higher semantic proximity, especially when cited in the body of patents. These results suggest that the contribution of OA to innovation depends less on access alone than on how different publishing models are embedded in information infrastructures that shape the visibility, discoverability, and use of scientific knowledge.

2026-03-30T09:55:26Z Abdelghani Maddi GEMASS Chongjun Xi GEMASS Xiaoting Chen GEMASS Isabelle Dorsch GEMASS Marc-André Simard UdeM, CIRST http://arxiv.org/abs/2603.28108v1 Quid est VERITAS? A Modular Framework for Archival Document Analysis 2026-03-30T07:14:51Z

The digitisation of historical documents has traditionally been conceived as a process limited to character-level transcription, producing flat text that lacks the structural and semantic information necessary for substantive computational analysis. We present VERITAS (Vision-Enhanced Reading, Interpretation, and Transcription of Archival Sources), a modular, model-agnostic framework that reconceptualises digitisation as an integrated workflow encompassing transcription, layout analysis, and semantic enrichment. The pipeline is organised into four stages - Preprocessing, Extraction, Refinement, and Enrichment - and employs a schema-driven architecture that allows researchers to declaratively specify their extraction objectives. We evaluate VERITAS on the critical edition of Bernardino Corio's Storia di Milano, a Renaissance chronicle of over 1,600 pages. Results demonstrate that the pipeline achieves a 67.6% relative reduction in word error rate compared to a commercial OCR baseline, with a threefold reduction in end-to-end processing time when accounting for manual correction. We further illustrate the downstream utility of the pipeline's output by querying the transcribed corpus through a retrieval-augmented generation system, demonstrating its capacity to support historical inquiry.

2026-03-30T07:14:51Z to be published in: LLMs4SSH: Shaping Multilingual, Multimodal AI for the Social Sciences and Humanities, organized within the 15th Language Resource and Evaluation Conference (2026) Leonardo Bassanini Ludovico Biancardi Alfio Ferrara Andrea Gamberini Sergio Picascia Folco Vaglienti http://arxiv.org/abs/2603.27698v1 Ink Detection from Surface Topography of the Herculaneum Papyri 2026-03-29T13:55:24Z

Reading the Herculaneum papyri is challenging because both the scrolls and the ink, which is carbon-based, are carbonized. In X-ray radiography and tomography, ink detection typically relies on density- or composition-driven contrast, but carbon ink on carbonized papyrus provides little attenuation contrast. Building on the morphological hypothesis, we show that the surface morphology of written regions contains enough signal to distinguish ink from papyrus. To this end, we train machine learning models on three-dimensional optical profilometry from mechanically opened Herculaneum papyri to separate inked and uninked areas. We further quantify how lateral sampling governs learnability and how a native-resolution model behaves on coarsened inputs. We show that high-resolution topography alone contains a usable signal for ink detection. Diminishing segmentation performance with decreasing lateral resolution provides insight into the characteristic spatial scales that must be resolved on our dataset to exploit the morphological signal. These findings inform spatial resolution targets for morphology-based reading of closed scrolls through X-ray tomography.

2026-03-29T13:55:24Z 9 pages, 3 figures, 2 tables. Currently under review Giorgio Angelotti Federica Nicolardi Paul Henderson W. Brent Seales http://arxiv.org/abs/2603.28802v1 Interactive Evidence Maps for Visualizing and Understanding Systematic Reviews 2026-03-27T21:18:14Z

Systematic reviews provide comprehensive syntheses of research fields. As a result, systematic reviews often emphasize synthesizing across the large bodies of literature rather than just describing the studies from which the conclusions were drawn. This risks an incomplete description of the sample - encouraging overgeneralization of the findings, obscuring connections between existing work, or overshadowing gaps in the literature. To address this challenge, we introduce interactive evidence maps; an accessible visualization tool that enables researchers to explore, filter, and analyze review data dynamically. Our approach leverages large language models to extract topic models that structure heterogeneous review data into an interactive, explorable knowledge map that supports deeper inspection beyond static tables and figures. We demonstrate the usefulness of interactive evidence maps using data from a published scoping review of pedagogical agents in K-12 education, and compare the results of the evidence map to those reported in the scoping review. Results show that interactive evidence maps complement traditional syntheses by enhancing transparency, supporting exploratory analysis, and revealing patterns and gaps that may not be easy to detect through narrative summaries alone.

2026-03-27T21:18:14Z Aditi Mallavarapu Rohan Khandare Mokshagna Kadiyala Neelesh Yaddanapudi Noah L. Schroeder Shan Zhang Jessica R. Gladstone http://arxiv.org/abs/2503.07823v2 Reproducibility and Artifact Consistency of the SIGIR 2022 Recommender Systems Papers Based on Message Passing 2026-03-27T12:22:03Z

Graph-based techniques relying on neural networks and embeddings have gained attention as a way to develop Recommender Systems (RS) with several papers on the topic presented at SIGIR 2022 and 2023. Given the importance of ensuring that published research is methodologically sound and reproducible, in this paper we analyze 10 graph-based RS papers, most of which were published at SIGIR 2022, and assess their impact on subsequent work published in SIGIR 2023. Our analysis reveals several critical points that require attention: (i) the prevalence of bad practices, such as erroneous data splits or information leakage between training and testing data, which call into question the validity of the results; (ii) frequent inconsistencies between the provided artifacts (source code and data) and their descriptions in the paper, causing uncertainty about what is actually being evaluated; and (iii) the preference for new or complex baselines that are weaker compared to simpler ones, creating the impression of continuous improvement even when, particularly for the Amazon-Book dataset, the state-of-the-art has significantly worsened. Due to these issues, we are unable to confirm the claims made in most of the papers that we examined and attempted to reproduce.

2025-03-10T20:09:04Z ACM Transactions on Information Systems (2025) Maurizio Ferrari Dacrema Michael Benigni Nicola Ferro 10.1145/3772275 http://arxiv.org/abs/2603.26324v1 PRISMA: Toward a Normative Information Infrastructure for Responsible Pharmaceutical Knowledge Management 2026-03-27T11:43:49Z

Most existing approaches to AI in pharmacy collapse three epistemologically distinct operations into a single technical layer: document preservation, semantic interpretation, and contextual presentation. This conflation is a root cause of recurring fragilities including loss of provenance, interpretive opacity, alert fatigue, and erosion of accountability. This paper proposes the PATOS--Lector--PRISMA (PLP) infrastructure as a normative information architecture for responsible pharmaceutical knowledge management. PATOS preserves regulatory documents with explicit versioning and provenance; Lector implements machine-assisted reading with human curation, producing typed assertions anchored to primary sources; PRISMA delivers contextual presentation through the RPDA framework (Regulatory, Prescription, Dispensing, Administration), refracting the same informational core into distinct professional views. The architecture introduces the Evidence Pack as a formal unit of accountable assertion (versioned, traceable, epistemically bounded, and curatorially validated), with assertions typified by illocutionary force. A worked example traces dipyrone monohydrate across all three layers using real system data. Developed and validated in Brazil's regulatory context, the architecture is grounded in an operational implementation comprising over 16,000 official documents and 38 curated Evidence Packs spanning five reference medications. The proposal is demonstrated as complementary to operational decision support systems, providing infrastructural conditions that current systems lack: documentary anchoring, interpretive transparency, and institutional accountability.

2026-03-27T11:43:49Z 52 pages, 3 figures, 71 references Eugenio Rodrigo Zimmer Neves Amanda Vanon Correa Camila Campioni Gabielli Pare Guglielmi Bruno Morelli http://arxiv.org/abs/2604.16387v1 Large language models for post-publication research evaluation: Evidence from expert recommendations and citation indicators 2026-03-27T11:32:18Z

Assessing the quality of scientific research is essential for scholarly communication, yet widely used approaches face limitations in scalability, subjectivity, and time delay. Recent advances in large language models (LLMs) offer new opportunities for automated research evaluation based on textual content. This study examines whether LLMs can support post-publication peer review tasks by benchmarking their outputs against expert judgments and citation-based indicators. Two evaluation tasks are constructed using articles from the H1 Connect platform: identifying high-quality articles and performing finer-grained evaluation including article rating, merit classification, and expert style commenting. Multiple model families, including BERT models, general-purpose LLMs, and reasoning oriented LLMs, are evaluated under multiple learning strategies. Results show that LLMs perform well in coarse grained evaluation tasks, achieving accuracy above 0.8 in identifying highly recommended articles. However, performance decreases substantially in fine-grained rating tasks. Few-shot prompting improves performance over zero-shot settings, while supervised fine-tuning produces the strongest and most balanced results. Retrieval augmented prompting improves classification accuracy in some cases but does not consistently strengthen alignment with citation indicators. The overall correlations between model outputs and citation indicators remain positive but moderate.

2026-03-27T11:32:18Z Mengjia Wu Yi Zhang Robin Haunschild Lutz Bornmann http://arxiv.org/abs/2603.16816v2 WildDepth: A Multimodal Dataset for 3D Wildlife Perception and Depth Estimation 2026-03-27T00:58:50Z

Depth estimation and 3D reconstruction have been extensively studied as core topics in computer vision. Starting from rigid objects with relatively simple geometric shapes, such as vehicles, the research has expanded to address general objects, including challenging deformable objects, such as humans and animals. However, for the animal, in particular, the majority of existing models are trained based on datasets without metric scale, which can help validate image-only models. To address this limitation, we present WildDepth, a multimodal dataset and benchmark suite for depth estimation, behavior detection, and 3D reconstruction from diverse categories of animals ranging from domestic to wild environments with synchronized RGB and LiDAR. Experimental results show that the use of multi-modal data improves depth reliability by up to 10% RMSE, while RGB-LiDAR fusion enhances 3D reconstruction fidelity by 12% in Chamfer distance. By releasing WildDepth and its benchmarks, we aim to foster robust multimodal perception systems that generalize across domains.

2026-03-17T17:19:43Z Muhammad Aamir Naoya Muramatsu Sangyun Shin Matthew Wijers Jia-Xing Zhong Xinyu Hou Amir Patel Andrew Loveridge Andrew Markham http://arxiv.org/abs/2603.19238v2 lit-tag: An app for adding custom tags and notes to a citation database 2026-03-26T23:47:19Z

To facilitate the review, evaluation and analysis of scientific literature, the lit-tag R Shiny application provides a convenient interface for users to generate a citation database with custom, user-defined tags and notes. Lit-tag is not subject-specific and is useful for any field of research. Starting with a table of citations exported from a Zotero library and a user-generated Excel file describing a set of tags and notes fields, lit-tag provides tools for assigning tags and notes to papers ("lit-tag-builder" module) and for exporting, graphing, and generating reports from the resulting database ("lit-tag-viewer" module). The app fills a need not met by the limited tagging tools available in bibliographic software and does not require database programming skills.

2026-01-12T23:08:17Z submitted to Plos ONE Paul McElhany Kalina Grabb Maddison Wood http://arxiv.org/abs/2603.25468v1 Improving metadata flows -- The simultaneous use of multiple metadata schemas at disciplinary research data repositories 2026-03-26T14:10:57Z

This study investigates the simultaneous use of multiple metadata schemas at research data repositories. The analysis covers how eight disciplinary research data repositories from the geosciences and social sciences use disciplinary metadata schemas and the DataCite Metadata Schema, and how two metadata records describing the same dataset compare. The results show that DataCite metadata records could be improved considerably by optimizing schema crosswalks. However, the parallel use of disciplinary and multidisciplinary metadata records is complex. For example, discipline has a significant effect on the completeness of DataCite metadata. A temporal analysis also highlights that metadata workflows are diverse, and in some cases, suboptimal crosswalks are likely not the sole cause of incomplete DataCite metadata. Comparing the disciplinary metadata schemas and the DataCite Metadata Schema on a structural level reveals that most differences between schemas are the result of different approaches to modelling statements about datasets, not the lack of opportunity to express them. The element sets of both disciplinary metadata schemas and the DataCite Metadata Schema could be extended to describe datasets in more detail. These observations demonstrate that disciplinary and multidisciplinary metadata schemas serve distinct purposes. Disciplinary repositories should take full advantage of the opportunities both options provide.

2026-03-26T14:10:57Z Dorothea Strecker http://arxiv.org/abs/2603.25349v1 Reinforcing Prestige: Journal Citation Biases in Astronomy 2026-03-26T11:54:06Z

Citations are essential for recognizing scientific contributions, yet citation behavior is shaped by more than just relevance or quality. We analyzed approximately 255,000 refereed astronomy articles published between 2000 and 2025 to investigate how journals are cited relative to their publication volume and authorship context. We find that multidisciplinary journals receive disproportionately more citations, up to nine times higher than their share of articles, while field-specific journals are cited less frequently in proportion to their output. Citations to a journal also increase significantly when authors publish within it, a bias particularly pronounced in multidisciplinary journals. Although this effect has declined over the past decade, it remains notable. These patterns likely arise from a combination of topical clustering, institutional/individual publishing habits, and strategic referencing to align with editorial expectations. Our findings reveal persistent structural biases in scientific visibility and suggest that citation-based metrics should be used with greater awareness of the publishing context they reflect. We encourage authors, reviewers, and editors to remain mindful of these dynamics and strive for fairness and inclusivity when selecting references.

2026-03-26T11:54:06Z This manuscript presents the full version of a shorter Research Note submitted to RNAAS Vardan Adibekyan Olivier Demangeon Tiago Campante Nuno Santos Susana Barros Artur Hakobyan http://arxiv.org/abs/2603.25761v1 A Survey of OCR Evaluation Methods and Metrics and the Invisibility of Historical Documents 2026-03-26T02:52:28Z

Optical character recognition (OCR) and document understanding systems increasingly rely on large vision and vision-language models, yet evaluation remains centered on modern, Western, and institutional documents. This emphasis masks system behavior in historical and marginalized archives, where layout, typography, and material degradation shape interpretation. This study examines how OCR and document understanding systems are evaluated, with particular attention to Black historical newspapers. We review OCR and document understanding papers, as well as benchmark datasets, which are published between 2006 and 2025 using the PRISMA framework. We look into how the studies report training data, benchmark design, and evaluation metrics for vision transformer and multimodal OCR systems. During the review, we found that Black newspapers and other community-produced historical documents rarely appear in reported training data or evaluation benchmarks. Most evaluations emphasize character accuracy and task success on modern layouts. They rarely capture structural failures common in historical newspapers, including column collapse, typographic errors, and hallucinated text. To put these findings into perspective, we use previous empirical studies and archival statistics from significant Black press collections to show how evaluation gaps lead to structural invisibility and representational harm. We propose that these gaps occur due to organizational (meso) and institutional (macro) behaviors and structure, shaped by benchmark incentives and data governance decisions.

2026-03-26T02:52:28Z This manuscript is the author's submitted version to the ACM Conference on Fairness, Accountability, and Transparency (FAccT 2026). Please cite the final published version via ACM Digital Library when available Fitsum Sileshi Beyene Christopher L. Dancy