top of page

STM Integrity Hub

Issue #78

Data, Numbers

by Michael Seadle

Joris van Rossum wrote a guest post for the Scholarly Kitchen on 23 May 2024. His topic was the STM Integrity Hub, which was launched in May 2022 by a number of major scholarly “publishers including Elsevier, Frontiers, Springer Nature, and Wiley”. He writes that the integrity problems are getting worse: “Last year saw a record number of retractions and publishers continue to receive submissions from papermills. Generative AI compounds those problems by making research misconduct easier through the fabrication of data, images and text. What is particularly concerning is that such fabricated data are largely undetectable with current tools, which are based on the detection of duplication and manipulation of existing content …”.¹ On the positive side he reports that “STM Integrity Hub [now has] more than 35 supporting organizations (including all the major submission systems) and over 100 people participating in working groups, task forces, and governance structure — making it a truly community-driven initiative.”¹  


The hub began with two specific services: a “Paper Mill Checker Tool [that] was the first service that we made available to publishers” and a “Duplicate Submission Checker Tool [that] started in the form of a pilot in October of last year.”¹ Rossum praises the result of the latter: “With a throughput of 20K manuscripts per month, the detection rate of a duplicate submission is over 1% … The duplicate submission checker is now checking for duplicates on the level of metadata, but we are moving to full text this year, making use of a technology developed by one of the participating publishers.”¹ 


Paper mills and duplicate submissions are certainly problems, especially for the publishers. What is missing in this Scholarly Kitchen post is information about any specific discussion about detecting fake data created either by generative AI or by more conventional means. Fake data damages scientific results more than merely duplicating papers. Dealing with the paper mills could help, because their editorial control over data is typically poor. Even so, the STM Integrity Hub focuses on problems that publishers see, rather than the quality concerns of scholars. Perhaps its next iteration will go deeper into data and analysis problems.


1:  Rossum, Joris van. “Guest Post: The STM Integrity Hub - Connecting the Dots in a Dynamic Landscape.” The Scholarly Kitchen, May 23, 2024.



Recent Posts

See All


bottom of page