Automating Trust: using machine learning and natural language processing to bolster better research and enhance trust in science

Dr Leslie Mcintosh1

1Ripeta, Cambridge, U.S.A.

The advent of COVID-19 has highlighted the need for scientific quality checks. Rapid review, preprint services, and more have expedited scientific innovation, but have been followed by a growth of bad actors, and thus, bad science. We will discuss ways we determine trust, and how scanning for key indicators can bolster scientific integrity and quality of research.

Leveraging natural language processing (NLP) to identify and extract key pieces of text from scientific articles, our models scan for seed phrases classified into indicators of reproducibility, professionalism, or research. Our team’s extensive work included identifying and defining scientific quality indicators and reviewing thousands of papers to map typical locations of these criteria. Besides yielding very precise models, Ripeta retrains these NLP models to react to emerging challenges or to uphold more stringent standards.

To date, we have developed quality indicators for authorship verification, funding statement, ethical approval, data availability, and code availability to name a few. Alone and combined, they indicate levels of trust. In data sharing locations alone there is significant variability in researchers’ data sharing methods: within their papers, upon request, through a repository (recommended), or not at all. Researchers use a repository less than 15% of the time and most frequently state ‘data available upon request’.

Attendees will gain a mix of automated and practical strategies to quickly identify and validate research, authorship, and data. These strategies will benefit attendees who reuse research, and may also be applied to their own research reporting practices.



Dr. Leslie D. McIntosh is the founder and CEO of Ripeta, a company formed to improve scientific research quality and reproducibility. She served as the executive director for the Research Data Alliance (RDA) – US region and as the Director of the Center for Biomedical Informatics at Washington University School in St. Louis. Over the past years, Dr. McIntosh has dedicated her work to improving science. Since 2014, this has focused on highlighting the need for reproducible science, then on transparently reporting science, and now on the need to build trust in science.


Oct 14 2021


3:30 pm - 3:50 pm

Local Time

  • Timezone: America/New_York
  • Date: Oct 14 2021
  • Time: 12:30 am - 12:50 am