A Checklist and Suggestions to Enhance Reproducibility of Deep Learning Experiments

Dr Jeaneth Machicao, Dr Ali Ben Abbes, Dr Alison Specht, Mr Leandro Meneguzzi, Prof Pedro Luiz Corrêa, Dr Romain David, Mr Danton Vellenich, Dr Rodolphe Devillers, Ms Shelley Stall

1Escola Politécnica da Universidade de São Paulo (EPUSP), Sao Paulo, Brazil, 2FRB-CESAB, Montpellier, France, 3TERN, University of Queensland, Indooroopilly, Australia, 4European Research Infrastructure on Highly Pathogenic Agents, AISBL, Brussels, Belgium, 5Espace-Dev (IRD-UM-UG-UR-UA-UNC), Montpellier, France, 6American Geophysical Union, Washington DC, USA

Background

In recent years efforts to develop good research practices have accelerated, and issues of Reproducibility and Replicability (R & R) in computer science experiments have gained attention. Due to the complexity of the approaches used, the reproducibility of deep learning (DL) experiments is a particular challenge. Without the use of DL technology, however, it would be impossible to solve problems which require the acquisition and analysis of enormous amounts of data across space and time. Future users could greatly benefit from the experience of creating and running such experiments, and the challenges to reproducibility of DL experiments should be overcome.

Methods

To investigate the nature of DL reproducibility challenges in detail and to find solutions that might improve their reproducibility, we assessed three DL studies which analysed visual elements from satellite and street imagery to estimate poverty levels in human settlements. We systematically detailed difficulties with the datasets, methods, and workflows for each experiment and identified actions to militate common problems. Our target audience includes researchers who want to replicate an experiment, authors who want to report on an experiment, and reviewers who want to evaluate DL experiments.

Results and Conclusion

As a result of this assessment, we have developed two approaches to improve the reproducibility and replicability of DL experiments: (i) a quality checklist incorporating the relevant FAIR principles inter alia, and (ii) guidelines to reduce the likelihood of wasted effort when deciding whether an experiment is able to be reproduced in a feasible manner.


Biography:

Over the last 10 years Alison Specht has focussed on facilitating interdisciplinary groups to tackle complex environmental problems using existing data. Her major interests, apart from her domain activities, are to improve data management and preservation, data sharing and re-use. From 2009 to 2014 she established the first synthesis centre in the southern hemisphere, the Australian Centre for Ecological Analysis and Synthesis, a facility of the Terrestrial Ecosystem Research Network (www.tern.org.au). From 2015 to 2018 she was Director of FRB-CESAB, the CEntre for the Synthesis and Analysis of Biodiversity in France. Alison wa a member of a DataONE (www.dataone.org) Working Group in the USA from 2010-2019. She is a member of several RDA interest groups. She is a co-leader of PARSEC (www.parsecproject.org), a multi-national project funded under the Belmont Forum CRA for Science-driven e-infrastructures innovation for the enhancement of transnational, interdisciplinary and transdisciplinary data use in environmental change research.

Categories