Mr Aaron Willcox1
1Independent Researcher, Darwin, Australia
Biography:
Aaron Willcox is a data scientist and research software specialist with a background in academia and social research, focused on building reproducible data systems that scale. Over the past five years, his work has spanned everything from DARPA-funded reproducibility pipelines to wellbeing metrics used in national education surveys.
Aaron is drawn to both the technical and the conceptual — he enjoys thinking about systems as much as he enjoys building them. His approach blends systems thinking with practical tooling in R, version control, and containerization, enabling teams to create transparent, scalable, and reusable workflows. He’s particularly interested in Research Operations (ResOps): applying DevOps-inspired practices like CI/CD and modular workflows to improve reproducibility, reduce friction, and align research with operational best practices.
Abstract:
In the rapidly evolving landscape of data science, achieving reproducibility and transparency remains a formidable challenge. Despite the push towards open science, systemic barriers continue to impede the seamless sharing and replication of scientific knowledge. This presentation explores the integration of Research Operations (ResOps) principles, inspired by DevOps, to address these challenges. By leveraging community-driven best practices such as containerization, version control, and continuous integration, ResOps offers a framework for enhancing computational reproducibility. We will discuss the cultural and methodological shifts necessary to align institutional policies with the practical realities faced by researchers. By fostering a culture that values transparency and reproducibility, and equipping researchers with the necessary tools and training, we aim to bridge the gap between the creation and dissemination of scientific knowledge. This session will provide actionable insights and recommendations for researchers, practitioners, and policymakers dedicated to advancing reproducibility in data science.