Trends in use of scientific workflows in different science disciplines


Siddeswara Guru1, David Abramson2, Ewa Deelman3, Igor Makunin 2, Hoang Nguyen2, Manish Parashar4, Damien Watkins5

1 Terrestrial Ecosystem Research Network, Brisbane, Australia,

2University of Queensland, Brisbane, Australia,,,

3 USC Information Science Institute, CA, USA,

4Rutgers University, NJ, USA,

5 Data61 CSIRO, Melbourne, Australia,


A scientific workflow is a series of well-defined coordinated, structured activities that enable scientific problem solving [1]. Scientific workflows have gained wide popularity in the research community for its ability to:

  • describe, manage, share and execute scientific analyses;
  • Provide high-level abstract view of scientific computation, while hiding underlying details;
  • Interface with distributed computing environment;
  • capture complete workflow as an artefact and make it as a reusable entity [2];
  • capture provenance information for further analysis and knowledge re-use.

There are several Scientific Workflow Management Systems (SWMS) used in science disciplines and some of them are very discipline-centric [3]. The basic functions of SWMS is to design, create, execute and manage workflows. Apart from providing execution environment for workflows, SWMS support additional functionalities such as run-time execution monitoring and reporting, and workflow provenance

In this BoF, we will provide an overview presentation of some of the SWMS (e.g., Kepler, Galaxy, workspace) used in different science disciplines. Furthermore, we will run an interactive Q&A panel discussion to understand motivations and use cases where scientific workflows are used, choosing a right tool for particular application, building a community around workflow management system and future activities for more co-ordination in tools development.

BoF Details:

  • The BoF will have short presentations on Kepler, Galaxy and Workspace workflow management systems used in different science discipline followed by a panel discussion. A panel will consist, David Abramson, Ewa Deelman and presenters. The BoF will conclude with a closing talk from Manish Parashar.
  • The BoF session will run for 90 minutes. The first 40 minutes is allocated for an introduction to the BoF and short presentations, next 30 minutes of panel discussion and final 20 minutes for a closing talk.


  1. Talia, D. Workflows Systems for Science: Concepts and Tools. ISRN Software Engineering, 2013.
  2. Guru, S.M., I.C. Hanigan, H.A. Nguyen, E. Burns, J. Stein, W. Blanchard, D. B. Lindenmayer, and T. Clancy, Development of a cloud-based platform for reproducible science: the case study of IUCN Red List of Ecosystems Assessment. Ecological Informatics, 2016.
  3. Liu, J., E. Pacitti, P. Valduriez, M. Mattoso, A Survey of Data-Intensive Scientific Workflow Management. Journal of Grid Computing, 2015,13:4, p. 457-493.


Siddeswara Guru is a Data Science Director for the Terrestrial Ecosystem Research Network (TERN). He initiates, coordinates and manages ecological data, e-infrastructure and synthesis projects apart from overseeing the data and information management activities across TERN.