Professor Debra J. Searles1,2, Jill Penridge2, Helen Morgan3
1Centre for Theoretical and Computational Molecular Science, Australian Institute for Bioengineering And Nanotechnology, The University of Queensland, Brisbane, Australia,
2The Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, Australia,
3Learning and Research Services, The University of Queensland , Brisbane, Australia
“It is not sufficient simply to focus on exposing, collecting, storing, and sharing data in the raw. It is what you do with it (and when) that counts” JISC report – The future of data-driven decision making
People often talk about the research data lifecycle from the perspective of the data that flows through it, in a clean and linear way… planning, collecting, analysing, publishing… is the journey the data goes on. Neglected are the stories from the people involved in making this work. These stories are often compelling and full of challenges and controversy. To get data to flow neatly as it is curated through the data lifecycle requires hard work and persistence on the part of many different stakeholders. In this paper, we hear from three of the university stakeholders who are very much at the forefront of research and who are, each in their own way, driving towards best-practice research data management.
Modern research faces existential dilemmas, key among them the growing lack of confidence in both the outcomes and the need for fundamental research as a societally valuable activity deserving of resources. The causes of this increasing uncertainty are being simultaneously driven by scandals around reproducibility, the commodification of research and the politics of science denialism. These pressures come at the same time as a significant generational shift in the research and research support workforces.
THE RESEARCHER PERSPECTIVE
For most researchers, the challenges associated with data are growing as the capabilities to generate and store it grow, the pressure to publish high research output rapidly increases and formal requirements associated with data storage, reproducibility and integrity are enforced. Certainly some research is not possible without fast storage and access to huge volumes of data, and for researchers in high performance computing this is vital. However, availability of resources also makes it easier to store data that is not required and ultimately more difficult to locate appropriate data when required.
Many new resources are becoming available which assist with data management, storage and availability, however as with any new technology, uptake can be slow due to the need to change processes and culture within a research group and the history of technologies, software systems and storage systems becoming outdated. Data carefully filed and stored on Zip drives and other outdated technologies become almost useless. Challenges associated with developing data storage processes within a research group that are practical, versatile, enduring and suitable need to be overcome, and this will be discussed.
As a researcher in high performance computing, digital data storage has always been a key issue. Today, this is becoming important to the broader research community that brings with it associated opportunities that will also be highlighted.
THE TRANSLATIONAL RESEARCH FOCUSSED INSTITUTE PERSPECTIVE
Data is the lifeblood of the basic research from which scientific discoveries are generated. In translational research, the reliability of data is equally or more important as the justification for the commercial value ascribed by commercial sponsors in exchange for their financial and reputational support of that research.
At an operational level, the data-related challenges to be addressed in the current research environment are varied. The existing cohort of senior academics should be encouraged to recognize the practicalities of data as a resource that needs to be sustainably managed and preserved to ensure business continuity risks are addressed. Incentives should be introduced to motivate recognition of the veracity and providence of data as the underpinning source of value, and therefore resources, to industry.
In particular, translational research also requires pragmatic navigation of pre-“big data” university and government policies that are influenced by entrenched institutional views of how research was undertaken in the past. One final area of potential focus is the need for increased IT literacy among the senior executive of universities, who may not have recent experience with IT in a research context, to improve institutional decision-making.
Solutions will be proposed to address these challenges but ultimately the necessary changes must be supported by effective cultural support at all levels of modern universities to have a positive impact.
THE WHOLE-OF-INSTITUTION PERSPECTIVE
Institutions face challenges in this space that are driven by a rapidly moving national and global landscape. Funders and Journals are increasing requirements to provide research data on publication or completion of project work.
This requires consistency with local institutional policy, and continued development of governance and systems to meet varying needs across the institution. Increasing ease of compliance for researchers without policing or administrative burden is requiring the development of new, novel workflows. These have to be designed and implemented in consultation with both researchers and operations managers.
Institutions have much to be gained by building reputations as trusted providers of data. The current swing towards innovation and emphasis on collaborating with industry to solve real-world problems means now more than ever data management is a priority activity. The ability to validate, verify and reproduce research outputs, and to have a clear audit trail from published results to the underlying raw data is a clear advantage for any organization.
CONCLUSION – GOAL of RESEARCH INTEGRITY, BEST-PRACTICE AT ALL LEVELS, TRUSTED DATA and RESEARCH OUTPUTS
Good data management throughout the data lifecycle requires significant engagement with the processes, systems and people at all levels. How this looks can be quite different and present different challenges to everyone involved. Ultimately good curation of data happens when there is a strong culture and shared understanding by researchers of the value created.
The ways in which these three perspectives intersect will be discussed and common points of friction highlighted. The practical implications will be considered through the lens of researchers as the effectiveness of their activities is the primary source of research success.