TURNING BIG SHIPS (DATA) ON A DIME: CHANGE MANAGEMENT AND DATA SENTENCING

Convenor: Rhys Francis

Presenters: Mr Dave Connell1, Mrs Sandra Ennor2Mr Nicholas McPhee4Ms Jaye Weatherburn5

1Australian Antarctic Division, Kingston, Australia, dave.connell@aad.gov.au

2Monash University, Clayton, Australia, Sandra.Ennor@monash.edul

4Monash University, Clayton, Australia, Nicholas.McPhee@monash.edu

5University of Melbourne, Parkville, Australia, jaye.weatherburn@unimelb.edu.au

 

Achieving data management goals involves change management. For a University a growing challenge relates to the shift from

‘data storage is cheap – use as much as you need’ to ‘some cold data storage options are cheap, but high end data storage not

so much and it’s not unlimited. Therefore   we need to assign some Terms and Conditions – especially instigating sentencing

regimes’. These are not easy conversations and often a require a deep cultural change across the board.

Aim would be for the convenor to open the session with a few key questions (5 mins) including:

  • What is sentencing and why is it needed at all?
  • What incentives and techniques can be used to help researchers become more aware of sentencing requirements for

data and how this can assist them?

  • What are the most effective strategies to assist researchers in capturing metadata at the start of the process?
  • Before technical tools are implemented, what are some of the main issues in gaining traction for data to be effectively

sentenced and managed over time?

Then introduce the three sets of speakers.  Each speaker is representing a different aspect of data management and each stage has often involved hard or difficult conversations.

1)    Dave Connell – Australian Antarctic Division – Commonwealth of Australia (10 mins) – How to easily capture metadata – a government perspective

Although the working environment of the federal government is different to that of a university, there are a number of similarities when it comes to managing data.   First and foremost of those is the need to capture metadata.  At the Australian Antarctic  Division  our remit is primarily the management of scientific data, and over the course of two decades we have experimented with several methods of metadata collection ranging from the unfortunately complex to the ludicrously simple.  Further to that has been the need to bring about a change in cultural attitudes towards data management and  archiving.   This presentation will focus on what has and hasn’t worked with regard to metadata capture and how other extenuating factors have assisted in data management in the AAD.

2)    Sandra Ennor/Nick McPhee/Cath Nicholls – Monash University – eSolutions/MAWG/MerC (10 mins) – Research Data

Management and University Records Management – Reflections on what we have learnt and applied

Monash University (via its Monash Agent Working Group, a collaboration of Library, MeRC, eSolutions and Records and Archives) have been actively sentencing legacy electronic research data for well over a year now.  The initial results have been positive, but there has been many lessons learnt during these early stages.  This presentation will focus on two small case studies undertaken to date on two different sets of electronic research data. In particular we will reflect on how well (or  not)  some of the traditional corporate records management activities (e.g. sentencing data) have translated across into managing research data.  In particular this presentation will focus on the role of metadata, the triggers for applying sentencing actions, as well as the key communication and change management techniques being applied.

3)    Digital Preservation Strategy – University of Melbourne (10 mins)

Since 2016, the University of Melbourne (UoM) has been actively investing time and resources towards the establishment  and  implementation  of digital preservation (“the series of managed activities necessary to ensure continued access to digital materials for as long as necessary” [1]) through a dedicated digital preservation project. A high-level strategy [2] and complementary roadmaps [3] have guided various project work to date, including training and skills framework development, infrastructure pilot projects, and cultural awareness improvements. Phase 1 of the UoM Digital Preservation Project concluded at the end of 2017, with much successful analysis to draw into planning a business case going forward. The business case for Phase 2 (the Implementation/Embedding project phase), including a draft preservation service architecture for the university, is currently under development.

Analysis work has shown that cultural change is essential, to increase awareness of the importance of digital preservation, and for organisations to invest in maintaining the value of digital materials over time. There is a clear need for digital preservation in different research disciplines, and for high-value data in active use. Active preservation processes must be initiated and maintained, especially for digital research material with complex dependencies (eg. cloud,  distributed  data,  proprietary  software/hardware,  complex  copyright  and  IP)  to  ensure  materials  remain accessible and reproducible. Governance and data stewardship planning continue to be a key focus for 2018 to meet this  cultural  change  requirement. The creation of a digital-preservation focused wiki to bring together disparate knowledge sources around curation for data is also a work in progress to continue to strive towards the collaborative cultural shift that  is required for managing valuable digital assets over time. Iterative, agile approaches have been essential both to drive change, and also acceptance of the new processes and new capabilities that digital preservation brings to an organisation.

Goal is to then allow another 20 mins for open discussion with the audience to tackle some of the questions raised at the top. Although the three institutions represented above are tackling different elements of the data management issue, when the three are placed together there is a nice holistic vision of what might be an ideal future state for the larger institutions dealing with diverse and rapidly expanding research legacy data. In particular the three themes of capture the metadata well (and early), apply sentencing and actively moving and deleting data as sentences dictate, and thirdly having an active digital preservation strategy. However,  while  all  of these themes are desirable, achieving buy in and relevance to the central audience (ie the research community itself), is not necessarily a given or easy.  Our job is to make it easier for our researchers, but part of that involves  having  the  hard  chat.    We  hope  this  BoF  will  be  an  opportunity  to  share  ideas  on  achieving  some of these communication and change management outcomes.

REFERENCES

  1. Digital  Preservation  Coalition,  Digital  Preservation  Handbook,  Glossary:  “Digital Preservation”, accessed 4 June 2018, https://www.dpconline.org/handbook/glossary#D
  2. University of Melbourne Digital Preservation Strategy 2015-2025 – Vision Mandate and Principles, accessed 4 June 2018. http://hdl.handle.net/11343/45135
  3. University of Melbourne  Digital Preservation Strategy 2015-2025 – Implementation Roadmaps, accessed 4 June 2018, http://hdl.handle.net/11343/45136

Biographies:

Rhys Francis – Rhys spent the first decade of his career as an academic researcher in parallel and distributed computing. The next decade and a half included roles as a senior principle researcher, research programme manager and strategic leader in information and communication technologies in the Commonwealth Scientific and Industrial Research Organisation (CSIRO). His experience includes  being  the High Performance Scientific Computing Director for CSIRO and the National Grid Programme Manager for the Australian Partnership for Advanced Computing. From 2006 Rhys worked within the Australian Government’s National Collaborative Research Infrastructure Strategy as the facilitator for its investment plan in eResearch and subsequently as the Executive Director of the Australian eResearch Infrastructure Council. Since then through a series of engagements he has continued to work to harness advancing information and communication technologies to the benefit of Australian research.

Dave Connell – Dave Connell completed a Bachelor of Science (honours) degree at the University of Tasmania, and has been working at the Australian Antarctic Division since 1998 and as the metadata officer since 1999.  His role is to catalogue and archive all scientific data collected by the Australian Antarctic program – specifically to ensure that scientists write high quality metadata records and archive their data in a timely manner.  During his time at the AAD, he has overseen the transition from ANZLIC metadata to DIF metadata, and also developed tools for converting DIF metadata into various profiles of the ISO 19115 metadata standard.  Dave is also very active in the Australian Government metadata space – reviewing and adapting ISO 19115 metadata  standards  for  use  in  Australian  scientific  organisations.    He  has  also  worked  with  the  Ocean  Acidification  – International Coordination Centre to develop an ocean acidification metadata profile.

Sandra Ennor – Sandra Ennor is a Senior Records Analyst at Monash University. Sandra has embraced a career in the Records Management industry analysing recordkeeping practices, project managing system implementations, enhancing training regimes and  increasing  education.  Those  elements  assisting with driving passions such as understanding Information Culture and evolution of business process. Sandra collaborates in Data Management and Big Data spaces with primary objectives encompassing frameworks such as – Change Management, Compliance and Governance, Networking and Rights

(including recordkeeping rights of the child/student and creating accessible systems for staff).

Nicholas McPhee – Nicholas McPhee has been part of the Monash University eResearch Centre since its creation more than ten years ago and is currently working with researchers and research groups in order to provide them with personalised information and data management strategies. Nicholas has also been involved in the development of information and data management policies and has maintained and administrated eResearch applications and data storage.

Jaye  Weatherburn  –  Jaye  Weatherburn is based in the Digital Scholarship team in Scholarly Services at the University of

Melbourne, working to improve and support data stewardship and digital preservation capability.

ORCID ID: http://orcid.org/0000-0002-2325-0331

Advancing HPC and Data collaborations in Australasia

Dr Jenni Harrison1,2, Mr Mark Gray1,2, Dr Daniel Grimwood1,2, Dr Georgina Rae3, Mr Nick Jones3, Mr Allan Williams4, Mr Peter Elford5

1Pawsey Supercomputing Centre, Kensington, Australia,

2CSIRO, Kensington, Australia,

3New Zealand eScience Infrastructure, Auckland, New Zealand,

4The National Collaborative Infrastructure, Canberra, Australia,

5AARNet, Canberra, Australia

 

Background

Working cooperatively in an interdisciplinary team is essential to solve complex challenges faced within academic and industry research sectors1.  There are obvious benefits to working as part of a team such as diversity, a wide variety of knowledge, skills and strengths, greater access to resources and funding, a range of facilities and infrastructure and an expanded network of contacts, just to mention a few.  As a consequence, often national collaborations are funded by Governments for various reasons including policy ambitions or to encourage knowledge creation (often in science or engineering) to facilitate economic growth2.

An International Perspective

In the last twenty years, in the eScience / eResearch / HPC world, globally there has been an increase in collaborative working between institutions and across countries, especially in Europe and the US.  For example, in 2015, Nordic eScience and eInfrastructure plan 2.03 was published.  This report was an update to the highly successful Nordic eScience and eInfrastructure plan, published in 2008, written and developed by the Nordic eScience Initiative (eNORIA) group4 which is a collaboration between Norway, Finland, Sweden, Denmark and Iceland.  The key areas highlighted the Nordic eScience and eInfrastructure plan 2.03 include:

  • Creating a Higher Education Arena for eScience
  • Enabling eScience as an Omnipresent Tool in Research
  • Knowledge Creation through Cross-Border Sharing of eInfrastructure

The Partnership for Advanced Computing in Europe (PRACE) was established to enable scientific and engineering research to “enhance European competitiveness for the benefit of society”5. PRACE operatives as a cooperative offering European researchers access to world class computing and data resources as well as services.  More generally, the PRACE partnership is also striving to reduce its impact on the environment by driving and demanding energy efficiency in HPC.  With obvious parallels with the ambitions of the eNORIA group, PRACE has four main interest areas which include:

  • Education and training
  • HPC Access
  • Research Infrastructure
  • HPC Market Surveillance

In the US, the Big Data Innovation Hubs (established by the National Science Foundation) have been set up across the country to “accelerate partnerships among people in business, academia, and government who apply data science and analytics to help solve regional and national challenges6”.  By nature, these Hubs are large in size.  For example, the South Big Data Hub serves 16 States and has more than 500 members (including universities, corporations, foundations, and cities).  Again the goal of the Big Data Innovation hub is similar to the collaborations in Europe.  For example, the Hubs set out to:

  • Build data science capacity for education and workforce development
  • Facilitate data sharing and shared cyber infrastructure and services.
  • Accelerate public-private partnerships that break barriers between industry, academia, and government.
  • Grow R&D communities that connect data scientists with domain scientists and practitioners.

The Australasian Context
There are no similar collaborative science partnerships existing within Australasia, although Australia (through AARNet) has a world-class national and international computer network.  Australia in particular is challenged by geographic scale, with Perth, WA being the most isolated capital on the planet.  At the same time, Boolardy WA is the location of two of the Square Kilometre Array (SKA)7 Pathfinder projects, with the SKA being one of the world’s largest science (radio astronomy) and engineering projects.

The two of the SKA precursors, Australian SKA Pathfinder (ASKAP) and the Murchison Widefield Array (MWA) are located in Boolardy WA, with the data processing engine located in Perth.  However, when operational, it is estimated that the SKA will generate a massive quantity of data, ~ 3 TB transmitted every second to data processing engine.  This sheer size of the data generated, its processing, analysis and storage makes this a global problem.  There are many other science and engineering projects in Australasia which could considerably benefit from the efforts of collaborations, for example, in Climate Modelling, Biosciences, Health Sciences, Geosciences and informatics.

CHALLENGES

Being part of a successful and ongoing partnership is often complex.  Collaboration is highly regarded as an effective means to tackle challenges (especially in science and engineering).  However, achieving a successful ongoing collaboration is often hard to achieve and sustain with failure widespread8.  Most HPC Centres serve the academic community at least to some degree.  Correspondingly some of the pressures faced by higher education, are also faced by HPC providers.  For example, there is often pressure (internally and externally) to respond to economic, political and / or social need9. Often Centres have multiple stakeholders with competing, significant demands in an environment of uncertain funding, particularly from the Government Sector. In the context of Australasia, there are also “local” issues, such as an expanse of geography, time zone differences, and language and cultural differences.   How then can we create an opportunity for a successful collaborative partnership in Australasia that services the needs of local / national customers and offers the benefits of being effectively connected internationally with the rest of Australasia?

OPPORTUNITY

The Authors seek to establish an effective collaborative partnership from Singapore via Australia to New Zealand, making this proposed partnership one of great opportunity as well as diversity and significant distance (~8000Kms).   If successful this partnership of HPC Centres would be one of the largest (geographically) anywhere in the world, spanning multiple countries and time zones.  Some of the key areas that the partnership would initially focus on include supporting skills development and gaining access to knowledge and skills not available locally.  We would expect to build on the very successful partnerships in computer networking that AARNet has undertaken, including its trans-Pacific network capacity, which has benefited New Zealand, its world-class national networks, which span from Cairns to Boolardy (at multiples of 100Gbps), and more recently Project Indigo, collaborating with Telstra, Singtel and others to provide terabit-scale capacity between Perth and Singapore (and Sydney), and the Japan-Guam-Australia cable in conjunction with Google and others.  AARNet’s CloudStor “Data Hub” could also provide a platform for bringing researchers wishing to share data together, both the few with very large datasets, and the many with smaller datasets.

THE STRUCTURE

The BoF session will be 60 minutes in duration.  At this BoF the Authors will examine some of the challenges and opportunities that such a partnership presents. The format of the session is a mixture of presentations, facilitated discussion as follows:

  • Understanding the background of international collaborations [10 mins];
  • The opportunities presented by an Australasian Collaborative Network [10 mins];
  • The challenges of developing and sustaining international collaborations from an Australasian perspective [10 mins];
  • Facilitated discussion regarding Australasian Collaborative Network would bring, including best practice [25 mins];
  • Wrap up [5 mins].

The presentations will be provided/convened by the listed presenters.

The targeted audience for the session includes anyone with an interest in how collaborations between organisations across countries can be used to facilitate the sharing of education, eResearch tools or infrastructure to support the research community. A primary goal for the BoF facilitated discussion is to initiate conversations aimed at promoting collaboration, therefore this BoF is not discipline specific.

REFERENCES

  1. Neumann, P., et al., Interdisciplinary teamwork in HPC education: Challenges, concepts, and outcomes. Journal of Parallel and Distributed Computing (2017) Vol 105, pp83-91.
  2. Wagner, C. S., International collaboration in science and technology: promises and pitfalls. Science and Technology Policy for Development, Dialogues at the Interface’ by Louk Box and Rutger Engelhard (eds) (2006) Anthem Press London UK. See: http://www.anthempress.com/product_info.php?cPath=96&products_id=274&osCsid=icd69j s77l634iqvoni0t6vk67 [accessed June 2018]
  3. Nordic eScience and eInfrastructure plan 2.0: eScience and eInfrastructure in an international context. Available from  https://neic.no/ [accessed 8th June 2018]
  4. eNORIA Group https://www.nordforsk.org/en/programmes-and-projects/projects/the-nordic-escience-initiative [accessed 8th June 2018]
  5. PRACE – The Scientific Case for High Performance Computing in Europe 2012-2020 http://www.prace-ri.eu/IMG/pdf/prace_-_the_scientific_case_-_full_text_-.pdf [accessed 8th June 2018]
  6. Big Southern Data Innovation Hub, https://southbigdatahub.org/ [accessed 8th June 2018]
  7. The Square Kilometer Array http://www.ska.gov.au/Observatory/Pages/MRO.aspx [accessed 8th June 2018]
  8. Marek, L., I., Brock, D-J., P. & Savla, J., Evaluating Collaboration for Effectiveness: Conceptualization and Measurement. American Journal of Evaluation (2014) pp1-19.
  9. Daniel, B., Big Data and analytics in higher education: opportunities and challenges. Journal of Educational Technology (2015) Vol 46, No. 5 pp904-920.

Biography:

I migrated to WA in 2011, to direct the Data Team at the Pawsey Supercomputing Centre, a national facility providing researchers access to leading computation and data resources. I now lead strategic projects and engagement at Pawsey. In my present role, my responsibilities include directing strategic projects, developing strategy for Pawsey as well as leading and sustaining strategic partnerships with key stakeholders.

I am presently engaged in the development of a cooperative network of centres in Asia Pacific to increase diversity, form new collaborative projects, support new training initiatives, solve technical challenges and highlight new opportunities.

Digital Infrastructure in the Humanities and Social Sciences in Australia – What now and where to next?

Ms Alexis Tindall1, Mr Greg D’Arcy2, Ms Ingrid Mason3

1eRSA, Adelaide, Australia alexis.tindall@ersa.edu.au

2The University of Melbourne, Melbourne, Australia gregory.darcy@unimelb.edu.au

3Australia’s Academic and Research Network (AARNet), Canberra, Australia Ingrid.Mason@aarnet.edu.au

 

DESCRIPTION

This BoF is intended to occur in two parts – a presentation on the current HASS DEVL project, then a BoF discussion on the future of HASS infrastructure in Australia.

The Humanities, Arts and Social Sciences (HASS) infrastructure landscape is characterised by high value but fragmented data, tools and services. The HASS Data Enhanced Virtual Lab (HASS DEVL) establishes workflows across national capabilities and research institutions with the aim of creating a more cohesive and interoperable landscape.

The 2016 National Research Infrastructure Roadmap highlighted the collaboration and integration prioritised need to make dispersed data sets more easily discoverable, accessible and researchable. Greater interoperability across data platforms and the research projects consuming and establishing data is a core goal of this project. As a consequence, the HASS community will directly benefit from the provision of national tools, improved and documented data curation standards, and improved workflows.

The project specifically targets the social sciences, linguistics, and historical studies but the data curation and interoperability frameworks ensure applicability to all HASS disciplines involving the collection of historical records, archival research, transcription, text analysis, mapping and geospatial data. Project outcomes will also broadly benefit research users of TROVE, ADA and AURIN.

But where to from here?

Part One: The HASS DEVL

The panel will discuss the work undertaken in 2018 on the HASS DEVL infrastructure, elaborating on both processes and outcomes.

This section will include lightning presentations from:

  • Workbench design and development
  • Data curation
  • Skills and training

Part Two: Future eResearch directions for HASS

A facilitated discussion sharing and responding to consultation findings, sharing our experience of project delivery across a dispersed collaborative team, and looking forward at key research infrastructure challenges for HASS.

Discussion will explore the following:

  • Challenges in development of data infrastructure that enables diverse HASS disciplines, including identifying need, ensuring accessibility, and sustainability
  • Working across a dispersed team
  • Current and future users of HASS research infrastructure – what are the tools already in use and what will be needed next? What happens after this project? What do researchers need?

Biographies:

Alexis, Greg and Ingrid lead the three streams of activity under the ARDC funded Humanities Arts and Social Sciences Data Enhanced Virtual Laboratory (HASS DEVL). They are working with a dispersed collaborative team representing twelve organisations to deliver a workbench environment for digital research tools, explore common issues of platform interoperability and develop a data curation framework and put together a package of skillbuilding opportunities as part of the DEVL.

Community Engagement as a Means of Boosting Training Outcomes

Belinda Weaver1, Nicholas Hamilton2, Frankie Stevens3, Weisi Chen4, Aidan Wilson5

1Software and Data Carpentry, Australia, bweaver@carpentries.org

2UQ Institute for Molecular Bioscience/QCIF, Australia, n.hamilton@uq.edu.au

3Intersect, Australia, weisi@intsersect.org.au

4Intersect, Australia, frankie@intsersect.org.au

5Intersect, Australia, aidan@intsersect.org.au

ABSTRACT

Training’s report card was marked ‘could do better’ in Tom Cochrane’s 2015 review of the NCRIS capability [1]. Needs specifically identified were a ‘more focused effort on outreach and awareness raising’, ‘more technical support skills’, and ‘relevant training for research groups’. ‘Continuing widespread concern about skills and expertise deficits’ remains despite the need for training and skills acquisition being flagged as a key issue right from the start of NCRIS investment in 2006.

As the review notes: ‘Data and software in research are useless without enthusiastic communities of people who are aware of it and possess skills to get results.’ Yet CSIRO feedback to the review stated: ‘Enhanced skills, training and career track is a systemic issue.’

Training needs to acknowledge the structural challenge caused by the ‘division of labour’ between the research workforce of academics and scientists on the one hand, and their research support staff on the other.  Research support encompasses both experts in technical infrastructure provision as well as experts with softer skills focused on cultural change, such as imparting the benefits of research data sharing approaches. The research support skill sets provide both the “How”, and the “Why”.  Given that, training that engages all sides and builds community will have greater benefits, because lack of support to integrate new skills into practice is one of the main reasons people do not deploy them.

In this BoF, we present a number of community engagement models that have helped improve training uptake and outcomes.

Hacky Hours are held at several universities now, including UQ, Curtin, Griffith, UTS, La Trobe and JCU, with a new HackR Hour starting at QUT (for R users).

Belinda Weaver from Software and Data Carpentry will discuss how the building of community around training workshops can help people assimilate new skills into their research practice.

Dr Nicholas Hamilton will discuss two initiatives – his weekly drop-in bio-imaging clinic at the UQ IMB, which has helped more than 300 researchers, and his co-ordination of the week-long UQ Winter School in Computational and Mathematical Biology for the last 6 years. Winter Schools are now accompanied by a tie-in Software Carpentry workshop specifically for attendees, so they can develop the skills they need to try to put into practice whatever innovative ideas they have picked up during the week.

Dr Weisi Chen will discuss the use of Hacky Hour at UTS in Sydney to build communities around tools and skills. Dr Frankie Stevens will cover other initiatives at Intersect.  Aidan Wilson will discuss Intersect’s contribution to training, with more than 6,500 researchers in over 650 courses at 14 universities and four state and federal government agencies, across four states and territories.

REFERENCES

  1. Cochrane, Tom ‘Status Report on the NCRIS eResearch Capability – Summary: A Report to the Australian Government Department of Education and Training.’ https://docs.education.gov.au/system/files/doc/other/abridged_eresearch_status_report_-_web.pdf

BIOGRAPHY

Belinda Weaver is the Community Development Lead for Software and Data Carpentry, global organisations that aim to make researchers more productive and their research more reliable by teaching them computational and data skills. She was formerly the eResearch Analyst Team Leader for the Queensland Cyber Infrastructure Foundation, where she helped deliver cloud solutions to Australian researchers. She was a key organiser of the Brisbane Research Bazaar events in 2016 and 2017 – cross-institutional, community-building events that taught a range of digital skills to researchers. She helped inaugurate the weekly Hacky Hour research IT advice sessions at UQ.  She is a certified Software Carpentry instructor and instructor trainer and has taught at many Software Carpentry workshops. She organised the two very successful Library Carpentry global sprints (aka hackathons) in 2016 and 2017 which updated and extended the basic lessons. Belinda has worked as a librarian, repository manager, project manager, newspaper columnist, Internet trainer and in research data management. She tweets as @cloudaus (https://twitter.com/cloudaus).

Dr Nick Hamilton is the Institute Bio-Mathematician at the Institute for Molecular Bioscience (IMB), The University of Queensland, and holds a co-appointment with the Research Computing Centre at UQ. He gained a PhD in Pure Mathematics from the University of Western Australia in 1996 and was subsequently awarded Fellowships in Australia and Belgium. In 2002, Nick made the decision to change fields into the exciting new areas of computational biology and bioinformatics, returned to Australia, and subsequently took up a position within the ARC Centre of Excellence in Bioinformatics at The University of Queensland. In 2008 he was appointed as a Laboratory Head at IMB, and Institute Bio-Mathematician in 2014, where he continues to lead a group in bio-image informatics, mathematical modelling and data visualisation,  developing methodologies to deal with the current deluge of data that new microscopy imaging technologies have enabled. He also has interests and has participated in many training and engagement models such as Hacky Hour, HealthHack, ResBaz and Software Carpentry, and has Chaired the Winter School in Mathematics and Computational Biology for the last 6 years. https://orcid.org/0000-0003-0331-3427

Dr. Frankie Stevens is currently Intersect’s eResearch Analyst for Southern Cross University. Dr. Frankie Stevens has previously held roles with the national Research Data Storage Infrastructure (RDSI) Project and as eResearch Programme Manager at the University of Sydney. Frankie has 20 years experience working in the higher education sector in Australia and overseas. Frankie’s expertise involves developing strong relationships between research communities, local, state and national eResearch infrastructure initiatives and has involved broad awareness raising and promotion of expert capabilities for the Australian research sector. Frankie holds a Bachelor of Science (Honours), majoring in biology with European studies (French) from the University of Sussex, and a PhD in cell biochemistry (Cancer Research) from the University of Manchester. Frankie is a published academic, and also holds a number of project and programme management qualifications.

Dr. Weisi Chen is currently Intersect’s eResearch Analyst for University of Technology Sydney and coordinator of Intersect’s training platform. With more than 4 years of eResearch training experience, Weisi has expertise in a broad range of eResearch techniques and how eResearch training and the establishment of Hacky Hour can enhance research efficiency by improving researchers’ capability of using technologies. Weisi has a Bachelor of Engineering in Computer Science and Technology from the Zhejiang University and a PhD in Computer Science and a PhD in Computer Science and Engineering from the University of New South Wales (UNSW). Weisi has previously worked as an academic and software engineer at UNSW where software architecture for eResearch data analysis was his main research focus, and has also been involved in a number of research projects in various domains.

Aidan Wilson is Intersect’s eResearch Analyst for the Australian Catholic University, and coordinator of Intersect’s training platform. Aidan’s research background is in documentary linguistics, concentrating on the syntax and morphology of Australia’s Aboriginal languages. He has also been actively involved in research support, and worked as a data manager for PARADISEC, an archive of Pacific and regional digital enthographical data, including linguistic and ethnomusicological recordings. In his time at Intersect, Aidan has been involved in a number of engineering and data science projects, including secure data movement for health and medical, and imaging datasets, and genome sequencing as-a-service.

Data Services Interest Group

Adrian Burton1, Ryan Fraser2, Cameron Moloney3, Aaron Sedgmen4, Jonathan Yu5, Lesley Wyborn6

1 Australian National Data Service, Canberra, Australia, adrian.burton@ands.org.au
2 CSIRO, Perth, Australia, ryan.fraser@csiro.au
3 Integrated Marine Observing System, Tasmania, Australia, Cameron.Moloney@utas.edu.au
4 Geoscience Australia, Canberra, Australia, Aaron.Sedgmen@ga.gov.au
5 CSIRO, Clayton, Australia, Jonathan.Yu@csiro.au
6 National Computational Infrastructure, Canberra, Australia, Lesley.Wyborn@anu.edu.au

 

DESCRIPTION

Data services in the research domain support the use of research collections and datasets by providing automated functions for the creation, access, processing and analysis of data. More and more data providers are publishing their data through services. It is an increasingly common design pattern for research applications and platforms, like for example the Nectar Virtual Labs, to consume such data access services and connect them with other services and tools for data analysis, processing, and visualization.

An Australian community of interest working group is being formed around data service provision and consumption across the NCRIS facilities, science agencies and universities. The Data Services Interest Group aims to exchange experience, establish visibility of data services, discuss interoperability issues and identify opportunities for collective action. Consistency in implementation of standards around services has been identified by the community as a key enabler of significant opportunities in this area.

DELIVERY

This BoF will include

1. Background on the Data Services Interest Group
2. “Lightning talks” outlining current practice in data service provision and consumption (IMOS/MARVL, AuScope/VGL, GA, NCI)
3. Input from the audience on issues and opportunities for this group

The format of this 60 minute session will begin with short presentations followed by an open discussion where members of the audience will be invited to contribute.


Biographies

Adrian Burton is a Director at the Australian National Data Services (ANDS). In this capacity he has a keen interest in national services that enable data publication, data discovery and data citation as well as the human support services that build the capability of researchers and research organisations to take advantage of data infrastructure. Adrian has provided strategic input into several national infrastructure initiatives, including Towards an Australian Research Data Commons, The National eResearch Architecture Taskforce, and the Australian Research Data Infrastructure Committee. Adrian is active in building national policy frameworks to unlock the value in the research data outputs of publicly funded research. Before being involved in research infrastructure Dr Burton taught South Asian Linguistics and conducted research at the Australian National University and was responsible for liaison between academic staff and central information and technology services.

Ryan Fraser is a Portfolio Manager with the CSIRO, with over 15 years of experience working in R&D, commercialisation of products and delivery to both government and industry using agile engineering methodologies. Ryan has led many Australian eResearch projects, including the AuScope Grid; Australian Spatial Research Data Commons; VGL; the Virtual Hazards, Impact and Risk Laboratory(VHIRL); ANDS and NeCTAR projects. Ryan possesses specialised knowledge and has current projects in spatial information infrastructures, data analytics, Cloud Computing, Data Management, and Interoperability and has extensive experience in managing and successfully delivering projects.

Aaron Sedgmen is a trained geoscientist although has spent most of his career in information management and ICT, working with geoscience/geospatial data and associated information systems. Over the last twenty five years Aaron has worked as a GIS analyst, database developer, web application developer, web operations manager and fulfilled various data management roles. In the last five years at Geoscience Australia Aaron has become a proponent of data access standards and enabling interoperability. Aaron is team leader of the Data Access and Services team, responsible for the governance and management of data access services, and promoting data access standards within the agency.

Jonathan Yu is a research computer scientist specialising in information architectures, data integration, linked data, Semantic Web, data analytics and visualisation. He is part of the Environmental Informatics group in CSIRO Land and Water. He obtained his PhD in Requirements-oriented Methodology for Ontology Evaluation at RMIT University in 2008.
He currently leads and supports the development of new approaches, architectures, methods and tools for transforming and connecting information flows across the environmental domain and the broader digital economy within Australia and internationally.

Lesley Wyborn is a geochemist by training and worked for BMR/AGSO/GA for 42 years in a variety of geoscience and geoinformatics positions. In 2014 she joined the ANU and currently has a joint adjunct fellowship with National Computational Infrastructure and the Research School of Earth Sciences. She has been involved in many NCRIS funded eResearch projects over the years. She is Deputy Chair of the Australian Academy of Science ‘Data for Science Committee’ and is co-chair of several RDA Interest Groups as well as a member of the AGU Earth and Space Science Executive Committee.

Persistent Identifiers and Group ID Policy

Heath Marks 1, Jeremy Hammond 2, Craig Hamilton3, Andrew Janke4, Siobhann McCafferty5(Convener)

  1. Australian Access Federation, Brisbane, Australia, heath.marks@aaf.edu.au
  2. University of Sydney, Sydney, Australia, jeremy.hammond@sydney.edu.au
  3. Intersect Australia ltd, Sydney, Australia, craig@intersect.org.au
  4. University of Queensland/Research Data Services, Brisbane, Australia, a.janke@uq.edu.au
  5. Australian Access Federation, Brisbane, Australia, siobhann.mccafferty@aaf.edu.au

DESCRIPTION

Summary of topic

Persistent Identifiers (PID’s) are an essential tool of digital research data management and the evolving data management ecosystem.

The use of PID’s supports data management across all phases of research by creating chains of provenance, improving discovery and access, and ensuring that output is attributable and reportable. Thus following the F.AI..R Principles [1] of making research data Findable, Accessible, Interoperable, and Re-usable.

The use of PID’s is becoming standard practice for institutions, funders and infrastructure providers and standard Identifiers have developed for individuals [2], activities or projects [3] and outputs [4], however there is no national or international Group or Team ID as of yet.

Currently there are a range of technical solutions with the ability to create and store groups, but the larger issue remains the complexity of access policy and business rules.

The Australian Data Life Cycle Framework Project [5] has precipitated an interest in Group IDs recently and there has been active discussion about development and policy between several groups and institutions, including ANDS, The Australian Access Federation (AAF), Intersect, AARNET, Research Data Services (RDS), Nectar, The University of Sydney and The University of Queensland.

This Birds of a Feather session will bring together these and further interested parties to discuss the current status of Group ID work and possible business rules and policy for a Group ID in Australasia.

Desired outcomes include the formation of a Group ID policy Working Group, and a draft list of business rules.

Session format

This invites involvement from Data Management professionals working in the Identity Management, Research Data Management and research infrastructure management areas in Australia. The session will be 60 minutes long and the format will be a combination of round table, and open discussion.

REFERENCES

  1. FAIR Principles. Available at: https://www.force11.org/group/fairgroup/fairprinciples, accessed 29 June 2017
  2. Available at: https://orcid.org/, accessed 29 June 2017
  3. Available at https://www.raid.org.au/, accessed 29 June 2017
  4. Available at: https://www.doi.org/, accessed 29 June 2017
  5. Data Life Cycle Framework Project. Available at: https://www.dlc.edu.au/, accessed 29 June 2017

Biographies

Siobhann McCafferty is the Project manager for the Data Life Cycle Framework project.

Heath Marks is the Director of The Australian Access Federation, based in Brisbane, Australia.

Dr Jeremy Hammond is the Associate Director of Research ICT at the University of Sydney.

Craig Hamilton is the Engineering Manager for Intersect Australia ltd.

Dr Andrew Janke is Informatics Fellow at the Australian National Imaging Facility, Systems Architect for the Data Life Cycle Framework Project and Senior Research Fellow at the Centre for Advanced Imaging at The University of Queensland.

Problems and Solutions patterns in Community development

Kheeran Dharmawardena1, Wojtek Goscinski2, Paul Box3

1NeCTAR, Parkville, Australia, Kheeran.d@nectar.org.au

2Monash University, Clayton, Australia, wojtek.goscinski@monash.edu

3CSIRO, Canberra, Australia, Paul.J.Box@csiro.au

 

DESCRIPTION

eResearch infrastructure comprising systems, data, processes and people providing this infrastructure (provider community)   has evolved to underpin specific communities (user communities) with specialised software and hardware requirements. Underpinning research user communities is challenging: software and data in cutting edge areas advances quickly meaning that software infrastructure can fast become irrelevant; research is naturally competitive, which makes collaboration a finely tuned balance; and building models for sustainability is challenging.

A range of inter-related social, institutional and economic factors can act as enablers of, or constraints on, effective collaborative effort necessary to build and sustain research information infrastructure. A number of patterns (described below) that impact achievement of collective goals in information infrastructure have been observed. There are sure to be many more patterns. This BOF provides an opportunity to share experiences around identified socio-technical patterns (i.e. the things that we believe hold true across different contexts) and explore community appetite for and approaches to developing an infrastructure pattern book

Some examples of patterns identified to-date that might be explored include:

  • Connecting rowing and steering – governance is the decision making process that sets the ‘rules of the game’ to ‘steer’ collective activity’. Individual orgs and people do the heavy lifting ‘rowing’ to achieve agreed outcomes. If there is a real or perceived inability to influence decision outcome in governance mechanisms there is likely to be a disincentive to taking action to achieve the outcomes particularly where collaborative efforts are in-kind volunteered effort, rather than being centrally funded.
  • Pigs and chicken[1] – decision rights should be allocated in ways that are appropriate to the needs of the community and the respective roles of individual actors. Assigning decision authority – decider (as opposed to decision input roles) can be used to give more voice in collective decision making to those who will have more skin in the implementation game i.e. the ‘pigs’
  • Understanding and leveraging Coalitions of the Willing – COWs. What incentivizes the folks who drive and contribute to initiatives? How can this be replicated and scaled up?
  • Elephants (in the room) – the need to surface, explore address the sometimes hidden non-negotiables that may be trivial (‘if we don’t use protocol/widget/environment X, my org is out!’) but which nonetheless, may derail collective efforts.
  • Horses (being put before carts) – what’s the right sequence of institutional, social, technical and economic levers to be pulled to build, gown and sustain infrastructure. Patterns include: Technology push (build it and they’ll come) versus end user driven rapid ‘value’ prototyping.
  • Innovators, early adopters and the majority (Diffusion of innovation)[2] – research infrastructure development is often driven by the needs of those out in front, requiring more sophisticated approaches. Meeting these needs whilst bringing the rest of the community along to ensure broader adoption is critical. Meeting the users where they are now, recognising significantly different levels of capability maturity and need is challenging.

This interactive BoF will look at these socio-technical challenges and seek to identify emergent problems & solutions patterns towards building user communities that help underpin research communities in the use of information systems.

REFERENCES

  1. Wikipedia, The Chicken and the Pig, Available from: https://en.wikipedia.org/wiki/The_Chicken_and_the_Pig, accessed 30 Jun 2017
  2. Wikipedia, Diffusion of innovations, Available from: https://en.wikipedia.org/wiki/Diffusion_of_innovations, accessed 30 Jun 2017

 


Biographies

Mr. Kheeran Dharmawardena, MBA, B.Comp., joined NeCTAR as the Coordinator of the Science Clouds initiative in October 2016.  Prior to this he has been responsible for the delivery of many ICT services at Monash University including infrastructure delivery, service delivery, data management, IT & enterprise architecture and eResearch.  He has a special interest in the socio-technical aspects involved in the delivery of effective services.

(orcid.org/0000-0002-4292-7475)

Dr Wojtek James Goscinski is the coordinator of the Multimodal Australian ScienceS Imaging and Visualisation Environment (MASSIVE), and the External Collaborations Manager at the Monash eResearch Centre a role in which he leads teams to develop and implement digital strategies to nurture and underpin next-generation research. He holds a PhD in Computer Science, a Bachelor of Design (Architecture), and a Bachelor of Computer Science.

Paul Box leads a CSIRO research team developing interoperable systems of systems or ‘Information Infrastructure’. Paul has worked for more than 25 years in geospatial information technology field.

More recently, Paul has focused attention on addressing the social rather than technical challenges of building Information Infrastructure. Coherent integrated approaches to addressing the social, institutional and economic challenges of infrastructure development are being elaborated through ‘social architecture’.

Recent Comments

    About the conference

    eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

    Conference Managers

    Please contact the team at Conference Design with any questions regarding the conference.

    © 2018 - 2019 Conference Design Pty Ltd