Current and Future Directions for Digital Infrastructure in the Humanities and Social Sciences in Australia BoF

Dr Steven McEachern1, , Dr Tully Barnett2, Associate Professor Shawn Ross3, Ms Amanda Lawrence4, Ms Sarah Nisbet5, Mr Malcolm Wolski6

1ANU, Canberra, Australia,

2Flinders University, Australia,

3Australia, Macquarie University, Sydney, Australia,

4Swinburne University, Melbourne, Australia,

5eRSA, Adelaide, Australia,

6Griffith University, Brisbane, Australia,


HASS is a multidisciplinary grouping that represents significant domain specialisation, interdisciplinarity, and transdisciplinarity. This BoF is intended to occur in two parts – a presentation on current infrastructures in the HASS community, and a guided BoF discussion on the future of HASS infrastructure in Australia.

Part One: Current HASS infrastructures

The panel will discuss examples of the current infrastructures in HASS and look at what works/doesn’t work for researchers from across archaeology, social science, arts and humanities with regards to building, getting access to or managing research infrastructure. The panel will describe their experiences with research infrastructure successes, gaps, failures, and opportunities.

This section will include lightning presentations from:

  1. The Field Acquired Information Management Systems Project – FAIMS (Shawn Ross, Macquarie University) –
  • The Australasian Consortium of Humanities Research Centres – ACHRC (Tully Barnett, Flinders University)
  • The Human Communication Science Virtual Lab – ALVEO (Steve Cassidy, Macquarie University)
  • Analysis & Policy Observatory –  (Amanda Lawrence, Swinburne University)
  • The Australian Data Archive (Steven McEachern, Australian National University) –
  • Prosecutions Project and Cultures and Communities (Mark Finnane, Griffith University, Sarah Nisbet, eRSA, Malcolm Wolski, Griffith University)

Part Two: Future eResearch requirements for HASS

The panel will then facilitate an open discussion around the future eResearch needs for HASS. We will be looking to frame a discussion around how we can address and respond to the key challenges as we work towards the Platforms for HASS described in the NCRIS roadmap.

We hope the BoF discussions will address and create a dialogue across the following:

  • Data infrastructure development enabling diverse HASS types, exchange and integration in support of HASS research
    • eResearch / data needs of HASS researchers (not easier, just different…).
    • ‘Small data’ research.
    • Getting beyond the early adopter –> early majority ‘chasm’ in the uptake of digital tools.
  • Platform interoperability enabling more interoperability between platforms and tools to aid with data sharing and more efficient use of informatics tools.
  • Deeper collaboration across HASS and with government stakeholders that may be research partners and critical data holders or the beneficiaries of translational research.



Dr. Steven McEachern is Director and Manager of the Australian Data Archive at the Australian National University, where he is responsible for the daily operations and technical and strategic development of the archive. He has high-level expertise in survey methodology and data archiving, and for over fifteen years has been actively involved in the development and application of survey research methodology and technologies in the Australian university sector.

Associate Professor Shawn Ross is Director, Data Science and eResearch, Office of the Deputy Vice-Chancellor (Research), and Associate Professor of History and Archaeology, Director, Field Acquired Information Management Systems (FAIMS) Project at Macquarie University.

Tully Barnett is a Research Fellow in the College of Humanities, Arts and Social Sciences at Flinders University in South Australia. She is Associate Director of the Australasian Consortium of Humanities Research Centres and a member of the executive board of the Australasian Association of Digital Humanities. Her research applies a critical infrastructures approach to understanding the practices of reading and literary communication in the context of digitalisation (both digitisation and the born digital). Her research fellowship is for the project Laboratory Adelaide: The Value of Culture, investigating the notion of value in culture beyond easily analysable metrics within the context of comparative cultural policy. She is the author of “The Human Trace in Google Books” in Border Crossings (2016), “Platforms for Social Reading: Material Imagery in Digital Book Formats” in Scholarly and Research Communication (2015), and she is co-author of “Counting culture to death: an Australian perspective on culture counts and quality metrics” in Cultural Trends (2017).

ORCID iD: 0000-0003-0269-5814

Amanda Lawrence is Research and Strategy Manager at Policy Online and Research Manager of the ARC Linkage project ‘Grey literature strategies: enhancing the value of research and information for public policy and practice’. She is currently completing a PhD in Communications and Media at the Swinburne Institute for Social Research. She has a Graduate Diploma in Library and Information Management from RMIT University and an Arts Degree with Honours from the University of Melbourne. Prior to moving into the library and information sector Amanda worked in literary arts management and bookselling.

ORCID iD: 0000-0003-2194-8178

Sarah Nisbet is eRSA’s Chief Operations Officer. Sarah has a Bachelor of Media from the University of Adelaide and an Industry Certificate (Festival & Event Design & Management), she is also a member of the Australian Science Communicators and the Public Relations Institute of Australia. Sarah is currently the Project Manager of the Australian National Cultures and Community Project, which is looking to enable better data sharing and discoverability between researchers and archives.

Malcolm Wolski is the Director, eResearch Services at Griffith University. Malcolm is a part of the senior leadership team providing library, information and IT services at Griffith University.  In his role, he is responsible for the development, management and delivery of eResearch services to the University’s research community, which includes the associated information management systems, applications, infrastructure, high performance computing, data management as well as a small media production service. These services are delivered through an integrated service delivery team by working closely his colleagues in the library and enterprise IT. More recent projects and other activities have involved working closely with national and international organisations, including groups such as NeCTAR, ANDS and he is a member of the Organisational Assembly Board of the Research Data Alliance. Malcolm has a background in IT, library, strategic planning and operations research.

Seeing the City Pulse through Social Media Analytics on the NeCTAR Research Cloud

Prof. Richard Sinnott1

1University Of Melbourne, Melbourne, Australia


Over 700 Masters-level students at the University of Melbourne have been taught big data analytics on the NeCTAR Research Cloud since 2013 as part of the Cluster and Cloud Computing course. This course covers HPC programming including MPI as well as the hands-on experiences in dynamic deployment and scaling of applications on the Cloud typically to support big data analytics. Students are exposed to technologies such as noSQL systems such as CouchDB, CouchBase, Hadoop/HDFS and Spark, as well as how to write scalable Cloud solutions using scripting approaches such as Boto and Ansible.

This talk will focus on one example of student implementation work that focuses on real time processing of social media data (Twitter, Instagram, Flickr and Foursquare) to better under the way in which individuals move around the city – capturing the so called pulse of the city. Such work provides insights into people’s daily routines that are otherwise impossible to capture. A key part of this work is in data analytics and data visualization. This includes algorithms for sentiment analysis and the scalability of these algorithms across the NeCTAR Research Cloud. We describe the technical solutions that have been adopted that reflect best practice in this space and the lessons learned in using the NeCTAR Research Cloud for such analysis. We also discuss the potential dangers of such data use and the privacy issues that they give rise to.



Professor Richard O. Sinnott is the Director of eResearch at the University of Melbourne and Chair of Applied Computing Systems. In these roles he is responsible for all aspects of eResearch (research-oriented IT development) at the University. He has been lead software engineer/architect on an extensive portfolio of national and international projects, with specific focus on those research domains requiring finer-grained access control (security). He is technical lead for the AURIN project and a range of other application domains. He has taught Cluster and Cloud Computing to over 400 Masters-level students at the University of Melbourne since 2013. He has supervised over 250 Masters dissertations in the last 4 years on a range of topics including big data analytics and Cloud computing, with extensive focus on social media data analytics and use of Twitter data. (Geoff and Chao are two such students!)

HASS-GLAM Networking

Ms Ingrid Mason1, Ms Hilary Goodson2, Peter Elford3

1AARNet, Sydney, Australia,

2AARNet, Sydney, Australia,

3AARNet, Sydney, Australia,



The 2016 National Collaborative Research Infrastructure Strategy Roadmap [1] outlines as key national research infrastructure: Platforms for Humanities, Arts, and Social Sciences (HASS), with a strong focus on interoperability across platforms and researcher access to physical and digital collections, including cultural collections.

Advancing research in these areas is critical to our future, and requires a nationally coordinated approach to infrastructure development to drive transformations in the way researchers discover, access, curate, and analyse Australia’s social and cultural data. (p33) 

Collaboration is a core aspect of this national strategy, reflected in its structural features and infrastructure design, and in the complex stakeholder interactions in the complex mix of infrastructure owners and jurisdictions referenced in the Platforms for HASS focus area. Built into this strategy is an assumption that collaboration will arise between the HASS and GLAM (Galleries, Libraries, Archives, and Museums) communities to build bridges between public and research infrastructures and establish working relationships and service boundaries.

AARNet already provides bridging infrastructure components (network and services) for the HASS and GLAM communities.  The national research and education network (NREN) AARNet provides vital underpinning research infrastructure to this focus area of the national strategy. Many of Australia’s large state and national cultural institutions are on the NREN and CloudStor (cloud storage service) has been designed to be easily accessible to any collaborator and partner on a research project (with an email address).  The combination of having all parties (data providers and digitisation suppliers) collaborating around research are ideally on high speed network connections that support rapid data transfer via the research institution or through arrangements under AARNetConnect.  The combination of these research infrastructure components makes collaboration around data sharing between HASS and GLAM in Australia straight forward.

This presentation will:

  1. Highlight AARNet’s experience to date with providing national research infrastructure bridging components to support HASS-GLAM collaboration.
  • Cover a selection of insights gained through that experience as entry points, to aid the broader eResearch community in engagement, with two diverse and highly collaborative communities, with long and well-established histories of collaboration.



There is a significant challenge ahead for the eResearch community to work together to build connecting research infrastructure to enable collaboration.  The eResearch community can draw confidence from the fact that professional communities and networking already occurs between the “HA” in HASS and the GLAMs.  The Australasian Association of Digital Humanities has been running a biennial conference listserv “2Cultures” for association members and the wider community to exchange information since 2012.  In 2016, the Digital Humanities Australasia conference hosted by the University of Tasmania in Hobart for example had a presentation from academic librarians at University of Wollongong on their infrastructure collaboration with history academics to develop the “History Archives Portal” [2] using the Omeka platform (a platform used in common by digital humanities research and the GLAM collecting communities).  The annual National Digital Forum in Wellington, New Zealand, commenced in 2002 has been drawing in contributions from GLAM and DH communities from Australasia and internationally.  The VALA biennial digital library conference (in operation in Australia for decades) has also more recently drawn in contributions from GLAM and from digital humanities (DH) researchers from Australasia and internationally.  A clear conjunction between the interests of DH research interests (and their data access requirements) and GLAM collection availability for mining was revealed in 2012 in a paper delivered on “Mining the treasures of Trove: new approaches and new tools” [3].  In 2015, the international Digital Humanities conference and the international LODLAM (Linked Open Data: Libraries, Archives, Museums) summit were co-located in Sydney at Western Sydney University and the State Library of New South Wales to enable the crossover of DH and GLAM interests in semantics, linked open data, and cultural heritage be exchanged.  In February of this year the international museum community landed in Melbourne to attend MuseumNext (a mobile professional forum) hosted by the Australian Centre for Moving Image [4] to give and listen to talks on: artists working in museums, science galleries, the use of robots and virtual reality, museum innovation hubs, library labs and much more.  The evidence is there that collaboration flourishes between DH and GLAM, these communities share events, interests, data, ideas, and physical and digital infrastructures.



The bundling the humanities, arts and social sciences (HASS) and galleries, libraries, archives and museums (GLAM), as communities, is a categorisation abstraction.  By example the Australian Academy for the Humanities in its submission [5] on the National Research Infrastructure Capability Issues Paper states:

The humanities together with the arts and social sciences (the HASS sector) form a sizeable part of Australia’s research and innovation system. HASS researchers comprise 43% of the university-based research system, and HASS contributed 44% of the total number of units of evaluation in the Excellence for Research in Australia (ERA) initiative in 2012.  The HASS sector is not only sizeable but diverse, comprising some 50 disciplines at the four-digit field of research level.

The GLAM Peak [6] initiative driven by Museums Galleries Australia and National and State Libraries Australasia draws together eleven peak bodies:

  • Australian Library and Information Association (ALIA)
  • Australian Society of Archivists (ASA)
  • Council of Australasian Archives and Records Authorities (CAARA)
  • Council of Australasian Museum Directors (CAMD)
  • Council of Australian Art Museum Directors (CAAMD)
  • Council of Australian University Librarians (CAUL)
  • Federation of Australian Historical Societies (FAHS)
  • International Council of Museums (ICOM)
  • Museums Galleries Australia (MGA)
  • National Research Collections Australia (NRCA)
  • National and State Libraries Australasia (NSLA)

The members of those peak bodies include local, state, territory, and national institutions in Australia, and represent an enormous highly connected network of physical and digital infrastructure.  Complexities can arise from conflating the practices, resources, and needs of the highly diverse academic and collecting activities and professionals represented by each letter in both of these two acronyms: HASS and GLAM.



  1. 2016 National Research Infrastructure Roadmap. Available from:, accessed 15 June 2017
  2. History Archives Portal. Available from:, accessed 15 June 2017
  3. Sherratt, T. “Mining the treasures of Trove: new approaches and new tools”, VALA 2012 Conference. Available from:, accessed 15 June 2017
  4. MuseumNext. Available from:, accessed 15 June 2017
  5. Australian Academy of the Humanities, National Research Infrastructure Capability Issues Paper, September 2016. Available from:, accessed 15 June 2017
  6. GLAM Peak.  Available from:, accessed 15 June 2017



Ingrid Mason, Deployment Strategist with AARNet, provides support for engagement and the uptake of the national research and education network (NREN) and services with AARNet members across the research, cultural and collections sectors. Ingrid has worked on several NCRIS programs: Australian National Data Service, National eResearch Collaborative Tools and Resources, and Research Data Services.

Enabling better data discovery of records across archives, institutions and libraries

Professor Mark Finnane1, Ms Sarah Nisbet2

1 Griffith University, Brisbane, Australia,

2 eRSA, Adelaide, Australia,



Making use of a purpose-built structured database with the flexibility to capture data from a variety of different sources and jurisdictions, and enabling a productive partnership between researchers and volunteer transcribers, the Prosecution Project is a platform for revolutionising our understanding of criminal justice histories. We have extended the functionality of this database by operationalising a national, sustainable and scalable API standard that allows data (and metadata) sharing and transfer between archives, institutions and research projects (such as Tasmanian Archives and Heritage Office, The Prosecution Project, Trove, Queensland State Archives, and the ALA).


Criminal justice history is a substantial research field, with a vital international community of scholars among whom Australia holds special interest given the convict origins of European settlement, as well as the impact of colonisation on the contemporary challenges of Indigenous crime and incarceration rates. The research field has significant inter-disciplinary connections to a range of other humanities areas including criminology, gender studies, law and psychology.

The Prosecution Project is a major initiative in this research field. The project is digitising large-scale criminal justice record sets in Australian jurisdictions to enable new research, quantitative and qualitative, in the history of prosecution and the criminal trial.

Historically, archival records have been hidden away in different archives across the country. To access them you need to physically visit the archives, copy the original records and then transcribe the data. With the advances in digitisation and digital archiving solutions, data can be made accessible to the national and international community, and related records can be linked across multiple institutional repositories

In this presentation we will discuss how the Prosecution Project and Griffith University are working collaboratively with archival institutions such as the Queensland State Archives and the Tasmanian Archive and Heritage Office to enable better access to archival data. The Open API project has produced a standardised method of accessing archival data. For example, this API enables access to the Prosecution Project’s transcription of court records alongside the Tasmanian Archives and Heritage Offices’ digitised records. This method of providing data as service, will be able to be exploited by discovery services such as the National Library of Australia (NLA), national and state archives.

In this presentation we will discuss the project and its outcome in detail. We will also talk about how the connectivity work has been supplemented by community building and user support and training activities to increase the uptake of the API.



Mark Finnane, Professor, Griffith University

Mark Finnane is ARC Laureate Fellow and Professor of History at Griffith University, where he works as a researcher in the Griffith Criminology Institute. Mark’s doctoral research on mental illness is the foundation for his later work on the history of policing, punishment and criminal justice. His most recent book is (with Heather Douglas) Indigenous Crime and Settler Law: White Sovereignty after Empire (Palgrave Macmillan, 2012), a study of the criminal law’s response to Aboriginal crimes of violence over the last two centuries. In 2013 he was awarded an ARC Laureate Fellowship (2013-18) to research the history of prosecution and the criminal trial in Australia.

Sarah Nisbet, COO, eRSA

Sarah Nisbet is eRSA’s Chief Operating Officer. Sarah began her career delivering communications solutions in the health care sector where she mastered the art of working across institutions, departments and organisational silos.

Sarah has a Bachelor of Media from the University of Adelaide and an Industry Certificate (Festival & Event Design & Management), she is also a member of the Australian Science Communicators and the Public Relations Institute of Australia.

Sarah is currently the Project Manager of the RDS Cultures and Community Project, which is looking to enable better data sharing and discoverability between researchers and archives. Alongside stakeholders, such as National Library of Australia, National Australian Archives, Tasmanian Archives and Heritage Office (TAHO) and Queensland State Archives, we’re developing an Open API to test the concept that if a research adds value (transcription, metadata, annotation) to a dataset from a cultural institution, how can that be shared back to the source institution (API, catalogue record, linked records, machine to machine capabilities)? This pilot project aims to operationalise a national, sustainable and scalable API standard that will allow data (and metadata) sharing and transfer between the Prosecution Project, TAHO and QSA.

Informatics from Pilot to HASS Platform (SCIP)

Lyle Winton1, Greg D’Arcy2, Mitchell Harrop3, Geordie Zhang4

1 The University of Melbourne,

2 The University of Melbourne,

3 The University of Melbourne,

4 The University of Melbourne,



The Social and Cultural Informatics Platform “SCIP” has been responding to demand and growth in the digital humanities, arts, and social sciences (HASS) at the University of Melbourne since 2015 [4]. SCIP is an expert based “research platform” that provides the necessary informatics skills and develops technology platforms to support researchers, research students and research projects. SCIP consists of a core informatics team and partner network to support a broad range of research activities from researcher consultations, workshops and training, through to deeper partnerships on prioritised research projects [1, 2].  SCIP specialises in exploring and applying technology typically working in the areas of research data management, online/web resources and collections, data analytics and informatics, visualisation and digitisation.  Activity over the recent year has included over 200 expert consultations, work on 20 prioritised projects and 6 technical platforms [3], over 20 workshops and outreach events, and significant involvement in the full grant lifecycle (application, planning, execution and dissemination) supporting over $3M in ARC projects.

The SCIP platform governance involves faculty based academic leaders (our chief investigators), representatives from faculty research offices and representatives from the funding partners.  The SCIP Working Group forms an expert advisory layer meets regularly to discuss and coordinate collaboration on new projects, to plan joint activities such as training and outreach, and to jointly address the challenges faced by projects and researchers. The working group consists of representatives from each of the SCIP network partners (eScholarship Research Centre, University Digitisation Centre, Research Platform Services, the library Digital Scholarship unit, the library Research Publications and Programs unit, Arts and MGSE faculty and subject liaison librarians, University of Melbourne Archives, and the Faculty of Arts Digital Studio).

SCIP has evolved from a pilot to an operating model that includes a balanced program of informatics support and development activities.  The model prioritises:

  1. Expert support – walk-in contact locations, face to face consultations, email
  • Priority projects – core team coordinated activity, sourced from a range of activities, brokering expertise, services and platforms
  • Platform development – to develop or improve re-usable technology platforms, grounded in supporting multiple research projects. Our platforms include Omeka, Mediaflux, cultural mapping and visualization, digitisation, advanced data analysis (HPC), and data-in-the-field tools.
  • Training and workshops – working together with partners to both lead and contribute to events (eg. Research Bazaar)
  • National platforms engagement – importing and supporting national platforms for local researchers
  • ARC grant cycle support – in application for grants, in planning for technology and informatics, and in execution

Come along to our presentation to hear:

  • How we’ve developed an operating model together with Faculty and expert partners;
  • How we’re working together with Faculty to uplift the Digital HASS capability;
  • How we support current needs and developing technology platforms for emerging areas;
  • How we measure our performance and impact;
  • Examples of SCIP research projects and platform developments;
  • Our planned future activities, developments and collaborations.


  1. “About SCIP”, SCIP website,
  2. “SCIP Informatics Team”, SCIP website,
  3. “SCIP Projects & Activities”, SCIP website,
  4. Neish, P., Murray, A. & Konstantelos, L. (2015) “The role of research data repositories in social and cultural informatics and the wider open data ecosystem” eResearch Australasia Conference, Brisbane, Australia, 19th-23rd October, 2015. Available:



Dr Lyle Winton has over 15 years of experience in research infrastructure having worked within universities and on state, national, and international initiatives. Lyle also has a research background in experimental physics involving large-scale collaborations, with significant challenges in sharing knowledge, data and computational power. Currently Lyle is the Manager of Digital Scholarship within the University of Melbourne Library, as well as the Manager of an informatics platform supporting digital research in Arts and Education. These activities involve the development and evolution of services and platforms to support research in the digital age.

Greg D’Arcy has over 15 years’ experience working with research data management, digital repositories, and publishing open resources and datasets. He has managed programs and projects with government agencies, not-for-profit organisations, the cultural sector (Art Galleries, Libraries, Archives and Museums), charities and digital producers. Greg has lectured at RMIT, Monash University and Victoria University, and holds the qualifications of Master of Business and Master of Technology.

Dr Mitchell Harrop is a Humanities and Social Sciences Informatics Specialist in the Social and Cultural Informatics Platform (SCIP) at The University of Melbourne. Mitchell’s PhD research involved ethnographic studies of digital game playing and was conducted within Melbourne University’s Computing and Information Systems department. He has lectured in Informatics, Database Systems and Web Information Technologies.

Digitally Mapping Colonial Collecting of Indigenous Australian Bodily Remains: the Return, Reconcile Renew Project

Prof. Paul Turnbull1

1University Of Tasmania, Launceston, Australia


In this paper, I discuss the Return, Reconcile Renew Project, a major research initiative funded by the Australian Research Council.  The project has several aims, the most important of which is the application of various e-research techniques to provenance research, primarily to assist the repatriation of the bodily remains of Old People by Indigenous Australian communities.  As I will explain in the paper, digitally investigating the collecting of ancestral bodily remains and important cultural property of Australian and other Indigenous peoples can greatly enhance the kinds of provenance research that western museums now commonly find themselves obliged to undertake.


Paul Turnbull is a Professor of History and Digital Humanities at the University of Tasmania.   He is co-editor with Michael Pickering of  The Long Way Home. The Meaning and Values of Repatriation (2010).  His book, Museums, Science and Collecting the Indigenous Dead will be published by Palgrave late this year. Paul is also known internationally for his creation of research-based digital resources exploring cross-cultural relations in Australia and the Pacific.

Improving the online profile of the Ngan’gi Language Collection

Dr Paddy Tobias1, Associate Professor Nick Reid2

1Intersect Australia, Armidale, Australia,

2University of New England, Armidale, Australia


For over 30 years, Associate Professor Nick Reid has conducted descriptive and documentary linguistics research in the Daly River region of the Northern Territory. Over this time A/Prof Reid has collected a large body of fieldnotes, including transcribed texts, photography, and video and audio recordings of many stories, mostly by speakers of the Ngan’gi language. Until now, online presentations of this rich collection have suffered from low discoverability and accessibility for the research community, and have allowed only passive engagement by the speech community. This paper describes the ANDS-funded Collection Enhancement Project to increase the online discoverability and analytic utility of this significant collection and offers a number of solutions for other researchers with similar collections to employ.

Of course, with the increase in discoverability comes a number of needs including the ability to present only particular content to certain user-groups, enable a high-level of interaction with the collection, and to maintain trust, ownership and sensitivity of the speech community in the online presentation. The paper will discuss the two web-based tools chosen to meet these needs, including the content management system, Murkutu, and the inter-linear text reader for transcribed audio-visual recordings, EOPAS. Both tools are open source and perfectly suited for presenting ethno-linguistic content. The paper will demonstrate the innovative integration of the EOPAS tool into the Murkutu environment, the ability to define multiple user-groups, as well as the dynamic language dictionary feature of Murkutu, all of which would be attractive for the presentation of other digital ethno-linguistic collections.

The Ngan’gi website is an example of the potentials for online presentations of cultural collections in an interactive and multi-communal way. Rather than presenting the web content in a passive manner, the Ngan’gi Language website is a forum through which linguistic researchers can access, critically engage with and contribute to materials such as transcriptions and time alignment files. In addition, the website allows the speech community to actively shape the linguistic and cultural identity presented to the wider world by adding new material to collections, setting access privileges for user-communities, and contribute to language maintenance and revival activities.



Nick Reid is an Associate Professor in the School of Behavioural, Cognitive and Social Sciences. His main linguistic specialisation is the study of Australian Aboriginal languages of the Daly River region (NT). He has published grammars, dictionaries and text collections of the two languages, Ngan’gikurunggurr and Ngen’giwumirri.

Nick’s descriptive work has fuelled other research interests, primarily within the areas of morphological and syntactic typology, language description, historical linguistics and ethnomusicology. He is particularly interested in systems of nominal and verbal classification, and has co-edited a book on nominal classification in Australian languages.

Dr Paddy Tobias – As the Intersect eResearch Analyst at the University of New England, Paddy’s role is to help researchers enhance their research projects through eResearch capabilities. Paddy’s research background is in political and social anthropology, and is a keen observer of how digital research approaches can be applied to this academic realm.

Using distant reading techniques to understand the value of culture in a time of crisis

Dr Tully Barnett1

1Flinders University, Park Holme, Australia


The 2015 Australian Federal Government’s Senate Inquiry into the impact of the 2014 and 2015 Commonwealth Budget [1] decisions on the Arts received 2719 submissions, a corpus of over 1.8 million words that form a record of a time when governmental support for arts and culture was under threat [2]. The robust response by Australian artists, arts administrators and the broader community demonstrates that the inquiry process was a crucial intervention in a moment of industry crisis. But it also answered a gap in the sector, providing an avenue for practicing artists, arts administrators and audiences to talk about the industry and its impacts in a way they haven’t been able to do before.

This paper considers the submissions and hearings testimony as a public body of material constructing the artist and the state of the arts in the present moment under austerity. The paper reports on the application of techniques and methodologies of distant reading to the corpus of 1.8 million words that form the submissions to the Senate Inquiry including word cloud generators (see Figure 1), JStor Labs [3] and Voyant (Figure 2) [4]. These 2719 submissions form a body of collective valuing of culture and I argue that we can understand something about the way that artists, arts organisations and arts audiences value culture by applying distant reading methodologies to the corpus.

Figure 1

Figure 2


  1. Impact of the 2014 and 2015 Commonwealth Budget Decisions on the Arts Accessed 16 June 2017.
  2. Meyrick, Julian, and Tully Barnett. “Culture without “world”: Australian cultural policy in the age of stupid.” Cultural Trends(2017): 1-18.
  3. JStore Labs Accessed 16 June 2017
  4. Voyant Tools Accessed 16 June 2017


Tully Barnett is a Research Fellow in the College of Humanities, Arts and Social Sciences at Flinders University.  She works on the ARC Linkage Project Laboratory Adelaide: The Value of Culture which is designed to formulate new ways of understanding about the value of arts and culture.  Tully also conducts research on literary reading in the digital age. She is Associate Director of the Australasian Consortium of Humanities Research Centres and is a member of the executive board of the Australasian Association of Digital Humanities.

ORCID: 0000-0003-0269-5814

A case study in online journalism: preliminary findings and unfolding digital humanities methods

Ms Kim Doyle1, Dr Mitchell Harrop1

1University Of Melbourne , Melbourne, Australia


Large-scale datasets of communication are challenging traditional, human-driven approaches to content analysis in media and communications research (Lewis et al. 2013, p.34). With online journalism and social media producing huge amounts of digital content daily, media and communication scholars are faced with the new challenge to describe and analyse this wealth of information (Günther & Quandt 2016, p.75). The cost of this research phase can impose limitations on sample sizes, and therefore the kind of research questions that can be addressed (Flaounas et al. 2013, p.102). Boumans and Trilling (2016, p.8) insist that ‘the sheer amount of data and the unique features of digital content call for the application of valuable new techniques.’ Yet, despite the unfolding opportunities, computational methods are not currently commonplace in digital journalism research (Boumans & Trilling 2016, p.8). Perhaps this is due to the ‘fluid and ephemeral’ (Karlsson & Sjøvaag 2016, p.179) nature of online journalism, which is currently ‘more akin to a flowing river that Web scrapers or algorithms can step into at a fixed point in time’ (ibid, p.186). In the presentation we will argue that using methodologies from computer science and computational linguistics has the potential to help disclose patterns; consistencies and organisational factors of online news production that would otherwise remain undiscovered with traditional methods alone.

This presentation is a case study in an interdisciplinary project in the field of Media Studies, which forms part of the first author’s ongoing doctoral research at The University of Melbourne. The research project is designed to explore large-scale data mining of multiple news websites, as well as utilising the Natural Language Toolkit as an analytical tool. Such a broad approach is supported by Karlsson & Sjøvaag, who argue that for reasons of methodological veracity, ‘researchers should ideally command both the access, collection and storage of digital news data’ (Karlsson & Sjøvaag 2016, p.187).

Tailoring code and data acquisition is often costly and/or time consuming; therefore ‘it is imperative that researchers share their computational solutions with other researchers’ (p. 187, ibid). As such, the research project may develop a platform that may be relevant and reusable for future Digital Humanities researchers. Yet reuse opportunities are not sufficient for quality research. Karlsson and Sjøvaag urge researchers ‘to publish more widely on the process of inductive method design rather than merely disseminating results from the analysis’ (Karlsson & Sjøvaag 2016, p.187). Moving towards standards for large-scale analysis of communications data therefore requires greater attention to the processes of preparing and conducting digital data collection and analysis.

The project addresses these duel concerns by constructing a platform using Apache Spark and the Zeppelin Notebook to inductively scrape, analyse and display news data using multiple programming languages, within the same user interface. This builds on the Digital Humanities tradition of ‘building things’ as additional research outputs, but goes further by creating a flexible, scalable and Open Source platform for next generation Digital Humanities scholars. This solution was researcher-driven, with the indispensable support of informatics and research services at The University of Melbourne. The platform is being built in collaboration with the Social Cultural Informatics Platform (SCIP) and Research Platforms, often through a process of mutual learning. The SCIP digital humanities initiative has previously been described at eResearch Australasia (see Neish et al., 2015).

This talk will discuss initial problems encountered during the ongoing research process. For example, when building tools and platforms for Digital Humanities, it is important to recognise that humanities have their own methods ‘not based in calculation, automation, or statistical probability, but in ambiguity, interpretation, and in embodied and situated models of knowledge and knowing’ (Burdick et al. 2012, p.92). Thus, a Humanities and Social Sciences (HASS) platform must be flexible enough to incorporate mixed-methodologies, as well as mixed skill levels and skill specialisations.


Burdick, A., Drucker, J., Lunenfeld, P., Presner, T., & Schnapp, J. (2012). Digital_Humanities. Mit Press.

Boumans, J.W. & Trilling, D., 2016. Taking stock of the toolkit: An overview of relevant automated content analysis approaches and techniques for digital journalism scholars. Digital Journalism, 4(1), pp.8–23.

Flaounas, I., Ali, O., Lansdall-Welfare, T., De Bie, T., Mosdell, N., Lewis, J., & Cristianini, N. (2013). Research Methods in the Age of Digital Journalism: Massive-scale automated analysis of news-content—topics, style and gender. Digital Journalism, 1(1), 102-116.

Günther, E. & Quandt, T., (2016). Word Counts and Topic Models: Automated text analysis methods for digital journalism research. Digital Journalism, 4(1), pp.75–88.

Karlsson, M. & Sjøvaag, H., (2016). Content analysis and online news: epistemologies of analysing the ephemeral Web. Digital Journalism, 4(1), pp.177–192.

Lewis, S.C., Zamith, R. & Hermida, A., (2013). Content Analysis in an Era of Big Data: A Hybrid Approach to Computational and Manual Methods. Journal of broadcasting & electronic media, 57(1), pp.34–52.

Neish, P., Murray, A. & Konstantelos, L. (2015) “The role of research data repositories in social and cultural informatics and the wider open data ecosystem” eResearch Australasia Conference, Brisbane, Australia, 19th-23rd October, 2015. Available:



Dr Mitchell Harrop
Mitchell is a Humanities and Social Sciences Informatics Specialist in the Social and Cultural Informatics Platform (SCIP) at The University of Melbourne. Mitchell’s PhD research involved ethnographic studies of digital game playing and was conducted within Melbourne University’s Computing and Information Systems department. He has lectured in Informatics, Database Systems and Web Information Technologies.

Cultural Heritage & Library Collections as Data and their Role in Digital Humanities Infrastructure

Dr Toby Burrows1, Prof. Deb Verhoeven2, Dr Christopher  McAvaney3

1University of Oxford, Oxford, UK,

2Deakin University, Burwood, Australia,

3Deakin University, Burwood, Australia


The importance of cultural heritage collections for research in the humanities, arts and social sciences has long been recognized. The digital and digitized forms of these collections are equally crucial for research which uses the methodologies, technologies and critical perspectives of the digital humanities. For the National Research Infrastructure Roadmap Report (2017), HASS platforms encompass both the physical collections and “online portals that facilitate the digitisation of and digital access to original artefacts, materials and knowledge.” [1] The Report emphasizes discoverability and accessibility as priorities, together with “enhanced digitisation aggregation and interpretation platform processes”.

At the same time, an initiative to understand these collections as data is gathering pace in the United States. Under the auspices of the Library of Congress and the Institute of Museum and Library Services, this “Collections as Data” program “aims to foster a strategic approach to developing, describing, providing access to, and encouraging reuse of collections that support computationally-driven research.” [2] One of the drivers for this initiative is the perception that, as Miriam Posner argues, “Libraries and archives [and museums] are increasingly making their materials available online, but, as a general rule, these materials aren’t of much use for computational purposes.” [3]

Instead, as Thomas Padilla has summarized the project, a “collections as data” imperative can benefit research communities by shifting cultural heritage and library practices in three key frames:
1. Generativity: to increase meaning making capacity
2. Legibility: to document and convey provenance and possibility
3. Creativity: to empower experimentation [4]

While much of the work of the “collections as data” program is focused on ways of making collections data available and accountable, there is also an interest in the relationship between data and infrastructure. The Santa Barbara Statement on Collections as Data summarises: “Working toward interoperability entails alignment with emerging and/or established community standards and infrastructure.” [5]

This presentation will evaluate three different approaches to delivering and using collections data to build HASS-oriented platforms. The British Museum’s ResearchSpace is a platform for bringing together data from cultural heritage collections by mapping to the CIDOC-CRM ontology. [6] Initially limited to the British Museum’s own collections, it is now been tested by other institutions in Europe and North America, as well as by the Collecting the West project in Western Australia. This project is using ResearchSpace to bring together data relating to Western Australian objects held in collections in Australia and Europe. [7] ResearchSpace enables researchers to work with collections data by adding annotations and arguments to objects and other entities.
OXLOD (Oxford Linked Open Data) is taking a similar approach, mapping data from Oxford University’s museums and libraries to CIDOC-CRM, in order to produce an interdisciplinary platform for cultural heritage research. An estimated 200,000 digital records will be linked and mapped in the initial phase of this project. [8]

The third service is HuNI, the Australian virtual laboratory for the humanities, which ingests collections data from library catalogues like AIATSIS, AFI and ACMI as well as data from various archives. It also aggregates data from Trove’s digitized newspaper collection, and from reference works, bibliographies and event-oriented databases. HuNI has recently added a pipeline from ingesting data for different collection types including researcher-contributed collections via the Omeka software.

HuNI re-formats collections data by extracting entities from incoming records and making them available for linking and visualizing, in the form of a network graph. Interpretations can be added to the data in the form of relationships and links, and the entities can be re-constituted into a researcher’s own collections. [9] Queries can be performed via keyword or through the graph search. The relationships between records are themselves distinguished between HuNI System generated links and user created links.

What these three services have in common is the idea of taking collections data and using them to create network graphs of relationships between entities – including people, places and objects. While there are other things which can be done with collections data (such as image interoperability using IIIF, and textual analysis of digital texts), network graphs are a powerful way of uncovering the meaning and significance of the knowledge embedded in cultural heritage collections. Analysis of these services will form the basis for a set of recommendations for best practice in making collections data available for computational purposes.


1. Australian Government, 2016 National Research Infrastructure Roadmap (Canberra, 2017), p. 33

2. Always Already Computational: Library Collections as Data (2017)

3. Posner, Miriam, “Actually Useful Collection Data: Some Infrastructure Suggestions”, in: Always Already Computational: Library Collections as Data: National Forum Position Statements (2017)

4. Padilla, Thomas , “On a Collections as Data Imperative”,

5. The Santa Barbara Statement on Collections as Data (2017)




9. Burrows, Toby and Deb Verhoeven, “Aggregating Cultural Heritage Data for Research Use: The Humanities Networked Infrastructure (HuNI)”, in Metadata and Semantics Research, 9th Research Conference, MTSR 2015, Manchester, UK, September 9–11, 2015: Proceedings, ed. Emmanouel Garoufallou, Richard J. Hartley, Panorea Gaitanou (Communications in Computer and Information Science, 544) (Cham: Springer, 2015), pp. 417-423



Deb Verhoeven is Professor and Chair of Media and Communcaition at Deakin Univeristy. She is Director of the Humanities Network Infrastructure (HuNI).

Recent Comments


    About the conference

    eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

    Conference Managers

    Please contact the team at Conference Design with any questions regarding the conference.

    © 2018 - 2020 Conference Design Pty Ltd