Thomas G. Close1,2, Francesco Sforazzini1, Phillip G. D. Ward1,3,4, Zhaolin Chen1,5, Gary F. Egan1,3,4
1Monash Biomedical Imaging, Melbourne, Australia, email@example.com
2Australian National Imaging Facility, Australia
3Australian Research Council Centre of Excellence for integrative Brain Function, Melbourne, Australia
4Monash Institute of Cognitive and Clinical Neurosciences, Monash University, Melbourne, Australia.
5Department of Electrical and Computer Systems Engineering, Monash University, Melbourne, Australia
Despite the availability of well-established analysis packages, designing comprehensive and portable neuroimaging workflows is challenging due to the wide variety of tools, parameters, and scanner configurations involved. Abstraction of Repository-Centric ANAlysis (Arcana) (https://arcana.readthedocs.io) is a Python framework for designing complex workflows that are comprehensive, portable between sites and scalable to large studies. Several neuroimaging workflows, including fibre tracking and vein analysis, were implemented using Arcana in the NiAnalysis package (https://github.com/mbi-image/nianalysis), and run over a large cohort to demonstrate the scalability of this approach. These workflows are integrated with Monash Biomedical Imaging’s XNAT repository  and can be triggered on data ingest to fully automate analyses.
Arcana builds on Nipype  to encapsulate repository data within Python classes (Fig. 1). Central to its design is the Study class, which specifies all products, and the pipelines to produce them, that can be derived from a specific set of acquisitions. Study objects aggregate modular repository (XNAT , BIDS , or a custom directory format), execution (linear, multi-process and SLURM)  and data-matching components, in order to manage the derivation of products on demand.
Imaging contrast or modality-specific analyses are implemented in Arcana by sub-classing the Study class to specify new products and pipelines (Fig. 1). Multi-contrast/modality studies can be represented by combining Study classes in MultiStudy classes.
Susceptibility weighted imaging (SWI) (1.8 mm, 256x232x72, TE=20ms, TR=30ms), MPRAGE (1 mm, 256x 240×192, TR=2300ms, TE=2.07ms), and dMRI (2 mm, 110x100x60, TE = 95ms, TR=8200ms, 33 directions with b=1500 mm2/s and 3 b=0) datasets were acquired for 544 healthy elderly subjects.
Pipelines for producing composite vein masks from QSM, SWI and T1-weighted MRI , and white matter tractograms from diffusion-weighted MRI  were implemented in the Arcana framework and applied to cohort of healthy elderly subjects. Composite vein masks were produced for 535 subjects without manual intervention within 13k compute hours on a cluster of Intel Xeon CPU E5 2.50GHz nodes. Data is pulled from, and derived products pushed to, an XNAT repository with processing jobs submitted to MASSIVE .
The encapsulation of repository data and pipeline generation by Arcana enables workflow designers to create portable, comprehensive workflows while focusing purely on the core logic of their analysis. Arcana’s modular pipeline and inheritance architecture promotes code reuse through the sharing of common segments (e.g. registration) and modifications of existing workflows. Intermediate products are saved in the repository and can be reused by subsequent analyses, saving computation time and manual quality control checks.
When using an XNAT repository, it is possible to trigger Arcana workflows on data ingest and thereby fully automate the analysis. This automation is makes it practical to analyse data from large studies as they are acquired and identify any issues with the acquisition protocol that might arise (e.g. from scanner upgrades or hardware faults).
Figure 1: UML description of the Arcana framework and its application to neuroimaging analysis.
Boxes: Python classes (blue=core, green=interchangeable, grey=specialised). Arrows: orange=data, magenta=processing, diamond=aggregated-in, triangle=subclass-of. Study.data(name) generates the requisite pipelines (specified in Study.data_specs) to produce requested data and uses the runner to execute them. Data is pulled and processed, then the products are pushed back to the repository.
By managing the complete flow of data from/to a repository in a flexible and extensible manner, Arcana enables the automation of complex analyses of large-scale neuroimaging studies.
- Marcus, D.S., Olsen, T.R., Ramaratnam, M., & Buckner, R.L. (2007), ‘The extensible neuroimaging archive toolkit’. Neuroinformatics, vol. 5, pp. 11–33.
- Gorgolewski, K., Burns, C.D., Madison, C., Clark, D., Halchenko, Y.O., Waskom, M.L., & Ghosh, S.S. (2011), ‘Nipype: A Flexible, Lightweight and Extensible Neuroimaging Data Processing Framework in Python’. Frontiers in Neuroinformatics, vol. 5.
- Gorgolewski, K.J. Auer, T. Calhoun, V.D. Craddock, R.C. Das, S. Duff, E.P. et al. (2016): ‘The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments’. Scientific Data, vol. 3.
- Ward, P.G.D. Ferris, N.J. Raniga, P. Dowe, D.L. Ng, A.C.L. Barnes, D.G. & Egan, G.F. (2018): ‘Combining images and anatomical knowledge to improve automated vein segmentation in MRI’. NeuroImage, vol. 165, pp. 294–305.
- Tournier, J.D. Calamante, F. & Connelly, A. (2012): ‘MRtrix: Diffusion tractography in crossing fiber regions’. International Journal of Imaging Systems and Technology, vol. 22, pp. 53–66.
- Goscinski, W.J. McIntosh, P. Felzmann, U. C. Maksimenko, A. Hal, C.J.l. Gureyev, T. D. Thompson, D. Janke, A. Galloway, G. Killeen, N.E.B. Raniga, P. Kaluza, O., Ng, A., Poudel, G., Barnes, D., Nguyen, T., Bonnington, P. and Egan, G.F. (2014). ‘The Multi-modal Australian ScienceS Imaging and Visualisation Environment (MASSIVE) high performance computing infrastructure: applications in neuroscience and neuroinformatics research’ Frontiers in Neuroinformatics
Tom completed his PhD on global tractography methods from diffusion MRI at the University of Melbourne. He became interested in neuroinformatics during a post-doc at the Okinawan Institute of Science and Technology, Japan, where he worked on standardising model descriptions of neural systems. Since returning to Melbourne, he has been developing robust and reproducible informatics workflows for the analysis of neuroimaging data at Monash Biomedical Imaging.