Tools for creating full TVB models from individual anatomical scans
Folder structure • Data structure • How to launch • How to rerun • Environment
It is used to preprocess the MR scans in order to get actual files that are compatible with TVB. The result can be later uploaded in TVB or used independently for modeling.
The mandatory inputs are DWI and T1 scans. Optionally, CT scans can be given as input, if sensors preprocessing is needed.
We are using the Pegasus WMS in order to connect and automatize the pipeline steps. Pegasus is distributed under the Apache v2.0 license, while our own code is GPL v3.
Illustration of pipeline input and output based on imaging data of one subject.
Image taken from article [2],Bibliography.
For an automated pipeline run, the patient data is divided into three categories:
For a multi-patient sequential run, the data needs to be structured in a similar manner for each patient. Also, it is important to name the patient folders in a predefined manner. As an example, a simple folder structure can be:
TVB_patients
│
├── TVB1
| |
| └── raw
| |
| └── mri
| |
| ├── t1_input.nii.gz
| |
| ├── dwi_raw.nii
| |
| ├── dwi.bvec
| |
| └── dwi.bval
└── TVB2
|
└── raw
|
└── mri
|
├── t1_input.nii.gz
|
├── dwi_raw.nii
|
├── dwi.bvec
|
└── dwi.bval
We provide a docker image which gathers all the dependencies necessary for tvb-recon code to run. The docker image can be found on docker hub at: thevirtualbrain/tvb-recon. Take it using the most recent tag, with:
# import docker image
$ docker pull thevirtualbrain/tvb-recon
Also, it would be good to have tvb-recon code locally, in case some changes are necessary. Take it with:
# Clone this repository
$ git clone https://github.com/the-virtual-brain/tvb-recon.git
# Go into the repository
$ cd tvb-recon
In order to use tvb-recon within the proposed docker image, you will need some details about its configurations and steps to follow for specifying your input data and start a workflow. We recommend new users to start with the default configurations and adjust their data structure as required. After a first workflow run has finished successfully, the configurations and data structure can be chosen by the user.
First of all, we process mostly T1 and DWI data. There is an option to process also CT scans. But, we would advise you to start only with T1 and DWI. In order to access the T1 and DWI input, tvb-recon pipeline expects, by default, a certain folder structure, and file naming. These can be changed later as you wish, but keep the default configurations for a first test. This means you should adjust your input data folder to the following structure (also rename your files as below):
TVB_patients
│
├── TVB1
| |
| └── raw
| |
| └── mri
| |
| ├── t1_input.nii.gz
| |
| ├── dwi_raw.nii
| |
| ├── dwi.bvec
| |
| └── dwi.bval
└── TVB2
|
└── raw
|
└── mri
|
├── t1_input.nii.gz
|
├── dwi_raw.nii
|
├── dwi.bvec
|
└── dwi.bval
(TVB1, TVB2, etc, being the ID of the patients. If your DWI data is not made of: dwi.nii + dwi.bvec + dwi.bval, let us know and we will tell you how to specify it differently.)
Once you have this folder structure for your data, you can run the tvb-recon docker image with the command below. Please make sure Docker has enough RAM memory assigned, we recommend at least 6 GB.
# To run the tvb-recon docker image
$ docker run -it -v your_path_to_TVB_patients/TVB_patients/:/home/submitter/data -v your_path_to_tvb_recon/tvb-recon/:/opt/tvb-recon thevirtualbrain/tvb-recon /bin/bash
(here you need to replace your_path_to_TVB_patients and your_path_to_tvb_recon with the paths of your local machine)
Now, you will be able to use bash commands inside the tvb-recon container. And here, you need to do the next steps:
# Run the following command and provide the sudo password: 123456
$ sudo condor_master
# Move to pegasus folder
$ cd pegasus
# Run the pipeline by the following command. The "1" argument is the patient number you want to process. By specifying "1", you choose to process TVB1. For running multiple patients (TVB1, TVB2 and TVB3), the argument should be: "1 2 3".
$ python run_sequentially.py "1"
If everything is correct, some messages will be displayed. Look for the following flow of messages:
*...
Starting to process the subject: TVB1
...
2018.06.28 11:11:40.285 UTC: Your workflow has been started and is running in the base directory:
2018.06.28 11:11:40.293 UTC: /home/submitter/pegasus/submit/submitter/pegasus/TVB-PIPELINE/run0001
...
The job that has been started has the id: 0001
Starting to monitor the submit folder: /home/submitter/pegasus/submit/submitter/pegasus/TVB-PIPELINE/run0001 ...
Checked at Thu, 28 Jun 2018 11:11:42 and monitord.done file was not generated yet!*
If the messages flow is not similar, let us know what is the error.
Once, you have started the workflow, you should see a new folder, named configs, on your local machine at path:
your_path_to_TVB_patients/TVB_patients/TVB1
Here you will have all the default configurations we need for a patient (these can be changed).
Later on, after some important steps have finished, you will also have an output folder inside:
your_path_to_TVB_patients/TVB_patients/TVB1.
Here is where all the output data will be stored, and of more interest will be the folders:
We use the Pegasus workflow engine in order to automatize the pipeline steps. This tool will let you check the status of the workflow anytime. In order to check the status of your current workflow:
# You can open a new terminal on the tvb-recon docker container with:
$ docker exec -i -t container_id /bin/bash
# Then run this command
$ pegasus-status -l /home/submitter/pegasus/submit/submitter/pegasus/TVB-PIPELINE/run0001
After you manage to test a first default workflow, we can speak about adjusting the configurations instead of adjusting the data structure.
There are 2 available entry points for the pipeline. They are both under the pegasus folder. In order to use these entry points, there are, in both cases, some configurations to be defined first. These configurations are kept as a folder specific to each patient and are explained in the next section.
The pipeline can be started using one of the following entry points:
main_pegasus.sh
This is the most straight-forward one. It starts one pipeline run for a single patient based on a set of predefined configurations. Command to launch the pipeline with this script:
$ sh main_pegasus.sh path_to_configurations_folder path_to_dax_folder
The arguments:
This entry point has the disadvantage that the user should manually fill in all the configuration files under the configurations folder.
run_sequential.py
This is a little more complex. It is used to start pipeline runs for a list of patients with similar configurations. As the name is suggesting, the runs will be started sequentially.
Command to launch the pipeline with this script:
$ python run_sequentially.py
This script does not need arguments, but it needs the user to edit the necessary configurations inside file run_sequential.py. The configurations to edit are described below:
All the configuration files are under pegasus/config at the top level. There are configurations specific to the patient, to the machine where the workflow is running or to the actual run. Some details about each file, are given below:
There are cases when the user is not satisfied with the obtained results.
Maybe the volume overlapping is not correct. Maybe more tracts or longer tracts are needed. Maybe the user has T2 scans and wants to add them.
These are all cases that imply the need to rerun the pipeline. But in the best-case scenario, the user does not need to rerun the whole pipeline again but rerun only the wanted steps.
With Pegasus, the pipeline can be rerun partially. This means that it will rerun only the steps for which the corresponding outputs are missing from the rc.txt
This is possible with pegasus, but it is not automatized. It needs user input and attention. In order to rerun with different parameters, the user has to:
As stated before, the rc.txt contains a mapping between the generated file names and their paths. In order for Pegasus to rerun a group of steps, the user has to remove their output files from rc.txt.
During the pipeline rerun, the rc.txt will be once again filled in, by re-running the steps (and all their dependencies) which are meant to produce the missing resources from rc.txt.
Inside the main_bnm.dax file, there is a XML representation of the workflow graph. Here is where the user can check all the pipeline steps and their input/output files.
Pegasus already has support for this. When the machine is started after an unexpected shutdown, it restarts the flow. If the recovery run is not started, there is also the option to force it by calling pegasus-start inside its submit folder.
The pipeline steps are dependent on the following external tools:
The automatized workflow is based on:
HTCondor
Pegasus uses HTCondor as a job scheduler.
Download the tarballs (current stable release) from here: http://research.cs.wisc.edu/htcondor/downloads/
Install for MacOS:
tar xzf condor-8.6.9-x86_64_MacOSX-stripped.tar.gz
cd condor-8.6.9-x86_64_MacOSX10-stripped
./condor_install --type=execute, manager, submit
Prepare environment:
Pegasus
This is the workflow engine we have used for automatizing the pipeline steps.
Download tarball for MacOSX from here: https://pegasus.isi.edu/downloads/?filename=4.8.1%2Fpegasus-binary-4.8.1-x86_64_macos_10.tar.gz
Prepare the environment:
tar xzf ../pegasus-binary-4.8.1-x86_64_macos_10.tar.gz
export PATH=../pegasus-4.8.1/bin/:$PATH
check it works by running: pegasus-status
[1] Proix T, Spiegler A, Schirner M, Rothmeier S, Ritter P, Jirsa VK, How do parcellation size and short-range connectivity affect dynamics in large-scale brain network models?, Neuroimage (2016).
[2] Schirner M, Rothmeier S, Jirsa VK, McIntosh AR, Ritter P, An automated pipeline for constructing personalized virtual brains from multimodal neuroimaging data, Neuroimage, (2015) Aug 15 117:343-357. Available here
[3] Andre Santos Ribeiro, Luis Miguel Lacerda, Hugo Alexandre Ferreira, Multimodal Imaging Brain Connectivity Analysis (MIBCA) toolbox, 2015 Jul 14. Available here