Adopting Datalad for collaboration¶
- Created by Sebastien Tourbier - 2019 Jan 8
Move original BIDS dataset to server¶
rsync -P -v -avz -e 'ssh' --exclude 'derivatives' --exclude 'code' --exclude '.datalad' --exclude '.git' --exclude '.gitattributes' /media/localadmin/HagmannHDD/Seb/ds-newtest2/* tourbier@<SERVER_IP_ADDRESS>:/home/tourbier/Data/ds-newtest2
Datalad setup and dataset creation on Server (accessible via ssh)¶
Connect to server¶
ssh tourbier@<SERVER_IP_ADDRESS>
Install liblzma-dev (datalad pylzma depnendency) and Datalad¶
sudo apt-get install liblzma-dev
pip install datalad[all]
pip install datalad_containers
pip install datalad_neuroimaging
pip install datalad_revolution
Go to source dataset directory, create a Datalad dataset and save all¶
cd /home/tourbier/Data/ds-newtest2
datalad rev-create -f -D "Original test dataset on lab server"
datalad rev-save -m 'Source (Origin) BIDS dataset' --version-tag origin
Report on the state of dataset content¶
datalad rev-status --recursive
Processing using the Connectome Mapper BIDS App on a local workstation¶
Dataset installation¶
datalad install -s ssh://tourbier@<SERVER_IP_ADDRESS>:/home/tourbier/Data/ds-newtest2 \
/home/localadmin/Data/ds-newtest2
cd /home/localadmin/Data/ds-newtest2
Get T1w and Diffusion images to be processed, written in a bash script for reproducibility¶
datalad get -J 4 sub-*/ses-*/anat/sub-*_T1w.nii.gz
datalad get -J 4 sub-*/ses-*/dwi/sub-*_dwi.nii.gz
datalad get -J 4 sub-*/ses-*/dwi/sub-*_dwi.bvec
datalad get -J 4 sub-*/ses-*/dwi/sub-*_dwi.bval
Write datalad get commands to get_required_files_for_analysis.sh:
mkdir code
echo "datalad get -J 4 sub-*/ses-*/anat/sub-*_T1w.nii.gz" > code/get_required_files_for_analysis.sh
echo "datalad get -J 4 sub-*/ses-*/dwi/sub-*_dwi.nii.gz" >> code/get_required_files_for_analysis.sh
echo "datalad get -J 4 sub-*/ses-*/dwi/sub-*_dwi.bvec" >> code/get_required_files_for_analysis.sh
echo "datalad get -J 4 sub-*/ses-*/dwi/sub-*_dwi.bval" >> code/get_required_files_for_analysis.sh
Add all content in the code/ directory directly to git:
datalad add --to-git code
Add the container image of the connectome mapper to the dataset¶
datalad containers-add connectomemapper-bidsapp-|release| \
--url dhub://sebastientourbier/connectomemapper-bidsapp:|release| \
--update
Save the state of the dataset prior to analysis¶
datalad rev-save -m "Seb's test dataset on local \
workstation ready for analysis with connectomemapper-bidsapp:|release|" \
--version-tag ready4analysis-<date>-<time>
Run Connectome Mapper on all subjects¶
datalad containers-run --container-name connectomemapper-bidsapp-|release| \
'/tmp' '/tmp/derivatives' participant \
--anat_pipeline_config '/tmp/code/ref_anatomical_config.ini' \
--dwi_pipeline_config '/tmp/code/ref_diffusion_config.ini' \
Save the state¶
datalad rev-save -m "Seb's test dataset on local \
workstation processed by connectomemapper-bidsapp:|release|, {Date/Time}" \
--version-tag processed-<date>-<time>
Report on the state of dataset content¶
datalad rev-status --recursive
With DataLad with don’t have to keep those inputs around – without losing the ability to reproduce an analysis. Let’s uninstall them – checking the size on disk before and after¶
datalad uninstall sub-*/*
Local collaboration with Bob for Electrical Source Imaging¶
Processed dataset installation on Bob’s workstation¶
datalad install -s (ssh://)localadmin@HOS51827:/home/localadmin/Data/ds-newtest2 \
/home/bob/Data/ds-newtest2
cd /home/bob/Data/ds-newtest2
Get connectome mapper output files (Brain Segmentation and Multi-scale Parcellation) used by Bob in his analysis¶
datalad get -J 4 derivatives/cmp/sub-*/ses-*/anat/sub-*_mask.nii.gz
datalad get -J 4 derivatives/cmp/sub-*/ses-*/anat/sub-*_class-*_dseg.nii.gz
datalad get -J 4 derivatives/cmp/sub-*/ses-*/anat/sub-*_scale*_atlas.nii.gz
Write datalad get commands to get_required_files_for_analysis_by_bob.sh for reproducibility:
echo "datalad get -J 4 derivatives/cmp/sub-*/ses-*/anat/sub-*_mask.nii.gz" > code/get_required_files_for_analysis_by_bob.sh
echo "datalad get -J 4 derivatives/cmp/sub-*/ses-*/anat/sub-*_class-*_dseg.nii.gz" >> code/get_required_files_for_analysis_by_bob.sh
echo "datalad get -J 4 derivatives/cmp/sub-*/ses-*/anat/sub-*_scale*_atlas.nii.gz" >> code/get_required_files_for_analysis_by_bob.sh
Add all content in the code/ directory directly to git:
datalad add --to-git code
Update derivatives¶
cd /home/bob/Data/ds-newtest2
mkdir derivatives/cartool ...
Save the state¶
datalad rev-save -m "Bob's test dataset on local \
workstation processed by cartool:|release|, {Date/Time}" \
--version-tag processed-<date>-<time>
Report on the state of dataset content¶
datalad rev-status --recursive
With DataLad with don’t have to keep those inputs around – without losing the ability to reproduce an analysis. Let’s uninstall them – checking the size on disk before and after¶
datalad uninstall sub-*/*
datalad uninstall derivatives/cmp/*
datalad uninstall derivatives/freesurfer/*
datalad uninstall derivatives/nipype/*