API

snakebids

snakebids.bids(root: Optional[Union[str, Path]] = None, datatype: Optional[str] = None, prefix: Optional[str] = None, suffix: Optional[str] = None, subject: Optional[str] = None, session: Optional[str] = None, include_subject_dir: bool = True, include_session_dir: bool = True, **entities: str)

Helper function for generating bids paths for snakemake workflows.

File path is of the form:

[root]/[sub-{subject}]/[ses-{session]/
[prefix]_[sub-{subject}]_[ses-{session}]_[{key}-{val}_ ... ]_[suffix]
Parameters:
  • root (str or Path, default=None) – root folder to include in the path (e.g. ‘results’)

  • datatype (str, default=None) – folder to include after sub-/ses- (e.g. anat, dwi )

  • prefix (str, default=None) – string to prepend to the file name (typically not defined, unless you want tpl-{tpl}, or a datatype)

  • suffix (str, default=None) – bids suffix including extension (e.g. ‘T1w.nii.gz’)

  • subject (str, default=None) – subject to use, for folder and filename

  • session (str, default=None) – session to use, for folder and filename

  • include_subject_dir (bool, default=True) – whether to include the sub-{subject} folder if subject defined (default: True)

  • include_session_dir (bool, default=True) – whether to include the ses-{session} folder if session defined (default: True)

  • **entities (dict, optional) – dictionary of bids entities (e.g. space=T1w for space-T1w)

Returns:

bids-like file path

Return type:

Path

Examples

Below is a rule using bids naming for input and output:

rule proc_img:
    input: 'sub-{subject}_T1w.nii.gz'
    output: 'sub-{subject}_space-snsx32_desc-preproc_T1w.nii.gz'

With bids() you can instead use:

rule proc_img:
   input: bids(subject='{subject}',suffix='T1w.nii.gz')
   output: bids(
       subject='{subject}',
       space='snsx32',
       desc='preproc',
       suffix='T1w.nii.gz'
   )

Note that here we are not actually using “functions as inputs” in snakemake, which would require a function definition with wildcards as the argument, and restrict to input/params, but bids() is being used simply to return a string.

Also note that space, desc and suffix are NOT wildcards here, only {subject} is. This makes it easy to combine wildcards and non-wildcards with bids-like naming.

However, you can still use bids() in a lambda function. This is especially useful if your wildcards are named the same as bids entities (e.g. {subject}, {session}, {task} etc..):

rule proc_img:
    input: lambda wildcards: bids(**wildcards,suffix='T1w.nii.gz')
    output: bids(
        subject='{subject}',
        space='snsx32',
        desc='preproc',
        suffix='T1w.nii.gz'
    )

Or another example where you may have many bids-like wildcards used in your workflow:

rule denoise_func:
    input: lambda wildcards: bids(**wildcards, suffix='bold.nii.gz')
    output: bids(
        subject='{subject}',
        session='{session}',
        task='{task}',
        acq='{acq}',
        desc='denoise',
        suffix='bold.nii.gz'
    )

In this example, all the wildcards will be determined from the output and passed on to bids() for inputs. The output filename will have a ‘desc-denoise’ flag added to it.

Also note that even if you supply entities in a different order, the entities will be ordered based on the OrderedDict defined here. If entities not known are provided, they will be just be placed at the end (before the suffix), in the order you provide them in.

Notes

snakebids.filter_list(zip_list, wildcards, return_indices_only=False)

This function is used when you are expanding over some subset of the wildcards i.e. if your output file doesn’t contain all the wildcards in input_wildcards

Parameters:
  • zip_list (dict) – generated zip lists dict from config file to filter

  • wildcards (dict) – wildcard values to filter the zip lists

  • return_indices_only (bool, default=False) – return the indices of the matching wildcards

Returns:

zip list with non-matching elements removed

Return type:

dict

Examples

>>> import snakebids

Filtering to get all subject='01' scans:

>>> snakebids.filter_list(
...     {
...         'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
...         'acq': ['98','98','98','98','99','99','99','99'],
...         'subject': ['01','01','02','02','01','01','02','02' ]
...     },
...     {'subject': '01'}
... ) == {
...     'dir': ['AP', 'PA', 'AP', 'PA'],
...     'acq': ['98', '98', '99', '99'],
...     'subject': ['01', '01', '01', '01']
... }
True

Filtering to get all acq='98' scans:

>>> snakebids.filter_list(
...     {
...         'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
...         'acq': ['98','98','98','98','99','99','99','99'],
...         'subject': ['01','01','02','02','01','01','02','02' ]
...     },
...     {'acq': '98'}
... ) == {
...     'dir': ['AP', 'PA', 'AP', 'PA'],
...     'acq': ['98', '98', '98', '98'],
...     'subject': ['01', '01', '02', '02']
... }
True

Filtering to get all dir=='AP' scans:

>>> snakebids.filter_list(
...     {
...         'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
...         'acq': ['98','98','98','98','99','99','99','99'],
...         'subject': ['01','01','02','02','01','01','02','02' ]
...     },
...     {'dir': 'AP'}
... ) == {
...     'dir': ['AP', 'AP', 'AP', 'AP'],
...     'acq': ['98', '98', '99', '99'],
...     'subject': ['01', '02', '01', '02']
... }
True

Filtering to get all subject='03' scans (i.e. no matches):

>>> snakebids.filter_list(
...     {
...         'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
...         'acq': ['98','98','98','98','99','99','99','99'],
...         'subject': ['01','01','02','02','01','01','02','02' ]
...     },
...     {'subject': '03'}
... ) == {
...     'dir': [],
...     'acq': [],
...     'subject': []
... }
True
snakebids.generate_inputs(bids_dir, pybids_inputs, derivatives=False, search_terms=None, limit_to=None, participant_label=None, exclude_participant_label=None)

Dynamically generate snakemake inputs using pybids_inputs dict, and pybids to parse the bids dataset.

Parameters:
  • bids_dir (str) – Path to bids directory

  • pybids_inputs (dict) –

    Configuration for bids inputs, with keys as the names (str)

    Nested dicts with the following required keys:

    • "filters": Dictionary containing keyword arguments that will be passed to pybids get().

    • "wildcards": List of (str) bids tags to include as wildcards in snakemake. At minimum this should usually include ['subject','session'], plus any other wildcards that you may want to make use of in your snakemake workflow, or want to retain in the output paths. Any wildcards in this list that are not in the filename will just be ignored.

Returns:

The dict returned by this functions contains seven items. Each of the following four items is a dict containing one item for each modality described by pybids_inputs.

  • "input_path": String with a wildcard-filled path that matches the images for this modality.

  • "input_zip_lists": Dictionary where each key is a wildcard entity and each value is a list of the values found for that entity. Each of these lists has length equal to the number of images matched for this modality, so they can be zipped together to get a list of the wildcard values for each image.

  • "input_lists": Dictionary where each key is a wildcard entity and each value is a list of the unique values found for that entity. These lists may not be the same length.

  • "input_wildcards": Dictionary where each key is the name of a wildcard entity, and each value is the Snakemake wildcard used for that entity.

Then there are three more top-level entries in the dictionary:

  • "subjects": A list of the subjects in the dataset.

  • "sessions": A list of the sessions in the dataset.

  • "subj_wildcards": The subject and session wildcards applicable to this dataset. {"subject": "{subject}"} if there is only one session, {"subject": "{subject}", "session": "{session}"} if there are multiple sessions.

Return type:

dict

Notes

As an example, consider the following BIDS dataset:

bids-example/
├── dataset_description.json
├── participants.tsv
├── README
└── sub-control01
    ├── anat
    │   ├── sub-control01_T1w.json
    │   ├── sub-control01_T1w.nii.gz
    │   ├── sub-control01_T2w.json
    │   └── sub-control01_T2w.nii.gz
    ├── dwi
    │   ├── sub-control01_dwi.bval
    │   ├── sub-control01_dwi.bvec
    │   └── sub-control01_dwi.nii.gz
    ├── fmap
    │   ├── sub-control01_magnitude1.nii.gz
    │   ├── sub-control01_phasediff.json
    │   ├── sub-control01_phasediff.nii.gz
    │   └── sub-control01_scans.tsv
    └── func
        ├── sub-control01_task-nback_bold.json
        ├── sub-control01_task-nback_bold.nii.gz
        ├── sub-control01_task-nback_events.tsv
        ├── sub-control01_task-nback_physio.json
        ├── sub-control01_task-nback_physio.tsv.gz
        ├── sub-control01_task-nback_sbref.nii.gz
        ├── sub-control01_task-rest_bold.json
        ├── sub-control01_task-rest_bold.nii.gz
        ├── sub-control01_task-rest_physio.json
        └── sub-control01_task-rest_physio.tsv.gz

With the following pybids_inputs defined in the config file:

pybids_inputs:
  bold:
    filters:
      suffix: 'bold'
      extension: '.nii.gz'
      datatype: 'func'
    wildcards:
      - subject
      - session
      - acquisition
      - task
      - run

Then generate_inputs(bids_dir, pybids_input) would return the following dictionary:

{
    "input_path": {
        "bold": "bids-example/sub-{subject}/func/sub-{subject}_task-{task}_bold.nii.gz"
    },
    "input_zip_lists": {
        "bold": {
            "subject": ["control01", "control01"],
            "task": ["nback", "rest"]
        }
    },
    "input_lists": {
        "bold": {
            "subject": ["control01"],
            "task": ["nback", "rest"]
        }
    },
    "input_wildcards": {
        "bold": {
            "subject": "{subject}",
            "task": "{task}"
        }
    },
    "subjects": ["subject01"],
    "sessions": [],
    "subj_wildcards": {"subject": "{subject}"}
}
snakebids.get_filtered_ziplist_index(zip_list, wildcards, subj_wildcards)

Use this function when you have wildcards for a single scan instance, and want to know the index of that scan, amongst that subject’s scan instances.

Parameters:
  • zip_list (dict) – lists for scans in a dataset, zipped to get each instance

  • wildcards (dict) – wildcards for the single instance for querying it’s index

  • subj_wildcards (dict) – keys of this dictionary are used to pick out the subject/(session) from the wildcards

Examples

>>> import snakebids

In this example, we have a dataset where with scans from two subjects, where each subject has dir-AP and dir-PA scans, along with acq-98 and acq-99:

  • sub-01_acq-98_dir-AP_dwi.nii.gz

  • sub-01_acq-98_dir-PA_dwi.nii.gz

  • sub-01_acq-99_dir-AP_dwi.nii.gz

  • sub-01_acq-99_dir-PA_dwi.nii.gz

  • sub-02_acq-98_dir-AP_dwi.nii.gz

  • sub-02_acq-98_dir-PA_dwi.nii.gz

  • sub-02_acq-99_dir-AP_dwi.nii.gz

  • sub-02_acq-99_dir-PA_dwi.nii.gz

The zip_list produced by generate_inputs() is the set of entities that when zipped together, e.g. with expand(path, zip, **zip_list), produces the entity combinations that refer to each scan:

{
    'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
    'acq': ['98','98','98','98','99','99','99','99'],
    'subject': ['01','01','02','02','01','01','02','02']
}

The filter_list() function produces a subset of the entity combinations as a filtered zip list. This is used e.g. to get all the scans for a single subject.

This get_filtered_ziplist_index() function performs filter_list() twice:

  1. Using the subj_wildcards (e.g.: 'subject': '{subject}') to get a subject/session-specific zip_list.

  2. To return the indices from that list of the matching wildcards.

In this example, if the wildcards parameter was:

{'dir': 'PA', 'acq': '99', 'subject': '01'}

Then the first (subject/session-specific) filtered list provides this zip list:

{
    'dir': ['AP','PA','AP','PA'],
    'acq': ['98','98','99','99'],
    'subject': ['01','01','01','01']
}

which has 4 combinations, and thus are indexed from 0 to 3.

The returned value would then be the index (or indices) that matches the wildcards. In this case, since the wildcards were {'dir': 'PA', 'acq': '99', 'subject':'01'}, the return index is 3.

>>> snakebids.get_filtered_ziplist_index(
...     {
...         'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
...         'acq': ['98','98','98','98','99','99','99','99'],
...         'subject': ['01','01','02','02','01','01','02','02' ]
...     },
...     {'dir': 'PA', 'acq': '99', 'subject': '01'},
...     {'subject': '{subject}' }
... )
3
snakebids.get_wildcard_constraints(image_types)

Return a wildcard_constraints dict for snakemake to use, containing all the wildcards that are in the dynamically grabbed inputs

Parameters:

image_types (dict) –

Returns:

  • Dict containing wildcard constraints for all wildcards in the

  • inputs, with typical bids naming constraints, ie letters and numbers

  • [a-zA-Z0-9]+.

snakebids.print_boilerplate()

Function to print out boilerplate to add to Snakefile. (not used anywhere yet)

snakebids.write_derivative_json(snakemake, **kwargs)

Snakemake function to read a json file, and write to a new one, adding BIDS derivatives fields for Sources and Parameters.

Parameters:

snakemake (struct Snakemake) – structure passed to snakemake python scripts, containing input, output, params, config … This function requires input.json and output.json to be defined, as it will read and write json files

app

Tools to generate a Snakemake-based BIDS app.

class snakebids.app.SnakeBidsApp(snakemake_dir, skip_parse_args: bool = False, parser: ~argparse.ArgumentParser = ArgumentParser(prog='__main__.py', usage=None, description='Snakebids helps build BIDS Apps with Snakemake', formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error', add_help=True), configfile_path: ~pathlib.Path = NOTHING, snakefile_path: ~pathlib.Path = NOTHING, config: ~typing.Dict[str, ~typing.Any] = NOTHING, args: ~typing.Optional[~snakebids.cli.SnakebidsArgs] = None)

Snakebids app with config and arguments.

snakemake_dir

Root directory of the snakebids app, containing the config file and workflow files.

Type:

str

parser

Parser including only the arguments specific to this Snakebids app, as specified in the config file. By default, it will use create_parser() from cli.py

Type:

ArgumentParser, optional

configfile_path

Relative path to config file (relative to snakemake_dir). By default, autocalculates based on snamake_dir

Type:

str, optional

snakefile_path

Absolute path to the input Snakefile. By default, autocalculates based on snakemake_dir

join(snakemake_dir, snakefile_path)

Type:

str, optional

config

Contains all the configuration variables parsed from the config file and generated during the initialization of the SnakeBidsApp.

Type:

dict, optional

args

Arguments to use when running the app. By default, generated using the parser attribute, autopopulated with args from config.py

Type:

SnakebidsArgs, optional

create_descriptor(out_file)

Generate a boutiques descriptor for this Snakebids app.

run_snakemake()

Run snakemake with that config.

Workflow snakefile will read snakebids config, create inputs_config, and read that in.

snakemake_io