Dataset Manipulation#

snakebids.filter_list(zip_list, filters, return_indices_only=False, regex_search=False)#

Filter zip_list, including only entries with provided entity values.

Parameters:
  • zip_list (ZipListLike) – generated zip lists dict from config file to filter

  • filters (Mapping[str, Iterable[str] | str]) – wildcard values to filter the zip lists

  • return_indices_only (bool) – return the indices of the matching wildcards

  • regex_search (bool) – Use regex matching to filter instead of the default equality check.

Return type:

ZipList | list[int]

Examples

>>> import snakebids

Filtering to get all subject='01' scans:

>>> snakebids.filter_list(
...     {
...         'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
...         'acq': ['98','98','98','98','99','99','99','99'],
...         'subject': ['01','01','02','02','01','01','02','02' ]
...     },
...     {'subject': '01'}
... ) == {
...     'dir': ['AP', 'PA', 'AP', 'PA'],
...     'acq': ['98', '98', '99', '99'],
...     'subject': ['01', '01', '01', '01']
... }
True

Filtering to get all acq='98' scans:

>>> snakebids.filter_list(
...     {
...         'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
...         'acq': ['98','98','98','98','99','99','99','99'],
...         'subject': ['01','01','02','02','01','01','02','02' ]
...     },
...     {'acq': '98'}
... ) == {
...     'dir': ['AP', 'PA', 'AP', 'PA'],
...     'acq': ['98', '98', '98', '98'],
...     'subject': ['01', '01', '02', '02']
... }
True

Filtering to get all dir=='AP' scans:

>>> snakebids.filter_list(
...     {
...         'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
...         'acq': ['98','98','98','98','99','99','99','99'],
...         'subject': ['01','01','02','02','01','01','02','02' ]
...     },
...     {'dir': 'AP'}
... ) == {
...     'dir': ['AP', 'AP', 'AP', 'AP'],
...     'acq': ['98', '98', '99', '99'],
...     'subject': ['01', '02', '01', '02']
... }
True

Filtering to get all subject='03' scans (i.e. no matches):

>>> snakebids.filter_list(
...     {
...         'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
...         'acq': ['98','98','98','98','99','99','99','99'],
...         'subject': ['01','01','02','02','01','01','02','02' ]
...     },
...     {'subject': '03'}
... ) == {
...     'dir': [],
...     'acq': [],
...     'subject': []
... }
True
snakebids.get_filtered_ziplist_index(zip_list, wildcards, subj_wildcards)#

Return the indices of all entries matching the filter query.

Parameters:
  • zip_list (dict) – lists for scans in a dataset, zipped to get each instance

  • wildcards (dict) – wildcards for the single instance for querying it’s index

  • subj_wildcards (dict) – keys of this dictionary are used to pick out the subject/(session) from the wildcards

Return type:

int | list[int]

Examples

>>> import snakebids

In this example, we have a dataset where with scans from two subjects, where each subject has dir-AP and dir-PA scans, along with acq-98 and acq-99:

  • sub-01_acq-98_dir-AP_dwi.nii.gz

  • sub-01_acq-98_dir-PA_dwi.nii.gz

  • sub-01_acq-99_dir-AP_dwi.nii.gz

  • sub-01_acq-99_dir-PA_dwi.nii.gz

  • sub-02_acq-98_dir-AP_dwi.nii.gz

  • sub-02_acq-98_dir-PA_dwi.nii.gz

  • sub-02_acq-99_dir-AP_dwi.nii.gz

  • sub-02_acq-99_dir-PA_dwi.nii.gz

The zip_list produced by generate_inputs() is the set of entities that when zipped together, e.g. with expand(path, zip, **zip_list), produces the entity combinations that refer to each scan:

{
    'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
    'acq': ['98','98','98','98','99','99','99','99'],
    'subject': ['01','01','02','02','01','01','02','02']
}

The filter_list() function produces a subset of the entity combinations as a filtered zip list. This is used e.g. to get all the scans for a single subject.

This get_filtered_ziplist_index() function performs filter_list() twice:

  1. Using the subj_wildcards (e.g.: 'subject': '{subject}') to get a subject/session-specific zip_list.

  2. To return the indices from that list of the matching wildcards.

In this example, if the wildcards parameter was:

{'dir': 'PA', 'acq': '99', 'subject': '01'}

Then the first (subject/session-specific) filtered list provides this zip list:

{
    'dir': ['AP','PA','AP','PA'],
    'acq': ['98','98','99','99'],
    'subject': ['01','01','01','01']
}

which has 4 combinations, and thus are indexed from 0 to 3.

The returned value would then be the index (or indices) that matches the wildcards. In this case, since the wildcards were {'dir': 'PA', 'acq': '99', 'subject':'01'}, the return index is 3.

>>> snakebids.get_filtered_ziplist_index(
...     {
...         'dir': ['AP','PA','AP','PA', 'AP','PA','AP','PA'],
...         'acq': ['98','98','98','98','99','99','99','99'],
...         'subject': ['01','01','02','02','01','01','02','02' ]
...     },
...     {'dir': 'PA', 'acq': '99', 'subject': '01'},
...     {'subject': '{subject}' }
... )
3