Data Structures#

class snakebids.BidsComponent(*, name, path, zip_lists)#

Representation of a bids data component.

A component is a set of data entries all corresponding to the same type of object. Entries vary over a set of entities. For example, a component may represent all the unprocessed, T1-weighted anatomical images aqcuired from a group of 100 subjects, across 2 sessions, with three runs per session. Here, the subject, session, and run are the entities over which the component varies. Each entry in the component has a single value assigned for each of the three entities (e.g subject 002, session 01, run 1).

Each entry can be defined solely by its wildcard values. The complete collection of entries can thus be stored as a table, where each row represents an entity and each column represents an entry.

BidsComponent stores and indexes this table. It uses ‘row-first’ indexing, meaning first an entity is selected, then an entry. It also has a number of properties and methods making it easier to incorporate the data in a snakemake workflow.

In addition, BidsComponent stores a template ~BidsComponent.path derived from the source dataset. This path is used by the expand() method to recreate the original filesystem paths.

The real power of the BidsComponent, however, is in creating derived paths based on the original dataset. Using the :meth`~BidsComponent.expand` method, you can pass new paths with {wildcard} placeholders wrapped in braces and named according to the entities in the component. These placeholders will be substituted with the entity values saved in the table, giving you a list of paths the same length as the number of entries in the component.

BidsComponents are immutable: their values cannot be altered.

Parameters:
  • name (str) –

  • path (str) –

name: str#

Name of the component

path: str#

Wildcard-filled path that matches the files for this component.

expand(paths=None, /, allow_missing=False, **wildcards)#

Safely expand over given paths with component wildcards.

Uses the entity-value combinations found in the dataset to expand over the given paths. If no path is provided, expands over the component path (thus returning the original files used to create the component). Extra wildcards can be specified as keyword arguments.

By default, expansion over paths with extra wildcards not accounted for by the component causes an error. This prevents accidental partial expansion. To allow the passage of extra wildcards without expansion,set allow_missing to True.

Uses the snakemake expand under the hood.

Parameters:
  • paths (Iterable[Path | str] | Path | str | None) – Path or list of paths to expand over. If not provided, the component’s own path will be expanded over.

  • allow_missing (bool | str | Iterable[str]) – If True, allow {wildcards} in the provided paths that are not present either in the component or in the extra provided **wildcards. These wildcards will be preserved in the returned paths.

  • wildcards (str | Iterable[str]) – Each keyword should be the name of an wildcard in the provided paths. Keywords not found in the path will be ignored. Keywords take values or lists of values to be expanded over the provided paths.

Return type:

list[str]

property entities: MultiSelectDict[str, list[str]]#

Component entities and their associated values.

Dictionary where each key is an entity and each value is a list of the unique values found for that entity. These lists might not be the same length.

filter(*, regex_search=False, **filters)#

Filter component based on provided entity filters.

This method allows you to expand over a subset of your wildcards. This could be useful for extracting subjects from a specific patient group, running different rules on different aquisitions, and any other reason you may need to filter your data after the workflow has already started.

Takes entities as keyword arguments assigned to values or list of values to select from the component. Only columns containing the provided entity-values are kept. If no matches are found, a component with the all the original entities but with no values will be returned.

Returns a brand new BidsComponent. The original component is not modified.

Parameters:
  • regex_search (bool | str | Iterable[str]) – Treat filters as regex patterns when matching with entity-values.

  • filters (str | Iterable[str]) – Each keyword should be the name of an entity in the component. Entities not found in the component will be ignored. Keywords take values or a list of values to be matched with the component zip_lists

Return type:

Self

pformat(max_width=None, tabstop=4)#

Pretty-format component.

Parameters:
  • max_width (int | float | None) – Maximum width of characters for output. If possible, zip_list table will be elided to fit within this width

  • tabstop (int) – Number of spaces for output indentation

Return type:

str

property wildcards: MultiSelectDict[str, str]#

Wildcards in brace-wrapped syntax.

Dictionary where each key is the name of a wildcard entity, and each value is the Snakemake wildcard used for that entity.

property zip_lists#

Table of unique wildcard groupings for each member in the component.

Dictionary where each key is a wildcard entity and each value is a list of the values found for that entity. Each of these lists has length equal to the number of images matched for this modality, so they can be zipped together to get a list of the wildcard values for each file.

Legacy BidsComponents properties

The following properties are historical aliases of BidsComponents properties. There are no current plans to deprecate them, but new code should avoid them.

property BidsComponent.input_zip_lists: snakebids.types.ZipList#

Alias of zip_lists.

Dictionary where each key is a wildcard entity and each value is a list of the values found for that entity. Each of these lists has length equal to the number of images matched for this modality, so they can be zipped together to get a list of the wildcard values for each file.

property BidsComponent.input_wildcards#

Alias of wildcards

Wildcards in brace-wrapped syntax.

property BidsComponent.input_name: str#

Alias of name.

Name of the component

property BidsComponent.input_path: str#

Alias of path.

Wildcard-filled path that matches the files for this component.

property BidsComponent.input_lists#

Alias of entities

Component entities and their associated values.

class snakebids.BidsPartialComponent(*, zip_lists)#

Primitive representation of a bids data component.

See BidsComponent for an extended definition of a data component.

BidsPartialComponents are typically derived from a BidsComponent. They do not store path information, and do not represent real data files. They just have a table of entity-values, typically a subset of those present in their source BidsComponent.

Despite this, BidsPartialComponents still allow you to expand the data table over new paths, allowing you to derive paths from your source dataset.

The members of BidsPartialComponent are identical to BidsComponent with the following exceptions:

  • No name or path

  • expand() must be given a path or list of paths as the first argument

BidsPartialComponents are immutable: their values cannot be altered.

class snakebids.BidsComponentRow(iterable, /, entity)#

A single row from a BidsComponent.

This class is derived by indexing a single entity from a BidsComponent or BidsPartialComponent. It should not be constructed manually.

The class is a subclass of ImmutableList and can thus be treated as a tuple. Indexing it via row[<int>] gives the entity-value of the selected entry.

The entities and wildcards directly return the list of unique entity-values or the {brace-wrapped-entity} name corresponding to the row, rather than a dict.

The expand() and filter() methods behave as they would in a BidsComponent with a single entity.

Parameters:
  • iterable (Iterable[str]) –

  • entity (str) –

property entities: tuple[str, ...]#

The unique values associated with the component.

property wildcards: str#

The entity name wrapped in wildcard braces.

expand(paths, /, allow_missing=False, **wildcards)#

Safely expand over given paths with component wildcards.

Uses the entity-values represented by this row to expand over the given paths. Extra wildcards can be specified as keyword arguments.

By default, expansion over paths with extra wildcards not accounted for by the component causes an error. This prevents accidental partial expansion. To allow the passage of extra wildcards without expansion,set allow_missing to True.

Uses the snakemake expand under the hood.

Parameters:
  • paths (Iterable[Path | str] | Path | str) – Path or list of paths to expand over

  • allow_missing (bool | str | Iterable[str]) – If True, allow {wildcards} in the provided paths that are not present either in the component or in the extra provided **wildcards. These wildcards will be preserved in the returned paths.

  • wildcards (str | Iterable[str]) – Each keyword should be the name of an wildcard in the provided paths. Keywords not found in the path will be ignored. Keywords take values or lists of values to be expanded over the provided paths.

Return type:

list[str]

filter(spec=None, /, *, regex_search=False, **filters)#

Filter component based on provided entity filters.

Extracts a subset of the entity-values present in the row.

Takes entities as keyword arguments assigned to values or list of values to select from the component. Only columns containing the provided entity-values are kept. If no matches are found, a component with the all the original entities but with no values will be returned.

Returns a brand new BidsComponentRow. The original component is not modified.

Parameters:
  • spec (Iterable[str] | str | None) – Value or iterable of values assocatiated with the ComponentRow’s entity. Equivalent to specifying .filter(entity=value)

  • regex_search (bool | str | Iterable[str]) – Treat filters as regex patterns when matching with entity-values.

  • filters (str | Iterable[str]) – Keyword-value(s) filters as in filter(). Here, the only valid filter is the entity of the BidsComponentRow; all others will be ignored.

Return type:

Self

class snakebids.BidsDataset(data, layout=None)#

A bids dataset parsed by pybids, organized into BidsComponents.

BidsDatasets are typically generated using generate_inputs(), which reads the pybids_inputs field in your snakemake config file and, for each entry, creates a BidsComponent using the provided name, wildcards, and filters.

Individual components can be accessed using bracket-syntax: (e.g. inputs["t1w"]).

Provides access to summarizing information, for instance, the set of all subjects or sessions found in the dataset.

Parameters:
layout: BIDSLayout | None#

Underlying layout generated from pybids. Note that this will be set to None if custom paths are used to generate every component

pformat(max_width=None, tabstop=4)#

Pretty-format dataset.

Parameters:
  • max_width (int | float | None) – Maximum width of characters for output. If possible, zip_list table will be elided to fit within this width

  • tabstop (int) – Number of spaces for output indentation

Return type:

str

property path: dict[str, str]#

Dict mapping BidsComponent names to their paths.

Warning

Deprecated since version 0.8.0: The behaviour of path will change in an upcoming release, where it will refer instead to the root path of the dataset. Please access component paths using Dataset[<component_name>].path

property zip_lists: dict[str, snakebids.types.ZipList]#

Dict mapping BidsComponent names to their zip_lists.

Warning

Deprecated since version 0.8.0: The behaviour of zip_lists will change in an upcoming release, where it will refer instead to the consensus of entity groups across all components in the dataset. Please access component zip_lists using Dataset[<component_name>].zip_lists

property entities: dict[str, snakebids.utils.containers.MultiSelectDict[str, list[str]]]#

Dict mapping BidsComponent names to their entities.

Warning

Deprecated since version 0.8.0: The behaviour of entities will change in the 1.0 release, where it will refer instead to the union of all entity-values across all components in the dataset. Please access component entity lists using Dataset[<component_name>].entities

property wildcards: dict[str, snakebids.utils.containers.MultiSelectDict[str, str]]#

Dict mapping BidsComponent names to their wildcards.

Warning

Deprecated since version 0.8.0: The behaviour of wildcards will change in an upcoming release, where it will refer instead to the union of all entity-wildcard mappings across all components in the dataset. Please access component wildcards using Dataset[<component_name>].wildcards

property subjects: list[str]#

A list of the subjects in the dataset.

property sessions: list[str]#

A list of the sessions in the dataset.

property subj_wildcards: dict[str, str]#

The subject and session wildcards applicable to this dataset.

{"subject":"{subject}"} if there is only one session, {"subject": "{subject}", "session": "{session}"} if there are multiple sessions.

property as_dict: BidsDatasetDict#

Get the layout as a legacy dict.

Included primarily for backward compatability with older versions of snakebids, where generate_inputs() returned a dict rather than the BidsDataset class

Return type:

BidsDatasetDict

classmethod from_iterable(iterable, layout=None)#

Construct Dataset from iterable of BidsComponents.

Parameters:
Return type:

BidsDataset

Legacy BidsDataset properties

The following properties are historical aliases of BidsDataset properties. There are no current plans to deprecate them, but new code should avoid them.

property BidsDataset.input_zip_lists: dict[str, snakebids.utils.containers.MultiSelectDict[str, list[str]]]#

Alias of zip_lists

Dict mapping BidsComponent names to their zip_lists.

property BidsDataset.input_wildcards: dict[str, snakebids.utils.containers.MultiSelectDict[str, str]]#

Alias of wildcards

Dict mapping BidsComponent names to their wildcards.

property BidsDataset.input_path: dict[str, str]#

Alias of path

Dict mapping BidsComponent names to their paths.

property BidsDataset.input_lists: dict[str, snakebids.utils.containers.MultiSelectDict[str, list[str]]]#

Alias of entities

Dict mapping BidsComponent names to their entities.

class snakebids.BidsDatasetDict#

Dict equivalent of BidsInputs, for backwards-compatibility.