Data Structures¶
- class snakebids.BidsComponent(*, name, path, zip_lists)¶
Representation of a bids data component.
A component is a set of data entries all corresponding to the same type of object. Entries vary over a set of entities. For example, a component may represent all the unprocessed, T1-weighted anatomical images acquired from a group of 100 subjects, across 2 sessions, with three runs per session. Here, the subject, session, and run are the entities over which the component varies. Each entry in the component has a single value assigned for each of the three entities (e.g subject 002, session 01, run 1).
Each entry can be defined solely by its wildcard values. The complete collection of entries can thus be stored as a table, where each row represents an entity and each column represents an entry.
BidsComponent
stores and indexes this table. It uses ‘row-first’ indexing, meaning first an entity is selected, then an entry. It also has a number of properties and methods making it easier to incorporate the data in a snakemake workflow.In addition,
BidsComponent
stores a template~BidsComponent.path
derived from the source dataset. This path is used by theexpand()
method to recreate the original filesystem paths.The real power of the
BidsComponent
, however, is in creating derived paths based on the original dataset. Using the :meth`~BidsComponent.expand` method, you can pass new paths with{wildcard}
placeholders wrapped in braces and named according to the entities in the component. These placeholders will be substituted with the entity values saved in the table, giving you a list of paths the same length as the number of entries in the component.BidsComponents are immutable: their values cannot be altered.
- expand(paths=None, /, allow_missing=False, **wildcards)¶
Safely expand over given paths with component wildcards.
Uses the entity-value combinations found in the dataset to expand over the given paths. If no path is provided, expands over the component
path
(thus returning the original files used to create the component). Extra wildcards can be specified as keyword arguments.By default, expansion over paths with extra wildcards not accounted for by the component causes an error. This prevents accidental partial expansion. To allow the passage of extra wildcards without expansion,set
allow_missing
toTrue
.Uses the snakemake expand under the hood.
- Parameters:
paths (Iterable[Path | str] | Path | str | None) – Path or list of paths to expand over. If not provided, the component’s own
path
will be expanded over.allow_missing (bool | str | Iterable[str]) – If True, allow
{wildcards}
in the provided paths that are not present either in the component or in the extra provided**wildcards
. These wildcards will be preserved in the returned paths.wildcards (str | Iterable[str]) – Each keyword should be the name of an wildcard in the provided paths. Keywords not found in the path will be ignored. Keywords take values or lists of values to be expanded over the provided paths.
- Return type:
- property entities: MultiSelectDict[str, list[str]]¶
Component entities and their associated values.
Dictionary where each key is an entity and each value is a list of the unique values found for that entity. These lists might not be the same length.
- filter(*, regex_search=False, **filters)¶
Filter component based on provided entity filters.
This method allows you to expand over a subset of your wildcards. This could be useful for extracting subjects from a specific patient group, running different rules on different acquisitions, and any other reason you may need to filter your data after the workflow has already started.
Takes entities as keyword arguments assigned to values or list of values to select from the component. Only columns containing the provided entity-values are kept. If no matches are found, a component with the all the original entities but with no values will be returned.
Returns a brand new
BidsComponent
. The original component is not modified.- Parameters:
regex_search (bool | str | Iterable[str]) – Treat filters as regex patterns when matching with entity-values.
filters (str | Iterable[str]) – Each keyword should be the name of an entity in the component. Entities not found in the component will be ignored. Keywords take values or a list of values to be matched with the component
zip_lists
- Return type:
- pformat(max_width=None, tabstop=4)¶
Pretty-format component.
- property wildcards: MultiSelectDict[str, str]¶
Wildcards in brace-wrapped syntax.
Dictionary where each key is the name of a wildcard entity, and each value is the Snakemake wildcard used for that entity.
- property zip_lists¶
Table of unique wildcard groupings for each member in the component.
Dictionary where each key is a wildcard entity and each value is a list of the values found for that entity. Each of these lists has length equal to the number of images matched for this modality, so they can be zipped together to get a list of the wildcard values for each file.
Legacy BidsComponents
properties
The following properties are historical aliases of BidsComponents
properties. There are no current plans to deprecate them, but new code should avoid them.
- property BidsComponent.input_zip_lists: snakebids.types.ZipList¶
Alias of
zip_lists
.Dictionary where each key is a wildcard entity and each value is a list of the values found for that entity. Each of these lists has length equal to the number of images matched for this modality, so they can be zipped together to get a list of the wildcard values for each file.
- class snakebids.BidsPartialComponent(*, zip_lists)¶
Primitive representation of a bids data component.
See
BidsComponent
for an extended definition of a data component.BidsPartialComponents
are typically derived from aBidsComponent
. They do not store path information, and do not represent real data files. They just have a table of entity-values, typically a subset of those present in their sourceBidsComponent
.Despite this,
BidsPartialComponents
still allow you to expand the data table over new paths, allowing you to derive paths from your source dataset.The members of
BidsPartialComponent
are identical toBidsComponent
with the following exceptions:BidsPartialComponents
are immutable: their values cannot be altered.
- class snakebids.BidsComponentRow(iterable, /, entity)¶
A single row from a BidsComponent.
This class is derived by indexing a single entity from a
BidsComponent
orBidsPartialComponent
. It should not be constructed manually.The class is a subclass of
ImmutableList
and can thus be treated as a tuple. Indexing it viarow[<int>]
gives the entity-value of the selected entry.The
entities
andwildcards
directly return the list of unique entity-values or the{brace-wrapped-entity}
name corresponding to the row, rather than a dict.The
expand()
andfilter()
methods behave as they would in aBidsComponent
with a single entity.- expand(paths, /, allow_missing=False, **wildcards)¶
Safely expand over given paths with component wildcards.
Uses the entity-values represented by this row to expand over the given paths. Extra wildcards can be specified as keyword arguments.
By default, expansion over paths with extra wildcards not accounted for by the component causes an error. This prevents accidental partial expansion. To allow the passage of extra wildcards without expansion,set
allow_missing
toTrue
.Uses the snakemake expand under the hood.
- Parameters:
paths (Iterable[Path | str] | Path | str) – Path or list of paths to expand over
allow_missing (bool | str | Iterable[str]) – If True, allow
{wildcards}
in the provided paths that are not present either in the component or in the extra provided**wildcards
. These wildcards will be preserved in the returned paths.wildcards (str | Iterable[str]) – Each keyword should be the name of an wildcard in the provided paths. Keywords not found in the path will be ignored. Keywords take values or lists of values to be expanded over the provided paths.
- Return type:
- filter(spec=None, /, *, regex_search=False, **filters)¶
Filter component based on provided entity filters.
Extracts a subset of the entity-values present in the row.
Takes entities as keyword arguments assigned to values or list of values to select from the component. Only columns containing the provided entity-values are kept. If no matches are found, a component with the all the original entities but with no values will be returned.
Returns a brand new
BidsComponentRow
. The original component is not modified.- Parameters:
spec (Iterable[str] | str | None) – Value or iterable of values assocatiated with the ComponentRow’s
entity
. Equivalent to specifying.filter(entity=value)
regex_search (bool | str | Iterable[str]) – Treat filters as regex patterns when matching with entity-values.
filters (str | Iterable[str]) – Keyword-value(s) filters as in
filter()
. Here, the only valid filter is theentity
of theBidsComponentRow
; all others will be ignored.
- Return type:
- class snakebids.BidsDataset(data, layout=None)¶
A bids dataset parsed by pybids, organized into BidsComponents.
BidsDatasets are typically generated using
generate_inputs()
, which reads thepybids_inputs
field in your snakemake config file and, for each entry, creates a BidsComponent using the provided name, wildcards, and filters.Individual components can be accessed using bracket-syntax: (e.g.
inputs["t1w"]
).Provides access to summarizing information, for instance, the set of all subjects or sessions found in the dataset.
- Parameters:
data (Any) –
layout (BIDSLayout | None) –
- layout: BIDSLayout | None¶
Underlying layout generated from pybids. Note that this will be set to None if custom paths are used to generate every
component
- pformat(max_width=None, tabstop=4)¶
Pretty-format dataset.
- property path: dict[str, str]¶
Dict mapping
BidsComponent
names to theirpaths
.Warning
Deprecated since version 0.8.0: The behaviour of path will change in an upcoming release, where it will refer instead to the root path of the dataset. Please access component paths using
Dataset[<component_name>].path
- property zip_lists: dict[str, snakebids.types.ZipList]¶
Dict mapping
BidsComponent
names to theirzip_lists
.Warning
Deprecated since version 0.8.0: The behaviour of zip_lists will change in an upcoming release, where it will refer instead to the consensus of entity groups across all components in the dataset. Please access component zip_lists using
Dataset[<component_name>].zip_lists
- property entities: dict[str, snakebids.utils.containers.MultiSelectDict[str, list[str]]]¶
Dict mapping
BidsComponent
names to theirentities
.Warning
Deprecated since version 0.8.0: The behaviour of entities will change in the 1.0 release, where it will refer instead to the union of all entity-values across all components in the dataset. Please access component entity lists using
Dataset[<component_name>].entities
- property wildcards: dict[str, snakebids.utils.containers.MultiSelectDict[str, str]]¶
Dict mapping
BidsComponent
names to theirwildcards
.Warning
Deprecated since version 0.8.0: The behaviour of wildcards will change in an upcoming release, where it will refer instead to the union of all entity-wildcard mappings across all components in the dataset. Please access component wildcards using
Dataset[<component_name>].wildcards
- property subj_wildcards: dict[str, str]¶
The subject and session wildcards applicable to this dataset.
{"subject":"{subject}"}
if there is only one session,{"subject": "{subject}", "session": "{session}"}
if there are multiple sessions.
- property as_dict: BidsDatasetDict¶
Get the layout as a legacy dict.
Included primarily for backward compatibility with older versions of snakebids, where generate_inputs() returned a dict rather than the BidsDataset class
- Return type:
- classmethod from_iterable(iterable, layout=None)¶
Construct Dataset from iterable of BidsComponents.
- Parameters:
iterable (Iterable[BidsComponent]) –
layout (BIDSLayout | None) –
- Return type:
Legacy BidsDataset
properties
The following properties are historical aliases of BidsDataset
properties. There are no current plans to deprecate them, but new code should avoid them.
- property BidsDataset.input_zip_lists: dict[str, snakebids.utils.containers.MultiSelectDict[str, list[str]]]¶
Alias of
zip_lists
Dict mapping
BidsComponent
names to theirzip_lists
.
- property BidsDataset.input_wildcards: dict[str, snakebids.utils.containers.MultiSelectDict[str, str]]¶
Alias of
wildcards
Dict mapping
BidsComponent
names to theirwildcards
.
- class snakebids.BidsDatasetDict¶
Dict equivalent of BidsInputs, for backwards-compatibility.