Dataset Creation#

snakebids.generate_inputs(bids_dir: Path | str, pybids_inputs: InputsConfig, pybidsdb_dir: Path | str | None = None, pybidsdb_reset: bool = None, derivatives: bool | Path | str = False, pybids_config: str | None = None, limit_to: Iterable[str] | None = None, participant_label: Iterable[str] | str | None = None, exclude_participant_label: Iterable[str] | str | None = None, use_bids_inputs: Literal[True] | None = None, index_metadata: bool = False, validate: bool = False, pybids_database_dir: Path | str | None = None, pybids_reset_database: bool = None) BidsDataset#
snakebids.generate_inputs(bids_dir: Path | str, pybids_inputs: InputsConfig, pybidsdb_dir: Path | str | None = None, pybidsdb_reset: bool = None, derivatives: bool | Path | str = False, pybids_config: str | None = None, limit_to: Iterable[str] | None = None, participant_label: Iterable[str] | str | None = None, exclude_participant_label: Iterable[str] | str | None = None, use_bids_inputs: Literal[False] = None, index_metadata: bool = False, validate: bool = False, pybids_database_dir: Path | str | None = None, pybids_reset_database: bool = None) BidsDatasetDict

Dynamically generate snakemake inputs using pybids_inputs.

Pybids is used to parse the bids_dir. Custom paths can also be parsed by including the custom_paths entry under the pybids_inputs descriptor.

Parameters:
  • bids_dir – Path to bids directory

  • pybids_inputs

    Configuration for bids inputs, with keys as the names (str)

    Nested dicts with the following required keys (for complete info, see InputConfig):

    • "filters": Dictionary of entity: “values” (dict of str -> str or list of str). The entity keywords should the bids tags on which to filter. The values should be an acceptable str value for that entity, or a list of acceptable str values.

    • "wildcards": List of (str) bids tags to include as wildcards in snakemake. At minimum this should usually include ['subject','session'], plus any other wildcards that you may want to make use of in your snakemake workflow, or want to retain in the output paths. Any wildcards in this list that are not in the filename will just be ignored.

    • "custom_path": Custom path to be parsed with wildcards wrapped in braces, as in /path/to/sub-{subject}/{wildcard_1}-{wildcard_2}. This path will be parsed without pybids, allowing the use of non-bids-compliant paths.

  • pybidsdb_dir – Path to database directory. If None is provided, database is not used

  • pybidsdb_reset – A boolean that determines whether to reset / overwrite existing database.

  • derivatives – Indicates whether pybids should look for derivative datasets under bids_dir. These datasets must be properly formatted according to bids specs to be recognized. Defaults to False.

  • limit_to – If provided, indicates which input descriptors from pybids_inputs should be parsed. For example, if pybids_inputs describes "bold" and "dwi" inputs, and limit_to = ["bold"], only the “bold” inputs will be parsed. “dwi” will be ignored

  • participant_label – Indicate one or more participants to be included from input parsing. This may cause errors if subject filters are also specified in pybids_inputs. It may not be specified if exclude_participant_label is specified

  • exclude_participant_label – Indicate one or more participants to be excluded from input parsing. This may cause errors if subject filters are also specified in pybids_inputs. It may not be specified if participant_label is specified

  • use_bids_inputs – If False, returns the classic BidsDatasetDict instead of :class`BidsDataset`. Setting to True is deprecated as of v0.8, as this is now the default behaviour

  • index_metadata – If True indexes metadata of BIDS directory using pybids, otherwise skips indexing.

  • validate – If True performs validation of BIDS directory using pybids, otherwise skips validation.

Returns:

Object containing organized information about the bids inputs for consumption in snakemake. See the documentation of BidsDataset for details and examples.

Return type:

BidsDataset | BidsDatasetDict

Example

As an example, consider the following BIDS dataset:

example
├── README.md
├── dataset_description.json
├── participant.tsv
├── sub-001
│   ├── ses-01
│   │   ├── anat
│   │   │   ├── sub-001_ses-01_run-01_T1w.json
│   │   │   ├── sub-001_ses-01_run-01_T1w.nii.gz
│   │   │   ├── sub-001_ses-01_run-02_T1w.json
│   │   │   └── sub-001_ses-01_run-02_T1w.nii.gz
│   │   └── func
│   │       ├── sub-001_ses-01_task-nback_bold.json
│   │       ├── sub-001_ses-01_task-nback_bold.nii.gz
│   │       ├── sub-001_ses-01_task-rest_bold.json
│   │       └── sub-001_ses-01_task-rest_bold.nii.gz
│   └── ses-02
│       ├── anat
│       │   ├── sub-001_ses-02_run-01_T1w.json
│       │   └── sub-001_ses-02_run-01_T1w.nii.gz
│       └── func
│           ├── sub-001_ses-02_task-nback_bold.json
│           ├── sub-001_ses-02_task-nback_bold.nii.gz
│           ├── sub-001_ses-02_task-rest_bold.json
│           └── sub-001_ses-02_task-rest_bold.nii.gz
└── sub-002
    ├── ses-01
    │   ├── anat
    │   │   ├── sub-002_ses-01_run-01_T1w.json
    │   │   ├── sub-002_ses-01_run-01_T1w.nii.gz
    │   │   ├── sub-002_ses-01_run-02_T1w.json
    │   │   └── sub-002_ses-01_run-02_T1w.nii.gz
    │   └── func
    │       ├── sub-002_ses-01_task-nback_bold.json
    │       ├── sub-002_ses-01_task-nback_bold.nii.gz
    │       ├── sub-002_ses-01_task-rest_bold.json
    │       └── sub-002_ses-01_task-rest_bold.nii.gz
    └── ses-02
        └── anat
            ├── sub-002_ses-02_run-01_T1w.json
            ├── sub-002_ses-02_run-01_T1w.nii.gz
            ├── sub-002_ses-02_run-02_T1w.json
            └── sub-002_ses-02_run-02_T1w.nii.gz

With the following pybids_inputs defined in the config file:

pybids_inputs:
  bold:
    filters:
      suffix: 'bold'
      extension: '.nii.gz'
      datatype: 'func'
    wildcards:
      - subject
      - session
      - acquisition
      - task
      - run

Then generate_inputs(bids_dir, pybids_input) would return the following values:

BidsDataset({
    "bold": BidsComponent(
        name="bold",
        path="bids/sub-{subject}/ses-{session}/func/sub-{subject}_ses-{session}_task-{task}_bold.nii.gz",
        zip_lists={
            "subject": ["001",   "001",  "001",   "001",  "002",   "002" ],
            "session": ["01",    "01",   "02",    "02",   "01",    "01"  ],
            "task":    ["nback", "rest", "nback", "rest", "nback", "rest"],
        },
    ),
    "t1w": BidsComponent(
        name="t1w",
        path="example/sub-{subject}/ses-{session}/anat/sub-{subject}_ses-{session}_run-{run}_T1w.nii.gz",
        zip_lists={
            "subject": ["001", "001", "001", "002", "002", "002", "002"],
            "session": ["01",  "01",  "02",  "01",  "01",  "02",  "02" ],
            "run":     ["01",  "02",  "01",  "01",  "02",  "01",  "02" ],
        },
    ),
})