Modules

build_config.py

Generates a Vitessce View config

build_config.build_options(file_type: str, file_path: str, file_options: dict[str, Any], check_exist: bool = False) → Any

Function that creates the View config’s options for non-image files

Parameters:

file_type (str) – Type of file supported by Vitessce.
file_path (str) – Path to file.
file_options (dict[str, T.Any]) – Dictionary defining the options.
check_exist (bool, optional) – Whether to check the given path to confirm the file exists. Defaults to False.

Returns:

Options dictionary for View config file

Return type:

T.Any

build_config.build_raster_options(images: dict[str, list[dict[str, Any]]], url: str) → dict[str, Any]

Function that creates the View config’s options for image files

Parameters:

images (dict[str, list[dict[str, T.Any]]], optional) – Dictionary containing for each image type key (raw and label) a list of dictionaries (one per image of that type) with the corresponding path and metadata for that image. Defaults to {}.
url (str) – URL to prepend to each file in the config file. The URL to the local or remote server that will serve the files

Returns:

Options dictionary for View config file

Return type:

dict[str, T.Any]

build_config.write_json(project: str = '', dataset: str = '', file_paths: list[str] = [], images: dict[str, list[dict[str, Any]]] = {}, url: str = '', options: dict[str, Any] | None = None, layout: str = 'minimal', custom_layout: str | None = None, title: str = '', description: str = '', config_filename_suffix: str = 'config.json', outdir: str = './') → None

This function writes a Vitessce View config JSON file

Parameters:

project (str, optional) – Project name. Defaults to “”.
dataset (str, optional) – Dataset name. Defaults to “”.
file_paths (list[str], optional) – Paths to files that will be included in the config file. Defaults to [].
images (dict[str, list[dict[str, T.Any]]], optional) – Dictionary containing for each image type key (raw and label) a list of dictionaries (one per image of that type) with the corresponding path and metadata for that image. Defaults to {}.
url (str, optional) – URL to prepend to each file in the config file. The URL to the local or remote server that will serve the files. Defaults to “”.
options (dict[str, T.Any], optional) – Dictionary with Vitessce config file options. Defaults to None.
layout (str, optional) – Type of predefined layout to use. Defaults to “minimal”.
custom_layout (str, optional) – String defining a Vitessce layout following its alternative syntax. https://vitessce.github.io/vitessce-python/api_config.html#vitessce.config.VitessceConfig.layout https://github.com/vitessce/vitessce-python/blob/1e100e4f3f6b2389a899552dffe90716ffafc6d5/vitessce/config.py#L855 Defaults to None.
title (str, optional) – Data title to show in the visualization. Defaults to “”.
config_filename_suffix (str, optional) – Config filename suffix. Defaults to “config.json”.
outdir (str, optional) – Directory in which the config file will be written to. Defaults to “./”.

Raises:

SystemExit – If no valid files have been input
SystemExit – If the layout has an error that can not be fixed

consolidate_md.py

Consolidates Zarr metadata

consolidate_md.consolidate(file_in: str) → None

Function to consolidate the metadata of a Zarr file

Parameters:: file_in (str) – Path to Zarr file

ome_zarr_metadata.py

Gets OME XML basic metadata

ome_zarr_metadata.get_metadata(xml_path: str) → str

Function that parses an OME XML file and dumps basic metadata as a JSON formatted str

Parameters:: xml_path (str) – Path to OME XML file
Returns:: JSON formatted metadata
Return type:: str

process_h5ad.py

Processes H5AD files into AnnData-Zarr

process_h5ad.batch_process_array(file: str, zarr_file: str, m: int, n: int, batch_size: int, chunk_size: int) → None

Function to incrementally load and write a dense matrix to Zarr

Parameters:

file (str) – Path to h5ad file
zarr_file (str) – Path to output Zarr file
m (int) – Number of rows in the matrix
n (int) – Number of columns in the matrix
batch_size (int) – Number of columns to load and write at a time
chunk_size (int) – Output Zarr column chunk size

process_h5ad.batch_process_sparse(file: str, zarr_file: str, m: int, n: int, batch_size: int, chunk_size: int, is_csc: bool = False) → None

Function to incrementally load and write a sparse matrix to Zarr

Parameters:

file (str) – Path to h5ad file
zarr_file (str) – Path to output Zarr file
m (int) – Number of rows in the matrix
n (int) – Number of columns in the matrix
batch_size (int) – Number of rows/columns to load and write at a time
chunk_size (int) – Output Zarr column chunk size
is_csc (bool, optional) – If matrix is in CSC format instead of CSR format. Defaults to False.

process_h5ad.h5ad_to_zarr(path: str | None = None, stem: str = '', adata: AnnData | None = None, chunk_size: int = 10, batch_processing: bool = False, batch_size: int = 10000, consolidate_metadata: bool = True, **kwargs) → str

This function takes an AnnData object or path to an h5ad file, ensures data is of an appropriate data type for Vitessce and writes the object to Zarr.

Parameters:

path (str, optional) – Path to the h5ad file. Defaults to None.
stem (str, optional) – Prefix for the output file. Defaults to “”.
adata (AnnData, optional) – AnnData object to process. Supersedes path. Defaults to None.
chunk_size (int, optional) – Output Zarr column chunk size. Defaults to 10.
batch_processing (bool, optional) – If the expression matrix will be written to Zarr incrementally. Use to avoid loading the whole AnnData into memory. Defaults to False.
batch_size (int, optional) – The amount of rows (if matrix is in CSR format) or columns (if matrix is dense or in CSC format) of the expression matrix to process at a time when batch processing. Defaults to 10000.

Raises:

SystemError – If batch_processing is True and the matrix contains an indptr key but the matrix is not in scipy.sparse.csr_matrix nor scipy.sparse.csc_matrix format

Returns:

Output Zarr filename

Return type:

str

process_h5ad.preprocess_anndata(adata: AnnData, compute_embeddings: bool = False, var_index: str | None = None, obs_subset: tuple[str, Any] | None = None, var_subset: tuple[str, Any] | None = None, **kwargs)

This function preprocesses an AnnData object, ensuring correct dtypes for zarr conversion

Parameters:

adata (AnnData) – AnnData object to preprocess.
compute_embeddings (bool, optional) – If X_umap and X_pca embeddings will be computed. Defaults to False.
var_index (str, optional) – Alternative var column name with var names to be used in the visualization. Defaults to None.
obs_subset (tuple(str, T.Any), optional) – Tuple containing an obs column name and one or more values to use to subset the AnnData object. Defaults to None.
var_subset (tuple(str, T.Any), optional) – Tuple containing a var column name and one or more values to use to subset the AnnData object. Defaults to None.

process_molecules.py

Processes molecules files

process_molecules.tsv_to_json(path: str, stem: str, has_header: bool = True, gene_col_name: str = 'Name', x_col_name: str = 'x_int', y_col_name: str = 'y_int', delimiter: str = '\t', x_scale: float = 1.0, y_scale: float = 1.0, x_offset: float = 0.0, y_offset: float = 0.0, gene_col_idx: int | None = None, x_col_idx: int | None = None, y_col_idx: int | None = None, filter_col_name: str | None = None, filter_col_idx: int | None = None, filter_col_value: str | None = None) → str

This function loads a TSV/CSV file containing gene names, X and Y coordinates and writes them to a JSON file supported by Vitessce

Parameters:

path (str) – Path to tsv/csv file
stem (str) – Prefix for output JSON file
has_header (bool, optional) – If input file contains a header row. Defaults to True.
gene_col_name (str, optional) – Column header name where gene names are stored. Defaults to “Name”.
x_col_name (str, optional) – Column header name where X coordinates are stored. Defaults to “x_int”.
y_col_name (str, optional) – Column header name where Y coordinates are stored. Defaults to “y_int”.
delimiter (str, optional) – Input file delimiter. Defaults to ” “.
x_scale (float, optional) – Scale to multiply X coordinates by. Defaults to 1.0.
y_scale (float, optional) – Scale to multiply Y coordinates by. Defaults to 1.0.
x_offset (float, optional) – Offset to add to X coordinates. Defaults to 0.0.
y_offset (float, optional) – Offset to add to Y coordinates. Defaults to 0.0.
gene_col_idx (int, optional) – Column index where gene names are stored if header is not present. Defaults to None.
x_col_idx (int, optional) – Column index where X coordinates are stored if header is not present. Defaults to None.
y_col_idx (int, optional) – Column index where Y coordinates are stored if header is not present. Defaults to None.
filter_col_name (str, optional) – Column header name storing values to filter data. Defaults to None.
filter_col_idx (int, optional) – Column index storing values to filter data if header is not present. Defaults to None.
filter_col_value (str, optional) – Value expected in filter column. If a row has a different value it will not be written to output file. Defaults to None.

Raises:

SystemExit – If any column header name is not in the header row.
e – If coordinate values cannot be parsed to float

Returns:

Output JSON filename

Return type:

str

process_spaceranger.py

Processes SpaceRanger output

process_spaceranger.spaceranger_to_anndata(path: str, load_clusters: bool = True, load_embeddings: bool = True, load_raw: bool = False) → AnnData

Function to create an AnnData object from a SpaceRanger output directory.

Parameters:

path (str) – Path to a SpaceRanger output directory
load_clusters (bool, optional) – If cluster files should be included in the AnnData object. Defaults to True.
load_embeddings (bool, optional) – If embedding coordinates files should be included in the AnnData object. Defaults to True.
load_raw (bool, optional) – If the raw matrix count file should be loaded instead of the filtered matrix. Defaults to False.

Returns:

AnnData object created from the SpaceRanger output data

Return type:

AnnData

process_spaceranger.spaceranger_to_zarr(path: str, stem: str, load_clusters: bool = True, load_embeddings: bool = True, load_raw: bool = False, save_h5ad: bool = False, **kwargs) → str

Function to write to Zarr an AnnData object created from SpaceRanger output data

Parameters:

path (str) – Path to a SpaceRanger output directory
stem (str) – Prefix for the output Zarr filename
load_clusters (bool, optional) – If cluster files should be included in the AnnData object. Defaults to True.
load_embeddings (bool, optional) – If embedding coordinates files should be included in the AnnData object. Defaults to True.
load_raw (bool, optional) – If the raw matrix count file should be loaded instead of the filtered matrix. Defaults to False.
save_h5ad (bool, optional) – If the AnnData object should also be written to an h5ad file. Defaults to False.

Returns:

Output Zarr filename

Return type:

str

process_spaceranger.visium_label(stem: str, file_path: str, shape: tuple[int, int] | None = None, obs_subset: tuple[int, Any] | None = None, sample_id: str | None = None, relative_size: str | None = None) → None

This function writes a label image tif file with drawn labels according to an Anndata object with necessary metadata stored within uns[“spatial”].

Parameters:

stem (str) – Prefix for the output image filename.
file_path (str) – Path to the h5ad file or spaceranger output directory.
shape (tuple[int, int], optional) – Output image shape. Defaults to None.
obs_subset (tuple(str, T.Any), optional) – Tuple containing an obs column name and one or more values to use to subset the AnnData object. Defaults to None.
sample_id (str, optional) – Sample ID string within the Anndata object. Defaults to None.
relative_size (str, optional) – Optional numerical obs column name that holds a multiplier for the spot diameter. Only useful for data that has been processed to merge spots. Defaults to None.

process_xenium.py

Processes Xenium output

process_xenium.xenium_label(stem: str, path: str, shape: tuple[int, int], resolution: float = 0.2125) → None

This function writes a label image tif file with drawn labels according to cell segmentation polygons from Xenium output cells.zarr.zip file

Parameters:

stem (str) – Prefix for the output image filename.
path (str) – Path to the Xenium output directory or cells.zarr.zip file
shape (tuple[int, int]) – Output image shape. Defaults to None.
resolution (float, optional) – Pixel resolution. Defaults to 0.2125.

process_xenium.xenium_to_anndata(path: str, spatial_as_pixel: bool = True, resolution: float = 0.2125, load_clusters: bool = True, load_embeddings: bool = True) → AnnData

Function to create an AnnData object from Xenium output.

Parameters:

path (str) – Path to a xenium output directory
spatial_as_pixel (bool, optional) – Boolean indicating whether spatial coordinates should be
True. (converted to pixels. Defaults to) –
resolution (float, optional) – Pixel resolution. Defaults to 0.2125.
load_clusters (bool, optional) – If cluster files should be included in the AnnData object. Defaults to True.
load_embeddings (bool, optional) – If embedding coordinates files should be included in the AnnData object. Defaults to True.

Returns:

AnnData object created from the xenium output data

Return type:

AnnData

process_xenium.xenium_to_zarr(path: str, stem: str, spatial_as_pixel: bool = True, resolution: float = 0.2125, save_h5ad: bool = False, **kwargs) → str

Function to write to Zarr an AnnData object created from xenium output data

Parameters:

path (str) – Path to a xenium output directory
stem (str) – Prefix for the output Zarr filename
spatial_as_pixel (bool, optional) – Boolean indicating whether spatial coordinates should be
True. (converted to pixels. Defaults to) –
resolution (float, optional) – Pixel resolution. Defaults to 0.2125.
save_h5ad (bool, optional) – If the AnnData object should also be written to an h5ad file. Defaults to False.

Returns:

Output Zarr filename

Return type:

str

build_config.py

Generates a Vitessce View config

build_config_multimodal.build_raster_options(images: dict[str, list[dict[str, Any]]], url: str) → dict[str, Any]

Function that creates the View config’s options for image files

Parameters:

images (dict[str, list[dict[str, T.Any]]], optional) – Dictionary containing for each image type key (raw and label) a list of dictionaries (one per image of that type) with the corresponding path and metadata for that image. Defaults to {}.
url (str) – URL to prepend to each file in the config file. The URL to the local or remote server that will serve the files

Returns:

Options dictionary for View config file

Return type:

dict[str, T.Any]

build_config_multimodal.concat_views(views: list, axis: str = 'v'): Recursively concatenate views

build_config_multimodal.write_json(project: str = '', datasets: dict[str, dict[str]] = {}, extended_features: str | list = [], url: str = '', config_filename_suffix: str = 'config.json', title: str = '', description: str = '', outdir: str = './') → None

This function writes a Vitessce View config JSON file

Parameters:

project (str, optional) – Project name. Defaults to “”.
datasets (dict[str, dict[str]], optional) – Dictionary of datasets. Expected structure: { dataset_name: { “file_paths” : [], “images”: {“raw”: [], “label”: []}, “options”: {}, “obs_type”: “cell”, “is_spatial”: True } } Defaults to {}.
extended_features (Union[list[str], str], optional) – List of features or string of single feature on which the expression matrix was extended and var/is_{feature} is present. Defaults to [].
url (str, optional) – URL to prepend to each file in the config file. The URL to the local or remote server that will serve the files. Defaults to “”.
config_filename_suffix (str, optional) – Config filename suffix. Defaults to “config.json”.
title (str, optional) – Data title to show in the visualization. Defaults to “”.
description (str, optional) – Data description to show in the visualization. Defaults to “”.
outdir (str, optional) – Directory in which the config file will be written to. Defaults to “./”.