Modules

build_config.py

Generates a Vitessce View config

build_config.build_options(file_type: str, file_path: str, file_options: dict[str, Any], check_exist: bool = False) Any

Function that creates the View config’s options for non-image files

Parameters:
  • file_type (str) – Type of file supported by Vitessce.

  • file_path (str) – Path to file.

  • file_options (dict[str, T.Any]) – Dictionary defining the options.

  • check_exist (bool, optional) – Whether to check the given path to confirm the file exists. Defaults to False.

Returns:

Options dictionary for View config file

Return type:

T.Any

build_config.build_raster_options(images: dict[str, list[dict[str, Any]]], url: str) dict[str, Any]

Function that creates the View config’s options for image files

Parameters:
  • images (dict[str, list[dict[str, T.Any]]], optional) – Dictionary containing for each image type key (raw and label) a list of dictionaries (one per image of that type) with the corresponding path and metadata for that image. Defaults to {}.

  • url (str) – URL to prepend to each file in the config file. The URL to the local or remote server that will serve the files

Returns:

Options dictionary for View config file

Return type:

dict[str, T.Any]

build_config.write_json(project: str = '', dataset: str = '', file_paths: list[str] = [], images: dict[str, list[dict[str, Any]]] = {}, url: str = '', options: dict[str, Any] | None = None, layout: str = 'minimal', custom_layout: str | None = None, title: str = '', description: str = '', config_filename_suffix: str = 'config.json', outdir: str = './') None

This function writes a Vitessce View config JSON file

Parameters:
  • project (str, optional) – Project name. Defaults to “”.

  • dataset (str, optional) – Dataset name. Defaults to “”.

  • file_paths (list[str], optional) – Paths to files that will be included in the config file. Defaults to [].

  • images (dict[str, list[dict[str, T.Any]]], optional) – Dictionary containing for each image type key (raw and label) a list of dictionaries (one per image of that type) with the corresponding path and metadata for that image. Defaults to {}.

  • url (str, optional) – URL to prepend to each file in the config file. The URL to the local or remote server that will serve the files. Defaults to “”.

  • options (dict[str, T.Any], optional) – Dictionary with Vitessce config file options. Defaults to None.

  • layout (str, optional) – Type of predefined layout to use. Defaults to “minimal”.

  • custom_layout (str, optional) – String defining a Vitessce layout following its alternative syntax. https://vitessce.github.io/vitessce-python/api_config.html#vitessce.config.VitessceConfig.layout https://github.com/vitessce/vitessce-python/blob/1e100e4f3f6b2389a899552dffe90716ffafc6d5/vitessce/config.py#L855 Defaults to None.

  • title (str, optional) – Data title to show in the visualization. Defaults to “”.

  • config_filename_suffix (str, optional) – Config filename suffix. Defaults to “config.json”.

  • outdir (str, optional) – Directory in which the config file will be written to. Defaults to “./”.

Raises:
  • SystemExit – If no valid files have been input

  • SystemExit – If the layout has an error that can not be fixed

consolidate_md.py

Consolidates Zarr metadata

consolidate_md.consolidate(file_in: str) None

Function to consolidate the metadata of a Zarr file

Parameters:

file_in (str) – Path to Zarr file

ome_zarr_metadata.py

Gets OME XML basic metadata

ome_zarr_metadata.get_metadata(xml_path: str) str

Function that parses an OME XML file and dumps basic metadata as a JSON formatted str

Parameters:

xml_path (str) – Path to OME XML file

Returns:

JSON formatted metadata

Return type:

str

process_h5ad.py

Processes H5AD files into AnnData-Zarr

process_h5ad.batch_process_array(file: str, zarr_file: str, m: int, n: int, batch_size: int, chunk_size: int) None

Function to incrementally load and write a dense matrix to Zarr

Parameters:
  • file (str) – Path to h5ad file

  • zarr_file (str) – Path to output Zarr file

  • m (int) – Number of rows in the matrix

  • n (int) – Number of columns in the matrix

  • batch_size (int) – Number of columns to load and write at a time

  • chunk_size (int) – Output Zarr column chunk size

process_h5ad.batch_process_sparse(file: str, zarr_file: str, m: int, n: int, batch_size: int, chunk_size: int, is_csc: bool = False) None

Function to incrementally load and write a sparse matrix to Zarr

Parameters:
  • file (str) – Path to h5ad file

  • zarr_file (str) – Path to output Zarr file

  • m (int) – Number of rows in the matrix

  • n (int) – Number of columns in the matrix

  • batch_size (int) – Number of rows/columns to load and write at a time

  • chunk_size (int) – Output Zarr column chunk size

  • is_csc (bool, optional) – If matrix is in CSC format instead of CSR format. Defaults to False.

process_h5ad.h5ad_to_zarr(path: str | None = None, stem: str = '', adata: AnnData | None = None, chunk_size: int = 10, batch_processing: bool = False, batch_size: int = 10000, consolidate_metadata: bool = True, **kwargs) str

This function takes an AnnData object or path to an h5ad file, ensures data is of an appropriate data type for Vitessce and writes the object to Zarr.

Parameters:
  • path (str, optional) – Path to the h5ad file. Defaults to None.

  • stem (str, optional) – Prefix for the output file. Defaults to “”.

  • adata (AnnData, optional) – AnnData object to process. Supersedes path. Defaults to None.

  • chunk_size (int, optional) – Output Zarr column chunk size. Defaults to 10.

  • batch_processing (bool, optional) – If the expression matrix will be written to Zarr incrementally. Use to avoid loading the whole AnnData into memory. Defaults to False.

  • batch_size (int, optional) – The amount of rows (if matrix is in CSR format) or columns (if matrix is dense or in CSC format) of the expression matrix to process at a time when batch processing. Defaults to 10000.

Raises:

SystemError – If batch_processing is True and the matrix contains an indptr key but the matrix is not in scipy.sparse.csr_matrix nor scipy.sparse.csc_matrix format

Returns:

Output Zarr filename

Return type:

str

process_h5ad.preprocess_anndata(adata: AnnData, compute_embeddings: bool = False, var_index: str | None = None, obs_subset: tuple[str, Any] | None = None, var_subset: tuple[str, Any] | None = None, **kwargs)

This function preprocesses an AnnData object, ensuring correct dtypes for zarr conversion

Parameters:
  • adata (AnnData) – AnnData object to preprocess.

  • compute_embeddings (bool, optional) – If X_umap and X_pca embeddings will be computed. Defaults to False.

  • var_index (str, optional) – Alternative var column name with var names to be used in the visualization. Defaults to None.

  • obs_subset (tuple(str, T.Any), optional) – Tuple containing an obs column name and one or more values to use to subset the AnnData object. Defaults to None.

  • var_subset (tuple(str, T.Any), optional) – Tuple containing a var column name and one or more values to use to subset the AnnData object. Defaults to None.

process_molecules.py

Processes molecules files

process_molecules.tsv_to_json(path: str, stem: str, has_header: bool = True, gene_col_name: str = 'Name', x_col_name: str = 'x_int', y_col_name: str = 'y_int', delimiter: str = '\t', x_scale: float = 1.0, y_scale: float = 1.0, x_offset: float = 0.0, y_offset: float = 0.0, gene_col_idx: int | None = None, x_col_idx: int | None = None, y_col_idx: int | None = None, filter_col_name: str | None = None, filter_col_idx: int | None = None, filter_col_value: str | None = None) str

This function loads a TSV/CSV file containing gene names, X and Y coordinates and writes them to a JSON file supported by Vitessce

Parameters:
  • path (str) – Path to tsv/csv file

  • stem (str) – Prefix for output JSON file

  • has_header (bool, optional) – If input file contains a header row. Defaults to True.

  • gene_col_name (str, optional) – Column header name where gene names are stored. Defaults to “Name”.

  • x_col_name (str, optional) – Column header name where X coordinates are stored. Defaults to “x_int”.

  • y_col_name (str, optional) – Column header name where Y coordinates are stored. Defaults to “y_int”.

  • delimiter (str, optional) – Input file delimiter. Defaults to ” “.

  • x_scale (float, optional) – Scale to multiply X coordinates by. Defaults to 1.0.

  • y_scale (float, optional) – Scale to multiply Y coordinates by. Defaults to 1.0.

  • x_offset (float, optional) – Offset to add to X coordinates. Defaults to 0.0.

  • y_offset (float, optional) – Offset to add to Y coordinates. Defaults to 0.0.

  • gene_col_idx (int, optional) – Column index where gene names are stored if header is not present. Defaults to None.

  • x_col_idx (int, optional) – Column index where X coordinates are stored if header is not present. Defaults to None.

  • y_col_idx (int, optional) – Column index where Y coordinates are stored if header is not present. Defaults to None.

  • filter_col_name (str, optional) – Column header name storing values to filter data. Defaults to None.

  • filter_col_idx (int, optional) – Column index storing values to filter data if header is not present. Defaults to None.

  • filter_col_value (str, optional) – Value expected in filter column. If a row has a different value it will not be written to output file. Defaults to None.

Raises:
  • SystemExit – If any column header name is not in the header row.

  • e – If coordinate values cannot be parsed to float

Returns:

Output JSON filename

Return type:

str

process_spaceranger.py

Processes SpaceRanger output

process_spaceranger.spaceranger_to_anndata(path: str, load_clusters: bool = True, load_embeddings: bool = True, load_raw: bool = False) AnnData

Function to create an AnnData object from a SpaceRanger output directory.

Parameters:
  • path (str) – Path to a SpaceRanger output directory

  • load_clusters (bool, optional) – If cluster files should be included in the AnnData object. Defaults to True.

  • load_embeddings (bool, optional) – If embedding coordinates files should be included in the AnnData object. Defaults to True.

  • load_raw (bool, optional) – If the raw matrix count file should be loaded instead of the filtered matrix. Defaults to False.

Returns:

AnnData object created from the SpaceRanger output data

Return type:

AnnData

process_spaceranger.spaceranger_to_zarr(path: str, stem: str, load_clusters: bool = True, load_embeddings: bool = True, load_raw: bool = False, save_h5ad: bool = False, **kwargs) str

Function to write to Zarr an AnnData object created from SpaceRanger output data

Parameters:
  • path (str) – Path to a SpaceRanger output directory

  • stem (str) – Prefix for the output Zarr filename

  • load_clusters (bool, optional) – If cluster files should be included in the AnnData object. Defaults to True.

  • load_embeddings (bool, optional) – If embedding coordinates files should be included in the AnnData object. Defaults to True.

  • load_raw (bool, optional) – If the raw matrix count file should be loaded instead of the filtered matrix. Defaults to False.

  • save_h5ad (bool, optional) – If the AnnData object should also be written to an h5ad file. Defaults to False.

Returns:

Output Zarr filename

Return type:

str

process_spaceranger.visium_label(stem: str, file_path: str, shape: tuple[int, int] | None = None, obs_subset: tuple[int, Any] | None = None, sample_id: str | None = None, relative_size: str | None = None) None

This function writes a label image tif file with drawn labels according to an Anndata object with necessary metadata stored within uns[“spatial”].

Parameters:
  • stem (str) – Prefix for the output image filename.

  • file_path (str) – Path to the h5ad file or spaceranger output directory.

  • shape (tuple[int, int], optional) – Output image shape. Defaults to None.

  • obs_subset (tuple(str, T.Any), optional) – Tuple containing an obs column name and one or more values to use to subset the AnnData object. Defaults to None.

  • sample_id (str, optional) – Sample ID string within the Anndata object. Defaults to None.

  • relative_size (str, optional) – Optional numerical obs column name that holds a multiplier for the spot diameter. Only useful for data that has been processed to merge spots. Defaults to None.

process_xenium.py

Processes Xenium output

process_xenium.xenium_label(stem: str, path: str, shape: tuple[int, int], resolution: float = 0.2125) None

This function writes a label image tif file with drawn labels according to cell segmentation polygons from Xenium output cells.zarr.zip file

Parameters:
  • stem (str) – Prefix for the output image filename.

  • path (str) – Path to the Xenium output directory or cells.zarr.zip file

  • shape (tuple[int, int]) – Output image shape. Defaults to None.

  • resolution (float, optional) – Pixel resolution. Defaults to 0.2125.

process_xenium.xenium_to_anndata(path: str, spatial_as_pixel: bool = True, resolution: float = 0.2125, load_clusters: bool = True, load_embeddings: bool = True) AnnData

Function to create an AnnData object from Xenium output.

Parameters:
  • path (str) – Path to a xenium output directory

  • spatial_as_pixel (bool, optional) – Boolean indicating whether spatial coordinates should be

  • True. (converted to pixels. Defaults to) –

  • resolution (float, optional) – Pixel resolution. Defaults to 0.2125.

  • load_clusters (bool, optional) – If cluster files should be included in the AnnData object. Defaults to True.

  • load_embeddings (bool, optional) – If embedding coordinates files should be included in the AnnData object. Defaults to True.

Returns:

AnnData object created from the xenium output data

Return type:

AnnData

process_xenium.xenium_to_zarr(path: str, stem: str, spatial_as_pixel: bool = True, resolution: float = 0.2125, save_h5ad: bool = False, **kwargs) str

Function to write to Zarr an AnnData object created from xenium output data

Parameters:
  • path (str) – Path to a xenium output directory

  • stem (str) – Prefix for the output Zarr filename

  • spatial_as_pixel (bool, optional) – Boolean indicating whether spatial coordinates should be

  • True. (converted to pixels. Defaults to) –

  • resolution (float, optional) – Pixel resolution. Defaults to 0.2125.

  • save_h5ad (bool, optional) – If the AnnData object should also be written to an h5ad file. Defaults to False.

Returns:

Output Zarr filename

Return type:

str

build_config.py

Generates a Vitessce View config

build_config_multimodal.build_raster_options(images: dict[str, list[dict[str, Any]]], url: str) dict[str, Any]

Function that creates the View config’s options for image files

Parameters:
  • images (dict[str, list[dict[str, T.Any]]], optional) – Dictionary containing for each image type key (raw and label) a list of dictionaries (one per image of that type) with the corresponding path and metadata for that image. Defaults to {}.

  • url (str) – URL to prepend to each file in the config file. The URL to the local or remote server that will serve the files

Returns:

Options dictionary for View config file

Return type:

dict[str, T.Any]

build_config_multimodal.concat_views(views: list, axis: str = 'v')

Recursively concatenate views

build_config_multimodal.write_json(project: str = '', datasets: dict[str, dict[str]] = {}, extended_features: str | list = [], url: str = '', config_filename_suffix: str = 'config.json', title: str = '', description: str = '', outdir: str = './') None

This function writes a Vitessce View config JSON file

Parameters:
  • project (str, optional) – Project name. Defaults to “”.

  • datasets (dict[str, dict[str]], optional) – Dictionary of datasets. Expected structure: { dataset_name: { “file_paths” : [], “images”: {“raw”: [], “label”: []}, “options”: {}, “obs_type”: “cell”, “is_spatial”: True } } Defaults to {}.

  • extended_features (Union[list[str], str], optional) – List of features or string of single feature on which the expression matrix was extended and var/is_{feature} is present. Defaults to [].

  • url (str, optional) – URL to prepend to each file in the config file. The URL to the local or remote server that will serve the files. Defaults to “”.

  • config_filename_suffix (str, optional) – Config filename suffix. Defaults to “config.json”.

  • title (str, optional) – Data title to show in the visualization. Defaults to “”.

  • description (str, optional) – Data description to show in the visualization. Defaults to “”.

  • outdir (str, optional) – Directory in which the config file will be written to. Defaults to “./”.