Spatial One: Pipeline Parameters

This page lists the different parameters that can be used at each SpatialOne module.

flowchart LR
    A["Visium\nExperiment"] -- High Resolution\n Image --> CS("Cell\nSegmentation")
    A -- Spot Transcripts --> CD("Cell\nDeconvolution") & QC("Quality Control")
    CS -- Segmented\nCells --> CA("Cell\nEstimation")
    CD -- Cell\nProportion --> CA
    QC --> Down("Downstream\nAnalysis")
    CA --> Down
    Down --> DA("Descriptive Analytics .") & SS("Spatial Structure .") & Co("Comparative Analysis") & TM("Interactive Visualization .")
    style A fill:#FFF9C4
    style Down fill:#C8E6C9
    style DA fill:#C8E6C9
    style SS fill:#C8E6C9
    style Co fill:#C8E6C9
    style TM fill:#C8E6C9

Loading

Parameters Index

Cell Segmentation
- Cellpose
  - Basic Parameters
  - Advanced Parameters
- Hovernet
  - Basic Parameters
  - Advanced Parameters
Cell Deconvolution
- Cell2Location
  - Basic Parameters
  - Advanced Parameters
- CARD
  - Basic Parameters
  - Advanced Parameters
Morphological Clustering
- K-means
  - Basic Parameters
  - Advanced Parameters
- Gaussian Clustering
  - Basic Parameters
Quality Control
- Basic Parameters
- Advanced Parameters
Datamerge
- Basic Parameters
- Advanced Parameters
Spatial Structure Analysis
- Whole Tissue Analysis
  - Basic Parameters
  - Advanced Parameters
- Region Analysis
  - Basic Parameters
  - Advanced Parameters

Cell Segmentation

SpatialOne's cell segmentation module implements Cellpose and Hovernet.

Cellpose

Cellpose is a generalist, deep learning-based cell segmentation algorithm designed to accurately identify and segment cells in microscopy images. SpatialOne uses the nuclei model for its analysis. To set it up make sure to provide the adjust the downsample_factor to align with the magnification of your images.

Please refer to the official documentation for additional details.

Basic Parameters

Parameter	Description
downsample_factor	Factor to downsample the image. Use a value of 2 for 40X images to match the 20X image training set.
diameter	Expected average cell size in pixels. A value of 0 allows the model to estimate cell size automatically.
batch_size	Number of tiles that will be simultaneously processed.
flow_threshold	Determines the aggressiveness of segmentation. Low values (e.g. 0.4) result in fewer segmented cells with high confidence, while high values (e.g. 0.8) result in more segmented cells with a higher risk of false positives.
n_channels	Number of channels the image has
channels	Determines which channels will be used to segment the cells: - 0,0: all channels - 1,0: blue channel - 2,0: green channel - 3,0: red channel
overlap	Defines if tiles should overlap at the edges, ensuring cells on the border of a tile are also segmented.
patch_size	Size of the tiles the image will be broken into for analysis.
model_type	Specifies if the model should segment nuclei or cytoplasm.

Parameter	Description
gpu	Determines whether to use GPU for processing.
help	Provides help or documentation for using Hovernet.
nr_inference_workers	Number of workers for inference processing.
nr_post_proc_workers	Number of workers for post-processing tasks.
tile	Size of the tile used for processing.
wsi	Option for whole slide image processing.
batch_size	Number of tiles that will be simultaneously processed.
model_mode	Hovernet model to use (fast or original).
nr_types	Number of expected nuclear types. Use 5 if uncertain.

Parameter	Description
atlas_type	Defines the single-cell dataset that will be used as a reference for deconvolution. This parameter is crucial as the deconvolution will consider only cell types present in the atlas. Select a reference atlas according to the organ, tissue type, and species you are analyzing.
mode	Mode of operation for the Cell2Location algorithm.

Parameter	Description
cell_abundance_threshold	If a spot has a low abundance of a given cell type (less than cell_abundance_threshold), that cell type will be excluded from the spot analysis. This parameter helps reduce noise in the output.
retrain_cell_sig	Forces retraining the cell deconvolution model even if a model has been previously created. This will likely slow down your analysis but will ensure the most recent reference data is being used.
sc_batch_key	Column storing the batch information in the single-cell reference dataset.
sc_categorical_covariate_keys	Specifies multiplicative technical effects (e.g., platform, 3' vs 5', donor effect).
sc_cell_count_cutoff	Determines the minimum number of cells in which a gene must be identified in the reference data to be considered in the analysis. Higher values may improve the confidence of the deconvolution but also increase the risk of cells not being identified.
sc_cell_percentage_cutoff2	Determines the minimum percentage of cells in the reference data in which a gene must be identified to be considered in the analysis. Higher values may improve the confidence of the deconvolution but also increase the risk of cells not being identified.
sc_label_key	Column storing the label information in the single-cell reference dataset.
sc_lr	Learning rate used to create a model from the single-cell reference data. Increasing the learning rate may increase the training speed but may also result in unstable models.
sc_max_epochs	Maximum number of epochs used to create a model from the single-cell reference data. Higher numbers may result in a more accurate model but with the risk of overfitting.
sc_nonz_mean_cutoff	Minimum mean expression that a gene must present within cells in the reference dataset. Authors recommend this value to be slightly greater than 1.
sc_use_gpu	Determines if cell deconvolution will run on a GPU.
st_N_cells_per_location	Expected average number of cells present at each spot. Providing an accurate estimate will help the deconvolution method provide a more reliable outcome.
st_detection_alpha	Hyperparameter controlling normalization of within-experiment variation in RNA detection.
st_max_epochs	Maximum number of epochs used to create a model from the single-cell reference data. Higher numbers may result in a more accurate model but with the risk of overfitting.

Parameter	Description
ct_select	Vector of cell type names that you are interested in deconvoluting. If no value is provided, all cells in the reference dataset will be considered.
ct_varname	Name of the column in sc_meta that specifies the cell type assignment.
min_count_gene	Defines the minimum number of counts a spatial location needs to be included in the analysis.
min_count_spot	Defines the minimum number of spots a gene should have non-zero expression in to be included in the analysis.
sc_label_key	Column storing the label information in the single-cell reference dataset.
sc_sample_key	Column storing the batch information in the single-cell reference dataset.

Parameter	Description,
n_clusters	Number of groups into which the data will be divided.
spot_clustering	Select this checkbox to improve cell assignment by using clustering information.

Parameter	Description
label	String label used to indicate filtered genes (default: "qc_mitochondrial_genes").
list_of_gene_ids	Genes in the list will be filtered as mitochondrial genes.
start_with	Genes starting with the following string will be filtered as mitochondrial genes (default: "-MT").

Parameter	Description
n_top_expressed_genes	Number of top expressed genes to include in the reporting. Default is 500.
n_top_variability_genes	Number of top variability genes to include in the reporting. Default is 500.
target_genes	List of specific genes to include in the analysis regardless of their expression levels. Default is an empty list.
annotation_file (optional)	Defines the location of the geojson file containing tissue annotations

Parameter	Description
cell_index	Column defining the cell index. Default is "cell_ids".
cells_df_cols_omit (optional)	List of columns to omit from the cells_df. Default is ["cell_polygons"].
spot_index	Column defining the spot index. Default is "barcode".
spots_df_cols_omit (optional)	List of columns to omit from spots_df. Default is ["in_tissue", "spot_polygons", "array_col", "array_row"].

Parameter	Description
cell_cooccur	Estimates the probability of cell co-occurrence within a given distance.
cell_counts	Bar plot summarizing cell counts in the slide.
cell_net_chord	Neighborhood Enrichment Analysis - Z-Score Clustergram. Quantifies co-localization of pairs of cells.
cell_net_matrix	Neighborhood Enrichment Analysis - Z-Score Clustergram. Quantifies co-localization of pairs of cells.
cell_per_spot	Boxplot visualization of cell abundance at spot level.
cell_summary	Table summarizing cell counts in the slide.
diff_exp_annotations	Performs differential expression analysis between annotated regions (if provided).
diff_exp_clusters	Performs differential expression analysis between different spot clusters.
infilt_comparing_cell_types	Analyzes infiltration levels of the cell within and outside the analyzed region.
infilt_comparing_levels	Analyzes infiltration levels of the cell within and outside the analyzed region.
morans_cell_heatmap	Heatmap showing Moran's I statistic at cell level. Moran's I computes cell autocorrelation.
morans_gene_heatmap	Heatmap showing Moran's I statistic at gene level. Moran's I computes gene autocorrelation.
morans_cell_bar	Barplot showing highest Moran's I statistic at cell level. Moran's I computes cell autocorrelation.
morans_gene_bar	Barplot showing highest Moran's I statistic at gene level. Moran's I computes gene autocorrelation.
qc_report	Includes the Space Ranger QC report in the resulting HTML.
spot_avg_gene_counts	Bar plot summarizing average gene expression per spot.
spot_diff_exp	Differential expression analysis between spots.

Parameter	Description
chord_n_cells	Minimum number of cell types required to generate a chord plot.
diff_fc_thresh	Fold change threshold to use for gene comparative analysis.
diff_pval	Significance level to use for gene comparative analysis.
infilt_levels	Number of levels to consider for infiltration analysis.
moran_pval	Significance level to use for Moran's I.
n_cells_neighbors	Minimum number of cells to build a neighbors NER graph.
n_cols	Number of columns in the grid of Moran plots.
n_genes_bar	Number of top genes to visualize on Moran's I bar plot.
n_genes_exp	Number of top genes to visualize after Moran's analysis.
n_intervals	Distance intervals at which co-occurrence is computed.
n_neighs	Number of neighboring tiles when coord_type is "grid". Number of neighborhoods for non-grid data when coord_type is "generic".
n_perms	Number of permutations for Moran's I. Increasing the number of permutations will increase processing time.
n_rings	Number of rings of neighbors for grid data.
n_splits	Number of splits in which to divide the spatial coordinates for co-occurrence analysis.
net_cutoff	Cutoff for considering an interaction from neighborhood enrichment analysis.

Files

parameters.md

Latest commit

History

parameters.md

File metadata and controls

Spatial One: Pipeline Parameters

Parameters Index

Cell Segmentation

Cellpose

Basic Parameters

Hovernet

Cell Deconvolution

Cell2Location

Basic Parameters

Advanced Parameters

CARD

Basic Parameters

Advanced Parameters

Morphological Clustering

K-means

Basic Parameters

Advanced Parameters

Gaussian Clustering

Basic Parameters

Quality Control

Basic Parameters

Datamerge

Basic Parameters

Advanced Parameters

Spatial Structure Analysis

Whole Tissue Analysis

Basic Parameters

Advanced Parameters

Region Analysis

Basic Parameters

Advanced Parameters