Visium HD Spatial Transcriptomics Data Analysis and Visualization (VST-DAVis)

Introduction

VST-DAVis is a user-friendly, browser-based R Shiny application designed for researchers without programming expertise to analyze and visualize 10x Genomics Spatial Transcriptomics Visium HD. It supports both single and multiple sample analyses, as well as group comparisons. The application offers the following key functional analyses:

1. Single or Multiple Samples Analysis

This section provides various tabs to analyze one or more samples, which can be grouped into up to six groups.

1.1 Stats

Displays the spatial QC, QC plot and cell summary of the uploaded sample(s).

1.2 Sample Groups and QC Filtering

Facilitates spatial QC filtering and QC metric selection for further analysis.

1.3 Normalization and PCA Analysis

Enables sample normalization using multiple methods and generates PCA plots.

1.4 Clustering

Utilizes the Seurat clustering algorithm to group cells into clusters and visualizes them using UMAP, tSNE, and spatial images.

1.5 Marker Identification

Identifies markers for all clusters, a specific cluster, or between clusters and supports the identification of conserved markers.

1.6 Cell Type Prediction

Provides multiple options for cell type identification, including ScType, SingleR, GPTCelltype, or custom user-provided labels.

1.7 Cluster-Based Plots

Visualizes expressed genes in each cluster using Spatial Feature, Dot, Violin, Ridge, and Feature plots.

1.8 Condition-Based Analysis

Identifies expressed genes between two groups, with visualization options including Spatial Feature, Dot, Violin, Ridge, Feature, or Volcano plots.

2. Subclustering

Allows sub-clustering within one or more clusters from single or multiple sample analyses or gene of interst in positive or negative selection, which follows similar steps as in the primary analysis.

3. Correlation Network Analysis

Uses the genesorteR package to identify the correlation between cell clusters. Provides correlation summary tables and visualizations of correlation matrix and network plots.

4. Genome Ontology (GO) Terms

Uses the clusterProfiler package to identify biological processes, molecular functions, and cellular components for marker genes. Provides GO summary tables and visualizations in Dot, Bar, Net and UpSetplots.

5. Pathway Analysis

Employs the clusterProfiler and ReactomePA packages to identify pathways in single or multiple clusters, with results displayed in Dot, Bar, Net and UpSetplots.

6. GSEA Analysis

Performs Gene Set Enrichment Analysis (GSEA) using the fgsea and msigdb packages to identify enriched gene sets. Results are displayed in GSEA plots, Bar plots, and PlotGseaTables.

7. Cell-Cell Communication

Uses the Cellchat package to identify signaling communication between clusters, with receptor-ligand interactions visualized in Circular, Chord, Heatmap, Bubble, Bar, Violin and Spatial plot for the selected interaction.

8. Trajectory and Pseudotime Analysis

Utilizes the Monocle3 package to order clusters in pseudotime and analyze gene function changes over time. Results include trajectory and pseudotime plots, pseudotime spatial plots, bar plots, and functional gene changes in pseudotime.

9. Co-Expression and TF analysis

9.1 Co-Expression Network Analysis

Uses the hdWGCNA package to identify co-expression networks as undirected, weighted gene networks. These are visualized through co-expression networks with modules, soft power plots, module relationship plots, module network plots, module UMAP plots and module spatial plot

9.2 Transcription factor regulatory network analysis

Uses the hdWGCNA package to identify the transcription factor (TFs) within co-expression modules. These TFs play a key role in regulating gene expression networks in single-cell data. These TFs are visualized through bar plot, network plot and module UMAP plots

Outputs and Visualization

VST-DAVis provides publication-quality plots in seven formats: JPG, TIFF, PDF, SVG, BMP, EPS, and PS. Summary tables are also generated in .csv format for easy visualization and download.

use VST-DAVis online

VST-DAVis is deployed at: https://www.gudalab-rtools.net/VST-DAVis

Launch VST-DAVis using R and GitHub

VST-DAVis were deposited under the GitHub repository: https://github.com/GudaLab/VST-DAVis

R (>= 4.4.3)
RStudio (>= 2024.12.0)
Bioconductor (>= 3.20)
Shiny (>= 1.10.0)

Note: VST-DAVis has been tested with these versions. Using older R versions may cause installation errors. It is recommended to update R before installation.
Once R is open in the command line or in RStudio, users should run the following command in R to install the shiny package.

install.packages('shiny')

library(shiny)

Start the app

Start the R session using RStudio and run these lines:

shiny::runGitHub('VST-DAVis','GudaLab')

or Alternatively, download the source code from GitHub and run the following command in the R session using RStudio:

library(shiny)
runApp('/path/to/the/VST-DAVis-master', launch.browser=TRUE)

Usage

Please refer our Manual tab.

Developed and maintained by

VST-DAVis was developed by Sankarasubramanian Jagadesan and Babu Guda. We share a passion for developing a user-friendly tool for biologists, particularly those who do not have access to bioinformaticians or programming expertise.

Each sample should be stored in a separate zip file according to the given structure. Users can upload multiple zip files simultaneously to analyze multiple samples.

Example file format

SpaceRanger h5 format and spatial image files
H5 File and spatial image files
SpaceRanger Matrix, Feauture, Barcodes and spatial image files
Matrix, Feauture, Barcodes and spatial image files

Sample names of uploaded file(s)

Number of cells in the given sample(s)

QC Plot

Spatial Feature QC Plot

Feature-Feature relationships plot

Select number of sample group(s)

Type Group1 name

Type Group2 name

Type Group3 name

Type Group4 name

Type Group5 name

Type Group6 name

Define filtering parameters

Keep the minimum number of nFeature Spatial

Keep the maximum number of nFeature Spatial

Filter cells that have more than this percentage mitochondrial counts

Number of cells after QC

Sample(s) based

Group(s) based

QC plot after filtering

Sample(s) based

Group(s) based

Spatial QC plot after filtering

Bar plots

Sample(s) based

Group(s) based

Normalization method

Integration method

Scale factor

variable genes detection

Number of top variable features

Number of dimensions (PCA)

Dimension reduction heatmap for PCA data

Elbow plot

PCA sample(s) based

PCA group(s) based

Nearest-neighbour graph construction

Number of dimensions

k.param

n.trees

Clustering parameters and integration method

Resolution

Clustering algorithm

Dimension reduction

Plot Options

UMAP t-SNE

UMAP parameters

Number of dimensions

k-nearest-neighbours

min.dist

Show label

t-SNE parameters

Number of dimensions

Show label

UMAP / t-SNE cluster plot

Cluster based count bar plot

UMAP / t-SNE cluster Spatial plot

UMAP / t-SNE condition(s) based plot

Condition(s) based count bar plot

UMAP / t-SNE sample(s) based plot

Sample(s) based count Bar plot

Number of cells in clusters

Number of cells in clusters based on condition(s)

Number of cells in clusters based on sample(s)

Spatial plot highlights each cluster separately, distinguishing individual samples

Clusters split by condition(s)

Clusters split by and sample(s)

Clusters split by and condition(s) and clusters

Markers identification or Differential expression analysis

Select the analysis type

Gene expression markers parameters

Minimal percentage of cells

log fold change threshold

Statistical test

Return only positive markers

group.var

Identified markers / differentially expressed genes

Conserved Markers genes

Heatmap for top 5 marker genes in cluster(s)

Predict Cell Type

Please make sure 'Identify markers in all clusters' were runned in the previous step, if you are using GPTCelltype. To use GPTCelltype locally, users need to update their API key by setting Sys.setenv(OPENAI_API_KEY = 'your_openai_API_key') in the global.R file

Cell type prediction method

Select reference data

Select tissue

DE.method

Select model

Top gene numbers to predict cell type

Dim plots label

Dimplot of annotated clusters

Spatial Dimplot of annotated clusters

ScType scores

SingleR Scores

SingleR score heatmap

SingleR Delta distribution

Select the plot type to display

Please make sure 'Identify markers in all clusters' and the same 'cell prediction method' were runned in the previous steps.

No. of features to display

Enter your genes for ploting (eg: gene names separated by , )

Plot type

group.by

split.by

Spatial feature/ Dot / Violin / Ridge / Feature plot

Top or selected genes, cell counts and proportion

Differential expression analysis between two groups

Parameters to find the DEGs

Minimal percentage of cells

log fold change threshold

Statistical test

Return only positive markers

Parameters for ploting

Plot type

group.by

No. of features to display

Enter your genes for ploting (eg: gene names separated by , )

Spatial feature / Dot / Violin / Ridge / Feature / Volcano plot

Differentially expressed genes

Number of cells in the sample(s)

QC Stats for sleected sub clusters

QC Stats for sleected sub clusters spatial image

Please use the same normalization method used in single or multiple samples analysis

Normalization method

Scale factor

variable genes detection

Number of top variable features

Number of dimensions (PCA)

Dimension reduction heatmap for PCA data

Elbow plot

PCA sample(s) based

PCA group(s) based

Nearest-neighbour graph construction

Number of dimensions

k.param

n.trees

Clustering parameters and integration method

Resolution

Clustering algorithm

Dimension reduction

Plot Options

UMAP t-SNE

UMAP parameters

Number of dimensions

k-nearest-neighbours

min.dist

Show label

t-SNE parameters

Number of dimensions

Show label

UMAP / t-SNE cluster plot

Cluster based count bar plot

UMAP / t-SNE cluster Spatial plot

UMAP / t-SNE condition(s) based plot

Condition(s) based count bar plot

UMAP / t-SNE sample(s) based plot

Sample(s) based count Bar plot

Number of cells in clusters

Number of cells in clusters based on condition(s)

Number of cells in clusters based on sample(s)

Spatial plot highlights each cluster separately, distinguishing individual samples

Clusters split by condition(s)

Clusters split by and sample(s)

Clusters split by and condition(s) and clusters

Markers identification or Differential expression analysis

Select the analysis type

Gene expression markers parameters

Minimal percentage of cells

log fold change threshold

Statistical test

Return only positive markers

group.var

Identified markers / differentially expressed genes

Conserved Markers genes

Heatmap for top 5 marker genes in cluster(s)

Predict Cell Type

Please make sure 'Identify markers in all clusters' were runned in the previous step, if you are using GPTCelltype

Cell type prediction method

Select reference data

Select tissue

DE.method

Select model

Top gene numbers to predict cell type

Dim plots label

Dimplot of annotated clusters

Spatial Dimplot of annotated clusters

ScType scores

SingleR Scores

SingleR score heatmap

SingleR Delta distribution

Select the plot type to display

Please make sure 'Identify markers in all clusters' and the same 'cell prediction method' were runned in the previous steps.

No. of features to display

Enter your genes for ploting (eg: gene names separated by , )

Plot type

group.by

split.by

Spatial feature / Dot / Violin / Ridge / Feature plot

Top or selected genes, cell counts and proportion

Differential expression analysis between two groups

Parameters to find the DEGs

Minimal percentage of cells

log fold change threshold

Statistical test

Return only positive markers

Parameters for ploting

Plot type

group.by

No. of features to display

Enter your genes for ploting (eg: gene names separated by , )

Spatial feature / Dot / Violin / Ridge / Feature / Volcano plot

Differentially expressed genes

To begin this analysis, please complete Single or Multiple samples or subclustering analysis until Cell Type Prediction and Marker Identification step.

Cell cluster correlation network analysis

Select the input data and celltype method for analysis

Input data

Select the celltype method

Correlation method

Cluster-based correlation matrix plot

Cluster-based Correlation Network plot

Cluster-based correlation table

To begin this analysis, please complete Single or Multiple samples or subclustering analysis until Cell Type Prediction and Marker Identification step.

Select the input data and cluster(s) for analysis

Input data

Enter your genes (eg: gene names separated by , )

Select the celltype method

p_val_adj

GO term parameters

Organism

Ontology

pAdjustMethod

pvalueCutoff

qvalueCutoff

Minimal size of genes

Maximal size of genes

Plot type

No. of category to plot

Go term plot

Summary table

To begin this analysis, please complete Single or Multiple samples or subclustering analysis until Cell Type Prediction and Marker Identification step.

Select the input data and cluster(s) for analysis

Pathway analysis type

Input data

Enter your genes (eg: gene names separated by , )

Select the celltype method

p_val_adj

Pathway parameters

Organism

pAdjustMethod

pvalueCutoff

qvalueCutoff

Minimal size of genes

Maximal size of genes

Plot type

No. of category to plot

Pathway plot

Summary table

To begin this analysis, please complete Single or Multiple samples or subclustering analysis until Cell Type Prediction and Marker Identification step.

Select the input data and cluster(s) for analysis

Input data

Select the celltype method

p_val_adj

GSEA parameters

Organism

Category (from MSigDB)

ScoreType

Minimal size of genes

Maximal size of genes

Number of permutations

Plot type

No. of significance to plot

GSEA plot

Summary table

To begin this analysis, please complete Single or Multiple samples or subclustering analysis until Cell Type Prediction and Marker Identification step.

Select the input data and celltype method for analysis

Input data

Select the celltype method

Cell-cell communication parameters (CellChat)

Organism

Threshold of the percent of cells expressed

Threshold of Log Fold Change

Threshold of p-values

Methods for computing the average gene expression per cell group

distance.use

scale.distance

The maximum interaction/diffusion length of ligands (Unit: microns)

contact.dependent

center-to-center distance, unit: microns)

Minmum number of cells required in each cell group for cell-cell communication

Communication pattern k-value

Show label

Interactions plot with counts

Interactions plot with weights/strength

Interaction heatmap

Incoming and outgoing signaling patterns

Incoming and Outgoing communication pattern of target and secreting cells

Interaction table

Show all the significant interactions associated with certain signaling pathways

Show label

Interactions plot (Spatial)

Interactions plot (Circle)

Interactions plot (Chord)

Interaction heatmap

Hierachy plot

Bubble plot

Network analysis contribution bar plot

Gene expression plot

Interaction table

To begin this analysis, please complete Single or Multiple samples or subclustering analysis until Cell Type Prediction and Marker Identification step.

Select the input data and annotation method

Input data

Select the celltype method

Parameters to Learn Trajectory

Please make sure you have used UMAP in the clustering steps

use_partition

close_loop

label_groups_by_cluster

label_branch_points

label_roots

label_leaves

Trajectory plot

Order cells in pseudotime

label_groups_by_cluster

label_branch_points

label_roots

label_leaves

Cells plotted in pseudotime

Cells ordered by Seurat cluster and Monocle3 pseudotime

Find genes that changes function during the pseudotime

neighbor_graph

label_groups_by_cluster

Pseudotime plot

Pseudotime plot with spatial images

List of genes that changes function during the pseudotime

Plot the top or user listed genes to see the changes in pseudotime

No. of genes to display

Enter your genes for ploting (eg: gene names separated by , )

Pseudotime plot for the top selected genes

Pseudotime plot for the top selected genes with Spatial images

Co-expression network analysis
Transcription factor regulatory network analysis

To begin this analysis, please complete Single or Multiple samples or subclustering analysis until Cell Type Prediction step.

Co-expression network analysis using hdWGCNA

Select the input data and cluster(s) for analysis

Input data

Select the celltype method

select the reduction type

Construct metacells

Nearest-neighbors parameter (k)

Minimum number of cells in a particular grouping to construct metacells

Maximum number of shared cells between two metacells

Maximum target number of metacells to construct

Select soft-power

Network Type

Module eigengenes and connectivity

Scale model

Harmonized module eigengenes

Show top N hub genes

Module based UMAP Plot

No. of hub genes to label in each module

Show edges between genes in different modules (grey edges)

UMAP plot to check the loaded the data is correct

Soft power threshold plots

Co-expression network plot

Module ranked by eigengene-based connectivity kME

Module feature plots

Module feature plots with Spatial image

Module correlagram plot

Module with Seurat’s dot plot

Individual module network plots

UMAP plot for co-expression networks

Soft-power threshold table

Module assignment table

Top N hub genes

Transcription factor regulatory network analysis

Organism

Identify TFs in promoter regions (uses JASPAR 2020 database, Motif scan and XGBOost)

max_depth

eta

alpha

Define TF Regulons

Threshold for regulatory score

The number of top TFs to keep for each gene

Calculate regulon expression signatures

Positive regulon score thresold

Negative regulon score thresold

Module regulatory network plot (Positive)

Module regulatory network plot (Negative)

Module regulatory network plot (Both)

Module regulatory network plot (Module UMAP)

TF network table

Select a TF of interest

Bar plot parameter

Number of top and bottom target genes

Network plot parameter

Attribute to color the network edges

Number of layers to extend the TF network

Feature plot of selected TF

Feature plot of selected TF with spatial image

Top target genes within TF regulons

TF network plot (Positive)

TF network plot (Negative)

TF network plot (Both)

Visium HD Spatial Transcriptomics Data Analysis and Visualization (VST-DAVis)

This section will introduce how to prepare input files:

Supported Input Formats:

H5 Files, spatial image folder (Space Ranger Output) and zip it to single folder for each samples

Space Ranger file: filtered_feature_bc_matrix.h5.

Space Ranger Matrix Files

Space Ranger files: matrix.mtx.gz, feature.tsv.gz, barcode.tsv.gz, spatial image folder and zip it to single folder for each samples.

Data Size and Handling:

The tool can handle VST-DAVis data up to 3GB in the specified formats.
Supports analysis of single or multiple samples, including up to six sample groups.
After data upload, users can proceed with the analysis through a step-by-step workflow for the 1st Module, with the 'Next Step' button guiding users through each tab in the process.
Once the single or multiple analysis is completed, users can analysis as per their need, there is no steps involved further

Output and Visualizations:

High-Quality Plot Download: Users can download plots in seven formats: JPG, TIFF, PDF, SVG, BMP, EPS, and PS. However, a few specific plots, such as those requiring exceptionally high detail or complex rendering (e.g., network graphs or high-resolution heatmaps), are only available as PDF files to preserve their quality and detail.
Summary Tables: Tables are displayed using the DT package. Users can visualize up to 100 rows (default is 10) and download the entire table as a CSV file.
Download Seurat Object: In single or multiple sample analyses, users can download the processed results as an RDS file (Seurat Object).

Example Datasets:

To ensure seamless analysis and reproducibility, VST-DAVis includes one reference dataset for each input format, sourced from NCBI, which has been pre-tested with the tool. These datasets allow users to explore the tool's functionalities and understand the analysis workflow effectively.

H5 File: GSE230207
Matrix Files: GSE244014
Example data to test the tool (TME_cold_vs_TME_hot_vs_TME_IME_vs_TME_IMS) from : GSE230207

Step-by-Step Approach for User Interaction

1. Single or Multiple samples analysis

1.1 Stats

Upload the mandatory files in zip format shown in above Fig:
Execution:
- Click the Submit button to run the analysis based on selected parameters (Fig. 1.1a).
Output:
- QC plots: Quality metrics before filtering (Fig. 1.1b).
- Spatial Feature QC plots: Quality metrics in spatial image (Fig. 1.1c).
- Feature-Feature Relationships Plot (Fig. 1.1d)
- Sample(s) cell counts table (Fig. 1.1e)

1.2. Sample Groups and QC Filtering

Assign sample(s) group(s):
Choose the number of group(s) based on sample grouping
- If only one sample: Select 1 group.
- For multiple samples in a single condition: Select 1 group.
- For multiple samples in different conditions: Select up to 6 groups.
Define Filtering Parameters:
- Set thresholds to remove low-quality cells and genes, and optionally filter out cells with high mitochondrial gene content.
Execution:
- Update Filtered Data: Click to apply filters and update the dataset (Fig. 1.2a).
Outputs after Filtering:
- QC matrics, spatial and Bar Plot (Fig. 1.2b-f)
- Summary Table: Filtered cell count data (Fig. 1.2g,h)

1.3. Normalization and PCA Analysis

Normalization Methods:
- LogNormalize: Adjusts for sequencing depth or read count differences.
  - Additional options: Scale Factor, Variable Genes Detection methods (vst, mvp, disp), Top Variable Features to use.
  - Select integration method(s): CCA or RPCA. These methods will allow users to handle dataset complexities and integrate data from multiple samples.
- SCT (SCTransform): Uses regularized negative binomial regression for clustering and differential expression.
PCA Settings:
- Choose the PCA Dimensions (typically between 1-50) for analysis.
Execution:
- Click Submit button to start the analysis (Fig. 1.3a).
Outputs:
- PCA Heatmap (Fig. 1.3b)
- Elbow Plot (Fig. 1.3c)
- PCA Plot (sample-wise or group-wise) (Fig. 1.3d,e)

1.4. Clustering

Clustering Step:
- Find Neighbors: The users selects the dimensions to use (PCA, integrated dimensions, etc.) and k-nearest neighbors.
- Clustering Algorithm: Select between Louvain, SLM or Leiden algorithms for clustering.
- Resolution Control: The users can adjust the resolution (0.1 to 1) parameter to control the granularity of clusters.
Dimension Reduction:
- o Choose between UMAP or t-SNE for dimensionality reduction.
  - For UMAP: Users can adjust parameters like min.dist, k-nearest-neighbours, and the number of dimensions.
  - For t-SNE: Users can adjust the number of dimensions
Execution:
- Click Submit button to start the analysis (Fig. 1.4a).
Visualize and Compare:
- Display UMAP or t-SNE plots with clustering labels and sample/condition overlays (Fig. 1.4b,d,f).
- Spatial images with the clustering labels for each samples and each clusters highlighted in yellow (Fig. 1.4j).
- UMAP plot split by condition (Fig. 4i) .
- Bar charts (Fig. 4c,e,h) and tables show cell counts per cluster and per sample/condition (Fig. 1.4k-m).

1.5. Marker Identification

Identify markers in all clusters (FindAllMarkers):
- Customizable parameters: (Fig. 1.5a)
  - Minimum cell percentage (min.pct) to specify the minimum fraction of cells in which a gene is expressed.
  - Log fold-change threshold (logfc.threshold) to filter markers based on expression magnitude.
  - Statistical test options (test.use), including Wilcoxon rank sum (wilcox), Wilcoxon-Limma hybrid (wilcox_limma), binomial (bimod), ROC, t-test, likelihood ratio test (LR), and MAST.
  - Positive markers only (only.pos), with options for yes or no, to focus on upregulated genes in the target cluster.
  - When using SCTransform for normalization, out tool uses PrepSCTFindMarkers preps the data for accurate differential testing by adjusting the SCT assay, making results more reliable for FindMarkers and FindAllMarkers.
- Output: Heatmap of the top 5 genes per cluster (Fig. 1.5b), helping users visualize the distinguishing genes for each cluster and Summary table of markers or expressed genes (Fig. 1.5c).
Marker identification in one specific cluster or between two clusters (FindMarkers):
- Identifies markers for one cluster against another or against all other clusters.
- Includes all the customizable parameters noted above, enabling targeted cluster comparison with refined criteria.
- Output: A table format displaying the expressed genes for the specified clusters, ideal for in-depth comparisons.
Conserved marker identification for one vs. all cluster or between two clusters (FindConservedMarkers):
- Finds markers conserved across groups (e.g., conditions) for a cluster, or conserved markers between two specific clusters.
- Utilizes the same customizable parameters for consistency across comparisons.
- Output: A table format with expressed genes, providing insights into markers consistently expressed across groups or clusters.

1.6. Cell Type Prediction

ScType:

Predefined Tissue Types: Users can select from 15 tissue types, including: Adrenal, Brain, Eye, Heart, Immune, Intestine, Kidney, Liver, Lung, Muscle, Pancreas, Placenta, Spleen, Stomach, Thymus.
Tissue Classification: Automatically classifies the cells based on the selected tissue type.

SingleR:

Reference Datasets: Users can use reference datasets such as: Human Primary Cell Atlas, Blueprint/ENCODE, Mouse RNA-seq, Immunological Genome Project, Database of Immune Cell Expression/eQTLs/Epigenomics, Novershtern Hematopoietic data, Monaco immune data.
Prediction: Predicts the cell types based on these well-known reference datasets.

GPTCelltype:

GPT Models: Utilizes various GPT models, including: gpt-4.5-preview, GPT-4, GPT-4-turbo, GPT-4o-mini, GPT-4o, ChatGPT-4o-latest, GPT-3.5-turbo, GPT-3.5-turbo.
Gene Requirements: Requires a minimum number of top genes for accurate prediction.
Availability: Available via the web platform. To use it locally, users need to update their API key by setting Sys.setenv(OPENAI_API_KEY = 'your_openai_API_key') in the global.R file.

Own Cell Labels:

User-Defined Labels: Users can manually input their own cell type labels for each cluster.
Cluster Grouping: If multiple clusters need the same label, users should provide the same label name for those clusters.

UMAP/t-SNE Labels:
- Label display options: Users can choose to show or hide cell type labels in the UMAP or t-SNE plots.
Execution:
- Click Detect cell type button to start the analysis (Fig. 1.6a).
Output:

Plot: Generates an image plot showing the predicted cell types (Fig. 1.6b, d).
Spatial Plot: Generates an spatial plot with the predicted cell types (Fig. 1.6c).
Summary Table: Provides a summary table with the predicted cell types and associated scores (Fig. 1.6e).

1.7. Cluster-based Plots

Gene Selection:
- Top Genes: Users can select the top features or genes (from 2 to 10).
- Custom Genes: Users may also input custom gene names by selecting them from a drop-down menu (list of genes) and entering the desired gene names as a comma-separated list.
Plot Types:
- Multiple visualization formats are available, including Spatial Plot, Dot Plot, Violin Plot, Ridge Plot, and Feature Plot. For Dot Plot, Violin Plot, and Ridge Plot, users can adjust parameters to visualize the plots for either all Seurat clusters or selected specific clusters.
Grouping and Splitting:
- Group by: Users can organize the data by Seurat clusters or labels generated from previous cell type prediction steps.
- Split by: If multiple samples are present, plots can be split by condition or sample to compare expression patterns across groups.
Execution:
- Click Generate plots button to start the analysis (Fig. 1.7a).
Output:
- Plot: The user receives one of the chosen plot formats ( Spatial Plot, violin plot, dot plot, feature plot or ridge plot) (Fig. 1.7b-f).
- Summary Tables: The tool generates tables showing marker gene cell counts and cell proportions, providing an additional layer of quantitative insight (Fig. 1.7g).

1.8. Condition-based Analysis

Group Selection:
- Users can compare gene expression between two conditions by selecting one group per dropdown menu.
Customizable Parameters:
- Minimum Cell Percentage (min.pct): Sets the minimum fraction of cells in which a gene must be expressed.
- Log Fold-Change Threshold (logfc.threshold): Filters markers by expression magnitude.
- Statistical Tests (test.use): Users can choose from various methods, including Wilcoxon rank sum, Wilcoxon-Limma hybrid, binomial, ROC, t-test, likelihood ratio test, and MAST.
- Positive Markers Only (only.pos): Option to display only upregulated genes in the target cluster.
Visualization Options:
- Multiple formats are available, including: Spatial Plot, Dot Plot, Violin Plot, Ridge Plot, Feature Plot, Volcano Plot
Grouping:
- Group By: Users can group data by Seurat clusters or predicted cell type labels.
- Number of Features: Allows display of a specific number of up- and down-regulated genes (e.g., 15).Users may also input custom gene names by selecting them from a drop-down menu (list of genes).
Execution:
- Click Submit button to start the analysis (Fig. 1.8a).
Output:
- Plot: The users receives the chosen plot type, providing visual comparison (Fig. 1.8b-g).
- Summary Tables: Table contains the differentially expressed genes between the slected groups (Fig. 1.8h).
- This setup enables users to conduct detailed comparisons between conditions, facilitating insights into differential gene expression and cellular responses.

Subclustering

In VST-DAVis, users can further explore specific clusters of interest by performing subclustering analysis. This feature allows for a more granular examination of cell populations within one or multiple clusters, based on the user’s selection. Similar output were generated as like as above for the selected cluster(s) or cell type(s).

2.1. Cluster Selection:

Users can choose one or multiple clusters for subclustering.
Clusters can be selected based on Seurat clusters or previously predicted annotation labels.
Users can select genes of interest to extract cells for reclustering (positive selection); for example, FCN1 or multiple genes like FCN1,PSAP. When specifying multiple genes, separate each gene name with a comma.
Exclude genes expressed in cells and perform the analysis using the remaining cells (negative selection); for example, FCN1 or multiple genes like FCN1,PSAP. When specifying multiple genes, separate each gene name with a comma.

2.2. Subclustering Analysis Steps:

The subclustering process mirrors the main workflow, with dedicated tabs for each stage, allowing users to perform the following analyses on the selected cluster(s):
- Cell Stats: Overview of cell metrics within the selected clusters, including minimum gene and cell expression thresholds.
- Normalization and PCA Analysis: Options to normalize and vizualize the PCA data for secific cluster(s). (use the same method used in the above menu)
- Clustering: Allows users to re-cluster cells within the subclusters, providing insights into finer subpopulations.
- Marker Identification: Users can identify markers specific to subclusters, with options to customize parameters for marker detection.
- Cell Type Prediction: Provides options to predict cell types within the selected subclusters using ScType, SingleR, GPTCelltype, or custom labels.
- Cluster-Based Plots: Users can visualize gene expression within subclusters through Dot, Violin, Ridge, or Feature plots.
- Condition-Based Analysis: Enables differential expression comparisons within subclusters, providing insights into condition-specific gene expression patterns.

3. Correlation Network Analysis

VST-DAVis includes Cluster-Based Correlation Analysis using the genesorteR package. This feature helps users explore relationships and interactions among genes within specific clusters by calculating pairwise correlations.

Prerequisites:
- Correlation Network Analysis becomes available after completing single or multiple samples analysis or subclustering analysis up to cell type prediction.
- Users can choose to conduct analysis on: Seurat clusters or predicted cell type labels from single, multiple, or subcluster analyses.
Correlation Methods:
- Pearson
- Spearman
- Kendall
Execution:
- Click the Cluster correlation network button to run the analysis based on selected parameters (Fig. 3a).
Output:
- Correlation Heatmap: Displays the correlation values between genes within clusters in a matrix format (Fig. 3b).
- Correlation Network Plot: Depicts the relationships between genes as a network, highlighting strongly correlated pairs (Fig. 3c).
- Summary Table: With the complete correlation matrix for detailed analysis (Fig. 3d)

This analysis provides a deeper understanding of gene co-expression and interaction patterns within clusters, aiding in the identification of significant biological relationships.

4. GO Term Analysis

VST-DAVis provides integrated Gene Ontology (GO) term analysis using the clusterProfiler package, enabling users to explore biological functions, molecular mechanisms, and cellular components related to gene expression patterns in single, multiple, or subcluster analyses. Here’s how users can conduct GO analysis:.

Prerequisites:
- GO analysis becomes available after completing single or multiple samples analysis or subclustering analysis up to cell type prediction.
Input Options:
Users can choose to conduct GO analysis on:
- Seurat clusters or predicted cell type labels from single, multiple, or subcluster analyses.
- Users can use one or multiple clusters at a time, with an adjustable parameter (p_val_adj < 0.05) for significant results.
- A custom list of genes: Users can manually enter gene names (comma-separated) to investigate GO terms for genes of specific interest.
Organisms Supported for GO Term Mapping:
VST-DAVis supports GO analysis for five organisms, mapping gene IDs to gene symbols:
- Human: org.Hs.eg.db
- Mouse: org.Mm.eg.db
- Rat: org.Mmu.eg.db
- Pig: org.Ss.eg.db
- Rhesus: org.Rn.eg.db
GO Term Analysis Parameters:
Ontology Method: Users can choose to focus on specific biological aspects or all three:
- Biological Process (BP)
- Molecular Function (MF)
- Cellular Component (CC)
- All: To analyze across all three categories.
Adjustable Parameters:
- pAdjustMethod: Select the method to adjust for multiple testing.
- pvalueCutoff: Set a cutoff for p-values.
- qvalueCutoff: Define a q-value threshold for significance.
- Minimum Size of Genes: Minimum number of genes required in a GO term.
- Maximum Size of Genes: Maximum number of genes in a GO term.
- Plot Type: Choose a visualization format (Dot Plot, Bar Plot, Net and UpSetPlot).
- Number of Categories to Plot: Select the number of categories to display (1 to 50).
Execution:
- Click the GO Term button to run the analysis based on selected parameters (Fig. 4a).
Output
- Plots: Dot plot, Bar plot, UpSet plot and Network plot for the selected ontology categories (Fig. 4b-e).
- Summary Table: A downloadable table summarizing the GO terms, adjusted p-values, and other relevant metrics, allowing users to interpret and visualize biological insights (Fig. 4f).

This GO term analysis feature in VST-DAVis provides users with an accessible, visually informative, and comprehensive view of gene functionality across clusters and conditions, enabling enhanced biological interpretation of VST-DAVis data.

5. Pathway Analysis

VST-DAVis offers pathway analysis through KEGG and Reactome databases using the clusterProfiler and ReactomePA packages. Users can gain insights into biological pathways associated with specific gene expression profiles from single, multiple, or subcluster analyses.

Prerequisites:
- Pathway analysis becomes available after completing single or multiple samples analysis or subclustering analysis up to cell type prediction.
Input Options:
Pathway analysis can be performed on:
- Seurat clusters or predicted cell type labels from single, multiple, or subcluster analyses.
- One or multiple clusters simultaneously, with results filtered by an adjusted p-value (p_val_adj < 0.05).
- A custom list of genes: Users can input specific gene names (comma-separated) to focus on pathways for genes of interest.
Organisms Supported for Pathway Mapping:
VST-DAVis enables pathway mapping for multiple organisms:
- KEGG Pathways: Supports human (org.Hs.eg.db), mouse (org.Mm.eg.db), and rat (org.Mmu.eg.db) for mapping gene IDs to symbols.
- Reactome Pathways: Available for human, mouse, and rat.
Pathway Analysis Parameters:
- pAdjustMethod: Choose a method for multiple testing correction.
- pvalueCutoff: Set a threshold for p-values.
- qvalueCutoff: Define a q-value cutoff for pathway significance.
- Minimum Size of Genes: Minimum gene count per pathway.
- Maximum Size of Genes: Maximum gene count per pathway.
- Plot Type: Choose visualization format (Dot Plot, Bar Plot, Net and UpSetPlot Plot).
- Number of Pathways to Plot: Select the number of pathways to display (1 to 50).
Execution:
- Click the Pathway Analysis button to run the analysis with the selected parameters (Fig. 5a).
Output
Pathway analysis results include:
- Visualizations: Dot plot, Bar plot, UpSet plot and Network plot, showcasing significant pathways (Fig. 5b-e).
- Summary Table: A downloadable table with pathway details, adjusted p-values, and other metrics for further exploration and interpretation (Fig. 5f).

The pathway analysis functionality in VST-DAVis helps users understand the biological processes and signaling pathways linked to gene expression profiles across clusters and conditions, providing a deep functional understanding of their VST-DAVis data.

6. GSEA Analysis

The Gene Set Enrichment Analysis (GSEA) feature in VST-DAVis leverages the fgsea package to identify enriched pathways using ranked gene lists, such as those generated from differential expression analysis. This allows users to assess pathway-level expression changes and gain insights into functional changes across clusters or conditions.

Prerequisites
- GSEA analysis can be conducted following single or multiple samples analysis or subclustering analysis up to cell type prediction.
Input Options
GSEA analysis can be performed on:
- Seurat clusters or predicted cell type labels derived from single, multiple, or subcluster analyses.
- Single or multiple clusters simultaneously, with results filtered by an adjusted p-value (p_val_adj < 0.05).
Organisms and Gene Sets Supported
- Organisms: Human and mouse gene ID mapping.
- Gene Set Categories: Using the msigdbr package, which provides gene sets compatible with fgsea from the Molecular Signatures Database (MSigDB). Available categories include:
  - Hallmark Gene Sets (H)
  - Positional Gene Sets (C1)
  - Curated Gene Sets (C2)
  - Regulatory Target Gene Sets (C3)
  - Computational Gene Sets (C4)
  - Ontology Gene Sets (C5)
  - Oncogenic Signature Gene Sets (C6)
  - Immunologic Signature Gene Sets (C7)
  - Cell Type Signature Gene Sets (C8)
GSEA Analysis Parameters:
- scoreType: Define the scoring method for pathway enrichment.
- Minimal Size of Genes: Minimum number of genes in a gene set.
- Maximal Size of Genes: Maximum number of genes in a gene set.
- Number of Permutations: Control the precision of p-value calculations.
- Plot Type: Choose visualization format (GSEA Plot, PlotGseaTable, Bar Plot).
- Number of Significant Pathways to Plot: Select the number of pathways to display (1 to 40).
Execution:
- Click the GSEA Analysis button to run the analysis with the selected parameters (Fig. 6a).
Output
The GSEA analysis provides:
- Visualizations:
  - GSEA Plot: Displays the enrichment score curve (Fig. 6b).
  - PlotGseaTable: Shows enriched pathways and their enrichment scores (Fig. 6c).
  - Bar Plot: Highlights top significant pathways (Fig. 6d).
- Summary Table: A downloadable table of enriched pathways, adjusted p-values, and scores. If the users selects the top 10 significant pathways, the tool displays the top 5 upregulated and top 5 downregulated pathways (Fig. 6e).

GSEA analysis in VST-DAVis offers a powerful method for understanding pathway-level dynamics, supporting biological interpretation of VST-DAVis data through visual and quantitative assessments of enriched pathways.

7. Cell-Cell Communication Analysis

VST-DAVis integrates CellChat to enable users to analyze cell-cell communication within single or multiple samples, as well as for subclusters. This analysis identifies potential ligand-receptor interactions, allowing users to explore how different cell types or clusters communicate based on gene expression patterns.

Input Options
- Source of Input: Users can analyze cell-cell communication using Seurat clusters or predicted cell type labels generated from single, multiple, or subcluster analysis.
- Organisms Supported: Human and mouse datasets are available for ligand-receptor interaction mapping.

7.1. Parameters for Cell-Cell Communication

Identify Over-Expressed Genes:
- Threshold of Cell Expression Percentage: Minimum percentage of cells expressing the genes.
- Log Fold Change Threshold: Minimum log fold-change required for genes to be considered over-expressed.
- p-Value Threshold: Statistical significance threshold.
Compute Communication Probability:
- Expression Method: Choose how to compute the average expression per cell group (options: triMean, truncatedMean, thresholdedMean, median).
Filter Communication:
- Minimum Cell Requirement: Minimum number of cells needed in each cell group to analyze cell-cell communication.
Communication Pattern Identification:
- Pattern k-Value: Defines the number of communication patterns to identify.
Label Option:
- Show or hide labels in plots.
Execution:
- Click to Cell-Cell communication analysis button to start the analysis (Fig. 7.1a).
Output for Cell-Cell Communication Analysis
The analysis generates the following visual outputs:
- Interaction Plots:
  - Counts and Weights/Strength: Displays the frequency and intensity of interactions among cell groups (Fig. 7.1b,c).
  - Interaction Heatmap: Shows interaction strengths across all clusters or cell types (Fig. 7.1d).
  - Incoming and Outgoing Signaling Patterns: Visualizes communication patterns for target and secreting cells (Fig. 7.1e,f).
- Interaction Table: Includes source and target cell types, ligand-receptor pairs, and interaction scores (Fig. 7.1g).

7.2. Analyzing Specific Signaling Pathways

For a more focused analysis, users can select a specific signaling pathway from a drop-down menu, enabling detailed visualization of the chosen pathway (Fig. 7.2a).

Outputs for Specific Signaling Pathway:
- Spatial plot: Display interaction intensity oin spatial image (Fig. 7.2b).
- Circle Plot: Visualizes interactions among cell groups by counts (Fig. 7.2c).
- Chord Plot: Depicts connections between cell types via ligand-receptor pairs (Fig. 7.2d).
- Interaction Heatmap: Interaction strengths among clusters for the specific pathway (Fig. 7.2e).
- Hierarchy Plot: Shows the hierarchical organization of cell types and their interactions (Fig. 7.2f).
- Bubble Plot and Bar Plot: Display interaction intensity for the selected pathway (Fig. 7.2g).
- Violin Plot: Shows expression of pathway-associated genes (Fig. 7.2h).
- Bar Plot: Shows the network analysis contribution in bar plot (Fig. 7.2i).
- Signaling Pathway Table: Contains source, target, ligand, receptor, and interaction details for the specific pathway (Fig. 7.2j).

This suite of tools and visualizations enables detailed exploration of cell communication, allowing users to interpret inter-cellular signaling dynamics in VST-DAVis datasets with biological relevance.

8. Trajectory and Pseudotime Analysis

VST-DAVis integrates Monocle3 for trajectory and pseudotime analysis, allowing users to study the dynamic progression of cells over pseudotime and identify genes with functional changes along this trajectory.

Preparing for Trajectory and Pseudotime Analysis
- Prerequisites: Users must complete analysis up to the cell type prediction step in either single or multiple sample analysis, or subclustering analysis.
- Input Format: The tool automatically converts the Seurat object to Monocle3 format, and users can choose between Seurat clusters or predicted cell type labels as input.
- UMAP Requirement: UMAP should be used in clustering steps for compatibility with Monocle3.

8.1. Parameters for Learning Trajectory

Partitioning Options:
- use_partition: Toggle to specify partitions for different groups.
- close_loop: Set to close or open the trajectory loop.
- label_groups_by_cluster: Labels cell groups by cluster.
- label_branch_points, label_roots, label_leaves: Allows labeling of key points on the trajectory (branches, roots, leaves).
Execution:
- Once parameters are set, users can click the Learn Trajectory button to generate the trajectory plot (Fig. 8a).
Output:
- Trajectory Plot: Displays cell progression in trajectory space, providing insight into the cellular development path (Fig. 8b).

8.2. Pseudotime Ordering of Cells

Parameters:
- Root Cluster Selection: Users must select one cluster to serve as the root cluster, marking the starting point of pseudotime.
- Labeling Options: Parameters include options to label groups by clusters, as well as marking branch points, roots, and leaves.
Execution:
- Click to Submit button to start the analysis (Fig. 8c).
Output:
- Pseudotime Plot: Cells are arranged by pseudotime, showing the developmental trajectory (Fig. 8d).
- Bar Chart: Cells are ordered based on both Seurat clusters and Monocle3 pseudotime (Fig. 8e).

8.3. Identifying Genes with Functional Changes in Pseudotime

To explore gene expression dynamics along the pseudotime trajectory, users can analyze gene expression changes:

Parameters:
- Neighbor Graph Selection: Users can select between Principal Graph or K-Nearest Neighbor (KNN) to model gene expression changes.
Execution:
- Click Find Genes Button: Begins the identification of genes whose functions vary along pseudotime (Fig. 8f).
Output:
- Pseudotime Plot of Cells: Visual representation of cells in pseudotime with associated gene expression (Fig. 8g).
- Summary Table: Lists genes with dynamic functional changes along pseudotime (Fig. 8h).

8.4. Plotting Gene Expression in Pseudotime

Users can visualize specific genes to observe their expression patterns over pseudotime: (Fig. 8i)

Gene Selection:
- Top Genes: By default, the tool plots the top 5 genes with dynamic changes, adjustable between 1 to 10 genes.
- Custom Genes: Users can specify a custom list of genes (comma-separated) to plot in pseudotime.
Output:
- Creates a feature plot to display gene expression across cells in pseudotime (Fig. 8j).

This functionality helps users analyze and visualize gene dynamics, offering insights into cellular progression and identifying key genes in developmental pathways.

9. Co-Expression and TF Analysis

9.1. Co-Expression Network Analysis

VST-DAVis incorporates co-expression network analysis for VST-DAVis data using the hdWGCNA package. This feature enables users to identify gene modules and their relationships in Seurat clusters or predicted cell type labels.

Prerequisites:
- Co-expression network analysis becomes available after completing single or multiple samples analysis or subclustering analysis up to cell type prediction.
- User can use one cluster at a time.
Metacell Construction:
Aggregates small groups of similar cells from the same biological sample. Uses the k-Nearest Neighbors (KNN) algorithm to group similar cells and compute a metacell gene expression matrix.
- Parameters:
  - k: Number of nearest neighbors for aggregation.
  - min_cells: Minimum number of cells in a group to construct metacells.
  - max_shared: Maximum number of cells shared across two metacells.
  - target_metacells: Maximum number of target metacells to construct.
Co-Expression Network Construction:
Builds networks with customizable parameters:
- softpower: Determines the scale-free topology for constructing networks.
- networkType: Options include signed, unsigned, or signed hybrid.
Module Eigengenes and Connectivity:
- Scales data using selectable models: linear, poisson, or negbinom.
- Allows Harmony batch correction for harmonized module eigengenes (hMEs), selectable by the users.
Hub Gene Extraction:
- Extracts the top N hub genes for selected modules, aiding in the identification of key regulators.
Execution:
- Click the WGCNA Analysis button initiates co-expression network analysis (Fig. 9a).
Outputs:
Few plots were not available in image files format so we have provided those as pdf files.
- Soft Power Plots: Visualizes the selection of the optimal soft power parameter for network construction (Fig. 9.1b).
- Co-Expression Network Visualization: Displays modules with distinct colors representing gene clusters (Fig. 9.1c).
- Ranked Genes in Modules: Provides a list of genes ranked by module membership (kME) (Fig. 9.1d).
- Feature Plots: Highlights the expression of modules or specific genes (Fig. 9.1e).
- Module Relationships Plots: Correlation between modules based on harmonized module eigengenes (hMEs) (Fig. 9.1f).
- Seurat DotPlot with Modules: Displays module-specific gene expression across clusters (Fig. 9.1g).
- Individual Module Network Plots: Visualizes the gene network for specific modules (Fig. 9.1h).
- Module UMAP Plots: Maps modules onto UMAP visualizations for spatial context (Fig. 9.1i).
- Summary Table: Soft Power Table: Lists optimal soft power values (Fig. 9.1j). Module Assignment Table: Details gene-module relationships with colors (Fig. 9.1k). Hub Genes Table: Identifies top hub genes per module (Fig. 9.1l).

This functionality provides a robust framework for uncovering intricate co-expression patterns and identifying key drivers in single-cell datasets.

9.2. Transcription Factor Regulatory Network Analysis

Transcription Factor (TF) Regulatory Network Analysis in VST-DAVis employs the hdWGCNA package to construct and analyze TF regulatory networks based on VST-DAVis data. This feature allows users to identify gene modules and investigate TF-mediated regulation within clusters or predicted cell type labels.

Prerequisites:
- Complete single or multiple sample analysis or subclustering analysis, including cell type prediction.
- Analysis is performed one cluster at a time.
TF Regulatory Network Construction:
- TF Binding Motif Information:
  - Human: EnsDb.Hsapiens.v86, BSgenome.Hsapiens.UCSC.hg38.
  - Mouse: EnsDb.Mmusculus.v79, BSgenome.Mmusculus.UCSC.mm10.
  - Motifs from the JASPAR 2020 database for multiple species.
- Machine Learning Model:
  - XGBoost used to model TF regulation for each gene with:
  - max_depth : Maximum depth of a tree
  - eta : Step size shrinkage used in update to prevent overfitting
  - alpha: L1 regularization term on weights
- TF Regulon Strategy:
  - Strategy A selects the top TFs for each gene by default
  - reg_thresh : Threshold for regulatory score)
  - n_tfs : The number of top TFs to keep for each gene
- Regulon Expression Signatures:
  - Positive correlation: cor_thresh = 0.05. Threshold for TF-gene correlation for genes to be included in the positive regulon score
  - Negative correlation: cor_thresh = -0.05. threshold for TF-gene correlation for genes to be included in the negative regulon score
Execution:
- Click Transcription factor analysis button to start the analysis (Fig. 9.2.1a).
Output and Visualization:
- Module Regulatory Network Plots: Positive, negative, and combined regulatory network plots. Visualize TF-to-target relationships categorized by regulatory effects (Fig. 9.2.1b-e).
- Regulated Scores Table: Comprehensive list of TFs and their downstream targets (Fig. 9.2.1f).
TF-Specific Visualizations:
Unravel regulatory mechanisms governing gene expression in cellular contexts. Identify key transcription factors and their target genes for hypothesis generation and validation. Explore positive and negative regulatory effects within gene modules.
- Select a TF from a dropdown menu to generate specific plots: (Fig. 9.2.2a)
Outputs:
- UMAP Plots: Spatial distribution of the TF (Fig. 9.2.2b).
- Bar Plots: Contribution of the TF across modules (Fig. 9.2.2c).
- Network Plots: Positive, negative, and combined networks, with primary, secondary and tertiary targets (Fig. 9.2.2d-f).

This functionality provides a comprehensive view of transcriptional regulation in VST-DAVis data, enabling detailed exploration of TF-driven cellular processes.

Visium HD Spatial Transcriptomics Data Analysis and Visualization (VST-DAVis)

Introduction

1. Single or Multiple Samples Analysis

1.1 Stats

1.2 Sample Groups and QC Filtering

1.3 Normalization and PCA Analysis

1.4 Clustering

1.5 Marker Identification

1.6 Cell Type Prediction

1.7 Cluster-Based Plots

1.8 Condition-Based Analysis

2. Subclustering

3. Correlation Network Analysis

4. Genome Ontology (GO) Terms

5. Pathway Analysis

6. GSEA Analysis

7. Cell-Cell Communication

8. Trajectory and Pseudotime Analysis

9. Co-Expression and TF analysis

9.1 Co-Expression Network Analysis

9.2 Transcription factor regulatory network analysis

Outputs and Visualization

use VST-DAVis online

Launch VST-DAVis using R and GitHub

Start the app

Usage

Developed and maintained by

Upload multiple samples, each in its own ZIP file

Each sample should be stored in a separate zip file according to the given structure. Users can upload multiple zip files simultaneously to analyze multiple samples.

Example file format

Sample names of uploaded file(s)

Number of cells in the given sample(s)

QC Plot

Spatial Feature QC Plot

Feature-Feature relationships plot

Define filtering parameters

Number of cells after QC

Sample(s) based

Group(s) based

QC plot after filtering

Sample(s) based

Group(s) based

Spatial QC plot after filtering

Bar plots

Sample(s) based

Group(s) based

Dimension reduction heatmap for PCA data

Elbow plot

PCA sample(s) based

PCA group(s) based

Nearest-neighbour graph construction

Clustering parameters and integration method

Dimension reduction

UMAP parameters

t-SNE parameters

UMAP / t-SNE cluster plot

Cluster based count bar plot

UMAP / t-SNE cluster Spatial plot

UMAP / t-SNE condition(s) based plot

Condition(s) based count bar plot

UMAP / t-SNE sample(s) based plot

Sample(s) based count Bar plot

Number of cells in clusters Download as csv

Number of cells in clusters based on condition(s) Download as csv

Number of cells in clusters based on sample(s) Download as csv

Spatial plot highlights each cluster separately, distinguishing individual samples

Clusters split by condition(s)

Clusters split by and sample(s)

Clusters split by and condition(s) and clusters

Markers identification or Differential expression analysis

Gene expression markers parameters

Identified markers / differentially expressed genes

Conserved Markers genes

Heatmap for top 5 marker genes in cluster(s)

Predict Cell Type

Please make sure 'Identify markers in all clusters' were runned in the previous step, if you are using GPTCelltype. To use GPTCelltype locally, users need to update their API key by setting Sys.setenv(OPENAI_API_KEY = 'your_openai_API_key') in the global.R file

Dimplot of annotated clusters

Spatial Dimplot of annotated clusters

ScType scores

SingleR Scores

Number of cells in clusters

Number of cells in clusters based on condition(s)

Number of cells in clusters based on sample(s)

Number of cells in clusters

Number of cells in clusters based on condition(s)

Number of cells in clusters based on sample(s)