SCARR-Vis - Single Cell Ambient RNA Removal and Visualization

SCARR-Vis removes ambient RNA from 10x Genomics single-cell data using SoupX , DecontX , scCDC , or FastCAR and lets you compare pre vs post cleanup across QC, clustering, UMAP, heatmaps, and per-cell tables.

Data upload

Provide both the raw/droplet matrix and the filtered/cell matrix. You can upload as:

A zip folder containing matrix.mtx.gz, barcodes.tsv.gz, features.tsv.gz; or

HDF5 (.h5) from Cell Ranger / other tools.

Estimate contamination

Go to Step 2. Estimate Contamination and choose a method.

SoupX: estimates sample-level ambient contamination (global ρ) from empty droplets; supports parameter tuning and cluster-aware count adjustment.

DecontX: Infers per-cell Bayesian contamination; uses user-provided clusters or re-clusters; outputs decontaminated counts.

scCDC: gene-specific contamination detection and correction (with doublet-aware adjustments).

FastCAR: fast ambient RNA correction using an empty-droplet UMI cutoff and gene-level contamination probability threshold, with optional ambient profiling to suggest a cutoff.

Compare results

Check QC (Pre) vs QC (Post), Estimation, Cluster Counts (Pre vs Post), UMAP (Pre vs Post), Top Genes, and the Cells table.

Outputs and Visualization

Each figure and table can be downloaded individually with the selected dimensions, or the user can click the Bulk Download option to download all publication-quality plots in JPG, TIFF, PDF, SVG, BMP, EPS, or PS format, along with summary tables in .csv format. The user can also download cleaned Seurat/SCE objects, summaries, .h5 files, matrix/barcode/feature files based on the input format, or a complete ZIP bundle containing all available outputs.

Tips

If SoupX errors on empty droplets, widen soupRange or use the automatic global soup profile fallback.
If your dataset contains erythroid/testis cells, adjust 'Optional Markers'.

use SCARR-Vis online

SCARR-Vis is deployed at: https://www.gudalab-rtools.net/SCARR-Vis

Launch SCARR-Vis using R and GitHub

SCARR-Vis were deposited under the GitHub repository: https://github.com/GudaLab/SCARR-Vis
Before running the app, users must have the following versions installed: R (>= 4.5.2), RStudio (>= 2025.09.2), Bioconductor (>= 3.22) and Shiny (>= 1.11.1) (Tested with this version).
Note: SCARR-Vis has been tested with these versions. If users are running an older version of R, they may encounter errors during package installation. Therefore, it is recommended to update R to the latest version first.
Once R is open in the command line or in RStudio, users should run the following command in R to install the shiny package.

install.packages('shiny')
library(shiny)

Start the app

Start the R session using RStudio and run these lines:

shiny::runGitHub('SCARR-Vis','GudaLab')

or Alternatively, download the source code from GitHub and run the following command in the R session using RStudio:

library(shiny)
runApp('/path/to/the/SCARR-Vis-master', launch.browser=TRUE)

Usage

Please refer our Manual tab.

Developed and maintained by

SCARR-Vis was developed by Sankarasubramanian Jagadesan and Babu Guda. We share a passion for developing a user-friendly tool for biologists, particularly those who do not have access to bioinformaticians or programming expertise.

Total number of views:

SCARR-Vis: Manual & Parameter Reference

SCARR-Vis is an R/Shiny application for interactive assessment and correction of ambient RNA contamination in single-cell and single nucleus RNA-seq data. The interface follows a typical workflow: upload 10x matrices, choose a decontamination method, inspect pre- and post-correction QC, and explore clustering and gene expression.

Overview of the Example Dataset (GSM7681687)

Throughout this manual we use the single-cell RNA-seq sample GSM7681687 from NCBI GEO. FASTQ files were processed with Cell Ranger to generate both raw and filtered feature-barcode matrices . Because raw matrices are typically not submitted to GEO, SCARR-Vis bundles this dataset as example data so users can play with the full pipeline, including ambient RNA estimation that relies on the raw matrix.

To quickly test SCARR-Vis, select “Use example data” in the upload panel and run the pipeline with the default parameters.

Upload 10x Matrices

In step 1, SCARR-Vis expects both a raw and a filtered 10x feature-barcode matrix. You can either use the bundled GSM7681687 example or upload your own matrices.

Species / genome build: choose the appropriate reference.
Raw matrix: a 10x HDF5 file .
Filtered matrix: a zipped directory contain barcode, matrix feature file show in above image.
Optional: collapse duplicate gene names; convert Ensembl IDs to gene symbols.

Figure 1. Upload panel and estimation method selector. Sub-panels show: (a) upload matrices, (b) SoupX, (c) DecontX, (d) scCDC, (e) FastCAR parameter panels.

Note: when starting from FASTQ files, you must first run Cell Ranger (or a similar pipeline) to generate the raw and filtered matrices required by SCARR-Vis.

Choose Estimation Method and Parameters

In step 2 (Estimate Contamination), SCARR-Vis provides adapters for four methods:

SoupX: models background 'soup' RNA and adjusts counts.
DecontX: from celda infers cell-specific contamination fractions.
scCDC: identifies contamination-causing genes (GCGs) and optionally corrects them.
FastCAR: profiles ambient RNA using empty droplets and estimates per-cell contamination.

Each method has its own parameter panel (see Figure 1b–1e). After setting parameters, click Run to start estimation. Progress and status messages are shown in the log.

1. QC (Pre)

The QC (Pre) tab summarizes per-cell metrics before any correction:

Detected genes per cell.
UMIs per cell.
Mitochondrial percentage (species-aware MT gene pattern).

Figure 2. QC (Pre) for GSM7681687: (a) detected genes per cell, (b) UMIs per cell, (c) mitochondrial %.

2. Estimation Diagnostics

The Estimation tab displays method-specific diagnostic plots.

2.1 SoupX

SoupX output includes the distribution of contamination fractions (ρ), their relationship with UMI counts, and other model diagnostics.

Figure 3.1. SoupX diagnostics: (a) ρ density, (b) ρ vs nUMIs (c) auto estimation contamination diagnostic.

2.2 DecontX

DecontX produces a contamination density plot, ρ vs nUMIs, and a UMAP colored by estimated contamination.

Figure 3.2. DecontX diagnostics: contamination density, ρ vs nUMIs, and contamination on UMAP.

2.3 scCDC

scCDC focuses on genes driving contamination. SCARR-Vis shows UMIs per cell post vs pre, mean counts of top GCGs, and entropy vs mean expression to highlight putative contamination genes.

Figure 3.3. scCDC diagnostics: (a) UMIs per cell (post vs pre), (b) top GCGs (pre vs post), (c) entropy vs mean expression.

2.4 FastCAR

FastCAR scans a grid of empty-droplet UMI cutoffs to profile ambient RNA and identifies an appropriate threshold for empty droplets and contamination. SCARR-Vis displays the number of empty droplets and genes in ambient RNA at each cutoff, as well as the distribution of UMIs removed per cell.

Figure 3.4. FastCAR diagnostics: (a) empty-droplet profile, (b) reads removed per cell.

3. QC (Post)

After decontamination, the QC (Post) tab repeats the same metrics as QC (Pre) but on the corrected counts, allowing a direct comparison.

Figure 4. QC (Post) histograms for GSM7681687: detected genes, UMIs, and mitochondrial %.

4. Cluster counts (Pre vs Post)

The Cluster counts (Pre vs Post) tab compares the number of cells in each cluster before and after correction, making it easy to see whether ambient RNA disproportionately affected particular clusters.

Figure 5. Cluster counts per cluster (pre vs post).

Figure 6. Bar plot with Cluster counts (pre vs post).

5. Cell table

The Cells Table tab lists per-cell metrics such as cluster ID, UMIs, QC statistics, and contamination estimates. Users can sort and filter rows for detailed inspection.

Figure 7. Cell-level summary table.

6 Top genes

The Top Genes tab highlights the genes most affected by ambient RNA and compares their pre- and post-correction expression.

Figure 8. Top genes affected by ambient RNA (pre vs post).

7. UMAP / tSNE (Pre vs Post)

Global structure is visualized in the UMAP/TSNE (Pre vs Post) tab. SCARR-Vis recomputes embeddings on both the original and corrected counts using Seurat.

UMAP (Pre) and UMAP (Post).
tSNE (Pre) and tSNE (Post).

Figure 9. UMAP and tSNE embeddings for GSM7681687 before and after correction.

8. Heatmap (Pre vs Post)

The Heatmap tab visualizes expression of top variable genes across clusters. Users can switch between pre- and post-correction matrices to see how ambient removal changes gene-level patterns.

Figure 10. Heatmaps of top variable genes for pre- and post-correction data.

9. Feature plots (Pre vs Post)

The Feature Plot tab displays per-gene expression over UMAP/tSNE. Users provide one or more comma-separated gene symbols (e.g. FTH1 ), and SCARR-Vis plots paired pre- and post-correction feature maps.

Figure 11. Example feature plots for FTH1 (Pre and Post).

10. Reproducibility and Session Info

SCARR-Vis provides a reproducibility summary and full R session information. The reproducibility table records the selected method, key parameter values, and dataset-level statistics. The Session Info tab shows R version, platform, and package versions. These should be included in reports or manuscripts so that analyses can be fully reproduced.

Figure 12. Reproducibility summary table with selected parameters.

11. Download outputs

After a successful run, the Status tab provides individual download buttons for the corrected Seurat object and cleaned count output, plus a Bulk Download Tables + Images option for collecting the run outputs in one ZIP file.

Bulk tables: top genes, cell-level metrics, cluster counts, and total cell counts are saved as CSV files.
Bulk images: all available QC, estimation, UMAP/tSNE, heatmap, feature plot, and method-specific images are exported with automatic height and width settings.
Image format: users choose one format for the bulk image export: JPG, TIFF, PDF, SVG, BMP, EPS, or PS.
scCDC report: when scCDC is selected and the ContaminationDetection PDF report has been generated, the report is included in the bulk ZIP under the reports folder.
Progress: bulk download, Seurat download, and cleaned-data download show a small live status notification while files are being prepared.

Pipeline summary: The Seurat-based processing (normalization, variable feature selection, PCA, neighbors, clustering, and UMAP) is run twice: first on the uploaded filtered counts (Pre) and again on the decontaminated counts (Post). The downstream visualization tabs always reflect this paired design.

Parameter reference

General

Name	Default	Min	Max	Notes
`min_cells`	3	1	—	Minimum cells per gene to retain when creating Seurat objects.

SoupX

Name	Default	Min	Max	Notes
`do_auto`	TRUE	FALSE	TRUE	If TRUE, uses autoEstCont to estimate ρ per cell.
`manual_rho`	0.05	0	1	Used only if do_auto = FALSE; uniform ρ.
`soupRange`	c(0, 100)	0	2000	UMI range of empty droplets to build soup profile.
`keepDroplets`	FALSE	FALSE	TRUE	Keeps droplet table in memory; uses more RAM.

DecontX

Name	Default	Min	Max	Notes
`decontx_use_clusters`	TRUE	FALSE	TRUE	If TRUE, uses Seurat clusters as priors.
`decontx_maxiter (maxIter)`	500	50	10000	Maximum EM iterations.
`decontx_delta`	10,10	>0,>0	—	Dirichlet prior hyperparameters as two numbers.
`decontx_estimateDelta`	TRUE	FALSE	TRUE	Estimate delta during fitting.
`decontx_convergence`	0.001	1e-6	0.1	EM tolerance for convergence.
`decontx_iterLogLik`	10	1	1000	Iterations between log-likelihood checks.
`decontx_varGenes`	5000	100	30000	Number of variable genes used by decontX.

scCDC

Name	Default / Option	Notes
`restriction_factor`	0.5 (dropdown)	Controls aggressiveness of GCG detection.
`min.cell`	100 (dropdown)	Minimum cells per gene for estimation.
`percent.cutoff`	0.2 (dropdown)	Threshold for ambient fraction filtering.

FastCAR

Name	Default	Min	Max	Notes
`fastcar_empty_cutoff`	100	10	5000	Maximum UMIs to call a droplet 'empty'. Higher values can over-correct lowly expressed genes.
`fastcar_contam_cutoff`	0.05	0	0.5	Contamination chance cutoff used for background detection; lower is more conservative.
`fastcar_do_profile`	TRUE	FALSE	TRUE	If TRUE, runs describe.ambient.RNA.sequence to profile ambient RNA over a grid of empty-droplet cutoffs.
`fastcar_profile_start`	10	1	2000	Lower bound of UMI cutoff grid for ambient profiling.
`fastcar_profile_stop`	500	50	10000	Upper bound of UMI cutoff grid for ambient profiling.
`fastcar_profile_by`	10	1	100	Step size of UMI cutoff grid for ambient profiling.
`fastcar_use_recommended`	TRUE	FALSE	TRUE	If TRUE, uses FastCAR's recommended empty-droplet cutoff based on the ambient profile.

Seurat processing (defaults used in app)

Step	Key parameters (value)	Notes
Mito %	`PercentageFeatureSet` pattern = `^MT- (human) / ^mt- (mouse)`	Species-aware mitochondrial regex.
NormalizeData	`normalization.method="LogNormalize"` , `scale.factor=10000`	Standard log-normalization.
FindVariableFeatures	`selection.method="vst"` , `nfeatures=2000`	Top 2,000 HVGs (Seurat default).
ScaleData	`center=TRUE` , `scale=TRUE` , `verbose=FALSE`	Centers and scales features before PCA.
RunPCA	`features=VariableFeatures(object)` , `npcs=30` , `verbose=FALSE`	PCA on HVGs; 30 PCs kept.
FindNeighbors	`reduction='pca'` , `dims=1:20` , `k.param=20`	SNN graph on first 20 PCs; k=20.
FindClusters	`resolution=0.5` , `algorithm=1`	Louvain (algorithm 1) at res=0.5.
RunUMAP	`reduction='pca'` , `dims=1:20` , `n.neighbors=30` , `min.dist=0.3` , `umap.method="uwot"` , `metric="cosine"`	UMAP via uwot; first 20 PCs.

See the Estimation, QC, and visualization tabs for diagnostics and plots after you run the pipeline.

Expected runtime

Runtime depends on dataset size, selected method, and hardware. On a typical modern laptop (4–8 CPU cores, 16 GB RAM), running the full SCARR-Vis pipeline on the bundled GSM7681687 example data < 5000 cells (SoupX/DecontX/scCDC/FastCAR, plus QC, clustering, UMAP/TSNE, and plots) usually completes in a few minutes per method (roughly 2–5 minutes).

SCARR-Vis - Single Cell Ambient RNA Removal and Visualization

Data upload

Estimate contamination

Compare results

Outputs and Visualization

Tips

use SCARR-Vis online

Launch SCARR-Vis using R and GitHub

Start the app

Usage

Developed and maintained by

1) Upload 10x Matrices

2) Estimate Contamination

Soup profile options

Optional Markers (non-expressed in most cells)

scCDC parameters

FastCAR parameters

Instructions for Uploading Sample Files

Clarification example file format

Users can download this example dataset to better understand the required structure. Following this reference will help ensure that your files are correctly prepared and fully compatible with our tool

Detected genes per cell

UMIs per cell

Mitocondrial percentage

rho density

rho vs nUMIs

Auto estimation contamination diagnostic

Ambient profile plot

UMIs removed per cell