addUMAP.Rd
This function will compute a UMAP embedding and add it to an ArchRProject.
addUMAP(
ArchRProj = NULL,
reducedDims = "IterativeLSI",
name = "UMAP",
nNeighbors = 40,
minDist = 0.4,
metric = "cosine",
dimsToUse = NULL,
scaleDims = NULL,
corCutOff = 0.75,
sampleCells = NULL,
outlierQuantile = 0.9,
saveModel = TRUE,
verbose = TRUE,
seed = 1,
force = FALSE,
threads = 1,
...
)
An ArchRProject
object.
The name of the reducedDims
object (i.e. "IterativeLSI") to use from the designated ArchRProject
.
The name for the UMAP embedding to store in the given ArchRProject
object.
An integer describing the number of nearest neighbors to compute a UMAP. This argument is passed to n_neighbors
in uwot::umap()
.
A number that determines how tightly the UMAP is allowed to pack points together. This argument is passed to min_dist
in
uwot::umap()
. For more info on this see https://jlmelville.github.io/uwot/abparams.html.
A number that determines how distance is computed in the reducedDims
to compute a UMAP. This argument is passed to metric
in uwot::umap()
.
A vector containing the dimensions from the reducedDims
object to use in computing the embedding.
A boolean value that indicates whether to z-score the reduced dimensions for each cell. This is useful for minimizing
the contribution of strong biases (dominating early PCs) and lowly abundant populations. However, this may lead to stronger sample-specific
biases since it is over-weighting latent PCs. If set to NULL
this will scale the dimensions based on the value of scaleDims
when the
reducedDims
were originally created during dimensionality reduction. This idea was introduced by Timothy Stuart.
A numeric cutoff for the correlation of each dimension to the sequencing depth. If the dimension has a correlation to
sequencing depth that is greater than the corCutOff
, it will be excluded from analysis.
An integer specifying the number of cells to subsample and perform UMAP Embedding on. The remaining cells that were not subsampled will be re-projected using uwot::umap_transform to the UMAP Embedding. This enables a decrease in run time and memory but can lower the overal quality of the UMAP Embedding. Only recommended for extremely large number of cells.
A numeric (0 to 1) describing the distance quantile in the subsampled cels (see sampleCells
) to use to filter poor quality re-projections.
This is necessary because there are lots of outliers if undersampled significantly.
A boolean value indicating whether or not to save the UMAP model in an RDS file for downstream usage such as projection of data into the UMAP embedding.
A boolean value that indicates whether printing UMAP output.
A number to be used as the seed for random number generation. It is recommended to keep track of the seed used so that you can reproduce results downstream.
A boolean value that indicates whether to overwrite the relevant data in the ArchRProject
object if the embedding indicated by
name
already exists.
The number of threads to be used for parallel computing. Default set to 1 because if set to high can cause C stack usage errors.
Additional parameters to pass to uwot::umap()
# Get Test ArchR Project
proj <- getTestProject()
# Add UMAP for Small Project
proj <- addUMAP(proj, force = TRUE)