addGroupCoverages.Rd
This function will merge cells within each designated cell group for the generation of pseudo-bulk replicates and then merge these replicates into a single insertion coverage file.
addGroupCoverages(
ArchRProj = NULL,
groupBy = "Clusters",
useLabels = TRUE,
sampleLabels = "Sample",
minCells = 40,
maxCells = 500,
maxFragments = 25 * 10^6,
minReplicates = 2,
maxReplicates = 5,
sampleRatio = 0.8,
excludeChr = NULL,
kmerLength = 6,
threads = getArchRThreads(),
returnGroups = FALSE,
parallelParam = NULL,
force = FALSE,
verbose = TRUE,
logFile = createLogFile("addGroupCoverages")
)
An ArchRProject
object.
The name of the column in cellColData
to use for grouping multiple cells together prior to generation of the insertion coverage file.
A boolean value indicating whether to use sample labels to create sample-aware subgroupings during as pseudo-bulk replicate generation.
The name of a column in cellColData
to use to identify samples. In most cases, this parameter should be left as NULL
and you
should only use this parameter if you do not want to use the default sample labels stored in cellColData$Sample
. However, if your individual Arrow
files do not map to individual samples, then you should set this parameter to accurately identify your samples. This is the case in (for example)
multiplexing applications where cells from different biological samples are mixed into the same reaction and demultiplexed based on a lipid barcode or genotype.
The minimum number of cells required in a given cell group to permit insertion coverage file generation.
The maximum number of cells to use during insertion coverage file generation.
The maximum number of fragments per cell group to use in insertion coverage file generation. This prevents the generation of excessively large files which would negatively impact memory requirements.
The minimum number of pseudo-bulk replicates to be generated.
The maximum number of pseudo-bulk replicates to be generated.
The fraction of the total cells that can be sampled to generate any given pseudo-bulk replicate.
A character vector containing the seqnames
of the chromosomes that should be excluded from this analysis.
The length of the k-mer used for estimating Tn5 bias.
The number of threads to be used for parallel computing.
A boolean value that indicates whether to return sample-guided cell-groupings without creating coverages.
This is used mainly in addReproduciblePeakSet()
when MACS2 is not being used to call peaks but rather peaks are called from a
TileMatrix (peakMethod = "Tiles"
).
A list of parameters to be passed for biocparallel/batchtools parallel computing.
A boolean value that indicates whether or not to skip validation and overwrite the relevant data in the ArchRProject
object if
insertion coverage / pseudo-bulk replicate information already exists.
A boolean value that determines whether standard output includes verbose sections.
The path to a file to be used for logging ArchR output.
# Get Test ArchR Project
proj <- getTestProject()
# Add Group Coverages
proj <- addGroupCoverages(proj, force = TRUE)