This function will determine cell groups for pseudobulk, summarize and export a summarized experiment for a assay in a ArchRProject.

getPBGroupSE(
  ArchRProj = NULL,
  useMatrix = "GeneScoreMatrix",
  groupBy = "Clusters",
  divideN = TRUE,
  scaleTo = 10000,
  useLabels = TRUE,
  sampleLabels = "Sample",
  minCells = 40,
  maxCells = 500,
  minReplicates = 2,
  maxReplicates = 5,
  sampleRatio = 0.8,
  verbose = TRUE,
  threads = getArchRThreads(),
  logFile = createLogFile("getPBGroupSE")
)

Arguments

ArchRProj

An ArchRProject object.

useMatrix

The name of the matrix in the ArrowFiles. See getAvailableMatrices to see options

groupBy

The name of the column in cellColData to use for grouping cells together for summarizing.

divideN

A boolean describing whether to divide by the number of cells.

scaleTo

Depth normalize to this value if not NULL.

useLabels

A boolean value indicating whether to use sample labels to create sample-aware subgroupings during as pseudo-bulk replicate generation.

sampleLabels

The name of a column in cellColData to use to identify samples. In most cases, this parameter should be left as NULL and you should only use this parameter if you do not want to use the default sample labels stored in cellColData$Sample. However, if your individual Arrow files do not map to individual samples, then you should set this parameter to accurately identify your samples. This is the case in (for example) multiplexing applications where cells from different biological samples are mixed into the same reaction and demultiplexed based on a lipid barcode or genotype.

minCells

The minimum number of cells required in a given cell group to permit insertion coverage file generation.

maxCells

The maximum number of cells to use during insertion coverage file generation.

minReplicates

The minimum number of pseudo-bulk replicates to be generated.

maxReplicates

The maximum number of pseudo-bulk replicates to be generated.

sampleRatio

The fraction of the total cells that can be sampled to generate any given pseudo-bulk replicate.

verbose

A boolean specifying to print messages during computation.

threads

An integer specifying the number of threads for parallel.

logFile

The path to a file to be used for logging ArchR output.

Examples


# Get Test ArchR Project
proj <- getTestProject()

# Get Group SE
se <- getPBGroupSE(proj, useMatrix = "PeakMatrix", groupBy = "Clusters")