14.2 ArchR Enrichment

In addition to testing peaks for enrichment of motifs, ArchR also enables the determination of more customizable enrichments. To facilitate this level of data exploration, we have curated a few different feature sets that can be easily tested for enrichment in your peak regions of interest. We describe each of those curated feature sets below. This type of analysis was originally inspired by LOLA.

14.2.1 Encode TF Binding Sites

The ENCODE consortium has mapped TF binding sites (TFBSs) across a wide array of cell types and factors. We can use these TFBS collections to better understand our clusters. For example, in the context of truly unknown cell types, these enrichments could help to elucidate cell identity. To enable analyses with these ENCODE TFBS feature sets, we simply call the addArchRAnnotations() function with collection = "EncodeTFBS". Similar to what happens when using addPeakAnnotations(), this creates a binarized representation of the overlap between all marker peaks and all ENCODE TFBS.

projHeme5 <- addArchRAnnotations(ArchRProj = projHeme5, collection = "EncodeTFBS")
## ArchR logging to : ArchRLogs/ArchR-addArchRAnnotations-371b02290586d-Date-2022-12-23_Time-07-59-07.log
## If there is an issue, please report to github with logFile!
## Annotating Chromosomes
## 2022-12-23 07:59:09 :
##  Annotating Chr: chr1
## 2022-12-23 07:59:09 :
##  Annotating Chr: chr2
## 2022-12-23 07:59:09 :
##  Annotating Chr: chr3
## 2022-12-23 07:59:09 :
##  Annotating Chr: chr4
## 2022-12-23 07:59:10 :
##  Annotating Chr: chr5
## 2022-12-23 07:59:10 :
##  Annotating Chr: chr6
## 2022-12-23 07:59:10 :
##  Annotating Chr: chr7
## 2022-12-23 07:59:10 :
##  Annotating Chr: chr8
## 2022-12-23 07:59:11 :
##  Annotating Chr: chr9
## 2022-12-23 07:59:11 :
##  Annotating Chr: chr10
## 2022-12-23 07:59:11 :
##  Annotating Chr: chr11
## 2022-12-23 07:59:11 :
##  Annotating Chr: chr12
## 2022-12-23 07:59:11 :
##  Annotating Chr: chr13
## 2022-12-23 07:59:12 :
##  Annotating Chr: chr14
## 2022-12-23 07:59:12 :
##  Annotating Chr: chr15
## 2022-12-23 07:59:12 :
##  Annotating Chr: chr16
## 2022-12-23 07:59:12 :
##  Annotating Chr: chr17
## 2022-12-23 07:59:12 :
##  Annotating Chr: chr18
## 2022-12-23 07:59:12 :
##  Annotating Chr: chr19
## 2022-12-23 07:59:13 :
##  Annotating Chr: chr20
## 2022-12-23 07:59:13 :
##  Annotating Chr: chr21
## 2022-12-23 07:59:13 :
##  Annotating Chr: chr22
## 2022-12-23 07:59:13 :
##  Annotating Chr: chrX
## 2022-12-23 07:59:13 :
## 2022-12-23 07:59:15 : All Regions Overlap at least 1 peak!, 0.118 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-addArchRAnnotations-371b02290586d-Date-2022-12-23_Time-07-59-07.log

We can then test for enrichment of these ENCODE TFBSs with our peak set using the peakAnnoEnrichment() function.

enrichEncode <- peakAnnoEnrichment(
    seMarker = markerPeaks,
    ArchRProj = projHeme5,
    peakAnnotation = "EncodeTFBS",
    cutOff = "FDR <= 0.1 & Log2FC >= 0.5"
  )
## ArchR logging to : ArchRLogs/ArchR-peakAnnoEnrichment-371b074a88753-Date-2022-12-23_Time-07-59-15.log
## If there is an issue, please report to github with logFile!
## 2022-12-23 07:59:21 : Computing Enrichments 1 of 11, 0.093 mins elapsed.
## 2022-12-23 07:59:21 : Computing Enrichments 2 of 11, 0.096 mins elapsed.
## 2022-12-23 07:59:21 : Computing Enrichments 3 of 11, 0.1 mins elapsed.
## 2022-12-23 07:59:21 : Computing Enrichments 4 of 11, 0.103 mins elapsed.
## 2022-12-23 07:59:21 : Computing Enrichments 5 of 11, 0.106 mins elapsed.
## 2022-12-23 07:59:22 : Computing Enrichments 6 of 11, 0.109 mins elapsed.
## 2022-12-23 07:59:22 : Computing Enrichments 7 of 11, 0.112 mins elapsed.
## 2022-12-23 07:59:22 : Computing Enrichments 8 of 11, 0.116 mins elapsed.
## 2022-12-23 07:59:22 : Computing Enrichments 9 of 11, 0.119 mins elapsed.
## 2022-12-23 07:59:22 : Computing Enrichments 10 of 11, 0.123 mins elapsed.
## 2022-12-23 07:59:23 : Computing Enrichments 11 of 11, 0.125 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-peakAnnoEnrichment-371b074a88753-Date-2022-12-23_Time-07-59-15.log

As previously, this function returns a SummarizedExperiment object.

enrichEncode
## class: SummarizedExperiment 
## dim: 689 11 
## metadata(0):
## assays(10): mlog10Padj mlog10p ... CompareFrequency feature
## rownames(689): 1.CTCF-Dnd41... 2.EZH2_39-Dnd41... ...
##   688.CTCF-WERI_Rb_1... 689.CTCF-WI_38...
## rowData names(0):
## colnames(11): B CD4.M ... PreB Progenitor
## colData names(0):

We can create a heatmap from these enrichment results using the plotEnrichHeatmap() function.

heatmapEncode <- plotEnrichHeatmap(enrichEncode, n = 7, transpose = TRUE)
## ArchR logging to : ArchRLogs/ArchR-plotEnrichHeatmap-371b051ab0dfc-Date-2022-12-23_Time-07-59-23.log
## If there is an issue, please report to github with logFile!
## Adding Annotations..
## Preparing Main Heatmap..
## 'magick' package is suggested to install to give better rasterization.
## 
## Set `ht_opt$message = FALSE` to turn off this message.

And then plot this heatmap using ComplexHeatmap::draw().

ComplexHeatmap::draw(heatmapEncode, heatmap_legend_side = "bot", annotation_legend_side = "bot")

To save an editable vectorized version of this plot, we use the plotPDF() function.

plotPDF(heatmapEncode, name = "EncodeTFBS-Enriched-Marker-Heatmap", width = 8, height = 6, ArchRProj = projHeme5, addDOC = FALSE)
## Plotting ComplexHeatmap!

14.2.2 Bulk ATAC-seq

Similar to the curated set of ENCODE TF binding sites, we have also curated peak calls from bulk ATAC-seq experiments that can be used for overlap enrichment testing. We access these bulk ATAC-seq peak sets by setting collection = "ATAC".

projHeme5 <- addArchRAnnotations(ArchRProj = projHeme5, collection = "ATAC")
## ArchR logging to : ArchRLogs/ArchR-addArchRAnnotations-371b0331322e8-Date-2022-12-23_Time-07-59-28.log
## If there is an issue, please report to github with logFile!
## Annotating Chromosomes
## 2022-12-23 07:59:29 :
##  Annotating Chr: chr1
## 2022-12-23 07:59:29 :
##  Annotating Chr: chr2
## 2022-12-23 07:59:29 :
##  Annotating Chr: chr3
## 2022-12-23 07:59:29 :
##  Annotating Chr: chr4
## 2022-12-23 07:59:30 :
##  Annotating Chr: chr5
## 2022-12-23 07:59:30 :
##  Annotating Chr: chr6
## 2022-12-23 07:59:30 :
##  Annotating Chr: chr7
## 2022-12-23 07:59:30 :
##  Annotating Chr: chr8
## 2022-12-23 07:59:30 :
##  Annotating Chr: chr9
## 2022-12-23 07:59:30 :
##  Annotating Chr: chr10
## 2022-12-23 07:59:30 :
##  Annotating Chr: chr11
## 2022-12-23 07:59:31 :
##  Annotating Chr: chr12
## 2022-12-23 07:59:31 :
##  Annotating Chr: chr13
## 2022-12-23 07:59:31 :
##  Annotating Chr: chr14
## 2022-12-23 07:59:31 :
##  Annotating Chr: chr15
## 2022-12-23 07:59:31 :
##  Annotating Chr: chr16
## 2022-12-23 07:59:31 :
##  Annotating Chr: chr17
## 2022-12-23 07:59:31 :
##  Annotating Chr: chr18
## 2022-12-23 07:59:31 :
##  Annotating Chr: chr19
## 2022-12-23 07:59:32 :
##  Annotating Chr: chr20
## 2022-12-23 07:59:32 :
##  Annotating Chr: chr21
## 2022-12-23 07:59:32 :
##  Annotating Chr: chr22
## 2022-12-23 07:59:32 :
##  Annotating Chr: chrX
## 2022-12-23 07:59:32 :
## 2022-12-23 07:59:33 : All Regions Overlap at least 1 peak!, 0.083 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-addArchRAnnotations-371b0331322e8-Date-2022-12-23_Time-07-59-28.log

We then test our marker peaks for enrichment of these bulk ATAC-seq peaks by setting peakAnnotation = "ATAC".

enrichATAC <- peakAnnoEnrichment(
    seMarker = markerPeaks,
    ArchRProj = projHeme5,
    peakAnnotation = "ATAC",
    cutOff = "FDR <= 0.1 & Log2FC >= 0.5"
  )
## ArchR logging to : ArchRLogs/ArchR-peakAnnoEnrichment-371b038f7e6c-Date-2022-12-23_Time-07-59-33.log
## If there is an issue, please report to github with logFile!
## 2022-12-23 07:59:38 : Computing Enrichments 1 of 11, 0.084 mins elapsed.
## 2022-12-23 07:59:38 : Computing Enrichments 2 of 11, 0.086 mins elapsed.
## 2022-12-23 07:59:38 : Computing Enrichments 3 of 11, 0.088 mins elapsed.
## 2022-12-23 07:59:39 : Computing Enrichments 4 of 11, 0.09 mins elapsed.
## 2022-12-23 07:59:39 : Computing Enrichments 5 of 11, 0.098 mins elapsed.
## 2022-12-23 07:59:39 : Computing Enrichments 6 of 11, 0.1 mins elapsed.
## 2022-12-23 07:59:39 : Computing Enrichments 7 of 11, 0.101 mins elapsed.
## 2022-12-23 07:59:39 : Computing Enrichments 8 of 11, 0.103 mins elapsed.
## 2022-12-23 07:59:39 : Computing Enrichments 9 of 11, 0.104 mins elapsed.
## 2022-12-23 07:59:40 : Computing Enrichments 10 of 11, 0.106 mins elapsed.
## 2022-12-23 07:59:40 : Computing Enrichments 11 of 11, 0.108 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-peakAnnoEnrichment-371b038f7e6c-Date-2022-12-23_Time-07-59-33.log

As before, the output of this is a SummarizedExperiment object with information on the enrichment results.

enrichATAC
## class: SummarizedExperiment 
## dim: 96 11 
## metadata(0):
## assays(10): mlog10Padj mlog10p ... CompareFrequency feature
## rownames(96): Brain_Astrocytes Brain_Excitatory_neurons ... Heme_MPP
##   Heme_NK
## rowData names(0):
## colnames(11): B CD4.M ... PreB Progenitor
## colData names(0):

We can create an enrichment heatmap from this SummarizedExperiment using plotEnrichHeatmap().

heatmapATAC <- plotEnrichHeatmap(enrichATAC, n = 7, transpose = TRUE)
## ArchR logging to : ArchRLogs/ArchR-plotEnrichHeatmap-371b02ab3f014-Date-2022-12-23_Time-07-59-40.log
## If there is an issue, please report to github with logFile!
## Adding Annotations..
## Preparing Main Heatmap..
## 'magick' package is suggested to install to give better rasterization.
## 
## Set `ht_opt$message = FALSE` to turn off this message.

And plot this heatmap using ComplexHeatmap::draw()

ComplexHeatmap::draw(heatmapATAC, heatmap_legend_side = "bot", annotation_legend_side = "bot")

To save an editable vectorized version of this plot, we use the plotPDF() function.

plotPDF(heatmapATAC, name = "ATAC-Enriched-Marker-Heatmap", width = 8, height = 6, ArchRProj = projHeme5, addDOC = FALSE)
## Plotting ComplexHeatmap!

14.2.3 Codex TFBS

The same type of analyses can be performed for CODEX TFBSs by setting collection = "Codex".

projHeme5 <- addArchRAnnotations(ArchRProj = projHeme5, collection = "Codex")
## ArchR logging to : ArchRLogs/ArchR-addArchRAnnotations-371b03bf780a8-Date-2022-12-23_Time-07-59-45.log
## If there is an issue, please report to github with logFile!
## Annotating Chromosomes
## 2022-12-23 07:59:46 :
##  Annotating Chr: chr1
## 2022-12-23 07:59:46 :
##  Annotating Chr: chr2
## 2022-12-23 07:59:46 :
##  Annotating Chr: chr3
## 2022-12-23 07:59:46 :
##  Annotating Chr: chr4
## 2022-12-23 07:59:46 :
##  Annotating Chr: chr5
## 2022-12-23 07:59:46 :
##  Annotating Chr: chr6
## 2022-12-23 07:59:47 :
##  Annotating Chr: chr7
## 2022-12-23 07:59:47 :
##  Annotating Chr: chr8
## 2022-12-23 07:59:47 :
##  Annotating Chr: chr9
## 2022-12-23 07:59:47 :
##  Annotating Chr: chr10
## 2022-12-23 07:59:47 :
##  Annotating Chr: chr11
## 2022-12-23 07:59:47 :
##  Annotating Chr: chr12
## 2022-12-23 07:59:47 :
##  Annotating Chr: chr13
## 2022-12-23 07:59:47 :
##  Annotating Chr: chr14
## 2022-12-23 07:59:47 :
##  Annotating Chr: chr15
## 2022-12-23 07:59:48 :
##  Annotating Chr: chr16
## 2022-12-23 07:59:48 :
##  Annotating Chr: chr17
## 2022-12-23 07:59:48 :
##  Annotating Chr: chr18
## 2022-12-23 07:59:48 :
##  Annotating Chr: chr19
## 2022-12-23 07:59:48 :
##  Annotating Chr: chr20
## 2022-12-23 07:59:48 :
##  Annotating Chr: chr21
## 2022-12-23 07:59:48 :
##  Annotating Chr: chr22
## 2022-12-23 07:59:48 :
##  Annotating Chr: chrX
## 2022-12-23 07:59:48 :
## 2022-12-23 07:59:49 : All Regions Overlap at least 1 peak!, 0.066 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-addArchRAnnotations-371b03bf780a8-Date-2022-12-23_Time-07-59-45.log
enrichCodex <- peakAnnoEnrichment(
    seMarker = markerPeaks,
    ArchRProj = projHeme5,
    peakAnnotation = "Codex",
    cutOff = "FDR <= 0.1 & Log2FC >= 0.5"
  )
## ArchR logging to : ArchRLogs/ArchR-peakAnnoEnrichment-371b04237471d-Date-2022-12-23_Time-07-59-49.log
## If there is an issue, please report to github with logFile!
## 2022-12-23 07:59:54 : Computing Enrichments 1 of 11, 0.083 mins elapsed.
## 2022-12-23 07:59:54 : Computing Enrichments 2 of 11, 0.084 mins elapsed.
## 2022-12-23 07:59:54 : Computing Enrichments 3 of 11, 0.085 mins elapsed.
## 2022-12-23 07:59:54 : Computing Enrichments 4 of 11, 0.087 mins elapsed.
## 2022-12-23 07:59:55 : Computing Enrichments 5 of 11, 0.094 mins elapsed.
## 2022-12-23 07:59:55 : Computing Enrichments 6 of 11, 0.095 mins elapsed.
## 2022-12-23 07:59:55 : Computing Enrichments 7 of 11, 0.097 mins elapsed.
## 2022-12-23 07:59:55 : Computing Enrichments 8 of 11, 0.098 mins elapsed.
## 2022-12-23 07:59:55 : Computing Enrichments 9 of 11, 0.099 mins elapsed.
## 2022-12-23 07:59:55 : Computing Enrichments 10 of 11, 0.1 mins elapsed.
## 2022-12-23 07:59:55 : Computing Enrichments 11 of 11, 0.101 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-peakAnnoEnrichment-371b04237471d-Date-2022-12-23_Time-07-59-49.log
enrichCodex
## class: SummarizedExperiment 
## dim: 189 11 
## metadata(0):
## assays(10): mlog10Padj mlog10p ... CompareFrequency feature
## rownames(189): 1.STAT5-No_drug_(DMSO)... 2.RUNX3-GM12878_cell_fr... ...
##   188.TP53-codex_Embryonic... 189.TP53-codex_Embryonic...
## rowData names(0):
## colnames(11): B CD4.M ... PreB Progenitor
## colData names(0):
heatmapCodex <- plotEnrichHeatmap(enrichCodex, n = 7, transpose = TRUE)
## ArchR logging to : ArchRLogs/ArchR-plotEnrichHeatmap-371b043a51484-Date-2022-12-23_Time-07-59-56.log
## If there is an issue, please report to github with logFile!
## Adding Annotations..
## Preparing Main Heatmap..
## 'magick' package is suggested to install to give better rasterization.
## 
## Set `ht_opt$message = FALSE` to turn off this message.
ComplexHeatmap::draw(heatmapCodex, heatmap_legend_side = "bot", annotation_legend_side = "bot")

We can then plot this

plotPDF(heatmapCodex, name = "Codex-Enriched-Marker-Heatmap", width = 8, height = 6, ArchRProj = projHeme5, addDOC = FALSE)
## Plotting ComplexHeatmap!