14.2 ArchR Enrichment
In addition to testing peaks for enrichment of motifs, ArchR also enables the determination of more customizable enrichments. To facilitate this level of data exploration, we have curated a few different feature sets that can be easily tested for enrichment in your peak regions of interest. We describe each of those curated feature sets below. This type of analysis was originally inspired by LOLA.
14.2.1 Encode TF Binding Sites
The ENCODE consortium has mapped TF binding sites (TFBSs) across a wide array of cell types and factors. We can use these TFBS collections to better understand our clusters. For example, in the context of truly unknown cell types, these enrichments could help to elucidate cell identity. To enable analyses with these ENCODE TFBS feature sets, we simply call the addArchRAnnotations()
function with collection = "EncodeTFBS"
. Similar to what happens when using addPeakAnnotations()
, this creates a binarized representation of the overlap between all marker peaks and all ENCODE TFBS.
projHeme5 <- addArchRAnnotations(ArchRProj = projHeme5, collection = "EncodeTFBS")
## ArchR logging to : ArchRLogs/ArchR-addArchRAnnotations-917905f15-Date-2025-02-06_Time-02-12-19.230764.log
## If there is an issue, please report to github with logFile!
## Annotation ArchR-Hg19-v1.Anno does not exist! Downloading..
## Annotating Chromosomes
## 2025-02-06 02:12:23.450598 :
## Annotating Chr: chr1
## 2025-02-06 02:12:23.45558 :
## Annotating Chr: chr2
## 2025-02-06 02:12:23.683719 :
## Annotating Chr: chr3
## 2025-02-06 02:12:23.852107 :
## Annotating Chr: chr4
## 2025-02-06 02:12:23.993586 :
## Annotating Chr: chr5
## 2025-02-06 02:12:24.100873 :
## Annotating Chr: chr6
## 2025-02-06 02:12:24.224218 :
## Annotating Chr: chr7
## 2025-02-06 02:12:24.373781 :
## Annotating Chr: chr8
## 2025-02-06 02:12:24.505743 :
## Annotating Chr: chr9
## 2025-02-06 02:12:24.619775 :
## Annotating Chr: chr10
## 2025-02-06 02:12:24.730418 :
## Annotating Chr: chr11
## 2025-02-06 02:12:24.850021 :
## Annotating Chr: chr12
## 2025-02-06 02:12:25.0099 :
## Annotating Chr: chr13
## 2025-02-06 02:12:25.170325 :
## Annotating Chr: chr14
## 2025-02-06 02:12:25.256685 :
## Annotating Chr: chr15
## 2025-02-06 02:12:25.367161 :
## Annotating Chr: chr16
## 2025-02-06 02:12:25.482679 :
## Annotating Chr: chr17
## 2025-02-06 02:12:25.611102 :
## Annotating Chr: chr18
## 2025-02-06 02:12:25.864134 :
## Annotating Chr: chr19
## 2025-02-06 02:12:25.941038 :
## Annotating Chr: chr20
## 2025-02-06 02:12:26.069982 :
## Annotating Chr: chr21
## 2025-02-06 02:12:26.168135 :
## Annotating Chr: chr22
## 2025-02-06 02:12:26.235897 :
## Annotating Chr: chrX
## 2025-02-06 02:12:26.315668 :
## 2025-02-06 02:12:28.295146 : All Regions Overlap at least 1 peak!, 0.151 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-addArchRAnnotations-917905f15-Date-2025-02-06_Time-02-12-19.230764.log
We can then test for enrichment of these ENCODE TFBSs with our peak set using the peakAnnoEnrichment()
function.
enrichEncode <- peakAnnoEnrichment(
seMarker = markerPeaks,
ArchRProj = projHeme5,
peakAnnotation = "EncodeTFBS",
cutOff = "FDR <= 0.1 & Log2FC >= 0.5"
)
## ArchR logging to : ArchRLogs/ArchR-peakAnnoEnrichment-9435e37c9-Date-2025-02-06_Time-02-12-30.285966.log
## If there is an issue, please report to github with logFile!
## 2025-02-06 02:12:33.503953 : Computing Enrichments 1 of 11, 0.054 mins elapsed.
## 2025-02-06 02:12:33.582179 : Computing Enrichments 2 of 11, 0.055 mins elapsed.
## 2025-02-06 02:12:33.666145 : Computing Enrichments 3 of 11, 0.056 mins elapsed.
## 2025-02-06 02:12:33.749031 : Computing Enrichments 4 of 11, 0.058 mins elapsed.
## 2025-02-06 02:12:33.829075 : Computing Enrichments 5 of 11, 0.059 mins elapsed.
## 2025-02-06 02:12:33.91125 : Computing Enrichments 6 of 11, 0.06 mins elapsed.
## 2025-02-06 02:12:33.987118 : Computing Enrichments 7 of 11, 0.062 mins elapsed.
## 2025-02-06 02:12:34.066629 : Computing Enrichments 8 of 11, 0.063 mins elapsed.
## 2025-02-06 02:12:34.149615 : Computing Enrichments 9 of 11, 0.064 mins elapsed.
## 2025-02-06 02:12:34.227038 : Computing Enrichments 10 of 11, 0.066 mins elapsed.
## 2025-02-06 02:12:34.309604 : Computing Enrichments 11 of 11, 0.067 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-peakAnnoEnrichment-9435e37c9-Date-2025-02-06_Time-02-12-30.285966.log
As previously, this function returns a SummarizedExperiment
object.
enrichEncode
## class: SummarizedExperiment
## dim: 689 11
## metadata(0):
## assays(10): mlog10Padj mlog10p ... CompareFrequency feature
## rownames(689): 1.CTCF-Dnd41... 2.EZH2_39-Dnd41... ...
## 688.CTCF-WERI_Rb_1... 689.CTCF-WI_38...
## rowData names(0):
## colnames(11): B CD4.M ... PreB Progenitor
## colData names(0):
We can create a heatmap from these enrichment results using the plotEnrichHeatmap()
function.
heatmapEncode <- plotEnrichHeatmap(enrichEncode, n = 7, transpose = TRUE)
## ArchR logging to : ArchRLogs/ArchR-plotEnrichHeatmap-94f7a8b86-Date-2025-02-06_Time-02-12-34.515397.log
## If there is an issue, please report to github with logFile!
## Adding Annotations..
## Preparing Main Heatmap..
## 'magick' package is suggested to install to give better rasterization.
##
## Set `ht_opt$message = FALSE` to turn off this message.
And then plot this heatmap using ComplexHeatmap::draw()
.
To save an editable vectorized version of this plot, we use the plotPDF()
function.
14.2.2 Bulk ATAC-seq
Similar to the curated set of ENCODE TF binding sites, we have also curated peak calls from bulk ATAC-seq experiments that can be used for overlap enrichment testing. We access these bulk ATAC-seq peak sets by setting collection = "ATAC"
.
projHeme5 <- addArchRAnnotations(ArchRProj = projHeme5, collection = "ATAC")
## ArchR logging to : ArchRLogs/ArchR-addArchRAnnotations-9619b4756-Date-2025-02-06_Time-02-12-38.58154.log
## If there is an issue, please report to github with logFile!
## Annotating Chromosomes
## 2025-02-06 02:12:38.84977 :
## Annotating Chr: chr1
## 2025-02-06 02:12:38.854547 :
## Annotating Chr: chr2
## 2025-02-06 02:12:38.968705 :
## Annotating Chr: chr3
## 2025-02-06 02:12:39.069345 :
## Annotating Chr: chr4
## 2025-02-06 02:12:39.161145 :
## Annotating Chr: chr5
## 2025-02-06 02:12:39.249762 :
## Annotating Chr: chr6
## 2025-02-06 02:12:39.333078 :
## Annotating Chr: chr7
## 2025-02-06 02:12:39.42061 :
## Annotating Chr: chr8
## 2025-02-06 02:12:39.500039 :
## Annotating Chr: chr9
## 2025-02-06 02:12:39.580578 :
## Annotating Chr: chr10
## 2025-02-06 02:12:39.657583 :
## Annotating Chr: chr11
## 2025-02-06 02:12:39.73432 :
## Annotating Chr: chr12
## 2025-02-06 02:12:39.818257 :
## Annotating Chr: chr13
## 2025-02-06 02:12:39.898198 :
## Annotating Chr: chr14
## 2025-02-06 02:12:39.95936 :
## Annotating Chr: chr15
## 2025-02-06 02:12:40.031763 :
## Annotating Chr: chr16
## 2025-02-06 02:12:40.105844 :
## Annotating Chr: chr17
## 2025-02-06 02:12:40.17736 :
## Annotating Chr: chr18
## 2025-02-06 02:12:40.262019 :
## Annotating Chr: chr19
## 2025-02-06 02:12:40.326714 :
## Annotating Chr: chr20
## 2025-02-06 02:12:40.41167 :
## Annotating Chr: chr21
## 2025-02-06 02:12:40.484403 :
## Annotating Chr: chr22
## 2025-02-06 02:12:40.543997 :
## Annotating Chr: chrX
## 2025-02-06 02:12:40.609008 :
## 2025-02-06 02:12:41.59324 : All Regions Overlap at least 1 peak!, 0.05 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-addArchRAnnotations-9619b4756-Date-2025-02-06_Time-02-12-38.58154.log
We then test our marker peaks for enrichment of these bulk ATAC-seq peaks by setting peakAnnotation = "ATAC"
.
enrichATAC <- peakAnnoEnrichment(
seMarker = markerPeaks,
ArchRProj = projHeme5,
peakAnnotation = "ATAC",
cutOff = "FDR <= 0.1 & Log2FC >= 0.5"
)
## ArchR logging to : ArchRLogs/ArchR-peakAnnoEnrichment-94cfd4317-Date-2025-02-06_Time-02-12-43.632264.log
## If there is an issue, please report to github with logFile!
## 2025-02-06 02:12:47.515811 : Computing Enrichments 1 of 11, 0.065 mins elapsed.
## 2025-02-06 02:12:47.562992 : Computing Enrichments 2 of 11, 0.065 mins elapsed.
## 2025-02-06 02:12:47.607953 : Computing Enrichments 3 of 11, 0.066 mins elapsed.
## 2025-02-06 02:12:47.653693 : Computing Enrichments 4 of 11, 0.067 mins elapsed.
## 2025-02-06 02:12:47.699384 : Computing Enrichments 5 of 11, 0.068 mins elapsed.
## 2025-02-06 02:12:47.748599 : Computing Enrichments 6 of 11, 0.069 mins elapsed.
## 2025-02-06 02:12:47.795038 : Computing Enrichments 7 of 11, 0.069 mins elapsed.
## 2025-02-06 02:12:47.84374 : Computing Enrichments 8 of 11, 0.07 mins elapsed.
## 2025-02-06 02:12:47.892506 : Computing Enrichments 9 of 11, 0.071 mins elapsed.
## 2025-02-06 02:12:47.939472 : Computing Enrichments 10 of 11, 0.072 mins elapsed.
## 2025-02-06 02:12:47.986307 : Computing Enrichments 11 of 11, 0.073 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-peakAnnoEnrichment-94cfd4317-Date-2025-02-06_Time-02-12-43.632264.log
As before, the output of this is a SummarizedExperiment
object with information on the enrichment results.
enrichATAC
## class: SummarizedExperiment
## dim: 96 11
## metadata(0):
## assays(10): mlog10Padj mlog10p ... CompareFrequency feature
## rownames(96): Brain_Astrocytes Brain_Excitatory_neurons ... Heme_MPP
## Heme_NK
## rowData names(0):
## colnames(11): B CD4.M ... PreB Progenitor
## colData names(0):
We can create an enrichment heatmap from this SummarizedExperiment
using plotEnrichHeatmap()
.
heatmapATAC <- plotEnrichHeatmap(enrichATAC, n = 7, transpose = TRUE)
## ArchR logging to : ArchRLogs/ArchR-plotEnrichHeatmap-976a252e6-Date-2025-02-06_Time-02-12-48.157383.log
## If there is an issue, please report to github with logFile!
## Adding Annotations..
## Preparing Main Heatmap..
## 'magick' package is suggested to install to give better rasterization.
##
## Set `ht_opt$message = FALSE` to turn off this message.
And plot this heatmap using ComplexHeatmap::draw()
To save an editable vectorized version of this plot, we use the plotPDF()
function.
14.2.3 Codex TFBS
The same type of analyses can be performed for CODEX TFBSs by setting collection = "Codex"
.
projHeme5 <- addArchRAnnotations(ArchRProj = projHeme5, collection = "Codex")
## ArchR logging to : ArchRLogs/ArchR-addArchRAnnotations-94096281b-Date-2025-02-06_Time-02-12-52.267269.log
## If there is an issue, please report to github with logFile!
## Annotating Chromosomes
## 2025-02-06 02:12:52.536008 :
## Annotating Chr: chr1
## 2025-02-06 02:12:52.540543 :
## Annotating Chr: chr2
## 2025-02-06 02:12:52.619001 :
## Annotating Chr: chr3
## 2025-02-06 02:12:52.696476 :
## Annotating Chr: chr4
## 2025-02-06 02:12:52.774483 :
## Annotating Chr: chr5
## 2025-02-06 02:12:52.840419 :
## Annotating Chr: chr6
## 2025-02-06 02:12:52.906912 :
## Annotating Chr: chr7
## 2025-02-06 02:12:52.975936 :
## Annotating Chr: chr8
## 2025-02-06 02:12:53.044106 :
## Annotating Chr: chr9
## 2025-02-06 02:12:53.116323 :
## Annotating Chr: chr10
## 2025-02-06 02:12:53.185412 :
## Annotating Chr: chr11
## 2025-02-06 02:12:53.253278 :
## Annotating Chr: chr12
## 2025-02-06 02:12:53.322582 :
## Annotating Chr: chr13
## 2025-02-06 02:12:53.391308 :
## Annotating Chr: chr14
## 2025-02-06 02:12:53.450126 :
## Annotating Chr: chr15
## 2025-02-06 02:12:53.513028 :
## Annotating Chr: chr16
## 2025-02-06 02:12:53.574894 :
## Annotating Chr: chr17
## 2025-02-06 02:12:53.63786 :
## Annotating Chr: chr18
## 2025-02-06 02:12:53.707179 :
## Annotating Chr: chr19
## 2025-02-06 02:12:53.764299 :
## Annotating Chr: chr20
## 2025-02-06 02:12:53.838341 :
## Annotating Chr: chr21
## 2025-02-06 02:12:53.9003 :
## Annotating Chr: chr22
## 2025-02-06 02:12:53.954982 :
## Annotating Chr: chrX
## 2025-02-06 02:12:54.012035 :
## 2025-02-06 02:12:54.747732 : All Regions Overlap at least 1 peak!, 0.041 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-addArchRAnnotations-94096281b-Date-2025-02-06_Time-02-12-52.267269.log
enrichCodex <- peakAnnoEnrichment(
seMarker = markerPeaks,
ArchRProj = projHeme5,
peakAnnotation = "Codex",
cutOff = "FDR <= 0.1 & Log2FC >= 0.5"
)
## ArchR logging to : ArchRLogs/ArchR-peakAnnoEnrichment-91010b3a1-Date-2025-02-06_Time-02-12-56.416156.log
## If there is an issue, please report to github with logFile!
## 2025-02-06 02:12:58.098723 : Computing Enrichments 1 of 11, 0.028 mins elapsed.
## 2025-02-06 02:12:58.132984 : Computing Enrichments 2 of 11, 0.029 mins elapsed.
## 2025-02-06 02:12:58.167791 : Computing Enrichments 3 of 11, 0.029 mins elapsed.
## 2025-02-06 02:12:58.201671 : Computing Enrichments 4 of 11, 0.03 mins elapsed.
## 2025-02-06 02:12:58.235424 : Computing Enrichments 5 of 11, 0.03 mins elapsed.
## 2025-02-06 02:12:58.269162 : Computing Enrichments 6 of 11, 0.031 mins elapsed.
## 2025-02-06 02:12:58.303835 : Computing Enrichments 7 of 11, 0.031 mins elapsed.
## 2025-02-06 02:12:58.338797 : Computing Enrichments 8 of 11, 0.032 mins elapsed.
## 2025-02-06 02:12:58.373664 : Computing Enrichments 9 of 11, 0.033 mins elapsed.
## 2025-02-06 02:12:58.408577 : Computing Enrichments 10 of 11, 0.033 mins elapsed.
## 2025-02-06 02:12:58.443187 : Computing Enrichments 11 of 11, 0.034 mins elapsed.
## ArchR logging successful to : ArchRLogs/ArchR-peakAnnoEnrichment-91010b3a1-Date-2025-02-06_Time-02-12-56.416156.log
enrichCodex
## class: SummarizedExperiment
## dim: 189 11
## metadata(0):
## assays(10): mlog10Padj mlog10p ... CompareFrequency feature
## rownames(189): 1.STAT5-No_drug_(DMSO)... 2.RUNX3-GM12878_cell_fr... ...
## 188.TP53-codex_Embryonic... 189.TP53-codex_Embryonic...
## rowData names(0):
## colnames(11): B CD4.M ... PreB Progenitor
## colData names(0):
heatmapCodex <- plotEnrichHeatmap(enrichCodex, n = 7, transpose = TRUE)
## ArchR logging to : ArchRLogs/ArchR-plotEnrichHeatmap-92b42ed9b-Date-2025-02-06_Time-02-12-58.594438.log
## If there is an issue, please report to github with logFile!
## Adding Annotations..
## Preparing Main Heatmap..
## 'magick' package is suggested to install to give better rasterization.
##
## Set `ht_opt$message = FALSE` to turn off this message.
We can then plot this