This demo showcases the analytic power of using XGR to make sense of eQTL SNPs summarised from eQTL mappings, including enrichment analysis, similarity analysis and network analysis. SNP-level enrichment analysis supported in XGR is unique in its ontology tree-aware analysis, as compared to conventional analysis using the list of GWAS Catalog traits. Similarity analysis adds a new dimension to interpret eQTL SNPs, not just showing their relevance to GWAS traits but also measuring how similar they are to each other in meanings of trait profiles (ie ontology annotation profiles).
library(XGR)
# Specify the locations of built-in data
RData.location <- "http://galahad.well.ox.ac.uk/bigdata"
We here illustrate the functionalities supported in XGR to interpret cis-eQTLs (SNPs) that are induced by 24-hour interferon gamma (IFN24), 24-hour LPS (LPS24) or 2-hour LPS (LPS2), and in naive state.
# Load cis-eQTL mapping results
cis <- xRDataLoader(RData.customised='JKscience_TS2A', RData.location=RData.location)
# Create a data frame for cis-eQTLs significantly induced in naive state
ind <- which(cis$Naive_t>0 & cis$Naive_fdr<0.05)
df_cis_Naive <- cis[ind, c('variant','Symbol','IFN_t','IFN_fdr')]
# Create a data frame for cis-eQTLs significantly induced by LPS2
ind <- which(cis$LPS2_t>0 & cis$LPS2_fdr<0.05)
df_cis_LPS2 <- cis[ind, c('variant','Symbol','LPS2_t','LPS2_fdr')]
# Create a data frame for cis-eQTLs significantly induced by LPS24
ind <- which(cis$LPS24_t>0 & cis$LPS24_fdr<0.05)
df_cis_LPS24 <- cis[ind, c('variant','Symbol','LPS24_t','LPS24_fdr')]
# Create a data frame for cis-eQTLs significantly induced by IFN24
ind <- which(cis$IFN_t>0 & cis$IFN_fdr<0.05)
df_cis_IFN24 <- cis[ind, c('variant','Symbol','IFN_t','IFN_fdr')]
Conventionally, SNP-based enrichment analysis is only using traits reported in GWAS studies. However, these GWAS traits can be mapped onto EFO terms. Using EFO enables us to look at a general term (representing a group of related traits) and its annotated SNPs, including GWAS-reported SNPs (or called ‘original annotations’) and inherited SNPs from its children terms (or called ‘inherited annotations’).
As a routine, SNP-based enrichment analysis considers LD SNPs.
Enrichment analysis is done using disease part of EFO. The necessity of using EFO is justified via comparisons of the following 2 scenarios:
This is demonstrated using cis-eQTLs induced by 24-hour interferon gamma.
df_cis <- df_cis_IFN24
data <- df_cis$variant
EFO (-)
eTerm_noEF <- xEnricherSNPs(data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, true.path.rule=F, RData.location=RData.location)
xEnrichViewer(eTerm_noEF, 10)
Barplot of enriched terms
bp_noEF <- xEnrichBarplot(eTerm_noEF, top_num="auto", displayBy="zscore")
bp_noEF
EFO (+)
eTerm_EF <- xEnricherSNPs(data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, true.path.rule=T, RData.location=RData.location)
xEnrichViewer(eTerm_EF, 10)
Barplot of enriched terms
bp_EF <- xEnrichBarplot(eTerm_EF, top_num="auto", displayBy="zscore")
bp_EF
Comparing EFO-based enrichment results (whether using EFO or not)
Under FDR cutoff at 0.01
list_eTerm <- list(eTerm_noEF, eTerm_EF)
names(list_eTerm) <- c('EFO (-)', 'EFO (+)')
bp_EF_FDR_001 <- xEnrichCompare(list_eTerm, displayBy="zscore", FDR.cutoff=0.01, bar.label.size=3)
bp_EF_FDR_001 + theme(axis.text.y=element_text(size=10))
DAGplot of enriched terms in the context of EF tree, with nodes/terms colored according to how many times being called significant. Also shown is the term name (if significant) prefixed in the form of ‘x1-x2’. In this case, x1 for ‘EFO (-)’, x2 for ‘EFO (+)’. The value of x1 (or x2) can be ‘1’ or ‘0’, denoting whether this term is called significant or not.
xEnrichDAGplotAdv(bp_EF_FDR_001, displayBy="nSig", colormap="white-lightcyan-cyan", layout.orientation="left_right", node.info=c("term_name"), graph.node.attrs=list(fontsize=20), newpage=F)
Under FDR cutoff at 0.05
list_eTerm <- list(eTerm_noEF, eTerm_EF)
names(list_eTerm) <- c('EFO (-)', 'EFO (+)')
bp_EF_FDR_005 <- xEnrichCompare(list_eTerm, displayBy="zscore", FDR.cutoff=0.05, bar.label.size=3)
bp_EF_FDR_005 + theme(axis.text.y=element_text(size=10))
DAGplot of enriched terms in the context of EF tree, with nodes/terms colored according to how many times being called significant. Also shown is the term name (if significant) prefixed in the form of ‘x1-x2’. In this case, x1 for ‘EFO (-)’, x2 for ‘EFO (+)’. The value of x1 (or x2) can be ‘1’ or ‘0’, denoting whether this term is called significant or not.
xEnrichDAGplotAdv(bp_EF_FDR_005, displayBy="nSig", colormap="white-lightcyan-cyan", layout.orientation="left_right", node.info=c("term_name"), graph.node.attrs=list(fontsize=30), newpage=F)
Enrichment analysis is done using disease part of EFO. The necessity of using EFO and respecting ontology tree structure is justified via comparisons of following 3 scenarios:
This is demonstrated using cis-eQTLs induced by 24-hour interferon gamma.
df_cis <- df_cis_IFN24
data <- df_cis$variant
EFO (-)
eTerm_noEF <- xEnricherSNPs(data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, true.path.rule=F, ontology.algorithm="none", RData.location=RData.location)
xEnrichViewer(eTerm_noEF, 10)
Barplot of enriched terms
bp_noEF <- xEnrichBarplot(eTerm_noEF, top_num="auto", displayBy="zscore")
bp_noEF
DAGplot of enriched terms
xEnrichDAGplot(eTerm_noEF, top_num="auto", displayBy="fdr", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=25,fontcolor="blue"), newpage=F)
EFO (+) & Tree (-)
eTerm_EF_noTree <- xEnricherSNPs(data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, true.path.rule=T, ontology.algorithm="none", RData.location=RData.location)
xEnrichViewer(eTerm_EF_noTree, 10)
Barplot of enriched terms
bp_EF_noTree <- xEnrichBarplot(eTerm_EF_noTree, top_num="auto", displayBy="zscore")
bp_EF_noTree
EFO (+) & Tree (+)
eTerm_EF_Tree <- xEnricherSNPs(data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, true.path.rule=T, ontology.algorithm="lea", RData.location=RData.location)
xEnrichViewer(eTerm_EF_Tree, 10)
Barplot of enriched terms
bp_EF_Tree <- xEnrichBarplot(eTerm_EF_Tree, top_num="auto", displayBy="zscore")
bp_EF_Tree
DAGplot of enriched terms
xEnrichDAGplot(eTerm_EF_Tree, top_num="auto", displayBy="fdr", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=25,fontcolor="blue"), newpage=F)
Comparing EFO-based enrichment results (whether using EFO and respecting ontology tree)
Under FDR cutoff at 0.01
list_eTerm <- list(eTerm_noEF, eTerm_EF_noTree, eTerm_EF_Tree)
names(list_eTerm) <- c('EFO (-)', 'EFO (+) & Tree (-)', 'EFO (+) & Tree (+)')
bp_FDR_001 <- xEnrichCompare(list_eTerm, displayBy="zscore", FDR.cutoff=0.01, bar.label.size=3)
bp_FDR_001 + theme(axis.text.y=element_text(size=10))
DAGplot of enriched terms in the context of EF tree, with nodes/terms colored according to how many times being called significant. Also shown is the term name (if significant) prefixed in the form of ‘x1-x2-x3’. In this case, x1 for ‘EFO (-)’, x2 for ‘EFO (+) & Tree (-)’, x3 for ‘EFO (+) & Tree (+)’. The value of x1-3 can be ‘1’ or ‘0’, denoting whether this term is called significant or not.
xEnrichDAGplotAdv(bp_FDR_001, displayBy="nSig", colormap="white-lightcyan-cyan", layout.orientation="left_right", node.info=c("term_name"), graph.node.attrs=list(fontsize=20), newpage=F)
This is demonstrated using cis-eQTL SNPs induced by 24-hour interferon gamma (IFN24), 24-hour LPS (LPS24) or 2-hour LPS (LPS2), and in naive state. All analyses are using the disease part of EFO, considering LD SNPs and respecting ontology tree as well.
EFO enrichments for cis-eQTLs significantly induced in naive state
data <- df_cis_Naive$variant
eTerm_EF_Naive <- xEnricherSNPs(data=data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, ontology.algorithm="lea", RData.location=RData.location)
xEnrichViewer(eTerm_EF_Naive, 10)
Barplot of enriched terms
bp_EF_Naive <- xEnrichBarplot(eTerm_EF_Naive, top_num="auto", displayBy="zscore")
bp_EF_Naive
DAGplot of enriched terms
xEnrichDAGplot(eTerm_EF_Naive, top_num="auto", displayBy="fdr", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=25,fontcolor="blue"), newpage=F)
EFO enrichments for cis-eQTLs significantly induced by LPS2
data <- df_cis_LPS2$variant
eTerm_EF_LPS2 <- xEnricherSNPs(data=data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, ontology.algorithm="lea", RData.location=RData.location)
xEnrichViewer(eTerm_EF_LPS2, 10)
Barplot of enriched terms
bp_EF_LPS2 <- xEnrichBarplot(eTerm_EF_LPS2, top_num="auto", displayBy="zscore")
bp_EF_LPS2
DAGplot of enriched terms
xEnrichDAGplot(eTerm_EF_LPS2, top_num="auto", displayBy="fdr", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=30,fontcolor="blue"), newpage=F)
EFO enrichments for cis-eQTLs significantly induced by LPS24
data <- df_cis_LPS24$variant
eTerm_EF_LPS24 <- xEnricherSNPs(data=data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, ontology.algorithm="lea", RData.location=RData.location)
xEnrichViewer(eTerm_EF_LPS24, 10)
Barplot of enriched terms
bp_EF_LPS24 <- xEnrichBarplot(eTerm_EF_LPS24, top_num="auto", displayBy="zscore")
bp_EF_LPS24
DAGplot of enriched terms
xEnrichDAGplot(eTerm_EF_LPS24, top_num="auto", displayBy="fdr", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=30,fontcolor="blue"), newpage=F)
EFO enrichments for cis-eQTLs significantly induced by IFN24
data <- df_cis_IFN24$variant
eTerm_EF_IFN24 <- xEnricherSNPs(data=data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, ontology.algorithm="lea", RData.location=RData.location)
xEnrichViewer(eTerm_EF_IFN24, 10)
Barplot of enriched terms
bp_EF_IFN24 <- xEnrichBarplot(eTerm_EF_IFN24, top_num="auto", displayBy="zscore")
bp_EF_IFN24
DAGplot of enriched terms
xEnrichDAGplot(eTerm_EF_IFN24, top_num="auto", displayBy="adjp", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=25,fontcolor="blue"), newpage=F)
Comparing EFO-based enrichment results for cis-eQTLs induced in four conditions
Under FDR cutoff at 0.01
list_eTerm <- list(eTerm_EF_Naive, eTerm_EF_LPS2, eTerm_EF_LPS24, eTerm_EF_IFN24)
names(list_eTerm) <- c('Naive cis-eQTLs', 'LPS2 cis-eQTLs', 'LPS24 cis-eQTLs', 'IFN24 cis-eQTLs')
bp_FDR_001 <- xEnrichCompare(list_eTerm, displayBy="zscore", FDR.cutoff=0.01, bar.label.size=2.5)
bp_FDR_001 + theme(axis.text.y=element_text(size=10))
DAGplot of enriched terms in the context of EF tree, with nodes/terms colored according to how many times being called significant. Also shown is the term name (if significant) prefixed in the form of ‘x1-x2-x3-x4’. In this case, x1 for ‘Naive cis-eQTLs’, x2 for ‘LPS2 cis-eQTLs’, x3 for ‘LPS24 cis-eQTLs’, x4 for ‘IFN24 cis-eQTLs’. The value of x1-4 can be ‘1’ or ‘0’, denoting whether this term is called significant or not.
xEnrichDAGplotAdv(bp_FDR_001, displayBy="nSig", colormap="white-lightcyan-cyan", layout.orientation="left_right", node.info=c("term_name"), graph.node.attrs=list(fontsize=30), newpage=F)
This adds a new dimension to interpret eQTL SNPs, not just showing their relevance to GWAS traits but also measuring how similar they are to each other in meanings of trait profiles (ie ontology annotation profiles).
It is demonstrated using cis-eQTLs induced by 24-hour interferon gamma. SNP-based similarity analysis is using GWAS Catalog traits (mapped to disease part of EFO), that is, SNP annotation profiles focusing on the disease part of EFO.
df_cis <- df_cis_IFN24
data <- df_cis$variant
This package is needed for visualisations
library(RCircos)
data <- df_cis$variant
ig_SNP_BM_complete <- xSocialiserSNPs(data=data, ontology="EF_disease", measure="BM.complete", method.term="Resnik", rescale=F, RData.location=RData.location)
Circos plot illustrating the top 100 similarity:
xCircos(g=ig_SNP_BM_complete, entity="SNP", top_num=100, entity.label.cex=0.8, verbose=F, RData.location=RData.location)
Circos plot involving an SNP ‘rs11150589’
xCircos(g=ig_SNP_BM_complete, entity="SNP", nodes.query='rs11150589', entity.label.cex=0.8, verbose=F, RData.location=RData.location)
Circos plot involving the SNP ‘rs11150589’ but its labelings are restricted to SNPs of interest, such as ‘rs11150589’,‘rs10500264’,‘rs4072037’,‘rs3957148’,‘rs2066807’
xCircos(g=ig_SNP_BM_complete, entity="SNP", nodes.query='rs11150589', entity.label.cex=0.8, entity.label.side="in", entity.label.query=c('rs11150589','rs10500264','rs4072037','rs3957148','rs2066807'), verbose=F, RData.location=RData.location)
> variant Symbol IFN_t IFN_fdr
> 2723 rs11150589 ITGAL 13.843583 2.270000e-27
> 1114 rs10500264 CEBPA 4.001606 8.848898e-03
> 12711 rs4072037 GBA 4.035999 7.881209e-03
> 12677 rs3957148 AGPAT1 3.488521 4.207953e-02
> 8546 rs2066807 CNPY2 4.719715 6.555550e-04
DAG plot of terms used to annotate an SNP
Nodes/terms in a direct acyclic graph (DAG) are colored according to information content (IC). Notably, original terms are highlighted by box-shaped nodes, inherited terms by ellipse nodes.
Terms used to annotate the SNP ‘rs11150589’:
xSocialiserDAGplot(g=ig_SNP_BM_complete, query='rs11150589', displayBy="IC", zlim=c(0,4), node.info=c("full_term_name"), graph.node.attrs=list(fontsize=25,color="transparent"), newpage=F)
Terms used to annotate the SNP ‘rs10500264’:
xSocialiserDAGplot(g=ig_SNP_BM_complete, query='rs10500264', displayBy="IC", zlim=c(0,4), node.info=c("full_term_name"), graph.node.attrs=list(fontsize=20,color="transparent"), newpage=F)
Terms used to annotate the SNP ‘rs4072037’:
xSocialiserDAGplot(g=ig_SNP_BM_complete, query='rs4072037', displayBy="IC", zlim=c(0,4), node.info=c("full_term_name"), graph.node.attrs=list(fontsize=15,color="transparent"), newpage=F)
Terms used to annotate the SNP ‘rs3957148’:
xSocialiserDAGplot(g=ig_SNP_BM_complete, query='rs3957148', displayBy="IC", zlim=c(0,4), node.info=c("full_term_name"), graph.node.attrs=list(fontsize=15,color="transparent"), newpage=F)
Terms used to annotate the SNP ‘rs2066807’:
xSocialiserDAGplot(g=ig_SNP_BM_complete, query='rs2066807', displayBy="IC", zlim=c(0,4), node.info=c("full_term_name"), graph.node.attrs=list(fontsize=15,color="transparent"), newpage=F)
Exploring the basis of similarity between two SNPs
Comparing terms used to annotate two SNPs ‘rs11150589’ and ‘rs10500264’
DAGplot of terms in the context of EFO tree, with nodes/terms colored according to information content (IC). Also shown is the term name prefixed in the form of ‘x1-x2’. In this case, x1 for ‘rs11150589’, x2 for ‘rs10500264’. The value of x1 (or x2) can be ‘0’, ‘1’ or ‘2’, respectively denoting no annotation, inherited annotation, original annotation.
xSocialiserDAGplotAdv(g=ig_SNP_BM_complete, query1='rs11150589', query2='rs10500264', displayBy="IC", zlim=c(0,4), graph.node.attrs=list(fontsize=25), newpage=F)
Comparing terms used to annotate two SNPs ‘rs11150589’ and ‘rs4072037’
DAGplot of terms in the context of EFO tree, with nodes/terms colored according to information content (IC). Also shown is the term name prefixed in the form of ‘x1-x2’. In this case, x1 for ‘rs11150589’, x2 for ‘rs4072037’. The value of x1 (or x2) can be ‘0’, ‘1’ or ‘2’, respectively denoting no annotation, inherited annotation, original annotation.
xSocialiserDAGplotAdv(g=ig_SNP_BM_complete, query1='rs11150589', query2='rs4072037', displayBy="IC", zlim=c(0,4), graph.node.attrs=list(fontsize=25), newpage=F)
Comparing terms used to annotate two SNPs ‘rs11150589’ and ‘rs3957148’
DAGplot of terms in the context of EFO tree, with nodes/terms colored according to information content (IC). Also shown is the term name prefixed in the form of ‘x1-x2’. In this case, x1 for ‘rs11150589’, x2 for ‘rs3957148’. The value of x1 (or x2) can be ‘0’, ‘1’ or ‘2’, respectively denoting no annotation, inherited annotation, original annotation.
xSocialiserDAGplotAdv(g=ig_SNP_BM_complete, query1='rs11150589', query2='rs3957148', displayBy="IC", zlim=c(0,4), graph.node.attrs=list(fontsize=25), newpage=F)
Comparing terms used to annotate two SNPs ‘rs11150589’ and ‘rs2066807’
DAGplot of terms in the context of EFO tree, with nodes/terms colored according to information content (IC). Also shown is the term name prefixed in the form of ‘x1-x2’. In this case, x1 for ‘rs11150589’, x2 for ‘rs2066807’. The value of x1 (or x2) can be ‘0’, ‘1’ or ‘2’, respectively denoting no annotation, inherited annotation, original annotation.
xSocialiserDAGplotAdv(g=ig_SNP_BM_complete, query1='rs11150589', query2='rs2066807', displayBy="IC", zlim=c(0,4), graph.node.attrs=list(fontsize=25), newpage=F)
Here is the output of sessionInfo()
on the system on which this user manual was built:
> R version 3.2.4 (2016-03-10)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.11.4 (El Capitan)
>
> locale:
> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>
> attached base packages:
> [1] grid stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] RCircos_1.1.3 VennDiagram_1.6.16 futile.logger_1.4.1
> [4] XGR_1.0.2 ggplot2_2.1.0 dnet_1.0.9
> [7] supraHex_1.11.1 hexbin_1.27.1 igraph_1.0.1
> [10] rmarkdown_0.9.5
>
> loaded via a namespace (and not attached):
> [1] Rcpp_0.12.5 formatR_1.3
> [3] highr_0.5.1 plyr_1.8.3
> [5] GenomeInfoDb_1.4.3 XVector_0.8.0
> [7] futile.options_1.0.0 bitops_1.0-6
> [9] zlibbioc_1.14.0 tools_3.2.4
> [11] digest_0.6.9 evaluate_0.8.3
> [13] nlme_3.1-125 gtable_0.2.0
> [15] lattice_0.20-33 Matrix_1.2-4
> [17] graph_1.46.0 Rgraphviz_2.12.0
> [19] yaml_2.1.13 parallel_3.2.4
> [21] rtracklayer_1.28.10 stringr_1.0.0
> [23] knitr_1.12.3 Biostrings_2.36.4
> [25] S4Vectors_0.6.6 IRanges_2.2.9
> [27] stats4_3.2.4 Biobase_2.28.0
> [29] BiocParallel_1.2.22 XML_3.98-1.4
> [31] reshape2_1.4.1 lambda.r_1.1.7
> [33] magrittr_1.5 GenomicAlignments_1.4.2
> [35] Rsamtools_1.20.5 scales_0.4.0
> [37] htmltools_0.3 BiocGenerics_0.14.0
> [39] GenomicRanges_1.20.8 ape_3.5
> [41] colorspace_1.2-6 labeling_0.3
> [43] stringi_1.1.1 RCurl_1.95-4.8
> [45] munsell_0.4.3