1 Web-app (linkto)

2 Summary

This demo showcases the analytic power of using XGR to make sense of eQTL SNPs summarised from eQTL mappings, including enrichment analysis, similarity analysis and network analysis. SNP-level enrichment analysis supported in XGR is unique in its ontology tree-aware analysis, as compared to conventional analysis using the list of GWAS Catalog traits. Similarity analysis adds a new dimension to interpret eQTL SNPs, not just showing their relevance to GWAS traits but also measuring how similar they are to each other in meanings of trait profiles (ie ontology annotation profiles).

3 Package

library(XGR)

# Specify the locations of built-in data
RData.location <- "http://galahad.well.ox.ac.uk/bigdata"

4 Data

We here illustrate the functionalities supported in XGR to interpret cis-eQTLs (SNPs) that are induced by 24-hour interferon gamma (IFN24), 24-hour LPS (LPS24) or 2-hour LPS (LPS2), and in naive state.

# Load cis-eQTL mapping results
cis <- xRDataLoader(RData.customised='JKscience_TS2A', RData.location=RData.location)

# Create a data frame for cis-eQTLs significantly induced in naive state
ind <- which(cis$Naive_t>0 & cis$Naive_fdr<0.05)
df_cis_Naive <- cis[ind, c('variant','Symbol','IFN_t','IFN_fdr')]

# Create a data frame for cis-eQTLs significantly induced by LPS2
ind <- which(cis$LPS2_t>0 & cis$LPS2_fdr<0.05)
df_cis_LPS2 <- cis[ind, c('variant','Symbol','LPS2_t','LPS2_fdr')]

# Create a data frame for cis-eQTLs significantly induced by LPS24
ind <- which(cis$LPS24_t>0 & cis$LPS24_fdr<0.05)
df_cis_LPS24 <- cis[ind, c('variant','Symbol','LPS24_t','LPS24_fdr')]

# Create a data frame for cis-eQTLs significantly induced by IFN24
ind <- which(cis$IFN_t>0 & cis$IFN_fdr<0.05)
df_cis_IFN24 <- cis[ind, c('variant','Symbol','IFN_t','IFN_fdr')]

5 Showcase

5.1 Enrichment analysis: necessity of using Experimental Factor Ontology (EFO) and respecting ontology tree structure

Conventionally, SNP-based enrichment analysis is only using traits reported in GWAS studies. However, these GWAS traits can be mapped onto EFO terms. Using EFO enables us to look at a general term (representing a group of related traits) and its annotated SNPs, including GWAS-reported SNPs (or called ‘original annotations’) and inherited SNPs from its children terms (or called ‘inherited annotations’).

As a routine, SNP-based enrichment analysis considers LD SNPs.

5.1.1 Conventional vs Ontology analysis

Enrichment analysis is done using disease part of EFO. The necessity of using EFO is justified via comparisons of the following 2 scenarios:

  1. EFO (-): without using EFO
  2. EFO (+): using EFO

This is demonstrated using cis-eQTLs induced by 24-hour interferon gamma.

df_cis <- df_cis_IFN24
data <- df_cis$variant

EFO (-)

eTerm_noEF <- xEnricherSNPs(data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, true.path.rule=F, RData.location=RData.location)
xEnrichViewer(eTerm_noEF, 10)

Barplot of enriched terms

bp_noEF <- xEnrichBarplot(eTerm_noEF, top_num="auto", displayBy="zscore")
bp_noEF

EFO (+)

eTerm_EF <- xEnricherSNPs(data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, true.path.rule=T, RData.location=RData.location)
xEnrichViewer(eTerm_EF, 10)

Barplot of enriched terms

bp_EF <- xEnrichBarplot(eTerm_EF, top_num="auto", displayBy="zscore")
bp_EF

Comparing EFO-based enrichment results (whether using EFO or not)

Under FDR cutoff at 0.01

list_eTerm <- list(eTerm_noEF, eTerm_EF)
names(list_eTerm) <- c('EFO (-)', 'EFO (+)')
bp_EF_FDR_001 <- xEnrichCompare(list_eTerm, displayBy="zscore", FDR.cutoff=0.01, bar.label.size=3)
bp_EF_FDR_001 + theme(axis.text.y=element_text(size=10))

DAGplot of enriched terms in the context of EF tree, with nodes/terms colored according to how many times being called significant. Also shown is the term name (if significant) prefixed in the form of ‘x1-x2’. In this case, x1 for ‘EFO (-)’, x2 for ‘EFO (+)’. The value of x1 (or x2) can be ‘1’ or ‘0’, denoting whether this term is called significant or not.

xEnrichDAGplotAdv(bp_EF_FDR_001, displayBy="nSig", colormap="white-lightcyan-cyan", layout.orientation="left_right", node.info=c("term_name"), graph.node.attrs=list(fontsize=20), newpage=F)

Under FDR cutoff at 0.05

list_eTerm <- list(eTerm_noEF, eTerm_EF)
names(list_eTerm) <- c('EFO (-)', 'EFO (+)')
bp_EF_FDR_005 <- xEnrichCompare(list_eTerm, displayBy="zscore", FDR.cutoff=0.05, bar.label.size=3)
bp_EF_FDR_005 + theme(axis.text.y=element_text(size=10))

DAGplot of enriched terms in the context of EF tree, with nodes/terms colored according to how many times being called significant. Also shown is the term name (if significant) prefixed in the form of ‘x1-x2’. In this case, x1 for ‘EFO (-)’, x2 for ‘EFO (+)’. The value of x1 (or x2) can be ‘1’ or ‘0’, denoting whether this term is called significant or not.

xEnrichDAGplotAdv(bp_EF_FDR_005, displayBy="nSig", colormap="white-lightcyan-cyan", layout.orientation="left_right", node.info=c("term_name"), graph.node.attrs=list(fontsize=30), newpage=F)

5.1.2 Ontology tree-aware analysis

Enrichment analysis is done using disease part of EFO. The necessity of using EFO and respecting ontology tree structure is justified via comparisons of following 3 scenarios:

  1. EFO (-): without using EFO
  2. EFO (+) & Tree (-): using EFO but without respecting ontology tree
  3. EFO (+) & Tree (+): using EFO and also respecting ontology tree

This is demonstrated using cis-eQTLs induced by 24-hour interferon gamma.

df_cis <- df_cis_IFN24
data <- df_cis$variant

EFO (-)

eTerm_noEF <- xEnricherSNPs(data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, true.path.rule=F, ontology.algorithm="none", RData.location=RData.location)
xEnrichViewer(eTerm_noEF, 10)

Barplot of enriched terms

bp_noEF <- xEnrichBarplot(eTerm_noEF, top_num="auto", displayBy="zscore")
bp_noEF

DAGplot of enriched terms

xEnrichDAGplot(eTerm_noEF, top_num="auto", displayBy="fdr", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=25,fontcolor="blue"), newpage=F)

EFO (+) & Tree (-)

eTerm_EF_noTree <- xEnricherSNPs(data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, true.path.rule=T, ontology.algorithm="none", RData.location=RData.location)
xEnrichViewer(eTerm_EF_noTree, 10)

Barplot of enriched terms

bp_EF_noTree <- xEnrichBarplot(eTerm_EF_noTree, top_num="auto", displayBy="zscore")
bp_EF_noTree

EFO (+) & Tree (+)

eTerm_EF_Tree <- xEnricherSNPs(data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, true.path.rule=T, ontology.algorithm="lea", RData.location=RData.location)
xEnrichViewer(eTerm_EF_Tree, 10)

Barplot of enriched terms

bp_EF_Tree <- xEnrichBarplot(eTerm_EF_Tree, top_num="auto", displayBy="zscore")
bp_EF_Tree

DAGplot of enriched terms

xEnrichDAGplot(eTerm_EF_Tree, top_num="auto", displayBy="fdr", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=25,fontcolor="blue"), newpage=F)

Comparing EFO-based enrichment results (whether using EFO and respecting ontology tree)

Under FDR cutoff at 0.01

list_eTerm <- list(eTerm_noEF, eTerm_EF_noTree, eTerm_EF_Tree)
names(list_eTerm) <- c('EFO (-)', 'EFO (+) & Tree (-)', 'EFO (+) & Tree (+)')
bp_FDR_001 <- xEnrichCompare(list_eTerm, displayBy="zscore", FDR.cutoff=0.01, bar.label.size=3)
bp_FDR_001 + theme(axis.text.y=element_text(size=10))

DAGplot of enriched terms in the context of EF tree, with nodes/terms colored according to how many times being called significant. Also shown is the term name (if significant) prefixed in the form of ‘x1-x2-x3’. In this case, x1 for ‘EFO (-)’, x2 for ‘EFO (+) & Tree (-)’, x3 for ‘EFO (+) & Tree (+)’. The value of x1-3 can be ‘1’ or ‘0’, denoting whether this term is called significant or not.

xEnrichDAGplotAdv(bp_FDR_001, displayBy="nSig", colormap="white-lightcyan-cyan", layout.orientation="left_right", node.info=c("term_name"), graph.node.attrs=list(fontsize=20), newpage=F)

5.2 Comparative enrichment analysis: across conditions

This is demonstrated using cis-eQTL SNPs induced by 24-hour interferon gamma (IFN24), 24-hour LPS (LPS24) or 2-hour LPS (LPS2), and in naive state. All analyses are using the disease part of EFO, considering LD SNPs and respecting ontology tree as well.

EFO enrichments for cis-eQTLs significantly induced in naive state

data <- df_cis_Naive$variant
eTerm_EF_Naive <- xEnricherSNPs(data=data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, ontology.algorithm="lea", RData.location=RData.location)
xEnrichViewer(eTerm_EF_Naive, 10)

Barplot of enriched terms

bp_EF_Naive <- xEnrichBarplot(eTerm_EF_Naive, top_num="auto", displayBy="zscore")
bp_EF_Naive

DAGplot of enriched terms

xEnrichDAGplot(eTerm_EF_Naive, top_num="auto", displayBy="fdr", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=25,fontcolor="blue"), newpage=F)

EFO enrichments for cis-eQTLs significantly induced by LPS2

data <- df_cis_LPS2$variant
eTerm_EF_LPS2 <- xEnricherSNPs(data=data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, ontology.algorithm="lea", RData.location=RData.location)
xEnrichViewer(eTerm_EF_LPS2, 10)

Barplot of enriched terms

bp_EF_LPS2 <- xEnrichBarplot(eTerm_EF_LPS2, top_num="auto", displayBy="zscore")
bp_EF_LPS2

DAGplot of enriched terms

xEnrichDAGplot(eTerm_EF_LPS2, top_num="auto", displayBy="fdr", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=30,fontcolor="blue"), newpage=F)

EFO enrichments for cis-eQTLs significantly induced by LPS24

data <- df_cis_LPS24$variant
eTerm_EF_LPS24 <- xEnricherSNPs(data=data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, ontology.algorithm="lea", RData.location=RData.location)
xEnrichViewer(eTerm_EF_LPS24, 10)

Barplot of enriched terms

bp_EF_LPS24 <- xEnrichBarplot(eTerm_EF_LPS24, top_num="auto", displayBy="zscore")
bp_EF_LPS24

DAGplot of enriched terms

xEnrichDAGplot(eTerm_EF_LPS24, top_num="auto", displayBy="fdr", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=30,fontcolor="blue"), newpage=F)

EFO enrichments for cis-eQTLs significantly induced by IFN24

data <- df_cis_IFN24$variant
eTerm_EF_IFN24 <- xEnricherSNPs(data=data, ontology="EF_disease", include.LD="EUR", LD.r2=0.8, ontology.algorithm="lea", RData.location=RData.location)
xEnrichViewer(eTerm_EF_IFN24, 10)

Barplot of enriched terms

bp_EF_IFN24 <- xEnrichBarplot(eTerm_EF_IFN24, top_num="auto", displayBy="zscore")
bp_EF_IFN24

DAGplot of enriched terms

xEnrichDAGplot(eTerm_EF_IFN24, top_num="auto", displayBy="adjp", layout.orientation="left_right", node.info=c("full_term_name"), graph.node.attrs=list(fontsize=25,fontcolor="blue"), newpage=F)

Comparing EFO-based enrichment results for cis-eQTLs induced in four conditions

Under FDR cutoff at 0.01

list_eTerm <- list(eTerm_EF_Naive, eTerm_EF_LPS2, eTerm_EF_LPS24, eTerm_EF_IFN24)
names(list_eTerm) <- c('Naive cis-eQTLs', 'LPS2 cis-eQTLs', 'LPS24 cis-eQTLs', 'IFN24 cis-eQTLs')
bp_FDR_001 <- xEnrichCompare(list_eTerm, displayBy="zscore", FDR.cutoff=0.01, bar.label.size=2.5)
bp_FDR_001 + theme(axis.text.y=element_text(size=10))

DAGplot of enriched terms in the context of EF tree, with nodes/terms colored according to how many times being called significant. Also shown is the term name (if significant) prefixed in the form of ‘x1-x2-x3-x4’. In this case, x1 for ‘Naive cis-eQTLs’, x2 for ‘LPS2 cis-eQTLs’, x3 for ‘LPS24 cis-eQTLs’, x4 for ‘IFN24 cis-eQTLs’. The value of x1-4 can be ‘1’ or ‘0’, denoting whether this term is called significant or not.

xEnrichDAGplotAdv(bp_FDR_001, displayBy="nSig", colormap="white-lightcyan-cyan", layout.orientation="left_right", node.info=c("term_name"), graph.node.attrs=list(fontsize=30), newpage=F)

5.3 Similarity analysis: new dimension for interpretation

This adds a new dimension to interpret eQTL SNPs, not just showing their relevance to GWAS traits but also measuring how similar they are to each other in meanings of trait profiles (ie ontology annotation profiles).

It is demonstrated using cis-eQTLs induced by 24-hour interferon gamma. SNP-based similarity analysis is using GWAS Catalog traits (mapped to disease part of EFO), that is, SNP annotation profiles focusing on the disease part of EFO.

df_cis <- df_cis_IFN24
data <- df_cis$variant

This package is needed for visualisations

library(RCircos)

5.3.1 SNP similarity network

data <- df_cis$variant
ig_SNP_BM_complete <- xSocialiserSNPs(data=data, ontology="EF_disease", measure="BM.complete", method.term="Resnik", rescale=F, RData.location=RData.location)

Circos plot illustrating the top 100 similarity:

xCircos(g=ig_SNP_BM_complete, entity="SNP", top_num=100, entity.label.cex=0.8, verbose=F, RData.location=RData.location)

5.3.2 Basis of SNP similarity

5.3.2.1 rs11150589

Circos plot involving an SNP ‘rs11150589’

xCircos(g=ig_SNP_BM_complete, entity="SNP", nodes.query='rs11150589', entity.label.cex=0.8, verbose=F, RData.location=RData.location)

Circos plot involving the SNP ‘rs11150589’ but its labelings are restricted to SNPs of interest, such as ‘rs11150589’,‘rs10500264’,‘rs4072037’,‘rs3957148’,‘rs2066807’

xCircos(g=ig_SNP_BM_complete, entity="SNP", nodes.query='rs11150589', entity.label.cex=0.8, entity.label.side="in", entity.label.query=c('rs11150589','rs10500264','rs4072037','rs3957148','rs2066807'), verbose=F, RData.location=RData.location)

>          variant Symbol     IFN_t      IFN_fdr
> 2723  rs11150589  ITGAL 13.843583 2.270000e-27
> 1114  rs10500264  CEBPA  4.001606 8.848898e-03
> 12711  rs4072037    GBA  4.035999 7.881209e-03
> 12677  rs3957148 AGPAT1  3.488521 4.207953e-02
> 8546   rs2066807  CNPY2  4.719715 6.555550e-04

DAG plot of terms used to annotate an SNP

Nodes/terms in a direct acyclic graph (DAG) are colored according to information content (IC). Notably, original terms are highlighted by box-shaped nodes, inherited terms by ellipse nodes.

Terms used to annotate the SNP ‘rs11150589’:

xSocialiserDAGplot(g=ig_SNP_BM_complete, query='rs11150589', displayBy="IC", zlim=c(0,4), node.info=c("full_term_name"), graph.node.attrs=list(fontsize=25,color="transparent"), newpage=F)

Terms used to annotate the SNP ‘rs10500264’:

xSocialiserDAGplot(g=ig_SNP_BM_complete, query='rs10500264', displayBy="IC", zlim=c(0,4), node.info=c("full_term_name"), graph.node.attrs=list(fontsize=20,color="transparent"), newpage=F)

Terms used to annotate the SNP ‘rs4072037’:

xSocialiserDAGplot(g=ig_SNP_BM_complete, query='rs4072037', displayBy="IC", zlim=c(0,4), node.info=c("full_term_name"), graph.node.attrs=list(fontsize=15,color="transparent"), newpage=F)

Terms used to annotate the SNP ‘rs3957148’:

xSocialiserDAGplot(g=ig_SNP_BM_complete, query='rs3957148', displayBy="IC", zlim=c(0,4), node.info=c("full_term_name"), graph.node.attrs=list(fontsize=15,color="transparent"), newpage=F)

Terms used to annotate the SNP ‘rs2066807’:

xSocialiserDAGplot(g=ig_SNP_BM_complete, query='rs2066807', displayBy="IC", zlim=c(0,4), node.info=c("full_term_name"), graph.node.attrs=list(fontsize=15,color="transparent"), newpage=F)

Exploring the basis of similarity between two SNPs

Comparing terms used to annotate two SNPs ‘rs11150589’ and ‘rs10500264’

DAGplot of terms in the context of EFO tree, with nodes/terms colored according to information content (IC). Also shown is the term name prefixed in the form of ‘x1-x2’. In this case, x1 for ‘rs11150589’, x2 for ‘rs10500264’. The value of x1 (or x2) can be ‘0’, ‘1’ or ‘2’, respectively denoting no annotation, inherited annotation, original annotation.

xSocialiserDAGplotAdv(g=ig_SNP_BM_complete, query1='rs11150589', query2='rs10500264', displayBy="IC", zlim=c(0,4), graph.node.attrs=list(fontsize=25), newpage=F)

Comparing terms used to annotate two SNPs ‘rs11150589’ and ‘rs4072037’

DAGplot of terms in the context of EFO tree, with nodes/terms colored according to information content (IC). Also shown is the term name prefixed in the form of ‘x1-x2’. In this case, x1 for ‘rs11150589’, x2 for ‘rs4072037’. The value of x1 (or x2) can be ‘0’, ‘1’ or ‘2’, respectively denoting no annotation, inherited annotation, original annotation.

xSocialiserDAGplotAdv(g=ig_SNP_BM_complete, query1='rs11150589', query2='rs4072037', displayBy="IC", zlim=c(0,4), graph.node.attrs=list(fontsize=25), newpage=F)

Comparing terms used to annotate two SNPs ‘rs11150589’ and ‘rs3957148’

DAGplot of terms in the context of EFO tree, with nodes/terms colored according to information content (IC). Also shown is the term name prefixed in the form of ‘x1-x2’. In this case, x1 for ‘rs11150589’, x2 for ‘rs3957148’. The value of x1 (or x2) can be ‘0’, ‘1’ or ‘2’, respectively denoting no annotation, inherited annotation, original annotation.

xSocialiserDAGplotAdv(g=ig_SNP_BM_complete, query1='rs11150589', query2='rs3957148', displayBy="IC", zlim=c(0,4), graph.node.attrs=list(fontsize=25), newpage=F)

Comparing terms used to annotate two SNPs ‘rs11150589’ and ‘rs2066807’

DAGplot of terms in the context of EFO tree, with nodes/terms colored according to information content (IC). Also shown is the term name prefixed in the form of ‘x1-x2’. In this case, x1 for ‘rs11150589’, x2 for ‘rs2066807’. The value of x1 (or x2) can be ‘0’, ‘1’ or ‘2’, respectively denoting no annotation, inherited annotation, original annotation.

xSocialiserDAGplotAdv(g=ig_SNP_BM_complete, query1='rs11150589', query2='rs2066807', displayBy="IC", zlim=c(0,4), graph.node.attrs=list(fontsize=25), newpage=F)

6 Session Info

Here is the output of sessionInfo() on the system on which this user manual was built:

> R version 3.2.4 (2016-03-10)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X 10.11.4 (El Capitan)
> 
> locale:
> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
> 
> attached base packages:
> [1] grid      stats     graphics  grDevices utils     datasets  methods  
> [8] base     
> 
> other attached packages:
>  [1] RCircos_1.1.3       VennDiagram_1.6.16  futile.logger_1.4.1
>  [4] XGR_1.0.2           ggplot2_2.1.0       dnet_1.0.9         
>  [7] supraHex_1.11.1     hexbin_1.27.1       igraph_1.0.1       
> [10] rmarkdown_0.9.5    
> 
> loaded via a namespace (and not attached):
>  [1] Rcpp_0.12.5             formatR_1.3            
>  [3] highr_0.5.1             plyr_1.8.3             
>  [5] GenomeInfoDb_1.4.3      XVector_0.8.0          
>  [7] futile.options_1.0.0    bitops_1.0-6           
>  [9] zlibbioc_1.14.0         tools_3.2.4            
> [11] digest_0.6.9            evaluate_0.8.3         
> [13] nlme_3.1-125            gtable_0.2.0           
> [15] lattice_0.20-33         Matrix_1.2-4           
> [17] graph_1.46.0            Rgraphviz_2.12.0       
> [19] yaml_2.1.13             parallel_3.2.4         
> [21] rtracklayer_1.28.10     stringr_1.0.0          
> [23] knitr_1.12.3            Biostrings_2.36.4      
> [25] S4Vectors_0.6.6         IRanges_2.2.9          
> [27] stats4_3.2.4            Biobase_2.28.0         
> [29] BiocParallel_1.2.22     XML_3.98-1.4           
> [31] reshape2_1.4.1          lambda.r_1.1.7         
> [33] magrittr_1.5            GenomicAlignments_1.4.2
> [35] Rsamtools_1.20.5        scales_0.4.0           
> [37] htmltools_0.3           BiocGenerics_0.14.0    
> [39] GenomicRanges_1.20.8    ape_3.5                
> [41] colorspace_1.2-6        labeling_0.3           
> [43] stringi_1.1.1           RCurl_1.95-4.8         
> [45] munsell_0.4.3