Now we are almost ready to run a two-class analysis with
MOSClip
! First of all, we can load the necessary libraries
and the pre-processed
ovarian cancer datasets that we downloaded from TCGA in the previous
tutorials. We also set a seed, in order to have reproducible results
in case of future repetition of this analysis, and we create a directory
where we will save all the results generated by this tutorial.
library(org.Hs.eg.db)
library(EDASeq)
library(graphite)
library(MOSClip)
library(kableExtra)
load("downloadTCGA/TCGA-OV-pre-processed.RData")
set.seed(1234)
dirname <- "MOSresults/twoClass/"
if (!dir.exists(dirname)){
dir.create(dirname)
}
We need to prepare data in order to run MOSClip
. The
first step is to modify all the multi-omic matrices assigning the type
of gene identifier used. Since we’ll be using the graphite
package to download pathways and their graphical structures, gene names
across all omics need to be compatible with graphite
. For
this analysis, we’ll use EntrezIDs. Thus, each gene ID in our data must
be prefixed with “ENTREZID:” for compatibility.
expression <- expAvg
row.names(expression) <- paste0("ENTREZID:", row.names(expression))
mutation <- ov.mutations$data
row.names(mutation) <- paste0("ENTREZID:", row.names(mutation))
names(metClustValues$eMap) <- paste0("ENTREZID:", row.names(metClustValues$eMap))
row.names(ov.cnv) <- paste0("ENTREZID:", row.names(ov.cnv))
Moving to patient selection, we keep only patients whose class annotation is available and their intersection with available patients across the 4 omics. Finally, our class annotation data frame is filtered to keep only the selected patients.
# select common patients
patients <- row.names(classes)
patients <- intersect(patients, colnames(expression))
patients <- intersect(patients, colnames(metClustValues$MET_Cancer_Clustered))
patients <- intersect(patients, colnames(mutation))
patients <- intersect(patients, colnames(ov.cnv))
classAnnot <- classes[patients, , drop=FALSE]
table(classAnnot)
## classes
## Differentiated Immunoreactive Mesenchymal Proliferative
## 74 67 64 63
At this point, we need to extract selected patients for each
multi-omic matrix. After patient selection, we can normalize (upper
quartile normalization from EDASeq
package) and
log-transform expression data.
# normalize expression data
expression <- expression[, patients, drop = FALSE]
keep = apply(expression >= 100, 1, any)
expNorm <- betweenLaneNormalization(expression[keep, , drop = FALSE], which = "upper")
pseudoExpNorm <- log2(expNorm + 1)
methylation <- metClustValues
methylation$MET_Cancer_Clustered <- methylation$MET_Cancer_Clustered[, patients, drop = FALSE]
mutation <- mutation[, patients, drop = FALSE]
cnv <- ov.cnv[, patients, drop = FALSE]
We are now ready to generate an object of class Omics
using the function makeOmics
. This object is required to
run each type of MOSClip
analysis. It is based on a
MultiAssayExperiment
object, containing an
ExperimentList
with matrices for each omic (we suggest to
use standard names for each experiment as shown in this example).
colData
in this case will contain class annotation for
patients. Additionally, specific slots for MOSClip
analysis
exist, including modelInfo
, where the user must specify the
desired method for data reduction for each omic, and
specificArgs
, with specific parameters to be used by
reduction functions. Both these slots must have the same dimension as
ExperimentList
.
The list of available methods for data dimensionality reduction can
be easily retrieved with availableOmicsMethods()
function.
multiOmics <- makeOmics(experiments = list(exp = pseudoExpNorm,
met = methylation$MET_Cancer_Clustered,
mut = mutation,
cnv = as.matrix(cnv)),
colData = classAnnot,
modelInfo = c("summarizeWithPca", "summarizeInCluster",
"summarizeToNumberOfEvents", "summarizeToNumberOfDirectionalEvents"),
specificArgs = list(pcaArgs = list(name = "exp", shrink = "FALSE", method = "sparse", maxPCs = 3),
clusterArgs = list(name = "met", dict = methylation$eMap, max_cluster_number = 3),
countEvent = list(name = "mut", min_prop = 0.05),
cnvAgv = list(name = "cnv", min_prop = 0.05)))
To run a MOSClip
analysis we also need a list of
pathways that we want to test, as well as their graphical structures if
we want to exploit the topological method implemented in the
package.
We decide to use pathways collected in Reactome database. They can be
downloaded using graphite
that is also able to convert gene
identifiers (here we use EntrezID). Since this download and conversion
process can take several minutes, we will save the resulting
PathwayList object for easy access in future analyses.
if (file.exists("downloadTCGA/reactome-entrez-2024-05-27.RData")) {
load("downloadTCGA/reactome-entrez-2024-05-27.RData")
} else {
reactome <- pathways("hsapiens", "reactome")
reactome <- convertIdentifiers(reactome, "entrez")
file = paste0("downloadTCGA", "/reactome-entrez-", as.character(Sys.Date()), ".RData")
save(reactome, file = file)
}
To keep analyses efficient and avoid redundancy, we select Reactome
pathways based on their number of nodes, considering only those nodes
that are present at least in the expression matrix. We prepare two
distinct PathwayList
objects:
nodesLength <- sapply(reactome, function(g) {length(intersect(graphite::nodes(g),
row.names(pseudoExpNorm)))})
reactSmall <- reactome[nodesLength >= 20 & nodesLength <= 100]
reactHuge <- reactome[nodesLength >= 10 & nodesLength <= 700]
We define the patient groups to compare, using the subtypes established for TCGA ovarian cancer patients. For this tutorial, we’ll focus on a multi-omic comparison between immunoreactive and mesenchymal subtypes, which should provide interesting insights. The user can decide which subtypes to compare.
We then filter accordingly our class annotation data frame and
multiOmics
object.
class1 <- "Immunoreactive"
class2 <- "Mesenchymal"
classAnnotation <- classAnnot[classAnnot$classes %in% c(class1, class2), , drop=FALSE]
multiOmics <- multiOmics[,row.names(classAnnotation)]
Now we are ready for MOSClip
two-class analysis.
We start from the analysis on modules using the function
multiOmicsTwoClassModuleTest
. Required inputs are an
Omics
object, a graph, and a class annotation data frame.
Patients in class annotation should have the same order as in
colData
. Reactome database is built with a hierarchical
organization: different pathways represent the same process with more or
less details, going from very specific and small pathways to very large
and general ones. To avoid redundant results, we choose to focus on a
subset of pathways, specifically those having between 20 and 100 nodes
(reactSmall
list); this way, we aim to limit the overlap in
gene testing.
This step will take some minutes. For this reason, we’ll save the final output in the directory “MOSresults”, and in case of future analyses we’ll load it from there.
if (file.exists(paste0(dirname, "twoClassM.rds"))){
twoClassM <- readRDS(paste0(dirname, "twoClassM.rds"))
} else {
twoClassM <- lapply(reactSmall, function(g) {
res <- multiOmicsTwoClassModuleTest(multiOmics, g, classAnnotation,
useTheseGenes = row.names(pseudoExpNorm))
res
})
saveRDS(twoClassM, file = paste0(dirname, "twoClassM.rds"))
}
The result is a list of MultiOmicsModules
objects. This
list can be used within the function
multiPathwayModuleReport
to return a tabular summary of
modules, sorted by p-values. Besides the p-value for each module,
p-values for tested covariates are shown. The user can specify which
covariates to visualize first in the data frame columns, giving the omic
names to priority_to
argument.
moduleSummary <- multiPathwayModuleReport(twoClassM,
priority_to = c("exp", "met", "cnv"))
module | pvalue | expPC1 | expPC2 | expPC3 | met2k2 | met3k2 | met3k3 | cnvNEG | cnvPOS | mut | |
---|---|---|---|---|---|---|---|---|---|---|---|
Elastic fibre formation.5 | 5 | 1.0e-07 | 0.0000001 | 0.0430093 | NA | 0.5044956 | NA | NA | NA | 0.5472875 | NA |
SUMOylation of transcription factors.4 | 4 | 1.0e-07 | 0.0000000 | NA | NA | NA | NA | NA | NA | 0.5666869 | NA |
Elastic fibre formation.2 | 2 | 1.0e-07 | 0.0000000 | NA | NA | 0.4479830 | NA | NA | NA | NA | NA |
O-glycosylation of TSR domain-containing proteins.4 | 4 | 2.0e-07 | 0.0000000 | NA | NA | 0.9925894 | NA | NA | NA | 0.2557571 | NA |
Gamma carboxylation, hypusinylation, hydroxylation, and arylsulfatase activation.4 | 4 | 2.0e-07 | 0.0000000 | NA | NA | 0.6992146 | NA | NA | NA | 0.0453129 | NA |
Glycosphingolipid metabolism.11 | 11 | 2.0e-07 | 0.0000000 | NA | NA | 0.6992146 | NA | NA | NA | 0.0453129 | NA |
Sphingolipid metabolism.15 | 15 | 2.0e-07 | 0.0000000 | NA | NA | 0.6992146 | NA | NA | NA | 0.0453129 | NA |
Glycosphingolipid catabolism.8 | 8 | 2.0e-07 | 0.0000000 | NA | NA | 0.6992146 | NA | NA | NA | 0.0453129 | NA |
ECM proteoglycans.2 | 2 | 3.0e-07 | 0.1654198 | 0.1413438 | 0.0000425 | 0.0845938 | NA | NA | NA | 0.4293656 | NA |
RUNX2 regulates osteoblast differentiation.4 | 4 | 3.0e-07 | 0.0000001 | NA | NA | NA | NA | NA | 0.8990975 | 0.6534472 | NA |
RUNX2 regulates bone development.5 | 5 | 3.0e-07 | 0.0000001 | NA | NA | NA | NA | NA | 0.8990975 | 0.6534472 | NA |
Nitric oxide stimulates guanylate cyclase.2 | 2 | 3.0e-07 | 0.2998266 | 0.0000000 | NA | NA | NA | NA | NA | 0.0574532 | NA |
O-glycosylation of TSR domain-containing proteins.23 | 23 | 3.0e-07 | 0.0000000 | NA | NA | 0.2906849 | NA | NA | NA | NA | NA |
ECM proteoglycans.1 | 1 | 4.0e-07 | 0.0241493 | 0.0093753 | 0.0000246 | NA | 0.5571007 | 0.3716193 | NA | 0.3592535 | NA |
Assembly of collagen fibrils and other multimeric structures.1 | 1 | 5.0e-07 | 0.0000002 | NA | NA | 0.0526327 | NA | NA | NA | 0.3959735 | NA |
Collagen chain trimerization.1 | 1 | 5.0e-07 | 0.0000002 | NA | NA | 0.0526327 | NA | NA | NA | 0.3959735 | NA |
Chondroitin sulfate biosynthesis.2 | 2 | 7.0e-07 | 0.9921860 | 0.0000002 | 0.3845546 | 0.4312105 | NA | NA | 0.6242335 | 0.1703916 | NA |
Bacterial Infection Pathways.1 | 1 | 8.0e-07 | 0.0000000 | NA | NA | NA | 0.5747115 | 0.9897028 | NA | NA | NA |
Molecules associated with elastic fibres.1 | 1 | 8.0e-07 | 0.0000231 | 0.0519096 | 0.0007252 | 0.3218417 | NA | NA | 0.7926718 | 0.1662188 | NA |
Non-integrin membrane-ECM interactions.4 | 4 | 8.0e-07 | 0.0000004 | 0.0000123 | NA | NA | 0.4979362 | 0.9528540 | NA | 0.7731634 | NA |
Signaling by TGF-beta Receptor Complex.12 | 12 | 9.0e-07 | 0.0000000 | NA | NA | 0.0370336 | NA | NA | 0.6218599 | 0.0552526 | NA |
Transcriptional activity of SMAD2/SMAD3:SMAD4 heterotrimer.5 | 5 | 9.0e-07 | 0.0000000 | NA | NA | 0.0370336 | NA | NA | 0.6218599 | 0.0552526 | NA |
SMAD2/SMAD3:SMAD4 heterotrimer regulates transcription.4 | 4 | 9.0e-07 | 0.0000000 | NA | NA | 0.0370336 | NA | NA | 0.6218599 | 0.0552526 | NA |
Assembly of collagen fibrils and other multimeric structures.4 | 4 | 9.0e-07 | 0.0000164 | 0.7065204 | NA | 0.0888073 | NA | NA | NA | 0.1326165 | 0.7739622 |
Collagen chain trimerization.4 | 4 | 9.0e-07 | 0.0000164 | 0.7065204 | NA | 0.0888073 | NA | NA | NA | 0.1326165 | 0.7739622 |
Unfolded Protein Response (UPR).23 | 23 | 1.0e-06 | 0.0000004 | NA | NA | NA | 0.5221858 | 0.8471660 | NA | NA | NA |
O-glycosylation of TSR domain-containing proteins.34 | 34 | 1.1e-06 | 0.0000000 | NA | NA | 0.8321512 | NA | NA | NA | 0.5098868 | NA |
Signaling by ALK in cancer.10 | 10 | 1.2e-06 | 0.2136126 | 0.0000001 | 0.0254404 | 0.1086051 | NA | NA | NA | 0.5563069 | NA |
Signaling by ALK fusions and activated point mutants.10 | 10 | 1.2e-06 | 0.2136126 | 0.0000001 | 0.0254404 | 0.1086051 | NA | NA | NA | 0.5563069 | NA |
Cell junction organization.6 | 6 | 1.3e-06 | 0.0000000 | NA | NA | 0.6829094 | NA | NA | NA | 0.8181169 | NA |
O-glycosylation of TSR domain-containing proteins.28 | 28 | 1.4e-06 | 0.0000000 | NA | NA | 0.7965990 | NA | NA | NA | 0.7398864 | NA |
Chondroitin sulfate biosynthesis.3 | 3 | 1.4e-06 | 0.5245592 | 0.0000006 | 0.1625020 | NA | 0.5076500 | 0.5248538 | 0.3964371 | 0.3221811 | NA |
Collagen formation.2 | 2 | 1.4e-06 | 0.0000010 | 0.0050478 | 0.7565458 | 0.1563999 | NA | NA | 0.3859780 | 0.3054711 | 0.8282875 |
Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell.14 | 14 | 1.5e-06 | 0.0332499 | 0.0000000 | NA | 0.9115779 | NA | NA | NA | 0.8072930 | NA |
Signaling by FGFR.10 | 10 | 1.6e-06 | 0.4196746 | 0.0000000 | 0.0020458 | 0.3525830 | NA | NA | 0.0782642 | 0.7629884 | NA |
Signaling by FGFR1.8 | 8 | 1.6e-06 | 0.4196746 | 0.0000000 | 0.0020458 | 0.3525830 | NA | NA | 0.0782642 | 0.7629884 | NA |
Collagen formation.3 | 3 | 1.7e-06 | 0.0000010 | 0.0043916 | 0.7007774 | 0.2173066 | NA | NA | 0.3931109 | 0.4241760 | 0.9316167 |
Collagen biosynthesis and modifying enzymes.2 | 2 | 1.7e-06 | 0.0000010 | 0.0043916 | 0.7007774 | 0.2173066 | NA | NA | 0.3931109 | 0.4241760 | 0.9316167 |
Activation of Matrix Metalloproteinases.2 | 2 | 1.7e-06 | 0.0000233 | 0.5657766 | 0.0000687 | 0.0710426 | NA | NA | NA | 0.6188975 | NA |
XBP1(S) activates chaperone genes.25 | 25 | 1.8e-06 | 0.0000000 | NA | NA | NA | 0.1578483 | 0.4096951 | NA | 0.0716312 | NA |
IRE1alpha activates chaperones.27 | 27 | 1.8e-06 | 0.0000000 | NA | NA | NA | 0.1578483 | 0.4096951 | NA | 0.0716312 | NA |
Unfolded Protein Response (UPR).31 | 31 | 1.8e-06 | 0.0000000 | NA | NA | NA | 0.1578483 | 0.4096951 | NA | 0.0716312 | NA |
Heparan sulfate/heparin (HS-GAG) metabolism.7 | 7 | 1.9e-06 | 0.0000001 | 0.0000002 | 0.5249690 | 0.2119964 | NA | NA | NA | 0.6004117 | NA |
A tetrasaccharide linker sequence is required for GAG synthesis.12 | 12 | 1.9e-06 | 0.0000001 | 0.0000002 | 0.5249690 | 0.2119964 | NA | NA | NA | 0.6004117 | NA |
Signaling by FGFR in disease.6 | 6 | 2.0e-06 | 0.2932271 | 0.0000001 | 0.0116695 | 0.0145332 | NA | NA | 0.8262102 | 0.7253705 | 0.1078824 |
Signaling by FGFR2 in disease.1 | 1 | 2.0e-06 | 0.2932271 | 0.0000001 | 0.0116695 | 0.0145332 | NA | NA | 0.8262102 | 0.7253705 | 0.1078824 |
MAPK6/MAPK4 signaling.6 | 6 | 2.0e-06 | 0.0000000 | NA | NA | NA | 0.0008409 | 0.1546065 | NA | 0.3455090 | NA |
Interleukin-10 signaling.35 | 35 | 2.1e-06 | 0.0000000 | NA | NA | NA | NA | NA | NA | 0.3584628 | NA |
Signaling by BMP.1 | 1 | 2.5e-06 | 0.0000001 | 0.9396654 | 0.3636854 | 0.3992086 | NA | NA | 0.8247345 | 0.0344677 | NA |
Chondroitin sulfate/dermatan sulfate metabolism.7 | 7 | 2.5e-06 | 0.4790755 | 0.0000000 | 0.4828802 | 0.9397531 | NA | NA | 0.2800561 | 0.4183288 | 0.8148043 |
Chondroitin sulfate biosynthesis.4 | 4 | 2.6e-06 | 0.3936016 | 0.7777561 | 0.0000001 | 0.0252156 | NA | NA | 0.0562418 | 0.6133604 | NA |
Collagen formation.1 | 1 | 2.6e-06 | 0.0000007 | 0.0020706 | 0.6738027 | 0.1884023 | NA | NA | 0.0822127 | 0.1264153 | 0.8214458 |
Collagen biosynthesis and modifying enzymes.1 | 1 | 2.6e-06 | 0.0000007 | 0.0020706 | 0.6738027 | 0.1884023 | NA | NA | 0.0822127 | 0.1264153 | 0.8214458 |
Diseases associated with glycosaminoglycan metabolism.4 | 4 | 2.6e-06 | 0.0887587 | 0.2683040 | 0.0000001 | NA | NA | NA | 0.7528869 | 0.9907462 | 0.8913413 |
Collagen formation.6 | 6 | 2.7e-06 | 0.0000117 | 0.4137637 | 0.0598106 | 0.8867219 | NA | NA | 0.5096098 | 0.3257258 | NA |
Assembly of collagen fibrils and other multimeric structures.13 | 13 | 2.7e-06 | 0.0000117 | 0.4137637 | 0.0598106 | 0.8867219 | NA | NA | 0.5096098 | 0.3257258 | NA |
Platelet homeostasis.2 | 2 | 2.7e-06 | 0.4200899 | 0.0000002 | 0.0070693 | 0.1039861 | NA | NA | NA | 0.3347172 | NA |
Syndecan interactions.3 | 3 | 2.8e-06 | 0.0001821 | 0.0000002 | NA | NA | 0.0325411 | 0.3753025 | NA | 0.3075261 | NA |
Non-integrin membrane-ECM interactions.5 | 5 | 2.8e-06 | 0.0001821 | 0.0000002 | NA | NA | 0.0325411 | 0.3753025 | NA | 0.3075261 | NA |
Signaling by FGFR in disease.5 | 5 | 2.9e-06 | 0.3606012 | 0.0000004 | 0.0011555 | 0.2837853 | NA | NA | 0.9731297 | 0.9571113 | NA |
Assembly of collagen fibrils and other multimeric structures.7 | 7 | 2.9e-06 | 0.0000000 | NA | NA | 0.9918623 | NA | NA | NA | NA | NA |
Collagen chain trimerization.7 | 7 | 2.9e-06 | 0.0000000 | NA | NA | 0.9918623 | NA | NA | NA | NA | NA |
O-glycosylation of TSR domain-containing proteins.31 | 31 | 2.9e-06 | 0.0000000 | NA | NA | 0.9906175 | NA | NA | NA | 0.0072528 | NA |
Signaling by FGFR in disease.3 | 3 | 3.1e-06 | 0.3575306 | 0.0000004 | 0.0011815 | 0.2877414 | NA | NA | 0.9248441 | 0.9168705 | NA |
Signaling by FGFR in disease.4 | 4 | 3.1e-06 | 0.3225506 | 0.0000001 | 0.0010588 | 0.3536394 | NA | NA | 0.9670913 | 0.7881869 | NA |
Syndecan interactions.7 | 7 | 3.2e-06 | 0.0071035 | 0.8405068 | 0.0000001 | NA | 0.0099464 | 0.7105744 | 0.5835903 | 0.3219886 | NA |
Non-integrin membrane-ECM interactions.9 | 9 | 3.2e-06 | 0.0071035 | 0.8405068 | 0.0000001 | NA | 0.0099464 | 0.7105744 | 0.5835903 | 0.3219886 | NA |
Signaling by FGFR2 in disease.2 | 2 | 3.6e-06 | 0.1727524 | 0.0000001 | 0.0111432 | 0.0438023 | NA | NA | 0.4370921 | 0.8553761 | NA |
ECM proteoglycans.8 | 8 | 3.6e-06 | 0.0000000 | NA | NA | 0.0888882 | NA | NA | 0.4096324 | 0.8958364 | 0.3549886 |
Activation of Matrix Metalloproteinases.3 | 3 | 3.6e-06 | 0.0000001 | 0.1796756 | 0.3827438 | 0.0586231 | NA | NA | NA | 0.4440382 | NA |
FRS-mediated FGFR2 signaling.1 | 1 | 3.6e-06 | 0.1566822 | 0.0000001 | 0.0118130 | 0.0447017 | NA | NA | 0.4599664 | 0.3875140 | NA |
Diseases associated with glycosaminoglycan metabolism.1 | 1 | 3.6e-06 | 0.0000063 | 0.2435271 | 0.2465458 | 0.0203295 | NA | NA | 0.4735686 | 0.8517177 | 0.9448151 |
Defective B4GALT7 causes EDS, progeroid type.1 | 1 | 3.6e-06 | 0.0000063 | 0.2435271 | 0.2465458 | 0.0203295 | NA | NA | 0.4735686 | 0.8517177 | 0.9448151 |
FGFR2 mutant receptor activation.1 | 1 | 3.7e-06 | 0.1855737 | 0.0000001 | 0.0115202 | 0.1384827 | NA | NA | 0.7255324 | 0.5102733 | NA |
Diseases associated with glycosaminoglycan metabolism.3 | 3 | 3.9e-06 | 0.0000070 | 0.2229971 | 0.2492377 | 0.0197037 | NA | NA | 0.4518418 | 0.9549673 | 0.9406503 |
Defective B3GAT3 causes JDSSDHD.1 | 1 | 3.9e-06 | 0.0000070 | 0.2229971 | 0.2492377 | 0.0197037 | NA | NA | 0.4518418 | 0.9549673 | 0.9406503 |
Diseases associated with glycosaminoglycan metabolism.10 | 10 | 3.9e-06 | 0.0000000 | NA | NA | 0.9883884 | NA | NA | NA | 0.2846400 | NA |
Diseases associated with glycosaminoglycan metabolism.2 | 2 | 4.0e-06 | 0.0000077 | 0.2174316 | 0.2502998 | 0.0198502 | NA | NA | 0.4427936 | 0.8482583 | 0.9372121 |
Defective B3GALT6 causes EDSP2 and SEMDJL1.1 | 1 | 4.0e-06 | 0.0000077 | 0.2174316 | 0.2502998 | 0.0198502 | NA | NA | 0.4427936 | 0.8482583 | 0.9372121 |
SHC-mediated cascade:FGFR2.1 | 1 | 4.0e-06 | 0.1582481 | 0.0000001 | 0.0114429 | 0.0805351 | NA | NA | 0.4707031 | 0.5287020 | NA |
Chondroitin sulfate biosynthesis.1 | 1 | 4.0e-06 | 0.4138422 | 0.0000000 | 0.9567405 | 0.1275922 | NA | NA | 0.1305718 | 0.2464359 | NA |
Collagen formation.5 | 5 | 4.1e-06 | 0.0000241 | 0.1134608 | 0.7455172 | 0.5377312 | NA | NA | 0.2002979 | 0.3300364 | 0.1098756 |
Assembly of collagen fibrils and other multimeric structures.12 | 12 | 4.1e-06 | 0.0000241 | 0.1134608 | 0.7455172 | 0.5377312 | NA | NA | 0.2002979 | 0.3300364 | 0.1098756 |
O-glycosylation of TSR domain-containing proteins.1 | 1 | 4.1e-06 | 0.0000000 | NA | NA | 0.8003887 | NA | NA | NA | 0.9922676 | NA |
COPI-dependent Golgi-to-ER retrograde traffic.1 | 1 | 4.1e-06 | 0.0734397 | 0.0000003 | 0.0172880 | NA | 0.0065869 | 0.2067050 | 0.2089244 | 0.2319105 | 0.4636113 |
Binding and Uptake of Ligands by Scavenger Receptors.6 | 6 | 4.1e-06 | 0.0000098 | 0.4437830 | 0.0102223 | NA | 0.2350263 | 0.9940394 | 0.3271237 | 0.7874115 | 0.0133496 |
Heparan sulfate/heparin (HS-GAG) metabolism.3 | 3 | 4.2e-06 | 0.0000000 | 0.0000704 | 0.8200092 | 0.3476560 | NA | NA | 0.2220251 | 0.8520836 | NA |
A tetrasaccharide linker sequence is required for GAG synthesis.7 | 7 | 4.2e-06 | 0.0000000 | 0.0000704 | 0.8200092 | 0.3476560 | NA | NA | 0.2220251 | 0.8520836 | NA |
Downstream signaling of activated FGFR2.2 | 2 | 4.2e-06 | 0.1429566 | 0.0000001 | 0.0115034 | 0.0836176 | NA | NA | 0.4721888 | 0.2036482 | NA |
Signaling by FGFR2.3 | 3 | 4.2e-06 | 0.1429566 | 0.0000001 | 0.0115034 | 0.0836176 | NA | NA | 0.4721888 | 0.2036482 | NA |
Negative regulation of FGFR2 signaling.1 | 1 | 4.2e-06 | 0.1551455 | 0.0000001 | 0.0094478 | 0.3014689 | NA | NA | 0.3678876 | 0.4365714 | NA |
Signaling by FGFR in disease.2 | 2 | 4.2e-06 | 0.3600873 | 0.0000003 | 0.0018168 | 0.3913534 | NA | NA | 0.8828437 | 0.8824586 | 0.1272349 |
Post-translational protein phosphorylation.1 | 1 | 4.3e-06 | 0.0000069 | 0.1293828 | 0.0094200 | 0.6388067 | NA | NA | 0.0646589 | 0.4812814 | 0.0415555 |
Integrin cell surface interactions.1 | 1 | 4.6e-06 | 0.0000054 | 0.7184853 | 0.0007483 | 0.3736606 | NA | NA | 0.2060560 | 0.3940942 | 0.2297422 |
Elastic fibre formation.1 | 1 | 4.8e-06 | 0.0003040 | 0.0668308 | 0.0001162 | NA | 0.8977608 | 0.3369076 | 0.1993385 | 0.2497101 | NA |
NCAM signaling for neurite out-growth.1 | 1 | 4.8e-06 | 0.0000004 | 0.4002290 | 0.0001503 | 0.5139677 | NA | NA | 0.4569624 | 0.0966194 | 0.6087424 |
Transcriptional Regulation by NPAS4.6 | 6 | 4.8e-06 | 0.0000534 | 0.0843128 | 0.0000060 | NA | NA | NA | NA | 0.0332949 | NA |
NPAS4 regulates expression of target genes.5 | 5 | 4.8e-06 | 0.0000534 | 0.0843128 | 0.0000060 | NA | NA | NA | NA | 0.0332949 | NA |
Diseases associated with glycosaminoglycan metabolism.9 | 9 | 4.9e-06 | 0.0000000 | NA | NA | 0.4304541 | NA | NA | NA | 0.8091153 | NA |
Diseases associated with glycosaminoglycan metabolism.8 | 8 | 5.0e-06 | 0.0000000 | NA | NA | 0.4318175 | NA | NA | NA | 0.9967769 | NA |
So far, we have identified some modules that are significant. We can test the robustness of our findings using a resampling procedure. With this strategy, we can repeat the two-class test on a subset of patients, removing 3 random patients at each iteration. Modules are prioritized if they show a resampling success score greater than 80.
We will run the analysis on a subset of pathways, considering only those pathways whose modules were found significant.
Since this step will take much time, results are saved.
useThisPathways <- unique(moduleSummary$pathway[moduleSummary$pvalue <= 0.05])
sModule <- moduleSummary[moduleSummary$pathway %in% useThisPathways, , drop = TRUE]
if (file.exists(paste0(dirname, "permsM.RData"))){
load(paste0(dirname, "permsM.RData"))
}else{
perms <- resamplingModulesTwoClass(fullMultiOmics = multiOmics,
classAnnotation, reactSmall, nperm = 100,
nPatients = 3, pathwaySubset = useThisPathways,
genesToConsider = row.names(pseudoExpNorm))
save(perms, file = paste0(dirname, "permsM.RData"))
}
The function selectStablePathwaysModules
will retrieve
the tabular summary of the results for those pathway modules whose
success score is greater than 80. It will also print a dotplot showing
the resampling success of modules based on their significance level. The
resampling success count can be retrieved with
getPathwaysModulesSuccess
and counts can be appended as a
new column to the tabular summary of module results with
addResamplingCounts
.
stableModulesSummary <- selectStablePathwaysModules(perms = perms, moduleSummary = sModule, success = 80)
resamplingSuccessCount <- getPathwaysModulesSuccess(perms = perms, moduleSummary = sModule)
moduleSummary <- addResamplingCounts(moduleSummary, resamplingSuccessCount)
The tabular results can also be visualized in a plot, that helps
comparing the contribution of each covariate to the significance of a
module. This is done with plotModuleReport
. The user must
provide the MultiOmicsModules
object for the pathway of
interest; for this tutorial, we choose to focus on the pathway “Syndecan
interactions”.
plotModuleReport(twoClassM[["Syndecan interactions"]],
MOcolors = c(exp = "red", met = "green", cnv = "yellow"),
priority_to = c("exp", "met"))
The most significant module for this pathway is the third one. The omics that are mainly involved are expression and methylation, as we can see from the p-values of the model covariates (expPC1, expPC2, met3k2).
We can have a look at the structure of the pathway graph and the module position in the pathway. The genes of module 3 are colored in red and and the contribution of each omic inside the module is highlighted with different colors.
plotModuleInGraph(twoClassM[["Syndecan interactions"]], reactSmall, 3,
paletteNames = c(exp="red", met="green", cnv="yellow"))
Using a heatmap, we can visualize the profiles of predictive genes, the top 3 genes for each omic. Above the heatmap, we can also show patient additional annotations, in this case their class annotation, and the summarized value of each covariate for each patient.
plotModuleHeat(twoClassM[["Syndecan interactions"]], 3,
additionalAnnotations = classAnnotation,
additionalPaletteNames = list(classes="violet"))
In second instance, we can ask if two or more omics are significant
in the same module simultaneously and if this omic interaction is more
frequent than those expected by chance. To perform this test, we use the
runSupertest
function. We plot only modules that are
significant and have a success score greater than 80. A circle plot is
returned with the frequency of all significant omic combinations and
their significance levels, represented by the height and the color of
the outer layer.
runSupertest(moduleSummary, pvalueThr = 0.05, zscoreThr = 0.05, resampligThr = 80,
excludeColumns = c("pathway", "module", "resamplingCount"))
As you can see, the combination of cnv and expression is significant, as well as the combination of expression and methylation. These combinations of omics co-regulate the same pathway modules more often than what expected by chance.
Finally, with plotFrequencies
it is possible to show the
frequency distribution of pathways aggregated into macro-categories,
generated using Reactome hierarchical structure, separately for each
omic combination. This plot suggests biological processes that may be
impacted by the omics and their cross-talk.
pathHierarchyGraph <- igraph::graph_from_data_frame(d = downloadPathwayRelationFromReactome(), directed = TRUE)
omicsClasses2pathways <- computeOmicsIntersections(moduleSummary,
pvalueThr = 0.05, zscoreThr = 0.05, resampligThr = 80,
excludeColumns = c("pathway", "module", "resamplingCount"))
omicsClasses2pathways <- lapply(omicsClasses2pathways, stripModulesFromPathways)
omicsClasses2fathers <- lapply(omicsClasses2pathways, annotePathwayToFather,
graphiteDB = reactome, hierarchy = pathHierarchyGraph)
MOMfreqDataframe <- computeFreqs(omicsClasses2fathers)
combiClass <- grep(";", MOMfreqDataframe$class)
MOMfreqDataframe.multi <- MOMfreqDataframe[combiClass, , drop = FALSE]
plotFrequencies(MOMfreqDataframe.multi, minSize = 6, maxSize = 9, width = 10, lineSize = 1)
The same analysis can be run on pathways rather than modules. In this
case, since we are using pathway graph structures, we suggest to adopt
the topological method for PCA, as implemented in MOSClip
,
as well as a shrinkage approach. To do this, we change parameters for
PCA in the secificArgs
slot of our Omics
object.
multiOmics@specificArgs$pcaArgs$shrink = TRUE
multiOmics@specificArgs$pcaArgs$method = "topological"
We are now ready to run the two-class analysis on pathways with the
function multiOmicsTwoClassPathwayTest
. The required input
are the same shown for modules. In this case, we choose to test a
greater amount of pathways (reactHuge
list). Again, since
this may take some minutes, we save the results
for future usage.
if (file.exists(paste0(dirname, "twoClassP.rds"))){
twoClassP <- readRDS(paste0(dirname, "twoClassP.rds"))
} else {
twoClassP <- lapply(reactHuge, function(g) {
res <- multiOmicsTwoClassPathwayTest(multiOmics, g, classAnnotation,
useTheseGenes = row.names(pseudoExpNorm))
res
})
saveRDS(twoClassP, file = paste0(dirname, "twoClassP.rds"))
}
The summary of the results is obtained with
multiPathwayReport
.
pathwaySummary <- multiPathwayReport(twoClassP)
pvalue | cnvNEG | cnvPOS | expPC1 | expPC2 | expPC3 | met2k2 | met3k2 | met3k3 | mut | |
---|---|---|---|---|---|---|---|---|---|---|
Extracellular matrix organization | 1.0e-07 | 0.0240316 | 0.5647068 | 0.0002849 | 0.0188057 | 0.0205115 | 0.0406744 | NA | NA | 0.5778667 |
Collagen formation | 2.0e-07 | 0.7902942 | 0.7921634 | 0.0000008 | 0.0178320 | 0.8616522 | 0.1180423 | NA | NA | 0.3335339 |
GPCR downstream signalling | 3.0e-07 | 0.0374317 | 0.3842342 | 0.0002943 | 0.0263907 | 0.0017945 | 0.9299399 | NA | NA | 0.8891513 |
Signaling by GPCR | 3.0e-07 | 0.0419602 | 0.3749801 | 0.0004191 | 0.0428657 | 0.0012833 | 0.9416765 | NA | NA | 0.9452117 |
Post-translational protein phosphorylation | 3.0e-07 | 0.1997894 | 0.5651028 | 0.0001413 | 0.0880231 | 0.2291858 | 0.4748054 | NA | NA | 0.0370344 |
Collagen biosynthesis and modifying enzymes | 4.0e-07 | 0.8417231 | 0.6308409 | 0.0000012 | 0.1424838 | 0.7534697 | 0.7997129 | NA | NA | 0.1864303 |
Defective B4GALT7 causes EDS, progeroid type | 5.0e-07 | 0.3727263 | 0.9881543 | 0.0000402 | 0.2699674 | 0.1647628 | 0.0462787 | NA | NA | 0.9291883 |
Defective B3GALT6 causes EDSP2 and SEMDJL1 | 5.0e-07 | 0.3383496 | 0.7703166 | 0.0000366 | 0.2803397 | 0.1423496 | 0.0430582 | NA | NA | 0.9092040 |
Defective B3GAT3 causes JDSSDHD | 5.0e-07 | 0.3638384 | 0.8173551 | 0.0000343 | 0.2582386 | 0.1577693 | 0.0435254 | NA | NA | 0.9043942 |
RHO GTPase cycle | 5.0e-07 | 0.0972397 | 0.6429130 | 0.0006720 | 0.0006680 | 0.0005676 | NA | 0.4587927 | 0.1976975 | 0.5002610 |
Elastic fibre formation | 7.0e-07 | 0.1596668 | 0.0761210 | 0.0001490 | 0.0104862 | 0.5504124 | NA | 0.8148059 | 0.7130069 | 0.2814824 |
Transcriptional regulation by RUNX2 | 7.0e-07 | 0.0565744 | 0.5169548 | 0.0000111 | 0.0113064 | 0.0013627 | 0.0300799 | NA | NA | 0.2717516 |
Vesicle-mediated transport | 7.0e-07 | 0.0143504 | 0.6007670 | 0.0000049 | 0.0000001 | 0.1329811 | 0.3858350 | NA | NA | 0.8800403 |
Integrin cell surface interactions | 8.0e-07 | 0.1817341 | 0.6212187 | 0.0000026 | 0.0175951 | 0.0074610 | NA | 0.0662815 | 0.8565121 | 0.4474758 |
Regulation of IGF Activity by IGFBP | 8.0e-07 | 0.2302804 | 0.5468850 | 0.0137079 | 0.2038044 | 0.4112719 | NA | 0.3086629 | 0.5264721 | 0.0769803 |
Chondroitin sulfate/dermatan sulfate metabolism | 9.0e-07 | 0.1695981 | 0.8043619 | 0.0000262 | 0.3117320 | 0.0471199 | 0.0186205 | NA | NA | 0.5189792 |
A tetrasaccharide linker sequence is required for GAG synthesis | 9.0e-07 | 0.4078676 | 0.9006614 | 0.0000254 | 0.0127004 | 0.6253993 | 0.0093022 | NA | NA | 0.4875072 |
G alpha (q) signalling events | 1.1e-06 | 0.1554920 | 0.5929441 | 0.0000023 | 0.4096661 | 0.8192870 | 0.3156041 | NA | NA | 0.5945333 |
Crosslinking of collagen fibrils | 1.1e-06 | 0.2215087 | 0.6214850 | 0.0000256 | 0.0214078 | 0.1356813 | NA | 0.0215082 | 0.5161131 | 0.3401271 |
Specification of primordial germ cells | 1.3e-06 | NA | 0.9096986 | 0.5943235 | 0.0065019 | 0.0001941 | NA | 0.2214385 | 0.0335146 | NA |
Axon guidance | 1.3e-06 | 0.0292805 | 0.4144901 | 0.0012508 | 0.0009121 | 0.1040659 | NA | 0.0771731 | 0.1453722 | 0.7220393 |
ERK1/ERK2 pathway | 1.4e-06 | 0.0434861 | 0.4085110 | 0.0000024 | 0.1184959 | 0.0188482 | 0.8953570 | NA | NA | 0.3097556 |
Chondroitin sulfate biosynthesis | 1.4e-06 | 0.2028251 | 0.7094322 | 0.0000039 | 0.0033934 | 0.5919170 | 0.0438022 | NA | NA | 0.2142623 |
Defective EXT2 causes exostoses 2 | 1.5e-06 | 0.4151082 | 0.8819938 | 0.0000001 | 0.4510110 | 0.0004920 | NA | NA | NA | 0.9024605 |
Defective EXT1 causes exostoses 1, TRPS2 and CHDS | 1.5e-06 | 0.4151082 | 0.8819938 | 0.0000001 | 0.4510110 | 0.0004920 | NA | NA | NA | 0.9024605 |
PTEN Regulation | 1.5e-06 | 0.0817627 | 0.0829696 | 0.0000149 | 0.0000083 | 0.0002185 | 0.8295500 | NA | NA | 0.3700005 |
Nervous system development | 1.5e-06 | 0.0314403 | 0.4149122 | 0.0012943 | 0.0010803 | 0.0773709 | NA | 0.0808487 | 0.1760271 | 0.7283126 |
Diseases associated with glycosaminoglycan metabolism | 1.6e-06 | 0.2914071 | 0.7717944 | 0.8139382 | 0.0001257 | 0.3898244 | NA | 0.7609713 | 0.3264045 | 0.9912342 |
RAF/MAP kinase cascade | 1.7e-06 | 0.0338995 | 0.3580731 | 0.0000033 | 0.1162157 | 0.0142060 | 0.6604954 | NA | NA | 0.3503695 |
Carbohydrate metabolism | 1.8e-06 | 0.9130043 | 0.7679701 | 0.0000148 | 0.5077062 | 0.0261936 | 0.7523325 | NA | NA | 0.2194696 |
RHOB GTPase cycle | 1.9e-06 | 0.0468311 | 0.5121054 | 0.0000001 | 0.0000096 | 0.7671123 | 0.8077366 | NA | NA | 0.8167321 |
Diseases of glycosylation | 2.0e-06 | 0.2383750 | 0.5069620 | 0.0001157 | 0.0793444 | 0.2916914 | 0.1534655 | NA | NA | 0.6423183 |
Defective B3GALTL causes PpS | 2.1e-06 | 0.2593327 | 0.8951813 | 0.0000001 | 0.1905794 | 0.1812853 | 0.4831839 | NA | NA | 0.3251967 |
FGFR2 ligand binding and activation | 2.2e-06 | 0.4558820 | 0.5733507 | 0.0000002 | 0.0000013 | 0.7373796 | 0.0269213 | NA | NA | NA |
Formation of the ureteric bud | 2.2e-06 | 0.1681610 | 0.1361971 | 0.0000081 | 0.0000116 | 0.0001122 | 0.6744045 | NA | NA | 0.3130224 |
Signaling by ROBO receptors | 2.2e-06 | 0.1517295 | 0.8897721 | 0.0008698 | 0.0000002 | 0.6001271 | 0.1489777 | NA | NA | 0.2662645 |
RHOC GTPase cycle | 2.4e-06 | 0.6613835 | 0.6087212 | 0.0000138 | 0.0000130 | 0.0000005 | NA | 0.6975819 | 0.3603495 | 0.0998455 |
RHOQ GTPase cycle | 2.5e-06 | 0.0962134 | 0.9000713 | 0.0440529 | 0.0000082 | 0.0000003 | 0.7646989 | NA | NA | 0.4050893 |
Degradation of DVL | 2.5e-06 | 0.1501040 | 0.9367314 | 0.0000039 | 0.0000002 | 0.5887583 | 0.6675524 | NA | NA | 0.8054636 |
MAPK family signaling cascades | 2.6e-06 | 0.0264043 | 0.5517982 | 0.0000044 | 0.0939433 | 0.0186880 | 0.3147759 | NA | NA | 0.1095113 |
Diseases associated with O-glycosylation of proteins | 2.6e-06 | 0.4844854 | 0.4118794 | 0.0000001 | 0.7626720 | 0.9848898 | 0.5830885 | NA | NA | 0.4395531 |
Signaling by FGFR2 in disease | 2.7e-06 | 0.6846049 | 0.7008530 | 0.0000004 | 0.0000007 | 0.5825747 | 0.0208173 | NA | NA | 0.1173073 |
Phospholipase C-mediated cascade; FGFR2 | 2.7e-06 | 0.7688547 | 0.1901028 | 0.0000004 | 0.0000042 | 0.7770023 | 0.3852312 | NA | NA | NA |
SHC-mediated cascade:FGFR2 | 2.8e-06 | 0.6254439 | 0.6463056 | 0.0000006 | 0.0000006 | 0.8483130 | 0.0346019 | NA | NA | NA |
Diseases of signal transduction by growth factor receptors and second messengers | 2.8e-06 | 0.0327482 | 0.3048264 | 0.0000078 | 0.0129148 | 0.0002951 | NA | 0.8535519 | 0.5617953 | 0.5480578 |
Binding and Uptake of Ligands by Scavenger Receptors | 2.9e-06 | 0.5140621 | 0.7285174 | 0.0001240 | 0.0000109 | 0.0038444 | 0.0292073 | NA | NA | 0.0463858 |
Regulation of RUNX2 expression and activity | 3.1e-06 | 0.0647823 | 0.4974112 | 0.0000019 | 0.0000680 | 0.0010635 | NA | 0.6550267 | 0.0847859 | 0.0308011 |
FGFR2c ligand binding and activation | 3.1e-06 | NA | 0.2072975 | 0.0000000 | 0.0012765 | NA | 0.7345982 | NA | NA | NA |
FGFRL1 modulation of FGFR1 signaling | 3.3e-06 | 0.1028123 | 0.8411326 | 0.0000574 | 0.0000000 | 0.1129987 | 0.4314953 | NA | NA | NA |
Formation of paraxial mesoderm | 3.6e-06 | 0.0633514 | 0.5488533 | 0.0009316 | 0.0000001 | 0.8402069 | NA | 0.0441185 | 0.9937696 | 0.8711378 |
Scavenging by Class A Receptors | 3.6e-06 | 0.4051526 | 0.8268658 | 0.0000061 | 0.4822013 | 0.0693799 | 0.7863883 | NA | NA | 0.0162194 |
Molecules associated with elastic fibres | 3.7e-06 | 0.6220088 | 0.0552455 | 0.0000171 | 0.1844808 | 0.3359382 | NA | 0.9083798 | 0.9522363 | 0.6443183 |
FGFR2 mutant receptor activation | 3.7e-06 | 0.6594481 | 0.5442979 | 0.0000001 | 0.0000011 | 0.6924788 | 0.1157368 | NA | NA | NA |
Signaling by PDGF | 3.7e-06 | 0.3355684 | 0.6935265 | 0.0000002 | 0.0354588 | 0.0162812 | 0.2345838 | NA | NA | 0.2626418 |
Activated point mutants of FGFR2 | 3.9e-06 | 0.7531018 | 0.3331602 | 0.0000004 | 0.0000049 | 0.7920151 | 0.3723379 | NA | NA | NA |
Dermatan sulfate biosynthesis | 4.1e-06 | 0.2975360 | 0.6157101 | 0.0000068 | 0.1866780 | 0.7230372 | 0.1467363 | NA | NA | 0.5966076 |
Negative regulation of the PI3K/AKT network | 4.1e-06 | 0.4258167 | 0.0403261 | 0.0000012 | 0.0000498 | 0.7633871 | 0.6932552 | NA | NA | 0.3108647 |
Trafficking and processing of endosomal TLR | 4.2e-06 | 0.2640648 | 0.8714498 | 0.0000030 | 0.0000134 | 0.0001465 | NA | 0.0014300 | 0.2877000 | 0.5197535 |
Insulin receptor signalling cascade | 4.2e-06 | 0.2579374 | 0.5197450 | 0.0000013 | 0.0000001 | 0.6301606 | 0.7184197 | NA | NA | 0.6850994 |
Metabolism of fat-soluble vitamins | 4.3e-06 | 0.0948774 | 0.1201581 | 0.0010215 | 0.0000033 | 0.7597878 | NA | 0.4470069 | 0.0781666 | 0.1167091 |
Retinoid metabolism and transport | 4.3e-06 | 0.0948774 | 0.1201581 | 0.0010215 | 0.0000033 | 0.7597878 | NA | 0.4470069 | 0.0781666 | 0.1167091 |
Signaling by FGFR | 4.3e-06 | 0.1288624 | 0.7941618 | 0.0000013 | 0.0000005 | 0.3588054 | NA | 0.6409580 | 0.6371550 | 0.2793489 |
FRS-mediated FGFR2 signaling | 4.3e-06 | 0.6239538 | 0.5797569 | 0.0000004 | 0.0000004 | 0.9727561 | 0.0660525 | NA | NA | NA |
Signaling by BMP | 4.4e-06 | 0.4740980 | 0.1027097 | 0.0000012 | 0.0819171 | 0.4171192 | 0.4312054 | NA | NA | 0.5480481 |
G alpha (i) signalling events | 4.5e-06 | 0.0371717 | 0.4248922 | 0.0000001 | 0.2293518 | 0.0149683 | 0.7057816 | NA | NA | 0.2622769 |
RHOA GTPase cycle | 4.5e-06 | 0.1024748 | 0.7408258 | 0.0000028 | 0.0000038 | 0.8234946 | NA | 0.6065774 | 0.6057045 | 0.7455201 |
PI5P, PP2A and IER3 Regulate PI3K/AKT Signaling | 4.5e-06 | 0.3882359 | 0.0312559 | 0.0000020 | 0.0000315 | 0.3688101 | 0.9931072 | NA | NA | 0.2296315 |
Signaling by Insulin receptor | 4.5e-06 | 0.1166131 | 0.5369190 | 0.0000010 | 0.0000002 | 0.6203859 | 0.8348461 | NA | NA | 0.9889899 |
Signaling by FGFR2 | 4.6e-06 | 0.2017734 | 0.9313176 | 0.0000007 | 0.0000009 | 0.5888631 | 0.3775783 | NA | NA | 0.2674864 |
Diseases of metabolism | 4.7e-06 | 0.0839464 | 0.1909300 | 0.0002900 | 0.0676150 | 0.2902320 | NA | 0.0552276 | 0.1123250 | 0.1485337 |
Negative regulation of FGFR2 signaling | 4.8e-06 | 0.1552913 | 0.7096466 | 0.0000003 | 0.0000007 | 0.9034642 | 0.3689606 | NA | NA | NA |
Glycosaminoglycan metabolism | 4.8e-06 | 0.7939859 | 0.6189523 | 0.0000023 | 0.1512943 | 0.4365784 | NA | 0.1545235 | 0.3594032 | 0.2511728 |
PI3K/AKT Signaling in Cancer | 5.0e-06 | 0.2583795 | 0.0314752 | 0.0000021 | 0.0000395 | 0.3146645 | 0.7016615 | NA | NA | 0.5626591 |
O-glycosylation of TSR domain-containing proteins | 5.0e-06 | 0.3406162 | 0.8548108 | 0.0000004 | 0.1818224 | 0.3893177 | 0.6655576 | NA | NA | 0.3315786 |
NCAM1 interactions | 5.0e-06 | 0.8840654 | 0.5568446 | 0.0000002 | 0.5580963 | 0.3819905 | 0.4849704 | NA | NA | 0.3705564 |
The activation of arylsulfatases | 5.1e-06 | 0.5695370 | 0.2168604 | 0.1535106 | 0.0015707 | 0.0000130 | NA | 0.0414831 | 0.4770026 | NA |
Downstream signal transduction | 5.6e-06 | 0.0177311 | 0.1178433 | 0.0000011 | 0.4141545 | 0.0757079 | 0.6324902 | NA | NA | 0.5933000 |
PI3K/AKT Signaling | 5.6e-06 | 0.0217046 | 0.1839324 | 0.0000025 | 0.0003501 | 0.5299693 | NA | 0.9728739 | 0.4837122 | 0.3928189 |
Constitutive Signaling by Aberrant PI3K in Cancer | 5.7e-06 | 0.1164392 | 0.0528870 | 0.0000021 | 0.0000327 | 0.3732082 | 0.9936136 | NA | NA | 0.2683390 |
Membrane Trafficking | 5.7e-06 | 0.0153472 | 0.3398972 | 0.0000039 | 0.0000001 | 0.0196522 | 0.9839172 | NA | NA | 0.9822560 |
NCAM signaling for neurite out-growth | 5.8e-06 | 0.4656544 | 0.3854618 | 0.0000004 | 0.1505118 | 0.8890633 | 0.0947353 | NA | NA | 0.8202409 |
Intra-Golgi and retrograde Golgi-to-ER traffic | 6.0e-06 | 0.1623534 | 0.8255326 | 0.0002684 | 0.1105520 | 0.0000001 | 0.1264353 | NA | NA | 0.4503369 |
Signaling by TGFB family members | 6.0e-06 | 0.1067845 | 0.9043556 | 0.0000041 | 0.9116613 | 0.9070655 | NA | 0.8827745 | 0.0473354 | 0.1125412 |
Phospholipase C-mediated cascade; FGFR3 | 6.4e-06 | NA | 0.9029737 | 0.0000000 | 0.0000206 | 0.5856938 | 0.2500034 | NA | NA | NA |
Synthesis, secretion, and inactivation of Glucagon-like Peptide-1 (GLP-1) | 6.5e-06 | 0.3199954 | 0.9447202 | 0.0001082 | 0.0000143 | 0.0000002 | 0.3221776 | NA | NA | 0.9733780 |
Regulation of CDH11 Expression and Function | 6.6e-06 | 0.0147802 | 0.4265107 | 0.0000003 | 0.2319873 | 0.7594590 | NA | 0.1508183 | 0.1809470 | 0.4937914 |
PI-3K cascade:FGFR2 | 6.7e-06 | 0.4293015 | 0.6454894 | 0.0000004 | 0.0000007 | 0.9867624 | 0.4245599 | NA | NA | NA |
Kidney development | 6.7e-06 | 0.1142341 | 0.8976835 | 0.0000342 | 0.0000038 | 0.0000428 | 0.0600717 | NA | NA | 0.4535685 |
Integrin signaling | 7.0e-06 | 0.4770905 | 0.6001199 | 0.0000181 | 0.0002779 | 0.0000002 | 0.5051000 | NA | NA | 0.0743639 |
Visual phototransduction | 7.2e-06 | 0.0656432 | 0.1455824 | 0.1349745 | 0.0000004 | 0.7496279 | 0.0445484 | NA | NA | 0.2201235 |
RUNX2 regulates osteoblast differentiation | 7.2e-06 | 0.5134977 | 0.8174828 | 0.0000151 | 0.0959687 | 0.0081984 | NA | 0.0357893 | 0.7365349 | 0.5343903 |
Metabolism of vitamins and cofactors | 7.4e-06 | 0.4158396 | 0.4196352 | 0.0636036 | 0.0000001 | 0.7882397 | 0.7812248 | NA | NA | 0.1809245 |
FGFR3 ligand binding and activation | 7.5e-06 | NA | 0.8093056 | 0.0000000 | 0.0000381 | 0.7198278 | 0.8511350 | NA | NA | NA |
FGFR3c ligand binding and activation | 7.5e-06 | NA | 0.8093056 | 0.0000000 | 0.0000381 | 0.7198278 | 0.8511350 | NA | NA | NA |
Heparan sulfate/heparin (HS-GAG) metabolism | 7.5e-06 | 0.2962408 | 0.8376337 | 0.0000007 | 0.6038985 | 0.0081590 | 0.0164833 | NA | NA | 0.0287701 |
TCF dependent signaling in response to WNT | 7.5e-06 | 0.0662995 | 0.4546861 | 0.4548073 | 0.0000000 | 0.0440621 | 0.9567258 | NA | NA | 0.5836250 |
Signaling by Receptor Tyrosine Kinases | 7.8e-06 | 0.0297181 | 0.3435254 | 0.0000003 | 0.0392486 | 0.4509775 | NA | 0.3453504 | 0.9727211 | 0.7256416 |
Regulation of Homotypic Cell-Cell Adhesion | 7.8e-06 | 0.0359923 | 0.4072230 | 0.2973926 | 0.0000004 | 0.1190335 | NA | 0.2201077 | 0.2469475 | 0.6290851 |
Regulation of Expression and Function of Type II Classical Cadherins | 7.8e-06 | 0.0359923 | 0.4072230 | 0.2973926 | 0.0000004 | 0.1190335 | NA | 0.2201077 | 0.2469475 | 0.6290851 |
Syndecan interactions | 7.8e-06 | 0.7773001 | 0.9600632 | 0.0000000 | 0.0011420 | 0.0028720 | NA | 0.7759527 | 0.0289862 | 0.7676395 |
Again, we can repeat the test on significant pathways applying a resampling strategy for a total of 100 iteration and we save the results. We can see from the dotplot that the vast majority of pathways is found significant in more than 80 iterations. We can extract stable pathways and add a column with the resampling success count on the pathway summary.
useThisPathways <- unique(row.names(pathwaySummary)[pathwaySummary$pvalue <= 0.05])
sPathway <- pathwaySummary[row.names(pathwaySummary) %in% useThisPathways, , drop = TRUE]
if (file.exists(paste0(dirname, "permsP.RData"))){
load(paste0(dirname, "permsP.RData"))
} else{
perms <- resamplingPathwayTwoClass(fullMultiOmics = multiOmics,
classAnnotation, reactHuge,
nperm = 100, nPatients = 3,
pathwaySubset = useThisPathways,
genesToConsider = row.names(pseudoExpNorm))
save(perms, file = paste0(dirname, "permsP.RData"))
}
stablePathwaysSummary <- selectStablePathwaysModules(perms = perms, moduleSummary = sPathway, success = 80)
resamplingSuccessCount <- getPathwaysModulesSuccess(perms = perms, moduleSummary = sPathway)
pathwaySummary <- addResamplingCounts(pathwaySummary, resamplingSuccessCount)
With the function plotMultiPathwayReport
we can plot the
test summary for a subset of pathways.
plotMultiPathwayReport(twoClassP[1:20],
MOcolors = c(exp = "red", mut = "blue",
cnv = "yellow", met = "green"),
priority_to = c("exp", "met"),
fontsize = 5)
We select “Adherens junctions interactions” pathway and plot the heatmap of prioritized genes for each omic.
plotPathwayHeat(twoClassP[["Adherens junctions interactions"]],
additionalAnnotations = classAnnotation,
additionalPaletteNames = list(classes = "violet"))
As found for module tests, the combination of expression and CNV, as well the combination of expression and methylation, co-regulate pathways more often than what expected by chance.
runSupertest(pathwaySummary,
pvalueThr = 0.05, zscoreThr = 0.05, resampligThr = 80,
excludeColumns = "resamplingCount")
The frequency plot shows that the combination of CNV and expression mainly affects signaling pathways, disease and immune system; the same happens with the combination of expression and methylation, with the addition in this case of gene expression macrocategory.
omicsClasses2pathways <- computeOmicsIntersections(pathwaySummary,
pvalueThr = 0.05, zscoreThr = 0.05, resampligThr = 80,
excludeColumns = "resamplingCount")
omicsClasses2fathers <- lapply(omicsClasses2pathways, annotePathwayToFather,
graphiteDB = reactome, hierarchy = pathHierarchyGraph)
freqDataframe <- computeFreqs(omicsClasses2fathers)
combiClass <- grep(";", freqDataframe$class)
freqDataframe.multi <- freqDataframe[combiClass, , drop = FALSE]
plotFrequencies(freqDataframe.multi, minSize = 6, maxSize = 9, width = 10, lineSize = 1)
The list of Reactome pathways and the pre-processed datasets used in this tutorial are available here.
Results obtained by running this tutorial are available here.
sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Europe/Rome
## tzcode source: system (glibc)
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils
## [7] datasets methods base
##
## other attached packages:
## [1] MethylMix_2.36.0 doParallel_1.0.17
## [3] iterators_1.0.14 foreach_1.5.2
## [5] impute_1.80.0 maftools_2.22.0
## [7] TCGAbiolinks_2.34.0 devtools_2.4.5
## [9] usethis_3.0.0 kableExtra_1.4.0
## [11] MOSClip_0.99.5 graphite_1.52.0
## [13] EDASeq_2.40.0 ShortRead_1.64.0
## [15] GenomicAlignments_1.42.0 SummarizedExperiment_1.36.0
## [17] MatrixGenerics_1.18.0 matrixStats_1.4.1
## [19] Rsamtools_2.22.0 GenomicRanges_1.58.0
## [21] Biostrings_2.74.0 GenomeInfoDb_1.42.0
## [23] XVector_0.46.0 BiocParallel_1.40.0
## [25] org.Hs.eg.db_3.20.0 AnnotationDbi_1.68.0
## [27] IRanges_2.40.0 S4Vectors_0.44.0
## [29] Biobase_2.66.0 BiocGenerics_0.52.0
##
## loaded via a namespace (and not attached):
## [1] fs_1.6.5 bitops_1.0-9
## [3] httr_1.4.7 RColorBrewer_1.1-3
## [5] SuperExactTest_1.1.0 Rgraphviz_2.50.0
## [7] profvis_0.4.0 tools_4.4.2
## [9] backports_1.5.0 utf8_1.2.4
## [11] R6_2.5.1 DT_0.33
## [13] GetoptLong_1.0.5 urlchecker_1.0.1
## [15] withr_3.0.2 prettyunits_1.2.0
## [17] gridExtra_2.3 cli_3.6.3
## [19] gRbase_2.0.3 Cairo_1.6-2
## [21] flashClust_1.01-2 sandwich_3.1-1
## [23] labeling_0.4.3 sass_0.4.9
## [25] mvtnorm_1.3-2 survMisc_0.5.6
## [27] readr_2.1.5 qpgraph_2.40.0
## [29] systemfonts_1.1.0 yulab.utils_0.1.7
## [31] svglite_2.1.3 R.utils_2.12.3
## [33] sessioninfo_1.2.2 rstudioapi_0.17.1
## [35] RSQLite_2.3.7 generics_0.1.3
## [37] gridGraphics_0.5-1 shape_1.4.6.1
## [39] BiocIO_1.16.0 hwriter_1.3.2.1
## [41] car_3.1-3 dplyr_1.1.4
## [43] qtl_1.70 lars_1.3
## [45] leaps_3.2 Matrix_1.7-1
## [47] interp_1.1-6 fansi_1.0.6
## [49] abind_1.4-8 R.methodsS3_1.8.2
## [51] lifecycle_1.0.4 scatterplot3d_0.3-44
## [53] multcomp_1.4-26 yaml_2.3.10
## [55] carData_3.0-5 SparseArray_1.6.0
## [57] BiocFileCache_2.14.0 grid_4.4.2
## [59] blob_1.2.4 promises_1.3.0
## [61] crayon_1.5.3 pwalign_1.2.0
## [63] miniUI_0.1.1.1 lattice_0.22-6
## [65] GenomicFeatures_1.58.0 annotate_1.84.0
## [67] KEGGREST_1.46.0 magick_2.8.5
## [69] pillar_1.9.0 knitr_1.48
## [71] ComplexHeatmap_2.22.0 rjson_0.2.23
## [73] TCGAbiolinksGUI.data_1.26.0 estimability_1.5.1
## [75] corpcor_1.6.10 codetools_0.2-20
## [77] glue_1.8.0 downloader_0.4
## [79] remotes_2.5.0 data.table_1.16.2
## [81] MultiAssayExperiment_1.32.0 vctrs_0.6.5
## [83] png_0.1-8 coxrobust_1.0.1
## [85] gtable_0.3.6 cachem_1.1.0
## [87] aroma.light_3.36.0 xfun_0.49
## [89] mime_0.12 S4Arrays_1.6.0
## [91] coda_0.19-4.1 survival_3.7-0
## [93] pheatmap_1.0.12 KMsurv_0.1-5
## [95] ellipsis_0.3.2 TH.data_1.1-2
## [97] bit64_4.5.2 progress_1.2.3
## [99] filelock_1.0.3 bslib_0.8.0
## [101] elasticnet_1.3 colorspace_2.1-1
## [103] DBI_1.2.3 DNAcopy_1.80.0
## [105] tidyselect_1.2.1 emmeans_1.10.5
## [107] bit_4.5.0 compiler_4.4.2
## [109] curl_5.2.3 rvest_1.0.4
## [111] httr2_1.0.6 graph_1.84.0
## [113] xml2_1.3.6 DelayedArray_0.32.0
## [115] rtracklayer_1.66.0 checkmate_2.3.2
## [117] scales_1.3.0 multcompView_0.1-10
## [119] rappdirs_0.3.3 stringr_1.5.1
## [121] digest_0.6.37 rmarkdown_2.29
## [123] htmltools_0.5.8.1 pkgconfig_2.0.3
## [125] jpeg_0.1-10 highr_0.11
## [127] FactoMineR_2.11 dbplyr_2.5.0
## [129] fastmap_1.2.0 rlang_1.1.4
## [131] GlobalOptions_0.1.2 htmlwidgets_1.6.4
## [133] UCSC.utils_1.2.0 shiny_1.9.1
## [135] farver_2.1.2 jquerylib_0.1.4
## [137] zoo_1.8-12 jsonlite_1.8.9
## [139] R.oo_1.27.0 RCurl_1.98-1.16
## [141] magrittr_2.0.3 Formula_1.2-5
## [143] GenomeInfoDbData_1.2.13 ggplotify_0.1.2
## [145] NbClust_3.0.1 munsell_0.5.1
## [147] Rcpp_1.0.13-1 stringi_1.8.4
## [149] zlibbioc_1.52.0 MASS_7.3-61
## [151] pkgbuild_1.4.5 plyr_1.8.9
## [153] ggrepel_0.9.6 deldir_2.0-4
## [155] survminer_0.5.0 splines_4.4.2
## [157] hms_1.1.3 circlize_0.4.16
## [159] igraph_2.1.1 ggpubr_0.6.0
## [161] ggsignif_0.6.4 pkgload_1.4.0
## [163] biomaRt_2.62.0 XML_3.99-0.17
## [165] evaluate_1.0.1 latticeExtra_0.6-30
## [167] tzdb_0.4.0 httpuv_1.6.15
## [169] tidyr_1.3.1 purrr_1.0.2
## [171] reshape_0.8.9 km.ci_0.5-6
## [173] clue_0.3-65 ggplot2_3.5.1
## [175] BiocBaseUtils_1.8.0 broom_1.0.7
## [177] xtable_1.8-4 restfulr_0.0.15
## [179] rstatix_0.7.2 later_1.3.2
## [181] viridisLite_0.4.2 tibble_3.2.1
## [183] memoise_2.0.1 cluster_2.1.6