MOSClip Two-Class Analysis on TCGA ovarian cancer patients

Prepare data for MOSClip analysis

Now we are almost ready to run a two-class analysis with MOSClip! First of all, we can load the necessary libraries and the pre-processed ovarian cancer datasets that we downloaded from TCGA in the previous tutorials. We also set a seed, in order to have reproducible results in case of future repetition of this analysis, and we create a directory where we will save all the results generated by this tutorial.

library(org.Hs.eg.db)
library(EDASeq)
library(graphite)
library(MOSClip)
library(kableExtra)


load("downloadTCGA/TCGA-OV-pre-processed.RData")

set.seed(1234)

dirname <- "MOSresults/twoClass/"
if (!dir.exists(dirname)){
    dir.create(dirname)
}

We need to prepare data in order to run MOSClip. The first step is to modify all the multi-omic matrices assigning the type of gene identifier used. Since we’ll be using the graphite package to download pathways and their graphical structures, gene names across all omics need to be compatible with graphite. For this analysis, we’ll use EntrezIDs. Thus, each gene ID in our data must be prefixed with “ENTREZID:” for compatibility.

expression <- expAvg
row.names(expression) <- paste0("ENTREZID:", row.names(expression))
mutation <- ov.mutations$data
row.names(mutation) <- paste0("ENTREZID:", row.names(mutation))
names(metClustValues$eMap) <- paste0("ENTREZID:", row.names(metClustValues$eMap))
row.names(ov.cnv) <- paste0("ENTREZID:", row.names(ov.cnv))

Moving to patient selection, we keep only patients whose class annotation is available and their intersection with available patients across the 4 omics. Finally, our class annotation data frame is filtered to keep only the selected patients.

# select common patients
patients <- row.names(classes)
patients <- intersect(patients, colnames(expression))
patients <- intersect(patients, colnames(metClustValues$MET_Cancer_Clustered))
patients <- intersect(patients, colnames(mutation))
patients <- intersect(patients, colnames(ov.cnv))

classAnnot <- classes[patients, , drop=FALSE]

table(classAnnot)

## classes
## Differentiated Immunoreactive    Mesenchymal  Proliferative 
##             74             67             64             63

At this point, we need to extract selected patients for each multi-omic matrix. After patient selection, we can normalize (upper quartile normalization from EDASeq package) and log-transform expression data.

# normalize expression data
expression <- expression[, patients, drop = FALSE]
keep = apply(expression >= 100, 1, any)
expNorm <- betweenLaneNormalization(expression[keep, , drop = FALSE], which = "upper")
pseudoExpNorm <- log2(expNorm + 1)

methylation <- metClustValues
methylation$MET_Cancer_Clustered <- methylation$MET_Cancer_Clustered[, patients, drop = FALSE]

mutation <- mutation[, patients, drop = FALSE]
cnv <- ov.cnv[, patients, drop = FALSE]

We are now ready to generate an object of class Omics using the function makeOmics. This object is required to run each type of MOSClip analysis. It is based on a MultiAssayExperiment object, containing an ExperimentList with matrices for each omic (we suggest to use standard names for each experiment as shown in this example). colData in this case will contain class annotation for patients. Additionally, specific slots for MOSClip analysis exist, including modelInfo, where the user must specify the desired method for data reduction for each omic, and specificArgs, with specific parameters to be used by reduction functions. Both these slots must have the same dimension as ExperimentList.

The list of available methods for data dimensionality reduction can be easily retrieved with availableOmicsMethods() function.

multiOmics <- makeOmics(experiments = list(exp = pseudoExpNorm, 
                                           met = methylation$MET_Cancer_Clustered, 
                                           mut = mutation, 
                                           cnv = as.matrix(cnv)), 
                        colData = classAnnot,
    modelInfo = c("summarizeWithPca", "summarizeInCluster", 
    "summarizeToNumberOfEvents", "summarizeToNumberOfDirectionalEvents"), 
    specificArgs = list(pcaArgs = list(name = "exp", shrink = "FALSE", method = "sparse", maxPCs = 3),
                        clusterArgs = list(name = "met", dict = methylation$eMap, max_cluster_number = 3), 
                        countEvent = list(name = "mut", min_prop = 0.05), 
                        cnvAgv = list(name = "cnv", min_prop = 0.05)))

Download Reactome pathways

To run a MOSClip analysis we also need a list of pathways that we want to test, as well as their graphical structures if we want to exploit the topological method implemented in the package.

We decide to use pathways collected in Reactome database. They can be downloaded using graphite that is also able to convert gene identifiers (here we use EntrezID). Since this download and conversion process can take several minutes, we will save the resulting PathwayList object for easy access in future analyses.

if (file.exists("downloadTCGA/reactome-entrez-2024-05-27.RData")) {
  load("downloadTCGA/reactome-entrez-2024-05-27.RData")
} else {
  reactome <- pathways("hsapiens", "reactome")
  reactome <- convertIdentifiers(reactome, "entrez")
  file = paste0("downloadTCGA", "/reactome-entrez-", as.character(Sys.Date()), ".RData")
  save(reactome, file = file)
}

To keep analyses efficient and avoid redundancy, we select Reactome pathways based on their number of nodes, considering only those nodes that are present at least in the expression matrix. We prepare two distinct PathwayList objects:

A smaller list with pathways containing between 20 and 100 nodes.
A larger list with pathways containing between 10 and 700 nodes.

nodesLength <- sapply(reactome, function(g) {length(intersect(graphite::nodes(g), 
                                                              row.names(pseudoExpNorm)))})
reactSmall <- reactome[nodesLength >= 20 & nodesLength <= 100]
reactHuge <- reactome[nodesLength >= 10 & nodesLength <= 700]

Prepare class annotations

We define the patient groups to compare, using the subtypes established for TCGA ovarian cancer patients. For this tutorial, we’ll focus on a multi-omic comparison between immunoreactive and mesenchymal subtypes, which should provide interesting insights. The user can decide which subtypes to compare.

We then filter accordingly our class annotation data frame and multiOmics object.

class1 <- "Immunoreactive"
class2 <- "Mesenchymal"

classAnnotation <- classAnnot[classAnnot$classes %in% c(class1, class2), , drop=FALSE]
multiOmics <- multiOmics[,row.names(classAnnotation)]

Now we are ready for MOSClip two-class analysis.

Two-class analysis on modules

We start from the analysis on modules using the function multiOmicsTwoClassModuleTest. Required inputs are an Omics object, a graph, and a class annotation data frame. Patients in class annotation should have the same order as in colData. Reactome database is built with a hierarchical organization: different pathways represent the same process with more or less details, going from very specific and small pathways to very large and general ones. To avoid redundant results, we choose to focus on a subset of pathways, specifically those having between 20 and 100 nodes (reactSmall list); this way, we aim to limit the overlap in gene testing.

This step will take some minutes. For this reason, we’ll save the final output in the directory “MOSresults”, and in case of future analyses we’ll load it from there.

if (file.exists(paste0(dirname, "twoClassM.rds"))){
  twoClassM <- readRDS(paste0(dirname, "twoClassM.rds"))
} else { 
    twoClassM <- lapply(reactSmall, function(g) {
        res <- multiOmicsTwoClassModuleTest(multiOmics, g, classAnnotation, 
                                            useTheseGenes = row.names(pseudoExpNorm))
        res
    })
    saveRDS(twoClassM, file = paste0(dirname, "twoClassM.rds"))
}

The result is a list of MultiOmicsModules objects. This list can be used within the function multiPathwayModuleReport to return a tabular summary of modules, sorted by p-values. Besides the p-value for each module, p-values for tested covariates are shown. The user can specify which covariates to visualize first in the data frame columns, giving the omic names to priority_to argument.

moduleSummary <- multiPathwayModuleReport(twoClassM, 
                                          priority_to = c("exp", "met", "cnv"))

	module	pvalue	expPC1	expPC2	expPC3	met2k2	met3k2	met3k3	cnvNEG	cnvPOS	mut
Elastic fibre formation.5	5	1.0e-07	0.0000001	0.0430093	NA	0.5044956	NA	NA	NA	0.5472875	NA
SUMOylation of transcription factors.4	4	1.0e-07	0.0000000	NA	NA	NA	NA	NA	NA	0.5666869	NA
Elastic fibre formation.2	2	1.0e-07	0.0000000	NA	NA	0.4479830	NA	NA	NA	NA	NA
O-glycosylation of TSR domain-containing proteins.4	4	2.0e-07	0.0000000	NA	NA	0.9925894	NA	NA	NA	0.2557571	NA
Gamma carboxylation, hypusinylation, hydroxylation, and arylsulfatase activation.4	4	2.0e-07	0.0000000	NA	NA	0.6992146	NA	NA	NA	0.0453129	NA
Glycosphingolipid metabolism.11	11	2.0e-07	0.0000000	NA	NA	0.6992146	NA	NA	NA	0.0453129	NA
Sphingolipid metabolism.15	15	2.0e-07	0.0000000	NA	NA	0.6992146	NA	NA	NA	0.0453129	NA
Glycosphingolipid catabolism.8	8	2.0e-07	0.0000000	NA	NA	0.6992146	NA	NA	NA	0.0453129	NA
ECM proteoglycans.2	2	3.0e-07	0.1654198	0.1413438	0.0000425	0.0845938	NA	NA	NA	0.4293656	NA
RUNX2 regulates osteoblast differentiation.4	4	3.0e-07	0.0000001	NA	NA	NA	NA	NA	0.8990975	0.6534472	NA
RUNX2 regulates bone development.5	5	3.0e-07	0.0000001	NA	NA	NA	NA	NA	0.8990975	0.6534472	NA
Nitric oxide stimulates guanylate cyclase.2	2	3.0e-07	0.2998266	0.0000000	NA	NA	NA	NA	NA	0.0574532	NA
O-glycosylation of TSR domain-containing proteins.23	23	3.0e-07	0.0000000	NA	NA	0.2906849	NA	NA	NA	NA	NA
ECM proteoglycans.1	1	4.0e-07	0.0241493	0.0093753	0.0000246	NA	0.5571007	0.3716193	NA	0.3592535	NA
Assembly of collagen fibrils and other multimeric structures.1	1	5.0e-07	0.0000002	NA	NA	0.0526327	NA	NA	NA	0.3959735	NA
Collagen chain trimerization.1	1	5.0e-07	0.0000002	NA	NA	0.0526327	NA	NA	NA	0.3959735	NA
Chondroitin sulfate biosynthesis.2	2	7.0e-07	0.9921860	0.0000002	0.3845546	0.4312105	NA	NA	0.6242335	0.1703916	NA
Bacterial Infection Pathways.1	1	8.0e-07	0.0000000	NA	NA	NA	0.5747115	0.9897028	NA	NA	NA
Molecules associated with elastic fibres.1	1	8.0e-07	0.0000231	0.0519096	0.0007252	0.3218417	NA	NA	0.7926718	0.1662188	NA
Non-integrin membrane-ECM interactions.4	4	8.0e-07	0.0000004	0.0000123	NA	NA	0.4979362	0.9528540	NA	0.7731634	NA
Signaling by TGF-beta Receptor Complex.12	12	9.0e-07	0.0000000	NA	NA	0.0370336	NA	NA	0.6218599	0.0552526	NA
Transcriptional activity of SMAD2/SMAD3:SMAD4 heterotrimer.5	5	9.0e-07	0.0000000	NA	NA	0.0370336	NA	NA	0.6218599	0.0552526	NA
SMAD2/SMAD3:SMAD4 heterotrimer regulates transcription.4	4	9.0e-07	0.0000000	NA	NA	0.0370336	NA	NA	0.6218599	0.0552526	NA
Assembly of collagen fibrils and other multimeric structures.4	4	9.0e-07	0.0000164	0.7065204	NA	0.0888073	NA	NA	NA	0.1326165	0.7739622
Collagen chain trimerization.4	4	9.0e-07	0.0000164	0.7065204	NA	0.0888073	NA	NA	NA	0.1326165	0.7739622
Unfolded Protein Response (UPR).23	23	1.0e-06	0.0000004	NA	NA	NA	0.5221858	0.8471660	NA	NA	NA
O-glycosylation of TSR domain-containing proteins.34	34	1.1e-06	0.0000000	NA	NA	0.8321512	NA	NA	NA	0.5098868	NA
Signaling by ALK in cancer.10	10	1.2e-06	0.2136126	0.0000001	0.0254404	0.1086051	NA	NA	NA	0.5563069	NA
Signaling by ALK fusions and activated point mutants.10	10	1.2e-06	0.2136126	0.0000001	0.0254404	0.1086051	NA	NA	NA	0.5563069	NA
Cell junction organization.6	6	1.3e-06	0.0000000	NA	NA	0.6829094	NA	NA	NA	0.8181169	NA
O-glycosylation of TSR domain-containing proteins.28	28	1.4e-06	0.0000000	NA	NA	0.7965990	NA	NA	NA	0.7398864	NA
Chondroitin sulfate biosynthesis.3	3	1.4e-06	0.5245592	0.0000006	0.1625020	NA	0.5076500	0.5248538	0.3964371	0.3221811	NA
Collagen formation.2	2	1.4e-06	0.0000010	0.0050478	0.7565458	0.1563999	NA	NA	0.3859780	0.3054711	0.8282875
Immunoregulatory interactions between a Lymphoid and a non-Lymphoid cell.14	14	1.5e-06	0.0332499	0.0000000	NA	0.9115779	NA	NA	NA	0.8072930	NA
Signaling by FGFR.10	10	1.6e-06	0.4196746	0.0000000	0.0020458	0.3525830	NA	NA	0.0782642	0.7629884	NA
Signaling by FGFR1.8	8	1.6e-06	0.4196746	0.0000000	0.0020458	0.3525830	NA	NA	0.0782642	0.7629884	NA
Collagen formation.3	3	1.7e-06	0.0000010	0.0043916	0.7007774	0.2173066	NA	NA	0.3931109	0.4241760	0.9316167
Collagen biosynthesis and modifying enzymes.2	2	1.7e-06	0.0000010	0.0043916	0.7007774	0.2173066	NA	NA	0.3931109	0.4241760	0.9316167
Activation of Matrix Metalloproteinases.2	2	1.7e-06	0.0000233	0.5657766	0.0000687	0.0710426	NA	NA	NA	0.6188975	NA
XBP1(S) activates chaperone genes.25	25	1.8e-06	0.0000000	NA	NA	NA	0.1578483	0.4096951	NA	0.0716312	NA
IRE1alpha activates chaperones.27	27	1.8e-06	0.0000000	NA	NA	NA	0.1578483	0.4096951	NA	0.0716312	NA
Unfolded Protein Response (UPR).31	31	1.8e-06	0.0000000	NA	NA	NA	0.1578483	0.4096951	NA	0.0716312	NA
Heparan sulfate/heparin (HS-GAG) metabolism.7	7	1.9e-06	0.0000001	0.0000002	0.5249690	0.2119964	NA	NA	NA	0.6004117	NA
A tetrasaccharide linker sequence is required for GAG synthesis.12	12	1.9e-06	0.0000001	0.0000002	0.5249690	0.2119964	NA	NA	NA	0.6004117	NA
Signaling by FGFR in disease.6	6	2.0e-06	0.2932271	0.0000001	0.0116695	0.0145332	NA	NA	0.8262102	0.7253705	0.1078824
Signaling by FGFR2 in disease.1	1	2.0e-06	0.2932271	0.0000001	0.0116695	0.0145332	NA	NA	0.8262102	0.7253705	0.1078824
MAPK6/MAPK4 signaling.6	6	2.0e-06	0.0000000	NA	NA	NA	0.0008409	0.1546065	NA	0.3455090	NA
Interleukin-10 signaling.35	35	2.1e-06	0.0000000	NA	NA	NA	NA	NA	NA	0.3584628	NA
Signaling by BMP.1	1	2.5e-06	0.0000001	0.9396654	0.3636854	0.3992086	NA	NA	0.8247345	0.0344677	NA
Chondroitin sulfate/dermatan sulfate metabolism.7	7	2.5e-06	0.4790755	0.0000000	0.4828802	0.9397531	NA	NA	0.2800561	0.4183288	0.8148043
Chondroitin sulfate biosynthesis.4	4	2.6e-06	0.3936016	0.7777561	0.0000001	0.0252156	NA	NA	0.0562418	0.6133604	NA
Collagen formation.1	1	2.6e-06	0.0000007	0.0020706	0.6738027	0.1884023	NA	NA	0.0822127	0.1264153	0.8214458
Collagen biosynthesis and modifying enzymes.1	1	2.6e-06	0.0000007	0.0020706	0.6738027	0.1884023	NA	NA	0.0822127	0.1264153	0.8214458
Diseases associated with glycosaminoglycan metabolism.4	4	2.6e-06	0.0887587	0.2683040	0.0000001	NA	NA	NA	0.7528869	0.9907462	0.8913413
Collagen formation.6	6	2.7e-06	0.0000117	0.4137637	0.0598106	0.8867219	NA	NA	0.5096098	0.3257258	NA
Assembly of collagen fibrils and other multimeric structures.13	13	2.7e-06	0.0000117	0.4137637	0.0598106	0.8867219	NA	NA	0.5096098	0.3257258	NA
Platelet homeostasis.2	2	2.7e-06	0.4200899	0.0000002	0.0070693	0.1039861	NA	NA	NA	0.3347172	NA
Syndecan interactions.3	3	2.8e-06	0.0001821	0.0000002	NA	NA	0.0325411	0.3753025	NA	0.3075261	NA
Non-integrin membrane-ECM interactions.5	5	2.8e-06	0.0001821	0.0000002	NA	NA	0.0325411	0.3753025	NA	0.3075261	NA
Signaling by FGFR in disease.5	5	2.9e-06	0.3606012	0.0000004	0.0011555	0.2837853	NA	NA	0.9731297	0.9571113	NA
Assembly of collagen fibrils and other multimeric structures.7	7	2.9e-06	0.0000000	NA	NA	0.9918623	NA	NA	NA	NA	NA
Collagen chain trimerization.7	7	2.9e-06	0.0000000	NA	NA	0.9918623	NA	NA	NA	NA	NA
O-glycosylation of TSR domain-containing proteins.31	31	2.9e-06	0.0000000	NA	NA	0.9906175	NA	NA	NA	0.0072528	NA
Signaling by FGFR in disease.3	3	3.1e-06	0.3575306	0.0000004	0.0011815	0.2877414	NA	NA	0.9248441	0.9168705	NA
Signaling by FGFR in disease.4	4	3.1e-06	0.3225506	0.0000001	0.0010588	0.3536394	NA	NA	0.9670913	0.7881869	NA
Syndecan interactions.7	7	3.2e-06	0.0071035	0.8405068	0.0000001	NA	0.0099464	0.7105744	0.5835903	0.3219886	NA
Non-integrin membrane-ECM interactions.9	9	3.2e-06	0.0071035	0.8405068	0.0000001	NA	0.0099464	0.7105744	0.5835903	0.3219886	NA
Signaling by FGFR2 in disease.2	2	3.6e-06	0.1727524	0.0000001	0.0111432	0.0438023	NA	NA	0.4370921	0.8553761	NA
ECM proteoglycans.8	8	3.6e-06	0.0000000	NA	NA	0.0888882	NA	NA	0.4096324	0.8958364	0.3549886
Activation of Matrix Metalloproteinases.3	3	3.6e-06	0.0000001	0.1796756	0.3827438	0.0586231	NA	NA	NA	0.4440382	NA
FRS-mediated FGFR2 signaling.1	1	3.6e-06	0.1566822	0.0000001	0.0118130	0.0447017	NA	NA	0.4599664	0.3875140	NA
Diseases associated with glycosaminoglycan metabolism.1	1	3.6e-06	0.0000063	0.2435271	0.2465458	0.0203295	NA	NA	0.4735686	0.8517177	0.9448151
Defective B4GALT7 causes EDS, progeroid type.1	1	3.6e-06	0.0000063	0.2435271	0.2465458	0.0203295	NA	NA	0.4735686	0.8517177	0.9448151
FGFR2 mutant receptor activation.1	1	3.7e-06	0.1855737	0.0000001	0.0115202	0.1384827	NA	NA	0.7255324	0.5102733	NA
Diseases associated with glycosaminoglycan metabolism.3	3	3.9e-06	0.0000070	0.2229971	0.2492377	0.0197037	NA	NA	0.4518418	0.9549673	0.9406503
Defective B3GAT3 causes JDSSDHD.1	1	3.9e-06	0.0000070	0.2229971	0.2492377	0.0197037	NA	NA	0.4518418	0.9549673	0.9406503
Diseases associated with glycosaminoglycan metabolism.10	10	3.9e-06	0.0000000	NA	NA	0.9883884	NA	NA	NA	0.2846400	NA
Diseases associated with glycosaminoglycan metabolism.2	2	4.0e-06	0.0000077	0.2174316	0.2502998	0.0198502	NA	NA	0.4427936	0.8482583	0.9372121
Defective B3GALT6 causes EDSP2 and SEMDJL1.1	1	4.0e-06	0.0000077	0.2174316	0.2502998	0.0198502	NA	NA	0.4427936	0.8482583	0.9372121
SHC-mediated cascade:FGFR2.1	1	4.0e-06	0.1582481	0.0000001	0.0114429	0.0805351	NA	NA	0.4707031	0.5287020	NA
Chondroitin sulfate biosynthesis.1	1	4.0e-06	0.4138422	0.0000000	0.9567405	0.1275922	NA	NA	0.1305718	0.2464359	NA
Collagen formation.5	5	4.1e-06	0.0000241	0.1134608	0.7455172	0.5377312	NA	NA	0.2002979	0.3300364	0.1098756
Assembly of collagen fibrils and other multimeric structures.12	12	4.1e-06	0.0000241	0.1134608	0.7455172	0.5377312	NA	NA	0.2002979	0.3300364	0.1098756
O-glycosylation of TSR domain-containing proteins.1	1	4.1e-06	0.0000000	NA	NA	0.8003887	NA	NA	NA	0.9922676	NA
COPI-dependent Golgi-to-ER retrograde traffic.1	1	4.1e-06	0.0734397	0.0000003	0.0172880	NA	0.0065869	0.2067050	0.2089244	0.2319105	0.4636113
Binding and Uptake of Ligands by Scavenger Receptors.6	6	4.1e-06	0.0000098	0.4437830	0.0102223	NA	0.2350263	0.9940394	0.3271237	0.7874115	0.0133496
Heparan sulfate/heparin (HS-GAG) metabolism.3	3	4.2e-06	0.0000000	0.0000704	0.8200092	0.3476560	NA	NA	0.2220251	0.8520836	NA
A tetrasaccharide linker sequence is required for GAG synthesis.7	7	4.2e-06	0.0000000	0.0000704	0.8200092	0.3476560	NA	NA	0.2220251	0.8520836	NA
Downstream signaling of activated FGFR2.2	2	4.2e-06	0.1429566	0.0000001	0.0115034	0.0836176	NA	NA	0.4721888	0.2036482	NA
Signaling by FGFR2.3	3	4.2e-06	0.1429566	0.0000001	0.0115034	0.0836176	NA	NA	0.4721888	0.2036482	NA
Negative regulation of FGFR2 signaling.1	1	4.2e-06	0.1551455	0.0000001	0.0094478	0.3014689	NA	NA	0.3678876	0.4365714	NA
Signaling by FGFR in disease.2	2	4.2e-06	0.3600873	0.0000003	0.0018168	0.3913534	NA	NA	0.8828437	0.8824586	0.1272349
Post-translational protein phosphorylation.1	1	4.3e-06	0.0000069	0.1293828	0.0094200	0.6388067	NA	NA	0.0646589	0.4812814	0.0415555
Integrin cell surface interactions.1	1	4.6e-06	0.0000054	0.7184853	0.0007483	0.3736606	NA	NA	0.2060560	0.3940942	0.2297422
Elastic fibre formation.1	1	4.8e-06	0.0003040	0.0668308	0.0001162	NA	0.8977608	0.3369076	0.1993385	0.2497101	NA
NCAM signaling for neurite out-growth.1	1	4.8e-06	0.0000004	0.4002290	0.0001503	0.5139677	NA	NA	0.4569624	0.0966194	0.6087424
Transcriptional Regulation by NPAS4.6	6	4.8e-06	0.0000534	0.0843128	0.0000060	NA	NA	NA	NA	0.0332949	NA
NPAS4 regulates expression of target genes.5	5	4.8e-06	0.0000534	0.0843128	0.0000060	NA	NA	NA	NA	0.0332949	NA
Diseases associated with glycosaminoglycan metabolism.9	9	4.9e-06	0.0000000	NA	NA	0.4304541	NA	NA	NA	0.8091153	NA
Diseases associated with glycosaminoglycan metabolism.8	8	5.0e-06	0.0000000	NA	NA	0.4318175	NA	NA	NA	0.9967769	NA

Permutations

So far, we have identified some modules that are significant. We can test the robustness of our findings using a resampling procedure. With this strategy, we can repeat the two-class test on a subset of patients, removing 3 random patients at each iteration. Modules are prioritized if they show a resampling success score greater than 80.

We will run the analysis on a subset of pathways, considering only those pathways whose modules were found significant.

Since this step will take much time, results are saved.

useThisPathways <- unique(moduleSummary$pathway[moduleSummary$pvalue <= 0.05])
sModule <- moduleSummary[moduleSummary$pathway %in% useThisPathways, , drop = TRUE]

if (file.exists(paste0(dirname, "permsM.RData"))){
    load(paste0(dirname, "permsM.RData"))
}else{
    perms <- resamplingModulesTwoClass(fullMultiOmics = multiOmics, 
                                       classAnnotation, reactSmall, nperm = 100, 
                                       nPatients = 3, pathwaySubset = useThisPathways, 
                                       genesToConsider = row.names(pseudoExpNorm))
    save(perms, file = paste0(dirname, "permsM.RData"))
}

The function selectStablePathwaysModules will retrieve the tabular summary of the results for those pathway modules whose success score is greater than 80. It will also print a dotplot showing the resampling success of modules based on their significance level. The resampling success count can be retrieved with getPathwaysModulesSuccess and counts can be appended as a new column to the tabular summary of module results with addResamplingCounts.

stableModulesSummary <- selectStablePathwaysModules(perms = perms, moduleSummary = sModule, success = 80)

resamplingSuccessCount <- getPathwaysModulesSuccess(perms = perms, moduleSummary = sModule)
moduleSummary <- addResamplingCounts(moduleSummary, resamplingSuccessCount)

Plots

The tabular results can also be visualized in a plot, that helps comparing the contribution of each covariate to the significance of a module. This is done with plotModuleReport. The user must provide the MultiOmicsModules object for the pathway of interest; for this tutorial, we choose to focus on the pathway “Syndecan interactions”.

plotModuleReport(twoClassM[["Syndecan interactions"]],
                 MOcolors = c(exp = "red", met = "green", cnv = "yellow"),
                 priority_to = c("exp", "met"))

The most significant module for this pathway is the third one. The omics that are mainly involved are expression and methylation, as we can see from the p-values of the model covariates (expPC1, expPC2, met3k2).

We can have a look at the structure of the pathway graph and the module position in the pathway. The genes of module 3 are colored in red and and the contribution of each omic inside the module is highlighted with different colors.

plotModuleInGraph(twoClassM[["Syndecan interactions"]], reactSmall, 3,
                  paletteNames = c(exp="red", met="green", cnv="yellow"))

Using a heatmap, we can visualize the profiles of predictive genes, the top 3 genes for each omic. Above the heatmap, we can also show patient additional annotations, in this case their class annotation, and the summarized value of each covariate for each patient.

plotModuleHeat(twoClassM[["Syndecan interactions"]], 3, 
               additionalAnnotations = classAnnotation, 
               additionalPaletteNames = list(classes="violet"))

In second instance, we can ask if two or more omics are significant in the same module simultaneously and if this omic interaction is more frequent than those expected by chance. To perform this test, we use the runSupertest function. We plot only modules that are significant and have a success score greater than 80. A circle plot is returned with the frequency of all significant omic combinations and their significance levels, represented by the height and the color of the outer layer.

runSupertest(moduleSummary, pvalueThr = 0.05, zscoreThr = 0.05, resampligThr = 80,
             excludeColumns = c("pathway", "module", "resamplingCount"))

As you can see, the combination of cnv and expression is significant, as well as the combination of expression and methylation. These combinations of omics co-regulate the same pathway modules more often than what expected by chance.

Finally, with plotFrequencies it is possible to show the frequency distribution of pathways aggregated into macro-categories, generated using Reactome hierarchical structure, separately for each omic combination. This plot suggests biological processes that may be impacted by the omics and their cross-talk.

pathHierarchyGraph <-  igraph::graph_from_data_frame(d = downloadPathwayRelationFromReactome(), directed = TRUE)

omicsClasses2pathways <- computeOmicsIntersections(moduleSummary, 
                                                   pvalueThr = 0.05, zscoreThr = 0.05, resampligThr = 80, 
                                                   excludeColumns = c("pathway", "module", "resamplingCount"))
omicsClasses2pathways <- lapply(omicsClasses2pathways, stripModulesFromPathways)
omicsClasses2fathers <- lapply(omicsClasses2pathways, annotePathwayToFather, 
                               graphiteDB = reactome, hierarchy = pathHierarchyGraph)

MOMfreqDataframe <- computeFreqs(omicsClasses2fathers)

combiClass <- grep(";", MOMfreqDataframe$class)
MOMfreqDataframe.multi <- MOMfreqDataframe[combiClass, , drop = FALSE]

plotFrequencies(MOMfreqDataframe.multi, minSize = 6, maxSize = 9, width = 10, lineSize = 1)

Two-class analysis on pathways

The same analysis can be run on pathways rather than modules. In this case, since we are using pathway graph structures, we suggest to adopt the topological method for PCA, as implemented in MOSClip, as well as a shrinkage approach. To do this, we change parameters for PCA in the secificArgs slot of our Omics object.

multiOmics@specificArgs$pcaArgs$shrink = TRUE
multiOmics@specificArgs$pcaArgs$method = "topological"

We are now ready to run the two-class analysis on pathways with the function multiOmicsTwoClassPathwayTest. The required input are the same shown for modules. In this case, we choose to test a greater amount of pathways (reactHuge list). Again, since this may take some minutes, we save the results for future usage.

if (file.exists(paste0(dirname, "twoClassP.rds"))){
  twoClassP <- readRDS(paste0(dirname, "twoClassP.rds"))
} else {
  twoClassP <- lapply(reactHuge, function(g) {
    res <- multiOmicsTwoClassPathwayTest(multiOmics, g, classAnnotation, 
                                         useTheseGenes = row.names(pseudoExpNorm))
    res
  })
  saveRDS(twoClassP, file = paste0(dirname, "twoClassP.rds"))
 }

The summary of the results is obtained with multiPathwayReport.

pathwaySummary <- multiPathwayReport(twoClassP)

	pvalue	cnvNEG	cnvPOS	expPC1	expPC2	expPC3	met2k2	met3k2	met3k3	mut
Extracellular matrix organization	1.0e-07	0.0240316	0.5647068	0.0002849	0.0188057	0.0205115	0.0406744	NA	NA	0.5778667
Collagen formation	2.0e-07	0.7902942	0.7921634	0.0000008	0.0178320	0.8616522	0.1180423	NA	NA	0.3335339
GPCR downstream signalling	3.0e-07	0.0374317	0.3842342	0.0002943	0.0263907	0.0017945	0.9299399	NA	NA	0.8891513
Signaling by GPCR	3.0e-07	0.0419602	0.3749801	0.0004191	0.0428657	0.0012833	0.9416765	NA	NA	0.9452117
Post-translational protein phosphorylation	3.0e-07	0.1997894	0.5651028	0.0001413	0.0880231	0.2291858	0.4748054	NA	NA	0.0370344
Collagen biosynthesis and modifying enzymes	4.0e-07	0.8417231	0.6308409	0.0000012	0.1424838	0.7534697	0.7997129	NA	NA	0.1864303
Defective B4GALT7 causes EDS, progeroid type	5.0e-07	0.3727263	0.9881543	0.0000402	0.2699674	0.1647628	0.0462787	NA	NA	0.9291883
Defective B3GALT6 causes EDSP2 and SEMDJL1	5.0e-07	0.3383496	0.7703166	0.0000366	0.2803397	0.1423496	0.0430582	NA	NA	0.9092040
Defective B3GAT3 causes JDSSDHD	5.0e-07	0.3638384	0.8173551	0.0000343	0.2582386	0.1577693	0.0435254	NA	NA	0.9043942
RHO GTPase cycle	5.0e-07	0.0972397	0.6429130	0.0006720	0.0006680	0.0005676	NA	0.4587927	0.1976975	0.5002610
Elastic fibre formation	7.0e-07	0.1596668	0.0761210	0.0001490	0.0104862	0.5504124	NA	0.8148059	0.7130069	0.2814824
Transcriptional regulation by RUNX2	7.0e-07	0.0565744	0.5169548	0.0000111	0.0113064	0.0013627	0.0300799	NA	NA	0.2717516
Vesicle-mediated transport	7.0e-07	0.0143504	0.6007670	0.0000049	0.0000001	0.1329811	0.3858350	NA	NA	0.8800403
Integrin cell surface interactions	8.0e-07	0.1817341	0.6212187	0.0000026	0.0175951	0.0074610	NA	0.0662815	0.8565121	0.4474758
Regulation of IGF Activity by IGFBP	8.0e-07	0.2302804	0.5468850	0.0137079	0.2038044	0.4112719	NA	0.3086629	0.5264721	0.0769803
Chondroitin sulfate/dermatan sulfate metabolism	9.0e-07	0.1695981	0.8043619	0.0000262	0.3117320	0.0471199	0.0186205	NA	NA	0.5189792
A tetrasaccharide linker sequence is required for GAG synthesis	9.0e-07	0.4078676	0.9006614	0.0000254	0.0127004	0.6253993	0.0093022	NA	NA	0.4875072
G alpha (q) signalling events	1.1e-06	0.1554920	0.5929441	0.0000023	0.4096661	0.8192870	0.3156041	NA	NA	0.5945333
Crosslinking of collagen fibrils	1.1e-06	0.2215087	0.6214850	0.0000256	0.0214078	0.1356813	NA	0.0215082	0.5161131	0.3401271
Specification of primordial germ cells	1.3e-06	NA	0.9096986	0.5943235	0.0065019	0.0001941	NA	0.2214385	0.0335146	NA
Axon guidance	1.3e-06	0.0292805	0.4144901	0.0012508	0.0009121	0.1040659	NA	0.0771731	0.1453722	0.7220393
ERK1/ERK2 pathway	1.4e-06	0.0434861	0.4085110	0.0000024	0.1184959	0.0188482	0.8953570	NA	NA	0.3097556
Chondroitin sulfate biosynthesis	1.4e-06	0.2028251	0.7094322	0.0000039	0.0033934	0.5919170	0.0438022	NA	NA	0.2142623
Defective EXT2 causes exostoses 2	1.5e-06	0.4151082	0.8819938	0.0000001	0.4510110	0.0004920	NA	NA	NA	0.9024605
Defective EXT1 causes exostoses 1, TRPS2 and CHDS	1.5e-06	0.4151082	0.8819938	0.0000001	0.4510110	0.0004920	NA	NA	NA	0.9024605
PTEN Regulation	1.5e-06	0.0817627	0.0829696	0.0000149	0.0000083	0.0002185	0.8295500	NA	NA	0.3700005
Nervous system development	1.5e-06	0.0314403	0.4149122	0.0012943	0.0010803	0.0773709	NA	0.0808487	0.1760271	0.7283126
Diseases associated with glycosaminoglycan metabolism	1.6e-06	0.2914071	0.7717944	0.8139382	0.0001257	0.3898244	NA	0.7609713	0.3264045	0.9912342
RAF/MAP kinase cascade	1.7e-06	0.0338995	0.3580731	0.0000033	0.1162157	0.0142060	0.6604954	NA	NA	0.3503695
Carbohydrate metabolism	1.8e-06	0.9130043	0.7679701	0.0000148	0.5077062	0.0261936	0.7523325	NA	NA	0.2194696
RHOB GTPase cycle	1.9e-06	0.0468311	0.5121054	0.0000001	0.0000096	0.7671123	0.8077366	NA	NA	0.8167321
Diseases of glycosylation	2.0e-06	0.2383750	0.5069620	0.0001157	0.0793444	0.2916914	0.1534655	NA	NA	0.6423183
Defective B3GALTL causes PpS	2.1e-06	0.2593327	0.8951813	0.0000001	0.1905794	0.1812853	0.4831839	NA	NA	0.3251967
FGFR2 ligand binding and activation	2.2e-06	0.4558820	0.5733507	0.0000002	0.0000013	0.7373796	0.0269213	NA	NA	NA
Formation of the ureteric bud	2.2e-06	0.1681610	0.1361971	0.0000081	0.0000116	0.0001122	0.6744045	NA	NA	0.3130224
Signaling by ROBO receptors	2.2e-06	0.1517295	0.8897721	0.0008698	0.0000002	0.6001271	0.1489777	NA	NA	0.2662645
RHOC GTPase cycle	2.4e-06	0.6613835	0.6087212	0.0000138	0.0000130	0.0000005	NA	0.6975819	0.3603495	0.0998455
RHOQ GTPase cycle	2.5e-06	0.0962134	0.9000713	0.0440529	0.0000082	0.0000003	0.7646989	NA	NA	0.4050893
Degradation of DVL	2.5e-06	0.1501040	0.9367314	0.0000039	0.0000002	0.5887583	0.6675524	NA	NA	0.8054636
MAPK family signaling cascades	2.6e-06	0.0264043	0.5517982	0.0000044	0.0939433	0.0186880	0.3147759	NA	NA	0.1095113
Diseases associated with O-glycosylation of proteins	2.6e-06	0.4844854	0.4118794	0.0000001	0.7626720	0.9848898	0.5830885	NA	NA	0.4395531
Signaling by FGFR2 in disease	2.7e-06	0.6846049	0.7008530	0.0000004	0.0000007	0.5825747	0.0208173	NA	NA	0.1173073
Phospholipase C-mediated cascade; FGFR2	2.7e-06	0.7688547	0.1901028	0.0000004	0.0000042	0.7770023	0.3852312	NA	NA	NA
SHC-mediated cascade:FGFR2	2.8e-06	0.6254439	0.6463056	0.0000006	0.0000006	0.8483130	0.0346019	NA	NA	NA
Diseases of signal transduction by growth factor receptors and second messengers	2.8e-06	0.0327482	0.3048264	0.0000078	0.0129148	0.0002951	NA	0.8535519	0.5617953	0.5480578
Binding and Uptake of Ligands by Scavenger Receptors	2.9e-06	0.5140621	0.7285174	0.0001240	0.0000109	0.0038444	0.0292073	NA	NA	0.0463858
Regulation of RUNX2 expression and activity	3.1e-06	0.0647823	0.4974112	0.0000019	0.0000680	0.0010635	NA	0.6550267	0.0847859	0.0308011
FGFR2c ligand binding and activation	3.1e-06	NA	0.2072975	0.0000000	0.0012765	NA	0.7345982	NA	NA	NA
FGFRL1 modulation of FGFR1 signaling	3.3e-06	0.1028123	0.8411326	0.0000574	0.0000000	0.1129987	0.4314953	NA	NA	NA
Formation of paraxial mesoderm	3.6e-06	0.0633514	0.5488533	0.0009316	0.0000001	0.8402069	NA	0.0441185	0.9937696	0.8711378
Scavenging by Class A Receptors	3.6e-06	0.4051526	0.8268658	0.0000061	0.4822013	0.0693799	0.7863883	NA	NA	0.0162194
Molecules associated with elastic fibres	3.7e-06	0.6220088	0.0552455	0.0000171	0.1844808	0.3359382	NA	0.9083798	0.9522363	0.6443183
FGFR2 mutant receptor activation	3.7e-06	0.6594481	0.5442979	0.0000001	0.0000011	0.6924788	0.1157368	NA	NA	NA
Signaling by PDGF	3.7e-06	0.3355684	0.6935265	0.0000002	0.0354588	0.0162812	0.2345838	NA	NA	0.2626418
Activated point mutants of FGFR2	3.9e-06	0.7531018	0.3331602	0.0000004	0.0000049	0.7920151	0.3723379	NA	NA	NA
Dermatan sulfate biosynthesis	4.1e-06	0.2975360	0.6157101	0.0000068	0.1866780	0.7230372	0.1467363	NA	NA	0.5966076
Negative regulation of the PI3K/AKT network	4.1e-06	0.4258167	0.0403261	0.0000012	0.0000498	0.7633871	0.6932552	NA	NA	0.3108647
Trafficking and processing of endosomal TLR	4.2e-06	0.2640648	0.8714498	0.0000030	0.0000134	0.0001465	NA	0.0014300	0.2877000	0.5197535
Insulin receptor signalling cascade	4.2e-06	0.2579374	0.5197450	0.0000013	0.0000001	0.6301606	0.7184197	NA	NA	0.6850994
Metabolism of fat-soluble vitamins	4.3e-06	0.0948774	0.1201581	0.0010215	0.0000033	0.7597878	NA	0.4470069	0.0781666	0.1167091
Retinoid metabolism and transport	4.3e-06	0.0948774	0.1201581	0.0010215	0.0000033	0.7597878	NA	0.4470069	0.0781666	0.1167091
Signaling by FGFR	4.3e-06	0.1288624	0.7941618	0.0000013	0.0000005	0.3588054	NA	0.6409580	0.6371550	0.2793489
FRS-mediated FGFR2 signaling	4.3e-06	0.6239538	0.5797569	0.0000004	0.0000004	0.9727561	0.0660525	NA	NA	NA
Signaling by BMP	4.4e-06	0.4740980	0.1027097	0.0000012	0.0819171	0.4171192	0.4312054	NA	NA	0.5480481
G alpha (i) signalling events	4.5e-06	0.0371717	0.4248922	0.0000001	0.2293518	0.0149683	0.7057816	NA	NA	0.2622769
RHOA GTPase cycle	4.5e-06	0.1024748	0.7408258	0.0000028	0.0000038	0.8234946	NA	0.6065774	0.6057045	0.7455201
PI5P, PP2A and IER3 Regulate PI3K/AKT Signaling	4.5e-06	0.3882359	0.0312559	0.0000020	0.0000315	0.3688101	0.9931072	NA	NA	0.2296315
Signaling by Insulin receptor	4.5e-06	0.1166131	0.5369190	0.0000010	0.0000002	0.6203859	0.8348461	NA	NA	0.9889899
Signaling by FGFR2	4.6e-06	0.2017734	0.9313176	0.0000007	0.0000009	0.5888631	0.3775783	NA	NA	0.2674864
Diseases of metabolism	4.7e-06	0.0839464	0.1909300	0.0002900	0.0676150	0.2902320	NA	0.0552276	0.1123250	0.1485337
Negative regulation of FGFR2 signaling	4.8e-06	0.1552913	0.7096466	0.0000003	0.0000007	0.9034642	0.3689606	NA	NA	NA
Glycosaminoglycan metabolism	4.8e-06	0.7939859	0.6189523	0.0000023	0.1512943	0.4365784	NA	0.1545235	0.3594032	0.2511728
PI3K/AKT Signaling in Cancer	5.0e-06	0.2583795	0.0314752	0.0000021	0.0000395	0.3146645	0.7016615	NA	NA	0.5626591
O-glycosylation of TSR domain-containing proteins	5.0e-06	0.3406162	0.8548108	0.0000004	0.1818224	0.3893177	0.6655576	NA	NA	0.3315786
NCAM1 interactions	5.0e-06	0.8840654	0.5568446	0.0000002	0.5580963	0.3819905	0.4849704	NA	NA	0.3705564
The activation of arylsulfatases	5.1e-06	0.5695370	0.2168604	0.1535106	0.0015707	0.0000130	NA	0.0414831	0.4770026	NA
Downstream signal transduction	5.6e-06	0.0177311	0.1178433	0.0000011	0.4141545	0.0757079	0.6324902	NA	NA	0.5933000
PI3K/AKT Signaling	5.6e-06	0.0217046	0.1839324	0.0000025	0.0003501	0.5299693	NA	0.9728739	0.4837122	0.3928189
Constitutive Signaling by Aberrant PI3K in Cancer	5.7e-06	0.1164392	0.0528870	0.0000021	0.0000327	0.3732082	0.9936136	NA	NA	0.2683390
Membrane Trafficking	5.7e-06	0.0153472	0.3398972	0.0000039	0.0000001	0.0196522	0.9839172	NA	NA	0.9822560
NCAM signaling for neurite out-growth	5.8e-06	0.4656544	0.3854618	0.0000004	0.1505118	0.8890633	0.0947353	NA	NA	0.8202409
Intra-Golgi and retrograde Golgi-to-ER traffic	6.0e-06	0.1623534	0.8255326	0.0002684	0.1105520	0.0000001	0.1264353	NA	NA	0.4503369
Signaling by TGFB family members	6.0e-06	0.1067845	0.9043556	0.0000041	0.9116613	0.9070655	NA	0.8827745	0.0473354	0.1125412
Phospholipase C-mediated cascade; FGFR3	6.4e-06	NA	0.9029737	0.0000000	0.0000206	0.5856938	0.2500034	NA	NA	NA
Synthesis, secretion, and inactivation of Glucagon-like Peptide-1 (GLP-1)	6.5e-06	0.3199954	0.9447202	0.0001082	0.0000143	0.0000002	0.3221776	NA	NA	0.9733780
Regulation of CDH11 Expression and Function	6.6e-06	0.0147802	0.4265107	0.0000003	0.2319873	0.7594590	NA	0.1508183	0.1809470	0.4937914
PI-3K cascade:FGFR2	6.7e-06	0.4293015	0.6454894	0.0000004	0.0000007	0.9867624	0.4245599	NA	NA	NA
Kidney development	6.7e-06	0.1142341	0.8976835	0.0000342	0.0000038	0.0000428	0.0600717	NA	NA	0.4535685
Integrin signaling	7.0e-06	0.4770905	0.6001199	0.0000181	0.0002779	0.0000002	0.5051000	NA	NA	0.0743639
Visual phototransduction	7.2e-06	0.0656432	0.1455824	0.1349745	0.0000004	0.7496279	0.0445484	NA	NA	0.2201235
RUNX2 regulates osteoblast differentiation	7.2e-06	0.5134977	0.8174828	0.0000151	0.0959687	0.0081984	NA	0.0357893	0.7365349	0.5343903
Metabolism of vitamins and cofactors	7.4e-06	0.4158396	0.4196352	0.0636036	0.0000001	0.7882397	0.7812248	NA	NA	0.1809245
FGFR3 ligand binding and activation	7.5e-06	NA	0.8093056	0.0000000	0.0000381	0.7198278	0.8511350	NA	NA	NA
FGFR3c ligand binding and activation	7.5e-06	NA	0.8093056	0.0000000	0.0000381	0.7198278	0.8511350	NA	NA	NA
Heparan sulfate/heparin (HS-GAG) metabolism	7.5e-06	0.2962408	0.8376337	0.0000007	0.6038985	0.0081590	0.0164833	NA	NA	0.0287701
TCF dependent signaling in response to WNT	7.5e-06	0.0662995	0.4546861	0.4548073	0.0000000	0.0440621	0.9567258	NA	NA	0.5836250
Signaling by Receptor Tyrosine Kinases	7.8e-06	0.0297181	0.3435254	0.0000003	0.0392486	0.4509775	NA	0.3453504	0.9727211	0.7256416
Regulation of Homotypic Cell-Cell Adhesion	7.8e-06	0.0359923	0.4072230	0.2973926	0.0000004	0.1190335	NA	0.2201077	0.2469475	0.6290851
Regulation of Expression and Function of Type II Classical Cadherins	7.8e-06	0.0359923	0.4072230	0.2973926	0.0000004	0.1190335	NA	0.2201077	0.2469475	0.6290851
Syndecan interactions	7.8e-06	0.7773001	0.9600632	0.0000000	0.0011420	0.0028720	NA	0.7759527	0.0289862	0.7676395

Permutations

Again, we can repeat the test on significant pathways applying a resampling strategy for a total of 100 iteration and we save the results. We can see from the dotplot that the vast majority of pathways is found significant in more than 80 iterations. We can extract stable pathways and add a column with the resampling success count on the pathway summary.

useThisPathways <- unique(row.names(pathwaySummary)[pathwaySummary$pvalue <= 0.05])
sPathway <- pathwaySummary[row.names(pathwaySummary) %in% useThisPathways, , drop = TRUE]

if (file.exists(paste0(dirname, "permsP.RData"))){
    load(paste0(dirname, "permsP.RData"))
} else{
    perms <- resamplingPathwayTwoClass(fullMultiOmics = multiOmics, 
                                       classAnnotation, reactHuge, 
                                       nperm = 100, nPatients = 3,
                                       pathwaySubset = useThisPathways, 
                                       genesToConsider = row.names(pseudoExpNorm))
    save(perms, file = paste0(dirname, "permsP.RData"))
}


stablePathwaysSummary <- selectStablePathwaysModules(perms = perms, moduleSummary = sPathway, success = 80)

resamplingSuccessCount <- getPathwaysModulesSuccess(perms = perms, moduleSummary = sPathway)
pathwaySummary <- addResamplingCounts(pathwaySummary, resamplingSuccessCount)

Plots

With the function plotMultiPathwayReport we can plot the test summary for a subset of pathways.

plotMultiPathwayReport(twoClassP[1:20],
                       MOcolors = c(exp = "red", mut = "blue", 
                                    cnv = "yellow", met = "green"),
                       priority_to = c("exp", "met"),
                       fontsize = 5)

We select “Adherens junctions interactions” pathway and plot the heatmap of prioritized genes for each omic.

plotPathwayHeat(twoClassP[["Adherens junctions interactions"]], 
                additionalAnnotations = classAnnotation,
                additionalPaletteNames = list(classes = "violet"))

As found for module tests, the combination of expression and CNV, as well the combination of expression and methylation, co-regulate pathways more often than what expected by chance.

runSupertest(pathwaySummary, 
             pvalueThr = 0.05, zscoreThr = 0.05, resampligThr = 80,
             excludeColumns = "resamplingCount")

The frequency plot shows that the combination of CNV and expression mainly affects signaling pathways, disease and immune system; the same happens with the combination of expression and methylation, with the addition in this case of gene expression macrocategory.

omicsClasses2pathways <- computeOmicsIntersections(pathwaySummary, 
                                                   pvalueThr = 0.05, zscoreThr = 0.05, resampligThr = 80,
                                                   excludeColumns = "resamplingCount")
omicsClasses2fathers <- lapply(omicsClasses2pathways, annotePathwayToFather, 
                               graphiteDB = reactome, hierarchy = pathHierarchyGraph)
freqDataframe <- computeFreqs(omicsClasses2fathers)

combiClass <- grep(";", freqDataframe$class)
freqDataframe.multi <- freqDataframe[combiClass, , drop = FALSE]

plotFrequencies(freqDataframe.multi, minSize = 6, maxSize = 9, width = 10, lineSize = 1)

Data availability

The list of Reactome pathways and the pre-processed datasets used in this tutorial are available here.
Results obtained by running this tutorial are available here.

sessionInfo()

## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Rome
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils    
## [7] datasets  methods   base     
## 
## other attached packages:
##  [1] MethylMix_2.36.0            doParallel_1.0.17          
##  [3] iterators_1.0.14            foreach_1.5.2              
##  [5] impute_1.80.0               maftools_2.22.0            
##  [7] TCGAbiolinks_2.34.0         devtools_2.4.5             
##  [9] usethis_3.0.0               kableExtra_1.4.0           
## [11] MOSClip_0.99.5              graphite_1.52.0            
## [13] EDASeq_2.40.0               ShortRead_1.64.0           
## [15] GenomicAlignments_1.42.0    SummarizedExperiment_1.36.0
## [17] MatrixGenerics_1.18.0       matrixStats_1.4.1          
## [19] Rsamtools_2.22.0            GenomicRanges_1.58.0       
## [21] Biostrings_2.74.0           GenomeInfoDb_1.42.0        
## [23] XVector_0.46.0              BiocParallel_1.40.0        
## [25] org.Hs.eg.db_3.20.0         AnnotationDbi_1.68.0       
## [27] IRanges_2.40.0              S4Vectors_0.44.0           
## [29] Biobase_2.66.0              BiocGenerics_0.52.0        
## 
## loaded via a namespace (and not attached):
##   [1] fs_1.6.5                    bitops_1.0-9               
##   [3] httr_1.4.7                  RColorBrewer_1.1-3         
##   [5] SuperExactTest_1.1.0        Rgraphviz_2.50.0           
##   [7] profvis_0.4.0               tools_4.4.2                
##   [9] backports_1.5.0             utf8_1.2.4                 
##  [11] R6_2.5.1                    DT_0.33                    
##  [13] GetoptLong_1.0.5            urlchecker_1.0.1           
##  [15] withr_3.0.2                 prettyunits_1.2.0          
##  [17] gridExtra_2.3               cli_3.6.3                  
##  [19] gRbase_2.0.3                Cairo_1.6-2                
##  [21] flashClust_1.01-2           sandwich_3.1-1             
##  [23] labeling_0.4.3              sass_0.4.9                 
##  [25] mvtnorm_1.3-2               survMisc_0.5.6             
##  [27] readr_2.1.5                 qpgraph_2.40.0             
##  [29] systemfonts_1.1.0           yulab.utils_0.1.7          
##  [31] svglite_2.1.3               R.utils_2.12.3             
##  [33] sessioninfo_1.2.2           rstudioapi_0.17.1          
##  [35] RSQLite_2.3.7               generics_0.1.3             
##  [37] gridGraphics_0.5-1          shape_1.4.6.1              
##  [39] BiocIO_1.16.0               hwriter_1.3.2.1            
##  [41] car_3.1-3                   dplyr_1.1.4                
##  [43] qtl_1.70                    lars_1.3                   
##  [45] leaps_3.2                   Matrix_1.7-1               
##  [47] interp_1.1-6                fansi_1.0.6                
##  [49] abind_1.4-8                 R.methodsS3_1.8.2          
##  [51] lifecycle_1.0.4             scatterplot3d_0.3-44       
##  [53] multcomp_1.4-26             yaml_2.3.10                
##  [55] carData_3.0-5               SparseArray_1.6.0          
##  [57] BiocFileCache_2.14.0        grid_4.4.2                 
##  [59] blob_1.2.4                  promises_1.3.0             
##  [61] crayon_1.5.3                pwalign_1.2.0              
##  [63] miniUI_0.1.1.1              lattice_0.22-6             
##  [65] GenomicFeatures_1.58.0      annotate_1.84.0            
##  [67] KEGGREST_1.46.0             magick_2.8.5               
##  [69] pillar_1.9.0                knitr_1.48                 
##  [71] ComplexHeatmap_2.22.0       rjson_0.2.23               
##  [73] TCGAbiolinksGUI.data_1.26.0 estimability_1.5.1         
##  [75] corpcor_1.6.10              codetools_0.2-20           
##  [77] glue_1.8.0                  downloader_0.4             
##  [79] remotes_2.5.0               data.table_1.16.2          
##  [81] MultiAssayExperiment_1.32.0 vctrs_0.6.5                
##  [83] png_0.1-8                   coxrobust_1.0.1            
##  [85] gtable_0.3.6                cachem_1.1.0               
##  [87] aroma.light_3.36.0          xfun_0.49                  
##  [89] mime_0.12                   S4Arrays_1.6.0             
##  [91] coda_0.19-4.1               survival_3.7-0             
##  [93] pheatmap_1.0.12             KMsurv_0.1-5               
##  [95] ellipsis_0.3.2              TH.data_1.1-2              
##  [97] bit64_4.5.2                 progress_1.2.3             
##  [99] filelock_1.0.3              bslib_0.8.0                
## [101] elasticnet_1.3              colorspace_2.1-1           
## [103] DBI_1.2.3                   DNAcopy_1.80.0             
## [105] tidyselect_1.2.1            emmeans_1.10.5             
## [107] bit_4.5.0                   compiler_4.4.2             
## [109] curl_5.2.3                  rvest_1.0.4                
## [111] httr2_1.0.6                 graph_1.84.0               
## [113] xml2_1.3.6                  DelayedArray_0.32.0        
## [115] rtracklayer_1.66.0          checkmate_2.3.2            
## [117] scales_1.3.0                multcompView_0.1-10        
## [119] rappdirs_0.3.3              stringr_1.5.1              
## [121] digest_0.6.37               rmarkdown_2.29             
## [123] htmltools_0.5.8.1           pkgconfig_2.0.3            
## [125] jpeg_0.1-10                 highr_0.11                 
## [127] FactoMineR_2.11             dbplyr_2.5.0               
## [129] fastmap_1.2.0               rlang_1.1.4                
## [131] GlobalOptions_0.1.2         htmlwidgets_1.6.4          
## [133] UCSC.utils_1.2.0            shiny_1.9.1                
## [135] farver_2.1.2                jquerylib_0.1.4            
## [137] zoo_1.8-12                  jsonlite_1.8.9             
## [139] R.oo_1.27.0                 RCurl_1.98-1.16            
## [141] magrittr_2.0.3              Formula_1.2-5              
## [143] GenomeInfoDbData_1.2.13     ggplotify_0.1.2            
## [145] NbClust_3.0.1               munsell_0.5.1              
## [147] Rcpp_1.0.13-1               stringi_1.8.4              
## [149] zlibbioc_1.52.0             MASS_7.3-61                
## [151] pkgbuild_1.4.5              plyr_1.8.9                 
## [153] ggrepel_0.9.6               deldir_2.0-4               
## [155] survminer_0.5.0             splines_4.4.2              
## [157] hms_1.1.3                   circlize_0.4.16            
## [159] igraph_2.1.1                ggpubr_0.6.0               
## [161] ggsignif_0.6.4              pkgload_1.4.0              
## [163] biomaRt_2.62.0              XML_3.99-0.17              
## [165] evaluate_1.0.1              latticeExtra_0.6-30        
## [167] tzdb_0.4.0                  httpuv_1.6.15              
## [169] tidyr_1.3.1                 purrr_1.0.2                
## [171] reshape_0.8.9               km.ci_0.5-6                
## [173] clue_0.3-65                 ggplot2_3.5.1              
## [175] BiocBaseUtils_1.8.0         broom_1.0.7                
## [177] xtable_1.8-4                restfulr_0.0.15            
## [179] rstatix_0.7.2               later_1.3.2                
## [181] viridisLite_0.4.2           tibble_3.2.1               
## [183] memoise_2.0.1               cluster_2.1.6