Preface

This tutorial will walk you to perform a complete analysis of multi-omics data associated with survival using MOSClip R package.

MOSClip is a method to combines survival analysis and graphical model theory to test the survival association of pathways or of their connected components that we called modules in a multi-omic framework. Multi-omic gene measurements are tested as covariates of a Cox proportional hazard model after dimensionality reduction of data. The final goal is to find the biological processes impacting the patient’s survival.

MOSClip has a modular structure, allowing the use of one or multiple different omics, as well as different data reduction strategies and tests.

In this tutorial we will focus on the integration of four omics: methylome, transcriptome, genomic mutations and genomic copy number variations, testing if these omics can be sinergically involved in pathways with survival prognostication power.

Furthermore, in MOSClip multiple efforts have been dedicated to the implementation of specific graphical tools to browse, manage and provide help in the interpretation of results. In this tutorial we will also exploit these tools to represent analysis results.

Preparing the enviornment for the analysis

First we need to load the necessary libraries:

# Loading libraries
library(org.Hs.eg.db)
library(EDASeq)
library(MOSClip)
library(graphite)

Data retrieving and data analysis will depend on your computational resources. Generally, it needs time and disk space. To speed up this tutorial, we pre-processed the TCGA data for you, which can be downloaded through this link (INSERT HeRE). Details on how to pre-process and format the data are available here.

The provided dataset includes matrices, genes per patients, of methylation status, somatic mutations, CNVs, and transcript expression levels of the TCGA ovarian cancer samples.

Move the dataset file inside a directory called “downloadTCGA” in your working directory. Now, we can load it:

# Loading pre-processed data
load("downloadTCGA/TCGA-OV-pre-processed.RData")

Next, it is recommended to create a directory to save the analysis results.

dirname <- "MOSresults/survival/"
if (!file.exists(dirname)) { # Checks whether the directory exists
    dir.create(dirname) # If it doesn't, creates a new directory 
}

The next step is to modifiy all the multi-omics matrices assigning the type of gene identifier. Here, we will work with Entrez Gene ID. It is indicated with the prefix tag “ENTREZID:”, compliant with Bioconductor org.dbi, as used in the latest graphite R package version. This will allow us to be able to match the omics data to graphite pathways.

# Renaming the matrix and then adding the prefix to the gene identifiers
expression <- expAvg
row.names(expression) <- paste0("ENTREZID:", row.names(expression))
mutation <- ov.mutations$data
row.names(mutation) <- paste0("ENTREZID:", row.names(mutation))
names(metClustValues$eMap) <- paste0("ENTREZID:", row.names(metClustValues$eMap))
row.names(ov.cnv) <- paste0("ENTREZID:", row.names(ov.cnv))

The TCGA dataset came along with the survival annotations from the table by Liu et al, Cell, 2018. Among the data loaded we can find the object ‘fup’ (short of ‘followup’) that represents our survival annotations.

We extract the slot for the “progression free survival” (pfs) and we save it in a variable named survAnnotations. Then,we want to select only the patients that have samples for all the four omics of our interest.

# Getting survival data 
survAnnotations <- fup$pfs

# Selecting the patients with samples for all omics
survAnnot <- na.omit(survAnnotations)
patients <- row.names(survAnnot)
patients <- intersect(patients, colnames(expression))
patients <- intersect(patients, colnames(metClustValues$MET_Cancer_Clustered))
patients <- intersect(patients, colnames(mutation))
patients <- intersect(patients, colnames(ov.cnv))

survAnnot <- survAnnot[patients, , drop = F]

# If the survAnnot RData file does not exist, create and save it.
if (!file.exists(paste0(dirname, "survAnnot.RData"))) {
    save(survAnnot, file = paste0(dirname, "survAnnot.RData"))
}

# Selecting the patients for the omics:
methylation <- metClustValues
methylation$MET_Cancer_Clustered <- methylation$MET_Cancer_Clustered[, patients, 
    drop = F]
mutation <- mutation[, patients, drop = F]
cnv <- ov.cnv[, patients, drop = F]

If you was enough attentive, you noticed that we didn’t perform the patient selection for the expression matrix. That is because we are going to perfom some extra steps after dropping the patients of this matrix. We need to filter the genes to avoid data sparsity, keeping only those genes with at least 100 counts in at least one patients.

After the selection of patients and genes of the expression matrix, we perform normalization and log2 of the (counts+1) transformation. This will ensure us to work with expression data that is very similar to a normal distribution, which is the most suitable distribution for the subsequent MOSClip tests. The normalization of the data is performed according to the data provided, which can change when the patients and/or samples change. That is why this step is performed after the patient selection and it is present in this tutorial, and not in the previous tutorial (how to format the dataset for MOSClip)

# Keeping the patients
expression <- expression[, patients, drop = F]

# Filtering the counts 
keep = apply(expression >= 100, 1, any)

# Normalizing the counts
expNorm <- betweenLaneNormalization(expression[keep, , drop = F], which = "upper")
pseudoExpNorm <- log2(expNorm + 1)

At this point, we need the a pathway knowledge-base. We will use the Reactome pathway graphs available at graphite R package. Reactome is natively distributed in Uniprot, our analysis will be in Entrez gene ID, thus we need to convert the pathway identifiers.

Since we are downloading and converting all the Reactom pathways, this step may take a while. To avoid the boredom of waiting for this step again, we will save the object for future analysis.

# If the file doesn't exist, create it
if (file.exists("downloadTCGA/reactome-entrez.RData")) {
    load("downloadTCGA/reactome-entrez.RData")
} else {
    reactome <- pathways("hsapiens", "reactome") # getting the Homo sapiens pathways from Reactome
    reactome <- convertIdentifiers(reactome, "entrez") 
    save(reactome, file = "downloadTCGA/reactome-entrez.RData")
}

The Reactome database has a hierarchical structure, and for this tutorial we are going to analyze only subsets of the pathways. We will cut the pathways according to their sizes.

For the pathway analysis, we are going to use reactHuge, a subset that have all the pathways with more (or equal) than 10 nodes. For module analysis, we are going to use reactSmall, so the pathways that are bigger than 20 nodes but smaller that 100 nodes.

This will ensure a sufficient level of specificity of the pathway/modules and will avoid unusefull reaction redundancies.

nodesLength <- sapply(reactome, function(g) {
    length(intersect(graphite::nodes(g), row.names(pseudoExpNorm)))
})
reactSmall <- reactome[nodesLength >= 20 & nodesLength <= 100]
reactHuge <- reactome[nodesLength >= 10]

We also need the pathway hierarchy for the MOSClip summary plots, so it is convenient to download it now.

pathHierarchy <- downloadPathwayRelationFromReactome()
pathHierarchyGraph <- igraph::graph_from_data_frame(d = pathHierarchy, directed = TRUE)

Creating the Omics object

We are almost ready to run MOSClip! First, we just need to organize the omics matrices in the specific MOSClip object: the Omics object. This object wraps together all the omics matrices, containing also the survival annotation (or any other type of “colData”), and specific arguments for the dimentionality reduction step.

Then, we need to indicate the data reduction strategy we want to apply for each matrix/omics. In this tutorial, we chose to use PCA for expression data, cluster analysis for methylation data, vote counting for mutations and CNVs (for detail see MOSClip paper). This data transformations are easily applied calling MOSClip functions, thus here we need only to provide the name of the needed function.

Specifically for the methylation data, MOSClip provides the possibility to include a dictionary to associate the methylation level of multiple CpG. This is because it is expected to have more than one CpG cluster associated to a gene. Thus, in the methylation specific arguments you need to provide the dictionary to convert cluster names into genes.

multiOmics <- makeOmics(experiments = list(exp = pseudoExpNorm,
                                           met = methylation$MET_Cancer_Clustered,
                                           mut = mutation,
                                           cnv = as.matrix(cnv)),
                        colData = survAnnot,
                        modelInfo = c("summarizeWithPca", "summarizeInCluster",
                                      "summarizeToNumberOfEvents",
                                      "summarizeToNumberOfDirectionalEvents"),
                        specificArgs = list(pcaArgs = list(name = "exp",
                                                           shrink = "FALSE",
                                                           method = "sparse",
                                                           maxPCs = 3),
                                            clusterArgs = list(name = "met",
                                                               dict = methylation$eMap,
                                                               max_cluster_number = 3),
                                            countEvent = list(name = "mut", min_prop = 0.05),
                                            cnvAgv = list(name = "cnv", min_prop = 0.05)))

save(multiOmics, file = paste0(dirname, "multiOmics.RData"))

As you can see, in this object we specified the four omic data, we chose four methods for data reduction (one for each omic), and four lists of additional parameters (as described in the help of each reduction function).

Running MOSClip analysis

MOSClip analysis can be performed at the pathway or at the module level, where modules are sub-parts of pathways. A pathway analysis give a general overview of the involved processes, the module analysis can highlights more precisely the mechanism involved.

For both type of analysis, we perform another a priori filter, using the genes that are present at least in the expression data, but this is not mandatory.

genesToConsider <- row.names(pseudoExpNorm)

Module Analysis

Here we are going to perofm the multi-omics survival analysis on pathway modules.

This step will create a list of MultiOmicsModules objects (aka “MOM” objects), in which each of them corresponds to one pathway.

The analysis is quite long, so we will also save the analysis results for future usage.

if (file.exists(paste0(dirname, "momSurvivalList.RData"))) {
    load(paste0(dirname, "momSurvivalList.RData"))
} else {
    momSurvivalList <- lapply(reactSmall, function(g) {
        #print(g@title)  # uncomment it so you can see which pathway is being analyzed
        # for each pathway contained in reactSmall, create the MOM object
        res <- multiOmicsSurvivalModuleTest(multiOmics, graph = g,
                                            useTheseGenes = genesToConsider)
        res
    })
    save(momSurvivalList, file = paste0(dirname, "momSurvivalList.RData"))
}

Pathway analysis

In pathway test, the topology of the pathways (in and out gene connections) can be exploited to guide the data reduction step. For this analysis, we suggest to use the topological PCA instead of the sparse PCA, which can be performed by changing the settings in the Omics object.

multiOmicsPathway <- multiOmics
multiOmicsPathway@specificArgs$pcaArgs$method = "topological"
multiOmicsPathway@specificArgs$pcaArgs$shrink = TRUE

Then we can run the analysis using the function multiOmicsSurvivalPathwayTest():

if (file.exists(paste0(dirname, "mopSurvivalList"))) {
    load(paste0(dirname, "mopSurvivalList.RData"))
} else {
    mopSurvivalList <- lapply(reactSmall, function(g) {
        #print(g@title)  # uncomment it so you can see which pathway is being analyzed
        #for each pathway contained in reactSmall, create the MOM object
        res <- multiOmicsSurvivalPathwayTest(multiOmicsPathway, graph = g,
                                             useTheseGenes = genesToConsider)
        res
    })
    save(mopSurvivalList, file = paste0(dirname, "mopSurvivalList.RData"))
}

This step created a list of MultiOmicsPathway objects (“MOP” objects for short), in which each of them corresponds to one pathway. This step also takes a while, so it is better to save the list for future analyses.

Now the analyses are done and we are ready to check it! MOSClip has plenty of functions to explore the results. We will show you some examples in the following part.

Exploring the results

Results summary

Using the function multiPathwayModuleReport() or multiPathwayReport() we can plot the tabular summary of the top 10 modules or pathways, selected by p-value of the Cox proportional hazard model.

moduleSummary <- multiPathwayModuleReport(momSurvivalList)
module pvalue cnvNEG cnvPOS expPC1 expPC2 expPC3 met2k2 met3k2 met3k3 mut
Interleukin-12 family signaling.2 2 0.0000641 0.0018125 0.3771929 0.0131640 0.0000084 0.0771730 NA 0.8627496 0.9468694 0.1074710
Interleukin-12 family signaling.3 3 0.0001780 0.0017586 0.4915304 0.0832916 0.0000170 0.9396365 0.8151653 NA NA 0.1093456
Apoptotic cleavage of cellular proteins.22 22 0.0001797 NA 0.1374085 0.2464061 NA NA 0.0000258 NA NA NA
Apoptotic execution phase.25 25 0.0001797 NA 0.1374085 0.2464061 NA NA 0.0000258 NA NA NA
FOXO-mediated transcription.21 21 0.0001839 NA 0.0556559 0.0210130 NA NA NA 0.0635954 0.0054845 NA
Interleukin-12 family signaling.1 1 0.0002170 0.0028624 0.2372187 0.0164954 0.0000244 0.1085322 NA 0.8574025 0.8829020 0.4882825
Interleukin-12 signaling.10 10 0.0003126 0.0293561 0.3075713 0.4085160 0.0002131 0.0070165 0.9237480 NA NA 0.6448038
FOXO-mediated transcription.8 8 0.0005220 0.9270558 0.0088464 0.2767392 0.0618636 0.5131571 NA 0.2452002 0.0285626 0.6465475
FOXO-mediated transcription.14 14 0.0005986 NA 0.2781266 0.0369473 NA NA NA 0.0855733 0.0098327 NA
FOXO-mediated transcription.16 16 0.0007577 NA 0.9678402 0.0521183 NA NA NA 0.0968579 0.0089905 NA
pathwaySummary <- multiPathwayReport(mopSurvivalList)
pvalue cnvNEG cnvPOS expPC1 expPC2 expPC3 met2k2 met3k2 met3k3 mut
Binding and Uptake of Ligands by Scavenger Receptors 0.0027281 0.9687245 0.0724609 0.0186142 0.1348834 0.0003696 0.2631560 NA NA 0.9533780
Activation of ATR in response to replication stress 0.0032654 0.1977267 0.0386159 0.0002652 0.0015228 0.7338686 NA 0.5533990 0.1054757 0.8821458
Cell-cell junction organization 0.0057893 0.8395458 0.0320409 0.6422265 0.0548040 0.0009511 0.7312096 NA NA 0.1251044
Downstream signal transduction 0.0073110 0.0807805 0.1463223 0.1494291 0.0292674 0.4850052 NA 0.0191119 0.9145165 0.3154155
Cell junction organization 0.0073670 0.9843566 0.0400655 0.4076975 0.0921129 0.0015507 0.6126347 NA NA 0.1938999
Adherens junctions interactions 0.0115703 0.6371739 0.0272854 0.4497246 0.0550522 0.0015842 NA 0.8560170 0.2584908 0.3325399
Activation of Matrix Metalloproteinases 0.0146996 0.0722402 0.0619855 0.1647940 0.3742310 0.0075057 0.2012347 NA NA 0.1622310
Assembly of collagen fibrils and other multimeric structures 0.0169185 0.3738750 0.5849960 0.1384777 0.3904844 0.0112785 0.0280128 NA NA 0.9033935
Acyl chain remodelling of PS 0.0171948 0.1453951 0.3153253 0.3554507 0.0009633 0.1588579 0.9022454 NA NA NA
Synthesis of PC 0.0180271 0.0214546 0.6891071 0.0164082 0.2971871 0.9016076 0.6097319 NA NA 0.0234343

Graphical exploration of MOSClip results

MOSClip have a function that can plot a heatmap of the report of the results. The heatmap is sorted by the p-value of the Cox proportional hazard model, using all the omics as covariates. The leftmost column is the p-value of the model, and the other columns are the p-values for each covariate. The color gradient also corresponds to the p-values. In this way, the plot can help to understand the involvment of different omics in the survival.

Report as heatmap

For the module analysis, the function is plotModuleReport() and it will plot all the modules of a chosen pathway. Please note that this function takes as input a list of MOM objects.

plotModuleReport(momSurvivalList[["Activation of Matrix Metalloproteinases"]])

As you can see in this heatmap, the module 5 of the pathway “Activation of Matrix Metalloproteinases” (which in plotted in the previous example) is significant. Furthermore, we can infer the omics that drives this survival behavior from the pvalues of the model covariates: expression (expPC1 and PC2), and methylation (met2k2).

For the pathway analysis, the function plotMultiPathwayReport() plots the first n pathways in a list of MOP objects:

plotMultiPathwayReport(mopSurvivalList, 10)

Here we can see that the same pathway we chose before (the “Activation of Matrix Metalloproteinases”) also remains signficiant when performing the pathway-type analysis.

Kaplan-Meier plots

Now that we could identify significant pathway/modules and the covariates that are implicated with survival, we can also plot Kaplan-Meier curves, dividing patients in groups with different omics patterns:

plotModuleKM(momSurvivalList[["Activation of Matrix Metalloproteinases"]], 5,
             formula = "Surv(days, status) ~ expPC1 + met2k",
             paletteNames = "Paired", inYears = TRUE)

plotPathwayKM(mopSurvivalList[["Activation of ATR in response to replication stress"]], 
             formula = "Surv(days, status) ~ cnvPOS + expPC1",
             paletteNames = "Paired", inYears = TRUE)

Heatmap

Finally, we can look at the predictive genes using heatmap and patient additional annotations. For this step, let’s perform some additional formatting of the survival annotation data so we have a better heatmap:

additionalA <- survAnnot
additionalA$status[additionalA$status == 1] <- "event"
additionalA$status[additionalA$status == 0] <- "no_event"
additionalA$PFS <- as.factor(additionalA$status)
additionalA$status <- NULL
additionalA$years <- round(additionalA$days/365.24, 0)
additionalA$days <- NULL

Then we can finally plot it:

plotModuleHeat(momSurvivalList[["Activation of Matrix Metalloproteinases"]], 5,
               sortBy = c("expPC1", "met2k", "status", "days"),
               additionalAnnotations = survAnnot,
               additionalPaletteNames = list(status = "teal", days = "violet"),
               withSampleNames = F)

plotPathwayHeat(mopSurvivalList[["Activation of ATR in response to replication stress"]],
               sortBy = c("cnvPOS", "expPC1", "status", "days"),
               additionalAnnotations = survAnnot,
               additionalPaletteNames = list(status = "teal", days = "violet"),
               withSampleNames = F)

When looking to predictive genes, it is extremely useful to correlate them to survival behavior (death event or survival time), as well as other types of annotation (e.g. tumor grade). That’s why MOSClip allows to plot heatmaps with additional custom annotations.

Super Exact test

With MOSClip is possible to perform a exact test, which is done by implementing theoretical framework using the SuperExactTest R package. It provides efficient computation of the statistical distributions of multi-omic pathways/module set intersections. Our function runSupertest will perfom this analysis and automatically provide a circular plot with the frequency of all significant omic combinations and their significance levels.

# For module analysis results
moduleST <- runSupertest(moduleSummary, pvalueThr = 0.05, zscoreThr = 0.05,
                         excludeColumns = c("pathway", "module"))

# For pathway-type analysis results
pathwayST <- runSupertest(pathwaySummary, pvalueThr = 0.05, zscoreThr = 0.05)

Here we can see that we have 28 modules with their expression and methylation significantly associated with survival. In the pathways graph, we have instead 3 significant pathways with the combination of expression and methylation.

Frequency distribution with radial plot

This plot shows the distribution of the pathways frequencies aggregated into macro categories. It uses Reactome or KEGG hierarchical structure, separately for each omic combinations. This plot can provide insights into prognostic biological processes that may be impacted by the omics and their cross-talk.

To plot this, we first need to compute the omics intersections and then annotate the pathways to their fathers’ nodes. Once this steps are done, we can finally compute the frequencies.

# For module analysis
modulesIntersections <- computeOmicsIntersections(moduleSummary,
                                                   pvalueThr = 0.05,
                                                   zscoreThr = 0.05,
                                                   excludeColumns = 
                                                     c("pathway", "module"))

#This step is exclusive to module-type results, we are just removing the module number from the end of the row names
modulesIntersections <- lapply(modulesIntersections, stripModulesFromPathways)

modules2fathers <- lapply(modulesIntersections, annotePathwayToFather,
                          graphiteDB = reactome,
                          hierarchy = pathHierarchyGraph)
MOMfreqDf <- computeFreqs(modules2fathers)

# We can also create a dataframe to have an annotation of the pathways fathers 
correspondence <- lapply(names(modulesIntersections), function(omicClass) {
    data.frame(path = modulesIntersections[[omicClass]], father = modules2fathers[[omicClass]], stringsAsFactors = F)
})

plotFrequencies(MOMfreqDf)

Doing this plot for the results of the pathway-type analysis is pretty much the same:

omicsClasses2pathways <- computeOmicsIntersections(pathwaySummary)
omicsClasses2fathers <- lapply(omicsClasses2pathways, annotePathwayToFather, graphiteDB = reactome, hierarchy = pathHierarchyGraph)
pathwayFreqDf <- computeFreqs(omicsClasses2fathers)

plotFrequencies(pathwayFreqDf)

Note that here is not necessary to perfom the step to trip the module number from the pathway names, since we are not dealing with modules.

Module in graph plot

The function plotModuleInGraph() is specific for the module-type analysis, and it allows the visualization of the position of a chosen module inside its pathway. The function takes as input a MOM object, a “PathwayList” object from the graphite R package, which in our case is the variable reactSmall, and the module number we want to visualize:

plotModuleInGraph(momSurvivalList[["Activation of Matrix Metalloproteinases"]],
                  reactSmall, moduleNumber = 5)
## 'select()' returned 1:1 mapping between keys and columns

Resampling strategy

So far, we have identified some modules or pathways that are significant. However, MOSClip gives the possibility to prioritize the most important and stable results with the resampling strategy. In this way, modules or pathways with a p-value <= 0.05 are resampled n times (customized) and later on select those successful above a certain threshold. In practice, this is done as below:

useThesePathwaysM <- unique(moduleSummary$pathway[moduleSummary$pvalue <= 0.05])

if (file.exists(paste0(dirname, "permsModules.RData"))) {
    load(paste0(dirname, "permsModules.RData"))
} else {
    permsModules <- resamplingModulesSurvival(fullMultiOmics = multiOmics,
                                              reactSmall, nperm = 100,
                                              pathwaySubset = useThesePathwaysM,
                                              genesToConsider = genesToConsider)
    save(permsModules, file = paste0(dirname, "permsModules.RData"))
}

In our case we used 100 as number of permutations, and by default this function remove 3 patients per permutation (modifiable). Note that depending on the number of permutation set, this step could take a while to run.

For the pathway-type analysis is pretty much the same, only that tthe resampling function is different:

Exploring resampling results

Once the resampling analyses are completed, we can explore the results. First, we can plot the distribution modules/pathways according to the success count and pvalue:

sModule <- moduleSummary[moduleSummary$pathway %in% useThesePathwaysM, ,
                         drop = T]
stableModulesSummary <- selectStablePathwaysModules(perms = permsModules,
                                                    moduleSummary = sModule,
                                                    success = 80)

sucessCountModules <- getPathwaysModulesSuccess(perms = permsModules,
                                                moduleSummary = sModule)
moduleSummary <- addResamplingCounts(moduleSummary, sucessCountModules)

# For the pathway-type results:
sPathway <- pathwaySummary[row.names(pathwaySummary) %in% useThesePathwaysP, ,
                           drop = T]
stablePathwaysSummary <- selectStablePathwaysModules(perms = permsPathways,
                                                    moduleSummary = sPathway,
                                                    success = 80)

sucessCountPathway <- getPathwaysModulesSuccess(perms = permsPathways,
                                                moduleSummary = sPathway)
pathwaySummary <- addResamplingCounts(pathwaySummary, sucessCountPathway)

Here we can see the distribution of the resampling success counts and the p-value of the modules or pathways. Additionally, we created a new column in the results summary table to include the success counts of the resampling.

Plotting results after resampling

Super exact test plot after resampling

The super exact test and the frequency distribution plot of MOSClip offers an additional filter specific for the resampling strategy. These two plots can plot only the modules or pathways that have a sucess count above a certain threshold:

runSupertest(moduleSummary, pvalueThr = 0.05, zscoreThr = 0.05,
             resampligThr = 80, excludeColumns = c("pathway", "module",
                                                   "resamplingCount"))

Frequency plot after resampling

We can also plot a frequency plot with pathway-type analysis results, after performing the resampling strategy:

omicsClasses2pathways <- computeOmicsIntersections(pathwaySummary,
                                                   resampligThr = 80,
                                                   excludeColumns =
                                                     c("resamplingCount"))
omicsClasses2fathers <- lapply(omicsClasses2pathways, annotePathwayToFather, graphiteDB = reactome, hierarchy = pathHierarchyGraph)
pathwayFreqDf <- computeFreqs(omicsClasses2fathers)
plotFrequencies(pathwayFreqDf)

Other plots for resampling results

Even if the plot do not have a specific parameter for resampling, we can still plot the modules/pathways with above a certain threshold of success count:

resamplingTrue <- names(which(sucessCountPathway >= 80))
plotMultiPathwayReport(mopSurvivalList[names(mopSurvivalList) %in%
                                         resamplingTrue], 10)

Conclusion

In last ten years we have witnessed a dramatic change in the clinical treatment of patients thanks to molecular and personalized medicine. In fact, many medical institutes are starting to adopt routine genome wide screening to complement and help diagnosis and treatment choices. As the number of datasets grows, we need to adapt and improve the methods to cope with the complexity, amount and multi-level structure of available information. That is why we need analytical methods that effectively integrate multi-omic dimensions of this issue.

MOSClip can deal with this complexity, allowing multi-omic data integration through survival pathway analyses. In brief, MOSClip comprises three main components: pathway topology, multi-omic data and survival analysis.

In this tutorial, you learned how to perform a complete analysis of multi-omics integration to identify pathway or modules that are significant correlated with survival. Starting from the data matrices of four omics (expression, cnv, mutation, and methylation) we ended it up with multiple graphs that dissect every single aspect of the results.

If you did not check yet, please check our other tutorials