This tutorial will walk you to perform a complete analysis of multi-omics data associated with survival using MOSClip R package.
MOSClip is a method to combines survival analysis and graphical model theory to test the survival association of pathways or of their connected components that we called modules in a multi-omic framework. Multi-omic gene measurements are tested as covariates of a Cox proportional hazard model after dimensionality reduction of data. The final goal is to find the biological processes impacting the patient’s survival.
MOSClip has a modular structure, allowing the use of one or multiple different omics, as well as different data reduction strategies and tests.
In this tutorial we will focus on the integration of four omics: methylome, transcriptome, genomic mutations and genomic copy number variations, testing if these omics can be sinergically involved in pathways with survival prognostication power.
Furthermore, in MOSClip multiple efforts have been dedicated to the implementation of specific graphical tools to browse, manage and provide help in the interpretation of results. In this tutorial we will also exploit these tools to represent analysis results.
First we need to load the necessary libraries:
# Loading libraries
library(org.Hs.eg.db)
library(EDASeq)
library(MOSClip)
library(graphite)
Data retrieving and data analysis will depend on your computational resources. Generally, it needs time and disk space. To speed up this tutorial, we pre-processed the TCGA data for you, which can be downloaded through this link (INSERT HeRE). Details on how to pre-process and format the data are available here.
The provided dataset includes matrices, genes per patients, of methylation status, somatic mutations, CNVs, and transcript expression levels of the TCGA ovarian cancer samples.
Move the dataset file inside a directory called “downloadTCGA” in your working directory. Now, we can load it:
# Loading pre-processed data
load("downloadTCGA/TCGA-OV-pre-processed.RData")
Next, it is recommended to create a directory to save the analysis results.
dirname <- "MOSresults/survival/"
if (!file.exists(dirname)) { # Checks whether the directory exists
dir.create(dirname) # If it doesn't, creates a new directory
}
The next step is to modifiy all the multi-omics matrices assigning the type of gene identifier. Here, we will work with Entrez Gene ID. It is indicated with the prefix tag “ENTREZID:”, compliant with Bioconductor org.dbi, as used in the latest graphite R package version. This will allow us to be able to match the omics data to graphite pathways.
# Renaming the matrix and then adding the prefix to the gene identifiers
expression <- expAvg
row.names(expression) <- paste0("ENTREZID:", row.names(expression))
mutation <- ov.mutations$data
row.names(mutation) <- paste0("ENTREZID:", row.names(mutation))
names(metClustValues$eMap) <- paste0("ENTREZID:", row.names(metClustValues$eMap))
row.names(ov.cnv) <- paste0("ENTREZID:", row.names(ov.cnv))
The TCGA dataset came along with the survival annotations from the table by Liu et al, Cell, 2018. Among the data loaded we can find the object ‘fup’ (short of ‘followup’) that represents our survival annotations.
We extract the slot for the “progression free survival” (pfs) and we save it in a variable named survAnnotations. Then,we want to select only the patients that have samples for all the four omics of our interest.
# Getting survival data
survAnnotations <- fup$pfs
# Selecting the patients with samples for all omics
survAnnot <- na.omit(survAnnotations)
patients <- row.names(survAnnot)
patients <- intersect(patients, colnames(expression))
patients <- intersect(patients, colnames(metClustValues$MET_Cancer_Clustered))
patients <- intersect(patients, colnames(mutation))
patients <- intersect(patients, colnames(ov.cnv))
survAnnot <- survAnnot[patients, , drop = F]
# If the survAnnot RData file does not exist, create and save it.
if (!file.exists(paste0(dirname, "survAnnot.RData"))) {
save(survAnnot, file = paste0(dirname, "survAnnot.RData"))
}
# Selecting the patients for the omics:
methylation <- metClustValues
methylation$MET_Cancer_Clustered <- methylation$MET_Cancer_Clustered[, patients,
drop = F]
mutation <- mutation[, patients, drop = F]
cnv <- ov.cnv[, patients, drop = F]
If you was enough attentive, you noticed that we didn’t perform the patient selection for the expression matrix. That is because we are going to perfom some extra steps after dropping the patients of this matrix. We need to filter the genes to avoid data sparsity, keeping only those genes with at least 100 counts in at least one patients.
After the selection of patients and genes of the expression matrix, we perform normalization and log2 of the (counts+1) transformation. This will ensure us to work with expression data that is very similar to a normal distribution, which is the most suitable distribution for the subsequent MOSClip tests. The normalization of the data is performed according to the data provided, which can change when the patients and/or samples change. That is why this step is performed after the patient selection and it is present in this tutorial, and not in the previous tutorial (how to format the dataset for MOSClip)
# Keeping the patients
expression <- expression[, patients, drop = F]
# Filtering the counts
keep = apply(expression >= 100, 1, any)
# Normalizing the counts
expNorm <- betweenLaneNormalization(expression[keep, , drop = F], which = "upper")
pseudoExpNorm <- log2(expNorm + 1)
At this point, we need the a pathway knowledge-base. We will use the Reactome pathway graphs available at graphite R package. Reactome is natively distributed in Uniprot, our analysis will be in Entrez gene ID, thus we need to convert the pathway identifiers.
Since we are downloading and converting all the Reactom pathways, this step may take a while. To avoid the boredom of waiting for this step again, we will save the object for future analysis.
# If the file doesn't exist, create it
if (file.exists("downloadTCGA/reactome-entrez.RData")) {
load("downloadTCGA/reactome-entrez.RData")
} else {
reactome <- pathways("hsapiens", "reactome") # getting the Homo sapiens pathways from Reactome
reactome <- convertIdentifiers(reactome, "entrez")
save(reactome, file = "downloadTCGA/reactome-entrez.RData")
}
The Reactome database has a hierarchical structure, and for this tutorial we are going to analyze only subsets of the pathways. We will cut the pathways according to their sizes.
For the pathway analysis, we are going to use reactHuge, a subset that have all the pathways with more (or equal) than 10 nodes. For module analysis, we are going to use reactSmall, so the pathways that are bigger than 20 nodes but smaller that 100 nodes.
This will ensure a sufficient level of specificity of the pathway/modules and will avoid unusefull reaction redundancies.
nodesLength <- sapply(reactome, function(g) {
length(intersect(graphite::nodes(g), row.names(pseudoExpNorm)))
})
reactSmall <- reactome[nodesLength >= 20 & nodesLength <= 100]
reactHuge <- reactome[nodesLength >= 10]
We also need the pathway hierarchy for the MOSClip summary plots, so it is convenient to download it now.
pathHierarchy <- downloadPathwayRelationFromReactome()
pathHierarchyGraph <- igraph::graph_from_data_frame(d = pathHierarchy, directed = TRUE)
We are almost ready to run MOSClip! First, we just need to organize the omics matrices in the specific MOSClip object: the Omics object. This object wraps together all the omics matrices, containing also the survival annotation (or any other type of “colData”), and specific arguments for the dimentionality reduction step.
Then, we need to indicate the data reduction strategy we want to apply for each matrix/omics. In this tutorial, we chose to use PCA for expression data, cluster analysis for methylation data, vote counting for mutations and CNVs (for detail see MOSClip paper). This data transformations are easily applied calling MOSClip functions, thus here we need only to provide the name of the needed function.
Specifically for the methylation data, MOSClip provides the possibility to include a dictionary to associate the methylation level of multiple CpG. This is because it is expected to have more than one CpG cluster associated to a gene. Thus, in the methylation specific arguments you need to provide the dictionary to convert cluster names into genes.
multiOmics <- makeOmics(experiments = list(exp = pseudoExpNorm,
met = methylation$MET_Cancer_Clustered,
mut = mutation,
cnv = as.matrix(cnv)),
colData = survAnnot,
modelInfo = c("summarizeWithPca", "summarizeInCluster",
"summarizeToNumberOfEvents",
"summarizeToNumberOfDirectionalEvents"),
specificArgs = list(pcaArgs = list(name = "exp",
shrink = "FALSE",
method = "sparse",
maxPCs = 3),
clusterArgs = list(name = "met",
dict = methylation$eMap,
max_cluster_number = 3),
countEvent = list(name = "mut", min_prop = 0.05),
cnvAgv = list(name = "cnv", min_prop = 0.05)))
save(multiOmics, file = paste0(dirname, "multiOmics.RData"))
As you can see, in this object we specified the four omic data, we chose four methods for data reduction (one for each omic), and four lists of additional parameters (as described in the help of each reduction function).
MOSClip analysis can be performed at the pathway or at the module level, where modules are sub-parts of pathways. A pathway analysis give a general overview of the involved processes, the module analysis can highlights more precisely the mechanism involved.
For both type of analysis, we perform another a priori filter, using the genes that are present at least in the expression data, but this is not mandatory.
genesToConsider <- row.names(pseudoExpNorm)
Here we are going to perofm the multi-omics survival analysis on pathway modules.
This step will create a list of MultiOmicsModules objects (aka “MOM” objects), in which each of them corresponds to one pathway.
The analysis is quite long, so we will also save the analysis results for future usage.
if (file.exists(paste0(dirname, "momSurvivalList.RData"))) {
load(paste0(dirname, "momSurvivalList.RData"))
} else {
momSurvivalList <- lapply(reactSmall, function(g) {
#print(g@title) # uncomment it so you can see which pathway is being analyzed
# for each pathway contained in reactSmall, create the MOM object
res <- multiOmicsSurvivalModuleTest(multiOmics, graph = g,
useTheseGenes = genesToConsider)
res
})
save(momSurvivalList, file = paste0(dirname, "momSurvivalList.RData"))
}
In pathway test, the topology of the pathways (in and out gene connections) can be exploited to guide the data reduction step. For this analysis, we suggest to use the topological PCA instead of the sparse PCA, which can be performed by changing the settings in the Omics object.
multiOmicsPathway <- multiOmics
multiOmicsPathway@specificArgs$pcaArgs$method = "topological"
multiOmicsPathway@specificArgs$pcaArgs$shrink = TRUE
Then we can run the analysis using the function multiOmicsSurvivalPathwayTest():
if (file.exists(paste0(dirname, "mopSurvivalList"))) {
load(paste0(dirname, "mopSurvivalList.RData"))
} else {
mopSurvivalList <- lapply(reactSmall, function(g) {
#print(g@title) # uncomment it so you can see which pathway is being analyzed
#for each pathway contained in reactSmall, create the MOM object
res <- multiOmicsSurvivalPathwayTest(multiOmicsPathway, graph = g,
useTheseGenes = genesToConsider)
res
})
save(mopSurvivalList, file = paste0(dirname, "mopSurvivalList.RData"))
}
This step created a list of MultiOmicsPathway objects (“MOP” objects for short), in which each of them corresponds to one pathway. This step also takes a while, so it is better to save the list for future analyses.
Now the analyses are done and we are ready to check it! MOSClip has plenty of functions to explore the results. We will show you some examples in the following part.
Using the function multiPathwayModuleReport() or multiPathwayReport() we can plot the tabular summary of the top 10 modules or pathways, selected by p-value of the Cox proportional hazard model.
moduleSummary <- multiPathwayModuleReport(momSurvivalList)
module | pvalue | cnvNEG | cnvPOS | expPC1 | expPC2 | expPC3 | met2k2 | met3k2 | met3k3 | mut | |
---|---|---|---|---|---|---|---|---|---|---|---|
Interleukin-12 family signaling.2 | 2 | 0.0000641 | 0.0018125 | 0.3771929 | 0.0131640 | 0.0000084 | 0.0771730 | NA | 0.8627496 | 0.9468694 | 0.1074710 |
Interleukin-12 family signaling.3 | 3 | 0.0001780 | 0.0017586 | 0.4915304 | 0.0832916 | 0.0000170 | 0.9396365 | 0.8151653 | NA | NA | 0.1093456 |
Apoptotic cleavage of cellular proteins.22 | 22 | 0.0001797 | NA | 0.1374085 | 0.2464061 | NA | NA | 0.0000258 | NA | NA | NA |
Apoptotic execution phase.25 | 25 | 0.0001797 | NA | 0.1374085 | 0.2464061 | NA | NA | 0.0000258 | NA | NA | NA |
FOXO-mediated transcription.21 | 21 | 0.0001839 | NA | 0.0556559 | 0.0210130 | NA | NA | NA | 0.0635954 | 0.0054845 | NA |
Interleukin-12 family signaling.1 | 1 | 0.0002170 | 0.0028624 | 0.2372187 | 0.0164954 | 0.0000244 | 0.1085322 | NA | 0.8574025 | 0.8829020 | 0.4882825 |
Interleukin-12 signaling.10 | 10 | 0.0003126 | 0.0293561 | 0.3075713 | 0.4085160 | 0.0002131 | 0.0070165 | 0.9237480 | NA | NA | 0.6448038 |
FOXO-mediated transcription.8 | 8 | 0.0005220 | 0.9270558 | 0.0088464 | 0.2767392 | 0.0618636 | 0.5131571 | NA | 0.2452002 | 0.0285626 | 0.6465475 |
FOXO-mediated transcription.14 | 14 | 0.0005986 | NA | 0.2781266 | 0.0369473 | NA | NA | NA | 0.0855733 | 0.0098327 | NA |
FOXO-mediated transcription.16 | 16 | 0.0007577 | NA | 0.9678402 | 0.0521183 | NA | NA | NA | 0.0968579 | 0.0089905 | NA |
pathwaySummary <- multiPathwayReport(mopSurvivalList)
pvalue | cnvNEG | cnvPOS | expPC1 | expPC2 | expPC3 | met2k2 | met3k2 | met3k3 | mut | |
---|---|---|---|---|---|---|---|---|---|---|
Binding and Uptake of Ligands by Scavenger Receptors | 0.0027281 | 0.9687245 | 0.0724609 | 0.0186142 | 0.1348834 | 0.0003696 | 0.2631560 | NA | NA | 0.9533780 |
Activation of ATR in response to replication stress | 0.0032654 | 0.1977267 | 0.0386159 | 0.0002652 | 0.0015228 | 0.7338686 | NA | 0.5533990 | 0.1054757 | 0.8821458 |
Cell-cell junction organization | 0.0057893 | 0.8395458 | 0.0320409 | 0.6422265 | 0.0548040 | 0.0009511 | 0.7312096 | NA | NA | 0.1251044 |
Downstream signal transduction | 0.0073110 | 0.0807805 | 0.1463223 | 0.1494291 | 0.0292674 | 0.4850052 | NA | 0.0191119 | 0.9145165 | 0.3154155 |
Cell junction organization | 0.0073670 | 0.9843566 | 0.0400655 | 0.4076975 | 0.0921129 | 0.0015507 | 0.6126347 | NA | NA | 0.1938999 |
Adherens junctions interactions | 0.0115703 | 0.6371739 | 0.0272854 | 0.4497246 | 0.0550522 | 0.0015842 | NA | 0.8560170 | 0.2584908 | 0.3325399 |
Activation of Matrix Metalloproteinases | 0.0146996 | 0.0722402 | 0.0619855 | 0.1647940 | 0.3742310 | 0.0075057 | 0.2012347 | NA | NA | 0.1622310 |
Assembly of collagen fibrils and other multimeric structures | 0.0169185 | 0.3738750 | 0.5849960 | 0.1384777 | 0.3904844 | 0.0112785 | 0.0280128 | NA | NA | 0.9033935 |
Acyl chain remodelling of PS | 0.0171948 | 0.1453951 | 0.3153253 | 0.3554507 | 0.0009633 | 0.1588579 | 0.9022454 | NA | NA | NA |
Synthesis of PC | 0.0180271 | 0.0214546 | 0.6891071 | 0.0164082 | 0.2971871 | 0.9016076 | 0.6097319 | NA | NA | 0.0234343 |
MOSClip have a function that can plot a heatmap of the report of the results. The heatmap is sorted by the p-value of the Cox proportional hazard model, using all the omics as covariates. The leftmost column is the p-value of the model, and the other columns are the p-values for each covariate. The color gradient also corresponds to the p-values. In this way, the plot can help to understand the involvment of different omics in the survival.
For the module analysis, the function is plotModuleReport() and it will plot all the modules of a chosen pathway. Please note that this function takes as input a list of MOM objects.
plotModuleReport(momSurvivalList[["Activation of Matrix Metalloproteinases"]])
As you can see in this heatmap, the module 5 of the pathway “Activation of Matrix Metalloproteinases” (which in plotted in the previous example) is significant. Furthermore, we can infer the omics that drives this survival behavior from the pvalues of the model covariates: expression (expPC1 and PC2), and methylation (met2k2).
For the pathway analysis, the function plotMultiPathwayReport() plots the first n pathways in a list of MOP objects:
plotMultiPathwayReport(mopSurvivalList, 10)
Here we can see that the same pathway we chose before (the “Activation of Matrix Metalloproteinases”) also remains signficiant when performing the pathway-type analysis.
Now that we could identify significant pathway/modules and the covariates that are implicated with survival, we can also plot Kaplan-Meier curves, dividing patients in groups with different omics patterns:
plotModuleKM(momSurvivalList[["Activation of Matrix Metalloproteinases"]], 5,
formula = "Surv(days, status) ~ expPC1 + met2k",
paletteNames = "Paired", inYears = TRUE)
plotPathwayKM(mopSurvivalList[["Activation of ATR in response to replication stress"]],
formula = "Surv(days, status) ~ cnvPOS + expPC1",
paletteNames = "Paired", inYears = TRUE)
Finally, we can look at the predictive genes using heatmap and patient additional annotations. For this step, let’s perform some additional formatting of the survival annotation data so we have a better heatmap:
additionalA <- survAnnot
additionalA$status[additionalA$status == 1] <- "event"
additionalA$status[additionalA$status == 0] <- "no_event"
additionalA$PFS <- as.factor(additionalA$status)
additionalA$status <- NULL
additionalA$years <- round(additionalA$days/365.24, 0)
additionalA$days <- NULL
Then we can finally plot it:
plotModuleHeat(momSurvivalList[["Activation of Matrix Metalloproteinases"]], 5,
sortBy = c("expPC1", "met2k", "status", "days"),
additionalAnnotations = survAnnot,
additionalPaletteNames = list(status = "teal", days = "violet"),
withSampleNames = F)
plotPathwayHeat(mopSurvivalList[["Activation of ATR in response to replication stress"]],
sortBy = c("cnvPOS", "expPC1", "status", "days"),
additionalAnnotations = survAnnot,
additionalPaletteNames = list(status = "teal", days = "violet"),
withSampleNames = F)
When looking to predictive genes, it is extremely useful to correlate them to survival behavior (death event or survival time), as well as other types of annotation (e.g. tumor grade). That’s why MOSClip allows to plot heatmaps with additional custom annotations.
With MOSClip is possible to perform a exact test, which is done by implementing theoretical framework using the SuperExactTest R package. It provides efficient computation of the statistical distributions of multi-omic pathways/module set intersections. Our function runSupertest will perfom this analysis and automatically provide a circular plot with the frequency of all significant omic combinations and their significance levels.
# For module analysis results
moduleST <- runSupertest(moduleSummary, pvalueThr = 0.05, zscoreThr = 0.05,
excludeColumns = c("pathway", "module"))
# For pathway-type analysis results
pathwayST <- runSupertest(pathwaySummary, pvalueThr = 0.05, zscoreThr = 0.05)
Here we can see that we have 28 modules with their expression and methylation significantly associated with survival. In the pathways graph, we have instead 3 significant pathways with the combination of expression and methylation.
This plot shows the distribution of the pathways frequencies aggregated into macro categories. It uses Reactome or KEGG hierarchical structure, separately for each omic combinations. This plot can provide insights into prognostic biological processes that may be impacted by the omics and their cross-talk.
To plot this, we first need to compute the omics intersections and then annotate the pathways to their fathers’ nodes. Once this steps are done, we can finally compute the frequencies.
# For module analysis
modulesIntersections <- computeOmicsIntersections(moduleSummary,
pvalueThr = 0.05,
zscoreThr = 0.05,
excludeColumns =
c("pathway", "module"))
#This step is exclusive to module-type results, we are just removing the module number from the end of the row names
modulesIntersections <- lapply(modulesIntersections, stripModulesFromPathways)
modules2fathers <- lapply(modulesIntersections, annotePathwayToFather,
graphiteDB = reactome,
hierarchy = pathHierarchyGraph)
MOMfreqDf <- computeFreqs(modules2fathers)
# We can also create a dataframe to have an annotation of the pathways fathers
correspondence <- lapply(names(modulesIntersections), function(omicClass) {
data.frame(path = modulesIntersections[[omicClass]], father = modules2fathers[[omicClass]], stringsAsFactors = F)
})
plotFrequencies(MOMfreqDf)
Doing this plot for the results of the pathway-type analysis is pretty much the same:
omicsClasses2pathways <- computeOmicsIntersections(pathwaySummary)
omicsClasses2fathers <- lapply(omicsClasses2pathways, annotePathwayToFather, graphiteDB = reactome, hierarchy = pathHierarchyGraph)
pathwayFreqDf <- computeFreqs(omicsClasses2fathers)
plotFrequencies(pathwayFreqDf)
Note that here is not necessary to perfom the step to trip the module number from the pathway names, since we are not dealing with modules.
The function plotModuleInGraph() is specific for the module-type analysis, and it allows the visualization of the position of a chosen module inside its pathway. The function takes as input a MOM object, a “PathwayList” object from the graphite R package, which in our case is the variable reactSmall, and the module number we want to visualize:
plotModuleInGraph(momSurvivalList[["Activation of Matrix Metalloproteinases"]],
reactSmall, moduleNumber = 5)
## 'select()' returned 1:1 mapping between keys and columns
So far, we have identified some modules or pathways that are significant. However, MOSClip gives the possibility to prioritize the most important and stable results with the resampling strategy. In this way, modules or pathways with a p-value <= 0.05 are resampled n times (customized) and later on select those successful above a certain threshold. In practice, this is done as below:
useThesePathwaysM <- unique(moduleSummary$pathway[moduleSummary$pvalue <= 0.05])
if (file.exists(paste0(dirname, "permsModules.RData"))) {
load(paste0(dirname, "permsModules.RData"))
} else {
permsModules <- resamplingModulesSurvival(fullMultiOmics = multiOmics,
reactSmall, nperm = 100,
pathwaySubset = useThesePathwaysM,
genesToConsider = genesToConsider)
save(permsModules, file = paste0(dirname, "permsModules.RData"))
}
In our case we used 100 as number of permutations, and by default this function remove 3 patients per permutation (modifiable). Note that depending on the number of permutation set, this step could take a while to run.
For the pathway-type analysis is pretty much the same, only that tthe resampling function is different:
Once the resampling analyses are completed, we can explore the results. First, we can plot the distribution modules/pathways according to the success count and pvalue:
sModule <- moduleSummary[moduleSummary$pathway %in% useThesePathwaysM, ,
drop = T]
stableModulesSummary <- selectStablePathwaysModules(perms = permsModules,
moduleSummary = sModule,
success = 80)
sucessCountModules <- getPathwaysModulesSuccess(perms = permsModules,
moduleSummary = sModule)
moduleSummary <- addResamplingCounts(moduleSummary, sucessCountModules)
# For the pathway-type results:
sPathway <- pathwaySummary[row.names(pathwaySummary) %in% useThesePathwaysP, ,
drop = T]
stablePathwaysSummary <- selectStablePathwaysModules(perms = permsPathways,
moduleSummary = sPathway,
success = 80)
sucessCountPathway <- getPathwaysModulesSuccess(perms = permsPathways,
moduleSummary = sPathway)
pathwaySummary <- addResamplingCounts(pathwaySummary, sucessCountPathway)
Here we can see the distribution of the resampling success counts and the p-value of the modules or pathways. Additionally, we created a new column in the results summary table to include the success counts of the resampling.
The super exact test and the frequency distribution plot of MOSClip offers an additional filter specific for the resampling strategy. These two plots can plot only the modules or pathways that have a sucess count above a certain threshold:
runSupertest(moduleSummary, pvalueThr = 0.05, zscoreThr = 0.05,
resampligThr = 80, excludeColumns = c("pathway", "module",
"resamplingCount"))
We can also plot a frequency plot with pathway-type analysis results, after performing the resampling strategy:
omicsClasses2pathways <- computeOmicsIntersections(pathwaySummary,
resampligThr = 80,
excludeColumns =
c("resamplingCount"))
omicsClasses2fathers <- lapply(omicsClasses2pathways, annotePathwayToFather, graphiteDB = reactome, hierarchy = pathHierarchyGraph)
pathwayFreqDf <- computeFreqs(omicsClasses2fathers)
plotFrequencies(pathwayFreqDf)
Even if the plot do not have a specific parameter for resampling, we can still plot the modules/pathways with above a certain threshold of success count:
resamplingTrue <- names(which(sucessCountPathway >= 80))
plotMultiPathwayReport(mopSurvivalList[names(mopSurvivalList) %in%
resamplingTrue], 10)
In last ten years we have witnessed a dramatic change in the clinical treatment of patients thanks to molecular and personalized medicine. In fact, many medical institutes are starting to adopt routine genome wide screening to complement and help diagnosis and treatment choices. As the number of datasets grows, we need to adapt and improve the methods to cope with the complexity, amount and multi-level structure of available information. That is why we need analytical methods that effectively integrate multi-omic dimensions of this issue.
MOSClip can deal with this complexity, allowing multi-omic data integration through survival pathway analyses. In brief, MOSClip comprises three main components: pathway topology, multi-omic data and survival analysis.
In this tutorial, you learned how to perform a complete analysis of multi-omics integration to identify pathway or modules that are significant correlated with survival. Starting from the data matrices of four omics (expression, cnv, mutation, and methylation) we ended it up with multiple graphs that dissect every single aspect of the results.
If you did not check yet, please check our other tutorials