Implement mofa2 wrapper for MAE #144

RiboRings · 2024-01-19T09:23:34Z

The current method for MultiAssayExperiments requires some manual work which the end user may not be familiar with. As in this example, after transforming assays you also need to:

remove unnecessary assays (like assay(mae[[1]], "counts") <- NULL)
run create_mofa_from_MultiAssayExperiment
Get default options and modify them if necessary (like model_opts$num_factors <- 5)
run prepare_mofa

As an alternative to performing those 4 steps separately, I propose to include them in a wrapper function, whose result can be directly passed to run_mofa.

RiboRings · 2024-01-19T09:28:04Z

The R script of interest is create_mofa_from_mae.R. I left run_mofa out of the wrapper in case the user would like to make additional changes prior to running, but run_mofa could also be included in the wrapper.

antagomir · 2024-01-19T10:58:00Z

Could you more clearly show here with an example how something can be done more efficiently with this new contribution? E.g. show parallel examples of current & new way. This will make it easier to see how this will be useful.

RiboRings · 2024-01-19T19:43:44Z

Sure! The following example is inspired by Chapter 15 of the OMA book.

Import and preprocess dataset:

library(mia)
data("HintikkaXOData", package = "mia")
mae <- HintikkaXOData

# For simplicity, classify all high-fat diets as high-fat, and all the low-fat 
# diets as low-fat diets
colData(mae)$Diet <- ifelse(colData(mae)$Diet == "High-fat" | 
                              colData(mae)$Diet == "High-fat + XOS", 
                            "High-fat", "Low-fat")

# Transforming microbiome data with rclr
mae[[1]] <- transformAssay(mae[[1]], method = "relabundance")
mae[[1]] <- transformAssay(mae[[1]], assay.type = "relabundance", method = "rclr")

# Transforming metabolomic data with log10
mae[[2]] <- transformAssay(mae[[2]], assay.type = "nmr",
                            MARGIN = "samples",
                            method = "log10")

# Transforming biomarker data with z-transform
mae[[3]] <- transformAssay(mae[[3]], assay.type = "signals",
                           MARGIN = "features",
                           method = "z", pseudocount = 1)

Example with wrapper:

prep_model <- mofa2(mae,
                    assay.types = c("rclr", "log10", "z"),
                    groups = "Diet",
                    num_factors = 5,
                    extract_metadata = TRUE)

Same example without wrapper:

# Removing assays no longer needed
assay(mae[[1]], "counts") <- NULL
assay(mae[[1]], "log10") <- NULL
assay(mae[[2]], "nmr") <- NULL
assay(mae[[3]], "signals") <- NULL

# Building our mofa model
model <- create_mofa_from_MultiAssayExperiment(mae,
                                               groups = "Diet", 
                                               extract_metadata = TRUE)

# Specifying custom model options
model_opts <- get_default_model_options(model)
model_opts$num_factors <- 5

# Preparing model
prep_model <- prepare_mofa(
  object = model,
  model_options = model_opts
)

Then, the prep_model object can be passed to run_mofa for training:

trained_model <- run_mofa(prep_model, use_basilisk = TRUE)

antagomir · 2024-01-20T13:14:47Z

Hmm seems more straightfwd and easy to implement.

antagomir · 2024-03-13T15:47:46Z

Up.

antagomir · 2024-04-11T19:30:30Z

@rargelaguet any thoughts on when this PR could be considered..?

artur-sannikov · 2024-05-10T10:54:41Z

Hi, I'm also interested in this feature. Are you planning to merge?

antagomir · 2024-06-10T08:38:34Z

Hi @rargelaguet - any thoughts on when this PR could be considered..?

rargelaguet · 2024-06-12T07:56:20Z

Hi @antagomir and @RiboRings, thanks a lot for the pull request and I am very sorry for the delay on my side.

I like the idea of having a general wrapper called [mofa2] that handles all input format and simplifies the process of creating the MOFA object. But the mofa2 function in the pull request only handles the MultiAssayExperiment format, which I think can be a bit misleading. I can extend the mofa2 function in your pull request to handle the other formats, and then proceed to merge. What do you think?

Thanks again for you feedback and contribution!

RiboRings · 2024-06-12T08:32:49Z

Thank you for the reply!

It sounds good to me. Feel free to let me know if anything should be changed/improved in the method for MultiAssayExperiment.

antagomir · 2024-06-12T12:09:26Z

Sounds good.

It just came to my mind that SingleCellExperiment/TreeSummarizedExperiment objects have alternative experiments (altExp) that are also commonly used to store multiple data tables. If it is welcome, we could add support for that format. It is probably as important in practice as the MultiAssayExperiment.

rargelaguet · 2024-06-29T16:38:22Z

Hi @antagomir and @RiboRings,

thanks again for your suggestions, very much appreciated.

I have been exploring the option to implement a one-line wrapper function that creates a MOFA2 object from multiple input formats and that it also parses the data/model/training options. This is now implemented a function called mofa2 in the test_improved_mae branch. Could you test if this function works well for you and then I will proceed to merge to the main branch

Note in this function I have incorporated an argument called assays that should solve the issue of removing unnecessary assays in a MultiAssayExperiment objects

RiboRings · 2024-06-30T06:42:13Z

Hi @rargelaguet!

Thank you for getting back to this discussion.

For me, the mofa2 wrapper gives the expected output with MAE as we originally discussed. However, I find it might be a bit confusing to have both the arguments assays and assay for the case of a single cell experiment, especially because at most one assay can be selected without error. For example:

library(mia)
data("Tengeler2020", package = "mia")
tse <- Tengeler2020
tse <- transformAssay(tse, method = "relabundance")

# Using assay argument
create_mofa(tse, assay = "counts")
> Creating MOFA object from a SingleCellExperiment object...

# Using assays argument
create_mofa(tse, assays = "counts")
> Creating MOFA object from a SingleCellExperiment object...

# Giving two assays as input
create_mofa(tse, assays = c("counts", "relabundance"))
> Creating MOFA object from a SingleCellExperiment object...
> Error in value[[3L]](cond) :
> assay(<TreeSummarizedExperiment>, i="character", ...)' invalid subscript 'i'
> attempt to extract more than one element

Maybe one option would be to just keep only assays argument and clarify in the vignettes that it should be one assay for SCE.

antagomir · 2024-06-30T07:41:27Z

Great. I made a quick test following the example on the manpage.

library(mia)
data("HintikkaXOData", package = "mia")
mae <- HintikkaXOData
experiments(mae)
mofa_obj <- mofa2(mae, assays = c("counts", "nmr"))

This throws the error:

Error in names(assays) <- names(experiments(mae)) :
'names' attribute [3] must be the same length as the vector [2]

When I drop the third object from MAE it works:

experiments(mae)$biomarkers <- NULL
mofa_obj <- mofa2(mae, assays = c("counts", "nmr"))

Checking data options...
Checking training options...
Checking model options...
Warning messages:
1: In prepare_mofa(object = new("MOFA", data = list(microbiota = list( :
Some view(s) have less than 15 features, MOFA will have little power to to learn meaningful factors for these view(s)....
2: In prepare_mofa(object = new("MOFA", data = list(microbiota = list( :
The total number of samples is very small for learning 15 factors.
Try to reduce the number of factors to obtain meaningful results. It should not exceed ~10.

It might be less error-prone if the user was required to specify assay for all availble experiments, for instance like

experiments(mae)$biomarkers <- NULL
mofa_obj <- mofa2(mae, assays = c("counts", "nmr", NULL))

or

experiments(mae)$biomarkers <- NULL
mofa_obj <- mofa2(mae, assays = c("counts", NULL, "signals"))

using NULL to indicate experiments that should be dropped out from the analysis?

antagomir · 2024-06-30T07:44:11Z

If support for altExps could be added that would be great. The logic could be the same than for MAE (i.e. specifying the vector of assays, one for each altExp). Often altExp is the recommended choice when samples match one-to-one.

I noticed that mofa2 function does not yet have @export tag

RiboRings · 2024-09-08T17:55:06Z

Hi @rargelaguet! Any news when this could be merged?

antagomir · 2024-09-09T17:57:10Z

I think we could also help to finalize this by making a PR against the development branch, if useful.

RiboRings · 2024-10-25T07:18:25Z

Hi @rargelaguet! Any update on this?

artur-sannikov · 2024-11-22T09:16:43Z

I second @RiboRings, any updates @rargelaguet?

TuomasBorman · 2024-11-22T19:22:13Z

R/create_mofa_from_mae.R

+.select_assays <- function(mae, assay.types) {
+  # Give corresponding experiment names to assay.types
+  names(assay.types) <- names(experiments(mae))
+  # For every experiment in MAE
+  for ( exp in names(experiments(mae)) ){
+    # Keep only selected assay.type from a given experiment
+    assays(mae[[exp]]) <- list(assay(mae[[exp]], assay.types[[exp]]))
+    # Update assay names
+    assayNames(mae[[exp]]) <- assay.types[[exp]]
+  }
+  return(mae)
+}


These can lead to an errors that might be hard for user to solve:

MAE can have also other datatypes than SEs. The object just has to have rows and columns.

What if the length of assay.types does not match with length of experiments?

TuomasBorman · 2024-11-22T20:07:52Z

Something like this might be more explicit with the assay names and experiments

MOFA2 <- function(mae, assay.types, experiments = names(mae), ...){
    if( !(length(experiments) == length(assay.types) ) ) stop(".")
    if( !(all(experiments %in% names(mae) ) || all(is.numeric(experiments) & experiments > 0 & experiments <= length(experiments(mae)) )) ) stop(".")
    mae <- mae[, , experiments]
    # Loop through experiments
    for( i in seq_len(length(mae)) ){
        tse <- mae[[i]]
        assay.type <- assay.types[[i]]
        tse <- .subset_experiment(tse, assay.type, i, ...)
        mae[[i]] <- tse
    }
    return(mae)
}

.subset_experiment <- function(tse, assay.type, i, altexps = NULL, ...){
    if( !(is.null(altexps) || length(altexps) >= i) ) stop(".")
    altexp <- altexps[[i]]
    
    if( !is(tse, "SummarizedExperiment") ) stop(".")
    if( !is.null(altexp) && !is.na(altexp) && !is(tse, "SingleCellExperiment")) stop("Object must be SCE to use altExp.")
    if( !(is.null(altexp) || is.na(altexp) || altexp %in% altExpNames(tse) )  ) stop(".")
    if( !is.null(altexp) && !is.na(altexp) ){
        tse <- altExp(tse, altexp)
    }
    if( !assay.type %in% assayNames(tse) ) stop(".")
    assays(tse) <- assays(tse)[assay.type]
    return(tse)
}

library(mia)
data("HintikkaXOData")

# Prepare data
mae <- HintikkaXOData
altExp(mae[[1]], "asd") <- mae[[1]]

MOFA2(
    mae,
    experiments = c(1, 2),
    assay.types = c("counts", "nmr"),
    altexps = c("asd", NA)
)

TuomasBorman · 2024-11-23T08:12:19Z

R/create_mofa_from_mae.R

+  # For every option in a set (data, model, train, ...)
+  for ( opt in names(default) ){
+    # If that option is found among arguments
+    if ( opt %in% names(list(...)) ){
+      # Replace default with value specified in arguments
+      default[[opt]] <- list(...)[[opt]]
+    }
+  }


Came to my mind while resting, does this work to avoid loops?

user_options <- list(...) set_options <- intersect(names(default), names(user_options)) default[set_options] <- user_options[set_options]

artur-sannikov · 2025-01-13T08:23:24Z

Any updates @rargelaguet?

Also pinging @RiboRings.

Initialise mofa2 wrapper for MAE

6486a98

RiboRings mentioned this pull request Jan 19, 2024

Extended support for MultiAssayExperiment #133

Open

RiboRings mentioned this pull request Jun 10, 2024

Make create_mofa_from_MultiAssayExperiment() accept alternative experiments microbiome/mia#412

Open

TuomasBorman reviewed Nov 22, 2024

View reviewed changes

TuomasBorman reviewed Nov 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement mofa2 wrapper for MAE #144

Implement mofa2 wrapper for MAE #144

RiboRings commented Jan 19, 2024

RiboRings commented Jan 19, 2024

antagomir commented Jan 19, 2024

RiboRings commented Jan 19, 2024 •

edited

Loading

antagomir commented Jan 20, 2024

antagomir commented Mar 13, 2024

antagomir commented Apr 11, 2024

artur-sannikov commented May 10, 2024

antagomir commented Jun 10, 2024

rargelaguet commented Jun 12, 2024

RiboRings commented Jun 12, 2024

antagomir commented Jun 12, 2024

rargelaguet commented Jun 29, 2024

RiboRings commented Jun 30, 2024

antagomir commented Jun 30, 2024 •

edited

Loading

antagomir commented Jun 30, 2024 •

edited

Loading

RiboRings commented Sep 8, 2024

antagomir commented Sep 9, 2024 •

edited

Loading

RiboRings commented Oct 25, 2024 •

edited

Loading

artur-sannikov commented Nov 22, 2024

TuomasBorman Nov 22, 2024

TuomasBorman commented Nov 22, 2024

TuomasBorman Nov 23, 2024

artur-sannikov commented Jan 13, 2025

Implement mofa2 wrapper for MAE #144

Are you sure you want to change the base?

Implement mofa2 wrapper for MAE #144

Conversation

RiboRings commented Jan 19, 2024

RiboRings commented Jan 19, 2024

antagomir commented Jan 19, 2024

RiboRings commented Jan 19, 2024 • edited Loading

antagomir commented Jan 20, 2024

antagomir commented Mar 13, 2024

antagomir commented Apr 11, 2024

artur-sannikov commented May 10, 2024

antagomir commented Jun 10, 2024

rargelaguet commented Jun 12, 2024

RiboRings commented Jun 12, 2024

antagomir commented Jun 12, 2024

rargelaguet commented Jun 29, 2024

RiboRings commented Jun 30, 2024

antagomir commented Jun 30, 2024 • edited Loading

antagomir commented Jun 30, 2024 • edited Loading

RiboRings commented Sep 8, 2024

antagomir commented Sep 9, 2024 • edited Loading

RiboRings commented Oct 25, 2024 • edited Loading

artur-sannikov commented Nov 22, 2024

TuomasBorman Nov 22, 2024

Choose a reason for hiding this comment

TuomasBorman commented Nov 22, 2024

TuomasBorman Nov 23, 2024

Choose a reason for hiding this comment

artur-sannikov commented Jan 13, 2025

RiboRings commented Jan 19, 2024 •

edited

Loading

antagomir commented Jun 30, 2024 •

edited

Loading

antagomir commented Jun 30, 2024 •

edited

Loading

antagomir commented Sep 9, 2024 •

edited

Loading

RiboRings commented Oct 25, 2024 •

edited

Loading