Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getNMF #692

Open
antagomir opened this issue Feb 8, 2025 · 4 comments
Open

getNMF #692

antagomir opened this issue Feb 8, 2025 · 4 comments

Comments

@antagomir
Copy link
Member

Currently getNMF returns the samples x components score matrix for NMF, with attributes.

If the output must be in this format (a single matrix, rather than a named list including all attributes), then the manpage examples could show for clarity's sake how to get the other elements out from getNMF output. The "attributes" command may not be very familiar for an average R user:

x <- getNMF(tse, ...)
res <- attributes(x)
names(res)

@antagomir
Copy link
Member Author

Might also make sense to have a method to pick those elements, or just return a list:

Now we can pick loadings with:
t(attr(reducedDim(tse, "NMF"), "loadings"))

If getNMF (and addNMF into reducedDim) would instead return a list, we could do something like:
reducedDim(tse, "NMF")[["loadings"]] # for addNMF output
res[["loadings"]] # for getNMF output

Or if we have a specific method we could do:
getElement(reducedDim(tse, "NMF"), "loadings")

Do we have a special reason to stick to using attributes?

@TuomasBorman
Copy link
Contributor

The value in reducedDim() must be a matrix-like object with nrow(value) == ncol(tse). That is why it cannot be a list.

We use attributes() because that is the only way to store the loadings and other results with the matrix.

In SCE, there has been some discussion about adding some kind of better support for loadings. I think the consensus was that the support is not added, because not all ordination methods generate loadings.

As reducedDim() is only sensible place to store NMF results, we do not have any better option than using attributes().

Adding getReducedDimElement(x, "loadings") function could be possible, and should be considered. The function could be easier to use and find the loadings, and we can provide informative errors to help users with non-existing attributes.

However, the function might make the TreeSE object more ambiguous. We also need to consider this from a development perspective: should we create "phyloseq-like" utility functions if the same results can be easily achieved without them?

@antagomir
Copy link
Member Author

These are essential questions.

IMO what primarily matters is that users can save time and have clear, understandable workflows but this has to be balanced with the development & maintenance effort. I think it is also a bit case-by-case, and this case is particularly non-standard so perhaps non-standard solutions are needed as long as there is no better way?

At the minimum the documentation (manpage examples & OMA) should be developed so that the key use cases are well covered and easy to find. I bumped into this while trying to do something with loadings, and it took quite some time to figure out how to find all relevant elements and make sure they represent what I expected, even though I'm more familiar with the system than an average user.

@TuomasBorman
Copy link
Contributor

Yes, it should be improved. I think there are more advantages than disadvantages in getReducedDimElement(), and it would solve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants