Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading H5 files directly #124

Open
boooooogey opened this issue Jun 8, 2023 · 3 comments
Open

Reading H5 files directly #124

boooooogey opened this issue Jun 8, 2023 · 3 comments

Comments

@boooooogey
Copy link

Can I provide h5 files instead of a list of methylation files to the read.bismark function or another compatible function?

When using HDF5Array as the BACKEND, it saves data as two files: se.rds and assays.h5. I can successfully read se.rds using readRDS after moving assays.h5 to the current path. It would be more convenient if I could directly provide the paths of se.rds and assays.h5.

Is it currently possible to achieve this with the existing code version? If not, would it be challenging to implement?

@PeteHaitch
Copy link
Contributor

Once you have an HDF5-backed BSseq object (i.e. you've run HDF5Array::saveHDF5BackedSummarizedExperiment() and you have the se.rds and assays.h5 files) then you can load it back into R using HDF5Array::loadHDF5SummarizedExperiment().
There's no need for read.bismark() once you're at this point, so I don't really understand what you're trying to do.

@sahuno
Copy link

sahuno commented Jun 9, 2023

this is good question! thanks @PeteHaitch for the response!
As a follow question -
if you subset a loaded bsseq object in an R session backed by hdf5 do you need to manually resave on disk before using the modified bsseq object for bsmooth()?
here's an example after removing chromosomes Y and MT, where bsmooth() doesn't seem to recognize the modified bsseq object. pls how can i resolve this? thanks!!!

chrMT_loci <- which(bismark_bsseq@rowRanges@seqnames == "MT")
chrY_loci <- which(bismark_bsseq@rowRanges@seqnames == "Y")
chr_loci_rm <- c(chrMT_loci, chrMT_loci)
bismark_bsseq <- bismark_bsseq[-chr_loci_rm,]

message("\n performing bsmoothing \n")
bismark_bsseq.fit <- BSmooth(BSseq = bismark_bsseq,
                            BPPARAM = MulticoreParam(workers = 24,progressbar = TRUE),
                            verbose = TRUE)

@PeteHaitch
Copy link
Contributor

if you subset a loaded bsseq object in an R session backed by hdf5 do you need to manually resave on disk before using the modified bsseq object for bsmooth()?

No, that shouldn't be necessary.

here's an example after removing chromosomes Y and MT, where bsmooth() doesn't seem to recognize the modified bsseq object. pls how can i resolve this?

I don't understand what you mean and the code you've pasted in doesn't show any output.
If you're having a problem please post a reproducible example so we can help you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants