Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stereoscope implementation #851

Closed
giovp opened this issue Nov 10, 2020 · 14 comments · Fixed by #852
Closed

stereoscope implementation #851

giovp opened this issue Nov 10, 2020 · 14 comments · Fixed by #852
Labels

Comments

@giovp
Copy link
Member

giovp commented Nov 10, 2020

Hi @romain-lopez ,

following almaan/stereoscope#18, I finally got time to check out your implementation of stereoscope in romain/spatial branch. Don't have much to comment, it's great!
I really like the decoupling between scmodel and spatial model, making it super convenient to fit one model for e.g. an atlas and reuse it multiple times for visium slides.
In terms of default n_epochs, they made a lot of sense also for the data I tried it on. Maybe I would be more relaxed with the spatial model (I increased it to 7k) but of course arg very dependent to data/settings.

Do you still plan to merge it to master? I'm sure many would use it.
Thank you again! 🎉

@giovp giovp added the question label Nov 10, 2020
@romain-lopez
Copy link
Member

Hi @giovp,

Thanks for your comment 👍

We are actually working on a v1.0 version of scvi-tools, in which we will feature this reimplementation of stereoscope. The front end won't probably change but we are doing more work on the backend (@adamgayoso and @galenxing are working on the interaction with pytorch-lightning). Ideally, it would even be more straighforward to implement other models in the codebase.

For the specific case of stereoscope, I expect this feature will be merged to master in a month or so? @adamgayoso what do you think?

@adamgayoso
Copy link
Member

Yeah a month sounds about right, though an official release might take a bit longer. Excited to have this method in the codebase!

@cartal
Copy link

cartal commented Nov 18, 2020

Hi, this is great!

For the lesser mortals who have lots of CPU and time but no access to GPU, is there a way to run this without it? I somehow thought that by adding use_cuda = False I would override the GPU use.

At the moment I have tried to run this step:

spatial_model = stStereoscope(st_adata, params, use_cuda = False)
spatial_model.train(lr = 0.01, n_epochs = 3000, train_size = 1, frequency = 5)

But get the following error:

INFO     Training for 3000 epochs                                                            
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-50-fdee515c64dc> in <module>
      1 spatial_model = stStereoscope(st_adata, params, use_cuda = False)
----> 2 spatial_model.train(lr = 0.01, n_epochs = 3000, train_size = 1, frequency = 5)

~/tools/scvi-tools/scvi/model/stereoscope.py in train(self, n_epochs, train_size, test_size, lr, frequency, train_fun_kwargs, **kwargs)
    201         if "lr" not in train_fun_kwargs:
    202             train_fun_kwargs["lr"] = lr
--> 203         self.trainer.train(**train_fun_kwargs)
    204         self.is_trained_ = True
    205 

~/tools/scvi-tools/scvi/core/trainers/trainer.py in train(self, n_epochs, lr, eps, params, **extras_kwargs)
    183         self.compute_metrics_time = 0
    184         self.n_epochs = n_epochs
--> 185         self.compute_metrics()
    186 
    187         self.on_training_begin()

~/miniconda3/lib/python3.8/site-packages/torch/autograd/grad_mode.py in decorate_no_grad(*args, **kwargs)
     47         def decorate_no_grad(*args, **kwargs):
     48             with self:
---> 49                 return func(*args, **kwargs)
     50         return decorate_no_grad
     51 

~/tools/scvi-tools/scvi/core/trainers/trainer.py in compute_metrics(self)
    149                                 computed.add(out_str)
    150                     for metric in self.metrics_to_monitor:
--> 151                         result = getattr(scdl, metric)()
    152                         out_str = metric + "_" + name
    153                         self.history[out_str] += [result]

~/tools/scvi-tools/scvi/core/data_loaders/scvi_data_loader.py in elbo(self)
    364             ind_x = tensors["ind_x"]
    365             labels = tensors[_CONSTANTS.LABELS_KEY]
--> 366             reconst_loss, local_kl, global_kl = self.model(sample_batch, labels, ind_x)
    367             log_lkl += torch.sum(reconst_loss + local_kl).item()
    368         n_samples = len(self.indices)

~/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

~/tools/scvi-tools/scvi/core/modules/ldeconv.py in forward(self, x, y, ind_x)
    315         """
    316         # Parameters for z latent distribution
--> 317         outputs = self.inference(x, ind_x)
    318         px_rate = outputs["px_rate"]
    319         px_o = outputs["px_o"]

~/tools/scvi-tools/scvi/core/modules/ldeconv.py in inference(self, x, ind_x)
    276 
    277         if self.use_cuda:
--> 278             w = w.cuda()
    279             px_o = self.px_o.cuda()
    280             eps = eps.cuda()

~/miniconda3/lib/python3.8/site-packages/torch/cuda/__init__.py in _lazy_init()
    194             raise RuntimeError(
    195                 "Cannot re-initialize CUDA in forked subprocess. " + msg)
--> 196         _check_driver()
    197         torch._C._cuda_init()
    198         _cudart = _load_cudart()

~/miniconda3/lib/python3.8/site-packages/torch/cuda/__init__.py in _check_driver()
     96         if torch._C._cuda_getDriverVersion() == 0:
     97             # found no NVIDIA driver on the system
---> 98             raise AssertionError("""
     99 Found no NVIDIA driver on your system. Please check that you
    100 have an NVIDIA GPU and installed a driver from

AssertionError: 
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

@romain-lopez
Copy link
Member

Hi!

Thanks for pointing this out. We are working through API changes that are pretty foundational for the back-end. I will address this in the next couple of weeks once Adam and Galen first merged these back-end changes (PyTorch ligthning!).

Happy that you're interested in using this 👍 It'll come out soon with the CPU version!

Best,
Romain

@romain-lopez
Copy link
Member

If you really want to run this, Google Colab should work perfectly though :)

@cartal
Copy link

cartal commented Nov 18, 2020

This is super exciting, thank you!

@romain-lopez
Copy link
Member

Some more progress on this, I have found that adding a per-cell scaling factor on the deconvolution model gives better performance on simulations (25 % improvement on MSE for predicting cell type proportion) AND speeds up the convergence rate (between 5x and 10x to achieve the local minimum of the objective function). These results are on a remote local branch, but we will have to discuss whether we want to provide the vanilla version or my slight modification on the codebase.

@vitkl
Copy link
Contributor

vitkl commented Nov 23, 2020

Interesting, I also see improvement in accuracy of estimating cell proportions on simulated data when adding a per-location scaling factor (in cell2location). However, it breaks inference of absolute cell abundance on simulated data (which cell2location does as well as proportions, Fig 1E, https://www.biorxiv.org/content/10.1101/2020.11.15.378125v1.full), and on real data too. So I would be concerned with over-normalising by adding that parameter and breaking the mapping for some cell types.
By the way, would be great if you could add cell2location to your benchmarks.

@romain-lopez
Copy link
Member

That's interesting! thanks for mentioning that! I did not actually consider estimating the absolute cell abundance.

@vitkl
Copy link
Contributor

vitkl commented Nov 23, 2020

I think it is not possible to do without informative priors because per-gene and per-location scaling factors are non-identifiable with the cell type abundance parameters. We also see a minor improvement when using absolute cell abundance as a predictor for detecting presence-absence of cell types (Fig 1C-type analysis, area 0.83 -> 0.84). But more importantly, you can do the downstream analysis on the estimates directly without usual challenges with proportional data.

@vitkl
Copy link
Contributor

vitkl commented Jan 20, 2021 via email

@romain-lopez
Copy link
Member

No it is not, this is an almost line-by-line reimplementation of stereoscope. Only differs a couple of minor points that we will detail in the docs. Thanks for pointing this out.

@vitkl
Copy link
Contributor

vitkl commented Jan 20, 2021

Thanks for clarifying! It is quite easy to use so thank you for reimplementation!

@adamgayoso
Copy link
Member

adamgayoso commented Jan 22, 2021

thanks @vitkl -- we might end up changing a few things to make it a tiny bit more user-friendly before official release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants