Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with the installation on colab #241

Open
bitcometz opened this issue Aug 16, 2024 · 2 comments
Open

Help with the installation on colab #241

bitcometz opened this issue Aug 16, 2024 · 2 comments

Comments

@bitcometz
Copy link

hello, thanks for this great tool !!!

I follow the installation tutorial and got some errors:

import os
import sys

if "google.colab" in sys.modules:
    print("Running on Google Colab")
    print("Installing dependencies...")
    !pip install -U scgpt
    # the optional dependency of flash-attion is skipped on colab
    !pip install wandb louvain

    # NOTE: May need to restart runtime after the installation

    print("Downloading data and model ckpt...")
    !pip install -q -U gdown
    import gdown

import scvi
adata = scvi.data.pbmc_dataset()

Errors:

INFO     File data/gene_info_pbmc.csv already downloaded                                                           
INFO     File data/pbmc_metadata.pickle already downloaded                                                         
INFO     File data/pbmc8k/filtered_gene_bc_matrices.tar.gz already downloaded                                      
INFO     Extracting tar file                                                                                       
INFO     Removing extracted data at data/pbmc8k/filtered_gene_bc_matrices                                          
INFO     File data/pbmc4k/filtered_gene_bc_matrices.tar.gz already downloaded                                      
INFO     Extracting tar file                                                                                       
INFO     Removing extracted data at data/pbmc4k/filtered_gene_bc_matrices                                          
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-16-2b688d4bab92>](https://localhost:8080/#) in <cell line: 1>()
      1 if dataset_name == "PBMC_10K":
----> 2     adata = scvi.data.pbmc_dataset()  # 11990 × 3346
      3     ori_batch_col = "batch"
      4     adata.obs["celltype"] = adata.obs["str_labels"].astype("category")
      5     adata.var = adata.var.set_index("gene_symbols")

2 frames
[/usr/local/lib/python3.10/dist-packages/numpy/__init__.py](https://localhost:8080/#) in __getattr__(attr)
    322 
    323         if attr in __former_attrs__:
--> 324             raise AttributeError(__former_attrs__[attr])
    325 
    326         if attr == 'testing':

AttributeError: module 'numpy' has no attribute 'str'.
`np.str` was a deprecated alias for the builtin `str`. To avoid this error in existing code, use `str` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.str_` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

Could you help with this problem? Thanks !!!

Best

@kocemir
Copy link

kocemir commented Aug 17, 2024

For which task is this ? I may try to help you with the annotation task.

First you should upload the datasets and pretrained models to a folder in your drive. Then, mount the drive to colab for path inserting. Then follow the below:

#!pip install scgpt ( if you dont want to use flash-attn)
!pip install scgpt "flash-attn<1.0.5" (takes time)

!pip install wandb

import copy
import gc
import json
import os
from pathlib import Path
import shutil
import sys
import time
import traceback
from typing import List, Tuple, Dict, Union, Optional
import warnings
import pandas as pd

import pickle
import torch
from anndata import AnnData
import scanpy as sc
import scvi
import seaborn as sns
import numpy as np
import wandb
from scipy.sparse import issparse
import matplotlib.pyplot as plt
from torch import nn
from torch.nn import functional as F
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split
from sklearn.metrics import adjusted_rand_score, normalized_mutual_info_score
from torchtext.vocab import Vocab
from torchtext._torchtext import (
Vocab as VocabPybind,
)
from sklearn.metrics import confusion_matrix

sys.path.insert(0, "../")
import scgpt as scg
from scgpt.model import TransformerModel, AdversarialDiscriminator
from scgpt.tokenizer import tokenize_and_pad_batch, random_mask_value
from scgpt.loss import (
masked_mse_loss,
masked_relative_error,
criterion_neg_log_bernoulli,
)
from scgpt.tokenizer.gene_tokenizer import GeneVocab
from scgpt.preprocess import Preprocessor
from scgpt import SubsetsBatchSampler
from scgpt.utils import set_seed, category_str2int, eval_scib_metrics

sc.set_figure_params(figsize=(6, 6))
os.environ["KMP_WARNINGS"] = "off"
warnings.filterwarnings('ignore')

This will probably work

@bitcometz
Copy link
Author

@kocemir , thanks for your help!

Yes, I want to do the annotation task !

You are right that adding "flash-attn<1.0.5" takes really long time !!!
I am using free colab GPU resources that I cannot finish the installation with adding flash-attn.

Best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants