-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mc.pl.extract_clean_data(full, name="hca_bm.one-pass.clean") crashes jupyter notebook everytime #66
Comments
Can you follow the stepshere #5 (comment) - that is, verify which instruction is at fault? That said, if you have compiled using --native, I don't see why the compiler have generated anything that isn't supported... |
grep flags /proc/cpuinfo | head -1 |
I'm a bit confused about the gdb command you use got confused by the output. Can you tell me what you are trying to look for? |
(metacells) lindseydudley@pop-os: For help, type "help". |
The thing is, you need to tell Normally this is called Sometimes it is just called Chalk all this up to the royal mess that is the migration from Python 2.* to 3.*, which still messes things up even after all these years. What we are trying to see if what is the actual cause of the crash - which binary instruction is not supported - and then we'll have to back-track to see how come this instruction was generated into the binary extensions compiled by the metacells package. On its face, compiling with |
Hi Oren, Sorry for the confusion as I have never used gdb before. This is what I have as the output when I try to specifically debug the jupyter notebook that keeps crashing. (gdb) r /home/lindseydudley/Desktop/Metacell/metacells-vignettes/notebooks/one-pass.ipynb |
Hi Oren, After playing around with gdb and creating a test script based off of the jupyter notebook. This is the output I get. Please let me know if this is helpful or if there is anything else I can do to help find the error. |
I'm a bit confused. This error seems nothing to do with issue #5. There the error was something like Here the error is |
Here is my test_op.py script. I just took the code from the one pass jupyter notebook and put it into a python script. Use SVG for scalable low-element-count diagrams.#config InlineBackend.figure_formats = ["svg"] A matter of personal preference.sb.set_style("white") Running operations on an inefficient layout can make code much slower.For example, summing the columns of a row-major matrix.By default this will just be a warning.We set it to be an error here to make sure the vignette does not lead you astray.Note that this only affects the Metacells package.Numpy will happily and silently take 100x longer for running such inefficient operations.At least, there's no way I can tell to create a warning or error for this;also, the implementation for "inefficient" operations could be much faster.The workaround in either case is to explicitly re-layout the 2D matrix before the operations.This turns out to be much faster, especially when the matrix can be reused.Note that numpy is also very slow when doing matrix re-layout,so the metacells package provides a function for doing it more efficiently.Sigh.mc.ut.allow_inefficient_layout(False) shutil.rmtree("../output/one-pass", ignore_errors=True) full = ad.read_h5ad("../blobs/hca_bm.full.h5ad") PROPERLY_SAMPLED_MIN_CELL_TOTAL = 800 total_umis_per_cell = mc.ut.get_o_numpy(full, "x", sum=True) plot.refline(x=PROPERLY_SAMPLED_MIN_CELL_TOTAL, color="darkgreen") plt.savefig("../output/one-pass/preliminary/figures/cell_total_umis.svg") too_small_cells_count = np.sum(total_umis_per_cell < PROPERLY_SAMPLED_MIN_CELL_TOTAL) total_umis_per_cell = mc.ut.get_o_numpy(full, name="x", sum=True) print( EXCLUDED_GENE_NAMES = [ mc.pl.exclude_genes( mc.tl.compute_excluded_gene_umis(full) PROPERLY_SAMPLED_MAX_EXCLUDED_GENES_FRACTION = 0.25 excluded_umis_fraction_regularization = 1e-3 # Avoid 0 values in log scale plot. excluded_umis_fraction_per_cell += excluded_umis_fraction_regularization plot.set(xlabel="Fraction of excluded gene UMIs", ylabel="Density", yticks=[]) plt.savefig("../output/one-pass/preliminary/figures/cell_excluded_umis_fraction.svg") too_excluded_cells_count = np.sum( mc.pl.exclude_cells( clean = mc.pl.extract_clean_data(full, name="hca_bm.one-pass.clean") full.write_h5ad("../output/one-pass/preliminary/hca_bm.full.h5ad") clean.write_h5ad("../output/one-pass/preliminary/hca_bm.clean.h5ad") cells = clean LATERAL_GENE_NAMES = [ This will mark as "lateral_gene" any genes that match the above, if they exist in the clean dataset.mc.pl.mark_lateral_genes( NOISY_GENE_NAMES = [ mc.pl.mark_noisy_genes(cells, noisy_gene_names=NOISY_GENE_NAMES) Either use the guesstimator:max_parallel_piles = mc.pl.guess_max_parallel_piles(cells) Or, if running out of memory manually override:max_paralle_piles = ...print(max_parallel_piles) with mc.ut.progress_bar(): metacells = Assign a single value for each metacell based on the cells.mc.tl.convey_obs_to_group( Compute the fraction of cells with each possible value in each metacell:mc.tl.convey_obs_fractions_to_group( with mc.ut.progress_bar(): min_long_edge_size = 4 cells.write_h5ad("../output/one-pass/preliminary/hca_bm.cells.h5ad") metacells.write_h5ad("../output/one-pass/preliminary/hca_bm.metacells.h5ad") When I run it line by line this is the specific error that pops out. File "/home/lindseydudley/anaconda3/envs/mcell/lib/python3.12/site-packages/metacells/utilities/logging.py", line 384, in wrapper |
It seems you are missing cell 5 of the notebook https://tanaylab.github.io/metacells-vignettes/one-pass.html |
Hi Oren, Thank you so much for your response! This seems to have just been a copy paste error and I'm so sorry. Here's my gdb with the corrected script. Please let me know if you what to do about this error. [New Thread 0x7ffff306c640 (LWP 1492911)] Program terminated with signal SIGKILL, Killed. |
Hmmm - there's nothing there other than SIGKILL. Assuming this is on Linux, one possibility is that the program run out of memory? That would be surprising unless you are running this on a machine with very small memory, since you haven't even got to the divide-and-conquer part yet. You can check this if you run |
I have attached my top output while running the program. Please let me know if anything jumps out at you that could help me debug! I appreciate all of the help because I definitely want to use your program. |
|
436067 lindsey+ 26 6 52.1g 9.2g 123648 D 18.5 7.3 0:10.34 python |
This is my top when doing it interactively. My computer actually has 128 GB ran and a CPU so I don't think that memory is the limiting factor. |
"top when doing it interactively" is more of a movie than a picture. |
When looking at my logs rfkill seems to be the issue but I'm not sure why. This is the log Apr 19 13:18:04 pop-os kernel: [ 144.258322] rfkill: input handler disabled |
rfkill is some tool for dis/enabling wireless devices, I don't think it is relevant. I also don't see any "killed" log messages. Perhaps something in https://stackoverflow.com/questions/726690/what-killed-my-process-and-why can help? |
Hi Oren,
Thank you so much for creating this program! I am running into a little trouble running the one pass vignette in my conda environment. My jupyter notebook crashes everything I run the mc.pl.extract_clean_data command. I believe this is similar to closed issue 5 but none of the workarounds in it ended up helping me. I have tried installing dependencies using conda like was suggested and used pip to install metacell regularly and also with the native flag but it doesn't seem to help at all. I would really appreciate you help debugging. I am running metacell 0.9.4, conda 24.1.2 and my Operating system is the latest release of POP OS which is a Linux system. I also greped to make sure I had avxs and I did. Thank you so much for your help and please let me know if you need anymore information to try to solve this.
The text was updated successfully, but these errors were encountered: