Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory error of adj_mat.todense() #3

Open
LucaGiudice opened this issue Jun 25, 2018 · 1 comment
Open

Memory error of adj_mat.todense() #3

LucaGiudice opened this issue Jun 25, 2018 · 1 comment

Comments

@LucaGiudice
Copy link

Dear programmer, sorry to interrupt you but, I have a problem with a large gene interaction network and its corresponding adjacency matrix. The error that occurs is the following:

Traceback (most recent call last):
  File "/home/pyNBS/network_propagation.py", line 14, in normalize_network
    subgraph_norm = normalize_network(network, symmetric_norm=False)
  File "/home/pyNBS/network_propagation.py", line 17, in normalize_network
    tmp_dense = adj_mat.todense()
  File "/home/anaconda2/lib/python2.7/site-packages/scipy/sparse/base.py", line 721, in todense
    return np.asmatrix(self.toarray(order=order, out=out))
  File "/home/anaconda2/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 964, in toarray
    return self.tocoo(copy=False).toarray(order=order, out=out)
  File "/home/anaconda2/lib/python2.7/site-packages/scipy/sparse/coo.py", line 252, in toarray
    B = self._process_toarray_args(order, out)
  File "/home/anaconda2/lib/python2.7/site-packages/scipy/sparse/base.py", line 1039, in _process_toarray_args
    return np.zeros(self.shape, dtype=self.dtype, order=order)
MemoryError

Using the commands:

print type(adj_mat)
print adj_mat.shape

I retrieved that the adj_mat is:

<class 'scipy.sparse.csr.csr_matrix'>
(119290, 119290)

The packages of interest are the following (attached there is the full list of packages):
packages_conda.txt

scipy 0.19.0 np112py27_0
numpy 1.12.1 py27_0
networkx 1.11 py27_0

I think that the matrix that should be returned by toDense() should require at least 113GB of memory that I do not have. In case it wouldn't be possible to solve the problem using at most 64GB of ram, should be possible to involve the space in the hard disk?

@justinkhuang
Copy link
Collaborator

Hi Luca,
I am sorry about this issue. The reason we convert the matrix to a dense array here is that in certain cases it may improve the speed of some of the matrix multiplications performed in the algorithm if you have a machine with multiple threads that can be utilized. However, I believe it is possible to keep the matrix as a sparse object to perform the matrix multiplications, but I am unsure if the matrix inversion that will need to take place is possible on that sparse matrix. Additionally, the sparse object is useful for the adjacency matrix at first (generally biological networks are sparse), however, after the propagation step, the matrix will become dense. I know this is probably not the answer you are looking for, my best suggestion at this time is to find a bigger machine (unfortunately), unless someone else has a better solution...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants