Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong shape after indexing a single element with a slice #754

Closed
ben-bou opened this issue Apr 8, 2021 · 4 comments · Fixed by #758
Closed

Wrong shape after indexing a single element with a slice #754

ben-bou opened this issue Apr 8, 2021 · 4 comments · Fixed by #758
Assignees

Comments

@ben-bou
Copy link
Collaborator

ben-bou commented Apr 8, 2021

Description
As seen in the code below, after a __getitem__ that pulls only one element from one process and leaves the other processes empty, the gshape is wrong.

To Reproduce
Steps to reproduce the behavior:

  1. Which module/class/function is affected?
    __getitem__
  2. What are the circumstances under which the bug appears?
    one process has one element, the other process is empty.
  3. What is the exact error message / erroneous behavior?
    wrong gshape and lshape

Illustrative

import heat as ht

arr = ht.ones((1,10), split=1)

res = arr[:,0:1]

print('rank:', ht.MPI_WORLD.rank, 'gshape:', res.gshape, 'lshape:', res.lshape, 'split:', res.split)

Run with one process, the output is as expected:

rank: 0 gshape: (1, 1) lshape: (1, 1) split: 1

Run with 2 processes:

rank: 0 gshape: (1,) lshape: (1, 1) split: 1
rank: 1 gshape: (1,) lshape: (0,) split: 1
/home/b.bourgart/anaconda3/lib/python3.7/site-packages/heat/core/dndarray.py:1598: ResourceWarning: This process (rank: 1) is without data after slicing, running the .balance_() function is recommended
  ResourceWarning,

The gshape has only one dimension, but should have two because a slice was used for indexing. The lshape of rank 1 should also have two dimensions (#656).

Expected behavior
A clear and concise description of what you expected to happen.
See above.

Version Info
0.5.1; also on master branch

Additional comments
Any other comments here.

@ben-bou ben-bou changed the title Wrong shape after __getitem__ (with empty process) Wrong shape after indexing a single element with a slice Apr 8, 2021
@ben-bou
Copy link
Collaborator Author

ben-bou commented Apr 12, 2021

Additionally, consider the following code (using the current master branch):

import heat as ht
mask3D = ht.random.randn(2,2,2, split=0)
print('before where |', 'rank:', ht.MPI_WORLD.rank, 'gshape:', mask3D[0].gshape, 'lshape:', mask3D[0].lshape, 'split:', mask3D[0].split, flush=True); ht.MPI_WORLD.Barrier()  # organize output
mask2D = ht.where(mask3D[0] > 0, 1., 0.)
print(' after where |', 'rank:', ht.MPI_WORLD.rank, 'gshape:', mask2D.gshape, 'lshape:', mask2D.lshape, 'split:', mask2D.split, flush=True)

With one process, this results in:

before where | rank: 0 gshape: (2, 2) lshape: (2, 2) split: 0
 after where | rank: 0 gshape: (2, 2) lshape: (2, 2) split: 0

With two processes:

before where | rank: 0 gshape: (2, 2) lshape: (2, 2) split: 0
before where | rank: 1 gshape: (2, 2) lshape: (0,) split: 0

Traceback (most recent call last):
  File "chunks.py", line 13, in <module>
    mask2D = ht.where(mask3D[0] > 0, 1., 0.)
  File "/home/b.bourgart/anaconda3/lib/python3.7/site-packages/heat/core/indexing.py", line 148, in where
    return cond.dtype(cond == 0) * y + cond * x
  File "/home/b.bourgart/anaconda3/lib/python3.7/site-packages/heat/core/types.py", line 98, in __new__
    array, dtype=cls, is_split=value[0].split, comm=comm, device=device
  File "/home/b.bourgart/anaconda3/lib/python3.7/site-packages/heat/core/factories.py", line 416, in array
    raise ValueError("unable to construct tensor, shape of local data chunk does not match")
ValueError: unable to construct tensor, shape of local data chunk does not match

My guess is that because of the lshape (0,), this belongs to this issue. Is that correct or should I open a new issue?

@coquelin77
Copy link
Member

unfortunately, some things from traditional python dont translate into heat. in the additional example there, the issue comes from the slicing (mask3D[0]).

when slicing a distributed object along the dimension which it is distributed (i.e. slicing on dimension 0 with split == 0) the result is not balanced between the processes. to correct this, all one needs to do is call the balance_() function.

import heat as ht
mask3D = ht.random.randn(2,2,2, split=0)
print('before where |', 'rank:', ht.MPI_WORLD.rank, 'gshape:', mask3D[0].gshape, 'lshape:', mask3D[0].lshape, 'split:', mask3D[0].split, flush=True); ht.MPI_WORLD.Barrier()  # organize output

sliced = mask3D[0]
sliced.balance_()

mask2D = ht.where(sliced > 0, 1., 0.)
print(' after where |', 'rank:', ht.MPI_WORLD.rank, 'gshape:', mask2D.gshape, 'lshape:', mask2D.lshape, 'split:', mask2D.split, flush=True)

an in-line (out of place) balance function is in the works as well

as for the main body of this issue, this requires more searching. it appears to be a difference in the shape of single element arrays and their defined shape.

@ben-bou
Copy link
Collaborator Author

ben-bou commented Apr 12, 2021

Hi, thanks for your reply!
The additional example is a reference to @ClaudiaComito's comment that the lshape being (0,) is also an issue.

Balancing sounds great, however do you also recommend it for the case of setting one layer:

import heat as ht
something3D = ht.zeros((2,2,2), split=0)
mask3D = ht.random.randn(2,2,2, split=0)
print('before where |', 'rank:', ht.MPI_WORLD.rank, 'gshape:', mask3D[0].gshape, 'lshape:', mask3D[0].lshape, 'split:', mask3D[0].split, flush=True); ht.MPI_WORLD.Barrier()  # organize output

something3D[0] = ht.where(mask3D[0] > 0, 1., 0.)
print(' after where |', 'rank:', ht.MPI_WORLD.rank, 'gshape:', something3D.gshape, 'lshape:', something3D.lshape, 'split:', something3D.split, flush=True)

Of course, splitting along a different dimension than the one I'm slicing would be best solution. Unfortunately I have to slice along every dimension eventually. Currently I'm using:

something3D[0] = ht.where(mask3D > 0, 1., 0.)[0]

Which I think is still faster than balancing and redistributing or resplitting.

@ClaudiaComito ClaudiaComito self-assigned this Apr 12, 2021
@ClaudiaComito
Copy link
Contributor

ClaudiaComito commented Apr 14, 2021

@ben-bou could you try out this branch and see if you get what you expect out of parallel slicing?

the tests don't run yet, I'm afraid I've neglected the single-node part Single-node is fixed as well.

@ClaudiaComito ClaudiaComito linked a pull request Apr 15, 2021 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants