Skip to content

Commit

Permalink
Bugfix: _like factories now default to same device as input DNDarray (
Browse files Browse the repository at this point in the history
#1443)

* bugfix

* removed print

* set appropriate default device in __factory_like

* undo explicit device setting

* fix cond dtype setting

* fix typo

* deleting benchmarks 2020

* pacify pre-commit

* reinstate original dtype call

* bypass numpy() call in test_random

* debug test_random

* Update CISupport.yml

* Update CIBase.yml

---------

Co-authored-by: Hoppe <[email protected]>
Co-authored-by: Michael Tarnawa <[email protected]>
Co-authored-by: Claudia Comito <[email protected]>
  • Loading branch information
4 people authored May 7, 2024
1 parent f465ef0 commit 421c868
Show file tree
Hide file tree
Showing 12 changed files with 45 additions and 38 deletions.
6 changes: 5 additions & 1 deletion .github/workflows/CIBase.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,15 @@ jobs:
with:
egress-policy: audit

- name: Get branch names
id: branch-names
uses: tj-actions/branch-names@v8
- name: 'start test'
run: |
curl -s -X POST \
--fail \
-F token=${{ secrets.CB_PIPELINE }} \
-F "ref=heat/base" \
-F "ref=heat/support" \
-F "variables[SHA]=$GITHUB_SHA" \
-F "variables[GHBRANCH]=${{ steps.branch-names.outputs.current_branch }}" \
https://codebase.helmholtz.cloud/api/v4/projects/7605/trigger/pipeline -o /dev/null
4 changes: 4 additions & 0 deletions .github/workflows/CISupport.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,15 @@ jobs:
starter:
runs-on: ubuntu-latest
steps:
- name: Get branch names
id: branch-names
uses: tj-actions/branch-names@v8
- name: 'start test'
run: |
curl -s -X POST \
--fail \
-F token=${{ secrets.CB_PIPELINE }} \
-F "ref=heat/support" \
-F "variables[SHA]=$GITHUB_SHA" \
-F "variables[GHBRANCH]=${{ steps.branch-names.outputs.current_branch }}" \
https://codebase.helmholtz.cloud/api/v4/projects/7605/trigger/pipeline -o /dev/null
Empty file removed benchmarks/2020/__init__.py
Empty file.
Empty file.
2 changes: 1 addition & 1 deletion benchmarks/2020/distance_matrix/numpy-cpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,4 @@
start = time.perf_counter()
dist = cdist(data, data)
end = time.perf_counter()
print(f"\t{end-start}s")
print(f"\t{end - start}s")
2 changes: 1 addition & 1 deletion benchmarks/2020/distance_matrix/torch-cpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,4 @@
start = time.perf_counter()
dist = torch.cdist(data, data)
end = time.perf_counter()
print(f"\t{end-start}s")
print(f"\t{end - start}s")
Empty file removed benchmarks/2020/kmeans/__init__.py
Empty file.
Empty file removed benchmarks/2020/lasso/__init__.py
Empty file.
Empty file.
47 changes: 25 additions & 22 deletions heat/core/factories.py
Original file line number Diff line number Diff line change
Expand Up @@ -598,13 +598,13 @@ def empty_like(
) -> DNDarray:
"""
Returns a new uninitialized :class:`~heat.core.dndarray.DNDarray` with the same type, shape and data distribution
of given object. Data type and data distribution strategy can be explicitly overriden.
of given object. Data type, data distribution axis, and device can be explicitly overridden.
Parameters
----------
a : DNDarray
The shape and data-type of ``a`` define these same attributes of the returned array. Uninitialized array with
the same shape, type and split axis as ``a`` unless overriden.
The shape, data-type, split axis and device of ``a`` define these same attributes of the returned array. Uninitialized array with
the same shape, type, split axis and device as ``a`` unless overriden.
dtype : datatype, optional
Overrides the data type of the result.
split: int or None, optional
Expand Down Expand Up @@ -794,8 +794,7 @@ def __factory_like(
factory : function
Function that creates a DNDarray.
device : str
Specifies the :class:`~heat.core.devices.Device` the array shall be allocated on, defaults to globally set
default device.
Specifies the :class:`~heat.core.devices.Device` the array shall be allocated on, defaults to the same device as ``a``.
comm: Communication
Handle to the nodes holding distributed parts or copies of this array.
order: str, optional
Expand Down Expand Up @@ -834,6 +833,13 @@ def __factory_like(
# do not split at all
pass

# infer the device, otherwise default to a.device
if device is None:
try:
device = a.device
except AttributeError:
device = devices.get_device()

# use the default communicator, if not set
comm = sanitize_comm(comm)

Expand Down Expand Up @@ -1057,21 +1063,20 @@ def full_like(
order: str = "C",
) -> DNDarray:
"""
Return a full :class:`~heat.core.dndarray.DNDarray` with the same shape and type as a given array.
Return a full :class:`~heat.core.dndarray.DNDarray` with the same shape and type as a given array. Data type, data distribution axis, and device can be explicitly overridden.
Parameters
----------
a : DNDarray
The shape and data-type of ``a`` define these same attributes of the returned array.
The shape, data-type, split axis and device of ``a`` define these same attributes of the returned array.
fill_value : scalar
Fill value.
dtype : datatype, optional
Overrides the data type of the result.
The data type of the result, defaults to `a.dtype`.
split: int or None, optional
The axis along which the array is split and distributed; ``None`` means no distribution.
The axis along which the array is split and distributed; defaults to `a.split`.
device : str or Device, optional
Specifies the :class:`~heat.core.devices.Device` the array shall be allocated on, defaults to globally set
default device.
Specifies the :class:`~heat.core.devices.Device` the array shall be allocated on, defaults to `a.device`.
comm: Communication, optional
Handle to the nodes holding distributed parts or copies of this array.
order: str, optional
Expand Down Expand Up @@ -1386,19 +1391,18 @@ def ones_like(
) -> DNDarray:
"""
Returns a new :class:`~heat.core.dndarray.DNDarray` filled with ones with the same type,
shape and data distribution of given object. Data type and data distribution strategy can be explicitly overriden.
shape, data distribution and device of the input object. Data type, data distribution axis, and device can be explicitly overridden.
Parameters
----------
a : DNDarray
The shape and data-type of ``a`` define these same attributes of the returned array.
The shape, data-type, split axis and device of ``a`` define these same attributes of the returned array.
dtype : datatype, optional
Overrides the data type of the result.
split: int or None, optional
The axis along which the array is split and distributed; ``None`` means no distribution.
The axis along which the array is split and distributed; defaults to `a.split`.
device : str or Device, optional
Specifies the :class:`~heat.core.devices.Device` the array shall be allocated on, defaults to globally set
default device.
Specifies the :class:`~heat.core.devices.Device` the array shall be allocated on, defaults to `a.device`.
comm: Communication, optional
Handle to the nodes holding distributed parts or copies of this array.
order: str, optional
Expand Down Expand Up @@ -1482,20 +1486,19 @@ def zeros_like(
order: str = "C",
) -> DNDarray:
"""
Returns a new :class:`~heat.core.dndarray.DNDarray` filled with zeros with the same type, shape and data
distribution of given object. Data type and data distribution strategy can be explicitly overriden.
Returns a new :class:`~heat.core.dndarray.DNDarray` filled with zeros with the same type, shape, data
distribution, and device of the input object. Data type, data distribution axis, and device can be explicitly overridden.
Parameters
----------
a : DNDarray
The shape and data-type of ``a`` define these same attributes of the returned array.
The shape, data-type, split axis, and device of ``a`` define these same attributes of the returned array.
dtype : datatype, optional
Overrides the data type of the result.
split: int or None, optional
The axis along which the array is split and distributed; ``None`` means no distribution.
The axis along which the array is split and distributed; defaults to `a.split`.
device : str or Device, optional
Specifies the :class:`~heat.core.devices.Device` the array shall be allocated on, defaults to globally set
default device.
Specifies the :class:`~heat.core.devices.Device` the array shall be allocated on, defaults to `a.device`.
comm: Communication, optional
Handle to the nodes holding distributed parts or copies of this array.
order: str, optional
Expand Down
20 changes: 8 additions & 12 deletions heat/core/tests/test_random.py
Original file line number Diff line number Diff line change
Expand Up @@ -350,10 +350,9 @@ def test_randn(self):
shape = (5, 10, 13, 23, 15, 20)
a = ht.random.randn(*shape, split=0, dtype=ht.float64)
self.assertEqual(a.dtype, ht.float64)
a = a.numpy()
mean = np.mean(a)
median = np.median(a)
std = np.std(a)
mean = ht.mean(a)
median = ht.median(a)
std = ht.std(a)
self.assertTrue(-0.01 < mean < 0.01)
self.assertTrue(-0.01 < median < 0.01)
self.assertTrue(0.99 < std < 1.01)
Expand All @@ -362,18 +361,18 @@ def test_randn(self):
ht.random.seed(54321)
elements = np.prod(shape)
b = ht.random.randn(elements, split=0, dtype=ht.float64)
b = b.numpy()
a = a.flatten()
self.assertTrue(np.allclose(a, b))
self.assertTrue(ht.allclose(a, b))

# Creating the same array two times without resetting seed results in different elements
c = ht.random.randn(elements, split=0, dtype=ht.float64)
c = c.numpy()
self.assertEqual(c.shape, b.shape)
self.assertFalse(np.allclose(b, c))
self.assertFalse(ht.allclose(b, c))

# All the created values should be different
d = np.concatenate((b, c))
d = ht.concatenate((b, c))
d.resplit_(None)
d = d.numpy()
_, counts = np.unique(d, return_counts=True)
self.assertTrue((counts == 1).all())

Expand All @@ -383,9 +382,6 @@ def test_randn(self):
ht.random.seed(12345)
b = ht.random.randn(*shape, split=5, dtype=ht.float64)
self.assertTrue(ht.equal(a, b))
a = a.numpy()
b = b.numpy()
self.assertTrue(np.allclose(a, b))

# Tests with float32
ht.random.seed(54321)
Expand Down
2 changes: 1 addition & 1 deletion scripts/numpy_coverage_tables.py
Original file line number Diff line number Diff line change
Expand Up @@ -548,7 +548,7 @@
# create Table of Contents
f.write("## Table of Contents\n")
for i, header in enumerate(headers):
f.write(f"{i+1}. [{headers[header]}](#{headers[header].lower().replace(' ', '-')})\n")
f.write(f"{i + 1}. [{headers[header]}](#{headers[header].lower().replace(' ', '-')})\n")
f.write("\n")

for i, function_list in enumerate(numpy_functions):
Expand Down

1 comment on commit 421c868

@github-actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 2.

Benchmark suite Current: 421c868 Previous: f465ef0 Ratio
matmul_split_0_N1_GPU - RUNTIME 0.007676383946090937 s (0.020674996078014374) 0.0032328602392226458 s (0.008287741802632809) 2.37

This comment was automatically generated by workflow using github-action-benchmark.

CC: @web-flow

Please sign in to comment.