-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
4ac3f5f
commit b614d9f
Showing
6 changed files
with
232 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,100 @@ | ||
# compressed-tensors | ||
|
||
This repository extends a [safetensors](https://github.com/huggingface/safetensors) format to efficiently store sparse and/or quantized tensors on disk. `compressed-tensors` format supports multiple compression types to minimize the disk space and facilitate the tensor manipulation. | ||
|
||
## Motivation | ||
|
||
### Reduce disk space by saving sparse tensors in a compressed format | ||
|
||
The compressed format stores the data much more efficiently by taking advantage of two properties of tensors: | ||
|
||
- Sparse tensors -> due to a large number of entries that are equal to zero. | ||
- Quantized -> due to their low precision representation. | ||
|
||
|
||
### Introduce an elegant interface to save/load compressed tensors | ||
|
||
The library provides the user with the ability to compress/decompress tensors. The properties of tensors are defined by human-readable configs, allowing the users to understand the compression format at a quick glance. | ||
|
||
## Installation | ||
|
||
### Pip | ||
|
||
```bash | ||
pip install compressed-tensors | ||
``` | ||
|
||
### From source | ||
|
||
```bash | ||
git clone https://github.com/neuralmagic/compressed-tensors | ||
cd compressed-tensors | ||
pip install -e . | ||
``` | ||
|
||
## Getting started | ||
|
||
### Saving | ||
|
||
The function `save_compressed` returns an optional `compression_config` (if compression has been applied). It can be used to inspect the applied compression. | ||
|
||
```python | ||
from compressed_tensors import save_compressed | ||
from torch import Tensor | ||
|
||
tensors: Dict[str, Tensor] = ... | ||
compression_config: Dict = save_compressed(tensors, "model.safetensors") | ||
|
||
|
||
``` | ||
|
||
### Loading | ||
|
||
```python | ||
from compressed_tensors import load_compressed | ||
from torch import Tensor | ||
|
||
tensors: Dict[str, Tensor] = load_compressed("model.safetensors", device="cpu") | ||
``` | ||
|
||
## Benefits | ||
TODO | ||
|
||
## SafeTensors File Format | ||
|
||
For each parameter in the uncompressed state_dict, we store the following attributes needed for decompression in the compressed state_dict: | ||
|
||
- Compressed tensor | ||
- Bitmask | ||
- Uncompressed shape | ||
- Row offsets | ||
|
||
```python | ||
# Dense | ||
{ | ||
PARAM_NAME: uncompressed_tensor | ||
} | ||
|
||
# Compressed | ||
{ | ||
PARAM_NAME.compressed: compressed_tensor, # 1d tensor | ||
PARAM_NAME.bitmask: value, # 2d bitmask tensor (nrows x (ncols / 8)) | ||
PARAM_NAME.shape: value, # Uncompressed shape tensor | ||
PARAM_NAME.row_offsets: value # 1d offsets tensor | ||
} | ||
``` | ||
|
||
The library provides pathways to automatically add the config information to the HF config file. | ||
|
||
```json | ||
// config.json | ||
{ | ||
"sparsity_config": { | ||
"format": "sparse_bitmask", // "dense_sparsity" for the original tensor format | ||
|
||
// Informational | ||
"sparsity_structure": "unstructured", // Or 2:4, 8:16, etc. | ||
"global_sparsity": "0.5" | ||
} | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import pytest | ||
import torch | ||
from compressed_tensors import save_compressed | ||
from compressed_tensors.config import BitmaskConfig | ||
|
||
|
||
@pytest.fixture | ||
def tensors_and_config_sparse(): | ||
tensors = {"tensor_1": torch.Tensor([[0.0, 0.0, 0.0], [1.0, 1.0, 1.0]])} | ||
expected_config_json = { | ||
"compression_config": { | ||
"format": "sparse_bitmask", | ||
"global_sparsity": ( | ||
tensors["tensor_1"].sum() / tensors["tensor_1"].numel() | ||
).item(), | ||
"sparsity_structure": "unstructured", | ||
} | ||
} | ||
return tensors, expected_config_json | ||
|
||
|
||
@pytest.fixture | ||
def tensors_dense(): | ||
tensors = {"tensor_1": torch.Tensor([[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]])} | ||
return tensors | ||
|
||
|
||
def test_save_compressed_sparse(tmp_path, tensors_and_config_sparse): | ||
tensors, expected_config_json = tensors_and_config_sparse | ||
|
||
config_json = save_compressed( | ||
tensors, | ||
compression_config=BitmaskConfig( | ||
format=expected_config_json["compression_config"]["format"], | ||
global_sparsity=expected_config_json["compression_config"][ | ||
"global_sparsity" | ||
], | ||
sparsity_structure=expected_config_json["compression_config"][ | ||
"sparsity_structure" | ||
], | ||
), | ||
save_path=tmp_path / "model.safetensors", | ||
) | ||
assert (tmp_path / "model.safetensors").exists() | ||
assert config_json == expected_config_json | ||
|
||
|
||
def test_save_compressed_dense(tmp_path, tensors_dense): | ||
tensors = tensors_dense | ||
|
||
config_json = save_compressed( | ||
tensors, | ||
save_path=tmp_path / "model.safetensors", | ||
) | ||
assert (tmp_path / "model.safetensors").exists() | ||
assert config_json is None | ||
|
||
|
||
def test_save_compressed_empty(): | ||
# make sure function raises error | ||
with pytest.raises(Exception): | ||
save_compressed({}, "") | ||
|
||
with pytest.raises(Exception): | ||
save_compressed(None, "") |