updates to .geo
file format
Hypertools now saves DataGeometry
objects using the pickle
file format internally, rather than HDF5. With improvements made to the built-in pickle
module since Hypertools's initial release, this now generally results in smaller files that save and load more quickly. It also allows us to no longer depend on deepdish
, which has compatibility issues with various pandas
objects, doesn't offer pre-built wheels for more recent Python versions, and is largely no longer maintained.
If you need to load .geo
files from the old format, hypertools.load
now accepts a keyword-only argument, legacy
. Install deepdish
if necessary, and pass legacy=True
to load older DataGeometry
objects. You can then .save()
them to convert them to the new format.
improvements to example datasets
All example data files have been upgraded to the new file format. Additionally, the three pre-trained scikit-learn Pipeline
s Hypertools provides (wiki_model
, nips_model
, and sotus_model
) have been recreated from scratch using a newer scikit-learn version, better text preprocessing, and updated CountVectorizer
and LatentDirichletAllocation
parameters that result in overall better models.
The example DataGeometry
objects associated with these three models (wiki
, nips
, and sotus
) have been updated accordingly, and additionally now use IncrementalPCA
as their default reducer
s, resulting in faster, deterministic transform outputs.
To use the new models and datasets, upgrade Hypertools to v0.8.0 (pip install -U hypertools
) and remove the local cache of old versions ([[ -d ~/hypertools_data ]] && rm ~/hypertools_data/*
). Older versions of Hypertools will continue to use the old example data.
Other improvements
- Hypertools is now compatible with Python 3.9! This release is also compatible in principle with Python 3.10, but
numba
does not yet support Python 3.10, so certain dependencies will fail to install. - Hypertools now works with newer scikit-learn versions! The updates above to the example datasets make Hypertools fully compatible with recent scikit-learn releases (
>=0.24
). This should make Hypertools easier to use in Colaboratory notebooks and more flexible in general. If you need to use an older scikit-learn version, pip-installhypertools<0.8.0
. - Hypertools now works with newer Matplotlib versions! Recent updates to
matplotlib
's plotting backends were causing Hypertools's plotting interface to fail on import. We've fixed these bugs and maintained backwards compatibility with older (deprecated) interactive plotting backends as well.
Other assorted changes
- failures when loading example datasets and
.geo
files now raiseHypertoolsIOError
with clearer error messages - specifying a
compression
when saving aDataGeometry
object raises aFutureWarning
- CI tests now run with Python 3.6 -- 3.9, use
mamba
for faster environment setup, and generate more verbose output - dependencies and code required for Python 2/3 compatibility have been removed
- various code causing
RuntimeWarning
s has been fixed