Releases: alibaba/graphlearn-for-pytorch
Release v0.2.4
We have released GraphLearn for PyTorch v0.2.4 with the following new features:
- introduced support for range partition books, which reduces the memory footprint of the partition data structure
- added an experimental feature for integration with the PyG remote backend
What's Changed
- fix: avoid acquiring the lock when feature already initialized by @Zhanghyi in #145
- [Feat] Support PyG remote backend by @Yi-Eaaa in #144
- fix: fix pyg remote backend ut by @Zhanghyi in #147
- [Feat] range partition book by @Zhanghyi in #146
- Bump version to 0.2.4 by @Zhanghyi in #148
New Contributors
Full Changelog: v0.2.3...v0.2.4
Release v0.2.3
We are thrilled to announce the release of GraphLearn for PyTorch v0.2.3. This update includes some enhancements focusing on:
- Distributed support for vineyard as an integration with GraphScope.
- Optimizations such as graph caching, and some experimental features including support for bf16 precision and all-to-all communication.
- Some bug fixes.
What's Changed
- IGBH: Add Dockerfile and some minors by @LiSu in #128
- IGBH: adjust mllogger tag position by @LiSu in #130
- Distributed support for v6d GraphScope by @husimplicity in #116
- add graph caching support for distributed training by @kaixuanliu in #132
- add bf16 precision support to utilize Intel's AMX accelerations by @kaixuanliu in #133
- fix: data path for s/c case by @husimplicity in #135
- add all2all support to replace p2p rpc, using gloo as backend temporarily by @kaixuanliu in #134
- Fix table dataset init graph by @husimplicity in #136
- Verify hops as dict with 0 by @husimplicity in #138
- fix: several bugs on distributed mode by @husimplicity in #139
- [fix] explicitly call share_memory_ in Feature in cpu mode by @Zhanghyi in #142
- upgrade pytorch and cuda versions by @Zhanghyi in #141
Full Changelog: v0.2.2...v0.2.3
Release v0.2.2
We're excited to announce the release of GraphLearn for PyTorch v0.2.2. This update brings numerous fixes and features enhancing the framework's functionality, performance, and user experience. We extend our gratitude to all contributors who have made this release possible.
What's Changed
- [Fix] ensure consistency between the seeds added to and retrieved from the multiprocessing queue using the put and get methods by @Zhanghyi in #65
- [Fix] skip sampling empty inputs by @LiSu in #67
- [Fix] try to fix tensor.nbr by @husimplicity in #71
- Fix igbh example by using proper parameters by @LiSu in #70
- [Feat] put input data on server and allow for n to n connection between servers and clients by @Zhanghyi in #59
- [Build] adjust setup.py and create an ext_module util function by @Zhanghyi in #73
- [Feat] add "trim_to_layer" support to igbh example by @kaixuanliu in #74
- [Feat] Add edge weight sample for cpu by @husimplicity in #72
- [Fix] Skip mem-sharing the feature tensor in cpu mode by @LiSu in #75
- [Fix] fix empty out_cols calling torch.cat by @husimplicity in #78
- [Fix] enable gc when evaluation by @husimplicity in #81
- [Feat] supports node split for both Dataset and DistDataset by @Zhanghyi in #82
- [Feat] load dataset from vineyard by @Zhanghyi in #80
- [Feat] Refactor RPC connection in server-client mode by @Jia-zb in #83
- [Feat] Add fields parsing for GraphScope side by @Jia-zb in #84
- [Build] split building of glt and glt_v6d by @Zhanghyi in #85
- [Feat] Multithread partition by @Jia-zb in #88
- [Fix] fix get GRAPHSCOPE_HOME in os.environ by @Zhanghyi in #89
- [CI] glt v6d ci by @Zhanghyi in #90
- [Feat] add
trim_to_layer
support to igbh distributed training by @kaixuanliu in #87 - [Fix] fix test: check partition dir exists before test by @LiSu in #91
- [Feat] Update IGBH example by @LiSu in #92
- [Fix] Fixes the build failure on MacOS and the compile flags settings in CMakeLists.txt by @sighingnow in #93
- [Feat] enable continuous downloading for large dataset by @kaixuanliu in #94
- [Feat] support two-stage partitioning by @LiSu in #95
- [Fix] update the label index range for igbh-full dataset by @LiSu in #96
- IGBH: synchronize after evaluation completes by @LiSu in #97
- IGBH updates by @LiSu in #98
- IGBH: persist feature when using FP16 by @LiSu in #99
- Fp16 support by @kaixuanliu in #100
- [Fix] fix GPU allocation while splitting training and sampling in distributed training by @LiSu in #101
- [Fix] Large file process by @kaixuanliu in #103
- [Feat] Refine IGBH preprocessing by @LiSu in #105
- [Feat] Expose random seed configuration for single-node and distributed training by @LiSu in #106
- [Fix] ML Perf code freeze minors by @LiSu in #108
- [Fix] Fixes include path resolution on MacOS by @sighingnow in #109
- [Fix] Use a lock to protect the critical path of sampler initialization in neighbor sampler by @LiSu in #110
- [Fix] adjust the lock location by @kaixuanliu in #111
- [Fix] add argument of channel size by @kaixuanliu in #113
- [Feat] IGBH: add MLPerf logging and control of evaluation frequency by @LiSu in #114
- [Feat] Add gpt example by @husimplicity in #115
- [Feat] IGBH: support specifying the fraction of validation seeds by @LiSu in #117
- [Fix] delete unused code by @kaixuanliu in #121
- [Feat] Separate training batch size and validation batch size in IGBH by @LiSu in #122
- [Feat] Add mechanism of save/load checkpoint by @LiSu in #123
- [Fix] add random seed para for link and subgraph loader by @LiSu in #124
- [Fix] properly handle drop last in distributed sampler by @LiSu in #125
New Contributors
- @sighingnow made their first contribution in #93
Full Changelog: v0.2.1...v0.2.2
Release v0.2.1
We are delighted to bring a number of improvements to GLT, alongside the 0.2.0 release. This release contains many new features, improvements/bug fixes and examples, which are summarized as follows:
- Add support for single-node and distributed inbound sampling, provide users options of both inbound and outbound sampling.
- Add chunk partitioning when partitioning graphs with large feature files, reduce the memory consumption of feature partitioning.
- Add examples for the IGBH dataset
- Fix bugs and improve system stability
What's Changed
- fix: clone the id chunk before pickle dump to avoid dumping the entire tensor by @LiSu in #44
- Update figure by @husimplicity in #45
- Feature: In bound sampling of single machine by @Jia-zb in #48
- fix bug 'index out of bounds for partition book List' for igbh-large … by @kaixuanliu in #49
- Fix igbh rgnn example by @LiSu in #50
- Add distributed in-sample functions by @husimplicity in #51
- [Example] clarify the setting of the number of servers and clients by @Zhanghyi in #52
- Fix igbh rgnn example by @Jia-zb in #53
- [bug] fix
invalid configuration argument
when samplers return torch.empty by @husimplicity in #54 - [Example] Separate server and client launch scripts for server-client mode distributed training by @Zhanghyi in #56
- Add IGBH multi-card single-node example & bug fix when mem-sharing graphs by @LiSu in #62
- Update the igbh readme of single-node multi-GPU training by @LiSu in #63
- Bump version to 0.2.1 by @LiSu in #64
New Contributors
Full Changelog: v0.2.0...v0.2.1
GraphLearn for PyTorch v0.2.0 Release Notes
We are pleased to announce the first open release of GLT v0.2.0!
- GLT provides both CPU-based and GPU-based graph operators, including neighbor sampling, negative sampling, and feature lookup. The GPU-based graph operations significantly accelerate computation and reduce data movement, making it suitable for GPU training.
- For distributed training, GLT implements multi-processing asynchronous sampling, pin memory buffer, hot feature cache, and utilizes fast networking technologies (PyTorch RPC with RDMA support) to speed up distributed sampling and reduce communication.
- GLT is also easy to use, with most of its APIs compatible with PyG/PyTorch and complete documentation and usage examples available, GLT focuses on real-world scenarios and provides distributed GNN training examples on large-scale graphs.