10 Jan 09:30

fyrestone

dcc090d

v0.10.0 Latest

Latest

What's Changed

Optimize tile of DataFrame.setitem by reducing time of generating chunk meta by @qinxuye in #3140
Increase the default value of alru cache max size by @zhongchun in #3146
Support scipy special function with tuple output by @RandomY-2 in #3139
Fix DAG.to_dot when reducers have multiple outputs by @chaokunyang in #3150
Fix deserializing RandomStateField when its value is None by @chaokunyang in #3149
Patch pandas magic functions to allow reverse operands by @wjsi in #3155
Run flaky test test_load_third_party_modules separately by @chaokunyang in #3162
Manually install cri-dockerd before installing kubernetes by @wjsi in #3166
[Shuffle] Add n_mappers and n_reducers to ShuffleProxy by @chaokunyang in #3160
[Ray] task based shuffle for ray by @chaokunyang in #3040
Add support for {DataFrame,Series}.align by @wjsi in #3147
Integrate remaining error functions and fresnel integrals except fresnel_zeros by @RandomY-2 in #3172
Improve numexpr fusion by @fyrestone in #3177
Ensure key is a valid Python identifier by @fyrestone in #3190
Bump terser from 5.7.1 to 5.14.2 in web component by @dependabot in #3194
Implement airy functions (except the ai_zeros and bi_zeros functions) by @shantam-8 in #3195
Disable version updates for dependabot by @wjsi in #3203
[Ray] Fix ray memory leak by @fyrestone in #3184
[Ray] Support reducer has inputs which isn't mapper by @chaokunyang in #3206
Refine UT and logs by @fyrestone in #3204
release actor lock when set_subtask_result by @chaokunyang in #3210
Refine apply key generation by @chaokunyang in #3208
fix remove mapper data by @chaokunyang in #3214
[Ray] Configurable subtask num_cpus by @fyrestone in #3207
Fix versionner compatibility with PEP600 by @chaokunyang in #3223
Support get mappers data without index/mapperids by @chaokunyang in #3222
[Ray] RayExecutionContext.get_chunk_meta from meta service by @fyrestone in #3212
[Ray] Share RayTaskState across tasks by @fyrestone in #3219
[Shuffle] Support shuffle operands mapper whose outputs aren't mapper blocks by @chaokunyang in #3228
Apply Operand Closure clean up by @vcfgv in #3205
Fix dataframe sort_values with multiple ascendings bug in pandas < 1.4 by @fyrestone in #3234
Lifecycle gc task service by @fyrestone in #3230
Fix dataframe loc with slice returns incorrect results by @fyrestone in #3241
Fix dataframe setitem bugs when partial indexes exist in target dataframe by @fyrestone in #3240
[Shuffle] isolate mappers in different subtasks for fetch_by_index mode by @chaokunyang in #3239
TypeDispatcher support one type multiple serializers by @fyrestone in #3242
[Shuffle] Skip store shuffle object refs to reduce meta overhead by @chaokunyang in #3209
[ray] Support scheduling ray tasks in Ray oscar deploy backend by @chaokunyang in #3165
Dump subtask graph for all backends by @fyrestone in #3245
[Metrics] Fix metrics and docs by @zhongchun in #3233
Remove storage service from supervisor by @vcfgv in #3254
Fix optimization rule memory leak by @fyrestone in #3246
fsspec integration by @hekaisheng in #3253
[Ray] Enable CI of mars/dataframe for Ray DAG by @fyrestone in #3250
Fix minikube installation by @hekaisheng in #3244
Implements scipy.stats.rankdata by @shantam-8 in #3218
Add S3 support by @fyrestone in #3258
Fix tensor frexp by @fyrestone in #3259
Optimize the display of task process bar by @zhongchun in #3264
[Ray] Optimize ray executor submit subtask by @fyrestone in #3271
[Ray] Enable CI of mars/learn for Ray DAG by @fyrestone in #3261
[Ray] Enable CI of mars/tensor for Ray DAG by @fyrestone in #3275
Compatible with pandas 1.5.0 by @hekaisheng in #3276
Remove skip_ray_dag mark for raydataset tests by @vcfgv in #3255
MapChunk Operand Closure and Callable cleanup by @vcfgv in #3238
[Ray] Spread scheduling subtasks with empty dependencies by @fyrestone in #3281
Speedup mars deserialization by new by @chaokunyang in #3283
A cython-based ordered_set to speedup discard operation by @chaokunyang in #3277
Optimize concat by @fyrestone in #3286
Fix md.concat error when there are same fetch chunk data by @zhongchun in #3285
[Ray] Improve Ray executor GC by @fyrestone in #3287
Fix some CI issues by @hekaisheng in #3296
[Ray] Implement Ray executor subtask GC by @fyrestone in #3294
[Ray] Add metrics for Ray executor by @fyrestone in #3295
Bump up required vineyard version to address the CI failure. by @sighingnow in #3298
[Operand] support loc setitem by @chaokunyang in #3291
[Ray] Support worker_mem for ray executor by @fyrestone in #3300
Fix duplicate execution by @fyrestone in #3301
Fix CI by @hekaisheng in #3306
[Ray] Basic slow subtask detection by @fyrestone in #3305
Fix stats tests and pin sphinx version by @hekaisheng in #3313
Fix s3 client kwargs by @fyrestone in #3316
Update Mars on Ray doc by @fyrestone in #3311

Full Changelog: v0.10.0a1...v0.10.0

Contributors

qinxuye, zhongchun, and 9 other contributors

Assets 2

12 Jun 10:56

qinxuye

v0.10.0a1

424cfb9

v0.10.0a1 Pre-release

Pre-release

This is the release notes of v0.10.0a1. See here for the complete list of solved issues and merged PRs.

New Features

Oscar
- Stop importing main module when starting Mars local cluster (#3110)
Tensor
- Integrate special error functions (#3060)
- Integrate part of scipy elliptic functions and integrals (#3111)
DataFrame
- Support sort=True for Groupby (#2959, thanks @sak2002!)

Enhancements

Disable bloom filter in merge for now (#2967)
[Ray] Implement ray task executor progress (#3008)
Dump remote tracebacks to make local ones more friendly (#3028)
Use tell when remove mapper data after execution (#3027)
Optimize import speed for Mars package (#3022)
Do not aggressively choose tree method in tile of groupby for distributed setting (#3032)
[Ray] Implements get_chunks_result for Ray execution context (#3023)
Refine ThreadedServiceContext.get_chunks_meta usage (#3037)
Shuffle both sides at the same time for md.merge (#3041)
Assign reducer ops in task assigner to make them more balanced across cluster (#3048)
[Ray] Destroy Ray executor when the task finish (#3049)
[Ray] Implements get_chunks_meta for Ray execution context (#3052)
[Ray] Support basic subtask retry and lineage reconstruction (#2969)
Combine tree and shuffle methods in DataFrameGroupBy.agg tile (#3051)
[Ray] Implements get_total_n_cpu for Ray execution context (#3059)
[Ray] Implement cancel method on Ray task executor (#3044)
Use OS-designated ports instead of random ports to create sub pools (#3053)
Unify DataFrameGroupByAgg's tile logic for auto method (#3084)
Simplify router clean up when pools or clusters ends (#3086)
Call immutable web API only once when previous call blocks (#3085)
[Ray] Create RayTaskState actor as needed by default (#3081)
[Ray] Implement gc for ray task executor context (#3061)
Simplify argument passing in actor batch calls (#3098)
Optimize performance of transfer (#3091)
Add n_reducers and reducer_ordinal to shuffle operands (#3055)
Optimize serializable memory (#3120)

Bug fixes

Fix errors when deleting mapper data (#3018)
Fix recursive_tile that it may cause duplicated tile for one tileable (#3021)
Fix error message when sparse data format not supported (#3046)
Patch pandas to make pickle compatible between 1.2 and 1.3 (#3047)
Fix chunk index error in auto_merge_chunks (#3057)
[Ray] Fix ray worker failover (#3080)
[Metric] Fix prometheus metric backend (#3124)
Fix mt.{cumsum, cumprod} when the first chunk is empty (#3134)

Tests

Check initialization of serializables on CI (#3007)
Use @pytest_asyncio.fixture instead of @pytest.fixture for async fixtures (#3025)
Change code owners to Mars PMC maintainers (#3031)
[Ray] Fix ray executor progress test (#3033)
[Ray] Optimize Ray CI execution time and stability (#3102)
Make test_session_set_progress more stable under Ray tests (#3103)
Update pytest imports for test_special.py (#3129)
[Ray] Fix flaky test test_optional_supervisor_node (#3133)

Others

Build web code before CIBW when deploying to PyPI (#3014)
Make PyPI user name configurable (#3130)

Contributors

sak2002

Assets 2

0 Join discussion

12 Jun 11:19

qinxuye

v0.9.0

5908922

v0.9.0

This is the release notes of v0.9.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.9.0rc3; for all highlights and changes, please refer to the release notes of the pre-releases:

alpha1
alpha2
beta1
beta2
rc1
rc2
rc3

Changes that break compatibility

From v0.9 on, Python 3.6 is dropped support.

Highlights

Performance is fully optimized in this version, welcome to give your feedback.

New Features

Oscar
- Stop importing main module when starting Mars local cluster (#3113)
Tensor
- Integrate special error functions (#3062)
- Integrate part of scipy elliptic functions and integrals (#3112)
DataFrame
- Support sort=True for Groupby (#3063, thanks @sak2002!)

Enhancements

Dump remote tracebacks to make local ones more friendly (#3030)
Optimize import speed for Mars package (#3035)
[Ray] Implement ray task executor progress (#3065)
Shuffle both sides at the same time for md.merge (#3066)
Refine ThreadedServiceContext.get_chunks_meta usage (#3067)
Do not aggressively choose tree method in tile of groupby for distributed setting (#3070)
Disable bloom filter in merge for now (#3071)
[Ray] Implements get_chunks_result for Ray execution context (#3072)
Use tell when remove mapper data after execution (#3073)
Assign reducer ops in task assigner to make them more balanced across cluster (#3075)
[Ray] Destroy Ray executor when the task finish (#3074)
Combine tree and shuffle methods in DataFrameGroupBy.agg tile (#3077)
[Ray] Implements get_chunks_meta for Ray execution context (#3076)
Use OS-designated ports instead of random ports to create sub pools (#3087)
Call immutable web API only once when previous call blocks (#3088)
Unify DataFrameGroupByAgg's tile logic for auto method (#3094)
[Ray] Support basic subtask retry and lineage reconstruction (#3097)
Simplify argument passing in actor batch calls (#3100)
[Ray] Implements get_total_n_cpu for Ray execution context (#3104)
Optimize performance of transfer (#3105)
Add n_reducers and reducer_ordinal to shuffle operands (#3107)
[Ray] Implement cancel method on Ray task executor (#3093)
[Ray] Create RayTaskState actor as needed by default (#3114)
[Ray] Implement gc for ray task executor context (#3116)
Optimize serializable memory (#3126)

Bug fixes

Patch pandas to make pickle compatible between 1.2 and 1.3 (#3050)
Fix errors when deleting mapper data (#3064)
Fix chunk index error in auto_merge_chunks (#3068)
Fix recursive_tile that it may cause duplicated tile for one tileable (#3069)
[Ray] Fix ray worker failover (#3115)
[Ray] Fix pandas schema parsing when reading Ray dataset (#3117)
[Ray] fix auto scale-in hang (#3125)
[Metric] Fix prometheus metric backend (#3127)
Fix mt.{cumsum, cumprod} when the first chunk is empty (#3136)

Tests

Check initialization of serializables on CI (#3013)
[Ray] Optimize Ray CI execution time and stability (#3121)
Update pytest imports for test_special.py (#3131)
[Ray] Fix flaky test test_optional_supervisor_node (#3135)

Others

Build web code before CIBW when deploying to PyPI (#3016)

Contributors

sak2002

Assets 2

0 Join discussion

10 May 02:49

wjsi

v0.8.7

e9b8e79

v0.8.7

This is the release notes of v0.8.7.

Bug fixes

Fixes missing web packages in Linux wheels (#3014)

Assets 2

07 May 16:48

qinxuye

v0.9.0rc3

03ed810

v0.9.0rc3 Pre-release

Pre-release

This is the release notes of v0.9.0rc3. See here for the complete list of solved issues and merged PRs.

New Features

Tensor
- Implementing Ellipsoidal Harmonics Functions (#2891, thanks @shantam-8!)
Services
- Support worker meta service (#2909)
- Basic Ray execution backend (#2921)

Enhancements

Add execution API to enable custimization of Mars Task Service (#2894)
Optimize serialization performance (#2914)
Skip adding band in meta when fetch shuffle data (#2922)
Store complete meta on worker and update supervisor meta via fetching from workers (#2912)
Use cython to accelerate core serialization (#2924)
Refine lifecycle api to support incref or decref with ref counts (#2926)
Ignore fetch operands when assign initial nodes (#2929)
Use cython to accelerate message serialization (#2932)
Ignore broadcaster's locality when assign subtasks (#2943)
Allow spawning serialization to threads for large objects (#2944)
Add metrics and event report for Ray channels (#2936)
Add more logs about execution info (#2940)
Add support for dask.persist (#2953, thanks @loopyme!)
Remove should_be_monotonic property (#2949)
Add metrics on operand and subtask executions (#2947, thanks @zhongchun!)
[Ray] optimize ray fetcher by query in remote node (#2957)
Improve deploy backend (#2958)
Support reporting tile progress (#2954)
Add logic key for tileable graph (#2961, thanks @zhongchun!)
[Ray] Loads the subtask inputs from meta (#2976)
New ExecutionConfig API (#2968)
Fix speculative execution compatibility with coloring (#2995)
Make functions that may take long run in thread for lifecycle tracker (#2992)
Optimize metric configs (#2996, thanks @zhongchun!)
Expand the ability of resource evaluator (#2997, thanks @zhongchun!)
Optimize gen subtask graph (#3004)
[Ray] Ray execution state (#3002)

Bug fixes

Fix paramter issue of worker actor pool (#2911, thanks @zhongchun!)
Fix default config to ensure storage backends configured (#2935)
Wrap errors in operand execution to protect scheduling service (#2964)
Fix dtype of series result for DataFrame.apply (#2978)
Fix potential data leak for shuffle tasks (#2975)
Fix potential empty chunks when creating DataFrame from pandas (#2987)
[Ray] Support new ray cluster through ray client (#2981)
Fix missing extra_params when constructing operands (#2999)
Fix msg_to_simple_str in Ray backend and add tests (#3003)
Fix incorrect result for df.sort_values when specifying multiple ascending (#2984)

Documentation

Add development documents for metrics (#2955, thanks @zhongchun!)

Tests

Add TPC-H benchmarks (#2937)
Fix Ray cases (#2983)
Fix version mismatch between kubernetes and minikube (#2986)
Allow selecting TPC queries (#3005)

Contributors

zhongchun, loopyme, and shantam-8

Assets 2

0 Join discussion

07 May 16:56

qinxuye

v0.8.6

e550ae4

v0.8.6

This is the release notes of v0.8.6. See here for the complete list of solved issues and merged PRs.

New Features

Tensor
- Implementing Ellipsoidal Harmonics Functions (#2927, thanks @shantam-8!)

Enhancements

Add support for dask.persist (#2990, thanks @loopyme!)
Optimize gen subtask graph (#3006)
Ignore broadcaster's locality when assign subtasks (#2994)

Bug fixes

Fix task hang when error object cannot be pickled (#2913)
Fix potential KeyError in actor_ref calls when running with multiple processes (#2962)
Wrap errors in operand execution to protect scheduling service (#2971)
Fix dtype of series result for DataFrame.apply (#2979)
Fix default config to ensure storage backends configured (#2989)
Fix potential empty chunks when creating DataFrame from pandas (#2991)
Fix incorrect result for df.sort_values when specifying multiple ascending (#3006)
Fix missing extra_params when constructing operands (#3006)

Tests

Fix version mismatch between kubernetes and minikube (#2988)

Contributors

loopyme and shantam-8

Assets 2

0 Join discussion

09 Apr 15:53

qinxuye

v0.9.0rc2

dc93f88

v0.9.0rc2 Pre-release

Pre-release

This is the release notes of v0.9.0rc2. See here for the complete list of solved issues and merged PRs.

New Features

Web
- Add stack display page on Mars Web (#2876)

Enhancements

Avoid printing too many messages in Oscar (#2871)
Expand slot scheduler to resource scheduler (#2846, thanks @zhongchun!)
Optimized iterative tiling by pruning unrelated chunks (#2874)
Optimize DataFrameIsin's tile (#2864)
Add benchmark for serialization (#2901)
[Ray] Ray client channel get recv when first complied (#2740, thanks @Catch-Bull!)
Use bloom filter to optimize df.merge execution (#2895)
Stop recording all mapper meta (#2900)
[Ray] Use main pool as owner when autoscale disabled (#2878)

Bug fixes

Fix XGBoost when some workers do not have evals data (#2861)
Fix duplicate node iteration in GraphAssigner (#2857)
Raise ActorNotExist when no supervisors available (#2859)
Fix dtype infer in DataFrame arithmetic on datetime consts (#2879)
Fix timeout for wait_task (#2883)
Make sure error can be raised in Actor.__pre_destroy__ (#2887)

Tests

Upgrade azure-pipelines to Python 3.9 (#2862)
Adapt to official cancel of Github Actions (#2902)

Contributors

zhongchun and Catch-Bull

Assets 2

0 Join discussion

09 Apr 16:00

qinxuye

v0.8.5

ed300c5

v0.8.5

This is the release notes of v0.8.5. See here for the complete list of solved issues and merged PRs.

New Features

Web
- Add stack display page on Mars Web (#2881)

Enhancements

Avoid printing too many messages in Oscar (#2880)
[Ray] Use main pool as owner when autoscale disabled (#2903)

Bug fixes

Fix XGBoost when some workers do not have evals data (#2863)
Raise ActorNotExist when no supervisors available (#2869)
Fix dtype infer in DataFrame arithmetic on datetime consts (#2880)
Fix duplicate node iteration in GraphAssigner (#2880)
Fix timeout for wait_task (#2890)
Make sure errors can be raised in Actor.__pre_destroy__ (#2892)

Tests

Upgrade azure-pipelines to Python 3.9 (#2886)
Adapt to official cancel of Github Actions (#2903)

Assets 2

0 Join discussion

23 Mar 01:03

wjsi

v0.9.0rc1

96af4fa

v0.9.0rc1 Pre-release

Pre-release

This is the release notes of v0.9.0rc1. See here for the complete list of solved issues and merged PRs.

New Features

Tensor
- Implements mars.tensor.setdiff1d (#2823)
Learn
- Added support for mars.learn.metrics.roc_auc_score (#2832)
Services
- A speculative execution based task scheduler (#2576)
Metric
- [ray] Add metric for ray object store (#2776, thanks @Catch-Bull!)
Others
- Use versioneer to manage release versions (#2806)

Enhancements

Support generating a DOT file for subtask graph (#2803)
Support generating dtypes, index_value etc lazily for DataFrame chunks (#2756)
[ray] Default enable fault tolerance for ray (#2801)
Improve subtask details in logs (#2836)
Accurate resource management for global slot manager (#2732)
Configure nthread of XGBoost jobs (#2844)
Improved performance of mars.learn.metrics.{roc_curve, roc_auc_score} (#2838)
Bump minimist and nanoid in Mars UI due to security alerts (#2849)
Fix store duplicate chunk and meta per subtask (#2845)

Bug fixes

Fix default value of gpu property for some operands (#2811)
Fixes the failure on Vineyard CI by ensure the input tensor chunk is a numpy's ndarray (#2817)
Fix race condition of set_subtask_result (#2784)
Fix duplicate subtask submit (#2815)
Change StorageHandlerActor to stateful (#2824)
Fix running xgboost on Ray cluster (#2826)
Fix FileSystem.ls for OSS (#2837)
Stop fetching data when pure dependencies specified (#2840)
Fix dirty version number caused by versioneer when building with cibuildwheel (#2855)

Tests

[Ray] Refine ray tests (#2793)
Build docker images cronically (#2804)
Introduce asv benchmark (#2798)

Contributors

Catch-Bull

Assets 2

23 Mar 01:02

wjsi

v0.8.4

9a83bb8

v0.8.4

This is the release notes of v0.8.4. See here for the complete list of solved issues and merged PRs.

New Features

Tensor
- Implements mars.tensor.setdiff1d (#2829)
Learn
- Added support for mars.learn.metrics.roc_auc_score (#2841)
Others
- Use versioneer to manage release versions (#2807)
- Use cibuildwheel to release wheels (#2854)

Enhancements

Support generating a DOT file for subtask graph (#2818)
Enhance subtask details in logs (#2842)
Configure cores of XGBoost jobs (#2847)
Improved performance of mars.learn.metrics.{roc_curve, roc_auc_score} (#2850)
Fix store duplicate chunk and meta per subtask (#2851)
Bump minimist and nanoid in Mars UI due to security alerts (#2851)

Bug fixes

Fix race condition of set_subtask_result (#2819)
Fix duplicate subtask submit (#2819)
Fixes the failure on Vineyard CI by ensure the input tensor chunk is a numpy's ndarray (#2819)
Fix default value of gpu property for some operands (#2820)
Fix running xgboost on Ray cluster (#2830)
Change StorageHandlerActor to stateful (#2830)
Fix FileSystem.ls for OSS (#2842)
Stop fetching data when pure dependencies specified (#2843)

Tests

[Ray] Refine ray tests (#2810)
Build docker images cronically (#2807)

Assets 2

Releases: mars-project/mars

v0.10.0

What's Changed

Contributors

v0.10.0a1

New Features

Enhancements

Bug fixes

Tests

Others

Contributors

v0.9.0

Changes that break compatibility

Highlights

New Features

Enhancements

Bug fixes

Tests

Others

Contributors

v0.8.7

Bug fixes

v0.9.0rc3

New Features

Enhancements

Bug fixes

Documentation

Tests

Contributors

v0.8.6

New Features

Enhancements

Bug fixes

Tests

Contributors

v0.9.0rc2

New Features

Enhancements

Bug fixes

Tests

Contributors

v0.8.5

New Features

Enhancements

Bug fixes

Tests

v0.9.0rc1

New Features

Enhancements

Bug fixes

Tests

Contributors

v0.8.4

New Features

Enhancements

Bug fixes

Tests