Merge pull request #42 from mlcommons/master

sync
flexaihq · Jan 15, 2025 · 290f8d2 · 290f8d2
2 parents e6a88d1 + 05a79cb
commit 290f8d2
Show file tree

Hide file tree

Showing 45 changed files with 543 additions and 106 deletions.
diff --git a/.github/workflows/test-cm.yml b/.github/workflows/test-cm.yml
@@ -16,7 +16,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        python-version: ["3.7", "3.8", "3.9", "3.10", "3.11", "3.12"]
+        python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
         on: [ubuntu-latest, windows-latest, macos-latest]
         exclude:
           - python-version: "3.7"

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -14,11 +14,11 @@ Modify the project in your own fork and issue a pull request once you want other
 to take a look at what you have done and discuss the proposed changes. 
 Ensure that cla-bot and other checks pass for your Pull requests.
 
-Collective Knowledge (CK) and Collective Mind (CM)
+Collective Knowledge (CK), Collective Mind (CM) and Common Metadata eXchange (CMX)
 were created by [Grigori Fursin](https://arxiv.org/abs/2406.16791),
 sponsored by cKnowledge.org and cTuning.org, and donated to MLCommons 
 to benefit everyone. Since then, this open-source automation technology
-(CM/CMX, CM4MLOps/CM4MLPerf, CM4ABTF, CM4Research, etc) is being extended 
+(CM/CMX, CM4MLOps/MLPerf automations, CM4ABTF, CM4Research, etc) is being extended 
 as a community effort thanks to all our volunteers, collaborators 
 and contributors listed here in alphabetical order:
 

diff --git a/COPYRIGHT.txt b/COPYRIGHT.txt
@@ -1,5 +1,5 @@
-Copyright (c) 2021-2024 MLCommons
+© 2021-2025 MLCommons. All Rights Reserved.
 
-Grigori Fursin, the cTuning foundation and OctoML donated this project to MLCommons to benefit everyone.
+Grigori Fursin, the cTuning foundation and OctoML donated the CK and CM projects to MLCommons to benefit everyone and continue development as a community effort.
 
 Copyright (c) 2014-2021 cTuning foundation
diff --git a/HISTORY.CM.md b/HISTORY.CM.md
@@ -0,0 +1,127 @@
+This document narrates the history of the creation and design of CM and CM4MLOps (also known as CK2) 
+by [Grigori Fursin](https://cKnowledge.org/gfursin). It also highlights the donation of this open-source technology to MLCommons, 
+aimed at benefiting the broader community and fostering its ongoing development as a collaborative, community-driven initiative:
+
+* Jan 28, 2021: After delivering an invited ACM TechTalk'21 about the Collective Knowledge framework (CK1) 
+  and reproducibility initiatives for conferences, as well as CK-MLOps and MLPerf automations, 
+  Grigori received lots of positive feedback and suggestions for improvements to workflow automations:
+  https://learning.acm.org/techtalks/reproducibility. 
+
+  Following this, Grigori began prototyping CK2 (later CM) to streamline CK1, CK-MLOps and MLPerf benchmarking. 
+  The goal was to dramatically simplify CK1 workflows by introducing just a few core and portable automations, 
+  which eventually evolved into `CM script` and `CM cache`.
+
+  At that time, the cTuning foundation hosted CK1 and all the prototypes for the CM framework at https://github.com/ctuning/ck:
+  [ref1](https://github.com/mlcommons/ck/commit/9e57934f4999db23052531e92160772ab831463a), 
+  [ref2](https://github.com/mlcommons/ck/tree/9e57934f4999db23052531e92160772ab831463a),
+  [ref3](https://github.com/mlcommons/ck/tree/9e57934f4999db23052531e92160772ab831463a/incubator).
+
+* Sep 23, 2021: donated CK1, CK-MLOps, MLPerf automations and early prototypes of CM from the cTuning repository to MLCommons:
+  [ref1](https://web.archive.org/web/20240803140223/https://octo.ai/blog/octoml-joins-the-community-effort-to-democratize-mlperf-inference-benchmarking),
+  [ref2](https://github.com/mlcommons/ck/tree/228f80b0bf44610c8244ff0c3f6bec5bbd25aa6c/incubator),
+  [ref3](https://github.com/mlcommons/ck/tree/695c3843fd8121bbdde6c453cd6ec9503986b0c6?tab=readme-ov-file#author-and-coordinator),
+  [ref4](https://github.com/mlcommons/ck/tree/master/ck),
+  [ref5](https://github.com/mlcommons/ck-mlops).
+
+  Prepared MLCommons proposal for the creation of the [MLCommons Task Force on Automation and Reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md),
+  aimed at fostering community-driven support for CK and CM developments to benefit everyone.
+
+* Jan, 2022: hired Arjun Suresh at OctoML to support and maintain CK1 framework and help prepare OctoML's MLPerf submissions using CK1.
+  Meanwhile, transitioned to focusing on CM and CM-MLOps development, building upon the prototypes created in 2021.
+
+* Mar 1, 2022: started developing cm-mlops: [ref](https://github.com/octoml/cm-mlops/commit/0ae94736a420dfa84f7417fc62d323303b8760c6).
+
+* Mar 24, 2022: after successfully stabilizing the initial prototype of CM, donated it to MLCommons to benefit the entire community:
+  [ref1](https://github.com/mlcommons/ck/tree/c7918ad544f26b6c499c2fc9c07431a9640fca5a/ck2), 
+  [ref2](https://github.com/mlcommons/ck/tree/c7918ad544f26b6c499c2fc9c07431a9640fca5a/ck2#coordinators),
+  [ref3](https://github.com/mlcommons/ck/commit/3c146cb3c75a015363f7a96758adf6dcc43032d6),
+  [ref4](https://github.com/mlcommons/ck/commit/3c146cb3c75a015363f7a96758adf6dcc43032d6#diff-d97f0f6f5a32f16d6ed18b9600ffc650f7b25512685f7a2373436c492c6b52b3R48).
+
+* Apr 6, 2022: started transitioning previous MLOps and MLPerf automations from the mlcommons/ck-mlops format 
+  to the new CM format using the cm-mlops repository (will be later renamed to cm4mlops):
+  [ref1](https://github.com/octoml/cm-mlops/commit/d1efdc30fb535ce144020d4e88f3ed768c933176),
+  [ref2](https://github.com/octoml/cm-mlops/blob/d1efdc30fb535ce144020d4e88f3ed768c933176/CONTRIBUTIONS).
+
+* Apr 22, 2022: began architecting "Intelligent Components" in the CM-MLOps repository, 
+  which will be renamed to `CM Script` at a later stage:
+  [ref1](https://github.com/octoml/cm-mlops/commit/b335c609c47d2c547afe174d9df232652d57f4f8),
+  [ref2](https://github.com/octoml/cm-mlops/tree/b335c609c47d2c547afe174d9df232652d57f4f8),
+  [ref3](https://github.com/octoml/cm-mlops/blob/b335c609c47d2c547afe174d9df232652d57f4f8/CONTRIBUTIONS).
+
+  At the same time, prototyped other core CM automations, including IC, Docker, and Experiment:
+  [ref1](https://github.com/octoml/cm-mlops/tree/b335c609c47d2c547afe174d9df232652d57f4f8/automation),
+  [ref2](https://github.com/mlcommons/ck/commits/master/?before=7f66e2438bfe21b4ce2d08326a5168bb9e3132f6+7001).
+
+* Apr 28, 2022: donated CM-MLOps to MLCommons, which was later renamed to CM4MLOps:
+  [ref](https://github.com/mlcommons/ck/commit/456e4861056c0e39c4d689c03da91f90a44be058).
+
+* May 9, 2022: developed the initial set of core IC automations for MLOps (aka CM scripts):
+ [ref1](https://github.com/octoml/cm-mlops/commit/4a4a027f4088ce7e7abcec29c39d98981bf09d4c),
+ [ref2](https://github.com/octoml/cm-mlops/tree/4a4a027f4088ce7e7abcec29c39d98981bf09d4c),
+ [ref3](https://github.com/octoml/cm-mlops/blob/7692240becd6397a96c3975388913ea082002e7a/CONTRIBUTIONS).
+
+* May 11, 2022: After successfully prototyping CM and CM-MLOps, deprecated the CK1 framework in favor of CM. 
+  Transferred Arjun Suresh to the CM project as a maintainer and tester for CM and CM-MLOps:
+  [ref](https://github.com/octoml/cm-mlops/blob/17405833665bc1e93820f9ff76deb28a0f543bdb/CONTRIBUTIONS).
+
+  Created a [file](https://github.com/mlcommons/ck/blob/master/cm-mlops/CHANGES.md) 
+  to document and track our public developments at MLCommons.
+
+* Jun 8, 2022: renamed the 'IC' automation to the more intuitive 'CM script' automation. 
+  [ref1](https://github.com/mlcommons/ck/tree/5ca4e2c33e58a660ac20a545d8aa5143ab6e8e81/cm-devops/automation/script),
+  [ref2](https://github.com/mlcommons/ck/tree/5ca4e2c33e58a660ac20a545d8aa5143ab6e8e81),
+  [ref3](https://github.com/octoml/cm-mlops/commit/7910fb7ffc62a617d987d2f887d6f9981ff80187).
+
+* Jun 16, 2022: prototyped the `CM cache` automation to facilitate caching and reuse of the outputs from CM scripts:
+  [ref1](https://github.com/mlcommons/ck/commit/1f81aae8cebd5567ec4ca55f693beaf32b49fb48),
+  [ref2](https://github.com/mlcommons/ck/tree/1f81aae8cebd5567ec4ca55f693beaf32b49fb48),
+  [ref3](https://github.com/mlcommons/ck/tree/1f81aae8cebd5567ec4ca55f693beaf32b49fb48?tab=readme-ov-file#contacts).
+
+* Sep 6, 2022: delivered CM demo to run MLPerf while deprecating CK1 automations for MLPerf:
+  [ref1](https://github.com/mlcommons/ck/commit/2c5d5c5c944ae5f252113c62af457c7a4c5e877a#diff-faac2c4ecfd0bfb928dafc938d3dad5651762fbb504a2544752a337294ee2573R224),
+  [ref2](https://github.com/mlcommons/ck/blob/2c5d5c5c944ae5f252113c62af457c7a4c5e877a/CONTRIBUTING.md#author-and-coordinator).
+
+  Welcomed Arjun Suresh as a contributor to CM automations for MLPerf: [ref](https://github.com/mlcommons/ck/blob/2c5d5c5c944ae5f252113c62af457c7a4c5e877a/CONTRIBUTING.md#contributors-in-alphabetical-order).
+
+* From September 2022: coordinated community development of CM and CM4MLOps 
+  to [modularize and automate MLPerf benchmarks](https://docs.mlcommons.org/inference)
+  and support [reproducibility initiatives at ML and Systems conferences](https://cTuning.or/ae) 
+  through the [MLCommons Task Force on Automation and Reproducibility](https://github.com/mlcommons/ck/blob/master/docs/taskforce.md).
+
+  * Directed and financed the creation of (CM) automations to streamline the MLPerf power measurement processes.
+
+  * Proposed to use MLPerf benchmarks for the Student Cluster Competition, led the developments 
+    and prepared a tutorial to run MLPerf inference at SCC'22 via CM: [ref](https://github.com/mlcommons/ck/blob/master/docs/tutorials/sc22-scc-mlperf.md)
+
+* April 2023: departed OctoML to focus on the development of the [CK playground](https://access.cKnowledge.org) and CM automations 
+  to make Mlperf accessible to everyone. Hired Arjun Suresh to help with developments.
+
+  * Initiated and funded development of the [MLPerf explorer](https://github.com/ctuning/q2a-mlperf-visualizer)
+    to improve visualization of results
+
+* August 2023: organized the 1st mass-scale MLPerf community submission of 12217 inference benchmark v3.1 results 
+   out of total 13351 results (including 90% of all power results) across diverse models, software and hardware 
+   from different vendors via [open challenges](https://access.cknowledge.org/playground/?action=challenges) funded by cTuning.org : 
+   [LinkedIn article](https://www.linkedin.com/pulse/new-milestone-make-mlperf-benchmarks-accessible-everyone-fursin/) 
+   with results visualized by the [MLPerf explorer](https://github.com/ctuning/q2a-mlperf-visualizer),
+   [CM4MLOps challenges at GitHub](https://github.com/mlcommons/cm4mlops/tree/main/challenge). 
+
+* February, 2024: proposed to use CM to automate [MLPerf automotive benchmark (ABTF)](https://mlcommons.org/working-groups/benchmarks/automotive/).
+
+  * moved my prototypes of the CM automation for ABTF to cm4abtf repo: [ref](https://github.com/mlcommons/cm4abtf/commit/f92b9f464de89a38a4bde149290dede2d94c8631)
+  * led further CM4ABTF developments funded by cTuning.org.
+
+* Starting in April 2024, began the gradual transfer of ongoing maintenance and enhancement 
+  responsibilities for CM and CM4MLOps, including MLPerf automations, to MLCommons.
+  Welcomed Anandhu Sooraj as a maintainer and contributor to CM4MLOps with MLPerf automations.
+
+* Took a break from all development activities.
+
+* July 2024: started prototyping the next generation of CM (CMX and CMX4MLOps) with simpler interfaces 
+  based on user feedback while maintaining backward compatibility.
+
+* 2025: continue developing CMX and CMX4MLOPs to make it easier to run and customize MLPerf inference, training 
+  and other benchmarks across diverse models, datasets, software and hardware.
+
+For more details, please refer to the [white paper](https://arxiv.org/abs/2406.16791) 
+and the [ACM REP'23 keynote](https://doi.org/10.5281/zenodo.8105339).
diff --git a/README.md b/README.md
@@ -18,7 +18,14 @@ in the most efficient and cost-effective way across diverse models, data sets, s
 
 It includes the following sub-projects.
 
-### Collective Mind (CM)
+### Collective Minds (CM)
+
+The Collective Mind (CM) project, or Collective Minds, facilitates the
+decomposition of complex software systems into portable, reusable, and
+interconnected automation recipes. These recipes are developed and
+continuously improved by the community.
+
+#### CM automation framework
 
 The [Collective Mind framework](https://github.com/mlcommons/ck/tree/master/cm) 
 is a lightweight, Python-based toolset featuring 
@@ -30,12 +37,21 @@ and other applications across diverse and continuously changing models, data, so
 Collective Mind is continuously enhanced through public and private Git repositories
 with CM automation recipes and artifacts accessible via unified CM interface.
 
-The CM architecture diagram is available for viewing 
+#### CMX automation framework
+
+[CMX](https://github.com/mlcommons/ck/tree/master/cmx) is the next evolution 
+of the Collective Mind framework designed to enhance simplicity, flexibility, and extensibility of automations 
+based on user feedback. It is backwards compatible with CM, released along with CM 
+in the [cmind package](https://pypi.org/project/cmind/) and can serve as drop-in replacement for CM.
+
+The CM/CMX architecture diagram is available for viewing 
 [here](https://github.com/mlcommons/ck/tree/master/docs/specs/cm-diagram-v3.5.1.png).
 
-### Notable Collective Mind repositories
 
-#### CM4MLOps
+
+### Notable CM use cases
+
+#### MLOps and MLPerf automations
 
 [CM4MLOPS repository powered by CM](https://github.com/mlcommons/ck/tree/master/cm-mlops) - 
 a collection of portable, extensible and technology-agnostic automation recipes
@@ -57,27 +73,19 @@ while keeping backward compatibility.
 See the [online documentation](https://docs.mlcommons.org/inference) 
 at MLCommons to run MLPerf inference benchmarks across diverse systems using CM.
 
-#### CM4ABTF
+#### MLCommons ABTF automation
 
 [CM4ABTF repository powered by CM](https://github.com/mlcommons/cm4abtf) - 
 a collection of portable automations and CM scripts to run the upcoming 
 automotive MLPerf benchmark across different models, data sets, software 
 and hardware from different vendors.
 
-#### CM4MLPerf-results
+#### MLPerf results visualization
 
 [CM4MLPerf-results powered by CM](https://github.com/mlcommons/cm4mlperf-results) - 
 a simplified and unified representation of the past MLPerf results 
 in the CM format for further visualization and analysis using [CK graphs](https://access.cknowledge.org/playground/?action=experiments).
 
-#### CM4Research
-
-[CM4Research repository powered by CM](https://github.com/ctuning/cm4research) - 
-a unified interface designed to streamline the preparation, execution, and reproduction of experiments in research projects.
-
-
-### Projects powered by Collective Mind
-
 #### Collective Knowledge Playground
 
 [Collective Knowledge Playground](https://access.cKnowledge.org) - 
@@ -97,18 +105,23 @@ collaboratively enhance the efficiency and cost-effectiveness of AI systems.
 leveraging the Collective Mind framework to automate artifact evaluation 
 and support reproducibility efforts at ML and systems conferences.
 
+* [CM4Research repository powered by CM](https://github.com/ctuning/cm4research) - 
+a unified interface designed to streamline the preparation, execution, and reproduction of experiments in research projects.
+
 
-## Incubator
+## Legacy projects 
 
-[CMX](https://github.com/mlcommons/ck/tree/master/cmx) - the next evolution of the Collective Mind framework,
-designed to enhance simplicity, flexibility, and extensibility of automations 
-based on user feedback. Follow the project's progress [here]( https://github.com/orgs/mlcommons/projects/46 ).
+### CM-MLOps (now CM4MLOps)
 
+You can find CM-MLOps original dev directory [here](https://github.com/mlcommons/ck/tree/master/cm-mlops).
+We moved it to [CM4MLOps](https://github.com/mlcommons/ck/tree/master/cm4mlops) in 2024.
+In 2025, we aggregate all CM and CMX automations in the [new CMX4MLOps repository](https://github.com/mlcommons/ck/tree/master/cmx4mlops).
 
-## Archived projects 
+### CK automation framework v1 and v2
 
-* [CM-MLOps](https://github.com/mlcommons/ck/tree/master/cm-mlops) - now [CM4MLOps](https://github.com/mlcommons/ck/tree/master/cm4mlops)
-* [CK automation framework v1 and v2](https://github.com/mlcommons/ck/tree/master/ck) - now [CM](https://github.com/mlcommons/ck/tree/master/cm)
+You can find the original CK automation framework v1 and v2 directory [here](https://github.com/mlcommons/ck/tree/master/ck).
+It was deprecated for the [CM framework](https://github.com/mlcommons/ck/tree/master/cm)
+and later for the [CMX workflow automation framework (backwards compatible with CM)](https://github.com/mlcommons/ck/tree/master/cmx)
 
 
 ## License
@@ -117,17 +130,17 @@ based on user feedback. Follow the project's progress [here]( https://github.com
 
 ## Copyright
 
-* Copyright (c) 2021-2024 MLCommons
-* Copyright (c) 2014-2021 cTuning foundation
+Copyright (c) 2021-2024 MLCommons
+
+Grigori Fursin, the cTuning foundation and OctoML donated this project to MLCommons to benefit everyone.
+
+Copyright (c) 2014-2021 cTuning foundation
 
 ## Author
 
 * [Grigori Fursin](https://cKnowledge.org/gfursin) (FlexAI, cTuning)
 
-## Citing Collective Mind and Collective Knowledge
-
-If you found the CM automation framework helpful, kindly reference this article:
-[ [ArXiv](https://arxiv.org/abs/2406.16791) ], [ [BibTex](https://github.com/mlcommons/ck/blob/master/citation.bib) ].
+## Long-term vision
 
 To learn more about the motivation behind CK and CM technology, please explore the following presentations:
 
@@ -136,10 +149,9 @@ To learn more about the motivation behind CK and CM technology, please explore t
 * ACM TechTalk'21 about Collective Knowledge project: [ [YouTube](https://www.youtube.com/watch?v=7zpeIVwICa4) ] [ [slides](https://learning.acm.org/binaries/content/assets/leaning-center/webinar-slides/2021/grigorifursin_techtalk_slides.pdf) ]
 * Journal of Royal Society'20: [ [paper](https://royalsocietypublishing.org/doi/10.1098/rsta.2020.0211) ]
 
+## Documentation
 
-## CM Documentation
-
-* [Collective Mind white paper](https://arxiv.org/abs/2406.16791)
+* [White paper](https://arxiv.org/abs/2406.16791)
 * [CM/CMX architecture](https://github.com/mlcommons/ck/tree/master/docs/specs/cm-diagram-v3.5.1.png)
 * [CM/CMX installation GUI](https://access.cknowledge.org/playground/?action=install)
 * [CM Getting Started Guide and FAQ](https://github.com/mlcommons/ck/tree/master/docs/getting-started.md)
@@ -149,13 +161,12 @@ To learn more about the motivation behind CK and CM technology, please explore t
   * [Other CM tutorials](https://github.com/mlcommons/ck/tree/master/docs/tutorials)
 * [Full documentation](https://github.com/mlcommons/ck/tree/master/docs/README.md)
 * [CM taskforce](https://github.com/mlcommons/ck/tree/master/docs/taskforce.md)
-* [CK, CM and CMX history](https://github.com/mlcommons/ck/tree/master/docs/history.md)
+* History: [CK](https://github.com/mlcommons/ck/tree/master/docs/history.md), [CM and CM automations for MLOps and MLPerf](https://github.com/mlcommons/ck/blob/master/HISTORY.CM.md)
 
 
 ### Acknowledgments
 
-The open-source Collective Knowledge project (CK, CM, CM4MLOps/CM4MLPerf, 
-CM4Research and CMX) was created by [Grigori Fursin](https://cKnowledge.org/gfursin)
+This open-source project was created by [Grigori Fursin](https://cKnowledge.org/gfursin)
 and sponsored by cTuning.org, OctoAI and HiPEAC.
 Grigori donated CK to MLCommons to benefit the community
 and to advance its development as a collaborative, community-driven effort.
@@ -164,3 +175,6 @@ We thank [MLCommons](https://mlcommons.org), [FlexAI](https://flex.ai)
 and [cTuning](https://cTuning.org) for supporting this project,
 as well as our dedicated [volunteers and collaborators](https://github.com/mlcommons/ck/blob/master/CONTRIBUTING.md)
 for their feedback and contributions!
+
+If you found the CM automations helpful, kindly reference this article:
+[ [ArXiv](https://arxiv.org/abs/2406.16791) ], [ [BibTex](https://github.com/mlcommons/ck/blob/master/citation.bib) ].
diff --git a/cm-mlops/COPYRIGHT.md b/cm-mlops/COPYRIGHT.md
@@ -0,0 +1,5 @@
+# Copyright Notice
+
+© 2022-2025 MLCommons. All Rights Reserved.
+
+Grigori Fursin, the cTuning foundation and OctoML donated the CK and CM projects to MLCommons to benefit everyone and continue development as a community effort.