update smartsim API with core changes #721

juliaputko · 2024-09-25T23:18:36Z

No description provided.

@ashao

The release of watchdog v5 introduced new types which caused further errors with mypy. To mitigate these errors for now, we pin the watchdog version to 4.x and will resolve these errors in the future. [ committed by @ashao ] [ reviewed by @al-rigazzi ]

@juliaputko

Allow specifying Model and Ensemble parameters with number-like types. The constructors for parameters on Model and Ensemble now validate that the input is number-like and convert them to strings. [ committed by @juliaputko ] [ reviewed by @ashao]

@ashao

- The RedisAIBuilder class was completely overhauled to allow users to express a wider range of support for hardware/software stacks. This will be extended to support ROCm, CUDA-11, and CUDA-12. - Versions for each of these packages are no longer specified in an internal class. Instead a default set of JSON files specifies the sources and versions. Users can specify their own custom specifications at smart build time --------- [ committed by @ashao ] [ reviewed by @MattToast @juliaputko ] Co-authored-by: Matt Drozt <[email protected]> Co-authored-by: Julia Putko <[email protected]>

@ashao

After discussing with admins at OLCF, miniforge is the preferred solution for creating virtual environments on Frontier. The instructions for installing SmartSim have been updated accordingly. Additionally, perlmutter did not have a step for compiling the SmartRedis libraries. This has been rectified to bring the two systems to parity. [ committed by @ashao ] [ reviewed by @MattToast @AlyssaCote ]

@ashao

On Frontier, the recommended way to activate conda environments is to go through source activate. This also means that ``conda init`` is not needed. The instructions for Frontier have been updated to reflect this. [ committed by @ashao ] [ reviewed by @MattToast ]

@MattToast

Bump the version number for the release, last minute actions and docs fixes [ committed by @MattToast ] [ reviewed by @ashao ]

codecov · 2024-09-25T23:22:58Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 47.32%. Comparing base (cce16e6) to head (8db1ff1).
Report is 19 commits behind head on smartsim-refactor.

Additional details and impacted files

@@                  Coverage Diff                  @@
##           smartsim-refactor     #721      +/-   ##
=====================================================
+ Coverage              40.45%   47.32%   +6.87%     
=====================================================
  Files                    110      109       -1     
  Lines                   7326     6573     -753     
=====================================================
+ Hits                    2964     3111     +147     
+ Misses                  4362     3462     -900

Files with missing lines	Coverage Δ
smartsim/builders/ensemble.py	`93.04% <ø> (ø)`

... and 15 files with indirect coverage changes

@ashao

Based on feedback from OLCF, users may need to se the MIOPEN cache prior to running `smart validate`. The installation instructions for Frontier have been updated accordingly. [ committed by @ashao ] [ reviewed by @MattToast ]

@MattToast

Removes the use of CI Build Wheel now that SmartSim is a pure python package. [ committed by @MattToast ] [ reviewed by @ashao ]

@MattToast

Merge develop to master for release [ committed by @MattToast ] [ reviewed by @al-rigazzi ]

This PR brings develop up to date with master for release.

… autosummary

@ashao

Scylla is in a preliminary state and so needs some specific instructions to help install SmartSim with CUDA support. The directions included here are preliminary and will be updated as needed. [ committed by @ashao ] [ reviewed by @MattToast @amandarichardsonn ]

@ashao

In libtensorflow, the `input` argument to `TF_SessionRun` seems to be mistyped to `TF_Output` instead of `TF_Input`. These two types differ only in name. GCC-14 catches this and throws an error, even though earlier versions allow this. To solve this problem, patches are applied to the Tensorflow backend in RedisAI. Future versions of Tensorflow may fix this problem, but for now this seems to be the best workaround. [ committed by @ashao ] [ reviewed by @MattToast ]

Create a v1.0 branch to combine ongoing efforts in `mli-feature` and `smartsim-refactor` feature branches --------- Co-authored-by: Alyssa Cote <[email protected]> Co-authored-by: Al Rigazzi <[email protected]>

Combine the `core-refactor` feature branch with `mli-feature` in `v1.0` branch --------- Co-authored-by: Amanda Richardson <[email protected]> Co-authored-by: Amanda Richardson <[email protected]> Co-authored-by: Matt Drozt <[email protected]> Co-authored-by: Julia Putko <[email protected]> Co-authored-by: amandarichardsonn <[email protected]> Co-authored-by: Alyssa Cote <[email protected]> Co-authored-by: Al Rigazzi <[email protected]> Co-authored-by: Julia Putko <[email protected]> Co-authored-by: Matt Drozt <[email protected]>

@AlyssaCote

Logs are improved during dragon install when there is a platform and asset type mismatch. [ committed by @AlyssaCote ] [ reviewed by @ankona ]

ashao and others added 7 commits September 2, 2024 11:47

Bump version number to 0.8.0 (CrayLabs#718)

e8eaa2b

Bump the version number for the release, last minute actions and docs fixes [ committed by @MattToast ] [ reviewed by @ashao ]

update smartsim API with core changes

a4cea3d

ashao and others added 20 commits September 26, 2024 15:46

Make a user-specific db cache (CrayLabs#727)

10bdeac

Based on feedback from OLCF, users may need to se the MIOPEN cache prior to running `smart validate`. The installation instructions for Frontier have been updated accordingly. [ committed by @ashao ] [ reviewed by @MattToast ]

Update release action to remove CI Build Wheel (CrayLabs#728)

0bab07c

Removes the use of CI Build Wheel now that SmartSim is a pure python package. [ committed by @MattToast ] [ reviewed by @ashao ]

Release v0.8.0 (CrayLabs#730)

528c1ae

Merge develop to master for release [ committed by @MattToast ] [ reviewed by @al-rigazzi ]

Merge master into develop (CrayLabs#731)

342a67f

This PR brings develop up to date with master for release.

Merge branch 'smartsim-refactor' of github.com:CrayLabs/SmartSim into…

0d7034e

… autosummary

updated smartsimapi

d72a97a

Merge MLI feature branch into v1.0 branch (CrayLabs#754)

8a19dee

Create a v1.0 branch to combine ongoing efforts in `mli-feature` and `smartsim-refactor` feature branches --------- Co-authored-by: Alyssa Cote <[email protected]> Co-authored-by: Al Rigazzi <[email protected]>

import settings, and title fixes in api docs

7ea1557

path changes for docs

d6902cb

docs fix

392a95a

readthedocs error fixes

e618758

readthedocs error fixes

2b30334

readthedocs error fixes

0756a60

fix spacing issue

8db1ff1

Fix logging bug in dragon_install (CrayLabs#761)

879b96e

Logs are improved during dragon install when there is a platform and asset type mismatch. [ committed by @AlyssaCote ] [ reviewed by @ankona ]

resolve merge conflicts

abad5a8

unpin sphinx versions

328ea86

juliaputko closed this Oct 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update smartsim API with core changes #721

update smartsim API with core changes #721

juliaputko commented Sep 25, 2024

codecov bot commented Sep 25, 2024 •

edited

Loading

update smartsim API with core changes #721

update smartsim API with core changes #721

Conversation

juliaputko commented Sep 25, 2024

codecov bot commented Sep 25, 2024 • edited Loading

Codecov Report

codecov bot commented Sep 25, 2024 •

edited

Loading