-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release 0.6.1 #490
Merged
Merged
Release 0.6.1 #490
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This PR brings develop up to date with master before releasing v0.6.0 [ committed by @al-rigazzi ] [ reviewed by @MattToast ]
This PR adds concurrency groups to GitHub's CI/CD workflows, preventing multiple workflows from the same PR to be launched concurrently. [ committed by @al-rigazzi ] [ reviewed by @MattToast ]
The sphinx-tabs documentation extension uses a white background for the tabs component. This causes readability issues with the theme that we have chosen. A custom CSS has been added to override those components to inherit the overall theme color. [ committed by @mellis13 ] [ reviewed by @al-rigazzi ]
Adds infrastructure to fetch RedisAI's dependencies. This removes the need to call RedisAI's `get_deps.sh` script so that we can fetch newer versions of our machine learning backends than the ones officially supported by RedisAI. Additionally, this upgrades the machine learning python packages required by SmartSim so that they stay up to date with the backends. This in turn allows us to add Python3.10+ONNX support. [ committed by @MattToast ] [ reviewed by @ashao ]
The implementation makes use of python `contextvars.ContextVar` to store experiment-specific state. The state is used to dynamically modify experiment-level logging. For example, this driver: ```py exp1 = smartsim.Experiment('exp-1') rs1 = exp1.create_runsettings(...) model1 = exp1.create_model(..., rs1) exp2 = smartsim.Experiment('other-exp') rs2 = exp2.create_runsettings(...) model2 = exp2.create_model(..., rs2) exp1.start(model1) exp1.start(model2) ``` Results in each experiment dynamically registering `logging.FileHandler` instances that write logs to separate files: - `/exp-1/.telemetry/smartsim/smartsim.out` - `/other-exp/.telemetry/smartsim/smartsim.out` ### Key changes: 1. Decorated experiment API w/contextualizer to enrich log context 2. Create/Use `ContextThread` to ensure threads include current context information 3. Create/Use `ContextAwareLogger` to dynamically add file handlers for experiment logs 4. Updated manifest serialization to include paths to experiment-specific log files 5. Added `LowPassFilter` to enable splitting experiment logs across `xxx.out` and `xxx.err` ### Additional minor changes: 1. Moved `serialize.TELMON_SUBDIR` constant to `Config.telemetry_subdir` to make it more universally available --------- Co-authored-by: Matt Drozt <[email protected]> Co-authored-by: Matt Drozt <[email protected]> [ committed by @ankona ] [ reviewed by @al-rigazzi @MattToast ]
As we are not aware of any system still using the Cobalt workload manager, its support in SmartSim was terminated. [ committed by @al-rigazzi ] [ reviewed by @MattToast @ashao ]
This PR updates GitHub CI/CD actions to latest versions, as some of those used in the workflows were outdated. [ committed by @al-rigazzi ] [ reviewed by @ashao ]
Quality of life `smart validate` improvements: - Set `CUDA_VISIBLE_DEVICES` environment variable within `smart validate` prior to importing any ML deps to prevent false negatives on multi-GPU systems - Move SmartRedis logs from standard out to dedicated log file in the validation temporary directory - Suppress `sklearn` deprecation warning by pinning `KMeans` constructor argument - Move TF test to last as TF may reserve the GPUs it uses [ committed by @MattToast ] [ reviewed by @al-rigazzi @ashao ]
Add Python 3.11 to SmartSim [ committed by @MattToast ] [ reviewed by @ashao ]
Relax the required version of `typing_extensions` [ committed by @MattToast ] [ reviewed by @ankona ]
This PR merges in github actions for running checks - black and isort [ reviewed by @MattToast ] [ committed by @amandarichardsonn ]
This PR adds Python type hinting to RunSettings.colocated_db_settings. [ reviewed by @MattToast ] [ committed by @amandarichardsonn ]
This PR fixes the `test_logs.py::test_context_leak` test that was erroneously creating a directory named `some value` in SmartSim's root directory.
Add and ship `py.typed` marker to expose inline type hints. Fix type errors related to SmartRedis. [ committed by @MattToast ] [ reviewed by @al-rigazzi ]
This PR merges in functionality to validate the timing format when requesting a slurm allocation. Previously, no check was required leading to the WLM responsibility to throw an error. With the new code, SmartSim will catch and throw. [ reviewed by @MattToast ] [ committed by @amandarichardsonn ]
The Torch eval() function is invoked in the tests to resolve warnings related to model tracing. [ reviewed by @MattToast @ashao ] [ committed by @mellis13 ]
With the new ml_lib_builder repository we can now ship a version of libtorch that is compiled for Mac OSX on Apple Silicon (arm64). Here the RedisAIBuilder method is reworked to detect whether this platform and retrieve the appropriate version of libtorch. Some additional refactoring was done to improve the internals of this class. [ committed by @ashao ] [ reviewed by @MattToast ]
Refactor logic of `Manifest.has_db_objects` to remove excess branching and improve readability/maintainability. [ committed by @MattToast ] [ reviewed by @ankona ]
This PR makes several patch changes to prepare for a SmartSim release including: - Set the default value of the "enable telemetry" flag to on. Currently this will enable telemetry system wide until finer grain control can be established with #460 - Bump the output `manifest.json` version number to match that of `smartdashboard` - Pin a watchdog version to avoid build errors [ committed by @MattToast @ankona ] [ reviewed by @ankona ] --------- Co-authored-by: Christopher McBride <[email protected]>
This PR fixes a bug which prevented the expected behavior when the `SMARTSIM_LOG_LEVEL` environment variable was set to `developer`. [ committed by @al-rigazzi ] [ reviewed by @MattToast @ankona ]
Updates `Copyright (c) 2021-2023` to `Copyright (c) 2021-2024` in all of the necessary files.
This PR prevents duplicate ML models and scripts names being added to an Ensemble member if the names exists already. The checks are performed for `Ensemble.add_ml_model()`, `Ensemble.add_model()`, `Ensemble.add_script()` and `Ensemble.add_function()`. [ reviewed by @ankona @MattToast ] [ committed by @amandarichardsonn ]
SmartSim support for MacOS with Apple Silicon is still fragile for common configurations and also does not have full feature parity with MacOS on Intel. Specifically, the docs now call out specifically that MacOS on Apple Silicon with Clang 15 does not build correctly and offers a solution. Additionally, the docs also highlight that only PyTorch is supported on MacOS for now. [ committed by @ashao ] [ reviewed by @ankona ]
Cloning Redis on Apple Silicon results in files within some of the Redis build scripts that have Windows-style line endings. This leads to errors because the interpreter for these scripts cannot be parsed correctly (e.g `/bin/sh^M`). To solve this, we now modify the `git clone` for both Redis and RedisAI to set the line endings to unix-style line endings when using MacOS on ARM. [ committed by @ashao and @MattToast ] [ reviewed by @al-rigazzi ] Co-authored-by: Matt Drozt <[email protected]>
This PR updates the changelog to prepare for release. [ reviewed by @MattToast ] [ committed by @amandarichardsonn ]
Update version number to 0.6.1 [ committed by @amandarichardsonn @MattToast ] [ reviewed by @al-rigazzi ]
MattToast
approved these changes
Feb 15, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!! 🎉
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #490 +/- ##
==========================================
+ Coverage 90.28% 90.61% +0.32%
==========================================
Files 60 60
Lines 3748 3826 +78
==========================================
+ Hits 3384 3467 +83
+ Misses 364 359 -5
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR brings master up to date with develop before releasing v0.6.1.