Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge core refactor into v1.0 feature branch #756

Merged
merged 118 commits into from
Oct 28, 2024
Merged
Show file tree
Hide file tree
Changes from 114 commits
Commits
Show all changes
118 commits
Select commit Hold shift + click to select a range
85a022e
pushing just to update
amandarichardsonn May 2, 2024
ffaa9b3
updates
amandarichardsonn May 3, 2024
8210991
pushing updates
amandarichardsonn May 7, 2024
6e595fb
pushing mypy errors
amandarichardsonn May 8, 2024
821ffe3
Merge branch 'develop' into smartsim-refactor
amandarichardsonn May 15, 2024
f2ba3a7
Merge branch 'develop' of https://github.com/CrayLabs/SmartSim into s…
MattToast May 20, 2024
919e65a
Merge branch 'develop' of https://github.com/CrayLabs/SmartSim into s…
amandarichardsonn May 21, 2024
0c0ab7b
Merge branch 'smartsim-refactor' of https://github.com/CrayLabs/Smart…
amandarichardsonn May 21, 2024
f3a58d4
Merge remote-tracking branch 'upstream/develop' into smartsim-refactor
MattToast May 23, 2024
3ed3ef7
Merge branch 'develop' of https://github.com/CrayLabs/SmartSim into s…
amandarichardsonn May 23, 2024
e96f893
Merge branch 'smartsim-refactor' of https://github.com/CrayLabs/Smart…
amandarichardsonn May 23, 2024
e31c837
Move SmartSim v0.X.Y test to `tests/_legacy` dir (#575)
MattToast May 24, 2024
2cefbcf
Merge branch 'develop' into smartsim-refactor
amandarichardsonn May 28, 2024
ce35746
Merge branch 'smartsim-refactor' of https://github.com/CrayLabs/Smart…
amandarichardsonn May 28, 2024
d105df6
Fix Inconsistency in run_arg Formatting (#588)
juliaputko May 28, 2024
1cf41b1
Ignore build docs on the smartsim-refactor branch (#604)
juliaputko May 29, 2024
89214d6
Renaming Model to Application (#579)
juliaputko May 30, 2024
f21b6f8
Removal of Factory Methods from Experiment API (#581)
juliaputko Jun 3, 2024
883824b
Renaming Orchestrator and Database to Feature Store (#597)
juliaputko Jun 4, 2024
b64af80
Job and MPMD Job Implementation (#603)
juliaputko Jun 7, 2024
d058213
BaseJobGroup, JobGroup, ColocatedJobGroup skeleton (#599)
amandarichardsonn Jun 11, 2024
d2fd6a7
Initial MLI schemas and MessageHandler class (#607)
AlyssaCote Jun 11, 2024
edb8364
BatchSettings, LaunchSettings, Command, CommandList and LaunchCommand…
amandarichardsonn Jun 13, 2024
3c9915c
Merge branch 'develop' into mli-feature
ankona Jun 14, 2024
38081da
ML Worker Manager MVP (#608)
ankona Jun 20, 2024
ab900b8
Remove device attribute from schemas (#619)
AlyssaCote Jun 25, 2024
e9d8eca
Merge remote-tracking branch 'upstream/develop' into smartsim-refactor
MattToast Jun 26, 2024
a9ffb14
Merge branch 'develop' into mli-feature
ankona Jul 2, 2024
ee2c110
Merge branch 'develop' into mli-feature
ankona Jul 2, 2024
8a2f173
Add model metadata to request schema (#624)
AlyssaCote Jul 3, 2024
7e3e92c
Add Type Check Back to Refactor CI (#630)
MattToast Jul 8, 2024
52abd32
Enable environment variable based configuration for ML Worker Manager…
AlyssaCote Jul 10, 2024
c2164ca
Compound Entity and Ensemble (#605)
MattToast Jul 10, 2024
eace71e
FLI-based Worker Manager (#622)
al-rigazzi Jul 15, 2024
5fac3e2
Add ability to specify hardware policies on dragon run requests (#631)
ankona Jul 17, 2024
0030a4a
Revert "Add ability to specify hardware policies on dragon run reques…
ankona Jul 17, 2024
b6c2f2b
Merge latest develop into mli-feature (#640)
ankona Jul 18, 2024
272a1d7
Improve error handling in worker manager (#629)
AlyssaCote Jul 18, 2024
7169f1c
Schema performance improvements (#632)
AlyssaCote Jul 18, 2024
84101b3
New develop merger (#645)
al-rigazzi Jul 19, 2024
9b511f1
Command Generation (#615)
juliaputko Jul 24, 2024
b965c3d
Link up SmartSim Refactor CI (#646)
MattToast Jul 24, 2024
df6ed98
Merge remote-tracking branch 'upstream/develop' into smartsim-refactor
MattToast Jul 25, 2024
e225c07
merging develop
ankona Jul 26, 2024
bd21957
Dispatch Jobs to Launchers based on Settings (#620)
MattToast Jul 27, 2024
a2c1251
Fix test overwriting a test configuration file (#648)
MattToast Jul 31, 2024
9f482b1
Merge branch 'develop' into mli-feature
ankona Jul 31, 2024
263e3c7
Fix dragon installation issues (#652)
ankona Aug 2, 2024
0453b8b
Add FeatureStore descriptor to tensor & model keys (#633)
ankona Aug 7, 2024
99ed41c
Merge branch 'develop' into mli-feature
ankona Aug 8, 2024
74d6e78
Use `torch.from_numpy` instead of `torch.tensor` to reduce a copy (#661)
AlyssaCote Aug 8, 2024
77eaf4d
Simple Status Checking (#655)
MattToast Aug 9, 2024
e44e099
Rename Status Enums (#663)
MattToast Aug 9, 2024
52cd8ec
Refactor of Generation class and inject path into Launch process (#650)
amandarichardsonn Aug 13, 2024
391784c
MLI environment variables updated using new naming convention (#665)
AlyssaCote Aug 14, 2024
ddde9c5
Move `smartsim.settings.dispatch` module under `smartsim._core` (#659)
juliaputko Aug 14, 2024
347458f
merge develop into smartsim-refactor
juliaputko Aug 14, 2024
bf348a0
remove where install, build_py and Distribution were used in setup.py
juliaputko Aug 15, 2024
7ddb12f
Merge develop into smartsim-refactor (#668)
juliaputko Aug 15, 2024
f7ef49b
Remove pydantic dependency from MLI code (#667)
AlyssaCote Aug 20, 2024
0569c05
Move `ShellLauncher` (#658)
juliaputko Aug 21, 2024
a0cc447
Removed Slurm specific methods from LaunchArguments (#662)
juliaputko Aug 22, 2024
f6928e5
Redis removal (#656)
juliaputko Aug 22, 2024
ef034d5
Enable specification of target hostname for a dragon task (#660)
ankona Aug 26, 2024
6d5518b
fix init reordering bug (#675)
ankona Aug 26, 2024
f9a86d9
Unit tests for ShellLauncher & additional shell launch context (#671)
amandarichardsonn Aug 28, 2024
5d85995
Queue-based Worker Manager (#647)
al-rigazzi Aug 28, 2024
ec7677a
Remove dead attributes in application (#673)
juliaputko Aug 29, 2024
128598b
Refactor `exception_handler` to avoid unnecessary building and serial…
AlyssaCote Aug 29, 2024
5611a16
Poll Based Waiting for Job Completion (#670)
MattToast Aug 29, 2024
8aa990c
SmartSim environment variables updated using new naming convention (#…
AlyssaCote Aug 29, 2024
4a4f43a
Refine SmartSimEntity Interface (#688)
amandarichardsonn Aug 29, 2024
c1faadd
Experiment Can Stop Launched Jobs (#677)
MattToast Aug 30, 2024
f6d55d8
MLI file names conform to snake case (#689)
AlyssaCote Aug 30, 2024
0175b6b
Ensemble Refactor (#686)
juliaputko Sep 5, 2024
c169878
Remove Problematic Type Union (#694)
MattToast Sep 5, 2024
a3795e7
Add event broadcasting capability (#672)
ankona Sep 10, 2024
28bfd8f
Update MLI docstrings part 1 (#692)
AlyssaCote Sep 10, 2024
61ab71d
Update MLI docstrings part 2 (#699)
AlyssaCote Sep 10, 2024
3e9bffa
Parametrize dragon install (#703)
ankona Sep 12, 2024
b4798da
Merge branch 'develop' into mli-feature
ankona Sep 12, 2024
b0b1db6
Ensure forks build latest public dragon in CI actions (#706)
ankona Sep 18, 2024
f49fde3
Camel Case to Snake Case (#711)
amandarichardsonn Sep 18, 2024
0ebd5ab
Clean up error handling in MLI (#698)
AlyssaCote Sep 19, 2024
3bfdff9
Removals of EntityList, EntitySequence, JobManager, and Controller (#…
juliaputko Sep 19, 2024
d43f7c7
MLI helper methods (#709)
AlyssaCote Sep 19, 2024
5ec287c
Bug fix (#715)
AlyssaCote Sep 20, 2024
f748789
Short task to update BatchSettings (#704)
amandarichardsonn Sep 20, 2024
4faf95c
Ensemble Move short-task (#714)
amandarichardsonn Sep 23, 2024
4d9ab27
Replace TaggedFilesHierarchy with os.walk and implement configure_dir…
amandarichardsonn Sep 27, 2024
dbf7b72
Allow for sequences into exp.start(), and unpack iterables (#712)
juliaputko Oct 4, 2024
2cbd3be
Runtime value checking of public API methods (#707)
juliaputko Oct 7, 2024
ca01cb1
Add integration of dragon-based event broadcasting (#710)
ankona Oct 10, 2024
c433db9
Merge branch 'develop' into mli-feature
ankona Oct 15, 2024
5c65cc3
post-merge cleanup
ankona Oct 16, 2024
8fe3594
Remove deprecated onnx cli param
ankona Oct 16, 2024
96678ba
Fixes for updated pydantic
ankona Oct 16, 2024
e1849be
remove use of deprecated Dragon policy property
ankona Oct 16, 2024
6e49f97
skip test hanging on build agent.
ankona Oct 16, 2024
a39246e
Application.files refactor (#732)
amandarichardsonn Oct 16, 2024
e33646b
Merge branch 'ssr-merge' into mli-merge
ankona Oct 17, 2024
fd63209
post-merge fixes & move tests/dragon
ankona Oct 17, 2024
6f0beb7
Merge branch 'develop' into mli-merge
ankona Oct 17, 2024
e296acf
skip ml test if no tf/torch found
ankona Oct 17, 2024
4ffc569
if at first you dont succeed
ankona Oct 17, 2024
243fcc5
Update dragon default version to 0.10, add appropriate tests
ankona Oct 18, 2024
bfd1459
fix incorrect number of arguments bug
ankona Oct 18, 2024
01d2eea
revert removal of dbcli option
ankona Oct 18, 2024
59f908e
remove smart build --keydb argument and test
ankona Oct 18, 2024
86a23d9
Replace comment w/TODO. Appropriate ticket created
ankona Oct 18, 2024
bfc36e3
merge core-refactor into v1.0
ankona Oct 21, 2024
30c7d36
Merge branch 'smartsim-refactor' into v1.0-merge-core
ankona Oct 21, 2024
0eded7d
fixes for tests breaking on agent due to paths
ankona Oct 21, 2024
f9fc777
remove merge carry-over file that was moved
ankona Oct 21, 2024
31529d4
remove obsolete comment
ankona Oct 24, 2024
b967ce3
merge cleanup, enable mypy, remove @deprecated methods
ankona Oct 24, 2024
9f3a68f
more merge cleanup
ankona Oct 24, 2024
629baa5
remove refs to `installed_redisai_backends`
ankona Oct 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 2 additions & 0 deletions .github/workflows/build_docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ on:
push:
branches:
- develop
branches-ignore:
- smartsim-refactor
ankona marked this conversation as resolved.
Show resolved Hide resolved

jobs:
build_docs:
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/changelog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ name: enforce_changelog

on:
pull_request:
branches-ignore:
- smartsim-refactor
ankona marked this conversation as resolved.
Show resolved Hide resolved
push:
branches:
- develop
Expand Down
21 changes: 15 additions & 6 deletions .github/workflows/run_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,14 @@ jobs:
with:
python-version: ${{ matrix.py_v }}

- name: Check Test Files are Marked
run: |
diff <(find tests -path tests/_legacy -prune -o -type f -name 'test_*.py' -print \
| xargs grep -l 'pytestmark' \
| sort) \
<(find tests -path tests/_legacy -prune -o -type f -name 'test_*.py' -print \
| sort)

- name: Install build-essentials for Ubuntu
if: contains( matrix.os, 'ubuntu' )
run: |
Expand Down Expand Up @@ -127,12 +135,13 @@ jobs:
LLP=$(cat $SP | grep LD_LIBRARY_PATH | awk '{split($0, array, "="); print array[2]}')
echo "LD_LIBRARY_PATH=$LLP:$LD_LIBRARY_PATH" >> $GITHUB_ENV

- name: Run mypy
run: |
make check-mypy
# - name: Run mypy
# run: |
# make check-mypy
ankona marked this conversation as resolved.
Show resolved Hide resolved
ankona marked this conversation as resolved.
Show resolved Hide resolved

- name: Run Pylint
run: make check-lint
# TODO: Re-enable static analysis once API is firmed up
# - name: Run Pylint
# run: make check-lint

# Run isort/black style check
- name: Run isort
Expand Down Expand Up @@ -163,7 +172,7 @@ jobs:
run: |
echo "SMARTSIM_LOG_LEVEL=debug" >> $GITHUB_ENV
py.test -s --import-mode=importlib -o log_cli=true --cov=$(smart site) --cov-report=xml --cov-config=./tests/test_configs/cov/local_cov.cfg --ignore=tests/full_wlm/ -m ${{ matrix.subset }} ./tests

ankona marked this conversation as resolved.
Show resolved Hide resolved
# Upload artifacts on failure, ignoring binary files
- name: Upload Artifact
if: failure()
Expand Down
2 changes: 1 addition & 1 deletion .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ max-module-lines=1000

# Allow the body of a class to be on the same line as the declaration if body
# contains single statement.
single-line-class-stmt=no
single-line-class-stmt=yes

# Allow the body of an if to be on the same line as the test if there is no
# else.
Expand Down
10 changes: 5 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -164,22 +164,22 @@ tutorials-prod:
# help: test - Run all tests
.PHONY: test
test:
@python -m pytest --ignore=tests/full_wlm/ --ignore=tests/dragon
@python -m pytest --ignore=tests/full_wlm/ --ignore=tests/dragon_wlm

# help: test-verbose - Run all tests verbosely
.PHONY: test-verbose
test-verbose:
@python -m pytest -vv --ignore=tests/full_wlm/ --ignore=tests/dragon
@python -m pytest -vv --ignore=tests/full_wlm/ --ignore=tests/dragon_wlm

# help: test-debug - Run all tests with debug output
.PHONY: test-debug
test-debug:
@SMARTSIM_LOG_LEVEL=developer python -m pytest -s -o log_cli=true -vv --ignore=tests/full_wlm/ --ignore=tests/dragon
@SMARTSIM_LOG_LEVEL=developer python -m pytest -s -o log_cli=true -vv --ignore=tests/full_wlm/ --ignore=tests/dragon_wlm

# help: test-cov - Run all tests with coverage
.PHONY: test-cov
test-cov:
@python -m pytest -vv --cov=./smartsim --cov-config=${COV_FILE} --ignore=tests/full_wlm/ --ignore=tests/dragon
@python -m pytest -vv --cov=./smartsim --cov-config=${COV_FILE} --ignore=tests/full_wlm/ --ignore=tests/dragon_wlm


# help: test-full - Run all WLM tests with Python coverage (full test suite)
Expand All @@ -196,4 +196,4 @@ test-wlm:
# help: test-dragon - Run dragon-specific tests
.PHONY: test-dragon
test-dragon:
ankona marked this conversation as resolved.
Show resolved Hide resolved
@dragon pytest tests/dragon
@dragon pytest tests/dragon_wlm
Loading
Loading