From 10bdeac1f9ced57d1114b0e61878c93ac7f9a3aa Mon Sep 17 00:00:00 2001 From: Andrew Shao Date: Thu, 26 Sep 2024 15:46:52 -0700 Subject: [PATCH] Make a user-specific db cache (#727) Based on feedback from OLCF, users may need to se the MIOPEN cache prior to running `smart validate`. The installation instructions for Frontier have been updated accordingly. [ committed by @ashao ] [ reviewed by @MattToast ] --- doc/changelog.md | 5 +++++ .../platform/frontier.rst | 16 +++++++++++----- 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/doc/changelog.md b/doc/changelog.md index f6e50b5a7..3e4c9d164 100644 --- a/doc/changelog.md +++ b/doc/changelog.md @@ -15,6 +15,7 @@ Released on 25 September, 2024 Description +- Add instructions for Frontier to set the MIOPEN cache - Refine Frontier documentation for proper use of miniforge3 - Refactor to the RedisAI build to allow more flexibility in versions and sources of ML backends @@ -40,6 +41,10 @@ Description Detailed Notes +- On Frontier, the MIOPEN cache may need to be set prior to using + RedisAI in the ``smart validate``. The instructions for Frontier + have been updated accordingly. + ([SmartSim-PR727](https://github.com/CrayLabs/SmartSim/pull/727)) - On Frontier, the recommended way to activate conda environments is to go through source activate. This also means that ``conda init`` is not needed. The instructions for Frontier have been updated to diff --git a/doc/installation_instructions/platform/frontier.rst b/doc/installation_instructions/platform/frontier.rst index 149df58da..9b05061fe 100644 --- a/doc/installation_instructions/platform/frontier.rst +++ b/doc/installation_instructions/platform/frontier.rst @@ -69,6 +69,13 @@ these instructions, being sure to set the following variables .. code:: bash + # Optimizations for inference + export MIOPEN_USER_DB_PATH="/tmp/${USER}/my-miopen-cache" + export MIOPEN_CUSTOM_CACHE_DIR=$MIOPEN_USER_DB_PATH + rm -rf $MIOPEN_USER_DB_PATH + mkdir -p $MIOPEN_USER_DB_PATH + + # Run the install validation utility smart validate --device gpu The following output indicates a successful install: @@ -96,11 +103,10 @@ build, and some variables should be set to optimize performance: source activate smartsim # Optimizations for inference - export SCRATCH=/lustre/orion/$PROJECT_NAME/scratch/$USER/ - export MIOPEN_USER_DB_PATH=/tmp/miopendb/ - export MIOPEN_SYSTEM_DB_PATH=$MIOPEN_USER_DB_PATH - mkdir -p $MIOPEN_USER_DB_PATH - export MIOPEN_DISABLE_CACHE=1 + export MIOPEN_USER_DB_PATH="/tmp/${USER}/my-miopen-cache" + export MIOPEN_CUSTOM_CACHE_DIR=${MIOPEN_USER_DB_PATH} + rm -rf ${MIOPEN_USER_DB_PATH} + mkdir -p ${MIOPEN_USER_DB_PATH} Binding DBs to Slingshot ------------------------