Update install notes #749

ashao · 2024-10-16T00:24:54Z

Some of the instructions for installing SmartSim were stale. Additionally, a user had requested how SmartSim could be installed on a machine that was airgapped from the internet. The install notes have been updated and slightly reorganized to aid discovery.

codecov · 2024-10-17T15:43:36Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.86%. Comparing base (d7d979e) to head (4952a13).
Report is 16 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #749      +/-   ##
===========================================
- Coverage    83.91%   81.86%   -2.05%     
===========================================
  Files           83       84       +1     
  Lines         6284     7075     +791     
===========================================
+ Hits          5273     5792     +519     
- Misses        1011     1283     +272

see 48 files with indirect coverage changes

doc/installation_instructions/basic.rst

doc/installation_instructions/platform.rst

amandarichardsonn · 2024-10-25T18:49:31Z

doc/installation_instructions/platform.rst

+need to retrieve all of the build dependencies themselves. Some machines
+have specific environment variables and/or configuration settings that need
+to be set for optimal performance. The below machines have vetted
+instructions, please feel free to contribute instructions for your own


I would make the last sentence, two sentences -> I would also link to how to contribute

doc/installation_instructions/troubleshooting/cuda-dependencies.rst

doc/installation_instructions/troubleshooting/custom_backends.rst

doc/installation_instructions/basic.rst

doc/installation_instructions/troubleshooting/cuda-dependencies.rst

doc/installation_instructions/troubleshooting/custom_backends.rst

doc/installation_instructions/troubleshooting/offline.rst

doc/installation_instructions/troubleshooting/troubleshooting.rst

amandarichardsonn

By the bolts of Frankenstein’s lab, these installation notes are a masterpiece! This will undoubtedly electrify the installation process for everyone. Well done!

amandarichardsonn · 2024-10-31T18:30:50Z

doc/installation_instructions/basic.rst

-  this seems to be hardcoded to `gcc` and `g++` in the Redis build so ensure that
-  `which gcc g++` do not point to Apple Clang.
+  We suggest using GCC to build Redis, RedisAI, and the ML backends. For specific
+  version requirements see the :ref:`Requirements <requirements>` section.


it does not look like there is a Requirements section, and it instead takes you to the ML Library Support/Linux section -> should this link instead point to the ML Library Support section?

amandarichardsonn · 2024-10-31T18:37:04Z

doc/installation_instructions/basic.rst

-for these GPUs often depends on the version of the CUDA or ROCm stack that is availble on your
-machine. In _most_ cases, the versions backwards compatible. If you encounter problems, please
-contact us and we can build the backend libraries for your desired version of CUDA and ROCm.
+SmartSim supports using Nvidia and AMD GPUs when using RedisAI for GPU


SmartSim supports using Nvidia and AMD GPUs when using RedisAI for GPU inference - what are your thoughts on rephrasing it to SmartSim enables the use of Nvidia and AMD GPUs for GPU inference with RedisAI?

amandarichardsonn · 2024-10-31T18:39:19Z

doc/installation_instructions/basic.rst

-contact us and we can build the backend libraries for your desired version of CUDA and ROCm.
+SmartSim supports using Nvidia and AMD GPUs when using RedisAI for GPU
+inference. GPU support often depends on the version of the CUDA or ROCm stack
+that is available on your machine. In _most_ cases, the versions of the ML


It looks like most is not rendering correctly in readthedocs and is showing up exactly as is - instead try *most*

amandarichardsonn · 2024-10-31T18:45:42Z

doc/installation_instructions/basic.rst

+inference. GPU support often depends on the version of the CUDA or ROCm stack
+that is available on your machine. In _most_ cases, the versions of the ML
+frameworks are backwards compatible. If you encounter problems, please contact
+us at (smartsim at hpe dot com) and we can build the backend libraries for your


for pointing them on how to contact us, what are your thoughts on pointing them to this area of the docs?

https://www.craylabs.org/docs/contributing.html#how-to-connect

so they have multiple options?

ALSO! what are your thoughts on changing (smartsim at hpe dot com) to instead
[email protected] <mailto:[email protected]>_

amandarichardsonn · 2024-10-31T18:47:39Z

doc/installation_instructions/basic.rst

@@ -64,7 +65,7 @@ Linux



I notice in the Linux tabs, there are additional requirements for CUDA 11 and CUDA 12 but not for ROCm or CPU - just raising this just incase!

oh also why does ROCm 6 have N/A for two columns?

CPU seems a little out of place, what are your thoughts on splitting this table into two with

GPU Configurations

CPU Configurations

amandarichardsonn · 2024-10-31T19:01:43Z

doc/installation_instructions/troubleshooting/cuda-dependencies.rst

+            wget https://developer.download.nvidia.com/compute/cuda/12.5.0/local_installers/cuda_12.5.0_555.42.02_linux.run
+            sh ./cuda_12.5.0_555.42.02_linux.run --toolkit --silent --toolkitpath=$CUDA_TOOLKIT_INSTALL_PATH
+
+**Step 3:** Download cuDNN:


add a space after line 46 so the sentence on line 47 starts on a /n

amandarichardsonn · 2024-10-31T19:03:55Z

doc/installation_instructions/troubleshooting/cuda-dependencies.rst

+            mkdir -p $CUDNN_INSTALL_PATH
+            tar -xf cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar -C $CUDNN_INSTALL_PATH --strip-components 1
+
+Option 1: Environment Variables


What are your thoughts on wrapping Option 1 and Option 2 into their own section? With an overview on both, I think it might better help the user understand what they are choosing between?

amandarichardsonn · 2024-10-31T19:06:55Z

doc/installation_instructions/troubleshooting/offline.rst

+
+
+The easiest way to accomplish this assumes that you have the following
+- A source machine connected to the internet with SmartSim built (referred to as Machine A).


The format of this is showing up weird in readthedocs -> add a space after line 16 to resolve the issue

amandarichardsonn · 2024-10-31T19:07:32Z

doc/installation_instructions/troubleshooting/offline.rst

+
+The easiest way to accomplish this assumes that you have the following
+- A source machine connected to the internet with SmartSim built (referred to as Machine A).
+- A target machine not connected to the Internet


do you need to note this as Machine B or is that too much

amandarichardsonn · 2024-10-31T19:09:56Z

doc/installation_instructions/troubleshooting/custom_backends.rst

+Custom ML backends
+------------------
+
+The ML backends (Torch, ONNX Runtime, and Tensorflow) and their associated


supported based on the intended device (CPU, ROCm, CUDA-11, or CUDA-12) - I think the only device in the list is CPU right? Maybe say device or platform

ashao requested review from amandarichardsonn and juliaputko October 17, 2024 00:32

amandarichardsonn reviewed Oct 25, 2024

View reviewed changes