diff --git a/README.md b/README.md index 878e68679..3b1adcce4 100644 --- a/README.md +++ b/README.md @@ -1,34 +1,49 @@ -# TensorFlow Profiler -The profiler includes a suite of tools. These tools help you understand, debug and optimize TensorFlow programs to run on CPUs, GPUs and TPUs. +# Tensorboard Profiler Plugin +The profiler includes a suite of tools for [JAX](https://jax.readthedocs.io/), [TensorFlow](https://www.tensorflow.org/), and [PyTorch/XLA](https://github.com/pytorch/xla). These tools help you understand, debug and optimize programs to run on CPUs, GPUs and TPUs. + +The profiler plugin offers a number of tools to analyse and visualize the +performance of your model across multiple devices. Some of the tools include: + +* **Overview**: A high-level overview of the performance of your model. This + is an aggregated overview for your host and all devices. It includes: + * Performance summary and breakdown of step times. + * A graph of individual step times. + * A table of the top 10 most expensive operations. +* **Trace Viewer**: Displays a timeline of the execution of your model that shows: + * The duration of each op. + * Which part of the system (host or device) executed an op. + * The communication between devices. +* **Memory Profile Viewer**: Monitors the memory usage of your model. +* **Graph Viewer**: A visualization of the graph structure of HLOs of your model. ## Demo First time user? Come and check out this [Colab Demo](https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras). ## Prerequisites -* TensorFlow >= 2.2.0 -* TensorBoard >= 2.2.0 -* tensorboard-plugin-profile >= 2.2.0 +* TensorFlow >= 2.18.0 +* TensorBoard >= 2.18.0 +* tensorboard-plugin-profile >= 2.18.0 -Note: The TensorFlow Profiler requires access to the Internet to load the [Google Chart library](https://developers.google.com/chart/interactive/docs/basic_load_libs#basic-library-loading). +Note: The Tensorboard Profiler Plugin requires access to the Internet to load the [Google Chart library](https://developers.google.com/chart/interactive/docs/basic_load_libs#basic-library-loading). Some charts and tables may be missing if you run TensorBoard entirely offline on your local machine, behind a corporate firewall, or in a datacenter. To profile on a **single GPU** system, the following NVIDIA software must be installed on your system: 1. NVIDIA GPU drivers and CUDA Toolkit: - * CUDA 10.1 requires 418.x and higher. + * CUDA 12.5 requires 525.60.13 and higher. 2. Ensure that CUPTI 10.1 exists on the path. ```shell $ /sbin/ldconfig -N -v $(sed 's/:/ /g' <<< $LD_LIBRARY_PATH) | grep libcupti ``` - If you don't see `libcupti.so.10.1` on the path, prepend its installation directory to the $LD_LIBRARY_PATH environmental variable: + If you don't see `libcupti.so.12.5` on the path, prepend its installation directory to the $LD_LIBRARY_PATH environmental variable: ```shell $ export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH ``` - Run the ldconfig command above again to verify that the CUPTI 10.1 library is found. + Run the ldconfig command above again to verify that the CUPTI 12.5 library is found. If this doesn't work, try: ```shell @@ -42,17 +57,33 @@ To profile multi-worker GPU configurations, profile individual workers independe To profile cloud TPUs, you must have access to Google Cloud TPUs. ## Quick Start -Install nightly version of profiler by downloading and running the `install_and_run.py` script from this directory. +The profiler plugin follows the TensorFlow versioning scheme. As a result, the +`tensorboard-plugin-profile` PyPI package can be behind the `tbp-nightly` PyPI +package. In order to get the latest version of the profiler plugin, you can +install the nightly package.. + +To install the nightly version of profiler: + ``` -$ git clone https://github.com/tensorflow/profiler.git profiler -$ mkdir profile_env -$ python3 profiler/install_and_run.py --envdir=profile_env --logdir=profiler/demo +$ pip uninstall tensorboard-plugin-profile +$ pip install tbp-nightly ``` + +Run TensorBoard: + +``` +$ tensorboard --logdir=profiler/demo +``` +If you are behind a corporate firewall, you may need to include the `--bind_all` +tensorboard flag. + Go to `localhost:6006/#profile` of your browser, you should now see the demo overview page show up. ![Overview Page](docs/images/overview_page.png) Congratulations! You're now ready to capture a profile. ## Next Steps -* GPU Profiling Guide: https://tensorflow.org/guide/profiler +* JAX Profiling Guide: https://jax.readthedocs.io/en/latest/profiling.html +* TensorFlow Profiling Guide: https://tensorflow.org/guide/profiler * Cloud TPU Profiling Guide: https://cloud.google.com/tpu/docs/cloud-tpu-tools * Colab Tutorial: https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras +* MiniGPT Example: https://docs.jaxstack.ai/en/latest/JAX_for_LLM_pretraining.html diff --git a/docs/profile_multi_gpu.md b/docs/profile_multi_gpu.md index 8ff14d258..420fa0f5f 100644 --- a/docs/profile_multi_gpu.md +++ b/docs/profile_multi_gpu.md @@ -4,18 +4,18 @@ NVIDIA® `CUDA 10.2` must be installed on your system: -* [NVIDIA® GPU drivers](https://www.nvidia.com/drivers) —`CUDA 10.2` requires `440.33 (Linux) / 441.22 (Windows)` and higher. -* [CUDA® Toolkit 10.2](https://developer.nvidia.com/cuda-toolkit-archive) +* [NVIDIA® GPU drivers](https://www.nvidia.com/drivers) —`CUDA 12.x` requires `525.60.13 (Linux) / 527.41 (Windows)` and higher. +* [CUDA® Toolkit 12.5](https://developer.nvidia.com/cuda-toolkit-archive) * CUPTI ships with the CUDA Toolkit. ## Linux setup 1. Install the [CUDA® Toolkit 10.2](https://developer.nvidia.com/cuda-downloads), select the target platform. - Here's the an example to install cuda 10.2 on Ubuntu 16.04 with nvidia driver and cupti included. + Here's the an example to install cuda 12.5 on Ubuntu 20.04 with nvidia driver and cupti included. ```shell - $ wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run - $ sudo sh cuda_10.2.89_440.33.01_linux.run # Select NVIDIA driver and CUPTI. + $ wget https://developer.download.nvidia.com/compute/cuda/12.5.0/local_installers/cuda_12.5.0_555.42.02_linux.run + $ sudo sh cuda_12.5.0_555.42.02_linux.run # Select NVIDIA driver and CUPTI. ``` 2. Ensure CUPTI exists on the path: @@ -23,24 +23,20 @@ NVIDIA® `CUDA 10.2` must be installed on your system: $ /sbin/ldconfig -N -v $(sed 's/:/ /g' <<< $LD_LIBRARY_PATH) | grep libcupti ``` You should see a string like - `libcupti.so.10.2 -> libcupti.so.10.2.75` + `libcupti.so.12.5 -> libcupti.so.12.5.75` If you don't have CUPTI on the path, prepend its installation directory to the $LD_LIBRARY_PATH environmental variable: ```shell $ export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH ``` - Run the ldconfig command above again to verify that the `CUPTI 10.2` library is found. + Run the ldconfig command above again to verify that the `CUPTI 12.5` library is found. -3. Make symbolic link to `libcudart.so.10.1` and `libcupti.so.10.1`. - TensorFlow 2.2 looks for those strings unless you build your own pip package with [TF_CUDA_VERSION=10.2](https://raw.githubusercontent.com/tensorflow/tensorflow/34bec1ebd4c7a2bc2cea5ea0491acf7615f8875e/tensorflow/tools/ci_build/release/ubuntu_16/gpu_py36_full/pip.sh). +3. Make symbolic link to `libcudart.so.12.5` and `libcupti.so.12.5`. + TensorFlow 2.18 looks for those strings unless you build your own pip package with [TF_CUDA_VERSION=12.5](https://raw.githubusercontent.com/tensorflow/tensorflow/6f43bd412b4aa6c2b23eeb7f4f71b557f14dc8a7/tensorflow/tools/ci_build/linux/rocm/rocm_py38_pip.sh#L25). ```shell - $ sudo ln -s /usr/local/cuda/lib64/libcudart.so.10.2 /usr/local/cuda/lib64/libcudart.so.10.1 - $ sudo ln -s /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.10.2 /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.10.1 + $ sudo ln -s /usr/local/cuda/lib64/libcudart.so.12.5 /usr/local/cuda/lib64/libcudart.so.12.5 + $ sudo ln -s /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.12.5 /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.12.5 ``` -4. Run the model again and look for `Successfully opened dynamic library libcupti.so.10.1` in the logs. Your setup is now complete. - -## Known issues -* Multi-GPU Profiling does not work with `CUDA 10.1`. While `CUDA 10.2` is not officially supported by TF, profiling on `CUDA 10.2` is known to work on some configurations. -* Faking the symbolic links IS NOT a suggested way of using CUDA per NVIDIA's standard (the suggested way is to recompile TF with `CUDA 10.2` toolchain). But that gives a simple and easy way to try whether things work without spending a lot of time figuring out the compilation steps. +4. Run the model again and look for `Successfully opened dynamic library libcupti.so.12.5` in the logs. Your setup is now complete.