Update README and docs to include JAX and Pytorch/XLA

PiperOrigin-RevId: 715850458
tensorflow · Jan 15, 2025 · 6f68129 · 6f68129
1 parent e525a14
commit 6f68129
Show file tree

Hide file tree

Showing 2 changed files with 57 additions and 30 deletions.
diff --git a/README.md b/README.md
@@ -1,34 +1,49 @@
-# TensorFlow Profiler
-The profiler includes a suite of tools. These tools help you understand, debug and optimize TensorFlow programs to run on CPUs, GPUs and TPUs.
+# Tensorboard Profiler Plugin
+The profiler includes a suite of tools for [JAX](https://jax.readthedocs.io/), [TensorFlow](https://www.tensorflow.org/), and [PyTorch/XLA](https://github.com/pytorch/xla). These tools help you understand, debug and optimize programs to run on CPUs, GPUs and TPUs.
+
+The profiler plugin offers a number of tools to analyse and visualize the
+performance of your model across multiple devices. Some of the tools include:
+
+*   **Overview**: A high-level overview of the performance of your model. This
+    is an aggregated overview for your host and all devices. It includes:
+    *   Performance summary and breakdown of step times.
+    *   A graph of individual step times.
+    *   A table of the top 10 most expensive operations.
+*   **Trace Viewer**: Displays a timeline of the execution of your model that shows:
+    *   The duration of each op.
+    *   Which part of the system (host or device) executed an op.
+    *   The communication between devices.
+*   **Memory Profile Viewer**: Monitors the memory usage of your model.
+*   **Graph Viewer**: A visualization of the graph structure of HLOs of your model.
 
 ## Demo
 First time user? Come and check out this [Colab Demo](https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras).
 
 ## Prerequisites
-* TensorFlow >= 2.2.0
-* TensorBoard >= 2.2.0
-* tensorboard-plugin-profile >= 2.2.0
+* TensorFlow >= 2.18.0
+* TensorBoard >= 2.18.0
+* tensorboard-plugin-profile >= 2.18.0
 
-Note: The TensorFlow Profiler requires access to the Internet to load the [Google Chart library](https://developers.google.com/chart/interactive/docs/basic_load_libs#basic-library-loading).
+Note: The Tensorboard Profiler Plugin requires access to the Internet to load the [Google Chart library](https://developers.google.com/chart/interactive/docs/basic_load_libs#basic-library-loading).
 Some charts and tables may be missing if you run TensorBoard entirely offline on
 your local machine, behind a corporate firewall, or in a datacenter.
 
 To profile on a **single GPU** system, the following NVIDIA software must be installed on your system:
 
 1. NVIDIA GPU drivers and CUDA Toolkit:
-    * CUDA 10.1 requires 418.x and higher.
+    * CUDA 12.5 requires 525.60.13 and higher.
 2. Ensure that CUPTI 10.1 exists on the path.
 
    ```shell
    $ /sbin/ldconfig -N -v $(sed 's/:/ /g' <<< $LD_LIBRARY_PATH) | grep libcupti
    ```
 
-   If you don't see `libcupti.so.10.1` on the path, prepend its installation directory to the $LD_LIBRARY_PATH environmental variable:
+   If you don't see `libcupti.so.12.5` on the path, prepend its installation directory to the $LD_LIBRARY_PATH environmental variable:
 
    ```shell
    $ export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
    ```
-   Run the ldconfig command above again to verify that the CUPTI 10.1 library is found.
+   Run the ldconfig command above again to verify that the CUPTI 12.5 library is found.
 
    If this doesn't work, try:
    ```shell
@@ -42,17 +57,33 @@ To profile multi-worker GPU configurations, profile individual workers independe
 To profile cloud TPUs, you must have access to Google Cloud TPUs.
 
 ## Quick Start
-Install nightly version of profiler by downloading and running the `install_and_run.py` script from this directory.
+The profiler plugin follows the TensorFlow versioning scheme. As a result, the
+`tensorboard-plugin-profile` PyPI package can be behind the `tbp-nightly` PyPI
+package. In order to get the latest version of the profiler plugin, you can
+install the nightly package..
+
+To install the nightly version of profiler:
+
 ```
-$ git clone https://github.com/tensorflow/profiler.git profiler
-$ mkdir profile_env
-$ python3 profiler/install_and_run.py --envdir=profile_env --logdir=profiler/demo
+$ pip uninstall tensorboard-plugin-profile
+$ pip install tbp-nightly
 ```
+
+Run TensorBoard:
+
+```
+$ tensorboard --logdir=profiler/demo
+```
+If you are behind a corporate firewall, you may need to include the `--bind_all`
+tensorboard flag.
+
 Go to `localhost:6006/#profile` of your browser, you should now see the demo overview page show up.
 ![Overview Page](docs/images/overview_page.png)
 Congratulations! You're now ready to capture a profile.
 
 ## Next Steps
-* GPU Profiling Guide:  https://tensorflow.org/guide/profiler
+* JAX Profiling Guide: https://jax.readthedocs.io/en/latest/profiling.html
+* TensorFlow Profiling Guide:  https://tensorflow.org/guide/profiler
 * Cloud TPU Profiling Guide: https://cloud.google.com/tpu/docs/cloud-tpu-tools
 * Colab Tutorial: https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras
+* MiniGPT Example: https://docs.jaxstack.ai/en/latest/JAX_for_LLM_pretraining.html
diff --git a/docs/profile_multi_gpu.md b/docs/profile_multi_gpu.md
@@ -4,43 +4,39 @@
 
 NVIDIA® `CUDA 10.2` must be installed on your system:
 
-* [NVIDIA® GPU drivers](https://www.nvidia.com/drivers) —`CUDA 10.2` requires `440.33 (Linux) / 441.22 (Windows)` and higher.
-* [CUDA® Toolkit 10.2](https://developer.nvidia.com/cuda-toolkit-archive)
+* [NVIDIA® GPU drivers](https://www.nvidia.com/drivers) —`CUDA 12.x` requires `525.60.13 (Linux) / 527.41 (Windows)` and higher.
+* [CUDA® Toolkit 12.5](https://developer.nvidia.com/cuda-toolkit-archive)
 * CUPTI ships with the CUDA Toolkit.
 
 ## Linux setup
 
 1. Install the [CUDA® Toolkit 10.2](https://developer.nvidia.com/cuda-downloads), select the target platform.
-   Here's the an example to install cuda 10.2 on Ubuntu 16.04 with nvidia driver and cupti included.
+   Here's the an example to install cuda 12.5 on Ubuntu 20.04 with nvidia driver and cupti included.
 
    ```shell
-   $ wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
-   $ sudo sh cuda_10.2.89_440.33.01_linux.run  # Select NVIDIA driver and CUPTI.
+   $ wget https://developer.download.nvidia.com/compute/cuda/12.5.0/local_installers/cuda_12.5.0_555.42.02_linux.run
+   $ sudo sh cuda_12.5.0_555.42.02_linux.run  # Select NVIDIA driver and CUPTI.
    ```
 
 2. Ensure CUPTI exists on the path:
    ```shell
    $ /sbin/ldconfig -N -v $(sed 's/:/ /g' <<< $LD_LIBRARY_PATH) | grep libcupti
    ```
    You should see a string like
-   `libcupti.so.10.2 -> libcupti.so.10.2.75`
+   `libcupti.so.12.5 -> libcupti.so.12.5.75`
 
    If you don't have CUPTI on the path, prepend its installation directory to the $LD_LIBRARY_PATH environmental variable:
 
    ```shell
    $ export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
    ```
-   Run the ldconfig command above again to verify that the `CUPTI 10.2` library is found.
+   Run the ldconfig command above again to verify that the `CUPTI 12.5` library is found.
 
-3. Make symbolic link to `libcudart.so.10.1` and `libcupti.so.10.1`.
-   TensorFlow 2.2 looks for those strings unless you build your own pip package with [TF_CUDA_VERSION=10.2](https://raw.githubusercontent.com/tensorflow/tensorflow/34bec1ebd4c7a2bc2cea5ea0491acf7615f8875e/tensorflow/tools/ci_build/release/ubuntu_16/gpu_py36_full/pip.sh).
+3. Make symbolic link to `libcudart.so.12.5` and `libcupti.so.12.5`.
+   TensorFlow 2.18 looks for those strings unless you build your own pip package with [TF_CUDA_VERSION=12.5](https://raw.githubusercontent.com/tensorflow/tensorflow/6f43bd412b4aa6c2b23eeb7f4f71b557f14dc8a7/tensorflow/tools/ci_build/linux/rocm/rocm_py38_pip.sh#L25).
 
    ```shell
-   $ sudo ln -s /usr/local/cuda/lib64/libcudart.so.10.2 /usr/local/cuda/lib64/libcudart.so.10.1
-   $ sudo ln -s /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.10.2 /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.10.1
+   $ sudo ln -s /usr/local/cuda/lib64/libcudart.so.12.5 /usr/local/cuda/lib64/libcudart.so.12.5
+   $ sudo ln -s /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.12.5 /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.12.5
    ```
-4. Run the model again and look for `Successfully opened dynamic library libcupti.so.10.1` in the logs. Your setup is now complete.
-
-## Known issues
-* Multi-GPU Profiling does not work with `CUDA 10.1`. While `CUDA 10.2` is not officially supported by TF, profiling on `CUDA 10.2` is known to work on some configurations.
-* Faking the symbolic links IS NOT a suggested way of using CUDA per NVIDIA's standard (the suggested way is to recompile TF with `CUDA 10.2` toolchain). But that gives a simple and easy way to try whether things work without spending a lot of time figuring out the compilation steps.
+4. Run the model again and look for `Successfully opened dynamic library libcupti.so.12.5` in the logs. Your setup is now complete.