Skip to content

Commit

Permalink
Update README and docs to include JAX and Pytorch/XLA
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 715850458
  • Loading branch information
Matt-Hurd authored and copybara-github committed Jan 15, 2025
1 parent e525a14 commit 6f68129
Show file tree
Hide file tree
Showing 2 changed files with 57 additions and 30 deletions.
59 changes: 45 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,34 +1,49 @@
# TensorFlow Profiler
The profiler includes a suite of tools. These tools help you understand, debug and optimize TensorFlow programs to run on CPUs, GPUs and TPUs.
# Tensorboard Profiler Plugin
The profiler includes a suite of tools for [JAX](https://jax.readthedocs.io/), [TensorFlow](https://www.tensorflow.org/), and [PyTorch/XLA](https://github.com/pytorch/xla). These tools help you understand, debug and optimize programs to run on CPUs, GPUs and TPUs.

The profiler plugin offers a number of tools to analyse and visualize the
performance of your model across multiple devices. Some of the tools include:

* **Overview**: A high-level overview of the performance of your model. This
is an aggregated overview for your host and all devices. It includes:
* Performance summary and breakdown of step times.
* A graph of individual step times.
* A table of the top 10 most expensive operations.
* **Trace Viewer**: Displays a timeline of the execution of your model that shows:
* The duration of each op.
* Which part of the system (host or device) executed an op.
* The communication between devices.
* **Memory Profile Viewer**: Monitors the memory usage of your model.
* **Graph Viewer**: A visualization of the graph structure of HLOs of your model.

## Demo
First time user? Come and check out this [Colab Demo](https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras).

## Prerequisites
* TensorFlow >= 2.2.0
* TensorBoard >= 2.2.0
* tensorboard-plugin-profile >= 2.2.0
* TensorFlow >= 2.18.0
* TensorBoard >= 2.18.0
* tensorboard-plugin-profile >= 2.18.0

Note: The TensorFlow Profiler requires access to the Internet to load the [Google Chart library](https://developers.google.com/chart/interactive/docs/basic_load_libs#basic-library-loading).
Note: The Tensorboard Profiler Plugin requires access to the Internet to load the [Google Chart library](https://developers.google.com/chart/interactive/docs/basic_load_libs#basic-library-loading).
Some charts and tables may be missing if you run TensorBoard entirely offline on
your local machine, behind a corporate firewall, or in a datacenter.

To profile on a **single GPU** system, the following NVIDIA software must be installed on your system:

1. NVIDIA GPU drivers and CUDA Toolkit:
* CUDA 10.1 requires 418.x and higher.
* CUDA 12.5 requires 525.60.13 and higher.
2. Ensure that CUPTI 10.1 exists on the path.

```shell
$ /sbin/ldconfig -N -v $(sed 's/:/ /g' <<< $LD_LIBRARY_PATH) | grep libcupti
```

If you don't see `libcupti.so.10.1` on the path, prepend its installation directory to the $LD_LIBRARY_PATH environmental variable:
If you don't see `libcupti.so.12.5` on the path, prepend its installation directory to the $LD_LIBRARY_PATH environmental variable:

```shell
$ export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
```
Run the ldconfig command above again to verify that the CUPTI 10.1 library is found.
Run the ldconfig command above again to verify that the CUPTI 12.5 library is found.

If this doesn't work, try:
```shell
Expand All @@ -42,17 +57,33 @@ To profile multi-worker GPU configurations, profile individual workers independe
To profile cloud TPUs, you must have access to Google Cloud TPUs.

## Quick Start
Install nightly version of profiler by downloading and running the `install_and_run.py` script from this directory.
The profiler plugin follows the TensorFlow versioning scheme. As a result, the
`tensorboard-plugin-profile` PyPI package can be behind the `tbp-nightly` PyPI
package. In order to get the latest version of the profiler plugin, you can
install the nightly package..

To install the nightly version of profiler:

```
$ git clone https://github.com/tensorflow/profiler.git profiler
$ mkdir profile_env
$ python3 profiler/install_and_run.py --envdir=profile_env --logdir=profiler/demo
$ pip uninstall tensorboard-plugin-profile
$ pip install tbp-nightly
```

Run TensorBoard:

```
$ tensorboard --logdir=profiler/demo
```
If you are behind a corporate firewall, you may need to include the `--bind_all`
tensorboard flag.

Go to `localhost:6006/#profile` of your browser, you should now see the demo overview page show up.
![Overview Page](docs/images/overview_page.png)
Congratulations! You're now ready to capture a profile.

## Next Steps
* GPU Profiling Guide: https://tensorflow.org/guide/profiler
* JAX Profiling Guide: https://jax.readthedocs.io/en/latest/profiling.html
* TensorFlow Profiling Guide: https://tensorflow.org/guide/profiler
* Cloud TPU Profiling Guide: https://cloud.google.com/tpu/docs/cloud-tpu-tools
* Colab Tutorial: https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras
* MiniGPT Example: https://docs.jaxstack.ai/en/latest/JAX_for_LLM_pretraining.html
28 changes: 12 additions & 16 deletions docs/profile_multi_gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,43 +4,39 @@

NVIDIA® `CUDA 10.2` must be installed on your system:

* [NVIDIA® GPU drivers](https://www.nvidia.com/drivers)`CUDA 10.2` requires `440.33 (Linux) / 441.22 (Windows)` and higher.
* [CUDA® Toolkit 10.2](https://developer.nvidia.com/cuda-toolkit-archive)
* [NVIDIA® GPU drivers](https://www.nvidia.com/drivers)`CUDA 12.x` requires `525.60.13 (Linux) / 527.41 (Windows)` and higher.
* [CUDA® Toolkit 12.5](https://developer.nvidia.com/cuda-toolkit-archive)
* CUPTI ships with the CUDA Toolkit.

## Linux setup

1. Install the [CUDA® Toolkit 10.2](https://developer.nvidia.com/cuda-downloads), select the target platform.
Here's the an example to install cuda 10.2 on Ubuntu 16.04 with nvidia driver and cupti included.
Here's the an example to install cuda 12.5 on Ubuntu 20.04 with nvidia driver and cupti included.

```shell
$ wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
$ sudo sh cuda_10.2.89_440.33.01_linux.run # Select NVIDIA driver and CUPTI.
$ wget https://developer.download.nvidia.com/compute/cuda/12.5.0/local_installers/cuda_12.5.0_555.42.02_linux.run
$ sudo sh cuda_12.5.0_555.42.02_linux.run # Select NVIDIA driver and CUPTI.
```

2. Ensure CUPTI exists on the path:
```shell
$ /sbin/ldconfig -N -v $(sed 's/:/ /g' <<< $LD_LIBRARY_PATH) | grep libcupti
```
You should see a string like
`libcupti.so.10.2 -> libcupti.so.10.2.75`
`libcupti.so.12.5 -> libcupti.so.12.5.75`

If you don't have CUPTI on the path, prepend its installation directory to the $LD_LIBRARY_PATH environmental variable:

```shell
$ export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
```
Run the ldconfig command above again to verify that the `CUPTI 10.2` library is found.
Run the ldconfig command above again to verify that the `CUPTI 12.5` library is found.

3. Make symbolic link to `libcudart.so.10.1` and `libcupti.so.10.1`.
TensorFlow 2.2 looks for those strings unless you build your own pip package with [TF_CUDA_VERSION=10.2](https://raw.githubusercontent.com/tensorflow/tensorflow/34bec1ebd4c7a2bc2cea5ea0491acf7615f8875e/tensorflow/tools/ci_build/release/ubuntu_16/gpu_py36_full/pip.sh).
3. Make symbolic link to `libcudart.so.12.5` and `libcupti.so.12.5`.
TensorFlow 2.18 looks for those strings unless you build your own pip package with [TF_CUDA_VERSION=12.5](https://raw.githubusercontent.com/tensorflow/tensorflow/6f43bd412b4aa6c2b23eeb7f4f71b557f14dc8a7/tensorflow/tools/ci_build/linux/rocm/rocm_py38_pip.sh#L25).

```shell
$ sudo ln -s /usr/local/cuda/lib64/libcudart.so.10.2 /usr/local/cuda/lib64/libcudart.so.10.1
$ sudo ln -s /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.10.2 /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.10.1
$ sudo ln -s /usr/local/cuda/lib64/libcudart.so.12.5 /usr/local/cuda/lib64/libcudart.so.12.5
$ sudo ln -s /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.12.5 /usr/local/cuda/extras/CUPTI/lib64/libcupti.so.12.5
```
4. Run the model again and look for `Successfully opened dynamic library libcupti.so.10.1` in the logs. Your setup is now complete.

## Known issues
* Multi-GPU Profiling does not work with `CUDA 10.1`. While `CUDA 10.2` is not officially supported by TF, profiling on `CUDA 10.2` is known to work on some configurations.
* Faking the symbolic links IS NOT a suggested way of using CUDA per NVIDIA's standard (the suggested way is to recompile TF with `CUDA 10.2` toolchain). But that gives a simple and easy way to try whether things work without spending a lot of time figuring out the compilation steps.
4. Run the model again and look for `Successfully opened dynamic library libcupti.so.12.5` in the logs. Your setup is now complete.

0 comments on commit 6f68129

Please sign in to comment.