Skip to content

Update XLA revision #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jun 16, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 36 additions & 32 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,17 @@ on:
tags:
- "v*.*.*"

env:
USE_BAZEL_VERSION: 7.4.1

jobs:
create_draft_release:
if: github.ref_type == 'tag'
permissions:
contents: write
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Create draft release
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Expand All @@ -23,83 +26,84 @@ jobs:
linux:
name: "x86_64-linux-gnu-{cpu,tpu}"
needs: [create_draft_release]
# We intentionally build on ubuntu 20 to compile against
# We intentionally build on ubuntu 22 to compile against
# an older version of glibc
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- uses: erlef/setup-beam@v1
with:
otp-version: "24"
elixir-version: "1.15.8"
# Setup the compilation environment
- uses: abhinavsingh/setup-bazel@v3
with:
version: "6.5.0"
- uses: actions/setup-python@v2
- uses: bazel-contrib/[email protected]
- uses: actions/setup-python@v5
with:
python-version: "3.9"
python-version: "3.11"
- run: python -m pip install --upgrade pip numpy
- run: |
clang_version="18"
sudo apt-get update
sudo apt-get install -y wget gnupg software-properties-common lsb-release
wget -qO- https://apt.llvm.org/llvm.sh | sudo bash -s -- $clang_version
sudo update-alternatives --install /usr/bin/clang clang /usr/bin/clang-$clang_version 200
sudo update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-$clang_version 200
# Build and upload the archives
- run: mix deps.get
- run: .github/scripts/compile_and_upload.sh ${{ github.ref_name }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
XLA_TARGET: cpu
CC: gcc-9
- run: .github/scripts/compile_and_upload.sh ${{ github.ref_name }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
XLA_TARGET: tpu
CC: gcc-9

macos:
name: "x86_64-darwin-cpu"
needs: [create_draft_release]
runs-on: macos-12
runs-on: macos-13
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- run: brew install elixir
- run: mix local.hex --force
# Setup the compilation environment
- uses: abhinavsingh/setup-bazel@v3
- uses: bazel-contrib/[email protected]
- uses: actions/setup-python@v5
with:
version: "6.5.0"
- uses: actions/setup-python@v2
with:
python-version: "3.9"
python-version: "3.11"
- run: python -m pip install --upgrade pip numpy
# Build and upload the archive
- run: mix deps.get
- run: .github/scripts/compile_and_upload.sh ${{ github.ref_name }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
XLA_TARGET: cpu
CC: gcc-9
# This runner comes with Clang 14, which does not support the -mavxvnniint8
# CLI flag. We can install newer Clang, however at some point Bazel toolchains
# invoke xcrun clang, which always uses the system version from Xcode, ignoring
# whichever version we installed ourselves. With the flag below, we make sure
# this flag is not passed in the first place.
# See https://github.com/tensorflow/tensorflow/pull/87514
BUILD_FLAGS: "--define=xnn_enable_avxvnniint8=false"

macos_arm:
name: "aarch64-darwin-cpu (cross-compiled)"
name: "aarch64-darwin-cpu"
needs: [create_draft_release]
runs-on: macos-12
runs-on: macos-14
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- run: brew install elixir
- run: mix local.hex --force
# Setup the compilation environment
- uses: abhinavsingh/setup-bazel@v3
with:
version: "6.5.0"
- uses: actions/setup-python@v2
- uses: bazel-contrib/[email protected]
- uses: actions/setup-python@v5
with:
python-version: "3.9"
python-version: "3.11"
- run: python -m pip install --upgrade pip numpy
# Build and upload the archive
- run: mix deps.get
- run: .github/scripts/compile_and_upload.sh ${{ github.ref_name }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
XLA_TARGET: cpu
XLA_TARGET_PLATFORM: "aarch64-darwin"
# Explicitly cross-compile for arm64
BUILD_FLAGS: "--config=macos_arm64"
CC: gcc-9
2 changes: 2 additions & 0 deletions .tool-versions
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
bazel 7.4.1
python 3.11.11
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
BUILD_MODE ?= opt # can also be dbg
OPENXLA_GIT_REPO ?= https://github.com/openxla/xla.git

OPENXLA_GIT_REV ?= fd58925adee147d38c25a085354e15427a12d00a
OPENXLA_GIT_REV ?= 870d90fd098c480fb8a426126bd02047adb2bc20

# Private configuration
BAZEL_FLAGS = --define "framework_shared_object=false" -c $(BUILD_MODE)
Expand Down
24 changes: 12 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,8 @@ In addition to building in a local environment, you can build the ROCm binary us
the Docker-based scripts in [`builds/`](https://github.com/elixir-nx/xla/tree/main/builds). You may want to adjust the ROCm
version in `rocm.Dockerfile` accordingly.

When you encounter errors at runtime, you may want to set `ROCM_PATH=/opt/rocm-5.7.0`
and `LD_LIBRARY_PATH="/opt/rocm-5.7.0/lib"` (with your respective version). For further
When you encounter errors at runtime, you may want to set `ROCM_PATH=/opt/rocm-6.0.0`
and `LD_LIBRARY_PATH="/opt/rocm-6.0.0/lib"` (with your respective version). For further
issues, feel free to open an issue.

#### `XLA_BUILD`
Expand Down Expand Up @@ -93,29 +93,29 @@ Keep in mind that the compilation usually takes a very long time.
You will need the following installed in your system for the compilation:

* [Git](https://git-scm.com/) for fetching XLA source
* [Bazel v6.5.0](https://bazel.build/) for compiling XLA
* [Bazel v7.4.1](https://bazel.build/) for compiling XLA
* [Clang 18](https://clang.llvm.org/) for compiling XLA
* [Python3](https://python.org) with NumPy installed for compiling XLA

### Common issues

#### Bazel version

Use `bazel --version` to check your Bazel version, make sure you are using v6.5.0.
Use `bazel --version` to check your Bazel version, make sure you are using v7.4.1.
Most binaries are available on [Github](https://github.com/bazelbuild/bazel/releases),
but it can also be installed with `asdf`:

```shell
asdf plugin-add bazel
asdf install bazel 6.5.0
asdf global bazel 6.5.0
asdf install bazel 7.4.1
asdf global bazel 7.4.1
```

#### GCC
#### Clang

You may have issues with newer and older versions of GCC. XLA builds are known to work
with GCC versions between 7.5 and 9.3. If your system uses a newer GCC version, you can
install an older version and tell Bazel to use it with `export CC=/path/to/gcc-{version}`
where version is the GCC version you installed.
XLA builds are known to work with Clang 18. On macOS clang comes as part of Xcode SDK
and the version may be older, though for macOS we have precompiled archives, so you
most likely don't need to worry about it.

#### Python and asdf

Expand All @@ -136,7 +136,7 @@ There are two known workarounds:
`direnv` along with the `asdf-direnv` plugin will explicitly set the paths for any binary specified
in your project's `.tool-versions` files.

If you still get the error, you can also try setting `PYTHON_BIN_PATH`, like `export PYTHON_BIN_PATH=/usr/bin/python3.9`.
If you still get the error, you can also try setting `PYTHON_BIN_PATH`, like `export PYTHON_BIN_PATH=/usr/bin/python3.11`.

After doing any of the steps above, it may be necessary to clear the build cache by removing ` ~/.cache/xla_build`
(or the corresponding OS-specific cache location).
Expand Down
2 changes: 2 additions & 0 deletions build.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
ERROR: Cannot connect to the Docker daemon at unix:///Users/jonatanklosko/.docker/run/docker.sock. Is the docker daemon running?
0.34 real 0.08 user 0.08 sys
49 changes: 29 additions & 20 deletions builds/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,26 +1,29 @@
ARG VARIANT
ARG BASE_IMAGE="hexpm/elixir:1.15.8-erlang-24.3.4.17-ubuntu-focal-20240427"
ARG BASE_IMAGE="hexpm/elixir:1.15.8-erlang-24.3.4.17-ubuntu-jammy-20250404"

# Pre-stages for base image variants

FROM ${BASE_IMAGE} AS base-cpu

FROM ${BASE_IMAGE} AS base-cuda

ARG CUDA_VERSION
ARG CUDNN_VERSION
# Now we use hermetic CUDA, so we no longer need to install it. We
# leave this commented out for now, for reference.

ARG DEBIAN_FRONTEND=noninteractive
# ARG CUDA_VERSION
# ARG CUDNN_VERSION

RUN distro="ubuntu$(. /etc/lsb-release; echo "$DISTRIB_RELEASE" | tr -d '.')" && \
# Official Docker images use the sbsa packages when targetting arm64.
# See https://gitlab.com/nvidia/container-images/cuda/-/blob/85f465ea3343a2d7f7753a0a838701999ed58a01/dist/12.5.1/ubuntu2204/base/Dockerfile#L12
arch="$(if [ "$(uname -m)" = "aarch64" ]; then echo "sbsa"; else echo "x86_64"; fi)" && \
apt-get update && apt-get install -y ca-certificates wget && \
wget -qO /tmp/cuda-keyring.deb https://developer.download.nvidia.com/compute/cuda/repos/$distro/$arch/cuda-keyring_1.1-1_all.deb && \
dpkg -i /tmp/cuda-keyring.deb && apt-get update && \
apt-get install -y git cuda-toolkit-${CUDA_VERSION} libcudnn9-cuda-12=${CUDNN_VERSION}-1 libcudnn9-dev-cuda-12=${CUDNN_VERSION}-1 && \
apt-get clean -y && rm -rf /var/lib/apt/lists/*
# ARG DEBIAN_FRONTEND=noninteractive

# RUN distro="ubuntu$(. /etc/lsb-release; echo "$DISTRIB_RELEASE" | tr -d '.')" && \
# # Official Docker images use the sbsa packages when targetting arm64.
# # See https://gitlab.com/nvidia/container-images/cuda/-/blob/85f465ea3343a2d7f7753a0a838701999ed58a01/dist/12.5.1/ubuntu2204/base/Dockerfile#L12
# arch="$(if [ "$(uname -m)" = "aarch64" ]; then echo "sbsa"; else echo "x86_64"; fi)" && \
# apt-get update && apt-get install -y ca-certificates wget && \
# wget -qO /tmp/cuda-keyring.deb https://developer.download.nvidia.com/compute/cuda/repos/$distro/$arch/cuda-keyring_1.1-1_all.deb && \
# dpkg -i /tmp/cuda-keyring.deb && apt-get update && \
# apt-get install -y git cuda-toolkit-${CUDA_VERSION} libcudnn9-cuda-12=${CUDNN_VERSION}-1 libcudnn9-dev-cuda-12=${CUDNN_VERSION}-1 && \
# apt-get clean -y && rm -rf /var/lib/apt/lists/*

FROM ${BASE_IMAGE} AS base-rocm

Expand All @@ -42,25 +45,29 @@ ENV ROCM_PATH "/opt/rocm-${ROCM_VERSION}.0"
FROM base-${VARIANT}

# Set the missing UTF-8 locale, otherwise Elixir warns
ENV LC_ALL C.UTF-8
ENV LC_ALL=C.UTF-8

# Make sure installing packages (like tzdata) doesn't prompt for configuration
ARG DEBIAN_FRONTEND=noninteractive

# We need to install "add-apt-repository" first
RUN apt-get update && apt-get install -y software-properties-common && \
# Add repository with the latest git version
add-apt-repository ppa:git-core/ppa && \
RUN apt-get update && \
# Install basic system dependencies
apt-get update && apt-get install -y ca-certificates curl git unzip wget && \
# Install Clang
clang_version="18" && \
apt-get install -y wget gnupg software-properties-common lsb-release && \
wget -qO- https://apt.llvm.org/llvm.sh | bash -s -- $clang_version && \
update-alternatives --install /usr/bin/clang clang /usr/bin/clang-$clang_version 100 && \
update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-$clang_version 100 && \
apt-get clean -y && rm -rf /var/lib/apt/lists/*

# Install Bazel using Bazelisk (works for both amd and arm)
RUN wget -O bazel "https://github.com/bazelbuild/bazelisk/releases/download/v1.18.0/bazelisk-linux-$(dpkg --print-architecture)" && \
RUN wget -O bazel "https://github.com/bazelbuild/bazelisk/releases/download/v1.26.0/bazelisk-linux-$(dpkg --print-architecture)" && \
chmod +x bazel && \
mv bazel /usr/local/bin/bazel

ENV USE_BAZEL_VERSION 6.5.0
ENV USE_BAZEL_VERSION=7.4.1

# Install Python and the necessary global dependencies
RUN apt-get update && apt-get install -y python3 python3-pip && \
Expand All @@ -70,6 +77,8 @@ RUN apt-get update && apt-get install -y python3 python3-pip && \

# Setup project files

WORKDIR /xla

ARG XLA_TARGET

ENV XLA_TARGET=${XLA_TARGET}
Expand All @@ -80,7 +89,7 @@ COPY mix.exs mix.lock ./
RUN mix deps.get

COPY lib lib
COPY Makefile ./
COPY README.md Makefile ./
COPY extension extension

CMD [ "mix", "compile" ]
4 changes: 2 additions & 2 deletions builds/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,10 @@ case "$target" in
;;

"cuda12")
# Note that the versions are configured with HERMETIC_CUDA_VERSION
# in lib/xla.ex.
docker build -t xla-cuda12 -f builds/Dockerfile \
--build-arg VARIANT=cuda \
--build-arg CUDA_VERSION=12-3 \
--build-arg CUDNN_VERSION=9.1.1.17 \
--build-arg XLA_TARGET=cuda12 \
.
;;
Expand Down
Loading