CI: 04/23/25 upstream sync #379

rocm-repo-management-api-2 · 2025-04-23T06:02:14Z

Daily sync with upstream

…functions About half of the tracing-cache-miss explanations in a large benchmark end up being from JAX-internal functions, such as `jax.numpy` functions. These cache misses are not what the JAX user wants to see, so we filter them out, using the same mechanism used for filtering tracebacks.

…4.34. As of today it has been 180 days since the release of 0.4.34 where the following legacy LAPACK kernels were no longer used when lowering: * getrf * geqrf / orgqr * potrf * gesdd * syevd * geev * gehrd Following our compatibility policy, these are now safe to remove. PiperOrigin-RevId: 746388529

PiperOrigin-RevId: 746397395

…ernals PiperOrigin-RevId: 746397452

PiperOrigin-RevId: 746402180

Previously, jax.jit returned a function with extra attributes, e.g., `trace`, and `lower`, such that we can use: ``` jax.jit(f).trace(...) ``` The new attributes create problems when `jax.jit` is used along `functools.wraps`. Essentially, `functools.wraps(jax.jit(f))(wrapper)` is supposed to result in a function that when invoked will invoke `wrapper` and then presumably `jax.jit(f)`. This works as expected if you just call the result, but if you try to use it with `lower` and `trace`, the `wrapper` is bypassed. This is because `wraps` copies the attributes `trace` and `lower` from `jax.jit(f)` onto the resulting function, so when `trace` is invoked the `wrapper` is bypassed entirely. See jax-ml#27829 and jax-ml#27825. The solution proposed here is to make the `trace` and `lower` be class attributes, so that they are not copied by `functools.wraps`. Thus, if you try to use `lower` or `trace` on the result of `functools.wraps(jax.jit(f))()` you will get an error. That is better than silently ignoring the wrapper. The workaround is to apply `jax.jit` last among your wrappers. Fixes: jax-ml#27829

PiperOrigin-RevId: 746425307

So far Mosaic was implicitly relying on XLA to register the NVPTX target which made problems in cases where only a Mosaic kernel gets compiled and XLA didn't initialize the LLVM NVPTX target. PiperOrigin-RevId: 746433654

The skip decorator being used here only worked for test methods, not test classes, so it accidentally had the effect of skipping all the tests. But we don't really need a special decorator here anyway. PiperOrigin-RevId: 746434607

Follow-up from jax-ml#27916. jax-fixit PiperOrigin-RevId: 746442635

stdout redirection is inherently racy; mark test cases doing it as thread unsafe. PiperOrigin-RevId: 746443039

…loop` lowering PiperOrigin-RevId: 746444372

http://github.com/openxla/xla/commit/ca9011742bb84b3d2158feb262ddca221957ccc9. PiperOrigin-RevId: 746448816

…k capsules to jax.dlpack.from_dlpack(). to_dlpack() is not needed in the current version of the dlpack protocol. The from_dlpack() method accepts an object that implements __dlpack__(). In most cases, a JAX array can be passed directly to functions like torch.dlpack.from_dlpack(), and vice versa for other frameworks. The main exception is TensorFlow which does not implement the current protocol. PiperOrigin-RevId: 746464890

PiperOrigin-RevId: 746490665

jax-fixit PiperOrigin-RevId: 746496570

PiperOrigin-RevId: 746520758

…s the same as the order of arguments received in `jit` API and make it keyword-only PiperOrigin-RevId: 746527807

This is not needed under the newer DLPack protocol for users, and there's an equivalent (`__dlpack__`). PiperOrigin-RevId: 746530351

PiperOrigin-RevId: 746543312

The new `METADATA` specification disallows use of underscore and automatically converts any usage of them to dash. https://packaging.python.org/en/latest/specifications/core-metadata/#provides-extra-multiple-use This should fix the following error: jax-ml#27874 from appearing in future JAX releases PiperOrigin-RevId: 746546162

PiperOrigin-RevId: 746546870

Use a count of chips (or omit it if 1) rather than specifying an ICI topology. Examples: * tpu_v5e_1x1 -> tpu_v5e * tpu_v5e_4x2 -> tpu_v5e_x8 PiperOrigin-RevId: 746547477

PiperOrigin-RevId: 746554582

PiperOrigin-RevId: 746564071

…thon as a patch, rolling back. Reverts b1c96d4 PiperOrigin-RevId: 746565341

Missing space in '..math::' meant that the math wasn't rendering correctly.

These APIs are already broken on GPU and TPU by virtue of not being implemented in the PJRT C API, so it seems unlikely that they have any users. PiperOrigin-RevId: 746595857

This parameter is available from jax-ml#23040 and documented in https://docs.jax.dev/en/latest/_autosummary/jax.numpy.isin.html. PiperOrigin-RevId: 746606206

PiperOrigin-RevId: 750226360

PiperOrigin-RevId: 750226731

PiperOrigin-RevId: 750229300

PiperOrigin-RevId: 750230068

…e block size is too small. PiperOrigin-RevId: 750244014

PiperOrigin-RevId: 750262738

Co-authored-by: Matthew Johnson <[email protected]>

PiperOrigin-RevId: 750282747

…if they are identical PiperOrigin-RevId: 750284947

PiperOrigin-RevId: 750287933

… pipeline emitter if there's nothing to wait for. Also enforce that `arrival_count` is always > 0. PiperOrigin-RevId: 750294068

PiperOrigin-RevId: 750296702

PiperOrigin-RevId: 750299496

PiperOrigin-RevId: 750302979

PiperOrigin-RevId: 750309885

PiperOrigin-RevId: 750342878

The signature is: ``` jax.shard_map(f, /, *, out_specs, axis_names=set(), in_specs=None, mesh=None, check_vma=True) ``` This API is a drop-in replacement for the experimental shard_map endpoint with just two small changes: `check_rep` is renamed to `check_vma` and all arguments (except `f`) to `shard_map` are keyword only and `f` is positional only. **But why are mesh and in_specs optional? And what is the new `axis_names` argument?** * `mesh` is optional because it can be inferred from the context if user sets the mesh via `jax.sharding.use_mesh(mesh)`. * `in_specs` is optional because it can be inferred from the arguments passed to `shard_map` if all mesh axes are `Explicit`. * `axis_names`: axis_names tells `shard_map` which axes are `Manual`. If empty, it implies the `shard_map` is `Manual` over all mesh axes. Before in the experimental endpoint of `shard_map`, this argument was called `auto`. But after the advent of `sharding_in_types`, mesh axes can be `Auto`, `Explicit` or `Manual`. So `auto` was not enough since axes can be `Explicit` too. That's why `jax.shard_map` flips the argument to `axis_names`. **If `in_specs` is optional, why is `out_specs` compulsory?** This is because, we still need to know which dimension to concat over. It can't be inferred automatically since the choice can be anything. PiperOrigin-RevId: 750343135

…REG spill. Taking new factors into account for auto tunning: - q_dtype_name - kv_dtype_name - num_q_heads_per_blk - num_kv_heads_per_blk - head_dim - page_size - max_num_batched_tokens - max_model_len = page_size * pages_per_seq We only has 32 SREGs in TensorCore. If the page size is small, we can easily spill SREGs. This cl suggests using `page_size = max_model_len // 16` which will make sure at most 16 SREGs will be used for KV page indices per sequence. PiperOrigin-RevId: 750370022

PiperOrigin-RevId: 750374339

PiperOrigin-RevId: 750385718

…ings PiperOrigin-RevId: 750390355

PiperOrigin-RevId: 750400956

…ard_map.py to `jax/_src` The signature is: `jax.shard_map(f, /, *, out_specs, axis_names=set(), in_specs=None, mesh=None, check_vma=True)` This API is a drop-in replacement for the experimental shard_map endpoint with just two small changes: check_rep is renamed to check_vma and all arguments (except f) to shard_map are keyword only and f is positional only. **But why are mesh and in_specs optional? And what is the new axis_names argument?** mesh is optional because it can be inferred from the context if user sets the mesh via jax.sharding.use_mesh(mesh). in_specs is optional because it can be inferred from the arguments passed to shard_map if all mesh axes are Explicit. axis_names: axis_names tells shard_map which axes are Manual. If empty, it implies the shard_map is Manual over all mesh axes. Before in the experimental endpoint of shard_map, this argument was called auto. But after the advent of sharding_in_types, mesh axes can be Auto, Explicit or Manual. So auto was not enough since axes can be Explicit too. That's why jax.shard_map flips the argument to axis_names. **If in_specs is optional, why is out_specs compulsory?** This is because, we still need to know which dimension to concat over. It can't be inferred automatically since the choice can be anything. END_PUBLIC PiperOrigin-RevId: 750401402

…oadcast in lower_to_llo. - To make fold_in non-trivial, in Pallas the key is now represented as a (1, 2)-shaped key. - 2 new primitives were added for wrapping/unwrapping the key from scalars. This is needed because JAX's wrap/unwrap return to and from vectors, whereas in Pallas we need to return a list of scalars. PiperOrigin-RevId: 750422791

gnecula and others added 30 commits April 11, 2025 12:53

Merge pull request jax-ml#27685 from Cjkkkk:return_cudnn_sdpa_residual

ac285a1

PiperOrigin-RevId: 746397395

Merge pull request jax-ml#27916 from gnecula:tracing_cache_ignore_int…

1035c9a

…ernals PiperOrigin-RevId: 746397452

Merge pull request jax-ml#27876 from gnecula:aot_compute_on

c9cbf82

PiperOrigin-RevId: 746402180

Merge pull request jax-ml#27873 from gnecula:aot_wraps2

a1c06fc

PiperOrigin-RevId: 746425307

Register NVPTX LLVM backend from Mosaic custom call

896557f

So far Mosaic was implicitly relying on XLA to register the NVPTX target which made problems in cases where only a Mosaic kernel gets compiled and XLA didn't initialize the LLVM NVPTX target. PiperOrigin-RevId: 746433654

Fix api_test on persistent cache enabled platform

8082186

Follow-up from jax-ml#27916. jax-fixit PiperOrigin-RevId: 746442635

Fix test flakiness in tpu_pallas_test when JAX_TEST_NUM_THREADS > 1.

614ef37

stdout redirection is inherently racy; mark test cases doing it as thread unsafe. PiperOrigin-RevId: 746443039

[pallas:mosaic_gpu] Added support for unroll=True to the `lax.fori_…

d543df1

…loop` lowering PiperOrigin-RevId: 746444372

Update XLA dependency to use revision

b3c0ec0

http://github.com/openxla/xla/commit/ca9011742bb84b3d2158feb262ddca221957ccc9. PiperOrigin-RevId: 746448816

Bump the JAX version to v0.6.0, which will be the next release version.

3736e5b

PiperOrigin-RevId: 746490665

Fix the printing of the function name in tracing-cache-miss explanations

5adac1c

jax-fixit PiperOrigin-RevId: 746496570

[Pallas] Fix potential race condition in Pallas TPU docs

88dae18

Remove unused execute_sharded_* functions.

b1c96d4

PiperOrigin-RevId: 746520758

Make sure the order passed to make_jit and _parse_jit_arguments i…

a39b623

…s the same as the order of arguments received in `jit` API and make it keyword-only PiperOrigin-RevId: 746527807

Deprecate jax.dlpack.to_dlpack.

ab88273

This is not needed under the newer DLPack protocol for users, and there's an equivalent (`__dlpack__`). PiperOrigin-RevId: 746530351

document SPMD pipeline parallelism

8e9fca1

PiperOrigin-RevId: 746543312

[Pallas] Allow 1D iota

27c07f7

PiperOrigin-RevId: 746546870

Rename TPU bazel test tags.

904419c

Use a count of chips (or omit it if 1) rather than specifying an ICI topology. Examples: * tpu_v5e_1x1 -> tpu_v5e * tpu_v5e_4x2 -> tpu_v5e_x8 PiperOrigin-RevId: 746547477

Reverts 907725d

e9364f4

PiperOrigin-RevId: 746554582

Deprecate PositionalSharding and GSPMDSharding

6efcf44

PiperOrigin-RevId: 746564071

Removed type annotations appear to be used and actually defined in py…

c0d97a6

…thon as a patch, rolling back. Reverts b1c96d4 PiperOrigin-RevId: 746565341

Fix typo in jax.lax.linalg.symmetric_product description

c90751b

Missing space in '..math::' meant that the math wasn't rendering correctly.

Deprecate jax.lax.infeed and jax.lax.outfeed.

6fc78a5

These APIs are already broken on GPU and TPU by virtue of not being implemented in the PJRT C API, so it seems unlikely that they have any users. PiperOrigin-RevId: 746595857

Add the method argument to jax.numpy.isin stub.

b2a8df7

This parameter is available from jax-ml#23040 and documented in https://docs.jax.dev/en/latest/_autosummary/jax.numpy.isin.html. PiperOrigin-RevId: 746606206

Google-ML-Automation and others added 28 commits April 22, 2025 09:43

Merge pull request jax-ml#27574 from jburnim:jburnim_pallas_core_map

3c65029

PiperOrigin-RevId: 750226360

Merge pull request jax-ml#28092 from dfm:gh25847

437d7d8

PiperOrigin-RevId: 750226731

Merge pull request jax-ml#24074 from scottstanie:fix-polyfit-cov

5b839ff

PiperOrigin-RevId: 750229300

Merge pull request jax-ml#28126 from jakevdp:scatter-annotations

711ba9a

PiperOrigin-RevId: 750230068

[Pallas][Mosaic TPU] Improve error message for 1D block specs when th…

720488f

…e block size is too small. PiperOrigin-RevId: 750244014

[CI] Add tpu v6e-8 to nightly test and release test.

91deb25

PiperOrigin-RevId: 750262738

[Pallas] Update TPU pipelining docs

64c645d

Fix handling of SymbolicZero output when batching custom_jvp.

82a79c5

Co-authored-by: Matthew Johnson <[email protected]>

Merge pull request jax-ml#28100 from justinjfu:pipe_docs_v2

c7417ce

PiperOrigin-RevId: 750282747

[Pallas Fuser] Allow multiple BlockSpec inputs to select_n push rule …

e81dae6

…if they are identical PiperOrigin-RevId: 750284947

DOC: link to ai-stack tutorials from JAX's front page

5c22694

Merge pull request jax-ml#28150 from dfm:gh28144

0e25620

PiperOrigin-RevId: 750287933

[Mosaic GPU] Do not create a barrier with arrival_count == 0 in the…

3d470f4

… pipeline emitter if there's nothing to wait for. Also enforce that `arrival_count` is always > 0. PiperOrigin-RevId: 750294068

[Pallas Fuser] Change physicalize to resolve_fusion_dtypes

5486296

PiperOrigin-RevId: 750296702

[Pallas] Propagate Jaxpr effects through pl.fusible

1727657

PiperOrigin-RevId: 750299496

Set core_index to default for tpu_pallas_async_test

1c9ef1d

PiperOrigin-RevId: 750302979

Merge pull request jax-ml#28158 from jakevdp:fix-ldexp

0dd8e97

PiperOrigin-RevId: 750309885

Update tsan requirements patch after lockfile update.

9425e1c

Merge pull request jax-ml#28185 from hawkinsp:tsan4

36af683

PiperOrigin-RevId: 750342878

Account for versioned clang binaries

14399f3

Merge pull request jax-ml#28182 from jakevdp:ai-stack-link

1a95e27

PiperOrigin-RevId: 750374339

Skip unary_ops_accuracy test for TPU version 7 and above.

57d1df4

PiperOrigin-RevId: 750385718

[Mosaic:TPU][Relayout] Row shifts for packed types and non-native til…

3c86982

…ings PiperOrigin-RevId: 750390355

Merge pull request jax-ml#27914 from ROCm:vers-clang-fix

23b63a2

PiperOrigin-RevId: 750400956

rocm-repo-management-api-2 bot requested a review from a team as a code owner April 23, 2025 06:02

rocm-repo-management-api-2 bot enabled auto-merge (rebase) April 23, 2025 06:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: 04/23/25 upstream sync #379

CI: 04/23/25 upstream sync #379

rocm-repo-management-api-2 bot commented Apr 23, 2025

CI: 04/23/25 upstream sync #379

Are you sure you want to change the base?

CI: 04/23/25 upstream sync #379

Conversation

rocm-repo-management-api-2 bot commented Apr 23, 2025