Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate MTL and MPS structs and enums with Clang.jl #492

Merged
merged 12 commits into from
Dec 20, 2024

Conversation

christiangnrd
Copy link
Contributor

@christiangnrd christiangnrd commented Dec 10, 2024

Currently works for structs and enums in Metal.framework with a few hacks and this branch of Clang devved: https://github.com/christiangnrd/Clang.jl/tree/objectiveC

Any advice, suggestions, or help welcome!

@maleadt
Copy link
Member

maleadt commented Dec 10, 2024

Even if only for enums etc, this would be pretty useful! I imagined having to write our own Clang.jl-based generator in order to generate ObjC calls, but this would probably be a good first step.

@christiangnrd christiangnrd force-pushed the wrapperexplore branch 3 times, most recently from 21b2dd0 to 4b4ed49 Compare December 10, 2024 19:22
@christiangnrd
Copy link
Contributor Author

I imagined having to write our own Clang.jl-based generator in order to generate ObjC calls, but this would probably be a good first step.

I think most of it can be upstreamed to Clang.jl. The changes I've done on my branch are either bug fixes or definitions of the Generators types for Objective-C constructs.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Benchmark suite Current: a419a2b Previous: 84447c4 Ratio
private array/construct 27380.85714285714 ns 26538.166666666668 ns 1.03
private array/broadcast 458833 ns 455500 ns 1.01
private array/random/randn/Float32 824917 ns 810750 ns 1.02
private array/random/randn!/Float32 650459 ns 641541 ns 1.01
private array/random/rand!/Int64 544250 ns 547500 ns 0.99
private array/random/rand!/Float32 581917 ns 581708 ns 1.00
private array/random/rand/Int64 784083.5 ns 754708 ns 1.04
private array/random/rand/Float32 622833.5 ns 576250 ns 1.08
private array/copyto!/gpu_to_gpu 642708 ns 678208 ns 0.95
private array/copyto!/cpu_to_gpu 817417 ns 821375 ns 1.00
private array/copyto!/gpu_to_cpu 624000 ns 674333 ns 0.93
private array/accumulate/1d 1354667 ns 1347708 ns 1.01
private array/accumulate/2d 1386542 ns 1384500 ns 1.00
private array/iteration/findall/int 2104209 ns 2102291.5 ns 1.00
private array/iteration/findall/bool 1809042 ns 1807917 ns 1.00
private array/iteration/findfirst/int 1693208 ns 1692020.5 ns 1.00
private array/iteration/findfirst/bool 1666500 ns 1656666 ns 1.01
private array/iteration/scalar 3546833 ns 3920667 ns 0.90
private array/iteration/logical 3217125 ns 3200708 ns 1.01
private array/iteration/findmin/1d 1767146 ns 1765458 ns 1.00
private array/iteration/findmin/2d 1354354 ns 1340875 ns 1.01
private array/reductions/reduce/1d 1036000 ns 1037000 ns 1.00
private array/reductions/reduce/2d 662500 ns 666542 ns 0.99
private array/reductions/mapreduce/1d 1034937.5 ns 1027813 ns 1.01
private array/reductions/mapreduce/2d 663792 ns 712062.5 ns 0.93
private array/permutedims/4d 2551354 ns 2564583 ns 0.99
private array/permutedims/2d 1020479 ns 1019958 ns 1.00
private array/permutedims/3d 1590396 ns 1593584 ns 1.00
private array/copy 548479.5 ns 541583 ns 1.01
latency/precompile 5775550229 ns 5246979375 ns 1.10
latency/ttfp 6740591145.5 ns 6653773250 ns 1.01
latency/import 1167900875 ns 1164822625 ns 1.00
integration/metaldevrt 718291.5 ns 713667 ns 1.01
integration/byval/slices=1 1565708.5 ns 1535999.5 ns 1.02
integration/byval/slices=3 8698562.5 ns 9724167 ns 0.89
integration/byval/reference 1566791.5 ns 1540854.5 ns 1.02
integration/byval/slices=2 2619229 ns 2638000 ns 0.99
kernel/indexing 461104 ns 476334 ns 0.97
kernel/indexing_checked 474312.5 ns 477542 ns 0.99
kernel/launch 8666 ns 8084 ns 1.07
metal/synchronization/stream 14166 ns 14625 ns 0.97
metal/synchronization/context 15042 ns 15041 ns 1.00
shared array/construct 27638.833333333332 ns 27142.333333333332 ns 1.02
shared array/broadcast 466937.5 ns 461000 ns 1.01
shared array/random/randn/Float32 810812 ns 817792 ns 0.99
shared array/random/randn!/Float32 647166 ns 636042 ns 1.02
shared array/random/rand!/Int64 546000 ns 542833 ns 1.01
shared array/random/rand!/Float32 583375 ns 577542 ns 1.01
shared array/random/rand/Int64 755354 ns 775874.5 ns 0.97
shared array/random/rand/Float32 617729 ns 573604 ns 1.08
shared array/copyto!/gpu_to_gpu 90667 ns 88375 ns 1.03
shared array/copyto!/cpu_to_gpu 87792 ns 85375 ns 1.03
shared array/copyto!/gpu_to_cpu 81209 ns 84084 ns 0.97
shared array/accumulate/1d 1338458 ns 1353166.5 ns 0.99
shared array/accumulate/2d 1386812 ns 1386916 ns 1.00
shared array/iteration/findall/int 1815750 ns 1831541.5 ns 0.99
shared array/iteration/findall/bool 1589542 ns 1577750 ns 1.01
shared array/iteration/findfirst/int 1391083 ns 1387770.5 ns 1.00
shared array/iteration/findfirst/bool 1361875 ns 1340584 ns 1.02
shared array/iteration/scalar 151584 ns 158625 ns 0.96
shared array/iteration/logical 2959875 ns 2945562.5 ns 1.00
shared array/iteration/findmin/1d 1464729.5 ns 1456666 ns 1.01
shared array/iteration/findmin/2d 1350417 ns 1357833 ns 0.99
shared array/reductions/reduce/1d 721500 ns 730167 ns 0.99
shared array/reductions/reduce/2d 662104.5 ns 669417 ns 0.99
shared array/reductions/mapreduce/1d 731042 ns 743312.5 ns 0.98
shared array/reductions/mapreduce/2d 661834 ns 667000 ns 0.99
shared array/permutedims/4d 2559146 ns 2570541.5 ns 1.00
shared array/permutedims/2d 1022750 ns 1026187.5 ns 1.00
shared array/permutedims/3d 1594438 ns 1596125 ns 1.00
shared array/copy 243604.5 ns 245812 ns 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@christiangnrd christiangnrd changed the title Explore wrapper generation with Clang.jl Generate MTL and MPS structs and enums with Clang.jl Dec 11, 2024
@christiangnrd christiangnrd force-pushed the wrapperexplore branch 8 times, most recently from 0d8492d to 2fa5f8a Compare December 16, 2024 17:08
@maleadt
Copy link
Member

maleadt commented Dec 17, 2024

Please ping when this is ready for review; you've marked it as such, but more commits keep appearing 🙂

Also, were the necessary changes to Clang.jl merged?

@christiangnrd
Copy link
Contributor Author

Please ping when this is ready for review; you've marked it as such, but more commits keep appearing 🙂

I do have a terrible habit of refining my PRs after marking ready. I'll be more mindful of that in the future.

Also, were the necessary changes to Clang.jl merged?

Not yet, JuliaInterop/Clang.jl#519 and JuliaInterop/Clang.jl#522 are general improvements/bug fixes to handle the types of enums that Apple ObjectiveC headers love so much (elaborated and attributes). I just created JuliaInterop/Clang.jl#524 which once the other 2 are merged, shouldn't be too hard to review.

I just rebased and any further changes other than review changes will be for a future PR.

@maleadt maleadt force-pushed the main branch 2 times, most recently from 5a8daaa to b9610e3 Compare December 19, 2024 08:12
Also comment out MTL enums and structs
Copy link
Member

@maleadt maleadt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We could probably also remove some enums from compiler/library.jl, but that can happen in a different PR.

res/wrap/README.md Outdated Show resolved Hide resolved
Co-authored-by: Tim Besard <[email protected]>
@christiangnrd christiangnrd merged commit ea1d6ad into main Dec 20, 2024
2 checks passed
@christiangnrd christiangnrd deleted the wrapperexplore branch December 20, 2024 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants