-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate MTL and MPS structs and enums with Clang.jl #492
Conversation
Even if only for enums etc, this would be pretty useful! I imagined having to write our own Clang.jl-based generator in order to generate ObjC calls, but this would probably be a good first step. |
21b2dd0
to
4b4ed49
Compare
I think most of it can be upstreamed to Clang.jl. The changes I've done on my branch are either bug fixes or definitions of the |
d197c0f
to
6595240
Compare
6595240
to
3d45135
Compare
3d45135
to
e9941fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metal Benchmarks
Benchmark suite | Current: a419a2b | Previous: 84447c4 | Ratio |
---|---|---|---|
private array/construct |
27380.85714285714 ns |
26538.166666666668 ns |
1.03 |
private array/broadcast |
458833 ns |
455500 ns |
1.01 |
private array/random/randn/Float32 |
824917 ns |
810750 ns |
1.02 |
private array/random/randn!/Float32 |
650459 ns |
641541 ns |
1.01 |
private array/random/rand!/Int64 |
544250 ns |
547500 ns |
0.99 |
private array/random/rand!/Float32 |
581917 ns |
581708 ns |
1.00 |
private array/random/rand/Int64 |
784083.5 ns |
754708 ns |
1.04 |
private array/random/rand/Float32 |
622833.5 ns |
576250 ns |
1.08 |
private array/copyto!/gpu_to_gpu |
642708 ns |
678208 ns |
0.95 |
private array/copyto!/cpu_to_gpu |
817417 ns |
821375 ns |
1.00 |
private array/copyto!/gpu_to_cpu |
624000 ns |
674333 ns |
0.93 |
private array/accumulate/1d |
1354667 ns |
1347708 ns |
1.01 |
private array/accumulate/2d |
1386542 ns |
1384500 ns |
1.00 |
private array/iteration/findall/int |
2104209 ns |
2102291.5 ns |
1.00 |
private array/iteration/findall/bool |
1809042 ns |
1807917 ns |
1.00 |
private array/iteration/findfirst/int |
1693208 ns |
1692020.5 ns |
1.00 |
private array/iteration/findfirst/bool |
1666500 ns |
1656666 ns |
1.01 |
private array/iteration/scalar |
3546833 ns |
3920667 ns |
0.90 |
private array/iteration/logical |
3217125 ns |
3200708 ns |
1.01 |
private array/iteration/findmin/1d |
1767146 ns |
1765458 ns |
1.00 |
private array/iteration/findmin/2d |
1354354 ns |
1340875 ns |
1.01 |
private array/reductions/reduce/1d |
1036000 ns |
1037000 ns |
1.00 |
private array/reductions/reduce/2d |
662500 ns |
666542 ns |
0.99 |
private array/reductions/mapreduce/1d |
1034937.5 ns |
1027813 ns |
1.01 |
private array/reductions/mapreduce/2d |
663792 ns |
712062.5 ns |
0.93 |
private array/permutedims/4d |
2551354 ns |
2564583 ns |
0.99 |
private array/permutedims/2d |
1020479 ns |
1019958 ns |
1.00 |
private array/permutedims/3d |
1590396 ns |
1593584 ns |
1.00 |
private array/copy |
548479.5 ns |
541583 ns |
1.01 |
latency/precompile |
5775550229 ns |
5246979375 ns |
1.10 |
latency/ttfp |
6740591145.5 ns |
6653773250 ns |
1.01 |
latency/import |
1167900875 ns |
1164822625 ns |
1.00 |
integration/metaldevrt |
718291.5 ns |
713667 ns |
1.01 |
integration/byval/slices=1 |
1565708.5 ns |
1535999.5 ns |
1.02 |
integration/byval/slices=3 |
8698562.5 ns |
9724167 ns |
0.89 |
integration/byval/reference |
1566791.5 ns |
1540854.5 ns |
1.02 |
integration/byval/slices=2 |
2619229 ns |
2638000 ns |
0.99 |
kernel/indexing |
461104 ns |
476334 ns |
0.97 |
kernel/indexing_checked |
474312.5 ns |
477542 ns |
0.99 |
kernel/launch |
8666 ns |
8084 ns |
1.07 |
metal/synchronization/stream |
14166 ns |
14625 ns |
0.97 |
metal/synchronization/context |
15042 ns |
15041 ns |
1.00 |
shared array/construct |
27638.833333333332 ns |
27142.333333333332 ns |
1.02 |
shared array/broadcast |
466937.5 ns |
461000 ns |
1.01 |
shared array/random/randn/Float32 |
810812 ns |
817792 ns |
0.99 |
shared array/random/randn!/Float32 |
647166 ns |
636042 ns |
1.02 |
shared array/random/rand!/Int64 |
546000 ns |
542833 ns |
1.01 |
shared array/random/rand!/Float32 |
583375 ns |
577542 ns |
1.01 |
shared array/random/rand/Int64 |
755354 ns |
775874.5 ns |
0.97 |
shared array/random/rand/Float32 |
617729 ns |
573604 ns |
1.08 |
shared array/copyto!/gpu_to_gpu |
90667 ns |
88375 ns |
1.03 |
shared array/copyto!/cpu_to_gpu |
87792 ns |
85375 ns |
1.03 |
shared array/copyto!/gpu_to_cpu |
81209 ns |
84084 ns |
0.97 |
shared array/accumulate/1d |
1338458 ns |
1353166.5 ns |
0.99 |
shared array/accumulate/2d |
1386812 ns |
1386916 ns |
1.00 |
shared array/iteration/findall/int |
1815750 ns |
1831541.5 ns |
0.99 |
shared array/iteration/findall/bool |
1589542 ns |
1577750 ns |
1.01 |
shared array/iteration/findfirst/int |
1391083 ns |
1387770.5 ns |
1.00 |
shared array/iteration/findfirst/bool |
1361875 ns |
1340584 ns |
1.02 |
shared array/iteration/scalar |
151584 ns |
158625 ns |
0.96 |
shared array/iteration/logical |
2959875 ns |
2945562.5 ns |
1.00 |
shared array/iteration/findmin/1d |
1464729.5 ns |
1456666 ns |
1.01 |
shared array/iteration/findmin/2d |
1350417 ns |
1357833 ns |
0.99 |
shared array/reductions/reduce/1d |
721500 ns |
730167 ns |
0.99 |
shared array/reductions/reduce/2d |
662104.5 ns |
669417 ns |
0.99 |
shared array/reductions/mapreduce/1d |
731042 ns |
743312.5 ns |
0.98 |
shared array/reductions/mapreduce/2d |
661834 ns |
667000 ns |
0.99 |
shared array/permutedims/4d |
2559146 ns |
2570541.5 ns |
1.00 |
shared array/permutedims/2d |
1022750 ns |
1026187.5 ns |
1.00 |
shared array/permutedims/3d |
1594438 ns |
1596125 ns |
1.00 |
shared array/copy |
243604.5 ns |
245812 ns |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
0d8492d
to
2fa5f8a
Compare
Please ping when this is ready for review; you've marked it as such, but more commits keep appearing 🙂 Also, were the necessary changes to Clang.jl merged? |
2837395
to
e47ff83
Compare
I do have a terrible habit of refining my PRs after marking ready. I'll be more mindful of that in the future.
Not yet, JuliaInterop/Clang.jl#519 and JuliaInterop/Clang.jl#522 are general improvements/bug fixes to handle the types of enums that Apple ObjectiveC headers love so much (elaborated and attributes). I just created JuliaInterop/Clang.jl#524 which once the other 2 are merged, shouldn't be too hard to review. I just rebased and any further changes other than review changes will be for a future PR. |
5a8daaa
to
b9610e3
Compare
Also comment out MTL enums and structs
Revert if function split doesn't end up getting merged
e47ff83
to
d1776c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. We could probably also remove some enums from compiler/library.jl
, but that can happen in a different PR.
Co-authored-by: Tim Besard <[email protected]>
Currently works for structs and enums in Metal.framework with a few hacks and this branch of Clang devved: https://github.com/christiangnrd/Clang.jl/tree/objectiveC
Any advice, suggestions, or help welcome!