-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[L0] Phase 2 of Counter-Based Event Implementation #1698
Conversation
This does not compile /w L0 adapter enabled. Also, feel free to add a relevant benchmark scenario to https://github.com/oneapi-src/unified-runtime/blob/main/.github/scripts/compute_benchmarks.py, or just run the existing benchmark with whatever env variables are needed. You can run these from: https://github.com/oneapi-src/unified-runtime/actions/workflows/benchmarks_compute.yml You can reach out to me if you need help or advice. |
@pbalcer It should compile now, working out some of the e2e tests that are still failing. |
@winstonzhang-intel , please link the intel/llvm PR related to this issue so we can see the full e2e test results. |
Compute Benchmarks level_zero run (with params: --env UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 --env UR_L0_USE_DRIVER_INORDER_LISTS=1): |
Compute Benchmarks level_zero run (with params: --env UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 --env UR_L0_USE_DRIVER_INORDER_LISTS=1): |
c63a309
to
caeceb4
Compare
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
Compute Benchmarks level_zero run (with params: --env UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 --env UR_L0_USE_DRIVER_INORDER_LISTS=1): |
Compute Benchmarks level_zero run (with params: --env UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 --env UR_L0_USE_DRIVER_INORDER_LISTS=1): Benchmark Results---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_sycl, mean execution time per 10 kernels (μs)
todayMarker off
dateFormat X
axisFormat %s
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=1<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)<br>Imm-CmdLists-OFF
This PR (38.675 us) : crit, 0, 38
baseline (38.357 us) : 0, 38
- : 0, 0
- : 0, 0
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=0<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)<br>Imm-CmdLists-OFF
This PR (36.082 us) : crit, 0, 36
baseline (36.972 us) : 0, 36
- : 0, 0
- : 0, 0
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=1<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)<br>
This PR (40.549 us) : crit, 0, 40
baseline (41.505 us) : 0, 41
- : 0, 0
- : 0, 0
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=0<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)<br>
This PR (40.023 us) : crit, 0, 40
baseline (41.129 us) : 0, 41
- : 0, 0
- : 0, 0
DetailsSubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0) Imm-CmdLists-OFFEnvironment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=0 Command:/home/test-user/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders Output:TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0) Imm-CmdLists-OFFEnvironment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=0 Command:/home/test-user/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders Output:TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)Environment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=1 Command:/home/test-user/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders Output:TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)Environment Variables:UR_L0_USE_IMMEDIATE_COMMANDLISTS=1 Command:/home/test-user/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders Output:TestCase,Mean,Median,StdDev,Min,Max,Type |
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
dfb7663
to
1768740
Compare
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
d1e1ed0
to
a7249fa
Compare
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Compute Benchmarks level_zero run (with params: --env UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 --env UR_L0_USE_DRIVER_INORDER_LISTS=1): |
Compute Benchmarks level_zero run (--env UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 --env UR_L0_USE_DRIVER_INORDER_LISTS=1): Summaryresult is better
Chartsapi_overhead_benchmark_sycl SubmitKernel out of order---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_sycl SubmitKernel out of order
todayMarker off
dateFormat X
axisFormat %s
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=0<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)
This PR (48.362 μs) : crit, 0, 48
baseline (50.631 μs) : 0, 50
- : 0, 0
- : 0, 0
api_overhead_benchmark_sycl SubmitKernel in order---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_sycl SubmitKernel in order
todayMarker off
dateFormat X
axisFormat %s
section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=1<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)
This PR (47.024 μs) : crit, 0, 47
baseline (49.385 μs) : 0, 49
- : 0, 0
- : 0, 0
api_overhead_benchmark_ur SubmitKernel out of order---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_ur SubmitKernel out of order
todayMarker off
dateFormat X
axisFormat %s
section SubmitKernel(api=ur<br>Profiling=0<br>Ioq=0<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)
This PR (31.312 μs) : crit, 0, 31
baseline (31.93 μs) : 0, 31
- : 0, 0
- : 0, 0
api_overhead_benchmark_ur SubmitKernel in order---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_ur SubmitKernel in order
todayMarker off
dateFormat X
axisFormat %s
section SubmitKernel(api=ur<br>Profiling=0<br>Ioq=1<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)
This PR (25.546 μs) : crit, 0, 25
baseline (28.586 μs) : 0, 28
- : 0, 0
- : 0, 0
memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024
todayMarker off
dateFormat X
axisFormat %s
section QueueInOrderMemcpy(api=sycl<br>IsCopyOnly=0<br>sourcePlacement=Device<br>destinationPlacement=Device<br>size=1KB<br>count=100)
This PR (424.685 μs) : crit, 0, 424
baseline (423.457 μs) : 0, 423
- : 0, 0
- : 0, 0
memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024
todayMarker off
dateFormat X
axisFormat %s
section QueueInOrderMemcpy(api=sycl<br>IsCopyOnly=0<br>sourcePlacement=Host<br>destinationPlacement=Device<br>size=1KB<br>count=100)
This PR (261.384 μs) : crit, 0, 261
baseline (253.906 μs) : 0, 253
- : 0, 0
- : 0, 0
memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024
todayMarker off
dateFormat X
axisFormat %s
section QueueMemcpy(api=sycl<br>sourcePlacement=Device<br>destinationPlacement=Device<br>size=1KB)
This PR (10.089 μs) : crit, 0, 10
baseline (9.179 μs) : 0, 9
- : 0, 0
- : 0, 0
memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240
todayMarker off
dateFormat X
axisFormat %s
section StreamMemory(api=sycl<br>type=Triad<br>size=10KB<br>useEvents=0<br>contents=Zeros<br>memoryPlacement=Device)
This PR (3.002 μs) : crit, 0, 3
baseline (1.854 μs) : 0, 1
- : 0, 0
- : 0, 0
api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024
todayMarker off
dateFormat X
axisFormat %s
section ExecImmediateCopyQueue(api=sycl<br>IsCopyOnly=1<br>MeasureCompletionTime=0<br>src=Device<br>dst=Device<br>size=1KB<br>ioq=0)
This PR (2.143 μs) : crit, 0, 2
baseline (4.506 μs) : 0, 4
- : 0, 0
- : 0, 0
api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024
todayMarker off
dateFormat X
axisFormat %s
section ExecImmediateCopyQueue(api=sycl<br>IsCopyOnly=1<br>MeasureCompletionTime=0<br>src=Host<br>dst=Host<br>size=1KB<br>ioq=1)
This PR (2.096 μs) : crit, 0, 2
baseline (3.613 μs) : 0, 3
- : 0, 0
- : 0, 0
miscellaneous_benchmark_sycl VectorSum---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title miscellaneous_benchmark_sycl VectorSum
todayMarker off
dateFormat X
axisFormat %s
section VectorSum(api=sycl<br>numberOfElementsX=512<br>numberOfElementsY=256<br>numberOfElementsZ=256)
This PR (858.416 μs) : crit, 0, 858
baseline (863.651 μs) : 0, 863
- : 0, 0
- : 0, 0
Velocity-Bench Hashtable---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Hashtable
todayMarker off
dateFormat X
axisFormat %s
section hashtable
This PR (207.852567 M keys/sec) : crit, 0, 207
baseline (178.291413 M keys/sec) : 0, 178
- : 0, 0
- : 0, 0
Velocity-Bench Bitcracker---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Bitcracker
todayMarker off
dateFormat X
axisFormat %s
section bitcracker
This PR (35.6076 s) : crit, 0, 35
baseline (35.8407 s) : 0, 35
- : 0, 0
- : 0, 0
Velocity-Bench CudaSift---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench CudaSift
todayMarker off
dateFormat X
axisFormat %s
section cudaSift
This PR (256.843 ms) : crit, 0, 256
baseline (283.294 ms) : 0, 283
- : 0, 0
- : 0, 0
Velocity-Bench Easywave---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Easywave
todayMarker off
dateFormat X
axisFormat %s
section easywave
This PR (446 ms) : crit, 0, 446
baseline (457.0 ms) : 0, 457
- : 0, 0
- : 0, 0
Velocity-Bench QuickSilver---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench QuickSilver
todayMarker off
dateFormat X
axisFormat %s
section QuickSilver
This PR (90.08 MMS/CTT) : crit, 0, 90
baseline (115.63 MMS/CTT) : 0, 115
- : 0, 0
- : 0, 0
Velocity-Bench Sobel Filter---
config:
gantt:
rightPadding: 10
leftPadding: 120
sectionFontSize: 10
numberSectionStyles: 2
---
gantt
title Velocity-Bench Sobel Filter
todayMarker off
dateFormat X
axisFormat %s
section sobel_filter
This PR (985.857 ms) : crit, 0, 985
baseline (934.963 ms) : 0, 934
- : 0, 0
- : 0, 0
DetailsSubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 Output:TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 Output:TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 Output:TestCase,Mean,Median,StdDev,Min,Max,Type SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 Output:TestCase,Mean,Median,StdDev,Min,Max,Type QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Device destinationPlacement=Device size=1KB count=100)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Device --destinationPlacement=Device --size=1024 --count=100 Output:TestCase,Mean,Median,StdDev,Min,Max,Type QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Host destinationPlacement=Device size=1KB count=100)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Host --destinationPlacement=Device --size=1024 --count=100 Output:TestCase,Mean,Median,StdDev,Min,Max,Type QueueMemcpy(api=sycl sourcePlacement=Device destinationPlacement=Device size=1KB)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueMemcpy --csv --noHeaders --iterations=10000 --sourcePlacement=Device --destinationPlacement=Device --size=1024 Output:TestCase,Mean,Median,StdDev,Min,Max,Type StreamMemory(api=sycl type=Triad size=10KB useEvents=0 contents=Zeros memoryPlacement=Device)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=StreamMemory --csv --noHeaders --iterations=10000 --type=Triad --size=10240 --memoryPlacement=Device --useEvents=0 --contents=Zeros Output:TestCase,Mean,Median,StdDev,Min,Max,Type ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Device dst=Device size=1KB ioq=0)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=0 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Device --dst=Device --size=1024 Output:TestCase,Mean,Median,StdDev,Min,Max,Type ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Host dst=Host size=1KB ioq=1)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=1 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Host --dst=Host --size=1024 Output:TestCase,Mean,Median,StdDev,Min,Max,Type VectorSum(api=sycl numberOfElementsX=512 numberOfElementsY=256 numberOfElementsZ=256)Environment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/compute-benchmarks-build/bin/miscellaneous_benchmark_sycl --test=VectorSum --csv --noHeaders --iterations=1000 --numberOfElementsX=512 --numberOfElementsY=256 --numberOfElementsZ=256 Output:TestCase,Mean,Median,StdDev,Min,Max,Type hashtableEnvironment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/hashtable/hashtable_sycl --no-verify Output:hashtable - total time for whole calculation: 0.645735 s bitcrackerEnvironment Variables:UR_L0_USE_DRIVER_COUNTER_BASED_EVENTS=1 Command:/home/test-user/bench_workdir/bitcracker/bitcracker -f /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000 Output:---------> BitCracker: BitLocker password cracking tool <--------- ==================================
|
-enable counter-based events for regular commandlist -counter-based events may be reused even though they are not done -when ref count goes to not used by external clients value it means that event may be reused by subsequent calls -move events that are no longer externally visible to re-usable pool and reuse those more aggressively Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
0059bb6
to
35de324
Compare
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
Signed-off-by: Winston Zhang <[email protected]>
35de324
to
ad11182
Compare
PR in UR: oneapi-src/unified-runtime#1698 Signed-off-by: Winston Zhang <[email protected]>
-enable counter-based events for regular commandlist
-counter-based events may be reused even though they are not done
-when ref count goes to not used by external clients value it means that event may be reused by subsequent calls -move events that are no longer externally visible to re-usable pool and reuse those more aggressively
intel/llvm PR: intel/llvm#14754