Releases: utahplt/gtp-benchmarks
v9.3
v9.2 minor take5 changes
- Replace the
module+ main
with a plain expression. Having the submodule is a problem for tools like the contract profiler (a minor problem, but it's easier to drop the submodule). - Add an assert around the call to
random
because its type no longer guarantees nonnegative numbers. (The old type was unsound but fine to use here.)
v9.0
Substantially revise acquire and take5. Before, acquire ran a game with AI players that all raised exceptions and take5 ignored an input list of AI players. After, the acquire players make valid moves and take5 uses its input. These changes do not affect the typed/untyped overhead.
v8.0
Remove racket/sandbox
dependency from acquire
and remove the player AI that times out.
Performance is similar before and after in a first test.
But in general, this change should make acquire
measurements more stable. We care about the cost of types, not of system calls.
Data:
acquire-sandbox.tar.gz
v7.0
Fix a return value in lnm
. Before it was a port. After it's a void.
Affects the module benchmarks/lnm/untyped/modulegraph.rkt
and function ensure-tikz
There is no change in performance
Original issue report: https://github.com/bennn/gtp-benchmarks/issues/25
v6.0
Major Changes Edited all benchmarks so that typed and untyped code are very similar.
If you compare any two typed/A.rkt
and untyped/A.rkt
files, the only differences should be the requires and the type annotations.
Example: gregor
In at least one place, the untyped gregor code had an extra assert. It's gone now.
diff --git a/benchmarks/gregor/untyped/date.rkt b/benchmarks/gregor/untyped/date.rkt
index a3102a9..6ceccb7 100644
--- a/benchmarks/gregor/untyped/date.rkt
+++ b/benchmarks/gregor/untyped/date.rkt
@@ -63,7 +64,6 @@
(define date->ymd Date-ymd)
;(: date->jdn (-> Any Integer))
(define (date->jdn d)
- (unless (Date? d) (error "date->jdn type error"))
(Date-jdn d))
Example: lnm
Typed lnm now uses asserts instead of casts to validate input data. Untyped lnm uses the same casts.
diff --git a/benchmarks/lnm/typed/spreadsheet.rkt b/benchmarks/lnm/typed/spreadsheet.rkt
index dd2dcbc..b869929 100644
--- a/benchmarks/lnm/typed/spreadsheet.rkt
+++ b/benchmarks/lnm/typed/spreadsheet.rkt
@@ -62,7 +62,7 @@
(void)
;; For each row, print the config ID and all the values
(for ([(row n) (in-indexed vec)])
- (void (natural->bitstring (cast n Index) #:pad (log2 num-configs)))
+ (void (natural->bitstring (assert n index?) #:pad (log2 num-configs)))
(for ([v row]) (void "~a~a" sep v))
(void)))
@@ -71,8 +71,18 @@
(define (rktd->spreadsheet input-filename
#:output [output #f]
#:format [format 'tab])
- (define vec (cast (file->value input-filename) (Vectorof (Listof Index))))
+ (define vec
+ (for/vector : (Vectorof (Listof Index)) ((x (in-vector (assert (file->value input-filename) vector?))))
+ (listof-index x)))
(define suffix (symbol->extension format))
(define out (or output (path-replace-suffix input-filename suffix)))
(define sep (symbol->separator format))
(vector->spreadsheet vec out sep))
+
+(: listof-index (-> Any (Listof Index)))
+(define (listof-index x)
+ (if (and (list? x)
+ (andmap index? x))
+ x
+ (error 'listof-index)))
diff --git a/benchmarks/lnm/untyped/spreadsheet.rkt b/benchmarks/lnm/untyped/spreadsheet.rkt
index 18be330..6466fb0 100644
--- a/benchmarks/lnm/untyped/spreadsheet.rkt
+++ b/benchmarks/lnm/untyped/spreadsheet.rkt
@@ -14,6 +14,7 @@
;; ----------------------------------------------------------------------------
(require
+ "../base/untyped.rkt"
(only-in racket/file file->value)
(only-in "bitstring.rkt" log2 natural->bitstring)
)
@@ -55,7 +56,7 @@
(void)
;; For each row, print the config ID and all the values
(for ([(row n) (in-indexed vec)])
- (void (natural->bitstring n #:pad (log2 num-configs)))
+ (void (natural->bitstring (assert n index?) #:pad (log2 num-configs)))
(for ([v row]) (void "~a~a" sep v))
(void)))
@@ -64,8 +65,16 @@
(define (rktd->spreadsheet input-filename
#:output [output #f]
#:format [format 'tab])
- (define vec (file->value input-filename))
+ (define vec
+ (for/vector ((x (in-vector (assert (file->value input-filename) vector?))))
+ (listof-index x)))
(define suffix (symbol->extension format))
(define out (or output (path-replace-suffix input-filename suffix)))
(define sep (symbol->separator format))
(vector->spreadsheet vec out sep))
+
+(define (listof-index x)
+ (if (and (list? x)
+ (andmap index? x))
+ x
+ (error 'listof-index)))
results (on Racket 7.7 BC release)
For most benchmarks, performance is the same before & after. But:
lnm
has lower overheadquadT
has higher overheadquadU
has higher overhead
lnm
typed code is much faster now (down from ~4.5s to 0.7s) because it uses assert
instead of cast
. The vector casts in spreadsheet.rkt
and summary.rkt
cost a little --- putting them back adds 1.5s and 0.5s, respectively. But the big savings comes from replacing (cast .... Index)
with (assert .... index?)
in bitstring.rkt
--- reverting adds almost 2.5s.
Both the untyped and fully-typed quad configurations run faster now, which likely makes the mixed configs. look worse. One reason for the change is that quad?
is a simple function instead of a define-predicate
... but things are harder to tease apart. (There are few changes to the main files, so things must be happening related to the base/
context, and that's hard to swap out & test.)
Full data & plots here:
gtp-benchmarks-v5-vs-v6.tar.gz
Raw gtp-measure output:
manifest-v6.tar.gz
v5.0
Fix one bug in lnm
and one bug in zordoz
.
lnm
The typed lnm
code performs an extra cast to satisfy the type checker, BUT the code doing the cast had a use-before-definition bug. That bug is fixed, and now the typed & untyped code compute the same plots.
Pull request, with more details on the issue:
https://github.com/bennn/gtp-benchmarks/pull/19
This change improves performance a little. I guess plot
throwing & handling and exception is more expensive than computing the next point to draw.
zordoz
The typed zordoz
contained an unused call to format
. This call is gone now, so (hopefully) the typed & untyped benchmarks are now running the same code.
Pull request:
https://github.com/bennn/gtp-benchmarks/pull/20
Unfortunately this change has BIG implications for performance. That format
call must have been executed often and suffered from runtime checks / wrappers.
- old typed/untyped ratio = 10.91x
- new typed/untyped ratio = 1.36x
The new zordoz now has worst-case <4x overhead. Before, things went up to 14x. Many thanks to @camoy for finding this small-looking error that introduced large overhead in typed code.
data for plots
Thank you Cameron Moy
v4.0
Replace a cast in the typed version of zombie
with a predicate test.
The untyped code now uses the same predicate.
zombie
is now a better gradual typing benchmark because less of its typed/untyped performance changes are explained by a call to cast
.
EDIT: here's some data collected with Racket 7.4
- old typed/untyped ratio = 4.37x
- new typed/untyped ratio = 1.83x
plot of old (zombie-3) vs new (zombie-4) showing that the new version has MORE configurations that suffer LESS overhead
full data behind the plot:
zombie-v4.tar.gz
Thank you Sam Tobin-Hochstadt and Cameron Moy
v3.0
Fix an issue with the untyped zordoz
code.
Before, two untyped modules imported from a typed library. After, the untyped code imports the untyped library.
This change removes an unnecessary boundary, making the untyped code a more realistic baseline for measuring Typed Racket's overhead.
The following plot compares the overhead in zordoz
for version 2 (zordoz-v2
) and version 3 (zordoz-v3
) of the GTP benchmarks. Version 3 is significantly worse:
Full results:
zordoz-gtp-2-vs-3.tar.gz
Thank you Cameron Moy
v2.0
Fix a difference between the typed and untyped mbta
code. Both are the same now.
The fix does not appear to affect performance.
Attached data:
mbta2-vs-orig.tar.gz
: output from agtp-measure
run comparing0-mbta
(after the change) to1-mbtaorig
(before). Also a tab-separated-file with 95% confidence intervals for each configuration
- picture of overhead before (0-mbta) and after (1-mbtaorig)
Thank you Robby Findler and Sam Sundar