Releases: diku-dk/futhark
0.15.2
Fixed
-
Fix a REPL regression that made it unable to handle overloaded
types (such as numeric literals, oops). -
The uniqueness of a record is now the minimum of the uniqueness of
any of its elements (#870). -
Bug in causality checking has been fixed (#872).
-
Invariant memory allocations in scan/reduce operators are now supported.
-
futhark run
now performs more type checking on entry point input (#876). -
Compiled Futhark programs now check for EOF after the last input
argument has been read (#877). -
Fixed a bug in
loop
type checking that prevented the result from
ever aliasing the initial parameter values (#879).
0.15.1
Added
-
Futhark now type-checks size annotations using a size-dependent
type system. -
The parallel code generators can now handle bounds checking and
other safety checks. -
Integer division by zero is now properly safety-checked and
produces an error message. -
Integer exponentiation with negative exponent is now properly
safety-checked and produces an error message. -
Serious effort has been put into improving type errors.
-
reduce_by_index
may be somewhat faster for complex operators on
histograms that barely fit in local memory. -
Improved handling of in-place updates of multidimensional arrays
nested inmap
. These are now properly parallelised. -
Added
concat_to
andflatten_to
functions to prelude. -
Added
indices
function to the prelude. -
futhark check
and all compilers now take a-w
option for
disabling warnings. -
futhark bench
now accepts--pass-compiler-option
. -
The integer modules now have
mad_hi
andmul_hi
functions for
getting the upper part of multiplications. Thanks to @porcuquine for the contribution! -
The
f32
andf64
modules now also definesinh
,cosh
,
tanh
,asinh
,acosh
, andatanh
functions. -
The
f32
andf64
modules now also definefma
andmad
functions.
Removed
- Removed
update
,split2
,intersperse
,intercalate
,pick
,
steps
, andrange
from the prelude.
Changed
"futlib"
is now called"prelude"
, and it is now an error to
import it explicitly.
Fixed
-
Corrected address calculations in
csharp
backend. -
The C backends are now more careful about generating overflowing
integer operations (since this is undefined behaviour in C, but
defined in Futhark). -
futhark dataset
no longer crashes uncontrollably when used
incorrectly (#849).
0.14.1
Added
-
The optimiser is now somewhat better at removing unnecessary
copies of array slices. -
futhark bench
andfuthark test
now take a--concurrency
option for limiting how many threads are used for housekeeping
tasks. Set this to a low value if you run out of memory. -
random
test blocks are now allowed to contain integer literals
with type suffixes. -
:frame <n>
command forfuthark repl
for inspecting the stack. -
e :> t
notation, which means the same ase : t
for now, but
will have looser constraints in the future. -
Size-lifted type abbreviations can be declared with
type~
and
size-lifted type parameters with'~
. These currently have no
significant difference from fully lifted types.
Changed
-
Tuples are now 0-indexed (#821, which also includes a conversion
script). -
Invalid ranges like
1..<0
now produce a run-time error instead
of an empty array. -
Record updates (
r with f = e
) now requirer
to have a
completely known type up tof
. This is a restriction that will
hopefully be lifted in the future. -
The backtrace format has changed to be innermost-first, like
pretty much all other languages. -
Value specs must now explicitly quantify all sizes of function
parameters. Instead ofval sum: []t -> t
you must write
val sum [n]: [n]t -> t
-
futhark test
now once again numbers un-named data sets from 0
rather than from 1. This fits a new general principle of always
numbering from 0 in Futhark. -
Type abbreviations declared with
type
may no longer contain
functions or anonymous sizes in their definition. Usetype^
for
these cases. Just a warning for now, but will be an error in the
future.
Fixed
-
Work around (probable) AMD OpenCL compiler bug for
reduce_by_index
operations with complex operators that require
locking. -
Properly handle another ICE on parse errors in test stanzas (#819).
-
futhark_context_new_with_command_queue()
now actually works. Oops. -
Different scopes are now properly addressed during type inference
(#838). Realistically, there will still be some missing cases.
0.13.2
Added
-
New subcommand,
futhark query
, for looking up information about
the name at some position in a file. Intended for editor
integration. -
(Finally) automatic support for compute model 7.5 in the CUDA backend.
-
Somewhat better performance for very large target arrays for
reduce_by_index.
.
Fixed
0.13.1
Added
- Stack traces are now multiline for better legibility.
Changed
-
The
empty(t)
notation now specifies the type of the entire
value (not just the element type), and requires dimension sizes
whent
is an array (e.g.empty(i32)
is no longer allowed, you
need for exampleempty([0]i32)
). -
All input files are now assumed to be in UTF-8.
Fixed
0.12.3
Added
-
Character literals can now be any integer type.
-
The integer modules now have
popc
andclz
functions. -
Tweaked inlining so that larger programs may now compile faster
(observed about 20%). -
Pattern-matching on large sum typed-values taken from arrays may
be a bit faster.
Fixed
-
Various small fixes to type errors.
-
All internal functions used in generated C code are now properly
declaredstatic
. -
Fixed bugs when handling dimensions and aliases in type ascriptions.
0.12.2
Added
-
New tool:
futhark autotune
, for tuning the threshold parameters
used by incremental flattening. Based on work by Svend Lund
Breddam, Simon Rotendahl, and Carl Mathias Graae Larsen. -
New tool:
futhark dataget
, for extracting test input data. Most
will probably never use this. -
Programs compiled with the
cuda
backend now take options
--default-group-size
,--default-num-groups
, and
--default-tile-size
. -
Segmented
reduce_by_index
are now substantially fasted for small
histograms. -
New functions:
f32.lerp
andf64.lerp
, for linear interpolation.
Fixed
-
Fixes to aliasing of record updates.
-
Fixed unnecessary array duplicates after coalescing optimisations.
-
reduce_by_index
nested inmap
s will no longer sometimes
require huge amounts of memory. -
Source location now correct for unknown infix operators.
-
Function parameters are no longer in scope of themselves (#798).
-
Fixed a nasty out-of-bounds error in handling of irregular allocations.
-
The
floor
/ceil
functions inf32
/f64
now handle infinities
correctly (and are also faster). -
Using
%
on floats now computes fmod instead of crashing the compiler.
0.12.1
Added
-
The internal representation of parallel constructs has been
overhauled and many optimisations rewritten. The overall
performance impact should be neutral on aggregate, but there may
be changes for some programs (please report if so). -
Futhark now supports structurally typed sum types and pattern
matching! This work was done by Robert Schenck. There remain
some problems with arrays of sum types that themselves contain
arrays. -
Significant reduction in compile time for some large programs.
-
Manually specified type parameters need no longer be exhaustive.
-
Mapped
rotate
is now simplified better. This can be
particularly helpful for stencils with wraparound.
Removed
- The
~
prefix operator has been removed.!
has been extended
to perform bitwise negation when applied to integers.
Changed
-
The
--futhark
option forfuthark bench
andfuthark test
now
defaults to the binary being used for the subcommands themselves. -
The legacy
futhark -t
option (which did the same asfuthark check
) has been removed. -
Lambdas now bind less tightly than type ascription.
-
stream_map
is nowmap_stream
andstream_red
is now
reduce_stream
.
Fixed
-
futhark test
now understands--no-tuning
as it was always
supposed to. -
futhark bench
andfuthark test
now interpret--exclude
in
the same way. -
The Python and C# backends can now properly read binary boolean
input.
0.11.2
Fixed
-
Entry points whose types are opaque due to module ascription, yet
whose representation is simple (scalars or arrays of scalars) were
mistakely made non-opaque when compiled with--library
. This
has been fixed. -
The CUDA backend now supports default sizes in
.tuning
files. -
Loop interchange across multiple dimensions was broken in some cases (#767).
-
The sequential C# backend now generates code that compiles (#772).
-
The sequential Python backend now generates code that runs (#765).
0.11.1
Added
-
Segmented scans are a good bit faster.
-
reduce_by_index
has received a new implementation that uses
local memory, and is now often a good bit faster when the target
array is not too large. -
The
f32
andf64
modules now containgamma
andlgamma
functions. At present these do not work in the C# backend. -
Some instances of
reduce
with vectorised operators (e.g.map2 (+)
) are orders of magnitude faster than before. -
Memory usage is now lower on some programs (specifically the ones
that have largemap
s with internal intermediate arrays).
Removed
- Size parameters (not annotations) are no longer permitted
directly inlet
andloop
bindings, nor in lambdas. You are
likely not affected (except for thestream
constructs; see
below). Few people used this.
Changed
-
The array creation functions exported by generated C code now take
int64_t
arguments for the shape, rather thanint
. This is in
line with what the shape functions return. -
The types for
stream_map
,stream_map_per
,stream_red
, and
stream_red_per
have been changed, such that the chunk function
now takes the chunk size as the first argument.
Fixed
-
Fixes to reading values under Python 3.
-
The type of a variable can now be deduced from its use as a size
annotation. -
The code generated by the C-based backends is now also compilable
as C++. -
Fix memory corruption bug that would occur on very large segmented
reductions (large segments, and many of them).