-
Notifications
You must be signed in to change notification settings - Fork 41
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Expand documentation, add discussion on counterintuitive behavior (#188)
* Standardizing markdown sections in README converting ### to ## * splitting README material into documenter sections * add StaticArray example * fix typo * mentioning on-the-fly construction of StructArray entries in overview.md * discussing mutability for counterintuitive behaviors * adding counterintuitive behavior docs to make.jl * adding an extra initialization section * setting "Overview" as the default doc homepage moving index.md to reference.md, moving overview.md to index.md (and deleting overview.md) * Apply suggestions from code review Co-authored-by: Pietro Vertechi <[email protected]> * removing make.jl TODOs * Update docs/src/counterintuitive.md Co-authored-by: Pietro Vertechi <[email protected]> Co-authored-by: Jesse Chan <[email protected]> Co-authored-by: Pietro Vertechi <[email protected]>
- Loading branch information
1 parent
e0b70ac
commit 0a0032c
Showing
7 changed files
with
524 additions
and
43 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
# Advanced techniques | ||
|
||
## Structures with non-standard data layout | ||
|
||
StructArrays support structures with custom data layout. The user is required to overload `staticschema` in order to define the custom layout, `component` to access fields of the custom layout, and `createinstance(T, fields...)` to create an instance of type `T` from its custom fields `fields`. In other word, given `x::T`, `createinstance(T, (component(x, f) for f in fieldnames(staticschema(T)))...)` should successfully return an instance of type `T`. | ||
|
||
Here is an example of a type `MyType` that has as custom fields either its field `data` or fields of its field `rest` (which is a named tuple): | ||
|
||
```julia | ||
using StructArrays | ||
|
||
struct MyType{T, NT<:NamedTuple} | ||
data::T | ||
rest::NT | ||
end | ||
|
||
MyType(x; kwargs...) = MyType(x, values(kwargs)) | ||
|
||
function StructArrays.staticschema(::Type{MyType{T, NamedTuple{names, types}}}) where {T, names, types} | ||
return NamedTuple{(:data, names...), Base.tuple_type_cons(T, types)} | ||
end | ||
|
||
function StructArrays.component(m::MyType, key::Symbol) | ||
return key === :data ? getfield(m, 1) : getfield(getfield(m, 2), key) | ||
end | ||
|
||
# generate an instance of MyType type | ||
function StructArrays.createinstance(::Type{MyType{T, NT}}, x, args...) where {T, NT} | ||
return MyType(x, NT(args)) | ||
end | ||
|
||
s = [MyType(rand(), a=1, b=2) for i in 1:10] | ||
StructArray(s) | ||
``` | ||
|
||
In the above example, our `MyType` was composed of `data` of type `Float64` and `rest` of type `NamedTuple`. In many practical cases where there are custom types involved it's hard for StructArrays to automatically widen the types in case they are heterogeneous. The following example demonstrates a widening method in that scenario. | ||
|
||
```julia | ||
using Tables | ||
|
||
# add a source of custom type data | ||
struct Location{U} | ||
x::U | ||
y::U | ||
end | ||
struct Region{V} | ||
area::V | ||
end | ||
|
||
s1 = MyType(Location(1, 0), place = "Delhi", rainfall = 200) | ||
s2 = MyType(Location(2.5, 1.9), place = "Mumbai", rainfall = 1010) | ||
s3 = MyType(Region([Location(1, 0), Location(2.5, 1.9)]), place = "North India", rainfall = missing) | ||
|
||
s = [s1, s2, s3] | ||
# Now if we try to do StructArray(s) | ||
# we will get an error | ||
|
||
function meta_table(iter) | ||
cols = Tables.columntable(iter) | ||
meta_table(first(cols), Base.tail(cols)) | ||
end | ||
|
||
function meta_table(data, rest::NT) where NT<:NamedTuple | ||
F = MyType{eltype(data), StructArrays.eltypes(NT)} | ||
return StructArray{F}(; data=data, rest...) | ||
end | ||
|
||
meta_table(s) | ||
``` | ||
|
||
The above strategy has been tested and implemented in [GeometryBasics.jl](https://github.com/JuliaGeometry/GeometryBasics.jl). | ||
|
||
## Mutate-or-widen style accumulation | ||
|
||
StructArrays provides a function `StructArrays.append!!(dest, src)` (unexported) for "mutate-or-widen" style accumulation. This function can be used via [`BangBang.append!!`](https://juliafolds.github.io/BangBang.jl/dev/#BangBang.append!!) and [`BangBang.push!!`](https://juliafolds.github.io/BangBang.jl/dev/#BangBang.push!!) as well. | ||
|
||
`StructArrays.append!!` works like `append!(dest, src)` if `dest` can contain all element types in `src` iterator; i.e., it _mutates_ `dest` in-place: | ||
|
||
```julia | ||
julia> dest = StructVector((a=[1], b=[2])) | ||
1-element StructArray(::Array{Int64,1}, ::Array{Int64,1}) with eltype NamedTuple{(:a, :b),Tuple{Int64,Int64}}: | ||
(a = 1, b = 2) | ||
|
||
julia> StructArrays.append!!(dest, [(a = 3, b = 4)]) | ||
2-element StructArray(::Array{Int64,1}, ::Array{Int64,1}) with eltype NamedTuple{(:a, :b),Tuple{Int64,Int64}}: | ||
(a = 1, b = 2) | ||
(a = 3, b = 4) | ||
|
||
julia> ans === dest | ||
true | ||
``` | ||
|
||
Unlike `append!`, `append!!` can also _widen_ element type of `dest` array: | ||
|
||
```julia | ||
julia> StructArrays.append!!(dest, [(a = missing, b = 6)]) | ||
3-element StructArray(::Array{Union{Missing, Int64},1}, ::Array{Int64,1}) with eltype NamedTuple{(:a, :b),Tuple{Union{Missing, Int64},Int64}}: | ||
NamedTuple{(:a, :b),Tuple{Union{Missing, Int64},Int64}}((1, 2)) | ||
NamedTuple{(:a, :b),Tuple{Union{Missing, Int64},Int64}}((3, 4)) | ||
NamedTuple{(:a, :b),Tuple{Union{Missing, Int64},Int64}}((missing, 6)) | ||
|
||
julia> ans === dest | ||
false | ||
``` | ||
|
||
Since the original array `dest` cannot hold the input, a new array is created (`ans !== dest`). | ||
|
||
Combined with [function barriers](https://docs.julialang.org/en/latest/manual/performance-tips/#kernel-functions-1), `append!!` is a useful building block for implementing `collect`-like functions. | ||
|
||
## Using StructArrays in CUDA kernels | ||
|
||
It is possible to combine StructArrays with [CUDAnative](https://github.com/JuliaGPU/CUDAnative.jl), in order to create CUDA kernels that work on StructArrays directly on the GPU. Make sure you are familiar with the CUDAnative documentation (esp. kernels with plain `CuArray`s) before experimenting with kernels based on `StructArray`s. | ||
|
||
```julia | ||
using CUDAnative, CuArrays, StructArrays | ||
d = StructArray(a = rand(100), b = rand(100)) | ||
|
||
# move to GPU | ||
dd = replace_storage(CuArray, d) | ||
de = similar(dd) | ||
|
||
# a simple kernel, to copy the content of `dd` onto `de` | ||
function kernel!(dest, src) | ||
i = (blockIdx().x-1)*blockDim().x + threadIdx().x | ||
if i <= length(dest) | ||
dest[i] = src[i] | ||
end | ||
return nothing | ||
end | ||
|
||
threads = 1024 | ||
blocks = cld(length(dd),threads) | ||
|
||
@cuda threads=threads blocks=blocks kernel!(de, dd) | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# Some counterintuitive behaviors | ||
|
||
StructArrays doesn't explicitly store any structs; rather, it materializes a struct element on the fly when `getindex` is called. This is typically very efficient; for example, if all the struct fields are `isbits`, then materializing a new struct does not allocate. However, this can lead to counterintuitive behavior when modifying entries of a StructArray. | ||
|
||
## Modifying the field of a struct element | ||
|
||
```julia | ||
julia> mutable struct Foo{T} | ||
a::T | ||
b::T | ||
end | ||
|
||
julia> x = StructArray([Foo(1,2) for i = 1:5]) | ||
|
||
julia> x[1].a = 10 | ||
|
||
julia> x # remains unchanged | ||
5-element StructArray(::Vector{Int64}, ::Vector{Int64}) with eltype Foo{Int64}: | ||
Foo{Int64}(1, 2) | ||
Foo{Int64}(1, 2) | ||
Foo{Int64}(1, 2) | ||
Foo{Int64}(1, 2) | ||
Foo{Int64}(1, 2) | ||
``` | ||
The assignment `x[1].a = 10` first calls `getindex(x,1)`, then sets property `a` of the accessed element. However, since StructArrays constructs `Foo(x.a[1],x.b[1])` on the fly when when accessing `x[1]`, setting `x[1].a = 10` modifies the materialized struct rather than the StructArray `x`. | ||
|
||
Note that one can modify a field of a StructArray entry via `x.a[1] = 10` (the order of `getproperty` and `getindex` matters). As an added benefit, this does not require that the struct `Foo` is mutable, as it modifies the underlying component array `x.a` directly. | ||
|
||
For mutable structs, it is possible to write code that works for both regular `Array`s and `StructArray`s with the following trick: | ||
```julia | ||
x[1] = x[1].a = 10 | ||
``` | ||
|
||
`x[1].a = 10` creates a new `Foo` element, modifies the field `a`, then returns the modified struct. Assigning this to `x[1]` then unpacks `a` and `b` from the modified struct and assigns entries of the component arrays `x.a[1] = a`, `x.b[1] = b`. | ||
|
||
## Broadcasted assignment for array entries | ||
|
||
Broadcasted in-place assignment can also behave counterintuitively for StructArrays. | ||
```julia | ||
julia> mutable struct Bar{T} <: FieldVector{2,T} | ||
a::T | ||
b::T | ||
end | ||
|
||
julia> x = StructArray([Bar(1,2) for i = 1:5]) | ||
5-element StructArray(::Vector{Int64}, ::Vector{Int64}) with eltype Bar{Int64}: | ||
[1, 2] | ||
[1, 2] | ||
[1, 2] | ||
[1, 2] | ||
[1, 2] | ||
|
||
julia> x[1] .= 1 | ||
2-element Bar{Int64} with indices SOneTo(2): | ||
1 | ||
1 | ||
|
||
julia> x | ||
5-element StructArray(::Vector{Int64}, ::Vector{Int64}) with eltype Bar{Int64}: | ||
[1, 2] | ||
[1, 2] | ||
[1, 2] | ||
[1, 2] | ||
[1, 2] | ||
``` | ||
Because setting `x[1] .= 1` creates a `Bar` struct first, broadcasted assignment modifies this new materialized struct rather than the StructArray `x`. Note, however, that `x[1] = x[1] .= 1` works, since it assigns the modified materialized struct to the first entry of `x`. | ||
|
||
## Mutable struct types | ||
|
||
Each of these counterintuitive behaviors occur when using StructArrays with mutable elements. However, since the component arrays of a StructArray are generally mutable even if its entries are immutable, a StructArray with immutable elements will in many cases behave identically to (but be more efficient than) a StructArray with mutable elements. Thus, it is recommended to use immutable structs with StructArray whenever possible. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
## Example usage to store complex numbers | ||
|
||
```julia | ||
julia> using StructArrays, Random | ||
|
||
julia> Random.seed!(4); | ||
|
||
julia> s = StructArray{ComplexF64}((rand(2,2), rand(2,2))) | ||
2×2 StructArray(::Array{Float64,2}, ::Array{Float64,2}) with eltype Complex{Float64}: | ||
0.680079+0.625239im 0.92407+0.267358im | ||
0.874437+0.737254im 0.929336+0.804478im | ||
|
||
julia> s[1, 1] | ||
0.680079235935741 + 0.6252391193298537im | ||
|
||
julia> s.re | ||
2×2 Array{Float64,2}: | ||
0.680079 0.92407 | ||
0.874437 0.929336 | ||
|
||
julia> StructArrays.components(s) # obtain all field arrays as a named tuple | ||
(re = [0.680079 0.92407; 0.874437 0.929336], im = [0.625239 0.267358; 0.737254 0.804478]) | ||
``` | ||
|
||
Note that the same approach can be used directly from an `Array` of complex numbers: | ||
|
||
```julia | ||
julia> StructArray([1+im, 3-2im]) | ||
2-element StructArray(::Array{Int64,1}, ::Array{Int64,1}) with eltype Complex{Int64}: | ||
1 + 1im | ||
3 - 2im | ||
``` | ||
|
||
## Example usage to store a data table | ||
|
||
```julia | ||
julia> t = StructArray((a = [1, 2], b = ["x", "y"])) | ||
2-element StructArray(::Array{Int64,1}, ::Array{String,1}) with eltype NamedTuple{(:a, :b),Tuple{Int64,String}}: | ||
(a = 1, b = "x") | ||
(a = 2, b = "y") | ||
|
||
julia> t[1] | ||
(a = 1, b = "x") | ||
|
||
julia> t.a | ||
2-element Array{Int64,1}: | ||
1 | ||
2 | ||
|
||
julia> push!(t, (a = 3, b = "z")) | ||
3-element StructArray(::Array{Int64,1}, ::Array{String,1}) with eltype NamedTuple{(:a, :b),Tuple{Int64,String}}: | ||
(a = 1, b = "x") | ||
(a = 2, b = "y") | ||
(a = 3, b = "z") | ||
``` | ||
|
||
## Example usage with StaticArray elements | ||
|
||
```julia | ||
julia> using StructArrays, StaticArrays | ||
|
||
julia> x = StructArray([SVector{2}(1,2) for i = 1:5]) | ||
5-element StructArray(::Vector{Tuple{Int64, Int64}}) with eltype SVector{2, Int64}: | ||
[1, 2] | ||
[1, 2] | ||
[1, 2] | ||
[1, 2] | ||
[1, 2] | ||
|
||
julia> A = StructArray([SMatrix{2,2}([1 2;3 4]) for i = 1:5]) | ||
5-element StructArray(::Vector{NTuple{4, Int64}}) with eltype SMatrix{2, 2, Int64, 4}: | ||
[1 2; 3 4] | ||
[1 2; 3 4] | ||
[1 2; 3 4] | ||
[1 2; 3 4] | ||
[1 2; 3 4] | ||
|
||
julia> B = StructArray([SArray{Tuple{2,2,2}}(reshape(1:8,2,2,2)) for i = 1:5]); B[1] | ||
2×2×2 SArray{Tuple{2, 2, 2}, Int64, 3, 8} with indices SOneTo(2)×SOneTo(2)×SOneTo(2): | ||
[:, :, 1] = | ||
1 3 | ||
2 4 | ||
|
||
[:, :, 2] = | ||
5 7 | ||
6 8 | ||
``` |
Oops, something went wrong.