Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename Assertion/assert to Validator/validate #86

Merged
merged 1 commit into from
Oct 19, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ makedocs(
],
"API" => [
"Impute" => "api/impute.md",
"Assertions" => "api/assertions.md",
"Validators" => "api/validators.md",
"Filter" => "api/filter.md",
"Imputors" => "api/imputors.md",
"Chain" => "api/chain.md",
Expand Down
2 changes: 1 addition & 1 deletion docs/src/api/functional.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Functional

To reduce verbosity, Impute.jl also provides a functional interface to its `Assertion`s, `Filter`s, `Imputor`s, etc.
To reduce verbosity, Impute.jl also provides a functional interface to its `Validator`s, `Filter`s, `Imputor`s, etc.

Ex)

Expand Down
4 changes: 2 additions & 2 deletions docs/src/api/assertions.md → docs/src/api/validators.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Assertions
# Validators

```@autodocs
Modules = [Impute]
Pages = ["assertions.jl"]
Pages = ["validators.jl"]
Order = [:module, :constant, :type, :function]
```

Expand Down
3 changes: 2 additions & 1 deletion src/Impute.jl
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,12 @@ using LinearAlgebra
using LinearAlgebra: Diagonal

include("utils.jl")
include("assertions.jl")
include("imputors.jl")
include("filter.jl")
include("validators.jl")
include("chain.jl")
include("deprecated.jl")

include("functional.jl")
include("data.jl")

Expand Down
14 changes: 7 additions & 7 deletions src/chain.jl
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
const Transform = Union{Assertion, Filter, Imputor}
const Transform = Union{Validator, Filter, Imputor}

"""
Chain{T<:Tuple{Vararg{Transform}}} <: Function

Runs multiple `Assertions`, `Filter` or `Imputor`s on the same data in the order they're
Runs multiple `Validators`, `Filter` or `Imputor`s on the same data in the order they're
provided.

# Fields
* `transforms::Vector{Union{Assertion, Filter, Imputor}}`
* `transforms::Vector{Union{Validator, Filter, Imputor}}`
"""
struct Chain{T<:Tuple{Vararg{Transform}}} <: Function
transforms::T
Expand All @@ -16,7 +16,7 @@ end
Chain(transforms::Vector{<:Transform}) = Chain(Tuple(transforms))

"""
Chain(transforms::Union{Assertion, Filter, Imputor}...) -> Chain
Chain(transforms::Union{Validator, Filter, Imputor}...) -> Chain

Creates a Chain using the transforms provided (ordering matters).
"""
Expand Down Expand Up @@ -66,9 +66,9 @@ function (C::Chain)(data; kwargs...)
X = trycopy(data)

for t in C.transforms
if isa(t, Assertion)
# Assertions just return the input
assert(X, t; kwargs...)
if isa(t, Validator)
# Validators just return the input
validate(X, t; kwargs...)
elseif isa(t, Filter)
# Filtering doesn't always work in-place
X = apply(X, t; kwargs...)
Expand Down
8 changes: 4 additions & 4 deletions src/functional.jl
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Generate a functional interface from the Assertion and Imputor types.
# Generate a functional interface from the Validator and Imputor types.
"""
_splitkwargs(::Type{T}, kwargs...) where T -> (imp, rem)

Expand Down Expand Up @@ -35,7 +35,7 @@ function _splitkwargs(::Type{Substitute}, kwargs...)
return (Substitute(; kwdef...), rem)
end

const global assertion_methods = (
const global validation_methods = (
threshold = Threshold,
)

Expand All @@ -55,12 +55,12 @@ const global imputation_methods = (
knn = KNN,
)

for (func, type) in pairs(assertion_methods)
for (func, type) in pairs(validation_methods)
typename = nameof(type)
@eval begin
function $func(data; kwargs...)
a, rem = _splitkwargs($typename, kwargs...)
return assert(data, a; rem...)
return validate(data, a; rem...)
end
end
end
Expand Down
54 changes: 27 additions & 27 deletions src/assertions.jl → src/validators.jl
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
"""
Assertion
Validator

An Assertion stores settings for checking the validity of a `AbstractArray` or `Tables.table` containing missing values.
New assertions are expected to subtype `Impute.Assertion` and, at minimum,
implement the `_assert(data::AbstractArray{Union{T, Missing}}, ::<MyAssertion>)` method.
An Validator stores settings for checking the validity of a `AbstractArray` or `Tables.table` containing missing values.
New validations are expected to subtype `Impute.Validator` and, at minimum,
implement the `_validate(data::AbstractArray{Union{T, Missing}}, ::<MyValidator>)` method.
"""
abstract type Assertion end
abstract type Validator end

"""
assert(data::AbstractArray, a::Assertion; dims=:)
validate(data::AbstractArray, v::Validator; dims=:)

If the assertion `a` fails then an `ThresholdError` is thrown, otherwise the `data`
provided is returned without mutation. See [`Assertion`](@ref) for the minimum internal
`_assert` call requirements.
If the validator `v` fails then an error is thrown, otherwise the `data`
provided is returned without mutation. See [`Validator`](@ref) for the minimum internal
`_validate` call requirements.

# Arguments
* `data::AbstractArray`: the data to be impute along dimensions `dims`
* `a::Assertion`: the assertion to apply
* `v::Validator`: the validator to apply

# Keywords
* `dims`: The dimension to apply the `_assert` along (default is `:`)
* `dims`: The dimension to apply the `_validate` along (default is `:`)

# Returns
* the input `data` if no error is thrown.
Expand All @@ -28,38 +28,38 @@ provided is returned without mutation. See [`Assertion`](@ref) for the minimum i
* An error when the test fails

```jldoctest
julia> using Test; using Impute: Threshold, ThresholdError, assert
julia> using Test; using Impute: Threshold, ThresholdError, validate

julia> M = [1.0 2.0 missing missing 5.0; 1.1 2.2 3.3 missing 5.5]
2×5 Array{Union{Missing, Float64},2}:
1.0 2.0 missing missing 5.0
1.1 2.2 3.3 missing 5.5

julia> @test_throws ThresholdError assert(M, Threshold())
julia> @test_throws ThresholdError validate(M, Threshold())
Test Passed
Thrown: ThresholdError
```
"""
function assert(data::AbstractArray, a::Assertion; dims=:, kwargs...)
dims === Colon() && return _assert(data, a; kwargs...)
function validate(data::AbstractArray, a::Validator; dims=:, kwargs...)
dims === Colon() && return _validate(data, a; kwargs...)
d = Impute.dim(data, dims)

for d in eachslice(data; dims=d)
_assert(d, a; kwargs...)
_validate(d, a; kwargs...)
end
return data
end

"""
assert(table, a::Assertion; cols=nothing)
validate(table, v::Validator; cols=nothing)

Applies the assertion `a` to the `table` 1 column at a time; if this is not the desired
behaviour custom `assert` methods should overload this method. See [`Assertion`](@ref) for
the minimum internal `_assert` call requirements.
Applies the validator `v` to the `table` 1 column at a time; if this is not the desired
behaviour custom `validate` methods should overload this method. See [`Validator`](@ref) for
the minimum internal `_validate` call requirements.

# Arguments
* `table`: the data to impute
* `a`: the assertion to apply
* `v`: the validator to apply

# Keyword Arguments
* `cols`: The columns to impute along (default is to impute all columns)
Expand All @@ -72,7 +72,7 @@ the minimum internal `_assert` call requirements.

# Example
```jldoctest
julia> using DataFrames, Test; using Impute: Threshold, ThresholdError, assert
julia> using DataFrames, Test; using Impute: Threshold, ThresholdError, validate

julia> df = DataFrame(:a => [1.0, 2.0, missing, missing, 5.0], :b => [1.1, 2.2, 3.3, missing, 5.5])
5×2 DataFrame
Expand All @@ -85,21 +85,21 @@ julia> df = DataFrame(:a => [1.0, 2.0, missing, missing, 5.0], :b => [1.1, 2.2,
│ 4 │ missing │ missing │
│ 5 │ 5.0 │ 5.5 │

julia> @test_throws ThresholdError assert(df, Threshold())
julia> @test_throws ThresholdError validate(df, Threshold())
Test Passed
Thrown: ThresholdError
```
"""
function assert(table, a::Assertion; cols=nothing, kwargs...)
istable(table) || throw(MethodError(assert, (table, a)))
function validate(table, v::Validator; cols=nothing, kwargs...)
istable(table) || throw(MethodError(validate, (table, v)))
columntable = Tables.columns(table)

cnames = cols === nothing ? propertynames(columntable) : cols
for cname in cnames
_assert(getproperty(columntable, cname), a; kwargs...)
_validate(getproperty(columntable, cname), v; kwargs...)
end

return table
end

include("assertions/threshold.jl")
include("validators/threshold.jl")
4 changes: 2 additions & 2 deletions src/assertions/threshold.jl → src/validators/threshold.jl
Original file line number Diff line number Diff line change
Expand Up @@ -31,14 +31,14 @@ If a weights array is provided then the ratio will be calculated as the
* `weights::AbstractWeights`: A set of statistical weights to use when evaluating the importance
of each observation. If present a weighted ratio of missing values will be calculated.
"""
struct Threshold <: Assertion
struct Threshold <: Validator
ratio::Float64
weights::Union{AbstractWeights, Nothing}
end

Threshold(; ratio=0.1, weights=nothing) = Threshold(ratio, weights)

function _assert(data::AbstractArray{Union{T, Missing}}, t::Threshold) where T
function _validate(data::AbstractArray{Union{T, Missing}}, t::Threshold) where T
mratio = if t.weights === nothing
count(ismissing, data) / length(data)
else
Expand Down
6 changes: 3 additions & 3 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -36,18 +36,18 @@ using Impute:
Threshold,
ThresholdError,
apply,
assert,
impute,
impute!,
interp,
run,
threshold
threshold,
validate


@testset "Impute" begin
include("testutils.jl")

include("assertions.jl")
include("validators.jl")
include("chain.jl")
include("data.jl")
include("deprecated.jl")
Expand Down
36 changes: 18 additions & 18 deletions test/assertions.jl → test/validators.jl
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
@testset "Assertions" begin
@testset "Validators" begin
# Defining our missing datasets
a = allowmissing(1.0:1.0:20.0)
a[[2, 3, 7]] .= missing
Expand All @@ -22,14 +22,14 @@

@testset "Base" begin
t = Threshold(; ratio=0.1)
@test_throws ThresholdError assert(a, t)
@test_throws ThresholdError assert(m, t)
@test_throws ThresholdError assert(aa, t)
@test_throws ThresholdError assert(table, t)
@test_throws ThresholdError validate(a, t)
@test_throws ThresholdError validate(m, t)
@test_throws ThresholdError validate(aa, t)
@test_throws ThresholdError validate(table, t)

# Test showerror
msg = try
assert(a, t)
validate(a, t)
catch e
sprint(showerror, e)
end
Expand All @@ -38,34 +38,34 @@

t = Threshold(; ratio=0.8)
# Use isequal because we expect the results to contain missings
@test isequal(assert(a, t), a)
@test isequal(assert(m, t), m)
@test isequal(assert(aa, t), aa)
@test isequal(assert(table, t), table)
@test isequal(validate(a, t), a)
@test isequal(validate(m, t), m)
@test isequal(validate(aa, t), aa)
@test isequal(validate(table, t), table)
end

@testset "Weighted" begin
# If we use an exponentially weighted context then we won't pass the limit
# because missing earlier observations is less important than later ones.
t = Threshold(; ratio=0.8, weights=eweights(20, 0.3))
@test isequal(assert(a, t), a)
@test isequal(assert(table, t), table)
@test isequal(validate(a, t), a)
@test isequal(validate(table, t), table)

@test isequal(threshold(m; ratio=0.8, weights=eweights(5, 0.3), dims=:cols), m)
@test isequal(threshold(m; ratio=0.8, weights=eweights(5, 0.3), dims=:cols), aa)

# If we reverse the weights such that earlier observations are more important
# then our previous limit of 0.2 won't be enough to succeed.
t = Threshold(; ratio=0.1, weights=reverse!(eweights(20, 0.3)))
@test_throws ThresholdError assert(a, t)
@test_throws ThresholdError assert(table, t)
@test_throws ThresholdError validate(a, t)
@test_throws ThresholdError validate(table, t)

t = Threshold(; ratio=0.1, weights=reverse!(eweights(5, 0.3)))
@test_throws ThresholdError assert(m, t; dims=:cols)
@test_throws ThresholdError assert(aa, t; dims=:cols)
@test_throws ThresholdError validate(m, t; dims=:cols)
@test_throws ThresholdError validate(aa, t; dims=:cols)

@test_throws DimensionMismatch assert(a[1:10], t)
@test_throws DimensionMismatch assert(m[1:3, :], t; dims=:cols)
@test_throws DimensionMismatch validate(a[1:10], t)
@test_throws DimensionMismatch validate(m[1:3, :], t; dims=:cols)
end

@testset "functional" begin
Expand Down