Allow arbitrary precision number formats (e.g., `Float32`) #43

rikhuijzer · 2024-01-31T20:40:07Z

While trying to solve #42, it looks like it is not so easy. I fixed 2 things along the way, but got into a 3rd situation. Many defaults are set to Float64. For example,

julia> Bernoulli()
Bernoulli{Float64}(p=0.5)

This is the 3rd situation that @testset "Issue 42" is now stuck in:

MethodError: no method matching delbeta!(::GLM.DensePredChol{Float64, CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}, ::Vector{Float32}, ::Vector{Float32})

Closest candidates are:
  delbeta!(::GLM.SparsePredChol{T}, ::Vector{T}, ::Vector{T}) where T
   @ GLM ~/.julia/packages/GLM/vM20T/src/linpred.jl:213
  delbeta!(::GLM.DensePredChol{T, <:Cholesky}, ::Vector{T}, ::Vector{T}) where T<:Union{Float32, Float64}
   @ GLM ~/.julia/packages/GLM/vM20T/src/linpred.jl:151
  delbeta!(::GLM.SparsePredChol{T}, ::Vector{T}) where T
   @ GLM ~/.julia/packages/GLM/vM20T/src/linpred.jl:220
  ...

Stacktrace:
 [1] _fit!(m::GLM.GeneralizedLinearModel{GLM.GlmResp{Vector{Float32}, Bernoulli{Float32}, GLM.LogitLink}, GLM.DensePredChol{Float64, CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, verbose::Bool, maxiter::Int64, minstepfac::Float64, atol::Float64, rtol::Float64, start::Nothing)
   @ GLM ~/.julia/packages/GLM/vM20T/src/glmfit.jl:372
 [2] #fit!#19
   @ ~/.julia/packages/GLM/vM20T/src/glmfit.jl:462 [inlined]
 [3] fit!
   @ ~/.julia/packages/GLM/vM20T/src/glmfit.jl:429 [inlined]
 [4] fit(::Type{GLM.GeneralizedLinearModel}, X::Matrix{Float64}, y::Vector{Float32}, d::Bernoulli{Float32}, l::GLM.LogitLink; dropcollinear::Bool, dofit::Bool, wts::Vector{Float32}, offset::Vector{Float32}, fitargs::@Kwargs{maxiter::Int64, atol::Float64, rtol::Float64, minstepfac::Float64})
   @ GLM ~/.julia/packages/GLM/vM20T/src/glmfit.jl:584
 [5] fit(::Type{GLM.GeneralizedLinearModel}, ::StatsModels.FormulaTerm{StatsModels.Term, Tuple{StatsModels.Term, StatsModels.Term, StatsModels.ConstantTerm{Int64}}}, ::Tables.MatrixTable{Matrix{Float32}}, ::Bernoulli{Float32}, ::Vararg{Any}; contrasts::Dict{Symbol, Any}, kwargs::@Kwargs{offset::Vector{Float32}, maxiter::Int64, atol::Float64, rtol::Float64, minstepfac::Float64, wts::Vector{Float32}})
   @ StatsModels ~/.julia/packages/StatsModels/syVEq/src/statsmodel.jl:88
 [6] glm(::StatsModels.FormulaTerm{StatsModels.Term, Tuple{StatsModels.Term, StatsModels.Term, StatsModels.ConstantTerm{Int64}}}, ::Tables.MatrixTable{Matrix{Float32}}, ::Bernoulli{Float32}, ::Vararg{Any}; kwargs::@Kwargs{offset::Vector{Float32}, maxiter::Int64, atol::Float64, rtol::Float64, minstepfac::Float64, wts::Vector{Float32}})
   @ GLM ~/.julia/packages/GLM/vM20T/src/glmfit.jl:604
 [7] fit(model::LinearBinaryClassifier, verbosity::Int64, X::@NamedTuple{a::Vector{Float32}, b::Vector{Float32}}, y::CategoricalArrays.CategoricalVector{Bool, UInt32, Bool, CategoricalArrays.CategoricalValue{Bool, UInt32}, Union{}}, w::Nothing)
   @ MLJGLMInterface ~/git/MLJGLMInterface.jl/src/MLJGLMInterface.jl:445
 [8] fit(model::LinearBinaryClassifier, verbosity::Int64, X::@NamedTuple{a::Vector{Float32}, b::Vector{Float32}}, y::CategoricalArrays.CategoricalVector{Bool, UInt32, Bool, CategoricalArrays.CategoricalValue{Bool, UInt32}, Union{}})
   @ MLJGLMInterface ~/git/MLJGLMInterface.jl/src/MLJGLMInterface.jl:433
 [9] top-level scope
   @ REPL[65]:1

So I propose to wait for @tiemvanderdeure's answer on the question how much 32-bit is needed and otherwise to just change the code to promote to 64-bit automatically. Performance and memory efficiency are usually not important for GLM.jl, I would guess. R's core math library also defaults to Float64 (see for example https://github.com/SurajGupta/r-source/blob/master/src/nmath/dnt.c which shows that the normal distribution logic is on doubles by default).

tiemvanderdeure · 2024-02-03T12:13:37Z

For my use case, promoting to Float64 automatically would be a fine solution.

rikhuijzer · 2024-02-07T07:28:09Z

I've thought a bit more about it and it's probably better for the clients to promote to Float64 on their side. An automatic promotion could later get in the way of people who want to not have the numbers promoted. Lower precision formats become more and more common in machine learning, so it is not unreasonable to expect that Julia will incorporate those more and more over time. It's unlikely, yes, but I don't think the tradeoff is worth it.

I'll close this PR. People who disagree are free to pick this one up.

Trigger CI

ba33f12

This comment was marked as off-topic.

Sign in to view

Fix some things

ba5d25f

rikhuijzer mentioned this pull request Jan 31, 2024

fit! on a LinearBinaryClassifier fails for Float32 values #42

Closed

rikhuijzer changed the title ~~Trigger CI~~ Allow arbitrary precision number formats (e.g., Float32) Jan 31, 2024

rikhuijzer closed this Feb 7, 2024

rikhuijzer deleted the rh/issue-42 branch February 7, 2024 07:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow arbitrary precision number formats (e.g., `Float32`) #43

Allow arbitrary precision number formats (e.g., `Float32`) #43

rikhuijzer commented Jan 31, 2024 •

edited

Loading

This comment was marked as off-topic.

tiemvanderdeure commented Feb 3, 2024

rikhuijzer commented Feb 7, 2024

Allow arbitrary precision number formats (e.g., Float32) #43

Allow arbitrary precision number formats (e.g., Float32) #43

Conversation

rikhuijzer commented Jan 31, 2024 • edited Loading

This comment was marked as off-topic.

tiemvanderdeure commented Feb 3, 2024

rikhuijzer commented Feb 7, 2024

Allow arbitrary precision number formats (e.g., `Float32`) #43

Allow arbitrary precision number formats (e.g., `Float32`) #43

rikhuijzer commented Jan 31, 2024 •

edited

Loading