Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow arbitrary precision number formats (e.g., Float32) #43

Closed
wants to merge 2 commits into from

Conversation

rikhuijzer
Copy link
Member

@rikhuijzer rikhuijzer commented Jan 31, 2024

While trying to solve #42, it looks like it is not so easy. I fixed 2 things along the way, but got into a 3rd situation. Many defaults are set to Float64. For example,

julia> Bernoulli()
Bernoulli{Float64}(p=0.5)

This is the 3rd situation that @testset "Issue 42" is now stuck in:

MethodError: no method matching delbeta!(::GLM.DensePredChol{Float64, CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}, ::Vector{Float32}, ::Vector{Float32})

Closest candidates are:
  delbeta!(::GLM.SparsePredChol{T}, ::Vector{T}, ::Vector{T}) where T
   @ GLM ~/.julia/packages/GLM/vM20T/src/linpred.jl:213
  delbeta!(::GLM.DensePredChol{T, <:Cholesky}, ::Vector{T}, ::Vector{T}) where T<:Union{Float32, Float64}
   @ GLM ~/.julia/packages/GLM/vM20T/src/linpred.jl:151
  delbeta!(::GLM.SparsePredChol{T}, ::Vector{T}) where T
   @ GLM ~/.julia/packages/GLM/vM20T/src/linpred.jl:220
  ...

Stacktrace:
 [1] _fit!(m::GLM.GeneralizedLinearModel{GLM.GlmResp{Vector{Float32}, Bernoulli{Float32}, GLM.LogitLink}, GLM.DensePredChol{Float64, CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, verbose::Bool, maxiter::Int64, minstepfac::Float64, atol::Float64, rtol::Float64, start::Nothing)
   @ GLM ~/.julia/packages/GLM/vM20T/src/glmfit.jl:372
 [2] #fit!#19
   @ ~/.julia/packages/GLM/vM20T/src/glmfit.jl:462 [inlined]
 [3] fit!
   @ ~/.julia/packages/GLM/vM20T/src/glmfit.jl:429 [inlined]
 [4] fit(::Type{GLM.GeneralizedLinearModel}, X::Matrix{Float64}, y::Vector{Float32}, d::Bernoulli{Float32}, l::GLM.LogitLink; dropcollinear::Bool, dofit::Bool, wts::Vector{Float32}, offset::Vector{Float32}, fitargs::@Kwargs{maxiter::Int64, atol::Float64, rtol::Float64, minstepfac::Float64})
   @ GLM ~/.julia/packages/GLM/vM20T/src/glmfit.jl:584
 [5] fit(::Type{GLM.GeneralizedLinearModel}, ::StatsModels.FormulaTerm{StatsModels.Term, Tuple{StatsModels.Term, StatsModels.Term, StatsModels.ConstantTerm{Int64}}}, ::Tables.MatrixTable{Matrix{Float32}}, ::Bernoulli{Float32}, ::Vararg{Any}; contrasts::Dict{Symbol, Any}, kwargs::@Kwargs{offset::Vector{Float32}, maxiter::Int64, atol::Float64, rtol::Float64, minstepfac::Float64, wts::Vector{Float32}})
   @ StatsModels ~/.julia/packages/StatsModels/syVEq/src/statsmodel.jl:88
 [6] glm(::StatsModels.FormulaTerm{StatsModels.Term, Tuple{StatsModels.Term, StatsModels.Term, StatsModels.ConstantTerm{Int64}}}, ::Tables.MatrixTable{Matrix{Float32}}, ::Bernoulli{Float32}, ::Vararg{Any}; kwargs::@Kwargs{offset::Vector{Float32}, maxiter::Int64, atol::Float64, rtol::Float64, minstepfac::Float64, wts::Vector{Float32}})
   @ GLM ~/.julia/packages/GLM/vM20T/src/glmfit.jl:604
 [7] fit(model::LinearBinaryClassifier, verbosity::Int64, X::@NamedTuple{a::Vector{Float32}, b::Vector{Float32}}, y::CategoricalArrays.CategoricalVector{Bool, UInt32, Bool, CategoricalArrays.CategoricalValue{Bool, UInt32}, Union{}}, w::Nothing)
   @ MLJGLMInterface ~/git/MLJGLMInterface.jl/src/MLJGLMInterface.jl:445
 [8] fit(model::LinearBinaryClassifier, verbosity::Int64, X::@NamedTuple{a::Vector{Float32}, b::Vector{Float32}}, y::CategoricalArrays.CategoricalVector{Bool, UInt32, Bool, CategoricalArrays.CategoricalValue{Bool, UInt32}, Union{}})
   @ MLJGLMInterface ~/git/MLJGLMInterface.jl/src/MLJGLMInterface.jl:433
 [9] top-level scope
   @ REPL[65]:1

So I propose to wait for @tiemvanderdeure's answer on the question how much 32-bit is needed and otherwise to just change the code to promote to 64-bit automatically. Performance and memory efficiency are usually not important for GLM.jl, I would guess. R's core math library also defaults to Float64 (see for example https://github.com/SurajGupta/r-source/blob/master/src/nmath/dnt.c which shows that the normal distribution logic is on doubles by default).

This comment was marked as off-topic.

@rikhuijzer rikhuijzer changed the title Trigger CI Allow arbitrary precision number formats (e.g., Float32) Jan 31, 2024
@tiemvanderdeure
Copy link
Contributor

For my use case, promoting to Float64 automatically would be a fine solution.

@rikhuijzer
Copy link
Member Author

I've thought a bit more about it and it's probably better for the clients to promote to Float64 on their side. An automatic promotion could later get in the way of people who want to not have the numbers promoted. Lower precision formats become more and more common in machine learning, so it is not unreasonable to expect that Julia will incorporate those more and more over time. It's unlikely, yes, but I don't think the tradeoff is worth it.

I'll close this PR. People who disagree are free to pick this one up.

@rikhuijzer rikhuijzer closed this Feb 7, 2024
@rikhuijzer rikhuijzer deleted the rh/issue-42 branch February 7, 2024 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants