-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use KKT conditions to speed up mode search #12
Comments
Did a first assessment with a 'medium-sized' set (24k covariates, 450k rows), which was not a big success: |
Well, that certainly seems disappointing, and counter to the small examples with which I played; using the swindle never took longer. Can you share the data files? Or provide R commands that I can run against cdm_sim4 to generate something similar? |
The following shows an approximately 2-fold speed-up: seed <- 666
tolerance <- 1E-4
data <- simulateCyclopsData(nstrata = 1,
nrows = 10000,
ncovars = 20000,
zeroEffectSizeProp = 0.99,
model = "logistic")
cyclopsData <- convertToCyclopsData(data$outcomes,
data$covariates,
modelType = "lr",
addIntercept = TRUE)
slowFit <- fitCyclopsModel(cyclopsData,
forceNewObject = TRUE, # Cold start for fair comparison
prior = createPrior("laplace", variance = 0.01, exclude = c(0)))
fastFit <- fitCyclopsModel(cyclopsData,
forceNewObject = TRUE, # Cold start for fair comparison
prior = createPrior("laplace", variance = 0.01, exclude = c(0)),
control = createControl(noiseLevel = "silent",
useKKTSwindle = TRUE,
tuneSwindle = 100))
slowFit$timeFit
fastFit$timeFit
slowFit$timeLog # An apparent inefficiency in moving coefficient estimates back into R
fastFit$timeLog #
expect_equal(coef(slowFit), coef(fastFit), tolerance = tolerance) |
Am still wondering about this .... |
Hi @schuemie and @pbr6cornell ,
If you can generate a dataset with 100,000 or more covariates and many, many rows, I just implemented (in a15bb66) a mode search strategy that should be much faster than before. Please let me know your mileage, so I can tinker a bit with performance. The R commands are:
The text was updated successfully, but these errors were encountered: