-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize input for Choice Models #208
Comments
This sounds like a promising feature! I am cautiously enthusiastic. Some points in favor: I believe that widely divergent coefficients are a problem not just for interpretation, but also for speed and accuracy of the parameter estimation. (The search for optimal values is harder if some are far from the starting point and if the sensitivities vary.) Advice I've heard is that the best practice is to manually scale the input data so that the fitted coefficients are of similar magnitude. But this is not convenient, especially in a semi-automated context like building an UrbanSim model. Automatically normalizing the input data would help. Some points of caution: We'd need to be very clear in the documentation and in the output that the fitted parameters apply to transformed data. I don't think this is a common approach. And it should be an optional setting. Our roadmap is to move the statistics logic out of the UrbanSim repository and into ChoiceModels, but it seems fine to implement this feature here and include it in a point release. ChoiceModels is a ways off from being ready, and the shift will be disruptive enough that we should save it for a major version bump of UrbanSim. |
I know I have heard andrew gelman express that viewpoint in the past, but a quick google found a more nuanced blog post of his.
Ok, so if we make it an optional setting, a way to make it clearer may be to have the section in the yaml be called |
A number of times we have accidentally compared the magnitude of coefficients in the yaml files that represent
MNLDiscreteChoiceModel
instances. This is of course a mistake as 0.001 is a large coefficient fornonres_sqft
and a small coefficient forfrac_developed
. In addition themagick 3's
problem; The code puts a hard cut off for coefficients at -3 and 3. This is a grate default for normalized variables i.e. ones with std ~=1 mean ~=0 but way to small/big for other columns. If coefficients are made comparable then we can also consider adding L1 or L2 regularization.My proposal is that when fitting a model subtract the mean and divide by the std for each column. In the yaml file store the training mean, training std, and the coefficients of the transformed columns. Then when predicting with a model we transform with the stored mean and std. Use of the Models will be unchanged, but the stored coefficients will be comparable with each other.
Thoughts?
The text was updated successfully, but these errors were encountered: