Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FMWithSGD default constructor parameters are inconsistent/too small #11

Open
Hydrotoast opened this issue May 23, 2016 · 0 comments
Open

Comments

@Hydrotoast
Copy link

Hydrotoast commented May 23, 2016

From the FMWithSGD file:

  /**
    * Construct an object with default parameters: {task: 0, stepSize: 1.0, numIterations: 100,
    * dim: (true, true, 8), regParam: (0, 0.01, 0.01), miniBatchFraction: 1.0}.
    */
  def this() = this(0, 1.0, 100, (true, true, 8), (0, 1e-3, 1e-4), 1e-5)

The comment is inconsistent with the actual values passed.

It is also worth noting that 1e-5 may be too small a fraction size to train over all parameters. Since the GradientDescent implementation in Scala performs numIterations iterations of mini batch SGD with batch size miniBatchFraction, it follows that approximately numIterations * miniBatchFraction labeled points are updated. For numIterations = 100 and miniBatchFraction = 1e-5, this means only a maximum of 1e-3 labeled points are actually used during training!

Further implications: since the model has a set of parameters per feature, this means that if a feature is unseen during training, then they will simply be initialized with their default values: latent vectors initialized from a Normal distribution and weights initialized to 0.0.

asfgit pushed a commit to apache/spark that referenced this issue May 25, 2016
## What changes were proposed in this pull request?

Add a warning log for the case that `numIterations * miniBatchFraction <1.0` during gradient descent. If the product of those two numbers is less than `1.0`, then not all training examples will be used during optimization. To put this concretely, suppose that `numExamples = 100`, `miniBatchFraction = 0.2` and `numIterations = 3`. Then, 3 iterations will occur each sampling approximately 6 examples each. In the best case, each of the 6 examples are unique; hence 18/100 examples are used.

This may be counter-intuitive to most users and led to the issue during the development of another Spark  ML model: zhengruifeng/spark-libFM#11. If a user actually does not require the training data set, it would be easier and more intuitive to use `RDD.sample`.

## How was this patch tested?

`build/mvn -DskipTests clean package` build succeeds

Author: Gio Borje <[email protected]>

Closes #13265 from Hydrotoast/master.

(cherry picked from commit 589cce9)
Signed-off-by: Sean Owen <[email protected]>
asfgit pushed a commit to apache/spark that referenced this issue May 25, 2016
## What changes were proposed in this pull request?

Add a warning log for the case that `numIterations * miniBatchFraction <1.0` during gradient descent. If the product of those two numbers is less than `1.0`, then not all training examples will be used during optimization. To put this concretely, suppose that `numExamples = 100`, `miniBatchFraction = 0.2` and `numIterations = 3`. Then, 3 iterations will occur each sampling approximately 6 examples each. In the best case, each of the 6 examples are unique; hence 18/100 examples are used.

This may be counter-intuitive to most users and led to the issue during the development of another Spark  ML model: zhengruifeng/spark-libFM#11. If a user actually does not require the training data set, it would be easier and more intuitive to use `RDD.sample`.

## How was this patch tested?

`build/mvn -DskipTests clean package` build succeeds

Author: Gio Borje <[email protected]>

Closes #13265 from Hydrotoast/master.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant