You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 8, 2019. It is now read-only.
Makoto YUI edited this page Jun 12, 2014
·
14 revisions
OOM in mappers
In a certain setting, the default input split size is too large for Hivemall. Due to that, OutOfMemoryError cloud happen on mappers in the middle of training.
Then, revise your a Hadoop setting (mapred.child.java.opts/mapred.map.child.java.opts) first to use a larger value as possible.
If an OOM error still caused after that, set smaller mapred.max.split.size value before training.
SET mapred.max.split.size=67108864;
Then, the number of training examples used for each trainer is reduced (as the number of mappers increases) and the trained model would fit in the memory.
OOM in shuffle/merge
If OOM caused during the merge step, try setting a larger mapred.reduce.tasks value before training and revise shuffle/reduce parameters.
SET mapred.reduce.tasks=64;
If your OOM happened by using amplify(), try using rand_amplify() instead.