Training speed of Regression Forest #45

GoogleCodeExporter · 2016-04-11T15:59:36Z

First thank you very much for this wonderful software!

I notice that for same number of samples and features, if only difference is 
the labeling type so one problem is classification and the other problem is 
regression, the time taken for construction of regression forest will be 
considerably longer than classification forest (using default parameters for 
msplit and keep ntrees the same. We also estimate variable importance along the 
way.) Is there any reasons behind this?

Thanks a lot!

Original issue reported on code.google.com by [email protected] on 27 Sep 2012 at 8:15

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2016-04-11T15:59:36Z

Hi Kang

yeh there is a difference between the regression/classification code. when 
creating tree you need to split data but before splitting you need to sort data 
falling into a node. the classification code uses a pre-sorted array and that 
makes the classification code scale as O(number of example) whereas regression 
code uses on the fly code and that makes regression code scale as O(nlog(n)) - 
best sort code scaling.

i am guessing you have lots of examples and thats one reason regression might 
be slower. 

the other reason might be that regression trees may be split totally (i.e leaf 
nodes have the minimum number of examples) whereas your classification trees 
might be much simpler (a low VC dimension)

calculate the mean number of nodes in the model created, that might give you 
some more idea
mean(modelRf.ndbigtree) (classification)
mean(modelRf.ndtree)(regression)

Original comment by abhirana on 27 Sep 2012 at 10:41

GoogleCodeExporter added Priority-Medium Type-Defect auto-migrated labels Apr 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training speed of Regression Forest #45

Training speed of Regression Forest #45

GoogleCodeExporter commented Apr 11, 2016

GoogleCodeExporter commented Apr 11, 2016

Training speed of Regression Forest #45

Training speed of Regression Forest #45

Comments

GoogleCodeExporter commented Apr 11, 2016

GoogleCodeExporter commented Apr 11, 2016