Title | Forests of Stumps |
Authors | Alharthi, Amirah and Taylor, Charles C. and Voss, Jochen |
Year | 2018 |
Volume | Archives of Data Science, Series A 5(1) / 2018 |
Abstract | Many numerical studies (Hansen and Salamon (1990), Schapire (1990)) indicate that bagged decision stumps perform more accurately than a single stump. In this work, we will investigate two approaches to create a forest of stumps for classification. The first method is bagging with stumps, that is growing a stump on different bootstrap sample size drawn from the training dataset. The second method is Gini-sampled stumps, where we sample split points with probability proportional to the Gini index. These two methods are combined with two aggregation methods: Majority vote and weighted vote. We use simulation studies to compare the performance and consumed time for these two methods. The computing time of generating split points by Gini-sampled stumps is less than half of the time needed to generate split points from bootstrap samples. Also, weighted vote aggregation results in more accurate performance than majority vote aggregation. |