Ensemble Learning

Ensemble Learning in Regression Applications

Gaurav Mishra, Karan Praharaj, Madan Gopal

This project aims to assess the performance and usability of ensemble regression techniques against the performance of other machine learning algorithms by comparing the results using test mean squared error as the metric for comparison. The goal of ensemble regression is to combine several models in order to improve the prediction accuracy on learning problems with a numerical target variable. Many methods for constructing ensembles have been developed. The method used in this project is constructing ensembles that manipulate the training examples to generate multiple hypotheses. For experimental evaluation, we have used a house value prediction problem and a stock index prediction problem. Specifically, the bagging, boosting and random forests ensemble methods were used in addition to the standard machine learning. In terms of test accuracy, it was found that Boosting (AdaBoost) and Bagging perform the best of all the algorithms for the housing problem. We expect that the learning attained from these results will provide a useful platform for our further work on the Stock Index Prediction problem.

alt text