GitHub Project: r-predictive-analysis-template

The main objective of this project is for finding the best potential predictive models for a given dataset, using a shotgun (i.e. spot-check) approach of trying many different models with reasonable defaults.

The intent is not to skip the thinking process, but to get a lot of information in a relatively short amount of time.

The information will help determine which potential models are worth spending time on and further optimizing/improving.

There are two types projects, regression and classification.

The regression code can be found here, the classification code is TBD.


The following plots show the main outputs for the regression analysis.

  • shotgun approach: plot showing cross-validated RMSE and R-Squared for a variety of models (with reasonable defaults) training on a training set.
    • For example:

spot_check

  • final models: plot showing RMSE, MAE, and correlation on the top x (e.g. 5) models that have been retrained on the entire training set (as oppossed to cross-validated), and the tested on the test set (data-points that the model has not seen).
    • For example:

final_models