Fitting the Perfect Model

Now that you've had a taste of how predictive modeling works in Azure ML Studio, let's dive into greater detail on how to select variables for your model. You may have noticed in the prior chapter that there was no examination of coefficient p-values. For example, when we created the Excel prediction calculator, we relied on the p-value of linear regression coefficients to determine which variables should be included in our model. Why doesn't Azure ML Studio's version of linear regression evaluation give us those results? Well, the p-value metric is starting to fall out of favor with "big data" scientists becasue it is so greatly affected by sample size. You can find a "statistically significant" p-value for almost any relationship as long as you have enough data. Therefore, you'll learn in this chapter how to reduce your independent variables down to a smaller, more relevant set using alternative metrics to a p-value.