Monday
Oct252010
Top Ten Confusing Stats Terms Explained in “Plain English” (#8: Residual)
Jeremy Taylor |
Monday, October 25, 2010 at 12:19PM When I hear the word "residual", the pulp left over after I drink my orange juice pops into my brain, or perhaps the film left on the car after a heavy rain. However, when my regression model spits out an estimate of my model's residual, I'm fairly confident it isn't referring to OJ or automobile gunk...right? Not so fast, that imagery is more similar to it's statistical meaning than you might initially think.
In statistics, a residual refers to the amount of variability in a dependent variable (DV) that is "left over" after accounting for the variability explained by the predictors in your analysis (often a regression). Right about now you are probably thinking: "this guy likes the word "variability" way too much, he should buy a thesaurus already!"
Let me try again: when you include predictors (independent variables) in a regression, you are making a guess (or prediction) that they are associated with the DV; a residual is a numeric value for how much you were wrong with that prediction. The lower the residual, the more accurate the the predictions in your regression are, indicating your IVs are related to (predictive of) the DV.

Reader Comments (4)
THANK YOU FOR BEING SO NICE AND SMART.
-Northwestern PhD student who feels really, really stupid in stats class
Glad I could be helpful, DudeBro!
Thanks for this series of plain-English explanations/definitions of statistical concepts. I was wondering though if you stopped at No. 8? Or are the 7 others somewhere else? I can't see them in this blog. Would want to know and read more of the items in your list.
Thanks for your post, Mark! To be honest, I got distracted away from the list, due to user feedback and requests to produce tutorial videos. However, I've received a few comments/questions about the list in the last few weeks, in particular, so I've been considering resuming that series!
I've also toyed with the idea of posting a interactive poll to elicit the users' help with choosing the remaining stats terms (and choosing what order they are in)... What does everyone think?