What are residuals in regression? How we can find residuals? (Raja, location unknown)
Great question(s), Raja! Residual is a buzz word that is often-used in statistics, which can make things very confusing if you aren't clear what it is. Before I go further in answering this question here (which I will), I'll also refer you to a blog I wrote in October on this very topic, called: "Top Ten Confusing Stats Terms Explained in Plain English (#8 Residual)". While I'm happy to try to address this question here, you might find that the blog posting offers more detail and examples that are useful to understand this complex topic.
That said, here is my attempt to answer this question concisely:
In general, a residual is the difference between the actual value of a dependent variable (DV) and the value of variable that was predicted by a statistical model. In the context of a regression, a residual is how far a predicted value (as determined by the predictors in the model) is from the actual value of the dependent variable. This is also called an "error term" (because it represents how much your regression model was in error, in terms of its ability to predict the value of the DV).
In any statistical model, a residual can be thought of in various contexts. For example, each person will have their own "residual" score (which is simply the difference between the actual and predicted value of the DV), while the model as a whole also has a residual score (which represents how much variability in the DV remained 'unexplained' by the predictors in the model). A residual score serves several purposes, including: 1) determining the accuracy of your model (how much variability is explained by the model) and 2) being used to test the assumptions inherent in the regression analysis (such as the assumption that the residuals are normally distributed OR that their variance is equal across all levels of the predicted value of the DV).
Again, please visit the blog linked above for more in-depth information. Good luck!
Reader Comments (4)
I have heard of people using residuals themselves as dependent variables in a subsequent regression analysis. When might this be helpful?
i am very thankful for answer.
But there is one thing which is not worrying me continously from last 3 weeks that is term "Residual Analysi"
now i know what is residuals but sir i want to know analysis of residuals , which methods are used for analysis of residuals.
Great question, Alyssa! Sometimes residuals are used as an output for a subsequent regression when a researchers would like to control for or "partial out" the effect of the predictors of the original regression (from which the residuals were obtained).
Thank for your answer. It is very useful. I got a lot of useful and significant information. Thank you so much.