Next Upcoming Google+ Hangout: Tuesday, August 27 @ 7PM (CST) - To Participate CLICK HERE

Search For Topics/Content


STATISTICS QUESTIONS FROM YOU FOR THE STATS MAKE ME CRY GUY!


This page page features questions previously submitted by users on the "Ask the Stats Make Me Cry Guy" page. Although we now use the forum for these questions instead, I decided to leave these posted so that the information was available!

Entries in regression (3)

Monday
Jan032011

Which methods are used for analysis of Residuals? (Raja, location unknown)

Another great question Raja! There are several ways that one might analyze residuals and the one that an analyst chooses is depended on their purpose/goal for analyzing the them. For example, if someone is testing whether the variance of their residuals are equal across levels of the predicted value of their model's DV ( an assumption of regression), one would use a scatterplot, placing the predicted value of their DV on the x-axis and the residual scores from the regression model on the y-axis.

NOTE: most statistical software packages will allow you to save both the predicted scores and residual scores of a regression model (in SPSS, simply place a checkmark in the desired boxes in the "save" dialogue of the "regression" analysis).

Conversely, if the residuals are being analyzed in an effort to control for confounds, the residuals of one regression model (a model with the variable/variables that are to be controlled as predictors) might be used as the DV in another model (a model that would feature your target variable of interest as the predictor).

In yet another analysis (of multivariate normality; also an assumption of regression), residual scores may be examined with a histogram to determine if their distribution is "normal."

I hope that is helpful!

 

Tuesday
Dec282010

What are residuals in regression? How we can find residuals? (Raja, location unknown)

Great question(s), Raja! Residual is a buzz word that is often-used in statistics, which can make things very confusing if you aren't clear what it is. Before I go further in answering this question here (which I will), I'll also refer you to a blog I wrote in October on this very topic, called: "Top Ten Confusing Stats Terms Explained in Plain English (#8 Residual)". While I'm happy to try to address this question here, you might find that the blog posting offers more detail and examples that are useful to understand this complex topic.

That said, here is my attempt to answer this question concisely:

In general, a residual is the difference between the actual value of a dependent variable (DV) and the value of variable that was predicted by a statistical model. In the context of a regression, a residual is how far a predicted value (as determined by the predictors in the model) is from the actual value of the dependent variable. This is also called an "error term" (because it represents how much your regression model was in error, in terms of its ability to predict the value of the DV).

In any statistical model, a residual can be thought of in various contexts. For example, each person will have their own "residual" score (which is simply the difference between the actual and predicted value of the DV), while the model as a whole also has a residual score (which represents how much variability in the DV remained 'unexplained' by the predictors in the model). A residual score serves several purposes, including: 1) determining the accuracy of your model (how much variability is explained by the model) and 2) being used to test the assumptions inherent in the regression analysis (such as the assumption that the residuals are normally distributed OR that their variance is equal across all levels of the predicted value of the DV).

Again, please visit the blog linked above for more in-depth information. Good luck!

Monday
Apr262010

Regression Analysis Question: What is the difference between a mediator and a moderator? (anonymous)

Great Question! The difference is basically parallel to the difference between explaining something and changing something. In statistics, a mediator is a variable that explains the relationship between two other variables. For example, age may be hypothetically related to having a higher income, but that may be explained by age's association with work experience, which may itself be related to a higher income. If work experience accounts for a significant portion of the variability in income that is explained by age, then work experience is a mediator of that relationship (there are different kinds of mediators also, such as full or partial, but we won't get into that here).

While a mediator explains, a moderator changes. When the strength of the relationship between two variables is dependent on the value of a third variable, that variable is called a moderator. With respect to moderators, the easiest example is often gender. For example, let's pretend that age was associated with liking of ice cream, but that was only true for boys, while girl's liking of ice cream did not tend to vary with age. In this case, one's gender determines whether their age is related to their liking of ice cream. In an entire sample of people, a researcher might say that knowing an individual's gender would CHANGE the extent that they are able to predict one's liking of ice cream from their age.

This is a topic that is commonly confused, so much so, that I made a video about it (which also includes information about suppressors)! Check out my video about mediators, moderators, and suppressors HERE! Thanks for the question and keep them coming!