Missing Data/Imputation Discussion > SPSS 19 Repeated Measures analysis with missing data
Hi Ellen! Thanks for the great question. What you are seeing happen is called "listwise deletion", and it is the default for most analysis in SPSS. Listwise deletion essentially means that the analysis will only use cases that have valid cases for all variables in the analysis. Some analyses have an option to switch to pairwise deletion, which will use all available cases/data, but unfortunately I don't think repeated measures GLM is one of those that have that option.
Normally, my suggestion to deal with this would be an imputation technique, such as "multiple imputation", but your sample size is too small to use those more complex techniques. With the sample size that you have available, your options are probably: "mean replacement", "regression imputation", and using the listwise deletion and accepting the sample size of 24. I hope this helps!
Hi Jeremy. I have the same problem and I have been struggling to find an answer. I've done some googling and found two other possible solutions. I wonder how you think about them.
First, going back to "multiple imputation", here is an example using this technique with relatively small sample size (http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Missing_Data/MissingDataSPSS.html) Do you think this example is ok?
Second, someone suggests that we can use linear mixed models (http://forums.stat.ucla.edu/read.php?7,317,317). I only know mixed models are models incorporating random effects, what does it have to do with missing data? Put it another way, what's the relationship between random effects and missing data in a repeated measures dataset?
Thank you very much!
Kingchi
Hi Jeremy, One more follow-up question, I've tried the multiple imputation and it generated five datasets. The problem is when I am doing statistical analysis. Usually it will generate a "pooled" coefficient table if you run regression. But what I need is a mixed-design ANOVA (2x2 plus some other between-subject factors/covariates), and SPSS doesn't provide a pooled stats. What should I do? Thank you very much!
Hi, sorry I just realized there is an entire thread about my last post. Is it still true that for SPSS, I have to install some marco to generate pooled F-/p-statistics? Could you please provide some references on how we should calculate the pooled stats? I guess if this approach is not normal, reviewers would ask for references... Thanks a million!
I would typically use R to do multiple imputation analysis and then use the R package manual as my reference. A few good packages for multiple regression in R include: "mi" and "Zelig". I often use "mi" to do the actual imputation and then use Zelig to run the analysis on the imputed data (it creates imputed stats for you).
Here is a link for Zelig:
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=7&cad=rja&ved=0CGwQFjAG&url=http%3A%2F%2Fcran.r-project.org%2Fweb%2Fpackages%2FZelig%2Findex.html&ei=5IIZUYy7F8-10AHd5YHoBw&usg=AFQjCNHdj5uwWRbNimIgAzSTMcfAJKaDyw&bvm=bv.42080656,d.dmQ
Here is a link for the "mi" package:
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CDIQFjAA&url=http%3A%2F%2Fcran.r-project.org%2Fweb%2Fpackages%2Fmi%2Findex.html&ei=C4MZUY3jJbGA0AGajoDACw&usg=AFQjCNGQ12dJD5zFqZp6-zWTS3X3dr8Fog&bvm=bv.42080656,d.dmQ
I have 26 participants and five ordinal factors data set in excel. One ordinal factor has missing data for two participants. When I input all the data into SPSS and do repeated measures analysis for both within subjects (five sampling over time) and between subjects (two different types) for all 5 ordinal factors. SPSS analysis output shows that SPSS system ignores all five factors for the two subjects that has missing value for one subject, I got total N value of 24. I tried to define missing values as 9999 and defined those 9999 as missing values in the SPSS. That didn't work. What I had to do was do analysis on four factors which gave me total N value of 26. Then did separate analysis on separate data file for just the one factor. I wanted to do the analysis on the data set containing all five factors. Any suggestions or help would be appreciated.