Next Upcoming Google+ Hangout: Tuesday, August 27 @ 7PM (CST) - To Participate CLICK HERE

Help Me, Help You...
This form does not yet contain any fields.
    « Within-Subject and Between-Subject Effects: Wanting Ice Cream Today, Tomorrow, and The Next Day… | Main | Put aside your fears and be wrong already! »
    Sunday
    Jul042010

    Bonferroni Correction In Regression: Fun To Say, Important To Do...

    The Bonferroni correction is only one way to guard against the bias of repeated testing effects, but it is probably the most common method and it is definitely the most fun to say. I've come to consider it as critical to the accuracy of my analyses as selecting the correct type of analysis or entering the data accurately. Unfortunately adjustments for repeated testing of hypotheses, as a whole, remains something that is often overlooked by researchers and the consequences may very well be inaccurate results and misleading inferences. In this independence day blog, I'll discuss why the Bonferroni Correction should be as important as apple pie on the 4th of July.


    The Bonferroni correction is a procedure that adjusts a researcher's test for significant effects, relative to how many repeated analyses are being done and repeated hypotheses are being tested. Here is an example:

    Let's say that I am seeking to identify what factors are most predictive of one's 4th of July enthusiasm, as measured by a hypothetical continuous scale. To determine this, I might test several potential predictors in a regression model, such as "love of fireworks," "love of apple pie," "enjoyment of being off of work," or good ol' "pride in being an American", along with 16 other potential predictors, for a total of 20 hypothesized predictors.

    Bonferroni quote box

    Normally we might just toss all of our predictors into a regression and see what we come up with. However, by doing so we'd be overlooking something. When we run a regression we choose an "alpha" and by doing so, choose a percentage oferror we are willing to live with. The most common amount of error that is accepted is 5% (as in p < .05). That is to say, we expect that 19 out of 20 times we find significant effects it will be without error. However, now that we've tested 20 different potential predictors, the likelihood of finding an erroneous signifificant effect (purely by random chance) has now ballooned to approximately 64% (See Berkeley's stats website for more on the math behind this).

    To avoid this inflated likelihood of error, we must use an adjusted p-value to test for significance. To calculate this using Bonferroni's method, we simply divide our desired p-value by the number of hypotheses being conducted. In our example, we divide .05 by 20 (.05/20 = .0025), giving us our new threshold of significance (p <.0025), maintaining our 95% confidence in our set of analyses as a whole (known as family-wise error rate).

    SPSS Screen Example

    (CLICK THUMBNAIL TO ENLARGE IMAGE)

    For more information about Bonferroni correction and other options to making these adjustments, check out Berkeley's stats site. Despite's its simplicity, Bonferroni remains a good option to guard against inflated family-wise error. Additionally, most modern stats packages offer it as an option in their calculations. SPSS (PASW), for example, offers the Bonferroni adjustment as an option in their General Linear Model (GLM) dialogue (see figure to left). No matter what your method, guarding against the pitfalls of repeated multiple hypothesis testing may save you a lot of time later, trying to explain inexplicable findings that were found by random error.

    Editorial Note: Stats Make Me Cry is owned and operated by Jeremy J. Taylor. The site offers many free statistical resources (e.g. a blog, SPSS video tutorials, R video tutorials, and a discussion forum), as well as fee-based statistical consulting and dissertation consulting services to individuals from a variety of disciplines all over the world.





    PrintView Printer Friendly Version

    EmailEmail Article to Friend

    References (2)

    References allow you to track sources for this article, as well as articles that were written in response to this article.

    Reader Comments (20)

    When doing a stepwise linear regression, do you set the Bonferroni correction to be p<0.05/number of total variables tested, or <0.05/number of variables ultimately included?

    THX

    February 24, 2011 | Unregistered CommenterA

    That is a great question and one that no one has ever asked! My thought would be to use the number of variables tested, since the testing of the variables repeatedly is where the error could occur, which could then influence whether a variable is even included in the final model. That is a very conservative method, but makes sense to me.

    February 24, 2011 | Registered CommenterJeremy Taylor

    Great Page! I have a multiple linear regression with 3 numerical variables and 1 categorical variable. Dummy variables are assigned to the categorical variable, ten in total. Does that mean I have to divide my alpha level with 3 + 10 = 13 variables? Or do I just divide by 4? Thanks!

    March 16, 2011 | Unregistered CommenterPP

    Great question PP, dummy codes are considered one variable in most cases, so I'd probably only divide by 4. Best of luck!

    March 17, 2011 | Registered CommenterJeremy Taylor

    I am doing two multiple hierarchical regression models using the same 7 variables in 3 steps to predict 3 dependent variables. The regression model is also "flipped", as in the steps are ordered differently in the two separate models. What do I divide by to get the corrected alpha level?

    May 2, 2011 | Unregistered CommenterS

    To clarify, you are saying that you have different DVs at each of the three stages of the hierarchy? I don't think that's possible, unless I'm misunderstanding something... Could you please elaborate?

    May 2, 2011 | Registered CommenterJeremy Taylor

    Please excuse my lack of clarity! I will try to be more clear. I am testing the same two models... which have the same independent variables in them, just ordered differently, on three different DVs... in separate analysis. In my earlier description I also forgot to mention the interaction (step 4).


    Model 1 Model 2 DV1 DV2 DV3
    Step 1 Step 1
    A A

    Step 2 Step 2
    B E
    C F
    D G

    Step 3
    E B
    F D
    G D

    Step 4
    H

    Thank you so much for your response!

    May 4, 2011 | Unregistered CommenterS

    Hi S!
    Sorry for the delay in my reply, but I've been on my honeymoon for the past few weeks. In general, to calculate a bonferroni adjustment you simply divide alpha by the number of simultaneous comparisons. For example, if you have 7 predictors and one interaction (comparing 8 effects), then an adjustment of an alpha of .05 would be: .05/8= .00625. However, it is important to remember that an adjustment is not necessary to calculate the significance of the overall model (significance of R-squared). The adjusted alpha only would need to be used to compare the individual predictive ability of your IVs.

    For more great information about using Bonferroni Adjustment in Multiple Regression, check out THIS ARTICLE.

    June 2, 2011 | Registered CommenterJeremy Taylor

    Hi,

    I know that PASW can automatically compute the Bonferroni adjustment for General Linear Models. Can it do this for Multiple Linear Regressions or must you do this by hand?

    Also due to the nature of my analysis I have a very large number of variables in an initial Spearmans correlation analysis, the results of which I used to determine which variables I was entering into separate grouped multiple (backward) regressions. The significant variables in each of these separate regression analyses were then combined into an overall analysis. Is it appropriate to use a Bonferroni correction in a correlation analysis or is it best used at the regression stage?

    This is a great resource - thanks!

    July 18, 2011 | Unregistered CommenterRebecca

    Hi Rebecca! First, thanks for your question.

    To my knowledge, there isn't an easy way to produce Bonferroni corrections in SPSS for multiple regression. However, you can adjust the p-value, based on the number of predictors (as I discuss in my Bonferroni blog HERE).

    In terms of your question about correlations, it is absolutely appropriate to use corrections with them. Anytime you are doing multiple comparisons, it is useful. The more comparisons you conduct, the greater the likelihood that some of them will be significant, due to chance alone (unless corrected). I hope this helps!

    July 20, 2011 | Registered CommenterJeremy Taylor

    Hi there,

    Thanks for an informative site.

    I'm having trouble working what my denominator should be to apply Bonferroni correction for post-hoc tests of an effect I found in a repeated measures ANOVA conducted using PASW's GLM command.

    I have an experiment with two within-subjects factors: width (3 levels) and trial number (1-20). The omnibus ANOVA reveals an interaction of trial number and width, so I've gone on to conduct three independent ANOVAs each each with trial number (1-20) as a within-subjects factor, to ascertain at which width/s the effect of trial is significant. I presume this is a defensible thing to do post-hoc, because it doesn't seem to be of any theoretical interest to compute paired-sample t tests between each width at every single trial number - essentially what I'm trying to determine is whether there is a significant linear effect across the course of the experiment in each condition. That being the case, however, I'm not sure what to have as the denominator to apply Bonferroni correction (presuming I need to do this).

    Any clarification you can offer would be very much appreciated.

    Cheers

    Jo

    April 15, 2012 | Unregistered CommenterJo Robertson

    Hey Jo!

    First, thanks for posting! I think there is a post-hoc option in SPSS that will calculate these comparison (including Bonferroni correction) for you. Have you explored that already?

    April 17, 2012 | Registered CommenterJeremy Taylor

    Hey, thank you for the great info!

    If I am doing 3 hierarchical multiple regressions, each with a different DV, but with the same IV and control variables, do I need to do a bonferroni correction? And if so would it be .05/3=0.0167?
    For exampl DV=x, y, z, IV=a, Control variables= c, d
    HMRA 1 would include x, c, d, and a
    HMRA 2 would include y, c, d, and a
    HMRA 3 would include z, c, d, and a

    Thank you for your help!

    May 7, 2012 | Unregistered CommenterCam

    Cam,

    Thanks for your question! I do think you could make an argument for using a bonferroni correction here and it appears to me that your understanding about how to apply it is correct!

    May 8, 2012 | Registered CommenterJeremy Taylor

    Love this blog! But I'm still a bit lost.

    I am doing three binary logistic regressions to test a choice between (1) A and a control, (2) B and a control, (3) A1 and A2, each regression using a subset of 13 variables – (1)10, (2) 6, (3) 8. The variables consisted of one variable we were actually testing plus other variables and covariates that we couldn’t control but whose influence on the outcome we needed to account for.

    Test 1 produced significance only for the test variable (p = <0.001) and for none of the random variables or covariates. Test 2 produced, as expected, no significance for any variable. Test 3 produced significance for the test variable only (p=0.007).

    Is this the kind of experiment that would require a Bonferroni correction?

    Should each regression have its own alpha or should there be an experiment-wide alpha? How would that be calculated?

    In test #1, I am testing 2 populations. If I run separate regressions for each of the populations to look at the influence of variables in that test, would I then have 4 tests and need to change the alpha level again?

    Thanks for any help you can give me.

    Catherine

    May 17, 2012 | Unregistered CommenterCatherine

    Thanks for this wonderful post! A related question, below...

    I'm planning on using a regression model to explain seed set of plants (my response) using some sort of predictor based on temperature. Basically, I have a number of temperature variables calculated from the same set of data (hourly temperatures for July and August, converted to variables such as average temperature, maximum temperature, minimum temperature, degree-days above zero Celsius, degree days above ten Celsius, etc...), and I want to decide which one should be included in my model. I know that I would ideally select one based on "prior knowledge" of the system (e.g. so-called "planned comparisons" or a temperature threshold that is known to be important for the development of seeds), but not much is known about this system.

    I've been warned against testing the significance of multiple predictors using p-values, unless I use Bonferroni correction as you have recommended above. Unfortunately, using Bonferroni correction would result in something like p = 0.05/7 (for seven different temperature variables); a rather small value for detecting anything! I was wondering whether it would be appropriate to instead use likelihood-based techniques (direct comparisons of log-likelihoods or AIC scores) to simply compare a series of models using each of the alternative predictors in turn, and choose the most relevant temperature variable (i.e. predictor) based on that.

    Thoughts on the validity of this approach? Would any adjustments have to be made for multiple comparisons if I used this strategy?

    May 24, 2012 | Unregistered CommenterJason

    Hi Jason,

    While I like your creativity, I'm afraid the dangers of inflating Type I error can not be avoided by using a different estimator. No matter what kind of model you run, the need for adjustment remains unchanged. I would also be wary of running multiple models using variations of the same variable/construct. Even if prior evidence doesn't compel your choice, I'd make a conceptual rationale for one of the measurements and go with it (i.e. why do you think one way of measurement would be best, based on what you know of the topic). After all, a wise person once told me that a truly robust finding shouldn't completely disappear based on the nuance of measurement. If your finding is robust, it should emerge no matter which unit of measurement you choose.

    With that said, there are exceptions to that rule, and there are instances when method measurement is more critical, but if your case is one of those instances, then it should become clear to you which method of measurement should be used (at the very least, you should be able to narrow it down).

    I hope this helps!

    May 30, 2012 | Registered CommenterJeremy Taylor

    This is a very useful page Jeremy! I'm glad I came across this. I have a question about bonferroni correction that is slightly different from previous posts.

    I need to run numerous linear regression analyses, separately for each of my two groups of subjects as well as different conditions. Together, I would have to run approximately 40 regression analyses across the 3 conditions for each group. I'm aware that we should correct for multiple comparisons for the regression coefficients within a model. But in my case, should I also do a similar correction for the p-value associated with the regression model itself (ie: the F test)?

    For example, if in one condition, I have to run 25 regression analyses for each group, would it make sense to divide alpha=.05 by 25?

    I've been going through different websites and textbooks, but can't seem to find an answer. Any suggestions/help would be appreciated!

    March 31, 2013 | Unregistered CommenterJulia

    Since you are increasing the likelihood that you will find a significant effect by random chance as the number of regressions you run increases, I would say that it would be a good idea to do the adjustment.

    Think of it this way, if you run 20 regression models, you are likely to find one significant result by random chance, if you apha is .05.

    I hope this helps!

    April 11, 2013 | Registered CommenterJeremy Taylor

    I am using the Bonferroni adjustment for a chi-square comparison in which I have 3 groups and a binary IV, wanting to see which of my three categories are are above/below the expected observation (omnibus chi sq is significant). The question is: Since the IV is binary, should my denominator for Bonferroni be 3 or 6? When 6, none of the individual groups are significant, but when 3, one is. Thanks in advance!

    May 13, 2015 | Unregistered CommenterJohn

    PostPost a New Comment

    Enter your information below to add a new comment.
    Author Email (optional):
    Author URL (optional):
    Post:
     
    Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>