Non-parametric Analysis Discussion > Distribution Free Tests
Greetings Pooven!
You pose a FANTASTIC question, although extremely complex. There many types of models that have relaxed assumptions about normality (or sometimes no assumptions about normality), but which would be specifically applicable would take thorough examination of your data, including what is the nature of your distributions (if not normal). Most of these models involve the use of structural equation modeling techniques, which you can conduct in software such as MPlus, AMOS, and/or Lisrel. I prefer Mplus, but AMOS has the most user friendly graphical interface.
Another thing to consider is transformations (e.g. log-tansformations, square root transformation, square transformation, inverse...etc). Can your data be transformed to approximate normality? If so, you could then use more typical time series analyses (such as growth modeling, survival analysis...etc, depending on the specific nature of your research questions and data).
With regard to your question about testing for differences between regression models, I'm not sure I completely understand your question. For the purposes of analyzing time series data, I don't think regression would be my first choice, as I'd prefer an analysis that explicitly models time and covariance over time. In that circumstance, there is no need to compare different regressions. Am I misunderstanding your question?
I hope this is helpful and feel free to follow-up with additional questions. Also, if you'd prefer more in-depth assistance, you can consider consultation services. In this case, you can complete a service request form HERE.
Dear Jeremy,
Thank you for replying. I haven't properly considered transformations and it certainly is worth investigating. From various research papers it seems my data comes from a population that follows a Weibull distribution; the data is skew to the left.
Perhaps I've mislead myself and I can't really use hypothesis testing. I was reading an example from Probability and Statistical Inference by Hogg & Tanis where they described a process that had a failure of p = 0.06 and a new process that had some other failure rate. They discussed which values of p would lead us to conclude that the new process is actually an improvement - this was the introduction to hypothesis testing.
So I have several models that try to predict the next step in the time series and they all appear to give me roughly the same accuracy. I therefore believe that despite the variations, there isn't a significant performance difference between the models but is this just a chance occurrence? Why can I confidently make any statement about this... so I thought hypothesis testing might be able to give me definite answer with a certain level of confidence? I suspect that taking the mean absolute difference won't work - what I've done is counted the number of times the model was successful (when the mean difference between actual and predicted value is < 1) and I obtained a certain rough probability of success based on no other evidence.
Can I use hypothesis testing to determine if the performance difference is chance occurrence or likely to always happen?
I'm sorry Pooven, but I'm afraid I don't completely understand the question. I think what you are referring to is a correction of your p-value, to guard against "Type I" error, which can occur when you are doing repeated analyses or several iterations of analysis. For example, if I am doing 100 analysis, with an alpha of .05, then 5 of those analysis are likely to be found significant, purely by chance. To deal with this we can use corrections (such as Bonferroni)...I'm not sure that I understand your question though, so let me know if it seems like I'm confused.
Dear Jeremy,
I have several non-linear regression models (specifically I'm using a temporal neural network modeling a time series). The predictors (the same variable taken at different time steps) follow a non-normal distribution which doesn't meet the assumption of the T statistic. What tests can I use to answer the question: is there a significant different between the performance of the regression models? Performance is measured in several ways - I'm hoping that using the mean absolute error (MAE) is okay for testing?
The T statistic compares the response variable with a certain mean - that is, one regression model is compared with another. Is there a way to compare multiple regression models whose input variables do not follow a normal distribution?
Thank you for your time and consideration.
Kind regards
Pooven