Solved: Non-parametric robust t-test alternative

shenflow · Posted 04-24-2020 04:59 AM

I want to compare two paired short time series (say of length 10) by comparing their means.

However, two significant issues currently prevent me from doing it:

(1) The data is not normally distributed and the sample sizes are really small, hence relying on the CLT appears to be quite a strong assumption.

(2) The data is autocorrelated.

Both prevent me from doing, e.g. a paired-t test, since the assumptions of normality and i.i.d. observations are violated. I know that I can, for example, perform a non-parametric test such as the Wilcoxon-Rank-Sum test to deal with the first issue. I also know that I can deal with the second issue by, for example, calculating the paired t-test with robust standard erros. However, the Wilcoxon-Rank-Sum test still requires independence, and calculating robust standard errors still requires normality.

Put differently, I do not know how to deal with both issues at once. I would be grateful if anyone could point me towards a procedure that deals with both issues.

SteveDenham · Posted 04-24-2020 08:11 AM

Here are some potential ways to think about this:

A quick and dirty way to deal with autocorrelation is apply a difference operator to each series in the hope of inducing stationarity. You could then look at differences in differences (DiD) to compare the two series, and waving your hands to imply that these are iid variables now. Then a straight t test or a Wilcoxon test might have some validity. You would not be able to say that the two series were different (as they may have level differences), but you could make inferences regarding the shape of the series relative to one another. However, if the differences between series is multiplicative, this may lead to an incorrect inference (i.e. the means of the differences do not differ. but the variances are not equal). As an example, consider that series 2 is simply 2*series 1 at each point. A DiD analysis will tell you that the means are not different. If the series are now stationary, both series of differences should have mean=0.

If you had more than 2 series, you might look at PROC COPULA, but once again there is an assumption regarding normality (or in this case multivariate normality).

If you had multiple measures at each time point in the series, you could fit a generalized mixed model or GEE model with correlated errors and an appropriate distribution.With only a single measure at each time point, you could only fit a main effects model using these tools, and would have to assume that the time component was identical for the two series (no interaction).

My last suggestion would be to try to bootstrap this, but with only 2 measures at each point, you can't really generate a lot of samples. You probably could bootstrap the differences between the series to get an approximation to the distribution of differences, and see where your sample falls in that distribution.

If none of these seem to fit, then you should follow @Ksharp's suggestion and post this in the Forecasting community, and see what those guys suggest.

SteveDenham

View solution in original post

Ksharp · Posted 04-24-2020 07:26 AM

I am not sure . Could you try TREND analysis?
proc freq;
table group*time/trend ;
exact .......;
run;

Or post it at Forecast forum ,since it is about time series, maybe some ETS guys could point you right direction.

shenflow · Posted 04-24-2020 07:43 AM

I am not sure how to use this when comparing two time series. Are you suggesting that I should compare the trends of both series? If so, what is the test statistic that I am going to emply? How do I test for significance of the difference between the trends?

shenflow · Posted 04-24-2020 07:47 AM

Moreover, how does the trend of a time series substitute the mean of a time series? (I am interested in comparing the means of two time series.)

SteveDenham · Posted 04-24-2020 08:11 AM

Here are some potential ways to think about this:

A quick and dirty way to deal with autocorrelation is apply a difference operator to each series in the hope of inducing stationarity. You could then look at differences in differences (DiD) to compare the two series, and waving your hands to imply that these are iid variables now. Then a straight t test or a Wilcoxon test might have some validity. You would not be able to say that the two series were different (as they may have level differences), but you could make inferences regarding the shape of the series relative to one another. However, if the differences between series is multiplicative, this may lead to an incorrect inference (i.e. the means of the differences do not differ. but the variances are not equal). As an example, consider that series 2 is simply 2*series 1 at each point. A DiD analysis will tell you that the means are not different. If the series are now stationary, both series of differences should have mean=0.

If you had more than 2 series, you might look at PROC COPULA, but once again there is an assumption regarding normality (or in this case multivariate normality).

If you had multiple measures at each time point in the series, you could fit a generalized mixed model or GEE model with correlated errors and an appropriate distribution.With only a single measure at each time point, you could only fit a main effects model using these tools, and would have to assume that the time component was identical for the two series (no interaction).

My last suggestion would be to try to bootstrap this, but with only 2 measures at each point, you can't really generate a lot of samples. You probably could bootstrap the differences between the series to get an approximation to the distribution of differences, and see where your sample falls in that distribution.

If none of these seem to fit, then you should follow @Ksharp's suggestion and post this in the Forecasting community, and see what those guys suggest.

SteveDenham

Non-parametric robust t-test alternative

Re: Non-parametric robust t-test alternative

Re: Non-parametric robust t-test alternative

Re: Non-parametric robust t-test alternative

Re: Non-parametric robust t-test alternative

Re: Non-parametric robust t-test alternative

Non-parametric robust t-test alternative

Re: Non-parametric robust t-test alternative

Re: Non-parametric robust t-test alternative

Re: Non-parametric robust t-test alternative

Re: Non-parametric robust t-test alternative

Re: Non-parametric robust t-test alternative

Ready to join fellow brilliant minds for the SAS Hackathon?