Statistical Procedures

scarico · Posted 08-23-2013 03:18 AM

Hello!

I have two different time series of average values of a high and low portfolio

For Example:

Low Portfolio : 0.1 0.5 0.4 0.6 0.1 0.5 0.6 .... (Average: 0,4)

High Portfolio: 1.1 0.4 1.4 0.6 0.2 0.2 0.8 .... (Average:0,67)

I want to test, whether the difference (0,67-0,4) between the two time series ist statistical significant.

Thanks a lot!

Murray_Court · Posted 08-23-2013 04:23 AM

It looks like Analysis of Variance is your solution here.

The assumptions made about your data must be as follows:

- Observations are independent (has you data been collected properly)

- Errors are normally distributed

- Both groups have equal response variences

You have given a very small sample; the following program tells me through a homogeneity of variences test that the differences in variences between the groups are within standard statistical parameters (only just).

data temp;

input low high;

datalines;

0.1 1.1
0.5 0.4
0.4 1.4
0.6 0.6
0.1 0.2
0.5 0.2
0.6 0.8

;

data temp_1;

team=1;

set temp (keep=low rename=(low=score));

run;

data temp_2;

team=2;

set temp(keep=high rename=(high=score));

run;

data temp;

set temp_1 temp_2;

run;

proc glm data=temp;

class team;

model score=team;

means team / hovtest;

quit;

With such a small sample size it is very hard to say wether the errors are normally distributed.

Having done our best to verify the assumptions (if you have more data you will be able to do this properly, just run a histrogram of the differenced between the mean and the observations for each group and verify that it looks like a bell curve), we can now proceed with an analysis of varience test, which is also included in the output of the GLM procedure of the abocve program.

The differences in varience yeild an F-statistic (strength and consistency of difference between means) of 2.02. The chances of this happening randomly on the assumption that there was no difference between the groups is given to be around 18.1%, this is above the common statistical threshold of 5%, implying that we cannot conclude that the two groups are different based on our observations.

Adding more observations to our data help us to determine with greater accuracy what is really the case here, as generally speaking when sample sizes are below 30 most statistical tests will be inconclusive.

Hope this helps,

-Murray

View solution in original post

scarico · Posted 08-23-2013 04:05 AM

OK, I think it is:

proc ttest data=zyx;

class xyz;

var x;

run;

But please correct me, if it is wrong

Murray_Court · Posted 08-23-2013 04:23 AM

It looks like Analysis of Variance is your solution here.

The assumptions made about your data must be as follows:

- Observations are independent (has you data been collected properly)

- Errors are normally distributed

- Both groups have equal response variences

You have given a very small sample; the following program tells me through a homogeneity of variences test that the differences in variences between the groups are within standard statistical parameters (only just).

data temp;

input low high;

datalines;

0.1 1.1
0.5 0.4
0.4 1.4
0.6 0.6
0.1 0.2
0.5 0.2
0.6 0.8

;

data temp_1;

team=1;

set temp (keep=low rename=(low=score));

run;

data temp_2;

team=2;

set temp(keep=high rename=(high=score));

run;

data temp;

set temp_1 temp_2;

run;

proc glm data=temp;

class team;

model score=team;

means team / hovtest;

quit;

With such a small sample size it is very hard to say wether the errors are normally distributed.

Having done our best to verify the assumptions (if you have more data you will be able to do this properly, just run a histrogram of the differenced between the mean and the observations for each group and verify that it looks like a bell curve), we can now proceed with an analysis of varience test, which is also included in the output of the GLM procedure of the abocve program.

The differences in varience yeild an F-statistic (strength and consistency of difference between means) of 2.02. The chances of this happening randomly on the assumption that there was no difference between the groups is given to be around 18.1%, this is above the common statistical threshold of 5%, implying that we cannot conclude that the two groups are different based on our observations.

Adding more observations to our data help us to determine with greater accuracy what is really the case here, as generally speaking when sample sizes are below 30 most statistical tests will be inconclusive.

Hope this helps,

-Murray

SteveDenham · Posted 08-23-2013 10:58 AM

I would really look at some of the procedures in SAS/ETS, or at some mixed modeling techniques that address autocorrelation, for timeseries comparisons. The usual methods (TTEST, GLM) offered for analysis fail horribly on the assumption of independence of the observations. I would recommend PROC MIXED with the REPEATED statement, or PROC PANEL in SAS/ETS.

Steve Denham

Statistical Procedures

Testing for significant differences of time series

Re: Testing for significant differences of time series

Re: Testing for significant differences of time series

Re: Testing for significant differences of time series

Re: Testing for significant differences of time series

Extracting Windowed Time Series Features with Python in SAS Event Stre...

Tip: Getting started with Time Series Clustering

Anomaly Detection in Sensor Data using Support Vector Data Description...

Post-Aggregate report Filters in Time Series Plots

Transforming the Frequency of Time Series Data

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...