Programming the statistical procedures from SAS

How to do a trend or regression analysis if mean and standard deviation values are known

Reply
New Contributor
Posts: 3

How to do a trend or regression analysis if mean and standard deviation values are known

Dear,

I would like to do a trend or a regression analysis. Following data is known:

year     average      standard deviation

1996     200                         10

1997     210                         15

1998     220                         15

etc.

I'm familiar with the normal proc reg analysis.

All I want to do now is to see if there is a significant trend but first I do not know how to read in these data (average, standard deviation).

Thank you in advance,

BJ

Trusted Advisor
Posts: 1,418

Re: How to do a trend or regression analysis if mean and standard deviation values are known

Ordinary least squares, as performed by PROC REG, will still provide you with the least squares estimate of the slope and intercept.

The t-tests and F-tests will not be correct, as they apply only to the case where the errors are i.i.d. normally distributed.

If you know the standard deviation of each data point, then you could use a weighted least squares to obtain statistical tests.

By the way, when you say average of 200 and standard deviation of 10, what is this the average and standard deviation of?

New Contributor
Posts: 3

Re: How to do a trend or regression analysis if mean and standard deviation values are known

Thank you for the quick answer!

It was just a hypothetical example. Goal is to analyse the trend analysis of air quality impact (expressed in disability adjusted life years or DALYs).

I'm familiar with the PROC REG procedure which I used many years ago for my PhD but in that time we always started from the raw data.

Now I only have the average and the standard deviation. How do you put in into SAS?

Thank you,

J

Respected Advisor
Posts: 2,655

Re: How to do a trend or regression analysis if mean and standard deviation values are known

Try this (untested code) working from your example:


data have;

input year     average      standard_deviation;

cards;

1996     200                         10

1997     210                         15

1998     220                         15

...

...

...

;

data want;

set have;

wt = 1/(standard_deviation;*standard_deviation); /* Makes the weight proportional to the reciprocal of variance, so the estimates are BLUE */

run;

proc reg data=want;

model average=year;

weight=wt;

run;

Steve Denham

Message was edited by: Steve Denham

Trusted Advisor
Posts: 1,418

Re: How to do a trend or regression analysis if mean and standard deviation values are known

It was just a hypothetical example. Goal is to analyse the trend analysis of air quality impact (expressed in disability adjusted life years or DALYs).

I'm familiar with the PROC REG procedure which I used many years ago for my PhD but in that time we always started from the raw data

I'm not convinced that this standard deviation that you are describing is meaningful here (or perhaps it is indeed meaningful, but in ways that are not what you are implying).

The condition needed to generate valid t-tests and F-tests is that the ERRORs around the regression line are independent and identically distributed as a normal distribution. The condition has NOTHING to do with the standard deviation of the air quality impact numbers during the year.

Thus, I'm also skeptical that the weighted regression is appropriate here, and so I would not (yet) recommend using the code above.

In fact, I would advise fitting the regression to the averages (ignoring the standard deviations), and then plotting the residuals in any one of a number of ways to see if they are normally distributed, and to see if they (somewhat) systematically get larger or smaller as the average increases.

Now, it may be true that a weighted regression is needed, because the ERRORs are not i.i.d. normal, but nowhere has that claim been made or implied.

Esteemed Advisor
Posts: 7,056

Re: How to do a trend or regression analysis if mean and standard deviation values are known

One correction to Steve's suggested code: in the weight statement, change the '=' sign to a space.

Here is a more brute force solution that will provide the same parameter estimates, but also provide possibly more meaningful numbers for the various other results output;

data have;

  input year average sd;

  cards;

1996 200 10

1997 210 15

1998 220 15

;

data base;

  input score;

  cards;

1

2

3

4

5

6

7

8

9

10

;

filename doit temp;

data _null_;

  file doit;

  set have;

  stmt=catx(' ','proc standard data=base mean=',average,' std=',sd,

              ' out=stndized;run;');

  put stmt;

  stmt=catx(' ','data stndized; retain year',year,'; set stndized;run;');

  put stmt;

  stmt=catt('proc append base=want data=stndized;run;');

  put stmt;

run;

%include doit;

proc reg data=want;

  model score=year;

run;

Respected Advisor
Posts: 2,655

Re: How to do a trend or regression analysis if mean and standard deviation values are known

Thanks, Art.  I should just put the hex on cut and paste...

Steve Denham

Respected Advisor
Posts: 3,773

Re: How to do a trend or regression analysis if mean and standard deviation values are known

You may be able to adapt this one-way ANOVA example to your application.

25020 - One-way ANOVA on summary data

New Contributor
Posts: 3

Re: How to do a trend or regression analysis if mean and standard deviation values are known

Thank you for the update. I'll play around with the data today!

Kind regards,

J

Ask a Question
Discussion stats
  • 8 replies
  • 323 views
  • 0 likes
  • 5 in conversation