DATA Step, Macro, Functions and more

Simuation--please help

Reply
Occasional Contributor
Posts: 6

Simuation--please help

Hi, I am new to SAS. I am doing a simulation study with 100 replications which generate 100 outputs. There are 50 observations in each of the 100 outputs. I want to calculate the mean of each observation over 100 replications. How to do this?

Super User
Posts: 5,516

Simuation--please help

ZhenLi,

Many posters could solve this, with a little more information.

Do you already have 100 SAS data sets?  As a follow-up question, would it be easy for you to assemble them into a single SAS data set, with a new variable that takes on values from 1 to 100 indicating the "replication"?

For each variable, do you want to get the mean separately for each replication, and then get the mean of those 100 means?

Occasional Contributor
Posts: 6

Re: Simuation--please help

Posted in reply to Astounding

Hi,

Thank you.

I already have 100 outputs; below is first 10 out of 50 observations in one output. I need to calculate mean for each observations over 100 outputs for 11 variables.

1  0.27189  -1.63333  0.21903  0.348507 -1.250858  0.319263 -1.296945  0.343462 -1.216949  0.352209 -1.198076

2  0.36986  -1.48808  0.19807  0.474084 -1.137540  0.434303 -1.173247  0.467221 -1.101967  0.479120 -1.085949

3  0.33389  -1.32761  0.00000  0.427978 -1.012348  0.392066 -1.036588  0.421783 -0.974936  0.432524 -0.962073

4  0.37094  -2.03541  0.20296  0.475468 -1.564543  0.435571 -1.639363  0.468586 -1.535242  0.480519 -1.508464

5  0.23689  -1.26799  0.22502  0.303644 -0.965835  0.278165 -0.985815  0.299249 -0.927740  0.306869 -0.916049

6  0.51489  -0.60272  0.20248  0.659983 -0.446820  0.604603 -0.419260  0.650429 -0.401102  0.666993 -0.402489

7  0.36227  -1.08792  0.18017  0.464355 -0.825352  0.425391 -0.832464  0.457633 -0.785194  0.469288 -0.777042

8  0.38185  -2.14096  0.20688  0.489453 -1.646889  0.448382 -1.729251  0.482368 -1.618797  0.494652 -1.589944

9  0.31068  -1.13604  0.19791  0.398228 -0.862894  0.364812 -0.873444  0.392463 -0.823286  0.402458 -0.814189

10  0.38501  0.07645  0.21471  0.493503  0.083039  0.452093  0.159133  0.486360  0.136540  0.498745  0.121801

.

Partial code is in the attached file

Attachment
Super User
Posts: 5,516

Simuation--please help

ZhenLi,

So far, so good.  But the objective is still a little hazy.

Do you need one mean for each observation in each data set, getting the average of all 11 variables within a single observation?

Do you need 11 means per data set (the mean of each variable for all 50 observations in each data set)?

Describe the formulas you would like to apply.

Good luck.

Occasional Contributor
Posts: 6

Simuation--please help

Posted in reply to Astounding

Astounding,

Sorry for the confusion.

There are 11 variables and 50 observations in each of 100 outputs.

I need to calculate 11 means for the 11 variables for each of the 50 observations. It will be the mean over 100 outputs for each variable of that observation. 

I my code, I was trying to read the same observation (e.g., observation 1) from all 100 outputs and save them in one datafile. and then get summary statistics.

I used append but it does not work very well.

Z.

Super User
Posts: 5,516

Simuation--please help

ZhenLi,

OK, just to make sure that I understand ...

It sounds like you need 550 means in total.

The mean of the first variable, based on 100 observations (1st observation from data set 1, plus first observation from data set 2, ... first observation from data set 100).

The mean of the first variable, based on 100 observations (2nd observation from data set 1, plus second observation from data set 2, ... second observation from data set 100).

The last mean would be:

The mean of the 11th variable, based on 100 observations (50th observation from data set 1, plus 50th observation from data set 2, ... 50th observation from data set 100).

Does this sound right?

Occasional Contributor
Posts: 6

Simuation--please help

Posted in reply to Astounding

yes. Thank you. Z

Super User
Posts: 5,516

Simuation--please help

ZhenLi,

OK, here's an approach that is easily adaptable to using macro language, but you'll have to write the macro.

Good luck.

data combine_all_100;

   rownum=0;

   do until (done1);

        set first_data_set end=done1;

        rownum + 1;

        output;

   end;

   rownum=0;

   do until (done2);

       set second_data_set end=done2;

       rownum + 1;

       output;

   end;

   ....

   rownum=0;

   do until (done100);

       set hundredth_data_set end=done100;

       rownum + 1;

       output;

   end;

   stop;

run;

proc means data=combine_all_100;

   var list of numerics excluding rownum;

   class rownum;

run;

Regular Contributor
Posts: 241

Simuation--please help

Here is one way. hth

  /* test data */
  proc plan seed=12345678;
     factors obs=50 ordered rep=100 ordered x=1 of 1000 random;
     output out=sim;
  run;

  /* mean of x over 100 reps for each obs */
  proc means data=sim;
     var x;
     by obs;
  run;
  /* on lst
  obs=1

  The MEANS Procedure

                         Analysis Variable : x

     N            Mean         Std Dev         Minimum         Maximum
  --------------------------------------------------------------------
   100     503.3600000     311.3526864      14.0000000         1000.00
  --------------------------------------------------------------------


  obs=2

                         Analysis Variable : x

     N            Mean         Std Dev         Minimum         Maximum
  --------------------------------------------------------------------
   100     483.4900000     290.8708796       8.0000000     979.0000000
  --------------------------------------------------------------------

  ...

  obs=50

                         Analysis Variable : x

     N            Mean         Std Dev         Minimum         Maximum
  --------------------------------------------------------------------
   100     467.2800000     273.5758747       2.0000000     999.0000000
  --------------------------------------------------------------------
  */

Occasional Contributor
Posts: 6

Simuation--please help

Posted in reply to chang_y_chung_hotmail_com

Hi,

Thank you.

How to use it in my question?

Super User
Posts: 10,044

Simuation--please help

Oh. Boy. You need the third dimension to calculated mean.

I only create 4 tables for demo. You need to change it into 50.

data a1(keep=a:);
array a{11};
do k=1 to 50;
do i=1 to dim(a);
 a{i}=ranuni(-1);
end;
 output;
end;
run;
data a2(keep=a: rename=(a1-a11=b1-b11));
array a{11};
do k=1 to 50;
do i=1 to dim(a);
 a{i}=ranuni(-1);
end;
 output;
end;
run;
data a3(keep=a: rename=(a1-a11=c1-c11));
array a{11};
do k=1 to 50;
do i=1 to dim(a);
 a{i}=ranuni(-1);
end;
 output;
end;
run;
data a4(keep=a: rename=(a1-a11=d1-d11));
array a{11};
do k=1 to 50;
do i=1 to dim(a);
 a{i}=ranuni(-1);
end;
 output;
end;
run;




data want(keep=mean:);
 merge a1-a4;
 array _a{*} _numeric_; 
 array mean{11};
 do i=1 to 11;
  sum=0;
  do j=i to dim(_a) by 11;
  sum+_a{j};
  end;
  mean{i}=sum/4;
 end;
run;




Ksharp

Occasional Contributor
Posts: 6

Simuation--please help

Ksharp, Thank you very much. Is it possible to write in marco?

PROC Star
Posts: 7,487

Simuation--please help

You can typically wrap most code within a macro and, since Ksharps' code didn't include a cards or datalines statement, I don't see why you couldn't.  What do you want to pass into the macro?  Just identifiy those values as variables in your macro declaration.

Super User
Posts: 10,044

Simuation--please help

NO. You don't need a macro.

What you need to do is rename the variable name of these 100 datasets to make sure they have unique name.

then use the code:

data want(keep=meanSmiley Happy;

merge a1-a100;

array _a{*} _numeric_;

array mean{11};

do i=1 to 11;

  sum=0;

  do j=i to dim(_a) by 11;

  sum+_a{j};

  end;

  mean{i}=sum/100;

end;

run;

P.S a1-a100 is your one hundred datasets.

Ksharp

Ask a Question
Discussion stats
  • 13 replies
  • 232 views
  • 6 likes
  • 5 in conversation