DATA Step, Macro, Functions and more

What is the best way to calculate means

Accepted Solution Solved
Reply
Contributor HG
Contributor
Posts: 23
Accepted Solution

What is the best way to calculate means

Hi All,

I have a lot of variables in my dataset.

variables in my dataset:

id w1 w2 w3 h1 h2 h3 age1 age2 ag3 .... plus a lot other variables ...rate1 rate2 rat3.

I need to calculate means of:

mean_w=mean(of,w1-w3);

mean_h=mean(of,h1-h3);

...

mean_rate=mean(of,rate1-rate3);

what is the best way to do it?

Thanks!


Accepted Solutions
Solution
‎08-24-2012 12:38 PM
PROC Star
Posts: 7,468

Re: What is the best way to calculate means

Slight modification of my previous suggestion to account for variable ranges that go beyond 1 digit.  I think this is a lot more direct than dealing with arrays in this case as the functions can deal directly with a variable prefix:

data have;

  input id $ this1-this3 that1-that3 x1-x10 y;

  cards;

1 1 2 1 4 3 2 5 6 7 1 2 3 4 5 6 7 9

2 5 6 7 3 2 4 8 6 7 1 2 3 4 5 6 7 9

;

proc sql noprint;

    select "mean_"||

           strip(name)||

           "=mean(of "||

           strip(name)||

           "Smiley Happy;"

      into :getmeans separated by " "

        from (select distinct substr(name,1,length(name)-

          anyalpha(reverse(strip(name)))+1) as name

            from dictionary.columns

             where libname="WORK" and

                memname="HAVE" and

                type="num" and

                anydigit(substr(name,length(name),1))

                  )

  ;

quit;

data want;

  set have;

  &getmeans.;

run;

View solution in original post


All Replies
Super User
Super User
Posts: 7,039

Re: What is the best way to calculate means

Other than restructuring that data so that the repetitions are rows?

How about arrays?

data have;

  input id w1-w3 h1-h3 ;

cards;

1 1 2 3 10 11 12

2 4 5 6 13 14 15

run;

data want;

  set have;

  array reps (2,3) w1--h3;

  array means mean_w mean_h ;

  do i=1 to dim(means);

    means(i) = mean(of reps(i,1),reps(i,2),reps(i,3));

  end;

run;

Super Contributor
Posts: 1,636

Re: What is the best way to calculate means

Hi Tom,

I like your array solution.

Since the OP stated there are a lot of variables, putting the variables into macro variables may save some typing:

data have;
  input id w1-w3 h1-h3 ag1-ag3 rate1-rate3;
cards;
1 1 2 3 10 11 12 20 21 23 90 91 92
2 4 5 6 13 14 15 20 21 23 90 . 92
;
run;
proc contents data=have out=vars(keep=name varnum ) noprint; run;
proc sql noprint ;
           select cats('mean_',substr(name,1,length(name)-1)) into : names separated by ' '
              from vars
                 where upcase(name) ne 'ID' and substr(name,length(name))='3'
                     order by varnum;
     select name into : allvar separated by ' '
         from vars
            where substr(name,length(name)) in ('1','2','3')
                order by varnum;
quit;

%let n=%sysfunc(countw(&names));

data want;
  set have;
  array reps (&n,3) &allvar;
  array means &names ;
  do i=1 to dim(means);
    means(i) = mean(of reps(i,1),reps(i,2),reps(i,3));
  end;
  drop i;
run;
proc print;run;

PROC Star
Posts: 7,468

Re: What is the best way to calculate means

If none of your variable ranges go beyond a single digit you might be able to do it with the following.  However, you'd have to adjust the code if any go beyond a single digit:

data have;

  input id $ this1-this3 that1-that3 x1-x3 y;

  cards;

1 1 2 1 4 3 2 5 6 7 9

2 5 6 7 3 2 4 8 6 7 9

;

proc sql noprint;

    select "mean_"||

           strip(name)||

           "=mean(of "||

           strip(name)||

           "Smiley Happy;"

      into :getmeans separated by " "

        from (select distinct substr(name,1,length(name)-1) as name

            from dictionary.columns

             where libname="WORK" and

                memname="HAVE" and

                type="num" and

                anydigit(substr(name,length(name),1))

                  )

  ;

quit;

data want;

  set have;

  &getmeans.;

run;

Solution
‎08-24-2012 12:38 PM
PROC Star
Posts: 7,468

Re: What is the best way to calculate means

Slight modification of my previous suggestion to account for variable ranges that go beyond 1 digit.  I think this is a lot more direct than dealing with arrays in this case as the functions can deal directly with a variable prefix:

data have;

  input id $ this1-this3 that1-that3 x1-x10 y;

  cards;

1 1 2 1 4 3 2 5 6 7 1 2 3 4 5 6 7 9

2 5 6 7 3 2 4 8 6 7 1 2 3 4 5 6 7 9

;

proc sql noprint;

    select "mean_"||

           strip(name)||

           "=mean(of "||

           strip(name)||

           "Smiley Happy;"

      into :getmeans separated by " "

        from (select distinct substr(name,1,length(name)-

          anyalpha(reverse(strip(name)))+1) as name

            from dictionary.columns

             where libname="WORK" and

                memname="HAVE" and

                type="num" and

                anydigit(substr(name,length(name),1))

                  )

  ;

quit;

data want;

  set have;

  &getmeans.;

run;

Super Contributor
Posts: 1,636

Re: What is the best way to calculate means

Hi Art,

If I want to see the value of your macro variable &getmeans and add "%put &getmeans;" to your code, is there a way to avoid the error messages in log file?

Thank you!

PROC Star
Posts: 7,468

Re: What is the best way to calculate means

: How about;

options symbolgen;

title "&getmeans." ;

Super Contributor
Posts: 1,636

Re: What is the best way to calculate means

Thank you Art! - Linlin

Contributor HG
Contributor
Posts: 23

Re: What is the best way to calculate means

All worked great.  Thank you very much for your help!!!Smiley Happy

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 340 views
  • 10 likes
  • 4 in conversation