DATA Step, Macro, Functions and more

Multiple by variables in proc means

Accepted Solution Solved
Reply
Occasional Contributor mgm
Occasional Contributor
Posts: 10
Accepted Solution

Multiple by variables in proc means

I would like to rank across multiple  grouping variables independently and then combine the results. Is there an easy way to do it (outside of using macros) .

proc RANK data = a ; 
by var1;
output out  = t1 mean(ret);
run;

proc RANK data = a ; 
by var2;
output out  = t2 mean(ret);
run;

and then combine t1 and t 2 . ... 

Accepted Solutions
Solution
‎09-16-2016 10:52 AM
Super User
Posts: 10,500

Re: Multiple by variables in proc means

The question may be do you want dataset that feeds into some other analysis or is this a report?

if a report some thing like this perhaps:

data have;
infile datalines;
input rownum Rank1 Rank2	Var1;
datalines;
1 1 3 1.2
2 1 4 1.1
3 2 5 0.9
4 3 1 0.766667
5 4 2 0.616667
6 4 1 0.466667
7 4 2 0.316667
8 5 2 0.166667
9 2 1 0.016667
10 4 4 -0.13333
11 5 5 -0.28333
12 4 5 -0.43333
13 3 2 -0.58333
;
run;

data trans (keep=Var1 RankVar RankVal);
   set have;
   array r rank: ;
   do _i_ = 1 to dim(r);
      if not missing (r(_i_)) then do;
         Rankvar = cat('Rank',_i_);
         RankVal = r(_i_);
         output;
      end;
   end;
run;

proc tabulate data=trans;
   class RankVar RankVal;
   var Var1;
   table RankVal='',
         RankVar='Rank Variable'*Var1*mean
         /box='Ranks'
   ;
run;



View solution in original post


All Replies
Super User
Super User
Posts: 7,401

Re: Multiple by variables in proc means

Please post test data (as a datatstep) an drequired output, it really helps illustrate what you want.  I imageine its something along the lines of:

VAR1    V1RANK  VAR2  V2RANK

ABC      1              DEF    1

ABC      1              RFG    2

GHJ      2              XYZ     1

?

 

If so you can sort the data, and retain a count for each by group - in the above case:

data want;
  set have;
  retain v1rank v2rank;
  by var1 var2;    /* Assumes sorted */
  if first.var1 then v1rank=sum(v1rank,1);
  if first.var2 then v2rank=sum(v2rank,1);
run;
Occasional Contributor mgm
Occasional Contributor
Posts: 10

Re: Multiple by variables in proc means

Sorry I botched up my request earlier.  

 

Here is the sample data : 

rownumRank1Rank 2Var1
1131.2
2141.1
3250.9
4310.766667
5420.616667
6410.466667
7420.316667
8520.166667
9210.016667
1044-0.13333
1155-0.28333
1245-0.43333
1332-0.58333

 

 Output : 

RanksAverage of Var1 by RAnk1Average of Var1 by RAnk2
11.150.417
20.460.129
30.091.2
40.170.483
5-0.060.061

Proc rank should read proc means earlier.

Super Contributor
Posts: 305

Re: Multiple by variables in proc means

hello,

 


data have;
infile datalines;
input rownum Rank1 Rank2	Var1;
datalines;
1 1 3 1.2
2 1 4 1.1
3 2 5 0.9
4 3 1 0.766667
5 4 2 0.616667
6 4 1 0.466667
7 4 2 0.316667
8 5 2 0.166667
9 2 1 0.016667
10 4 4 -0.13333
11 5 5 -0.28333
12 4 5 -0.43333
13 3 2 -0.58333
;

proc means data=have ;
class Rank1 Rank2;
Var  Var1;
ways 1;
output out=int(drop=_type_ _freq_) mean()=;
run;

data want;

set int(drop=rank2 rename=(var1=Average_of_Var1_by_RAnk1 rank1=rank) where=(missing(rank)=0) );
set int(drop=rank1 rename=(var1=Average_of_Var1_by_RAnk2 rank2=rank) where=(missing(rank)=0) );

run;
Occasional Contributor mgm
Occasional Contributor
Posts: 10

Re: Multiple by variables in proc means

 

Thanks it seems to work

 

But I was wondering if there is a scalable solution ? I have almost  20 Class variables So I would have to repeat this statement 20 times with a drop list that doesn't contain one variable at a time... 

set int(drop=rank2 rename=(var1=Average_of_Var1_by_RAnk1 rank1=rank) where=(missing(rank)=0) );

 

 

Solution
‎09-16-2016 10:52 AM
Super User
Posts: 10,500

Re: Multiple by variables in proc means

The question may be do you want dataset that feeds into some other analysis or is this a report?

if a report some thing like this perhaps:

data have;
infile datalines;
input rownum Rank1 Rank2	Var1;
datalines;
1 1 3 1.2
2 1 4 1.1
3 2 5 0.9
4 3 1 0.766667
5 4 2 0.616667
6 4 1 0.466667
7 4 2 0.316667
8 5 2 0.166667
9 2 1 0.016667
10 4 4 -0.13333
11 5 5 -0.28333
12 4 5 -0.43333
13 3 2 -0.58333
;
run;

data trans (keep=Var1 RankVar RankVal);
   set have;
   array r rank: ;
   do _i_ = 1 to dim(r);
      if not missing (r(_i_)) then do;
         Rankvar = cat('Rank',_i_);
         RankVal = r(_i_);
         output;
      end;
   end;
run;

proc tabulate data=trans;
   class RankVar RankVal;
   var Var1;
   table RankVal='',
         RankVar='Rank Variable'*Var1*mean
         /box='Ranks'
   ;
run;



Super User
Posts: 9,681

Re: Multiple by variables in proc means

I think you need make a macro like CALL EXECUTE():


data have;
infile datalines;
input rownum Rank1 Rank2	Var1;
datalines;
1 1 3 1.2
2 1 4 1.1
3 2 5 0.9
4 3 1 0.766667
5 4 2 0.616667
6 4 1 0.466667
7 4 2 0.316667
8 5 2 0.166667
9 2 1 0.016667
10 4 4 -0.13333
11 5 5 -0.28333
12 4 5 -0.43333
13 3 2 -0.58333
;
run;
proc sql;
create table temp as
 select rank1 as r,mean(var1) as mean,'rank1' as id length=20
  from have
   group by rank1
union all
 select rank2 as r,mean(var1) as mean,'rank2'
  from have
   group by rank2
order by r;
quit;
proc transpose data=temp out=want;
 by r;
 id id;
 var mean;
run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 354 views
  • 1 like
  • 5 in conversation