BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AD
Calcite | Level 5 AD
Calcite | Level 5

Suppose I've a got a dataset with a some "V" variables (V1 - V3) that each can take on values A, B, or C (which I call "Options").  I've also got avariable called "Score" that is numeric. Here is an example of the data:

Data new;

Input V1$ V2$ V3$ Score;

Cards;

A B A 24

C C B 66

B A C 25

C A B 40

;

Run;

I want to compute the average "Score" for all those who have a value of A on V1, a value of B on V1, a value of C on V1, a value of A on V2, a value of B on V2, a value of C on V2, a value of A on V3 a value of B on V3, and a value of C on V3, and I want to save all this information in a single dataset called "Final".

The following code works and gets me the output dataset in the format I want:

%Macro OptionStats;
   %Do i = 1 %To 3;
     Proc Sort Data=new; By V&i; Run;
     Proc Means Data=New Mean Noprint; Var Score;  By V&i; 
       Output Out=Score_V&i (Drop=_TYPE_ _FREQ_)  Mean()=;
     Run;
     Data Score_V&i; Set Score_V&i; Rename V&i=Option Score=V&i; Run;
   %End;
%Mend;
%OptionStats;

Data Final;  Merge Score_V1 Score_V2 Score_V3;  Run;

   

My question is whether there is a better, more efficient way to do this.  In the end, I will have a large dataset with hundreds of "V" variables (instead of just 3) and thousands of rows (instead of just 4), and it seems like it will be slow to have it go through an iterative process with a Proc Sort and a Proc Means for each V variable.

Any ideas?

1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

Maybe you should test it.  I think the OP wants 1 ways for many CLASS variables that should not "exceed" the limits maybe.  Otherwise just transpose and do the means by _NAME_ and level.

View solution in original post

4 REPLIES 4
data_null__
Jade | Level 19

Maybe this will work.

Data new;
   Input V1$ V2$ V3$ Score;
   Cards;
A B A 24
C C B 66
B A C 25
C A B 40
;
  
Run;
proc summary data=new missing chartype;
  
class v:;
   ways 1;
  
output out=ways1 mean(score)=;
   run;
ballardw
Super User

With "hundreds" of V variables the combinations will likely exceed the combinations that Proc Summary or means will handle. You may want to examine the CLASSDATA= procedure option to identify the the combinations of V variables that you want output. It may still take a couple of passes through the data depending on the numbers of combinations you request but you would only need to change the Classdata data set and output set.

data_null__
Jade | Level 19

Maybe you should test it.  I think the OP wants 1 ways for many CLASS variables that should not "exceed" the limits maybe.  Otherwise just transpose and do the means by _NAME_ and level.

AD
Calcite | Level 5 AD
Calcite | Level 5

Thanks to you both for your helpful suggestions.  As it turns out, the data comes out of the query in a different format (untransposed), then it is transposed into the format described in my question above for some other additional analyses.  So it made by far the most sense to simply do the Proc means by _NAME_ and level as a separate step before the data was transposed the first time.

Thanks again for the advice...

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 504 views
  • 6 likes
  • 3 in conversation