turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Most efficient way to get a bunch of means

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-31-2013 01:47 PM

Suppose I've a got a dataset with a some "V" variables (V1 - V3) that each can take on values A, B, or C (which I call "Options"). I've also got avariable called "Score" that is numeric. Here is an example of the data:

Data new;

Input V1$ V2$ V3$ Score;

Cards;

A B A 24

C C B 66

B A C 25

C A B 40

;

Run;

I want to compute the average "Score" for all those who have a value of A on V1, a value of B on V1, a value of C on V1, a value of A on V2, a value of B on V2, a value of C on V2, a value of A on V3 a value of B on V3, and a value of C on V3, and I want to save all this information in a single dataset called "Final".

The following code works and gets me the output dataset in the format I want:

%Macro OptionStats;

%Do i = 1 %To 3;

Proc Sort Data=new; By V&i; Run;

Proc Means Data=New Mean Noprint; Var Score; By V&i;

Output Out=Score_V&i (Drop=_TYPE_ _FREQ_) Mean()=;

Run;

Data Score_V&i; Set Score_V&i; Rename V&i=Option Score=V&i; Run;

%End;

%Mend;

%OptionStats;

Data Final; Merge Score_V1 Score_V2 Score_V3; Run;

My question is whether there is a better, more efficient way to do this. In the end, I will have a large dataset with hundreds of "V" variables (instead of just 3) and thousands of rows (instead of just 4), and it seems like it will be slow to have it go through an iterative process with a Proc Sort and a Proc Means for each V variable.

Any ideas?

Accepted Solutions

Solution

05-31-2013
02:25 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-31-2013 02:25 PM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-31-2013 02:02 PM

Maybe this will work.

Input V1$ V2$ V3$ Score;

Cards;

A B A 24

C C B 66

B A C 25

C A B 40

;

class v:;

ways

output out=ways1 mean(score)=;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-31-2013 02:09 PM

With "hundreds" of V variables the combinations will likely exceed the combinations that Proc Summary or means will handle. You may want to examine the CLASSDATA= procedure option to identify the the combinations of V variables that you want output. It may still take a couple of passes through the data depending on the numbers of combinations you request but you would only need to change the Classdata data set and output set.

Solution

05-31-2013
02:25 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

05-31-2013 02:25 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-04-2013 05:12 PM

Thanks to you both for your helpful suggestions. As it turns out, the data comes out of the query in a different format (untransposed), then it is transposed into the format described in my question above for some other additional analyses. So it made by far the most sense to simply do the Proc means by _NAME_ and level as a separate step before the data was transposed the first time.

Thanks again for the advice...