BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DougHold
Obsidian | Level 7

I'm using the sashelp.cars dataset here.

 

1) The following are 3 subgroup analyses using PROC MEANS. This isn't a lot of repetition, but for the purpose of learning, how can I best combine multiple subgroup analyses to minimize the amount of repetition in code?

 

PROC MEANS DATA = sashelp.cars MAXDEC=1 N mean Median Q1 Q3 Qrange;
	Var MSRP;
		WHERE Type = 'Sports';
RUN; 

PROC MEANS DATA = sashelp.cars MAXDEC=1 N mean Median Q1 Q3 Qrange;
	Var Invoice;
		WHERE Origin = 'Europe';
RUN;

PROC MEANS DATA = sashelp.cars MAXDEC=1 N mean Median Q1 Q3 Qrange;
	Var Horsepower;
		WHERE DriveTrain = 'Rear';
RUN;

If I run the following I get different results. I think what's happening is the last WHERE (DriveTrain = 'Rear') is being used for all the Vars, which is not what I want. I want it to look like the previous except all together.

 

PROC MEANS DATA = sashelp.cars MAXDEC=1 N mean Median Q1 Q3 Qrange;
	Var MSRP;
		WHERE Type = 'Sports';
	Var Invoice;
		WHERE Origin = 'Europe';
	Var Horsepower;
		WHERE DriveTrain = 'Rear';
RUN;

 

2) Here are 3 subgroup analyses using PROC FREQ. Again, I would like to know how to combine multiple subgroup analyses to minimize the amount of repetition in code.

PROC FREQ DATA = sashelp.cars;
	TABLES Type;
		WHERE Origin = 'USA';
RUN;

PROC FREQ DATA = sashelp.cars;
	TABLES Origin;
		WHERE DriveTrain = 'Front';
RUN;

PROC FREQ DATA = sashelp.cars;
	TABLES DriveTrain;
		WHERE Horsepower > 200;
RUN;

If I run the following I get different results. Again, I think what's happening is the last WHERE (Horsepower > 200) is being used for all of them, which is not what I want. I want it to look like the previous except all together.

 

PROC FREQ DATA = sashelp.cars;
	TABLES Type;
		WHERE Origin = 'USA';
	TABLES Origin;
		WHERE DriveTrain = 'Front';
	TABLES DriveTrain;
		WHERE Horsepower > 200;
RUN;

Thanks,

 

 

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

 

PROC MEANS DATA = sashelp.cars noprint;
    class drivetrain type origin;
    ways 1;
    var Horsepower msrp invoice;
    output out=stats n= mean= median= q1= q3= qrange=/autoname;
run;

The above is a one-step approach to the problem. It gives you all the statistics you want, plus others you didn't ask for, and is probably a more real-world situation where you want means for all three variables, medians for all three variables, etc. It uses the power of PROC MEANS to pass through the data once, instead of what you originally programmed, which has to pass through the data three times; and if your example goes from needing three WHERE clauses to (for example) 20 WHERE clauses, you can see that it would be a huge amount of typing, but the method using PROC MEANS simply requires you to add more variables to the VAR statement, that's all.

 

By the way, there are many different forms of output from PROC MEANS, in case you want a report to include in a document, or in case you want a SAS data set (as I have shown).

 

I recommend you learn to use the power of PROC MEANS/PROC SUMMARY. It will help you a lot more in the long run, as I feel more real world problems are better handled this way rather than with specific WHERE statements to produce the results.

 

--
Paige Miller

View solution in original post

7 REPLIES 7
Reeza
Super User
If the subgroups are split by the same variable or the variables do not overlap there are ways of doing this. But if they're entirely different variables there's not a more efficient way of doing this.

You could do all and then just filter the results at the end with a single WHERE if you'd like instead though.

PaigeMiller
Diamond | Level 26

 

PROC MEANS DATA = sashelp.cars noprint;
    class drivetrain type origin;
    ways 1;
    var Horsepower msrp invoice;
    output out=stats n= mean= median= q1= q3= qrange=/autoname;
run;

The above is a one-step approach to the problem. It gives you all the statistics you want, plus others you didn't ask for, and is probably a more real-world situation where you want means for all three variables, medians for all three variables, etc. It uses the power of PROC MEANS to pass through the data once, instead of what you originally programmed, which has to pass through the data three times; and if your example goes from needing three WHERE clauses to (for example) 20 WHERE clauses, you can see that it would be a huge amount of typing, but the method using PROC MEANS simply requires you to add more variables to the VAR statement, that's all.

 

By the way, there are many different forms of output from PROC MEANS, in case you want a report to include in a document, or in case you want a SAS data set (as I have shown).

 

I recommend you learn to use the power of PROC MEANS/PROC SUMMARY. It will help you a lot more in the long run, as I feel more real world problems are better handled this way rather than with specific WHERE statements to produce the results.

 

--
Paige Miller
DougHold
Obsidian | Level 7

Thank you for this. I will need to be doing a lot of descriptive statistics, so i'm definitely going to have to learn more about PROC MEANS and PROC SUMMARY.

Reeza
Super User
There's an example in the documentation that shows some of the different ways summaries are generated and read into the TYPE/WAYS statements for their functionality.
Tom
Super User Tom
Super User

Since the WHERE statement applies to the whole step you might need to learn how to programmatically generate the SAS code you want to run.  That is what the macro processor is for.

 

For example you could create a macro that generates your PROC MEANS step.  Define it to take a parameter you can use to build the WHERE statement.

 

%macro mymeans(where=);
PROC MEANS   DATA = sashelp.cars MAXDEC=1 N mean Median Q1 Q3 Qrange;
  var MSRP;
  WHERE &where;
RUN; 
%mend mymeans;

%mymeans(where=(Type = 'Sports'));
%mymeans(where=(Origin = 'Europe'));
%mymeans(where=(DriveTrain = 'Rear'));

 

 

DougHold
Obsidian | Level 7
Thanks, this is a good idea too!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1425 views
  • 2 likes
  • 5 in conversation