Hi all,
So I am currently running the below proc tabulate query,
proc tabulate data=sashelp.cars (where =(make in("Acura","Audi"))) out=cars; class make; var msrp; table (msrp * (N MEAN MEDIAN MIN MAX)), ALL make; run;
And the output dataset from OUT statement looks like this(first 5 rows):
Obs | Make | _TYPE_ | _PAGE_ | _TABLE_ | MSRP_N | MSRP_Mean | MSRP_Median | MSRP_Min | MSRP_Max |
1 | 0 | 1 | 1 | 26 | . | . | . | . | |
2 | Acura | 1 | 1 | 1 | 7 | . | . | . | . |
3 | Audi | 1 | 1 | 1 | 19 | . | . | . | . |
4 | 0 | 1 | 1 | . | 43208.46 | . | . | . | |
5 | Acura | 1 | 1 | 1 | . | 42938.57 | . | . | . |
6 | Audi | 1 | 1 | 1 | . | 43307.89 | . | . | . |
But, I'd like to create an ADS that looks like this,
Type | All | Acura | Audi |
MSRP_N | 26 | 7 | 19 |
MSRP_Mean | 43208.46 | 42938.57 | 43307.89 |
MSRP_Median | 40115 | 36945 | 40590 |
MSRP_Min | 23820 | 23820 | 25940 |
MSRP_Max | 89765 | 89765 | 84600 |
I have tried using proc transpose, I'm unable to figure out how to deal with the nulls and also label them with "Make"
proc transpose data=cars out=cars2;
var msrp_:;
Any sort of help would be appreciated. Thanks in advance! 🙂
What is an ADS?
This makes a data set similar to that desired content from that starting point(likely not the best)
data temp; set cars; if _type_=0 then make='All'; run; proc sort data=temp; by make; run; proc transpose data=temp out=trans; by make; var msrp_: ; run; data temp2; set trans; col1 = coalesce(of col1-col5); drop col2-col5; run; proc sort data=temp2; by _name_; run; proc transpose data=temp2 out=want; by _name_; var col1; id make; run;
If I needed something in that layout as a data set this is the way I would start because of the extra stuff in a tabulate data set. Also, depending on the complexity of a "real" tabulate call there can be much more work involved in restructuring. I do so for some projects but they are very specific and each is a one-off based on the actual table(s) structures created by tabulate. Summary/ Means with class variables often results in a somewhat cleaner data set to work with unless you are using any of the row/col percent calculations.
proc summary data=sashelp.cars N MEAN MEDIAN MIN MAX ; where make in("Acura","Audi"); class make; var msrp ; output out=summary N= MEAN= MEDIAN= MIN= MAX= /autoname; run; data othertemp; set summary; if _type_=0 then Make='All'; run; proc transpose data=othertemp out=want2; var msrp: ; id make; run;
@sam_sas2 wrote:
OMG
you are a genius!!!
Thank you so much! I think I can work around this.
I'm not sure of proc summary as I will be needing different stats based on the variable.
For example, frequency for categorical and descriptive statistics for continuous variables.
P.S. Sorry for using acronyms. ADS stands for analytical data set.
Proc Summary/means will do counts of variables on the CLASS statement if NO var statement is used but the _type_ value needs to be used depending on specific combinations of interest.
Often breaking a problem apart such a continuous variables in one step then categorical in another then combine for report or analysis may be appropriate. Summary/means will do all the same statistics that Tabulate (or Report). The percents are the only thing that tabulate actually does.
You can control the statistics for each variable in PROC MEANS similar to how you can with PROC TABULATE and the output is closer to what you'd like by default.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.