Re: Create a dataset using proc tabulate

sam_sas2 · Posted 06-16-2021 09:53 AM

Hi all,
So I am currently running the below proc tabulate query,

proc tabulate data=sashelp.cars (where =(make in("Acura","Audi"))) out=cars;
class make;
var msrp;
table (msrp * (N MEAN MEDIAN MIN MAX)),
	  ALL make;
run;

And the output dataset from OUT statement looks like this(first 5 rows):

Obs	Make	_TYPE_	_PAGE_	_TABLE_	MSRP_N	MSRP_Mean	MSRP_Median	MSRP_Min	MSRP_Max
1		0	1	1	26	.	.	.	.
2	Acura	1	1	1	7	.	.	.	.
3	Audi	1	1	1	19	.	.	.	.
4		0	1	1	.	43208.46	.	.	.
5	Acura	1	1	1	.	42938.57	.	.	.
6	Audi	1	1	1	.	43307.89	.	.	.

But, I'd like to create an ADS that looks like this,

Type	All	Acura	Audi
MSRP_N	26	7	19
MSRP_Mean	43208.46	42938.57	43307.89
MSRP_Median	40115	36945	40590
MSRP_Min	23820	23820	25940
MSRP_Max	89765	89765	84600

I have tried using proc transpose, I'm unable to figure out how to deal with the nulls and also label them with "Make"

proc transpose data=cars out=cars2;
var msrp_:;

Any sort of help would be appreciated. Thanks in advance! 🙂

ballardw · Posted 06-16-2021 10:58 AM

What is an ADS?

This makes a data set similar to that desired content from that starting point(likely not the best)

data temp;
   set cars;
   if _type_=0 then make='All';
run;

proc sort data=temp;
   by make;
run;

proc transpose data=temp out=trans;
   by make;
   var msrp_: ;
run;

data temp2;
   set trans;
   col1 = coalesce(of col1-col5);
   drop col2-col5;
run;
proc sort data=temp2;
   by _name_;
run;

proc transpose data=temp2 out=want;
   by _name_;
   var col1;
   id make;
run;

If I needed something in that layout as a data set this is the way I would start because of the extra stuff in a tabulate data set. Also, depending on the complexity of a "real" tabulate call there can be much more work involved in restructuring. I do so for some projects but they are very specific and each is a one-off based on the actual table(s) structures created by tabulate. Summary/ Means with class variables often results in a somewhat cleaner data set to work with unless you are using any of the row/col percent calculations.

proc summary data=sashelp.cars N MEAN MEDIAN MIN MAX ;
   where make in("Acura","Audi");
   class make;
   var msrp ;
   output out=summary N= MEAN= MEDIAN= MIN= MAX= /autoname; 
run;
data othertemp;
   set summary;
   if _type_=0 then Make='All';
run;

proc transpose data=othertemp out=want2;
  var msrp: ;
  id make;
run;

sam_sas2 · Posted 06-16-2021 11:13 AM

OMG
you are a genius!!!
Thank you so much! I think I can work around this.
I'm not sure of proc summary as I will be needing different stats based on the variable.
For example, frequency for categorical and descriptive statistics for continuous variables.

P.S. Sorry for using acronyms. ADS stands for analytical data set.

ballardw · Posted 06-16-2021 11:26 AM

@sam_sas2 wrote:
OMG
you are a genius!!!
Thank you so much! I think I can work around this.
I'm not sure of proc summary as I will be needing different stats based on the variable.
For example, frequency for categorical and descriptive statistics for continuous variables.

P.S. Sorry for using acronyms. ADS stands for analytical data set.

Proc Summary/means will do counts of variables on the CLASS statement if NO var statement is used but the _type_ value needs to be used depending on specific combinations of interest.

Often breaking a problem apart such a continuous variables in one step then categorical in another then combine for report or analysis may be appropriate. Summary/means will do all the same statistics that Tabulate (or Report). The percents are the only thing that tabulate actually does.

Reeza · Posted 06-16-2021 11:27 AM

You can control the statistics for each variable in PROC MEANS similar to how you can with PROC TABULATE and the output is closer to what you'd like by default.

Create a dataset using proc tabulate