Solved: Re: Compact Descriptive Stats

DavidKaib · Posted 03-14-2023 01:28 PM

I have a series of fields each with a different set of value options. So A1-A9, B1-B9, etc. Where the answer options for A1-A9 are F, O, J, S, for B1-B9 they are AS, BU, PA, IS, etc. I need to display frequencies for each field along with the total N. Using a proc freq like this generates many pages of results.

proc freq;
tables A1-A9;
run;

I would like something more compact.

Test data, current output and ideal output in the attachments.

PGStats · Posted 03-14-2023 04:17 PM

Could be done with proc tabulate :

data have;
input Record (A1 A2 A3 A4 A5 A6 A7 A8 A9 B1 B2 B3 B4 B5 B6 B7 B8 B9) (:$2.);
datalines;
1 F F O J J F S S O IS PA BR PA BR PA BR AS IS
2 S O J F O J J S S AS BR PA BR IS IS IS IS AS
3 F S O F S S S O F BR BR BR IS AS PA BR BR BR
4 O J J O F F J J F AS BR IS BR IS BR IS IS IS
5 J F F J F F S S J PA PA IS BR AS AS AS PA BR
6 J O J F J O J S O IS PA AS IS PA PA BR AS BR
7 . F F F S S O J F AS BR BR IS AS IS AS BR AS
8 S O S O J J F S O BR IS AS PA AS PA PA AS BR
9 J J F S F S S O F AS IS PA AS PA AS AS PA AS
10 F F O J S F J J F PA PA BR IS BR PA BR BR PA
;

proc transpose data=have out=temp;
by record;
var _character_;
run;

data temp2 / view=temp2;
set temp;
val = not missing(col1);
test = substr(_name_, 1, 1);
run;

proc tabulate data=temp2;
var val;
class test _name_ col1;
table test, _name_="Question", n col1="Answer"*rowpctn=""*format=best5.;
run;

PG

View solution in original post

SASJedi · Posted 03-14-2023 02:31 PM

Well, someone may have a more succinct approach, but this should work:

/* Get frequency data into a table */
ods output OneWayFreqs=FREQS(drop=freq: cum:);
proc freq data=have(drop=record);
	tables _all_;
run;
ods output close;

/* Get a list of variable names to coalesce */
proc sql;
select Name into :list separated by ','
	from dictionary.columns
	where libname='WORK' and memname='FREQS'
		and name not contains '_'
		and name not in ('Table','Percent')
;
quit;

/* Prep the data for transposition */
proc sql;
create table freqs2 as
select scan(table,-1) length=2 as Column
		,coalescec(&list) as Value
		,Percent
	from freqs
	order by 1,2
;
quit;

/* Transpose the data to get the desired output */
proc transpose data=freqs2 out=want(drop=_:) let;
	by column;
	ID value;
run;

And this is the result:

Obs	Column	F	J	O	S	AS	BR	IS	PA
1	A1	33.33	33.33	11.11	22.22	.	.	.	.
2	A2	40.00	20.00	30.00	10.00	.	.	.	.
3	A3	30.00	30.00	30.00	10.00	.	.	.	.
4	A4	40.00	30.00	20.00	10.00	.	.	.	.
5	A5	30.00	30.00	10.00	30.00	.	.	.	.
6	A6	40.00	20.00	10.00	30.00	.	.	.	.
7	A7	10.00	40.00	10.00	40.00	.	.	.	.
8	A8	.	30.00	20.00	50.00	.	.	.	.
9	A9	50.00	10.00	30.00	10.00	.	.	.	.
10	B1	.	.	.	.	40.00	20.00	20.00	20.00
11	B2	.	.	.	.	.	40.00	20.00	40.00
12	B3	.	.	.	.	20.00	40.00	20.00	20.00
13	B4	.	.	.	.	10.00	30.00	40.00	20.00
14	B5	.	.	.	.	40.00	20.00	20.00	20.00
15	B6	.	.	.	.	20.00	10.00	20.00	50.00
16	B7	.	.	.	.	30.00	40.00	20.00	10.00
17	B8	.	.	.	.	30.00	30.00	20.00	20.00
18	B9	.	.	.	.	30.00	40.00	20.00	10.00

Hope that gets you started, anyway.

Mark

Check out my Jedi SAS Tricks for SAS Users

PGStats · Posted 03-14-2023 04:17 PM

Could be done with proc tabulate :

data have;
input Record (A1 A2 A3 A4 A5 A6 A7 A8 A9 B1 B2 B3 B4 B5 B6 B7 B8 B9) (:$2.);
datalines;
1 F F O J J F S S O IS PA BR PA BR PA BR AS IS
2 S O J F O J J S S AS BR PA BR IS IS IS IS AS
3 F S O F S S S O F BR BR BR IS AS PA BR BR BR
4 O J J O F F J J F AS BR IS BR IS BR IS IS IS
5 J F F J F F S S J PA PA IS BR AS AS AS PA BR
6 J O J F J O J S O IS PA AS IS PA PA BR AS BR
7 . F F F S S O J F AS BR BR IS AS IS AS BR AS
8 S O S O J J F S O BR IS AS PA AS PA PA AS BR
9 J J F S F S S O F AS IS PA AS PA AS AS PA AS
10 F F O J S F J J F PA PA BR IS BR PA BR BR PA
;

proc transpose data=have out=temp;
by record;
var _character_;
run;

data temp2 / view=temp2;
set temp;
val = not missing(col1);
test = substr(_name_, 1, 1);
run;

proc tabulate data=temp2;
var val;
class test _name_ col1;
table test, _name_="Question", n col1="Answer"*rowpctn=""*format=best5.;
run;

PG

DavidKaib · Posted 03-14-2023 05:19 PM

This proc tabulate is perfect. One follow up questions - would there be a way to do this if the fields did not begin with the same letter--where I would need to list out the fields?

ballardw · Posted 03-14-2023 05:52 PM

@DavidKaib wrote:

This proc tabulate is perfect. One follow up questions - would there be a way to do this if the fields did not begin with the same letter--where I would need to list out the fields?

Are you asking about the Proc Transpose step? _character_ is a list, short hand for all character variables. If you only want some specific variables you would list them on the Var statement. There are other list shortcuts available: A: would get all variables with names starting with A, A1-A9 would use A1, A2, A3 ... A9. You can mix lists such as : Var A: B1-B9 ; to get all A variables and just B1 through B9.

DavidKaib · Posted 03-15-2023 12:34 PM

Perfect. This is really great.

When I tried this with some variables that are not just A, B, etc., it worked but the order came out odd - so for example Test Other was first, Test Acad was second (in the original data the variables the begin with Acad are first.) Is there a way to reorder those? I could also rename those variables as A1-A9, B2-, etc,, but if there is a way to set the order in the code that would be helpful.

PGStats · Posted 03-15-2023 02:02 PM

The ORDER= option in the CLASS statement of the TABULATE proc gives you some control on the ordering of the table lines and columns:

data have;
input Record (Z9 Z8 Z7 Z6 Z5 Z4 Z3 Z2 Z1 B1 B2 B3 B4 B5 B6 B7 B8 B9) (:$2.);
datalines;
1 F F O J J F S S O IS PA BR PA BR PA BR AS IS
2 S O J F O J J S S AS BR PA BR IS IS IS IS AS
3 F S O F S S S O F BR BR BR IS AS PA BR BR BR
4 O J J O F F J J F AS BR IS BR IS BR IS IS IS
5 J F F J F F S S J PA PA IS BR AS AS AS PA BR
6 J O J F J O J S O IS PA AS IS PA PA BR AS BR
7 . F F F S S O J F AS BR BR IS AS IS AS BR AS
8 S O S O J J F S O BR IS AS PA AS PA PA AS BR
9 J J F S F S S O F AS IS PA AS PA AS AS PA AS
10 F F O J S F J J F PA PA BR IS BR PA BR BR PA
;

proc transpose data=have out=temp;
by record;
var _character_;
run;

data temp2 / view=temp2;
set temp;
val = not missing(col1);
length test $10;
select (_name_);
    when ("Z1", "Z2", "Z3", "Z4", "Z5", "Z6", "Z7", "Z8", "Z9") test = "The First";
    when ("B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9") test = "Second";
end;
run;

proc tabulate data=temp2;
var val;
class test _name_ / order=data;
class col1 / order=formatted;
table test="Test:", _name_="Question", n col1="Answer"*rowpctn=""*format=best5.;
run;

PG

PGStats · Posted 03-15-2023 01:41 PM

As long as the same name doesn't occur in more than one test, you could replace

test = substr(_name_, 1, 1);

with

select (_name_);
when ("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9") test = "First";
when ("B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9") test = "Second";
end;

PG

SASJedi · Posted 03-15-2023 07:40 AM

Nice!

Check out my Jedi SAS Tricks for SAS Users

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away