I have a series of fields each with a different set of value options. So A1-A9, B1-B9, etc. Where the answer options for A1-A9 are F, O, J, S, for B1-B9 they are AS, BU, PA, IS, etc. I need to display frequencies for each field along with the total N. Using a proc freq like this generates many pages of results.
proc freq;
tables A1-A9;
run;
I would like something more compact.
Test data, current output and ideal output in the attachments.
Could be done with proc tabulate :
data have;
input Record (A1 A2 A3 A4 A5 A6 A7 A8 A9 B1 B2 B3 B4 B5 B6 B7 B8 B9) (:$2.);
datalines;
1 F F O J J F S S O IS PA BR PA BR PA BR AS IS
2 S O J F O J J S S AS BR PA BR IS IS IS IS AS
3 F S O F S S S O F BR BR BR IS AS PA BR BR BR
4 O J J O F F J J F AS BR IS BR IS BR IS IS IS
5 J F F J F F S S J PA PA IS BR AS AS AS PA BR
6 J O J F J O J S O IS PA AS IS PA PA BR AS BR
7 . F F F S S O J F AS BR BR IS AS IS AS BR AS
8 S O S O J J F S O BR IS AS PA AS PA PA AS BR
9 J J F S F S S O F AS IS PA AS PA AS AS PA AS
10 F F O J S F J J F PA PA BR IS BR PA BR BR PA
;
proc transpose data=have out=temp;
by record;
var _character_;
run;
data temp2 / view=temp2;
set temp;
val = not missing(col1);
test = substr(_name_, 1, 1);
run;
proc tabulate data=temp2;
var val;
class test _name_ col1;
table test, _name_="Question", n col1="Answer"*rowpctn=""*format=best5.;
run;
Well, someone may have a more succinct approach, but this should work:
/* Get frequency data into a table */
ods output OneWayFreqs=FREQS(drop=freq: cum:);
proc freq data=have(drop=record);
	tables _all_;
run;
ods output close;
/* Get a list of variable names to coalesce */
proc sql;
select Name into :list separated by ','
	from dictionary.columns
	where libname='WORK' and memname='FREQS'
		and name not contains '_'
		and name not in ('Table','Percent')
;
quit;
/* Prep the data for transposition */
proc sql;
create table freqs2 as
select scan(table,-1) length=2 as Column
		,coalescec(&list) as Value
		,Percent
	from freqs
	order by 1,2
;
quit;
/* Transpose the data to get the desired output */
proc transpose data=freqs2 out=want(drop=_:) let;
	by column;
	ID value;
run;
And this is the result:
| Obs | Column | F | J | O | S | AS | BR | IS | PA | 
|---|---|---|---|---|---|---|---|---|---|
| 1 | A1 | 33.33 | 33.33 | 11.11 | 22.22 | . | . | . | . | 
| 2 | A2 | 40.00 | 20.00 | 30.00 | 10.00 | . | . | . | . | 
| 3 | A3 | 30.00 | 30.00 | 30.00 | 10.00 | . | . | . | . | 
| 4 | A4 | 40.00 | 30.00 | 20.00 | 10.00 | . | . | . | . | 
| 5 | A5 | 30.00 | 30.00 | 10.00 | 30.00 | . | . | . | . | 
| 6 | A6 | 40.00 | 20.00 | 10.00 | 30.00 | . | . | . | . | 
| 7 | A7 | 10.00 | 40.00 | 10.00 | 40.00 | . | . | . | . | 
| 8 | A8 | . | 30.00 | 20.00 | 50.00 | . | . | . | . | 
| 9 | A9 | 50.00 | 10.00 | 30.00 | 10.00 | . | . | . | . | 
| 10 | B1 | . | . | . | . | 40.00 | 20.00 | 20.00 | 20.00 | 
| 11 | B2 | . | . | . | . | . | 40.00 | 20.00 | 40.00 | 
| 12 | B3 | . | . | . | . | 20.00 | 40.00 | 20.00 | 20.00 | 
| 13 | B4 | . | . | . | . | 10.00 | 30.00 | 40.00 | 20.00 | 
| 14 | B5 | . | . | . | . | 40.00 | 20.00 | 20.00 | 20.00 | 
| 15 | B6 | . | . | . | . | 20.00 | 10.00 | 20.00 | 50.00 | 
| 16 | B7 | . | . | . | . | 30.00 | 40.00 | 20.00 | 10.00 | 
| 17 | B8 | . | . | . | . | 30.00 | 30.00 | 20.00 | 20.00 | 
| 18 | B9 | . | . | . | . | 30.00 | 40.00 | 20.00 | 10.00 | 
Hope that gets you started, anyway.
Mark
Could be done with proc tabulate :
data have;
input Record (A1 A2 A3 A4 A5 A6 A7 A8 A9 B1 B2 B3 B4 B5 B6 B7 B8 B9) (:$2.);
datalines;
1 F F O J J F S S O IS PA BR PA BR PA BR AS IS
2 S O J F O J J S S AS BR PA BR IS IS IS IS AS
3 F S O F S S S O F BR BR BR IS AS PA BR BR BR
4 O J J O F F J J F AS BR IS BR IS BR IS IS IS
5 J F F J F F S S J PA PA IS BR AS AS AS PA BR
6 J O J F J O J S O IS PA AS IS PA PA BR AS BR
7 . F F F S S O J F AS BR BR IS AS IS AS BR AS
8 S O S O J J F S O BR IS AS PA AS PA PA AS BR
9 J J F S F S S O F AS IS PA AS PA AS AS PA AS
10 F F O J S F J J F PA PA BR IS BR PA BR BR PA
;
proc transpose data=have out=temp;
by record;
var _character_;
run;
data temp2 / view=temp2;
set temp;
val = not missing(col1);
test = substr(_name_, 1, 1);
run;
proc tabulate data=temp2;
var val;
class test _name_ col1;
table test, _name_="Question", n col1="Answer"*rowpctn=""*format=best5.;
run;
This proc tabulate is perfect. One follow up questions - would there be a way to do this if the fields did not begin with the same letter--where I would need to list out the fields?
@DavidKaib wrote:
This proc tabulate is perfect. One follow up questions - would there be a way to do this if the fields did not begin with the same letter--where I would need to list out the fields?
Are you asking about the Proc Transpose step? _character_ is a list, short hand for all character variables. If you only want some specific variables you would list them on the Var statement. There are other list shortcuts available: A: would get all variables with names starting with A, A1-A9 would use A1, A2, A3 ... A9. You can mix lists such as : Var A: B1-B9 ; to get all A variables and just B1 through B9.
Perfect. This is really great.
When I tried this with some variables that are not just A, B, etc., it worked but the order came out odd - so for example Test Other was first, Test Acad was second (in the original data the variables the begin with Acad are first.) Is there a way to reorder those? I could also rename those variables as A1-A9, B2-, etc,, but if there is a way to set the order in the code that would be helpful.
The ORDER= option in the CLASS statement of the TABULATE proc gives you some control on the ordering of the table lines and columns:
data have;
input Record (Z9 Z8 Z7 Z6 Z5 Z4 Z3 Z2 Z1 B1 B2 B3 B4 B5 B6 B7 B8 B9) (:$2.);
datalines;
1 F F O J J F S S O IS PA BR PA BR PA BR AS IS
2 S O J F O J J S S AS BR PA BR IS IS IS IS AS
3 F S O F S S S O F BR BR BR IS AS PA BR BR BR
4 O J J O F F J J F AS BR IS BR IS BR IS IS IS
5 J F F J F F S S J PA PA IS BR AS AS AS PA BR
6 J O J F J O J S O IS PA AS IS PA PA BR AS BR
7 . F F F S S O J F AS BR BR IS AS IS AS BR AS
8 S O S O J J F S O BR IS AS PA AS PA PA AS BR
9 J J F S F S S O F AS IS PA AS PA AS AS PA AS
10 F F O J S F J J F PA PA BR IS BR PA BR BR PA
;
proc transpose data=have out=temp;
by record;
var _character_;
run;
data temp2 / view=temp2;
set temp;
val = not missing(col1);
length test $10;
select (_name_);
    when ("Z1", "Z2", "Z3", "Z4", "Z5", "Z6", "Z7", "Z8", "Z9") test = "The First";
    when ("B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9") test = "Second";
end;
run;
proc tabulate data=temp2;
var val;
class test _name_ / order=data;
class col1 / order=formatted;
table test="Test:", _name_="Question", n col1="Answer"*rowpctn=""*format=best5.;
run;
As long as the same name doesn't occur in more than one test, you could replace
test = substr(_name_, 1, 1);
with
select (_name_);
when ("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9") test = "First";
when ("B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9") test = "Second";
end;
Nice!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
