I have a series of fields each with a different set of value options. So A1-A9, B1-B9, etc. Where the answer options for A1-A9 are F, O, J, S, for B1-B9 they are AS, BU, PA, IS, etc. I need to display frequencies for each field along with the total N. Using a proc freq like this generates many pages of results.
proc freq;
tables A1-A9;
run;
I would like something more compact.
Test data, current output and ideal output in the attachments.
Could be done with proc tabulate :
data have;
input Record (A1 A2 A3 A4 A5 A6 A7 A8 A9 B1 B2 B3 B4 B5 B6 B7 B8 B9) (:$2.);
datalines;
1 F F O J J F S S O IS PA BR PA BR PA BR AS IS
2 S O J F O J J S S AS BR PA BR IS IS IS IS AS
3 F S O F S S S O F BR BR BR IS AS PA BR BR BR
4 O J J O F F J J F AS BR IS BR IS BR IS IS IS
5 J F F J F F S S J PA PA IS BR AS AS AS PA BR
6 J O J F J O J S O IS PA AS IS PA PA BR AS BR
7 . F F F S S O J F AS BR BR IS AS IS AS BR AS
8 S O S O J J F S O BR IS AS PA AS PA PA AS BR
9 J J F S F S S O F AS IS PA AS PA AS AS PA AS
10 F F O J S F J J F PA PA BR IS BR PA BR BR PA
;
proc transpose data=have out=temp;
by record;
var _character_;
run;
data temp2 / view=temp2;
set temp;
val = not missing(col1);
test = substr(_name_, 1, 1);
run;
proc tabulate data=temp2;
var val;
class test _name_ col1;
table test, _name_="Question", n col1="Answer"*rowpctn=""*format=best5.;
run;
Well, someone may have a more succinct approach, but this should work:
/* Get frequency data into a table */
ods output OneWayFreqs=FREQS(drop=freq: cum:);
proc freq data=have(drop=record);
tables _all_;
run;
ods output close;
/* Get a list of variable names to coalesce */
proc sql;
select Name into :list separated by ','
from dictionary.columns
where libname='WORK' and memname='FREQS'
and name not contains '_'
and name not in ('Table','Percent')
;
quit;
/* Prep the data for transposition */
proc sql;
create table freqs2 as
select scan(table,-1) length=2 as Column
,coalescec(&list) as Value
,Percent
from freqs
order by 1,2
;
quit;
/* Transpose the data to get the desired output */
proc transpose data=freqs2 out=want(drop=_:) let;
by column;
ID value;
run;
And this is the result:
Obs | Column | F | J | O | S | AS | BR | IS | PA |
---|---|---|---|---|---|---|---|---|---|
1 | A1 | 33.33 | 33.33 | 11.11 | 22.22 | . | . | . | . |
2 | A2 | 40.00 | 20.00 | 30.00 | 10.00 | . | . | . | . |
3 | A3 | 30.00 | 30.00 | 30.00 | 10.00 | . | . | . | . |
4 | A4 | 40.00 | 30.00 | 20.00 | 10.00 | . | . | . | . |
5 | A5 | 30.00 | 30.00 | 10.00 | 30.00 | . | . | . | . |
6 | A6 | 40.00 | 20.00 | 10.00 | 30.00 | . | . | . | . |
7 | A7 | 10.00 | 40.00 | 10.00 | 40.00 | . | . | . | . |
8 | A8 | . | 30.00 | 20.00 | 50.00 | . | . | . | . |
9 | A9 | 50.00 | 10.00 | 30.00 | 10.00 | . | . | . | . |
10 | B1 | . | . | . | . | 40.00 | 20.00 | 20.00 | 20.00 |
11 | B2 | . | . | . | . | . | 40.00 | 20.00 | 40.00 |
12 | B3 | . | . | . | . | 20.00 | 40.00 | 20.00 | 20.00 |
13 | B4 | . | . | . | . | 10.00 | 30.00 | 40.00 | 20.00 |
14 | B5 | . | . | . | . | 40.00 | 20.00 | 20.00 | 20.00 |
15 | B6 | . | . | . | . | 20.00 | 10.00 | 20.00 | 50.00 |
16 | B7 | . | . | . | . | 30.00 | 40.00 | 20.00 | 10.00 |
17 | B8 | . | . | . | . | 30.00 | 30.00 | 20.00 | 20.00 |
18 | B9 | . | . | . | . | 30.00 | 40.00 | 20.00 | 10.00 |
Hope that gets you started, anyway.
Mark
Could be done with proc tabulate :
data have;
input Record (A1 A2 A3 A4 A5 A6 A7 A8 A9 B1 B2 B3 B4 B5 B6 B7 B8 B9) (:$2.);
datalines;
1 F F O J J F S S O IS PA BR PA BR PA BR AS IS
2 S O J F O J J S S AS BR PA BR IS IS IS IS AS
3 F S O F S S S O F BR BR BR IS AS PA BR BR BR
4 O J J O F F J J F AS BR IS BR IS BR IS IS IS
5 J F F J F F S S J PA PA IS BR AS AS AS PA BR
6 J O J F J O J S O IS PA AS IS PA PA BR AS BR
7 . F F F S S O J F AS BR BR IS AS IS AS BR AS
8 S O S O J J F S O BR IS AS PA AS PA PA AS BR
9 J J F S F S S O F AS IS PA AS PA AS AS PA AS
10 F F O J S F J J F PA PA BR IS BR PA BR BR PA
;
proc transpose data=have out=temp;
by record;
var _character_;
run;
data temp2 / view=temp2;
set temp;
val = not missing(col1);
test = substr(_name_, 1, 1);
run;
proc tabulate data=temp2;
var val;
class test _name_ col1;
table test, _name_="Question", n col1="Answer"*rowpctn=""*format=best5.;
run;
This proc tabulate is perfect. One follow up questions - would there be a way to do this if the fields did not begin with the same letter--where I would need to list out the fields?
@DavidKaib wrote:
This proc tabulate is perfect. One follow up questions - would there be a way to do this if the fields did not begin with the same letter--where I would need to list out the fields?
Are you asking about the Proc Transpose step? _character_ is a list, short hand for all character variables. If you only want some specific variables you would list them on the Var statement. There are other list shortcuts available: A: would get all variables with names starting with A, A1-A9 would use A1, A2, A3 ... A9. You can mix lists such as : Var A: B1-B9 ; to get all A variables and just B1 through B9.
Perfect. This is really great.
When I tried this with some variables that are not just A, B, etc., it worked but the order came out odd - so for example Test Other was first, Test Acad was second (in the original data the variables the begin with Acad are first.) Is there a way to reorder those? I could also rename those variables as A1-A9, B2-, etc,, but if there is a way to set the order in the code that would be helpful.
The ORDER= option in the CLASS statement of the TABULATE proc gives you some control on the ordering of the table lines and columns:
data have;
input Record (Z9 Z8 Z7 Z6 Z5 Z4 Z3 Z2 Z1 B1 B2 B3 B4 B5 B6 B7 B8 B9) (:$2.);
datalines;
1 F F O J J F S S O IS PA BR PA BR PA BR AS IS
2 S O J F O J J S S AS BR PA BR IS IS IS IS AS
3 F S O F S S S O F BR BR BR IS AS PA BR BR BR
4 O J J O F F J J F AS BR IS BR IS BR IS IS IS
5 J F F J F F S S J PA PA IS BR AS AS AS PA BR
6 J O J F J O J S O IS PA AS IS PA PA BR AS BR
7 . F F F S S O J F AS BR BR IS AS IS AS BR AS
8 S O S O J J F S O BR IS AS PA AS PA PA AS BR
9 J J F S F S S O F AS IS PA AS PA AS AS PA AS
10 F F O J S F J J F PA PA BR IS BR PA BR BR PA
;
proc transpose data=have out=temp;
by record;
var _character_;
run;
data temp2 / view=temp2;
set temp;
val = not missing(col1);
length test $10;
select (_name_);
when ("Z1", "Z2", "Z3", "Z4", "Z5", "Z6", "Z7", "Z8", "Z9") test = "The First";
when ("B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9") test = "Second";
end;
run;
proc tabulate data=temp2;
var val;
class test _name_ / order=data;
class col1 / order=formatted;
table test="Test:", _name_="Question", n col1="Answer"*rowpctn=""*format=best5.;
run;
As long as the same name doesn't occur in more than one test, you could replace
test = substr(_name_, 1, 1);
with
select (_name_);
when ("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9") test = "First";
when ("B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9") test = "Second";
end;
Nice!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.