dataset have:
dataset want:
Variable A and B are two check-all-apply variables in original dataset.
I want to get a frequency table which explicite all combinations of A,B, and C, and count the frequency of each combinations.
Here is one way but may not be extensible easily if you are looking to do this with many more or groups of variables.
data have; input A1-A3 B1-B4 C $; datalines; 1 1 0 0 0 1 1 C1 0 1 1 0 0 1 0 C1 0 0 1 1 0 0 1 C3 ; run; data trans; set have; array as a1-a3; array bs b1-b4; do i= 1 to dim(as); do j= 1 to dim(bs); if as[i]=1 and bs[j]=1 then do; A= vname(as[i]); B= vname(bs[j]); output; end; end; end; keep a b c; run; proc freq data=trans noprint; tables A*B*C/list out=want (drop=Percent); run;
Note that the data is in a data step to test code with.
Also I believe your example output is incorrect as A2 = 1 on rows 1 and 2 and B3=1 on rows 1 and two. So the count for A2 B3 is 2 since the value of C is the same on both rows.
Here is one way but may not be extensible easily if you are looking to do this with many more or groups of variables.
data have; input A1-A3 B1-B4 C $; datalines; 1 1 0 0 0 1 1 C1 0 1 1 0 0 1 0 C1 0 0 1 1 0 0 1 C3 ; run; data trans; set have; array as a1-a3; array bs b1-b4; do i= 1 to dim(as); do j= 1 to dim(bs); if as[i]=1 and bs[j]=1 then do; A= vname(as[i]); B= vname(bs[j]); output; end; end; end; keep a b c; run; proc freq data=trans noprint; tables A*B*C/list out=want (drop=Percent); run;
Note that the data is in a data step to test code with.
Also I believe your example output is incorrect as A2 = 1 on rows 1 and 2 and B3=1 on rows 1 and two. So the count for A2 B3 is 2 since the value of C is the same on both rows.
This works perfectly! Much more efficient than PROC TRANSPOSE that I was using.
Really appreciate!
data have;
input A1-A3 B1-B4 C $;
datalines;
1 1 0 0 0 1 1 C1
0 1 1 0 0 1 0 C1
0 0 1 1 0 0 1 C3
;
run;
data want;
retain A B C;
set have;
array t(*) a1--b4;
do i=1 to dim(t)-1;
A=vname(t(i));
do j=i+1 to dim(t);
B=vname(t(j));
if first(vname(t(i))) ne first(vname(t(j))) then if t(i)+t(j)=2 then do;count=1;output;end;
end;
end;
keep A B C Count;
run;
Just for fun, here's a little different version. It sets up a reference file of all of the possible responses, and then matches the input data against it.
Tom
/* Set up a couple of formats */
proc format;
value Af
1 = "A1"
2 = "A2"
3 = "A3"
. = "None"
;
value Bf
1 = "B1"
2 = "B2"
3 = "B3"
4 = "B4"
. = "None"
;
run;
/* Create a dataset that contains the desired results for different input patterns */
data Patterns;
format A Af. B Bf.;
keep Pattern A B;
/* Get the A values with no B values */
do APattern = 0 to 7;
BPattern = 0;
Pattern = APattern * 16 + BPattern;
AChar = put(APattern, binary3.);
do ACount = 1 to 3;
call missing(A, B);
if substr(AChar, ACount, 1) = "1" then
A = ACount;
if ^missing(A) then
output;
end;
end;
/* Get the B values with no A values */
APattern = 0;
do BPattern = 0 to 15;
Pattern = APattern * 16 + BPattern;
BChar = put(BPattern, binary4.);
do BCount = 1 to 4;
call missing(A, B);
if substr(BChar, BCount, 1) = "1" then
B = BCount;
if ^missing(B) then
output;
end;
end;
/* Get the combinations of A values with B values */
do APattern = 0 to 7;
do BPattern = 0 to 15;
Pattern = APattern * 16 + BPattern;
AChar = put(APattern, binary3.);
BChar = put(BPattern, binary4.);
do ACount = 1 to 3;
call missing(A);
if substr(AChar, ACount, 1) = "1" then
A = ACount;
do BCount = 1 to 4;
call missing(B);
if substr(BChar, BCount, 1) = "1" then
B = BCount;
if ^missing(A) & ^missing(B) then
output;
end;
end;
end;
end;
run;
/* Get the data */
data Have;
length A1 A2 A3 B1 B2 B3 B4 8 C $2;
input A1 A2 A3 B1 B2 B3 B4 C;
SeqNo = _n_;
cards;
1 1 0 0 0 1 1 C1
0 1 1 0 0 1 0 C1
0 0 1 1 0 0 1 C3
run;
/* Create the pattern to match with */
data Inter01;
set Have;
Pattern = B4*1 + B3*2 + B2*4 + B1*8 + A3*16 + A2*32 + A1*64;
run;
/* Match the input data to the patterns */
proc sql noprint;
create table Want as
select i.A1, i.A2, i.A3, i.B1, i.B2, i.B3, i.B4, p.A, p.B, i.C
from Inter01 i inner join Patterns p on i.Pattern = p.Pattern
order by SeqNo, A, B;
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.