SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How to count the number of each combination of two check-all-apply variables

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

How to count the number of each combination of two check-all-apply variables

[ Edited ]

 

 

dataset have:have.PNG

 

dataset  want:want.PNG

 

Variable A and B are two check-all-apply variables in original dataset.

I want to get a frequency table which explicite all combinations of A,B, and C, and count the frequency of each combinations.

 

 

 


Accepted Solutions
Solution
‎06-15-2018 06:40 PM
Super User
Posts: 13,947

Re: How to count the number of each combination of two check-all-apply variables

Posted in reply to zhouxysherry

Here is one way but may not be extensible easily if you are looking to do this with many more or groups of variables.

data have;
   input A1-A3 B1-B4 C $;
datalines;
1 1 0 0 0 1 1 C1
0 1 1 0 0 1 0 C1
0 0 1 1 0 0 1 C3
;
run;

data trans;
   set have;
   array as a1-a3;
   array bs b1-b4;
   do i= 1 to dim(as);
      do j= 1 to dim(bs);
         if as[i]=1 and bs[j]=1 then do;
            A= vname(as[i]);
            B= vname(bs[j]);
            output;
         end;
      end;
   end;
   keep a b c;
run;

proc freq data=trans noprint;
  tables A*B*C/list out=want (drop=Percent);
run;

Note that the data is in a data step to test code with.

 

Also I believe your example output is incorrect as A2 = 1 on rows 1 and 2 and B3=1 on rows 1 and two. So the count for A2 B3 is 2 since the value of C is the same on both rows.

View solution in original post


All Replies
Solution
‎06-15-2018 06:40 PM
Super User
Posts: 13,947

Re: How to count the number of each combination of two check-all-apply variables

Posted in reply to zhouxysherry

Here is one way but may not be extensible easily if you are looking to do this with many more or groups of variables.

data have;
   input A1-A3 B1-B4 C $;
datalines;
1 1 0 0 0 1 1 C1
0 1 1 0 0 1 0 C1
0 0 1 1 0 0 1 C3
;
run;

data trans;
   set have;
   array as a1-a3;
   array bs b1-b4;
   do i= 1 to dim(as);
      do j= 1 to dim(bs);
         if as[i]=1 and bs[j]=1 then do;
            A= vname(as[i]);
            B= vname(bs[j]);
            output;
         end;
      end;
   end;
   keep a b c;
run;

proc freq data=trans noprint;
  tables A*B*C/list out=want (drop=Percent);
run;

Note that the data is in a data step to test code with.

 

Also I believe your example output is incorrect as A2 = 1 on rows 1 and 2 and B3=1 on rows 1 and two. So the count for A2 B3 is 2 since the value of C is the same on both rows.

Occasional Contributor
Posts: 5

Re: How to count the number of each combination of two check-all-apply variables

This works perfectly! Much more efficient than PROC TRANSPOSE that I was using.

 

Really appreciate!

Super User
Posts: 2,075

Re: How to count the number of each combination of two check-all-apply variables

Posted in reply to zhouxysherry
data have;
   input A1-A3 B1-B4 C $;
datalines;
1 1 0 0 0 1 1 C1
0 1 1 0 0 1 0 C1
0 0 1 1 0 0 1 C3
;
run;

data want;
retain A B C;
set have;
array t(*) a1--b4;
do i=1 to dim(t)-1;
A=vname(t(i));
do j=i+1 to dim(t);
B=vname(t(j));
if first(vname(t(i))) ne first(vname(t(j))) then if t(i)+t(j)=2 then do;count=1;output;end;
end;
end;
keep A B C Count;
run;
Occasional Contributor
Posts: 5

Re: How to count the number of each combination of two check-all-apply variables

Posted in reply to novinosrin
Thank you novinosrin! It's amazing that this could be finished in one step!
It works very well.
PROC Star
Posts: 1,334

Re: How to count the number of each combination of two check-all-apply variables

Posted in reply to zhouxysherry

Just for fun, here's a little different version. It sets up a reference file of all of the possible responses, and then matches the input data against it.

 

Tom

 

/* Set up a couple of formats */
proc format;
	value Af
		1 = "A1"
		2 = "A2"
		3 = "A3"
		. = "None"
	;
	value Bf
		1 = "B1"
		2 = "B2"
		3 = "B3"
		4 = "B4"
		. = "None"
	;
run;

/* Create a dataset that contains the desired results for different input patterns */
data Patterns;
	format A Af. B Bf.;
	keep Pattern A B;

	/* Get the A values with no B values */
	do APattern = 0 to 7;
		BPattern = 0;
		Pattern = APattern * 16 + BPattern;
		AChar = put(APattern, binary3.);

		do ACount = 1 to 3;
			call missing(A, B);

			if substr(AChar, ACount, 1) = "1" then
				A = ACount;

			if ^missing(A) then
				output;
		end;
	end;

	/* Get the B values with no A values */
	APattern = 0;

	do BPattern = 0 to 15;
		Pattern = APattern * 16 + BPattern;
		BChar = put(BPattern, binary4.);

		do BCount = 1 to 4;
			call missing(A, B);

			if substr(BChar, BCount, 1) = "1" then
				B = BCount;

			if ^missing(B) then
				output;
		end;
	end;

	/* Get the combinations of A values with B values */
	do APattern = 0 to 7;
		do BPattern = 0 to 15;
			Pattern = APattern * 16 + BPattern;
			AChar = put(APattern, binary3.);
			BChar = put(BPattern, binary4.);

			do ACount = 1 to 3;
				call missing(A);

				if substr(AChar, ACount, 1) = "1" then
					A = ACount;

				do BCount = 1 to 4;
					call missing(B);

					if substr(BChar, BCount, 1) = "1" then
						B = BCount;

					if ^missing(A) & ^missing(B) then
						output;
				end;
			end;
		end;
	end;
run;

/* Get the data */
data Have;
	length A1 A2 A3 B1 B2 B3 B4 8 C $2;
	input A1 A2 A3 B1 B2 B3 B4 C;
	SeqNo = _n_;
	cards;
1 1 0 0 0 1 1 C1
0 1 1 0 0 1 0 C1
0 0 1 1 0 0 1 C3
run;

/* Create the pattern to match with */
data Inter01;
	set Have;
	Pattern = B4*1 + B3*2 + B2*4 + B1*8 + A3*16 + A2*32 + A1*64;
run;

/* Match the input data to the patterns */
proc sql noprint;
	create table Want as
		select i.A1, i.A2, i.A3, i.B1, i.B2, i.B3, i.B4, p.A, p.B, i.C
			from Inter01 i inner join Patterns p on i.Pattern = p.Pattern
				order by SeqNo, A, B;
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 268 views
  • 3 likes
  • 4 in conversation