SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How do I define and count mutually exclusive groups?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

How do I define and count mutually exclusive groups?

Hi,

 

I have a dataset that contains three ID variables: FieldA, Field, FieldC.  An individual's ID can appear in one, two, or all three of these fields.  Also, the number of observations per variable varies (i.e. there are different numbers of obs in FieldA, FieldB,and Field C). Below is a snapshot of the data (there are thousands of records overal).  

 

FieldAFieldBFieldC
100128843100028331100028331
100153363100038471100038471
100265327100042435100042435
100281941100046405100046405
100533731100055129100055129
100590741100062169100062169
100682161100065289100065289
100691903100071483100071483
100696993100071671100071671
100780833100078509100078509
100874907100083405100083405
100875239100084351100084351

 

I'm trying to determine how many IDs appear in:

 

Only FieldA

Only FieldB

Only FieldC

Only FieldA + FieldB

Only FieldA + FieldC

Only FieldB + FieldC

Only Field A + FieldB + Field C

 

I'd like for these groups to be mutually exclusive.  Any advice?  Thanks!


Accepted Solutions
Solution
‎07-19-2017 01:51 PM
Super User
Posts: 5,099

Re: How do I define and count mutually exclusive groups?

For simplicity, I would go for a multistep approach:

 

proc freq data=have noprint;

tables FieldA / out=A_counts (keep=fielda);

tables FieldB / out=B_counts (keep=fieldb);

tables FieldC / out=C_counts (keep=fieldc);

run;

 

data want;

merge a_counts (in=ina rename=(fielda=field))

  b_counts (in=inb rename=(fieldb=field))

  c_counts (in=inc rename=(fieldc=field));

by field;

length match $ 3;

if ina then match='A';

if inb then substr(match, 2, 1) = 'B';

if inc then substr(match, 3, 1) = 'C';

run;

 

proc freq data=want;

tables match;

run;

View solution in original post


All Replies
Solution
‎07-19-2017 01:51 PM
Super User
Posts: 5,099

Re: How do I define and count mutually exclusive groups?

For simplicity, I would go for a multistep approach:

 

proc freq data=have noprint;

tables FieldA / out=A_counts (keep=fielda);

tables FieldB / out=B_counts (keep=fieldb);

tables FieldC / out=C_counts (keep=fieldc);

run;

 

data want;

merge a_counts (in=ina rename=(fielda=field))

  b_counts (in=inb rename=(fieldb=field))

  c_counts (in=inc rename=(fieldc=field));

by field;

length match $ 3;

if ina then match='A';

if inb then substr(match, 2, 1) = 'B';

if inc then substr(match, 3, 1) = 'C';

run;

 

proc freq data=want;

tables match;

run;

Occasional Contributor
Posts: 6

Re: How do I define and count mutually exclusive groups?

That did it - thank you!!!

Super User
Posts: 9,691

Re: How do I define and count mutually exclusive groups?

data have;
input FieldA	FieldB	FieldC;
cards;
100128843	100028331	100028331
100153363	100038471	100038471
100265327	100042435	100042435
100281941	100046405	100046405
100533731	100055129	100055129
100590741	100062169	100062169
100682161	100065289	100065289
100691903	100071483	100071483
100696993	100071671	100071671
100780833	100078509	100078509
100874907	100083405	100083405
100875239	100084351	100084351
;
run;
proc sql;
create table only_in_a as
select FieldA from have
except
select FieldB from have
except
select FieldC from have ;

create table only_in_a_and_b as
select FieldA from have
intersect
select FieldB from have
except
select FieldC from have ;

create table only_in_a_and_b_and_c as
select FieldA from have
intersect
select FieldB from have
intersect
select FieldC from have ;
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 229 views
  • 0 likes
  • 3 in conversation