BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
acrowther
Calcite | Level 5

Hi,

 

I have a dataset that contains three ID variables: FieldA, Field, FieldC.  An individual's ID can appear in one, two, or all three of these fields.  Also, the number of observations per variable varies (i.e. there are different numbers of obs in FieldA, FieldB,and Field C). Below is a snapshot of the data (there are thousands of records overal).  

 

FieldAFieldBFieldC
100128843100028331100028331
100153363100038471100038471
100265327100042435100042435
100281941100046405100046405
100533731100055129100055129
100590741100062169100062169
100682161100065289100065289
100691903100071483100071483
100696993100071671100071671
100780833100078509100078509
100874907100083405100083405
100875239100084351100084351

 

I'm trying to determine how many IDs appear in:

 

Only FieldA

Only FieldB

Only FieldC

Only FieldA + FieldB

Only FieldA + FieldC

Only FieldB + FieldC

Only Field A + FieldB + Field C

 

I'd like for these groups to be mutually exclusive.  Any advice?  Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

For simplicity, I would go for a multistep approach:

 

proc freq data=have noprint;

tables FieldA / out=A_counts (keep=fielda);

tables FieldB / out=B_counts (keep=fieldb);

tables FieldC / out=C_counts (keep=fieldc);

run;

 

data want;

merge a_counts (in=ina rename=(fielda=field))

  b_counts (in=inb rename=(fieldb=field))

  c_counts (in=inc rename=(fieldc=field));

by field;

length match $ 3;

if ina then match='A';

if inb then substr(match, 2, 1) = 'B';

if inc then substr(match, 3, 1) = 'C';

run;

 

proc freq data=want;

tables match;

run;

View solution in original post

3 REPLIES 3
Astounding
PROC Star

For simplicity, I would go for a multistep approach:

 

proc freq data=have noprint;

tables FieldA / out=A_counts (keep=fielda);

tables FieldB / out=B_counts (keep=fieldb);

tables FieldC / out=C_counts (keep=fieldc);

run;

 

data want;

merge a_counts (in=ina rename=(fielda=field))

  b_counts (in=inb rename=(fieldb=field))

  c_counts (in=inc rename=(fieldc=field));

by field;

length match $ 3;

if ina then match='A';

if inb then substr(match, 2, 1) = 'B';

if inc then substr(match, 3, 1) = 'C';

run;

 

proc freq data=want;

tables match;

run;

acrowther
Calcite | Level 5

That did it - thank you!!!

Ksharp
Super User
data have;
input FieldA	FieldB	FieldC;
cards;
100128843	100028331	100028331
100153363	100038471	100038471
100265327	100042435	100042435
100281941	100046405	100046405
100533731	100055129	100055129
100590741	100062169	100062169
100682161	100065289	100065289
100691903	100071483	100071483
100696993	100071671	100071671
100780833	100078509	100078509
100874907	100083405	100083405
100875239	100084351	100084351
;
run;
proc sql;
create table only_in_a as
select FieldA from have
except
select FieldB from have
except
select FieldC from have ;

create table only_in_a_and_b as
select FieldA from have
intersect
select FieldB from have
except
select FieldC from have ;

create table only_in_a_and_b_and_c as
select FieldA from have
intersect
select FieldB from have
intersect
select FieldC from have ;
quit;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 4367 views
  • 0 likes
  • 3 in conversation