BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fengyuwuzu
Pyrite | Level 9
proc sort data=A out=B dupout=C nodupkey; By var1 var 2 var3 ; run;

by using the above code, I can get a dataset unique on By variables (dataset B) and the duplicates on the By variables (dataset C). 

Sometimes I want to compare the duplicates (unique ones in B and duplicates in C), to see what variables they differ other than the By variables, but how to put them together? I mean how to extract those obs which have the matching By variables  in dataset C?

 

for example, I have dataset A as:

ID age sex win lost 

1 20 F 200 120

2 22 M 150 130

2 22 M 150 80

3 25 M 110 90

3 25 M 110 210

4 27 F  105 85

 

if I run 

proc sort data=A out=B dupout=C nodupkey; by ID age sex win; run;

I will get B as:

ID age sex win lost 

1 20 F 200 120

2 22 M 150 130

3 25 M 110 90

4 27 F  105 85

 

and C:

ID age sex win lost 

 

2 22 M 150 80

3 25 M 110 210

 

Now I want to how other variables in the duplicates differ other than the identical By variables, so I want to have the PAIRS of duplicates like this:

 

ID age sex win lost 

2 22 M 150 130

2 22 M 150 80

3 25 M 110 90

3 25 M 110 210

 

This means I need to extract the "By variable" identical obs from dataset B. How to do it? Thanks in advance. 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

 

data bygroups_having_duplicates;
  set b (in=inb) c;
  by id age;
  if not(first.age=1 and last.age=1);
  if inb then source='B';
  else source='C';
run;

 

No singletons will pass the subsetting if statement.  And the first record for each by group will be from dataset B.  All subsequent records for the by group are from C.

 

PROC SQL alternative from @Kurt_Bremser:

 

proc sql;
create table d as
select * from a
group by id, age, sex, win
having count(*) >= 2
;
quit;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

4 REPLIES 4
Miracle
Barite | Level 11

How about this? HTH.

 

proc sort data=A out=dup_rec nouniquekey; by id age sex win; run;
proc print data=dup_rec noobs; run;

 

mkeintz
PROC Star

 

data bygroups_having_duplicates;
  set b (in=inb) c;
  by id age;
  if not(first.age=1 and last.age=1);
  if inb then source='B';
  else source='C';
run;

 

No singletons will pass the subsetting if statement.  And the first record for each by group will be from dataset B.  All subsequent records for the by group are from C.

 

PROC SQL alternative from @Kurt_Bremser:

 

proc sql;
create table d as
select * from a
group by id, age, sex, win
having count(*) >= 2
;
quit;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
fengyuwuzu
Pyrite | Level 9

Thank you very much, all of you! It is so nice to have multiple solutions! 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 98870 views
  • 7 likes
  • 4 in conversation