BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,
My input dataset is X and i need two output datasets Y and Z as below:
Y contains complete duplicate observations and Z contians only unique observations.

data X;
input id age sex$;
cards;
1 22 M
1 22 M
1 23 M
1 24 M
1 24 M
2 12 F
2 13 F
2 14 F
2 14 F
3 24 M
3 24 M
3 24 M
4 25 F
5 26 M
6 24 M
7 26 M
run;

Data Y:
1 22 M
1 22 M
1 24 M
1 24 M
2 14 F
2 14 F
3 24 M
3 24 M
3 24 M

Data Z:
1 23 M
2 12 F
2 13 F
4 25 F
5 26 M
6 24 M
7 26 M

Thanks & Regards
Sam
6 REPLIES 6
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Consider the DUPOUT= keyword with PROC SORT.

Scott Barry
SBBWorks, Inc.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Sorry - DUPOUT= is not the technique, given your desired output conditions. In fact, there is a near-identical post over in the SAS PROCEDURES forum, with the SUBJECT "Reg :Duplicates" for your reference.

Scott Barry
SBBWorks, Inc.
GertNissen
Barite | Level 11
[pre]data X;
input id age sex$;
cards;
1 22 M
1 22 M
1 23 M
1 24 M
1 24 M
2 12 F
2 13 F
2 14 F
2 14 F
3 24 M
3 24 M
3 24 M
4 25 F
5 26 M
6 24 M
7 26 M
;
run;

proc sort data=X; by id age sex; run;

proc means data=X n;
by id age sex;
output out=Z(where=(_freq_=1));
run;

data Y;
merge X
Z(in=in_z keep=id age sex);
by id age sex;
if not in_z then output;
run;[/pre]
deleted_user
Not applicable
Thank you very much for your reply
data_null__
Jade | Level 19
In a data step, with an obvious limitation on variable names.

[pre]
data dups unique;
set;
by _all_;
array f
  • first:;
    array l
  • last:;
    if f[dim(f)] and l[dim(l)] then do;
    output unique;
    return;
    end;
    output dups;
    run;
    proc print data=dups;
    proc print data=unique;
    run;
    [/pre]
  • DanielSantos
    Barite | Level 11
    Another simple way of doing this would be:

    (And assuming that X is already sorted by ID AGE SEX)

    data Y Z;
    set X;
    by ID AGE SEX; /* assume that X is sorted this way */

    /* if the value of the last varible in group (SEX) is first and last of the group, then unique */
    if (first.SEX and last.SEX) then output Z;
    else output Y; /* else, duplicate */
    run;

    Greetings from Portugal.

    Daniel Santos at www.cgd.pt

    SAS Innovate 2025: Register Now

    Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
    Sign up by Dec. 31 to get the 2024 rate of just $495.
    Register now!

    How to Concatenate Values

    Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

    Find more tutorials on the SAS Users YouTube channel.

    SAS Training: Just a Click Away

     Ready to level-up your skills? Choose your own adventure.

    Browse our catalog!

    Discussion stats
    • 6 replies
    • 1209 views
    • 0 likes
    • 5 in conversation