Hi,
My input dataset is X and i need two output datasets Y and Z as below:
Y contains complete duplicate observations and Z contians only unique observations.
data X;
input id age sex$;
cards;
1 22 M
1 22 M
1 23 M
1 24 M
1 24 M
2 12 F
2 13 F
2 14 F
2 14 F
3 24 M
3 24 M
3 24 M
4 25 F
5 26 M
6 24 M
7 26 M
run;
Data Y:
1 22 M
1 22 M
1 24 M
1 24 M
2 14 F
2 14 F
3 24 M
3 24 M
3 24 M
Data Z:
1 23 M
2 12 F
2 13 F
4 25 F
5 26 M
6 24 M
7 26 M
Sorry - DUPOUT= is not the technique, given your desired output conditions. In fact, there is a near-identical post over in the SAS PROCEDURES forum, with the SUBJECT "Reg :Duplicates" for your reference.
[pre]data X;
input id age sex$;
cards;
1 22 M
1 22 M
1 23 M
1 24 M
1 24 M
2 12 F
2 13 F
2 14 F
2 14 F
3 24 M
3 24 M
3 24 M
4 25 F
5 26 M
6 24 M
7 26 M
;
run;
proc sort data=X; by id age sex; run;
proc means data=X n;
by id age sex;
output out=Z(where=(_freq_=1));
run;
data Y;
merge X
Z(in=in_z keep=id age sex);
by id age sex;
if not in_z then output;
run;[/pre]
(And assuming that X is already sorted by ID AGE SEX)
data Y Z;
set X;
by ID AGE SEX; /* assume that X is sorted this way */
/* if the value of the last varible in group (SEX) is first and last of the group, then unique */
if (first.SEX and last.SEX) then output Z;
else output Y; /* else, duplicate */
run;