Re: Proc sort Nodupkey..but I want to keep the first occurance

Ray_H · Posted 10-29-2008 06:56 PM

I have a data set in which I have duplicates (rcode, pcodes, and prov). I want to create a new table in which these are deduped, but retaining the first time they were experienced as unique by (keydt). I cant use proc sort nodupkey because I need to include keydate in the sort, and will yield no dups by that particular date. If I don’t use the keydate the sort will then sort by rcode, pcode and prov; and will not yield the first time the unique combination was observed.

data xx;
format keydt date. rcode $4. pcode $5.;
input ID keydt :date. rcode pcode pr;
cards;
8001 20-Jan-06 634 J0886 26
8001 20-Jan-06 636 J1270 26
8001 20-Jan-06 821 90999 26
8001 06-Feb-06 634 J0886 26
8001 06-Feb-06 636 J1270 26
8001 06-Feb-06 821 90999 26
8001 23-Feb-06 634 J0886 26
8001 23-Feb-06 636 J1270 26
8001 23-Feb-06 821 90935 26
8001 03-Mar-06 634 J0886 26
8001 03-Mar-06 636 J1270 26
8001 03-Mar-06 821 90935 26
8001 10-Apr-06 636 J1270 26
8001 10-Apr-06 821 90935 26
8001 20-Apr-06 634 J0886 26
8001 20-Apr-06 636 J1270 26
8001 20-Apr-06 821 90935 26
8001 13-May-06 270 A4657 26
8001 13-May-06 634 J0886 26
8001 13-May-06 636 J1270 26
8001 13-May-06 821 90935 26
8001 16-May-06 270 A4657 26
8001 16-May-06 634 J0886 26
8001 16-May-06 636 J1270 26
8001 16-May-06 821 90935 26
8001 22-May-06 634 J0886 26
8001 22-May-06 636 J1270 26
8001 22-May-06 821 90935 26
8001 07-Jun-06 634 J0886 26
8001 07-Jun-06 636 J1270 26
8001 07-Jun-06 821 90935 26
8001 20-Jun-06 634 J0886 26
8001 20-Jun-06 636 J1270 26
8001 20-Jun-06 821 90935 26
8001 07-Jul-06 270 A4657 26
8001 07-Jul-06 634 J0886 26
8001 07-Jul-06 636 J1270 26
8001 07-Jul-06 821 90935 26
8001 14-Jul-06 634 J0886 26

;
run;

sbb · Posted 10-29-2008 10:18 PM

Consider using multiple sorts, first without and then with the NODUPSKEY and also including the EQUALS parameter for your BY key variables.

Scott Barry
SBBWorks, Inc.

sbb · Posted 10-29-2008 10:34 PM

Another option if you want more "control" is to consider using a DATA step with a SET statement, a BY statement (input file must be sorted by the variable list), and use IF FIRST.variable_name and if necessary IF LAST.variable_name, and the OUTPUT statement to create the file as needed.

Scott Barry
SBBWorks, Inc.

Proc sort Nodupkey..but I want to keep the first occurance

Re: Proc sort Nodupkey..but I want to keep the first occurance

Re: Proc sort Nodupkey..but I want to keep the first occurance

Registration is open

SAS Training: Just a Click Away