- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posted 10-29-2008 06:56 PM
(2476 views)
I have a data set in which I have duplicates (rcode, pcodes, and prov). I want to create a new table in which these are deduped, but retaining the first time they were experienced as unique by (keydt). I cant use proc sort nodupkey because I need to include keydate in the sort, and will yield no dups by that particular date. If I don’t use the keydate the sort will then sort by rcode, pcode and prov; and will not yield the first time the unique combination was observed.
data xx;
format keydt date. rcode $4. pcode $5.;
input ID keydt :date. rcode pcode pr;
cards;
8001 20-Jan-06 634 J0886 26
8001 20-Jan-06 636 J1270 26
8001 20-Jan-06 821 90999 26
8001 06-Feb-06 634 J0886 26
8001 06-Feb-06 636 J1270 26
8001 06-Feb-06 821 90999 26
8001 23-Feb-06 634 J0886 26
8001 23-Feb-06 636 J1270 26
8001 23-Feb-06 821 90935 26
8001 03-Mar-06 634 J0886 26
8001 03-Mar-06 636 J1270 26
8001 03-Mar-06 821 90935 26
8001 10-Apr-06 636 J1270 26
8001 10-Apr-06 821 90935 26
8001 20-Apr-06 634 J0886 26
8001 20-Apr-06 636 J1270 26
8001 20-Apr-06 821 90935 26
8001 13-May-06 270 A4657 26
8001 13-May-06 634 J0886 26
8001 13-May-06 636 J1270 26
8001 13-May-06 821 90935 26
8001 16-May-06 270 A4657 26
8001 16-May-06 634 J0886 26
8001 16-May-06 636 J1270 26
8001 16-May-06 821 90935 26
8001 22-May-06 634 J0886 26
8001 22-May-06 636 J1270 26
8001 22-May-06 821 90935 26
8001 07-Jun-06 634 J0886 26
8001 07-Jun-06 636 J1270 26
8001 07-Jun-06 821 90935 26
8001 20-Jun-06 634 J0886 26
8001 20-Jun-06 636 J1270 26
8001 20-Jun-06 821 90935 26
8001 07-Jul-06 270 A4657 26
8001 07-Jul-06 634 J0886 26
8001 07-Jul-06 636 J1270 26
8001 07-Jul-06 821 90935 26
8001 14-Jul-06 634 J0886 26
;
run;
data xx;
format keydt date. rcode $4. pcode $5.;
input ID keydt :date. rcode pcode pr;
cards;
8001 20-Jan-06 634 J0886 26
8001 20-Jan-06 636 J1270 26
8001 20-Jan-06 821 90999 26
8001 06-Feb-06 634 J0886 26
8001 06-Feb-06 636 J1270 26
8001 06-Feb-06 821 90999 26
8001 23-Feb-06 634 J0886 26
8001 23-Feb-06 636 J1270 26
8001 23-Feb-06 821 90935 26
8001 03-Mar-06 634 J0886 26
8001 03-Mar-06 636 J1270 26
8001 03-Mar-06 821 90935 26
8001 10-Apr-06 636 J1270 26
8001 10-Apr-06 821 90935 26
8001 20-Apr-06 634 J0886 26
8001 20-Apr-06 636 J1270 26
8001 20-Apr-06 821 90935 26
8001 13-May-06 270 A4657 26
8001 13-May-06 634 J0886 26
8001 13-May-06 636 J1270 26
8001 13-May-06 821 90935 26
8001 16-May-06 270 A4657 26
8001 16-May-06 634 J0886 26
8001 16-May-06 636 J1270 26
8001 16-May-06 821 90935 26
8001 22-May-06 634 J0886 26
8001 22-May-06 636 J1270 26
8001 22-May-06 821 90935 26
8001 07-Jun-06 634 J0886 26
8001 07-Jun-06 636 J1270 26
8001 07-Jun-06 821 90935 26
8001 20-Jun-06 634 J0886 26
8001 20-Jun-06 636 J1270 26
8001 20-Jun-06 821 90935 26
8001 07-Jul-06 270 A4657 26
8001 07-Jul-06 634 J0886 26
8001 07-Jul-06 636 J1270 26
8001 07-Jul-06 821 90935 26
8001 14-Jul-06 634 J0886 26
;
run;
2 REPLIES 2
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Consider using multiple sorts, first without and then with the NODUPSKEY and also including the EQUALS parameter for your BY key variables.
Scott Barry
SBBWorks, Inc.
Scott Barry
SBBWorks, Inc.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another option if you want more "control" is to consider using a DATA step with a SET statement, a BY statement (input file must be sorted by the variable list), and use IF FIRST.variable_name and if necessary IF LAST.variable_name, and the OUTPUT statement to create the file as needed.
Scott Barry
SBBWorks, Inc.
Scott Barry
SBBWorks, Inc.