BookmarkSubscribeRSS Feed
Ray_H
Calcite | Level 5
I have a data set in which I have duplicates (rcode, pcodes, and prov). I want to create a new table in which these are deduped, but retaining the first time they were experienced as unique by (keydt). I cant use proc sort nodupkey because I need to include keydate in the sort, and will yield no dups by that particular date. If I don’t use the keydate the sort will then sort by rcode, pcode and prov; and will not yield the first time the unique combination was observed.


data xx;
format keydt date. rcode $4. pcode $5.;
input ID keydt :date. rcode pcode pr;
cards;
8001 20-Jan-06 634 J0886 26
8001 20-Jan-06 636 J1270 26
8001 20-Jan-06 821 90999 26
8001 06-Feb-06 634 J0886 26
8001 06-Feb-06 636 J1270 26
8001 06-Feb-06 821 90999 26
8001 23-Feb-06 634 J0886 26
8001 23-Feb-06 636 J1270 26
8001 23-Feb-06 821 90935 26
8001 03-Mar-06 634 J0886 26
8001 03-Mar-06 636 J1270 26
8001 03-Mar-06 821 90935 26
8001 10-Apr-06 636 J1270 26
8001 10-Apr-06 821 90935 26
8001 20-Apr-06 634 J0886 26
8001 20-Apr-06 636 J1270 26
8001 20-Apr-06 821 90935 26
8001 13-May-06 270 A4657 26
8001 13-May-06 634 J0886 26
8001 13-May-06 636 J1270 26
8001 13-May-06 821 90935 26
8001 16-May-06 270 A4657 26
8001 16-May-06 634 J0886 26
8001 16-May-06 636 J1270 26
8001 16-May-06 821 90935 26
8001 22-May-06 634 J0886 26
8001 22-May-06 636 J1270 26
8001 22-May-06 821 90935 26
8001 07-Jun-06 634 J0886 26
8001 07-Jun-06 636 J1270 26
8001 07-Jun-06 821 90935 26
8001 20-Jun-06 634 J0886 26
8001 20-Jun-06 636 J1270 26
8001 20-Jun-06 821 90935 26
8001 07-Jul-06 270 A4657 26
8001 07-Jul-06 634 J0886 26
8001 07-Jul-06 636 J1270 26
8001 07-Jul-06 821 90935 26
8001 14-Jul-06 634 J0886 26

;
run;
2 REPLIES 2
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Consider using multiple sorts, first without and then with the NODUPSKEY and also including the EQUALS parameter for your BY key variables.

Scott Barry
SBBWorks, Inc.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Another option if you want more "control" is to consider using a DATA step with a SET statement, a BY statement (input file must be sorted by the variable list), and use IF FIRST.variable_name and if necessary IF LAST.variable_name, and the OUTPUT statement to create the file as needed.

Scott Barry
SBBWorks, Inc.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1898 views
  • 0 likes
  • 2 in conversation