BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
lboyd
Calcite | Level 5

How can I delete duplicates of one variable based on a start date using NODUPKEY?

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Sort it twice. First time with email address and date, second with email address and the NODUPKEY option.

View solution in original post

7 REPLIES 7
Reeza
Super User

 

Provide more details.


@lboyd wrote:

How can I delete duplicates of one variable based on a start date using NODUPKEY?

Thanks!


 

lboyd
Calcite | Level 5
I've been using this:
PROC SORT DATA=CMpre DUPOUT=results NODUPKEY ;
BY QID25;
RUN ;

where qid25 is an email address. I need to get rid of duplicate email addresses and I want to keep the one that had the earliest start date. Startdate variable looks something like this:
22JUN17:00:00:00
Reeza
Super User

Sort it twice. First time with email address and date, second with email address and the NODUPKEY option.

lboyd
Calcite | Level 5
This code gets rid of missing e-mails, is there a way to prevent that?
lboyd
Calcite | Level 5
Also some of the e-mails start with an uppercase letter while others start with lower case-is there any way to delete based on both? For instance, if someone said Sam123@gmail.com and also sam123@gmail.com-I'd want one of those deleted from the database.
Reeza
Super User
For instance, if someone said Sam123@gmail.com and also sam123@gmail.com-I'd want one of those deleted from the database.

 

To fix this you need to clean your data first.

 

This code gets rid of missing e-mails, is there a way to prevent that?

 

To deal with this you likely need to do it manually. 

First sort and then use a data step with first/last but coding an exception for the missing emails.

 

proc sort data=have;
by group_var;
run;

data want;
set have;
by group_var;
if first.group_var or missing(group_var);
run;



lboyd
Calcite | Level 5
Thank you!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1593 views
  • 0 likes
  • 2 in conversation