DATA Step, Macro, Functions and more

Deduplicate dataset

Accepted Solution Solved
Reply
Contributor
Posts: 39
Accepted Solution

Deduplicate dataset

How can i deduplicate a dataset by ID1 and ID2, if I have the next structure:

 

ID1    ID2  DATE           VALUE

1000 10    31JAN2016     5

1000 10    26FEB2016     6

1000 10    31MAR2016    7

 

The result that I expectedis:

ID1    ID2 DATE           VALUE

1000 10    31MAR2016    7

(Which is the most current date according to the date)


Accepted Solutions
Solution
‎11-15-2016 07:00 PM
Super User
Posts: 17,894

Re: Deduplicate dataset

Use proc sort.

 

proc sort data=have;

by id id2 date; run;

 

data want;

set have;

by id id2;

if last.id2;

run;

 

 

View solution in original post


All Replies
Solution
‎11-15-2016 07:00 PM
Super User
Posts: 17,894

Re: Deduplicate dataset

Use proc sort.

 

proc sort data=have;

by id id2 date; run;

 

data want;

set have;

by id id2;

if last.id2;

run;

 

 

Contributor
Posts: 39

Re: Deduplicate dataset

Thank you it works very good just a question in:
if last.id2;
why var id2? Can be any variable?
Super User
Posts: 17,894

Re: Deduplicate dataset

No, it has to be a variable listed in the BY statement.

 

And it picks the last of that group, based on your specifications, this is what was required, but you can modify it to fit your needs, if you have more variables or something.

 

You can review the documentation on how BY groups are processed and the FIRST/LAST variables and how they're calculated.

I find the examples and illustrations here helpful:

http://support.sas.com/documentation/cdl/en/lrcon/68089/HTML/default/viewer.htm#p0xu93fy5eemkyn1p6mj...

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 185 views
  • 2 likes
  • 2 in conversation