DATA Step, Macro, Functions and more

Identifying deleted duplicates using a lag statement.

Reply
Contributor
Posts: 31

Identifying deleted duplicates using a lag statement.

data finals.palmbeach17;
set finals.palmbeachsales17;
if lag (parcel_id)=parcel_id and lag (sale_prc)=sale_prc and lag (sale_mo)=sale_mo then delete;
run;

I am trying to look at the observations that are being deleted from this lag statement above. Is there anyway I can print or capture the deleted observations?

PROC Star
Posts: 1,296

Re: Identifying deleted duplicates using a lag statement.

[ Edited ]
Posted in reply to andrewfau
data finals.palmbeach17 check;
set finals.palmbeachsales17;
if lag (parcel_id)=parcel_id and lag (sale_prc)=sale_prc and lag (sale_mo)=sale_mo then do;
output check;
end;
else output finals.palmbeach17;
run;
Trusted Advisor
Posts: 1,822

Re: Identifying deleted duplicates using a lag statement.

Posted in reply to andrewfau

Better sort the data and use first./last. to separate between the duplicates and the last one,

for examle:

 

proc sort data=have;
    by parcel_d sale_mo /* month ? */ sale_prc;
run;

data ones dups;
  set have;
   by parcel_d sale_mo sale_prc;
      if last.sale_prc then output ones;
      else output dups;
run;   
Ask a Question
Discussion stats
  • 2 replies
  • 74 views
  • 2 likes
  • 3 in conversation