Difference between values of a same variable

Accepted Solution Solved
Reply
Contributor
Posts: 25
Accepted Solution

Difference between values of a same variable


Hi,

Could anyone tell me how to delete the rows that have  difference in dates for same ID is morethan 2 years?

id       date

1001  02july2012

1001  02june2011

1001  08april2009

1002  02july2011

Thanks in advance!

Rk


Accepted Solutions
Solution
‎03-07-2013 04:34 PM
Respected Advisor
Posts: 4,934

Re: Difference between values of a same variable

Can be done this way :

data have(drop=datetxt);
format date yymmdd10.;
input id datetxt :$15.;
date = input(prxchange("s/(\d+)(\D{3})\D*(\d+)/\1\2\3/",1,datetxt), date9.);
datalines;
1001  02july2012
1001  02june2011
1001  08april2009
1001  03may2009
1002  02july2011
;

proc sort data=have; by id descending date; run;

data want(drop=lastDate);
retain lastDate;
set have; by id;
if first.id then lastDate = date;
if intck("YEAR",date, lastDate,"CONTINUOUS") < 2 then do;
output;
lastDate = date;
end;
run;

proc print; run;

PG

PG

View solution in original post


All Replies
Respected Advisor
Posts: 4,934

Re: Difference between values of a same variable

Which ones do you wish to keep in the following set?

id       date

1001  02july2012

1001  02june2011

1001 08april2009

1001 03may2009

1002  02july2011

PG

PG
PROC Star
Posts: 7,492

Re: Difference between values of a same variable

In your example which record(s) do you want to delete?

Contributor
Posts: 25

Re: Difference between values of a same variable

Hi Sorry for not being clear.

I want delete observations that have date diff of morethan 2 years from latest date  (for example i want to delete

1001 08april2009

1001 03may2009

since they have morethan 2 yr  date difference from 02july2012

)

Thanks,

Rk

PROC Star
Posts: 7,492

Re: Difference between values of a same variable

You will have two problems to solve.  First, I don't think SAS has an informat that can read the kind of dates that you have, thus you will have to parse the dates.

Second, if you have 9.3, the calculation is easy:

data have;

informat id $4.;

informat date date9.;

format date date9.;

input @;

  _infile_=substr(_infile_,1,5)||

           substr(strip(substr(_infile_,5)),1,5)||

           substr(_infile_,length(_infile_)-3);

  input id date;

  cards;

1001 02july2012

1001 02june2011

1001 08april2009

1001 03may2009

1002 02july2011

;

data want (drop=start_date);

  set have;

  retain start_date;

  by id;

  if first.id then start_date=date;

  else if yrdif(date, start_date, 'AGE') gt 2 then delete;

run;

Solution
‎03-07-2013 04:34 PM
Respected Advisor
Posts: 4,934

Re: Difference between values of a same variable

Can be done this way :

data have(drop=datetxt);
format date yymmdd10.;
input id datetxt :$15.;
date = input(prxchange("s/(\d+)(\D{3})\D*(\d+)/\1\2\3/",1,datetxt), date9.);
datalines;
1001  02july2012
1001  02june2011
1001  08april2009
1001  03may2009
1002  02july2011
;

proc sort data=have; by id descending date; run;

data want(drop=lastDate);
retain lastDate;
set have; by id;
if first.id then lastDate = date;
if intck("YEAR",date, lastDate,"CONTINUOUS") < 2 then do;
output;
lastDate = date;
end;
run;

proc print; run;

PG

PG
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 243 views
  • 3 likes
  • 3 in conversation