data have;
infile datalines missover;
input lag24 ddmmyy12. @12 id @14 date ddmmyy12. @25 score;
format date ddmmyy10.;
format lag24 ddmmyy10.;
datalines;
1 21-09-2009 1.4
25-09-2007 1 25-09-2009 1.4
16-11-2007 1 16-11-2009 2.2
03-02-2008 1 03-02-2010 2.6
2 21-09-2009 1.4
25-09-2007 2 25-09-2009 1.4
16-11-2007 2 16-11-2009 2.2
03-02-2012 2 03-02-2014 2.6
;
run;
I need the following data. We start with the last row within the group variable "ID", and check if the previous dates are within two years of the last date. Lets say the last date in the ID=1 is 03-02-2010 and the previous date is 16-11-2009 which is between 03-02-2008 and 03-02-2010, so take we sum of score (2.6+2.2). Then again check if the date before the previous date, ie. 2nd row from top 25-09-2009 is between 03-02-2008 and 03-02-2010, so take we sum of score (2.6+2.2+1.4) and again for the first row (2.6+2.2+1.41.4=7.6) which is diplayed on the last row. Similarly we take the 2nd last row as "index row" and check if the previous dates are between the date of index date and lag24=(index date-24 months).
data want;
infile datalines missover;
input lag24 ddmmyy12. @12 id @14 date ddmmyy12. @25 score @29 sum;
format date ddmmyy10.;
format lag24 ddmmyy10.;
datalines;
1 21-09-2009 1.4
25-09-2007 1 25-09-2009 1.4 2.8
16-11-2007 1 16-11-2009 2.2 6.2
03-02-2008 1 03-02-2010 2.6 7.6
2 21-09-2009 1.4
25-09-2007 2 25-09-2009 1.4 2.8
16-11-2007 2 16-11-2009 2.2 5
03-02-2012 2 03-02-2014 2.6
;
run;
Try this:
proc sql;
create table want as
select
a.id,
a.date,
a.score,
case
when sum(b.score) = a.score
then .
else sum(b.score)
end as sum
from have a
left join have b
on a.id = b.id and b.date between a.lag24 and a.date
group by
a.id,
a.date,
a.score
;
quit;
Try this:
proc sql;
create table want as
select
a.id,
a.date,
a.score,
case
when sum(b.score) = a.score
then .
else sum(b.score)
end as sum
from have a
left join have b
on a.id = b.id and b.date between a.lag24 and a.date
group by
a.id,
a.date,
a.score
;
quit;
Here is a data step solution:
data want;
set have;
by id;
if first.id then do;
_first_=_N_;
_sum_=score;
end;
else do;
_sum_+score;
do _first_=_first_ to _N_;
set have(keep=date score rename=(date=_date_ score=_score_)) point=_first_;
if _date_>=lag24 then leave;
_sum_=sum-_score_;
end;
sum=_sum_;
end;
retain _:;
drop _:;
run;
BTW, I think there is an error in your WANT data step, I think the SUM in row 3 should be 5, not 6.2.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.