BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
A_SAS_Man
Pyrite | Level 9

I am trying to look at different lines of data and count how many times a line falls within 30 days of previous lines for the same id and code. I have been trying to do this with a data step but haven't been able to figure it out and am open to any solutions.

 

Here is my starting data.

 

data have;
input id code date :date9.;
format date date9.;
datalines;
4350 1 '01JUN2021'd
4350 1 '13JUN2021'd
4456 3 '27JUL2021'd
4456 4 '27NOV2021'd
1234 1 '01JAN2021'd
1234 1 '10JAN2021'd
1234 1 '21JAN2021'd
;
run;

Here is what I'm trying to generate.

 

data want;
input id code date :date9. count;
format date date9.;
datalines;
4350 1 '01JUN2021'd 0
4350 1 '13JUN2021'd 1
4456 3 '27JUL2021'd 0
4456 4 '27NOV2021'd 0
1234 1 '01JAN2021'd 0
1234 1 '10JAN2021'd 1
1234 1 '21JAN2021'd 2
;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

So if I interpret you right, we need to keep a starting date for the group, and start a new group when we go past 30 days of that:

data want;
set have;
by id notsorted code;
retain start_date;
l_date = lag(date);
if first.code then start_date = date;
if not first.code and date - l_date le 30
then do;
  if date - start_date le 30
  then count + 1;
  else do;
    count = 1;
    start_date = l_date;
  end;
end;
else count = 0;
drop l_date start_date;
run;

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

In a data step, use BY processing and LAG to compare with the previous value.

data want;
set have;
by id notsorted code;
if not first.code and date - lag(date) le 30
then count + 1;
else count = 0;
run;

The SUM statement (count + 1) implies an automatic RETAIN.

A_SAS_Man
Pyrite | Level 9

This is very close, but it seems to "overcount" in some instances. See below modified example to demonstrate.

data have;
input id code date :date9.;
format date date9.;
datalines;
4350 1 '01JUN2021'd
4350 1 '13JUN2021'd
4456 3 '27JUL2021'd
4456 4 '27NOV2021'd
1234 1 '01JAN2021'd
1234 1 '10JAN2021'd
1234 1 '21JAN2021'd
1234 1 '15FEB2021'd
;
run;

Your code would produce this:

data code_output;
input id code date :date9. count;
format date date9.;
datalines;
4350 1 '01JUN2021'd 0
4350 1 '13JUN2021'd 1
4456 3 '27JUL2021'd 0
4456 4 '27NOV2021'd 0
1234 1 '01JAN2021'd 0
1234 1 '10JAN2021'd 1
1234 1 '21JAN2021'd 2
1234 1 '15FEB2021'd 3
;
run;

What I'm trying to get is this:

data want;
input id code date :date9. count;
format date date9.;
datalines;
4350 1 '01JUN2021'd 0
4350 1 '13JUN2021'd 1
4456 3 '27JUL2021'd 0
4456 4 '27NOV2021'd 0
1234 1 '01JAN2021'd 0
1234 1 '10JAN2021'd 1
1234 1 '21JAN2021'd 2
1234 1 '15FEB2021'd 1
;
run;

This is because the last line is only within 30 days of one of the other lines with matching codes and ids.

Kurt_Bremser
Super User

So if I interpret you right, we need to keep a starting date for the group, and start a new group when we go past 30 days of that:

data want;
set have;
by id notsorted code;
retain start_date;
l_date = lag(date);
if first.code then start_date = date;
if not first.code and date - l_date le 30
then do;
  if date - start_date le 30
  then count + 1;
  else do;
    count = 1;
    start_date = l_date;
  end;
end;
else count = 0;
drop l_date start_date;
run;
ballardw
Super User

You can save yourself a bit of typing for the data lines in your code:

data have;
input id code date :date9.;
format date date9.;
datalines;
4350 1 01JUN2021
4350 1 13JUN2021
4456 3 27JUL2021
4456 4 27NOV2021
1234 1 01JAN2021
1234 1 10JAN2021
1234 1 21JAN2021
;
run;

the quotes and d are needed to define a literal date in code and not needed, and potentially can cause errors when reading the data.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1440 views
  • 1 like
  • 3 in conversation