BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
A_SAS_Man
Pyrite | Level 9

I am trying to look at different lines of data and count how many times a line falls within 30 days of previous lines for the same id and code. I have been trying to do this with a data step but haven't been able to figure it out and am open to any solutions.

 

Here is my starting data.

 

data have;
input id code date :date9.;
format date date9.;
datalines;
4350 1 '01JUN2021'd
4350 1 '13JUN2021'd
4456 3 '27JUL2021'd
4456 4 '27NOV2021'd
1234 1 '01JAN2021'd
1234 1 '10JAN2021'd
1234 1 '21JAN2021'd
;
run;

Here is what I'm trying to generate.

 

data want;
input id code date :date9. count;
format date date9.;
datalines;
4350 1 '01JUN2021'd 0
4350 1 '13JUN2021'd 1
4456 3 '27JUL2021'd 0
4456 4 '27NOV2021'd 0
1234 1 '01JAN2021'd 0
1234 1 '10JAN2021'd 1
1234 1 '21JAN2021'd 2
;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

So if I interpret you right, we need to keep a starting date for the group, and start a new group when we go past 30 days of that:

data want;
set have;
by id notsorted code;
retain start_date;
l_date = lag(date);
if first.code then start_date = date;
if not first.code and date - l_date le 30
then do;
  if date - start_date le 30
  then count + 1;
  else do;
    count = 1;
    start_date = l_date;
  end;
end;
else count = 0;
drop l_date start_date;
run;

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

In a data step, use BY processing and LAG to compare with the previous value.

data want;
set have;
by id notsorted code;
if not first.code and date - lag(date) le 30
then count + 1;
else count = 0;
run;

The SUM statement (count + 1) implies an automatic RETAIN.

A_SAS_Man
Pyrite | Level 9

This is very close, but it seems to "overcount" in some instances. See below modified example to demonstrate.

data have;
input id code date :date9.;
format date date9.;
datalines;
4350 1 '01JUN2021'd
4350 1 '13JUN2021'd
4456 3 '27JUL2021'd
4456 4 '27NOV2021'd
1234 1 '01JAN2021'd
1234 1 '10JAN2021'd
1234 1 '21JAN2021'd
1234 1 '15FEB2021'd
;
run;

Your code would produce this:

data code_output;
input id code date :date9. count;
format date date9.;
datalines;
4350 1 '01JUN2021'd 0
4350 1 '13JUN2021'd 1
4456 3 '27JUL2021'd 0
4456 4 '27NOV2021'd 0
1234 1 '01JAN2021'd 0
1234 1 '10JAN2021'd 1
1234 1 '21JAN2021'd 2
1234 1 '15FEB2021'd 3
;
run;

What I'm trying to get is this:

data want;
input id code date :date9. count;
format date date9.;
datalines;
4350 1 '01JUN2021'd 0
4350 1 '13JUN2021'd 1
4456 3 '27JUL2021'd 0
4456 4 '27NOV2021'd 0
1234 1 '01JAN2021'd 0
1234 1 '10JAN2021'd 1
1234 1 '21JAN2021'd 2
1234 1 '15FEB2021'd 1
;
run;

This is because the last line is only within 30 days of one of the other lines with matching codes and ids.

Kurt_Bremser
Super User

So if I interpret you right, we need to keep a starting date for the group, and start a new group when we go past 30 days of that:

data want;
set have;
by id notsorted code;
retain start_date;
l_date = lag(date);
if first.code then start_date = date;
if not first.code and date - l_date le 30
then do;
  if date - start_date le 30
  then count + 1;
  else do;
    count = 1;
    start_date = l_date;
  end;
end;
else count = 0;
drop l_date start_date;
run;
ballardw
Super User

You can save yourself a bit of typing for the data lines in your code:

data have;
input id code date :date9.;
format date date9.;
datalines;
4350 1 01JUN2021
4350 1 13JUN2021
4456 3 27JUL2021
4456 4 27NOV2021
1234 1 01JAN2021
1234 1 10JAN2021
1234 1 21JAN2021
;
run;

the quotes and d are needed to define a literal date in code and not needed, and potentially can cause errors when reading the data.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1321 views
  • 1 like
  • 3 in conversation