BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
A_SAS_Man
Pyrite | Level 9

I am trying to look at different lines of data and count how many times a line falls within 30 days of previous lines for the same id and code. I have been trying to do this with a data step but haven't been able to figure it out and am open to any solutions.

 

Here is my starting data.

 

data have;
input id code date :date9.;
format date date9.;
datalines;
4350 1 '01JUN2021'd
4350 1 '13JUN2021'd
4456 3 '27JUL2021'd
4456 4 '27NOV2021'd
1234 1 '01JAN2021'd
1234 1 '10JAN2021'd
1234 1 '21JAN2021'd
;
run;

Here is what I'm trying to generate.

 

data want;
input id code date :date9. count;
format date date9.;
datalines;
4350 1 '01JUN2021'd 0
4350 1 '13JUN2021'd 1
4456 3 '27JUL2021'd 0
4456 4 '27NOV2021'd 0
1234 1 '01JAN2021'd 0
1234 1 '10JAN2021'd 1
1234 1 '21JAN2021'd 2
;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

So if I interpret you right, we need to keep a starting date for the group, and start a new group when we go past 30 days of that:

data want;
set have;
by id notsorted code;
retain start_date;
l_date = lag(date);
if first.code then start_date = date;
if not first.code and date - l_date le 30
then do;
  if date - start_date le 30
  then count + 1;
  else do;
    count = 1;
    start_date = l_date;
  end;
end;
else count = 0;
drop l_date start_date;
run;

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

In a data step, use BY processing and LAG to compare with the previous value.

data want;
set have;
by id notsorted code;
if not first.code and date - lag(date) le 30
then count + 1;
else count = 0;
run;

The SUM statement (count + 1) implies an automatic RETAIN.

A_SAS_Man
Pyrite | Level 9

This is very close, but it seems to "overcount" in some instances. See below modified example to demonstrate.

data have;
input id code date :date9.;
format date date9.;
datalines;
4350 1 '01JUN2021'd
4350 1 '13JUN2021'd
4456 3 '27JUL2021'd
4456 4 '27NOV2021'd
1234 1 '01JAN2021'd
1234 1 '10JAN2021'd
1234 1 '21JAN2021'd
1234 1 '15FEB2021'd
;
run;

Your code would produce this:

data code_output;
input id code date :date9. count;
format date date9.;
datalines;
4350 1 '01JUN2021'd 0
4350 1 '13JUN2021'd 1
4456 3 '27JUL2021'd 0
4456 4 '27NOV2021'd 0
1234 1 '01JAN2021'd 0
1234 1 '10JAN2021'd 1
1234 1 '21JAN2021'd 2
1234 1 '15FEB2021'd 3
;
run;

What I'm trying to get is this:

data want;
input id code date :date9. count;
format date date9.;
datalines;
4350 1 '01JUN2021'd 0
4350 1 '13JUN2021'd 1
4456 3 '27JUL2021'd 0
4456 4 '27NOV2021'd 0
1234 1 '01JAN2021'd 0
1234 1 '10JAN2021'd 1
1234 1 '21JAN2021'd 2
1234 1 '15FEB2021'd 1
;
run;

This is because the last line is only within 30 days of one of the other lines with matching codes and ids.

Kurt_Bremser
Super User

So if I interpret you right, we need to keep a starting date for the group, and start a new group when we go past 30 days of that:

data want;
set have;
by id notsorted code;
retain start_date;
l_date = lag(date);
if first.code then start_date = date;
if not first.code and date - l_date le 30
then do;
  if date - start_date le 30
  then count + 1;
  else do;
    count = 1;
    start_date = l_date;
  end;
end;
else count = 0;
drop l_date start_date;
run;
ballardw
Super User

You can save yourself a bit of typing for the data lines in your code:

data have;
input id code date :date9.;
format date date9.;
datalines;
4350 1 01JUN2021
4350 1 13JUN2021
4456 3 27JUL2021
4456 4 27NOV2021
1234 1 01JAN2021
1234 1 10JAN2021
1234 1 21JAN2021
;
run;

the quotes and d are needed to define a literal date in code and not needed, and potentially can cause errors when reading the data.

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1899 views
  • 1 like
  • 3 in conversation