BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
A_SAS_Man
Pyrite | Level 9

I am attempting to repeat events within a certain time. I asked a similar question at the link below, and received an answer that worked for that problem. I'm trying to edit it for a slightly different issue and I'm running into problems.

https://communities.sas.com/t5/SAS-Programming/How-to-Count-Dates-within-Certain-Range/m-p/742739#M2...

 

I want to count the number of "repeat events" for a certain line of information within 30 days. For example here is the data I have and the data I want.

 

data have;
input id code date1 :date9. date2 :date9.;
format date1 date9. date2 date9.;
datalines;
4350 1 '01JUN2021'd '03JUN2021'd
4350 1 '02JUl2021'd '14JUl2021'd
4350 1 '16JUl2021'd '17JUl2021'd
4350 1 '20JUl2021'd '22JUl2021'd
4456 3 '20AUG2020'd '21OCT2020'd
4456 4 '19MAR2020'd '23MAR2020'd
1234 1 '01JAN2021'd '02JAN2021'd
1234 1 '10JAN2021'd '21JAN2021'd
1234 1 '24FEB2021'd '25FEB2021'd
1234 1 '01MAR2021'd '11MAR2021'd
;
run;

 

data want;
input id code date1 :date9. date2 :date9. count;
format date date9.;
datalines;
4350 1 '01JUN2021'd '03JUN2021'd 0
4350 1 '02JUl2021'd '14JUl2021'd 1
4350 1 '16JUl2021'd '17JUl2021'd 1
4350 1 '20JUl2021'd '22JUl2021'd 2
4456 3 '20AUG2020'd '21OCT2020'd 0
4456 4 '19MAR2020'd '23MAR2020'd 0
1234 1 '01JAN2021'd '02JAN2021'd 0
1234 1 '10JAN2021'd '21JAN2021'd 1
1234 1 '24FEB2021'd '25FEB2021'd 0
1234 1 '01MAR2021'd '11MAR2021'd 1
;
run;

For every line, I want to see how many previous lines date2 value is within 30 days of the current lines date1 for the same ID and code. I can't figure out how to alter the previous solution to account for comparing between two different date fields now.

 

Edit: Below is the edited code I've been unable to get to work.

data want_test;
set have;
by id notsorted code;
retain start_date;
if not first.code then l_date = lag(date2);
if first.code then start_date = date1;
if not first.code and date1 - l_date le 30
then do;
  test_date=date1 - l_date;
  if date1 - l_date le 30
  then count + 1;
  else do;
    count = 1;
    start_date = date1;
  end;
end;
else count = 0;
format l_date date9. start_date date9.;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi again @A_SAS_Man  Sorry I failed to notice your response to @PaigeMiller  explaining the logic. I read it just now. Right, I think I should be close this time around? Please test-


data have;
input id code date1 :date9. date2 :date9.;
format date1 date9. date2 date9.;
datalines;
4350 1 '01JUN2021'd '03JUN2021'd
4350 1 '02JUl2021'd '14JUl2021'd
4350 1 '16JUl2021'd '17JUl2021'd
4350 1 '20JUl2021'd '22JUl2021'd
4456 3 '20AUG2020'd '21OCT2020'd
4456 4 '19MAR2020'd '23MAR2020'd
1234 1 '01JAN2021'd '02JAN2021'd
1234 1 '10JAN2021'd '21JAN2021'd
1234 1 '24FEB2021'd '25FEB2021'd
1234 1 '01MAR2021'd '11MAR2021'd
;
run;

data want ;
  do _n_=1 by 1 until(last.code) ;
    set have ;
	by id code notsorted ;
	array t(999) _temporary_;
    if _n_>1 then  do __n=1 to _n_-1 ;
	 if intck('day',t(__n),date1)<30 then count+1 ;
	end ;
	output ;
	count=0 ;
    t(_n_)=date2 ;
  end ;
  call missing(of t(*)) ;
  drop _: ;
run ;

proc print noobs ; run ;
id code date1 date2 count
4350 1 01JUN2021 03JUN2021 0
4350 1 02JUL2021 14JUL2021 1
4350 1 16JUL2021 17JUL2021 1
4350 1 20JUL2021 22JUL2021 2
4456 3 20AUG2020 21OCT2020 0
4456 4 19MAR2020 23MAR2020 0
1234 1 01JAN2021 02JAN2021 0
1234 1 10JAN2021 21JAN2021 1
1234 1 24FEB2021 25FEB2021 0
1234 1 01MAR2021 11MAR2021 1

 

 

 

View solution in original post

10 REPLIES 10
Reeza
Super User
Please include your edited code as well.
A_SAS_Man
Pyrite | Level 9
Thank you, I've included it now.
PaigeMiller
Diamond | Level 26

Could you explain how to go from the input data to the output data?

 

Why is COUNT a 0 on the first line, a 1 on lines 2 and 3, and a two on line 4?

--
Paige Miller
A_SAS_Man
Pyrite | Level 9

It is a count of how many previous lines had a date2 within 30 days of the current lines date1. It is kind of confusing so if I've made an error somewhere let me know and I can correct that.

 

To add a little more detail, it is 0 on the first one because it is the first line with no previous lines. It is 1 on the second because it was within range of the first. The third is 1 because it was within range of the second, but not the first. The fourth was within range of 2 and 3, but not 1.

novinosrin
Tourmaline | Level 20

Hi @A_SAS_Man  Methinks @PaigeMiller 's point is spot on in catching the anomaly in your WANT for the ID= 4350 code=1.  Please review the Count in your WANT for the 1st ID group. Your description of logic makes me lean to the following results if I understand you which I believe what Paige is thinking too- 

 


data have;
input id code date1 :date9. date2 :date9.;
format date1 date9. date2 date9.;
datalines;
4350 1 '01JUN2021'd '03JUN2021'd
4350 1 '02JUl2021'd '14JUl2021'd
4350 1 '16JUl2021'd '17JUl2021'd
4350 1 '20JUl2021'd '22JUl2021'd
4456 3 '20AUG2020'd '21OCT2020'd
4456 4 '19MAR2020'd '23MAR2020'd
1234 1 '01JAN2021'd '02JAN2021'd
1234 1 '10JAN2021'd '21JAN2021'd
1234 1 '24FEB2021'd '25FEB2021'd
1234 1 '01MAR2021'd '11MAR2021'd
;
run;


data want ;
  do until(last.code) ;
    set have ;
	by id code notsorted ;
	if _d then if intck('day', _d, date1)>30 then count=0 ;
	else count+1 ;
	output;
	_d=date2 ;
  end ;
  count=0 ;
  drop _d ;
run ;

proc print noobs ; run ;

Results;

id code date1 date2 count
4350 1 01JUN2021 03JUN2021 0
4350 1 02JUL2021 14JUL2021 1
4350 1 16JUL2021 17JUL2021 2
4350 1 20JUL2021 22JUL2021 3
4456 3 20AUG2020 21OCT2020 0
4456 4 19MAR2020 23MAR2020 0
1234 1 01JAN2021 02JAN2021 0
1234 1 10JAN2021 21JAN2021 1
1234 1 24FEB2021 25FEB2021 0
1234 1 01MAR2021 11MAR2021 1
A_SAS_Man
Pyrite | Level 9

So this isn't quite what I need. The third line should be 1 because July 16 and June 3 are more than 30 days apart and July 16 and July 14 are less than 30 days apart. Similarly, line 4 should be 2 because July 20 is less than 30 days from July 17 and July 14, but more than 30 from June 3.

novinosrin
Tourmaline | Level 20

Hi again @A_SAS_Man  Sorry I failed to notice your response to @PaigeMiller  explaining the logic. I read it just now. Right, I think I should be close this time around? Please test-


data have;
input id code date1 :date9. date2 :date9.;
format date1 date9. date2 date9.;
datalines;
4350 1 '01JUN2021'd '03JUN2021'd
4350 1 '02JUl2021'd '14JUl2021'd
4350 1 '16JUl2021'd '17JUl2021'd
4350 1 '20JUl2021'd '22JUl2021'd
4456 3 '20AUG2020'd '21OCT2020'd
4456 4 '19MAR2020'd '23MAR2020'd
1234 1 '01JAN2021'd '02JAN2021'd
1234 1 '10JAN2021'd '21JAN2021'd
1234 1 '24FEB2021'd '25FEB2021'd
1234 1 '01MAR2021'd '11MAR2021'd
;
run;

data want ;
  do _n_=1 by 1 until(last.code) ;
    set have ;
	by id code notsorted ;
	array t(999) _temporary_;
    if _n_>1 then  do __n=1 to _n_-1 ;
	 if intck('day',t(__n),date1)<30 then count+1 ;
	end ;
	output ;
	count=0 ;
    t(_n_)=date2 ;
  end ;
  call missing(of t(*)) ;
  drop _: ;
run ;

proc print noobs ; run ;
id code date1 date2 count
4350 1 01JUN2021 03JUN2021 0
4350 1 02JUL2021 14JUL2021 1
4350 1 16JUL2021 17JUL2021 1
4350 1 20JUL2021 22JUL2021 2
4456 3 20AUG2020 21OCT2020 0
4456 4 19MAR2020 23MAR2020 0
1234 1 01JAN2021 02JAN2021 0
1234 1 10JAN2021 21JAN2021 1
1234 1 24FEB2021 25FEB2021 0
1234 1 01MAR2021 11MAR2021 1

 

 

 

A_SAS_Man
Pyrite | Level 9
Thanks so much, I'm away from my workstation right now, but will test first thing tomorrow and accept the answer after I confirm.
novinosrin
Tourmaline | Level 20

No biggie, Should you come across a situation where you have more than 999 records for a single Id Code group combination, please make sure you resize the array something bigger like-

array t(999) _temporary_; /*existing 999*/

array t(9999) _temporary_; /*make it larger 9999*/

Test thoroughly and then accept the answer. 

Ksharp
Super User
data have;
input id code date1 :date9. date2 :date9.;
format date1 date9. date2 date9.;
datalines;
4350 1 '01JUN2021'd '03JUN2021'd
4350 1 '02JUl2021'd '14JUl2021'd
4350 1 '16JUl2021'd '17JUl2021'd
4350 1 '20JUl2021'd '22JUl2021'd
4456 3 '20AUG2020'd '21OCT2020'd
4456 4 '19MAR2020'd '23MAR2020'd
1234 1 '01JAN2021'd '02JAN2021'd
1234 1 '10JAN2021'd '21JAN2021'd
1234 1 '24FEB2021'd '25FEB2021'd
1234 1 '01MAR2021'd '11MAR2021'd
;
run;
proc sql;
create table want as
select a.*,(select count(*) from have where id=a.id and code=a.code and 
date1<a.date1 and date2 between a.date1-30 and a.date1) as count
 from have as a;
quit;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2421 views
  • 3 likes
  • 5 in conversation