BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alireza_Boloori
Fluorite | Level 6

Hello everyone,

 

I have a data like this:

 

x    date

1    01/01/2008

1    01/10/2008

1    02/15/2008

1    03/25/2008

2    01/17/2008

2    04/28/2008

2    05/07/2008

 

and I want to (1) obtain the difference between dates (in # days) for each specific value of x, and (2) identify observations whose time difference with the previous one is less than 30 days. In other words, I want this data:

 

x    date                 difference         identifier

1    01/01/2008      NA                   0

1    01/10/2008      9                      1

1    02/15/2008      5                      1

1    03/25/2008      40                    0

2    01/17/2008      NA                   0

2    04/28/2008      63                    0

2    05/07/2008      10                    1

 

Any idea/help is really appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisBrooks
Ammonite | Level 13

This is easy enough except that unless I'm misunderstanding what you want some of the differences in your output aren't correct.

 

Here is my code

 

data have;
	infile datalines dlm=",";
	input x date mmddyy10.;
	format date mmddyy10.;
	datalines;
1,01/01/2008
1,01/10/2008
1,02/15/2008
1,03/25/2008
2,01/17/2008
2,04/28/2008
2,05/07/2008
;
run;


proc sort data=have;
	by x;
run;

data want(drop=ldate);
	set have;
	by x;
	ldate=lag(date);
	if first.x then do;
		difference=.;
		identifier=0;
	end;
	else do;
		difference=date-ldate;
		if difference<30 then identifier=1;
		else identifier=0;
	end;
run;

Also SAS uses the missing symbol . instead of NA so my output looks like this

 

Differences.png

 

 

 

View solution in original post

2 REPLIES 2
ChrisBrooks
Ammonite | Level 13

This is easy enough except that unless I'm misunderstanding what you want some of the differences in your output aren't correct.

 

Here is my code

 

data have;
	infile datalines dlm=",";
	input x date mmddyy10.;
	format date mmddyy10.;
	datalines;
1,01/01/2008
1,01/10/2008
1,02/15/2008
1,03/25/2008
2,01/17/2008
2,04/28/2008
2,05/07/2008
;
run;


proc sort data=have;
	by x;
run;

data want(drop=ldate);
	set have;
	by x;
	ldate=lag(date);
	if first.x then do;
		difference=.;
		identifier=0;
	end;
	else do;
		difference=date-ldate;
		if difference<30 then identifier=1;
		else identifier=0;
	end;
run;

Also SAS uses the missing symbol . instead of NA so my output looks like this

 

Differences.png

 

 

 

Shmuel
Garnet | Level 18
data have;
       input x date mmddtt10.;
       format date mmddyy10.;
cards;
1    01/01/2008
1    01/10/2008
1    02/15/2008
1    03/25/2008
2    01/17/2008
2    04/28/2008
2    05/07/2008
; run;

data want;
 set have;
  by x;
       prev_dt = lag(date);
       if first.x then prev_dt = .;
       difference = date - prev_dt;
       if difference < 30 then identifier = 1;
run;

Catch up on SAS Innovate 2026

Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.

Explore Now →
Develop Code with SAS Studio

Get started using SAS Studio to write, run and debug your SAS programs.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1744 views
  • 0 likes
  • 3 in conversation