BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rakeshvvv
Quartz | Level 8

I have dataset with three varaibles id startdate enddate. Dates are numerical sas formats.

I would like to pull out the records for id where the difference between the two succesive START dates is more then 28 days. There will be few missing dates too.

example:

ID    STARTDATE ENDDATE

001   19583      19604

001   19589      19609

001   19600      19610

001   19628      19638

001   19480      19520

002     .           .

002   19620      19624

002   19630      19634

1 ACCEPTED SOLUTION

Accepted Solutions
pradeepalankar
Obsidian | Level 7

change if condition to

if id=lag(id) and startdate-lag(startdate)>=28;

View solution in original post

11 REPLIES 11
pradeepalankar
Obsidian | Level 7

Hi Try this,

data test;

input ID $3.   STARTDATE ENDDATE;

if startdate-lag(startdate)>=28;

cards;

001   19583      19604

001   19589      19609

001   19600      19610

001   19628      19638

001   19480      19520

002     .           .

002   19620      19624

002   19630      19634

;

run;

output:

001   19628      19638

->use sum function if you want dates after missing dates to be displayed

if sum(startdate,-lag(startdate))>=28;

rakeshvvv
Quartz | Level 8

Thanks for your reply. It answered my query partially but I would like to calculate diff between dates for ID Specific. The logic you gave is working but it working across all the id's and it outputting even when the difference between two dates is greater then 28 for two different id.

PaigeMiller
Diamond | Level 26

So, using the LAG function, you test to see if the ID of the current observation matches the ID of the previous observation, and also using the LAG function test to see if the dates are greater than 28.

--
Paige Miller
pradeepalankar
Obsidian | Level 7

change if condition to

if id=lag(id) and startdate-lag(startdate)>=28;

rakeshvvv
Quartz | Level 8

Thanks....its working.....is there any other option....we can use like(Group by ID as we do SQL and use lag function ) ......my query is more about finding the discrepancies of id's with dates greater then 28.....any query with group by id and using the lag function would be prefect fit for me......

PaigeMiller
Diamond | Level 26

rakeshvvv wrote:

Thanks....its working.....is there any other option....we can use like(Group by ID as we do SQL and use lag function ) ......my query is more about finding the discrepancies of id's with dates greater then 28.....any query with group by id and using the lag function would be prefect fit for me......

I don't understand what you are asking for ... or how it is different than what we have already explained ... an example would definitely help. It also sounds like (although you don't specifically state this) that you are asking for a PROC SQL solution. Are you asking for a PROC SQL solution? If so, why does it have to be PROC SQL when a data step works perfectly well?

--
Paige Miller
rakeshvvv
Quartz | Level 8

Hi,

I have data set with  ID and with multiple VISITDATES......my query should output the records where difference between two successive visit dates is greater then 28 within a ID. hope it helps......

PaigeMiller
Diamond | Level 26

I have data set with  ID and with multiple VISITDATES......my query should output the records where difference between two successive visit dates is greater then 28 within a ID. hope it helps......

But that's what the code above has provided

--
Paige Miller
Reeza
Super User

This is similar to a 30 day readmission problem in medical studies. Search on here and you'll find SQL solutions to that effect.

It's a common request/issue.

Astounding
PROC Star

If you are saying that you only want one record per ID, just to identify the offending IDs, this would be a way.  It assumes your SAS data set is sorted by ID STARTDATE:

data want;

   do until (last.id);

      prior_start = startdate;

      set have;

      by id;

      if first.id=0 and startdate - prior_start >= 28 then wanted='Y';

   end;

   if wanted='Y';

   keep id;

run;

SQL would likely not be an option to accomplish this, since it doesn't guarantee the order of the incoming records.

pradeepalankar
Obsidian | Level 7

how you want your output data to look like...

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1611 views
  • 0 likes
  • 5 in conversation