BookmarkSubscribeRSS Feed
appleorange
Obsidian | Level 7

I have a dataset with data that's like the following.  Basically flag1 just keeps track if date1 was before an anchor date (begin) or date2 was before another anchor (end) date.  The reason that it can get spread out over rows is because the table I am comparing date1 and date2 to have date ranges that are sporadic.

    ID  Begin     End         date1       date2       flag1
    A   5/1/2007  11/30/2011  6/14/2011   5/14/2012   10
    A   12/1/2011 12/31/2011  6/14/2011   5/14/2012   00
    A   1/1/2012  9/30/2012   6/14/2011   5/14/2012   01
    B                                                 00
    C                                                 10
    C                                                 00
    E                                                 00
    E                                                 01
    F                                                 11

I didn't want to write out all the dates, the first one is just an example showing how flag1 works.  But basically there could be 5 situations:

1. Both date1 and date2 are within Begin and End (11).
2. Neither date1 and date2 are within Begin and End (00).
3. Date1 is within Begin, but Date2 isn't until x rows later.
4. Date1 is within Begin, but Date2 never will be x rows later.
5. Date1 will never be within Begin, but Date2 will be x rows later.

Those first two are easy to deal with, but a situation like A, there could be several rows of 00 between the first.ID and last.ID.  Similarly for 4. and 5., there could be several rows before the criteria is met.

The first step would be to only keep 1. and 3. only where both dates are within Begin/End on the same row or different rows.  If they are on different rows, I need to find the difference between the previous End date and the next Begin for as many rows necessary and sum the days (if applicable).  So for my example

    A:
   
    11/30/2011 - 12/1/2011
    12/31/2011 - 1/1/2012

I get that in this example there are no gaps, but there could be for other dates in my dataset.  I have an inkling using a do loop of some sort along with first.recip_id or last.recip_id would work but I'm at a loss how to do all of this efficienctly.

5 REPLIES 5
Reeza
Super User

A few things:

1. I feel like I've seen this question before, so either you've reposted or its been on here recently.

2. You need to expand your example to clarify your rules - more specifically a HAVE and WANT dataset are great.

3. It seems like data structure might be a bit of a problem here, is this your original data structure or did you restructure it this way via a merge or join?

appleorange
Obsidian | Level 7

Hi Reeza,

I haven't reposted this question on here but I did ask another one about first.variable and last.variable.  I managed to cobble together a "method" but it requires a lot of datasteps - I don't have the code with me at the moment but I will edit my question to be clearer.

To answer the data structure question, it's kind of complicated but originally I had a dataset like this:

ID   Anchor_Date

A    1/1/2011

B    1/2/2011

C    1/3/2011

There is a specified date range I have to compare the Anchor_Date to (example 30 days before Anchor_Date and 90 days after Anchor_Date).  The dataset that contains this information has sporadic eligibility dates like this:

ID   Begin       End

A    1/1/2009  1/1/2010

A    1/1/2011   1/1/2012

There is no rule for how the Begin and End dates are organized.  It's the way the data is compiled by something I can't control and the information could be spread out over several rows.  So the only solution was to establish a date1 (Anchor_date - 30 days) and date2 (Anchor_date + 90 days) then check it against each row of Begin/End date.  So I did merge all of this together, and I agree the data structure is iffy.

Ksharp
Super User

You can change the table

ID   Begin       End

A    1/1/2009  1/1/2010

A    1/1/2011   1/1/2012

as

ID Anchor_Date

A 1/1/2009 

A 1/2/2009 

A 1/3/2009 

A 1/4/2009 

..........

Then use Hash Table or proc format to check if it is falling the range you want .

appleorange
Obsidian | Level 7

Hi,

Thanks for the help...but how do I change the table from a range horizontally to vertically?  I'm assuming that it would go through each day between 1/1/2009 until 1/1/2010 as a whole column?

Ksharp
Super User

Yes. You are right. That will expand your small into a big table ..

"change the table from a range horizontally to vertically"

do Anchor_Date=Begin    to   End ;

  output;

end;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1689 views
  • 0 likes
  • 3 in conversation