DATA Step, Macro, Functions and more

Retrieving data from a prior line

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Retrieving data from a prior line

I have a bit of a head scratcher where I think I know what to use, but I'm not seeing the results I expect.  Here's what the data looks like:

 

Provider     Key     From_Date     Thru_Date     Origin

1234          585      20150213       20150227      A

1234          585      20150313       20150327      A

1234          585      20150831       20150905      A

1235          585      20150905       20150930      B

1235          585      20150905       20151005      B

2234          979      20150916       20150921      A

2234          979      20150917       20150921      A

2235          979      20150922       20151007      B

 

Generally, the Origin=B row occurs when Origin=A ends.  Hence, the same date.  Maybe that's what I need to be matching...

 

What I am trying to do is to start with an Origin of 'B', go back one line and return the Thru_Date from the Origin 'A'.  I have tried FIRST.ORIGIN and I get sorting errors (It needs to be sorted by Key and From_Date). I've tried IF Origin = 'B' then initial_thru_date=lag(Thru_Date), then I only get the lag to occur when Origin='B'.  And I understand why that is happening, but not sure how to start with Origin='B' and then have SAS run a process that doesn't just find the rows where Origin='B'.  This is what I would like the result to resemble:

 

Provider     Key     From_Date     Thru_Date     Origin     Initial_Thru_Date

1234          585      20150213       20150227      A

1234          585      20150313       20150327      A

1234          585      20150831       20150905      A

1235          585      20150905       20150930      B            20150905

1235          585      20150905       20151005      B

2234          979      20150916       20150921      A

2234          979      20150917       20150921      A

2235          979      20150922       20151007      B            20150921

 

I'm struggling to determine if there is a mix of FIRST. or LAG, or something else I can utilize.

 

Any assistance would be greatly appreciated. Even if it is the correct syntax for a Google search. Smiley Wink

 

As always, thank you.

 

Brian


Accepted Solutions
Solution
‎09-28-2016 02:18 PM
Super User
Posts: 5,071

Re: Retrieving data from a prior line

Well, you've identified the issue with LAG.  That's half the battle.  The next question is what can you do about it?  In general, the solution is to use LAG on every observation.  Then wipe out the value assigned to your new variable, if you don't want it.  For example:

 

data want;

set have;

by origin notsorted;

initial_thru_date = lag(thru_date);

if (origin ne 'B') or (first.origin=0) then initial_thru_date=.;

run;

 

There can be complications if you need to begin the logic over again when KEY changes.  But even in that case this program will hold up as long as each KEY always begins with an "A" value for ORIGIN.

View solution in original post


All Replies
Solution
‎09-28-2016 02:18 PM
Super User
Posts: 5,071

Re: Retrieving data from a prior line

Well, you've identified the issue with LAG.  That's half the battle.  The next question is what can you do about it?  In general, the solution is to use LAG on every observation.  Then wipe out the value assigned to your new variable, if you don't want it.  For example:

 

data want;

set have;

by origin notsorted;

initial_thru_date = lag(thru_date);

if (origin ne 'B') or (first.origin=0) then initial_thru_date=.;

run;

 

There can be complications if you need to begin the logic over again when KEY changes.  But even in that case this program will hold up as long as each KEY always begins with an "A" value for ORIGIN.

Super User
Posts: 10,466

Re: Retrieving data from a prior line

If it is acceptable to assume that your data is sorted by Key and origin

 

data have;
   input Provider     Key     From_Date     Thru_Date     Origin $ ;
datalines;
1234          585      20150213       20150227      A
1234          585      20150313       20150327      A
1234          585      20150831       20150905      A
1235          585      20150905       20150930      B
1235          585      20150905       20151005      B
2234          979      20150916       20150921      A
2234          979      20150917       20150921      A
2235          979      20150922       20151007      B
;
run;

data want;
   set have;
   by key origin;
   ldate = lag(Thru_date);
   if first.origin and origin='B' then initial_thru_date=ldate;
   drop ldate;
run;

You could use

 

By origin notsorted;

for your example data as well but it is not clear if the KEY value is important or not for deciding "first" status.

Occasional Contributor
Posts: 8

Re: Retrieving data from a prior line

I wanted to accept both of these as a solution, so thank you both SO much for taking the time look over my query. I was all around it in my attempts, I just didn't think to put all of the attempts together. lol

 

Once again, thank you both!

 

Brian

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 263 views
  • 2 likes
  • 3 in conversation