BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shounster
Obsidian | Level 7

I have a bit of a head scratcher where I think I know what to use, but I'm not seeing the results I expect.  Here's what the data looks like:

 

Provider     Key     From_Date     Thru_Date     Origin

1234          585      20150213       20150227      A

1234          585      20150313       20150327      A

1234          585      20150831       20150905      A

1235          585      20150905       20150930      B

1235          585      20150905       20151005      B

2234          979      20150916       20150921      A

2234          979      20150917       20150921      A

2235          979      20150922       20151007      B

 

Generally, the Origin=B row occurs when Origin=A ends.  Hence, the same date.  Maybe that's what I need to be matching...

 

What I am trying to do is to start with an Origin of 'B', go back one line and return the Thru_Date from the Origin 'A'.  I have tried FIRST.ORIGIN and I get sorting errors (It needs to be sorted by Key and From_Date). I've tried IF Origin = 'B' then initial_thru_date=lag(Thru_Date), then I only get the lag to occur when Origin='B'.  And I understand why that is happening, but not sure how to start with Origin='B' and then have SAS run a process that doesn't just find the rows where Origin='B'.  This is what I would like the result to resemble:

 

Provider     Key     From_Date     Thru_Date     Origin     Initial_Thru_Date

1234          585      20150213       20150227      A

1234          585      20150313       20150327      A

1234          585      20150831       20150905      A

1235          585      20150905       20150930      B            20150905

1235          585      20150905       20151005      B

2234          979      20150916       20150921      A

2234          979      20150917       20150921      A

2235          979      20150922       20151007      B            20150921

 

I'm struggling to determine if there is a mix of FIRST. or LAG, or something else I can utilize.

 

Any assistance would be greatly appreciated. Even if it is the correct syntax for a Google search. 😉

 

As always, thank you.

 

Brian

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Well, you've identified the issue with LAG.  That's half the battle.  The next question is what can you do about it?  In general, the solution is to use LAG on every observation.  Then wipe out the value assigned to your new variable, if you don't want it.  For example:

 

data want;

set have;

by origin notsorted;

initial_thru_date = lag(thru_date);

if (origin ne 'B') or (first.origin=0) then initial_thru_date=.;

run;

 

There can be complications if you need to begin the logic over again when KEY changes.  But even in that case this program will hold up as long as each KEY always begins with an "A" value for ORIGIN.

View solution in original post

3 REPLIES 3
Astounding
PROC Star

Well, you've identified the issue with LAG.  That's half the battle.  The next question is what can you do about it?  In general, the solution is to use LAG on every observation.  Then wipe out the value assigned to your new variable, if you don't want it.  For example:

 

data want;

set have;

by origin notsorted;

initial_thru_date = lag(thru_date);

if (origin ne 'B') or (first.origin=0) then initial_thru_date=.;

run;

 

There can be complications if you need to begin the logic over again when KEY changes.  But even in that case this program will hold up as long as each KEY always begins with an "A" value for ORIGIN.

ballardw
Super User

If it is acceptable to assume that your data is sorted by Key and origin

 

data have;
   input Provider     Key     From_Date     Thru_Date     Origin $ ;
datalines;
1234          585      20150213       20150227      A
1234          585      20150313       20150327      A
1234          585      20150831       20150905      A
1235          585      20150905       20150930      B
1235          585      20150905       20151005      B
2234          979      20150916       20150921      A
2234          979      20150917       20150921      A
2235          979      20150922       20151007      B
;
run;

data want;
   set have;
   by key origin;
   ldate = lag(Thru_date);
   if first.origin and origin='B' then initial_thru_date=ldate;
   drop ldate;
run;

You could use

 

By origin notsorted;

for your example data as well but it is not clear if the KEY value is important or not for deciding "first" status.

shounster
Obsidian | Level 7

I wanted to accept both of these as a solution, so thank you both SO much for taking the time look over my query. I was all around it in my attempts, I just didn't think to put all of the attempts together. lol

 

Once again, thank you both!

 

Brian

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 961 views
  • 2 likes
  • 3 in conversation