BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rlafond
Obsidian | Level 7

I'm getting a single instance of duplication in a DATA step that I cannot figure out.

 

Input data:

PatIDFromDateThruDateProvID
10003/3/20173/3/2017222
10003/6/20173/6/2017222
10003/7/20173/7/2017222
10003/8/20173/8/2017222
10003/9/20173/9/2017222
10003/10/20173/10/2017222
10005/8/20175/8/2017222
10005/9/20175/9/2017222
10005/10/20175/10/2017222
10005/11/20175/11/2017222
10005/12/20175/12/2017222
10005/15/20175/15/2017222
10006/27/20176/27/2017999

 

Code:

DATA STEP2;

KEEP PATID FromDate_NEW ThruDate ProvID;

RENAME FromDate_NEW = FromDate;

SET STEP1;

BY PATID ProvID;

MERGE STEP1

STEP1 (FIRSTOBS=2 KEEP=FromDate RENAME=(FromDate=_NEXT_FROM));

RETAIN FromDate_NEW ThruDate_LAST;

FORMAT FromDate_NEW MMDDYY10.

ThruDate_LAST MMDDYY10.;

IF FIRST.PATID or first.ProvID THEN DO;

ThruDate_LAST = .;

 

FromDate_NEW=FromDate;

END;

ELSE DO;

IF FromDate - ThruDate_LAST > 2 THEN FromDate_NEW = FromDate;

END;

 

ThruDate_LAST=ThruDate;

IF LAST.PATID or last.ProvID THEN OUTPUT;

if LAST.PATID=0 AND _next_from - ThruDate > 2 THEN OUTPUT;

RUN;

 

Output:

 

PatIDThruDateProvIDFromDate
10003/3/20172223/3/2017
10003/10/20172223/6/2017
10005/12/20172225/8/2017
10005/15/20172225/15/2017
10005/15/20172225/15/2017
10006/27/2017999

6/27/2017

 

Desired output:

PatIDThruDateProvIDFromDate
10003/3/20172223/3/2017
10003/10/20172223/6/2017
10005/12/20172225/8/2017
10005/15/20172225/15/2017
10006/27/20179996/27/2017

 

Any idea why I'm getting duplication when FromDate = 5/15/17 and ProvID = 222?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

I can tell you the cause, but only you can figure out the solution.  You're the one who knows what the program is supposed to do.

 

You have two OUTPUT statements.  and you have an observation (the one that repeats an extra time) that meets the IF conditions for both OUTPUT statements.  So it gets output twice.  It's up to you to change the logic about when to output.  It might be as simple as adding the word ELSE after the first OUTPUT statement, or it might be more complex than that.

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

I guess because you have a set and a merge statement. Since your code is extremely hard to read because of all the shouting and no visual formatting, it's just a guess.

Astounding
PROC Star

I can tell you the cause, but only you can figure out the solution.  You're the one who knows what the program is supposed to do.

 

You have two OUTPUT statements.  and you have an observation (the one that repeats an extra time) that meets the IF conditions for both OUTPUT statements.  So it gets output twice.  It's up to you to change the logic about when to output.  It might be as simple as adding the word ELSE after the first OUTPUT statement, or it might be more complex than that.

rlafond
Obsidian | Level 7
This was helpful, thanks. I thought the vector cleared and the code started over after the first output. ELSE solved it.
ballardw
Super User

@rlafond wrote:
This was helpful, thanks. I thought the vector cleared and the code started over after the first output. ELSE solved it.

OUTPUT is executable. Which means each time the instruction is encountered it executes.

 

This is intentional behavior as you can specify a specific output put data set as an argument to the statement (assuming the data set name appears on the DATA statement) to direct observations as needed.

 

Consider as an example:

data work.boys work.girls work.older;
   set sashelp.class;
   if sex='F' then output work.girls;
   if sex='M' then output work.boys;
   if age ge 14 then output work.older;
run;

I will not say that splitting the data is the best option but it is one.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 730 views
  • 1 like
  • 4 in conversation