DATA Step, Macro, Functions and more

2X DWL or DUL dilemma?

Accepted Solution Solved
Reply
Respected Advisor
Posts: 3,124
Accepted Solution

2X DWL or DUL dilemma?

Dear All,

It was just occured to me that I still have this question on 2x Do-loops regarless of so many times I have used them.

For example:

data have;

infile cards missover;

input YEAR     patient $     Mortality_in365days $;

cards;

1990     221

1991     221

1991     221

1993     221

1995     221     *

2000     789

2001     789

2002     789     *

2001     965

2005     965     *

;

data want;

do until (last.patient);

set have;

by patient year;

Death_year=ifn(last.patient,year,death_year);

retain death_year;

end;

do until (last.patient);

set have;

by patient year;

Years_Before_death=death_year-year;

output;

end;

run;

My question is: by the time the first loop reaches the end of first 'have', the datastep should stop, so the final output should be lacking the last group of data, say patient= 965 should not be seen in the final output. Instead, datastep keeps on moving forward, and only being finished after the second set reach its end. I mean I am really glad that datastep acts smart on this one, the 2xDUL gets to hold its charm. But why?

Thanks in advance for all of your inputs and Answers!

Haikuo


Accepted Solutions
Solution
‎02-07-2012 10:58 AM
PROC Star
Posts: 7,363

2X DWL or DUL dilemma?

I can't write a full paper here to explain the concept but, forunately, Paul Dorfman and Koen Vyverman already have.  Take a good look at http://support.sas.com/resources/papers/proceedings09/038-2009.pdf .

I think it will provide all of the explanations you are seeking.

View solution in original post


All Replies
Solution
‎02-07-2012 10:58 AM
PROC Star
Posts: 7,363

2X DWL or DUL dilemma?

I can't write a full paper here to explain the concept but, forunately, Paul Dorfman and Koen Vyverman already have.  Take a good look at http://support.sas.com/resources/papers/proceedings09/038-2009.pdf .

I think it will provide all of the explanations you are seeking.

Respected Advisor
Posts: 3,124

2X DWL or DUL dilemma?

Wow, it is a GREAT paper! I probably need more time to dwell on it, but Thanks, Art!

Super Contributor
Posts: 1,636

2X DWL or DUL dilemma?

Hi Hai.kuo,

I looked at the paper Art recommended. It is a very good paper. I made some changes to one of your posts for practice.  Thank you!

data have;

input naics4 $ taxable1-taxable5;

cards;

1 20 30 40 50 60

1 25 35 45 55 65

1 30 40 50 60 70

2 20 30 40 50 60

2 25 35 45 55 65

3 30 40 50 60 70

;

run;

data want (drop=tax: );

do  until (last.naics4);

set have;

by naics4;

array tax(*) taxable1-taxable5;

array st(*) sum_tax1-sum_tax5;

do _n_=1 to dim(tax);

   st(_n_)=sum(st(_n_),tax(_n_));

end;

end;

run;

proc print;run;

Trusted Advisor
Posts: 2,113

2X DWL or DUL dilemma?

Haikuo,

The problem is not with the DO loop.  It has to do with the behavior of the SET statement. 

"What SET Does

              

Each time the SET statement                     is executed, SAS reads one observation into the program data vector.                     SET reads all variables and all observations from the input data                     sets unless you tell SAS to do otherwise. A SET statement can contain                     multiple data sets; a DATA step can contain multiple SET statements.                     See                                      Combining and Modifying SAS Data Sets: Examples.                  "

By design, the second SET statement starts over at the beginning of the dataset.  There are examples in the documentation that cover the behavior that you observed.

Doc Muhlbaier

Duke

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 187 views
  • 3 likes
  • 4 in conversation