Help using Base SAS procedures

DOW LOOP

Accepted Solution Solved
Reply
Super Contributor
Posts: 1,041
Accepted Solution

DOW LOOP


Hi TEam,

I was reading a paper on Dow Loops.

http://analytics.ncsu.edu/sesug/2010/BB13.Dorfman.pdf

The example is from page 2 of this paper.

"If VAR is missing CONTINUE passes the control straight to the bottom of the loop".

What is the meaning of bottom of the loop? Is it not gona calculate Mcount Prod and SUm for that variable and goes to the next variable but still with in the loop?Is that the meaning ?

Secondly,

"PROD and COUNT are

set to 1, and the non-retained SUM, MEAN, and MCOUNT are set to missing by the default action of the

implied loop (program control at the top of the implied loop)."

This sentence says that SUM Mean and Count are set to missing . But what lines in the code sets these values to missing??

Any help is greatly appreciated possibly with an example would be great

Thanks

Data B ( Keep = Id Prod Sum Count Mean) ;

  Prod = 1 ;

   Do Count = 1 By 1 Until ( Last.Id ) ;

   Set A ;

    By Id ;

       If Missing (Var) Then Continue ;

      Mcount = Sum (Mcount, 1) ;

       Prod = Prod * Var ;

       Sum = Sum (Sum, Var) ;

       End ;

       Mean = Sum / Mcount ;

Run ;

Message was edited by: Karun Diri Also I feel the keep statement would have been good with the set statement rather than the Data B Any suggestions??


Accepted Solutions
Solution
‎10-07-2012 12:32 PM
PROC Star
Posts: 1,167

Re: DOW LOOP

Posted in reply to robertrao

Hi, Karun

You're doing a great job of reading through papers and picking up SAS.

Here's a couple of methods that I like to use when I'm trying to figure out code. Try it yourself, and see what you think.

First, from looking at the code, it looks like there's a dataset a that contains a variable Id that is some kind of "key", and a variable Var that is continuous, and it gets treated differently if it's missing. So I created the following SAS program:

data a(drop=I);
do i = 1 to 50;
  Id = floor(rand('uniform')*10);
  Var = rand('uniform') * 100;

  if rand('uniform') < .1 then
   call missing(Var);
  output;
end;
run;

proc sort;
by Id;
run;

You should be familiar with most of it. In terms of the unusual bits, the do...end will loop 50 times, and the output before the end writes a record, so it'll write 50 records to SAS dataset a;

The rand('uniform') function will generate a random number uniformly distributed between 0 and 1. Multiplying it by 10 will uniformly distribute it between 0 and 10 (actually 9.9...), and the floor function will remove the decimal places, so Id will be a random integer between 0 and 9.

Same idea for Var, but we'll make it between 0 and 100, and we won't remove the decimal places.

The if statement will set Var to missing 10% of the time.

And because there's a "by" statement in the code to be examined, we need to sort our data by the variable in the "by" statement.

Here's the code under examination, with the changes I made:

Data B ( Keep = Id Prod Sum Count Mean) ;

  putlog 'After Data statement ' _all_;

  Prod = 1 ;

   Do Count = 1 By 1 Until ( Last.Id ) ;

   putlog 'After Do statement ' _all_;

   Set A ;

    By Id ;

       putlog 'After Set statement ' _all_;

       If Missing (Var) Then Continue ;

       Mcount = Sum (Mcount, 1) ;

       Prod = Prod * Var ;

       Sum = Sum (Sum, Var) ;

       putlog 'Before End statement ' _all_;

       End ;

       Mean = Sum / Mcount ;

       putlog 'Before Run statement ' _all_;

Run ;

The only thing I did was to add five "putlog" statements in. These will print out the comment of where in the code we are, and then the _all_ clause prints out the values of all of the variables in the program. As a result, you get a log listing that pretty much lets you track the logic of the program.

Play with it, and see what you think.

Tom

View solution in original post


All Replies
Valued Guide
Posts: 634

Re: DOW LOOP

Posted in reply to robertrao

1)     The CONTINUE transfers control to the bottom of the loop.  This is the END statement, consequently the three assignment statements are not executed when VAR is missing.  Note that since you are using a DOW loop, the DATA statement will only be executed once for each unique ID.

2) During the execution phase, non-retained variables are set to missing, when the DATA statement is executed.  In this case these will be any variables that are not on the incoming data set (A).

3) the KEEP= data set option will be applied to the data set it is associated with.  To control incoming variables use it on the data set on the SET statement.  When used with the data set(s) on the DATA statement, the KEEP= applies to the new data set.

Solution
‎10-07-2012 12:32 PM
PROC Star
Posts: 1,167

Re: DOW LOOP

Posted in reply to robertrao

Hi, Karun

You're doing a great job of reading through papers and picking up SAS.

Here's a couple of methods that I like to use when I'm trying to figure out code. Try it yourself, and see what you think.

First, from looking at the code, it looks like there's a dataset a that contains a variable Id that is some kind of "key", and a variable Var that is continuous, and it gets treated differently if it's missing. So I created the following SAS program:

data a(drop=I);
do i = 1 to 50;
  Id = floor(rand('uniform')*10);
  Var = rand('uniform') * 100;

  if rand('uniform') < .1 then
   call missing(Var);
  output;
end;
run;

proc sort;
by Id;
run;

You should be familiar with most of it. In terms of the unusual bits, the do...end will loop 50 times, and the output before the end writes a record, so it'll write 50 records to SAS dataset a;

The rand('uniform') function will generate a random number uniformly distributed between 0 and 1. Multiplying it by 10 will uniformly distribute it between 0 and 10 (actually 9.9...), and the floor function will remove the decimal places, so Id will be a random integer between 0 and 9.

Same idea for Var, but we'll make it between 0 and 100, and we won't remove the decimal places.

The if statement will set Var to missing 10% of the time.

And because there's a "by" statement in the code to be examined, we need to sort our data by the variable in the "by" statement.

Here's the code under examination, with the changes I made:

Data B ( Keep = Id Prod Sum Count Mean) ;

  putlog 'After Data statement ' _all_;

  Prod = 1 ;

   Do Count = 1 By 1 Until ( Last.Id ) ;

   putlog 'After Do statement ' _all_;

   Set A ;

    By Id ;

       putlog 'After Set statement ' _all_;

       If Missing (Var) Then Continue ;

       Mcount = Sum (Mcount, 1) ;

       Prod = Prod * Var ;

       Sum = Sum (Sum, Var) ;

       putlog 'Before End statement ' _all_;

       End ;

       Mean = Sum / Mcount ;

       putlog 'Before Run statement ' _all_;

Run ;

The only thing I did was to add five "putlog" statements in. These will print out the comment of where in the code we are, and then the _all_ clause prints out the values of all of the variables in the program. As a result, you get a log listing that pretty much lets you track the logic of the program.

Play with it, and see what you think.

Tom

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 242 views
  • 3 likes
  • 3 in conversation