BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robertrao
Quartz | Level 8


Hi TEam,

I was reading a paper on Dow Loops.

http://analytics.ncsu.edu/sesug/2010/BB13.Dorfman.pdf

The example is from page 2 of this paper.

"If VAR is missing CONTINUE passes the control straight to the bottom of the loop".

What is the meaning of bottom of the loop? Is it not gona calculate Mcount Prod and SUm for that variable and goes to the next variable but still with in the loop?Is that the meaning ?

Secondly,

"PROD and COUNT are

set to 1, and the non-retained SUM, MEAN, and MCOUNT are set to missing by the default action of the

implied loop (program control at the top of the implied loop)."

This sentence says that SUM Mean and Count are set to missing . But what lines in the code sets these values to missing??

Any help is greatly appreciated possibly with an example would be great

Thanks

Data B ( Keep = Id Prod Sum Count Mean) ;

  Prod = 1 ;

   Do Count = 1 By 1 Until ( Last.Id ) ;

   Set A ;

    By Id ;

       If Missing (Var) Then Continue ;

      Mcount = Sum (Mcount, 1) ;

       Prod = Prod * Var ;

       Sum = Sum (Sum, Var) ;

       End ;

       Mean = Sum / Mcount ;

Run ;

Message was edited by: Karun Diri Also I feel the keep statement would have been good with the set statement rather than the Data B Any suggestions??

1 ACCEPTED SOLUTION

Accepted Solutions
TomKari
Onyx | Level 15

Hi, Karun

You're doing a great job of reading through papers and picking up SAS.

Here's a couple of methods that I like to use when I'm trying to figure out code. Try it yourself, and see what you think.

First, from looking at the code, it looks like there's a dataset a that contains a variable Id that is some kind of "key", and a variable Var that is continuous, and it gets treated differently if it's missing. So I created the following SAS program:

data a(drop=I);
do i = 1 to 50;
  Id = floor(rand('uniform')*10);
  Var = rand('uniform') * 100;

  if rand('uniform') < .1 then
   call missing(Var);
  output;
end;
run;

proc sort;
by Id;
run;

You should be familiar with most of it. In terms of the unusual bits, the do...end will loop 50 times, and the output before the end writes a record, so it'll write 50 records to SAS dataset a;

The rand('uniform') function will generate a random number uniformly distributed between 0 and 1. Multiplying it by 10 will uniformly distribute it between 0 and 10 (actually 9.9...), and the floor function will remove the decimal places, so Id will be a random integer between 0 and 9.

Same idea for Var, but we'll make it between 0 and 100, and we won't remove the decimal places.

The if statement will set Var to missing 10% of the time.

And because there's a "by" statement in the code to be examined, we need to sort our data by the variable in the "by" statement.

Here's the code under examination, with the changes I made:

Data B ( Keep = Id Prod Sum Count Mean) ;

  putlog 'After Data statement ' _all_;

  Prod = 1 ;

   Do Count = 1 By 1 Until ( Last.Id ) ;

   putlog 'After Do statement ' _all_;

   Set A ;

    By Id ;

       putlog 'After Set statement ' _all_;

       If Missing (Var) Then Continue ;

       Mcount = Sum (Mcount, 1) ;

       Prod = Prod * Var ;

       Sum = Sum (Sum, Var) ;

       putlog 'Before End statement ' _all_;

       End ;

       Mean = Sum / Mcount ;

       putlog 'Before Run statement ' _all_;

Run ;

The only thing I did was to add five "putlog" statements in. These will print out the comment of where in the code we are, and then the _all_ clause prints out the values of all of the variables in the program. As a result, you get a log listing that pretty much lets you track the logic of the program.

Play with it, and see what you think.

Tom

View solution in original post

2 REPLIES 2
ArtC
Rhodochrosite | Level 12

1)     The CONTINUE transfers control to the bottom of the loop.  This is the END statement, consequently the three assignment statements are not executed when VAR is missing.  Note that since you are using a DOW loop, the DATA statement will only be executed once for each unique ID.

2) During the execution phase, non-retained variables are set to missing, when the DATA statement is executed.  In this case these will be any variables that are not on the incoming data set (A).

3) the KEEP= data set option will be applied to the data set it is associated with.  To control incoming variables use it on the data set on the SET statement.  When used with the data set(s) on the DATA statement, the KEEP= applies to the new data set.

TomKari
Onyx | Level 15

Hi, Karun

You're doing a great job of reading through papers and picking up SAS.

Here's a couple of methods that I like to use when I'm trying to figure out code. Try it yourself, and see what you think.

First, from looking at the code, it looks like there's a dataset a that contains a variable Id that is some kind of "key", and a variable Var that is continuous, and it gets treated differently if it's missing. So I created the following SAS program:

data a(drop=I);
do i = 1 to 50;
  Id = floor(rand('uniform')*10);
  Var = rand('uniform') * 100;

  if rand('uniform') < .1 then
   call missing(Var);
  output;
end;
run;

proc sort;
by Id;
run;

You should be familiar with most of it. In terms of the unusual bits, the do...end will loop 50 times, and the output before the end writes a record, so it'll write 50 records to SAS dataset a;

The rand('uniform') function will generate a random number uniformly distributed between 0 and 1. Multiplying it by 10 will uniformly distribute it between 0 and 10 (actually 9.9...), and the floor function will remove the decimal places, so Id will be a random integer between 0 and 9.

Same idea for Var, but we'll make it between 0 and 100, and we won't remove the decimal places.

The if statement will set Var to missing 10% of the time.

And because there's a "by" statement in the code to be examined, we need to sort our data by the variable in the "by" statement.

Here's the code under examination, with the changes I made:

Data B ( Keep = Id Prod Sum Count Mean) ;

  putlog 'After Data statement ' _all_;

  Prod = 1 ;

   Do Count = 1 By 1 Until ( Last.Id ) ;

   putlog 'After Do statement ' _all_;

   Set A ;

    By Id ;

       putlog 'After Set statement ' _all_;

       If Missing (Var) Then Continue ;

       Mcount = Sum (Mcount, 1) ;

       Prod = Prod * Var ;

       Sum = Sum (Sum, Var) ;

       putlog 'Before End statement ' _all_;

       End ;

       Mean = Sum / Mcount ;

       putlog 'Before Run statement ' _all_;

Run ;

The only thing I did was to add five "putlog" statements in. These will print out the comment of where in the code we are, and then the _all_ clause prints out the values of all of the variables in the program. As a result, you get a log listing that pretty much lets you track the logic of the program.

Play with it, and see what you think.

Tom

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 771 views
  • 3 likes
  • 3 in conversation