Was the variable "RunsToDate" a new created variable?

Reply
Frequent Contributor
Posts: 112

Was the variable "RunsToDate" a new created variable?

Hello everyone,

this topic was from my assignment, 

and after checking the answer provided,

I still had this question.

 

The data-set is as below:

The labels are:

Month Date Team Hits Runs Status

6-19 Columbia Peaches      8  3 Complete
6-20 Columbia Peaches     10  5 Complete
6-23 Plains Peanuts        3  4 Complete
6-24 Plains Peanuts        7  2 Complete
6-25 Plains Peanuts       12  8 Complete
6-30 Gilroy Garlics        .  . No Data
7-1  Gilroy Garlics        .  . No Data
7-4  Sacramento Tomatoes  15  9 Complete
7-4  Sacramento Tomatoes  10 10 Complete
7-5  Sacramento Tomatoes   2  3 Complete

The question was:

  1. You want to accumulate the maximum number of runs that you know about. In this case, for example, record 6 should list MaxRuns=8.
  2. You only want to accumulated the total number of runs to date until you don’t have information—when this happens you want to set RunsToDate to a missing value. In this case, for example, record 6 should list RunsToDate=.;

The code was as below:

data mydata;
infile "&dirdata/Week_5/Games_Plus.dat" truncover;
input Month 1 Day 3-4 Team $6-24 Hits 27-28 Runs 30-31 Status $9.;
retain MaxRuns RunstoDate 0 ;
MaxRuns=Max(MaxRuns, Runs);
RunsToDate=RunsToDate+Runs;
run;

proc print data=mydata;
title "Season's Record to Date, with Missing Values";
run;

My question was the "Retain" command:

Starting from here, I had not yet created variables named "MaxRuns" and "RunsToDate".

However, it seemed SAS knows this.

And I also did not understand MaxRuns: why did it state as Max(MaxRuns,Runs) instead of simply MaxRuns=Max(Runs)

And I think "RunsToDate=RunsToDate+Runs" is because it is like RunsToDate of record 3=RunsToDate of record 2 +Runs of record 3 and so on...

 

I guess I did not really understand about this question,

I wonder if anyone understand this and would like to guiding me a little bit.

Thanks a lot!Smiley Happy

Super User
Posts: 10,280

Re: Was the variable "RunsToDate" a new created variable?

Contrary to SQL, where a summary function like max() can work over all rows, a data step (and a data step function) always deals with the current observation only. So you need to compare the current value with the retained summary value.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Frequent Contributor
Posts: 112

Re: Was the variable "RunsToDate" a new created variable?

Posted in reply to KurtBremser

Thank you very much! Now I understand. Thanks~

Super User
Posts: 10,280

Re: Was the variable "RunsToDate" a new created variable?

The retained variables are not stored in a different section of memory, they are part of the PDV like all other non-automatic variables, but the data step always does this with variables in the PDV:

  • variables from input datasets are retained (so when one observation of dataset A is merged with several observations of B, the values of A persist)
  • newly created variables are always set to missing at the start of a new datastep iteration, unless they are named in a retain statement or a summation statement of the form 
    x + n;
    (x will be retained, n is any numeric expression)

HTH

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Trusted Advisor
Posts: 1,346

Re: Was the variable "RunsToDate" a new created variable?

Posted in reply to KurtBremser

@KurtBremser

 

Retained variables are indeed part of the pdv, but that does not mean they have the same memory address as when they are not retained.  This is what I meant by "section" of memory.

 

For simplicity let me restrict my example to numeric variables that are newly created in the data step.

 

Consider the impact on the address of variable W below.  W is retained in the second data step but not the first, and as a result has a different memory address.   In fact, if you have a number of new variables, and retain a subset, I have never seen a retained variable in a memory location contiguous to the non-retained vars.  Instead they are contiguous to each other (separated by 8 bytes needed for numeric variables).   And the non-retained vars are similarly contiguous to each other. 

 

This is why I believe it is a useful paradigm to consider the retain statement as a memory-location assignment statement.

 

Even so, as you have noted, they are in the PDV, and non-retained and retained variables can be logically contiguous - handy for programming logic statement, such as array declarations, etc.

 

 

 

 

data _null_;
   set sashelp.class;
   a=age;
   w=weight;
   h=height;

   ada=addrlong(a);
   adw=addrlong(w);
   adh=addrlong(h);
   put (ad:) (=$hex16. /);
   stop;
run;

data _null_;
   set sashelp.class;
   a=age;
   w=weight;
   h=height;
   retain w;
   ada=addrlong(a);
   adw=addrlong(w);
   adh=addrlong(h);
   put (ad:) (=$hex16. /);
   stop;
run;

 

 

My system is windows, which has "little endian" addresses, so addresses such as 

   6840750600000000

   7040750600000000

are "contiguous"  (ie

     location 6840750600000000  is followed by

     location 6940750600000000  is followed by

     location 6A40750600000000  is followed by

     location 6B40750600000000  is followed by

     location 6C40750600000000  is followed by

     location 6D40750600000000  is followed by

     location 6E40750600000000  is followed by

     location 6F40750600000000  is followed by

     location 7040750600000000

providing 8 bytes for the numeric variable at the first address.

Trusted Advisor
Posts: 1,346

Re: Was the variable "RunsToDate" a new created variable?

The retain statement can precede the corresponding value assignment statement (although the retain statement has the option of assigning in initial value.  Note because of this, you can declare a retained variable even though not only the value, but also the variable type (numeric vs character) is not evident until a subsequent statement.

 

Frequent Contributor
Posts: 112

Re: Was the variable "RunsToDate" a new created variable?

Thank you. A bit complicated but I think I will figure it out after I am more familiar to it. Thanks!

Trusted Advisor
Posts: 1,346

Re: Was the variable "RunsToDate" a new created variable?

Think of it this way.  When the SAS data step encounters a retain statement it turns out that the retained variables are stored in a different part of memory than for non-retained variables.  This can be demonstrated by use of the ADDRLONG function, which I don't propose to describe here.

 

So in a way, all the retain statement apparently does is assign a variable's location in memory to a region which the data step does not reset to missing with each new record.

Ask a Question
Discussion stats
  • 7 replies
  • 128 views
  • 7 likes
  • 3 in conversation