03-10-2018 12:20 PM
this topic was from my assignment,
and after checking the answer provided,
I still had this question.
The data-set is as below:
The labels are:
Month Date Team Hits Runs Status
6-19 Columbia Peaches 8 3 Complete 6-20 Columbia Peaches 10 5 Complete 6-23 Plains Peanuts 3 4 Complete 6-24 Plains Peanuts 7 2 Complete 6-25 Plains Peanuts 12 8 Complete 6-30 Gilroy Garlics . . No Data 7-1 Gilroy Garlics . . No Data 7-4 Sacramento Tomatoes 15 9 Complete 7-4 Sacramento Tomatoes 10 10 Complete 7-5 Sacramento Tomatoes 2 3 Complete
The question was:
The code was as below:
data mydata; infile "&dirdata/Week_5/Games_Plus.dat" truncover; input Month 1 Day 3-4 Team $6-24 Hits 27-28 Runs 30-31 Status $9.; retain MaxRuns RunstoDate 0 ; MaxRuns=Max(MaxRuns, Runs); RunsToDate=RunsToDate+Runs; run; proc print data=mydata; title "Season's Record to Date, with Missing Values"; run;
My question was the "Retain" command:
Starting from here, I had not yet created variables named "MaxRuns" and "RunsToDate".
However, it seemed SAS knows this.
And I also did not understand MaxRuns: why did it state as Max(MaxRuns,Runs) instead of simply MaxRuns=Max(Runs)
And I think "RunsToDate=RunsToDate+Runs" is because it is like RunsToDate of record 3=RunsToDate of record 2 +Runs of record 3 and so on...
I guess I did not really understand about this question,
I wonder if anyone understand this and would like to guiding me a little bit.
Thanks a lot!
03-10-2018 03:21 PM
Contrary to SQL, where a summary function like max() can work over all rows, a data step (and a data step function) always deals with the current observation only. So you need to compare the current value with the retained summary value.
03-11-2018 04:58 AM
The retained variables are not stored in a different section of memory, they are part of the PDV like all other non-automatic variables, but the data step always does this with variables in the PDV:
(x will be retained, n is any numeric expression)
x + n;
03-11-2018 12:41 PM
Retained variables are indeed part of the pdv, but that does not mean they have the same memory address as when they are not retained. This is what I meant by "section" of memory.
For simplicity let me restrict my example to numeric variables that are newly created in the data step.
Consider the impact on the address of variable W below. W is retained in the second data step but not the first, and as a result has a different memory address. In fact, if you have a number of new variables, and retain a subset, I have never seen a retained variable in a memory location contiguous to the non-retained vars. Instead they are contiguous to each other (separated by 8 bytes needed for numeric variables). And the non-retained vars are similarly contiguous to each other.
This is why I believe it is a useful paradigm to consider the retain statement as a memory-location assignment statement.
Even so, as you have noted, they are in the PDV, and non-retained and retained variables can be logically contiguous - handy for programming logic statement, such as array declarations, etc.
data _null_; set sashelp.class; a=age; w=weight; h=height; ada=addrlong(a); adw=addrlong(w); adh=addrlong(h); put (ad:) (=$hex16. /); stop; run; data _null_; set sashelp.class; a=age; w=weight; h=height; retain w; ada=addrlong(a); adw=addrlong(w); adh=addrlong(h); put (ad:) (=$hex16. /); stop; run;
My system is windows, which has "little endian" addresses, so addresses such as
are "contiguous" (ie
location 6840750600000000 is followed by
location 6940750600000000 is followed by
location 6A40750600000000 is followed by
location 6B40750600000000 is followed by
location 6C40750600000000 is followed by
location 6D40750600000000 is followed by
location 6E40750600000000 is followed by
location 6F40750600000000 is followed by
providing 8 bytes for the numeric variable at the first address.
03-10-2018 04:23 PM
The retain statement can precede the corresponding value assignment statement (although the retain statement has the option of assigning in initial value. Note because of this, you can declare a retained variable even though not only the value, but also the variable type (numeric vs character) is not evident until a subsequent statement.
03-10-2018 10:02 PM
Think of it this way. When the SAS data step encounters a retain statement it turns out that the retained variables are stored in a different part of memory than for non-retained variables. This can be demonstrated by use of the ADDRLONG function, which I don't propose to describe here.
So in a way, all the retain statement apparently does is assign a variable's location in memory to a region which the data step does not reset to missing with each new record.