Can someone please explain nicely and with examples the difference between RETAIN and RETAIN 0? For example,
RETAIN var1; vs. RETAIN var1 0;
Many thanks!
DATA demo;
LENGTH a b c d ac bc cc dc 8.;
RETAIN a ; *same as RETAIN a .;
RETAIN b 0;
RETAIN c 10;
*NOTE: doing nothing to variable d (default);
/* Initialization proceeds as follows */
ac = a + 1; *_N_ = 1: a = . (the value you provided initially);
*_N_ = 2: a = . (the value retained from previous step);
*_N_ = 3: a = 5 (the value retained from previous step);
bc = b + 1; *_N_ = 1: b = 0 (the value you provided initially);
*_N_ = 2: b = . (the value retained from previous step);
*_N_ = 3: b = 5 (the value retained from previous step);
cc = c + 1; *_N_ = 1: c = 10 (the value you provided initially);
*_N_ = 2: c = . (the value retained from previous step);
*_N_ = 3: c = 5 (the value retained from previous step);
dc = d + 1; *_N_ = 1: d = . (default behavior: initialized to . beginning every step);
*_N_ = 2: d = . (default behavior: initialized to . beginning every step);
*_N_ = 3: d = . (default behavior: initialized to . beginning every step);
INPUT a b c d;
DATALINES;
. . . .
5 5 5 5
. . . .
;
RUN;
PROC PRINT DATA = demo NOOBS;
RUN;
@pink_poodle wrote:
Can someone please explain nicely and with examples the difference between RETAIN and RETAIN 0? For example,
RETAIN var1; vs. RETAIN var1 0;
Many thanks!
Your second version provides an initial value of 0. Otherwise it is initialized to a missing value (blanks for character vars, . for numerics):
data _null_;
length x $5;
retain x;
retain y 0;
retain z;
put (_all_) (=);
run;
When you use
retain var1 ;
It will retain var1, and set the initial value to missing.
If you want to set the initial value to 0 you would use:
retain var1 0 ;
If you code:
data want1 ;
set sashelp.class ;
retain counter ;
put _n_= "before incrementing" counter= ;
counter=sum(counter,1) ;
put _n_= "after incrementing" counter= ;
run ;
You will see that the first PUT statement shows counter has a missing value.
If you code:
data want2 ;
set sashelp.class ;
retain counter 0;
put _n_= "before incrementing" counter= ;
counter=sum(counter,1) ;
put _n_= "after incrementing" counter= ;
run ;
The first PUT statement will show counter has the value 0.
RETAIN on its own does not set any attributes of the variable. RETAIN 0 defines it as numeric and initializes it to zero.
DATA demo;
LENGTH a b c d ac bc cc dc 8.;
RETAIN a ; *same as RETAIN a .;
RETAIN b 0;
RETAIN c 10;
*NOTE: doing nothing to variable d (default);
/* Initialization proceeds as follows */
ac = a + 1; *_N_ = 1: a = . (the value you provided initially);
*_N_ = 2: a = . (the value retained from previous step);
*_N_ = 3: a = 5 (the value retained from previous step);
bc = b + 1; *_N_ = 1: b = 0 (the value you provided initially);
*_N_ = 2: b = . (the value retained from previous step);
*_N_ = 3: b = 5 (the value retained from previous step);
cc = c + 1; *_N_ = 1: c = 10 (the value you provided initially);
*_N_ = 2: c = . (the value retained from previous step);
*_N_ = 3: c = 5 (the value retained from previous step);
dc = d + 1; *_N_ = 1: d = . (default behavior: initialized to . beginning every step);
*_N_ = 2: d = . (default behavior: initialized to . beginning every step);
*_N_ = 3: d = . (default behavior: initialized to . beginning every step);
INPUT a b c d;
DATALINES;
. . . .
5 5 5 5
. . . .
;
RUN;
PROC PRINT DATA = demo NOOBS;
RUN;
You should normally avoid using RETAIN for variables that you are actually READING in with an INPUT statement. The only case where that makes sense is when the INPUT statement is executed CONDITIONALLY so values are only read in some times so that the retained value can be used in the other cases.
A better way to see what is happening is to look at the values during the data step itself.
DATA demo;
LENGTH a b c d ac bc cc dc 8.;
RETAIN a ; *same as RETAIN a .;
RETAIN b 0;
RETAIN c 10;
*NOTE: doing nothing to variable d (default);
if _n_=1 then put ' A B C D AC BC CC DC';
put _n_=;
put (_all_) (3.) ' <- START';
/* Initialization proceeds as follows */
ac = a + 1; *_N_ = 1: a = . (the value you provided initially);
*_N_ = 2: a = . (the value retained from previous step);
*_N_ = 3: a = 5 (the value retained from previous step);
bc = b + 1; *_N_ = 1: b = 0 (the value you provided initially);
*_N_ = 2: b = . (the value retained from previous step);
*_N_ = 3: b = 5 (the value retained from previous step);
cc = c + 1; *_N_ = 1: c = 10 (the value you provided initially);
*_N_ = 2: c = . (the value retained from previous step);
*_N_ = 3: c = 5 (the value retained from previous step);
dc = d + 1; *_N_ = 1: d = . (default behavior: initialized to . beginning every step);
*_N_ = 2: d = . (default behavior: initialized to . beginning every step);
*_N_ = 3: d = . (default behavior: initialized to . beginning every step);
put (_all_) (3.) ' <- AFTER IF';
INPUT a b c d;
put (_all_) (3.) ' <- AFTER INPUT';
DATALINES;
. . . .
5 5 5 5
. . . .
;
Result
A B C D AC BC CC DC _N_=1 . 0 10 . . . . . <- START . 0 10 . . 1 11 . <- AFTER IF . . . . . 1 11 . <- AFTER INPUT _N_=2 . . . . . . . . <- START . . . . . . . . <- AFTER IF 5 5 5 5 . . . . <- AFTER INPUT _N_=3 5 5 5 . . . . . <- START 5 5 5 . 6 6 6 . <- AFTER IF . . . . 6 6 6 . <- AFTER INPUT _N_=4 . . . . . . . . <- START . . . . . . . . <- AFTER IF
Notice also that the data steps ends on the 4th iteration when the INPUT statement reads past the end of the datalines.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.