DATA Step, Macro, Functions and more

imputing missing values with retain statement

Accepted Solution Solved
Reply
Regular Contributor
Posts: 234
Accepted Solution

imputing missing values with retain statement

[ Edited ]

The first data step gives the desired resullt while the second does not.

In second data step why  is  the  variable LastX not retained as it is explicitly mentioned in the retain statment?

 

data initial;
input x $ y;
datalines;
A 2
. 4
. 6
. 8
. 10
B 1 
. 2 
C 1 
. 2 
. 3 
. 4
;
run;

/* Impute missing X values from previous valid values */
data imputed1;
	set initial;
	retain LastX " ";

	if X="" then
		X=LastX;
	else LastX=X;
	drop LastX;
run;

data imputed2;
	set initial;
	retain LastX " ";
	LastX=X;  /* assign the value of LastX */

	if X="" then
		X=LastX;
	drop LastX;
run;

proc print data=imputed1;
run;
proc print data=imputed2;
run 

 


Accepted Solutions
Solution
‎08-02-2016 10:34 PM
Super User
Posts: 10,521

Re: imputing missing values with retain statement

Retained variables are just like any other variable with the special property that the last value assigned is kept until the start of the next iteration through the data. The value can be changed just like any other variable. So when you use

 

data example;

   set have;

   retain lastx;

   lastx=x;

 

what happens is the value of LastX is that from the previous record up until you reassign it.

 

Similar behaviour occurs if you attempt to Retain a variable that is in the input data set. The value of the variable from incoming record overwrites the retained value.

 

Note that in your exampl ewith XPREVDT that the value is set conditionally: IF First.patient so that assignmet occurs when the first recordd for that patient is read. FIRST. is a very specific type of operation. Note that the example does not have:

   retain XPREVDT;

   XPREVDT = didate;

 

as sequential lines of code.

 

View solution in original post


All Replies
Super User
Posts: 10,521

Re: imputing missing values with retain statement

The first thing you do in the second data step is assign the value of the CURRENT x to LastX, thereby removing the effect of retaining it.

 

Regular Contributor
Posts: 234

Re: imputing missing values with retain statement

[ Edited ]

@ballardw , so assinging value to the retained variables nullify the retaining effect?

I got the following code from a SAS paper (http://support.sas.com/resources/papers/proceedings11/091-2011.pdf).  Variable XPREVDT is assigned  a value (didate) for first patient and still retained for other records for that patient..

 

data diary1;
	set diary1;
	by patient date;
	format XPREVDT date9.;
	retain XPREVDT;

	if first.patient then
		do;
			XPREVDT = didate;
		end;
	else
		do;
			if nmiss(didate - XPREVDT) = 0 then
				lapse = didate - XPREVDT;
			else lapse = .;
			XPREVDT = didate; * reset to current date;
end; run;

  

Also here, LastX is assinged to X in  else statement and it is retained.

 

data imputed1;
	set initial;
	retain LastX " ";

	if X="" then
		X=LastX;
	else LastX=X;
	drop LastX;
run;
Super User
Posts: 17,869

Re: imputing missing values with retain statement


SAS_inquisitive wrote:

@ballardw , so assinging value to the retained variables nullify the retaining effect?

 


Yes, assigning it a value, even missing, overwrites whatever value was in the variable. 

 

The logic below has some flaws beyond the retain issue, which is a danger of user papers. Missing an END for last DO, weird use of NMISS. 

It may also be a copy/paste error, I didn't read reference. Also, different datasets and situations require different logic. Yours may not be the same. 

Solution
‎08-02-2016 10:34 PM
Super User
Posts: 10,521

Re: imputing missing values with retain statement

Retained variables are just like any other variable with the special property that the last value assigned is kept until the start of the next iteration through the data. The value can be changed just like any other variable. So when you use

 

data example;

   set have;

   retain lastx;

   lastx=x;

 

what happens is the value of LastX is that from the previous record up until you reassign it.

 

Similar behaviour occurs if you attempt to Retain a variable that is in the input data set. The value of the variable from incoming record overwrites the retained value.

 

Note that in your exampl ewith XPREVDT that the value is set conditionally: IF First.patient so that assignmet occurs when the first recordd for that patient is read. FIRST. is a very specific type of operation. Note that the example does not have:

   retain XPREVDT;

   XPREVDT = didate;

 

as sequential lines of code.

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 327 views
  • 2 likes
  • 3 in conversation