BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Masande
Fluorite | Level 6

According to the Prep Guide : 

 

When PROC IMPORT reads raw data, SAS sets the value of each variable in the DATA step to missing at the beginning of each cycle of execution, with these exceptions:


Variables that are named in a RETAIN statement
variables that are created in a sum statement
Automatic variables


In contrast, when reading variables from a SAS data set, SAS sets the values to missing only before the first cycle of execution of the DATA step. Therefore, the variables retain their values until new values become available (for example, through an assignment statement or through the next execution of a SET or MERGE statement). Variables that are created with options in a SET or MERGE statement also retain their values from one cycle of execution to the next.

 

I don't understand this part of the data step execution. For me, with each new iteration, the variables are set to 'missing' for the following code, unless I write "retain total;" or "total+var1;" : 

 

 

data dataset.new;
	set dataset.old;
	total=sum(total,var1);
run;

 

which is confirmed when I execute the code. However, the bolded part seems to say the opposite, as dataset.old is a SAS data set: the total value shouldn't be reset to missing and var1 would be added to total.

 

While the book has probably no error and the code is executed as I expect, what do I miss about its explanation ? 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@Masande wrote:
If I understand it correctly, the variables i and j are coming from a different data set and their values are retained. sum(i,j) is newly created, hence its value is set to missing.

Is that what I'm supposed to see?

I and J are variables being read from a dataset.  So they are retained, but in a normal simple data step where the SET statement is the first thing that execute is really doesn't matter whether they are "retained" or not since whatever value they had is immediately changed by executing the SET statement.

 

SUM(I,J) is a function call and so has nothing to do with the point of the question.  The difference between the two steps is that in one the value is being assigned on a variable that is coming from an input dataset and in the other it is being assigned to a variable that is NOT coming from an input dataset.  So in one the value BEFORE the set statement reflects the value at the end of the previous iteration. And in the other the value is missing before the assignment statement gives it a value.

 

In the simple data step generated by PROC IMPORT none of the variables are coming from an input dataset, so none of them are "retained".  Also since each iteration of the data step includes in INPUT statement that sets the values of the variables it doesn't really matter whether the variables are retained.   That is why it seems strange to mention this issue in the context of PROC IMPORT.

 

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20

Can you please run this and check the log to see if this helps your undestanding

 

data test_data;
do i=1 to 10;
j=i;
output;
end;
run;

data _null_;
put 'before' +2  j= _n_=; /*before*/
set test_data;
j=sum(i,j);
put 'after' +2 j= _n_=;	/*after*/
run;

Compare the above with the below

 

/*Now testing with new assignmnt var jj*/
data _null_;
put 'before' +2  jj= _n_=; /*before*/
set test_data;
jj=sum(i,j);
put 'after' +2 jj= _n_=;	/*after*/
run;

 

Masande
Fluorite | Level 6
If I understand it correctly, the variables i and j are coming from a different data set and their values are retained. sum(i,j) is newly created, hence its value is set to missing.

Is that what I'm supposed to see?
Tom
Super User Tom
Super User

@Masande wrote:
If I understand it correctly, the variables i and j are coming from a different data set and their values are retained. sum(i,j) is newly created, hence its value is set to missing.

Is that what I'm supposed to see?

I and J are variables being read from a dataset.  So they are retained, but in a normal simple data step where the SET statement is the first thing that execute is really doesn't matter whether they are "retained" or not since whatever value they had is immediately changed by executing the SET statement.

 

SUM(I,J) is a function call and so has nothing to do with the point of the question.  The difference between the two steps is that in one the value is being assigned on a variable that is coming from an input dataset and in the other it is being assigned to a variable that is NOT coming from an input dataset.  So in one the value BEFORE the set statement reflects the value at the end of the previous iteration. And in the other the value is missing before the assignment statement gives it a value.

 

In the simple data step generated by PROC IMPORT none of the variables are coming from an input dataset, so none of them are "retained".  Also since each iteration of the data step includes in INPUT statement that sets the values of the variables it doesn't really matter whether the variables are retained.   That is why it seems strange to mention this issue in the context of PROC IMPORT.

 

Tom
Super User Tom
Super User

What does PROC IMPORT have to do with this question about data steps?

Masande
Fluorite | Level 6
According to the Prep Guide, PROC IMPORT runs a DATA step to read the data. Hence the relation with the first post (which is a quote from the book).
Tom
Super User Tom
Super User

@Masande wrote:
According to the Prep Guide, PROC IMPORT runs a DATA step to read the data. Hence the relation with the first post (which is a quote from the book).

PROC IMPORT will generate and run a DATA step when used to read a delimited text file.  But it does not generate a data step to read from structured data, like Excel files.

 

Who wrote that guide?

Kurt_Bremser
Super User

total is NOT read from the dataset, but a newly created variable, and therefore set to missing at the start of each data step iteration.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1351 views
  • 0 likes
  • 4 in conversation