Hi !
I would like to understand the logic behind this simple data step.
data test ;
put _all_ ;
input x ;
y = 2 * x ;
cards ;
1
;
In the log it is displayed :
x=. y=. _n_=1
x=. y=. _n_=2
I probably "missing" few details :
1. Why are there 2 iteration ?
2. Is the first iteration done for reading the data descriptor ?
3. Why then at the second iteration (when reading INPUT i guess ) X and Y are still missing ?
Thanks in advance
saskap
The datastep iteration halts when a SET of INPUT statement reads past the end of file. Any statements before that (in this case a PUT statement) are still executed.
Note that your output dataset still has only one observation with X=1 and Y=2, as expected.
- Jan.
What you see is because the put _all_ statement is before the input. The datastep halts when the input statement reads end of data in the cards. That's only after you print _all_ to the log.
Also, because the put statement comes before any other statements all variables are missing. At the start of an iteration the PDV is cleared for all non-retained variables.
For illustration you could move the put statement to the botrtom of the datastep after the assigment tof Y.
Hope this clariefies things a bit.
- Jan
For the basics have a good look at How the DATA Step Works: A Basic Introduction. It is essential learning.
- Jan.
Thanks Jan...Yes off course the reinitialisation of the PDV...! However, why two iteration ? My only explanation would be that the first iteration creates data descripor (compilation section and _n_=1) and that the second iteration only it starts the execution...What do you think ?
The datastep iteration halts when a SET of INPUT statement reads past the end of file. Any statements before that (in this case a PUT statement) are still executed.
Note that your output dataset still has only one observation with X=1 and Y=2, as expected.
- Jan.
@saskapa wrote:
Thanks Jan...Yes off course the reinitialisation of the PDV...! However, why two iteration ? My only explanation would be that the first iteration creates data descripor (compilation section and _n_=1) and that the second iteration only it starts the execution...What do you think ?
No. All data steps are compiled before they start.
The reason there are two put statements is because the implied data step loop runs twice. The second time it stops when the INPUT statement reads past the end of the input. So it never gets to the end where the record is written so you only have one output record. This is how all simple data steps work. The same would happen if you were using a SET statement instead of reading raw data with an INPUT statement. When the SET statement reads past the end of the data then the step stops.
Thanks Tom for the clarification
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.