Hello:
In PDV of first program, every observation occurs three times, but only first observation in second program. Although the output is same. I think it is because of OUTPUT statement in first program. Just need verification from the experts.
data test;
put _all_;
set sashelp.class;
put _all_;
if name = 'Alfred' then output;
put _all_;
run;
data test;
put _all_;
set sashelp.class;
put _all_;
if name = 'Alfred';
put _all_;
run;
I agree with @SuryaKiran, but the observed behavior isn't limited to just subsetting if statements. The following is equivalent:
data test;
put _all_;
set sashelp.class;
put _all_;
if name NE 'Alfred' then delete;
put _all_;
run;
Art, CEO, AnalystFinder.com
The put statements in your datastep only print the values, from the pdf, to your log. They have no effect on the file you are creating with the datastep.
In your first datastep you are explicitly only outputting the record that has the name 'Alfred', thus you only get one record.
In your second datastep, each record is output, since that is how SAS functions if a datastep doesn't include a specific output statement.
If you look at your log, you'll notice that the first put statement produces a record in your output that shows all missing values. That is because no record had yet been read.
The reason that the last record also has three non-missing representations in the log is because (1) the first put statement lists (in the log) the values for the previous person, (2) the next two put statements list the values (in the log) for the last record and, then (3) SAS goes to read another record, list the values (in the log) of the last record, encounters the end of file upon executing the set statement, and stops processing at that point.
Art, CEO, AnalystFinder.com
@art297 My question was why first data step program has three representations for all observations in the log while second data step program has three representation of first observation and two representation for the rest.
My intuition is - IF statement is true for the first observation, so the last PUT statement executes in the second program. While IF statement is FALSE for the rest and the last PUT statement never executes.
In first program, the last PUT statement executes for all observations, even IF condition is true for the first observation only. The observations are still available beyond OUTPUT statement.
I agree with @SuryaKiran, but the observed behavior isn't limited to just subsetting if statements. The following is equivalent:
data test;
put _all_;
set sashelp.class;
put _all_;
if name NE 'Alfred' then delete;
put _all_;
run;
Art, CEO, AnalystFinder.com
Your 2nd program is called sub-setting IF(without THEN clause) , the process will continue only if the statement is true. Only first record is true so all three PUT statements executed but for other records the third PUT statement will not execute.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.