DATA Step, Macro, Functions and more

repeated obs

Reply
Contributor
Posts: 64

repeated obs

Hi Experts,

 

I am unable to understand how this program is giving output for 8 observation and repeating 1, 2 and 5???

 

data input;
infile datalines;
input Var1 $ Var2 $;
datalines;
A one
A two
B three
C four
A five
;
RUN;
data one two;
set WORK.INPUT;
count= _n_;
if Var1='A' then output WORK.ONEs;
output;
run;

Regular Contributor
Posts: 233

Re: repeated obs

[ Edited ]
Posted in reply to Rahul_SAS

Hello,

 

Data step processes the input dataset row by row.

If there is no explicit output instruction in the code,

the currently processed row (*) is output when the interpreter

hits the run instruction. If there remains rows to process, there

is an implicit loop and the instructions in the data step are

executed again for the next row of the input dataset.

 

(*) More precisely the program data vector (PDV) wich is SAS terminology to

indicate the data being processed (row extracted from the input dataset + newly

created variables).

 

 

On the other hand, when the code in the data step contains some

output instructions, those determine when to output new lines in the

created dataset and the run instruction does not implicitely output 

data anymore.

 

So in your code, for each input row where Var1='A', the instruction

output WORK.ONE

creates a new row in table ONE.

The following output instruction creates a new row in both output datasets.

 

With 3 rows having Var1='A' and two rows where Var1 <> 'A',

the ONE dataset will have : 3x2+2=8 rows.

Super User
Posts: 19,772

Re: repeated obs

Posted in reply to Rahul_SAS

There are two output statements. 

 

The one executes for ALL records. The second is conditional and only executes if the variable is A, records 1,2 and 5 in your case. 

Ask a Question
Discussion stats
  • 2 replies
  • 207 views
  • 0 likes
  • 3 in conversation