DOW: do until(last.var) and do _n_=1 by 1 until (last.var)?

Reply
Contributor
Posts: 46

DOW: do until(last.var) and do _n_=1 by 1 until (last.var)?

Hi SAS experts,

What happens when we use set statement inside a do loop in PDV? and is do until(last.var) and do _n_=1 by 1 until (last.var) different from each other? _n_ as an index variable- so does that mean _n_ can be incremented to 2  and beyond during the processing of the first observation of a SAS dataset? Please let me know the significance of using _n_ as index variable or is it possible to use some other index variable instead of _n_ for a similar example/instance?

I have read the long papers on DOW but still haven't grasped the essential processing. I'd appreciate if anybody can explain with simpler example than the illustration in SUGI papers?

Thanks so much in advance if you lend me your valuable time.

God bless

PROC Star
Posts: 7,468

Re: DOW: do until(last.var) and do _n_=1 by 1 until (last.var)?

Posted in reply to Allaluiah

An easy way to see what is going on in the pdv is to carefully place 'put _all_;' statements at strategic places in your code.  E.g.:

data a;

  input id var;

  cards;

1 1

1 2

1 3

2 1

2 2

2 3

2 4

;

Data b;

  Do _n_ = 1 By 1 Until ( Last.Id ) ;

    put 'before set 1st Iteration:'_all_;

    Set A ;

    By Id ;

    Sum = Sum (Sum, Var) ;

    put 'after set 1st Iteration:'_all_;

  End ;

  Mean = Sum / _n_ ;

  Do _n_ = 1 By 1 Until ( Last.Id ) ;

    put 'before set 2nd Iteration:'_all_;

    Set A ;

    By Id ;

    put 'after set 2nd Iteration:'_all_;

    output;

  End ;

Run ;

Yes, you can increment _n_ by whatever value you want but, since you would likely be including the index for use as a counter, you would typically only increment by 1.

One of the benefits of using _n_, rather than some other variable, is that you don't have to drop it from your output, as _n_ is never included in the output dataset.

Super Contributor
Posts: 340

Re: DOW: do until(last.var) and do _n_=1 by 1 until (last.var)?

Posted in reply to Allaluiah

Ad ".. simpler example than the illustration in SUGI papers ..". Maybe these little programs are good to start with.

The difference between "Do I=1 By 1 Until (Last.Var)" and "Do Until (Last.Var)" is that you get a counter "I" which can be useful, for example to calculate a mean (see code 4).


Data A;
  Input N $ X Y;
  Datalines;
A 378 334
A 362 .
A 221 93
B 210 12
B 100 .
B 389 28
B 723 92
C 23 .
C 98 .
C 12 239
;
Run;

* code 1: Fill out missing Y's with predecessor - NO by-processing;
Data Y_Complete (Drop=dummy);
  Do Until (Eof);
    Set A End=Eof;
If not Missing (Y) Then dummy=Y;
Else If Missing (Y) Then Y=dummy;
Output;
  End;
Run;

* Code 2: Calculate Sum and Mean of all X's;
Data Calculate_Total_Mean (Keep=Mean_X Total_X);
  Do _N_=1 By 1 Until (Eof);
    Set A End=Eof;
Total_X=Sum(X,Total_X);
  End;
  Mean_X=Total_X/_N_;
Run;

* code 3: Mean values by group;
Data Instead_of_proc_means (Keep=N Mean_X);
  Do i=1 By 1 Until (Last.N);
    Set A;
By N;
Total=Sum(X,Total);
  End;
  Mean_X=Total/i;
Run;

* code 4: Mean values by group, keep all observations;
Data Add_Mean_to_All_Xes;
  Do _N_=1 By 1 Until (Last.N); * "_N_=1 By 1" .. you need a counter to calculate the mean;
    Set A;
By N;
Total=Sum(X,Total);
  End;
  Mean_X=Total/_N_;
  Do Until (Last.N); * you only want to loop and output the data lines;
    Set A;
By N;
    Output;
  End;
Run;

Ask a Question
Discussion stats
  • 2 replies
  • 4727 views
  • 6 likes
  • 3 in conversation