SAS Procedures

Help using Base SAS procedures
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
raivester
Quartz | Level 8

I am reviewing someone's code and came across a SET statement with an END option. I've read the SAS documentation on this combination but I am still not clear on why it is necessary. To my understanding, SAS waits until the last observation of the set has been read before performing the subsequent action, but I don't understand the benefit of this?  If it is helpful, the code is below:

 

data _null_;
 set e_gen_count end=eof;
 if eof then do;
 	call symput("macro_counter",put(trim(left(macro_counter)),2.0));
end;
run;

%put &macro_counter;
1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

Hi,

 

A key point is that your understanding "SAS waits until the last observation of the set has been read before performing the subsequent action" is wrong.

 

The DATA step is an iterative loop. Consider the following step:

data _null_ ;
  set sashelp.class end=eof ;
  put (_n_ name eof)(=) ;
run ;

When the step executes, the SET statement will read the first record, then the PUT statement will execute, then it will loop again and the SET statement will read the second statement, and the PUT statement will execute.  Each time the SET statement executes, it reads only 1 record.  It does not read all of the records at once.

 

The log shows the values read on each iteration of the loop, and the EOF variable created by the end= option:

 

5    data _null_;
6      set sashelp.class end=eof;
7      put (_n_ name eof)(=) ;
8    run ;

_N_=1 Name=Alfred eof=0
_N_=2 Name=Alice eof=0
_N_=3 Name=Barbara eof=0
_N_=4 Name=Carol eof=0
_N_=5 Name=Henry eof=0
_N_=6 Name=James eof=0
_N_=7 Name=Jane eof=0
_N_=8 Name=Janet eof=0
_N_=9 Name=Jeffrey eof=0
_N_=10 Name=John eof=0
_N_=11 Name=Joyce eof=0
_N_=12 Name=Judy eof=0
_N_=13 Name=Louise eof=0
_N_=14 Name=Mary eof=0
_N_=15 Name=Philip eof=0
_N_=16 Name=Robert eof=0
_N_=17 Name=Ronald eof=0
_N_=18 Name=Thomas eof=0
_N_=19 Name=William eof=1

Now suppose you wanted to calculate the total weight of all students, and assign that value to a macro variable.  You could do it like:

 

data _null_;
  set sashelp.class end=eof;
  totalweight+weight ;
  put (_n_ name totalweight eof)(=) ;
  call symputx("totalweight",totalweight) ;
run ;

%put &totalweight ;

And that code works, but there's an inefficiency.  That CALL SYMPUTX statement will execute 19 times.  You only need it to execute once, after you have read the read the last record and calculated the totalweight for all 19 records.  You can add that efficiency with an IF statement:

data _null_;
  set sashelp.class end=eof;
  totalweight+weight ;
  put (_n_ name totalweight eof)(=) ;

  if eof then do ;
    call symputx("totalweight",totalweight) ;
  end ;
run ;

 

 

 

 

 

 

 

 

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.

View solution in original post

4 REPLIES 4
ed_sas_member
Meteorite | Level 14

Hi @raivester 

To me, it is quite like the notion of first. and last. internal variables when you have a BY statement.

first.variable will be equal to 1 for the first occurence of each group and 0 otherwise.

The END statement will be equal to 1 for the very last observation.

It can be useful for example if you want to output the very last record only or even in some do loop statements.

In your case, END=eof creates an indicator variable named eof which is equal to 1 if it is the last observation -> so the calculation of the macrovariable in the CALL SYMPUT statement will be done only based on this record.

Best,

Quentin
Super User

Hi,

 

A key point is that your understanding "SAS waits until the last observation of the set has been read before performing the subsequent action" is wrong.

 

The DATA step is an iterative loop. Consider the following step:

data _null_ ;
  set sashelp.class end=eof ;
  put (_n_ name eof)(=) ;
run ;

When the step executes, the SET statement will read the first record, then the PUT statement will execute, then it will loop again and the SET statement will read the second statement, and the PUT statement will execute.  Each time the SET statement executes, it reads only 1 record.  It does not read all of the records at once.

 

The log shows the values read on each iteration of the loop, and the EOF variable created by the end= option:

 

5    data _null_;
6      set sashelp.class end=eof;
7      put (_n_ name eof)(=) ;
8    run ;

_N_=1 Name=Alfred eof=0
_N_=2 Name=Alice eof=0
_N_=3 Name=Barbara eof=0
_N_=4 Name=Carol eof=0
_N_=5 Name=Henry eof=0
_N_=6 Name=James eof=0
_N_=7 Name=Jane eof=0
_N_=8 Name=Janet eof=0
_N_=9 Name=Jeffrey eof=0
_N_=10 Name=John eof=0
_N_=11 Name=Joyce eof=0
_N_=12 Name=Judy eof=0
_N_=13 Name=Louise eof=0
_N_=14 Name=Mary eof=0
_N_=15 Name=Philip eof=0
_N_=16 Name=Robert eof=0
_N_=17 Name=Ronald eof=0
_N_=18 Name=Thomas eof=0
_N_=19 Name=William eof=1

Now suppose you wanted to calculate the total weight of all students, and assign that value to a macro variable.  You could do it like:

 

data _null_;
  set sashelp.class end=eof;
  totalweight+weight ;
  put (_n_ name totalweight eof)(=) ;
  call symputx("totalweight",totalweight) ;
run ;

%put &totalweight ;

And that code works, but there's an inefficiency.  That CALL SYMPUTX statement will execute 19 times.  You only need it to execute once, after you have read the read the last record and calculated the totalweight for all 19 records.  You can add that efficiency with an IF statement:

data _null_;
  set sashelp.class end=eof;
  totalweight+weight ;
  put (_n_ name totalweight eof)(=) ;

  if eof then do ;
    call symputx("totalweight",totalweight) ;
  end ;
run ;

 

 

 

 

 

 

 

 

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
novinosrin
Tourmaline | Level 20

Sir @Quentin  Very elegant and neat explanation. Just class! Thank you. I have just copied to my notes. Hmm sounds like you have a lot of free time today. lol

Quentin
Super User

Thanks @novinosrin .  I'm listening in the background to a not so engaging corporate webinar. : )

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2204 views
  • 8 likes
  • 4 in conversation