BookmarkSubscribeRSS Feed
BrahmanandaRao
Lapis Lazuli | Level 10
data d;
do until(age in(16,15,11));
set sashelp.class;
end;
put _ALL_;
run;

Is it correct  method to retrieve the age observations?

9 REPLIES 9
JosvanderVelden
SAS Super FREQ
Why do you have the set statement in the do-loop? What do you want to achieve by that?

If you just want to subset the class dataset you can use a subsetting if statement or a where statement.
BrahmanandaRao
Lapis Lazuli | Level 10

i am practicing do until ,do while loops is above method also correct?

Quentin
Super User

I would say yes, it's one possible method. 

 

If I saw this code in production, I would be confused, because the simple approach of using subsetting IF or WHERE is easier and clearer.

 

That said, since you said you are practicing, code like this is a GREAT way to practice and really learn SAS.  Playing with code like this allows you to develop a deep understanding of how the DATA step works, how the Program Data Vector works, etc.  For example, consider how you would write the same step with DO WHILE, and whether DO WHILE is riskier than DO UNTIL in this setting.

 

Also,  the code structure of placing a SET statement inside a DO UNTIL() loop is so useful that it has been given a colloquial name ("DoW-loop") and there are several papers written about why it is a useful coding pattern. A good starting point would be Paul Dorfman's paper: https://support.sas.com/resources/papers/proceedings12/156-2012.pdf .

 

If you have come up with this idea independent of seeing someone else use this approach, then I would congratulate you.  For me, learning approaches like this (mostly from SAS-L), was what helped to move me from simply 'using' the DATA step to programming in the DATA step.

 

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
Kurt_Bremser
Super User

Maxim 4. Run it and look at the results and the log.

Hint: a SET statement in the DATA step means that the data step will run until either the SET tries to read past the last observation, or a STOP statement is executed.

And the pointer of the SET will not be reset when a new data step iteration begins (and the next DO loop starts).

Quentin
Super User

@Kurt_Bremser wrote (in part):

Hint: a SET statement in the DATA step means that the data step will run until either the SET tries to read past the last observation, or a STOP statement is executed.


Or until the data step iterates and does not execute the SET statement, which is what makes the difference in DO WHILE vs DO UNTIL interesting for this example.

 

As a quiz question, if you gave a SAS programmer a printout of sashelp.class:

Name       Age

Alfred      14
Alice       13
Barbara     13
Carol       14
Henry       14
James       12
Jane        12
Janet       15
Jeffrey     13
John        12
Joyce       11
Judy        14
Louise      12
Mary        15
Philip      16
Robert      12
Ronald      15
Thomas      11
William     15

And asked them which records are output by the DO UNTIL step vs DO WHILE:

data want1;
  do until(age in(16,15,11));
    set sashelp.class;
  end;
run;

data want2;
  do while(age NOT in(16,15,11));
    set sashelp.class;
  end;
run;

I suspect the success rate would be low.  It could be an interesting interview question for an intermediate/advanced SAS programmer.  Even if a candidate couldn't answer correctly, it would be insightful to see how they worked through the problem.

 

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
yabwon
Onyx | Level 15

Definitely for _advanced_ SAS programmer. 😉

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Quentin
Super User

Yeah, I would probably get this question wrong. : ) But even for the intermediate candidate, the discussion is useful.  I had a great boss once, and we talked about how to interview SAS folks.  And he said, when he interviewed beginner or even intermediate programmers, he didn't care so much if they got questions right.  He wanted to see if they got interested / excited when he introduced them to new approaches, and what sort of questions they asked.

 

When he interviewed me he asked if I preferred to use positional parameters or keyword parameters in my macros, and why.  My answer was acceptable enough to show that I knew what a macro parameter was, but it wasn't great.  When he explained his preferences and reasoning, I was interested / excited enough to demonstrate that I wanted learn.

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
yabwon
Onyx | Level 15

Great approach! Working with such manage had to be a pleasure.

Bart

 

P.S. Side note. I can see with eyes of my imagination, how this thread would bloom in SAS-L discussion 🙂

( @DonH @hashman @data_null__ @mkeintz @RichardDeVen @rogerjdeangelis )

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



yabwon
Onyx | Level 15

If "correct" means: "gives the same result as other (more popular) approaches" - then I would say - yes it is correct method. But I dare to say quite "nonstandard". 

I bet $5 that in 90% cases it would be rather IF-subseiting or WHERE statement:

 

data d2;
  set sashelp.class;
  where age in (16,15,11);
  put _ALL_;
run;

data d3;
  set sashelp.class;
  if age in (16,15,11);
  put _ALL_;
run;

 

 

Result of your code is "somewhere  between" those two classic approaches, because in the log you will see "19 observations were read" but also you will see maximum value of _N_ equal to 7:

 

 

1    data d;
2      do until(age in (16,15,11));
3        set sashelp.class;
4      end;
5      put _ALL_;
6    run;

age=15 Name=Janet Sex=F Height=62.5 Weight=112.5 _ERROR_=0 _N_=1
age=11 Name=Joyce Sex=F Height=51.3 Weight=50.5 _ERROR_=0 _N_=2
age=15 Name=Mary Sex=F Height=66.5 Weight=112 _ERROR_=0 _N_=3
age=16 Name=Philip Sex=M Height=72 Weight=150 _ERROR_=0 _N_=4
age=15 Name=Ronald Sex=M Height=67 Weight=133 _ERROR_=0 _N_=5
age=11 Name=Thomas Sex=M Height=57.5 Weight=85 _ERROR_=0 _N_=6
age=15 Name=William Sex=M Height=66.5 Weight=112 _ERROR_=0 _N_=7
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.D has 7 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds


7
8    data d2;
9      set sashelp.class;
10     where age in (16,15,11);
11     put _ALL_;
12   run;

Name=Janet Sex=F Age=15 Height=62.5 Weight=112.5 _ERROR_=0 _N_=1
Name=Joyce Sex=F Age=11 Height=51.3 Weight=50.5 _ERROR_=0 _N_=2
Name=Mary Sex=F Age=15 Height=66.5 Weight=112 _ERROR_=0 _N_=3
Name=Philip Sex=M Age=16 Height=72 Weight=150 _ERROR_=0 _N_=4
Name=Ronald Sex=M Age=15 Height=67 Weight=133 _ERROR_=0 _N_=5
Name=Thomas Sex=M Age=11 Height=57.5 Weight=85 _ERROR_=0 _N_=6
Name=William Sex=M Age=15 Height=66.5 Weight=112 _ERROR_=0 _N_=7
NOTE: There were 7 observations read from the data set SASHELP.CLASS.
      WHERE age in (11, 15, 16);
NOTE: The data set WORK.D2 has 7 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


13
14   data d3;
15     set sashelp.class;
16     if age in (16,15,11);
17     put _ALL_;
18   run;

Name=Janet Sex=F Age=15 Height=62.5 Weight=112.5 _ERROR_=0 _N_=8
Name=Joyce Sex=F Age=11 Height=51.3 Weight=50.5 _ERROR_=0 _N_=11
Name=Mary Sex=F Age=15 Height=66.5 Weight=112 _ERROR_=0 _N_=14
Name=Philip Sex=M Age=16 Height=72 Weight=150 _ERROR_=0 _N_=15
Name=Ronald Sex=M Age=15 Height=67 Weight=133 _ERROR_=0 _N_=17
Name=Thomas Sex=M Age=11 Height=57.5 Weight=85 _ERROR_=0 _N_=18
Name=William Sex=M Age=15 Height=66.5 Weight=112 _ERROR_=0 _N_=19
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.D3 has 7 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

 

 

My question/comment is: interesting approach, what are advantages of it?

First what came into my mind is: we can calculate distance between listed values in the data set (and the beginning of the dataset itself) with this approach:

 

data test;
input x;
cards;
1
2
3
4
5
6
7
8
9
;
run;

data d;
  DISTANCE_BETWEEN=0;
  do until(x in (2,6,9));
    set test;
    DISTANCE_BETWEEN + 1;
  end;
  put _ALL_;
run;

Log:

 

DISTANCE_BETWEEN=2 x=2 _ERROR_=0 _N_=1
DISTANCE_BETWEEN=4 x=6 _ERROR_=0 _N_=2
DISTANCE_BETWEEN=3 x=9 _ERROR_=0 _N_=3

What can I say, interesting use of conditional DoW-loop!

 

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1255 views
  • 6 likes
  • 5 in conversation