BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GurmeetKaur_23
Calcite | Level 5

Hi All,

I was brushing up my concepts for SAS and during that I created a code snippet which has confused me. Below is the code snippet which I need to understand as how the output is coming and what is happening in the background.

 

data test;
if age <14;
set sashelp.class;
run;

proc print;
run;

Now, if I run this code snippet, it gives me the first obs from sashelp.class dataset and then in the log window, it says:
"DATA step stopped due to looping" AND ALSO, in the log window, it prints first observation and the value for _ERROR_ = 0
How? What is happening in the background, kindly help.

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

There are several DATA step principles at work here.

 

The statements within the DATA step execute in order.  They execute many times, not just once.

 

Any variable that comes from a SAS data set is automatically retained.

 

Most DATA steps end by having the SET statement fail, because there are no more observations to read.

 

If we apply those ideas to your DATA step ...

 

The first statement to execute checks AGE.  At that point, no data has been read in, so AGE has a missing value.  Missing is less than 14, so the SET statement executes and reads in the first observation from SASHELP.CLASS.

 

That first observation gets output, and the DATA step continues. 

 

The IF statement executes for a second time.  At that point, AGE is (presumably) greater than 14 ... whatever value was read from the first observation in SASHELP.CLASS is automatically retained.

 

So the subsetting IF deletes the observation, and the SET statement never executes.

 

Now SAS gets worried.  Your DATA step contains a SET statement, yet the programming logic caused SAS to reach the end of the programming statements without actually executing the SET statement.  It is conceivable that this situation would continue, and the SET statement would never execute and the DATA step would never end.  To protect against the possibility that this DATA step would never end, SAS ends it for you with the message about possible looping.

View solution in original post

11 REPLIES 11
AlanC
Barite | Level 11

Try this:

 

data test;

    set sashelp.class;

    where age>14;

run;

 

proc print data=test;

run;

https://github.com/savian-net
GurmeetKaur_23
Calcite | Level 5

Hey Alan, Thanks for the reply. But I did try your code and 

data test;
set sashelp.class;
if age < 14;
run;

proc print;
run;

And it is also giving me proper output. But my query is why the code snippet I have posted as query, is workin? Why is it not giving error? What is happening in the background that it gave me first obs of sashelp.class dataset?

 

Please help

AlanC
Barite | Level 11

I don't have SAS currently so I cannot test. When you did that if statement at the beginning, it initilialized the PDV with the AGE variable as a numeric. Once the condition was met, with the first obs, it stopped. You essentially said, 1 pass, not all obs.

 

The dataset should always be specified first unless it is one of the modifiers (such as ATTRIB) so that the PDV is set correctly. the PDV determines everything. See Don Henderson/Merry Rabb's seminal paper on it:

 

http://www.lexjansen.com/nesug/nesug88/sas_supervisor.pdf

 

 

https://github.com/savian-net
GurmeetKaur_23
Calcite | Level 5

Hey Alan,

 

I agree with the point that initialised the variable age with missing value BUT the first obs that it is printing is the record that is having AGE = 14. It is not less than 14 and then it is getting stopped.

 

I am confused at this point and wanted to understand the logic behind. No problem, if you do not have SAS right now. Once you have some time to look into this issue, kindly share your findings on this program because as per my understanding, it is very strange that it is getting executed and is not giving any error as well and also giving the very first record from sashelp.class as output which is having AGE=14

 

Thanks in advance!!!

AlanC
Barite | Level 11

Ok, think through what is happening here. SAS will read, check the condition after the read, not before. 

 

On the 1st observation in the sashelp.class dataset, it reads in the record, does an automatic output, then terminates based upon the fact that the record is now read. The output still happened since the output implicitely happens at the run statement. Try it with varying values of the age and you will see that it fails the condition right after it outputs the first record that failed it.

 

You are essentially doing an until loop.

https://github.com/savian-net
Reeza
Super User

@GurmeetKaur_23 You asked the question here, and there is a fairly thorough answer. 

https://stackoverflow.com/questions/45211790/how-to-this-dataset-is-working-in-the-background

 

Is there something there that doesn't make sense to you?

 

Excellent question! It's important to learn how the DATA step works, and part of that is to know when it stops.

The typical way a DATA step stops is the SET statement tries to read the next record in a dataset and hits the end of the file.

Another way a step will stop is if it has a SET statement in it, and it goes one full iteration of the DATA step loop without a SET statement executing. When it stops for this reason, you get the "stopped due to looping" message. It's basically protection against an infinite loop.

Look at your code, with some PUT statements added:

27 data test;
28 put "top of loop " _n_= age=;
29 if age<14;
30 set sashelp.class;
31 put "bottom of loop " _n_= age=;
32 run;

top of loop _N_=1 age=.
bottom of loop _N_=1 age=14
top of loop _N_=2 age=14
NOTE: DATA STEP stopped due to looping.
NOTE: There were 1 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.TEST has 1 observations and 5 variables.

At the top of the first iteration of the loop, age=., so `if age<14` is true. The SET statement executes and the first record is read. At the bottom of the loop age=14.

At the top of the second iteration of the loop (`_n_=2`), age=14 because it is automatically retained. The subsetting IF statement is false. Control goes to the bottom of the loop. The DATA step sees that during the second of iteration of the loop, no records were read. It stops, with the note that it stopped "due to looping."

If you change your subsetting IF to be AFTER the SET statement, the step will not stop due to looping, because on every iteration of the DATA step loop a record will be read.

33 data test;
34 put "top of loop " _n_= age=;
35 set sashelp.class;
36 if age<14;
37 put "bottom of loop " _n_= age=;
38 run;

top of loop _N_=1 age=.
top of loop _N_=2 age=14
bottom of loop _N_=2 age=13
top of loop _N_=3 age=13
bottom of loop _N_=3 age=13
top of loop _N_=4 age=13
top of loop _N_=5 age=14
top of loop _N_=6 age=14
bottom of loop _N_=6 age=12
top of loop _N_=7 age=12
bottom of loop _N_=7 age=12
top of loop _N_=8 age=12
top of loop _N_=9 age=15
bottom of loop _N_=9 age=13
top of loop _N_=10 age=13
bottom of loop _N_=10 age=12
top of loop _N_=11 age=12
bottom of loop _N_=11 age=11
top of loop _N_=12 age=11
top of loop _N_=13 age=14
bottom of loop _N_=13 age=12
top of loop _N_=14 age=12
top of loop _N_=15 age=15
top of loop _N_=16 age=16
bottom of loop _N_=16 age=12
top of loop _N_=17 age=12
top of loop _N_=18 age=15
bottom of loop _N_=18 age=11
top of loop _N_=19 age=11
top of loop _N_=20 age=15
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.TEST has 10 observations and 5 variables.

 

GurmeetKaur_23
Calcite | Level 5

Hey There,

 

Yes, I asked the question there on stackoverflow as well. But dont you think there could be some reason why I posted it again here? Might be that I had some more queries or might be the reason that no body would have posted the answer on stackoverflow when I posted my query here. There would have been a reason other wise why would I have put the same question here. No?

 

I think one should better be asked reason before being pointed to on such a public forum.

 

 

Reeza
Super User

@GurmeetKaur_23 wrote:

Hey There,

 

Yes, I asked the question there on stackoverflow as well. But dont you think there could be some reason why I posted it again here? Might be that I had some more queries or might be the reason that no body would have posted the answer on stackoverflow when I posted my query here. There would have been a reason other wise why would I have put the same question here. No?

 

I think one should better be asked reason before being pointed to on such a public forum.

 

 


That's exactly what I did, l asked the reason. 

Is there something there that doesn't make sense to you?

 

@GurmeetKaur_23 Considering you hadn't bothered to ask any clarifications or acknowledge the answer on the other post with someone who took the time to answer your question, I'm fine with pointing out that you posted it elsewhere and asking why that post did not meet your needs or what was unclear

 

 

GurmeetKaur_23
Calcite | Level 5

But , anyways, Thanks for the answer.

Astounding
PROC Star

There are several DATA step principles at work here.

 

The statements within the DATA step execute in order.  They execute many times, not just once.

 

Any variable that comes from a SAS data set is automatically retained.

 

Most DATA steps end by having the SET statement fail, because there are no more observations to read.

 

If we apply those ideas to your DATA step ...

 

The first statement to execute checks AGE.  At that point, no data has been read in, so AGE has a missing value.  Missing is less than 14, so the SET statement executes and reads in the first observation from SASHELP.CLASS.

 

That first observation gets output, and the DATA step continues. 

 

The IF statement executes for a second time.  At that point, AGE is (presumably) greater than 14 ... whatever value was read from the first observation in SASHELP.CLASS is automatically retained.

 

So the subsetting IF deletes the observation, and the SET statement never executes.

 

Now SAS gets worried.  Your DATA step contains a SET statement, yet the programming logic caused SAS to reach the end of the programming statements without actually executing the SET statement.  It is conceivable that this situation would continue, and the SET statement would never execute and the DATA step would never end.  To protect against the possibility that this DATA step would never end, SAS ends it for you with the message about possible looping.

GurmeetKaur_23
Calcite | Level 5

Hey There,

 

Thanks a lot for clearing the concept Smiley Happy

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 2918 views
  • 0 likes
  • 4 in conversation