Solved: Re: Program step count

DavidBrown · Posted 03-29-2020 11:28 AM

Working through the SAS Certified Specialist Guide. The book says the last PROC PRINT is the 3rd and last step. From reading the text book, I thought this would have 6 statements with each RUN; being counted as a step. The explanation in the textbook doesn't make sense to me. Can anyone please explain why this code example only has 3 steps?

data user.tables; 
  set work.jobs;run;  
proc sort data=user.tables;  
  by name;  run;  
proc print data=user.tables;  
run;

Kurt_Bremser · Posted 03-29-2020 11:48 AM

The

run;

creates a step boundary, unless you have a procedure that supports run-group processing, like DATASETS or DS2.

data

or

proc

also create a step boundary, if a step is "active" (not yet finished). Since you have run statements for every step, the following data or proc statements do not create a new boundary, and you have three steps.

proc print data=sashelp.class;
run;
run;
run;

is a single step, followed by two run statements that do nothing.

A datalines block in a data step also creates a step boundary, as you can't have any data step statements after the datalines block.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 03-29-2020 11:48 AM

The

run;

creates a step boundary, unless you have a procedure that supports run-group processing, like DATASETS or DS2.

data

or

proc

also create a step boundary, if a step is "active" (not yet finished). Since you have run statements for every step, the following data or proc statements do not create a new boundary, and you have three steps.

proc print data=sashelp.class;
run;
run;
run;

is a single step, followed by two run statements that do nothing.

A datalines block in a data step also creates a step boundary, as you can't have any data step statements after the datalines block.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

DavidBrown · Posted 03-29-2020 12:57 PM

@Kurt_Bremser Thank you for the explanation. Your break down makes sense. However, I was confused by the example below from the text, in which it labels the "run" after "data" as a separate step (Step #3). Why isn't that "run" considered part of the "data" step. In essence, why is this considered to have 4 steps and not 3 vs. my original post, which is said to have 3 steps. The code in my original post, the "data" step is the first step, and includes the "run". This seems contradictory to my SAS beginner's mind.

title1 'June Billing';        /*#1*/
data work.junefee;            /*#2*/  
set cert.admitjune;  
where age>39;
run;                          /*#3*/
proc print data=work.junefee; /*#4*/
run;

Kurt_Bremser · Posted 03-29-2020 01:11 PM

@DavidBrown wrote:

@Kurt_Bremser Thank you for the explanation. Your break down makes sense. However, I was confused by the example below from the text, in which it labels the "run" after "data" as a separate step (Step #3). Why isn't that "run" considered part of the "data" step. In essence, why is this considered to have 4 steps and not 3 vs. my original post, which is said to have 3 steps. The code in my original post, the "data" step is the first step, and includes the "run". This seems contradictory to my SAS beginner's mind.
title1 'June Billing';        /*#1*/
data work.junefee;            /*#2*/  
set cert.admitjune;  
where age>39;
run;                          /*#3*/
proc print data=work.junefee; /*#4*/
run;

From my POV, that is flat wrong. These are two(!) steps and a single Global Statement.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

DavidBrown · Posted 03-29-2020 01:27 PM

@Kurt_Bremser Thank you. I thought I was going crazy. In essence they are saying there is a difference between these 2 steps below, when there isn't.

**Example #1 (one step)**;
data user.tables;   
  set work.jobs;run; 

**Example #2 (two steps)**;
data work.junefee;            
  set cert.admitjune;  
  where age>39;
run;

FreelanceReinh · Posted 03-29-2020 12:58 PM

Actually it's quite simple (just to rephrase Kurt Bremser's reply):

Roughly speaking, a DATA step looks like this

data ...;
...
run;

and a PROC step looks like this

proc ...;
...
run;

or, for a few procedures, like this

proc ...;
...
quit;

So, this simplified ("first-lesson like") statement already provides the answer to your question.

Notable exceptions (not applicable to your example) are

DATA steps with datalines: The RUN statement is optional for them, not required.
DATA or PROC steps whose closing RUN (or QUIT) statement is omitted for brevity because it's implied by the beginning of a new step (i.e., a subsequent DATA or PROC statement).

DavidBrown · Posted 03-29-2020 01:06 PM

@FreelanceReinhard Thank you. I'm starting to get it, but still don't get, why in my second example "run;" in step 3 is considered and individual step. Why is it not part of the data step? Sorry if I am being obtuse.

FreelanceReinh · Posted 03-29-2020 01:26 PM

@Kurt_Bremser wrote:

From my POV, that is flat wrong. These are two(!) steps and a single Global Statement.

Exactly. The assignment of a separate number to the global TITLE1 statement clearly indicates that these numbers do not count steps.

Also, the fact that an omitted RUN (or QUIT) statement can be "implied" by the beginning of a new step (as I mentioned earlier) does not mean that the combination of a mere RUN (or QUIT) statement and the beginning of a new step constitutes two steps.

DavidBrown · Posted 03-29-2020 01:32 PM

@FreelanceReinhard The text considers the global statement as an "outside" step, but a step nonetheless.
Per the SAS Certified Specialist Prep Guide Chapter 2:
1 The TITLE statement is a global statement. Global statements are typically outside steps and do not require a RUN statement.

Here is their explanation on why my first example as 3 steps. However, from my POV I still don't see why:

When it encounters a DATA, PROC, or RUN statement, SAS stops reading statements and executes the previous step in the program. This program contains one DATA step and two PROC steps, for a total of three program steps.

.

Kurt_Bremser · Posted 03-29-2020 01:38 PM

A global statement is NEVER a step. See the definition of "step" by SAS themselves:

What are the components of a SAS program?

A SAS program is a sequence of steps that you submit to SAS for execution. Each step in the program performs a specific task. Only two kinds of steps make up SAS programs: DATA steps and PROC steps.

(from https://support.sas.com/software/products/university-edition/faq/programs_components.htm, emphasis by me)

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

DavidBrown · Posted 03-29-2020 01:49 PM

Thanks for taking the time to answer my question. Initially, my answer to my initial question was that there were 3 steps. But when I referred back to the chapter content to be sure, I saw them label "run;" as it's own step. Then, they called a Global Statement a step, albeit an "outside step".

Appreciate your input very much.

FreelanceReinh · Posted 03-29-2020 02:00 PM

@DavidBrown wrote:
(...) Global statements are typically outside steps ...

Well, English is not my native language, but I understand this sentence like "Global statements are typically (used) outside of steps." (But for some global statements, e.g., TITLE statements, it's not uncommon to use them inside [of] steps.) There is no such thing as an "outside step" -- only DATA and PROC steps.

DavidBrown · Posted 03-29-2020 02:20 PM

@FreelanceReinhard I think you are correct here and before about the numbering. That example was straight from the text. They were explaining steps, then gave that example. I assumed the numbers were steps, when they were not. The devil was in the details here, which is what I am finding in SAS documentation. It may be technically correct, but it is not very clear. It appears to be written as a reference for somebody who already knows the language, rather than for someone learning it.

What are the components of a SAS program?

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away