DATA Step, Macro, Functions and more

what is the purpose of if _N_=1 then do in this data step?

Reply
Frequent Contributor
Posts: 133

what is the purpose of if _N_=1 then do in this data step?

in advanced certification prep, book  in hash table chapter, it has following data step.

Can someone help me understand what exactly does "if _N_=1 then do" do?

I really don't see the need of using this statement.

data work.difference (drop= goalamount);

     length goalamount 8;

     if _N_ = 1 then do;

          declare hash goal( );

          goal.definekey("QtrNum");

          goal.definedata("GoalAmount");

          goal.definedone( );

          call missing(qtrnum, goalamount);

          goal.add(key:’qtr1’, data:10 );

          goal.add(key:’qtr2’, data:15 );

          goal.add(key:’qtr3’, data: 5 );

          goal.add(key:’qtr4’, data:15 );

     end;

     set sasuser.contrib;

     goal.find();

     Diff = amount - goalamount;

run;

Frequent Contributor
Posts: 138

what is the purpose of if _N_=1 then do in this data step?

Hi ,

This says that if the step is reading the first observation in datastep then hash variable is decalred and properties are set including the key variabels and data variables.

Frequent Contributor
Posts: 133

what is the purpose of if _N_=1 then do in this data step?

can I remove this line? what effect will it have without this line then?

Super User
Posts: 9,681

what is the purpose of if _N_=1 then do in this data step?

You must keep it. If you remove it.

At every data loop, SAS will re-build this hash table, this is not what you need.

Ksharp

Respected Advisor
Posts: 3,124

what is the purpose of if _N_=1 then do in this data step?

Try removing it, and then compare your log, you will see what Ksharp means.

Haikuo

Frequent Contributor
Posts: 133

Re: what is the purpose of if _N_=1 then do in this data step?

Ksharp, you said: "At every data loop, SAS will re-build this hash table"

I do not understand, where does the loop come from?

I removed the line, nothing happened, I still do not see the magic of this line here.

Respected Advisor
Posts: 3,124

Re: what is the purpose of if _N_=1 then do in this data step?

Ok, notice this line in your code:

set sasuser.contrib;

The number of loop is the total number of  obs in 'sasuser.contrib' plus 1. It comes from the implicit loop of 'set'. Unless you stop(abort) or skip the loop somewhere in your downstream code, it will be n+1, n being the number of obs in 'sasuser.contrib'.

Having said that, it would be somehow different if you apply DOW on  'set' statement, such as:

do until (your conditions);

   set sasuser.contrib;

blah blah;

end;

Then the number of the loop will be the number of DOW plus 1.

Haikuo

Edit: if you remove _n_ line, you will NOT see errors if your original code has no error. You will see bunch of notes telling your hash object has been initiated, then couple of lines later, initiated again, and again.

PROC Star
Posts: 7,363

Re: what is the purpose of if _N_=1 then do in this data step?

Hai.kuo: Not on 9.2!  Thus, it is a good question.  The result looks like it will stay the same but, without the if statement, the processing time will increase dramatically.

Respected Advisor
Posts: 3,124

Re: what is the purpose of if _N_=1 then do in this data step?

I see, Art. Probably that is why OP is so confused. Thanks for pointing it out! Learned!

In addition to increased processing time, without first _n_ loop, it won't work if hash() need to be dynanmically  modified during the course.

Frequent Contributor
Posts: 101

Re: what is the purpose of if _N_=1 then do in this data step?

ZRick wrote:

Ksharp, you said: "At every data loop, SAS will re-build this hash table"

I do not understand, where does the loop come from?

I removed the line, nothing happened, I still do not see the magic of this line here.

ZRick,

You make a comment like this which leads one to believe you do not understand how a data step works. So Art provided a link that exactly shows how a data step loops and where _n_ comes from, but you totally dismissed his help. Take 5 minutes to read the link, then maybe you will understand where the looping occurs and why. From that point, maybe you will get insight into why the hash object only needs to be declared and populated once when _n_ = 1.

Frequent Contributor
Posts: 133

Re: what is the purpose of if _N_=1 then do in this data step?

still try to understand this _N_ better, so what what contains _N_=1?

In addition, under _N_=1, it only loop once, is that it?

PROC Star
Posts: 7,363

Re: what is the purpose of if _N_=1 then do in this data step?

If you are preparing for the certification exam, you will probably want to read (at least):

http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001290590.htm

Frequent Contributor
Posts: 133

Re: what is the purpose of if _N_=1 then do in this data step?

thank you for pointing me the interesting link, but I am more focused on understanding the logic of the code behind it.

Super User
Super User
Posts: 6,500

Re: what is the purpose of if _N_=1 then do in this data step?

One way to think about how SAS processes a data step is to consider a simple step to calculate a new variable.

data new;

   set old;

   y= x*x ;

run;

Now if there are 100 observations in OLD then SAS must execute the assignment statement that creates Y 100 times.  So the implied loop over all input data is what lets that happen.

This concept is one of the things that makes creating SAS programs so much simplier than the old FORTRAN or PL/I programs we had to use before SAS was developed. Or for that matter more modern languages such as Java or Excel.

Super User
Posts: 3,106

Re: what is the purpose of if _N_=1 then do in this data step?

_N_ is an automatic SAS counter that can be used to find out how many times the DATA step has looped.

The purpose for it in your example is to only create and load the hash table once, at the start of the first loop through the step. It only needs to be done once.

If you removed this check the hash table would be created and loaded for every record the step is processing!

Ask a Question
Discussion stats
  • 17 replies
  • 10165 views
  • 0 likes
  • 8 in conversation