DATA Step, Macro, Functions and more

What is the purpose of _N_ syntax in the codes?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

What is the purpose of _N_ syntax in the codes?

[ Edited ]

 

 

The codes as follows are from a text book:


DATA flowersales; INFILE '*****\MyRawData\TropicalFlowers.dat' ; INPUT CustomerID $4. @6 SaleDate MMDDYY10. @ 17 Variety $9. SaleQuantity SaleAmount; * Sorting the data by saleamount in descending order; PROC SORT DATA = flowersales; BY DESCENDING SaleAmount; RUN; * Find biggest order and pass the customer id to a macro variable ; DATA _NULL_ ; SET flowersales; IF _N_ = 1 THEN CALL SYMPUT( "selectedcustomer",CustomerID); ELSE STOP ; RUN; PROC PRINT DATA = flowersales; WHERE CustomerID = "&selectedcustomer" ; FORMAT SaleDate WORDDATE18. SaleAmount DOLLAR7.; TITLE "Customer &selectedcustomer Had the Single Largest Order" ; RUN

 

 

Since I want to see the purpose of  IF _N_ =1 THEN..  ,

IF _N_ = 1 THEN CALL SYMPUT( "selectedcustomer",CustomerID);

 I change the above syntaxas as

IF _N_ = 2 THEN CALL SYMPUT( "selectedcustomer",CustomerID);

or

IF _N_ = 9 THEN CALL SYMPUT( "selectedcustomer",CustomerID);

or directly delte IF _N_ =  THEN

CALL SYMPUT( "selectedcustomer",CustomerID);

The above four situations produce the same result (see attachment, for the dataset, please also see attachment), so it seems that IF _N_ = 1 is not necessary. 

Could anyone tell me is that true? If not, what is the actuall purpose of  IF _N_ = 1  in this situation?

 

 

 

 

 

 


sss.png

Accepted Solutions
Solution
‎09-07-2017 02:00 PM
Super User
Posts: 7,832

Re: What is the purpose of _N_ syntax in the codes?

[ Edited ]

Editor's Note:  Thanks also to @mohamed_zaki for his additional insight into the issue.

 

 

 

The 

if _n_ = 1 then call symput(); else stop;

construct makes the data step create the macro variable in the first iteration of the data step (which happens to be when the first record of the dataset in the set; statement is read), and in the second iteration (where _n_ = 2) the execution of the data step stops. Basically, this means "get me the first record of the data set".

Another method would have been

data _null_ ;                   
set flowersales (obs=1);  
call symput( "selectedcustomer",CustomerID);                       
run;

as this makes the data step read only the first observation of flowersales, and no explicit stop is needed.

 

Now, in your original example, the following may have happened:

- you ran the original code, which set &selectedcustomer to the intended value

- you changed the number (2, then 9) in the if _n_ = condition

- you ran the code, but it entered the else stop; branch in the first iteration (_n_ being 1) and did not set &selectedcustomer at all

- since &selectedcustomer was already set from the first run, nothing seemed to change

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super User
Posts: 19,833

Re: What is the purpose of _N_ syntax in the codes?

That's not right...

As its being used, it's a record counter, so it's getting the first record in the data set and assigning that value to a macro variable.

First it sorts the data descending so the largest value is at the top of the data set and then it gets the customer ID of the customer who had the highest sales. The macro variable is then used in the next step to include it in the title.

 

 

data _null_;
set sashelp.class;

if _n_=1 then put "Record 1";
else if _n_=2 then put "Record 2";

else if _n_=12 then put "Record 12";
run;

 

 

 

Occasional Contributor
Posts: 11

Re: What is the purpose of _N_ syntax in the codes?

Thank you @Reeza. I am still a bit confused. I just don't understand which syntax indicate to choose all the observations where customer ID=356W. Could you tell me how to do if I want to select the observations where customer ID = 240W? Maybe that will make it clearer.
Super User
Posts: 19,833

Re: What is the purpose of _N_ syntax in the codes?

[ Edited ]

If you're looking for the usage of a variable, look for where it occurs in your text. 

 

In this case, search and see where the macro variable, selectedCustomer, is being used later on in the report.

Occasional Contributor
Posts: 11

Re: What is the purpose of _N_ syntax in the codes?

Thanks for explanation. Now it is clear.
Super Contributor
Posts: 490

Re: What is the purpose of _N_ syntax in the codes?

[ Edited ]

All that because the stament

ELSE STOP ;

try to delete it, then you get different results based on the order from the sort.

 

What happening is that you run the code at the begining as it is so the macro got the value of first ID in order.

Then you changed the code but every time the STOP is happening before the macro variable is called by the new values.

 

And this is not issue with the code, but the code match the purpose from it exactly which is 

1- Get the first observation value

2- then terminate in the second iteration as now point from next iterations.

 

Hope that help you.

Occasional Contributor
Posts: 11

Re: What is the purpose of _N_ syntax in the codes?

Posted in reply to mohamed_zaki
Thank you! It helps!
Solution
‎09-07-2017 02:00 PM
Super User
Posts: 7,832

Re: What is the purpose of _N_ syntax in the codes?

[ Edited ]

Editor's Note:  Thanks also to @mohamed_zaki for his additional insight into the issue.

 

 

 

The 

if _n_ = 1 then call symput(); else stop;

construct makes the data step create the macro variable in the first iteration of the data step (which happens to be when the first record of the dataset in the set; statement is read), and in the second iteration (where _n_ = 2) the execution of the data step stops. Basically, this means "get me the first record of the data set".

Another method would have been

data _null_ ;                   
set flowersales (obs=1);  
call symput( "selectedcustomer",CustomerID);                       
run;

as this makes the data step read only the first observation of flowersales, and no explicit stop is needed.

 

Now, in your original example, the following may have happened:

- you ran the original code, which set &selectedcustomer to the intended value

- you changed the number (2, then 9) in the if _n_ = condition

- you ran the code, but it entered the else stop; branch in the first iteration (_n_ being 1) and did not set &selectedcustomer at all

- since &selectedcustomer was already set from the first run, nothing seemed to change

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 3170 views
  • 2 likes
  • 4 in conversation