01-21-2016 05:39 PM - edited 01-21-2016 05:47 PM
The codes as follows are from a text book:
DATA flowersales; INFILE '*****\MyRawData\TropicalFlowers.dat' ; INPUT CustomerID $4. @6 SaleDate MMDDYY10. @ 17 Variety $9. SaleQuantity SaleAmount; * Sorting the data by saleamount in descending order; PROC SORT DATA = flowersales; BY DESCENDING SaleAmount; RUN; * Find biggest order and pass the customer id to a macro variable ; DATA _NULL_ ; SET flowersales; IF _N_ = 1 THEN CALL SYMPUT( "selectedcustomer",CustomerID); ELSE STOP ; RUN; PROC PRINT DATA = flowersales; WHERE CustomerID = "&selectedcustomer" ; FORMAT SaleDate WORDDATE18. SaleAmount DOLLAR7.; TITLE "Customer &selectedcustomer Had the Single Largest Order" ; RUN
Since I want to see the purpose of IF _N_ =1 THEN.. ,
IF _N_ = 1 THEN CALL SYMPUT( "selectedcustomer",CustomerID);
I change the above syntaxas as
IF _N_ = 2 THEN CALL SYMPUT( "selectedcustomer",CustomerID);
IF _N_ = 9 THEN CALL SYMPUT( "selectedcustomer",CustomerID);
or directly delte IF _N_ = THEN
CALL SYMPUT( "selectedcustomer",CustomerID);
The above four situations produce the same result (see attachment, for the dataset, please also see attachment), so it seems that IF _N_ = 1 is not necessary.
Could anyone tell me is that true? If not, what is the actuall purpose of IF _N_ = 1 in this situation？
01-21-2016 05:53 PM
That's not right...
As its being used, it's a record counter, so it's getting the first record in the data set and assigning that value to a macro variable.
First it sorts the data descending so the largest value is at the top of the data set and then it gets the customer ID of the customer who had the highest sales. The macro variable is then used in the next step to include it in the title.
data _null_; set sashelp.class; if _n_=1 then put "Record 1"; else if _n_=2 then put "Record 2"; else if _n_=12 then put "Record 12"; run;
01-21-2016 06:09 PM
01-21-2016 06:14 PM - edited 01-21-2016 06:22 PM
If you're looking for the usage of a variable, look for where it occurs in your text.
In this case, search and see where the macro variable, selectedCustomer, is being used later on in the report.
01-21-2016 05:56 PM - edited 01-21-2016 06:00 PM
All that because the stament
ELSE STOP ;
try to delete it, then you get different results based on the order from the sort.
What happening is that you run the code at the begining as it is so the macro got the value of first ID in order.
Then you changed the code but every time the STOP is happening before the macro variable is called by the new values.
And this is not issue with the code, but the code match the purpose from it exactly which is
1- Get the first observation value
2- then terminate in the second iteration as now point from next iterations.
Hope that help you.
01-22-2016 02:35 AM
if _n_ = 1 then call symput(); else stop;
construct makes the data step create the macro variable in the first iteration of the data step (which happens to be when the first record of the dataset in the set; statement is read), and in the second iteration (where _n_ = 2) the execution of the data step stops. Basically, this means "get me the first record of the data set".
Another method would have been
data _null_ ; set flowersales (obs=1); call symput( "selectedcustomer",CustomerID); run;
as this makes the data step read only the first observation of flowersales, and no explicit stop is needed.
Now, in your original example, the following may have happened:
- you ran the original code, which set &selectedcustomer to the intended value
- you changed the number (2, then 9) in the if _n_ = condition
- you ran the code, but it entered the else stop; branch in the first iteration (_n_ being 1) and did not set &selectedcustomer at all
- since &selectedcustomer was already set from the first run, nothing seemed to change