Hi,
Is it normal to set initial value of first. and last. to 1.
Here is example.
data temp;
store_id=101;
sale_eff_date = '01may2015'd;
sale_end_date = '15may2015'd;
total_amount = 100;
output;
sale_eff_date=101;
sale_eff_date = '17may2015'd;
sale_end_date = '30may2015'd;
total_amount = 200;
output;
format sale_eff_date sale_end_date mmddyy10.;
run;
data temp1;
put "before exec " _all_;
set temp;
by store_id;
put "after exec " _all_;
run;
before exec store_id=. sale_eff_date=. sale_end_date=. total_amount=. FIRST.store_id=1 LAST.store_id=1 _ERROR_=0 _N_=1
after exec store_id=101 sale_eff_date=05/01/2015 sale_end_date=05/15/2015 total_amount=100 FIRST.store_id=1 LAST.store_id=0 _ERROR_=0 _N_=1
Why first.store_id and last.store_id is set to 1 before execution. Isn't that should be set to missing ? Why SAS set initial value to 1 for first. and last.variable ?
Thanks
Here's a fun experiment to try ...
/* create empty dataset */
data temp2;
var1 = 0;
delete;
run;
data temp3;
put "before exec " _all_;
set temp2;
by var1;
put "after exec " _all_;
run;
Log output:
18 /* create empty dataset */ 19 data temp2; 20 var1 = 0; 21 delete; 22 run; NOTE: The data set WORK.TEMP2 has 0 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 23 24 data temp3; 25 put "before exec " _all_; 26 set temp2; 27 by var1; 28 put "after exec " _all_; 29 run; before exec var1=. FIRST.var1=1 LAST.var1=1 _ERROR_=0 _N_=1 NOTE: There were 0 observations read from the data set WORK.TEMP2. NOTE: The data set WORK.TEMP3 has 0 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
First and Last variables are flags and boolean in their nature. They should only have two values (True, False).
You have your put statement before the SET so I would assume what you get is how SAS initializes these temporay variables.
It's interesting to know to what SAS sets the values during initialization but I can't think of any use case where this really matters or would cause issues.
It matters when using it in loop.
data temp;
store_id=101;
sale_eff_date = '01may2015'd;
sale_end_date = '15may2015'd;
total_amount = 100;
output;
store_id=101;
sale_eff_date = '17may2015'd;
sale_end_date = '30may2015'd;
total_amount = 200;
output;
store_id=102;
sale_eff_date = '17may2015'd;
sale_end_date = '30may2015'd;
total_amount = 200;
output;
format sale_eff_date sale_end_date mmddyy10.;
run;
data temp1;
put "before exec " _all_;
do until(last.store_id);
set temp;
by store_id;
total = sum(total,total_amount);
first_sd=first.store_id;
last_id=last.store_id;
output;
put "after exec " _all_;
end;
run;
data temp2;
put "before exec " _all_;
do while(last.store_id);
set temp;
by store_id;
total = sum(total,total_amount);
first_sd=first.store_id;
last_id=last.store_id;
output;
put "after exec " _all_;
end;
run;
It may matter then, but then the reason is why are you doing it that way in the first place? To attain the end result, this would be the way to code it, with a retain statement:
data temp1; set temp; by store_id; retain total; if first.store_id then total=0; total=sum(total,total_amount); run;
Hi, you could also use a SUM statement and make use of the automatic retain of variables created in that statement (plus initialization at zero, not missing). The following makes use of the value of FIRST.STORE_ID ...
data temp;
input store_id total_amount @@;
datalines;
1 10 1 30 1 50 2 40 2 70 2 90 3 10 3 20
;
data temp1;
set temp;
by store_id;
total + total_amount - (first.store_id * total);
run;
data set TEMP1 ...
total_
Obs store_id amount total
1 1 10 10
2 1 30 40
3 1 50 90
4 2 40 40
5 2 70 110
6 2 90 200
7 3 10 10
8 3 20 30
Tips:Between and Within Group Counters
http://www.sascommunity.org/wiki/Tips:Between_and_Within_Group_Counters
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.