I come across
data pd4; set pb3; by region product sub_product; if first.product then do; cum_active=0; cum_default=0; end; cum_active+active; cum_default+Default; cum_active_percent = cum_active / total_active; cum_default_percent=cum_Default / total_Default; run;
Hi I read the doc and couldn't understand the significance of assigning first, last to 1,0 etc in the above programs. Could you kindly explain?
Aso could you explain the statements
cum_active+active;
cum_default+Default;
are they assigning values to some variable? are they boolean? what are they doing?
if first.product then do;
cum_active=0;
cum_default=0;
end;
This is Initializing variables to be used for the sum statement.
cum_active+active;
cum_default+Default;
Thereafter, the total is calculated.
For more information on the sum statement,
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/n1dfiqj146yi2cn1maeju9wo7ijs.htm
First data step variables are described here
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lrcon/n01a08zkzy5igbn173zjz82zsi1s.htm
Also, I have a few requests.
Please do not post the text mixed in with the code block as it makes it difficult to see.
Also, if necessary, please make questions that you have already posted resolved.
For information on using the FIRST and LAST indicator variables in a BY-group analysis, see "How to use FIRST.variable and LAST.variable in a BY-group analysis in SAS."
The first. and last. automatic variables generated via the BY statement are just indicators about the current observation related to the preceding and following observations. When first.x=1, it means the current obs has a different value for x than the previous obs - otherwise first.x=0.. Similarly the last.x means the current value of x differs from the next obs (i.e. SAS does a look-ahead for you). Now, it is possible for a record to have both first.x=1 and last.x=1 (the current obs is a singleton) or first.x=0 and last.x=0 (current obs is in the middle of a stream of constant x values).
Now when you have BY a b c d. there will be multiple first. and multiple last. automatic variables. The (slightly) non-intuitive behavior of these is that if, in the middle of a series of constant B values, you have a change in A, both first.A and first.B are set to 1, i.e. whenever a first. variable becomes 1, all the first.'s for variables to its right also become 1. Same with a set of last. variables.
So let me revise my first statement: whenever first.x=1, it means that the current x differs from the preceding x or the current value of some earlier variable in a by list differs from its predecessor.
Editted addition: the above refers to use of by statement in a DATA step.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.