input clm clmv;
value = first.clmv;
value1 = first.clmv;
My question is: why is there a difference when I write "first." before the set statement and after the set statement? Can anyone explain that?
The correct usage is to put any statements that refer to variables AFTER the set statement. Otherwise, they are undefined (missing) when first encountered.
In your program, the VALUE1 variable is correct. The VALUE variable is looking at the PREVIOUS observation to determine the value of first.clmv, except for the first time, which sees missing values.
If this is not clear, look at the following simpler example. Do you see how the VALUE variable is missing the first time and then has the value from the PREVIOUS observation? In symbols, VALUE=LAG(VALUE1).
data c; value = clm + clmv; set a; by clmv; value1 = clm + clmv; run; proc print;run;
The SET statement has a double nature:
And it is the "read" which populates dependent variables like your FIRST.
The values of variable value1 are explained in section "How SAS Identifies the Beginning and End of a BY Group" of the documentation of the BY statement:
SAS sets the value of FIRST.variable to 1 when it reads the first observation in a BY group
and in subsection "How SAS Determines FIRST.variable and LAST.variable" of the section "FIRST. and LAST. DATA Step Variables" in the documentation "BY-Group Processing in the DATA Step":
- For all other observations in the BY group, the value of FIRST.variable is 0.
It is the SET statement which reads the observations, so only after the execution of the SET statement variable first.clmv is updated for the current observation.
The values of variable value are explained by the facts that variables first.variable and last.variable are
(which seem to be not clearly stated in the documentation linked above).
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.