Please provide data in the form of a data step. That way we don't have to make guesses about the actual types of variables.
One way, note the data step of what I am guessing are the variable types for your data.
data have;
input year city $ variable ;
datalines;
2000 Allen 5000
2001 Allen 5000
2002 Allen 5000
2003 Allen 5100
2004 Allen 5100
2005 Allen 5100
2000 Frisco 6300
2001 Frisco 6300
2002 Frisco 6000
2003 Frisco 6000
2004 Frisco 6000
2005 Frisco 6000
;
proc sort data=have;
by city year ;
run;
data want;
set have;
by city year ;
retain var_2;
lv=lag(variable);
if first.city then var_2=0;
else if variable ne lv then var_2+1;
drop lv;
run;
I am sorting the data by City and year just in case your data is not in order and to allow use of BY group processing in the follow up data step that creates what you request.
The BY statement creates automatic variables First. and Last. for each variable on the by statement that are 1/0 valued for true/false indicating if the current observation is the first or last of the by group. These values can be used to conditionally execute code such as resetting the value of Var_2 to 0. RETAIN tells SAS to keep the values of the variable across boundaries of the data step so can accumulate values. The LAG function, when used correctly, has the value of the previous record(s) for a variable. So this allows comparing the value of Variable for the current record with the previous to increment Var_2.
Note: if you have more than 1 change this will increment at each change, getting 2, 3, etc.
The lagged value is dropped from the data set as not needed.
@pablocasso wrote:
I have the following dataset. Year, city and variable. The variable changes infrequently. I would like to create another variable (variable_2) that tracks variable. Prior to a change in variable, the value of variable_2 will be 0 and after the change the value of variable_2 changes to 1. Here is the original data.
year
city
variable
2000
Allen
5000
2001
Allen
5000
2002
Allen
5000
2003
Allen
5100
2004
Allen
5100
2005
Allen
5100
2000
Frisco
6300
2001
Frisco
6300
2002
Frisco
6000
2003
Frisco
6000
2004
Frisco
6000
2005
Frisco
6000
After the transformation
year
city
variable
Variable_2
2000
Allen
5000
0
2001
Allen
5000
0
2002
Allen
5000
0
2003
Allen
5100
1
2004
Allen
5100
1
2005
Allen
5100
1
2000
Frisco
6300
0
2001
Frisco
6300
0
2002
Frisco
6000
1
2003
Frisco
6000
1
2004
Frisco
6000
1
2005
Frisco
6000
1
... View more