BookmarkSubscribeRSS Feed
pablocasso
Calcite | Level 5

I have the following dataset. Year, city and variable. The variable changes infrequently. I would like to create another variable (variable_2) that tracks variable. Prior to a change in variable, the value of variable_2 will be 0 and after the change the value of variable_2 changes to 1. Here is the original data.

yearcityvariable
2000Allen5000
2001Allen5000
2002Allen5000
2003Allen5100
2004Allen5100
2005Allen5100
2000Frisco6300
2001Frisco6300
2002Frisco6000
2003Frisco6000
2004Frisco6000
2005Frisco6000

 

After the transformation

yearcityvariableVariable_2
2000Allen50000
2001Allen50000
2002Allen50000
2003Allen51001
2004Allen51001
2005Allen51001
2000Frisco63000
2001Frisco63000
2002Frisco60001
2003Frisco60001
2004Frisco60001
2005Frisco60001
1 REPLY 1
ballardw
Super User

Please provide data in the form of a data step. That way we don't have to make guesses about the actual types of variables.

 

One way, note the data step of what I am guessing are the variable types for your data.

 

data have;
   input year	city $	variable ;
datalines;
2000	Allen	5000
2001	Allen	5000
2002	Allen	5000
2003	Allen	5100
2004	Allen	5100
2005	Allen	5100
2000	Frisco	6300
2001	Frisco	6300
2002	Frisco	6000
2003	Frisco	6000
2004	Frisco	6000
2005	Frisco	6000
;

proc sort data=have;
   by city year ;
run;

data want;
  set have;
  by city year ;
  retain var_2;
  lv=lag(variable);
  if first.city then var_2=0;
  else if variable ne lv then var_2+1;
  drop lv;
run;

I am sorting the data by City and year just in case your data is not in order and to allow use of BY group processing in the follow up data step that creates what you request.

The BY statement creates automatic variables First. and Last. for each variable on the by statement that are 1/0 valued for true/false indicating if the current observation is the first or last of the by group. These values can be used to conditionally execute code such as resetting the value of Var_2 to 0. RETAIN tells SAS to keep the values of the variable across boundaries of the data step so can accumulate values. The LAG function, when used correctly, has the value of the previous record(s) for a variable. So this allows comparing the value of Variable for the current record with the previous to increment Var_2.

Note: if you have more than 1 change this will increment at each change, getting 2, 3, etc.

The lagged value is dropped from the data set as not needed.

 


@pablocasso wrote:

I have the following dataset. Year, city and variable. The variable changes infrequently. I would like to create another variable (variable_2) that tracks variable. Prior to a change in variable, the value of variable_2 will be 0 and after the change the value of variable_2 changes to 1. Here is the original data.

year city variable
2000 Allen 5000
2001 Allen 5000
2002 Allen 5000
2003 Allen 5100
2004 Allen 5100
2005 Allen 5100
2000 Frisco 6300
2001 Frisco 6300
2002 Frisco 6000
2003 Frisco 6000
2004 Frisco 6000
2005 Frisco 6000

 

After the transformation

year city variable Variable_2
2000 Allen 5000 0
2001 Allen 5000 0
2002 Allen 5000 0
2003 Allen 5100 1
2004 Allen 5100 1
2005 Allen 5100 1
2000 Frisco 6300 0
2001 Frisco 6300 0
2002 Frisco 6000 1
2003 Frisco 6000 1
2004 Frisco 6000 1
2005 Frisco 6000 1

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 472 views
  • 0 likes
  • 2 in conversation