Solved
Contributor
Posts: 39

# Lag function in a by group processing?

Hi SAS forum,

I have this dataset which shows the income of two cities in different dates.

data have;

input City \$ 1-5 date income;

cards;

tokyo 20150324 10000000

tokyo 20150412 75000000

tokyo 20150522 65000000

miami 20150225 50000000

miami 20150312 60000000

;

run;

Q: I wanted to calculate the income differnece in each city between two adjoining months.

My attempt is this.

data want;

set have;

lag_income = lag(income);

monthly_income_dif= income - lag_income;

run;

Output table:

 City date income lag_income monthly_income_dif tokyo 20150324 10000000 tokyo 20150412 75000000 10000000 65000000 tokyo 20150522 65000000 75000000 -10000000 miami 20150225 50000000 65000000 -15000000 miami 20150312 60000000 50000000 10000000

Problem: The value       -15000000 in "Monthly_income_dif" variable is not correct because it substract tokyo income from Miami income which is not correct.

How to avoid this problem and calculate the income differences within a city?

Thanks

Mirisa

Accepted Solutions
Solution
‎08-30-2016 01:25 PM
Contributor
Posts: 20

## Re: Lag function in a by group processing?

Hi Mirisa

A quick solution, sort the data by city and date and then set the difference to 0 for the first occurrence of each city...

proc sort data = have;
by city date;
run;
data want;
set have;
by city;
lag_income = lag(income);
if first.city then
monthly_income_dif=0;
else
monthly_income_dif= income - lag_income;
run;

HTH
Chris

All Replies
Solution
‎08-30-2016 01:25 PM
Contributor
Posts: 20

## Re: Lag function in a by group processing?

Hi Mirisa

A quick solution, sort the data by city and date and then set the difference to 0 for the first occurrence of each city...

proc sort data = have;
by city date;
run;
data want;
set have;
by city;
lag_income = lag(income);
if first.city then
monthly_income_dif=0;
else
monthly_income_dif= income - lag_income;
run;

HTH
Chris
Super User
Posts: 9,599

## Re: Lag function in a by group processing?

Add to the if statement:

`data have;input City \$ 1-5 date income;cards;tokyo 20150324 10000000tokyo 20150412 75000000tokyo 20150522 65000000miami 20150225 50000000miami 20150312 60000000;run;proc sort data=have; by city date;run;data want; set have; by city; retain lst_income; if not first.city then monthly_income_dif= income - lst_income; lst_income=income;run;`
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
• 2 replies
• 4697 views
• 4 likes
• 3 in conversation