BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dunga
Obsidian | Level 7

Hi SAS forum,

 

I have this dataset which shows the income of two cities in different dates.

 

data have;

input City $ 1-5 date income;

cards;

tokyo 20150324 10000000

tokyo 20150412 75000000

tokyo 20150522 65000000

miami 20150225 50000000

miami 20150312 60000000

;

run;

 

Q: I wanted to calculate the income differnece in each city between two adjoining months.

 

My attempt is this.

 

data want;

set have;

lag_income = lag(income);

monthly_income_dif= income - lag_income;

run;

 

Output table:

 

Citydateincomelag_incomemonthly_income_dif
tokyo2015032410000000  
tokyo20150412750000001000000065000000
tokyo201505226500000075000000-10000000
miami201502255000000065000000-15000000
miami20150312600000005000000010000000

 

Problem: The value       -15000000 in "Monthly_income_dif" variable is not correct because it substract tokyo income from Miami income which is not correct.

How to avoid this problem and calculate the income differences within a city?

Thanks

Mirisa

 

1 ACCEPTED SOLUTION

Accepted Solutions
rivieralad
Obsidian | Level 7
Hi Mirisa

A quick solution, sort the data by city and date and then set the difference to 0 for the first occurrence of each city...

proc sort data = have;
by city date;
run;
data want;
set have;
by city;
lag_income = lag(income);
if first.city then
monthly_income_dif=0;
else
monthly_income_dif= income - lag_income;
run;

HTH
Chris

View solution in original post

3 REPLIES 3
rivieralad
Obsidian | Level 7
Hi Mirisa

A quick solution, sort the data by city and date and then set the difference to 0 for the first occurrence of each city...

proc sort data = have;
by city date;
run;
data want;
set have;
by city;
lag_income = lag(income);
if first.city then
monthly_income_dif=0;
else
monthly_income_dif= income - lag_income;
run;

HTH
Chris
grace999
Obsidian | Level 7

I just ran the above code, and got the lag not by city, but just by the whole data. why? The 3rd observation should have missing as the value of lag_income since it's another city, tokyo, but the result got from the income from miami. Why can't I get the lag by group? 

 

Citydateincomelag_incomemonthly_income_dif
miami2015022550000000.0
miami20150312600000005000000010000000
tokyo2015032410000000600000000
tokyo20150412750000001000000065000000
tokyo201505226500000075000000-10000000
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Add to the if statement:

data have;
input City $ 1-5 date income;
cards;
tokyo 20150324 10000000
tokyo 20150412 75000000
tokyo 20150522 65000000
miami 20150225 50000000
miami 20150312 60000000
;
run;
proc sort data=have;
by city date;
run;
data want;
set have;
by city;
retain lst_income;
if not first.city then monthly_income_dif= income - lst_income;
lst_income=income;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 58162 views
  • 6 likes
  • 4 in conversation