BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dunga
Obsidian | Level 7

Hi SAS forum,

 

I have this dataset which shows the income of two cities in different dates.

 

data have;

input City $ 1-5 date income;

cards;

tokyo 20150324 10000000

tokyo 20150412 75000000

tokyo 20150522 65000000

miami 20150225 50000000

miami 20150312 60000000

;

run;

 

Q: I wanted to calculate the income differnece in each city between two adjoining months.

 

My attempt is this.

 

data want;

set have;

lag_income = lag(income);

monthly_income_dif= income - lag_income;

run;

 

Output table:

 

Citydateincomelag_incomemonthly_income_dif
tokyo2015032410000000  
tokyo20150412750000001000000065000000
tokyo201505226500000075000000-10000000
miami201502255000000065000000-15000000
miami20150312600000005000000010000000

 

Problem: The value       -15000000 in "Monthly_income_dif" variable is not correct because it substract tokyo income from Miami income which is not correct.

How to avoid this problem and calculate the income differences within a city?

Thanks

Mirisa

 

1 ACCEPTED SOLUTION

Accepted Solutions
rivieralad
Obsidian | Level 7
Hi Mirisa

A quick solution, sort the data by city and date and then set the difference to 0 for the first occurrence of each city...

proc sort data = have;
by city date;
run;
data want;
set have;
by city;
lag_income = lag(income);
if first.city then
monthly_income_dif=0;
else
monthly_income_dif= income - lag_income;
run;

HTH
Chris

View solution in original post

3 REPLIES 3
rivieralad
Obsidian | Level 7
Hi Mirisa

A quick solution, sort the data by city and date and then set the difference to 0 for the first occurrence of each city...

proc sort data = have;
by city date;
run;
data want;
set have;
by city;
lag_income = lag(income);
if first.city then
monthly_income_dif=0;
else
monthly_income_dif= income - lag_income;
run;

HTH
Chris
grace999
Obsidian | Level 7

I just ran the above code, and got the lag not by city, but just by the whole data. why? The 3rd observation should have missing as the value of lag_income since it's another city, tokyo, but the result got from the income from miami. Why can't I get the lag by group? 

 

Citydateincomelag_incomemonthly_income_dif
miami2015022550000000.0
miami20150312600000005000000010000000
tokyo2015032410000000600000000
tokyo20150412750000001000000065000000
tokyo201505226500000075000000-10000000
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Add to the if statement:

data have;
input City $ 1-5 date income;
cards;
tokyo 20150324 10000000
tokyo 20150412 75000000
tokyo 20150522 65000000
miami 20150225 50000000
miami 20150312 60000000
;
run;
proc sort data=have;
by city date;
run;
data want;
set have;
by city;
retain lst_income;
if not first.city then monthly_income_dif= income - lst_income;
lst_income=income;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 57866 views
  • 6 likes
  • 4 in conversation