Hi SAS forum,
I have this dataset which shows the income of two cities in different dates.
data have;
input City $ 1-5 date income;
cards;
tokyo 20150324 10000000
tokyo 20150412 75000000
tokyo 20150522 65000000
miami 20150225 50000000
miami 20150312 60000000
;run;
I want to produce results below using the lag function by city group :
City | date | income | lag_income |
miami | 20150225 | 50000000 | . |
miami | 20150312 | 60000000 | 50000000 |
tokyo | 20150324 | 10000000 | . |
tokyo | 20150412 | 75000000 | 10000000 |
tokyo | 20150522 | 65000000 | 75000000 |
proc sort data = have;
by city date;
run;
data want;
set have;
by city;
lag_income = lag(income);
run;
After using the above code, I got the lagged income not by city, but just by the whole data. why? The 3rd observation should have missing as the value of lag_income, but the result got from the income from miami. Why can't I get the lag by group?
City | date | income | lag_income |
miami | 20150225 | 50000000 | . |
miami | 20150312 | 60000000 | 50000000 |
tokyo | 20150324 | 10000000 | 60000000 |
tokyo | 20150412 | 75000000 | 10000000 |
tokyo | 20150522 | 65000000 | 75000000 |
lag() does not take care of any by-group processing on its own, you need to do that yourself:
data want;
set have;
by city;
lag_income = lag(income);
if first.city then lag_income = .;
run;
data want;
set have;
by city;
prev_income=lag(income);
if first.city then lag_income=.;
else lag_income=prev_income;
drop prev_income;
run;
lag() does not take care of any by-group processing on its own, you need to do that yourself:
data want;
set have;
by city;
lag_income = lag(income);
if first.city then lag_income = .;
run;
I understand that I can use
if first.city then lag_income=.;
to get the results I want. SAS lag function can't do that itself? is there better way to do it?
data want;
set have;
by city;
lag_income = lag(income);
if first.city then lag_income=.;
run;
Try:
data have;
input City $ 1-5 date income;
cards;
tokyo 20150324 10000000
tokyo 20150412 75000000
tokyo 20150522 65000000
miami 20150225 50000000
miami 20150312 60000000
;
run;
proc sort data=have;
by city;
run;
data want;
set have;
by city;
lag_income=ifn(first.city,.,lag(income));
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.