Hello,
I have a large data set with customer IDs and daily cumulative usage readings. I'm looking to calculate the daily usage per customer from the cumulative usage and add it as a variable in the data set. Any suggestions on how to even begin doing this? Thank you!
HAVE
ID date cumulative_reading
1 10/1/19 100
1 10/2/19 150
1 10/3/19 200
2 10/1/19 100
2 10/2/19 500
2 10/3/19 900
WANT
ID Date Cumulative_reading Daily Usage
1 10/1/19 100 50
1 10/2/19 150 50
1 10/3/19 200
2 10/1/19 100 400
2 10/2/19 500 400
2 10/3/19 900
Much easier if you put the value on the current record instead of the previous one.
data want;
set have ;
by id ;
daily_usage=dif(cumulative_reading);
if first.id then daily_usage=.;
run;
Results:
cumulative_ daily_ Obs id date reading usage 1 1 2019-10-01 100 . 2 1 2019-10-02 150 50 3 1 2019-10-03 200 50 4 2 2019-10-01 100 . 5 2 2019-10-02 500 400 6 2 2019-10-03 900 400
You can even decide if you want the first record's value to be missing or set it to the first value.
if first.id then daily_usage=cumulative_reading;
Results:
cumulative_ daily_ Obs id date reading usage 1 1 2019-10-01 100 100 2 1 2019-10-02 150 50 3 1 2019-10-03 200 50 4 2 2019-10-01 100 100 5 2 2019-10-02 500 400 6 2 2019-10-03 900 400
data want;
set have;
by id;
prev_cumulative=lag(cumulative_reading);
if first.id then daily=cumulative_reading;
else daily=cumulative_reading-prev_cumulative;
drop prev_cumulative;
run;
This assumes the data is properly sorted.
Also, you might want to re-think your example, there is no logical way to get a daily of 50 for the first row.
Much easier if you put the value on the current record instead of the previous one.
data want;
set have ;
by id ;
daily_usage=dif(cumulative_reading);
if first.id then daily_usage=.;
run;
Results:
cumulative_ daily_ Obs id date reading usage 1 1 2019-10-01 100 . 2 1 2019-10-02 150 50 3 1 2019-10-03 200 50 4 2 2019-10-01 100 . 5 2 2019-10-02 500 400 6 2 2019-10-03 900 400
You can even decide if you want the first record's value to be missing or set it to the first value.
if first.id then daily_usage=cumulative_reading;
Results:
cumulative_ daily_ Obs id date reading usage 1 1 2019-10-01 100 100 2 1 2019-10-02 150 50 3 1 2019-10-03 200 50 4 2 2019-10-01 100 100 5 2 2019-10-02 500 400 6 2 2019-10-03 900 400
You need to do a "look-ahead":
data have;
input ID date :mmddyy10. cumulative_reading;
format date yymmddd10.;
datalines;
1 10/1/19 100
1 10/2/19 150
1 10/3/19 200
2 10/1/19 100
2 10/2/19 500
2 10/3/19 900
;
options mergenoby=nowarn;
data want;
merge
have
have (
firstobs=2
keep=id cumulative_reading
rename=(id=nextid cumulative_reading=nextval)
)
;
if id = nextid then daily_usage = nextval - cumulative_reading;
drop nextid nextval;
run;
proc print data=want noobs;
run;
Result:
cumulative_ daily_ ID date reading usage 1 2019-10-01 100 50 1 2019-10-02 150 50 1 2019-10-03 200 . 2 2019-10-01 100 400 2 2019-10-02 500 400 2 2019-10-03 900 .
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.