Hello,
I have a large data set with customer IDs and daily cumulative usage readings. I'm looking to calculate the daily usage per customer from the cumulative usage and add it as a variable in the data set. Any suggestions on how to even begin doing this? Thank you!
HAVE
ID date cumulative_reading
1 10/1/19 100
1 10/2/19 150
1 10/3/19 200
2 10/1/19 100
2 10/2/19 500
2 10/3/19 900
WANT
ID Date Cumulative_reading Daily Usage
1 10/1/19 100 50
1 10/2/19 150 50
1 10/3/19 200
2 10/1/19 100 400
2 10/2/19 500 400
2 10/3/19 900
Much easier if you put the value on the current record instead of the previous one.
data want;
set have ;
by id ;
daily_usage=dif(cumulative_reading);
if first.id then daily_usage=.;
run;
Results:
cumulative_ daily_ Obs id date reading usage 1 1 2019-10-01 100 . 2 1 2019-10-02 150 50 3 1 2019-10-03 200 50 4 2 2019-10-01 100 . 5 2 2019-10-02 500 400 6 2 2019-10-03 900 400
You can even decide if you want the first record's value to be missing or set it to the first value.
if first.id then daily_usage=cumulative_reading;
Results:
cumulative_ daily_ Obs id date reading usage 1 1 2019-10-01 100 100 2 1 2019-10-02 150 50 3 1 2019-10-03 200 50 4 2 2019-10-01 100 100 5 2 2019-10-02 500 400 6 2 2019-10-03 900 400
data want;
set have;
by id;
prev_cumulative=lag(cumulative_reading);
if first.id then daily=cumulative_reading;
else daily=cumulative_reading-prev_cumulative;
drop prev_cumulative;
run;
This assumes the data is properly sorted.
Also, you might want to re-think your example, there is no logical way to get a daily of 50 for the first row.
Much easier if you put the value on the current record instead of the previous one.
data want;
set have ;
by id ;
daily_usage=dif(cumulative_reading);
if first.id then daily_usage=.;
run;
Results:
cumulative_ daily_ Obs id date reading usage 1 1 2019-10-01 100 . 2 1 2019-10-02 150 50 3 1 2019-10-03 200 50 4 2 2019-10-01 100 . 5 2 2019-10-02 500 400 6 2 2019-10-03 900 400
You can even decide if you want the first record's value to be missing or set it to the first value.
if first.id then daily_usage=cumulative_reading;
Results:
cumulative_ daily_ Obs id date reading usage 1 1 2019-10-01 100 100 2 1 2019-10-02 150 50 3 1 2019-10-03 200 50 4 2 2019-10-01 100 100 5 2 2019-10-02 500 400 6 2 2019-10-03 900 400
You need to do a "look-ahead":
data have;
input ID date :mmddyy10. cumulative_reading;
format date yymmddd10.;
datalines;
1 10/1/19 100
1 10/2/19 150
1 10/3/19 200
2 10/1/19 100
2 10/2/19 500
2 10/3/19 900
;
options mergenoby=nowarn;
data want;
merge
have
have (
firstobs=2
keep=id cumulative_reading
rename=(id=nextid cumulative_reading=nextval)
)
;
if id = nextid then daily_usage = nextval - cumulative_reading;
drop nextid nextval;
run;
proc print data=want noobs;
run;
Result:
cumulative_ daily_ ID date reading usage 1 2019-10-01 100 50 1 2019-10-02 150 50 1 2019-10-03 200 . 2 2019-10-01 100 400 2 2019-10-02 500 400 2 2019-10-03 900 .
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.