I have one problem and I think there is not much to correct to work right. I have table:
data have;
input id opt $ t_purchase : datetime32. t_spent :datetime32. bonus usage ;
format time datetime32. ;
cards;
1 a1 10NOV2017:12:02:00 10NOV2017:14:05:00 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:15:07:33 100 0
1 a1 10NOV2017:12:02:00 10NOV2017:13:24:50 100 6
2 a1 10NOV2017:13:54:00 10NOV2017:14:02:58 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:20:22:07 100 12
2 a1 10NOV2017:13:54:00 10NOV2017:13:57:12 100 7
So, I need to sum all usage values from time_purchase (for one id, opt combination (group by id, opt) there is just one unique t_purchase) until t_spent. Also, I have about million of rows, so hash table would be the best solution. I've tried with:
data want;
if _n_=1 then do;
if 0 then set have(rename=(usage=_usage));
declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
h.definekey('id','opt', 't_purchase', 't_spent');
h.definedata('_usage');
h.definedone();
end;
set have;
sum_usage=0;
do i=intck('second', t_purchase, t_spent) to t_spent ;
if h.find(key:user,key:id_option,key:i)=0 then sum_usage-_usage;
end;
drop _usage i;
run;
The fifth line from the bottom is not correct for sure,(do i=intck('second', t_purchase, t_spent)
but have no idea how to approach this, how to select all records between two datetime values. So, the main problem is how to set up time interval to calculate this. I have already one function in this hash table but with the slightly different time interval, so it would be pretty good to write this one too, but it's not necessary.
Desired output:
id opt t_purchase t_spent bonus usage sum_usage
1 a1 10NOV2017:12:02:00 10NOV2017:14:05:00 100 3 9
1 a1 10NOV2017:12:02:00 10NOV2017:15:07:33 100 0 9
1 a1 10NOV2017:12:02:00 10NOV2017:13:24:50 100 6 6
2 a1 10NOV2017:13:54:00 10NOV2017:14:02:58 100 3 10
1 a1 10NOV2017:12:02:00 10NOV2017:20:22:07 100 12 21
2 a1 10NOV2017:13:54:00 10NOV2017:13:57:12 100 7 . 7
OK. If I understand your question.
data have;
input id opt $ t_purchase : datetime32. t_spent :datetime32. bonus usage ;
format t_purchase t_spent datetime32. ;
cards;
1 a1 10NOV2017:12:02:00 10NOV2017:14:05:00 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:15:07:33 100 0
1 a1 10NOV2017:12:02:00 10NOV2017:13:24:50 100 6
2 a1 10NOV2017:13:54:00 10NOV2017:14:02:58 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:20:22:07 100 12
2 a1 10NOV2017:13:54:00 10NOV2017:13:57:12 100 7
;
run;
data want;
if _n_=1 then do;
if 0 then set have(rename=(usage=_usage));
declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
h.definekey('id','opt', 't_purchase', 't_spent');
h.definedata('_usage');
h.definedone();
end;
set have;
sum_usage=0;
do i=t_purchase to t_spent ;
if h.find(key:id,key:opt,key:t_purchase,key:i)=0 then sum_usage+_usage;
end;
drop _usage i;
run;
What do you expect the "fifth line from the bottom" to be? You do mean the second record, with t_spent=10NOV2017:15:07:33, right?
Both record 1 (usage=3), and record 3 (usage=6) fall within the time-spanned in record 2. That's a total of sum_usage=9 (since record 2 has usage=0). Records 4 and 6 have a different id/opt combination, and record 5 is outside the record 2 time span, so their usage is not part of record 2 sum_usage.
Does this do what you are looking for?
proc sort data=have;
BY id opt t_purchase t_spent;
run;
DATA want;
DO UNTIL(LAST.id);
SET work.HAVE;
BY id opt t_purchase t_spent;
IF (FIRST.id) THEN sum_usage=0;
sum_usage + usage;
OUTPUT;
END;
RUN;
Ahmed
This can be a bit more compact. In particular you don't need the "do until ..." group, and you don't need the explicit OUTPUT statement:
proc sort data=have;
BY id opt t_purchase t_spent;
run;
DATA want;
SET work.HAVE;
BY id opt t_purchase;
IF (FIRST.purchase) THEN sum_usage=0;
sum_usage + usage;
RUN;
If you can sort the input data, this should do it:
proc sort data=have; by id opt t_purchase t_spent; run; data want; set have; by id opt t_purchase t_spent; if first.t_purchase then sum_usage=usage; else sum_usage+usage; run;
OK. If I understand your question.
data have;
input id opt $ t_purchase : datetime32. t_spent :datetime32. bonus usage ;
format t_purchase t_spent datetime32. ;
cards;
1 a1 10NOV2017:12:02:00 10NOV2017:14:05:00 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:15:07:33 100 0
1 a1 10NOV2017:12:02:00 10NOV2017:13:24:50 100 6
2 a1 10NOV2017:13:54:00 10NOV2017:14:02:58 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:20:22:07 100 12
2 a1 10NOV2017:13:54:00 10NOV2017:13:57:12 100 7
;
run;
data want;
if _n_=1 then do;
if 0 then set have(rename=(usage=_usage));
declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
h.definekey('id','opt', 't_purchase', 't_spent');
h.definedata('_usage');
h.definedone();
end;
set have;
sum_usage=0;
do i=t_purchase to t_spent ;
if h.find(key:id,key:opt,key:t_purchase,key:i)=0 then sum_usage+_usage;
end;
drop _usage i;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.