I have one problem and I think there is not much to correct to work right. I have table:
data have;
input id opt $ t_purchase : datetime32. t_spent :datetime32. bonus usage ;
format time datetime32. ;
cards;
1 a1 10NOV2017:12:02:00 10NOV2017:14:05:00 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:15:07:33 100 0
1 a1 10NOV2017:12:02:00 10NOV2017:13:24:50 100 6
2 a1 10NOV2017:13:54:00 10NOV2017:14:02:58 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:20:22:07 100 12
2 a1 10NOV2017:13:54:00 10NOV2017:13:57:12 100 7
So, I need to sum all usage values from time_purchase (for one id, opt combination (group by id, opt) there is just one unique t_purchase) until t_spent. Also, I have about million of rows, so hash table would be the best solution. I've tried with:
data want;
if _n_=1 then do;
if 0 then set have(rename=(usage=_usage));
declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
h.definekey('id','opt', 't_purchase', 't_spent');
h.definedata('_usage');
h.definedone();
end;
set have;
sum_usage=0;
do i=intck('second', t_purchase, t_spent) to t_spent ;
if h.find(key:user,key:id_option,key:i)=0 then sum_usage-_usage;
end;
drop _usage i;
run;
The fifth line from the bottom is not correct for sure,(do i=intck('second', t_purchase, t_spent)
but have no idea how to approach this, how to select all records between two datetime values. So, the main problem is how to set up time interval to calculate this. I have already one function in this hash table but with the slightly different time interval, so it would be pretty good to write this one too, but it's not necessary.
Desired output:
id opt t_purchase t_spent bonus usage sum_usage
1 a1 10NOV2017:12:02:00 10NOV2017:14:05:00 100 3 9
1 a1 10NOV2017:12:02:00 10NOV2017:15:07:33 100 0 9
1 a1 10NOV2017:12:02:00 10NOV2017:13:24:50 100 6 6
2 a1 10NOV2017:13:54:00 10NOV2017:14:02:58 100 3 10
1 a1 10NOV2017:12:02:00 10NOV2017:20:22:07 100 12 21
2 a1 10NOV2017:13:54:00 10NOV2017:13:57:12 100 7 . 7
OK. If I understand your question.
data have;
input id opt $ t_purchase : datetime32. t_spent :datetime32. bonus usage ;
format t_purchase t_spent datetime32. ;
cards;
1 a1 10NOV2017:12:02:00 10NOV2017:14:05:00 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:15:07:33 100 0
1 a1 10NOV2017:12:02:00 10NOV2017:13:24:50 100 6
2 a1 10NOV2017:13:54:00 10NOV2017:14:02:58 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:20:22:07 100 12
2 a1 10NOV2017:13:54:00 10NOV2017:13:57:12 100 7
;
run;
data want;
if _n_=1 then do;
if 0 then set have(rename=(usage=_usage));
declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
h.definekey('id','opt', 't_purchase', 't_spent');
h.definedata('_usage');
h.definedone();
end;
set have;
sum_usage=0;
do i=t_purchase to t_spent ;
if h.find(key:id,key:opt,key:t_purchase,key:i)=0 then sum_usage+_usage;
end;
drop _usage i;
run;
What do you expect the "fifth line from the bottom" to be? You do mean the second record, with t_spent=10NOV2017:15:07:33, right?
Both record 1 (usage=3), and record 3 (usage=6) fall within the time-spanned in record 2. That's a total of sum_usage=9 (since record 2 has usage=0). Records 4 and 6 have a different id/opt combination, and record 5 is outside the record 2 time span, so their usage is not part of record 2 sum_usage.
Does this do what you are looking for?
proc sort data=have;
BY id opt t_purchase t_spent;
run;
DATA want;
DO UNTIL(LAST.id);
SET work.HAVE;
BY id opt t_purchase t_spent;
IF (FIRST.id) THEN sum_usage=0;
sum_usage + usage;
OUTPUT;
END;
RUN;
Ahmed
This can be a bit more compact. In particular you don't need the "do until ..." group, and you don't need the explicit OUTPUT statement:
proc sort data=have;
BY id opt t_purchase t_spent;
run;
DATA want;
SET work.HAVE;
BY id opt t_purchase;
IF (FIRST.purchase) THEN sum_usage=0;
sum_usage + usage;
RUN;
If you can sort the input data, this should do it:
proc sort data=have; by id opt t_purchase t_spent; run; data want; set have; by id opt t_purchase t_spent; if first.t_purchase then sum_usage=usage; else sum_usage+usage; run;
OK. If I understand your question.
data have;
input id opt $ t_purchase : datetime32. t_spent :datetime32. bonus usage ;
format t_purchase t_spent datetime32. ;
cards;
1 a1 10NOV2017:12:02:00 10NOV2017:14:05:00 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:15:07:33 100 0
1 a1 10NOV2017:12:02:00 10NOV2017:13:24:50 100 6
2 a1 10NOV2017:13:54:00 10NOV2017:14:02:58 100 3
1 a1 10NOV2017:12:02:00 10NOV2017:20:22:07 100 12
2 a1 10NOV2017:13:54:00 10NOV2017:13:57:12 100 7
;
run;
data want;
if _n_=1 then do;
if 0 then set have(rename=(usage=_usage));
declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
h.definekey('id','opt', 't_purchase', 't_spent');
h.definedata('_usage');
h.definedone();
end;
set have;
sum_usage=0;
do i=t_purchase to t_spent ;
if h.find(key:id,key:opt,key:t_purchase,key:i)=0 then sum_usage+_usage;
end;
drop _usage i;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.