BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jovic92
Obsidian | Level 7

I have one problem and I think there is not much to correct to work right. I have table:

 

data have;
input id  opt  $   t_purchase   : datetime32. t_spent :datetime32.  bonus  usage ;
format  time   datetime32. ;
cards; 
1    a1  10NOV2017:12:02:00  10NOV2017:14:05:00   100     3 
1    a1  10NOV2017:12:02:00  10NOV2017:15:07:33   100     0  
1    a1  10NOV2017:12:02:00  10NOV2017:13:24:50   100     6 
2    a1  10NOV2017:13:54:00  10NOV2017:14:02:58   100     3 
1    a1  10NOV2017:12:02:00  10NOV2017:20:22:07   100    12    
2    a1  10NOV2017:13:54:00  10NOV2017:13:57:12   100     7 

So, I need to sum all usage values from time_purchase (for one id, opt combination (group by id, opt) there is just one unique t_purchase) until t_spent. Also, I have about million of rows, so hash table would be the best solution. I've tried with:

 

data want;
 if _n_=1 then do;
  if 0 then set have(rename=(usage=_usage));
  declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
  h.definekey('id','opt', 't_purchase', 't_spent');
  h.definedata('_usage');
  h.definedone();
 end;
set have;
sum_usage=0;
do i=intck('second', t_purchase, t_spent) to t_spent ;
 if h.find(key:user,key:id_option,key:i)=0 then sum_usage-_usage;
end;
drop _usage i;
run;

 The fifth line from the bottom is not correct for sure,(do i=intck('second', t_purchase, t_spent) but have no idea how to approach this, how to select all records between two datetime values. So, the main problem is how to set up time interval to calculate this. I have already one function in this hash table but with the slightly different time interval, so it would be pretty good to write this one too, but it's not necessary.

 

Desired output:

id  opt    t_purchase            t_spent       bonus   usage sum_usage
1    a1  10NOV2017:12:02:00  10NOV2017:14:05:00   100     3        9
1    a1  10NOV2017:12:02:00  10NOV2017:15:07:33   100     0        9
1    a1  10NOV2017:12:02:00  10NOV2017:13:24:50   100     6        6
2    a1  10NOV2017:13:54:00  10NOV2017:14:02:58   100     3        10
1    a1  10NOV2017:12:02:00  10NOV2017:20:22:07   100    12        21 
2    a1  10NOV2017:13:54:00  10NOV2017:13:57:12   100     7 .      7
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

OK. If I understand your question.

 

data have;
input id  opt  $   t_purchase   : datetime32. t_spent :datetime32.  bonus  usage ;
format  t_purchase  t_spent datetime32. ;
cards; 
1    a1  10NOV2017:12:02:00  10NOV2017:14:05:00   100     3 
1    a1  10NOV2017:12:02:00  10NOV2017:15:07:33   100     0  
1    a1  10NOV2017:12:02:00  10NOV2017:13:24:50   100     6 
2    a1  10NOV2017:13:54:00  10NOV2017:14:02:58   100     3 
1    a1  10NOV2017:12:02:00  10NOV2017:20:22:07   100    12    
2    a1  10NOV2017:13:54:00  10NOV2017:13:57:12   100     7 
;
run;

data want;
 if _n_=1 then do;
  if 0 then set have(rename=(usage=_usage));
  declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
  h.definekey('id','opt', 't_purchase', 't_spent');
  h.definedata('_usage');
  h.definedone();
 end;
set have;
sum_usage=0;
do i=t_purchase to t_spent ;
 if h.find(key:id,key:opt,key:t_purchase,key:i)=0 then sum_usage+_usage;
end;
drop _usage i;
run;

View solution in original post

6 REPLIES 6
mkeintz
PROC Star

What do you expect the "fifth line from the bottom" to be?  You do mean the second record, with t_spent=10NOV2017:15:07:33, right?

 

Both record 1 (usage=3), and record 3 (usage=6) fall within the time-spanned in record 2.  That's a total of sum_usage=9 (since record 2 has usage=0).  Records 4 and 6 have a different id/opt combination, and record 5 is outside the record 2 time span, so their usage is not part of record 2 sum_usage.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
jovic92
Obsidian | Level 7
I thought fifth line from bottom in code, not table. Intck function can't be used, i guess this way in this solution, but I have no idea how to solve this.
AhmedAl_Attar
Ammonite | Level 13

Does this do what you are looking for?

proc sort data=have;
    BY id opt t_purchase t_spent;
run;

DATA want;
    
    DO UNTIL(LAST.id);
        SET work.HAVE;
        BY id opt t_purchase t_spent;
        IF (FIRST.id) THEN sum_usage=0;
        sum_usage + usage;
        OUTPUT;
    END;
RUN;

Ahmed

mkeintz
PROC Star

This can be a bit more compact.  In particular you don't need the "do until ..." group, and you don't need the explicit OUTPUT statement:

 

 

proc sort data=have;
    BY id opt t_purchase t_spent;
run;

DATA want;
  SET work.HAVE;
  BY id opt t_purchase;
  IF (FIRST.purchase) THEN sum_usage=0;
  sum_usage + usage;
RUN;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
s_lassen
Meteorite | Level 14

If you can sort the input data, this should do it:

proc sort data=have;
  by id opt t_purchase t_spent;
run;
data want;
  set have;
  by id opt t_purchase t_spent;
  if first.t_purchase then
    sum_usage=usage;
  else
    sum_usage+usage;
run;

 

Ksharp
Super User

OK. If I understand your question.

 

data have;
input id  opt  $   t_purchase   : datetime32. t_spent :datetime32.  bonus  usage ;
format  t_purchase  t_spent datetime32. ;
cards; 
1    a1  10NOV2017:12:02:00  10NOV2017:14:05:00   100     3 
1    a1  10NOV2017:12:02:00  10NOV2017:15:07:33   100     0  
1    a1  10NOV2017:12:02:00  10NOV2017:13:24:50   100     6 
2    a1  10NOV2017:13:54:00  10NOV2017:14:02:58   100     3 
1    a1  10NOV2017:12:02:00  10NOV2017:20:22:07   100    12    
2    a1  10NOV2017:13:54:00  10NOV2017:13:57:12   100     7 
;
run;

data want;
 if _n_=1 then do;
  if 0 then set have(rename=(usage=_usage));
  declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
  h.definekey('id','opt', 't_purchase', 't_spent');
  h.definedata('_usage');
  h.definedone();
 end;
set have;
sum_usage=0;
do i=t_purchase to t_spent ;
 if h.find(key:id,key:opt,key:t_purchase,key:i)=0 then sum_usage+_usage;
end;
drop _usage i;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2081 views
  • 0 likes
  • 5 in conversation