Partition by equivalent in SAS base or proc sql

Accepted Solution Solved
Reply
Contributor
Posts: 23
Accepted Solution

Partition by equivalent in SAS base or proc sql

Hi, everyone,

What is needed is to calculate for every record (every row) the last 3hr sum of usage (Usage is one of the columns in dataset) grouped by User and ID_option. 

Every line(row) represent one record (within 3 min time interval). For example (including desired column sum_usage):

 

 

User  ID_option          time             usage        sum_usage_3hr
1         a1        12OCT2017:11:20:32       3             10
1         a1        12OCT2017:10:23:24       7             14
1         b1        12OCT2017:09:34:55       12            12
2         b1        12OCT2017:08:55:06       4              6
1         a1        12OCT2017:07:59:53       7              7
2         b1        12OCT2017:06:59:12       2              2

I have tried with something like this code below, and it return me sum of all time, not just the last 3HR. I'm not surprised, but have not so much idea how I'm going to do this in sas. 

proc sql:
CREATE table my_table
SELECT *, SUM(usage) AS sum_usage_3hr
FROM prev_table WHERE time BETWEEN TIME and intnx('second', time, -3*3600)
GROUP BY User, ID_option;
RUN;

 Any help is welcomed, thanks. 


Accepted Solutions
Solution
‎11-29-2017 07:42 AM
Super User
Posts: 10,698

Re: Partition by equivalent in SAS base or proc sql

OK. Here is . Hope you have big enough memory to run Hash Table.

 

data have;
input User  ID_option  $        time   : datetime32.          usage   ;
format  time   datetime32. ;
cards; 
1         a1        12OCT2017:11:20:32       3             
1         a1        12OCT2017:10:23:24       7             
1         b1        12OCT2017:09:34:55       12            
2         b1        12OCT2017:08:55:06       4             
1         a1        12OCT2017:07:59:53       7             
2         b1        12OCT2017:06:59:12       2             
;
run;
data want;
 if _n_=1 then do;
  if 0 then set have(rename=(usage=_usage));
  declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
  h.definekey('user','id_option','time');
  h.definedata('_usage');
  h.definedone();
 end;
set have;
sum_usage_3hr=0;
do i=time-3*3600 to time ;
 if h.find(key:user,key:id_option,key:i)=0 then sum_usage_3hr+_usage;
end;
drop _usage i;
run;

View solution in original post


All Replies
Super User
Posts: 10,698

Re: Partition by equivalent in SAS base or proc sql

data have;
input User  ID_option  $        time   : datetime32.          usage   ;
format  time   datetime32. ;
cards; 
1         a1        12OCT2017:11:20:32       3             
1         a1        12OCT2017:10:23:24       7             
1         b1        12OCT2017:09:34:55       12            
2         b1        12OCT2017:08:55:06       4             
1         a1        12OCT2017:07:59:53       7             
2         b1        12OCT2017:06:59:12       2             
;
run;
proc sql;
select *,(select sum(usage) from have as b where b.user=a.user and b.ID_option=a.ID_option and
b.time between a.time-3*3600 and a.time ) as want
 from have as a;
quit;
Contributor
Posts: 23

Re: Partition by equivalent in SAS base or proc sql

I'm testing this code now, but it works about 1 hour, and still counting, I don't know why, it looks good. 

Maybe there is need for some sorting data before aggregation? 

Thank you for your reply. 

Super User
Posts: 10,698

Re: Partition by equivalent in SAS base or proc sql

Do you have a big table ? If you have, then I suggest to split your big table into many small table, each contains only ONE id . and run my code . Or maybe you need other fast algorithm like Hash Table .

Contributor
Posts: 23

Re: Partition by equivalent in SAS base or proc sql

Yes, it is a big table, but I think it should be done till now. I will try that, thanks. 

Contributor
Posts: 23

Re: Partition by equivalent in SAS base or proc sql

[ Edited ]

Now I've made table a smaller, I've take just 5 of 24 loader files and still have the same problem. 

Solution
‎11-29-2017 07:42 AM
Super User
Posts: 10,698

Re: Partition by equivalent in SAS base or proc sql

OK. Here is . Hope you have big enough memory to run Hash Table.

 

data have;
input User  ID_option  $        time   : datetime32.          usage   ;
format  time   datetime32. ;
cards; 
1         a1        12OCT2017:11:20:32       3             
1         a1        12OCT2017:10:23:24       7             
1         b1        12OCT2017:09:34:55       12            
2         b1        12OCT2017:08:55:06       4             
1         a1        12OCT2017:07:59:53       7             
2         b1        12OCT2017:06:59:12       2             
;
run;
data want;
 if _n_=1 then do;
  if 0 then set have(rename=(usage=_usage));
  declare hash h(dataset:'have(rename=(usage=_usage))',hashexp:20);
  h.definekey('user','id_option','time');
  h.definedata('_usage');
  h.definedone();
 end;
set have;
sum_usage_3hr=0;
do i=time-3*3600 to time ;
 if h.find(key:user,key:id_option,key:i)=0 then sum_usage_3hr+_usage;
end;
drop _usage i;
run;
Contributor
Posts: 23

Re: Partition by equivalent in SAS base or proc sql

I have got an error: 

Invalid DO loop control information, either the INITIAL or TO expression is missing or the BY expression is missing, zero, or invalid. 

Super User
Posts: 10,698

Re: Partition by equivalent in SAS base or proc sql

And don't forget round your TIME variable, if it has decimal number.

 

data have;
input User  ID_option  $        time   : datetime32.          usage   ;
time=int(time);
format  time   datetime32. ;
Contributor
Posts: 23

Re: Partition by equivalent in SAS base or proc sql

Time is in format I have posted in question. 

Thank you, I'm going to try that in a few minutes. 

Super User
Posts: 10,698

Re: Partition by equivalent in SAS base or proc sql

If you don't have big memory to run Hash Table, Try this one .

NOTE: these code is assuming there are not duplicate TIME for the same user and id_option.

If there duplicated TIME(e.g.  two or more obs have the same TIME under the same user and same id_option),

use PROC SUMMARY to sum USAGE.

 

data have;
input User  ID_option  $        time   : datetime32.          usage   ;
format  time   datetime32. ;
cards; 
1         a1        12OCT2017:11:20:32       3             
1         a1        12OCT2017:10:23:24       7             
1         b1        12OCT2017:09:34:55       12            
2         b1        12OCT2017:08:55:06       4             
1         a1        12OCT2017:07:59:53       7             
2         b1        12OCT2017:06:59:12       2             
;
run;
proc sort data=have;
by User  ID_option ;
run;
data want;
 if _n_=1 then do;
  if 0 then set have;
  declare hash h(hashexp:20);
  h.definekey('user','id_option','time');
  h.definedata('_usage');
  h.definedone();
 end;

do until(last.id_option);
set have;
by user id_option;
 _usage=usage;
 h.ref();
end;

do until(last.id_option);
set have;
by user id_option;
sum_usage_3hr=0;
do i=time-3*3600 to time ;
 if h.find(key:user,key:id_option,key:i)=0 then sum_usage_3hr+_usage;
end;
output;
end;
h.clear();
drop _usage i;
run;
Contributor
Posts: 23

Re: Partition by equivalent in SAS base or proc sql

I think I have enough memory, 32 GB RAM an 8-core processor. 

Just don't get it why is there problem with "No mathing DO/SELECT statement now. 

 

I'm going to try new code. 

Super User
Posts: 10,698

Re: Partition by equivalent in SAS base or proc sql

NOTE: If you have duplicated obs , you need run PROC SUMMARY firstly 

i.e.

1 a1 12OCT2017:11:20:32 3
1 a1 12OCT2017:10:23:24 7

1 a1 12OCT2017:10:23:24 2

1 a1 12OCT2017:10:23:24 4

 

 

proc summary data=have;

by  User  ID_option         time ;

var usage;

output out=have1 sum=;

run;

Contributor
Posts: 23

Re: Partition by equivalent in SAS base or proc sql

I have done that, but again is the problem with DO loop control. Same as I mentioned earlier. 

I appreciate your help, either we find the solution or not. 

Super User
Posts: 10,698

Re: Partition by equivalent in SAS base or proc sql

It is really weird. I didn't run into any ERROR when running my code . Can you post your LOG ?

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 21 replies
  • 1552 views
  • 9 likes
  • 3 in conversation