Average across pattern

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

Average across pattern

I need help in finding the average of time for repeating pattern. So in the below example, Time represents the seconds on each click and Total time represents the total time for that pattern. The first pattern ABCone got repeated twice so its average will be 11 and similarly DEF just appeared once so its average will be 12. I also want the number of clicks in each pattern. For example:

Clicks    Group  TotalTime   Time

  A         1                 5
  B         1                 3
  C         1                 2
  one       1       10        0
  D         2                 6 
  E         2                 6
  F         2       12        0
  A         3                 3
  B         3                 5
  C         3                 4
  one       3       12        0
  H         4                10
  I         4                 8
  J         4       18        0

Currently, I got the results which gives me the average time for each click for each pattern using this code:

data have;
set work.book12;
run;
data have2 / view=have2;
  length pattern $30;
  do _n_ = 1 by 1 until (last.group);
    set have;
    by group;
    pattern = cats(pattern,clicks);
  end;
  if length (pattern) = lengthc(pattern) then do;
    put 'WARNING: pattern needs more length';
    stop;
  end;
  do _n_ = 1 to _n_;
    set have;
    output;
  end;
run;

proc means noprint data=have2;
  class pattern clicks;
var time;
  ways 2;
  output out=have_means mean=mean ;
run;

data want (keep=pattern time_summary _freq_);
  do until (last.pattern);
    set have_means;
    by pattern;
    length time_summary $100;
    time_summary = catx(',',time_summary,catx('-',clicks,mean));
  end;
run;

Output expected: In this ' ABCone' got repeated twice so A’s total average is 4 seconds i.e 5 seconds in pattern 1 and 3 seconds in 2nd repeated pattern similarly for B ,C and other clicks seconds under Time column.

Pattern  Freq   Time                 Average    Number of Clicks
ABCone    2     A-4,B-4,C-3,one-0       11            4
DEF       1     D-6,E-6,F-0             12            3
HIJ       1     H-10,I-8,J-0            18            3

 


Accepted Solutions
Solution
2 weeks ago
Super User
Posts: 10,215

Re: Average across pattern

Do us a big favor and post example data in a working data step, so we don't have to fiddle around beating text into usable data.

See this:

data book12;
infile cards dsd dlm=' ';
input clicks $ group totaltime time;
cards;
A 1  5
B 1  3
C 1  2
one 1 10 0
D 2  6 
E 2  6
F 2 12 0
A 3  3
B 3  5
C 3  4
one 3 12 0
H 4  10
I 4  8
J 4 18 0
;
run;

data have;
  length pattern $30;
  do until (last.group);
    set book12;
    by group;
    pattern = cats(pattern,clicks);
  end;
  if length (pattern) = lengthc(pattern) then do;
    put 'WARNING: pattern needs more length';
    stop;
  end;
  seq = 0;
  do until (last.group);
    set book12;
    by group;
    seq + 1;
    output;
  end;
run;

proc sort data=have;
by pattern seq;
run;

data want;
label
  pattern = 'Pattern'
  freq = 'Freq'
  timestr = 'Time'
  average = 'Average'
  num_clicks = 'Number of Clicks'
;
set have;
by pattern seq;
length timestr $50;
retain timestr sum_time sum_total num_clicks;
if first.pattern
then do;
  timestr = '';
  sum_total = 0;
  num_clicks = 0;
end;
if first.seq
then do;
  sum_time = 0;
  freq = 0;
  num_clicks + 1;
end;
sum_time + time;
sum_total = sum(sum_total,totaltime);
freq + 1;
if last.seq then timestr = catx(',',timestr,strip(clicks)!!'-'!!strip(put(sum_time/freq,best.)));
if last.pattern
then do;
  average = sum_total / freq;
  output;
end;
keep pattern freq timestr average num_clicks;
run;

proc print data=want noobs label;
run;

Note that I have omitted the unnecessary creation of the view, and simplified the double DOW loop.

The result:

                                                   Number
                                                     of
Pattern    Freq    Time                 Average    Clicks

ABCone       2     A-4,B-4,C-3,one-0       11         4  
DEF          1     D-6,E-6,F-0             12         3  
HIJ          1     H-10,I-8,J-0            18         3  
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Solution
2 weeks ago
Super User
Posts: 10,215

Re: Average across pattern

Do us a big favor and post example data in a working data step, so we don't have to fiddle around beating text into usable data.

See this:

data book12;
infile cards dsd dlm=' ';
input clicks $ group totaltime time;
cards;
A 1  5
B 1  3
C 1  2
one 1 10 0
D 2  6 
E 2  6
F 2 12 0
A 3  3
B 3  5
C 3  4
one 3 12 0
H 4  10
I 4  8
J 4 18 0
;
run;

data have;
  length pattern $30;
  do until (last.group);
    set book12;
    by group;
    pattern = cats(pattern,clicks);
  end;
  if length (pattern) = lengthc(pattern) then do;
    put 'WARNING: pattern needs more length';
    stop;
  end;
  seq = 0;
  do until (last.group);
    set book12;
    by group;
    seq + 1;
    output;
  end;
run;

proc sort data=have;
by pattern seq;
run;

data want;
label
  pattern = 'Pattern'
  freq = 'Freq'
  timestr = 'Time'
  average = 'Average'
  num_clicks = 'Number of Clicks'
;
set have;
by pattern seq;
length timestr $50;
retain timestr sum_time sum_total num_clicks;
if first.pattern
then do;
  timestr = '';
  sum_total = 0;
  num_clicks = 0;
end;
if first.seq
then do;
  sum_time = 0;
  freq = 0;
  num_clicks + 1;
end;
sum_time + time;
sum_total = sum(sum_total,totaltime);
freq + 1;
if last.seq then timestr = catx(',',timestr,strip(clicks)!!'-'!!strip(put(sum_time/freq,best.)));
if last.pattern
then do;
  average = sum_total / freq;
  output;
end;
keep pattern freq timestr average num_clicks;
run;

proc print data=want noobs label;
run;

Note that I have omitted the unnecessary creation of the view, and simplified the double DOW loop.

The result:

                                                   Number
                                                     of
Pattern    Freq    Time                 Average    Clicks

ABCone       2     A-4,B-4,C-3,one-0       11         4  
DEF          1     D-6,E-6,F-0             12         3  
HIJ          1     H-10,I-8,J-0            18         3  
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Occasional Contributor
Posts: 7

Re: Average across pattern

[ Edited ]
Posted in reply to KurtBremser

 Thank you for the solution I have one more doubt:   If a pattern contains the consecutive keyword "one" then I want it to be merged and considered as single keyword "one" but get the average of total number of times "one" appeared. (explained in the example) For example:

Clicks    Group  TotalTime   Time

  A         1                 5
  B         1                 3
  C         1                 2
  one       1       10        0
  D         2                 6 
  E         2                 6
  F         2       12        0
  A         3                 3
  B         3                 5
  C         3                 4
  one       3                 6
  one       3                 2
  one       3       20        0
  H         4                10
  I         4                 8
  J         4       18        0

 

output expected:

Pattern  Freq   Time                 Average    Number of Clicks
ABCone    2     A-4,B-4,C-3,one-2       15            4
DEF       1     D-6,E-6,F-0             12            3
HIJ       1     H-10,I-8,J-0            18            3

 

Output which I received from the code is not right so can you help me in correcting where I am going wrong ?

code:

data have2 / view=have2;
length pattern $30;

pattern = '|'; * prepare for bounded token search via INDEX();

do _n_ = 1 by 1 until (last.group);
set have;
by group;

* use this line if all items in group are known to be distinct ;
* pattern = cats(pattern,clicks);

* track observed clicks by searching the growing pattern of the group;
bounded_token = cats( '|', clicks, '|' );
if index (pattern, trim(bounded_token) ) = 0 then
pattern = cats (pattern, clicks, '|');
end;
if length (pattern) = lengthc(pattern) then do;
put 'WARNING: pattern needs more length';
stop;
end;

* remove token bounders;
pattern = compress(pattern,'|');

do _n_ = 1 to _n_;
set have;
output;
end;
run;

proc means noprint data=have2;
class pattern clicks;
var time;
ways 2;
output out=have_means mean=mean ;
run;

data want (keep=pattern time_summary _freq_);
do until (last.pattern);
set have_means;
by pattern;
length time_summary $100;
time_summary = catx(',',time_summary,catx('-',clicks,mean));
end;
run;
 

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 96 views
  • 1 like
  • 2 in conversation