I need help in finding the average of time for repeating pattern. So in the below example, Time represents the seconds on each click and Total time represents the total time for that pattern. The first pattern ABCone got repeated twice so its average will be 11 and similarly DEF just appeared once so its average will be 12. I also want the number of clicks in each pattern. For example:
Clicks Group TotalTime Time
A 1 5
B 1 3
C 1 2
one 1 10 0
D 2 6
E 2 6
F 2 12 0
A 3 3
B 3 5
C 3 4
one 3 12 0
H 4 10
I 4 8
J 4 18 0
Currently, I got the results which gives me the average time for each click for each pattern using this code:
data have;
set work.book12;
run;
data have2 / view=have2;
length pattern $30;
do _n_ = 1 by 1 until (last.group);
set have;
by group;
pattern = cats(pattern,clicks);
end;
if length (pattern) = lengthc(pattern) then do;
put 'WARNING: pattern needs more length';
stop;
end;
do _n_ = 1 to _n_;
set have;
output;
end;
run;
proc means noprint data=have2;
class pattern clicks;
var time;
ways 2;
output out=have_means mean=mean ;
run;
data want (keep=pattern time_summary _freq_);
do until (last.pattern);
set have_means;
by pattern;
length time_summary $100;
time_summary = catx(',',time_summary,catx('-',clicks,mean));
end;
run;
Output expected: In this ' ABCone' got repeated twice so A’s total average is 4 seconds i.e 5 seconds in pattern 1 and 3 seconds in 2nd repeated pattern similarly for B ,C and other clicks seconds under Time column.
Pattern Freq Time Average Number of Clicks
ABCone 2 A-4,B-4,C-3,one-0 11 4
DEF 1 D-6,E-6,F-0 12 3
HIJ 1 H-10,I-8,J-0 18 3
Do us a big favor and post example data in a working data step, so we don't have to fiddle around beating text into usable data.
See this:
data book12;
infile cards dsd dlm=' ';
input clicks $ group totaltime time;
cards;
A 1 5
B 1 3
C 1 2
one 1 10 0
D 2 6
E 2 6
F 2 12 0
A 3 3
B 3 5
C 3 4
one 3 12 0
H 4 10
I 4 8
J 4 18 0
;
run;
data have;
length pattern $30;
do until (last.group);
set book12;
by group;
pattern = cats(pattern,clicks);
end;
if length (pattern) = lengthc(pattern) then do;
put 'WARNING: pattern needs more length';
stop;
end;
seq = 0;
do until (last.group);
set book12;
by group;
seq + 1;
output;
end;
run;
proc sort data=have;
by pattern seq;
run;
data want;
label
pattern = 'Pattern'
freq = 'Freq'
timestr = 'Time'
average = 'Average'
num_clicks = 'Number of Clicks'
;
set have;
by pattern seq;
length timestr $50;
retain timestr sum_time sum_total num_clicks;
if first.pattern
then do;
timestr = '';
sum_total = 0;
num_clicks = 0;
end;
if first.seq
then do;
sum_time = 0;
freq = 0;
num_clicks + 1;
end;
sum_time + time;
sum_total = sum(sum_total,totaltime);
freq + 1;
if last.seq then timestr = catx(',',timestr,strip(clicks)!!'-'!!strip(put(sum_time/freq,best.)));
if last.pattern
then do;
average = sum_total / freq;
output;
end;
keep pattern freq timestr average num_clicks;
run;
proc print data=want noobs label;
run;
Note that I have omitted the unnecessary creation of the view, and simplified the double DOW loop.
The result:
Number of Pattern Freq Time Average Clicks ABCone 2 A-4,B-4,C-3,one-0 11 4 DEF 1 D-6,E-6,F-0 12 3 HIJ 1 H-10,I-8,J-0 18 3
Do us a big favor and post example data in a working data step, so we don't have to fiddle around beating text into usable data.
See this:
data book12;
infile cards dsd dlm=' ';
input clicks $ group totaltime time;
cards;
A 1 5
B 1 3
C 1 2
one 1 10 0
D 2 6
E 2 6
F 2 12 0
A 3 3
B 3 5
C 3 4
one 3 12 0
H 4 10
I 4 8
J 4 18 0
;
run;
data have;
length pattern $30;
do until (last.group);
set book12;
by group;
pattern = cats(pattern,clicks);
end;
if length (pattern) = lengthc(pattern) then do;
put 'WARNING: pattern needs more length';
stop;
end;
seq = 0;
do until (last.group);
set book12;
by group;
seq + 1;
output;
end;
run;
proc sort data=have;
by pattern seq;
run;
data want;
label
pattern = 'Pattern'
freq = 'Freq'
timestr = 'Time'
average = 'Average'
num_clicks = 'Number of Clicks'
;
set have;
by pattern seq;
length timestr $50;
retain timestr sum_time sum_total num_clicks;
if first.pattern
then do;
timestr = '';
sum_total = 0;
num_clicks = 0;
end;
if first.seq
then do;
sum_time = 0;
freq = 0;
num_clicks + 1;
end;
sum_time + time;
sum_total = sum(sum_total,totaltime);
freq + 1;
if last.seq then timestr = catx(',',timestr,strip(clicks)!!'-'!!strip(put(sum_time/freq,best.)));
if last.pattern
then do;
average = sum_total / freq;
output;
end;
keep pattern freq timestr average num_clicks;
run;
proc print data=want noobs label;
run;
Note that I have omitted the unnecessary creation of the view, and simplified the double DOW loop.
The result:
Number of Pattern Freq Time Average Clicks ABCone 2 A-4,B-4,C-3,one-0 11 4 DEF 1 D-6,E-6,F-0 12 3 HIJ 1 H-10,I-8,J-0 18 3
Thank you for the solution I have one more doubt: If a pattern contains the consecutive keyword "one" then I want it to be merged and considered as single keyword "one" but get the average of total number of times "one" appeared. (explained in the example) For example:
Clicks Group TotalTime Time
A 1 5
B 1 3
C 1 2
one 1 10 0
D 2 6
E 2 6
F 2 12 0
A 3 3
B 3 5
C 3 4
one 3 6
one 3 2
one 3 20 0
H 4 10
I 4 8
J 4 18 0
output expected:
Pattern Freq Time Average Number of Clicks
ABCone 2 A-4,B-4,C-3,one-2 15 4
DEF 1 D-6,E-6,F-0 12 3
HIJ 1 H-10,I-8,J-0 18 3
Output which I received from the code is not right so can you help me in correcting where I am going wrong ?
code:
data have2 / view=have2;
length pattern $30;
pattern = '|'; * prepare for bounded token search via INDEX();
do _n_ = 1 by 1 until (last.group);
set have;
by group;
* use this line if all items in group are known to be distinct ;
* pattern = cats(pattern,clicks);
* track observed clicks by searching the growing pattern of the group;
bounded_token = cats( '|', clicks, '|' );
if index (pattern, trim(bounded_token) ) = 0 then
pattern = cats (pattern, clicks, '|');
end;
if length (pattern) = lengthc(pattern) then do;
put 'WARNING: pattern needs more length';
stop;
end;
* remove token bounders;
pattern = compress(pattern,'|');
do _n_ = 1 to _n_;
set have;
output;
end;
run;
proc means noprint data=have2;
class pattern clicks;
var time;
ways 2;
output out=have_means mean=mean ;
run;
data want (keep=pattern time_summary _freq_);
do until (last.pattern);
set have_means;
by pattern;
length time_summary $100;
time_summary = catx(',',time_summary,catx('-',clicks,mean));
end;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.