## Finding Pattern

Occasional Contributor
Posts: 7

# Finding Pattern

[ Edited ]

I have a dataset which has all the clicks done on the website in 1 column. I want to find the pattern which gets repeated in the whole data and the data contains more than 1 Million rows and has 17000 different pattern. I also want to know the average time spend on each click for each pattern. I have written a code in SAS which groups each pattern and also finds the time difference between each click but I am not getting the output how I want. For example, according to my code I am getting this output:

Clicks    Group     Time(Seconds)

A               1                      6
B               1                      2
C               1                      0
D               2                     12
E               2                       5
F               2                       0
A               3                       9
B               3                       6
C               3                       0
H               4                       8
I                 4                       9
J                4                       0

Output expected:

Clicks         AverageTime      Count

ABC       A-7.5,B-4,C-0            2
DEF       D-12,E-5,F-0             1
HIJ         H-8,I-9,J-0                 1

Super User
Posts: 6,785

## Re: Finding Pattern

Are we guaranteed that every group contains exactly 3 clicks?

Here's an approach, that does the heavy lifting.  You can easily expand upon it if more than 3 clicks per pattern are allowed:

data want;

length pattern \$ 100;

array duration {3};

recnum = 0;

do until last.group;

set have;

by group;

if first.group then pattern = clicks;

else pattern = catx('|', pattern, clicks);

recnum + 1;

duration{recnum} = time;

end;

keep group pattern duration1-duration3;

run;

This will at least give you the pieces to work with.  The rest of the programming would use simple steps like PROC FREQ and PROC MEANS.

Discussion stats