## how do I group data and calculate sum using 'like' operator?

Solved
Occasional Contributor
Posts: 19

# how do I group data and calculate sum using 'like' operator?

I have a dataset like

 words c1 c2 c3 c4 ally 21 32 45 65 ally traitor 56 76 45 67 barriar overcome 87 98 21 76 barriar overcome instruction 32 42 12 32 cat 67 36 43 85 cat pat 18 92 18 27 cat pa 82 54 21 34

I want to group these data by words if they have similar words and calculate their sum of c1 c2 c3 c4 seperately.

For example, I want 'the first level' group calculation if the first word in the first column are similar. that is to say, something like:

 first level c1 c2 c3 c4 ally 77 108 90 132 barriar 119 140 33 108 cat 167 182 82 146

How do I do this? I come up with some ideas about 'like' operator, but there are thousands of words in the first column, I don't know whether there is some loop that I can do with this process, can some one help me out?

Accepted Solutions
Solution
‎11-21-2016 12:50 PM
Occasional Contributor
Posts: 9

## Re: how do I group data and calculate sum using 'like' operator?

[ Edited ]

Hi hx,

I have a solution that fits the data you have presented, however it is dependent on you wanting to group your data based on a match of the first word in the "words" variable (you could edit the wordmtch variable to include the first two words, or write a macro to change this for you, if you also require that) and I'm sure there is a better way to go about doing this, but maybe it will help you.

``````data have;
input words \$ 1-28 c1 c2 c3 c4;
wordmtch = scan(words,1);
cards;
ally                         21 32 45 65
ally traitor                 56 76 45 67
barriar overcome             87 98 21 76
barriar overcome instruction 32 42 12 32
cat                          67 36 43 85
cat pat                      18 92 18 27
cat pa                       82 54 21 34
;
run;

data want;
set have;
by wordmtch;
retain c1tot c2tot c3tot c4tot;
if first.wordmtch then do
c1tot = c1;
c2tot = c2;
c3tot = c3;
c4tot = c4;
end;
else do;
c1tot = c1tot+c1;
c2tot = c2tot+c2;
c3tot = c3tot+c3;
c4tot = c4tot+c4;
end;
if last.wordmtch then output;
drop c1 c2 c3 c4 words;
rename c1tot = c1 c2tot = c2 c3tot = c3 c4tot = c4 wordmtch = words;
run;``````

Hope this helps,

Lee

All Replies
Solution
‎11-21-2016 12:50 PM
Occasional Contributor
Posts: 9

## Re: how do I group data and calculate sum using 'like' operator?

[ Edited ]

Hi hx,

I have a solution that fits the data you have presented, however it is dependent on you wanting to group your data based on a match of the first word in the "words" variable (you could edit the wordmtch variable to include the first two words, or write a macro to change this for you, if you also require that) and I'm sure there is a better way to go about doing this, but maybe it will help you.

``````data have;
input words \$ 1-28 c1 c2 c3 c4;
wordmtch = scan(words,1);
cards;
ally                         21 32 45 65
ally traitor                 56 76 45 67
barriar overcome             87 98 21 76
barriar overcome instruction 32 42 12 32
cat                          67 36 43 85
cat pat                      18 92 18 27
cat pa                       82 54 21 34
;
run;

data want;
set have;
by wordmtch;
retain c1tot c2tot c3tot c4tot;
if first.wordmtch then do
c1tot = c1;
c2tot = c2;
c3tot = c3;
c4tot = c4;
end;
else do;
c1tot = c1tot+c1;
c2tot = c2tot+c2;
c3tot = c3tot+c3;
c4tot = c4tot+c4;
end;
if last.wordmtch then output;
drop c1 c2 c3 c4 words;
rename c1tot = c1 c2tot = c2 c3tot = c3 c4tot = c4 wordmtch = words;
run;``````

Hope this helps,

Lee

Occasional Contributor
Posts: 19

## Re: how do I group data and calculate sum using 'like' operator?

Thank you Lee!

You gave me a great idea on how to dealing with this issue and it worked. I will try some macro or adjustment based on your code to fit my original data.

Super Contributor
Posts: 285

## Re: how do I group data and calculate sum using 'like' operator?

Using PROC SUMMARY will also work.

data have;
input words \$ 1-28 c1 c2 c3 c4;
wordmtch = scan(words,1);
cards;
ally 21 32 45 65
ally traitor 56 76 45 67
barriar overcome 87 98 21 76
barriar overcome instruction 32 42 12 32
cat 67 36 43 85
cat pat 18 92 18 27
cat pa 82 54 21 34
;
run;
proc sort;
by wordmtch;
proc summary data=have;
by wordmtch;
var c1 c2 c3 c4;
output out=have2
sum=;
run;
proc print data=have2;
run;

Occasional Contributor
Posts: 19

## Re: how do I group data and calculate sum using 'like' operator?

thank you greggB, the thing is I have almost 10 thousands observations for the 'words', and many of them have some typos. for example, the word 'cat' may be typed as 'catt' and I still want them to be treated as the same, should I include 'like' operator somewhere?
☑ This topic is solved.