DATA Step, Macro, Functions and more

how do I group data and calculate sum using 'like' operator?

Accepted Solution Solved
Reply
Occasional Contributor hx
Occasional Contributor
Posts: 17
Accepted Solution

how do I group data and calculate sum using 'like' operator?

I have a dataset like 

wordsc1c2c3c4
ally21324565
ally traitor56764567
barriar overcome87982176
barriar overcome instruction32421232
cat67364385
cat pat18921827
cat pa82542134

I want to group these data by words if they have similar words and calculate their sum of c1 c2 c3 c4 seperately. 

For example, I want 'the first level' group calculation if the first word in the first column are similar. that is to say, something like:

first levelc1c2c3c4
ally7710890132
barriar11914033108
cat16718282146

How do I do this? I come up with some ideas about 'like' operator, but there are thousands of words in the first column, I don't know whether there is some loop that I can do with this process, can some one help me out?


Accepted Solutions
Solution
‎11-21-2016 12:50 PM
Occasional Contributor
Posts: 7

Re: how do I group data and calculate sum using 'like' operator?

[ Edited ]

Hi hx,

 

I have a solution that fits the data you have presented, however it is dependent on you wanting to group your data based on a match of the first word in the "words" variable (you could edit the wordmtch variable to include the first two words, or write a macro to change this for you, if you also require that) and I'm sure there is a better way to go about doing this, but maybe it will help you.

 

data have;
input words $ 1-28 c1 c2 c3 c4;
wordmtch = scan(words,1);
cards;
ally                         21 32 45 65
ally traitor                 56 76 45 67
barriar overcome             87 98 21 76
barriar overcome instruction 32 42 12 32
cat                          67 36 43 85
cat pat                      18 92 18 27
cat pa                       82 54 21 34
;
run;


data want;
set have;
by wordmtch;
retain c1tot c2tot c3tot c4tot;
if first.wordmtch then do
c1tot = c1;
c2tot = c2;
c3tot = c3;
c4tot = c4;
end;
else do;
c1tot = c1tot+c1;
c2tot = c2tot+c2;
c3tot = c3tot+c3;
c4tot = c4tot+c4;
end;
if last.wordmtch then output;
drop c1 c2 c3 c4 words;
rename c1tot = c1 c2tot = c2 c3tot = c3 c4tot = c4 wordmtch = words;
run;

Hope this helps,

 

Lee

View solution in original post


All Replies
Solution
‎11-21-2016 12:50 PM
Occasional Contributor
Posts: 7

Re: how do I group data and calculate sum using 'like' operator?

[ Edited ]

Hi hx,

 

I have a solution that fits the data you have presented, however it is dependent on you wanting to group your data based on a match of the first word in the "words" variable (you could edit the wordmtch variable to include the first two words, or write a macro to change this for you, if you also require that) and I'm sure there is a better way to go about doing this, but maybe it will help you.

 

data have;
input words $ 1-28 c1 c2 c3 c4;
wordmtch = scan(words,1);
cards;
ally                         21 32 45 65
ally traitor                 56 76 45 67
barriar overcome             87 98 21 76
barriar overcome instruction 32 42 12 32
cat                          67 36 43 85
cat pat                      18 92 18 27
cat pa                       82 54 21 34
;
run;


data want;
set have;
by wordmtch;
retain c1tot c2tot c3tot c4tot;
if first.wordmtch then do
c1tot = c1;
c2tot = c2;
c3tot = c3;
c4tot = c4;
end;
else do;
c1tot = c1tot+c1;
c2tot = c2tot+c2;
c3tot = c3tot+c3;
c4tot = c4tot+c4;
end;
if last.wordmtch then output;
drop c1 c2 c3 c4 words;
rename c1tot = c1 c2tot = c2 c3tot = c3 c4tot = c4 wordmtch = words;
run;

Hope this helps,

 

Lee

Occasional Contributor hx
Occasional Contributor
Posts: 17

Re: how do I group data and calculate sum using 'like' operator?

Thank you Lee!

You gave me a great idea on how to dealing with this issue and it worked. I will try some macro or adjustment based on your code to fit my original data. 

Super Contributor
Posts: 268

Re: how do I group data and calculate sum using 'like' operator?

Using PROC SUMMARY will also work.

 

data have;
input words $ 1-28 c1 c2 c3 c4;
wordmtch = scan(words,1);
cards;
ally 21 32 45 65
ally traitor 56 76 45 67
barriar overcome 87 98 21 76
barriar overcome instruction 32 42 12 32
cat 67 36 43 85
cat pat 18 92 18 27
cat pa 82 54 21 34
;
run;
proc sort;
by wordmtch;
proc summary data=have;
by wordmtch;
var c1 c2 c3 c4;
output out=have2
sum=;
run;
proc print data=have2;
run;

Occasional Contributor hx
Occasional Contributor
Posts: 17

Re: how do I group data and calculate sum using 'like' operator?

thank you greggB, the thing is I have almost 10 thousands observations for the 'words', and many of them have some typos. for example, the word 'cat' may be typed as 'catt' and I still want them to be treated as the same, should I include 'like' operator somewhere?
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 230 views
  • 0 likes
  • 3 in conversation