Hi, I'm trying to figure out how I can sum across columns and check to see which is highest (or sort them).
Here's what I mean.
I have the data
Acct Prov Field_1 Field_2 Msg1 Msg2 Msg3 ..... Msg999
123 ON A 123 1 1 0 0
234 ON B 124 0 1 1 1
235 BC A 154 1 1 1 0
256 YK D 146 0 0 1 1
938 MB F 156 0 1 1 1
Pretend I have 100,000 records, I want to see by addition which message pops up to most from Msg1 to Msg999 (may be some missing like Msg 903, etc. is not a columns.
for Example Msg1 : 234 times
Msg2 : 3484 times,
Msg3 : 23847 times,
Msg4 : 1 time
and so on...
Thanks.
Use STACKODS in combination with PROC MEANS:
proc means data=fake stackods sum mean;
var msg1-msg100;
ods output summary=want;
run;
proc sort data=want;
by descending sum;
run;
proc means data=yourdataset sum;
var msg: ;
run;
the msg: the : denotes a variable list of all variables whose names start with msg.
One of the very nice things of 0/1 coded variables is that the sum = count of values = 1. The mean gives you a percentage if values=1.
With so many variables, you probably want the machine to find the largest values. For example:
proc summary data=have;
var msg: ;
output out=sums (keep=msg: ) sum=;
run;
data want;
set sums;
array counts {*} msg: ;
do i=1 to dim(counts);
variable = vname(counts{i});
count = counts{i};
output;
end;
keep variable count;
run;
proc sort data=want;
by descending count;
run;
proc print data=want;
var variable count;
run;
Use STACKODS in combination with PROC MEANS:
proc means data=fake stackods sum mean;
var msg1-msg100;
ods output summary=want;
run;
proc sort data=want;
by descending sum;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.