Hi, I'm trying to figure out how I can sum across columns and check to see which is highest (or sort them).
Here's what I mean.
I have the data
Acct Prov Field_1 Field_2 Msg1 Msg2 Msg3 ..... Msg999
123 ON A 123 1 1 0 0
234 ON B 124 0 1 1 1
235 BC A 154 1 1 1 0
256 YK D 146 0 0 1 1
938 MB F 156 0 1 1 1
Pretend I have 100,000 records, I want to see by addition which message pops up to most from Msg1 to Msg999 (may be some missing like Msg 903, etc. is not a columns.
for Example Msg1 : 234 times
Msg2 : 3484 times,
Msg3 : 23847 times,
Msg4 : 1 time
and so on...
Thanks.
Use STACKODS in combination with PROC MEANS:
proc means data=fake stackods sum mean;
var msg1-msg100;
ods output summary=want;
run;
proc sort data=want;
by descending sum;
run;
proc means data=yourdataset sum;
var msg: ;
run;
the msg: the : denotes a variable list of all variables whose names start with msg.
One of the very nice things of 0/1 coded variables is that the sum = count of values = 1. The mean gives you a percentage if values=1.
With so many variables, you probably want the machine to find the largest values. For example:
proc summary data=have;
var msg: ;
output out=sums (keep=msg: ) sum=;
run;
data want;
set sums;
array counts {*} msg: ;
do i=1 to dim(counts);
variable = vname(counts{i});
count = counts{i};
output;
end;
keep variable count;
run;
proc sort data=want;
by descending count;
run;
proc print data=want;
var variable count;
run;
Use STACKODS in combination with PROC MEANS:
proc means data=fake stackods sum mean;
var msg1-msg100;
ods output summary=want;
run;
proc sort data=want;
by descending sum;
run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.