Re: How to count in multiple columns with Proc Freq

muchkin · Posted 06-15-2015 10:17 PM

Hi, I am using SAS EG 5.1 at work. I have a data base like below, with 14 columns. 1st column is unique IDs , and the rest are Instructions, which are in form of codes (numbers between 0 - 200) . Is there a way I can calculate the frequency of occurrence of a specific code value (for eg 30) in the whole instruction 1 : instruction 13 grid collectively. I have tried Proc freq, but that gives me tables for single columns separately.

I have previously done this in excel using the countif (range, criteria) code where I have set the range to the whole grid, and criteria as the value of the specific code.

Is there a SAS EG/BASE equivalent to the above function , where I can find the collective frequency of a specific number in a range of columns.

ID

instruction 1

instruction2

instruction 3

instruction 4

instruction 10

instruction 11

instruction 12

instruction 13

Thank you for the help

hbi · Posted 09-10-2015 04:29 PM

Give this a try. It does not use PROC FREQ, but I believe this technique performs more efficiently than using multiple PROC FREQs.

/* stack the output first and get counts for each column */
PROC SQL;
  CREATE TABLE instruction_merged AS 
  SELECT           instruction_1  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_1
  UNION ALL SELECT instruction_2  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_2
  UNION ALL SELECT instruction_3  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_3
  UNION ALL SELECT instruction_4  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_4
  UNION ALL SELECT instruction_5  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_5
  UNION ALL SELECT instruction_6  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_6
  UNION ALL SELECT instruction_7  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_7
  UNION ALL SELECT instruction_8  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_8
  UNION ALL SELECT instruction_9  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_9
  UNION ALL SELECT instruction_10 AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_10
  UNION ALL SELECT instruction_11 AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_11
  UNION ALL SELECT instruction_12 AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_12
  UNION ALL SELECT instruction_13 AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_13;
QUIT;
 
 
/* get sum of sums */
PROC SQL;
  CREATE TABLE instruction_counts AS 
  SELECT instruction_code, SUM(group_frequency) AS total_frequency
  FROM instruction_merged
  GROUP BY instruction_code
  ORDER BY instruction_code;
QUIT;

MikeZdeb · Posted 09-10-2015 05:32 PM

Hi. If you don't mind creating another data set, you could transpose your data prior to PROC FREQ ...

* create some data, 100 observations, 14 variables (an ID plus 13 instructions)

data x;
array ins(13);
do id=1 to 100;
do j=1 to 13;
ins(j) = int(200*ranuni(1234));
end;
output;
end;
keep i: ;
run;

* create a new data set with only two variables, INSTR (instructions 1 to 13) and COL1 (value of instructions 1 to 13);
proc transpose data=x out=y (drop=id) name=instr ;
var ins1-ins13;
by id;
run;

* use PROC FREQ on the new data set;

proc freq data=y;
table col1*instr / norow nocol nopercent;
run;

Hopefully you won't mind that instructions 1 and then 10 through 13 appear in the first 5 columns. If it's an issue, you could always change the values of variable INSTR in data set Y (make INSTR1 -> INSTR01, INSTR2 -> INSTR02, etc.).

Reeza · Posted 09-10-2015 05:43 PM

Same method as above, but using EG Tasks:

In EG use the Transpose Task to flip the table so that you have:

ID Instruction Value

1 1 1

1 2 23

1 3 23

Use One Way Freq Task on the resulting table

MikeZdeb · Posted 09-10-2015 05:54 PM

Hi, sugestion for renaming the variables in my previous post (insert just after PROC TRANSPOSE)

data y;
set y;
if length(instr) eq 4 then instr = catt('ins0',char(instr,4));
run;

That would convert INS1 to INS01, INS2 to INS02, etc. and the table columns would be in variable number order.

How to count in multiple columns with Proc Freq