DATA Step, Macro, Functions and more

How to count in multiple columns with Proc Freq

Reply
New Contributor
Posts: 3

How to count in multiple columns with Proc Freq

Hi, I am using SAS EG 5.1 at work. I have a data base like below, with 14 columns. 1st column is unique IDs , and the rest are Instructions, which are in form of codes (numbers between 0 - 200) . Is there a way I can calculate the frequency of occurrence of a specific code value (for eg 30) in the whole instruction 1 : instruction 13 grid collectively. I have tried Proc freq, but that gives me tables for single columns separately.

I have previously done this in excel using the countif (range, criteria) code where I have set the range to the whole grid, and criteria as the value of the specific code.

Is there a SAS EG/BASE equivalent to the above function , where I can find the collective frequency of a specific number in a range of columns.

IDinstruction 1instruction2instruction 3instruction 4instruction 10
instruction 11
instruction 12
instruction 13

Thank you for the help Smiley Happy

Contributor hbi
Contributor
Posts: 66

Re: How to count in multiple columns with Proc Freq

Give this a try. It does not use PROC FREQ, but I believe this technique performs more efficiently than using multiple PROC FREQs.

/* stack the output first and get counts for each column */
PROC SQL;
  CREATE TABLE instruction_merged AS 
  SELECT           instruction_1  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_1
  UNION ALL SELECT instruction_2  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_2
  UNION ALL SELECT instruction_3  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_3
  UNION ALL SELECT instruction_4  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_4
  UNION ALL SELECT instruction_5  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_5
  UNION ALL SELECT instruction_6  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_6
  UNION ALL SELECT instruction_7  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_7
  UNION ALL SELECT instruction_8  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_8
  UNION ALL SELECT instruction_9  AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_9
  UNION ALL SELECT instruction_10 AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_10
  UNION ALL SELECT instruction_11 AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_11
  UNION ALL SELECT instruction_12 AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_12
  UNION ALL SELECT instruction_13 AS instruction_code, COUNT(id) AS group_frequency 
  FROM work.your_dataset GROUP BY instruction_13;
QUIT;
 
 
/* get sum of sums */
PROC SQL;
  CREATE TABLE instruction_counts AS 
  SELECT instruction_code, SUM(group_frequency) AS total_frequency
  FROM instruction_merged
  GROUP BY instruction_code
  ORDER BY instruction_code;
QUIT;
Valued Guide
Posts: 765

Re: How to count in multiple columns with Proc Freq

Hi.  If you don't mind creating another data set, you could transpose your data prior to PROC FREQ ...

 

* create some data, 100 observations, 14 variables (an ID plus 13 instructions)

data x;
array ins(13);
do id=1 to 100;
do j=1 to 13;
ins(j) = int(200*ranuni(1234));
end;
output;
end;
keep i: ;
run;

 

* create a new data set with only two variables, INSTR (instructions 1 to 13) and COL1 (value of instructions 1 to 13);
proc transpose data=x out=y (drop=id) name=instr ;
var ins1-ins13;
by id;
run;

 

* use PROC FREQ on the new data set;

proc freq data=y;
table col1*instr / norow nocol nopercent;
run;

 

Hopefully you won't mind that instructions 1 and then 10 through 13 appear in the first 5 columns.  If it's an issue, you could always change the values of variable INSTR in data set Y (make INSTR1 -> INSTR01, INSTR2 -> INSTR02, etc.).

Super User
Posts: 17,774

Re: How to count in multiple columns with Proc Freq

Same method as above, but using EG Tasks:

 

In EG use the Transpose Task to flip the table  so that you have:

 

ID Instruction Value

1 1 1

1 2 23

1 3 23

 

Use One Way Freq Task on the resulting table

Valued Guide
Posts: 765

Re: How to count in multiple columns with Proc Freq

Hi, sugestion for renaming the variables in my previous post (insert just after PROC TRANSPOSE)

 

data y;
set y;
if length(instr) eq 4 then instr = catt('ins0',char(instr,4));
run;

 

That would convert INS1 to INS01, INS2 to INS02, etc. and the table columns would be in variable number order.

Ask a Question
Discussion stats
  • 4 replies
  • 216 views
  • 0 likes
  • 4 in conversation