Solved: Re: frequency count across variables

Angi · Posted 02-26-2015 04:48 AM

I have a dataset that contains policy violations by students. The type of violation is coded from 1 to 40. Some students have only one violation while others have up to 13 violations.

I am trying to get a frequency count across variables for the different types of violations that have occurred.

Dataset Have

id	violation1	violation2	violation3	violation4	violation5
A1	1	2	6
A2	5	3	1	1	3
A3	7	1	4	1	5
A4	2	8	1	2
A5	1	7	2	8	6

Output Want

Violation_type	frequency
1	7
2	4
3	2
4	1
5	2
6	2
7	2
8	2

Thanks in advance!

RW9 · Posted 02-26-2015 05:11 AM

Hi,

Well, several ways of doing this. You could for instance have an array of vio1-vio7 and set them to 1 where they exist and freq that. My preferred method would be to normalise (vertical) the data, then do a simple count on that:

data have;

id="A1"; violation1=1; violation2=2; violation3=6; output;

id="A2"; violation1=5; violation2=3; violation3=1; output;

id="A3"; violation1=7; violation2=1; violation3=4; output;

run;

proc transpose data=have out=inter;

by id;

var violation1-violation3;

run;

proc sql;

create table WANT as

select COL1 as VIOLATION_TYPE,

COUNT(COL1) as FREQUENCY

from WORK.INTER

group by COL1;

quit;

View solution in original post

RW9 · Posted 02-26-2015 05:11 AM

Hi,

Well, several ways of doing this. You could for instance have an array of vio1-vio7 and set them to 1 where they exist and freq that. My preferred method would be to normalise (vertical) the data, then do a simple count on that:

data have;

id="A1"; violation1=1; violation2=2; violation3=6; output;

id="A2"; violation1=5; violation2=3; violation3=1; output;

id="A3"; violation1=7; violation2=1; violation3=4; output;

run;

proc transpose data=have out=inter;

by id;

var violation1-violation3;

run;

proc sql;

create table WANT as

select COL1 as VIOLATION_TYPE,

COUNT(COL1) as FREQUENCY

from WORK.INTER

group by COL1;

quit;

Angi · Posted 02-26-2015 08:33 PM

Thank you RW9, Kurt, and Hai!

Kurt_Bremser · Posted 02-26-2015 05:26 AM

A different approach, verticalizing the data in a data step and using freq for the count:

data have;

infile cards;

input

id :$2.

violation1

violation2

violation3

violation4

violation5

;

cards;

A1 1 2 6 . .

A2 5 3 1 1 3

A3 7 1 4 1 5

A4 2 8 1 2 .

A5 1 7 2 8 6

;

run;

data int (keep=violation_type);

set have;

array viols {*} _numeric_;

do i = 1 to dim(viols);

if viols{i} ne .

then do;

violation_type = viols{i};

output;

end;

run;

proc freq data=int noprint;

tables violation_type /out=want(drop=percent);

run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Haikuo · Posted 02-26-2015 08:53 AM

As for a one-step approach, Hash can be an option: (Raw data stole from Kurt)

data have;

infile cards;

input

id :$2.

violation1

violation2

violation3

violation4

violation5

;

cards;

A1 1 2 6 . .

A2 5 3 1 1 3

A3 7 1 4 1 5

A4 2 8 1 2 .

A5 1 7 2 8 6

;

run;

data _null_;

if _n_=1 then

do;

dcl hash h(ordered:'a');

h.definekey('vtype');

h.definedata('vtype', 'freq');

h.definedone();

call missing (vtype, freq);

end;

set have end=last;

array v violation:;

do over v;

if not missing(v) then

do;

vtype=v;

if h.find() ne 0 then

do;

freq=1;

rc=h.replace();

end;

else

do;

freq+1;

rc=h.replace();

end;

if last then

rc=h.output(dataset:'want');

run;

frequency count across variables

Re: frequency count across variables

Re: frequency count across variables

Re: frequency count across variables

Re: frequency count across variables

Re: frequency count across variables

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away