BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Angi
Obsidian | Level 7

I have a dataset that contains policy violations by students. The type of violation is coded from 1 to 40. Some students have only one violation while others have up to 13 violations.

I am trying to get a frequency count across variables for the different types of violations that have occurred.

Dataset Have

id

violation1

violation2

violation3

violation4

violation5

A1

1

2

6

A2

5

3

1

1

3

A3

7

1

4

1

5

A4

2

8

1

2

A5

1

7

2

8

6

Output Want

Violation_type

frequency

1

7

2

4

3

2

4

1

5

2

6

2

7

2

8

2

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Well, several ways of doing this.  You could for instance have an array of vio1-vio7 and set them to 1 where they exist and freq that.  My preferred method would be to normalise (vertical) the data, then do a simple count on that:

data have;

  id="A1"; violation1=1; violation2=2; violation3=6; output;

  id="A2"; violation1=5; violation2=3; violation3=1; output;

  id="A3"; violation1=7; violation2=1; violation3=4; output;

run;

proc transpose data=have out=inter;

  by id;

  var violation1-violation3;

run;

proc sql;

  create table WANT as

  select  COL1 as VIOLATION_TYPE,

          COUNT(COL1) as FREQUENCY

  from    WORK.INTER

  group by COL1;

quit;

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Well, several ways of doing this.  You could for instance have an array of vio1-vio7 and set them to 1 where they exist and freq that.  My preferred method would be to normalise (vertical) the data, then do a simple count on that:

data have;

  id="A1"; violation1=1; violation2=2; violation3=6; output;

  id="A2"; violation1=5; violation2=3; violation3=1; output;

  id="A3"; violation1=7; violation2=1; violation3=4; output;

run;

proc transpose data=have out=inter;

  by id;

  var violation1-violation3;

run;

proc sql;

  create table WANT as

  select  COL1 as VIOLATION_TYPE,

          COUNT(COL1) as FREQUENCY

  from    WORK.INTER

  group by COL1;

quit;

Angi
Obsidian | Level 7

Thank you RW9, Kurt, and Hai!

Kurt_Bremser
Super User

A different approach, verticalizing the data in a data step and using freq for the count:

data have;

infile cards;

input

  id :$2.

  violation1

  violation2

  violation3

  violation4

  violation5

;

cards;

A1 1 2 6 . .

A2 5 3 1 1 3

A3 7 1 4 1 5

A4 2 8 1 2 .

A5 1 7 2 8 6

;

run;

data int (keep=violation_type);

set have;

array viols {*} _numeric_;

do i = 1 to dim(viols);

  if viols{i} ne .

  then do;

    violation_type = viols{i};

    output;

  end;

end;

run;

proc freq data=int noprint;

tables violation_type /out=want(drop=percent);

run;

Haikuo
Onyx | Level 15

As for a one-step approach, Hash can be an option: (Raw data stole from Kurt)

data have;

     infile cards;

     input

           id :$2.

           violation1

           violation2

           violation3

           violation4

           violation5

     ;

     cards;

A1 1 2 6 . .

A2 5 3 1 1 3

A3 7 1 4 1 5

A4 2 8 1 2 .

A5 1 7 2 8 6

;

run;

data _null_;

     if _n_=1 then

           do;

                dcl hash h(ordered:'a');

                h.definekey('vtype');

                h.definedata('vtype', 'freq');

                h.definedone();

                call missing (vtype, freq);

           end;

     set have end=last;

     array v violation:;

     do over v;

           if not missing(v) then

                do;

                     vtype=v;

                     if h.find() ne 0 then

                           do;

                                freq=1;

                                rc=h.replace();

                           end;

                     else

                           do;

                                freq+1;

                                rc=h.replace();

                           end;

                end;

     end;

     if last then

           rc=h.output(dataset:'want');

run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 3362 views
  • 12 likes
  • 4 in conversation