BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Angi
Obsidian | Level 7

I have a dataset that contains policy violations by students. The type of violation is coded from 1 to 40. Some students have only one violation while others have up to 13 violations.

I am trying to get a frequency count across variables for the different types of violations that have occurred.

Dataset Have

id

violation1

violation2

violation3

violation4

violation5

A1

1

2

6

A2

5

3

1

1

3

A3

7

1

4

1

5

A4

2

8

1

2

A5

1

7

2

8

6

Output Want

Violation_type

frequency

1

7

2

4

3

2

4

1

5

2

6

2

7

2

8

2

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Well, several ways of doing this.  You could for instance have an array of vio1-vio7 and set them to 1 where they exist and freq that.  My preferred method would be to normalise (vertical) the data, then do a simple count on that:

data have;

  id="A1"; violation1=1; violation2=2; violation3=6; output;

  id="A2"; violation1=5; violation2=3; violation3=1; output;

  id="A3"; violation1=7; violation2=1; violation3=4; output;

run;

proc transpose data=have out=inter;

  by id;

  var violation1-violation3;

run;

proc sql;

  create table WANT as

  select  COL1 as VIOLATION_TYPE,

          COUNT(COL1) as FREQUENCY

  from    WORK.INTER

  group by COL1;

quit;

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Well, several ways of doing this.  You could for instance have an array of vio1-vio7 and set them to 1 where they exist and freq that.  My preferred method would be to normalise (vertical) the data, then do a simple count on that:

data have;

  id="A1"; violation1=1; violation2=2; violation3=6; output;

  id="A2"; violation1=5; violation2=3; violation3=1; output;

  id="A3"; violation1=7; violation2=1; violation3=4; output;

run;

proc transpose data=have out=inter;

  by id;

  var violation1-violation3;

run;

proc sql;

  create table WANT as

  select  COL1 as VIOLATION_TYPE,

          COUNT(COL1) as FREQUENCY

  from    WORK.INTER

  group by COL1;

quit;

Angi
Obsidian | Level 7

Thank you RW9, Kurt, and Hai!

Kurt_Bremser
Super User

A different approach, verticalizing the data in a data step and using freq for the count:

data have;

infile cards;

input

  id :$2.

  violation1

  violation2

  violation3

  violation4

  violation5

;

cards;

A1 1 2 6 . .

A2 5 3 1 1 3

A3 7 1 4 1 5

A4 2 8 1 2 .

A5 1 7 2 8 6

;

run;

data int (keep=violation_type);

set have;

array viols {*} _numeric_;

do i = 1 to dim(viols);

  if viols{i} ne .

  then do;

    violation_type = viols{i};

    output;

  end;

end;

run;

proc freq data=int noprint;

tables violation_type /out=want(drop=percent);

run;

Haikuo
Onyx | Level 15

As for a one-step approach, Hash can be an option: (Raw data stole from Kurt)

data have;

     infile cards;

     input

           id :$2.

           violation1

           violation2

           violation3

           violation4

           violation5

     ;

     cards;

A1 1 2 6 . .

A2 5 3 1 1 3

A3 7 1 4 1 5

A4 2 8 1 2 .

A5 1 7 2 8 6

;

run;

data _null_;

     if _n_=1 then

           do;

                dcl hash h(ordered:'a');

                h.definekey('vtype');

                h.definedata('vtype', 'freq');

                h.definedone();

                call missing (vtype, freq);

           end;

     set have end=last;

     array v violation:;

     do over v;

           if not missing(v) then

                do;

                     vtype=v;

                     if h.find() ne 0 then

                           do;

                                freq=1;

                                rc=h.replace();

                           end;

                     else

                           do;

                                freq+1;

                                rc=h.replace();

                           end;

                end;

     end;

     if last then

           rc=h.output(dataset:'want');

run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 3092 views
  • 12 likes
  • 4 in conversation