## How to count frequencies between multiple variables

Solved
Occasional Contributor
Posts: 15

# How to count frequencies between multiple variables

Hi everyone,

I am trying to generate a table of total count of values across multiple values (top 100 most frequent values).  It will be a count of a couple of multiple variables. For example, this could be an example of the data:

 ID Dx1 Dx2 Dx3 Dx4 Dx5 Dx6 1 A B C D 2 B A D 3 C D A 4 E F A D 5 G B C D E F

I would like to tally up all values between Dx1 - Dx6, and I would like to get the following frequency table:

 Dx Frequency % D 5 25 A 4 20 B 3 15 C 3 15 F 2 10 E 2 10 G 1 5

LW

Accepted Solutions
Solution
‎07-07-2015 02:25 AM
Super Contributor
Posts: 355

## Re: How to count frequencies between multiple variables

You could also try:

data wide;
infile datalines missover ;
input ID  Dx1 \$ Dx2 \$ Dx3 \$ Dx4 \$ Dx5 \$ Dx6 \$;
datalines;
1 A B C D
2 B A D
3 C D A
4 E F A D
5 G B C D E F
;
run;

Data Want (Keep=ID Dx);
Set wide;
Array DA Dx:;
Do over DA;
Dx=DA;
If not Missing (Dx);
Output;
End;
Run;

Proc Freq Data=Want;
Tables Dx/NoPrint Out=Want;
Run;

All Replies
Regular Contributor
Posts: 249

## Re: How to count frequencies between multiple variables

perhaps you could try this?

data wide;

infile datalines missover ;

input ID  Dx1 \$ Dx2 \$ Dx3 \$ Dx4 \$ Dx5 \$ Dx6 \$;

datalines;

1 A B C D

2 B A D

3 C D A

4 E F A D

5 G B C D E F

;

run;

DATA long;

SET wide;

DO i = 1 to 6 ;

else do;

OUTPUT;

END;

END;

DROP dx1 - dx6 ;

RUN;

proc freq data=long; tables dx; run;

Super User
Posts: 9,599

## Re: How to count frequencies between multiple variables

An alternative to the array:

proc transpose data=have out=inter;

var dx1-dx6;

run;

This would give you a long list of the dx variables, you can then use freq or sql counts to get the required output table.

Occasional Contributor
Posts: 15

## Re: How to count frequencies between multiple variables

Hello to both of you,

Thank you so much for your help. Is there a way to do it without having to transpose? This is a play set, but my actual dataset is over 300,000 with over 21 columns per entry to transpose.

Thank you so much!

Super User
Posts: 23,663

## Re: How to count frequencies between multiple variables

I think you'll find that many of the diagnosis will be empty so transposing isn't actually much of an issue. SAS should process 1 million rows in a few seconds on a crappy system.

Here's a way to do it with proc freq. Run proc freq with all variables and capture that table to run a proc freq on again. The final table should be what you're after.

data wide;

infile datalines missover ;

input ID  Dx1 \$ Dx2 \$ Dx3 \$ Dx4 \$ Dx5 \$ Dx6 \$;

datalines;

1 A B C D

2 B A D

3 C D A

4 E F A D

5 G B C D E F

;

run;

ods table onewayfreqs=summary1;

proc freq data=wide;

table dx:;

run;

data summary2;

set summary1;

diagnosis=coalescec(of dx1-dx6);

keep diagnosis frequency;

run;

proc freq data=summary2 order=freq;

table diagnosis/out=want;

weight frequency;

run;

proc print data=want;

run;

Solution
‎07-07-2015 02:25 AM
Super Contributor
Posts: 355

## Re: How to count frequencies between multiple variables

You could also try:

data wide;
infile datalines missover ;
input ID  Dx1 \$ Dx2 \$ Dx3 \$ Dx4 \$ Dx5 \$ Dx6 \$;
datalines;
1 A B C D
2 B A D
3 C D A
4 E F A D
5 G B C D E F
;
run;

Data Want (Keep=ID Dx);
Set wide;
Array DA Dx:;
Do over DA;
Dx=DA;
If not Missing (Dx);
Output;
End;
Run;

Proc Freq Data=Want;
Tables Dx/NoPrint Out=Want;
Run;

Super User
Posts: 9,599

## Re: How to count frequencies between multiple variables

As mentioned above, 300,00 records is nothing.  Also factor into any approach you take what you want to do with the data later.  For instance if you want to do any further summarizing, expanding of the data etc. as the normalised form is generally easier to work with.

Occasional Contributor
Posts: 15

## Re: How to count frequencies between multiple variables

Thank you so much to everyone for your help and for the different ways to solve this problem. Thank you again!

🔒 This topic is solved and locked.