Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- Create proc freq crosstab for all column combinations in a dataset

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 02-27-2024 01:12 PM
(522 views)

Hi fellow SAS users,

I am trying to create 2x2 tables with Cramer's V to assess correlation of categorical variables. It is a very large table with many columns, but for this example, lets say:

I have a table consisting of 5 columns called A, B, C, D, and E. I want to output a proc freq with Cramer's V for each combination. I know I can do something like:

`proc freq data=dataset; table (A--E)*(A--E) / chisq; run;`

But I don't want duplicates (i.e., I do not need both the A*B and B*A crosstabs). I do not have much experience writing arrays, but I think it should be possible to accomplish this task using one. The real table has 30+ columns, so typing out all the combinations to avoid duplications isn't practical.

Thanks in advance for any help the community can provide! 😁

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I would re-arrange the data such that you can use a BY statement in PROC FREQ. (Alternative is to use a macro)

```
data have;
input a b c d e;
cards;
7 10 11 12 14
7 11 10 12 13
8 10 12 13 14
;
data re_arrange;
set have;
array v a--e;
do i=1 to dim(v)-1;
do j=i+1 to dim(v);
value1=v(i);
value2=v(j);
type = cats(vname(v(i)),'*',vname(v(j)));
output;
end;
end;
drop i j;
run;
proc sort data=re_arrange;
by type;
run;
proc freq data=re_arrange;
by type;
table value1*value2/chisq;
run;
```

Sending this to a text or html output would produce a very long report where you need just a tiny fraction of the results to get Cramer's V. Better would be sending the statistics to a data set, called CHISQ and then filtering the row so you get only the cases where STATISTIC="Cramer's V". Replace the PROC FREQ with this:

```
ods select none;
ods output chisq=chisq(where=(statistic=:'Cram') drop=table);
proc freq data=re_arrange;
by type;
table value1*value2/chisq;
run;
```

If all the variables have labels (which I would strongly recommend), you could then use the VLABEL() function instread of the VNAME() function.

--

Paige Miller

Paige Miller

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I would re-arrange the data such that you can use a BY statement in PROC FREQ. (Alternative is to use a macro)

```
data have;
input a b c d e;
cards;
7 10 11 12 14
7 11 10 12 13
8 10 12 13 14
;
data re_arrange;
set have;
array v a--e;
do i=1 to dim(v)-1;
do j=i+1 to dim(v);
value1=v(i);
value2=v(j);
type = cats(vname(v(i)),'*',vname(v(j)));
output;
end;
end;
drop i j;
run;
proc sort data=re_arrange;
by type;
run;
proc freq data=re_arrange;
by type;
table value1*value2/chisq;
run;
```

Sending this to a text or html output would produce a very long report where you need just a tiny fraction of the results to get Cramer's V. Better would be sending the statistics to a data set, called CHISQ and then filtering the row so you get only the cases where STATISTIC="Cramer's V". Replace the PROC FREQ with this:

```
ods select none;
ods output chisq=chisq(where=(statistic=:'Cram') drop=table);
proc freq data=re_arrange;
by type;
table value1*value2/chisq;
run;
```

If all the variables have labels (which I would strongly recommend), you could then use the VLABEL() function instread of the VNAME() function.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@irvinery wrote:

Hi fellow SAS users,

I am trying to create 2x2 tables with Cramer's V to assess correlation of categorical variables. It is a very large table with many columns, but for this example, lets say:

I have a table consisting of 5 columns called A, B, C, D, and E. I want to output a proc freq with Cramer's V for each combination. I know I can do something like:

`proc freq data=dataset; table (A--E)*(A--E) / chisq; run;`

But I don't want duplicates (i.e., I do not need both the A*B and B*A crosstabs). I do not have much experience writing arrays, but I think it should be possible to accomplish this task using one. The real table has 30+ columns, so typing out all the combinations to avoid duplications isn't practical.

Thanks in advance for any help the community can provide! 😁

Since Arrays only exist for the duration of a data step or Proc Iml I am not sure how you are thinking of using this.

Unless you are thinking of placing the the names of the variables into an array and using one of the combinatorial functions to write the code.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@irvinery it seems to me it is wrong to mark that as the correct answer, as I have already shown how to use arrays in this problem.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Paige,

I didn't select that as the right answer, so I"m not sure how that happened.

I didn't select that as the right answer, so I"m not sure how that happened.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Done! Thanks again for the help.

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.