BookmarkSubscribeRSS Feed
AliRKM
Obsidian | Level 7

Hi all,

 

I'm wanting to create a single table comparing variables between 2 groups with results of chisq.

This code from @ballardw solves part of the problem:

proc tabulate data=sashelp.cars;
class make ;
class type;
class origin;
tables type origin,
(all='All makes' make)*( rowpctn n)
;
run;

Is there a way to add a chisq test directly to the output - without having to run a proc freq and ods output OneWayChiSq=chisq?

 

4 REPLIES 4
ballardw
Super User

Proc tabulate has no CHI-sq statistic possible. So no on that part.

 

And unless your data were very clean with no missing values would likely be a poor choice if you have more than 2 class variables as missing values for any class variable by default will remove a record from the table. And use of the / MISSING option could create categories that you don't want in some of the table. (Ask me how I know this).

 

You can create multiple outputs of chi-square tests in proc freq.

One way:

ods output crosstabfreqs = freqtables;
ods output chisq = chisqtables;
proc freq data = sashelp.cars;
   tables make *(type origin) /chisq;
run;

If you look closely at those output sets you'll see LOTS of missing values because of the way the set is structured.

You might be able to merge the chi-sq statistics you want with your existing data and get it to display as a VAR variable and an appropriate statistic , like Max but the actual details might be fun.

 

Or combine the counts from the crosstable with the chissquare statistic using the Table variable as a matching variable: caution on exactly which rows you want from either set.

 

This is one of the cases where you get to think about what you actually need to assemble bits.

 

I have recently done something along these lines to create a bunch of CALL Execute statements to build a display table without the Chisq statistics but add text afterwards to display the statistic/df/ p-value as desired. Could probably generate Proc Report code with a LINE statement to dispaly the chi-sq information.

 

AliRKM
Obsidian | Level 7
Thanks. I do indeed have lots of missing values.
I did this originally using similar code to the what you have showed here
but I have about 40 variables so manually creating the output I want from
the freqtables and chisqtables is very onerous. Was hoping for a shortcut.

My best bet is probably to proc tabulate matched to proc freq.

CALL Execute statements are above my paygrade 😞
ballardw
Super User

I am attaching part of a program that matches the chi-sq statistic only with a subset of the proc freq crosstabs, creating a data set that can be used to drive call execute.

 

The multiple tables statements in proc freq were grouping concepts needed in the report. If you haven't run into the Tables options for multiple crosses like A*(B C) which produces tables A*B and A*C the use of that short hand to get one variable crossed with specific subset of other variables. Then I only get the ones I want. If you do something like: Tables (A B C) * ( X Y Z) then you get each element in the first parentheses crossed with every element of the second set. You can also use any of the SAS variable list types for short hand. These can create a lot of output. You have been warned.

 

If you need a three or more dimension analysis you are on your own as this could take a lot of time to customize.

 

The data step with all the call execute statements could be changed to PUT instead of call execute. If you have a FILE statement then the put would create a code file you could examine and run bits at a time instead of the whole thing.

However the code is not targeted for your data or output needs. It may give you enough hints to modify as needed.

In my case I only wanted the cell frequencies and the row percentages for the tables I built. And only used the Chi-sq statistic row, none of the MH or other stats.

The display driver data set parses the elements of the data to pull out the variable names and labels (you do provide meaningful labels for variables don't you) for use in the per pair of variables proc tabulate code used to display the results. I create the lines of code in a variable called Lstr (long string) because some of the things to put together are easier to do outside of a call execute statement and just plain could not be done with a Put statement.  

The Proc ODSTEXT is to create a paragraph after each table with the Chi-square p-value. The DF and statistic would be available if you want them.

 

This sort of restructuring is not a trivial exercise and there are many ways to approach it. Experience gives you some idea of approaches for what you need in a specific instance but you have to do a lot of problems to get that sort of experience.

 

Note that I sandwich all of the call execute created statements in an ODS RTF output as that is what my project wanted. You may not need that or want PDF. Warning: sending this output to Excel has a high likelihood of ugly output in the form of word wrapped narrow columns and possibly a very large number of tabs.

AliRKM
Obsidian | Level 7

Thank you. I did not realise how complicated this is. I am really new to SAS so it will probably take me longer to understand the code you kindly provided than to create the table manually 😞

So much to learn.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2893 views
  • 1 like
  • 2 in conversation