## Contingency tables and output

Occasional Learner
Posts: 1

# Contingency tables and output

Hi I need help with generating contingency tables, and then using ODS to output everything. I have included the question and also what I have so far, Thanks

Problem:

The SAS data set Houston.sas7bdat was obtained from the Behavioral Risk Factor Surveillance System (BRFSS) in 2012. The samples are people ≥ 18 year-old in Houston. This data set contains the following categorical variables:

sex = (1, 2) = (male, female)
race = (1, 2, 3, 4) = (White, Black, Hispanic, Others)
bmicat = (1, 2, 3, 4) = (underweight, normal, overweight, obese)
smoking = (1, 2, 3) = (non-smoker, former smoker, current smoker)
insurance = (1, 2) = (have insurance, no insurance)

1. Need to generate a data set which can exactly fill out the table below.
 Never had asthma Ever had asthma N %† N %† χ2 value P-value Adult Adult Elder Sex Male Female Race White Black Hispanic Others

† % = Row percentage

Step 1) Use a PROC FREQ to generate contingency tables with asthma in columns and adult, sex, and race in rows. Each cell only shows counts and row percentages. Have the Chi-square test for each variable, and then use ODS to generate two data sets (named freq and chisq) for cross-tables and statistics separately.

Step 2) Use a data step to clean the freq data set. Remove total frequencies and percentages (Hint: use _TYPE_ variable). Then use the IF-THEN-ELSE statement to output the value of frequency and row percentage into three different sub-datasets for adult, sex, and race, respectively.

Step 3) Use an appropriate procedure to transpose frequency and row percentage by adult/sex/race in those sub-datasets.

Step 4) Use the MERGE statement to combine the same transposed data set twice. One data set only contains frequency and another one only contains row percentage. Rearrange the position of variables as shown in the table.

Step 5) Use a data step to exact the chi-square value and p-value for adult. (Hint: Use the scan function on the variable Table to keep wanted rows)

Step 6) Combine the merged data set and chi-square data set for adult, which looks like:

Step 7) Repeat step 3~6 for sex and race.

Step 8) Combine three clean data sets.

Step 9) Export the data set to a csv file and directly copy all values from the excel file to the table, like:

 Never had asthma Ever had asthma N %† N %† χ2 value P-value Adult Adult 695 87.09 103 12.91 0.1339 0.7144 Elder 262 87.92 36 12.08 . . Sex Male Female Race White Black Hispanic Others

1. b)  Use PROC FORMAT to create customized formats for variables asthma, bmicat, smoking, and insurance according to the description on page 3. Draw frequency plots for bmicat, smoking and insurance with grouped bars by asthma

THIS IS THE CODE I HAVE SO FAR, BUT GETTING STUCK AT OUTPUTTING THE FIRST STEP:

ODS LISTING;
DATA houston ;
set 'C:\Data\houston.sas7bdat' ;
RUN;

PROC FORMAT ;
value asthmalbl 1 = 'never had asthma' 0 = 'ever had asthma';
value sexlbl 1 = 'male' 2 = 'female' ;
value racelbl 1 = 'White' 2 = 'Black' 3 = 'Hispanic' 4 = 'Others';
value bmicatlbl 1 = 'underweight' 2 = 'normal' 3 = 'overweight' 4 = 'obese' ;
value smokinglbl 1 = 'non-smoker' 2 = 'former smoker' 3 = 'current smoker' ;
value insurancelbl 1 = 'have insurance' 2 = 'no insurance' ;
RUN;
PROC PRINT DATA = houston ;
smoking smokinglbl. insurance insurancelbl. ;
RUN;

PROC FREQ DATA = houston ;
TABLES race adult sex * Asthma / NOROW NOCOL NOFREQ CHISQ CL;
ODS OUTPUT CrosstabFreqs = houston2
TITLE ;
smoking smokinglbl. insurance insurancelbl. ;
RUN;

This is a sample of the data in case you do not want to use the attachment

Obs asthma adult sex race bmicat educat smoking insurance

1 1 1 2 1 3 3 3 1
2 0 1 2 1 2 3 3 1
3 0 1 1 3 2 2 2 2
4 0 2 2 4 2 3 1 1
5 0 2 2 3 3 1 3 1
6 0 2 1 1 3 3 3 1
7 0 2 1 1 2 2 2 1
8 0 1 2 2 2 2 3 1
9 0 1 1 1 3 3 3 1
10 0 1 1 1 2 3 1 1
11 0 2 2 1 3 3 2 1
12 0 2 2 1 4 2 2 1
13 0 1 1 1 2 3 3 1
14 0 1 1 1 2 3 2 1
15 0 2 2 1 3 3 3 1
16 0 1 1 1 2 3 1 1
17 0 2 2 1 2 3 3 1
18 0 1 1 1 4 3 2 1
19 0 1 2 2 2 2 3 1
20 0 1 2 1 4 3 3 1

I appreciate any help, thank you!

Super User
Posts: 13,523

## Re: Contingency tables and output

This sounds so much like a classroom exercise I'm going ask if it is.

The main reason I want to know if it is a school exercise is, having worked with BRFSS data for approaching 20 years, the complex weighting schemes generally mean that procedures that do not account for the sample design generally are not suitable for analysis.

As a minimum you should account for the weights associated. That is what the Weight statement is for.

Easiest is to have an ODS OUTPUT statement for each table that you want to output, if that's the assignment.

This line:

TABLES race adult sex * Asthma / NOROW NOCOL NOFREQ CHISQ CL;

possibly should be

TABLES (race adult sex) * Asthma / NOROW NOCOL NOFREQ CHISQ CL;

if you want to cross the first 3 variables with the asthma variable. Since you have arequirement to include a row percent you might want to consider dropping that NOROW. That generates 3 tables of output. (HINT)

Discussion stats