Hello, I'm working on NHANES 2011-2018 complex survey dataset and I've been coding all week, so it's possible I'm just not understanding a potential simple mistake I made. I recoded several variables into categories--for example, race and age. Below is what I coded (this is after concatenating datasets) as an example: if race=3 then raceCat=1;
else if race=4 then raceCat=2;
else if race=6 then raceCat=3;
else if race=1 or race=2 then raceCat=4;
else if race=7 then raceCat=5; I am now trying to check for normality among some of my variables. I am using PROC UNIVARIATE for this. Below is the code: proc sort;
by gender racecat;
run;
PROC UNIVARIATE data=datasetn plot normal;
where age >= 20;
by gender and racecat;
VAR waistcirc;
freq wt8yr_ng; *This is the weighting variable;
FORMAT gender SEXFMT. racecat RACEFMT. ;
title "Distribution of waist circumference gender and race: NHANES 2011-2018";
run; I noticed that in the output, the generated results are not going through all combinations of gender and race categories. For this particular code, only gender 1 (male) and race category 1 (Non-Hispanic White) were generated. [The screenshot is for the same program, but also includes 'age categories' in the by statement. As you can see, the program is only selecting one age category - 20 to 39 years old and there are no other category combination results after this 1 combo.]. I have also noticed the same problem when I ran a simple PROC FREQ procedure cross tabulating with a by statement -- only the first category of the variable in the by statement is used and the rest are ignored. Is there something I need to change in my settings? I'm very confused about why this is occurring. Thank you in advance for your help!
... View more