About akbarlam

akbarlam · ‎12-03-2021

Hello, I'm working on NHANES 2011-2018 complex survey dataset and I've been coding all week, so it's possible I'm just not understanding a potential simple mistake I made. I recoded several variables into categories--for example, race and age. Below is what I coded (this is after concatenating datasets) as an example: if race=3 then raceCat=1; else if race=4 then raceCat=2; else if race=6 then raceCat=3; else if race=1 or race=2 then raceCat=4; else if race=7 then raceCat=5; I am now trying to check for normality among some of my variables. I am using PROC UNIVARIATE for this. Below is the code: proc sort; by gender racecat; run; PROC UNIVARIATE data=datasetn plot normal; where age >= 20; by gender and racecat; VAR waistcirc; freq wt8yr_ng; *This is the weighting variable; FORMAT gender SEXFMT. racecat RACEFMT. ; title "Distribution of waist circumference gender and race: NHANES 2011-2018"; run; I noticed that in the output, the generated results are not going through all combinations of gender and race categories. For this particular code, only gender 1 (male) and race category 1 (Non-Hispanic White) were generated. [The screenshot is for the same program, but also includes 'age categories' in the by statement. As you can see, the program is only selecting one age category - 20 to 39 years old and there are no other category combination results after this 1 combo.]. I have also noticed the same problem when I ran a simple PROC FREQ procedure cross tabulating with a by statement -- only the first category of the variable in the by statement is used and the rest are ignored. Is there something I need to change in my settings? I'm very confused about why this is occurring. Thank you in advance for your help!

akbarlam · ‎10-05-2020

@ballardw Thank you so much, that worked. I see what I did wrong there!

akbarlam · ‎10-05-2020

Hi everyone, I'm a new SAS user and researcher. I'm working with environmental contaminant data and trying to impute at 0.5*LOD (limit of detection). For example, I have my contaminant variable - pcb99. For this variable, I'd like to impute 0.5*0.03 for all sample values that fall below the detection limit of 0.03. All values above the DL would remain the same. Originally I was going to use proc MI (multiple imputation), but I believe that's only for missing values. I was wondering if I could get some help with the coding for creating a *new* variable imputing 0.015 for all values below the DL (0.03). I've tried: data set2; set set1; pcb99=pcb99DL; if pcb99DL<0.03 then pcb99DL=0.015; run; This imputes all values as 0.015. I would greatly appreciate any help, thank you!

Proc Univariate - program ignoring groups in by statement

Re: Creating a new variable

Creating a new variable

Proc Univariate - program ignoring groups in by statement

Re: Creating a new variable

Creating a new variable