Hello,
I am working with a data set called PRAMS and it uses a complex sampling design. It has several racial categories and years. I would like to include one year of data and two racial categories (non-Hispanic). However, from reading it says I cannot delete cases because it will mess up the weighting. How can I exclude cases for the purposes of my analysis? Also, how can I identify non hispanic black and non hispanic white if there are two separate variables. I have a MAT_RACE variable with categories 1, 2,3 (white black other that include non hispanic and hispanic ethnicity) I also have Hispanic BC that includes 1 yes hispanic and 2 not hispanic. What code can I use to create a variable for non hispanic black white and other ( I would like to use this as my analytic sample).
Finally, I have non PRAMS data that I have merged with PRAMS data. I was wondering if I had to create weights for that data as well? My dependent variables (outcomes) come from PRAMS and my independent variables come from another data set that I created and merged with PRAMS data. I am doing a state level analysis..so the outcomes are nested within states. I know that will be a multilevel model. To run the statistical models I have to apply weights to the PRAMS data. Should I also create weights for the other (independent variables) as well?
Using a single year is not likely to be an issue as long as you are careful to use the entire year and if the data were originally weighted for that year. A multiyear data set may have been weighted differently, so read and understand the weighting methodology for the set used. If you extract a single year from a multiyear it is extremely likely that any statistic that you project to population total (not rates) is way low.
You don't mention what type of analysis you are wanting. If you want to have the data grouped by values of a category for you may be looking for a DOMAIN statement.
Code to create non-hispanic white and non-hispanic (and should do non-hispanic other) depends to an extent on what you want for a result. Do you want 3 levels, 4 levels, 5 ....
/* create hispanic as separate level with non-hispanic others*/ if hispanic then newvar=1 ; /* or what ever code*/ else select (race); when (white) newvar=2; when (black) newvar=3; when (other) newvar=4; other; end;
use your variables and values.
What sort of data did you merge to data set? If you affected the number of records then some sort of reweighting is more likely to be needed and you may have moved your project into the MIXED model world.
That description really sounds like some sort of Mixed model which is not my strong suit.
I would look strongly at your incarceration data. You may want to exclude some other states because the black population is low and the one-year rate can fluctuate drastically. I say that as I live and work in such a state and get funny looks at some meetings where I mentioned that we have more Basques, Native Americans and, at that time, Asians than Blacks. So ratios of anything related to the black population were subject to wide annual variation. This might also be the reason one of your states does "other than white" to have a sample reliable enough to mean something in terms of policy planning/decision making processes.
If a variable is missing on a model statement in most of the procedures the record would be excluded from the analysis, which may not quite be the same as discarding the record such as with Where.
But not sure how to handle this case, as I said mixed models aren't my experience.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.