@ballardw wrote:
You say:
" when I produce one for Race, it looks weird and just doesn't look right".
Why doesn't it look right? Some code (generated or otherwise) might help.
If you try to predict race from other variables you might be looking at the equivalent of trying to predict package color from the contents of a package.
I can see a model using race as an independent variable, in which case "looking right" can depend a lot on how well other data as well as "race" is collected and used. Race should almost never be dependent variable.
Actually, I don't find this to be a problem at all. For example, you have found an actual skeleton and by measuring the bones, you want to determine gender, or age, or race (one famous case is that bones found on a South Pacific island in 1940, near the known flight path of Amelia Earhart, were determined to likely be from a female of European descent of approximately the same height and age as Earhart, and there are no other known females of European descent that were lost in this area of the South Pacific).
These are all real-world problems that use discriminant analysis (PROC DISCRIM) to determine a model which can be used on skeletons found in the future (or past). And of course, the problem isn't really limited to skeletons. Whether it makes sense to do a logistic regression or decision tree or discriminant analysis in the EXACT situation that @b_smsha faces, well I don't know, but I don't have a problem with the concept.
Your example of predicting the color of a package by knowing the contents is somewhat spurious because the color of the package is likely uncorrelated with the contents. The race of a skeleton may be (I don't know, I'm not an anthropologist) correlated with the physical dimensions of a skeleton.
... View more