06-20-2016 01:39 PM
I was wondering how you go about proving that categorical varaibles are normal. For instance, I have married and not married, but how do i show that my distrobution is normal for this two categorical variables. A p-p and q-q plot will not work as those are supposed to be for continuous variables. Can anyone help me. what are some visual ways to prove that it is normal. I am only asking since we have so show we have our data from a normal distrobution in order to use it.
06-20-2016 02:56 PM - edited 06-20-2016 02:56 PM
The Normal distribution is for continuous quantities (weight, height, blood pressure). It is not appropriate for ordinal (age group, produce grade, education level) or nominal (gender, race) variables.
06-20-2016 02:58 PM
when when we are doing a significance test or a difference in portions we only have the numerical test, and no graphs to judge normalcy and independence on for categorical variables
06-20-2016 03:10 PM
*95% confidence interval of the difference between not married and married; proc freq data = project.termlifepartb; tables marcat*policy/ norow riskdiff(cl=(wald mn)); run; *95% significance(hypothesis test) of married and unmarried by yes or no policy; proc freq data=project.termlifepartb order=data; tables marcat*policy/ nopercent norow chisq relrisk; run;
I am doing 95% confidence interval for the difference in marriage category vs having an insurance policy and the second bit of code is for a hypothesis test of the difference in porportions of marriage category vs having an insurance policy. I was hypothesiing that people who are married will buy more insurance. my second bit of code output has its chi squared as being less than .05 so i reject that there is no difference and say that there is almost a 2.5 times more likely of a chance that if you are married that you wil possess term life insurance.
my issue is there are two conditions to do diference in proportions and they are sample proportions are available based on independent randomly selected samples from two populations and that np, n(1-p) are greater than or equal to 10. my question is how do i know they are randomly if i have no way to test categorical variables.
I attached my file as wel