- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
I want to create a dummy variable for gender (m=0, f=1) so that I can use proc reg for a bivariate analyses. However, my error keeps indicating that my variable doesn't exist. Any insight would be useful 🙂
data eva.cohort;
set eva.finalcohort;
if sex="m" then sexm=0;
if sex="f" then sexf=1;
avgvol1=sum(avg_2009+avg_2011+avg_2012+avg_2014)/4;
avgvol2=sum(bee_avg_2009+bee_avg_2011+bee_avg_2012+bee_avg_2014)/4;
run;
proc means data=eva.cohort n nmiss mean median max min;
var avgvol1 avgvol2;
run;
proc reg data=eva.cohort;
model avgvol1=sexm sexf;
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps your actual data doesn't contain "m" and "f". Perhaps it contains "M" and "F" instead.
At any rate, you would be well advised to treat SEX as a CLASS variable within PROC REG. Most regression procedures will automatically create the proper dummy variables when you use a CLASS statement.
Note for the future: instead of posting the program, post the log so we can see what message applies to what step.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps your actual data doesn't contain "m" and "f". Perhaps it contains "M" and "F" instead.
At any rate, you would be well advised to treat SEX as a CLASS variable within PROC REG. Most regression procedures will automatically create the proper dummy variables when you use a CLASS statement.
Note for the future: instead of posting the program, post the log so we can see what message applies to what step.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Astounding wrote:
At any rate, you would be well advised to treat SEX as a CLASS variable within PROC REG. Most regression procedures will automatically create the proper dummy variables when you use a CLASS statement.
PROC GLM, not PROC REG
To @kthartma: there is no need to create your own dummy variables. PROC GLM will create them for you, and also avoid the programming error you are having.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks so much! I ended up using proc glm.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You're not using the SUM() function as usually intended either.
It's usually used when you want to consider missing as 0, this approach wouldn't do that because you've listed the items with + in between rather than comma's.
Test your code with the following:
avgvol1=sum(avg_2009+avg_2011+avg_2012+avg_2014)/4;
avgvol1_check0 =sum(avg_2009, avg_2011, avg_2012, avg_2014)/4;
avgvol1_check1 = sum(of avg_2009-avg_2014)/ 4;
avgvol1_check2 = mean(of avg_2009-avg_2014);
If you have different results between any of the calculations you have an issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you insist on using proc reg your code should read:
data eva.cohort;
set eva.finalcohort;
if sex="m" then sexDum=0;
if sex="f" then sexDum=1;
avgvol1 = (avg_2009 + avg_2011 + avg_2012 + avg_2014) / 4;
avgvol2 = (bee_avg_2009 + bee_avg_2011 + bee_avg_2012 + bee_avg_2014) / 4;
run;
proc means data=eva.cohort n nmiss mean median max min;
var avgvol1 avgvol2;
run;
proc reg data=eva.cohort;
model avgvol1 = sexDum;
run;