BookmarkSubscribeRSS Feed
tuntun
Calcite | Level 5

 

I want to compare different data in 1 variable (class).

For example, i want to compare student in class 6, class 7 and class 8 with another variable, which are student's scores, so i code like this in new data set:

If class = 6  then class_6 =1 (class 6 is new variable)
Else If class = 7 then class_6=0
and i do similar for class 7 and 8;

my proc reg :data=labla;
Model score = Class_6 class_7 class_8;
Run; 

However,  I get this not when use proc reg , i think because there are many class beside 6,7 and 8 in variable class and I just want to compare these three, which lead to this note.: Model is not full rank. Least-squares solutions for the parameters are not unique. Some statistics will be misleading. A reported DF of 0 or B means that the estimate is biased.

 

what can i do to solve this problem, is there any better way to find out and compare score of class6,7,8 besides proc reg, thank you. 

9 REPLIES 9
PaigeMiller
Diamond | Level 26

This is expected behavior of PROC REG, and really is NOT a problem.

 

To eliminate the warning, then you have to take out of the model one of your dummy variables. Or use PROC GLM with a CLASS statement, and then you don't even have to create the dummy variables in the first place.

--
Paige Miller
ballardw
Super User

You can restrict analysis to a subset of data using where:

 

proc reg data=labla;

   where class in (6 7 8);

 

I suspect that you will get odd results.

I think that you may want proc GLM; with your variable named class on a class statement

 

proc glm data=labla;

   where class in (6 7 8);

   model score = class;

 

;

 if you are interested in testing whether the mean score is the same for the classes then behaps a MEANS statement

means class; <other options for specific types of tests such as ANOVA>

tuntun
Calcite | Level 5
I tried proc glm data=labla;

where class in (6 7 8);

model score = class

but it 's error in where statement :((

PaigeMiller
Diamond | Level 26

Please be kind enough to share the SAS log with us so we can see the exact code you used, and the exact error message.

 

Click on the {i} icon and then paste the log into that window. Do not paste the log directly into your message.


Advice for future interactions: saying you got errors, but not telling us what the error is, always results in more questions, rather than how to fix the error.

--
Paige Miller
tuntun
Calcite | Level 5
I tried proc glm data=labla;

where class = 'six' or class='seven' or class='eight';

class = class;

model score = class;

run;

PaigeMiller
Diamond | Level 26

This is not the LOG. This does not show the error you get. This is not pasted into the window that appears when you click on the {i} icon. Please follow the instructions and provide the information we requested.

--
Paige Miller
Reeza
Super User

When you create dummy variables, you need to create one less than the number of levels. The reason for this is if you know 2, you can always figure out what the third value should be, so it doesn't add any information. Technically, in linear algebra terms, it means that the third indicator is a linear combination of the first two indicator variables so you need to drop one variable. 

 

I would recommend switching to PROC GLM and using a CLASS statement instead though. 

proc reg data=labla (rename = class=grade_level);
where grade_level in (6, 7, 8);
class grade_level(ref=first) / param=ref;
Model score = grade_level;
Run;

Assuming the code you've shown is correct and works without errors in the log the above code should work and provide what you want, it will be comparing grade 7/8 to grade 6 though.


@tuntun wrote:

 

I want to compare different data in 1 variable (class).

For example, i want to compare student in class 6, class 7 and class 8 with another variable, which are student's scores, so i code like this in new data set:

If class = 6  then class_6 =1 (class 6 is new variable)
Else If class = 7 then class_6=0
and i do similar for class 7 and 8;

my proc reg :data=labla;
Model score = Class_6 class_7 class_8;
Run; 

However,  I get this not when use proc reg , i think because there are many class beside 6,7 and 8 in variable class and I just want to compare these three, which lead to this note.: Model is not full rank. Least-squares solutions for the parameters are not unique. Some statistics will be misleading. A reported DF of 0 or B means that the estimate is biased.

 

what can i do to solve this problem, is there any better way to find out and compare score of class6,7,8 besides proc reg, thank you. 


 

tuntun
Calcite | Level 5
Thanks. I use proc glm but i dont know how to read output. I cant not attach screenshot of output for this reply. So can I basically use means statement to compare score for class 6,7,8 in proc glm?
Reeza
Super User
I'm not famililar with the means statement. I would be comparing parameter estimates, but it depends on your hypothesis. Are you comparing those three against each other, or against the rest of the population?

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 912 views
  • 1 like
  • 4 in conversation