BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ariel
Calcite | Level 5

Hi, all,

 

I meet a problem recently. My data is that some events happen on Good (G), Bad (B), Fair (F) dates. And I used General Logistic Regression. Then result only show "Good" and "Fair" , but I  want Good and Bad...(maybe the wrong reference?) My code and result are as follows. Could anyone give me some suggestions to change the modle? Thank you very much!!!!!!!reg_image.PNG

data multi_logistic_data_dummy;
set logistic_data_original_multi;
IF multi_Y='Bad'  then Y=1;
IF multi_Y='Fair'  then Y=2;
IF multi__Y='Good' then Y=3;
run;

proc logistic descending data = multi_logistic_data_dummy;
model multi_Y= Size ROA Cement Semiconductor HM  Optoelectronic  othElectronic  Financial  Building  Textiles  Network  Trade /RSQ;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

You're modelling a cumulative logistic equation, I think you need the generalized form. If you include the following in your model statement it should work as expected. It does in my test code. I also removed the descending option. 

 

 

 

link=glogit

 

proc logistic data = multi_logistic_data_dummy;

model multi_Y(event='Fair')= Size ROA Cement Semiconductor HM  Optoelectronic  othElectronic  Financial  Building  Textiles  Network  Trade /RSQ link=glogit;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;

 

View solution in original post

14 REPLIES 14
Reeza
Super User
Model multi_y (event = 'Fair') = ... Rest of code;

Specify the event that's modelled. 

Ariel
Calcite | Level 5
Thank you for your reply. But I already try this...it not worked.

##- Please type your reply above this line. Simple formatting, no
attachments. -##
stat_sas
Ammonite | Level 13

Problem is in your multi_Y variable. 

 

multi__Y is different from multi_Y

 

data multi_logistic_data_dummy;
set logistic_data_original_multi;
IF multi_Y='Bad'  then Y=1;
IF multi_Y='Fair'  then Y=2;
IF multi__Y='Good' then Y=3;
run;

 

 

Reeza
Super User

@stat_sas I don't think that matters, because the variable being created never gets used, the original multi_y does. 

 

Post your code and log, it should work. Make sure it matches your data exactly, it is case sensitive. 

Ariel
Calcite | Level 5

Hello,

 

My code and data are as follows. And it still doesn't work....

 

Thank you,

 

 

 

proc logistic descending data = multi_logistic_data_dummy;
model multi_Y (Event='Fair')= Size ROA Cement Semiconductor HM  Optoelectronic  othElectronic  Financial  Building  Textiles  Network  Trade /RSQ;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;

 

Reeza
Super User

I don't download files. 

 

Post your log and code.

Ariel
Calcite | Level 5

Hi Reeza,

 

THank you!

 

Ariel

 

____log___________

 

112 proc logistic descending data = multi_logistic_data_dummy;
113 model multi_Y (Event='Fair')= Size ROA Cement Semiconductor HM Optoelectronic
113! othElectronic Financial Building Textiles Network Trade /RSQ;
114 title "multi_logistic_Good";
115 output out=multi_logistic_Good_0;
116 quit;

NOTE: Option EVENT= is ignored since LINK=CLOGIT.
NOTE: PROC LOGISTIC is fitting the cumulative logit model. The probabilities modeled are
summed over the responses having the lower Ordered Values in the Response Profile table.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 488 observations read from the data set ARIEL.MULTI_LOGISTIC_DATA_DUMMY.
NOTE: The data set ARIEL.MULTI_LOGISTIC_GOOD_0 has 488 observations and 38 variables.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.21 seconds
cpu time 0.06 seconds


117 run;

 

 

PROC IMPORT
DATAFILE="C:\Users\USER\Desktop\paper_2\data\logistic_data_original_multi.xlsx"
OUT= logistic_data_original_multi
DBMS=EXCELCS REPLACE; 
RUN;


data multi_logistic_data_dummy;
set logistic_data_original_multi;
IF Family=1 THEN DO; FAM=1; END;
IF Family=0 THEN DO;  FAM=0; END;
IF Light='B' THEN DO;   Light_YB=0; Light_G=0;  Light_YR=0; Light_R=0; END;
IF Light='YB' THEN DO;  Light_YB=1; Light_G=0;  Light_YR=0; Light_R=0; END;
IF Light='G' THEN DO;  Light_YB=0; Light_G=1;  Light_YR=0; Light_R=0; END;
IF Light='YR' THEN DO;  Light_YB=0; Light_G=0;  Light_YR=1; Light_R=0; END;
IF Light='R' THEN DO;  Light_YB=0; Light_G=0;  Light_YR=0; Light_R=1; END;
IF Industry='水泥工業' THEN DO; Cement=1;  Semiconductor=0; HM=0;  Optoelectronic=0;  othElectronic=0; Financial=0; Building=0; Textiles=0;  Network=0;  Trade=0; END;
IF Industry='半導體' THEN DO; Cement=0; Semiconductor=1; HM=0;  Optoelectronic=0;  othElectronic=0; Financial=0; Building=0; Textiles=0;  Network=0;  Trade=0; END;
IF Industry='生技醫療' THEN DO;Cement=0;  Semiconductor=0; HM=1;  Optoelectronic=0;  othElectronic=0; Financial=0; Building=0; Textiles=0;  Network=0;  Trade=0; END;
IF Industry='光電業' THEN DO; Cement=0; Semiconductor=0; HM=0;  Optoelectronic=1;  othElectronic=0; Financial=0; Building=0; Textiles=0;  Network=0;  Trade=0; END;
IF Industry='其他' THEN DO;  Cement=0; Semiconductor=0; HM=0;  Optoelectronic=0; othElectronic=0; Financial=0; Building=0; Textiles=0;  Network=0;  Trade=0; END;
IF Industry='其他電子業' THEN DO;Cement=0;   Semiconductor=0; HM=0;  Optoelectronic=0;  othElectronic=1; Financial=0; Building=0; Textiles=0;  Network=0;  Trade=0; END;
IF Industry='金融業' THEN DO; Cement=0;  Semiconductor=0; HM=0;  Optoelectronic=0;  othElectronic=0; Financial=1; Building=0; Textiles=0;  Network=0;  Trade=0; END;
IF Industry='建材營造' THEN DO; Cement=0; Semiconductor=0; HM=0;  Optoelectronic=0;  othElectronic=0; Financial=0; Building=1; Textiles=0;  Network=0;  Trade=0; END;
IF Industry='紡織纖維' THEN DO; Cement=0;  Semiconductor=0; HM=0;  Optoelectronic=0;  othElectronic=0; Financial=0; Building=0; Textiles=1;  Network=0;  Trade=0; END;
IF Industry='通信網路業' THEN DO;Cement=0;  Semiconductor=0; HM=0;  Optoelectronic=0;  othElectronic=0; Financial=0; Building=0; Textiles=0;  Network=1;  Trade=0; END;
IF Industry='貿易百貨' THEN DO;  Cement=0; Semiconductor=0; HM=0;  Optoelectronic=0;  othElectronic=0; Financial=0; Building=0; Textiles=0;  Network=0;  Trade=1; END;
run;


proc logistic descending data = multi_logistic_data_dummy;
model multi_Y (Event='Fair')= Size ROA Cement Semiconductor HM  Optoelectronic  othElectronic  Financial  Building  Textiles  Network  Trade /RSQ;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;
Reeza
Super User

If you read the log the following line indicates that EVENT is not honoured:

 

NOTE: Option EVENT= is ignored since LINK=CLOGIT.

 

 

You are aware that you don't need to create dummy variables for SAS for categorical variables? You can include them in the CLASS statement. Try changing the ref value in the CLASS statement. 

 

 

proc logistic descending data = multi_logistic_data_dummy;
class multiy_y (ref='Fair')/param=ref;
model multi_Y= Size ROA Cement Semiconductor HM  Optoelectronic  othElectronic  Financial  Building  Textiles  Network  Trade /RSQ;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;

 

Ariel
Calcite | Level 5

Hi Reeza,

 

Yes, I aware that. It also has some problem...Thank you.

 

log------

132 proc logistic descending data = multi_logistic_data_dummy;
133 class multi_Y (ref='Fair')/param=ref;
134 model multi_Y= Size ROA Cement Semiconductor HM Optoelectronic othElectronic Financial
134! Building Textiles Network Trade /RSQ;
135 title "multi_logistic_Good";
136 output out=multi_logistic_Good_0;
137 quit;

NOTE: The REF= option for the response variable is ignored.
NOTE: PROC LOGISTIC is fitting the cumulative logit model. The probabilities modeled are
summed over the responses having the lower Ordered Values in the Response Profile table.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 488 observations read from the data set ARIEL.MULTI_LOGISTIC_DATA_DUMMY.
NOTE: The data set ARIEL.MULTI_LOGISTIC_GOOD_0 has 488 observations and 38 variables.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.25 seconds
cpu time 0.11 seconds


138 run;

 

 

 

Reeza
Super User

You're modelling a cumulative logistic equation, I think you need the generalized form. If you include the following in your model statement it should work as expected. It does in my test code. I also removed the descending option. 

 

 

 

link=glogit

 

proc logistic data = multi_logistic_data_dummy;

model multi_Y(event='Fair')= Size ROA Cement Semiconductor HM  Optoelectronic  othElectronic  Financial  Building  Textiles  Network  Trade /RSQ link=glogit;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;

 

Ariel
Calcite | Level 5
If I use "event='Fair' ", it will not show this outcome.
But "ref='Fair'" can, I don't know why@@
Reeza
Super User

If I'm understanding your question, which is very brief, that's how regression with categorical variables works....you get estimates in relation to other levels, so one level is always missing, your reference level.

 

 

 

stat_sas
Ammonite | Level 13
model multi_Y (ref='Fair')= Size ROA Cement Semiconductor HM  Optoelectronic  othElectronic  Financial  Building  Textiles  Network  Trade /RSQ;

 

 

Ariel
Calcite | Level 5

Hi Reeza,

 

I use your suggestion and change event='Fair' into ref='Fair'. It seems to work. The result is as follow.

But I wonder that what is the difference between generalized form and cumulative form???

Thank you very much for your time.

 

Ariel

 

 

reg_image.PNG

 

proc logistic descending data = multi_logistic_data_dummy;
class multi_Y (ref='Fair')/param=ref;
model multi_Y(ref='Fair')= Size ROA Cement Semiconductor HM  Optoelectronic  othElectronic  Financial  Building  Textiles  Network  Trade /link = glogit;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 4804 views
  • 3 likes
  • 3 in conversation