Hi, all,
I meet a problem recently. My data is that some events happen on Good (G), Bad (B), Fair (F) dates. And I used General Logistic Regression. Then result only show "Good" and "Fair" , but I want Good and Bad...(maybe the wrong reference?) My code and result are as follows. Could anyone give me some suggestions to change the modle? Thank you very much!!!!!!!
data multi_logistic_data_dummy;
set logistic_data_original_multi;
IF multi_Y='Bad' then Y=1;
IF multi_Y='Fair' then Y=2;
IF multi__Y='Good' then Y=3;
run;
proc logistic descending data = multi_logistic_data_dummy;
model multi_Y= Size ROA Cement Semiconductor HM Optoelectronic othElectronic Financial Building Textiles Network Trade /RSQ;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;
You're modelling a cumulative logistic equation, I think you need the generalized form. If you include the following in your model statement it should work as expected. It does in my test code. I also removed the descending option.
link=glogit
proc logistic data = multi_logistic_data_dummy;
model multi_Y(event='Fair')= Size ROA Cement Semiconductor HM Optoelectronic othElectronic Financial Building Textiles Network Trade /RSQ link=glogit;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;
Model multi_y (event = 'Fair') = ... Rest of code;
Specify the event that's modelled.
Problem is in your multi_Y variable.
multi__Y is different from multi_Y
data multi_logistic_data_dummy;
set logistic_data_original_multi;
IF multi_Y='Bad' then Y=1;
IF multi_Y='Fair' then Y=2;
IF multi__Y='Good' then Y=3;
run;
@stat_sas I don't think that matters, because the variable being created never gets used, the original multi_y does.
Post your code and log, it should work. Make sure it matches your data exactly, it is case sensitive.
Hello,
My code and data are as follows. And it still doesn't work....
Thank you,
proc logistic descending data = multi_logistic_data_dummy;
model multi_Y (Event='Fair')= Size ROA Cement Semiconductor HM Optoelectronic othElectronic Financial Building Textiles Network Trade /RSQ;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;
I don't download files.
Post your log and code.
Hi Reeza,
THank you!
Ariel
____log___________
112 proc logistic descending data = multi_logistic_data_dummy;
113 model multi_Y (Event='Fair')= Size ROA Cement Semiconductor HM Optoelectronic
113! othElectronic Financial Building Textiles Network Trade /RSQ;
114 title "multi_logistic_Good";
115 output out=multi_logistic_Good_0;
116 quit;
NOTE: Option EVENT= is ignored since LINK=CLOGIT.
NOTE: PROC LOGISTIC is fitting the cumulative logit model. The probabilities modeled are
summed over the responses having the lower Ordered Values in the Response Profile table.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 488 observations read from the data set ARIEL.MULTI_LOGISTIC_DATA_DUMMY.
NOTE: The data set ARIEL.MULTI_LOGISTIC_GOOD_0 has 488 observations and 38 variables.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.21 seconds
cpu time 0.06 seconds
117 run;
PROC IMPORT
DATAFILE="C:\Users\USER\Desktop\paper_2\data\logistic_data_original_multi.xlsx"
OUT= logistic_data_original_multi
DBMS=EXCELCS REPLACE;
RUN;
data multi_logistic_data_dummy;
set logistic_data_original_multi;
IF Family=1 THEN DO; FAM=1; END;
IF Family=0 THEN DO; FAM=0; END;
IF Light='B' THEN DO; Light_YB=0; Light_G=0; Light_YR=0; Light_R=0; END;
IF Light='YB' THEN DO; Light_YB=1; Light_G=0; Light_YR=0; Light_R=0; END;
IF Light='G' THEN DO; Light_YB=0; Light_G=1; Light_YR=0; Light_R=0; END;
IF Light='YR' THEN DO; Light_YB=0; Light_G=0; Light_YR=1; Light_R=0; END;
IF Light='R' THEN DO; Light_YB=0; Light_G=0; Light_YR=0; Light_R=1; END;
IF Industry='水泥工業' THEN DO; Cement=1; Semiconductor=0; HM=0; Optoelectronic=0; othElectronic=0; Financial=0; Building=0; Textiles=0; Network=0; Trade=0; END;
IF Industry='半導體' THEN DO; Cement=0; Semiconductor=1; HM=0; Optoelectronic=0; othElectronic=0; Financial=0; Building=0; Textiles=0; Network=0; Trade=0; END;
IF Industry='生技醫療' THEN DO;Cement=0; Semiconductor=0; HM=1; Optoelectronic=0; othElectronic=0; Financial=0; Building=0; Textiles=0; Network=0; Trade=0; END;
IF Industry='光電業' THEN DO; Cement=0; Semiconductor=0; HM=0; Optoelectronic=1; othElectronic=0; Financial=0; Building=0; Textiles=0; Network=0; Trade=0; END;
IF Industry='其他' THEN DO; Cement=0; Semiconductor=0; HM=0; Optoelectronic=0; othElectronic=0; Financial=0; Building=0; Textiles=0; Network=0; Trade=0; END;
IF Industry='其他電子業' THEN DO;Cement=0; Semiconductor=0; HM=0; Optoelectronic=0; othElectronic=1; Financial=0; Building=0; Textiles=0; Network=0; Trade=0; END;
IF Industry='金融業' THEN DO; Cement=0; Semiconductor=0; HM=0; Optoelectronic=0; othElectronic=0; Financial=1; Building=0; Textiles=0; Network=0; Trade=0; END;
IF Industry='建材營造' THEN DO; Cement=0; Semiconductor=0; HM=0; Optoelectronic=0; othElectronic=0; Financial=0; Building=1; Textiles=0; Network=0; Trade=0; END;
IF Industry='紡織纖維' THEN DO; Cement=0; Semiconductor=0; HM=0; Optoelectronic=0; othElectronic=0; Financial=0; Building=0; Textiles=1; Network=0; Trade=0; END;
IF Industry='通信網路業' THEN DO;Cement=0; Semiconductor=0; HM=0; Optoelectronic=0; othElectronic=0; Financial=0; Building=0; Textiles=0; Network=1; Trade=0; END;
IF Industry='貿易百貨' THEN DO; Cement=0; Semiconductor=0; HM=0; Optoelectronic=0; othElectronic=0; Financial=0; Building=0; Textiles=0; Network=0; Trade=1; END;
run;
proc logistic descending data = multi_logistic_data_dummy;
model multi_Y (Event='Fair')= Size ROA Cement Semiconductor HM Optoelectronic othElectronic Financial Building Textiles Network Trade /RSQ;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;
If you read the log the following line indicates that EVENT is not honoured:
NOTE: Option EVENT= is ignored since LINK=CLOGIT.
You are aware that you don't need to create dummy variables for SAS for categorical variables? You can include them in the CLASS statement. Try changing the ref value in the CLASS statement.
proc logistic descending data = multi_logistic_data_dummy;
class multiy_y (ref='Fair')/param=ref;
model multi_Y= Size ROA Cement Semiconductor HM Optoelectronic othElectronic Financial Building Textiles Network Trade /RSQ;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;
Hi Reeza,
Yes, I aware that. It also has some problem...Thank you.
log------
132 proc logistic descending data = multi_logistic_data_dummy;
133 class multi_Y (ref='Fair')/param=ref;
134 model multi_Y= Size ROA Cement Semiconductor HM Optoelectronic othElectronic Financial
134! Building Textiles Network Trade /RSQ;
135 title "multi_logistic_Good";
136 output out=multi_logistic_Good_0;
137 quit;
NOTE: The REF= option for the response variable is ignored.
NOTE: PROC LOGISTIC is fitting the cumulative logit model. The probabilities modeled are
summed over the responses having the lower Ordered Values in the Response Profile table.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 488 observations read from the data set ARIEL.MULTI_LOGISTIC_DATA_DUMMY.
NOTE: The data set ARIEL.MULTI_LOGISTIC_GOOD_0 has 488 observations and 38 variables.
NOTE: PROCEDURE LOGISTIC used (Total process time):
real time 0.25 seconds
cpu time 0.11 seconds
138 run;
You're modelling a cumulative logistic equation, I think you need the generalized form. If you include the following in your model statement it should work as expected. It does in my test code. I also removed the descending option.
link=glogit
proc logistic data = multi_logistic_data_dummy;
model multi_Y(event='Fair')= Size ROA Cement Semiconductor HM Optoelectronic othElectronic Financial Building Textiles Network Trade /RSQ link=glogit;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;
If I'm understanding your question, which is very brief, that's how regression with categorical variables works....you get estimates in relation to other levels, so one level is always missing, your reference level.
model multi_Y (ref='Fair')= Size ROA Cement Semiconductor HM Optoelectronic othElectronic Financial Building Textiles Network Trade /RSQ;
Hi Reeza,
I use your suggestion and change event='Fair' into ref='Fair'. It seems to work. The result is as follow.
But I wonder that what is the difference between generalized form and cumulative form???
Thank you very much for your time.
Ariel
proc logistic descending data = multi_logistic_data_dummy;
class multi_Y (ref='Fair')/param=ref;
model multi_Y(ref='Fair')= Size ROA Cement Semiconductor HM Optoelectronic othElectronic Financial Building Textiles Network Trade /link = glogit;
title "multi_logistic_Good";
output out=multi_logistic_Good_0;
quit;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.