BookmarkSubscribeRSS Feed
Demographer
Pyrite | Level 9

Hi,

I try to perform a conditional logit with SAS. Suppose that z is an alternative's characteristic coded 0-1 and x an individual's characteristic coded 0-1 too. The dependant variable CHOICE is of course binary too. I want to modelise the interaction between z and x, so I create 3 dummy variables as follow:

xz1=0; if x=0 and z=1 then xz1=1;

xz2=0; if x=1 and z=0 then xz2=1;

xz3=0;if x=1 and z=1 then xz3=1;

The database is then like this:

SUBJECTMODECHOICEXZXZ1XZ2XZ3
11001100
12001100
13001100
14001100
15000000
16000000
17001100
18101100
19001100
110000000
21011001
22111001
23011001
10009001100
100010000000

When I use the procedure PROC MDC, all seems correct.

:

proc mdc ;

id SUBJECT;

model CHOICE = XZ1 XZ2 XZ3/ type=clogit choice=(MODE) ;

run;

The results are as follows:

                                             Erreur                                Prob.

       Paramètre     DDL Estimation         type    Valeur du test t    Approx. > |t|

       XZ1                        1 -0.8324       0.0996               -8.36           <.0001

       XZ2                        1 -0.2845       0.0546               -5.21           <.0001

       XZ3                        1 0.4596       0.0546                8.42            <.0001

However, when I use the procedure PROC LOGISTIC, results are weird:

proc logistic;

model CHOICE(event='1') = XZ1 XZ2 XZ3 ;

strata SUBJECT;

run;

 

                                                 Erreur         Khi 2

Paramètre     DDL    Estimation        type       de Wald    Pr > Khi 2

XZ1                       1       -0.8324      0.0996       69.8063        <.0001

XZ2                       1       -0.7442      0.1092       46.4685        <.0001

XZ3                       0             0           .         .             .

As you notice, the parameter of the last interaction variable is set to 0. This message appears in logs:

Note: The following parameters have been set to 0, since the variables are a linear combination of other variables as shown. XZ3=-XZ2

I first think it was a problem with the database, but since the results with the same variables using the PROC MDC are corrects, I think my database is ok. In addition, I notice that the parameters for XZ2 and XZ3 from the PROC MDC give the parameter for XZ2 from the PROC LOGISTIC by doing XZ2-XZ3 (-0.2845-0.4596 = -0.7442).

The same problem as the one in the PROC LOGISTIC occurs to when I try PROC PHREG.

Some has an idea on the reason of this problem?

5 REPLIES 5
Reeza
Super User

Its saying that your model has two variables that are perfectly collinear, ie Xz3=-XZ2 . You only need of the two variables in the model, because the second provides no extra information and can be calculated separately.

Demographer
Pyrite | Level 9

Thank for your answer, but I'm pretty sure this message from SAS is wrong. I need 3 variables since I want the interaction of 2 binary variables (so 4 possible cases, one of them is the reference). Moreover, it works with 3 variables by using PROC MDC or Stata.

Reeza
Super User

xz2=0; if x=1 and z=0 then xz2=1;

xz3=0;if x=1 and z=1 then xz3=1;

Those variables are the inverse of each other, when xz2=1 then xz3=0 and when xz2=0 then xz3=1.

Run a proc freq xz2*xz3 to see....

If you want the interaction in SAS you can use the class statement instead:

proc logistic data=have;

class x z/param=ref;

model choice(event='1')= x z x*z;

run;

Demographer
Pyrite | Level 9

You're right, thank a lot...

But there is something I don't get then. How can I estimate the parameter for xz3? There is 4 possibilities and only 2 parameters.

Demographer
Pyrite | Level 9

Reeza a écrit:

proc logistic data=have;

class x z/param=ref;

model choice(event='1')= x z x*z;

run;

I'm not totaly sure, but I can't do that since it's a conditional logit (with strata statement). The individual's variables (x) can't be there alone. They have to be in interaction with an alternative's variables.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2105 views
  • 3 likes
  • 2 in conversation