Programming the statistical procedures from SAS

Different Somer's D from Freq and Logistic procedures

Reply
N/A
Posts: 0

Different Somer's D from Freq and Logistic procedures

HI

Can someone help on this please. I run freq and logestic procedure on same data but Somer's D produced is differnt. My code is below:

proc freq data=develop noprint;
tables p6*new_rating / measures;
output out=somersd(keep=_SMDCR_ ) smdcr;
run;

Somer's D = -0.42526

proc logistic data=develop desc;
class p6;
model new_rating=p6;
run;

Somer's D =0.17017


Thanks
Super Contributor
Super Contributor
Posts: 365

Re: Different Somer's D from Freq and Logistic procedures

Hello Mansoor,

Proc FREQ produces two Somer's D. Did you try to compare the other one?

SPR
N/A
Posts: 0

Re: Different Somer's D from Freq and Logistic procedures

Yes I did comppare the SMDRC and they are the same. But as 'new_rating' is independent variable in this example therefore, I think the SMDCR from Freq procedure shouold match the one from logistic procedure?? Message was edited by: mansoor
Regular Contributor
Posts: 169

Re: Different Somer's D from Freq and Logistic procedures

The FREQ procedure documentation indicates that both row and column measures must have ordinal properties for interpretation of Somers' D computed by the FREQ procedure. Note that a binary variable can always be assumed to have ordinal properties. So, the variable NEW_RATING meets the requirements for use in computing Somers' D with the FREQ procedure. But in specifying the predictor, p6, on a CLASS statement in the LOGISTIC procedure, you are indicating that p6 is a nominal variable. Thus, there is a mismatch of assumptions here.

It should be noted that a predictor variable in the LOGISTIC procedure can have either nominal or interval measurement level, but not ordinal. Somers' D returned by the LOGISTIC procedure does not, indeed cannot, be based on an assumption of ordinality of all variables. When you have a predictor variable which has more than two levels, you should rarely, if ever, obtain the same Somers' D from the FREQ and LOGISTIC procedures.

Which Somers' D computation is correct depends on what your assumptions are about the measurement level of the variable p6. However, my guess is that the Somers' D returned by the FREQ procedure is NOT the correct statistic if only because the value of Somers' D returned by PROC FREQ is negative. Also, the fact that you specified p6 as a categorical variable in the logistic regression model also indicates that it would not be appropriate to assume that p6 is ordinal.
N/A
Posts: 0

Re: Different Somer's D from Freq and Logistic procedures

Thanks Dale. It's really helpful.

However, Is it correct, with my definition of Table statement in Freq procedure, to compute _SMDCR_ and not _SMDRC_ ?

Can you also please elaborate the following statement a bit?
"However, my guess is that the Somers' D returned by the FREQ procedure is NOT the correct statistic if only because the value of Somers' D returned by PROC FREQ is negative."

Thanks
Regular Contributor
Posts: 169

Re: Different Somer's D from Freq and Logistic procedures

Unless you can assume that BOTH variables are ordinal, it would not be appropriate to compute either version of Somers' D using the FREQ procedure. If you can assume that both variables are ordinal, then _SMDRC_ is the appropriate statistic if the row variable represents the predictor variable and the column variable represents the response. If row and column interpretations as to predictor and response are turned around, then you would want to compute and interpret _SMDCR_.

Please disregard the comment about the negative Somers' D. I was thinking in terms of Somers' D from a logistic regression model where we would always expect a positive value. But for ordinal variables employed in PROC FREQ, the value of Somers' D can be negative if the frequency table has large values in the lower left and upper right portions of the table.
N/A
Posts: 0

Re: Different Somer's D from Freq and Logistic procedures

Thanks a lot for detailed answer.

One more question please, the following statement is an extract from SAS Help File, explaining Somers'D:

"Somers' D(C|R) and Somers' D(R|C) are asymmetric modifications of tau-b. C|R denotes that the row variable X is regarded as an independent variable, while the column variable Y is regarded as dependent. Similarly, R|C denotes that the column variable Y is regarded as an independent variable, while the row variable X is regarded as dependent."

I think, it could be only me, but what do you think of above statement?
Regular Contributor
Posts: 169

Re: Different Somer's D from Freq and Logistic procedures

The SAS documentation appears to have things turned around. We can examine this by obtaining the two variants of Somers' D for an asymmetric 2x2 frequency table. We can then compute Somers' D from PROC LOGISTIC using the row variable as the response and the column variable as predictor. Then try using the column variable as the response and the row variable as the predictor. What is reported by the FREQ procedure as Somers' D C|R is the same as Somers' D returned by the LOGISTIC procedure when the row variable is employed as the response and the column variable is the predictor. Somers' D R|C is the same as Somers' D returned by the LOGISTIC procedure when the column variable is the response and the row variable is the predictor. The code below demonstrates:

data test;
  row=1; col=1; freq=120; output;
  row=1; col=2; freq=5; output;
  row=2; col=1; freq=15; output;
  row=2; col=2; freq=80; output;
run;


proc freq data=test;
  weight freq;
  tables row*col / measures;
run;

proc logistic data=test;
  freq freq;
  model row=col;
run;

proc logistic data=test;
  freq freq;
  model col=row;
run;
Ask a Question
Discussion stats
  • 7 replies
  • 4476 views
  • 0 likes
  • 3 in conversation