turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Different Somer's D from Freq and Logistic procedu...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-08-2010 12:50 PM

HI

Can someone help on this please. I run freq and logestic procedure on same data but Somer's D produced is differnt. My code is below:

proc freq data=develop noprint;

tables p6*new_rating / measures;

output out=somersd(keep=_SMDCR_ ) smdcr;

run;

Somer's D = -0.42526

proc logistic data=develop desc;

class p6;

model new_rating=p6;

run;

Somer's D =0.17017

Thanks

Can someone help on this please. I run freq and logestic procedure on same data but Somer's D produced is differnt. My code is below:

proc freq data=develop noprint;

tables p6*new_rating / measures;

output out=somersd(keep=_SMDCR_ ) smdcr;

run;

Somer's D = -0.42526

proc logistic data=develop desc;

class p6;

model new_rating=p6;

run;

Somer's D =0.17017

Thanks

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-08-2010 03:34 PM

Hello Mansoor,

Proc FREQ produces two Somer's D. Did you try to compare the other one?

SPR

Proc FREQ produces two Somer's D. Did you try to compare the other one?

SPR

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-11-2010 04:04 AM

Yes I did comppare the SMDRC and they are the same. But as 'new_rating' is independent variable in this example therefore, I think the SMDCR from Freq procedure shouold match the one from logistic procedure??
Message was edited by: mansoor

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-11-2010 12:34 PM

The FREQ procedure documentation indicates that both row and column measures must have ordinal properties for interpretation of Somers' D computed by the FREQ procedure. Note that a binary variable can always be assumed to have ordinal properties. So, the variable NEW_RATING meets the requirements for use in computing Somers' D with the FREQ procedure. But in specifying the predictor, p6, on a CLASS statement in the LOGISTIC procedure, you are indicating that p6 is a nominal variable. Thus, there is a mismatch of assumptions here.

It should be noted that a predictor variable in the LOGISTIC procedure can have either nominal or interval measurement level, but not ordinal. Somers' D returned by the LOGISTIC procedure does not, indeed cannot, be based on an assumption of ordinality of all variables. When you have a predictor variable which has more than two levels, you should rarely, if ever, obtain the same Somers' D from the FREQ and LOGISTIC procedures.

Which Somers' D computation is correct depends on what your assumptions are about the measurement level of the variable p6. However, my guess is that the Somers' D returned by the FREQ procedure is NOT the correct statistic if only because the value of Somers' D returned by PROC FREQ is negative. Also, the fact that you specified p6 as a categorical variable in the logistic regression model also indicates that it would not be appropriate to assume that p6 is ordinal.

It should be noted that a predictor variable in the LOGISTIC procedure can have either nominal or interval measurement level, but not ordinal. Somers' D returned by the LOGISTIC procedure does not, indeed cannot, be based on an assumption of ordinality of all variables. When you have a predictor variable which has more than two levels, you should rarely, if ever, obtain the same Somers' D from the FREQ and LOGISTIC procedures.

Which Somers' D computation is correct depends on what your assumptions are about the measurement level of the variable p6. However, my guess is that the Somers' D returned by the FREQ procedure is NOT the correct statistic if only because the value of Somers' D returned by PROC FREQ is negative. Also, the fact that you specified p6 as a categorical variable in the logistic regression model also indicates that it would not be appropriate to assume that p6 is ordinal.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-12-2010 04:09 AM

Thanks Dale. It's really helpful.

However, Is it correct, with my definition of Table statement in Freq procedure, to compute _SMDCR_ and not _SMDRC_ ?

Can you also please elaborate the following statement a bit?

"However, my guess is that the Somers' D returned by the FREQ procedure is NOT the correct statistic if only because the value of Somers' D returned by PROC FREQ is negative."

Thanks

However, Is it correct, with my definition of Table statement in Freq procedure, to compute _SMDCR_ and not _SMDRC_ ?

Can you also please elaborate the following statement a bit?

"However, my guess is that the Somers' D returned by the FREQ procedure is NOT the correct statistic if only because the value of Somers' D returned by PROC FREQ is negative."

Thanks

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-12-2010 03:35 PM

Unless you can assume that BOTH variables are ordinal, it would not be appropriate to compute either version of Somers' D using the FREQ procedure. If you can assume that both variables are ordinal, then _SMDRC_ is the appropriate statistic if the row variable represents the predictor variable and the column variable represents the response. If row and column interpretations as to predictor and response are turned around, then you would want to compute and interpret _SMDCR_.

Please disregard the comment about the negative Somers' D. I was thinking in terms of Somers' D from a logistic regression model where we would always expect a positive value. But for ordinal variables employed in PROC FREQ, the value of Somers' D can be negative if the frequency table has large values in the lower left and upper right portions of the table.

Please disregard the comment about the negative Somers' D. I was thinking in terms of Somers' D from a logistic regression model where we would always expect a positive value. But for ordinal variables employed in PROC FREQ, the value of Somers' D can be negative if the frequency table has large values in the lower left and upper right portions of the table.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-13-2010 06:44 AM

Thanks a lot for detailed answer.

One more question please, the following statement is an extract from SAS Help File, explaining Somers'D:

"Somers' D(C|R) and Somers' D(R|C) are asymmetric modifications of tau-b.**C|R denotes that the row variable X is regarded as an independent variable, while the column variable Y is regarded as dependent. Similarly, R|C denotes that the column variable Y is regarded as an independent variable, while the row variable X is regarded as dependent.**"

I think, it could be only me, but what do you think of above statement?

One more question please, the following statement is an extract from SAS Help File, explaining Somers'D:

"Somers' D(C|R) and Somers' D(R|C) are asymmetric modifications of tau-b.

I think, it could be only me, but what do you think of above statement?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-13-2010 12:31 PM

The SAS documentation appears to have things turned around. We can examine this by obtaining the two variants of Somers' D for an asymmetric 2x2 frequency table. We can then compute Somers' D from PROC LOGISTIC using the row variable as the response and the column variable as predictor. Then try using the column variable as the response and the row variable as the predictor. What is reported by the FREQ procedure as Somers' D C|R is the same as Somers' D returned by the LOGISTIC procedure when the row variable is employed as the response and the column variable is the predictor. Somers' D R|C is the same as Somers' D returned by the LOGISTIC procedure when the column variable is the response and the row variable is the predictor. The code below demonstrates:

data test;

row=1; col=1; freq=120; output;

row=1; col=2; freq=5; output;

row=2; col=1; freq=15; output;

row=2; col=2; freq=80; output;

run;

proc freq data=test;

weight freq;

tables row*col / measures;

run;

proc logistic data=test;

freq freq;

model row=col;

run;

proc logistic data=test;

freq freq;

model col=row;

run;

data test;

row=1; col=1; freq=120; output;

row=1; col=2; freq=5; output;

row=2; col=1; freq=15; output;

row=2; col=2; freq=80; output;

run;

proc freq data=test;

weight freq;

tables row*col / measures;

run;

proc logistic data=test;

freq freq;

model row=col;

run;

proc logistic data=test;

freq freq;

model col=row;

run;