turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Proc discrim: how to interpret generalized squared...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-20-2015 02:59 PM

Hi,

when running proc discrim with unequal priors, say 0.9 and 0.1, a generalized squared distance matrix is produced in the output. Although I (computationally) understand how those values are computed (as the SAS manual also shows), I was wondering how to INTERPRET a nonzero distance to itself, and how to INTERPRET the asymmetry in the distances.

Any idea?

Thanks

Accepted Solutions

Solution

11-25-2015
12:17 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-25-2015 09:29 AM

The GENERALIZED squared distance between groups is composed of the squared distance plus two other terms. The squared distance is symmetric and the distance from a group to itself is zero. So it is the other two terms that provides the assymmetry.

The formula is in the documentation under "Parametric Mathods". It includes the terms

1. ln| S_t |, which is the log of the determinant of the covariance matrix within the t_th group

2. -2 ln(q_t), where q_t is the prior probability of membership in the t_th group. (Note that q_t < 1, so this term is actually positive.)

The second paragraph of the "Overview" section cites Rao (1973) for the generalized squared distance.

As for interpretation, I don't have a reference, but I'll take a guess. From the definition, it looks like the genearlized distancefrom Group t to itself increases when the variance within the group increases. It also increases as the prior probability decreases. So I'd interpret the terms as giving information about how much variance is in the group and how rare the group is. Groups that are "more rare and have more variance" have a greater "distance to themselves" than groups that are less variable and have more members. The only group that has zero for a generalized distance to itself is the limiting case of "zero variance" (all values equal) and "membership probability=1."

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-23-2015 07:59 PM

Could you point us to an example in the doc that has this output?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-24-2015 11:00 AM

Thank you for your kind reply.

I hereby attached an example of output.

X1 is my response variable which has 3 classes (1,2,3).

When I run the proc discrim with unequal priors I get an output like this. Can you kindly help me interpreting it. The SAS manual provides explanations about the computation but nothing is mentioned about interpretation of the nonzero distance of a class to iteself and about the asymmetry in the matrix.

Thanks a lot,

D.

Solution

11-25-2015
12:17 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-25-2015 09:29 AM

The GENERALIZED squared distance between groups is composed of the squared distance plus two other terms. The squared distance is symmetric and the distance from a group to itself is zero. So it is the other two terms that provides the assymmetry.

The formula is in the documentation under "Parametric Mathods". It includes the terms

1. ln| S_t |, which is the log of the determinant of the covariance matrix within the t_th group

2. -2 ln(q_t), where q_t is the prior probability of membership in the t_th group. (Note that q_t < 1, so this term is actually positive.)

The second paragraph of the "Overview" section cites Rao (1973) for the generalized squared distance.

As for interpretation, I don't have a reference, but I'll take a guess. From the definition, it looks like the genearlized distancefrom Group t to itself increases when the variance within the group increases. It also increases as the prior probability decreases. So I'd interpret the terms as giving information about how much variance is in the group and how rare the group is. Groups that are "more rare and have more variance" have a greater "distance to themselves" than groups that are less variable and have more members. The only group that has zero for a generalized distance to itself is the limiting case of "zero variance" (all values equal) and "membership probability=1."

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-25-2015 09:43 AM

Thank you again for your kindness!

Hence, I guess that this output is merely "computational".

Your suggestion on how to interpret those values definetly makes sense and I agree. I would say, though, that such an information can be read (more clearly) from other parts of the SAS output, that's why I was wondering if this output was trying to tell me something more or something different. Unfortunately, it often happens (of course non only with SAS), that some not relevant output are provided.

I appreciate your kind support,

Thank you again,

Best.

D

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-25-2015 11:53 AM

Great. When you think no further comments are necessary, mark the post as "answered" so that others know that the discussion can be closed.