turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Calculating Dissimilarity Index

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-18-2016 03:12 PM

Hello,

I would like to calculate dissimilarity index with SAS. The formula is the following:

where

bi is the value of variable b in area i

B is the summation of all bi

w is the value of variable w in area i

W is the summation of all wi.

I was doing the long way, using proc means, output out, etc. but I'm pretty sure there is a easily and quicky way to do this with proc SQL.

I try this codes, but it doesn't work (I don't really know how works the SLQ procedure):

proc sql;

select sum(abs((b/sum(b))-(w/sum(w)))/2 as d1

from data.data;

quit;

Can someone please help me?

Accepted Solutions

Solution

03-18-2016
05:26 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-18-2016 04:57 PM

You can nest this so no intermediate table as;

proc sql;

select 0.5*sum(abs(var1-var2)) as d1

from (select b/sum(b) as var1, w/sum(w) as var2

from data.data

);

quit;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-18-2016 04:45 PM

Nevermind, I found the solution:

proc sql;

create table table1 as

select*, b/sum(b) as var1, w/sum(w) as var2

from data.data;

select 0.5*sum(abs(var1-var2)) as d1

from table1;

quit;

run;

Solution

03-18-2016
05:26 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-18-2016 04:57 PM

You can nest this so no intermediate table as;

proc sql;

select 0.5*sum(abs(var1-var2)) as d1

from (select b/sum(b) as var1, w/sum(w) as var2

from data.data

);

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-18-2016 05:26 PM - edited 03-18-2016 06:43 PM

Thanks. And what should I do if I want to add a class statement to calculate many index?

I tried simply adding "group by X" to your codes, but the result is not relevent.

Also, is there a way to store the d1 into an excel file?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-18-2016 06:50 PM

I would need data to test anything.

And are you lookin to have a "dissimilarity index" within each level of the variable (resulting in multiple indices) or to calculate the index between summarized levels (a single index)?

With a code that doesn't provide what you want it helps to provide the code ran, some input data and the desired result for the example data. If there were errors or warnings in the log those may help as well.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-18-2016 06:56 PM - edited 03-18-2016 06:57 PM

Here is a short sample of the data

ADIDU | County | b | w |

24550081 | 1 | 5,63 | 80 |

24550075 | 1 | 0 | 55 |

24550080 | 1 | 298,33 | 130 |

24550079 | 1 | 61,05 | 70 |

24550082 | 2 | 38,73 | 95 |

24550077 | 2 | 88,13 | 65 |

24550078 | 2 | 188,55 | 40 |

24550088 | 2 | 64,07 | 45 |

24550083 | 2 | 18,01 | 190 |

24550085 | 3 | 0 | 135 |

24550076 | 3 | 0 | 140 |

24550087 | 3 | 0 | 145 |

24550084 | 3 | 160,45 | 45 |

24550086 | 3 | 0 | 115 |

24570213 | 3 | 35,75 | 65 |

24570175 | 3 | 0 | 75 |

24570214 | 4 | 0 | 190 |

24570178 | 4 | 0 | 110 |

24570177 | 4 | 0 | 80 |

24570174 | 4 | 0 | 120 |

24570066 | 4 | 196,25 | 65 |

24570255 | 4 | 145,73 | 55 |

I would like to compute a dissimilatiry index for each county (there is much more observations in county than what I show there) and one global dissimilarity index.