Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Re: Earth mover's distance (EMD)

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 12-05-2017 01:47 PM
(2646 views)

Has anyone used PROC KDE, or any other procedure to perform "Earth Mover's Distance" calculations?

There is a "Do Loop" blog on the topic from 2013, and I cannot find anything else on the topic since that time.

While the blog is helpful and the procedure relatively straight forward, there are nuances used that are application dependent, and I'm hoping to find others that have performed EMD calculations.

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

1. Please provide references/citations for the L1 density computations so that we can understand what you are trying to do.

2. If your data represent (size, density) pairs, it seems like you can estimate the centers of concentration (="peaks") in a conventional way. Why do you think you need to use an L1 distance for these data?

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'm not sure what metric you are using, but perhaps it is another name for the L1 or "city block" metric.

What sort of calculations are you trying to compute? PROC KDE is for density estimation. It uses a Gaussian kernel function that uses the squared Euclidean distance between two points to estimate the density. Although in theory, you could compute the density by using another metric, I haven't seen that done. It's not clear how you would select an optimal bandwidth, since the automated bandwidth selection algorithms in PROC KDE are based on the Gaussian kernel.

If you say more about the "nuances ...that are application dependent," perhaps we can say more.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Rick, thanks for the reply.

I'm preparing for the calculations and don't have all the details, yet, so may need to followup with again in a

few days. Am just trying to find anything I can on the topic , in preparation.

My understanding is that the peaks to be compared are densities of "globules" in a solution

that have been sorted via an analytical method by size and density.

There will be multiple peaks to compare. I expect a reference peak will be chosen

and then the other peaks will be compared to it. I see where in PROC KDE that is possible.

I have not settled the Bandwidth question yet. Thanks again !

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

1. Please provide references/citations for the L1 density computations so that we can understand what you are trying to do.

2. If your data represent (size, density) pairs, it seems like you can estimate the centers of concentration (="peaks") in a conventional way. Why do you think you need to use an L1 distance for these data?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Rick,

Thank you. I will get back to you with those answers, and in the interim will mark this as resolved, since it may take a day or two.

Best,

Robert

Are you ready for the spotlight? We're accepting content ideas for **SAS Innovate 2025** to be held May 6-9 in Orlando, FL. The call is **open **until September 16. Read more here about **why** you should contribute and **what is in it** for you!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.