Desktop productivity for business analysts and programmers

Zip Code Euclidean Distance

Accepted Solution Solved
Reply
Contributor
Posts: 23
Accepted Solution

Zip Code Euclidean Distance

All,

I am trying to figure out distance between doctors offices locations.  I have zipcode and lat/long data.  I am interested in figuring out the closest doctors to each other, for comparison in their prescribing methods (for some background on the question).  It is easy to figure out the distance between the first and second zips and then auto fill this down in excel.  But I want to know the distance between: Let A = Latitude Let B = Longitude (Ai,Bi) and (Ak,Bk) for all i,k.  I am using the Euclidean Distance formula (refresher for all those who have been out of their math classes too long :smileylaughSmiley Happy D=SQRT((A1-B1)^2+(A2-B2)^2)  I have over 32 thousand zipcodes here, so I am not going to compute indivual columns in EG.  I welcome any solution here, even programming, as I can just add the programming solution to my EG flow.  Thanks! 


Accepted Solutions
Solution
‎11-01-2012 09:06 PM
Esteemed Advisor
Posts: 7,288

Re: Zip Code Euclidean Distance

Here is a simplified (?) version:

data cityzips;

  input zip;

  cards;

15217

12209

44101

;

proc sql noprint;

  select count(*)

    into :nrecs

      from cityzips

  ;

quit;

data city_distance (drop=i j k zipsSmiley Happy;

  array zips(&nrecs.);

  do i=1 to &nrecs.;

    set cityzips;

    zips(i)=zip;

  end;

  do j=1 to &nrecs.;

    set cityzips;

    do k=1 to &nrecs.;

      if zip ne zips(k) then do;

        compare_zip=zips(k);

        distance=zipcitydistance(zip,compare_zip);

        output;

      end;

    end;

  end;

run;

View solution in original post


All Replies
Contributor
Posts: 23

Re: Zip Code Euclidean Distance

Thanks for the reply Arthur.  I like the function of link two, I was just going to go off the scale of distance between lat/long but having it in miles would be nice.  However, the function are for two specific zip codes.  Is there a way to do it so I get an answer between all zipcodes with each other, in the whole dataset?

Esteemed Advisor
Posts: 7,288

Re: Zip Code Euclidean Distance

There are a number of ways to automate the process to calculate all pairs.  e.g., take a look at the examples shown in the following paper: http://analytics.ncsu.edu/sesug/2010/RIV03.Okerson.pdf

Contributor
Posts: 23

Re: Zip Code Euclidean Distance

I can't open this link, I googled it and it won't open from google either.  hmmm, hopefully my internet starts working properly, thanks for the help!

Esteemed Advisor
Posts: 7,288

Re: Zip Code Euclidean Distance

Works for me.  Here is one of the example methods shown in that paper.  The author didn't use the function, as it turned out, but the method is still appropriate.  She also doesn't indicate where &n came from, but I presume it was just the number of records in the file.

data city_distance;

  keep startcity endcity distance startprojx startprojy endprojx endprojy;

  set locations;

  startx=atan(1)/45*long;

  starty=atan(1)/45*lat;

  startcity=city;

  /* Get the projected values for annotate */

    startprojx=x;

    startprojy=y;

  /* Get the observations for each of the cities */

  do i=1 to #

    set locations point=i;

      endx=atan(1)/45*long;

      endy=atan(1)/45*lat;

      endcity=city;

      endprojx=x;

      endprojy=y;

     /* If start and end are the same, delete the observation */

     if startcity = endcity then delete;

    /* Calculate distance between cities with Great Circle Distance Formula*/

    else Distance = round(3949.99*arcos(sin(starty)*sin(endy)+ cos( starty )*cos(endy )*cos( startx - endx ) ));

    output;

  end;

run

Solution
‎11-01-2012 09:06 PM
Esteemed Advisor
Posts: 7,288

Re: Zip Code Euclidean Distance

Here is a simplified (?) version:

data cityzips;

  input zip;

  cards;

15217

12209

44101

;

proc sql noprint;

  select count(*)

    into :nrecs

      from cityzips

  ;

quit;

data city_distance (drop=i j k zipsSmiley Happy;

  array zips(&nrecs.);

  do i=1 to &nrecs.;

    set cityzips;

    zips(i)=zip;

  end;

  do j=1 to &nrecs.;

    set cityzips;

    do k=1 to &nrecs.;

      if zip ne zips(k) then do;

        compare_zip=zips(k);

        distance=zipcitydistance(zip,compare_zip);

        output;

      end;

    end;

  end;

run;

Contributor
Posts: 23

Re: Zip Code Euclidean Distance

Thank you! Sorry it took so long, just got back on the forum now.  I really appreciate your help!

Grand Advisor
Posts: 10,210

Re: Zip Code Euclidean Distance

If you have actual lat and long you might look at the GEODIST function which should be more precise than the ZIPCITYDISTANCE. And if your lat / long measurements are degrees such a 38.45 no conversion of units is needed.

Contributor
Posts: 23

Re: Zip Code Euclidean Distance


Thanks, I have actual lat/long...I appreciate it

Esteemed Advisor
Posts: 7,288

Re: Zip Code Euclidean Distance

FWIW the reference for geodist is the third link to my original response and the function can be included in the same methodology as suggested in my last post.

Contributor
Posts: 23

Re: Zip Code Euclidean Distance

Yes I saw that link....just thanking everyone.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 955 views
  • 4 likes
  • 3 in conversation