BookmarkSubscribeRSS Feed
art297
Opal | Level 21

Mike already provided the code in his paper.  I would still start with proc sql to create a file with all combinations of the zip codes, then wrap Mike's code in a Macro that loops through the file you created with proc sql.  All you have to pass into the macro are the zip code pairs and run it for each pair.

That would simply replace the suggested datastep that is using the SAS zip/distance function.

The rest of the code, whichever method you choose, would remain the same.

Darrell_sas
SAS Employee

You should carefully read the Google Maps terms of use documents before using the drive time code to make sure it isn't in violations of their terms.  At one time, they specified that 'services' could only be used if the information was being put on a Google map.

MikeZdeb
Rhodochrosite | Level 12

hi ... some other ideas ...

#1 this also uses the ZIPCITYDISTANCE function ... basically Robert's ideas in one use of PROC SQL, though I took out the condition 

where usa1.zip^=metro_rank.metro_zip;

since I think it'd be OK if the closest USA zip to a METRO zip was the same zip, yes/no?

the first part cleans up USA1

it replaces spaces in ADDZIP with asterisks then creates a numeric variable UZIP using the 1st five characters of ADDZIP

(you get some LOG messages about INVALID strings)

it also checks to see if UZIP is in SASHELP.ZIPCODE since ZIPCITYDISTANCE uses that data set

the second part grabs the top 50 observations in METRO_RANK

the third part computes the distance and grabs the minimum distance between a TIC and a top 50 CITY

you get some LOG messages about INVALID  strings

proc sql;

create view usa as

select unique tic, input(translate(addzip,'*',' '), 5.) as uzip format=z5.

from z.usa1, sashelp.zipcode (keep=zip) as zc

having uzip eq zc.zip;

create view metro as

select city, sum_rank, input(zipcode, 5.) as mzip format=z5.

from z.metro_rank (where=(sum_rank between 1 and 50));

create table pairs as

select *, zipcitydistance(uzip, mzip) as dist

from usa, metro

group tic

having dist eq min(dist);

quit;

#2  this would be too big a problem to for the SAS and Google Maps drive distance/time (not really intended for 500K+ calls to Google Maps)

#3  the code in the original SGF paper no longer works ... a link to the most recent code canbe found at ...

http://www.sascommunity.org/wiki/User:Msz03

(but remember, your problem is TOO BIG)

#4  Darrell's comment of "... Google Maps terms of use ..." ... I looked (again) at ...

http://www.google.com/help/terms_maps.html

and see the two possible sticking points as how one would interpret ...

(a) access or use the Products or any Content through any technology or means other than those provided in the Products, or through other explicitly authorized means Google may designate (such as through the Google Maps/Google Earth APIs);

and

(e) use the Products in a manner that gives you or any other person access to mass downloads or bulk feeds of any Content, including but not limited to numerical latitude or longitude coordinates, imagery, and visible map data;

I see at ... http://blog.programmableweb.com/2011/10/27/google-maps-usage-fees-how-many-developers-will-have-to-p...http://code.google.com/apis/maps/faq.html#usagelimits

that folks using Google APIs for map access have a 25,000 free maps/day, after that there's a charge

#5  take a look at ... http://www.tandfonline.com/doi/abs/10.1080/00330124.2011.583586#preview

Boscoe FP, Henry KA, Zdeb MS.

A Nationwide Comparison of Driving Distance Versus Straight Line Distance to Hospitals.

Professional Geographer 2012

the conclusion in that paper is that in most instances, the added precision of driving distance over straight line distance isn't worth the effort

mspak
Quartz | Level 8

Thanks for suggestions,

I think I need time to figure out the correct ways to calculate the drive time efficiently. Some locations may not be accessible by roads. This task is much more complicated as compared to merely calculating the straight line distance between locations as it involves the use of google maps and a big country - US.

There might be many different possible routes on road from one location to another. I will have to choose, the lowest  drive time among the possible routes.

As requested by art297, I will share some useful tips if I found any good way to handle this complicated calculations efficiently.

Thank you.

Darrell_sas
SAS Employee

One performance suggestion to Mike's program.  Use the 'geodist' function instead of 'zipcitydistance'.   ZipCityDistance is being passed the ZIP Code of every city times every tic ZIP Code.  It must find the location of the ZIP Code for each ZIP Code every time it is called.  It would be faster to just lookup each location once. 

The first view in Mike's code can be tweaked to save the Y, X (Lat, Long) values from the SASHELP.ZIPCODE data set.

This change would replace his code:

     select unique tic, zc.x, zc.y, input(translate(addzip,'*',' '), 5.) as uzip format=z5.

     from z.usa1, sashelp.zipcode (keep=zip x y) as zc

The metro_rank data set already has the Latitude and Longitude for each city (I'm assuming it is correct).  This just needs to be kept in the data with the following change:

     select city, sum_rank, latitude, longitude, input(zipcode, 5.) as mzip format=z5.

Finally, the line with zipcitydistance should be changed to use geodist:

      select *, geodist(y, x, latitude, longitude, "M") as dist


A couple of notes:

- The metro_rank data set has a location in Puerto Rico in the top 50.  Not sure if that matters.

- The code allows for a tie - 2 or more cities the same distance away.

mspak
Quartz | Level 8

Hi again,

The metro_rank data is extracted from a US Zip code database and US 2010 Census data. I assume that it is correct. My longitude and latitude data, are in decimals. Therefore, I think they must be converted into radians before calculations.

I notice that  zipcitydistance (1 decimal point) output is very close to the geodist. I think,  zipcitydistance is more than enough to fit my purpose. If not mistaken, the zipcitydistance function is not applicable for US but not for other countries. If I were to calculate the distances in other countries, I must noit use the zipcitydistance function.

Thank you anyway - it is useful for my other projects.Smiley Happy

art297
Opal | Level 21

While we all obviously have to respect terms of agreement, and I'm not familiar with what the following authors did, I just noticed an interesting paper title scheduled to be presented at this year's SGF:

Batch Production of Driving Distances and Times Using SAS® and Web Map APIs

Ash Roy, Canadian Institute for Health Information

Yingbo Na, Canadian Institute for Health Information

http://sgf2012.confnav.com/sgf2012/web/sessions/date/view.html#20111103194323199460000000

I can't attend their presentation, as I'm presenting a paper at the same time that those authors are presenting, but you can probably contact them now and get some additional answers to your question.

If you discover anything useful I, for one, would appreciate your posting a summary here.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 3949 views
  • 6 likes
  • 7 in conversation