BookmarkSubscribeRSS Feed
jesspurse
Obsidian | Level 7

I'm trying to find the distance between 3,600 pairs of zipcodes, patient and their provider. I attached 50 pairs in a CSV file. Patient zipcode is patz, and provider zipcode is provz.

 

I run SAS 9.4 on Windows 10 in a VM using Parallels, so I'm used to things taking a little bit longer than they would when I run it on a Windows computer. Except I left it to run overnight last night, and it wasn't done seven hours later.

 

Is there a problem with my code?

 

libname ob "\\Mac\Home\Desktop\SAS";

proc import file="\\Mac\Home\Desktop\zip.csv"
dbms=csv
out=zip
replace;
run;

data _null_;
call symputx('nzips',obs);
stop;
set zip nobs=obs;
run;

%macro distance_time (ds=, z1=, z2=, out=);
%do j=1 %to &nzips;
DATA _null_;
nrec = &j;
set zip point=nrec;
call symputx('patz',&z1);
call symputx('provz',&z2);
RUN;

filename x url "https://www.google.com/maps/dir/&z1/&z2/?force=lite";
filename z temp;

data _null_; 
infile x recfm=f lrecl=1 end=eof; 
file z recfm=f lrecl=1;
input @1 x $char1.; 
put @1 x $char1.;
if eof;
call symputx('filesize',_n_);
run;

data temp;
keep z1 z2 distance units time;
z1="&z1";
z2="&z2";
infile z recfm=f lrecl=&filesize. eof=done;
input @ 'miles' +(-15) @ '"' distance :comma12. text $30.;
units    = scan(text,1,'"');
text     = scan(text,3,'"');
 
  select;
   when (find(text,'d') ne 0)  time = sum(86400*input(scan(text,1,' '),best.), 
                                        3600*input(scan(text,3,' '),best.));
   when (find(text,'h') ne 0)  time = sum(3600*input(scan(text,1,' '),best.), 
                                        60*input(scan(text,3,' '),best.));
   otherwise                   time = 60*input(scan(text,1,' '),best.);
  end;
 
output; 
keep z1 z2 distance units time;
done:
output;
run;
 
filename x clear;
filename z clear;

proc append base=distance_time data=temp force;
run;
%end;
%mend;
 
%distance_time (ds=zip, z1=patz, z2=provz, out=dist_time);

data ob.distance_time;
set dist_time;
run;

proc export data=ob.distance_time
dbms=xlsx
outfile="\\Mac\Home\Desktop\distance.xlsx"
replace;
run;

Thanks!!!

4 REPLIES 4
Rick_SAS
SAS Super FREQ

The main thing wrong with your code is the structure. The program uses a macro loop to loop over pairs and computes the distance between pairs and then uses a formula to compute the distance and uses PROC APPEND to append the results.

 

Just create a data set that contains the pairs that you want to analyze and then process them all by using a single DATA step.

 

PS> If you want to know why the macro loop + PROC APPEND method is so slow, see the article "Simulation in SAS: The slow way or the BY way", which analyzes macro loops in the context of simulation studies.

Tom
Super User Tom
Super User

Take the macro out of the problem.  Does it run for one pair of zip codes?

 

Do you really expect the TEMP data step to generate more than one observation?  If not then insert a STOP statement after the OUTPUT statement.

ballardw
Super User

And after you get this to work for  few pairs you might want to look into the SAS ZIPCITYDISTANCE function and see if the results differ by enough to be a problem.

 

data example;
   input patz provz;
   distance = zipcitydistance(patz,provz);
datalines;
28110 29403
28110 29440
28110 29520
;

The distance returned is in miles.

 

The ZIPCITYDISTANCE function returns the geodetic distance in miles between two ZIP code locations. The centroid of each ZIP code is used in the calculation.
The Sashelp.Zipcode data set must be present when you use this function. If you remove the data set, then ZIPCITYDISTANCE returns unexpected results.
Tom
Super User Tom
Super User

Note that Google is probably not happy with so many hits on its service from the same location.  It might well be slowing down its response or ignoring you all together.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 454 views
  • 2 likes
  • 4 in conversation