I'd be grateful for help figuring out how to get the shortest driving distance and driving time between a list of ~300 patients' zip codes and a separate list of >100 providers' cities. I've gotten the latitude and longitude for each zip code and city. This post was helpful in figuring out that Google Maps can help, but I'm having trouble getting it to work. FYI, I modified the macro slightly because I have concatenated the latitude and longitude into a single variable (first data step). Here are both the code I used and the resulting log: *Convert latitude and longitude into character variables and then concatenate them into a single variable;
data CBTI.PatientProviderLatLongOnly;
set CBTI.PatientProviderDistance;
yc=input(y,$15.); drop y; rename yc=y;
xc=input(x,$15.); drop x; rename xc=x;
latc=input(lat,$15.); drop lat; rename latc=lat;
longc=input(long,$15.); drop long; rename longc=long;
addr1=catx(',',y,x);
addr2=catx(',',lat,long);
run;
*Create version of the dataset with only ID and coordinates;
data CBTI.PatientProviderLatLongOnly;
set CBTI.PatientProviderLatLongOnly;
keep id addr1 addr2;
run;
*Check the dataset;
proc print data=CBTI.PatientProviderLatLongOnly;
run;
*Verify coordinate variables are character;
proc contents data=CBTI.PatientProviderLatLongOnly;
run;
data CBTI;
set CBTI.PatientProviderLatLongOnly;
run;
*Place number of observations in a macro variable;
data _null_;
call symputx('Observations',obs);
stop;
set CBTI nobs=obs;
run;
*Delete any data set named DISTANCE_TIME that might exist in the WORK library;
proc datasets lib=work nolist;
delete distance_time;
quit;
*Create a macro that contains a loop to access Google Maps multiple times;
* create a macro that contains a loop to access Google Maps multiple time;
%macro distance_time;
* delete any data set named DISTANCE_TIME that might exist in the WORK library;
proc datasets lib=work nolist;
delete distance_time;
quit;
%do j=1 %to 5/*&Observations*/;
data _null_;
nrec = &j;
set CBTI point=nrec;
%let ll1=addr1;
%let ll2=addr2;
stop;
run;
* lat/long of centroid of zip 12203 hard-coded as part of the URL;
filename x url "https://www.google.com/maps/dir/&ll1/&ll2/?force=lite";
filename z temp;
* same technique used in the example with a pair of lat/long coodinates;
data _null_;
infile x recfm=f lrecl=1 end=eof;
file z recfm=f lrecl=1;
input @1 x $char1.;
put @1 x $char1.;
if eof;
call symputx('filesize',_n_);
run;
* drive time as a numeric variable;
data temp;
infile z recfm=f lrecl=&filesize. eof=done;
input @ 'miles' +(-15) @ '"' distance :comma12. text $30.;
units = scan(text,1,'"');
text = scan(text,3,'"');
* convert times to seconds;
select;
* combine days and hours;
when (find(text,'d') ne 0) time = sum(86400*input(scan(text,1,' '),best.),
3600*input(scan(text,3,' '),best.));
* combine hours and minutes;
when (find(text,'h') ne 0) time = sum(3600*input(scan(text,1,' '),best.),
60*input(scan(text,3,' '),best.));
* just minutes;
otherwise time = 60*input(scan(text,1,' '),best.);
end;
output;
keep distance time;
stop;
done:
output;
run;
filename x clear;
filename z clear;
* add an observation to the data set DISTANCE_TIME;
proc append base=distance_time data=temp;
run;
%end;
%mend;
* use the macro;
%distance_time;
*
add variables from original data set to new data set distance_time
use geodist function to calculate straight line distance
;
data distance_time;
set distance_time;
set CBTI point=_n_;
straight_line = round(geodist(&ll1/&ll2,'DM'), 0.01);
run;
proc print data=distance_time noobs label;
var x y time distance straight_line zip city statecode;
format zip z5. time time6. ;
run; Here's the log: 136 *Convert latitude and longitude into character variables and then concatenate them into a
136! single variable;
137 data CBTI.PatientProviderLatLongOnly;
138 set CBTI.PatientProviderDistance;
139
140 yc=input(y,$15.); drop y; rename yc=y;
141 xc=input(x,$15.); drop x; rename xc=x;
142 latc=input(lat,$15.); drop lat; rename latc=lat;
143 longc=input(long,$15.); drop long; rename longc=long;
144
145 addr1=catx(',',y,x);
146 addr2=catx(',',lat,long);
147 run;
NOTE: Numeric values have been converted to character values at the places given by:
(Line):(Column).
140:10 141:10 142:12 143:13
NOTE: There were 697 observations read from the data set CBTI.PATIENTPROVIDERDISTANCE.
NOTE: The data set CBTI.PATIENTPROVIDERLATLONGONLY has 697 observations and 13 variables.
NOTE: DATA statement used (Total process time):
real time 0.09 seconds
cpu time 0.01 seconds
148
149 *Create version of the dataset with only ID and coordinates;
150 data CBTI.PatientProviderLatLongOnly;
151 set CBTI.PatientProviderLatLongOnly;
152 keep id addr1 addr2;
153 run;
NOTE: There were 697 observations read from the data set CBTI.PATIENTPROVIDERLATLONGONLY.
NOTE: The data set CBTI.PATIENTPROVIDERLATLONGONLY has 697 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.03 seconds
154
155 *Check the dataset;
156 proc print data=CBTI.PatientProviderLatLongOnly;
157 run;
NOTE: There were 697 observations read from the data set CBTI.PATIENTPROVIDERLATLONGONLY.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
158
159 *Verify coordinate variables are character;
160 proc contents data=CBTI.PatientProviderLatLongOnly;
161 run;
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
162
163 data CBTI;
164 set CBTI.PatientProviderLatLongOnly;
165 run;
NOTE: There were 697 observations read from the data set CBTI.PATIENTPROVIDERLATLONGONLY.
NOTE: The data set WORK.CBTI has 697 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
166
167 *Place number of observations in a macro variable;
168 data _null_;
169 call symputx('Observations',obs);
170 stop;
171 set CBTI nobs=obs;
172 run;
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
173
174 *Delete any data set named DISTANCE_TIME that might exist in the WORK library;
175 proc datasets lib=work nolist;
176 delete distance_time;
177 quit;
NOTE: Deleting WORK.DISTANCE_TIME (memtype=DATA).
NOTE: PROCEDURE DATASETS used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
178
179 *Create a macro that contains a loop to access Google Maps multiple times;
180 * create a macro that contains a loop to access Google Maps multiple time;
181 %macro distance_time;
182
183 * delete any data set named DISTANCE_TIME that might exist in the WORK library;
184 proc datasets lib=work nolist;
185 delete distance_time;
186 quit;
187
188 %do j=1 %to 5/*&Observations*/;
189 data _null_;
190 nrec = &j;
191 set CBTI point=nrec;
192 %let ll1=addr1;
193 %let ll2=addr2;
194 stop;
195 run;
196
197 * lat/long of centroid of zip 12203 hard-coded as part of the URL;
198 filename x url "https://www.google.com/maps/dir/&ll1/&ll2/?force=lite";
199 filename z temp;
200
201 * same technique used in the example with a pair of lat/long coodinates;
202 data _null_;
203 infile x recfm=f lrecl=1 end=eof;
204 file z recfm=f lrecl=1;
205 input @1 x $char1.;
206 put @1 x $char1.;
207 if eof;
208 call symputx('filesize',_n_);
209 run;
210
211 * drive time as a numeric variable;
212 data temp;
213 infile z recfm=f lrecl=&filesize. eof=done;
214 input @ 'miles' +(-15) @ '"' distance :comma12. text $30.;
215 units = scan(text,1,'"');
216 text = scan(text,3,'"');
217 * convert times to seconds;
218 select;
219 * combine days and hours;
220 when (find(text,'d') ne 0) time = sum(86400*input(scan(text,1,' '),best.),
221 3600*input(scan(text,3,' '),best.));
222 * combine hours and minutes;
223 when (find(text,'h') ne 0) time = sum(3600*input(scan(text,1,' '),best.),
224 60*input(scan(text,3,' '),best.));
225 * just minutes;
226 otherwise time = 60*input(scan(text,1,' '),best.);
227 end;
228 output;
229 keep distance time;
230 stop;
231 done:
232 output;
233 run;
234
235 filename x clear;
236 filename z clear;
237
238 * add an observation to the data set DISTANCE_TIME;
239 proc append base=distance_time data=temp;
240 run;
241 %end;
242 %mend;
243
244 * use the macro;
245 %distance_time;
NOTE: The file WORK.DISTANCE_TIME (memtype=DATA) was not found, but appears on a DELETE
statement.
NOTE: PROCEDURE DATASETS used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: The infile X is:
Filename=https://www.google.com/maps/dir/addr1/addr2/?force=lite,
Local Host Name=VMW7-GONZALBD03,
Local Host IP addr=fe80::255e:1b35:74e2:2b4f%12,
Service Hostname Name=mia07s47-in-f4.1e100.net,
Service IP addr=172.217.8.68,Service Name=N/A,
Service Portno=443,Lrecl=1,Recfm=Fixed
NOTE: The file Z is:
Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary
Files\_TD6520_VMW7-GONZALBD03_\#LN00106,
RECFM=F,LRECL=1,File Size (bytes)=0,
Last Modified=03Oct2019:08:13:56,
Create Time=03Oct2019:08:13:56
NOTE: 50233 records were read from the infile X.
NOTE: 50233 records were written to the file Z.
NOTE: DATA statement used (Total process time):
real time 0.46 seconds
cpu time 0.04 seconds
NOTE: The infile Z is:
Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary
Files\_TD6520_VMW7-GONZALBD03_\#LN00106,
RECFM=F,LRECL=50233,File Size (bytes)=50233,
Last Modified=03Oct2019:08:13:56,
Create Time=03Oct2019:08:13:56
NOTE: 1 record was read from the infile Z.
NOTE: SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line.
NOTE: The data set WORK.TEMP has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
NOTE: Fileref X has been deassigned.
NOTE: Fileref Z has been deassigned.
NOTE: Appending WORK.TEMP to WORK.DISTANCE_TIME.
NOTE: BASE data set does not exist. DATA file is being copied to BASE file.
NOTE: There were 1 observations read from the data set WORK.TEMP.
NOTE: The data set WORK.DISTANCE_TIME has 1 observations and 2 variables.
NOTE: PROCEDURE APPEND used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: The infile X is:
Filename=https://www.google.com/maps/dir/addr1/addr2/?force=lite,
Local Host Name=VMW7-GONZALBD03,
Local Host IP addr=fe80::255e:1b35:74e2:2b4f%12,
Service Hostname Name=mia07s47-in-f4.1e100.net,
Service IP addr=172.217.8.68,Service Name=N/A,
Service Portno=443,Lrecl=1,Recfm=Fixed
NOTE: The file Z is:
Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary
Files\_TD6520_VMW7-GONZALBD03_\#LN00107,
RECFM=F,LRECL=1,File Size (bytes)=0,
Last Modified=03Oct2019:08:13:57,
Create Time=03Oct2019:08:13:57
NOTE: 50236 records were read from the infile X.
NOTE: 50236 records were written to the file Z.
NOTE: DATA statement used (Total process time):
real time 0.70 seconds
cpu time 0.03 seconds
NOTE: The infile Z is:
Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary
Files\_TD6520_VMW7-GONZALBD03_\#LN00107,
RECFM=F,LRECL=50236,File Size (bytes)=50236,
Last Modified=03Oct2019:08:13:57,
Create Time=03Oct2019:08:13:57
NOTE: 1 record was read from the infile Z.
NOTE: SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line.
NOTE: The data set WORK.TEMP has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
NOTE: Fileref X has been deassigned.
NOTE: Fileref Z has been deassigned.
NOTE: Appending WORK.TEMP to WORK.DISTANCE_TIME.
NOTE: There were 1 observations read from the data set WORK.TEMP.
NOTE: 1 observations added.
NOTE: The data set WORK.DISTANCE_TIME has 2 observations and 2 variables.
NOTE: PROCEDURE APPEND used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: The infile X is:
Filename=https://www.google.com/maps/dir/addr1/addr2/?force=lite,
Local Host Name=VMW7-GONZALBD03,
Local Host IP addr=fe80::255e:1b35:74e2:2b4f%12,
Service Hostname Name=mia07s47-in-f4.1e100.net,
Service IP addr=172.217.8.68,Service Name=N/A,
Service Portno=443,Lrecl=1,Recfm=Fixed
NOTE: The file Z is:
Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary
Files\_TD6520_VMW7-GONZALBD03_\#LN00108,
RECFM=F,LRECL=1,File Size (bytes)=0,
Last Modified=03Oct2019:08:13:57,
Create Time=03Oct2019:08:13:57
NOTE: 50236 records were read from the infile X.
NOTE: 50236 records were written to the file Z.
NOTE: DATA statement used (Total process time):
real time 0.42 seconds
cpu time 0.06 seconds
NOTE: The infile Z is:
Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary
Files\_TD6520_VMW7-GONZALBD03_\#LN00108,
RECFM=F,LRECL=50236,File Size (bytes)=50236,
Last Modified=03Oct2019:08:13:57,
Create Time=03Oct2019:08:13:57
NOTE: 1 record was read from the infile Z.
NOTE: SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line.
NOTE: The data set WORK.TEMP has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
NOTE: Fileref X has been deassigned.
NOTE: Fileref Z has been deassigned.
NOTE: Appending WORK.TEMP to WORK.DISTANCE_TIME.
NOTE: There were 1 observations read from the data set WORK.TEMP.
NOTE: 1 observations added.
NOTE: The data set WORK.DISTANCE_TIME has 3 observations and 2 variables.
NOTE: PROCEDURE APPEND used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: The infile X is:
Filename=https://www.google.com/maps/dir/addr1/addr2/?force=lite,
Local Host Name=VMW7-GONZALBD03,
Local Host IP addr=fe80::255e:1b35:74e2:2b4f%12,
Service Hostname Name=mia07s47-in-f4.1e100.net,
Service IP addr=172.217.8.68,Service Name=N/A,
Service Portno=443,Lrecl=1,Recfm=Fixed
NOTE: The file Z is:
Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary
Files\_TD6520_VMW7-GONZALBD03_\#LN00109,
RECFM=F,LRECL=1,File Size (bytes)=0,
Last Modified=03Oct2019:08:13:58,
Create Time=03Oct2019:08:13:58
NOTE: 50252 records were read from the infile X.
NOTE: 50252 records were written to the file Z.
NOTE: DATA statement used (Total process time):
real time 0.40 seconds
cpu time 0.00 seconds
NOTE: The infile Z is:
Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary
Files\_TD6520_VMW7-GONZALBD03_\#LN00109,
RECFM=F,LRECL=50252,File Size (bytes)=50252,
Last Modified=03Oct2019:08:13:58,
Create Time=03Oct2019:08:13:58
NOTE: 1 record was read from the infile Z.
NOTE: SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line.
NOTE: The data set WORK.TEMP has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
NOTE: Fileref X has been deassigned.
NOTE: Fileref Z has been deassigned.
NOTE: Appending WORK.TEMP to WORK.DISTANCE_TIME.
NOTE: There were 1 observations read from the data set WORK.TEMP.
NOTE: 1 observations added.
NOTE: The data set WORK.DISTANCE_TIME has 4 observations and 2 variables.
NOTE: PROCEDURE APPEND used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: The infile X is:
Filename=https://www.google.com/maps/dir/addr1/addr2/?force=lite,
Local Host Name=VMW7-GONZALBD03,
Local Host IP addr=fe80::255e:1b35:74e2:2b4f%12,
Service Hostname Name=mia07s47-in-f4.1e100.net,
Service IP addr=172.217.8.68,Service Name=N/A,
Service Portno=443,Lrecl=1,Recfm=Fixed
NOTE: The file Z is:
Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary
Files\_TD6520_VMW7-GONZALBD03_\#LN00110,
RECFM=F,LRECL=1,File Size (bytes)=0,
Last Modified=03Oct2019:08:13:58,
Create Time=03Oct2019:08:13:58
NOTE: 50240 records were read from the infile X.
NOTE: 50240 records were written to the file Z.
NOTE: DATA statement used (Total process time):
real time 0.49 seconds
cpu time 0.07 seconds
NOTE: The infile Z is:
Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary
Files\_TD6520_VMW7-GONZALBD03_\#LN00110,
RECFM=F,LRECL=50240,File Size (bytes)=50240,
Last Modified=03Oct2019:08:13:58,
Create Time=03Oct2019:08:13:58
NOTE: 1 record was read from the infile Z.
NOTE: SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line.
NOTE: The data set WORK.TEMP has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
NOTE: Fileref X has been deassigned.
NOTE: Fileref Z has been deassigned.
NOTE: Appending WORK.TEMP to WORK.DISTANCE_TIME.
NOTE: There were 1 observations read from the data set WORK.TEMP.
NOTE: 1 observations added.
NOTE: The data set WORK.DISTANCE_TIME has 5 observations and 2 variables.
NOTE: PROCEDURE APPEND used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
246
247 *
248 add variables from original data set to new data set distance_time
249 use geodist function to calculate straight line distance
250 ;
251 data distance_time;
252 set distance_time;
253 set CBTI point=_n_;
254 straight_line = round(geodist(&ll1/&ll2,'DM'), 0.01);
-
22
WARNING: Apparent symbolic reference LL1 not resolved.
WARNING: Apparent symbolic reference LL2 not resolved.
254 straight_line = round(geodist(&ll1/&ll2,'DM'), 0.01);
-
22
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string,
a numeric constant, a datetime constant, a missing value, ), INPUT, PUT.
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string,
a numeric constant, a datetime constant, a missing value, INPUT, PUT.
WARNING: Apparent symbolic reference LL2 not resolved.
254 straight_line = round(geodist(&ll1/&ll2,'DM'), 0.01);
-------
71
ERROR 71-185: The GEODIST function call does not have enough arguments.
255 run;
NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
254:41
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.DISTANCE_TIME may be incomplete. When this step was stopped there
were 0 observations and 8 variables.
WARNING: Data set WORK.DISTANCE_TIME was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
256
257 proc print data=distance_time noobs label;
258 var x y time distance straight_line zip city statecode;
ERROR: Variable X not found.
ERROR: Variable Y not found.
ERROR: Variable STRAIGHT_LINE not found.
ERROR: Variable ZIP not found.
ERROR: Variable CITY not found.
ERROR: Variable STATECODE not found.
259 format zip z5. time time6. ;
260 run;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds And here's what I get when I try to print the distance_time dataset: Obs distance time 1 . . 2 . . 3 . . 4 . . 5 . .
... View more