- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'd be grateful for help figuring out how to get the shortest driving distance and driving time between a list of ~300 patients' zip codes and a separate list of >100 providers' cities. I've gotten the latitude and longitude for each zip code and city. This post was helpful in figuring out that Google Maps can help, but I'm having trouble getting it to work. FYI, I modified the macro slightly because I have concatenated the latitude and longitude into a single variable (first data step).
Here are both the code I used and the resulting log:
*Convert latitude and longitude into character variables and then concatenate them into a single variable;
data CBTI.PatientProviderLatLongOnly;
set CBTI.PatientProviderDistance;
yc=input(y,$15.); drop y; rename yc=y;
xc=input(x,$15.); drop x; rename xc=x;
latc=input(lat,$15.); drop lat; rename latc=lat;
longc=input(long,$15.); drop long; rename longc=long;
addr1=catx(',',y,x);
addr2=catx(',',lat,long);
run;
*Create version of the dataset with only ID and coordinates;
data CBTI.PatientProviderLatLongOnly;
set CBTI.PatientProviderLatLongOnly;
keep id addr1 addr2;
run;
*Check the dataset;
proc print data=CBTI.PatientProviderLatLongOnly;
run;
*Verify coordinate variables are character;
proc contents data=CBTI.PatientProviderLatLongOnly;
run;
data CBTI;
set CBTI.PatientProviderLatLongOnly;
run;
*Place number of observations in a macro variable;
data _null_;
call symputx('Observations',obs);
stop;
set CBTI nobs=obs;
run;
*Delete any data set named DISTANCE_TIME that might exist in the WORK library;
proc datasets lib=work nolist;
delete distance_time;
quit;
*Create a macro that contains a loop to access Google Maps multiple times;
* create a macro that contains a loop to access Google Maps multiple time;
%macro distance_time;
* delete any data set named DISTANCE_TIME that might exist in the WORK library;
proc datasets lib=work nolist;
delete distance_time;
quit;
%do j=1 %to 5/*&Observations*/;
data _null_;
nrec = &j;
set CBTI point=nrec;
%let ll1=addr1;
%let ll2=addr2;
stop;
run;
* lat/long of centroid of zip 12203 hard-coded as part of the URL;
filename x url "https://www.google.com/maps/dir/&ll1/&ll2/?force=lite";
filename z temp;
* same technique used in the example with a pair of lat/long coodinates;
data _null_;
infile x recfm=f lrecl=1 end=eof;
file z recfm=f lrecl=1;
input @1 x $char1.;
put @1 x $char1.;
if eof;
call symputx('filesize',_n_);
run;
* drive time as a numeric variable;
data temp;
infile z recfm=f lrecl=&filesize. eof=done;
input @ 'miles' +(-15) @ '"' distance :comma12. text $30.;
units = scan(text,1,'"');
text = scan(text,3,'"');
* convert times to seconds;
select;
* combine days and hours;
when (find(text,'d') ne 0) time = sum(86400*input(scan(text,1,' '),best.),
3600*input(scan(text,3,' '),best.));
* combine hours and minutes;
when (find(text,'h') ne 0) time = sum(3600*input(scan(text,1,' '),best.),
60*input(scan(text,3,' '),best.));
* just minutes;
otherwise time = 60*input(scan(text,1,' '),best.);
end;
output;
keep distance time;
stop;
done:
output;
run;
filename x clear;
filename z clear;
* add an observation to the data set DISTANCE_TIME;
proc append base=distance_time data=temp;
run;
%end;
%mend;
* use the macro;
%distance_time;
*
add variables from original data set to new data set distance_time
use geodist function to calculate straight line distance
;
data distance_time;
set distance_time;
set CBTI point=_n_;
straight_line = round(geodist(&ll1/&ll2,'DM'), 0.01);
run;
proc print data=distance_time noobs label;
var x y time distance straight_line zip city statecode;
format zip z5. time time6. ;
run;
Here's the log:
136 *Convert latitude and longitude into character variables and then concatenate them into a 136! single variable; 137 data CBTI.PatientProviderLatLongOnly; 138 set CBTI.PatientProviderDistance; 139 140 yc=input(y,$15.); drop y; rename yc=y; 141 xc=input(x,$15.); drop x; rename xc=x; 142 latc=input(lat,$15.); drop lat; rename latc=lat; 143 longc=input(long,$15.); drop long; rename longc=long; 144 145 addr1=catx(',',y,x); 146 addr2=catx(',',lat,long); 147 run; NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column). 140:10 141:10 142:12 143:13 NOTE: There were 697 observations read from the data set CBTI.PATIENTPROVIDERDISTANCE. NOTE: The data set CBTI.PATIENTPROVIDERLATLONGONLY has 697 observations and 13 variables. NOTE: DATA statement used (Total process time): real time 0.09 seconds cpu time 0.01 seconds 148 149 *Create version of the dataset with only ID and coordinates; 150 data CBTI.PatientProviderLatLongOnly; 151 set CBTI.PatientProviderLatLongOnly; 152 keep id addr1 addr2; 153 run; NOTE: There were 697 observations read from the data set CBTI.PATIENTPROVIDERLATLONGONLY. NOTE: The data set CBTI.PATIENTPROVIDERLATLONGONLY has 697 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.07 seconds cpu time 0.03 seconds 154 155 *Check the dataset; 156 proc print data=CBTI.PatientProviderLatLongOnly; 157 run; NOTE: There were 697 observations read from the data set CBTI.PATIENTPROVIDERLATLONGONLY. NOTE: PROCEDURE PRINT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 158 159 *Verify coordinate variables are character; 160 proc contents data=CBTI.PatientProviderLatLongOnly; 161 run; NOTE: PROCEDURE CONTENTS used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 162 163 data CBTI; 164 set CBTI.PatientProviderLatLongOnly; 165 run; NOTE: There were 697 observations read from the data set CBTI.PATIENTPROVIDERLATLONGONLY. NOTE: The data set WORK.CBTI has 697 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 166 167 *Place number of observations in a macro variable; 168 data _null_; 169 call symputx('Observations',obs); 170 stop; 171 set CBTI nobs=obs; 172 run; NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 173 174 *Delete any data set named DISTANCE_TIME that might exist in the WORK library; 175 proc datasets lib=work nolist; 176 delete distance_time; 177 quit; NOTE: Deleting WORK.DISTANCE_TIME (memtype=DATA). NOTE: PROCEDURE DATASETS used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 178 179 *Create a macro that contains a loop to access Google Maps multiple times; 180 * create a macro that contains a loop to access Google Maps multiple time; 181 %macro distance_time; 182 183 * delete any data set named DISTANCE_TIME that might exist in the WORK library; 184 proc datasets lib=work nolist; 185 delete distance_time; 186 quit; 187 188 %do j=1 %to 5/*&Observations*/; 189 data _null_; 190 nrec = &j; 191 set CBTI point=nrec; 192 %let ll1=addr1; 193 %let ll2=addr2; 194 stop; 195 run; 196 197 * lat/long of centroid of zip 12203 hard-coded as part of the URL; 198 filename x url "https://www.google.com/maps/dir/&ll1/&ll2/?force=lite"; 199 filename z temp; 200 201 * same technique used in the example with a pair of lat/long coodinates; 202 data _null_; 203 infile x recfm=f lrecl=1 end=eof; 204 file z recfm=f lrecl=1; 205 input @1 x $char1.; 206 put @1 x $char1.; 207 if eof; 208 call symputx('filesize',_n_); 209 run; 210 211 * drive time as a numeric variable; 212 data temp; 213 infile z recfm=f lrecl=&filesize. eof=done; 214 input @ 'miles' +(-15) @ '"' distance :comma12. text $30.; 215 units = scan(text,1,'"'); 216 text = scan(text,3,'"'); 217 * convert times to seconds; 218 select; 219 * combine days and hours; 220 when (find(text,'d') ne 0) time = sum(86400*input(scan(text,1,' '),best.), 221 3600*input(scan(text,3,' '),best.)); 222 * combine hours and minutes; 223 when (find(text,'h') ne 0) time = sum(3600*input(scan(text,1,' '),best.), 224 60*input(scan(text,3,' '),best.)); 225 * just minutes; 226 otherwise time = 60*input(scan(text,1,' '),best.); 227 end; 228 output; 229 keep distance time; 230 stop; 231 done: 232 output; 233 run; 234 235 filename x clear; 236 filename z clear; 237 238 * add an observation to the data set DISTANCE_TIME; 239 proc append base=distance_time data=temp; 240 run; 241 %end; 242 %mend; 243 244 * use the macro; 245 %distance_time; NOTE: The file WORK.DISTANCE_TIME (memtype=DATA) was not found, but appears on a DELETE statement. NOTE: PROCEDURE DATASETS used (Total process time): real time 0.01 seconds cpu time 0.01 seconds NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds NOTE: The infile X is: Filename=https://www.google.com/maps/dir/addr1/addr2/?force=lite, Local Host Name=VMW7-GONZALBD03, Local Host IP addr=fe80::255e:1b35:74e2:2b4f%12, Service Hostname Name=mia07s47-in-f4.1e100.net, Service IP addr=172.217.8.68,Service Name=N/A, Service Portno=443,Lrecl=1,Recfm=Fixed NOTE: The file Z is: Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary Files\_TD6520_VMW7-GONZALBD03_\#LN00106, RECFM=F,LRECL=1,File Size (bytes)=0, Last Modified=03Oct2019:08:13:56, Create Time=03Oct2019:08:13:56 NOTE: 50233 records were read from the infile X. NOTE: 50233 records were written to the file Z. NOTE: DATA statement used (Total process time): real time 0.46 seconds cpu time 0.04 seconds NOTE: The infile Z is: Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary Files\_TD6520_VMW7-GONZALBD03_\#LN00106, RECFM=F,LRECL=50233,File Size (bytes)=50233, Last Modified=03Oct2019:08:13:56, Create Time=03Oct2019:08:13:56 NOTE: 1 record was read from the infile Z. NOTE: SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line. NOTE: The data set WORK.TEMP has 1 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.00 seconds NOTE: Fileref X has been deassigned. NOTE: Fileref Z has been deassigned. NOTE: Appending WORK.TEMP to WORK.DISTANCE_TIME. NOTE: BASE data set does not exist. DATA file is being copied to BASE file. NOTE: There were 1 observations read from the data set WORK.TEMP. NOTE: The data set WORK.DISTANCE_TIME has 1 observations and 2 variables. NOTE: PROCEDURE APPEND used (Total process time): real time 0.00 seconds cpu time 0.00 seconds NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds NOTE: The infile X is: Filename=https://www.google.com/maps/dir/addr1/addr2/?force=lite, Local Host Name=VMW7-GONZALBD03, Local Host IP addr=fe80::255e:1b35:74e2:2b4f%12, Service Hostname Name=mia07s47-in-f4.1e100.net, Service IP addr=172.217.8.68,Service Name=N/A, Service Portno=443,Lrecl=1,Recfm=Fixed NOTE: The file Z is: Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary Files\_TD6520_VMW7-GONZALBD03_\#LN00107, RECFM=F,LRECL=1,File Size (bytes)=0, Last Modified=03Oct2019:08:13:57, Create Time=03Oct2019:08:13:57 NOTE: 50236 records were read from the infile X. NOTE: 50236 records were written to the file Z. NOTE: DATA statement used (Total process time): real time 0.70 seconds cpu time 0.03 seconds NOTE: The infile Z is: Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary Files\_TD6520_VMW7-GONZALBD03_\#LN00107, RECFM=F,LRECL=50236,File Size (bytes)=50236, Last Modified=03Oct2019:08:13:57, Create Time=03Oct2019:08:13:57 NOTE: 1 record was read from the infile Z. NOTE: SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line. NOTE: The data set WORK.TEMP has 1 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.01 seconds NOTE: Fileref X has been deassigned. NOTE: Fileref Z has been deassigned. NOTE: Appending WORK.TEMP to WORK.DISTANCE_TIME. NOTE: There were 1 observations read from the data set WORK.TEMP. NOTE: 1 observations added. NOTE: The data set WORK.DISTANCE_TIME has 2 observations and 2 variables. NOTE: PROCEDURE APPEND used (Total process time): real time 0.00 seconds cpu time 0.00 seconds NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds NOTE: The infile X is: Filename=https://www.google.com/maps/dir/addr1/addr2/?force=lite, Local Host Name=VMW7-GONZALBD03, Local Host IP addr=fe80::255e:1b35:74e2:2b4f%12, Service Hostname Name=mia07s47-in-f4.1e100.net, Service IP addr=172.217.8.68,Service Name=N/A, Service Portno=443,Lrecl=1,Recfm=Fixed NOTE: The file Z is: Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary Files\_TD6520_VMW7-GONZALBD03_\#LN00108, RECFM=F,LRECL=1,File Size (bytes)=0, Last Modified=03Oct2019:08:13:57, Create Time=03Oct2019:08:13:57 NOTE: 50236 records were read from the infile X. NOTE: 50236 records were written to the file Z. NOTE: DATA statement used (Total process time): real time 0.42 seconds cpu time 0.06 seconds NOTE: The infile Z is: Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary Files\_TD6520_VMW7-GONZALBD03_\#LN00108, RECFM=F,LRECL=50236,File Size (bytes)=50236, Last Modified=03Oct2019:08:13:57, Create Time=03Oct2019:08:13:57 NOTE: 1 record was read from the infile Z. NOTE: SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line. NOTE: The data set WORK.TEMP has 1 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.01 seconds NOTE: Fileref X has been deassigned. NOTE: Fileref Z has been deassigned. NOTE: Appending WORK.TEMP to WORK.DISTANCE_TIME. NOTE: There were 1 observations read from the data set WORK.TEMP. NOTE: 1 observations added. NOTE: The data set WORK.DISTANCE_TIME has 3 observations and 2 variables. NOTE: PROCEDURE APPEND used (Total process time): real time 0.01 seconds cpu time 0.00 seconds NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds NOTE: The infile X is: Filename=https://www.google.com/maps/dir/addr1/addr2/?force=lite, Local Host Name=VMW7-GONZALBD03, Local Host IP addr=fe80::255e:1b35:74e2:2b4f%12, Service Hostname Name=mia07s47-in-f4.1e100.net, Service IP addr=172.217.8.68,Service Name=N/A, Service Portno=443,Lrecl=1,Recfm=Fixed NOTE: The file Z is: Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary Files\_TD6520_VMW7-GONZALBD03_\#LN00109, RECFM=F,LRECL=1,File Size (bytes)=0, Last Modified=03Oct2019:08:13:58, Create Time=03Oct2019:08:13:58 NOTE: 50252 records were read from the infile X. NOTE: 50252 records were written to the file Z. NOTE: DATA statement used (Total process time): real time 0.40 seconds cpu time 0.00 seconds NOTE: The infile Z is: Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary Files\_TD6520_VMW7-GONZALBD03_\#LN00109, RECFM=F,LRECL=50252,File Size (bytes)=50252, Last Modified=03Oct2019:08:13:58, Create Time=03Oct2019:08:13:58 NOTE: 1 record was read from the infile Z. NOTE: SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line. NOTE: The data set WORK.TEMP has 1 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.01 seconds NOTE: Fileref X has been deassigned. NOTE: Fileref Z has been deassigned. NOTE: Appending WORK.TEMP to WORK.DISTANCE_TIME. NOTE: There were 1 observations read from the data set WORK.TEMP. NOTE: 1 observations added. NOTE: The data set WORK.DISTANCE_TIME has 4 observations and 2 variables. NOTE: PROCEDURE APPEND used (Total process time): real time 0.01 seconds cpu time 0.01 seconds NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds NOTE: The infile X is: Filename=https://www.google.com/maps/dir/addr1/addr2/?force=lite, Local Host Name=VMW7-GONZALBD03, Local Host IP addr=fe80::255e:1b35:74e2:2b4f%12, Service Hostname Name=mia07s47-in-f4.1e100.net, Service IP addr=172.217.8.68,Service Name=N/A, Service Portno=443,Lrecl=1,Recfm=Fixed NOTE: The file Z is: Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary Files\_TD6520_VMW7-GONZALBD03_\#LN00110, RECFM=F,LRECL=1,File Size (bytes)=0, Last Modified=03Oct2019:08:13:58, Create Time=03Oct2019:08:13:58 NOTE: 50240 records were read from the infile X. NOTE: 50240 records were written to the file Z. NOTE: DATA statement used (Total process time): real time 0.49 seconds cpu time 0.07 seconds NOTE: The infile Z is: Filename=C:\Users\gonzalbd\AppData\Local\Temp\SAS Temporary Files\_TD6520_VMW7-GONZALBD03_\#LN00110, RECFM=F,LRECL=50240,File Size (bytes)=50240, Last Modified=03Oct2019:08:13:58, Create Time=03Oct2019:08:13:58 NOTE: 1 record was read from the infile Z. NOTE: SAS went to a new line when INPUT @'CHARACTER_STRING' scanned past the end of a line. NOTE: The data set WORK.TEMP has 1 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.01 seconds NOTE: Fileref X has been deassigned. NOTE: Fileref Z has been deassigned. NOTE: Appending WORK.TEMP to WORK.DISTANCE_TIME. NOTE: There were 1 observations read from the data set WORK.TEMP. NOTE: 1 observations added. NOTE: The data set WORK.DISTANCE_TIME has 5 observations and 2 variables. NOTE: PROCEDURE APPEND used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 246 247 * 248 add variables from original data set to new data set distance_time 249 use geodist function to calculate straight line distance 250 ; 251 data distance_time; 252 set distance_time; 253 set CBTI point=_n_; 254 straight_line = round(geodist(&ll1/&ll2,'DM'), 0.01); - 22 WARNING: Apparent symbolic reference LL1 not resolved. WARNING: Apparent symbolic reference LL2 not resolved. 254 straight_line = round(geodist(&ll1/&ll2,'DM'), 0.01); - 22 ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, a numeric constant, a datetime constant, a missing value, ), INPUT, PUT. ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, a numeric constant, a datetime constant, a missing value, INPUT, PUT. WARNING: Apparent symbolic reference LL2 not resolved. 254 straight_line = round(geodist(&ll1/&ll2,'DM'), 0.01); ------- 71 ERROR 71-185: The GEODIST function call does not have enough arguments. 255 run; NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 254:41 NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.DISTANCE_TIME may be incomplete. When this step was stopped there were 0 observations and 8 variables. WARNING: Data set WORK.DISTANCE_TIME was not replaced because this step was stopped. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds 256 257 proc print data=distance_time noobs label; 258 var x y time distance straight_line zip city statecode; ERROR: Variable X not found. ERROR: Variable Y not found. ERROR: Variable STRAIGHT_LINE not found. ERROR: Variable ZIP not found. ERROR: Variable CITY not found. ERROR: Variable STATECODE not found. 259 format zip z5. time time6. ; 260 run; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE PRINT used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
And here's what I get when I try to print the distance_time dataset:
Obs distance time
1 . .
2 . .
3 . .
4 . .
5 . .
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
You can use
options mlogic mprint symbolgen;
to get additional debug info in the log for macros.
In your macro there is the following data step :
data _null_;
nrec = &j;
set CBTI point=nrec;
%let ll1=addr1;
%let ll2=addr2;
stop;
run;
Here, you symply create two macrovariables ll1 and ll2 which resolve respectively to addr1, addr2, so the instruction :
filename x url "https://www.google.com/maps/dir/&ll1/&ll2/?force=lite";
will become
filename x url "https://www.google.com/maps/dir/addr1/addr2/?force=lite";
Is that really what you want ? If you want to use the values of dataset columns addr1 and addr2, use instead :
call symputx('ll1',addr1,'g');
Note that i used the 'g' parameter to ensure that the resulting macrovariable is global.
Indeed, the error you obtain in the log comes from the fact that you use macrovariables ll1 and ll2 outside the
macro that defined them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
You can use
options mlogic mprint symbolgen;
to get additional debug info in the log for macros.
In your macro there is the following data step :
data _null_;
nrec = &j;
set CBTI point=nrec;
%let ll1=addr1;
%let ll2=addr2;
stop;
run;
Here, you symply create two macrovariables ll1 and ll2 which resolve respectively to addr1, addr2, so the instruction :
filename x url "https://www.google.com/maps/dir/&ll1/&ll2/?force=lite";
will become
filename x url "https://www.google.com/maps/dir/addr1/addr2/?force=lite";
Is that really what you want ? If you want to use the values of dataset columns addr1 and addr2, use instead :
call symputx('ll1',addr1,'g');
Note that i used the 'g' parameter to ensure that the resulting macrovariable is global.
Indeed, the error you obtain in the log comes from the fact that you use macrovariables ll1 and ll2 outside the
macro that defined them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks very much for clarifying the global vs. local macro. That seemed to do the trick.