BookmarkSubscribeRSS Feed

 

When you call the zipcitydist() function and supply an zipcode that doesn't exist in sashelp.zipcode like below:

 

option notes;

data zip;
  zip = 12345;
run;

proc sql noprint;
  select zipcitydist(00000,99999) from zip;
%runquit;

Part of the log displayed is this note:

 

NOTE: ZIP code value 0 could not be found in SASHELP.ZIPCODE.

Not necessarily a bad note, unless your dataset is millions of rows, and you get a note for each one.

 

Can the default behavior of this note be changed so that only the first 10 or so examples are displayed and then we get a note saying further examples were omitted from the log?  Currently the only alternative is to use 'option nonote' which is overkill.

 

3 Comments
ballardw
Super User

Are you getting "random" bad zips or are all of these either '00000' or '99999'?

If you are getting a lot of '00000' because of missing data then perhaps a simple test that neither of the zips are equal to 00000 or 99999 before calling the function.

 

 

RobP
Quartz | Level 8

@ballardw Random bad zips from user input data.  The above code was just an example to illustrate the log message.

 

In hindsight, the example I gave was still overly complicated, I could have demonstrated it with a simple data step instead.

ballardw
Super User

I wonder if a modified sashelp.zipcode data set with missing values for X and y for all of the zips not assigned would address this? Or just add a bunch of missing coordinates errors instead?

 

I suspect that the instructions on updating the sashelp.zipcode data set could be modified to use a modified data set.

 

I know it is possible to create a validation format and only call the function when both values are valid but with the numbers of records you are using I'm not sure of the performance hit.

 

proc sql;
   create table zipcntlin as
   select distinct zip as start,'Valid' as label, 'Validzip' as fmtname
         ,'N' as type
   from sashelp.zipcode
   ;
quit;

proc format library=work cntlin=zipcntlin;
run;

Use would be something like:

 

if put(zip1,validfmt.)='Valid' and Put(zip2,validfmt.)='Valid' then distance=zipcitydist(zip1, zip2);

 

or an equivalent CASE statement in Proc SQL.