BookmarkSubscribeRSS Feed
Badjuju
Fluorite | Level 6

I know how to do this fairly easy in Python pandas using lookup() but want to see how to do it in SAS. The problem is straight forward. I have this master list show the zone associated with any zip-code pair. For instance, if the origin zip code is 95826 and the destination zip code is 73864, then the corresponding zone is 6. This master list has 3000 rows and columns. I would appreciate keeping the code macro free. I just need the fundamental base SAS code or ideas and I can take it from there. Many people have suggested transposing then doing a join/merge. I don't think that will work here. Currently my office uses Excel VBA to do this then feeds the results back into SAS. Considering what we pay for SAS, I'm certain there's a way to do this in SAS. In python, the solution involves re-indexing the "Master List" data set so the zip codes along the vertical are the new row indices instead of the default 1,2,3,4,....

Thanks in advance.

 

For our discussion, let's say I got the data into SAS and it looks like the following

Master List of zones:

 

1       FILL      73864    37463    37465    72958     20276

2    95826              6            1            5            8             2

3    98574              2            9            1            9             1

4    23847              1            3            8            4             7            

5    95867              3            2            5            8             1

6    39487              .             .             .             .              .

7    39483              .             .             .             .              .

 

 

Desired Output:

 

Package_Weight    Package_Height    Package_Length    Entry_zip    Destination_zip    zone

                          2                          10                         32          95867                   20276          1

                          6                            3                          12         23847                   37465          8

11 REPLIES 11
Astounding
PROC Star

In SAS, you would transform the data set into from-to pairs.  For example, the first few observations might look like this:

 

From   To     Value

 

95826  73864     6

95826  37463     1

95826  37465     5

 

That data set would be useful within a DATA step as a hash table, where the FROM TO combination is the index to the hash table, used to retrieve the VALUE.

 

If you need to retrieve the values in a procedure rather than a DATA step, there are other possibilities.  But those possibilities still begin with transforming the data.

Badjuju
Fluorite | Level 6
Astounding, how do I get it in the form you suggest? Transpose will not do that.
Astounding
PROC Star

Can you get the data into a SAS data set in any form at all?  Clearly what you posted is prior to that step.  So get the data into SAS, and describe the structure of what it looks like.

Badjuju
Fluorite | Level 6
I've posted what the data will look like in SAS
Ksharp
Super User

What kind of style did your data store? A sas table or a Excel ? 12345 is not a valid sas name.

Here are some ways:

1) best way is using SAS/IML code.

2) Hash Table

3)Proc format

4) Array Skill .

Badjuju
Fluorite | Level 6

The master zip code list is an Excel doc. 12345 is not the sas file name; I'm saying the row observations are named sequentially as is normally the case 1 thru (# of observations).

Ksharp
Super User

Then you need offer this excel file.

You need import this excel file into sas before doing look-up operator .

Badjuju
Fluorite | Level 6
I've updated the question to what the data set would look like in SAS.
Ksharp
Super User

OK. Here is an example.

The best way is using IML, if you have SAS/IML .

 

option validvarname=any;
data have;
input FILL      '73864'n    '37463'n    '37465'n    '72958'n     '20276'n;
cards;
   95826              6            1            5            8             2
    98574              2            9            1            9             1
    23847              1            3            8            4             7            
    95867              3            2            5            8             1
;
run;

proc sort data=have out=temp;
by fill;
run;
proc transpose data=temp out=lookup(where=(_NAME_ ne 'FILL'));
by fill;
var _all_;
run;
data lookup;
 set lookup;
 name=input(_name_,best32.);
run;
data want;
 if _n_=1 then do;
  if 0 then set lookup;
  declare hash h(dataset:'lookup');
  h.definekey('fill','name');
  h.definedata('col1');
  h.definedone();
 end;

Entry_zip=95867;    Destination_zip=20276; 
if h.find(key:Entry_zip,key:Destination_zip)=0 then zone=col1;
output;


Entry_zip=23847  ;    Destination_zip=37465  ; 
if h.find(key:Entry_zip,key:Destination_zip)=0 then zone=col1;
output;


run;
Badjuju
Fluorite | Level 6
Thanks for this Ksharp. I need to read-up on this hash stuff and digest this code.
Tom
Super User Tom
Super User

Note that postal codes are NOT numbers. For example the code 04032 for Freeport Maine has a leading zero.

 

How to start depends on what format your original list is in.  If it is a CSV or other type of text file then just read it in originally as from/to pairs.  For example if the file looks like this:

FILL 73864 37463 37465 72958 20276
95826 6 1 5 8 2
98574 2 9 1 9 1
23847 1 3 8 4 7
95867 3 2 5 8 1

you could read the first line and remember the codes for the columns.

data zones ;
  infile zone truncover ;
  array cols (3000) $7 _temporary_;
  length from to $7 zone 8;
  if _n_=1 then do;
    input to @;
    do col=1 by 1 until(to=' ') ;
      input to @ ;
      cols(col)=to;
    end;
    ncols=col-1;
    retain ncols;
    input ;
  end;
  input from @;
  do col=1 to ncols ;
    to=cols(col);
    input zone @;
    output;
  end;
  keep from to zone;
run;

Now you can either use the data in a hash and join or merge or create a format or informat.

You could easily transform the FROM/TO/ZONE variables into START/LABEL values to use with PROC FORMAT's CNTLIN= option.

data infmt;
  set zones ;
  retain fmtname 'ZONE' type 'I' ;
  length start $14 ;
  start = cat(from,to);
  rename zone=label;
run;
proc format cntlin=infmt;
run;

Now if we create your example data.

data have;
  input Package_Weight Package_Height Package_Length Entry_zip :$7.  Destination_zip :$7. zone_expect ;
cards;
 2 10 32 95867 20276 1
 6 3 12 23847 37465 8
;

It is easy enough to use the INFORMAT to convert the from/to pair into a zone.

data want ;
 set have ;
 zone = input(cat(entry_zip,destination_zip),zone.);
run;
       Package_    Package_    Package_    Entry_    Destination_     zone_
Obs     Weight      Height      Length      zip          zip         expect    zone

 1         2          10          32       95867        20276           1        1
 2         6           3          12       23847        37465           8        8

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1288 views
  • 8 likes
  • 4 in conversation