BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Bennettr99
Fluorite | Level 6

I am new to SAS and I using SAS 9.4. I am trying to convert data using data from a different data set. The first data set is called data and contains a variable for the first 3 digits of a zip code plus 14 other variables. The second data is called zip and contains the first three digits of a zipcodes along with the corresponding county and city names. For example:

 

Data                     Zip

zip_3 (variable name)          zip_3             County         State

   005                                    005                Holtsville      NY

   005                                    006                Adjuntas      PR

   006                                    006                Aquada        PR

 

I need to find some way to merge the data so that the county and state name are in the same observation as the matching zipcodes and if there are any other counties it could be, it is also in the observation. Example:

Data

zip_3             County         State      County         State

005                 Holtsville      NY

005                 Holtsville      NY

006                 Adjuntas      PR         Aquada        PR   

 

I've tried merging the data, but I have had no luck. I haven't used SAS much outside of merging data with identical variables, so I am at a loss. 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

There are two problems. First you are trying to do a many to many merge.  There can be more than one loan per ZIP3 and there are more than one "lookup" value per ZIP3.  The second problem is you want transpose you many rows of detailed zipcode into a single observation. If you solve the second one first then the first one disappears.

 

First convert your ZIP dataset to have one observation per ZIP3 value.  For example using PROC TRANSPOSE.

proc transpose data=zip out=zip_name prefix=name ;
  by zip_3 ;
  var name ;
run;
proc tranpose data=zip out=zip_code prefix=zipcode ;
  by zip_3;
  var zipcode ;
run;
data zip_single;
  merge zip_name zip_code ;
  by zip_3;
run;

Then you can merge this with your loan data and you will be doing a many to one merge.

data want ;
   merge loan_data (in=in1) zip_single ;
   by zip_3;
   if in1 ;
run;

 

View solution in original post

6 REPLIES 6
Reeza
Super User

Your approach of merging is correct, can you explain how it didn't work? 

Feel free to include your exact code and error messages. 

 

Otherwise, the approach I'd recommend would be either a merge (data step or SQL) and/or a format. I'm partial to formats because I find them reusable versus merges and keep a library of formats that I can commonly pull from when needed. 

 


@Bennettr99 wrote:

I am new to SAS and I using SAS 9.4. I am trying to convert data using data from a different data set. The first data set is called data and contains a variable for the first 3 digits of a zip code plus 14 other variables. The second data is called zip and contains the first three digits of a zipcodes along with the corresponding county and city names. For example:

 

Data                     Zip

zip_3 (variable name)          zip_3             County         State

   005                                    005                Holtsville      NY

   005                                    006                Adjuntas      PR

   006                                    006                Aquada        PR

 

I need to find some way to merge the data so that the county and state name are in the same observation as the matching zipcodes and if there are any other counties it could be, it is also in the observation. Example:

Data

zip_3             County         State      County         State

005                 Holtsville      NY

005                 Holtsville      NY

006                 Adjuntas      PR         Aquada        PR   

 

I've tried merging the data, but I have had no luck. I haven't used SAS much outside of merging data with identical variables, so I am at a loss. 


 

Bennettr99
Fluorite | Level 6

I just realized I made a mistake with the merging and I should have been more specific about my problem. This was the code I was using. 

 

DATA home.merge;
MERGE home.data home.zip;
BY zip_3;
RUN;

This is the result (with some variables dropped):

datasample.PNG

The loan_id is from home.data. I need to merge the data without duplicating the data from home.data like it does in this data. I need to find a way to have one ID correspond to multiple Counties or States if the 3 digits of the zip code correspond to multiple zip codes. I also need to remove any observation that does not include data from home.data.

 

So if the zip_3 in home.data is 025, all of the five digit zip codes and county names would be listed in the same observation.

Tom
Super User Tom
Super User

There are two problems. First you are trying to do a many to many merge.  There can be more than one loan per ZIP3 and there are more than one "lookup" value per ZIP3.  The second problem is you want transpose you many rows of detailed zipcode into a single observation. If you solve the second one first then the first one disappears.

 

First convert your ZIP dataset to have one observation per ZIP3 value.  For example using PROC TRANSPOSE.

proc transpose data=zip out=zip_name prefix=name ;
  by zip_3 ;
  var name ;
run;
proc tranpose data=zip out=zip_code prefix=zipcode ;
  by zip_3;
  var zipcode ;
run;
data zip_single;
  merge zip_name zip_code ;
  by zip_3;
run;

Then you can merge this with your loan data and you will be doing a many to one merge.

data want ;
   merge loan_data (in=in1) zip_single ;
   by zip_3;
   if in1 ;
run;

 

Bennettr99
Fluorite | Level 6

This works! Thanks for the help. 

ballardw
Super User

You may also want to be aware that in many parts of the US the first three digits of a Zip code  will not determine county.

 

For instance in my state we have 44 counties and only 7 different values of first three digits in the Zip, the county counts ranging from 1 to 11 per 3-digit code.

Reeza
Super User
And a small FYI, though not helpful if you only have the 3 digit zip, SAS has a bunch of functions that convert ZIP codes to state names and such. http://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=p1en5dzvubdabmn1czrqdcq9ozv3.htm&...

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 845 views
  • 3 likes
  • 4 in conversation