Converting Data Using Another Data Set

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

Converting Data Using Another Data Set

I am new to SAS and I using SAS 9.4. I am trying to convert data using data from a different data set. The first data set is called data and contains a variable for the first 3 digits of a zip code plus 14 other variables. The second data is called zip and contains the first three digits of a zipcodes along with the corresponding county and city names. For example:

 

Data                     Zip

zip_3 (variable name)          zip_3             County         State

   005                                    005                Holtsville      NY

   005                                    006                Adjuntas      PR

   006                                    006                Aquada        PR

 

I need to find some way to merge the data so that the county and state name are in the same observation as the matching zipcodes and if there are any other counties it could be, it is also in the observation. Example:

Data

zip_3             County         State      County         State

005                 Holtsville      NY

005                 Holtsville      NY

006                 Adjuntas      PR         Aquada        PR   

 

I've tried merging the data, but I have had no luck. I haven't used SAS much outside of merging data with identical variables, so I am at a loss. 


Accepted Solutions
Solution
a month ago
Super User
Super User
Posts: 7,399

Re: Converting Data Using Another Data Set

[ Edited ]
Posted in reply to Bennettr99

There are two problems. First you are trying to do a many to many merge.  There can be more than one loan per ZIP3 and there are more than one "lookup" value per ZIP3.  The second problem is you want transpose you many rows of detailed zipcode into a single observation. If you solve the second one first then the first one disappears.

 

First convert your ZIP dataset to have one observation per ZIP3 value.  For example using PROC TRANSPOSE.

proc transpose data=zip out=zip_name prefix=name ;
  by zip_3 ;
  var name ;
run;
proc tranpose data=zip out=zip_code prefix=zipcode ;
  by zip_3;
  var zipcode ;
run;
data zip_single;
  merge zip_name zip_code ;
  by zip_3;
run;

Then you can merge this with your loan data and you will be doing a many to one merge.

data want ;
   merge loan_data (in=in1) zip_single ;
   by zip_3;
   if in1 ;
run;

 

View solution in original post


All Replies
Super User
Posts: 21,546

Re: Converting Data Using Another Data Set

Posted in reply to Bennettr99

Your approach of merging is correct, can you explain how it didn't work? 

Feel free to include your exact code and error messages. 

 

Otherwise, the approach I'd recommend would be either a merge (data step or SQL) and/or a format. I'm partial to formats because I find them reusable versus merges and keep a library of formats that I can commonly pull from when needed. 

 


Bennettr99 wrote:

I am new to SAS and I using SAS 9.4. I am trying to convert data using data from a different data set. The first data set is called data and contains a variable for the first 3 digits of a zip code plus 14 other variables. The second data is called zip and contains the first three digits of a zipcodes along with the corresponding county and city names. For example:

 

Data                     Zip

zip_3 (variable name)          zip_3             County         State

   005                                    005                Holtsville      NY

   005                                    006                Adjuntas      PR

   006                                    006                Aquada        PR

 

I need to find some way to merge the data so that the county and state name are in the same observation as the matching zipcodes and if there are any other counties it could be, it is also in the observation. Example:

Data

zip_3             County         State      County         State

005                 Holtsville      NY

005                 Holtsville      NY

006                 Adjuntas      PR         Aquada        PR   

 

I've tried merging the data, but I have had no luck. I haven't used SAS much outside of merging data with identical variables, so I am at a loss. 


 

Occasional Contributor
Posts: 6

Re: Converting Data Using Another Data Set

I just realized I made a mistake with the merging and I should have been more specific about my problem. This was the code I was using. 

 

DATA home.merge;
MERGE home.data home.zip;
BY zip_3;
RUN;

This is the result (with some variables dropped):

The loan_id is from home.data. I need to merge the data without duplicating the data from home.data like it does in this data. I need to find a way to have one ID correspond to multiple Counties or States if the 3 digits of the zip code correspond to multiple zip codes. I also need to remove any observation that does not include data from home.data.

 

So if the zip_3 in home.data is 025, all of the five digit zip codes and county names would be listed in the same observation.

Solution
a month ago
Super User
Super User
Posts: 7,399

Re: Converting Data Using Another Data Set

[ Edited ]
Posted in reply to Bennettr99

There are two problems. First you are trying to do a many to many merge.  There can be more than one loan per ZIP3 and there are more than one "lookup" value per ZIP3.  The second problem is you want transpose you many rows of detailed zipcode into a single observation. If you solve the second one first then the first one disappears.

 

First convert your ZIP dataset to have one observation per ZIP3 value.  For example using PROC TRANSPOSE.

proc transpose data=zip out=zip_name prefix=name ;
  by zip_3 ;
  var name ;
run;
proc tranpose data=zip out=zip_code prefix=zipcode ;
  by zip_3;
  var zipcode ;
run;
data zip_single;
  merge zip_name zip_code ;
  by zip_3;
run;

Then you can merge this with your loan data and you will be doing a many to one merge.

data want ;
   merge loan_data (in=in1) zip_single ;
   by zip_3;
   if in1 ;
run;

 

Occasional Contributor
Posts: 6

Re: Converting Data Using Another Data Set

This works! Thanks for the help. 

Super User
Posts: 12,148

Re: Converting Data Using Another Data Set

Posted in reply to Bennettr99

You may also want to be aware that in many parts of the US the first three digits of a Zip code  will not determine county.

 

For instance in my state we have 44 counties and only 7 different values of first three digits in the Zip, the county counts ranging from 1 to 11 per 3-digit code.

Super User
Posts: 21,546

Re: Converting Data Using Another Data Set

Posted in reply to Bennettr99
And a small FYI, though not helpful if you only have the 3 digit zip, SAS has a bunch of functions that convert ZIP codes to state names and such. http://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=p1en5dzvubdabmn1czrqdcq9ozv3.htm&...
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 132 views
  • 3 likes
  • 4 in conversation