BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mspak
Quartz | Level 8

Dear all,

 

I would like to merge two datasets with different informat of a variable. I tried to change the variable "STATE" from a numeric variable to a character variable using PUT function prior to merging but I failed. 

 

Could anyone recommend a better way to solve the problem?

 

Thank you.

 

MSPAK

 

1 ACCEPTED SOLUTION

Accepted Solutions
MikeZdeb
Rhodochrosite | Level 12

Hi.  I think that you want to CONCATENATE (not MERGE) the two data sets you posted.

 

STATE is a state name in data set SET1 and a state code in data set SET2.  

So ... make two new variables for the new data set, STATE_CODE and STATE_NAME.

 

NAME has a name and state in data set SET1 but only a name in data set SET2.

So ... get rid of the state portion of NAME (each observation will have a STATE_CODE and STATE_NAME)

 

data set SET1 has no county numbers

So ... use the FIPS code to add a county number

 

 Assuming that I'm correct, try this ...

 

* first ... remove the formats and informats from both data sets (useless);
proc datasets lib=z;
modify set1;
format _all_;
informat _all_;
modify set2;
format _all_;
informat _all_;
quit;

 

* second ... combine (CONCATENATE) the two data sets with SET statements;

data one_two;
length name $36;
format state_code z2. county z3. fips z5.;
set z.set1 (in=one rename=(state=state_name)) z.set2 (rename=(state=state_code));
if one then do;
   state_code = input(substr(put(fips,z5.),1,2),2.);
   county = input(substr(put(fips,z5.),3,3),3.);
end;
else do;
   state_name = fipname(state_code);
   name = scan(name,1,',');
end;
run;

 

The combined data set ONE_TWO has 15,714 observations.  PROC CONTENTS shows ...

 

# Variable   Type Len Format Label

7 FYEAR      Num   8
6 INHERENT   Num   8
3 county     Num   8   Z3.  3-digit county fips number
4 fips       Num   8   Z5.  FIPS Code
1 name       Char 36        County Name
2 state_code Num   8   Z2.  State Census Code
5 state_name Char 20        State Name

 

View solution in original post

9 REPLIES 9
KachiM
Rhodochrosite | Level 12

Unable to understand your question. Better to show the program and the Log. 

Tom
Super User Tom
Super User

The INFORMAT attached to a variable has no impact on whether you can merge by it.

If the variables are different TYPE then you will not be able combine the datasets at all, let alone merge by that variable.

You cannot change the type of variable. You can make a new variable and rename it.

newvar=put(state,z2.);
rename newvar=state state=oldstate;

You could also merge directly using SQL.

select * from a,b 
where a.state = put(b.state,Z2.)
;
mspak
Quartz | Level 8

Hi all,

 

These two files attached. I try to merge these two files together but the format of the variable "STATE" are different.

 

TQ.

mspak
Quartz | Level 8

Hi,

 

I am using SAS9.4 Windows64. The SAS files uploaded were in imported from (stata) dta files.

ToddB
SAS Employee

This is how I would solve the problem assuming that you want to get all of the inherent and fyear records from both set1 and set2.

 

/* get dataset of distinct states and counties */
proc sql;
 create table state_county as
 select distinct a.fips, a.county as county_num, b.state, b.name as county_name
 from set2 as a join
	set1 as b on a.fips = b.fips;
run;

/* concatenate the tables together to get inherent and fyear data from each set */
/* keep state and county information                                            */
proc sql;
 create table combined as
 select a.*, b.inherent, b.fyear
 from state_county as a join
 	set2 as b on a.fips = b.fips
union
select a.*, b.inherent, b.fyear
 from state_county as a join
 	set1 as b on a.fips = b.fips
	;
quit;

/* sort for fips and year */
proc sort data=combined;
 by fips fyear;
run;
Reeza
Super User

SET1:

STATE - Type=TEXT, $20

SET2:

STATE - Type=NUMERIC, Format=STATE.

 

In one data set you have a character variable for state and in the second you have a numeric variable that is mapped using a format.

@Kurt_Bremser you can use the option noftmerr to view the datasets.

 

@mspak something like the following should work - rename old variable state to state_code and create a new variable state that is character. Then merge the datasets. This won't work exactly as I suspect you have more than STATE in your merge condition but it's a starting point.

 

Hope this helps.

 

Data SET2_NEW;
set SET2(rename=state=state_code);
state=put(state_code, state.);
run;

*Appropriate sorts...;

data want;
merge set1 set2;
by state;
run;

 

MikeZdeb
Rhodochrosite | Level 12

Hi.  I think that you want to CONCATENATE (not MERGE) the two data sets you posted.

 

STATE is a state name in data set SET1 and a state code in data set SET2.  

So ... make two new variables for the new data set, STATE_CODE and STATE_NAME.

 

NAME has a name and state in data set SET1 but only a name in data set SET2.

So ... get rid of the state portion of NAME (each observation will have a STATE_CODE and STATE_NAME)

 

data set SET1 has no county numbers

So ... use the FIPS code to add a county number

 

 Assuming that I'm correct, try this ...

 

* first ... remove the formats and informats from both data sets (useless);
proc datasets lib=z;
modify set1;
format _all_;
informat _all_;
modify set2;
format _all_;
informat _all_;
quit;

 

* second ... combine (CONCATENATE) the two data sets with SET statements;

data one_two;
length name $36;
format state_code z2. county z3. fips z5.;
set z.set1 (in=one rename=(state=state_name)) z.set2 (rename=(state=state_code));
if one then do;
   state_code = input(substr(put(fips,z5.),1,2),2.);
   county = input(substr(put(fips,z5.),3,3),3.);
end;
else do;
   state_name = fipname(state_code);
   name = scan(name,1,',');
end;
run;

 

The combined data set ONE_TWO has 15,714 observations.  PROC CONTENTS shows ...

 

# Variable   Type Len Format Label

7 FYEAR      Num   8
6 INHERENT   Num   8
3 county     Num   8   Z3.  3-digit county fips number
4 fips       Num   8   Z5.  FIPS Code
1 name       Char 36        County Name
2 state_code Num   8   Z2.  State Census Code
5 state_name Char 20        State Name

 

mspak
Quartz | Level 8

Dear all,

 

Thank you very much for all the solutions and suggestions provided. I truly appreciate your contributions. I finally manage to combine 2 datasets.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 3027 views
  • 2 likes
  • 7 in conversation