Dear all,
I would like to merge two datasets with different informat of a variable. I tried to change the variable "STATE" from a numeric variable to a character variable using PUT function prior to merging but I failed.
Could anyone recommend a better way to solve the problem?
Thank you.
MSPAK
Hi. I think that you want to CONCATENATE (not MERGE) the two data sets you posted.
STATE is a state name in data set SET1 and a state code in data set SET2.
So ... make two new variables for the new data set, STATE_CODE and STATE_NAME.
NAME has a name and state in data set SET1 but only a name in data set SET2.
So ... get rid of the state portion of NAME (each observation will have a STATE_CODE and STATE_NAME)
data set SET1 has no county numbers
So ... use the FIPS code to add a county number
Assuming that I'm correct, try this ...
* first ... remove the formats and informats from both data sets (useless);
proc datasets lib=z;
modify set1;
format _all_;
informat _all_;
modify set2;
format _all_;
informat _all_;
quit;
* second ... combine (CONCATENATE) the two data sets with SET statements;
data one_two;
length name $36;
format state_code z2. county z3. fips z5.;
set z.set1 (in=one rename=(state=state_name)) z.set2 (rename=(state=state_code));
if one then do;
state_code = input(substr(put(fips,z5.),1,2),2.);
county = input(substr(put(fips,z5.),3,3),3.);
end;
else do;
state_name = fipname(state_code);
name = scan(name,1,',');
end;
run;
The combined data set ONE_TWO has 15,714 observations. PROC CONTENTS shows ...
# Variable Type Len Format Label
7 FYEAR Num 8
6 INHERENT Num 8
3 county Num 8 Z3. 3-digit county fips number
4 fips Num 8 Z5. FIPS Code
1 name Char 36 County Name
2 state_code Num 8 Z2. State Census Code
5 state_name Char 20 State Name
Unable to understand your question. Better to show the program and the Log.
The INFORMAT attached to a variable has no impact on whether you can merge by it.
If the variables are different TYPE then you will not be able combine the datasets at all, let alone merge by that variable.
You cannot change the type of variable. You can make a new variable and rename it.
newvar=put(state,z2.);
rename newvar=state state=oldstate;
You could also merge directly using SQL.
select * from a,b
where a.state = put(b.state,Z2.)
;
Hi all,
These two files attached. I try to merge these two files together but the format of the variable "STATE" are different.
TQ.
With which SAS version and on which platform did you create those datasets? I can't open them with 9.2 on Win64.
Hi,
I am using SAS9.4 Windows64. The SAS files uploaded were in imported from (stata) dta files.
This is how I would solve the problem assuming that you want to get all of the inherent and fyear records from both set1 and set2.
/* get dataset of distinct states and counties */
proc sql;
create table state_county as
select distinct a.fips, a.county as county_num, b.state, b.name as county_name
from set2 as a join
set1 as b on a.fips = b.fips;
run;
/* concatenate the tables together to get inherent and fyear data from each set */
/* keep state and county information */
proc sql;
create table combined as
select a.*, b.inherent, b.fyear
from state_county as a join
set2 as b on a.fips = b.fips
union
select a.*, b.inherent, b.fyear
from state_county as a join
set1 as b on a.fips = b.fips
;
quit;
/* sort for fips and year */
proc sort data=combined;
by fips fyear;
run;
SET1:
STATE - Type=TEXT, $20
SET2:
STATE - Type=NUMERIC, Format=STATE.
In one data set you have a character variable for state and in the second you have a numeric variable that is mapped using a format.
@Kurt_Bremser you can use the option noftmerr to view the datasets.
@mspak something like the following should work - rename old variable state to state_code and create a new variable state that is character. Then merge the datasets. This won't work exactly as I suspect you have more than STATE in your merge condition but it's a starting point.
Hope this helps.
Data SET2_NEW;
set SET2(rename=state=state_code);
state=put(state_code, state.);
run;
*Appropriate sorts...;
data want;
merge set1 set2;
by state;
run;
Hi. I think that you want to CONCATENATE (not MERGE) the two data sets you posted.
STATE is a state name in data set SET1 and a state code in data set SET2.
So ... make two new variables for the new data set, STATE_CODE and STATE_NAME.
NAME has a name and state in data set SET1 but only a name in data set SET2.
So ... get rid of the state portion of NAME (each observation will have a STATE_CODE and STATE_NAME)
data set SET1 has no county numbers
So ... use the FIPS code to add a county number
Assuming that I'm correct, try this ...
* first ... remove the formats and informats from both data sets (useless);
proc datasets lib=z;
modify set1;
format _all_;
informat _all_;
modify set2;
format _all_;
informat _all_;
quit;
* second ... combine (CONCATENATE) the two data sets with SET statements;
data one_two;
length name $36;
format state_code z2. county z3. fips z5.;
set z.set1 (in=one rename=(state=state_name)) z.set2 (rename=(state=state_code));
if one then do;
state_code = input(substr(put(fips,z5.),1,2),2.);
county = input(substr(put(fips,z5.),3,3),3.);
end;
else do;
state_name = fipname(state_code);
name = scan(name,1,',');
end;
run;
The combined data set ONE_TWO has 15,714 observations. PROC CONTENTS shows ...
# Variable Type Len Format Label
7 FYEAR Num 8
6 INHERENT Num 8
3 county Num 8 Z3. 3-digit county fips number
4 fips Num 8 Z5. FIPS Code
1 name Char 36 County Name
2 state_code Num 8 Z2. State Census Code
5 state_name Char 20 State Name
Dear all,
Thank you very much for all the solutions and suggestions provided. I truly appreciate your contributions. I finally manage to combine 2 datasets.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.