Good morning,
In a simple way, how could I correct this mistake?
Thanks!
-----------------------------------------------------------------------------------------------
data indwts_and_empratios0;
merge inds_empout rims_both3(in=inb);
by pubIndCode_id;
if inb;
1270 data indwts_and_empratios0;
1271 merge inds_empout rims_both3(in=inb);
ERROR: Variable pubIndCode_id has been defined as both character and numeric.1272 by pubIndCode_id;
1273 if inb;
It means that your "by" variable is defined in the incoming datasets with different types (character and numeric). You need to properly convert it to the correct type in one of the datasets for the merge to work.
You need to decide if pubIndCode_id should be a number or a character string. Since it ends with _ID it probably should be a character variable. You don't need to perform arithmetic on ID values.
Then figure out which of the two datasets has the variable defined as a number and fix that. Preferable by changing the process that created the dataset.
So this makes sense, but is there a way to manipulate the code I pasted and force those columns to be either character or numeric?
Thanks a lot
You cannot do it easily if it is the BY variable. You will need to make an updated version of one of the datasets and possible re-sort it since numbers and character strings could sort in different orders. Although it is normally better to use character for ID varaibles it is probably easier to convert the character version to numeric since then leading zeros or leading spaces will not cause confusion.
data fix ;
set oldA (rename=(old_var=old_var_char));
old_var = input(old_var_char,32.);
run;
proc sort data=fix; by old_var; run;
data want;
merge fix oldB;
by old_var;
run;
If it is some other variable then here is a work around that does not require that you know which dataset has the numeric and which has the character variables.
data want;
merge A (in=ina rename=(old_var=old_var_A))
B(in=inb rename=(old_var=old_var_b));
by some_other_variable;
old_var = input(coalescec(cats(old_var_b),cats(old_var_a)),32.);
run;
SQL allows you to change types in the "Merge/Join" step, but a data step does not.
I don't know which is character or which is numeric but you can use PUT to convert the numeric variable to a character though you'll want to test that.
proc sql;
create table want as
select *
from rims_both as rb
left join inds_empout as ie
on put(rb.pubIndCode_ID, $8. -l) = ie.pubIndCode_ID;
quit;
@Afor910327 wrote:
So this makes sense, but is there a way to manipulate the code I pasted and force those columns to be either character or numeric?
Thanks a lot
@Afor910327 wrote:
So this makes sense, but is there a way to manipulate the code I pasted and force those columns to be either character or numeric?
Thanks a lot
It is not uncommon for variables used in an identification role, like bank accounts, to have leading zeroes. You may have to spend some time deciding if that is the case for your data as there may be more complications getting the correct number of leading zeroes than you expect depending on your actual values. If the lengths of your character values vary and have leading zeroes you may find it easier to re-read the external data source with the matching length.
@Afor910327 wrote:
So this makes sense, but is there a way to manipulate the code I pasted and force those columns to be either character or numeric?
Thanks a lot
Ideally, you do this when the data enters the SAS realm. So you should look at that part of the process.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.