Hi, I have two datasets saved in sasuser library. Both are mostly different, however some of variables have the same names but they mean something different. Please, let me know how to merge them with automatic new variables names in one of datasets (for repeated variables names).
My code at the moment is:
data a(rename=(E3=E33)); /*(here is the try for renaming one of known repeated names of variable)
set sasuser.datA(rename=(E3=E3num E3num=E33 F1=F111 UW1=UW11));
E3 = put(E3num, 7.);
run;
data datBA;
merge sasuser.datB datA;
by id;
run;
if age_group = '55 - 59 y' then delete;
Moreover, the log says as well that an error occur when i try to delete some observations "ERROR 180-322: Statement is not valid or it is used out of proper order".
Can you see any issues in my code?
Thank you in advance for any response.
There is no AUTOMATIC renaming. You could write some code to compare the variable names from the two datasets and then use that to generate code. But for two datasets it is not worth the effort.
You second error is probably because you tried to run a IF statement outside of any DATA step.
For what your describe the easiest way is to use the RENAME= dataset option on INCOMING datasets so the name conflict is fixed BEFORE the data step tries to combine the data.
So for example if both datasets had variables E3 and F1 you might do something like this:
data want;
merge
sasuser.datA (rename=( e3 = A_e3 f1=A_f1 ))
sasuser.datB (rename=( e3 = B_e3 f1=B_f1 ))
;
by id;
run;
I rarely use the RENAME= dataset option on an outgoing dataset from a DATA step. For that you can just use the RENAME statement instead. The RENAME= dataset option is more useful on the outgoing dataset from a PROC step where you cannot use the RENAME statement.
For the renaming part the code could look as below:
data datBA;
merge
sasuser.datB (rename=(e3=datb_e3 e4=datb_e4))
datA
;
by id;
run;
You can now write code like:
e3=put(datab_e3, 7.)
...but make sure that you don't have a variable e3 coming from source table datA.
If you just want to change the variable e3 then you could also use code as below:
For the error you observe:
Hard to say without the full SAS log.
There is no AUTOMATIC renaming. You could write some code to compare the variable names from the two datasets and then use that to generate code. But for two datasets it is not worth the effort.
You second error is probably because you tried to run a IF statement outside of any DATA step.
For what your describe the easiest way is to use the RENAME= dataset option on INCOMING datasets so the name conflict is fixed BEFORE the data step tries to combine the data.
So for example if both datasets had variables E3 and F1 you might do something like this:
data want;
merge
sasuser.datA (rename=( e3 = A_e3 f1=A_f1 ))
sasuser.datB (rename=( e3 = B_e3 f1=B_f1 ))
;
by id;
run;
I rarely use the RENAME= dataset option on an outgoing dataset from a DATA step. For that you can just use the RENAME statement instead. The RENAME= dataset option is more useful on the outgoing dataset from a PROC step where you cannot use the RENAME statement.
@Skillside wrote:
Thank You guys for the answers. Well, isn't it a limitation of SAS then? Because overwriting the databases because of same variables names is kinda not comfortable? Once again, thank you.
No database will let you have 2 variables/columns with the same name in the same table.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.