I have to upload various tables(xlsx files) and join them together via a query. I,e upload table A1, and join it with A2, A3.....etc.
Can I make it such that my query is dynamic enough to ignore join statements if my table does not have certain columns.
For example, uploading table A1, A2, A3, I might join them with primary Key "ID" and 'name'. However, if I upload table B1, B2, B3, it might not have Primary Key "name."
So I want to have a query that has all the join staements for name and ID, but be able to ignore the 'name' join if these columns aren't available.
Here is an example:
data a1; id=1; var1=123; output; id=2; var1=345; output; run; data a2; id=1; name="abc"; var2=12; output; id=3; name="def"; var2=90; output; run; data a3; id=3; name="tmp"; var3=20; output; id=4; name="rtt"; var3=34; output; run; %macro Join_ID (t1=,t2=,use_name=N,tresult=); proc sql; create table &tresult. as select * from &t1. t1 left join &t2. t2 on t1.id=t2.id %if &use_name.=Y %then %do; and t1.name=t2.name %end; ; quit; %mend Join_ID; %Join_ID (t1=a1,t2=a2,tresult=first); %Join_ID (t1=first,t2=a3,use_name=Y,tresult=second);
You can switch the call of Join_ID based on what is in sashelp.vcolumn.
However, once again, I strongly recommend you don't go down that route as you will have problems, not just logical ones, but inaccurate joins and such like. Know your data, program to your data, this is the way to write robust clean code which can repeatedly produce the same accurate result.
Whilst there might be, after some major code work, a way of doing something similar to what you want the real question is why you don't know your data. Programming is 95% data and documentation, with a small amount of coding at the end. Writing code based off data you don't know and trying to generalise things will just cause you far more work. Know the data, know the linking variables - this is the data model and fundamental to any programming endeavor.
Sure. Use dictionary.columns to check to see if the field(s) exist and then use macro code (%if) to bypass joins that wouldn't work (make the whole thing a macro to use things like %if).
something like:
PROC SQL noprint ;
SELECT *
FROM DICTIONARY.COLUMNS
WHERE LIBNAME='5XX' and memname='B1' and name='name' ;
QUIT;
%if &sqlobs>0 %then... (do your join by name)
I considered that, but I didn't see anything that needed to be saved...just the existence of the field itself is all that's needed.
Thank you. Im' still new to SAS and macros, and still need help understanding. Would appreciate some assistance if possible.
Here's an example full code that I have.
proc sql;
select *
from
work.table1 t1
left join work.table2 t2 on t1.ID = t2.ID and t1.name = t2.name;
quit;
In my tables, t2 does not have a 'name' column, so running the query like this, it'll error out. I want my query to be able to ignore the name joins so it'll run as if there wasn't a name join
I understand the first part of you code where it's filtering the dictionary columns to the 'Name' column. However, I dont understand the rest of it.
proc sql;
select * from dictionary.columns
where LIBNAME = 'WORK'
AND memname = 'TABLE2'
AND name = 'Name';
quit;
is &sqlobs a predefined macro? and is this already referencing the above script? I'm actually still very new to sas and macros, so I'm still unsure how to apply what you gave me to my code above. my extent to macros at the moment is %let = variable, and then referencing it throughout the code
Here is an example:
data a1; id=1; var1=123; output; id=2; var1=345; output; run; data a2; id=1; name="abc"; var2=12; output; id=3; name="def"; var2=90; output; run; data a3; id=3; name="tmp"; var3=20; output; id=4; name="rtt"; var3=34; output; run; %macro Join_ID (t1=,t2=,use_name=N,tresult=); proc sql; create table &tresult. as select * from &t1. t1 left join &t2. t2 on t1.id=t2.id %if &use_name.=Y %then %do; and t1.name=t2.name %end; ; quit; %mend Join_ID; %Join_ID (t1=a1,t2=a2,tresult=first); %Join_ID (t1=first,t2=a3,use_name=Y,tresult=second);
You can switch the call of Join_ID based on what is in sashelp.vcolumn.
However, once again, I strongly recommend you don't go down that route as you will have problems, not just logical ones, but inaccurate joins and such like. Know your data, program to your data, this is the way to write robust clean code which can repeatedly produce the same accurate result.
@mrdlau wrote:
So I want to have a query that has all the join staements for name and ID, but be able to ignore the 'name' join if these columns aren't available.
Would those tables still be joined with ID or ignored entirely?
If the key variables are the only variables they have in common the consider using NATURAL joins. SAS will automatically join based on variables with the same names.
proc sql ;
create table want as
select *
from b2
natural join b1
;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.