Hi,
So I have multiple tables (about 30) and about 13 of them have a variable activeIndicator while the other 17 have active_indicator. For tables 1 to 13, it's activeIndicator and for 14-30 it's activeIndicator.
I need this variable, because I also want to look at it when the active indicator variable is equal to 1.
How do I merge on variables using proc sql and choose what variables to merge on depending on a condition?
The case when statement won't work because I think it looks at the inputTable_&i. and searches for BOTH active_Indicator and activeIndicator. I prefer not changing the loop too much, but if it's necessary. Thank you!
I prefer a loop, so something like
%macro tableLoop(start,end);
%do i = &start. %to &end.;
proc sql;
create table output_&i. as select
case when &i. <= 13 then activeIndicator else active_Indicator end as variable_want
from inputTable_&i.
where variable_want = 1
;quit;
%tableLoop(1,30);
@yellowyellowred wrote:
Hi,
So I have multiple tables (about 30) and about 13 of them have a variable activeIndicator while the other 17 have active_indicator. For tables 1 to 13, it's activeIndicator and for 14-30 it's activeIndicator.
Was it intentional to have two such similar named but different variables? I would think it might be much easier to pick one name for the variable, rename it when needed and then none of this loop junk is needed. In fact, since your description above doesn't even have different spellings between tables 1 to 13 and 14-30 renaming seems like a much better approach.
Proc datasets will let you rename variables in data sets in place. Assuming you actually have some named active_indicator as in your code and you would prefer them to be activeIndicator
proc datasets library=work;/* or whatever the actual name of your library may be*/ modify table14; rename active_Indicator=activeIndicator; modify table15; rename active_Indicator=activeIndicator; modify table16; rename active_Indicator=activeIndicator; /* continue until all tables that need modification are included*/ Quit; /*Datasets requires Quit to end */
You don't show any join code to know what you might want but with all the sets having the same variable name you then join, or merge on that variable.
Yes, that was just an outdated name (someone else made those tables). The intention was to capture this active indicator and then filter on whenever the entry is "active". The tables are on the actual data server, so I can't rename them, so I was hoping there was another approach
The names you are showing appear in your WORK library. That means they are yours. You don't need to carry on stupidity others generated.
OR go to who ever manages the data and discuss a quality improvement activity of making the variables the same in all the data sets.
Show some of what the next step looks like. As I mentioned, you don't show anything "merging". It sounds like, as a minimum, you need to have 4 separate pieces of code, both sets with activeIndicator, both sets with active_indicator, one where the first set has activeIndicator and the second with active_Indicator and the 4th with the other order. Then need to examine the two sets for which variable to select the code to actually run. If more than two sets are involved this get much more complicated quickly. Really FIX the problem instead of kludging around it is a much better approach in the long run.
Those source tables are actually tables that I take from a shared server (I just named it inputTable_ for convenience here). I agree that it should be renamed, but alas.
The goal of the code I want to write is to:
Have a final output containing all "active" people. Each person has a unique "ID" variable which is present in each table. With that, they have an indicator saying whether or not they are "active".
Each source table is named inputTable_n (n is an integer from 1 to 30). For n <= 13, the input table inputTable_n has an ID column and an activeIndicator column (=0 for inactive, =1 for active).
For n>= 14, the input table inputTable_n has an ID column and an active_Indicator column (=0 for inactive, =1 for active).
This is why I looped through the tables, from 1 to 30 - the revised code hopefully makes it clearer on the goal of the code:
%let LIB1 = ACR;
%macro tableLoop(start,end);
%do i = &start. %to &end.;
proc sql;
create table output_&i. as select
ID
,case when &i. <= 13 then activeIndicator else active_Indicator end as variable_want
from &LIB1..inputTable_&i.
where variable_want = 1
;quit;
%mend;
%tableLoop(1,30);
I don't have any other merge step because this is the part I'm having trouble with - I just want the proc sql to say: OK, so we are looking at n <= 13, let's pull the activeIndicator variable - and if it's n >= 14, we look for "active_Indicator" instead. I attempt this in the "case when" statement and it's also why I have the "variable_want = 1" line.
Apologies for mentioning "merging" in the title, I refer to it as pulling the variable from the inputTable.
I think I can run a loop instead to rename, something like...
%macro renameVar(start,end);
%do i = &start. %to *end.;
if &i.<= 13 then do;
data WRK_inputTable_&i.;
set &LIB1..inputTable_&i.;
rename activeIndicator=active_Indicator;
run;
else if &i >= 14 then do;
data WRK_inputTable_&i.;
set &LIB1..inputTable_&i.;
run;
end;
end;
%mend;
%renameVar(1,30);
Would code along the line of below do what you're after?
data work.output; length _source_table $41; set <source libref>.inputTable_: indsname=_source_table; source_table=_source_table; active_indicator=coalesce(active_indicator,activeIndicator); drop activeIndicator; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.