I'm executing the following code in most of the programs.. For the programs which don't have ENTITY_ID from the querying table, I want to create the variable ENTITY_ID (character) and assign the missing value instead of receiving the error 'ERROR: The following columns were not found in the contributing tables: ENTITY_ID. '
/*extract unique entity_id*/
proc sql;
create table dis_entity_id as select distinct ENTITY_ID format=$ENTITY. from &library..&Input_table.;
quit;
You might want to just use normal SAS code instead of SQL.
data dis_entity_id ;
set &library..&input_table ;
format ENTITY_ID $ENTITY. ;
keep entity_id;
run;
proc sort data=dis_entity_id nodupkey;
by entity;
run;
/*extract unique entity_id*/
proc sql;
create table dis_entity_id as
select *,' ' as ENTITY_ID
from &library..&Input_table.;
quit
So, this is how you plug your sql in that macro
%if %sysfunc(varnum(&dsid, &var)) > 0 %then %do;
/*extract unique entity_id*/
proc sql;
create table dis_entity_id as
select distinct ENTITY_ID
from &library..&Input_table.;
quit;
%end;
%else %do;
proc sql;
create table dis_entity_id as
select *,' ' ENTITY_ID
from &library..&Input_table.;
quit;
%end;
Yes, please read the comments, you will understand.
Basically, first, these are steps
1. Check if the variable already exists
2. If exists, run a plain extract query
3. If not, assign a new variable with that name with "missing" value
Now, using the above 3, think how can we translate that into SAS syntax. Gradually, you will build the code. Try
You might want to just use normal SAS code instead of SQL.
data dis_entity_id ;
set &library..&input_table ;
format ENTITY_ID $ENTITY. ;
keep entity_id;
run;
proc sort data=dis_entity_id nodupkey;
by entity;
run;
Proc sort ending with error if there is no value for ENTITY_ID. Yes, it is right, but I'd like tweak the sort code to work only if there is any Observation from the dataset 'dis_entity_id'. Your proposed Data step works well for both the cases.
33 proc sort data=dis_entity_id nodupkey;
34 by entity;
ERROR: Variable ENTITY not found.
35 run;
Make sure to use the right variable name in the BY statement.
You don't mind the uninitialized note thrown by:
data dis_entity_id ;
set &library..&input_table ;
format ENTITY_ID $ENTITY. ;
keep entity_id;
run;
?
Generally I tend to consider that an error message. (Actually, I tend to make it an error message with VARINITCHK=error).
@Quentin wrote:
You don't mind the uninitialized note thrown by:
data dis_entity_id ; set &library..&input_table ; format ENTITY_ID $ENTITY. ; keep entity_id; run;
?
Generally I tend to consider that an error message. (Actually, I tend to make it an error message with VARINITCHK=error).
Nope. It is useful information. It lets us know that ENTITY_ID did not exist in the source data.
One way would be to make a view which creates the variable. Something like below should work, if you don't mind side effect of changing the order of variables:
data vInputTable/view=vInputTable ;
length Entity_ID $8 ;
call missing(Entity_ID) ;
set &library..&Input_table ;
run ;
Then read that view in your PROC SQL step.
It's often easier to fix data structures rather than code around them.
You can check if the column exist with SQL, e.g.
proc sql noprint;
%let name=' ' as ENTITY_ID;
select name into :name from
dictionary.columns
where libname="%upcase(&library)" and memname="%upcase(&Input_table)" and name='ENTITY_ID';
create table dis_entity_id as select distinct &name format=$ENTITY. from &library..&Input_table.;
quit;
It's a bit fast and dirty, but I think it should work. The idea is that if the column is not found, the macro variable NAME will not be overwritten.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.