BookmarkSubscribeRSS Feed
trevand
Obsidian | Level 7

Is it possible to ignore an error if I select a variable in proc sql that that does not exists? I'm running a loop over many years and in some years the variable exist in some it doesn't.

6 REPLIES 6
ballardw
Super User

Could you show us the code you are using to "loop over years"?

 

It may be possible to insert a check before you use it.

Are there actually multiple variables involved?

Tom
Super User Tom
Super User

My first reaction would be to switch to a data step.

 

But please provide details on what you are doing to get better advice.

Patrick
Opal | Level 21

If SQL throws an error then you can’t ignore it. You could implement a check prior to the SQL and then only use the column in the SQL if it exists. 

You could also concatenate all the source tables (could even be a data step view) and then process all the data in a single SQL. If you concatenate the source tables with a data step then the column will exist and be missing for the source tables where it didn’t exists.

The only challenge with creating such a intermediary single source table: If in the source tables same named variables got different attributes (lengths or type) the you will still need to implement some further pre-processing to deal with it in order to avoid string truncation or even errors due to type mismatch. 

trevand
Obsidian | Level 7

@Patrick I did exactly what you suggested. I appended all annual files and then saved them as a single year file since at the end I need single year files. How would you  implement a check prior to the SQL and then only use the column in the SQL if it exists?

 

@ballardw you also suggested to do a check. Maybe you could also show an example of a check? There are indeed different number of variables in some of the years. I have different number of id variables that appear based on number of duplicates in each year. And the number of duplicates is not the same across the years.

 

@Tom I am doing a proc sql distinct group by. I'm not aware of a better alternative using data step.

ballardw
Super User

@trevand wrote:

@Patrick I did exactly what you suggested. I appended all annual files and then saved them as a single year file since at the end I need single year files. How would you  implement a check prior to the SQL and then only use the column in the SQL if it exists?

 

@ballardw you also suggested to do a check. Maybe you could also show an example of a check? There are indeed different number of variables in some of the years. I have different number of id variables that appear based on number of duplicates in each year. And the number of duplicates is not the same across the years.

 

@Tom I am doing a proc sql distinct group by. I'm not aware of a better alternative using data step.


How your code flows can make a difference on what approaches work for checking on variables.

Since it sound like you have a list of variables that may or may not exist this is an approach that checks the names in a list a against the contents of the data set and returns values in macro variable that only exist in the data set.

This sort of code would have to be used when the library and data set name are known along with the list of variables. So how it might be called depends on your code.

Caution: this has no error checking, so if you call the macro without a list of variables, memname (data set) or libname parameter it doesn't tell you. Also if you happen to pass a list with NO variables in common with the data set the resulting macro variable will be blank. You would have to test for that.

 

This places the result into a macro variable named Newvarlist. An exercise for the interested reader to add a parameter to create a differently named variable as a parameter..

%macro selectstr(varlist=,memname=, libname=,aliasvar=);
%let varlist=%upcase(&varlist.);
%put &varlist.;

proc sql noprint; 
   select catx('.', "&aliasvar.",name)  into :newvarlist separated by ","
   from dictionary.columns 
   where libname=upcase("&libname.") and memname=upcase("&memname.")
      and findw("&varlist.",upcase(strip(name)),' ','i')>0
   ;
quit;

%global newvarlist;

%mend;

 

You have not provided any actual example of how your are using the problem variable names so I don't know if a data set name or alias needs to be included as is common in an SQL select list. i.e T1.var, T1.thatvar, t2.anothervar . The name would go in the Aliasvar. If it is missing or called as blank then no alias or dot appears.

An example call:

%selectstr(varlist=Name sex age mm nn,memname=class, libname=sashelp,aliasvar=t1);

and check the result with to see an example result

%put &newvarlist.;

Which in my case shows:

t1.Name,t1.Sex,t1.Age

So in use might look something like :

%selectstr(varlist=Name sex age mm nn,memname=class, libname=sashelp,aliasvar=t1);

proc sql;
   create table work.example as
   select &newvarlist, t1.height
   from Sashelp.class as t1
   where sex='F'
   ;
quit;

I show an example selecting an additional variable as your question has implied that you may have a smallish list that changes and the rest of the variables are present.

See that this has to be called completely before the Proc SQL that uses the Newvarlist and assumes that at least one variable remains.

 

 

Patrick
Opal | Level 21

@trevand wrote:

@Patrick I did exactly what you suggested. I appended all annual files and then saved them as a single year file since at the end I need single year files. How would you  implement a check prior to the SQL and then only use the column in the SQL if it exists?

 


You could take an approach that creates stable table structures which will make downstream processing easier. If you concatenate multiple SAS table via a SAS data step then the concatenated table will contain any variable that exists in at least one source table. The variable will just be missing for the years where it didn't exist in source.

You could for example concatenate ALL the yearly tables you've got to create the table layout (without any rows in it) and then use this layout file as part of your concatenation. Below some sample code to illustrate the approach.

/* create sample data */
data year_2020(drop=var1 var2) year_2021(drop=var2 var3) year_20203(drop=var4);
  array vars {*} var1-var4 (4*100);
  do i=1 to 10;
    output;
  end;
run;

/* create mapping file */
%let save_obs=%sysfunc(getoption(obs,keyword));
options obs=0;
data mapping;
  stop;
  set year_:;
run;
options &save_obs;

/* create concatenated table for selecte years with stable structure */
data selected_years;
  set mapping year_2020 year_2021;
run;

/* list table structure */
proc contents data=selected_years;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 436 views
  • 1 like
  • 4 in conversation