Structured Query Language (SQL) is a powerful tool for data file creation, retrieval, update and deletion but it can also be used to maintain the data tables in which the data resides. You can use it to maintain indexes and constraints, alter column lengths or formats or any one of a number of other features of your data. However, if you need to make changes to multiple tables it can be burdensome to have to write an SQL statement for every file you need to update. This article demonstrates a way of using Proc SQL to dynamically generate a group of SQL statements which will then automatically be executed, saving you time and ensuring accuracy.
For our example, we will take a scenario where we are responsible for maintaining an old and sprawling system which is poorly documented. Many of the data sets have a variable called “name”, which is obviously a character field, but may have different lengths in different data sets and we need to standardise it as a character field with a length of 20 (which we know to be the maximum length we need for the variable). This presents us with several problems:
In order to generate some test data for our scenario we will run the following piece of code to produce 10 copies of SASHELP.CLASS where each one has a different length for the name field.
libname sqlgen '/folders/myshortcuts/Dropbox/Articles/SAS Communities Library/Generated SQL/Data'; %macro buildtables; %do i = 0 %to 9; %let varlength=%eval(9+&i); data sqlgen.class&i; length Name $&varlength; set sashelp.class; run; %end; %mend buildtables; %buildtables;
It may seem counterintuitive but the best way to generate the SQL statements we need is with Proc SQL itself. The alter table statement we want is of the form
Alter Table <table-name> Modify <column-definition>
In order to determine which tables we need to modify we will use dictionary tables. If you haven’t come across these before they are read-only tables created automatically by SAS during each session and are held in a “hidden library” called DICTIONARY. The particular table we will need is called DICTIONARY.COLUMNS which looks like this:
If you want to learn more about dictionary tables there are several excellent SAS Global Forum papers covering them in more depth.
So, in order to generate the required SQL statements and store them in a data set we can run the following Proc SQL statement:
proc sql; create table sqlcmnds as select "alter table sqlgen."||compress(memname)||" modify name char(20);" as cmnd from dictionary.columns where libname="SQLGEN" and name="Name"; quit;
The resulting table then looks like this:
Having the Alter Table statements in a data step is, as previously mentioned, only part of the solution. We now have to build some code to execute those statements.
Firstly, we will write a short macro to execute a Proc SQL statement like so:
%macro run_sql(); %let incode=%sysfunc(dequote(&incode)); proc sql; &incode; quit; %mend run_sql;
Note that this will execute any valid SQL statement held in the macro variable called incode. Now we need a way to call that macro passing each SQL statement to it in turn.
Next, we are going to create a short custom function with Proc FCMP. Here is the code:
proc fcmp outlib=work.funcs.sql; function run_sql(incode $); rc = run_macro('run_sql',incode); return (rc); endsub;
We can now use this function inside a data step just as we would any inbuilt function – one thing to note is that the second parameter to the run_macro function isn’t a macro parameter in quite the way you might expect. It creates a macro variable which can be used in the called macro.
Finally, we can link all this together through a data step which calls the run_sql function which then calls the run_sql macro for each statement in the table:
options cmplib=work.funcs; data _null_; set sqlcmnds; rc=run_sql(cmnd); run;
Here is a partial listing of the updated DICTIONARY.COLUMNS table showing all of the Name columns now having the same length:
We’ve seen how we can use Proc SQL to construct and execute a large number of similar SQL statements but of course we are not limited to building SQL with this technique. We could just as easily use it to generate multiple Proc Freq statements or Proc Sorts – so the next time you have to generate a large number of similar statements it might be worth considering this simple but effective technique.