Hi,
I am trying to build a program the output of which would be a list of the variables that get mentioned in the SAS program.
I have a general sense of processing each line of the program and doing lookups, but I'm not sure how specifically to approach it and was hoping to get some good ideas from this group.
I do have a list of variables that I'm specifically interested in looking for, so if that's easier to program than just trying to find general variable names, I can definitely do that.
Does anyone have any useful ideas for me?
Thanks!
Another way is to have :
length STR $ 32767;
and use
STR = catx(' ,', STR, Var_name)
to collect all variables of interest. STR will be holding Var-names separated by comma or any other delimiter you like.
I would approach this as:
Read the code so each token is tested a valid variable name (unless you made the choice of using the VALIDVARNAME=Any in which case you made the whole thing a lot more complicated) and record the line number , "word". The NVALID function will test if a string is valid as variable name.
The compare that list to data set of interest .
Hi @Walternate
I tried to make some working code based on @ballardw 's suggestions. Note that it will find explicit mentioned variables only, not shortcuts etc. There is room for many refinements, so take it as a proof of concept.
* Get a list of SAS program files;
%let datafolder = Y:\Diverse Windows\AdHocProd;
filename df pipe "dir ""&datafolder\*.sas""";
data pgmlist (drop=line c);
length pgm $255;
infile df truncover end=eod;
input line $char255.;
if substr(line,1,1) ne '' then do;
pgm = "&datafolder"||'\'||substr(line,37);
output;
c + 1;
end;
if eod then call symputx('pgmcnt',c);
run;
%put &=pgmcnt;
* Delete previous results;
proc datasets lib=work nolist;
delete allpgmwords;
quit;
* Read all files and split in words that might be variables;
%macro readfiles;
%do i=1 %to &pgmcnt;
data _null_; set pgmlist(firstobs=&i obs=&i);
call symput('thispgm',trim(pgm));
run;
data pgmwords (keep=program word);
infile "&thispgm" truncover;
length program $255 word $32;
program = "&thispgm";
input line $char255.;
line = translate(line,' ','()[]{};/\+-*?.,=');
do i = 1 to countw(line);
word = scan(line,i,' ');
if nvalid(word,'v7') then output;
end;
run;
proc append base=allpgmwords data=pgmwords;
run;
%end;
%mend readfiles;
%readfiles;
/* reduce to a distinct list */
proc sql;
create table uniquepgmwords as
select distinct
program,
word
from allpgmwords;
quit;
* Get list of interesting variables;
data reflist;
input refvar $32.;
cards;
chargednumber
connectednumber
cpr
;
run;
* Find programs where these variables are mentioned;
proc sql;
create table result as
select
uniquepgmwords.program,
uniquepgmwords.word as variable_found
from uniquepgmwords, reflist
where upcase(refvar) = upcase(word)
order by program, word;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.