Hi Everyone,
I am parsing SAS-Code in SAS, which results in creating a table that has a variable named "Source_Code". For every observation in this table, the value of "Source_Code" is a code line of a SAS program.
In particular, I am interested in how to find x-command, which were used in the parsed SAS-Code.
Lets imagine, I have a program called prog1.sas which consists of the following lines of code
data _NULL_;
set sashelp.class;
x "rm /this/is/a/path/to/a/file.txt" ;
x "rm /this/is/another/path/to/a/file2.txt";
run;
Now, as you can see the second x-command is preceeded by couple of blanks.
When I run my SAS-Parsing-Program that parses the above prog1.sas, everything works fine, in the sense, that each line is being parsed and written into a table called "parsed_code".
What I want is, for the program to indicate, if in the parsed program any x-command-line has been used (which is the case for line 3 and 4!)
Therefore, I tried using PERL Regular Expression and also using %STR - Quoting Function.
data test;
set work.parsed_code;
if _N_=1 then do;
RETAIN x_cmd_patternID;
x_cmd_pattern = "/x[[:blank:]]/";
x_cmd_patternID=prxparse(x_cmd_pattern);
end;
reg_ex_flag_x_cmd= prxmatch(x_cmd_patternID,Source_Code);
flag_x_cmd = index(upcase(Source_Code), ("%STR(X %")")) gt 0;
run;
Regarding the PERL RegEx-Espression, how can I "tell" the command to look for the string 'x "', i.e. <x blank one unmatched pair of double quotes> regardless, if there are leading blanks in front of the "x"? And also regardless of how many blanks are between the "x" and the double quotes...?
I was trying to make sense of the https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=lefunctionsref&docsetTarg... but so far, without successful results...
The problem, I have, when using the %STR() function in EG is, that all successive code is being altered in its color, indicating, that EG thinks, this is all part of a unmatched quote! Even though I escaped the unmatched pair of double quotes with a percent sign.
Any tips how tell EG to mask the unmatched pair of double quotes?
Cheers,
FK1
You don't need %STR
data test;
input source_code $char80.;
*set work.parsed_code;
if _N_=1 then do;
RETAIN x_cmd_patternID;
x_cmd_pattern = "/x[[:blank:]]/";
x_cmd_patternID=prxparse(x_cmd_pattern);
end;
reg_ex_flag_x_cmd= prxmatch(x_cmd_patternID,Source_Code);
flag_x_cmd = index(upcase(Source_Code), 'X "') gt 0;
cards4;
data _NULL_;
set sashelp.class;
x "rm /this/is/a/path/to/a/file.txt" ;
x "rm /this/is/another/path/to/a/file2.txt";
run;
;;;;
run;
proc print;
run;
🙂 sometimes it can be so easy....
@data_null__ : do you also have any suggestions for the REG EX command?
It is important to note that the expression
index(upcase(Source_Code), 'X "')
is not adequate to identify all X command syntax
Consider.
x "cmd";
x 'cmd';
x %sysfunc(quote(&cmd));
To name just three
Tanks, for your remarks, @data_null__ !
This is exactly, why I was trying to use regular expressions to incorporate cases, where there are things like trailing or leading blanks, double quotes vs single quotes, etc.
Unfortunately, I am not well equipped when it comes to PRX- expressions.
So far, I came up with this:
x_cmd_pattern = '/x[[:blank:]]"/';
How do I modify this expressions to incorporate trailing or leading blanks, double quotes vs single quotes, etc.?
Any ideas?
Something like this should find most macro-less calls you describe:
FLAG=prxmatch('/\A\s*x\s+[''"]/i',STRING);
If you want to catch calls that use the macro language, you need to define the perimeter.
There is almost no limit to how complex a macro expression can be.
Thanks @ChrisNZ !
Do you also maybe have a tip how to "catch" filename pipe commands?
filename <some arbitryry text of variable length> pipe;
How can I tell a Perl Regular Expression to interleave any number of letters between the word "filename" and the word "pipe"?
FLAG=prxmatch('/\s*filename\s[[:alpha:]]*pipe/i',STRING);
Would it be sufficient to search the string for the word PIPE findw(upcase(source), 'PIPE') after you find FILENAME.statement.
Also note that you can also have an INFILE statement with the PIPE option that does not use a FILENAME
data _null_;
command = '/usr/bin/find ....';
INFILE DUMMY PIPE filevar=command end=eof;
do while(not eof);
input;
end;
stop;
Like this?
FLAG=prxmatch('/\s*filename\s+[[:alpha:]]{1,8}pipe\s/i',STRING);
Another expression to catch some infile statements as @data_null__ showed, could be:
FLAG=prxmatch('/[\s;]*[filename|infile]\s+[[:alpha:]]{1,8}pipe\s/i',STRING);
This will also catch a filename preceded by a ; but will not catch a filename created using the filename function, or a filename spanning several lines.
Also note that filename _A1 '.'; is valid, so looking for letters only is insufficient.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.