Hi there SASsers,
I would like to know how I can programmatically input a string and change all text to lower case EXCEPT for text which is within quotation marks (single or double). The quoted text should remain unchanged. I suspect this could be a job for Perl Regular Expressions (eg PRXCHANGE function).
The background is : We have many (100+) very long (100-5000 lines) .sas programs. They have been written completely in Upper Case which makes it difficult to read easily. Many of the program statements refer to case sensitive values eg:
/*Have*/
IF NAME='Rob' AND ADDRESS='21a High Street' THEN CHECKED="True";
/*Want*/
if name='Rob' and address='21a High Street' then checked="True";
I can write some code to read in a .sas file and work on the text string, outputting to a new sas file with :
filename oldprog 'C:\temp\mySASprog.sas' ;
filename newprog 'C:\temp\mySASprogNEW.sas' ;
data _null_;
file newprog ;
infile oldprog missover lrecl=32767;
informat fulline $300. ;
input @1 fulline ;
newline=lowcase(_infile_);
put newLine;
run;
However how do I get newline to convert to lower case and keep any quoted text untouched.
As I said earlier, sounds like a job for prxchange function or some combination of Perl / Regex functions?
Regards, rob,
Here is a program that does what you want, more or less:
data want;
set have;
prx_squote=prxparse('/''[^'']*''/');
prx_dquote=prxparse('/"[^"]*"/');
pos_squote=1;
pos_dquote=1;
pos=1;
pos1=1;
pos2=1;
len_squote=0;
len_dquote=0;
call prxnext(prx_squote,pos1,-1,fullline,pos_squote,len_squote);
call prxnext(prx_dquote,pos2,-1,fullline,pos_dquote,len_dquote);
if pos_squote=0 and pos_dquote=0 then
fullline=lowcase(fullline);
else do while(pos_squote or pos_dquote);
if pos_squote and pos_dquote then do;
if pos_squote<pos_dquote then do;
substr(fullline,pos,pos_squote-pos)=lowcase(substr(fullline,pos,pos_squote-pos));
pos=pos1;
if pos_dquote<pos then do;
pos2=pos;
call prxnext(prx_dquote,pos2,-1,fullline,pos_dquote,len_dquote);
end;
call prxnext(prx_squote,pos1,-1,fullline,pos_squote,len_squote);
end;
else do;
substr(fullline,pos,pos_dquote-pos)=lowcase(substr(fullline,pos,pos_dquote-pos));
pos=pos2;
if pos_squote<pos2 then do;
pos1=pos;
call prxnext(prx_squote,pos1,-1,fullline,pos_squote,len_squote);
end;
call prxnext(prx_dquote,pos2,-1,fullline,pos_dquote,len_dquote);
end;
end;
else if pos_squote then do;
substr(fullline,pos,pos_squote-pos)=lowcase(substr(fullline,pos,pos_squote-pos));
pos=pos1;
call prxnext(prx_squote,pos1,-1,fullline,pos_squote,len_squote);
end;
else do;
substr(fullline,pos,pos_dquote-pos)=lowcase(substr(fullline,pos,pos_dquote-pos));
pos=pos2;
call prxnext(prx_dquote,pos2,-1,fullline,pos_dquote,len_dquote);
end;
end;
run;
It is a bit complicated, because we have to account for quoted quotes (e.g. "Rob's Pizza Shop"), and it does not check for unmatched quotes - if your program looks like this, you will get both lines lower case:
WHERE a>'
GGG';
Also, it does not check for CARDS or DATALINES statements and macro quoted quotes (e.g. %STR(%')), and it will also put most comments in all lower case, so I would be a bit wary about just letting it loose on a large batch of programs.
data have;
input have $80.;
cards4;
IF NAME='Rob' AND ADDRESS='21a High Street' THEN CHECKED="True";
IF NAME='Rob';
IF;
;;;;
data want;
set have;
length want $ 200;
do i=1 to countw(have,"'""",'m') ;
temp=scan(have,i,"'""",'m');
if mod(i,2)=1 then temp=lowcase(temp);
else temp=cats('"',temp,'"');
want=catx(' ',want,temp);
end;
drop i temp;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.