BookmarkSubscribeRSS Feed
robAs
Fluorite | Level 6

Hi there SASsers,
I would like to know how I can programmatically input a string and change all text to lower case EXCEPT for text which is within quotation marks (single or double).  The quoted text should remain unchanged. I suspect this could be a job for Perl Regular Expressions (eg PRXCHANGE function).

The background is : We have many (100+) very long (100-5000 lines) .sas programs. They have been written completely in Upper Case which makes it difficult to read easily. Many of the program statements refer to case sensitive values eg:  
/*Have*/

IF NAME='Rob' AND ADDRESS='21a High Street' THEN CHECKED="True";

/*Want*/

if name='Rob' and address='21a High Street' then checked="True";

 

I can write some code to read in a .sas file and work on the text string, outputting to a new sas file with :

filename oldprog 'C:\temp\mySASprog.sas' ;
filename newprog 'C:\temp\mySASprogNEW.sas' ;

 

data _null_;
  file newprog ;
  infile oldprog missover lrecl=32767;
  informat fulline $300. ;
  input @1 fulline ;
  newline=lowcase(_infile_);
  put newLine;
run;

 

However how do I get newline to convert to lower case and keep any quoted text untouched.
As I said earlier, sounds like a job for prxchange function or some combination of Perl / Regex functions?
Regards,  rob,






 

2 REPLIES 2
s_lassen
Meteorite | Level 14

Here is a program that does what you want, more or less:

data want;                                                                                                                              
  set have;                                                                                                                             
  prx_squote=prxparse('/''[^'']*''/');                                                                                                  
  prx_dquote=prxparse('/"[^"]*"/');                                                                                                     
  pos_squote=1;                                                                                                                         
  pos_dquote=1;                                                                                                                         
  pos=1;                                                                                                                                
  pos1=1;                                                                                                                               
  pos2=1;                                                                                                                               
  len_squote=0;                                                                                                                         
  len_dquote=0;                                                                                                                         
  call prxnext(prx_squote,pos1,-1,fullline,pos_squote,len_squote);                                                                      
  call prxnext(prx_dquote,pos2,-1,fullline,pos_dquote,len_dquote);                                                                      
  if pos_squote=0 and pos_dquote=0 then                                                                                                 
    fullline=lowcase(fullline);                                                                                                         
  else do while(pos_squote or pos_dquote);                                                                                              
    if pos_squote and pos_dquote then do;                                                                                               
      if pos_squote<pos_dquote then do;                                                                                                 
        substr(fullline,pos,pos_squote-pos)=lowcase(substr(fullline,pos,pos_squote-pos));                                               
        pos=pos1;                                                                                                                       
        if pos_dquote<pos then do;                                                                                                      
          pos2=pos;                                                                                                                     
          call prxnext(prx_dquote,pos2,-1,fullline,pos_dquote,len_dquote);                                                              
          end;                                                                                                                          
        call prxnext(prx_squote,pos1,-1,fullline,pos_squote,len_squote);                                                                
        end;                                                                                                                            
      else do;                                                                                                                          
        substr(fullline,pos,pos_dquote-pos)=lowcase(substr(fullline,pos,pos_dquote-pos));                                               
        pos=pos2;                                                                                                                       
        if pos_squote<pos2 then do;                                                                                                     
          pos1=pos;                                                                                                                     
          call prxnext(prx_squote,pos1,-1,fullline,pos_squote,len_squote);                                                              
          end;                                                                                                                          
        call prxnext(prx_dquote,pos2,-1,fullline,pos_dquote,len_dquote);                                                                
        end;                                                                                                                            
      end;                                                                                                                              
    else if pos_squote then do;                                                                                                         
      substr(fullline,pos,pos_squote-pos)=lowcase(substr(fullline,pos,pos_squote-pos));                                                 
      pos=pos1;                                                                                                                         
      call prxnext(prx_squote,pos1,-1,fullline,pos_squote,len_squote);                                                                  
      end;                                                                                                                              
    else do;                                                                                                                            
      substr(fullline,pos,pos_dquote-pos)=lowcase(substr(fullline,pos,pos_dquote-pos));                                                 
      pos=pos2;                                                                                                                         
      call prxnext(prx_dquote,pos2,-1,fullline,pos_dquote,len_dquote);                                                                  
      end;                                                                                                                              
    end;                                                                                                                                
run;

It is a bit complicated, because we have to account for quoted quotes (e.g. "Rob's Pizza Shop"), and it does not check for unmatched quotes - if your program looks like this, you will get both lines lower case:

WHERE a>'
GGG';

Also, it does not check for CARDS or DATALINES statements and macro quoted quotes (e.g. %STR(%')), and it will also put most comments in all lower case, so I would be a bit wary about just letting it loose on a large batch of programs.

Ksharp
Super User
data have;
input have $80.;
cards4;
IF NAME='Rob' AND ADDRESS='21a High Street' THEN CHECKED="True";
IF NAME='Rob';
IF;
;;;;

data want;
set have;
length want $ 200;
  do i=1 to countw(have,"'""",'m') ;
    temp=scan(have,i,"'""",'m');
 if mod(i,2)=1 then temp=lowcase(temp);
   else temp=cats('"',temp,'"');
 want=catx(' ',want,temp);
  end;
  drop i temp;
run;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 820 views
  • 1 like
  • 3 in conversation