DATA Step, Macro, Functions and more

CASE select alphanumeric data

Reply
Frequent Contributor
Posts: 75

CASE select alphanumeric data

I'm trying to assign a new variable based off of CASE select results. Some old code that I am trying to convert/rewrite:

IF VOL IN:('TKDA' 'TKDB' 'TKDP' 'TKFR' 'TKDU' 'TKGI' 'TKM '   
           'TKOL' 'TKOM' 'TKOS' 'TKPL' 'TKPM' 'TKPS' 'TKWL'   
           'TKWM' 'TKWS' 'RS'   'RS41' 'TKDR' 'SFSP' 'SP00'   
           'LMD'  'TMM' 'VM' 'LHFS' 'SHFS' 'RE' 'NITC')       
   THEN TYPEVOL = 'CUST';                                     
ELSE                                                          
IF VOL IN:('CAT0' 'SFCK' 'SF2A' 'SN29' 'NF2A' 'SAG0' 'NF'     
           'TLSA' 'TLTA' 'CAT1' 'CAT2' 'CAT3' 'OF' 'ZVM' '430'
           'CAT4' 'CAT9' 'SHCD' 'SS2A' 'SS29' 'SYD1' 'SYD2'   
           'SYD3' 'SYD4' 'SYD9' 'SYI1' 'SYI2' 'SYI3' 'SYI4'   
           'SYI9' 'SYO2' 'SYPG' 'SYPL' 'SYR1' 'SYR2' 'SYI4'   
           'SYR3' 'SYR4' 'SYR9' 'SY39' 'SYR1' 'SYR2' 'SYI4'   
           'SYS'  'SF1' 'SS1' 'CV' 'SS3P' 'SYDM' 'SS4' 'DSKP')
   THEN TYPEVOL = 'SYST';                                     
ELSE                                                          
   TYPEVOL = 'UNKN';                                          

The following is what I'm trying to do in SQL, though I am running into issue with alphnumeric comparisons:

PROC SQL;                                                           
 SELECT DCVVOLSR,                                                   
        CASE                                                        
         WHEN DCVVOLSR IN:('TKDA' 'TKDB' 'TKDP' 'TKFR' 'TKDU' 'TKGI'
           'TKM' 'TKOL' 'TKOM' 'TKOS' 'TKPL' 'TKPM' 'TKPS' 'TKWL'   
           'TKWM' 'TKWS' 'RS' 'RS41' 'TKDR' 'SFSP' 'SP00'           
           'LMD' 'TMM' 'VM' 'LHFS' 'SHFS' 'RE' 'NITC')              
         THEN VOLTYPE = 'CUST'                                      
         WHEN DCVVOLSR IN:('CAT0' 'SFCK' 'SF2A' 'SN29' 'NF2A' 'SAG0'
           'NF' 'TLSA' 'TLTA' 'CAT1' 'CAT2' 'CAT3' 'OF' 'ZVM'       
           'CAT4' 'CAT9' 'SHCD' 'SS2A' 'SS29' 'SYD1' 'SYD2'         
           'SYD3' 'SYD4' 'SYD9' 'SYI1' 'SYI2' 'SYI3' 'SYI4'         
           'SYI9' 'SYO2' 'SYPG' 'SYPL' 'SYR1' 'SYR2' 'SYI4'         
           'SYR3' 'SYR4' 'SYR9' 'SY39' 'SYR1' 'SYR2' 'SYI4'         
           'SYS' 'SF1' 'SS1' 'CV' 'SS3P' 'SYDM' 'SS4'               
           '430' 'DSKP')                                            
         THEN VOLTYPE = 'SYST'                                      
         ELSE VOLTYPE = 'UNKN'                                      
         END AS VOLTYPE,                                            
        SUM(DCVALLOC) AS ALLOCATED_SPACE FORMAT MGBYTES.,           
        SUM(DCVFRESP) AS FREE_SPACE FORMAT MGBYTES.,                
        SUM(DCVVLCAP) AS TOTAL_SPACE FORMAT MGBYTES.                
   FROM INPT.DCOLVOLS                                               
 GROUP BY DCVVOLSR;                                                 
QUIT;                                                               

I've tried using the LIKE clause as well.

Super User
Posts: 7,791

Re: CASE select alphanumeric data

Try this:

PROC SQL;                                                           
 SELECT DCVVOLSR,                                                   
        CASE                                                        
         WHEN DCVVOLSR IN ('TKDA','TKDB','TKDP','TKFR','TKDU','TKGI',
           'TKM','TKOL','TKOM','TKOS','TKPL','TKPM','TKPS','TKWL',
           'TKWM','TKWS','RS','RS41','TKDR','SFSP','SP00',
           'LMD','TMM','VM','LHFS','SHFS','RE','NITC')
         THEN 'CUST'
         WHEN DCVVOLSR IN ('CAT0','SFCK','SF2A','SN29','NF2A','SAG0',
           'NF','TLSA','TLTA','CAT1','CAT2','CAT3','OF','ZVM',
           'CAT4','CAT9','SHCD','SS2A','SS29','SYD1','SYD2',
           'SYD3','SYD4','SYD9','SYI1','SYI2','SYI3','SYI4',
           'SYI9','SYO2','SYPG','SYPL','SYR1','SYR2','SYI4',
           'SYR3','SYR4','SYR9','SY39','SYR1','SYR2','SYI4',
           'SYS','SF1','SS1','CV','SS3P','SYDM','SS4',
           '430','DSKP')
         THEN 'SYST'                                      
         ELSE 'UNKN'                                      
         END AS VOLTYPE,                                            
        SUM(DCVALLOC) AS ALLOCATED_SPACE FORMAT MGBYTES.,           
        SUM(DCVFRESP) AS FREE_SPACE FORMAT MGBYTES.,                
        SUM(DCVVLCAP) AS TOTAL_SPACE FORMAT MGBYTES.                
   FROM INPT.DCOLVOLS                                               
 GROUP BY DCVVOLSR;                                                 
QUIT;                        
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 75

Re: CASE select alphanumeric data

Posted in reply to KurtBremser

Kurt,

 

Thanks, but that produces all volumes as UNKN. I need LIKE matching on alphanumeric against variable DCVVOLSR.

 

Jeff

Super User
Posts: 19,815

Re: CASE select alphanumeric data

You only have a few lengths (3/4) so why not take the the substr() of the first 3 or 4 characters and do your tests?

 

case when substr(....)  in () 
    or substr(..., , 4) in () then ..

Respected Advisor
Posts: 4,925

Re: CASE select alphanumeric data

There are also two other techniques

 

1) Use regular expressions

 

data test;
input DCVVOLSR $;
datalines;
TKD
TKDA
TKDA123
V
VM
VM123
430
43
430x
;

PROC SQL;                                                           
 SELECT DCVVOLSR,                                                   
        CASE                                                        
         WHEN prxmatch("/^(TKDA|TKDB|TKDP|TKFR|TKDU|TKGI|
TKM|TKOL|TKOM|TKOS|TKPL|TKPM|TKPS|TKWL|TKWM|TKWS|RS|RS41|TKDR|SFSP|SP00|
LMD|TMM|VM|LHFS|SHFS|RE|NITC)/o", DCVVOLSR)
         THEN 'CUST'
         WHEN prxmatch("/^(CAT0|SFCK|SF2A|SN29|NF2A|SAG0|
NF|TLSA|TLTA|CAT1|CAT2|CAT3|OF|ZVM|CAT4|CAT9|SHCD|SS2A|SS29|SYD1|SYD2|
SYD3|SYD4|SYD9|SYI1|SYI2|SYI3|SYI4|SYI9|SYO2|SYPG|SYPL|SYR1|SYR2|SYI4|
SYR3|SYR4|SYR9|SY39|SYR1|SYR2|SYI4|SYS|SF1|SS1|CV|SS3P|SYDM|SS4|430|DSKP)/o", DCVVOLSR)
         THEN 'SYST'                                      
         ELSE 'UNKN'                                      
         END AS VOLTYPE
from test;
quit;

 

or 2) Use an auxiliary dataset

 

data types;
input DCVVOLSR $ VOLTYPE $;
datalines;
TKDA CUST
VM CUST
430 SYST
;

proc sql;
select test.DCVVOLSR, coalesce(VOLTYPE, "UNKN") as VOLTYPE
from test left join types on find(test.DCVVOLSR, types.DCVVOLSR, "T") = 1;
quit;
PG
Ask a Question
Discussion stats
  • 4 replies
  • 61 views
  • 4 likes
  • 4 in conversation