Okay Art, I will change it slightly to make it more efficient by breaking apart second prx into two, and why not, lets write it in ds2 also: 2083 data x.have (drop=i); 2084 input string $40.; 2085 do i=1to 1e6; 2086 output; 2087 end; 2088 cards; NOTE: The data set X.HAVE has 4000000 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.14 seconds cpu time 0.14 seconds 2093 ; 2094 run; 2095 2096 data want_fe; 2097 set x.have; 2098 array v[14] $ 2; 2099 v1 = scan(string, 1, ','); 2100 v2 = ','; 2101 bar = scan(string, 2, ','); 2102 bar = prxchange('s/([a-z])([a-z])/\11\2/io', -1, bar); 2103 bar = prxchange('s/([a-z]+)(?!\s)/\1_/io', -1, bar); 2104 bar = prxchange('s/([0-9]+)/\1_/o', -1, bar); 2105 do i=3 to dim(v) by 1 while(1); 2106 v=scan(bar, i-2, '_'); 2107 if missing(v) then leave; 2108 end; 2109 keep v:; 2110 run; NOTE: There were 4000000 observations read from the data set X.HAVE. NOTE: The data set WORK.WANT_FE has 4000000 observations and 14 variables. NOTE: DATA statement used (Total process time): real time 15.04 seconds cpu time 15.06 seconds 2111 2112 data want_art (drop=string i); 2113 set x.have; 2114 string=prxchange('s/([0-9])([,A-Za-z])/$1_$2/', -1, STRING); 2115 string=prxchange('s/(,)([0-9A-Za-z])/$1_$2/', -1, STRING); 2116 string=prxchange('s/([0-9A-Za-z])(,)/$1_$2/', -1, STRING); 2117 string=prxchange('s/([A-Za-z])([0-9])/$1_$2/', -1, STRING); 2118 string=prxchange('s/([A-Za-z])([A-Za-z])/$1_1_$2/', -1, STRING); 2119 array var(14) $; NOTE: The array var has the same name as a SAS-supplied or user-defined function. Parentheses following this name are treated as array references and not function references. 2120 i=1; 2121 do while(scan(string,i,'_') ne ''); 2122 var(i)=scan(string,i,'_'); 2123 i+1; 2124 end; 2125 run; NOTE: There were 4000000 observations read from the data set X.HAVE. NOTE: The data set WORK.WANT_ART has 4000000 observations and 14 variables. NOTE: DATA statement used (Total process time): real time 20.46 seconds cpu time 20.39 seconds 2126 2127 proc ds2; 2128 2129 thread break_string / overwrite=yes; 2130 vararray nchar(2) v[14]; 2131 keep v:; 2132 method run(); 2133 declare nchar(32) foo; 2134 declare int i; 2135 set x.have; 2136 v[1]=scan(string, 1, ','); 2137 v[2]=','; 2138 foo=scan(string, 2, ','); 2139 foo=prxchange('s/([a-z])([a-z])/\11\2/io', -1, foo); 2140 foo=prxchange('s/([a-z]+)(?!\s)/\1_/io', -1, foo); 2141 foo=prxchange('s/([0-9]+)/\1_/o', -1, foo); 2142 do i=3 to dim(v); 2143 v=scan(foo, i-2, '_'); 2144 if missing(v) then leave; 2145 end; 2146 end; 2147 endthread; 2148 run; NOTE: Created thread break_string in data set work.break_string. NOTE: Execution succeeded. No rows affected. 2149 2150 data want_feds2 (overwrite=yes); 2151 declare thread break_string bsthread; 2152 method run(); 2153 set from bsthread threads=4; 2154 end; 2155 enddata; 2156 run; NOTE: BASE driver, creation of a NCHAR column has been requested, the table encoding is set to UTF-8. NOTE: Execution succeeded. 4000000 rows affected. 2157 2158 quit; NOTE: PROCEDURE DS2 used (Total process time): real time 5.11 seconds cpu time 20.57 seconds
... View more