I discovered another way using prx expressions: data new;
set old;
*get rid of the word mg;
dose_new=compbl(compress(dose, "mg"));
*delete space between hyphens;
dose_new=tranwrd(dose_new, " -", "-";
dose_new=tranwrd(dose_new, "- ", "-";
*prx expression to locate digit string. ? means character or digit are optional, \ before special characters, such as period. Reads as: optional decimal, followed by digit of any length, followed by optional comma, followed by a digit of any length followed by an optional hyphen; repeat sequence;
digit_string=prxparse("/\.?\d*,?\d*-?\.?\d*,?\d*-?\.?\d*,?\d*-?\.?\d*,?\d*-?\.?\d*,?\d*-?\.?\d*,?\d*-?\.?\d*,?\d*-?\.?\d*,?\d*/");
*extract digit string based on prx expression using substring function; *create two new variables, position and length of digit variable;
call prxsubstr(digit_string, dose_new, position, length); *extract digit string based on position and length;
new = substrn(dose_new, position, length);
length dose_new $32.;
run;
... View more