@Xinxin There are probably easier ways, but here's one way:
data have;
input id name $50.;
datalines;
1 sodium chloride 0.9 % bolus 2,250 mL
1 sodium chloride 0.9 % bolus 1,000 mL
1 sodium chloride 0.9 % bolus 1,000 mL
2 sodium chloride 0.9 % bolus 500 mL
3 sodium chloride 0.9 % bolus 1,000 mL
3 sodium chloride 0.9 % bolus 250 mL
3 sodium chloride 0.9 % bolus 1,551 mL
4 lactated ringers bolus 1,000 mL
4 sodium chloride 0.9 % bolus 1,000 mL
4 PLASMA-LYTE A bolus 1,000 mL
5 sodium chloride 0.9 % bolus 30 mL/kg/dose
5 sodium chloride 0.9 % bolus 1,000 mL
;
run;
data want;
set have;
reverse_name = upcase(strip(reverse(name)));
mL = index(reverse_name, 'LM');
space = find(reverse_name, ' ', mL + 3);
difference = space - (mL + 3);
numbers = input(reverse(substr(reverse_name, mL + 3, difference)), comma12.);
keep id name numbers;
run;
Result:
The code above assumes the desired number is always preceded by a space (" ") and followed by a space and mL (i.e. " mL").
Also, this puts all those numbers into a single variable. Did you need the mL/kg/dose number in a separate variable? It might be easier to have the number in one column, and then have another column for the units (e.g. mL or mL/kg/dose).
... View more