I have data as shown below. I need to add the numbers preceding "mL" for each ID, into 1 variable, and add the numbers preceding mL/kg per ID into another variable.
I tried removing the spaces but I don't know how many digits the number will be, to count back.
I tried extracting the numeric part but then other numbers like the 9 from the % also gets extracted.
Any help on how to do it in SAS (or Excel) will be appreciated. Thank you!
ID | DISPLAY_NAME |
1 | sodium chloride 0.9 % bolus 2,250 mL |
1 | sodium chloride 0.9 % bolus 1,000 mL |
1 | sodium chloride 0.9 % bolus 1,000 mL |
2 | sodium chloride 0.9 % bolus 500 mL |
3 | sodium chloride 0.9 % bolus 1,000 mL |
3 | sodium chloride 0.9 % bolus 250 mL |
3 | sodium chloride 0.9 % bolus 1,551 mL |
4 | lactated ringers bolus 1,000 mL |
4 | sodium chloride 0.9 % bolus 1,000 mL |
4 | PLASMA-LYTE A bolus 1,000 mL |
5 | sodium chloride 0.9 % bolus 30 mL/kg/dose |
5 | sodium chloride 0.9 % bolus 1,000 mL |
@Xinxin There are probably easier ways, but here's one way:
data have;
input id name $50.;
datalines;
1 sodium chloride 0.9 % bolus 2,250 mL
1 sodium chloride 0.9 % bolus 1,000 mL
1 sodium chloride 0.9 % bolus 1,000 mL
2 sodium chloride 0.9 % bolus 500 mL
3 sodium chloride 0.9 % bolus 1,000 mL
3 sodium chloride 0.9 % bolus 250 mL
3 sodium chloride 0.9 % bolus 1,551 mL
4 lactated ringers bolus 1,000 mL
4 sodium chloride 0.9 % bolus 1,000 mL
4 PLASMA-LYTE A bolus 1,000 mL
5 sodium chloride 0.9 % bolus 30 mL/kg/dose
5 sodium chloride 0.9 % bolus 1,000 mL
;
run;
data want;
set have;
reverse_name = upcase(strip(reverse(name)));
mL = index(reverse_name, 'LM');
space = find(reverse_name, ' ', mL + 3);
difference = space - (mL + 3);
numbers = input(reverse(substr(reverse_name, mL + 3, difference)), comma12.);
keep id name numbers;
run;
Result:
The code above assumes the desired number is always preceded by a space (" ") and followed by a space and mL (i.e. " mL").
Also, this puts all those numbers into a single variable. Did you need the mL/kg/dose number in a separate variable? It might be easier to have the number in one column, and then have another column for the units (e.g. mL or mL/kg/dose).
data want;
set have;
c=max(index(upcase(disp), 'ML/KG'), index(upcase(disp),'ML'));
if c then res=input(compress(scan(substr(disp,1,c-1),-1, ' '),,'kd'),best.);
run;
This also worked....thank you very much!
data want;
set have;
/*Regex Pattern*/
/*(?<=bolus\s)(.*)(?=\smL)*/
/*Positive Lookbehind (?<=bolus\s) - Assert that the Regex below matches bolus matches the characters bolus literally*/
/*\s matches any whitespace character */
/*1st Capturing Group (.*)*/
/*.* matches any character (except for line terminators)*/
/*Positive Lookahead (?=\smL) - Assert that the Regex below matches*/
/*\s matches any whitespace character (equal to [\r\n\t\f\v ])*/
/*mL matches the characters mL literally (case sensitive)*/
pat_id=prxparse('/(?<=bolus\s)(.*)(?=\smL)/');
call prxsubstr(pat_id,name,pos,len);
num_=input(substr(name,pos,len),comma8.);
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.