Have a bunch of records as below:
INJECT 35-40 UNITS INTO THE SKIN DAILY
INJECT 45-70 UNITS INTO THE SKIN DAILY.
10-12 UNITS AFTER BREAKFAST, LUNCH AND DINNER
VIA INSULIN PUMP AROUND 100-150 UNITS DAILY
Created two additional variables to place strings to the immediate LEFT ("LOWERNG")
& RIGHT ( "UPPRNG" ) of where "-" is found in the string.
I've identified where the "-" is in the variable with the index command. Can be any position in the variable, usually near the beginning of the string, but not always. Using
IF HYPHEN >0 THEN LOWERNG = substr(INSTRUCTIONS,HYPHEN,HYPHEN-5); - five characters to the LEFT
IF HYPHEN >0 THEN UPPRNG = substr(INSTRUCTIONS,HYPHEN,HYPHEN+5); - five characters to the RIGHT
intent is to grab five characters to the immediate right ( "UPPRNG" ) of "-" and five characters to the immediate left ("LOWERNG").
Or so I thought.
Here are some results: (HYPHEN is the position of the "-")
INSTRUCTION | HYPHEN | UPPRNG | LOWERNG |
INJECT 0-8 UNITS INTO THE SKIN 3 TIMES DAILY. | 9 | -8 UNIT | -8 U |
INJECT 10-20 UNITS INTO THE SKIN DAILY. | 10 | -20 UNI | -20 U |
INJECT 40-55 UNITS INTO THE SKIN DAILY. | 10 | -55 UNI | -55 U |
UPPERNG & LOWERNG are identical (more or less), both variables containing strings from the RIGHT of where the hyphen is. Not what I expected at all - below is what I thought I was going to receive:
INSTRUCTION | HYPHEN | LOWERNG | UPPRNG |
INJECT 0-8 UNITS INTO THE SKIN 3 TIMES DAILY. | 9 | ECT 0 | 8 UN |
INJECT 10-20 UNITS INTO THE SKIN DAILY. | 10 | ECT 10 | 20 UN |
INJECT 40-55 UNITS INTO THE SKIN DAILY. | 10 | ECT 40 | 55 UN |
I can scrub out any characters later with
COMPRESS(UPPRNG,'0123456789','k') ;
COMPRESS(LOWERNG,'0123456789','k') ;
After all the characters are scrubbed out, am going to add LOWERNG & UPPRNG, divide by two, (will need to use a put command in there somewhere) so I get an average of the upper and lower ranges. I would like to exclude the hyphen in either LOWERNG or UPPRNG results.
But I'm stuck.
Thanx.
There are probably more elegant ways to do this but here is one approach. I've kept it fairly simple so you can see what each statement does:
data have;
input @1 instructions $50.;
pos = findw(instructions, 'UNITS');
dose = substr(instructions, 1, pos - 2);
words = countw(dose);
lowerng = input(scan(dose, words -1), best12.);
upperng = input(scan(dose, words), best12.);
average = (lowerng + upperng)/2;
datalines;
INJECT 35-40 UNITS INTO THE SKIN DAILY
INJECT 45-70 UNITS INTO THE SKIN DAILY.
10-12 UNITS AFTER BREAKFAST, LUNCH AND DINNER
VIA INSULIN PUMP AROUND 100-150 UNITS DAILY
;
run;
There are probably more elegant ways to do this but here is one approach. I've kept it fairly simple so you can see what each statement does:
data have;
input @1 instructions $50.;
pos = findw(instructions, 'UNITS');
dose = substr(instructions, 1, pos - 2);
words = countw(dose);
lowerng = input(scan(dose, words -1), best12.);
upperng = input(scan(dose, words), best12.);
average = (lowerng + upperng)/2;
datalines;
INJECT 35-40 UNITS INTO THE SKIN DAILY
INJECT 45-70 UNITS INTO THE SKIN DAILY.
10-12 UNITS AFTER BREAKFAST, LUNCH AND DINNER
VIA INSULIN PUMP AROUND 100-150 UNITS DAILY
;
run;
If "UNITS" is always preceded by an expression like xxx-yyy, and if that is always the first instance of a dash in INSTRUCTIONS, then you could:
data have;
input instructions $50.;
datalines;
INJECT 35-40 UNITS INTO THE SKIN DAILY
INJECT 45-70 UNITS INTO THE SKIN DAILY.
10-12 UNITS AFTER BREAKFAST, LUNCH AND DINNER
VIA INSULIN PUMP AROUND 100-150 UNITS DAILY
run;
data want;
set have;
lowerng= input(scan(scan(instructions,1,'-'),-1,' '),best12.);
upperng= input(scan(scan(instructions,2,'-'),+1,' '),best12.);
run;
Function substr() does not work like what you made up.
Read the documentation.
The third parameter is the length. You always want 5 in your case.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.