BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jumboshrimps
Obsidian | Level 7

Have a bunch of records as below:

INJECT 35-40 UNITS INTO THE SKIN DAILY 

INJECT 45-70 UNITS INTO THE SKIN DAILY.

10-12 UNITS AFTER BREAKFAST, LUNCH AND DINNER

VIA INSULIN PUMP AROUND 100-150 UNITS DAILY

 

Created two additional variables  to place strings to the immediate LEFT ("LOWERNG")

& RIGHT ( "UPPRNG" ) of where "-" is found in the string. 

I've identified where the "-" is in the variable with the index command. Can be any position in the variable, usually near the beginning of the string, but not always.  Using                                                                                                               

IF HYPHEN >0 THEN LOWERNG = substr(INSTRUCTIONS,HYPHEN,HYPHEN-5); - five characters to the LEFT
IF HYPHEN >0 THEN UPPRNG = substr(INSTRUCTIONS,HYPHEN,HYPHEN+5); - five characters to the RIGHT

intent is  to grab five characters to the immediate right ( "UPPRNG" ) of "-" and  five characters to the immediate left ("LOWERNG"). 

Or so I thought.

 

Here are some results: (HYPHEN is the position of the "-")

INSTRUCTIONHYPHENUPPRNGLOWERNG
INJECT 0-8 UNITS INTO THE SKIN 3 TIMES DAILY.9-8 UNIT-8 U
INJECT 10-20 UNITS INTO THE SKIN DAILY.10-20 UNI-20 U
INJECT 40-55 UNITS INTO THE SKIN DAILY. 10-55 UNI-55 U

 

UPPERNG & LOWERNG are identical (more or less), both variables containing strings from the RIGHT of where the hyphen is. Not what I expected at all - below is what I thought I was going to receive:

 

INSTRUCTIONHYPHENLOWERNGUPPRNG
INJECT 0-8 UNITS INTO THE SKIN 3 TIMES DAILY.9ECT 08 UN
INJECT 10-20 UNITS INTO THE SKIN DAILY.10ECT 1020 UN
INJECT 40-55 UNITS INTO THE SKIN DAILY. 10ECT 4055 UN

 

I can scrub out any characters later with                                                                                             

COMPRESS(UPPRNG,'0123456789','k') ;                                                                                             

COMPRESS(LOWERNG,'0123456789','k') ;                                                                                                                

After all the characters are scrubbed out, am going to add LOWERNG & UPPRNG, divide by two, (will need to use a put command in there somewhere) so I get an average of the upper and lower ranges.  I would like to exclude the hyphen in either  LOWERNG or UPPRNG results.

                                                    

But I'm stuck.

 

Thanx.

 

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

There are probably more elegant ways to do this but here is one approach. I've kept it fairly simple so you can see what each statement does:

data have;
  input @1 instructions $50.;
  pos = findw(instructions, 'UNITS');
  dose = substr(instructions, 1, pos - 2);
  words = countw(dose);
  lowerng = input(scan(dose, words -1), best12.);
  upperng = input(scan(dose, words), best12.);
  average = (lowerng + upperng)/2;
datalines;
INJECT 35-40 UNITS INTO THE SKIN DAILY 
INJECT 45-70 UNITS INTO THE SKIN DAILY.
10-12 UNITS AFTER BREAKFAST, LUNCH AND DINNER
VIA INSULIN PUMP AROUND 100-150 UNITS DAILY
;
run;

View solution in original post

4 REPLIES 4
SASKiwi
PROC Star

There are probably more elegant ways to do this but here is one approach. I've kept it fairly simple so you can see what each statement does:

data have;
  input @1 instructions $50.;
  pos = findw(instructions, 'UNITS');
  dose = substr(instructions, 1, pos - 2);
  words = countw(dose);
  lowerng = input(scan(dose, words -1), best12.);
  upperng = input(scan(dose, words), best12.);
  average = (lowerng + upperng)/2;
datalines;
INJECT 35-40 UNITS INTO THE SKIN DAILY 
INJECT 45-70 UNITS INTO THE SKIN DAILY.
10-12 UNITS AFTER BREAKFAST, LUNCH AND DINNER
VIA INSULIN PUMP AROUND 100-150 UNITS DAILY
;
run;
Jumboshrimps
Obsidian | Level 7
That works.
Thanx.
Had a bad experience with scan function on this data and best10. - don't remember what the error was - but I should have persevered.
mkeintz
PROC Star

If "UNITS" is always preceded by an expression like  xxx-yyy, and if that is always the first instance of a dash in INSTRUCTIONS, then you could:

  1. Get the first "word" of instructions, where the only word separator is a '-':  
      i.e.    scan(instructions,1,'-')      which extracts
           INJECT 35    from INJECT 35-40 UNITS INTO THE SKIN DAILY

  2. From the results of #1, get the last word (where words are now separated by blanks).  Note the -1 below means to get the last word of the expression)
           scan(....,-1,' ')         which extracts   35 from   INJECT 35

  3. The output of #2, when converted from text to numeric, produces the lower range
          lowerng=input(...,best12.)

  4. Do the analogous operation to get the upper range.

 

 

data have;
  input instructions $50.;
datalines;
INJECT 35-40 UNITS INTO THE SKIN DAILY 
INJECT 45-70 UNITS INTO THE SKIN DAILY.
10-12 UNITS AFTER BREAKFAST, LUNCH AND DINNER
VIA INSULIN PUMP AROUND 100-150 UNITS DAILY
run;
data want;
  set have;
  lowerng= input(scan(scan(instructions,1,'-'),-1,' '),best12.);
  upperng= input(scan(scan(instructions,2,'-'),+1,' '),best12.);
run;

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ChrisNZ
Tourmaline | Level 20

Function substr() does not work like what you made up.

Read the documentation.

The third parameter is the length. You always want 5 in your case.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 3975 views
  • 0 likes
  • 4 in conversation