BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Aayushi_17
Quartz | Level 8

Hi 

 

i want to separate character from numeric  value 

 

for eg :

Txt 
 
50mg 
slicing data 
100/50 mg 
mg 
unknown 
0.25 mmmgb 

 

how to get the output like this:

 

cTxt_b
  
50mg 
 slicing data 
100/50mg 
 mg 
 unknown 
0.25mmmgb
1 ACCEPTED SOLUTION

Accepted Solutions
heffo
Pyrite | Level 9

Try this:

data want;
	txt = "100/50 mg medicin";
	c = compress(txt,"0123456789.,/ ","K"); *K is to keep the chars you specified.;
	txt_b = compress(txt,"0123456789.,/");
run;

View solution in original post

5 REPLIES 5
heffo
Pyrite | Level 9

Try this:

data want;
	txt = "100/50 mg medicin";
	c = compress(txt,"0123456789.,/ ","K"); *K is to keep the chars you specified.;
	txt_b = compress(txt,"0123456789.,/");
run;
tsap
Pyrite | Level 9

If the accepted solution works to achieve your expected results, one recommendation I would make is regarding the "txt_b" line of logic.

 

As it stands the logic used to populate the "txt_b" field excludes the characters pulled in the "C" field above it, except for the blank space. This in turn causes any spaces between the numeric value and the character value to remain. So when you have "100/50 mg medicine", the blank between '50' and 'mg' remains. So to clean this up without incorrectly removing the blank between the words 'mg' and 'medicine', we can use the STRIP FUNCTION.

 

DATA WORK.Want;
	SET WORK.Have;
	c 	  = COMPRESS(Txt,"0123456789.,/ ","K"); *K is to keep the chars you specified.;
	txt_b = STRIP(COMPRESS(Txt,"0123456789.,/"));
RUN;

 

SASKiwi
PROC Star

100/50 isn't a valid numeric because of the "/". You could split it into two numerics - 100 and 50. How do you want to handle this? 

tsap
Pyrite | Level 9

Here's an alternative option:

 

Obviously not as simple as the accepted solution, but still works regardless.

DATA WORK.WANT (DROP=Re Position Length Start Stop);
	SET WORK.HAVE;
	FORMAT C $10. Txt_B $15.;
	IF _N_=1 THEN RE= PRXPARSE("m/0|1|2|3|4|5|6|7|8|9/oi");
	RETAIN Re;
	Start=1;
	Stop=LENGTH(Txt);
		 IF PRXMATCH(Re,Txt) = 0 THEN DO;
			 C		= ' ';
			 Txt_B	= Txt;
		 END;
	ELSE IF PRXMATCH(Re,Txt) > 0 THEN DO;
		CALL PRXNEXT(Re, Start, Stop, Txt, Position, Length);
		DO WHILE (Position > 0);
			 C	   = STRIP(SUBSTR(Txt, 1, Start-1));
	         Txt_B = STRIP(SUBSTR(Txt, Start, Stop));
	         CALL PRXNEXT(Re, Start, Stop, Txt, Position, Length);
		END;
	END;
RUN;
tsap
Pyrite | Level 9

The main benefit to this option, over the other solution, is that depending on any other values in the text field, some issues may arise.

 

For example:

If there is a value of "15/100 mg/d", this is something completely made up just to show what could occur with the accepted solution logic.

As it stands that logic would produce C="15/100/" Txt_B=" mgd". So there would be an incorrect slash at the end of the value in the C field and it would be removed from the Txt_B field. Even if you removed the '/' from the Txt_B logic you would still have a bad value of "/mg/d" with the incorrect slash in the front now. If you will never have starting values like this, then you will be fine with the accepted solution.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 7823 views
  • 4 likes
  • 4 in conversation