BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Aayushi_17
Quartz | Level 8

Hi 

 

i want to separate character from numeric  value 

 

for eg :

Txt 
 
50mg 
slicing data 
100/50 mg 
mg 
unknown 
0.25 mmmgb 

 

how to get the output like this:

 

cTxt_b
  
50mg 
 slicing data 
100/50mg 
 mg 
 unknown 
0.25mmmgb
1 ACCEPTED SOLUTION

Accepted Solutions
heffo
Pyrite | Level 9

Try this:

data want;
	txt = "100/50 mg medicin";
	c = compress(txt,"0123456789.,/ ","K"); *K is to keep the chars you specified.;
	txt_b = compress(txt,"0123456789.,/");
run;

View solution in original post

5 REPLIES 5
heffo
Pyrite | Level 9

Try this:

data want;
	txt = "100/50 mg medicin";
	c = compress(txt,"0123456789.,/ ","K"); *K is to keep the chars you specified.;
	txt_b = compress(txt,"0123456789.,/");
run;
tsap
Pyrite | Level 9

If the accepted solution works to achieve your expected results, one recommendation I would make is regarding the "txt_b" line of logic.

 

As it stands the logic used to populate the "txt_b" field excludes the characters pulled in the "C" field above it, except for the blank space. This in turn causes any spaces between the numeric value and the character value to remain. So when you have "100/50 mg medicine", the blank between '50' and 'mg' remains. So to clean this up without incorrectly removing the blank between the words 'mg' and 'medicine', we can use the STRIP FUNCTION.

 

DATA WORK.Want;
	SET WORK.Have;
	c 	  = COMPRESS(Txt,"0123456789.,/ ","K"); *K is to keep the chars you specified.;
	txt_b = STRIP(COMPRESS(Txt,"0123456789.,/"));
RUN;

 

SASKiwi
PROC Star

100/50 isn't a valid numeric because of the "/". You could split it into two numerics - 100 and 50. How do you want to handle this? 

tsap
Pyrite | Level 9

Here's an alternative option:

 

Obviously not as simple as the accepted solution, but still works regardless.

DATA WORK.WANT (DROP=Re Position Length Start Stop);
	SET WORK.HAVE;
	FORMAT C $10. Txt_B $15.;
	IF _N_=1 THEN RE= PRXPARSE("m/0|1|2|3|4|5|6|7|8|9/oi");
	RETAIN Re;
	Start=1;
	Stop=LENGTH(Txt);
		 IF PRXMATCH(Re,Txt) = 0 THEN DO;
			 C		= ' ';
			 Txt_B	= Txt;
		 END;
	ELSE IF PRXMATCH(Re,Txt) > 0 THEN DO;
		CALL PRXNEXT(Re, Start, Stop, Txt, Position, Length);
		DO WHILE (Position > 0);
			 C	   = STRIP(SUBSTR(Txt, 1, Start-1));
	         Txt_B = STRIP(SUBSTR(Txt, Start, Stop));
	         CALL PRXNEXT(Re, Start, Stop, Txt, Position, Length);
		END;
	END;
RUN;
tsap
Pyrite | Level 9

The main benefit to this option, over the other solution, is that depending on any other values in the text field, some issues may arise.

 

For example:

If there is a value of "15/100 mg/d", this is something completely made up just to show what could occur with the accepted solution logic.

As it stands that logic would produce C="15/100/" Txt_B=" mgd". So there would be an incorrect slash at the end of the value in the C field and it would be removed from the Txt_B field. Even if you removed the '/' from the Txt_B logic you would still have a bad value of "/mg/d" with the incorrect slash in the front now. If you will never have starting values like this, then you will be fine with the accepted solution.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 7734 views
  • 4 likes
  • 4 in conversation