BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8
10MG/2ML
5MG/ML
12MG KIT
0.4MG
24MG KIT
5MG
0.6MG
10MG/2ML
5MGFLEX
10MG

for the above data is used the following reg expression:

if _N_ = 1 then RE = PRXPARSE ("/ \d{1,5}\.?\d{0,4}\?mg/i");
retain RE;
call PRXSUBSTR(RE,Drug_Strength_Name,START,LENGTH);
if START GT 0 then do;
str = SUBSTRN(Drug_Strength_Name,START+1 ,LENGTH-1 );
output;
end;
run;


it dosent return any values at all.
5 REPLIES 5
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Suggest you add some SAS diagnostic statements ("nn" increments for easier correlation in code to log):

PUTLOG '>DIAGnn ' _all_;

Hopefully this additional info will help you diagnose the problem systematically.

Scott Barry
SBBWorks, Inc.
Tim_SAS
Barite | Level 11
This program works for the data set you posted:
[pre]
data _null_;
retain re;
infile datalines;
input Drug_Strength_Name $ 1-9;

if _N_ = 1 then
re = prxparse("/\d+(\.\d+)?MG((\/\dML)| KIT|FLEX)?/i");
call prxsubstr(re,Drug_Strength_Name, start, length);
if start gt 0 then do;
str = substrn(Drug_Strength_Name, start, length);
put str;
end;
datalines;
10MG/2ML
5MG/ML
12MG KIT
0.4MG
24MG KIT
5MG
0.6MG
10MG/2ML
5MGFLEX
10MG
;;;;
[/pre]
Log:
[pre]
704 data _null_;
705 retain re;
706 infile datalines;
707 input Drug_Strength_Name $ 1-9;
708
709 if _N_ = 1 then
710 re = prxparse("/\d+(\.\d+)?MG((\/\dML)| KIT|FLEX)?/i");
711 call prxsubstr(re,Drug_Strength_Name, start, length);
712 if start gt 0 then do;
713 str = substrn(Drug_Strength_Name, start, length);
714 put str;
715 end;
716 datalines;

10MG/2ML
5MG
12MG KIT
0.4MG
24MG KIT
5MG
0.6MG
10MG/2ML
5MGFLEX
10MG
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


727 ;;;;
[/pre]
SushilNayak
Obsidian | Level 7
Hi SASPhile,
Tim@SAS gave the perfect regular expression.
I came up with re = PRXPARSE ("/[0-9]+(\.[0-9]*)?MG(\/[0-9]*ML| KIT|FLEX)?/i");
I realized its similar to what Tim@SAS posted 😞 .

Just to add more to the logic and regular expression. Whenever u are trying to create a regular expression try and break down the pattern. In your case of provided data
1) Integer part hence [0-9]+ <--- i come from perl background hence used to [0-9] instead of \d, don't mind.
2) Decimal part(may or may not come hence optional) hence (\.[0-9]*)?
3) MG is just after the integer or decimal part (always coming) hence MG
4) (An Integer may or may not come with ML) or ( KIT may come with leading space) or (FLEX would come) ....as any of the three cases can happen or may not happen at all (e.g 10MG) hence above 3 cases are optional hence regular expression for this part would be ----> (\/[0-9]*ML| KIT|FLEX)?

Now going back to the original regular expression that you coded
RE = PRXPARSE ("/ \d{1,5}\.?\d{0,4}\?mg/i") . For the datalines written by Tim@SAS, your regular expression would not work. The reasons are :
1) PRXPARSE ("/ \d ..if you look closely, there is a space regulare expression is looking for before the integer. Maybe your data has that space, not sure thats why im pointing it out.
2) d{0,4}\? would match in case of 0.4?MG and not 0.4MG hence it should have been like d{0,4}?
Even after fixing the above 2 things you would only get Integer + decimal + MG being pattern matched (e.g. 10.4MG , 10MG & 0.4MG ) the KITS/2ML/ML etc would still not get matched. For that check out the regular expression posted by me or Tim@SAS

Njoy understanding the patterns!!!!!!!!!! 🙂
SASPhile
Quartz | Level 8
Thanks Guys!
Tim_SAS
Barite | Level 11
Actually I missed one: 5MG/ML. The RE needs to accept a slash followed by 0 or more digits followed by ML: "\/\d*ML". Here's the corrected version:

[pre]
data _null_;
retain re;
infile datalines;
input Drug_Strength_Name $ 1-9;

if _N_ = 1 then
re = prxparse("/\d+(\.\d+)?MG((\/\d*ML)| KIT|FLEX)?/i");
call prxsubstr(re,Drug_Strength_Name, start, length);
if start gt 0 then do;
str = substrn(Drug_Strength_Name, start, length);
put str;
end;
datalines;
10MG/2ML
5MG/ML
12MG KIT
0.4MG
24MG KIT
5MG
0.6MG
10MG/2ML
5MGFLEX
10MG
;;;;
[/pre]

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 848 views
  • 0 likes
  • 4 in conversation