BookmarkSubscribeRSS Feed
GreenTriangle
Calcite | Level 5

Hello,

I'm trying to search for multiple alphanumeric strings in a variable. There are two types of searches I want to do: 1. Search for only the specific string (e.g. E45 only, not any string that contains E45) and 2. search for any string that starts with a specified string of characters (e.g. any string that begins with E45). Once I identify any of these alphanumeric strings I want to flag it as 1 and if the strings are not present, I want the variable to be 0.

 

I can't just copy and paste the data here for privacy reasons but it looks like this:

 

data test;
input alphanum $ ;
datalines;
G85.0;Z58.6;;
;;
Z00.125;F80.14;F82.0;
;;;E66.09;H54.0;E66.9;E66.01;
Z68.55
F20.96
F32O
F331;Z68.55;Z67.56;Z74.89;
run;

 

I wrote code like this for each of the flags:  if prxmatch('/^H5G|\bH5X.66\b|\bH5F.1G\b|^H5D\b/', icd10)>0 then test_flag=1; 
else test_flag=0; 

 

If there is a ^ then it should pull in any strings that begin with those letters and numbers. If the string is bordered by \b, then it should only flag it if that exact string appears.

 

Some of the flags worked but some of them are flagging strings they should not be, completely unrelated strings.

 

I did get this as a warning on some of the flags but am not sure how to fix it:

NOTE: The quoted string currently being processed has become more than 262 characters long. You
might have unbalanced quotation marks.

 

Could this cause it? IS prxmatch just not meant to work with this kind of data and if so, is there another way to search multiple alphanumeric strings?

3 REPLIES 3
ballardw
Super User

You discuss and use as example searching for E45 and do not include any values containing E45. So what exactly are you searching for? Please do not make us have to guess exactly what you are searching for.

You should include at least one example of each type of search and show the result.

Any time you have a question about an error or warning message then best practice is to copy from the log the entire procedure or data step code generating the message an all the notes, warnings, messages or errors then on the forum open a text box using the </> icon above the message window and paste all of that text. The text box will preserve formatting of many of the diagnostic messages SAS provides.

 

 

Your particular warning is about 90% of the time caused by either a missing quote or mismatched quote (one single quote and one double) somewhere. Which is why you should provide the entire code as the message may not appear until several lines after the problem starts.

 

It appears that you are searching in the middle, so 'any string starts with' is apparently a misdirection unless you define what starts a 'string'.

 


@GreenTriangle wrote:

Hello,

I'm trying to search for multiple alphanumeric strings in a variable. There are two types of searches I want to do: 1. Search for only the specific string (e.g. E45 only, not any string that contains E45) and 2. search for any string that starts with a specified string of characters (e.g. any string that begins with E45). Once I identify any of these alphanumeric strings I want to flag it as 1 and if the strings are not present, I want the variable to be 0.

 

I can't just copy and paste the data here for privacy reasons but it looks like this:

 

data test;
input alphanum $ ;
datalines;
G85.0;Z58.6;;
;;
Z00.125;F80.14;F82.0;
;;;E66.09;H54.0;E66.9;E66.01;
Z68.55
F20.96
F32O
F331;Z68.55;Z67.56;Z74.89;
run;

 

I wrote code like this for each of the flags:  if prxmatch('/^H5G|\bH5X.66\b|\bH5F.1G\b|^H5D\b/', icd10)>0 then test_flag=1; 
else test_flag=0; 

 

If there is a ^ then it should pull in any strings that begin with those letters and numbers. If the string is bordered by \b, then it should only flag it if that exact string appears.

 

Some of the flags worked but some of them are flagging strings they should not be, completely unrelated strings.

 

I did get this as a warning on some of the flags but am not sure how to fix it:

NOTE: The quoted string currently being processed has become more than 262 characters long. You
might have unbalanced quotation marks.

 

Could this cause it? IS prxmatch just not meant to work with this kind of data and if so, is there another way to search multiple alphanumeric strings?


 

Oligolas
Barite | Level 11

...and... you're not even searching for E45 in your regex (!) ...and... your dataset does not work

F44.0? 😶

 

________________________

- Cheers -

PaigeMiller
Diamond | Level 26

Of course, you don't need to use PRXMATCH

 

Search 1

 

if string = 'E45' then ... ;

 

Search 2

 

if string =: 'E45' then ... ;

 

The two searches you want to search for can be collapsed into the code I provide for search 2, and search 1 is not needed.

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 394 views
  • 2 likes
  • 4 in conversation