BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Cruise
Ammonite | Level 13

I'm trying to flag 'codes' to a broad categories of:

 

- all characters

- all numeric

- any alphanumeric

 

I find categorizing codes for more generic conditions harder than specific conditions met shown below.

 

Any help please?  What am I doing wrong? Thanks in advance.

 


data have;
input codes $; 
datalines;
0D7Q8ZZ
XHRPXL2
0090T99
0090THJ
123456788
23456
234
0090T
0987F
HYDHDJH
;
/*Help is appreciated to work below codes work out*/
data x; set have; if prxMatch("/^D+\s*$/o",codes) then flag = "all character"; if prxMatch("/^[a-z]+\s*$/o",codes) then flag = "all character"; if prxMatch("/^[a-z]*\s*$/o",codes) then flag = "all character";
if prxMatch("/^w*\s*$/o",codes) then flag = "any alphanumeric"; if prxMatch("/^d+\s*$/o",codes) then flag = "all_numeric"; run; proc freq data=x; tables flag; run;

/*codes worked fine*/

if prxMatch("/^\d{5}.{2}\s*$/o",codes) then flag="CPT1";
else if prxMatch("/^\d{4}F.{2}\s*$/o",codes) then flag = "CPT2";
else if prxMatch("/^\d{4}T.{2}\s*$/o",codes) then flag = "CPT3";

else if prxMatch("/^V\d+\s*$/o",codes) then flag="VCODE";
else if prxMatch("/^E\d+\s*$/o",codes) then flag="ECODE";
else if prxMatch("/^\d{1}\s*$/o",codes) or
        prxMatch("/^\d{2}\s*$/o",codes) or
        prxMatch("/^\d{3}\s*$/o",codes) or  
        prxMatch("/^\d{4}\s*$/o",codes) then flag = "ICD9";

 

1 ACCEPTED SOLUTION

Accepted Solutions
kiranv_
Rhodochrosite | Level 12

something like below should work 

data have;
input codes $; 
datalines;
0D7Q8ZZ
XHRPXL2
0090T99
0090THJ
123456788
23456
234
0090T
0987F
HYDHDJH
;

data x; set have;
length flag $30.;
if prxMatch("/^[a-z]+$/i",trim(codes)) then flag = "all character"; 
 else if prxMatch("/^[0-9]+$/o",trim(codes)) then flag = "all numeric"; 
 else if prxMatch("/^[a-z0-9]+$/i",trim(codes)) then flag = "any alphanumeric"; 
run;

View solution in original post

5 REPLIES 5
kiranv_
Rhodochrosite | Level 12

something like below should work 

data have;
input codes $; 
datalines;
0D7Q8ZZ
XHRPXL2
0090T99
0090THJ
123456788
23456
234
0090T
0987F
HYDHDJH
;

data x; set have;
length flag $30.;
if prxMatch("/^[a-z]+$/i",trim(codes)) then flag = "all character"; 
 else if prxMatch("/^[0-9]+$/o",trim(codes)) then flag = "all numeric"; 
 else if prxMatch("/^[a-z0-9]+$/i",trim(codes)) then flag = "any alphanumeric"; 
run;
Cruise
Ammonite | Level 13
Btw, why trim() function needed here? Just trying to get full grasp of what you're doing here. Thanks.
kiranv_
Rhodochrosite | Level 12

just to remove leading and trailing blanks

acordes
Rhodochrosite | Level 12

thanks @kiranv_ 

 Do you know or @Rick_SAS if I can vectorize this with IML?

 

If nested in a loop, it works, otherwise no. 

PROC IML;
USE HAVE;
READ ALL VAR _ALL_ INTO X [COLNAME=VARNAMES];
CLOSE;

/*doesn't work*/
FLAG= prxMatch("m/^[a-z]+$/imx",trim(X[,"codes"]));
PRINT X FLAG;


/*works*/
FLAG=J(NROW(X),1,.);

DO I=1 TO NROW(X);
FLAG[I] = prxMatch("m/^[a-z]+$/si",trim(X[I,"codes"]));
END;

PRINT X FLAG;

 

Rick_SAS
SAS Super FREQ

In IML, all character vectors are a fixed length, which is the length of the longest element. Thus when you execute

T = trim(X);

the TRIM function removes blanks from each element of X but the assignment operator essentially packs shorter elements with blanks so that every element of T has the same number of characters.

 

In other words, for vectors, the TRIM function is not doing what you think it is, but it does for scalars.

 

Anyway, you can tell the regular expression to ignore white space (or ignore at the end), which is probably what I'd suggest:

FLAG= prxMatch("m/^[a-z]+\s*$/imx", X);
PRINT FLAG X;

Another option would be to replace the regular expression with the ANY* functions in SAS. The general idea is

hasDigit = anydigit(X);
hasPunct = anypunct(X);
hasAlpha = anyalpha(X);

allChar = hasAlpha & ^hasDigit;
allNum = hasDigit & ^hasAlpha;
PRINT allChar allNum X;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 4100 views
  • 6 likes
  • 4 in conversation