BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rajdeep
Pyrite | Level 9

Hi Team,

 

I need a pattern of 1 character and 11 numeric values, but the challenge is the 1st character should be the first letter of name column postfix with 11 numeric values and sex='M'.

data test4;
input Name $ Sex $ ID $12.;
datalines;
Alice M A10294354554
Barbara M C02943545549
Carol F C02943545543
Henry M H02943545540
James M K02943545545
;
run;

In the above sample I need to validate the ID column with a flag value 'Valid' or 'Invalid'.

If the sex='M'  and ID value contains the first character of Name column and 11 numeric values then it's a valid ID else it's an invalid ID.

I tried to take the first character of Name column into a macro variable and tried to write the program.

%macro test;
data want;
set test4;
%if Sex='M' %then %do;
proc sql;
select strip(substr(name,1,1)) into :First_Char from test4;
quit;
%end;
match=prxmatch("/^&First_Char\d{11}$/",ID);
%if match>1 %then %do;
flag='Invalid';
%end;
%else %do;
flag='Valid';
%end;
run; %mend; %test;

I am not getting any match, though 'Alice' and 'Henry' ID are valid, both IDs contains first character of name and 11 numbers beside sex='M' for both of them.

 

Please help me correct the logic.

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

@rajdeep wrote:

hi  FreelanceReinhard,

 

Thanks a lot for the quick reply.

 

I am just wondering what if the ID value length is not standard. Means, suppose one of the ID values has13 or 15 digit characters, but other values has 12 characters. Like below.

 

input Name $ Sex $ ID $13.;
datalines;
Alice M A102943545547
Barbara M C02943545549
Carol F C02943545543
Henry M H02943545540
James M K02943545545
;

In the above sample you can see Alice has 13 char ID value, but Henry is having 12 char value with a trailing space, so in that case Henry's ID value is valid. will your code work in that case!

 

Thanks.

 

With the amended structure of the input dataset (defined length of ID is now 13) the condition length(id)=12 is no longer redundant and trailing blanks must be considered in the third argument of the FINDC function:

data want;
set test5;
flag=(sex='M' & id=:first(name) & length(id)=12 & ~findc(id,,'kdt',2));
run;

View solution in original post

7 REPLIES 7
FreelanceReinh
Jade | Level 19

Hi @rajdeep,

 

There's no need for macro coding. Even regular expressions can be avoided:

data test5;
input Name $ Sex $ ID $12.;
datalines;
Alice M A10294354554
Barbara M C02943545549
Carol F C02943545543
Henry M H02943545540
James M K02943545545
Robert M R029435A5540
Timothy M T0294355540
;

data want;
set test5;
flag=(sex='M' & id=:first(name) & ~findc(id,,'kd',2));
run;

[Edit: Removed the redundant criterion length(id)=12.]

I leave it to you to create character values 'Valid' and 'Invalid' from the 1s and 0s produced by the code above if needed. Personally, I prefer these 0-1 flags because they are easier to handle and character labels can be assigned with a simple format:

proc format;
value valflag
0='Invalid'
1='Valid';
run;

proc print data=want;
format flag valflag.;
run;

 

rajdeep
Pyrite | Level 9

hi  FreelanceReinhard,

 

Thanks a lot for the quick reply.

 

I am just wondering what if the ID value length is not standard. Means, suppose one of the ID values has13 or 15 digit characters, but other values has 12 characters. Like below.

 

input Name $ Sex $ ID $13.;
datalines;
Alice M A102943545547
Barbara M C02943545549
Carol F C02943545543
Henry M H02943545540
James M K02943545545
;

In the above sample you can see Alice has 13 char ID value, but Henry is having 12 char value with a trailing space, so in that case Henry's ID value is valid. will your code work in that case!

 

Thanks.

 
FreelanceReinh
Jade | Level 19

@rajdeep wrote:

hi  FreelanceReinhard,

 

Thanks a lot for the quick reply.

 

I am just wondering what if the ID value length is not standard. Means, suppose one of the ID values has13 or 15 digit characters, but other values has 12 characters. Like below.

 

input Name $ Sex $ ID $13.;
datalines;
Alice M A102943545547
Barbara M C02943545549
Carol F C02943545543
Henry M H02943545540
James M K02943545545
;

In the above sample you can see Alice has 13 char ID value, but Henry is having 12 char value with a trailing space, so in that case Henry's ID value is valid. will your code work in that case!

 

Thanks.

 

With the amended structure of the input dataset (defined length of ID is now 13) the condition length(id)=12 is no longer redundant and trailing blanks must be considered in the third argument of the FINDC function:

data want;
set test5;
flag=(sex='M' & id=:first(name) & length(id)=12 & ~findc(id,,'kdt',2));
run;
rajdeep
Pyrite | Level 9

Wow, this is something great. 

 

Thanks a lot.

r_behata
Barite | Level 11
data want;
	set test4;
	if Sex='M' then	flag=ifc(ifc(first(Name)eq first(ID)   ,1,0) eq 1 and prxmatch("m/[a-zA-Z]\d{11}/oi",ID),'Valid','Invalid');
run;
                                          Obs     Name      Sex         ID          flag

                                           1     Alice       M     A10294354554    Valid  
                                           2     Barbara     M     C02943545549    Invalid
                                           3     Carol       F     C02943545543           
                                           4     Henry       M     H02943545540    Valid  
                                           5     James       M     K02943545545    Invalid
rajdeep
Pyrite | Level 9

Hi Team,

 

I am want to output specific records based on condition, but it seems it's not happening for some reason.

 


data test;
input CCI_STATE $2. CCI_ID_NUMBER $14. CCI_ID_TYPE $21.;
datalines;
CA	CAID C6307718	DRIVERS LIC/STATE ID
CA	N31790267	    DRIVERS LIC/STATE ID
CA	A6118782	    DRIVERS LIC/STATE ID
CA	B3765104	    DRIVERS LIC/STATE ID
CA	Y5652061	    DRIVERS LIC/STATE ID
;
run;

data valid invalid;
set test;
if CCI_STATE='CA' and CCI_ID_Type='DRIVERS LIC/STATE ID' and prxmatch("/^(?<!\w)\w{1}\d{7}(?!\d)$/",compress(CCI_ID_Number, ,'s'))=0  then output invalid;
else  output valid;
run;

so, basically out of 5 records I would like to output the (1 alpha+7numeric) ID numbers to valid dataset and remaining 2 records to invalid dataset.

 

While running the code it's only outputting the 'N31790267' ID to invalid, not the 'CAID C6307718' ID.

 

if I am checking the prxmatch result for the particular 'CAID C6307718' ID it's giving 0 only, but it's not outputting the same record to invalid dataset based on the if condition.

 

I would request please suggest something with regular expression facility.

 

Please help.

 

Thanks

FreelanceReinh
Jade | Level 19

Hi @rajdeep,

 

I would prefer a logic that requires certain criteria for a classification as "valid" and rejects everything else as "invalid."

 

In your example:

if ... and prxmatch('/^[A-Z]\d{7}$/',trim(CCI_ID_Number)) then output valid;
else output invalid;

To allow also lowercase letters (as in "a1234567") insert the i modifier after the closing slash of the regular expression. Of course, the IF conditions regarding CCI_STATE and CCI_ID_TYPE would need to be adapted if, e.g., data from other states were to be classified.

 

In the future, please open a new thread for a new question. This improves the chances of getting a quick answer because only very few people will notice a late post in an old thread.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 668 views
  • 1 like
  • 3 in conversation