SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How to flag variables based on string

Reply
Frequent Contributor
Posts: 88

How to flag variables based on string

Hi Everyone, 

 

New to SAS

 

I am trying to flag all variables with the string "comprehensive assessment". 

 

This is part of my code:

 

ca_only = 0;
			if prxmatch ("m/Comprehensive Assessment|comprehensive ax|HW - Comprehensive Ax|FA - Comprehensive Ax|FA - Comprehensive Ax TM|HW - Comprehensive Ax TM|FA - Reassessment|HW - Reassessment|Reassessment|Dominance|Comp Ax Telemedicine|Clinic Comprehensive Assessment|Comprehensive Ax - Telemedicine|Comprehensive Ax - TM/oi", description)> 0 then
			if prxmatch("m/Comprehensive Assessment - OT\/PT|Dominance Transfer Training|conc|psych|LIBYA - AX - COMPREHENSIVE ASSESSMENT|LIBYA - AX - COMPREHENSIVE ASSESSMENT OT\/PT|/oi", description) = 0 then ca_only = 1;

However, I've reached a point now where it seems my code is too long and I am getting an error message saying I have exceeded 252 characters.

 

How else can I find all values that CONTAIN "comprehensive assessment" but also contain other variations like outlined in my above code (e.g., "comprehensive ax", "comprehensive ax-tm", "comp ax") etc...

 

thanks in advanced!

Super User
Posts: 19,815

Re: How to flag variables based on string

Posted in reply to christinagting0

Your current code is 

 

If <CONDITION1> THEN 

    if <CONDITION2> then ca_only=1;

 

This isn't correct SAS syntax. 

 

You may want:

 

If <CONDITION1>  AND  <CONDITION2> then ca_only=1;

Super User
Posts: 5,509

Re: How to flag variables based on string

Posted in reply to christinagting0

On a side note, SAS does permit IF/THEN/IF.  However, nobody uses it because it doesn't save you anything.  (All it really does is make the ELSE statement more difficult to interpret.)  For example, here's a test program:

 

data _null_;
do i=1 to 100000000;
   if i > 0 and 5=4 then x=2;
end;
run;

data _null_;
do i=1 to 100000000;
   if 5=4 and i > 0 then x=2;
end;
run;

data _null_;
do i=1 to 100000000;
   if i > 0 then if 5=4 then x=2;
end;
run;

 

The middle step runs faster because 5=4 is always false.  The software is smart enough to figure out that it doesn't need to check the second condition when the first condition is false.

 

When it comes to your application, why do you need to check all these combinations?  If you find "Comprehensive Assessment" isn't that enough so that you don't need to check for variations that would follow?  Are there any strings that contain "Comprehensive Assessment" that also contain other characters where it would be incorrect to set CA_ONLY to 1?  I realize there are other strings that need to be checked (upper vs. lower case, "ax" vs. "Assessment"), but why do they all need to be checked?

 

Finally, strings that become lengthy should not generate an error.  They should at most generate a warning because the software suspects you have done something wrong.  If you know you haven't done anything wrong, you can usually ignore this particular warning.  (Very bad advice in general, but applicable in this case.)

 

 

Respected Advisor
Posts: 4,173

Re: How to flag variables based on string

Posted in reply to Astounding

@Reeza and @Astounding

Actually I believe to remember that in some very outdated SAS publication (for SAS V6 I believe) the IF...THEN...IF...THEN construct had been documented as a way to improve performance. This was at a time when SAS always evaluated the full expression in an AND construct even if the first part already resolved to False.

 

Super User
Posts: 5,509

Re: How to flag variables based on string

Patrick,

 

That goes back a long way, but I think you're correct.  I think it was closer to the end of the version 6 releases (once initial bugs were worked out, efficiency became more important).  But I would never swear to it.

Frequent Contributor
Posts: 88

Re: How to flag variables based on string

Posted in reply to Astounding

Thanks for your response!

 

I need to flag all values for the variable DESCRIPTION that contain "comprehensive assessment" BUT ALSO other variation of this like "comprehensive ax" + "comp ax" etc.

 

Just including "Comprehensive Assessment" in my code doesn't capture the other variations of this. I need to do this b/c this is how we set up our rule for flagging-they all mean the same thing but have different values, which is why we are trying to flag them all the same.

 

Is there a better way for me to accomplish this?

 

The actual error that I am getting is the following:

 

The quoted string currently being processed has become more than 262 characters long.
         You might have unbalanced quotation marks.

So I don't think it's actually processing the code that I posted above. which means that i can't ignore it.

Super User
Posts: 5,509

Re: How to flag variables based on string

[ Edited ]
Posted in reply to christinagting0

Well, I don't think I can help with PRXMATCH.  But you may not need it.  Consider:

 

test_string = upcase(description);

if index(test_string, 'COMPREHENSIVE ASSESSMENT')

or index(test_string, 'COMPREHENSIVE AX')

or index(test_string, 'COMP AX')

then ca_flag=1;

 

I'm not sure if I covered every possible case here, but this might be enough.  My real point is that you may not need to look for every possible variation of text that might occur.  Just locating certian key words may be enough.

Frequent Contributor
Posts: 88

Re: How to flag variables based on string

Posted in reply to Astounding

Oh I see what you mean...

 

so including for just "COMPREHENSIVE" should capture all variations that include that string? So for example, it should also capture "COMPREHENSIVE AX", "COMPREHENSIVE ASSESSMENT" AND "COMPREHENSIVE ASSESS"?

 

I might have tried that before but I don't think it captured all variations (at least not with prxmatch) which is why I had to spell all variations out one by one.

 

thanks for trying to help though!

Super User
Posts: 5,509

Re: How to flag variables based on string

Posted in reply to christinagting0

Yes, just COMPREHENSIVE would capture all of those strings.  If it didn't last time, it might be the result of different capitalization.  But just COMPREHENSIVE might also capture other strings that you don't want as well.

Respected Advisor
Posts: 4,173

Re: How to flag variables based on string

Posted in reply to christinagting0

The FIND() function with the 'i' switch for case insensitive search would eventually suffice.

 

You can certainly use a RegEx which matches all the strings - you just must know what the pattern is you're searching for. Can you provide some sample data with the different cases?

 

 

Ask a Question
Discussion stats
  • 9 replies
  • 432 views
  • 3 likes
  • 4 in conversation