DATA Step, Macro, Functions and more

Looking for a SAS function that can help identify a string that contains all the valid characters!

Reply
Occasional Contributor
Posts: 17

Looking for a SAS function that can help identify a string that contains all the valid characters!

[ Edited ]

Hi everyone, I am wondering if there is a function that can help identify observations for a variable that contain all the valid charaters.

 

For example, I have a dataset called test which is comprised of two variables. 1. observation, 2 code. Below is the table of info:

 

Observation Code

1                  abccccccccccc

2                  aaaaaaaaaaaa

3                  bbbcccccccccc

4                  aaaaacccccccc

5                  bbbbbbbbbbbb

6                  bbbbbbbaaaaa

7                  cccccbbbbbbbc

 

I would like to identify the oberversation 3 and observation 7  that contain all the characters "b" and "c"  only NOT "b" or "c" only. What function(s) can help me get that?

 

I tried to use verify (trim(Code),"bc"), and it ends up giving me the observation 3, observation 5  and observation 7, but I only want to have observation 3 and  observation 7 in my result.

 

Note:  I know I could use find( trim(code),"bbbcccccccccc") to find obersavtion 3 and find(trim(code),"cccccbbbbbbbc") to find obersavtion 7, but that requires the exact match not just in terms of all characters in a string, but also in terms of their positions in a string.

 

Is there a more efficient function that can help identify the observation 3 and observation 7 without specifying the entire characters in a string as well as the exact positioning of those characters in a string?

 

Any help on this would be greatly appreciated. Smiley Happy

 

Thank you!

PROC Star
Posts: 7,487

Re: Looking for a SAS function that can help identify a string that contains all the valid character

Posted in reply to Dancer_on_data

There is probably an easier way, but I think that the following will do what you want:

 

data want;
  set have;
  test=verify(strip(code),'acdefghijklmnopqrstuvwxyz')>0 and
    verify(strip(code),'abdefghijklmnopqrstuvwxyz') > 0 and
    verify(strip(code),'bc') eq 0;
run;

Art, CEO, AnalystFinder.com

 

Super User
Posts: 5,516

Re: Looking for a SAS function that can help identify a string that contains all the valid character

Posted in reply to Dancer_on_data

Somebody will come up with a pxparse solution that I won't understand, but here is a reasonable way:

 

data want;

set have;

test_code = compress(code, 'b');

if 0 < lengthn(test_code) < lengthn(code);

if compress(test_code, 'c') = ' ';

drop test_code;

run;

Frequent Contributor
Posts: 75

Re: Looking for a SAS function that can help identify a string that contains all the valid character

Posted in reply to Dancer_on_data

Here's another option

data want;
set have;
where indexc(code,'b')>0 and indexc(code,'c')>0 and indexc(code,'adefghijklmnopqrstuvwxyz') = 0;
run;

:

 

Trusted Advisor
Posts: 1,022

Re: Looking for a SAS function that can help identify a string that contains all the valid character

[ Edited ]
Posted in reply to Dancer_on_data

Just use the function (compress, not trim) twice:

 

 

data want;

   bcode=compress(code,'b');

   if bcode=' ' then delete;

   if compress(code,'c')=' ';

   if compress(bcode,'c')=' ';

run;

 

Editted, thanks to Art's observation.

PROC Star
Posts: 7,487

Re: Looking for a SAS function that can help identify a string that contains all the valid character

I'd have to think that @mkeintz meant:

 

data want (drop=bcode);
  set have;
  bcode=compress(code,'b');
  if bcode=' ' then delete;
  if compress(bcode,'c')=' ';
run;

Art, CEO, AnalystFinder.com

 

Trusted Advisor
Posts: 1,022

Re: Looking for a SAS function that can help identify a string that contains all the valid character

@art297

 

I think @mkeintz meant that too.  Will edit.

Regular Contributor
Posts: 218

Re: Looking for a SAS function that can help identify a string that contains all the valid character

Posted in reply to Dancer_on_data

Hi,

 

Here is my solution to your question

 

data have;
length Observation 3 Code $20;
input Observation Code;
datalines;
1                  abccccccccccc
2                  aaaaaaaaaaaa
3                  bbbcccccccccc
4                  aaaaacccccccc
5                  bbbbbbbbbbbb
6                  bbbbbbbaaaaa
7                  cccccbbbbbbbc
;
run;

 

data want;
    set have(where=(prxmatch('/[^bc]/',strip(code))=0));  /* Only read observation with nothing but b/c/both */
    if (findc(code,'c') AND findc(code,'b')); /* Only keep observations with both b&c */
run;

Super User
Super User
Posts: 7,074

Re: Looking for a SAS function that can help identify a string that contains all the valid character

Posted in reply to Dancer_on_data

Contains both B and C

indexc(string,'b') and indexc(string,'c')

But doesn't have something other than b and c.

and not verify(string,'b c')

Try it.

data test;
  input string $10. ;
  ok = indexc(string,'b') and indexc(string,'c') and not verify(string,' bc') ;
  put (_all_) (=);
cards;
abccccccccccc
aaaaaaaaaaaa
bbbcccccccccc
aaaaacccccccc
bbbbbbbbbbbb
bbbbbbbaaaaa
cccccbbbbbbbc
abc
bbbcc
bc
.
;
Occasional Contributor
Posts: 17

Re: Looking for a SAS function that can help identify a string that contains all the valid character

Posted in reply to Dancer_on_data

Thank you so much for all the ideas, which really helped stimulate my thinking, I built upon some of your ideas, and arriived at the following codes for the task.

 

data test;
  input Obs Code $20. ;
cards;
1 abccccccccccc
2 aaaaaaaaaaaa
3 bbbcccccccccc
4 aaaaacccccccc
5 bbbbbbbbbbbb
6 bbbbbbbbaaaaa
7 cccccbbbbbbbc
;
run;

data final;
set test;
if verify(strip(code),"b")>0 and
verify(strip(code),"c")>0 and
verify(strip(code),"bc")=0;
run;

 

 

Once again, thanks so much for all your help on this task.Smiley Happy

 

Best,

Respected Advisor
Posts: 4,930

Re: Looking for a SAS function that can help identify a string that contains all the valid character

Posted in reply to Dancer_on_data
data have;
input Observation Code :$20.;
datalines;
1                  abccccccccccc
2                  aaaaaaaaaaaa
3                  bbbcccccccccc
4                  aaaaacccccccc
5                  bbbbbbbbbbbb
6                  bbbbbbbaaaaa
7                  cccccbbbbbbbc
;

data want;
check = "bc";
set have;
ok = verify(check, trim(code))=0 and verify(trim(code), check)=0;
run;

proc print; run;
PG
Contributor
Posts: 54

Re: Looking for a SAS function that can help identify a string that contains all the valid character

Posted in reply to Dancer_on_data
I am reading all creative solutions admiringly.
Thanks a lot for all.
Ask a Question
Discussion stats
  • 11 replies
  • 307 views
  • 9 likes
  • 9 in conversation