Help using Base SAS procedures

Find Character values in variable

Reply
Occasional Contributor
Posts: 11

Find Character values in variable

hi

I have a variable with values containing numeric as well as alphabetic characters.(eg. 1675ag157). I want to put the observations that contains alphas in a seperate dataset. Is there a formula that i can use to identify these records other than a lengthy if statement?

Super Contributor
Posts: 259

Find Character values in variable

You can use the anyalpha-function:

...

if anyalpha(variable) then output alphas;

else output numerics;

...

Frequent Contributor
Posts: 140

Find Character values in variable


data l;
input x $;
cards;
rt56y
polio
run;

data l2(drop=c) l3(drop=c);
set l;
c=anydigit(x);
if c=0 then output l2;
else output l3;
run;
proc print data=l2;
run;

Super Contributor
Posts: 268

Find Character values in variable

did any of these replies answer your question?

Trusted Advisor
Posts: 1,300

Find Character values in variable

*solution using regular expression;

data digits alphas;

input key $;

if prxmatch('/[a-zA-Z]+/',key) then output alphas; else output digits;

cards;

1675ag157

ag09912

simpton

123476

32246

12345h

;

run;

PROC Star
Posts: 7,363

Find Character values in variable

Matt, Just FWIW, Andreas' proposed anyalpha solution ran twice as fast as using a regular expression.  The code I ran, on 100,000 replications of your example data, was:

data digits alphas;

  set have;

  if prxmatch('/[a-zA-Z]+/',key) then output alphas;

   else output digits;

run;

data digits alphas;

  set have;

  if anyalpha(key) then output alphas;

  else output digits;

run;

Trusted Advisor
Posts: 1,300

Find Character values in variable

Yes, in my opinion SAS does not implement regular expressions efficiently, in most cases a native function will outperform.  The benefit generally comes from more specific strings and cases that fall outside what the index and find functions can do (and the anyalpha/similar functions).

Super User
Posts: 9,681

Find Character values in variable

FriedEgg

You can use 'o' modifier to prevent SAS parse Perl Regular Expression at every data step loop.

That will be fast a lot.

if prxmatch('/[a-zA-Z]+/o',key) then output alphas; else output digits;

Ksharp

Trusted Advisor
Posts: 1,300

Find Character values in variable

Thanks Ksharp, I did not know that hint.  Another option is to use prxparse first and then retain the id.  This does save a substantial amount of time when processing, however I still find it typically to be less efficient.

Ask a Question
Discussion stats
  • 8 replies
  • 208 views
  • 1 like
  • 7 in conversation