DATA Step, Macro, Functions and more

Regular Expression to find more than one occurence

Reply
Occasional Contributor
Posts: 5

Regular Expression to find more than one occurence

Dear all.

I have a String variable in a Dataset with manually entered text. I try to split the text into chunks, while the delimiter consists of more than one character and the can be one or more expressions of this kind. Example:

"0 - Nothing entered or found. 1 - One or more expression- or else - entered. 99 - Missing"

Should become

Field1 Field2

0     Nothing entered or found.

1     One or more expression- or else - entered

99   Missing

I am looking for the 0 - Part using regular expressions (\d+ ?)-

Now I'm search for a possibility to not only find the first occurence of the expressions. Any help is appreciated.

Best Regards

Super User
Super User
Posts: 7,430

Re: Regular Expression to find more than one occurence

Hi,

You could use scan:

data have;

  a="0 - Nothing entered or found. 1 - One or more expression- or else - entered. 99 - Missing";

run;

data want;

  set have;

  i=1;

  do while (scan(a,i,".") ne "");

    word=scan(a,i,".");

    output;

    i=i+1;

  end;

run;

Occasional Contributor
Posts: 5

Re: Regular Expression to find more than one occurence

Hi RW9.

Sorry, but no. The dot at the end of each sentence is not guaranteed. That's why I think about PRX. I have to look for the expression ([0-9] - ) as delimiter.

Thanks

Occasional Contributor
Posts: 5

Re: Regular Expression to find more than one occurence

I found one way:

Using PRXCHANGE('s/(d+ ?)-/$1=/', -1, mytextstring); I change the expression (0 - ) into (0 =). Now I can use SCAN Function.

Trusted Advisor
Posts: 1,300

Re: Regular Expression to find more than one occurence

Assuming you do not have numbers embedded in the strings (like in your example) you can use the following:

(\d+) - ([^\d]+)

Match 1:0 - Nothing entered or found.     0    30
Group 1:0     0     1
Group 2:Nothing entered or found.     4    26
Match 2:1 - One or more expression- or else - entered.    30    47
Group 1:1    30     1
Group 2:One or more expression- or else - entered.    34    43
Match 3:99 - Missing    77    12
Group 1:99    77     2
Group 2:Missing    82     7
Super User
Posts: 9,691

Re: Regular Expression to find more than one occurence

You also could use SCAN(). But I would like to use Perl Regular Expression. Check  CALL PRXNEXT().

Code: Program

data have;
  a="0 - Nothing entered or found. 1 - One or more expression- or else - entered. 99 - Missing";
run;
data want;
set have;
length var $ 100;
do i=1 to countw(a,,'d');
  var=catx(' ',scan(a,i,,'kd'),scan(a,i,,'d'));output;
end;
run;

Xia Keshan

Ask a Question
Discussion stats
  • 5 replies
  • 214 views
  • 0 likes
  • 4 in conversation