DATA Step, Macro, Functions and more

Perl in macro question

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

Perl in macro question

I created a macro variable numlist as the following:

ONE|TWO|THREE|FOUR|FIVE|SIX|SEVEN|EIGHT|NINE|TEN|ELEVEN|TWELVE|THIRTEEN|FOURTEEN|FIFTEEN|SIXTEEN|SEVENTEEN|EIGHTEEN|NINETEEN|TWENTY|TWENTY-ONE|TWENTY-TWO|TWENTY-THREE|and so on.


And format of $wordnum converts one to 1, two to 2, and so on.

Now I need to use regular expression  to capture the word numbers like this:

data have;

var="abcd efgh ijkl WITH THREE mnop qrst"; output;

var="abcd efgh ijkl W/ TWENTY-FIVE qrst"; output;

run;

%macro try;

  data want;

     set have;

           %local re1;

           %let re1=%sysfunc(prxparse(/(\s|\(|\)|-|#|[HW]\/ ?)(\d{1,3}|&numlist)\W*(mnop|qrst)/));

           %if &re1 > 0 %then %do;

                %let pt_comments=%sysfunc(prxposn(&re1,2,var));

                var2 = put("&pt_comments",$wordnum.);

           %end;

  run;

%mend;

%try;

proc print data=want; run;

I’m expecting result of the following: var2 as 3 and 25, what did I do wrong in my program to create want? Thanks!


obs     var                       var2

1

abcd efgh ijkl WITH THREE mnop qrst

3

2

abcd efgh ijkl W/ TWENTY-FIVE qrst

25


Accepted Solutions
Solution
‎04-08-2015 01:35 PM
PROC Star
Posts: 7,363

Re: Perl in macro question

The following is from the documentation's explanation of the ?? modifier:  ? or ?? specifies the optional question mark (?) and double question mark (??) modifiers that suppress the printing of both the error messages and the input lines when invalid data values are read. The ? modifier suppresses the invalid data message. The ?? modifier also suppresses the invalid data message and, in addition, prevents the automatic variable _ERROR_ from being set to 1 when invalid data are read.

While I've tried to find the time to learn regular expressions, I still haven't. Thus, I'm sure the following is more convoluted than it has to be. However, if I correctly understand what you want to do, then the following appears to accomplish the task:

data seed;

  length start $100;

  retain fmtname 'wordnum' type 'i';

  do label=1 to 100;

    start=upcase(put(label,words100.));

    output;

  end;

run;

proc format library=work cntlin=seed; run;

data have;

  var="abcd efgh ijkl WITH THREE mnop qrst"; output;

  var="abcd efgh ijkl WITH THREE mxop qrst"; output;

  var="abcd efgh ijkl WITH THIRTY mnop qrst"; output;

  var="abcd efgh ijkl W/ TWENTY-FIVE qrst"; output;

  var="abcd efgh ijkl W TWENTY-FIVE qrst"; output;

  var="abcd efgh ijkl With/ TWENTY-FIVE qrst"; output;

run;

data want (drop=newvar re);

  set have;

  length newvar $200;

  var=upcase(var);

  _n_=1;

  do while (scan(var,_n_,' ') ne '');

    if input(scan(var,_n_,' '),?? wordnum.) gt 0 then do;

      var2=input(scan(var,_n_,' '),wordnum.);

      newvar=catx(' ',newvar,input(scan(var,_n_,' '),wordnum.));

    end;

    else newvar=catx(' ',newvar,scan(var,_n_,' '));

    _n_+1;

  end;

  re=prxparse('/(WITH\s|W\/\s)\d{1,3}(\sMNOP|\sQRST)/');

  if not prxmatch(re,newvar) then call missing(var2);

run;

View solution in original post


All Replies
Contributor
Posts: 40

Re: Perl in macro question

Here is the program to create the macro variable and the format:

data seed;

  length start $100 num $10000;

  retain fmtname 'wordnum' type 'c' num '';

  do label=1 to 100;

    start=upcase(put(label,words100.));

    output;

     num = catx('|', num, upcase(put(label,words100.)));

  end;

  call symputx('numlist', num);

run;

proc format library=work cntlin=seed; run;

PROC Star
Posts: 7,363

Re: Perl in macro question

You can also do it without either a regular expression or a macro:

data seed;

  length start $100;

  retain fmtname 'wordnum' type 'i';

  do label=1 to 100;

    start=upcase(put(label,words100.));

    output;

  end;

run;

proc format library=work cntlin=seed; run;

data have;

  var="abcd efgh ijkl WITH THREE mnop qrst"; output;

  var="abcd efgh ijkl WITH JUNK mnop qrst"; output;

  var="abcd efgh ijkl W/ TWENTY-FIVE qrst"; output;

run;

data want;

  set have;

  _n_=1;

  do while (scan(var,_n_,' ') ne '');

    if input(scan(var,_n_,' '),?? wordnum.) gt 0 then

     var2=input(scan(var,_n_,' '),wordnum.);

    _n_+1;

  end;

run;

Contributor
Posts: 40

Re: Perl in macro question

Thank you. The thing is there are more conditions need to be considered, like the number has to have WITH OR W/ in front of it and mnop behind it, etc, I can’t list all of them.

Can you explain what does ?? do in the first input statement?

Thanks!

Solution
‎04-08-2015 01:35 PM
PROC Star
Posts: 7,363

Re: Perl in macro question

The following is from the documentation's explanation of the ?? modifier:  ? or ?? specifies the optional question mark (?) and double question mark (??) modifiers that suppress the printing of both the error messages and the input lines when invalid data values are read. The ? modifier suppresses the invalid data message. The ?? modifier also suppresses the invalid data message and, in addition, prevents the automatic variable _ERROR_ from being set to 1 when invalid data are read.

While I've tried to find the time to learn regular expressions, I still haven't. Thus, I'm sure the following is more convoluted than it has to be. However, if I correctly understand what you want to do, then the following appears to accomplish the task:

data seed;

  length start $100;

  retain fmtname 'wordnum' type 'i';

  do label=1 to 100;

    start=upcase(put(label,words100.));

    output;

  end;

run;

proc format library=work cntlin=seed; run;

data have;

  var="abcd efgh ijkl WITH THREE mnop qrst"; output;

  var="abcd efgh ijkl WITH THREE mxop qrst"; output;

  var="abcd efgh ijkl WITH THIRTY mnop qrst"; output;

  var="abcd efgh ijkl W/ TWENTY-FIVE qrst"; output;

  var="abcd efgh ijkl W TWENTY-FIVE qrst"; output;

  var="abcd efgh ijkl With/ TWENTY-FIVE qrst"; output;

run;

data want (drop=newvar re);

  set have;

  length newvar $200;

  var=upcase(var);

  _n_=1;

  do while (scan(var,_n_,' ') ne '');

    if input(scan(var,_n_,' '),?? wordnum.) gt 0 then do;

      var2=input(scan(var,_n_,' '),wordnum.);

      newvar=catx(' ',newvar,input(scan(var,_n_,' '),wordnum.));

    end;

    else newvar=catx(' ',newvar,scan(var,_n_,' '));

    _n_+1;

  end;

  re=prxparse('/(WITH\s|W\/\s)\d{1,3}(\sMNOP|\sQRST)/');

  if not prxmatch(re,newvar) then call missing(var2);

run;

Contributor
Posts: 40

Re: Perl in macro question

Thank you so much! Arthur.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 245 views
  • 0 likes
  • 2 in conversation