BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Dimax
Fluorite | Level 6

Hello, I am learning SAS and trying to write a macro program that takes a file path as a macro parameter and a second parameter for the search term to be found in this file. However, I can't get it to work because I am getting the Error 

ERROR: The regular expression passed to the function PRXMATCH contains a syntax error.

This is the code

%macro matcher_prxmatch(path, search);
data prxmatch;
length pos count zeilen_nr 8 line $32767 regex pattern $100 lrecl 8 search $100;
infile "&path" lrecl=32767 truncover;
input line $varying32767. lrecl;
search="&search";
pattern = cats('/', search, '/i');
regex = prxparse(pattern);
if missing(regex) then do;
put "ERROR: Ungültiger regulärer Ausdruck.";
stop;
end;
zeilen_nr = _N_;
count = 0;
pos = prxmatch(regex, line);
if pos > 0 then do;
count + 1;
temp_line = substr(line, pos + 1);
do while (prxmatch(regex, temp_line) > 0);
count + 1;
temp_line = substr(temp_line, prxmatch(regex, temp_line) + 1);
end;
end;
output;
run;
proc print data=prxmatch;
var zeilen_nr count;
run;
%mend matcher_prxmatch;

%matcher_prxmatch(_____a.txt, SYS);

Can someone explain to me why the error occurs?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@Dimax wrote:

This is really complicated... I had length regex 8, which I thought meant numeric 8 bytes. I changed it to length regex 8. and the error disappeared. Then I changed it back to length regex 8, and I still didn't get an error.  I also thank everyone for their help. 👍

 


The period makes no difference.  Lengths are always integers.

You had this:

length ... regex pattern $100 ..;

Which defines REGEX and PATTERN as length $100.

View solution in original post

8 REPLIES 8
Tom
Super User Tom
Super User

Run the same code without the macro and you will be able to clearly see the LINE that is causing the error.

1    data prxmatch;
2      length pos count zeilen_nr 8 line $32767 regex pattern $100 lrecl 8
2  ! search $100;
3      infile text lrecl=32767 truncover;
4      input line $varying32767. lrecl;
5      search="SYS";
6      pattern = cats('/', search, '/i');
7      regex = prxparse(pattern);
8      if missing(regex) then do;
9        put "ERROR: Ungültiger regulärer Ausdruck.";
10       stop;
11     end;
12     zeilen_nr = _N_;
13     count = 0;
14     pos = prxmatch(regex, line);
15     if pos > 0 then do;
16       count + 1;
17       temp_line = substr(line, pos + 1);
NOTE: Variable "temp_line" was given a default length of 32767 as the result
      of a function call.  If you do not like this, please use a LENGTH
      statement to declare "temp_line".
18       do while (prxmatch(regex, temp_line) > 0);
19         count + 1;
20         temp_line = substr(temp_line, prxmatch(regex,temp_line) + 1);
21       end;
22     end;
23     output;
24   run;

NOTE: Numeric values have been converted to character
      values at the places given by: (Line):(Column).
      7:11
NOTE: Variable lrecl is uninitialized.

So you are defining REGEX as a CHARACTER variable, but it needs to be a NUMERIC variable.

You are also trying to use LRECL in the INPUT statement to specify the length for the $VARYING informat when you never set LRECL to any value.

data prxmatch;
  length pos count zeilen_nr 8 line temp_line $32767 regex 8 pattern $100 search $100;
  infile text lrecl=32767 truncover;
  input ;
  line = _infile_;
  search="SYS";
  pattern = cats('/', search, '/i');
  regex = prxparse(pattern);
  if missing(regex) then do;
    put "ERROR: Ungültiger regulärer Ausdruck.";
    stop;
  end;
  zeilen_nr = _N_;
  count = 0;
  pos = prxmatch(regex, line);
  if pos > 0 then do;
    count + 1;
    temp_line = substr(line, pos + 1);
    do while (prxmatch(regex, temp_line) > 0);
      count + 1;
      temp_line = substr(temp_line, prxmatch(regex,temp_line) + 1);
    end;
  end;
  output;
run;
Dimax
Fluorite | Level 6
regex 8 is defined as numeric
because prxparse(pattern); return pattern id
NOTE: Numeric values have been converted to character
values at the places given by: (Line):(Column).
7:11 this is because file path contains numbers
PaigeMiller
Diamond | Level 26

@Dimax wrote:
regex 8 is defined as numeric
because prxparse(pattern); return pattern id
NOTE: Numeric values have been converted to character
values at the places given by: (Line):(Column).
7:11 this is because file path contains numbers

Show us the ENTIRE log for this macro. Showing us tiny little bits of the log really isn't helpful. Please click on the </> icon and paste the log into the window that appears.

PaigeMiller_0-1715196634946.png

--
Paige Miller
Tom
Super User Tom
Super User

@Dimax wrote:
regex 8 is defined as numeric
because prxparse(pattern); return pattern id
NOTE: Numeric values have been converted to character
values at the places given by: (Line):(Column).
7:11 this is because file path contains numbers

Actual the reverse.  REGEX is defined as CHARACTER because of the LENGTH statement.  PRXPARSE() function returns a numeric value which is then converted into a character string to store into REGEX.   So REGEX ends up with a value like '          1'.

 

This then causes the later errors when REGEX is used as the input to the PRXMATCH() function call.   Since REGEX is character instead of numeric PRXMATCH() is attempting to interpret the value as a new regular expression instead of using the one previously compiled by the PRXPARSE() function call.

Dimax
Fluorite | Level 6

This is really complicated... I had length regex 8, which I thought meant numeric 8 bytes. I changed it to length regex 8. and the error disappeared. Then I changed it back to length regex 8, and I still didn't get an error.  I also thank everyone for their help. 👍

 

Tom
Super User Tom
Super User

@Dimax wrote:

This is really complicated... I had length regex 8, which I thought meant numeric 8 bytes. I changed it to length regex 8. and the error disappeared. Then I changed it back to length regex 8, and I still didn't get an error.  I also thank everyone for their help. 👍

 


The period makes no difference.  Lengths are always integers.

You had this:

length ... regex pattern $100 ..;

Which defines REGEX and PATTERN as length $100.

Dimax
Fluorite | Level 6
yeah ..this was my mistake
Tom
Super User Tom
Super User

Get the SAS code to work before trying to convert it into a macro.

%let search=12;
data prxmatch;
  length zeilen_nr pos count 8 search $100 line temp_line $32767 regex 8 ;
  infile text lrecl=32767 truncover;
  input line $char32767.;
  zeilen_nr + 1;
  if zeilen_nr=1 then do;
    search="&search";
    regex=prxparse("/&search/i");
    if missing(regex) then do;
      put "ERROR: Ungültiger regulärer Ausdruck.";
      stop;
    end;
  end;
  retain search regex;
  count = 0;
  temp_line = line;
  do until(pos=0);
    pos = prxmatch(regex, temp_line);
    if pos > 0 then do;
      count + 1;
      temp_line = substr(temp_line, pos + 1);
    end;
  end;
  drop regex pos temp_line;
run;
proc print;
run;

Result

     zeilen_
Obs     nr    count  search                        line

  1      1      1      12    Name=Alfred Sex=M Age=14 Height=69 Weight=112.5
  2      2      0      12    Name=Alice Sex=F Age=13 Height=56.5 Weight=84
  3      3      0      12    Name=Barbara Sex=F Age=13 Height=65.3 Weight=98
  4      4      0      12    Name=Carol Sex=F Age=14 Height=62.8 Weight=102.5
  5      5      0      12    Name=Henry Sex=M Age=14 Height=63.5 Weight=102.5
  6      6      1      12    Name=James Sex=M Age=12 Height=57.3 Weight=83
  7      7      1      12    Name=Jane Sex=F Age=12 Height=59.8 Weight=84.5
  8      8      1      12    Name=Janet Sex=F Age=15 Height=62.5 Weight=112.5
  9      9      0      12    Name=Jeffrey Sex=M Age=13 Height=62.5 Weight=84
 10     10      1      12    Name=John Sex=M Age=12 Height=59 Weight=99.5
 11     11      0      12    Name=Joyce Sex=F Age=11 Height=51.3 Weight=50.5
 12     12      0      12    Name=Judy Sex=F Age=14 Height=64.3 Weight=90
 13     13      1      12    Name=Louise Sex=F Age=12 Height=56.3 Weight=77
 14     14      1      12    Name=Mary Sex=F Age=15 Height=66.5 Weight=112
 15     15      0      12    Name=Philip Sex=M Age=16 Height=72 Weight=150
 16     16      2      12    Name=Robert Sex=M Age=12 Height=64.8 Weight=128
 17     17      0      12    Name=Ronald Sex=M Age=15 Height=67 Weight=133
 18     18      0      12    Name=Thomas Sex=M Age=11 Height=57.5 Weight=85
 19     19      1      12    Name=William Sex=M Age=15 Height=66.5 Weight=112

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 2824 views
  • 2 likes
  • 3 in conversation