SAS Programming

PamG · Posted 03-15-2024 06:59 PM

I have to search for set of strings that when used in PRXMATCH comes to more than 262 characters long and SAS gives an unbalanced quotation marks warning.

NOTE: The quoted string currently being processed has become more than 262 characters long.
You might have unbalanced quotation marks.

The run finishes inspite of the warning and I am wondering if I should break the PRXMATCH into several statements. A sample code is given below. The code runs fine and I am wondering if I should ignore the warning.

data _null_;
var="24900|24901|24910|24911|24920|24921|24930|24931|24940|24941|24950|24951|24960|24961|24970|24971|24980|24981|24990|24991|25000|25001|25002|25003|25010|25011|25012|25013|25020|25021|25022|25023|25030|25031|25032|25033|25040|25041|25042|25043|25050|25051|25052|25053|25060|25061|25062|25063|25070|25071|25072|25073|25080|25081|25082|25083|25090|25091|25092|25093|36641|V5391|V5867";
varlength=Length(var);
%put var;
%put varlength;
run;

mkeintz · Posted 03-15-2024 07:46 PM

As is often the case, "there is an option for that".

Try

option noquotelenmax;

See QUOTELENMAX System Option

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

mkeintz · Posted 03-15-2024 07:46 PM

As is often the case, "there is an option for that".

Try

option noquotelenmax;

See QUOTELENMAX System Option

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Patrick · Posted 03-16-2024 12:16 AM

Besides of noquotelenmax you could also just create your macro variable differently - which should also make it easier to maintain your list of values.

data values;
  infile datalines truncover;
  input var:$16.;
  datalines;
24900
24901
24910
24911
24920
24921
24930
;
run;

proc sql noprint;
  select var into :var_list separated by '|'
  from values;
quit;

%put &=var_list;

PamG · Posted 03-16-2024 09:22 AM

Thanks. That will not work for me as I have 50 such variables.

Patrick · Posted 03-17-2024 02:26 AM

@PamG wrote:

Thanks. That will not work for me as I have 50 such variables.

@PamG The more you show and tell us what you have, the better we can provide advice.

Below sample code accepts as many lists (variables) as you have to then dynamically process these without the need for any further code change.

data work.var_val;
  infile datalines truncover;
  input name:$32. value:$16.;
  datalines;
var1 24900
var1 24901
var1 24910
var1 24911
var1 24920
var1 24921
var1 24930
var2 25070
var2 25071
var2 25072
var2 25073
var2 25080
var2 25081
var2 25082
var2 25083
;
run;

proc sort data=work.var_val nodupkey;
  by name value;
run;

data work.master;
  infile datalines truncover;
  length id source_str $40. match_flg_expected 3;
  input id match_flg_expected source_str $40.;
  datalines;
1 1 agvd98475249014374 25080 24930 25080 
2 1 99999250729999
3 0 111112507911111
;

proc sql noprint;
  select count(name), max(cnt_value) into :n_names trimmed, :n_values trimmed
  from
    (
      select name, count(distinct value) as cnt_value
      from work.var_val
      group by name
    )
  ;
quit;

/*data demo(keep=id search_var match_val pos len);*/
data work.demo(drop=_:);
 if _n_=1 then
  do;
    _names_ind =1;
    _values_ind=0;
    do until(_done);
      set work.var_val(keep=name value rename=(name=_name value=_value)) end=_done;
      by _name;
      array _values{&n_values} $16 _temporary_;
      array _names {&n_names}  $32 _temporary_;
      array _prxid {&n_names}   8  _temporary_;
      _values_ind+1;
      _values[_values_ind]=_value;
      if last._name then
        do;
          _names[_names_ind]=_name;
          _prxid[_names_ind]=prxparse(cats('/(',catx(')|(',of _values[*]),')/'));
          _names_ind+1;
          _values_ind=0;
          call missing(of _values[*]);
        end;
    end;
  end;

  set work.master;
  length search_var $32 match_val $16;
  _stop = length(source_str);
  do _i=1 to &n_names;
    _start = 1;
    call prxnext(_prxid[_i], _start, _stop, trim(source_str), pos, len);
    if pos>0 then search_var=_names[_i];
    do while (pos > 0);
       match_val = substr(trim(source_str), pos, len);
       output;
       call prxnext(_prxid[_i], _start, _stop, trim(source_str), pos, len);
    end;
  end;
  /* below line only if also neew to output source rows with any match */
  if missing(search_var) then output;
run;

proc print data=work.demo;
run;

FreelanceReinh · Posted 03-16-2024 09:49 AM

Hello @PamG,

@PamG wrote:

var="24900|24901|24910|24911|24920|24921|24930|24931|24940|24941|24950|24951|24960|24961|24970|24971|24980|24981|24990|24991|25000|25001|25002|25003|25010|25011|25012|25013|25020|25021|25022|25023|25030|25031|25032|25033|25040|25041|25042|25043|25050|25051|25052|25053|25060|25061|25062|25063|25070|25071|25072|25073|25080|25081|25082|25083|25090|25091|25092|25093|36641|V5391|V5867";

Note that the length of this particular regular expression can be reduced from 377 to merely 44 characters:

249[0-9][01]|250[0-9][0-3]|36641|V5391|V5867

This makes it not only more readable, but also about seven times faster (on my workstation) when used with PRXMATCH. And, of course, it avoids the "... more than 262 characters ..." note in the log.

SAS Programming

Unbalanced quotation in variable assignment and PRXMATCH

Re: Unbalanced quotation in variable assignment and PRXMATCH

Re: Unbalanced quotation in variable assignment and PRXMATCH

Re: Unbalanced quotation in variable assignment and PRXMATCH

Re: Unbalanced quotation in variable assignment and PRXMATCH

Re: Unbalanced quotation in variable assignment and PRXMATCH

Re: Unbalanced quotation in variable assignment and PRXMATCH

Final line of code to stop unbalanced quotation marks ??

Assigning values based on another variable

Prxmatch function for values with decimal point and without

IF prxmatch('/prx-pattern/', catx(' ', of charvar1-charvar100)) vs. W...

Split a variable and assign the contribution of IDs to that variable v...

Follow Us

What is...

SAS Programming

Join us for our biggest event of the year!

SAS Training: Just a Click Away

Follow Us

What is...