DATA Step, Macro, Functions and more

CATS function and buffer length

Reply
Respected Advisor
Posts: 3,887

CATS function and buffer length

Hi all

I've run today into below problem (more than one OS and SAS version - below log from SAS 9.3 M2 under Win7).

What I try do do is to calculate a digest value. The code used is very close to what DI Studio generates when using the SCD Type 2 loader. This digest value is used to determine if rows matched by a key differ in any of the variables used to construct the digest value.

When having a lot of variables and using a data step everything works. But when using the exactly same nested functions in a SQL it returns a warning - and it actually does what the warning says: Not the full variable string is used to construct the digest value so the values differ between the data step version and the SQL version.

I haven't found a way around this issue. Any ideas of how to solve this? (eg. how to increase the buffer?).

This is an issue I've encountered during some "mocking around with code". So I'm not after a "work-around" but after a solution/explanation. It's more about getting a deeper understanding of what's going on here.

Sample code

data have;
  array vars {100} $10 (100*'abc');
run;

proc sql noprint;
  select name into :varlist separated by ','
  from dictionary.columns
  where libname='WORK' and memname='HAVE'
  ;
quit;

/* returns a warning */
proc sql;
  create table Has_Issues as
  select put(md5(cats(&varlist)),$hex32.) as DigestValue length=32
  from have
  ;
quit;

/* no warning. seems to work */
data No_Issues (drop=varsSmiley Happy;
  set have;
  length DigestValue $32.;
  DigestValue=put(md5(cats(&varlist)),$hex32.);
run;

LOG:

27         proc sql;

28           create table Has_Issues as

29           select put(md5(cats(&varlist)),$hex32.) as DigestValue length=32

30           from have

31           ;

WARNING: In a call to the CATS function, the buffer allocated for the result was not long enough to contain the concatenation of

         all the arguments. The correct result would contain 300 characters, but the actual result might either be truncated to 200

         character(s) or be completely blank, depending on the calling environment. The following note indicates the left-most

         argument that caused truncation.

NOTE: Invalid argument 67 to function CATS. Missing values may be generated.

NOTE: Table WORK.HAS_ISSUES created, with 1 rows and 1 columns.

Thanks

Patrick

Respected Advisor
Posts: 3,777

Re: CATS function and buffer length

I think this will work.

proc sql;
 
create table Has_Issues2(drop=vlist) as
 
select cats(&varlist) as vlist length=1024,
      put(md5(calculated vlist),
$hex32.) as DigestValue length=32
 
from have
  ;
quit;

Respected Advisor
Posts: 3,887

Re: CATS function and buffer length

Hi data _null_;

Thanks for that. The only thing I needed to add was a strip() function as md5() uses the full variable length inclusive trailing blanks for calculation. Without strip() the digest values from the data step and the SQL would differ.

proc sql;

  create table Has_Issues2(drop=vlist) as

  select cats(&varlist) as vlist length=1024,

      put(md5(strip(calculated vlist)),$hex32.) as DigestValue length=32

  from have

  ;

quit;

I wasn't aware that I can use a calculated variable this way in a SAS SQL so that it's used immediately without any re-merging.

Given your answer I assume there is no way to actually increase buffer size for an intermediate cats() result. But then the nested function works in a data step so I'm still a bit confused about the different behavior of data step and SQL.

Thanks

Patrick

Respected Advisor
Posts: 3,777

Re: CATS function and buffer length

Length of Returned Variable: Special Cases


The CATS function returns a value to a variable, or returns a value in a temporary buffer. The value that is returned from the CATS function has the following length:


•up to 200 characters in WHERE clauses and in PROC SQL

•up to 32767 characters in the DATA step except in WHERE clauses

•up to 65534 characters when CATS is called from the macro processor

Super User
Posts: 9,662

Re: CATS function and buffer length

Hi. Patrick.

"I assume there is no way to actually increase buffer size for an intermediate cats() result. But then the nested function works in a data step so I'm still a bit confused about the different behavior of data step and SQL."

Me, either.

I also am shocked that  cats() can worked in a data step, since is defaultly limited 200 long .

Or you can try to use another string concatenate operator ' || ' ,which has no such limitation.

data have;
  array vars {100} $10 (100*'abc');
run;

proc sql noprint;
  select cats('strip(',name,')') into :varlist separated by '||'
  from dictionary.columns
  where libname='WORK' and memname='HAVE'
  ;
quit;



proc sql;
  create table Has_Issues as
  select put(md5(&varlist),$hex32.) as DigestValue length=32
  from have
  ;
quit;


data No_Issues (drop=vars:);
  set have;
  length DigestValue $32.;
  DigestValue=put(md5(&varlist),$hex32.);
run;

Ksharp

Super User
Posts: 9,662

Re: CATS function and buffer length

I tested also. Without strip(), data step and sql is also matched.

The reason might be  in data step there is no limitation 200 for cats(), but for SQL.

data have;
  array vars {100} $10 (100*'abc');
run;

proc sql noprint;
  select name into :varlist separated by '||'
  from dictionary.columns
  where libname='WORK' and memname='HAVE'
  ;
quit;



proc sql;
  create table Has_Issues as
  select put(md5(&varlist),$hex32.) as DigestValue length=32
  from have
  ;
quit;


data No_Issues (drop=vars:);
  set have;
  length DigestValue $32.;
  DigestValue=put(md5(&varlist),$hex32.);
run;

Ksharp

Ask a Question
Discussion stats
  • 5 replies
  • 4364 views
  • 6 likes
  • 3 in conversation