- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all
I've run today into below problem (more than one OS and SAS version - below log from SAS 9.3 M2 under Win7).
What I try do do is to calculate a digest value. The code used is very close to what DI Studio generates when using the SCD Type 2 loader. This digest value is used to determine if rows matched by a key differ in any of the variables used to construct the digest value.
When having a lot of variables and using a data step everything works. But when using the exactly same nested functions in a SQL it returns a warning - and it actually does what the warning says: Not the full variable string is used to construct the digest value so the values differ between the data step version and the SQL version.
I haven't found a way around this issue. Any ideas of how to solve this? (eg. how to increase the buffer?).
This is an issue I've encountered during some "mocking around with code". So I'm not after a "work-around" but after a solution/explanation. It's more about getting a deeper understanding of what's going on here.
Sample code
data have;
array vars {100} $10 (100*'abc');
run;
proc sql noprint;
select name into :varlist separated by ','
from dictionary.columns
where libname='WORK' and memname='HAVE'
;
quit;
/* returns a warning */
proc sql;
create table Has_Issues as
select put(md5(cats(&varlist)),$hex32.) as DigestValue length=32
from have
;
quit;
/* no warning. seems to work */
data No_Issues (drop=vars:);
set have;
length DigestValue $32.;
DigestValue=put(md5(cats(&varlist)),$hex32.);
run;
LOG:
27 proc sql;
28 create table Has_Issues as
29 select put(md5(cats(&varlist)),$hex32.) as DigestValue length=32
30 from have
31 ;
WARNING: In a call to the CATS function, the buffer allocated for the result was not long enough to contain the concatenation of
all the arguments. The correct result would contain 300 characters, but the actual result might either be truncated to 200
character(s) or be completely blank, depending on the calling environment. The following note indicates the left-most
argument that caused truncation.
NOTE: Invalid argument 67 to function CATS. Missing values may be generated.
NOTE: Table WORK.HAS_ISSUES created, with 1 rows and 1 columns.
Thanks
Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think this will work.
proc sql;
create table Has_Issues2(drop=vlist) as
select cats(&varlist) as vlist length=1024,
put(md5(calculated vlist),$hex32.) as DigestValue length=32
from have
;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi data _null_;
Thanks for that. The only thing I needed to add was a strip() function as md5() uses the full variable length inclusive trailing blanks for calculation. Without strip() the digest values from the data step and the SQL would differ.
proc sql;
create table Has_Issues2(drop=vlist) as
select cats(&varlist) as vlist length=1024,
put(md5(strip(calculated vlist)),$hex32.) as DigestValue length=32
from have
;
quit;
I wasn't aware that I can use a calculated variable this way in a SAS SQL so that it's used immediately without any re-merging.
Given your answer I assume there is no way to actually increase buffer size for an intermediate cats() result. But then the nested function works in a data step so I'm still a bit confused about the different behavior of data step and SQL.
Thanks
Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Length of Returned Variable: Special Cases
The CATS function returns a value to a variable, or returns a value in a temporary buffer. The value that is returned from the CATS function has the following length:
•up to 200 characters in WHERE clauses and in PROC SQL
•up to 32767 characters in the DATA step except in WHERE clauses
•up to 65534 characters when CATS is called from the macro processor
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi. Patrick.
"I assume there is no way to actually increase buffer size for an intermediate cats() result. But then the nested function works in a data step so I'm still a bit confused about the different behavior of data step and SQL."
Me, either.
I also am shocked that cats() can worked in a data step, since is defaultly limited 200 long .
Or you can try to use another string concatenate operator ' || ' ,which has no such limitation.
data have; array vars {100} $10 (100*'abc'); run; proc sql noprint; select cats('strip(',name,')') into :varlist separated by '||' from dictionary.columns where libname='WORK' and memname='HAVE' ; quit; proc sql; create table Has_Issues as select put(md5(&varlist),$hex32.) as DigestValue length=32 from have ; quit; data No_Issues (drop=vars:); set have; length DigestValue $32.; DigestValue=put(md5(&varlist),$hex32.); run;
Ksharp
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I tested also. Without strip(), data step and sql is also matched.
The reason might be in data step there is no limitation 200 for cats(), but for SQL.
data have; array vars {100} $10 (100*'abc'); run; proc sql noprint; select name into :varlist separated by '||' from dictionary.columns where libname='WORK' and memname='HAVE' ; quit; proc sql; create table Has_Issues as select put(md5(&varlist),$hex32.) as DigestValue length=32 from have ; quit; data No_Issues (drop=vars:); set have; length DigestValue $32.; DigestValue=put(md5(&varlist),$hex32.); run;
Ksharp