Hi. I have two datasets of similar content. One dataset has imb_code defined in Oracle as 31 char, the second dataset has imb_code defined as 93 char (but no instances longer than 31). When I Proc Append these datasets I get the message that BASE is 31 and DATA is 124. Where is this 124 coming from? Clearly it's 31+93, but why is this failing my Proc Append?
702 proc append base=FinalData data=QueryData;
703 run;
NOTE: Appending WORK.QUERYDATA to WORK.FINALDATA.
WARNING: Variable IMB_CODE has different lengths on BASE and DATA files (BASE 31 DATA 124).
ERROR: No appending done because of anomalies listed above. Use FORCE option to append these files.
NOTE: 0 observations added.
The variable IMB_Code in the data set QueryData has an assigned length of 124 characters.
This may be longer than you expect if you are using some string functions that will combine the lengths of variables to ensure that the resutl is always long enough to hold the longest combination.
Run the following code and then examine the assigned lengths of the variabls Str and Str2 for a brief example of what might cause this.
data junk;
length a b c $ 8.;
a='1';
b='3';
c='5';
str= a||b||c;
str2= cats(a,b,c);
run;
Either run proc contents on data set Junk or open column properties in the SAS explorer.
The variable IMB_Code in the data set QueryData has an assigned length of 124 characters.
This may be longer than you expect if you are using some string functions that will combine the lengths of variables to ensure that the resutl is always long enough to hold the longest combination.
Run the following code and then examine the assigned lengths of the variabls Str and Str2 for a brief example of what might cause this.
data junk;
length a b c $ 8.;
a='1';
b='3';
c='5';
str= a||b||c;
str2= cats(a,b,c);
run;
Either run proc contents on data set Junk or open column properties in the SAS explorer.
So strange (see attachment). Why does it do this? Especially STR2.
@buechler66 wrote:
So strange (see attachment). Why does it do this? Especially STR2.
For STR the rationale is: The concatenation of three strings of length (up to) 8 can be up to 24 characters long. SAS does not check the actual lengths of the variables A, B, C in all observations (again, these could be billions). It simply takes and adds the defined lengths.
For STR2 the explanation can be found in section "Length of Returned Variable" of the documentation of the CATS function: "In a DATA step, if the CATS function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes."
proc append base=FinalData data=QueryData FORCE;
run;
@buechler66 wrote:
(...) Where is this 124 coming from? Clearly it's 31+93, but why is this failing my Proc Append?
Thanks for the details. Much appreciated. But It's still confusing where the 124 comes from since imb_code is only defined as 31 in one table and as 93 in the other.
@buechler66 wrote:
Thanks for the details. Much appreciated. But It's still confusing where the 124 comes from since imb_code is only defined as 31 in one table and as 93 in the other.
Did you happen to note that 124 = 31 + 93? AND that 93 = 3*31?
I suspect some form of concatenating your 31 length variable, possible multiple SUBSTR.
In this code:
data junk;
length a b c $ 5.;
a='aaa';
b='bbb';
c='ccc';
str= substr(a,1,1)||substr(b,1,1)||substr(b,1,1);
run;
Note that STR ends up with a length of 15 (changed to get away from common default length of character variables). You might assume that since you only pulled 1 character from each that the resulting variable would have a length of 3.
@buechler66 wrote:
I wonder then what would happen if I instead just Set the datasets together rather then Proc Append them.
Probably you would see this warning:
WARNING: Multiple lengths were specified for the variable IMB_CODE by input data set(s). This can cause truncation of data.
Nevertheless, the data step would concatenate the datasets using the defined length of IMB_CODE in the first dataset listed in the SET statement as the length of this variable in the output dataset, be it sufficient or not.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.