BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
buechler66
Barite | Level 11

Hi.  I have two datasets of similar content.  One dataset has imb_code defined in Oracle as 31 char, the second dataset has imb_code defined as 93 char (but no instances longer than 31).  When I Proc Append these datasets I get the message that BASE is 31 and DATA is 124.  Where is this 124 coming from?  Clearly it's 31+93, but why is this failing my Proc Append?

 

702  proc append base=FinalData data=QueryData;
703  run;

NOTE: Appending WORK.QUERYDATA to WORK.FINALDATA.
WARNING: Variable IMB_CODE has different lengths on BASE and DATA files (BASE 31 DATA 124).
ERROR: No appending done because of anomalies listed above. Use FORCE option to append these files.
NOTE: 0 observations added.
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

The variable IMB_Code in the data set QueryData has an assigned length of 124 characters.

This may be longer than you expect if you are using some string functions that will combine the lengths of variables to ensure that the resutl is always long enough to hold the longest combination.

 

Run the following code and then examine the assigned lengths of the variabls Str and Str2 for a brief example of what might cause this.

data junk;
   length a b c $ 8.;
   a='1';
   b='3';
   c='5';
   str= a||b||c;
   str2= cats(a,b,c);
run;

Either run proc contents on data set Junk or open column properties in the SAS explorer.

 

View solution in original post

10 REPLIES 10
ballardw
Super User

The variable IMB_Code in the data set QueryData has an assigned length of 124 characters.

This may be longer than you expect if you are using some string functions that will combine the lengths of variables to ensure that the resutl is always long enough to hold the longest combination.

 

Run the following code and then examine the assigned lengths of the variabls Str and Str2 for a brief example of what might cause this.

data junk;
   length a b c $ 8.;
   a='1';
   b='3';
   c='5';
   str= a||b||c;
   str2= cats(a,b,c);
run;

Either run proc contents on data set Junk or open column properties in the SAS explorer.

 

buechler66
Barite | Level 11

So strange (see attachment).  Why does it do this?  Especially STR2.


Capture.PNG
FreelanceReinh
Jade | Level 19

@buechler66 wrote:

So strange (see attachment).  Why does it do this?  Especially STR2.


For STR the rationale is: The concatenation of three strings of length (up to) 8 can be up to 24 characters long. SAS does not check the actual lengths of the variables A, B, C in all observations (again, these could be billions). It simply takes and adds the defined lengths.

 

For STR2 the explanation can be found in section "Length of Returned Variable" of the documentation of the CATS function: "In a DATA step, if the CATS function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes."

buechler66
Barite | Level 11
Thanks so much.
paulkaefer
Lapis Lazuli | Level 10
The warning mentions that you have a variable IMB_CODE that has different lengths. The SAS documentation for this suggests using the FORCE option with PROC APPEND. Try this:

proc append base=FinalData data=QueryData FORCE;
run;
FreelanceReinh
Jade | Level 19

@buechler66 wrote:

(...) Where is this 124 coming from?  Clearly it's 31+93, but why is this failing my Proc Append?


  1. The reason for 124 must be found in the code creating dataset QueryData (or the dataset where IMB_CODE got length 124 for the first time).
  2. PROC APPEND (without FORCE option) fails, because SAS does not look through the observations (these could be billions!) to find out the actual lengths of the values of IMB_CODE. It just looks at the defined lengths of the variables in the meta data. In this case, the variable length in the second dataset is greater than the corresponding length in the base dataset, so that truncation could occur (depending on the unknown data). By using the FORCE option you confirm that you are aware of this risk and accept it.
buechler66
Barite | Level 11

Thanks for the details. Much appreciated.  But It's still confusing where the 124 comes from since imb_code is only defined as 31 in one table and as 93 in the other.

ballardw
Super User

@buechler66 wrote:

Thanks for the details. Much appreciated.  But It's still confusing where the 124 comes from since imb_code is only defined as 31 in one table and as 93 in the other.


Did you happen to note that 124 = 31 + 93? AND that 93 = 3*31?

I suspect some form of concatenating your 31 length variable, possible multiple SUBSTR.

In this code:

data junk;
   length a b c $ 5.;
   a='aaa';
   b='bbb';
   c='ccc';
   str= substr(a,1,1)||substr(b,1,1)||substr(b,1,1);
run;

Note that STR ends up with a length of 15 (changed to get away from common default length of character variables). You might assume that since you only pulled 1 character from each that the resulting variable would have a length of 3.

 

buechler66
Barite | Level 11
I wonder then what would happen if I instead just Set the datasets together rather then Proc Append them.
FreelanceReinh
Jade | Level 19

@buechler66 wrote:
I wonder then what would happen if I instead just Set the datasets together rather then Proc Append them.

Probably you would see this warning:

WARNING: Multiple lengths were specified for the variable IMB_CODE by input data set(s). This can cause truncation of data.

Nevertheless, the data step would concatenate the datasets using the defined length of IMB_CODE in the first dataset listed in the SET statement as the length of this variable in the output dataset, be it sufficient or not.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 10706 views
  • 5 likes
  • 4 in conversation