An alternative is to capture the lengths of all of the character variables via PROC CONTENTS, then determine the maximum length of each and write out a subroutine that you then call before your SET statement. That way you optimize your dataset size yet don't truncate any character variables.
Let's say you have three datasets (creatively named A, B, and C).
1) Run a contents, but keep just the name, type, and length, stripping off just the character variables.
2) Sort each contents dataset, renaming the length field to a unique name.
3) Merge the contents together, and determine the maximum length.
4) Write out a subroutine (in this case, "charlen.sas")
5) %include the subroutine (I love doing this. Your data helps you out.)
*** STEPS 1 & 2 ***;
proc contents data=A noprint out=Acon(keep=name type length where=(type=2));
proc sort data=Acon(keep=name length) out=Acon2(rename=(length=lengthA)); by name; run;
proc contents data=B noprint out=Bcon(keep=name type length where=(type=2));
proc sort data=Bcon(keep=name length) out=Bcon2(rename=(length=lengthB)); by name; run;
proc contents data=C noprint out=Ccon(keep=name type length where=(type=2));
proc sort data=Ccon(keep=name length) out=Ccon2(rename=(length=lengthC)); by name; run;
/* repeat as necessary for your datasets */
*** STEP 3 ***;
filename charlen "C:\Documents and Settings\mystuff\charlen.sas";
data _null_; merge Acon2 Bcon2 Ccon2; by name;
maxlen=max(lengthA,lengthB,lengthC);
*** STEP 4 ***;
file charlen recfm=v;
put "length " name "$" maxlen +(-1) ";";
run;
*** STEP 5 ***;
data ABC;
%include charlen;
set A B C;
run;
Mike