DATA Step, Macro, Functions and more

Combining Files - setting length of variables

Reply
N/A
Posts: 0

Combining Files - setting length of variables

I am concatenating several datasets together. However, some of the character variables have different lengths and SAS determines the length of a variable based upon the first dataset in the SET statement. Therefore, data could be truncated.

What I want to do is set the length of all the character variables to 200 as a default. Any ideas how I can do this? I know I can use the LENGTH statement prior to the SET statement, but I’d have to name each variable and I don’t want to do that.

Thanks
Super Contributor
Super Contributor
Posts: 365

Re: Combining Files - setting length of variables

Posted in reply to deleted_user
Hello Liz,

You can use _CHARACTER_ in the LENGTH statement to get this:
[pre]
data o;
data i;
length _character_ $200;
run;
[/pre]
Sincerely,
SPR
SAS Super FREQ
Posts: 8,862

Re: Combining Files - setting length of variables

Hi:
LENGTH used in this fashion will give a WARNING:
[pre]
615 data newclass;
616 set sashelp.class;
617 length _character_ $200;
WARNING: Length of character variable Name has already been set.
Use the LENGTH statement as the very first statement in the DATA STEP to declare the
length of a character variable.
WARNING: Length of character variable Sex has already been set.
Use the LENGTH statement as the very first statement in the DATA STEP to declare the
length of a character variable.
618 run;
[/pre]

And, I do not believe that using _character_ in a LENGTH statement before the SET statement will work as desired...the names of the variables have to be used.

cynthia
Super Contributor
Super Contributor
Posts: 365

Re: Combining Files - setting length of variables

Posted in reply to Cynthia_sas
Hello Cynthia,

"And, I do not believe that using _character_ in a LENGTH statement before the SET statement will work as desired...the names of the variables have to be used."

You are 100% right: it does not work and indicates some issue in SAS.

Sincerely,
SPR
Valued Guide
Posts: 634

Re: Combining Files - setting length of variables

Posted in reply to deleted_user
You will first need to determine the names of the character variables. There are several ways to do this. The following uses an SQL step to build a macro variable which is then used in the length statement.
[pre]/*proc sql ;*/
/*describe table dictionary.columns;*/
/*select libname, memname, name, type*/
/* from dictionary.columns*/
/* where (libname='SASHELP' and memname='CLASS'); */
/*quit;*/
proc sql noprint;
select name
into :charlist separated by ' '
from dictionary.columns
where (libname='SASHELP' and memname='CLASS' and type='char');
quit;
data long;
length &charlist $200;
set sashelp.class;
run;
proc contents data=long;
run;[/pre]
N/A
Posts: 0

Re: Combining Files - setting length of variables

Posted in reply to deleted_user
An alternative is to capture the lengths of all of the character variables via PROC CONTENTS, then determine the maximum length of each and write out a subroutine that you then call before your SET statement. That way you optimize your dataset size yet don't truncate any character variables.


Let's say you have three datasets (creatively named A, B, and C).
1) Run a contents, but keep just the name, type, and length, stripping off just the character variables.
2) Sort each contents dataset, renaming the length field to a unique name.
3) Merge the contents together, and determine the maximum length.
4) Write out a subroutine (in this case, "charlen.sas")
5) %include the subroutine (I love doing this. Your data helps you out.)


*** STEPS 1 & 2 ***;
proc contents data=A noprint out=Acon(keep=name type length where=(type=2));
proc sort data=Acon(keep=name length) out=Acon2(rename=(length=lengthA)); by name; run;

proc contents data=B noprint out=Bcon(keep=name type length where=(type=2));
proc sort data=Bcon(keep=name length) out=Bcon2(rename=(length=lengthB)); by name; run;

proc contents data=C noprint out=Ccon(keep=name type length where=(type=2));
proc sort data=Ccon(keep=name length) out=Ccon2(rename=(length=lengthC)); by name; run;
/* repeat as necessary for your datasets */

*** STEP 3 ***;
filename charlen "C:\Documents and Settings\mystuff\charlen.sas";
data _null_; merge Acon2 Bcon2 Ccon2; by name;
maxlen=max(lengthA,lengthB,lengthC);

*** STEP 4 ***;
file charlen recfm=v;
put "length " name "$" maxlen +(-1) ";";
run;

*** STEP 5 ***;
data ABC;
%include charlen;
set A B C;
run;

Mike
Ask a Question
Discussion stats
  • 5 replies
  • 208 views
  • 0 likes
  • 4 in conversation