set multiple datasets with different variable lengths

Reply
Contributor
Posts: 36

set multiple datasets with different variable lengths

Hi,

I have two datasets which both contain 'Rec_Name' variable. The first dataset has Rec_Name of length $17. and the the second dataset has Rec_Name of length $100. When I set combine both datasets, the Rec_Name's from the second dataset were truncated. I already specified

length Rec_Name $100.;

before -set- statement.

The first dataset, data1, looks like

Rec_Name

a0Oa000000Eto9m
a0Oa000000MCQQF
a0Oa000000JbpAi

The second dataset, data2, looks like

Rec_Name

abcde Needed BPO Obj
Copyright (c) 2000-2014 xxx, inc. All rights reserved.
Confidential Information - Do Not Distribute

My code is

data data3;

     length Rec_Name $100.;

     set data1 data2;

run;

a0Oa000000Eto9m
a0Oa000000MCQQF
a0Oa000000JbpAi

abcde Needed BPO
Copyright (c) 200
Confidential Info

Does anybody know the problem? Thank.

Trusted Advisor
Posts: 1,228

Re: set multiple datasets with different variable lengths

data data1;
input Rec_Name $17.;
datalines;
a0Oa000000Eto9m
a0Oa000000MCQQF
a0Oa000000JbpAi
;

data data2;
infile datalines truncover;
input Rec_Name $60.;
datalines;
abcde Needed BPO Obj
Copyright (c) 2000-2014 xxx, inc. All rights reserved.
Confidential Information - Do Not Distribute
;

data data3;
     length Rec_Name $100.;
     set data1 data2;
run;

Super User
Super User
Posts: 7,038

Re: set multiple datasets with different variable lengths

Most likely the data is there, but you have a format attached the the variable. Here is a simple example to demonstrate the issue.

data one ;

  attrib x length=$5 format=$5.;

  x='abcde';

run;

data two;

  attrib x length=$10 format=$10.;

  x='1234567890';

run;

data both ;

  length x $10;

  set one two;

  put x ;

run;

What is happening is that the format is being set by the first non-blank format that is seen in the data step.

Simplest solution is to add a format statement AFTER the SET statement to remove the unwanted format from the variable.

data both ;

  length x $10;

  set one two;

  format x;

  put x ;

run;

This type of problem is why people should not being using a FORMAT statement as a substitute for a LENGTH statement to define their variables. And also why SAS should NOT be automatically attaching $xxx. formats to character variables, as it currently does with PROC IMPORT and SAS/Access to XXXX.  There is almost no situation where the function of SAS is improved by having $ formats permanently attached to character variables.  It is much more likely to result in this type of issue.

Super User
Posts: 10,018

Re: set multiple datasets with different variable lengths

try change the position of these two tables. or sql give you everything .

data want;

set second first ;

run;

proc sql;

create table want as

select * from first

union all corr

select * from second

;

quit;

Xia Keshan

Ask a Question
Discussion stats
  • 3 replies
  • 372 views
  • 1 like
  • 4 in conversation