BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Emma_at_SAS
Lapis Lazuli | Level 10

I want to merge multiple datasets with multiple variables but the length of the character variables are different in each dataset. I created an example that works but I want to know how I can improve it to make it easier to apply when the number of datasets is 5-10 with 20 variables to check for length.

Thanks for your thoughts

data d1;
	length var1 $10. var2 $10.;
	input var1 var2;
	cards;
apple carrots
orange cabbage
;

data d2;
	length var1 $30. var2 $30.;
	input var1 var2;
	datalines;
apple-pie-recipe  carrot-soup
orange-juice-for-breakfast cabbage-soup-for-supper
;

data d1_i (drop= var1_old var2_old);
	set d1(rename= (var1=var1_old var2=var2_old));
	length var1 $30.;
	var1=var1_old;
	length var2 $30.;
	var2=var2_old;
	run;

data want;
	set d1_i d2;
	run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

The best way is using PROC SQL.

 

data d1;
	length var1 $10. var2 $10.;
	input var1 var2;
	cards;
apple carrots
orange cabbage
;

data d2;
	length var1 $30. var2 $30.;
	input var1 var2;
	datalines;
apple-pie-recipe  carrot-soup
orange-juice-for-breakfast cabbage-soup-for-supper
;

proc sql;
create table want as
select * from d1 
union all corr
select * from d2;
quit;

View solution in original post

15 REPLIES 15
Tom
Super User Tom
Super User

Find the maximum lengths and just add a length statement before the SET.

So something like:

data want;
  length var1 $30 var2 $30 ;
  set d1 d2;
run;

You can generate that by querying the metadata for the source tables.

proc sql noprint;
select distinct catx(' ',name,cats('$',length))
  into :lengths separated by ' '
from
  (select upcase(name) as unique_name,min(name) as name,max(length) as length
  from dictionary.columns
  where libname='WORK' 
    and memname in ('D1' 'D2')
    and type='char'
  group by unique_name
  )
;
%let nchar=&sqlobs;
quit;

And then you can add that LENGTH statement to your step that combines the dataset.

You might also want to remove any formats that are attached to prevent things like a variable with length 200 using a format of only $30. which will cause the values to appear truncated even when they are not.

data want;
%if &nchar %then %do;
  length &lengths;
  format _character_ ;
%end;
  set d1 d2;
run;

PS  You don't need to include a period when specifying the length of a variable. Variables can only have integer lengths.

 

Tom
Super User Tom
Super User

It would probably be better to create the original dataset with consistent lengths for the variables.

What is the source of those dataset and why does the same variable end up having a different length in different datasets?

Emma_at_SAS
Lapis Lazuli | Level 10
I agree! The datasets were created from different sources and by different individuals. then when I imported them to SAS from SPSS/EXCELL the length is assigned by SAS. Is there any way I can prevent this in the future? Thanks
Tom
Super User Tom
Super User

PROC IMPORT of an SPSS files should replicate the structure in the SPSS file pretty well.

But PROC IMPORT of an EXCEL file cannot since there is no concept of a variable in EXCEL.  Every cell can be totally independent from each other cell.

 

If you are getting files from EXCEL you will have more control if they files are delivered as delimited text files.  Then you can write your own data step in SAS to read file so that you have full control over the type, length, name and other attributes of the variables.

Emma_at_SAS
Lapis Lazuli | Level 10
Thank you KurtBremser. I changed the title so that is right for the future users.
Ksharp
Super User

The best way is using PROC SQL.

 

data d1;
	length var1 $10. var2 $10.;
	input var1 var2;
	cards;
apple carrots
orange cabbage
;

data d2;
	length var1 $30. var2 $30.;
	input var1 var2;
	datalines;
apple-pie-recipe  carrot-soup
orange-juice-for-breakfast cabbage-soup-for-supper
;

proc sql;
create table want as
select * from d1 
union all corr
select * from d2;
quit;
Emma_at_SAS
Lapis Lazuli | Level 10

Woe, thank you @Ksharp ! I am glad I asked. It is much easier than changing the length one by one!

Am I right that 

union all corr

does the job of increasing the length to the longer one automatically? Thanks!

Emma_at_SAS
Lapis Lazuli | Level 10

@Ksharp May I please ask one more question. How do I modify the code to stack 5 datasets? I tried adding lines like

select * from d3;
select * from d4;
select * from d5;

but that did not work. Any hint to avoid adding datasets altogether instead of one at a time? Thanks!

Ksharp
Super User
Stack these dataset by UNION operator.

proc sql;
create table want as
select * from d1
union all corr
select * from d2
union all corr
select * from d3
union all corr
select * from d4
union all corr
select * from d5
;
quit;
Emma_at_SAS
Lapis Lazuli | Level 10

@Ksharp I hope this is the last question. How can I keep the variables in each dataset that do not match (other than var1 and var2) but I want to keep in the stacked dataset? Thanks!!!

data d1;
	length var1 $10. var2 $10. var3 8;
	input var1 var2 var3;
	cards;
apple carrots 99
orange cabbage 88
;

data d2;
	length var1 $30. var2 $30. var4 8 var5 8;
	input var1 var2 var4 var5;
	datalines;
apple-pie-recipe  carrot-soup 100 111
orange-juice-for-breakfast cabbage-soup-for-supper 200 222
;

proc sql;
create table want as
select * from d1 
union all corr
select * from d2;
quit;

 

 

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 2881 views
  • 9 likes
  • 4 in conversation