DATA Step, Macro, Functions and more

Choose maximum variable length with set

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 80
Accepted Solution

Choose maximum variable length with set

Hi all,

 

I have two datasets which the same variables but differing lengths.  I am using a set statement to combine them.  Is there any way for SAS to pick the maximum length for each variable in the new dataset rather than just picking the length of the variable from the first set dataset listed?  There are numerous variables, and the lengths will not be consistent nor one set consistently longer than the others.

 

For instance, on one day, the length of ID might be 6 in the first dataset and 8 in the second, but on the next day it might be 9 in the first dataset and 5 in the second.

 

I would like to avoid having to scan the datasets each time and pick out the maximum.  Is there any option in the set statement to do this?

 

Thanks!

Michael


Accepted Solutions
Solution
‎01-18-2017 04:25 PM
Respected Advisor
Posts: 3,887

Re: Choose maximum variable length with set

If both tables have the same variables then a Proc SQL Union does what you're after.

data have1;
  set sashelp.class;
run;

data have2;
  length sex $5;
  set sashelp.class;
run;

proc sql;
  create table want as
  select * from have1
  union corr all
  select * from have2
  ;
quit;

proc contents data=want;
run;

If that's an ongoing issue for a defined table structure then another option would be to create an empty "mapping" dataset with defined variable attributes (so with all the max. lengths you can reasonably expect) and then use this mapping dataset first in a set statement.

data class_mapping;
  length sex $5;
  set sashelp.class;
  stop;
run;

data want2;
  set class_mapping sashelp.class;
run;

proc contents data=want;
run;

View solution in original post


All Replies
Super User
Posts: 17,784

Re: Choose maximum variable length with set

I don't think so.

Is it possible to standardize your process ahead of time, ensure the data is read in a standard manner?

Or create a master dataset with the maximum lengths and variables definitions that are standard? Then you can use that each time as the 'master' data and it will set the lengths.

PROC Star
Posts: 7,360

Re: Choose maximum variable length with set

Not an option, but you could write some code (e.g., using proc sql to get the max length for each variable by comparing the values in dictionary.columns and have proc sql create a macro variable that includes entries like "length variablename x;"), run that code before running your current code, and include the macro variable before your set statement.

 

Art, CEO, AnalystFinder.com

 

Super User
Posts: 10,483

Re: Choose maximum variable length with set

The approach to expand slightly on @art297's is to include one or more Length statements before the SET statement with the desired length.

Data want;

    length thisvar $ 25;

    set data1 data2;

run;

would set the length of the variable Thisvar to 25 characters regardless of which set.

 

If you are going to be combining these data sets, or others similar then I suggest going back further into your process and determine why these variables are different lengths, especially the data structure and contents is otherwise the same.

 

If you are using data that is brought into SAS using Proc Import, a frequent cause of mismatched length, then you might consider a different approach for bringing the data into SAS. Another method would be to create a data set with the appropriate characteristics for all of the variables involved that has no observations. Then make that the first data set listed in the datasets on the set statement.

Solution
‎01-18-2017 04:25 PM
Respected Advisor
Posts: 3,887

Re: Choose maximum variable length with set

If both tables have the same variables then a Proc SQL Union does what you're after.

data have1;
  set sashelp.class;
run;

data have2;
  length sex $5;
  set sashelp.class;
run;

proc sql;
  create table want as
  select * from have1
  union corr all
  select * from have2
  ;
quit;

proc contents data=want;
run;

If that's an ongoing issue for a defined table structure then another option would be to create an empty "mapping" dataset with defined variable attributes (so with all the max. lengths you can reasonably expect) and then use this mapping dataset first in a set statement.

data class_mapping;
  length sex $5;
  set sashelp.class;
  stop;
run;

data want2;
  set class_mapping sashelp.class;
run;

proc contents data=want;
run;
Frequent Contributor
Posts: 80

Re: Choose maximum variable length with set

Thanks everyone.  I like this SQL solution as it's simple code, should work fairly quickly, and doesn't require any assumptions.

 

Yes, unfortunately, I am not permitted to standardize the inputs (clueless bosses...).

Super User
Posts: 6,928

Re: Choose maximum variable length with set

You are dealing with symptoms, but not with the root cause. The root cause is a flawed import process, probably using proc import.

Change to explicitly written data steps to read external data, and use the description provided by the data provider(s) as a base.

The improperly imported datasets can be fixed by one of the methods suggested.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 612 views
  • 12 likes
  • 6 in conversation