BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kastchei
Pyrite | Level 9

Hi all,

 

I have two datasets which the same variables but differing lengths.  I am using a set statement to combine them.  Is there any way for SAS to pick the maximum length for each variable in the new dataset rather than just picking the length of the variable from the first set dataset listed?  There are numerous variables, and the lengths will not be consistent nor one set consistently longer than the others.

 

For instance, on one day, the length of ID might be 6 in the first dataset and 8 in the second, but on the next day it might be 9 in the first dataset and 5 in the second.

 

I would like to avoid having to scan the datasets each time and pick out the maximum.  Is there any option in the set statement to do this?

 

Thanks!

Michael

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

If both tables have the same variables then a Proc SQL Union does what you're after.

data have1;
  set sashelp.class;
run;

data have2;
  length sex $5;
  set sashelp.class;
run;

proc sql;
  create table want as
  select * from have1
  union corr all
  select * from have2
  ;
quit;

proc contents data=want;
run;

If that's an ongoing issue for a defined table structure then another option would be to create an empty "mapping" dataset with defined variable attributes (so with all the max. lengths you can reasonably expect) and then use this mapping dataset first in a set statement.

data class_mapping;
  length sex $5;
  set sashelp.class;
  stop;
run;

data want2;
  set class_mapping sashelp.class;
run;

proc contents data=want;
run;

View solution in original post

6 REPLIES 6
Reeza
Super User

I don't think so.

Is it possible to standardize your process ahead of time, ensure the data is read in a standard manner?

Or create a master dataset with the maximum lengths and variables definitions that are standard? Then you can use that each time as the 'master' data and it will set the lengths.

art297
Opal | Level 21

Not an option, but you could write some code (e.g., using proc sql to get the max length for each variable by comparing the values in dictionary.columns and have proc sql create a macro variable that includes entries like "length variablename x;"), run that code before running your current code, and include the macro variable before your set statement.

 

Art, CEO, AnalystFinder.com

 

ballardw
Super User

The approach to expand slightly on @art297's is to include one or more Length statements before the SET statement with the desired length.

Data want;

    length thisvar $ 25;

    set data1 data2;

run;

would set the length of the variable Thisvar to 25 characters regardless of which set.

 

If you are going to be combining these data sets, or others similar then I suggest going back further into your process and determine why these variables are different lengths, especially the data structure and contents is otherwise the same.

 

If you are using data that is brought into SAS using Proc Import, a frequent cause of mismatched length, then you might consider a different approach for bringing the data into SAS. Another method would be to create a data set with the appropriate characteristics for all of the variables involved that has no observations. Then make that the first data set listed in the datasets on the set statement.

Patrick
Opal | Level 21

If both tables have the same variables then a Proc SQL Union does what you're after.

data have1;
  set sashelp.class;
run;

data have2;
  length sex $5;
  set sashelp.class;
run;

proc sql;
  create table want as
  select * from have1
  union corr all
  select * from have2
  ;
quit;

proc contents data=want;
run;

If that's an ongoing issue for a defined table structure then another option would be to create an empty "mapping" dataset with defined variable attributes (so with all the max. lengths you can reasonably expect) and then use this mapping dataset first in a set statement.

data class_mapping;
  length sex $5;
  set sashelp.class;
  stop;
run;

data want2;
  set class_mapping sashelp.class;
run;

proc contents data=want;
run;
Kastchei
Pyrite | Level 9

Thanks everyone.  I like this SQL solution as it's simple code, should work fairly quickly, and doesn't require any assumptions.

 

Yes, unfortunately, I am not permitted to standardize the inputs (clueless bosses...).

Kurt_Bremser
Super User

You are dealing with symptoms, but not with the root cause. The root cause is a flawed import process, probably using proc import.

Change to explicitly written data steps to read external data, and use the description provided by the data provider(s) as a base.

The improperly imported datasets can be fixed by one of the methods suggested.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 7938 views
  • 13 likes
  • 6 in conversation