BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,

I have two CHARACTER variables; say A & B, in two datasets, say 1 & 2. The Length of A & B is: 4&5 respectively for the dataset 1 and 5 & 4 respectively for dataset 2. I use the code:-

Data combine;
Length A $5;
Length B $5;
Set data1 data2;
Run;

But then the length of the variable A gets reduced to 4 as I have used data1 first in SET command. I want that the lengtyh of both is 5 ….I don’t know of what code will help in this.??
11 REPLIES 11
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
What you have explained is not logically possible with a SAS CHARACTER variable because the LENGTH statement, when it precedes a SET statement, dictates the output variable length on WORK.COMBINE for A and B as CHARACTER, length 5.

You will need to share more information, such as an actual SAS log, with a PROC CONTENTS listing demonstrating the behavior as you have explained your personal experience.

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
I tried this:--

data data1_OUT;
input a $ 4. b $ 5.;
cards;
shiv ramas
;
run;


data data2_IN;
input a $ 5. b $ 4.;
cards;
Ravis Rani
;
run;


Data combine;
Length A $5;
Length B $5;
Set data1_Out data2_IN;
Run;

Proc print data=combine;
run;


The output that I want is

A B
shiv ramas
Ravis Rani

but this code does not give that

Kindly suggest
data_null__
Jade | Level 19
Your problem is with the INPUT statements not doing what you think.

You need to RTM.

You probably wanted to use list input so put colon : before $ sign in input statements.
deleted_user
Not applicable
Hello,

the datasets 1 & 2 are just as examples that I have used, actually I didnt use INPUT statement in any of these data sets as these were imported from Excel.

Kind Regards,
Kriti
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Recommend some self-initiated desk-checking with SAS PROC CONTENTS on your output file, and then if you see a SAS-truncated variable other than what's coded in the LENGTH statement (it must occur before the SET), then come back to the forum with a post-reply for further feedback.

For what it's worth, PROC IMPORT generates a DATA step with an INPUT statement, so you likely had data rows in the beginning of your file(s) that were inconsistent on the max field length. Suggest you review the SAS-generated log for each of your IMPORT operations and inspect the DATA step / INPUT statement results.

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
Hello Scott,

I changed the length of the variables by creating tow new data sets from the existing datasets from 4 to 5. (the dataset one had the variable A of length 4 and variable B of length 5). I used the code:--

data one_n;
length A $5;
set one;
run;

and

dats two_n;
length B $5;
set two;
run;

then i appended these new datasets as:-

data comb;
set one_n two_n;
run;

this last code also truncates the variable B's length to 4.

will I need to change the format of these variables?

Kind Regards
polingjw
Quartz | Level 8
How are you determining that the "length" of B is four?

This is just a guess, but maybe you successfully changed the length of the variable but left the format unchanged. Try adding a couple of format statements to the code and see if anything changes. For example:

data comb;
set one_n two_n;
format A B $5.;
run;
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
There is a difference in SAS treating a SAS CHARACTER variable LENGTH as compared to the assigned INFORMAT and FORMAT. If you do not code a FORMAT or INFORMAT, then SAS will associate the first-encountered attribute for a named variable within a DATA step, either from a SET statement (first SAS member's attributes are used), or with a LENGTH statement, or in the case of a FORMAT or INFORMAT, SAS will honor the assigned attribute when you code a FORMAT/INFORMAT after a SET statement.

You should be able to define a couple of simple DATA steps with multiple files and a single variable -- define LENGTH statements, FORMAT and INFORMAT statements, are varying points in the program, and then analyze the results of PROC CONTENTS to learn about the SAS system's behavior.

So, the answer to your question - yes, if you do not want the "first occurring" FORMAT/INFORMAT to be associated with a new file, you will need to re-declare a FORMAT/INFORMAT statement after the SET in the DATA step that concatenates your files.

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
Hello Scott,

Just to continue my last post, when I cerated two new datasets "one_n" & "two_n", althogh the lengths of the variables get changed from 4 to 5 (both for variables A & B) still the format and informat does not, that is, both the format and informat of the new datasets still is different for the variables A & B.

Do I need to change the format of A & B and make them same so as to use the SET on the new datasets?

Kind Regards.
stateworker
Fluorite | Level 6
Not the most efficient, but this could work for you... basically set up two new variables and specify the format and length of them to $5 like you want. Then you're copying var1 and var2 into them. The second data step is just to get your new variables named back to your original variables so you can continue working with your data - if that matters to you that they are named the same as originally.

Data combine (drop var1 var2 ) ;
Set data1 data2;
format var3 var4 $5.;
var3=var1;
var4=var2;
Run;
data combine;
set combine (rename=(var3=var1 var4=var2);
run;

(I didn't test this, but I've had to do this sort of thing before because I couldn't figure out a better way to do it.)
Ksharp
Super User
Hi.
First , you should understand that once the character variable entrying the PDV means the length of character variable is definitely sured (i.e you can modify the length even if using length statement).
So 'set' statement following 'data1' means the length of A is 4 and the length of B is 5(i.e. length in data1 ).You cann't change it any more.

You can rename A in data1 and make a new variable A in data1.Such as:

[pre]
data data1;
set data1(rename=(A=_A));
length A $ 5;
A=_A;
;

data whole;
set data1 data2;
run;
[/pre]

or

[pre]
Data combine;
length A $ 5;
Set data1_Out data2_IN;
Run;
[/pre]




Ksharp Message was edited by: Ksharp

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 1692 views
  • 0 likes
  • 6 in conversation