BookmarkSubscribeRSS Feed
hjjijkkl
Pyrite | Level 9

I have a data with 100 + variables and each variable have a very long character values. How can I increase the length of all variables all together ( I want to have the max length for all) instead of changing the length of the variables one by one. How can I do that in data step?

7 REPLIES 7
PaigeMiller
Diamond | Level 26

Example:

data want;
    length charvar1-charvar100 $ 6000;
    set have;
run;
--
Paige Miller
hjjijkkl
Pyrite | Level 9

do the "charvar1-charvar100" represent the first and last variables in the data? Am asking this b/c i am getting  error when i use the first and last variables in the data. Does it need to have 1 - 100 at the end of the variable names as well?

charvar1-charvar100

 this is the error I am getting in the log

 

ERROR: Variable referral1 cannot be found on the list of previously defined variables.

PaigeMiller
Diamond | Level 26

@hjjijkkl wrote:

do the "charvar1-charvar100" represent the first and last variables in the data? Am asking this b/c i am getting  error when i use the first and last variables in the data. Does it need to have 1 - 100 at the end of the variable names as well?

charvar1-charvar100

 this is the error I am getting in the log

 

ERROR: Variable referral1 cannot be found on the list of previously defined variables.


You never gave specific information, so I can't be specific and thus I used generic variable names. If you want specific code for your specific case, you have to give specific variable names. Important rule: the more information you give us about the problem, the more likely our solution will fit. And the converse is true as well.

 

If your variable names are

 

dog gorilla giraffe rabbit chicken horta buffalo

then you have to use those names in the LENGTH statement.

--
Paige Miller
jklaverstijn
Rhodochrosite | Level 12

That will work. Note that the dataset will have to be recreated; you cannot change the length of variables without that.

 

Also, the max length of a char is $32767 and not 6000. If you have 100+ of those the disk space requirements can rapidly explode and burden your I/O subsystem. It may become unwieldy. Consider adding COMPRESS=YES to deal with some of the downsides of a large amount of long characters, especially if you have a fair percentage of missing values.

 

Another approach could be to reconsider the data modelling. A table with a large amount of character variables can often be transposed into a version with one long char plus an identifying variable. Of course that depends entirely on the case at hand.

 

Hope this helps,

-- Jan.

hjjijkkl
Pyrite | Level 9
I dont want to change the length of the variables. I want to change the length of the character within the variables.
Can you please provide me an example ?
Tom
Super User Tom
Super User

@hjjijkkl wrote:
I dont want to change the length of the variables. I want to change the length of the character within the variables.
Can you please provide me an example ?

That make no sense.  SAS has two types of variables, floating point numbers and fixed length character strings.

The length of character variable is the maximum number of bytes that can be stored in the variable. (Note with non-single byte encodings like UTF-8 some "characters" might take more than one byte to store).

 

If you want to replace individual characters, like 'X', with something longer, like 'ABC', then use the TRANWRD() function.

If you want to make that change to every character variable in the dataset then use an array.

data want;
  set have;
  array _C _character_;
  do index=1 to dim(_C);
     _c[index] = tranwrd(_c[index],'X','ABC');
  end;
run;

But note that this could make the value too long to fit into the original maximum number of bytes that the variable was defined to allow.  

 

Which is back to the original answers you got when the question was interpeted as how the change definition of the variable.

 

To change the length of all of the character variables in an existing dataset you will want to re-define the variable BEFORE SAS looks at the variable's definition in the source dataset.  So set hte length BEFORE the SET statement.  This means you need to know the NAMES of the variables you want to change.  You could query the metadata of the existing dataset to see the names of the character variables. And then use that information to generate a LENGTH statement.  Here is one way. In this version any character variable will be defined with a length of 6000 and any numeric variable defined as length 8 (SAS floating point numbers use 8 bytes).

proc contents data=have out=contents noprint; run;
proc sort data=contents; by varnum; run;

filename length temp;
data _null_;
  length name $80;
  set contents end=eof;
  file length;
  if _n_=1 then put 'length';
  name=nliteral(name);
  put name @;
  if type=1 then put length ;
  else put '$6000' ;
  if eof then put ';';
run;

Now just run a data step to copy the data and include that LENGTH statement before the SET statement.  Make sure to remove an $xxx formats attached to the variables (SAS has a bad habit of attaching these when you read data from external databases).

data want;
%Include length / source2;
  set have;
  format _character_ ;
run;
ballardw
Super User

@hjjijkkl wrote:
I dont want to change the length of the variables. I want to change the length of the character within the variables.
Can you please provide me an example ?

You need to provide the example as what you ask can be interpreted several ways. One is to change font size so individual letters occupy more space when seen: "W" vs "W" for instance. Fonts however are not properties of data set variables.

Another would be to change language encoding from a single byte system such as ASCII where you have a numeric value of up to 255 (2 to the 8th power) or one byte of storage or to a language character set that uses a Double Byte character set because there are to many characters to be represented by one byte of storage which is related to many issues of National Language Support and file encoding.

Or just replacing an "A" with "ABC".

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 11823 views
  • 1 like
  • 5 in conversation