BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Uknown_user
Quartz | Level 8

I am trying to conditionally change the length of any varible that exceeds 50 characters. Is it possible to make this work?:

 

data demo;
length a $100;
a = '1111';
b = '1111111111';
run;

data reformat;
set demo;
if length _all_ > $50 then length = $10;
run;

Thanks for any suggestion!

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

You cannot change the length of a variable inside a data step. The length of variables is set when the step is compiled, and stays fixed throughout.

You need to determine the new length beforehand, and set it before it is defined by the contributing dataset.

EG

proc sql;
select max(length(var)) into :maxlen from have;
quit;

data want;
length var &maxlen.;
set have;
run;

/* or */

data want;
set have (rename=(var=_var));
length var &maxlen.;
var = _var;
drop _var;
run;

View solution in original post

10 REPLIES 10
Kurt_Bremser
Super User

You cannot change the length of a variable inside a data step. The length of variables is set when the step is compiled, and stays fixed throughout.

You need to determine the new length beforehand, and set it before it is defined by the contributing dataset.

EG

proc sql;
select max(length(var)) into :maxlen from have;
quit;

data want;
length var &maxlen.;
set have;
run;

/* or */

data want;
set have (rename=(var=_var));
length var &maxlen.;
var = _var;
drop _var;
run;
Shmuel
Garnet | Level 18

If you are woried from disk usage - you can add either to the configuration file or to the autoexec file 

or before any relevant step the: OPTION Compress=yes;

Uknown_user
Quartz | Level 8

I tried to use this solution OPTION Compress=yes; but it did not work. My issue is that the procedure I am executing fails because some variables are kept with $1500 length thought.

 

If I add this option right before the second data step, it still does not work and the variable is not reformated:

 

data demo;
length a $100;
a = '1111';
b = '1111111111';
run;

OPTION COMPRESS=yes;
data reformat;
set demo;
if length _all_ > $50 then length = $10;
run;
Kurt_Bremser
Super User

@Uknown_user wrote:

I tried to use this solution OPTION Compress=yes; but it did not work. My issue is that the procedure I am executing fails because some variables are kept with $1500 length thought.

 

If I add this option right before the second data step, it still does not work and the variable is not reformated:

 

data demo;
length a $100;
a = '1111';
b = '1111111111';
run;

OPTION COMPRESS=yes;
data reformat;
set demo;
if length _all_ > $50 then length = $10;
run;

See my other post. Changing the length of a variable like this is NOT possible.

length is a declarative statement that cannot be executed conditionally (as it is being acted upon while the data step is compiled).

And if the length of a variable has already been set (eg because the variable is part of a dataset in the previous set statement), the length statement will have no effect.

 

The data step compiler creates a variable in the PDV as soon as the variable is encountered; in the case of your code that is when the set statement is compiled.

 

 

But this statement:

if length _all_ > $50 then length = $10;

is syntactically bogus and only leads to a ERROR message. The length() function (note the brackets) can only work on ONE variable at any time. And length = $10 is an invalid assigment.

 

Maxim 2: Read the log.

Uknown_user
Quartz | Level 8

Thanks, is it possible to format all of the variables in given dataset without looping one by one?

Kurt_Bremser
Super User

@Uknown_user wrote:

Thanks, is it possible to format all of the variables in given dataset without looping one by one?


Yes. Draw the metadata for the columns from DICTIONARY.COLUMNS (in SQL, or SASHELP.VCOLUMN for data steps).

EG

proc sql;
create table varnames as
select name
from dictionary.columns
where libname = 'YOURLIB' and memname = 'YOURDATA' and type = 'char' and length > 50;
quit;

Now you can determine the maximum used length for each variable in a dynamically created SQL:

data _null_;
call execute('proc sql noprint; select ');
do until(done);
  set varnames end=done;
  call execute('max(length(' !! trim(name) !! '))');
  if not done then call execute(',');
end;
call execute(' into ');
do until (done1);
  set varnames end=done1;
  call execute(':' !! trim(name));
  if not done1 then call execute(',');
end;
call execute('from YOURLIB.YOURDATA; quit;');
stop;
run;

You will now have a macro variable with the name of each column, that contains the max length for that column.

From that we create the final data step:

data _null_;
call execute('data want; set YOURLIB.YOURDATA (rename=(');
do until (done);
  set varnames end=done;
  call execute(trim(name) !! '=_' !! trim(name) !! ' ');
end;
call execute(')); length ');
do until (done);
  set varnames end=done;
  call execute(trim(name) !! ' $&' !! trim(name) !! '. ');
end;
call execute(';');
do until (done1);
  set varnames end=done1;
  call execute(trim(name) !! '=_' !! trim(name) !! ';');
end;
call execute('drop ');
do until (done2);
  set varnames end=done2;
  call execute('_' !! trim(name) !! ' ');
end;
call execute('; run;');
stop;
run;

Note that it might be necessary to use a macro quoting function to prevent premature resolution of the macro variables with the maximum lengths.

Since I have nothing to test against, this code was written entirely "on the fly" and is untested.

 

Edit: after testing with some fake data, added stop; statements and used multiple variables for the end= condition.

Should work now.

Kurt_Bremser
Super User

Note that I did some editing; macro quoting is not necessary, BTW.

See this as an exercise in the usage of call execute, and some advanced programming. Feel free to ask for aspects that puzzle you.

Uknown_user
Quartz | Level 8

Thanks again, I decided to use this code to format all char vars to its minimum len then run the procedure I want:

 

data first;
do i=1 to 10;
output;
end;
run;

data DEMO;
set first;
drop i;
length a $100 b 8. c $100;
a = '111';
b = 1111111;
c = 'aaaaaa';
run;

data tmp;
variable = 'afadfadsfa';
nvarnum = 2343242432;
run;

data demo;
merge demo tmp;
run;


proc sql noprint;
select name into: cvar1 - :cvar100000 from dictionary.columns where libname=upcase('WORK') and memname=upcase('demo') and type = 'char';
quit;

%macro help();

%do z = 1 %to &sqlobs.;

proc sql noprint;
select max(length(&&cvar&z)) into: maxlen from DEMO;
quit;

data _t_&&cvar&z (keep=&&cvar&z);
set demo (rename=(&&cvar&z = _&&cvar&z));
length &&cvar&z $&maxlen.;
&&cvar&z = _&&cvar&z;
drop _&&cvar&z;
run;

data demo;
set demo (drop=&&cvar&z);
run;

%end;

data fin;
merge _t_: demo;
run;

proc datasets noprint;
	delete _t_:;
run;

%mend help;
%help;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 2710 views
  • 4 likes
  • 3 in conversation