BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
xxformat_com
Barite | Level 11

Hi,

Is there a smart way of reducing the length of multiple character variables based on the maximal observed lengths?

 

Here is an example. The maximal observed length of variable sex is stored in a macro variable

i.e. we would as many variables as variable to update which is not really smart.

 

*prepare the data for the demo;
proc copy in=sashelp out=work; select class; run; proc sql; alter table class modify name char(20), sex char(10); quit;
*current solution;
proc sql noprint; select max(lengthn(sex)) into : chk from class; reset print; alter table class modify sex char(&chk.); quit; proc contents data=class varnum; run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
7 REPLIES 7
Tom
Super User Tom
Super User

Here is an updated version of that macro that uses more modern SAS functions such as CALL SYMPUTX() and CATS() to make the code simpler.

%macro change(dsn);
%local nchar ;
%let nchar=0;
data _null_;
  set &dsn end=done;
  array __C $1 _character_ __C2;
  array __L [5000] _temporary_;
  do i = 1 to dim(__C)-1;
    __L[i]=max(__L[i],length(__C[i]));
  end;
  if done then do;
    call symputx('nchar',dim(__C)-1);
    do i = 1 to dim(__C)-1;
      call symputx(cats('l',i)
         ,catx(' ',nliteral(vname(__C[i])),cats('$',__L[i]))
         ,'L');
    end;
  end;
run;
data &dsn._;
  length 
%do i = 1 %to &nchar;
  &&l&i
%end;
  ;
  set &dsn;
  format _character_ ;
run;
%mend;
data test;
  infile datalines truncover;
  input aa $3. bb :$15.;
datalines;
123 4567
;
options mprint;
%change(work.test)
565  options mprint;
566  %change(work.test)
MPRINT(CHANGE):   data _null_;
MPRINT(CHANGE):   set work.test end=done;
MPRINT(CHANGE):   array __C $1 _character_ __C2;
MPRINT(CHANGE):   array __L [5000] _temporary_;
MPRINT(CHANGE):   do i = 1 to dim(__C)-1;
MPRINT(CHANGE):   __L[i]=max(__L[i],length(__C[i]));
MPRINT(CHANGE):   end;
MPRINT(CHANGE):   if done then do;
MPRINT(CHANGE):   call symputx('nchar',dim(__C)-1);
MPRINT(CHANGE):   do i = 1 to dim(__C)-1;
MPRINT(CHANGE):   call symputx(cats('l',i),catx(' ',nliteral(vname(__C[i])),cats('$',__L[i])),'L');
MPRINT(CHANGE):   end;
MPRINT(CHANGE):   end;
MPRINT(CHANGE):   run;

NOTE: There were 1 observations read from the data set WORK.TEST.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds


MPRINT(CHANGE):   data work.test_;
MPRINT(CHANGE):   length aa $3 bb $4 ;
MPRINT(CHANGE):   set work.test;
MPRINT(CHANGE):   format _character_ ;
MPRINT(CHANGE):   run;

WARNING: Multiple lengths were specified for the variable bb by input data set(s). This can cause truncation of data.
NOTE: There were 1 observations read from the data set WORK.TEST.
NOTE: The data set WORK.TEST_ has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds
ballardw
Super User

I am not sure exactly what you are requesting.

Something like this?

proc sql noprint;
    select max(lengthn(sex)),max(lengthn(name)) into : schk ,: nchk
    from class;

    reset print;
    alter table class
        modify sex char(&schk.),
               name char(&nchk.);
quit;

I would also refer to documentation of your data sources. I have external file sources that provide data and the values are often not the defined length of the variable field BUT every so often there is a file with the maximum length value. A process like this turned loose on multiple data sets without consideration of such can result in different length variables which will cause as a minimum notes about different length variables and at worst lost of data from truncation.

 

If you are trying to get a list of the variables then look at Dictionary.Tables for your source data set to get the variables and use that to generate desired code.

xxformat_com
Barite | Level 11

Hi @ballardw,

 

I was looking for something like what was suggested by Tom.

But having said that, I don't find the solution ideal.

It is another macro to compile somewhere first.

 

Plus, we cannot use something like that:

 

proc sql;
select sex_new length=max(lengthn(sex))
from sashelp.class;
quit;

Tom
Super User Tom
Super User

@xxformat_com wrote:

Hi @ballardw,

 

I was looking for something like what was suggested by Tom.

But having said that, I don't find the solution ideal.

It is another macro to compile somewhere first.

 

Plus, we cannot use something like that:

 

proc sql;
select sex_new length=max(lengthn(sex))
from sashelp.class;
quit;


Not sure I understand what you want.

 

Your example SQL is not valid code.  The LENGTH= option needs an actual constant value.  So you will need to do some code generation to generate valid SQL syntax.

 

There are many ways to do code generation.  The macro language is just one.

xxformat_com
Barite | Level 11

Hi,

Yes I know it is not a valid code. My point is just to show that I'm looking for easy ways to set the minimum needed length without multiple steps.

 

Another example would be this one where it would be great to have a way to set the length 10 for lbtest without having to know the data:

 

proc format;
    value $lbtestcd 'HCT'     ='Hematocrit'
                    'WBC'     ='Leukocytes'
                    'HDLCCHOL'='HDL Cholesterol/Total Cholesterol';
run;

data lb;
    length lbtestcd $40;
    lbtestcd='HCT'; output;
    lbtestcd='WBC'; output;
run;

data lb;
    set lb;
    length lbtest $10;
    lbtest=put(lbtestcd,$lbtestcd.);
run;
Ksharp
Super User
/*原始数据的 路径 - the path of original sas dataset*/
libname in v9  "D:\XiaKeShan\数据管理\批量修改数据集的名称_LABEL_变量的名称_LABEL_长度2\原始数据"    access=readonly;

/*处理后数据的 路径 - the path of processed sas dataset*/
libname out v9  "D:\XiaKeShan\数据管理\批量修改数据集的名称_LABEL_变量的名称_LABEL_长度2\处理后的数据" ;



/*清空导出库的 数据集*/
proc datasets library=out kill nolist nodetails;
quit;
/*清空work库的 数据集*/
proc datasets library=work kill nolist nodetails;
quit;
/*将 数据集名 统一为 小写*/
proc copy in=in out=out memtype=data;
run;

/********************更改变量存储长度 **********************************/
%macro change_var_len;
/*计算所有数据集中字符变量的长度*/
data _null_;
 set sashelp.vcolumn(keep=libname memname name type where=(libname='OUT' and upcase(type)='CHAR')) end=last;
 by memname;
 if _n_=1 then call execute('proc sql;');
 if first.memname then call execute(catt('create table _',memname,' as select '));
 call execute(cat('max(length(',strip(name),')) as ',name));
 if not last.memname then call execute(',');
  else call execute(catt('from out.',memname,';'));
 if last then call execute('quit;');
run;
/*将上面产生的数据集转置 wide -> long*/
data _null_;
 set sashelp.vtable(keep=libname memname where=(libname='WORK' and memname =: '_'));
 call execute(catt('proc transpose data=',memname,' out=_',memname,';run;'));
run;
/*合并转置的数据集*/
data change_var_len;
 length _name_ $ 40;
 set __: indsname=indsname;
 dsn=indsname;
run;
/*计算变量在所有数据集中的 最长长度*/
proc sql;
create table change_var_len_1 as
 select *,substr(dsn,8) as new_dsn length=40,max(col1) as max_len
  from change_var_len
   where col1 is not missing
    group by _NAME_
     order by dsn;
quit;
/*按变量的最长长度 修改数据集中所有变量的长度*/
data _null_;
 set change_var_len_1 end=last;
 by dsn;
 if _n_=1 then call execute('proc sql;');
 if first.dsn then call execute(catt('alter table out.',new_dsn,' modify '));
 call execute(catt(_name_,' char(',max_len,')'));
 if not last.dsn then call execute(',');
  else call execute(catt(';'));
 if last then call execute('quit;');
run;
%mend;
%change_var_len

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2218 views
  • 1 like
  • 4 in conversation