BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
EinarRoed
Pyrite | Level 9

Imagine that you have 1 table with roughly 150 character variables.

 

You need to find the maximum value length that occurs within each variable.

 

What's a good way to determine this?

 

I'm mostly working in DI Studio, and don't want to manually type "max(length(varname))" 150 times. Would appreciate code suggestions.

1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

This is the code that I actually use in my resize program.  I would save the hash for a more complex task.

   /*-------------------------------------------------------------------------------------------------*
    | Generate code to measure each character variable.
    *-------------------------------------------------------------------------------------------------*/
   filename FT33F001 temp;
   data _null_;
      file FT33F001;
      set __contents__;
      where libname eq "&libname" and memname eq "&memname" and type eq 2;
      put +3 'Retain _LEN_' name '1;';
      put +3 '_LEN_' name '=max(_len_' name ',length(' name '));';
      run;
   /*-------------------------------------------------------------------------------------------------*
    | OPT_LENGTH
    *-------------------------------------------------------------------------------------------------*/
   data __SIZE__&memname(keep=libname memname name opt_length);
      set &libname..&memname(keep=_character_) end=eof INDSNAME=INDSNAME;
      %inc FT33F001;
      if eof then do;
         length libname $8 memname $32 name $32;
         libname=scan(indsname,1,'.');
         memname=scan(indsname,2,'.');
         array _LEN _len_:;
         do _n_ = 1 to dim(_len);
            name = substrn(vname(_len[_n_]),6);
            opt_length = _len[_N_];
            output;
            end;
         stop;
         end;

 

A hash of character variables constructed from an array of all _CHARACTER_ variables can be used to keep a running MAX length that is output at the end.

 

%let file=sashelp.heart;

/*------------------------------------------------------------------------*
 | data set _OPTLEN will have one obs for each character variable with
 | name and length as max length of each.
 *------------------------------------------------------------------------*/
data _null_;
   if 0 then set &file(keep=_character_);
   array _c[*] _character_;
   declare hash h();
   h.definekey('_n_');
   length _memname_ _name_ $32 _l_ 8;
   retain _memname_ "%scan(&file,-1,.)";
   h.definedata('_memname_','_name_','_l_');
   h.definedone();
   _l_=0;
   do _n_ = 1 to dim(_c);
      _name_ = vname(_c[_n_]);
      _rc_ = h.add();
      end;
   do until(eof);
      set &file(keep=_character_) end=eof;
      do _n_ = 1 to dim(_c);
         _rc_ = h.find();
         _l_ = _l_ max length(_c[_n_]);
         _rc_ = h.replace();
         end;
      end;
   length _dataset_ $256;
   _dataset_ = cats('_optlen','(rename=(_memname_=memname _name_=name _l_=length))');
   h.output(dataset:strip(_dataset_));
   stop;
   run;
proc print data=_optlen;
   run;

Capture.PNG

View solution in original post

5 REPLIES 5
LinusH
Tourmaline | Level 20

Not knowing what kind of data you are talking about, but is this i DW data, 150 variables is just too much in one table.

By transposing/normalizing the data, you get much better control of it, and operations like this is much simpler to develop/maintain.

 

Is this an ad-hoc thing that you want to use as a basis for your DI Studio/ETL design, or do you wish to incorporate this logic into your job? If the later - how do you intend to use it?

Data never sleeps
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Am with @LinusH, working with lots of variables is not the best idea.  The simple way to think about tables for Programming purpose is that variables are Structure - this should be minimal, not changing.  Rows are data, these can expand and shrink as needed.

Once you get this concept, and also that the data you display in output reports is not necessarily the same as the data you program with, you will find your problem is a very simple one:

Data looks like:

<other vars>   VAR_NAME        VAR_TEXT

...                  CHARACTER1   ABC

...                  CHARACTER2   DEFTY

...

So your original columns are now rows, then:

proc sql;
  create table LENGTHS as
  select VAR_NAME,
            max(lengthn(VAR_TEXT)) as MAX_LEN
  from   HAVE
  group by VAR_NAME;
quit;
data_null__
Jade | Level 19

This is the code that I actually use in my resize program.  I would save the hash for a more complex task.

   /*-------------------------------------------------------------------------------------------------*
    | Generate code to measure each character variable.
    *-------------------------------------------------------------------------------------------------*/
   filename FT33F001 temp;
   data _null_;
      file FT33F001;
      set __contents__;
      where libname eq "&libname" and memname eq "&memname" and type eq 2;
      put +3 'Retain _LEN_' name '1;';
      put +3 '_LEN_' name '=max(_len_' name ',length(' name '));';
      run;
   /*-------------------------------------------------------------------------------------------------*
    | OPT_LENGTH
    *-------------------------------------------------------------------------------------------------*/
   data __SIZE__&memname(keep=libname memname name opt_length);
      set &libname..&memname(keep=_character_) end=eof INDSNAME=INDSNAME;
      %inc FT33F001;
      if eof then do;
         length libname $8 memname $32 name $32;
         libname=scan(indsname,1,'.');
         memname=scan(indsname,2,'.');
         array _LEN _len_:;
         do _n_ = 1 to dim(_len);
            name = substrn(vname(_len[_n_]),6);
            opt_length = _len[_N_];
            output;
            end;
         stop;
         end;

 

A hash of character variables constructed from an array of all _CHARACTER_ variables can be used to keep a running MAX length that is output at the end.

 

%let file=sashelp.heart;

/*------------------------------------------------------------------------*
 | data set _OPTLEN will have one obs for each character variable with
 | name and length as max length of each.
 *------------------------------------------------------------------------*/
data _null_;
   if 0 then set &file(keep=_character_);
   array _c[*] _character_;
   declare hash h();
   h.definekey('_n_');
   length _memname_ _name_ $32 _l_ 8;
   retain _memname_ "%scan(&file,-1,.)";
   h.definedata('_memname_','_name_','_l_');
   h.definedone();
   _l_=0;
   do _n_ = 1 to dim(_c);
      _name_ = vname(_c[_n_]);
      _rc_ = h.add();
      end;
   do until(eof);
      set &file(keep=_character_) end=eof;
      do _n_ = 1 to dim(_c);
         _rc_ = h.find();
         _l_ = _l_ max length(_c[_n_]);
         _rc_ = h.replace();
         end;
      end;
   length _dataset_ $256;
   _dataset_ = cats('_optlen','(rename=(_memname_=memname _name_=name _l_=length))');
   h.output(dataset:strip(_dataset_));
   stop;
   run;
proc print data=_optlen;
   run;

Capture.PNG

EinarRoed
Pyrite | Level 9

That worked perfectly, @data_null__! Will store it for future use. Appreciate it. Smiley Happy

Ksharp
Super User
data _null_;
 set sashelp.vcolumn
    (keep=libname memname name type
     where=(libname='SASHELP' and memname='HEART' and type='char')) end=last;
if _n_ eq 1 then call execute('proc sql;create table temp as select ');
call execute(cat('max(length(',name,')) as ',name));
if last then call execute('from sashelp.heart;quit;');   
 else call execute(',');
run;
proc transpose data=temp out=want name=name prefix=length;
run;

x.png

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 10453 views
  • 7 likes
  • 5 in conversation