DATA Step, Macro, Functions and more

Determining max and min length of variables

Reply
Contributor
Posts: 39

Determining max and min length of variables

[ Edited ]

I need some help determining the maximum and minimum length of Character and Numeric variables. Right now, the program never ends, and it seems the program is in an endless loop, with the log populating with the same note n number of times. Is there another efficient way of achieving this?

 

Code:

 

%let len= %sysfunc(length(Variable));

 

data _null_;

set Libref.TableA;

 

If vtype(Variable)="C" then do;

   call execute ("proc sql;

                          Create table Max_Min_Length as select Max(length(Variable)) as MxL, Min(length(Variable)) as MnL

                             from Libref.TableA;

                                      quit;");

 end;

 

Else If vtype(Variable)="N" then do;

  call execute ("proc sql;

                           Create table Max_Min_Length as select Max(length(put(Variable, &len..))) as MxL,

                              Min(length(put(Variable, &len..)) as MnL from Libref.TableA;

                                  quit;");

end;

run;

 

Respected Advisor
Posts: 4,173

Re: Determining max and min length of variables

What you want to do only makes sense for character variables. Numeric variables use 8 Bytes for storage. How long the printed string is depends on the format assigned. 

 

Here a code version of how to do this.

data inter;
  set sashelp.class;
  array chars _character_;
  length __varname $32;
  do over chars;
    __varname=vname(chars);
    __length=lengthn(chars);
    output;
  end;
run;

proc sql;
  create table want as
  select __varname as varname, max(__length) as max_length, min(__length) as min_length
  from inter
  group by varname
  ;
quit;
Contributor
Posts: 39

Re: Determining max and min length of variables

Hello Patrick,

Does that mean the max and min length of numeric variables will be same?
Respected Advisor
Posts: 4,173

Re: Determining max and min length of variables

@zoomzoom

It means that using a character function for a numeric variable is not the right thing to do. 

 

If you do so then you will get a NOTE like below which is something you should always try to avoid.

NOTE: Numeric values have been converted to character values at the places given by: (Line)Smiley SadColumn).

 

And yes, max and min would be the same - question is what's the meaning of such a max and min for numeric variables. 

Contributor
Posts: 39

Re: Determining max and min length of variables

Thank you, Patrick.


So I have dropped the idea of length check for numeric, and kept the code only for character.


Now, when I am trying to limit the input variable to character, it is not working properly. As the variables come in from different tables, in case of numeric ones, I get three columns in the output, i.e. max, min and a a blank one in the name of the numeric variable. vtype(var) is not limiting variables to char.


Code:

data _null_;

set Libref.TableA;



If vtype(Variable)="C" then do;

call execute ("proc sql;

Create table Max_Min_Length as select Max(length(Variable)) as MxL, Min(length(Variable)) as MnL

from Libref.TableA;

quit;");

end;

run;
Respected Advisor
Posts: 4,173

Re: Determining max and min length of variables

Your code as posted won't return what you're after.

 

Why can't you use the code version I've posted?

Contributor
Posts: 39

Re: Determining max and min length of variables

I am trying to understand how do I restrict input variable to only Character variable, in your code.
Respected Advisor
Posts: 4,173

Re: Determining max and min length of variables

[ Edited ]

@zoomzoom

The code as posted already restricts "itself" to character variables only via the following statement:

array chars _character_;

 

Contributor
Posts: 39

Re: Determining max and min length of variables

@Patrick

I understand that the code restricts to char. I was not clear in my answer. It seemed that the code will pull all character variables from the source table. In my case, I have multiple tables with one or two character variables from those tables.

I did something else, I am now selecting based on the 'type' from dictionary columns, after I have put in the table names and column names in a temporary table.
Respected Advisor
Posts: 4,173

Re: Determining max and min length of variables

@zoomzoom

So pretty much what @AskoLötjönen suggested. Sounds like a good solution and one I only didn't suggest because I wasn't sure on which level of SAS coding you are.

Contributor
Posts: 39

Re: Determining max and min length of variables

@Patrick,

I am back to programming after a long break, and it feels rusty!!

One thing I still cannot resolve is the note you have mentioned, i.e. "Character values have been converted to numeric values at the places.."
Contributor
Posts: 44

Re: Determining max and min length of variables

This makes a new dataset with min and max lengths of character variables. You can modify it by adding libname and memname into classvars datasets so then you dont need to "hard code" from-table.

 

create table classvars as
  select name ,
         type
  from dictionary.columns
  where LIBNAME = 'SASHELP'
    and memname = 'CLASS'
    and type = 'char';
quit;

data _null_  ;
  set classvars end=last;
  if _n_ = 1 then Call execute ('proc sql; create table varlens as select ');
  call execute('Max(length('||name||')) as '||strip(name)||'_max , Min(length('||name||')) as '||strip(name)||'_min' );
  if last then call execute ('from sashelp.class; quit');
  else call execute (',');
run;
 

Valued Guide
Posts: 765

Re: Determining max and min length of variables

[ Edited ]

Hi, here's another solution (examines SASHELP.HEART, a heart exam).  Since you need a data step to get the variable lengths in ecah observation, use the same data step to determine the MIN and MAX lengths ...

 

proc sql noprint;
select count(*) into :nc trimmed from sashelp.vcolumn
where libname='SASHELP' and memname='HEART' and type='char';
quit;


data minmax (keep=var l_min l_max);
array maxx(&nc);
array minn(&nc);

do until(last);
  set sashelp.heart end=last;
  array ch(&nc) _character_;  
do j=1 to &nc;
  maxx(j) = max(maxx(j),lengthn(ch(j)));
  minn(j) = min(minn(j),lengthn(ch(j)));
end;
end;
length var $32;
do j=1 to &nc;
  var   = vname(ch(j));
  l_min = minn(j);
  l_max = maxx(j);
  output;
end;
run;

 

data set MINMAX ..

Obs    var               l_min    l_max

 1     Status              4         5
 2     DeathCause          0        25
 3     Sex                 4         6
 4     Chol_Status         0        10
 5     BP_Status           4         7
 6     Weight_Status       0        11
 7     Smoking_Status      0        17

Ask a Question
Discussion stats
  • 12 replies
  • 3565 views
  • 6 likes
  • 4 in conversation