Help using Base SAS procedures

Shorting variable length

Reply
Frequent Contributor
Posts: 99

Shorting variable length

Hi—

I need to shorten the length of a bunch of variables at once, over 100. Most of them at up to 32 characters and I need to shorten them all by around 6 to 8

characters. So variables with a title like this:

Quality_of_FASP__Family_Child_As

I need to shorten to this

Quality_of_FASP__Family_

Any help is greatly aperciated.


Thanks!

PROC Star
Posts: 7,363

Re: Shorting variable length

Do you want to specify the new names or jjust cut the old names off at X characters?

Frequent Contributor
Posts: 99

Re: Shorting variable length

Cut off at X charters. Thanks! 

Respected Advisor
Posts: 3,124

Re: Shorting variable length

Well, what if you have extreme long tailing character , such as Quality_of_FASP__Family_arthurtabachneckhaikuobian, will you cut it even it is more than 8 characters?

Here is an your liberal approach, the conservative one can easily tweaked to:


data have;

a='Quality_of_FASP__Family_Child_As';

run;

data want;

set have;

do i=1 to countw(a,'_');

  b=scan(a,-i,'_');

  lc=lengthn(b)+1;

  lt+lc;

  if lt>=6 then leave;

end;

new_a=substr(a,1,lengthn(a)-lt);

call missing(lt);

keep a new_a;

run;

Haikuo

Frequent Contributor
Posts: 99

Re: Shorting variable length

Thanks Hai. kuo. Its not exatlly wat I am looking for though. I'm looking to shorten the actually variable names not the cell contents. Do you know how to do that?

Super User
Posts: 5,083

Re: Shorting variable length

You can pick the length by altering the 3rd parameter within SUBSTR.  Of course you have to adjust WORK and HAVE to reflect your data:

proc sql noprint;

   select name || '=' || substr(name,1,25) into : rename_list separated by ' '

   from dictionary.columns where libname='WORK' and memname='HAVE';

quit;

proc datasets library=work;

   modify have;

   rename &rename_list;

run;

Good luck.

Super User
Posts: 5,083

Re: Shorting variable length

Whoops, needs to be modified.  This will generate errors for any names not being truncated.  Add to the WHERE clause:

where libname='WORK' and memname='HAVE' and length(name) > 25;

Sorry about that.

Super Contributor
Posts: 474

Re: Shorting variable length

Whoops, needs to be modified.  This will generate errors for any names not being truncated.  Add to the WHERE clause:

where libname='WORK' and memname='HAVE' and length(name) > 25;

Sorry about that

.

Or simply replace the SUBSTR function with the less used SUBSTRN function which won't generate any errors.

More on SUBSTRN here:

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002255112.htm

Cheers from Portugal.

Daniel Santos @ www.cgd.pt

Super User
Posts: 5,083

Re: Shorting variable length

Daniel,

Sorry, the issue is a little different.  I don't think SUBSTRN can solve it.

Within dictionary.columns, NAME has a length of 32 (although there may be trailing blanks).  There are always enough characters to read.  The problem comes about if the program were to generate this sort of statement:

rename whoops=whoops;

In that case, PROC DATASETS generates an error because the name whoops is already being used.  Maybe it should be this way, maybe not, but that's the result I'm getting.  I'm using 9.1 so perhaps things have changed in more up-to-date releases.

Super Contributor
Posts: 474

Re: Shorting variable length

Hmmm. OK, didn't saw that. You're absolutely right.

My mistake, sorry. SUBSTRN won't help there, and it's actually not needed, as I was forgetting the sufficient length of the INPUT var.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt

Respected Advisor
Posts: 3,124

Re: Shorting variable length

Here is another try borrowing some ideas from Astounding's post:

data have;

Quality_of_FASP__Family_Child_As=1;

Quality_of_FASP_arthurmikemarkks=2;

run;

proc sql;

create table name as select distinct name from dictionary.columns where LIBNAME='WORK' AND MEMNAME='HAVE';QUIT;

data NAME;

set name;

do i=1 to countw(name,'_');

  b=scan(name,-i,'_');

  lc=lengthn(b)+1;

  lt+lc;

  if lt>=6 then leave;

end;

new_name=substr(name,1,lengthn(name)-lt);

call missing(lt);

keep name new_name;

run;

proc sql;

select cats(name,'=',new_name) into :rename separated by ' ' from name;quit;

data want;

set have;

rename &rename;

run;

proc print;run;

Haikuo

Respected Advisor
Posts: 3,777

Re: Shorting variable length

You might have a better result if you create a table of old an new names.  Instead of just truncating you could abbreviate words like "Quality" to "Qua" perhaps and Family.  Remove double underscores etc.  Then you can use that data set to gen a rename statement.

Of course you could do all the transformations to new names with SAS code PRX or TRNSTRN etc. but it might be just as easy to do it by hand.  You will only need it once.

New Contributor
Posts: 4

Re: Shorting variable length

This is not pretty but it works. I am a sucker for itterative macro loops. The new dataset will be the old dataset's name with an "s_" prefix in the work library. the function %shortnam() takes the first argument as the length for substrn() the second argument is the lib name the third is the data set name.

/*--BE SURE TO USE CAPITALS I DID NOT BOTHER WITH AN UPCASE FUNCTION*/

/*------------------------------------------------------------------------------------------*/

option symbolgen mlogic mprint;

%macro shortname(length,libname,dsn);

proc sql;

select count(*) into :tempcnt from dictionary.columns where libname="&libname" and memname="&dsn";

%let varcnt=%sysfunc(compress(&tempcnt)); /*there might be a better way to do this*/

select name into :name_1-:name_&varcnt from dictionary.columns where libname="&libname" and memname="&dsn";

select substrn(name,1,&length) into :s_name_1-:s_name_&varcnt from dictionary.columns where libname="&libname" and memname="&dsn";

select compress(format,'CHAR') into :format_1-:format_&varcnt from dictionary.columns where libname="&libname" and memname="&dsn";

quit;

data s_&dsn (keep= %do i=1 %to &varcnt; &&s_name_&i %endSmiley Wink;

set &libname..&dsn;

%do i=1 %to &varcnt;

format &&s_name_&i &&format_&i;

&&s_name_&i=&&name_&i ;

%end;

run;

%mend shortname;

%shortname(6,MYLIB,EGYPT_201107_201207)

Super Contributor
Posts: 474

Re: Shorting variable length

Hi John, good point, but not a very performance wise solution.

Your code implies duplicating columns and rewriting the whole dataset.

Astounding solution uses dataset procedure which only reads/writes the descriptor portion of the dataset.

I really think that's the best solution you can have. Short code and effective.

No need to rewrite the whole data portion of the file when you just want to edit the metadata of the table.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt

Super User
Posts: 5,083

Re: Shorting variable length

One additional issue to think about:  what happens if two variable names get shortened, and become identical as a result?  The results vary, depending on the methods you choose.

Ask a Question
Discussion stats
  • 14 replies
  • 573 views
  • 0 likes
  • 7 in conversation