Hi guys,
I have an Excel file that has different worksheets (one sheet for each month). On every worksheet are different "column-variablenames" e.g. " in bn. $" (with leading space and dollar sign) , "car manufacturer" and "number of cars sold in tsd/1000".
It looks like that:
in bn. $ | car manufacturer | number of cars sold in tsd |
13 | audi | 2133 |
12 | vauxhall | 3121 |
16 | toyota | 13131 |
... | ... | ... |
I imported the excel data into SAS using the libname xlsx command. My problem is, that i can't change the column variables that SAS has assigned automatically. Especially the " in bn. $" (with leading space and $ sign) causes huge problems. Is there any way to delete all the headings and replace them with A, B and C for example. The colum names are not necessarily identical in the other months (sometimes its also "number of cars in 1000")
I would be very grateful for your help.
Thank you very much
If you are getting variable names like that from the column headers in your Excel sheet then you must be using this SAS option:
options validvarname=ANY ;
In that case you will need to use NAME literals to rename the variables whose names are not valid SAS names.
rename " in bn. $"n=A ;
If you want to automate renaming the variables you could try something like this to generate OLD=NEW name pairs that you could use in a RENAME statement or RENAME= dataset option.
proc sql noprint;
select catx('=',nliteral(name),cats('VAR',varnum))
into :renames separated by ' '
from dictionary.columns
where libname='MYEXCEL' and memname='MYSHEET'
;
quit;
data mysheet;
set myexcel.mysheet(rename=(&renames));
run;
While you are at it you might want to attach those original names as the labels for the variables.
proc sql noprint;
select catx('=',nliteral(name),cats('VAR',varnum))
, catx('=',nliteral(name),quote(strip(name)))
into :renames separated by ' '
, :labels separated by ' '
from dictionary.columns
where libname='MYEXCEL' and memname='MYSHEET'
;
quit;
data mysheet;
set myexcel.mysheet;
rename &renames;
label &labels;
run;
If you are getting variable names like that from the column headers in your Excel sheet then you must be using this SAS option:
options validvarname=ANY ;
In that case you will need to use NAME literals to rename the variables whose names are not valid SAS names.
rename " in bn. $"n=A ;
If you want to automate renaming the variables you could try something like this to generate OLD=NEW name pairs that you could use in a RENAME statement or RENAME= dataset option.
proc sql noprint;
select catx('=',nliteral(name),cats('VAR',varnum))
into :renames separated by ' '
from dictionary.columns
where libname='MYEXCEL' and memname='MYSHEET'
;
quit;
data mysheet;
set myexcel.mysheet(rename=(&renames));
run;
While you are at it you might want to attach those original names as the labels for the variables.
proc sql noprint;
select catx('=',nliteral(name),cats('VAR',varnum))
, catx('=',nliteral(name),quote(strip(name)))
into :renames separated by ' '
, :labels separated by ' '
from dictionary.columns
where libname='MYEXCEL' and memname='MYSHEET'
;
quit;
data mysheet;
set myexcel.mysheet;
rename &renames;
label &labels;
run;
An option that @Tom missed would be make sure that when using Proc Import that you use
options validvarname=V7;
The V7 refers to SAS version 7 where the longer variable names were acceptable. That option will turn all non-letter and non-digit characters into underscore and reduce the complexity of the rename code removing the need for the NLITERAL bits, as @PaigeMiller's example shows.
Perhaps a more important question, at least in my line of work, is why each of these sheets has a different set of column names at all if this is all related to a single project.
Awesome.
proc sql noprint;
select catx('=',nliteral(name),cats('VAR',varnum))
into :renames separated by ' '
from dictionary.columns
where libname='MYEXCEL' and memname='MYSHEET'
;
quit;
data mysheet;
set myexcel.mysheet(rename=(&renames));
run;
That's exactly what i wanted. Thanks a lot 🙂
My problem is, that i can't change the column variables that SAS has assigned automatically.
I would have to disagree. You can name these columns anything you want. Example:
proc import datafile="example1.xlsx" out=a(rename=(_in_bn___=billion_dollars));
run;
options validvarname=any validmemname=extend;
libname x xlsx 'c:\temp\text.xlsx';
proc datasets library=work kill;
quit;
proc copy in=x out=work noclone;
run;
%macro rename(dsn=);
proc transpose data="&dsn."n(obs=0) out=_temp_;
var _all_;
run;
data _temp_;
set _temp_;
new_name=cats('_',compress(_name_,,'kad'));
run;
proc sql noprint;
select catx('=',nliteral(_name_),new_name) into : rename separated by ' '
from _temp_;
quit;
proc datasets library=work nolist nodetails;
modify "&dsn."n;
rename &rename.;
quit;
%mend;
data _null_;
set sashelp.vtable(where=(libname='X'));
call execute(catt('%nrstr(%rename)(dsn=',memname,')'));
run;
libname x clear;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.